7

I am doing a bit of research on hash functions. I understand the concept that it is an equation that is easy to do one way (you take the number 00011010 for example and do reasonably simple math with it) but the function you use is very difficult to do the other way. I cannot however find any examples of what a one-way hash function would look like. I watched YouTube videos where they gave prime number multiplication as an analogy, and I searched google for examples but cannot find any examples for what an actual function would look like.

I have basic computer programming knowledge but am not an experienced coder.

3 Answers3

13

It seems you'tre talking about cryptographic hash functions, where it's essential that you cannot easily construct any input that will have a given output - that is what "one-way function" means. Hash functions in general (e.g. used for hash tables) do not have this requirement.

The easiest example of a cryptographic hash function is the Rabin function, modular squaring. It works like this:

  • Take your input as a number (any digital data can easily be interpreted as a binary number).
  • Square it.
  • Take the modulo (remainder of dividing by) N, where N is the product of two prime numbers and determines the length of your hash.

Let's use N = 4181.

I tell you the hash is 3666. Your job is to find X such that X^2 mod 4181 = 3666. How do you solve that?

You can of course brute force it by seeing whether 4181 + 3666 is a square number, then trying 4181*2+3666, then 4181*3+3666, but that is going to take forever.

You can do some serious math, and find out that you can find a solution quickly if you know the prime factors of N. But you don't, and finding the prime factors for a large number (in a real scenario N would be much larger) also takes forever.

7

All hash functions are one-way. Hash functions map a larg(er) (potentially infinite) input space into a small(er) (usually finite) output space.

If you are familiar with the Pigeonhole Principle, this should immediately tell you that hash functions must be one-way. If you are not familiar with the Pigeonhole Principle, here is a very simple explanation:

The Pigeonhole Principle states that you cannot put 3 socks in 2 drawers without having at least one drawer with at least 2 socks in it.

So, if you have a function that maps a larger input space into a smaller output space, then you will have at least two inputs that map to the same output, ergo, you cannot reverse a hash function.

A very simple example of a hash function that does not use any advanced math, is this simple parity function:

def hash(n: Nat)
  if n.even?
    0
  else
    1
  end
end

As you can see, it maps a large input space (the natural numbers) into a small output space (the set {0, 1}). And it is one-way: if I tell you that the result is 1, you can't tell me what the input was.

Jörg W Mittag
  • 104,619
4

Here's a simple example:

A hash of the string "Hello world!" is "Hel". If you're given "Hel", you cannot recreate "Hello world!", and yet it is likely not going to clash with many other strings.

Admittedly, this hash isn't very good because if this were a password, knowing the first three letters makes it a lot easier to brute force the original password.

So what if we multiplied each letter value by 3 mod 26?

H (7) * 3 -> V (21)
e (4) * 3 -> m (12)
l (11) * 3 -> f (5)

Now our hash is "Vmf". Granted, you could reverse this, but without knowing that it was multiplied by 3, this already becomes a bit tricker. For a computer this is trivial, but imagine multiplying against enormous prime numbers. It would make finding a pattern virtually impossible, and you'd have to dedicate long computation hours calculating possible values and trying them out.

Converting it to "Vmf" was a trivial matter, but restoring to "Hel" isn't. This is exactly what we want from a hash.

If the user provides the string "Hello, World!", without having to save the original string, we can simply apply the hash to "Hello, World!" and obtain "Vmf" and then compare that string to the one we have on file..

"Vmf" === "Vmf"  // Bingo!

And in a nutshell this is what hashing is. There are various techniques, but the concept is ultimately the same. Deterministically create an irreversible string of data from an input.

Neil
  • 22,848