How can I write a sha1 hash function from scratch?

Excerpt§
Learn how to write a SHA1 hash function from scratch and understand the process behind it. This blog post explains each step in detail and provides examples for better understanding.

Introduction§

SHA1 or SHA-1 is a popular one-way cryptographic hash function used widely in security applications. Though the algorithm is complex, it is educational and insightful to understand the core steps involved in implementing a basic SHA1 hash function from scratch. In this post, we will walk through the key stages of developing a SHA1 hashing program in a simplified manner.

Understanding SHA1§

SHA1 takes an arbitrary length input message and produces a 160-bit hash value. Key properties:

Small changes in input lead to significant changes in output
Infeasible to generate original input from hash
Low chance of collisions for different inputs

The SHA1 algorithm operates on binary data and consists of 6 main steps:

Convert input to binary
Pad the input
Break into 512-bit blocks
Initialize variables
Perform hash computation
Output final 160-bit hash

Let’s look at each step:

Step 1: Converting Input to Binary§

The input string is first converted to a binary bit string. This can be done by:

ASCII encoding - converting characters to 8-bit ASCII values
Unicode encoding - converting characters to 16-bit Unicode code points

For example, the string “Hello” becomes

10100100001100101011011000110110001101111
text

in ASCII binary form.

Step 2: Padding the Message§

The binary message is padded to ensure its length is a multiple of 512 bits. Padding involves:

Appending a single 1 bit to message
Adding 0 bits until message is 64 bits shy of next 512 multiple
Appending a 64-bit message length value at the end

So “Hello” with its ASCII binary is padded to become

101001000011001010110110001101100011011110000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001010000001010100100000011001010
text

Step 3: Breaking the Message into Blocks§

The padded message is then broken into 512-bit blocks. If the padded length is 1048 bits, we get two 512-bit blocks:

1Block 1: 0100100001100101011011000110110001101111000000100000000000000000000000000000000000000000000000000000000
2
3Block 2: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001010000001010100100000011001010
text

Step 4: Initializing Variables and Constants§

SHA1 uses five 32-bit variables initialized as:

1var h0 := 0x67452301
2var h1 := 0xEFCDAB89
3var h2 := 0x98BADCFE
4var h3 := 0x10325476
5var h4 := 0xC3D2E1F0
text

It also uses constants generated from the fractional parts of the cube root of primes from 2 to 79.

Step 5: Hash Computation§

This involves four rounds of processing the message blocks:

Message Scheduling - Creating 80 expanded message words
Compression Function - Conducting logical operations usingExpanded message, constants and hash values.
Updating Hash Value - Adding output of compressionfunction to initial hash values.
Output - Converting final hash values to hexadecimal to generate 160-bit hash.

The compression and update steps are repeated for each 512-bit block.

Step 6: Generating the Final Hash§

After all blocks are processed, the final 160-bit hash value is produced by concatenating the 5 updated 32-bit hash variables h0 to h4.

For “Hello”, the final SHA1 hash is

1f7ff9e8b7bb2e09b70935a5d785e0cc5d9d0abf0
text

An free online tool to quickly verify your answers

SHA1 Hash Generator

Generate secure SHA1 hash values for your sensitive data with our user-friendly online tool.

Try Now

Conclusion§

Implementing the SHA1 algorithm from scratch provides valuable insights into its internal workings. By following the essential steps of preprocessing, initialization, hash computation and output generation, we can produce the final SHA1 digest for any input. Learning these cryptographic concepts and methods prepares us for tackling more advanced hash functions down the road.