
- Hash Functions provide the very idea of cryptography , for generating unique output string corresponding to each input string. The input string could be of any length ( even empty string works ) , however the hash algorithm always produces hash of a fixed size, and deterministic i.e. passing the hash same string as input to the hash function , generates the same hash or the output string.
- The fundamental characteristic of hash function is , it represents one way traffic ,i.e. one could get hash from the input string however , one should not be able to get the input string from the generated hash string
- However , the Hash differ from general encryption schema from the fact that they are one way i.e. after encoding the data we cannot retrieve from the hash the data from which it was generated.
- For any general Hash function , it should produce a unique value for each of the input it is provided ( i.e. there should be no collisions ) , added to this it should be deterministic i.e. same hash is generated each time the same input is provided ( hash for a given input doesn't change )
Hexadecimal Notation
- less space to store , and hence more data could be fitted into the same amount of storage.
- one could use even more bits but to make the length even more shorter , we run out of the run out of human readable characters
- thus often an agreement b/w the length and the number of bits used for data representation .
MD5 Hash Algorithm
- It generates a 128 bit string from an arbitrary string , given as input , of how so ever length the input data be. Although this hash has been shown to be broken recently , however it is still widely used to ensure file integrity and stuff these days .
- It takes the plain text or the input in multiple of 512 bit blocks , which are further divided into 16 blocks , each of size 32 bits and produces a 128 bit hash string which is composed of 4
32
bit sub-parts.
Algorithm 🙂 :

The main algorithm of the MD5 hash , works on data of block size of 512 bits , and then proceeds with it's main encryption part . Hence forth , after dividing the given string into blocks of 512 bits ( there might be multiple or no such blocks ) , we make the leftover string length by padding the string , so that the block finally becomes a multiple of 512 bits.
- The detailed algorithm is explained below :
-
Add the Padding
: For the leftover string , append 1 1
bit to the end of the string , and append 0
bits until the length of the string becomes 448 for the remaining 64 bits i.e. we add 0
's until , the length of the resulting string is 64
less than 512
-
Append the length
: append the 64 bit representation of the length of the original string modulo 2^64 , and hence we now get a string that could be broken into blocks of 512 bits
-
Initialising The Buffers
: After dividing the entire string into multiples of 512 bit blocks , we initialize 4 different buffers of 32 bits size each as :
A |
01 |
23 |
45 |
67 |
B |
89 |
ab |
cd |
ef |
C |
fe |
dc |
ba |
98 |
D |
76 |
54 |
32 |
10 |
-
We define the helper functions, that are used to process the 32 bit digest blocks as:
Helper Functions
-
Processing the 512 bit blocks
:
- In total there are 64 rounds of operations , and each function being used in 16 functions . Basically we can say that there are 4 rounds ( 1 for each operation F,G, H and I ) , and each operation is done 16 times.
- We have a predefined set of 64 keys , each of which is used in the corresponding round number .
- The input message of length 512 bits , is divided into 16 parts , each of length 32 bits , each of which takes part in the ith sub_round as described ahead
- After each round of operations there is an intermixing of the vectors A,B, C and D . after every one sub-round ( 1 in the 16 sub-rounds that happen for an operation ) , the vector B is fed into vector C , C into D , D into A and A after some operations ( which are specified below ) is fed into B for next sub-round operations.
- after performing the operation specific to the round ( F or G or H or I ) , it is added to the vector A ( modular addition ) which further undergoes modular addition operations with the key ( of that particular operation number ( 1 - 64 ) ) and message ( input ) corresponding to the sub_round operation number ( 1 - 16 ).
- Then the left shift operation is carried out by the amount specified for each round , and is added with vector B ( modular addition ) and finally fed into the vector B, and the cycle goes on for further rounds of computation.
One sub-round operation

Hold up , next coming is SHA-256 algorithm broken down into easyyyyy
steps !!!!!!! 😎