– Hashing algorithm is another name for the hash functions or subroutines that are used for mapping the data sets of fixed length and data sets of variable length.

– For example, the name of an object is of variable length and it could be hashed or mapped to an integer which is of fixed size.

**The hashing algorithms return a value which could range from the following:**

- Hash values
- Hash codes
- Hash sums
- Checksums
- Hashes

– Hashing algorithms are used wherever it is required to compare things such as in searching some item in data base and table look ups.

– These algorithms are used in the detection of the similar or duplicate records in the same file and sometimes might also be used in searching similar patterns in DNA sequences etc.

– Hashing algorithms have a predefined size and are one way algorithms.

– If it was possible to decry-pt the hashing algorithms, then a reversible function would have been possible.

– If such a function was possible, then result could be found using brute force technique.

**– Stability or referential transparency is one of the major characteristics of a good hashing algorithm.**

– For example, if the two equal inputs are evaluated, the result should be the same both the times.

– A number of programming languages offer certain constructs for allowing the users to override the hash functions of an object i.e., the hash codes for the two equal objects must be the same.

– This is necessary especially when an element has to be looked up in the hash table in short time.

– This is so because the same slot will be hashed by both the same elements.

**– Collisions are a common problem whenever hashing algorithms** try to map smaller set of data with a larger set of data.

– Hashing algorithms try to keep mapping as even as possible and collisions become more frequent as the table once fills up.

– Because of this issue, 80 percent of the table size has been set as the limit of the single digit hash values.

– Several properties such as linear probing, double hashing etc. might be required depending up on the algorithm being used.

– Even though this topic emerged in 1950, it still continues to be an area of research.

**Hashing algorithms might be confused with a lot many topics such as:**

- Checksums
- Check digits
- Finger prints
- Randomization functions
- Error correcting codes
- Cryptographic hash functions

– All these concepts are overlapping to some degree.

– Still they have their requirements and purpose and there designing and optimization is done differently.

– Hash functions are mostly used by the hash tables for the quick location of the data records provided its search key.

**– The search key is also called the headword.**

– The search key is mapped to an index by the hash function.

– The position of the record is given by the index.

– Hash tables are then used in the implementation of the dynamic sets and associative arrays.

– Usually, the size of the domain of the hash function is greater than the size of its range and so a number of keys are mapped to the same index.

– Thus, associated with slot of the hash table is a set of records and not just one record.

– This is why the slots of the table have been named as bucket and there values as bucket indices.

**– The purpose of the hash function is to state the location.**

– A good hashing algorithm will provide a narrow searching of 1 or 2 entries in the table.

– An even cache for large data is built using the hashing algorithms.

– The cache thus formed is simpler than the hash table.

– Here, the collision can be avoided since the older item can be rewritten or discarded.

– The same mechanism is applicable in the file comparison tasks.

– Bloom filters cannot function without hashing algorithm.

## Leave a Reply