Hashing Algorithm
- A hash in computer science is an algorithm that performs a complex calculation on some source data.
- The intention is to produce a much smaller value.
- The final value can only be produced from a limited number of inputs.
Pigeon Hole Principle
- Because the hash is smaller than the original data it should be clear that there are more possible sources than there are possible hashes.
- This idea is known as the pigeonhole principle.
Check Digit
- A check digit is produced by a hashing algorithm
- It is used to confirm that the data being processed has been received correctly
- Used extensively in data capture devices such as barcode scanners.
Checksum Hash
- Similar to a check digit but for much bigger data sets
- Downloads that must be correct
- A longer number is produced
- This can be downloaded and compared to a locally produced checksum
Hash Tables
- When data is stored in an organised way, such as a database table.
- A hash can be used to determine where in the file data should be stored.
- To determine where to write the data.
- To determine where the data should be in order to read it.
Other uses of Hashing
- Bloom filters
- Encryption
- Finding duplicate or similar data