TITLE

A FUZZY HASHING APPROACH BASED ON RANDOM SEQUENCES AND HAMMING DISTANCE

AUTHOR(S)
Breitinger, Frank; Baler, Harald
PUB. DATE
January 2012
SOURCE
Proceedings of the Conference on Digital Forensics, Security & L;2012, p89
SOURCE TYPE
Conference Proceeding
DOC. TYPE
Article
ABSTRACT
Hash functions are well-known methods in computer science to map arbitrary large input to bit strings of a fixed length that serve as unique input identifier/fingerprints. A key property of cryptographic hash functions is that even if only one bit of the input is changed the output behaves pseudo randomly and therefore similar files cannot be identified. However, in the area of computer forensics it is also necessary to find similar files (e.g. different versions of a file), wherefore we need a similarity preserving hash function also called fuzzy hash function. In this paper we present a new approach for fuzzy hashing called bbHash. It is based on the idea to 'rebuild' an input as good as possible using a fixed set of randomly chosen byte sequences called building blocks of byte length l (e.g. l=128). The proceeding is as follows: slide through the input byte-by-byte, read out the current input byte sequence of length l, and compute the Hamming distances of all building blocks against the current input byte sequence. Each building block with Hamming distance smaller than a certain threshold contributes the file's bbHash. We discuss (dis-)advantages of our bbHash to further fuzzy hash approaches. A key property of bbHash is that it is the first fuzzy hashing approach based on a comparison to external data structures.
ACCESSION #
82899982

 

Related Articles

  • Hyper Hasher.  // Network Dictionary;2007, p234 

    An encyclopedia entry about "Hyper Hasher" is presented. It is a shareware program developed by Matt LaPlante. Its objective is to allow the use to compute digital files hashes and checksums in more formats than any other computer program.

  • Symbolic Gray Code as a Perfect Multiattribute Hashing Scheme for Partial Match Queries. Chang, C.C.; Lee, R.C.T.; Du, M.W. // IEEE Transactions on Software Engineering;May82, Vol. 8 Issue 3, p235 

    In this paper. we shall show that the symbolic Gray code hashing mechanism is not only good for best matching, but also good for partial match queries. Essentially, we shall propose a new hashing scheme, called bucket-oriented symbolic Gray code, which can be used to produce any arbitrary...

  • Cryptanalysis of the Tillich-Zémor Hash Function. Grassl, Markus; Ilić, Ivana; Magliveras, Spyros; Steinwandt, Rainer // Journal of Cryptology;Winter2011, Vol. 24 Issue 1, p148 

    CRYPTO '94, Tillich and Zémor proposed a family of hash functions, based on computing a suitable matrix product in groups of the form $SL_{2}(\mathbb{F}_{2^{n}})$. We show how to construct collisions between palindromic bit strings of length 2 n+2 for Tillich and Zémor's construction. The...

  • Efficient Cryptographic key Generation from Fingerprint using Symmetric Hash Functions. Nandini, C.; Shylaja, B. // International Journal of Research & Reviews in Computer Science;Aug2011, Vol. 2 Issue 4, p937 

    Biometrics gives a lot of methods in high-secure applications while using natural, user-friendly and fast authentication. Cryptography is intended to ensure the secrecy and authenticity of message. The cryptographic key will be long, so it is difficult to remember, protecting the confidentiality...

  • EXPLORING FORENSIC IMPLICATIONS OF THE FUSION DRIVE. Gupta, Shruti; Rogers, Marcus // Journal of Digital Forensics, Security & Law;2014, Vol. 9 Issue 2, p145 

    This paper explores the forensic implications of Apple's Fusion Drive. The Fusion Drive is an example of auto-tiered storage. It uses a combination of a ash drive and a magnetic drive. Data is moved between the drives automatically to maximize system performance. This is different from...

  • On-line Ciphers and the Hash-CBC Constructions. Bellare, M.; Boldyreva, A.; Knudsen, L.; Namprempre, C. // Journal of Cryptology;Oct2012, Vol. 25 Issue 4, p640 

    We initiate a study of on-line ciphers. These are ciphers that can take input plaintexts of large and varying lengths and will output the i th block of the ciphertext after having processed only the first i blocks of the plaintext. Such ciphers permit length-preserving encryption of a data...

  • Cryptanalysis of SHA-0 and Reduced SHA-1. Biham, Eli; Chen, Rafi; Joux, Antoine // Journal of Cryptology;Winter2015, Vol. 28 Issue 1, p110 

    We present new techniques for the cryptanalysis of hash functions. Our contributions are two-fold: both on the search level of the compression function and on the meta-structure. The former led to the neutral bits technique, while the latter led to the multi-block technique. The usefulness of...

  • TESTING FRAMEWORK FOR MOBILE DEVICE FORENSICS TOOLS. Anobah, Maxwell; Saleem, Shahzad; Popov, Oliver // Journal of Digital Forensics, Security & Law;2014, Vol. 9 Issue 2, p221 

    The proliferation of mobile communication and computing devices, in particular smart mobile phones, is almost paralleled with the increasing number of mobile device forensics tools in the market. Each mobile forensics tool vendor, on one hand claims to have a tool that is best in terms of...

  • Digital Forensics. Garfinkel, Simson L. // American Scientist;Sep/Oct2013, Vol. 101 Issue 5, p370 

    The article looks at digital forensics, or the study of evidence from electronic equipment with digital data storage including computers and cell phones in the context of criminal or civil-law investigations, as of 2013. It discusses best-practice standards in areas such as collection and...

Share

Read the Article

Courtesy of THE LIBRARY OF VIRGINIA

Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics