Document Type


Publication Date


Subject: LCSH

Local area networks (Computer networks)--Traffic, Cyber forensics, Computer forensics, Hashing (Computer science)


Computer Engineering | Computer Sciences | Electrical and Computer Engineering | Forensic Science and Technology | Information Security


Hash functions are established and well-known in digital forensics, where they are commonly used for proving integrity and file identification (i.e., hash all files on a seized device and compare the fingerprints against a reference database). However, with respect to the latter operation, an active adversary can easily overcome this approach because traditional hashes are designed to be sensitive to altering an input; output will significantly change if a single bit is flipped. Therefore, researchers developed approximate matching, which is a rather new, less prominent area but was conceived as a more robust counterpart to traditional hashing. Since the conception of approximate matching, the community has constructed numerous algorithms, extensions, and additional applications for this technology, and are still working on novel concepts to improve the status quo. In this survey article, we conduct a high-level review of the existing literature from a non-technical perspective and summarize the existing body of knowledge in approximate matching, with special focus on bytewise algorithms. Our contribution allows researchers and practitioners to receive an overview of the state of the art of approximate matching so that they may understand the capabilities and challenges of the field. Simply, we present the terminology, use cases, classification, requirements, testing methods, algorithms, applications, and a list of primary and secondary literature.


Copyright (c) 2016 Journal of Digital Forensics, Security and Law This work is licensed under a Creative Commons Attribution 4.0 International License.

Dr. Baggili was appointed to the University of New Haven's Elder Family Endowed Chair in 2015.

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Publisher Citation

Harichandran, Vikram S., Frank Breitinger, and Ibrahim Baggili. "Bytewise Approximate Matching: The Good, The Bad, and The Unknown." Journal of Digital Forensics, Security and Law, 11, no. 2 (2016): 59-78.

Check your library



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.