Cyber forensics, Computer forensics, Hashing (Computer science)
Computer Engineering | Computer Sciences | Electrical and Computer Engineering | Forensic Science and Technology
Hash functions are widespread in computer sciences and have a wide range of applications such as ensuring integrity in cryptographic protocols, structuring database entries (hash tables) or identifying known files in forensic investigations. Besides their cryptographic requirements, a fundamental property of hash functions is efficient and easy computation which is especially important in digital forensics due to the large amount of data that needs to be processed when working on cases. In this paper, we correlate the runtime efficiency of common hashing algorithms (MD5, SHA-family) and their implementation. Our empirical comparison focuses on C-OpenSSL, Python, Ruby, Java on Windows and Linux and C and WinCrypto API on Windows. The purpose of this paper is to recommend appropriate programming languages and libraries for coding tools that include intensive hashing processes. In each programming language, we compute the MD5, SHA-1, SHA-256 and SHA-512 digest on datasets from 2MB to 1 GB. For each language, algorithm and data, we perform multiple runs and compute the average elapsed time. In our experiment, we observed that OpenSSL and languages utilizing OpenSSL (Python and Ruby) perform better across all the hashing algorithms and data sizes on Windows and Linux. However, on Windows, performance of Java (Oracle JDK) and C WinCrypto is comparable to OpenSSL and better for SHA-512.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Gurjar, Satyendra; Baggili, Ibrahim; Breitinger, Frank; and Fischer, Alice E., "An Empirical Comparison of Widely Adopted Hash Functions in Digital Forensics: Does the Programming Language and Operating System Make a Difference?" (2015). Electrical & Computer Engineering and Computer Science Faculty Publications. 30.
Gurjar, S., Baggili, I., Breitinger, F. and Fischer, A. (2015). An empirical comparison of widely adopted hash functions in digital forensics: does the programming language and operating system make a difference? Proceedings of the Conference on Digital Forensics, Security and Law, CDFSL 2015, Daytona Beach, Fla., May 19-21, 2015, pp. 57-68.