A new edit distance for fuzzy hashing applications.


Similarity preserving hashing applications, also known as fuzzy hashing functions, help to analyse the content of digital devices by performing a resemblance comparison between different files. In practice, the similarity matching procedure is a two-step process, where first a signature associated to the files under comparison is generated, and then a comparison of the signatures themselves is performed. Even though ssdeep is the best-known application in this field, the edit distance algorithm that ssdeep uses for performing the signature comparison is not well-suited for certain scenarios. In this contribution we present a new edit distance algorithm that better reflects the similarity of two strings, and that can be used by fuzzy hashing applications in order to improve their results.

Publication type: 
Published in: 
The 2015 World Congress in Computer Science, Computer Engineering, and Applied Computing , The 2015 International Conference on Security and Management WORLDCOMP''15-SAM''15. Las Vegas, Nevada, USA pp. 326-332
Publication date: 
July 2015
CeDInt Authors: 
Other Authors: 
V. Gayoso Martínez, F. Hernández Álvarez, L. Hernández Encinas