From a4e5e5e97400585c3c1b34a0bee23d963dcf2b5b Mon Sep 17 00:00:00 2001 From: Vasan Dilaksan Date: Fri, 12 Sep 2025 16:05:21 +0400 Subject: [PATCH 1/3] Docs: Rework README.md for improved readability --- README | 110 ------------------------------------------------------ README.md | 103 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 103 insertions(+), 110 deletions(-) delete mode 100644 README create mode 100644 README.md diff --git a/README b/README deleted file mode 100644 index 5ea6112..0000000 --- a/README +++ /dev/null @@ -1,110 +0,0 @@ - -**** FUZZY HASHING API **** - -This file documents the fuzzy hashing API. Information on how to use the -fuzzy hashing program ssdeep can be found in the man page. On *nix -systems you can view this file with: - -$ man ./ssdeep.1 - -Windows users can get the ssdeep usage information from README.TXT. - - -** Using the API in Your Own Progrms ** - -You can use the fuzzy hashing API in your own programs by doing -the following: - -1. Include the fuzzy hashing header - -#include - - -2. Call one of the functions: - -* Fuzzy hashing a buffer of text: - -int fuzzy_hash_buf(const unsigned char *buf, - uint32_t buf_len, - char *result); - -This function computes the fuzzy hash of the buffer 'buf' and stores the -result in result. You MUST allocate result to hold FUZZY_MAX_RESULT -characters before calling this function. The length of the buffer should -be passed in via buf_len. It is the user's responsibility to append the -filename, if any, to the output. The function returns zero on success, -one on error. - - -* Fuzzy hashing a file: - -There are in fact two ways to fuzzy hash a file. If you already -have an open file handle you can use: - -int fuzzy_hash_file(FILE *handle, - char *result); - -This function computes the fuzzy hash of the file pointed to by handle -and stores the result in result. You MUST allocate result to hold -FUZZY_MAX_RESULT characters before calling this function. It is the -user's responsibility to append the filename to the output. -The function returns zero on success, one on error. - -The other function to hash a file takes a file name: - -int fuzzy_hash_filename(const char * filename, - char * result); - -Like the function above, this function stores the fuzzy hash result -in the parameter result. You MUST allocate result to hold -FUZZY_MAX_RESULT characters before calling this function. - - -* Compare two fuzzy hash signatures: - -int fuzzy_compare(const char *sig1, const char *sig2); - -This function returns a value from 0 to 100 indicating the match -score of the two signatures. A match score of zero indicates the \ -signatures did not match. - - -3. Compile - -To compile the program using gcc: - - $ gcc -Wall -I/usr/local/include -L/usr/local/lib sample.c -lfuzzy - -Using mingw: - - C:\> gcc -Wall -Ic:\path\to\includes sample.c fuzzy.dll - -Using Microsoft Visual C (MSVC): - -To paraphrase the MinGW documentation, -http://www.mingw.org/mingwfaq.shtml#faq-msvcdll: - -The Windows ssdeep package includes a Win32 DLL and a .def file. Although -MSVC users can't use the DLL directly, they can easily create a .lib file -using the Microsoft LIB tool: - - C:\> lib /machine:i386 /def:fuzzy.def - -You can then compile your program using the resulting library: - - C:\> cl sample.c fuzzy.lib - - - -** Sample Program ** - -A sample program that uses the API is in sample.c. - - - -** See Also ** - -- Jesse D. Kornblum, "Identifying almost identical files using context -triggered piecewise hashing", Digital Investigaton, 3(S):91-97, -September 2006, http://dx.doi.org/10.1016/j.diin.2006.06.015, -The Proceedings of the 6th Annual Digital Forensic Research Workshop diff --git a/README.md b/README.md new file mode 100644 index 0000000..a475f96 --- /dev/null +++ b/README.md @@ -0,0 +1,103 @@ +# FUZZY HASHING API + +This file documents the fuzzy hashing API. Information on how to use the fuzzy hashing program ssdeep can be found in the man page. On *nix systems you can view this file with: + +```bash +$ man ./ssdeep.1 +``` + +Windows users can get the ssdeep usage information from `README.TXT`. + +--- + +## Using the API in Your Own Programs + +You can use the fuzzy hashing API in your own programs by doing the following: + +### 1. Include the fuzzy hashing header + +```c +#include +``` + +### 2. Call one of the functions: + +#### Fuzzy hashing a buffer of text: + +```c +int fuzzy_hash_buf(const unsigned char *buf, + uint32_t buf_len, + char *result); +``` + +This function computes the fuzzy hash of the buffer `buf` and stores the result in `result`. You **MUST** allocate `result` to hold `FUZZY_MAX_RESULT` characters before calling this function. The length of the buffer should be passed in via `buf_len`. It is the user's responsibility to append the filename, if any, to the output. The function returns zero on success, one on error. + +#### Fuzzy hashing a file: + +There are in fact two ways to fuzzy hash a file. If you already have an open file handle you can use: + +```c +int fuzzy_hash_file(FILE *handle, + char *result); +``` + +This function computes the fuzzy hash of the file pointed to by `handle` and stores the result in `result`. You **MUST** allocate `result` to hold `FUZZY_MAX_RESULT` characters before calling this function. It is the user's responsibility to append the filename to the output. The function returns zero on success, one on error. + +The other function to hash a file takes a file name: + +```c +int fuzzy_hash_filename(const char * filename, + char * result); +``` + +Like the function above, this function stores the fuzzy hash result in the parameter `result`. You **MUST** allocate `result` to hold `FUZZY_MAX_RESULT` characters before calling this function. + +#### Compare two fuzzy hash signatures: + +```c +int fuzzy_compare(const char *sig1, const char *sig2); +``` + +This function returns a value from 0 to 100 indicating the match score of the two signatures. A match score of zero indicates the signatures did not match. + +### 3. Compile + +#### To compile the program using gcc: + +```bash +$ gcc -Wall -I/usr/local/include -L/usr/local/lib sample.c -lfuzzy +``` + +#### Using mingw: + +```cmd +C:\> gcc -Wall -Ic:\path\to\includes sample.c fuzzy.dll +``` + +#### Using Microsoft Visual C (MSVC): + +To paraphrase the MinGW documentation, http://www.mingw.org/mingwfaq.shtml#faq-msvcdll: + +The Windows ssdeep package includes a Win32 DLL and a .def file. Although MSVC users can't use the DLL directly, they can easily create a .lib file using the Microsoft LIB tool: + +```cmd +C:\> lib /machine:i386 /def:fuzzy.def +``` + +You can then compile your program using the resulting library: + +```cmd +C:\> cl sample.c fuzzy.lib +``` + +--- + +## 💻 Sample Program + +A sample program that uses the API is in `sample.c`. + +--- + +## 📚 See Also + +Jesse D. Kornblum, "Identifying almost identical files using context triggered piecewise hashing", Digital Investigation, 3(S):91-97, September 2006, http://dx.doi.org/10.1016/j.diin.2006.06.015, The Proceedings of the 6th Annual Digital Forensic Research Workshop \ No newline at end of file From 56f83982d2feac6c7fa21430ec0cfd388bdb0c95 Mon Sep 17 00:00:00 2001 From: Vasan Dilaksan Date: Fri, 12 Sep 2025 18:38:16 +0400 Subject: [PATCH 2/3] Fix: Address collaborator feedback on README --- README.md => README | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) rename README.md => README (98%) diff --git a/README.md b/README similarity index 98% rename from README.md rename to README index a475f96..8e2a916 100644 --- a/README.md +++ b/README @@ -2,7 +2,7 @@ This file documents the fuzzy hashing API. Information on how to use the fuzzy hashing program ssdeep can be found in the man page. On *nix systems you can view this file with: -```bash +```console $ man ./ssdeep.1 ``` @@ -64,13 +64,13 @@ This function returns a value from 0 to 100 indicating the match score of the tw #### To compile the program using gcc: -```bash +```console $ gcc -Wall -I/usr/local/include -L/usr/local/lib sample.c -lfuzzy ``` #### Using mingw: -```cmd +```console C:\> gcc -Wall -Ic:\path\to\includes sample.c fuzzy.dll ``` @@ -80,13 +80,13 @@ To paraphrase the MinGW documentation, http://www.mingw.org/mingwfaq.shtml#faq-m The Windows ssdeep package includes a Win32 DLL and a .def file. Although MSVC users can't use the DLL directly, they can easily create a .lib file using the Microsoft LIB tool: -```cmd +```console C:\> lib /machine:i386 /def:fuzzy.def ``` You can then compile your program using the resulting library: -```cmd +```console C:\> cl sample.c fuzzy.lib ``` From 351c5cae51244d957b9f9ec46a44d72e29680cb6 Mon Sep 17 00:00:00 2001 From: Vasan Dilaksan Date: Sat, 13 Sep 2025 10:10:31 +0400 Subject: [PATCH 3/3] Fix: Add missing newline at end of file --- README | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README b/README index 8e2a916..9b7f036 100644 --- a/README +++ b/README @@ -100,4 +100,4 @@ A sample program that uses the API is in `sample.c`. ## 📚 See Also -Jesse D. Kornblum, "Identifying almost identical files using context triggered piecewise hashing", Digital Investigation, 3(S):91-97, September 2006, http://dx.doi.org/10.1016/j.diin.2006.06.015, The Proceedings of the 6th Annual Digital Forensic Research Workshop \ No newline at end of file +Jesse D. Kornblum, "Identifying almost identical files using context triggered piecewise hashing", Digital Investigation, 3(S):91-97, September 2006, http://dx.doi.org/10.1016/j.diin.2006.06.015, The Proceedings of the 6th Annual Digital Forensic Research Workshop.