auditor

auditor is a forensic tool for fast integrity auditing that uses cryptographic hash functions .

It is similar to other popular tools (fsum, hashdeep, sha256sum, etc.), but with features to make digital data auditing simpler and faster.

auditor has support to several hash algorithms. The default is sha256, that is recommended by NIST since 2015 ( see NIST Policy on Hash Functions). By default, thash method is enabled, but this mode can be disabled.
thash method is enabled, but this mode can be disabled.

Usage

With auditor installed in your system, you can use it as follows:
Manual em português

auditor subcommands: info
hash
check
info
Manual
Important !

hash is the first step of forensic auditing. It generates audit files that allow verifying file integrity.

Basic usage of subcommand hash

auditor hash input_path

This will:

  1. Hash ALL files in input_path with default algorithm (sha256)
  2. Generate audit files:
    • Audit_FullList: contains the integrity data (hash, size, name) of file list in input_path. Default name: _auditor_hashes.txt
    • Audit_Stamp: contains only the integrity data of Audit_FullList. Default name: _auditor_stamp.txt.This is the file that needs to be printed or digitally signed to ensure check of all integrity chain.
  3. Show integrity data of Audit_FullList


  4. After hash, you can perform check command to verify integrity. To properly ensure integrity check in the future, follow advices in Important!



Others examples:

  1. Overwrite audit files (-o), use blockSize 10MB and 'whirlpool' hash function

    auditor hash input_path -o -b 10MB -a whirlpool
  2. Overwrite audit files (-o), disable 'thash method' (-d), and use blake3 hash function (-a)

    auditor hash input_path -o -d -a blake3
  3. Use 100 workers (-w 100), overwrite audit files (-o), use blockSize 10MB (-b 10MB) and 'whirlpool' hash function (-a whirlpool)

    auditor hash input_path -o -b 10MB -a whirlpool -w 100 
  4. Overwrite audit files (-o), include only txt files (-i "**/*.txt") only in root folder (-u 1)

    auditor hash input_path -o -i "**/*.txt" -u 1
  5. Just generate hashes, but don’t create any files

    auditor hash input_path -l

See Manual section to all options!


check is the second step of forensic auditing. It checks the the integrity of data using information in the audit files and can be used after hash was performed.

Basic usage of subcommand Check

auditor check input_path

This will:

  1. Recalculate integrity data of Audit_FullList and check against audit file Audit_Stamp
  2. Check integrity of Audit_FullList against all original files in input_path
  3. Show the recalculated integrity data of Audit_FullList to be compared with the original one created with hash.

Others examples:

  1. Check in quiet mode (-q) and stop on first error (-x), using default audit files.

    auditor check input_path -q -x
  2. Check F:\data_path using audit files with specific names. (-f to <Audit_FullList> and -s to <Audit_Stamp>)

    auditor check F:\data_path -f C:\other_path\personal_fullList.txt -s C:\other_path\personal_stamp.txt -q -x 
  3. Check integrity of just one file in <input_path> against some audit file

    auditor check F:\data_path\file1.txt -f C:\other_path\some_audit.txt -q -x 

See Manual section to all options!



info does not perform hash integrity check. It only tests the audit files and the content of input_path and shows useful information. Can be used after hash was performed.

Example of Forensic Info

auditor info input_path

This will:

  1. Verify if audit files Audit_Stamp and Audit_FullList exists.
  2. Verify if files listed in Audit_Stamp and Audit_FullList exists and listed size is the same that in input_path.
  3. Verify if all files in input_path are listed in Audit_FullList.
  4. Recalculate integrity data of Audit_FullList and check against audit file Audit_Stamp
  5. Give a report of size of the files and of all input_path.


Manual português

Usage

Usage: auditor.exe <SUBCOMMAND> [OPTIONS]

To see version: auditor.exe --version or auditor.exe -V

To help: auditor.exe --help or auditor.exe -h

SUBCOMMAND:

One of the following subcommands:

  • hash: Hash <input_path> and generates audit files (<Audit_FullList>/<Audit_Stamp>)
  • check: Checks the integrity of <input_path> against data in audit files (<Audit_FullList>/<Audit_Stamp>)
  • info: Only tests if audit files and <input_path> are synchronized. This doesn't check the integrity!

Subcommand hash

Hash <input_path> and generates audit files (<Audit_FullList>/<Audit_Stamp>)

Usage: auditor.exe hash <INPUT_PATH> [OPTIONS]

Arguments:

  • <INPUT_PATH>: Path to the data that require integrity assurance. (will be hashed)

Options:

  • -n, --audit-basename <AUDIT_BASENAME>: If specified, it changes <audit_basename> of the audit files. See with option --help for details. [default: _auditor_hashes.txt]
  • -f, --audit-full <AUDIT_FULLLIST_FILE>: If specified, uses this whole path as <Audit_FullList>, that can be in anywhere. See with option --help for details.
  • -s, --audit-stamp <AUDIT_STAMP_FILE>: If specified, uses this whole path as <Audit_Stamp>, that can be in anywhere. See with option --help for details.
  • -o, --overwrite-audit-files: Enables Overwrite mode, which will delete existing audit files and create new ones.
  • -b, --block <BLOCKSIZE>: BlockSize for thash mode. Use number followed by KB, MB, GB, TB or use auto. With auto, blocksize of files bigger than 50MB will be determined by your file_size divided by number of workers. Ex: 10MB. [default: 50MB]
  • -a, --alg-hash <ALGORITHM_TO_HASH>: Algorithms to hash: sha256, sha512, whirlpool, blake3, k12 (kangarootwelve), sha3_256, sha3_512, keccak256, keccak512. (sha256 and sha512 are recommended by NIST at release date of this version, --help for more info) [default: sha256]
  • -d, --disable-thash: Disable 'thash method' mode. This will force hash files in the normal method, and can be significant slower to big files. See http://thash.org to learn more.
  • -k, --no-stamp: Don’t create the <Audit_Stamp> (but will create the <Audit_FullList>).
  • -l, --no-audit: Just calculate hashes, but don’t create the <Audit_FullList> neither <Audit_Stamp>.
  • -u, --max-depth <MAX_RECURSIVE_DIR_DEPTH>: Maximum recursive directory depth level. 1: Only one level (current dir), 2: Two levels, etc... Default: '0', infinite. no limit!
  • -i, --include-glob-pattern <INCLUDE_GLOB_PATTERN>...: Include only files that match the Glob patterns. If not used, will include all files. Can be used multiple times. Use the char " to enclose it. Examples: "**/*.txt" "**/*.{txt,doc}" "**/*file1*" .
  • -e, --exclude-glob-pattern <EXCLUDE_GLOB_PATTERN>...: Exclude files that match the Glob pattern. It works over included files. Can be used multiple times. Use the char " to enclose it. Examples: "**/*.txt" "**/*.{txt,doc}" "**/*file1*" .
  • -x, --stop: Stops and fails immediately on any error
  • -p, --ignore-permissions-errors: Proceed when encountering access permission errors with folders or files. If the '--stop' flag is used without this flag, such errors will cause the operation to fail.
  • -q, --quiet: Runs the program in quiet mode
  • -w, --n-workers <N_WORKERS>: Number of worker threads. Default is number of cores on your computer.
  • -c, --n-max-concur <N_MAX_CONCUR>: Maximum number of concurrent access to a same file.
  • -z, --fast-disk : To use with medias where concurrent access to disk are very fast, as ssd disks, to improve performance.
  • -v, --verbosity : Increases the verbosity level. -v: minimum , -vv: some, -vvv: many
  • -h, --help: Print help (see more with '--help')

Subcommand check

Checks the integrity of <input_path> against data in audit files (<Audit_FullList>/<Audit_Stamp>)

Usage: auditor.exe check <INPUT_PATH> [OPTIONS]

Arguments:

  • <INPUT_PATH>: Path to the directory that contains the audit files and the original data to check the entire integrity chain.

Options:

  • -n, --audit-basename <AUDIT_BASENAME>: If specified, it changes <audit_basename> of the audit files. See with option --help for details. [default: _auditor_hashes.txt]
  • -f, --audit-full <AUDIT_FULLLIST_FILE>: If specified, uses this whole path as <Audit_FullList>, that can be in anywhere. See with option --help for details.
  • -s, --audit-stamp <AUDIT_STAMP_FILE>: If specified, uses this whole path as <Audit_Stamp>, that can be in anywhere. See with option --help for details.
  • -k, --no-stamp: It will not check <Audit_stamp>. Just check hash files inside <Audit_FullList>.
  • --strict: Check if <input_path> matches exactly with audit files. Any warning or error will fail.
  • -u, --max-depth <MAX_RECURSIVE_DIR_DEPTH>: Maximum recursive directory depth level. 1: Only one level (current dir), 2: Two levels, etc... Default: '0', infinite. no limit!
  • -i, --include-glob-pattern <INCLUDE_GLOB_PATTERN>...: Include only files that match the Glob patterns. If not used, will include all files. Can be used multiple times. Use the char " to enclose it. Examples: "**/*.txt" "**/*.{txt,doc}" "**/*file1*" .
  • -e, --exclude-glob-pattern <EXCLUDE_GLOB_PATTERN>...: Exclude files that match the Glob pattern. It works over included files. Can be used multiple times. Use the char " to enclose it. Examples: "**/*.txt" "**/*.{txt,doc}" "**/*file1*" .
  • -x, --stop: Stops and fails immediately on any error
  • -p, --ignore-permissions-errors: Proceed when encountering access permission errors with folders or files. If the '--stop' flag is used without this flag, such errors will cause the operation to fail.
  • -q, --quiet: Runs the program in quiet mode
  • -w, --n-workers <N_WORKERS>: Number of worker threads. Default is number of cores on your computer.
  • -c, --n-max-concur <N_MAX_CONCUR>: Maximum number of concurrent access to a same file.
  • -z, --fast-disk : To use with medias where concurrent access to disk are very fast, as ssd disks, to improve performance.
  • -v, --verbosity: Increases the verbosity level. -v: minimum , -vv: some, -vvv: many
  • -h, --help: Print help (see more with '--help')

Subcommand info

Only tests if audit files and <input_path> are synchronized. This doesn't check the integrity!

Usage: auditor.exe info [OPTIONS] <INPUT_PATH>

Arguments:

  • <INPUT_PATH>: Path to the directory that contains the audit files and the original data.

Options:

  • -n, --audit-basename <AUDIT_BASENAME>: If specified, it changes <audit_basename> of the audit files. See with option --help for details. [default: _auditor_hashes.txt]
  • -f, --audit-full <AUDIT_FULLLIST_FILE>: If specified, uses this whole path as <Audit_FullList>, that can be in anywhere. See with option --help for details.
  • -s, --audit-stamp <AUDIT_STAMP_FILE>: If specified, uses this whole path as <Audit_Stamp>, that can be in anywhere. See with option --help for details.
  • -k, --no-stamp: It will not check <Audit_stamp>. Just check hash files inside <Audit_FullList>.
  • -h, --help: Print help (see more with '--help')

About the integrity of data

To securely ensure future check of all chain of integrity, you should:

  1. Save all data, including audit files, and either print the contents of Audit_Stamp or digitally sign this file. If you don´t do this, someone can simply change the data an generate new audit files.
  2. In future, when someone performs a check, the content of audit file Audit_Stamp MUST BE the same of the printed or digitally signed version done in step 1. If does not match, the integrity check is not valid.

If you don't have a digital certificate, you can use a free timestamping authority to sign the file online, such as freetsa.org (using Online Signature).

Output Formats

The format of audit files are simple. Each line contains:

hash_value ?algorithm[<thash-BlockSize>]|file_size[:hex]*relative_filepath
where [ ] are optional:

hash_value: value of hash.

algorithm[<thash-BlockSize>]: algorithm used to hash, stored in capital letters to mantain compatibility with some others tools. The parameter with <thash-BlockSize> is optional, indicating that thash method and BlockSize were used. BlockSize must be in KB, MB, GB or TB. Ex: 10MB.

file_size: FileSize of original file when was hashed. Useful in check, to improve speed when size doesnt match. Why hash a big file when already is known that its size does not match with original?

[:hex]: Optional flag to indicate that filepaths is in hex format. This is necessary because char as '\n', '\r' or '\0', are permitted in some OS, and the hex avoid problems with formatting the results.

relative_filepath: The relative filepath of file hashed.

Example 1: using method thash with algorithm sha256 and BlockSize 50MB :

281d5d93464f1165ea7c403ca99d63ff4bf9a360864f8df4bd0e8e6c03774e98 ?SHA256<thash-50MB>|500000*file_hashed.bin

Example 2: using normal method, just with algorithm blake3.

7357b67824d086dc53f5e1ded565f500456bea1812783f1fbcddc08fddc3944c ?BLAKE3|2233:hex*1aCb344356e4e2b2b6

Others formats can be implemented in future.

Download

v.0.3.4- Windows x64

sha256:4807037412cbabf69c403635e763b2be7c2019cb2a1a134376c71e1a1c55e67c

download

v.0.3.4- Linux x64

sha256:609267455fed1c869b94ec03c2a87e639f1aed77ea9475cb392396a527e74f10

download

Disclaimer: auditor is provided as a software in development, without ANY kind of warranty or support. So, use it at your own risk.

Benchmarks

Using hyperfine, tests between auditor, fsum and hashdeep64 were performed and results are shown below.

Machine configs:
S.O.: Windows 11 Home 64bits
Processor: AMD Ryzen 7 (7800X3D 4.20 GHz)
RAM: 64 GB of RAM (Corsair Vengeance DDR5 64GB - 5200MHz)
Disk: SSD M.2 2TB (Corsair MP600 Pro NVMe)

Data Source:
Benchmarks
Benchmarks: Benchmarks
Command Mean [s] Min [s] Max [s] Relative
.\fsum.exe -dC:\thash\data\completo\ -sha256 -R *.* 249.565 249.565 249.565 5.94
.\auditor.exe hash C:\thash\data\completo\ -o 42.049 42.049 42.049 1.00
.\hashdeep64.exe -d -r -j 48 -p 50m C:\thash\data\completo\ 171.027 171.027 171.027 4.07
Have suggestions or found a bug? Contact us at: [email protected]