xxhash-addon

Yet another xxhash addon for Node.js

Downloads in past

Stats

StarsIssuesVersionUpdatedCreatedSize
xxhash-addon
7102.0.34 months ago4 years agoMinified + gzip package size for xxhash-addon in KB

Readme

Yet another xxhash addon for Node.js which can be x50 times faster than crypto MD5
IMPORTANT: xxhash-addon v2 is finally here. This is almost a re-work of this project with heavy focus on performance and consistency. FAQ
has some very good info that you may not want to miss.
npm NPM
|Platform |Build Status | |------------|---------| |AppVeyor (Windows - Release build) | Build status | |Actions (Ubuntu, macOS, Windows - Release and ASan builds) | .github/workflows/ci.yml |
Overview
xxhash-addon is a native addon for Node.js (>=8.6.0) written using N-API. It 'thinly' wraps xxhash v0.8.2, which has support for a new algorithm XXH3 that has been showed to outperform its predecessor.
IMPORTANT: As of v0.8.0, XXH3 and XXH128 are now considered stable. Rush to the upstream CHANGELOG for the formal announcement! xxhash-addon v1.4.0 is the first iteration packed with stable XXH3 and XXH128.
Why v2?
  1. Greatly improved performance backed by benchmarks (see charts below.)
  2. Better consistency and smaller code size thanks to pure C-style wrapping.

The following results are generated by running the benchmark.js file. Duration (ms) measures time taken to digest 10GB of randomly filled buffer using streaming methods (update() and digest()) of the hash functions.
npm run benchmark

  • On an ARM MacBook Pro (16-inch, 2021): M1 Pro, 16GB of Mem, macOS Monterey 12.4, Node.js v16.15.1

|Hash func |Length (bits) |Duration (ms) | Note | |------------|---------|---------|---------| |MD5 (node:crypto) |128 |19653 || |SHA1 (node:crypto) |160 |4380 || |BLAKE2s256 (node:crypto) |256 |18293 |BLAKE2s is so slow on Node.js. This is not aligned with xxHash benchmark. | |XXH64 (xxhash-addon) |64 |732 |Compilied with -O2.| |XXH3 (xxhash-addon) |64 | 350 |Compilied with -O2. On ARM, XXH3 is x2 times faster than XXH64 and x50 times faster than MD5. |
  • On an Intel Mac mini (2018): Core i3, 8GB of Mem, macOS Monterey 12.4, Node.js v16.15.0

|Hash func |Length (bits) |Duration (ms) | Note | |------------|---------|---------|---------| |MD5 (node:crypto) |128 |15187 || |SHA1 (node:crypto) |160 |10568 || |BLAKE2s256 (node:crypto) |256 |27334 |BLAKE2s is so slow on Node.js. This is not aligned with xxHash benchmark. | |XXH64 (xxhash-addon) |64 |1038 |Compilied with -O2. | |XXH3 (xxhash-addon) |64 | 767 |Compilied with -O2. Significant improvement on XXH3. Even more impressive on ARM. |
Features
  • xxhash-addon exposes xxhash's API in a friendly way for downstream consumption (see the Example of Usage section).
  • Covering all 4 variants of the algorithm: XXH32, XXH64, XXH3 64-bit, XXH3 128-bit.
  • Supporting XXH3 secret.
  • Consistently producing canonical (big-endian) form of hash values as per xxhash's recommendation.
  • The addon is extensively sanity-checked againts xxhash's sanity test suite to ensure that generated hashes are correct and align with xxhsum's (xxhsum is the official utility of xxhash). Check the file xxhash-addon.test.js to see how xxhash-addon is being tested.
  • Being seriously checked against memory safety and UB issues with ASan and UBSan. See the CI for how this is done.
  • Benchmarks are publicly available.
  • Minimal dependency: the package does not depend on any other npm packages.
  • TypeScript support. xxhash-addon is strongly recommended to be used with TypeScript. Definitely check FAQ before using the addon.
Installation
npm install xxhash-addon

Note: This native addon requires recompiling. If you do not have Node.js building toolchain then you must install them first:
On a Windows machine
npm install --global --production windows-build-tools

On a Debian/Ubuntu machine
sudo apt-get update && sudo apt-get install python g++ make

On a RHEL/CentOS machine
If you are on RHEL 6 or 7, you would need to install GCC/G++ >= 6.3 via devtoolset- for the module to compile. See SCL.
On a Mac
Install Xcode command line tools
Example
const { XXHash32, XXHash3 } = require('xxhash-addon');

// Hash a string using the static one-shot method.
const salute = 'hello there';
const buf_salute = Buffer.from(salute);
console.log(XXHash32.hash(buf_salute).toString('hex'));

// Digest a byte-stream (hash a buffer piece by piece).
const hasher32 = new XXHash32(Buffer.from([0, 0, 0, 0]));
hasher32.update(buf_salute.slice(0, 3));
console.log(hasher32.digest().toString('hex'));
hasher32.update(buf_salute.slice(3));
console.log(hasher32.digest().toString('hex'));

// Reset the hasher for another hashing.
hasher32.reset();

// Using secret for XXH3
// Same constructor call syntax, but hasher switches to secret mode whenever
// it gets a buffer of at least 136 bytes.
const hasher3 = new XXHash3(require('fs').readFileSync('package-lock.json'));
FAQ
    1. Why TypeScript?
  • Short answer: for much better performance and security.
  • Long answer:
Dynamic type check is so expensive that it can hurt performance. In the world with no TypeScript, the streaming update() method had to check whether the buffer passed to it was an actual Node's Buffer. Failing to detect Buffer type might cause v8 to CHECK and crashed Node process. Such dynamic type check could degrade performance of xxhash-addon by 10-15% per my onw benchmark on a low-end Intel Mac mini (on Apple Silicon, the difference is neglectable though.)
So how does TypeScript (TS) help? Static type check.
There is still a theoretical catch. TS' type system is structural so in a corner case where you have a class that is structurally like Buffer and you pass an instance of that class to update(). This is an extreme case that should never happen in practice. Nevertheless, there are official techniques to 'force' nominal typing. Check https://www.typescriptlang.org/play#example/nominal-typing for an in-depth.
If you don't use TS then you probably want to enable run-time type check of xxhash-addon. Uncomment the line # "defines": [ "ENABLE_RUNTIME_TYPE_CHECK" ] in binding.gyp and re-compile the addon. Use this at your own risk.
Development
This is for people who are interested in creating a PR.
How to clone?
git clone https://github.com/ktrongnhan/xxhash-addon
git submodule update --init
npm install jest --save-dev
npm run debug:build
npm run benchmark
npm test
Note: debug:build compiles and links with Address Sanitizer (-fsanitze=address). npm test may not work out-of-the-box on macOS.
How to debug asan build?
You may have troubles running tests with asan build. Here is my snippet to get it running under lldb on macOS.
$ lldb node node_modules/jest/bin/jest.js
(lldb) env DYLD_INSERT_LIBRARIES=/Library/Developer/CommandLineTools/usr/lib/clang/13.1.6/lib/darwin/libclang_rt.asan_osx_dynamic.dylib
(lldb) env ASAN_OPTIONS=detect_leaks=1
(lldb) breakpoint set -f src/addon.c -l 100
(lldb) run
(lldb) next

OR
DYLD_INSERT_LIBRARIES=$(clang --print-file-name=libclang_rt.asan_osx_dynamic.dylib) ASAN_OPTIONS=detect_leaks=1 node node_modules/jest/bin/jest.js

Key takeaways:
  • If you see an error saying ASan Interceptor is loaded too late, set the env DYLD_INSERT_LIBRARIES. You need to use absolute path to your Node.js binary and jest.js as well. Curious why? An interesting article.
  • ASan doesn't detect mem-leak on macOS by default. You may want to turn this on with the env ASAN_OPTIONS=detect_leaks=1.

If you are debugging on Linux with GCC as your default compiler, here is a helpful oneliner:
$ LD_PRELOAD=$(gcc -print-file-name=libasan.so) LSAN_OPTIONS=suppressions=suppr.lsan DEBUG=1 node node_modules/jest/bin/jest.js

How to upgrade xxHash?
Everything should be set up already. Just pull from the release branch of xxHash.
git submodule update --remote
git status
git add xxHash
git commit -m "Bump xxHash to..."
git push origin your_name/upgrade_deps
API reference

Streaming Interface

export interface XXHash {
  update(data: Buffer): void;
  digest(): Buffer;
  reset(): void;
}

XXHash32

export class XXHash32 implements XXHash {
  constructor(seed: Buffer); // Buffer must be 4-byte long.
  update(data: Buffer): void;
  digest(): Buffer;
  reset(): void;
  static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}

XXHash64

export class XXHash64 implements XXHash {
  constructor(seed: Buffer); // Buffer must be 4- or 8-byte long.
  update(data: Buffer): void;
  digest(): Buffer;
  reset(): void;
  static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}

XXHash3

export class XXHash3 implements XXHash {
  constructor(seed_or_secret: Buffer); // For using seed: Buffer must be 4- or 8-byte long; for using secret: must be at least 136-byte long.
  update(data: Buffer): void;
  digest(): Buffer;
  reset(): void;
  static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}

XXHash128

export class XXHash128 implements XXHash {
  constructor(seed_or_secret: Buffer); // For using seed: Buffer must be 4- or 8-byte long; for using secret: must be at least 136-byte long.
  update(data: Buffer): void;
  digest(): Buffer;
  reset(): void;
  static hash(data: Buffer): Buffer; // One-shot with default seed (zero).
}
Licence
The project is licensed under BSD-2-Clause.