sqd

executing unix commands with multi processes

  • sqd

Downloads in past

Stats

StarsIssuesVersionUpdatedCreatedSize
sqd
100.0.149 years ago9 years agoMinified + gzip package size for sqd in KB

Readme

SQD
executing unix commands with multi processes

installation

$ npm install -g sqd

usage

$ sqd -c command [--debug] [--exit] [-p nProcess] [-s separator_command] <input file> [output file]

grep with 8 processes
sqd -c "grep -e something" -p 8 input.txt
results are on STDOUT.
sed with 4 processes (default), results to output.txt
sqd -c "sed -e y/ATCG/atcg/" input.txt output.txt

with separator option, we can also handle binary files
sqd -c "samtools view -" -s bam input.bam

reducing
sqd -c "node foobar.js" sample.txt --reduce

in foobar.js
if (process.env.sqd_map) {
  process.stdin.on("data", function(data) {
    // do something, separated into multi processes
  });
}
else if (process.env.sqd_reduce) {
  process.stdin.on("data", function(data) {
    // do somothing which reduces the results
  });
}
process.stdin.resume()

options

  • -p: the number of processes
  • --debug: debug mode (showing time, temporary files)
  • --exit: exits when child processes emit an error or emit to stderr
  • -s: (see separator section)
  • --reduce: reducing the results with the same command, which is given an environmental variable named sqdreduce with value "1"

additional environment variables in child processes

Be careful that all values are parsed as string.
  • sqdn: process number named by sqd, differs among child processes
  • sqdstart: start position of the file passed to the child process
  • sqdend: end position of the file passed to the child process
  • sqdcommand: command string (common)
  • sqdinput: input file name (common)
  • sqdtmpfile: path to the tmpfile used in the child process
  • sqddebug: debug mode or not. '1' or '0' (common)
  • sqdhStart:: start position of the header (common)
  • sqdmap:: "1", unless it is spawned for reducing. undefined, otherwise
  • sqdreduce:: "1", if it is spawned for reducing. undefined, otherwise

separator

sqd requires a separator which separates a given input file into multiple chunks. separator offers the way how sqd separates the file by JSON format.
the JSON keys are
  • positions: start positions of each chunks in the file

"positions": [133, 271, 461, 631]

  • header: range of the header section of the file, null when there is no header section

"header": [0, 133]

  • size: file size (optional)

"size": 34503

available separators

sqdm --much more memory

$ sqdm [memory=4000MB] -c command [--debug] [--exit] [-p nProcess] [-s separator_command] <input file> [output file]

sqd with 8000MB(≒8GB) memory
sqdm 8000 -c "cat" sample.txt