faops is a lightweight tool for operating sequences in the fasta format.
This tool can be regarded as a combination of faCount, faSize,
faFrag, faRc, faSomeRecords, faFilter and faSplit from
UCSC Jim Kent's utilities.
Comparing to Kent's fa* utilities, faops is:
- much smaller (kilo vs mega bytes)
- easy to compile (only one external dependency)
- well tested
- contains only one executable file
- can operate gzipped (bgzipped) files
- and can be run under all major OSes (including Windows).
faops is also inspired/influenced/stealing from
seqtk and
ufasta.
$ ./faops
Usage: faops <command> [options] <arguments>
Version: 0.8.21
Commands:
help print this message
count count base statistics in FA file(s)
size count total bases in FA file(s)
masked masked (or gaps) regions in FA file(s)
frag extract sub-sequences from a FA file
rc reverse complement a FA file
one extract one fa record
some extract some fa records
order extract some fa records by the given order
replace replace headers from a FA file
filter filter fa records
split-name splitting by sequence names
split-about splitting to chunks about specified size
n50 compute N50 and other statistics
dazz rename records for dazz_db
interleave interleave two PE files
region extract regions from a FA file
Options:
There're no global options.
Type "faops command-name" for detailed options of each command.
Options *MUST* be placed just after command.
-
Reverse complement
faops rc test/ufasta.fa out.fa # prepend RC_ to names faops rc -n test/ufasta.fa out.fa # keep original names -
Extract sequences with names in
list.file, one name per linefaops some test/ufasta.fa list.file out.fa -
Same as above, but from stdin and to stdout
cat test/ufasta.fa | faops some stdin list.file stdout -
Sort by header strings
faops order test/ufasta.fa \ <(cat test/ufasta.fa | grep '>' | sed 's/>//' | sort) \ out.fa -
Sort by lengths
faops order test/ufasta.fa \ <(faops size test/ufasta.fa | sort -n -r -k2,2 | cut -f 1) \ out2.fa -
Tidy fasta file to 80 characters of sequence per line
faops filter -l 80 test/ufasta.fa out.fa -
All content written on one line
faops filter -l 0 test/ufasta.fa out.fa -
Convert fastq to fasta
faops filter -l 0 in.fq out.fa -
Compute N50, clean result
faops n50 -H test/ufasta.fa -
Compute N75
faops n50 -N 75 test/ufasta.fa -
Compute N90, sum and average of contigs with estimated genome size
faops n50 -N 90 -S -A -g 10000 test/ufasta.fa
faops can be compiled under Linux, macOS (gcc or clang) and Windows
(MinGW).
git clone https://github.com/wang-q/faops
cd faops
makebrew install wang-q/tap/faopsDone with bats. Useful articles:
- https://blog.engineyard.com/2014/bats-test-command-line-tools
- http://blog.spike.cx/post/60548255435/testing-bash-scripts-with-bats
# brew install bats-core
make testzlibkseq.handkhash.hfromklib(bundled)
Qiang Wang <wang-q@outlook.com>
This software is copyright (c) 2014 by Qiang Wang.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.