Some one-liners

I find myself wrangling fastqs a lot. Here are some useful one-liners I’ve saved over time.

To subsample the first 4000 reads of all the fastq.gz files in a directory and write them to a subdirectory:

find * -name "*.fastq.gz" -print -exec sh -c "zcat < {} | head -n 4000 > test/{}" \;

To turn a multi-line fasta into a single line one:

awk '/^>/ {printf("\n%s\n",$0);next; } { printf("%s",$0);}  END {printf("\n");}' file.fasta