Some one-liners
I find myself wrangling fastqs a lot. Here are some useful one-liners I’ve saved over time.
To subsample the first 4000 reads of all the fastq.gz files in a directory and write them to a subdirectory:
find * -name "*.fastq.gz" -print -exec sh -c "zcat < {} | head -n 4000 > test/{}" \;
To turn a multi-line fasta into a single line one:
awk '/^>/ {printf("\n%s\n",$0);next; } { printf("%s",$0);} END {printf("\n");}' file.fasta