### cd /mnt/pulsar/files/staging/276083/working ### /mnt/pulsar/server/dependencies/_conda/envs/__snippy@3.2/bin/snippy --outdir out --cpus 2 --ref foo.gbk --mapqual 60 --mincov 10 --minfrac 0.9 --pe1 /mnt/pulsar/files/staging/276083/inputs/dataset_446048.dat --pe2 /mnt/pulsar/files/staging/276083/inputs/dataset_446049.dat ### samtools faidx reference/ref.fa ### bwa index reference/ref.fa [bwa_index] Pack FASTA... 0.02 sec [bwa_index] Construct BWT for the packed sequence... [bwa_index] 0.44 seconds elapse. [bwa_index] Update BWT... 0.02 sec [bwa_index] Pack forward-only FASTA... 0.01 sec [bwa_index] Construct SA from BWT and Occ... 0.14 sec [main] Version: 0.7.17-r1188 [main] CMD: bwa index reference/ref.fa [main] Real time: 0.816 sec; CPU: 0.652 sec ### mkdir reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa ### mkdir reference/ref && bgzip -c reference/ref.gff > reference/ref/genes.gff.gz ### snpEff build -c reference/snpeff.config -dataDir . -gff3 ref WARNING: All frames are zero! This seems rather odd, please check that 'frame' information in your 'genes' file is accurate. ### (bwa mem -v 2 -M -R '@RG\tID:snps\tSM:snps' -t 2 reference/ref.fa /mnt/pulsar/files/staging/276083/inputs/dataset_446048.dat /mnt/pulsar/files/staging/276083/inputs/dataset_446049.dat | samtools view -u -T reference/ref.fa -F 3844 -q 60 | samtools sort --reference reference/ref.fa > snps.bam) [W::bseq_read] the 2nd file has fewer sequences. [W::bseq_read] the 2nd file has fewer sequences. [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (217, 268, 339) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 583) [M::mem_pestat] mean and std.dev: (282.53, 90.73) [M::mem_pestat] low and high boundaries for proper pairs: (1, 705) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (964, 2283, 7194) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 19654) [M::mem_pestat] mean and std.dev: (3576.18, 3007.58) [M::mem_pestat] low and high boundaries for proper pairs: (1, 25884) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation RF [W::bseq_read] the 2nd file has fewer sequences. [gzclose] buffer error ### samtools index snps.bam ### samtools depth -aa -q 20 snps.bam | bgzip > snps.depth.gz ### tabix -s 1 -b 2 -e 2 snps.depth.gz ### fasta_generate_regions.py reference/ref.fa.fai 104067 > reference/ref.txt ### freebayes-parallel reference/ref.txt 2 -p 1 -q 20 -m 60 --min-coverage 10 -V -f reference/ref.fa snps.bam > snps.raw.vcf ### /mnt/pulsar/server/dependencies/_conda/envs/__snippy@3.2/bin/snippy-vcf_filter --minqual 10 --mincov 10 --minfrac 0.9 snps.raw.vcf > snps.filt.vcf Parsing: snps.raw.vcf Types: snp=39 complex=4 Passed 43/81 variants (53.09%) ### snpEff ann -no-downstream -no-upstream -no-intergenic -no-utr -c reference/snpeff.config -dataDir . -noStats ref snps.filt.vcf > snps.vcf ### bgzip -c snps.vcf > snps.vcf.gz ### tabix -p vcf snps.vcf.gz ### /mnt/pulsar/server/dependencies/_conda/envs/__snippy@3.2/bin/snippy-vcf_to_tab --gff reference/ref.gff --ref reference/ref.fa --vcf snps.vcf > snps.tab Loading reference: reference/ref.fa Loaded 1 sequences. Loading features: reference/ref.gff Parsing variants: snps.vcf Converted 43 SNPs to TAB format. ### vcf-consensus snps.vcf.gz < reference/ref.fa > snps.consensus.fa Leading or trailing space in attr_key-attr_value pairs is discouraged: [Description] [Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' ] INFO= Leading or trailing space in attr_key-attr_value pairs is discouraged: [Description] [Predicted loss of function effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected' ] INFO= Leading or trailing space in attr_key-attr_value pairs is discouraged: [Description] [Predicted nonsense mediated decay effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected' ] INFO= ### /mnt/pulsar/server/dependencies/_conda/envs/__snippy@3.2/bin/snippy-vcf_filter --subs snps.filt.vcf > snps.filt.subs.vcf Parsing: snps.filt.vcf Types: complex=4 snp=39 Passed 43/43 variants (100.00%) ### bgzip -c snps.filt.subs.vcf > snps.filt.subs.vcf.gz ### tabix -p vcf snps.filt.subs.vcf.gz ### vcf-consensus snps.filt.subs.vcf.gz < reference/ref.fa > snps.consensus.subs.fa