Compute sequence length
(Galaxy Version 1.0.3)Tool Parameters
Sequences
Please provide a value for this option.
*
No datasets with fasta elements available
*
Stripping the description will truncate the fasta header to just the sequence ID. Otherwise the header description will be kept. This step is done before the 'How many characters to keep' option.
Additional Options
Send an email notification when the job completes.
Help
What it does
This tool counts the length of each fasta sequence in the file. The output file has two columns per line (separated by tab): fasta titles and lengths of the sequences. The option How many characters to keep? allows to select a specified number of letters from the beginning of each FASTA entry.
Example
Suppose you have the following FASTA formatted sequences from a Roche (454) FLX sequencing run:
>EYKX4VC02EQLO5 length=108 xy=1826_0455 region=2 run=R_2007_11_07_16_15_57_ TCCGCGCCGAGCATGCCCATCTTGGATTCCGGCGCGATGACCATCGCCCGCTCCACCACG TTCGGCCGGCCCTTCTCGTCGAGGAATGACACCAGCGCTTCGCCCACG >EYKX4VC02D4GS2 length=60 xy=1573_3972 region=2 run=R_2007_11_07_16_15_57_ AATAAAACTAAATCAGCAAAGACTGGCAAATACTCACAGGCTTATACAATACAAATGTAAfa
Running this tool while setting How many characters to keep? to 14 will produce this:
EYKX4VC02EQLO5 108 EYKX4VC02D4GS2 60
However, if your IDs are not all the same length, you may wish to just keep the fasta ID, and not the description:
>EYKX4VC02EQLO5 length=108 xy=1826_0455 region=2 run=R_2007_11_07_16_15_57_ TCCGCGCCGAGCATGCCCATCTTGGATTCCGGCGCGATGACCATCGCCCGCTCCACCACG TTCGGCCGGCCCTTCTCGTCGAGGAATGACACCAGCGCTTCGCCCACG >EYKX4VC length=60 xy=1573_3972 region=2 run=R_2007_11_07_16_15_57_ AATAAAACTAAATCAGCAAAGACTGGCAAATACTCACAGGCTTATACAATACAAATGTAAfa
Running this tool with Strip fasta description from header set to True and How many characters to keep? set to 0 will produce:
EYKX4VC02EQLO5 108 EYKX4VC 60
Unnamed history
Draggable