CPAT
coding potential assessment (Galaxy Version 3.0.5+galaxy1)Tool Parameters
Please provide a value for this option.
*
No datasets with fasta or fasta.gz elements available
Reference genome source
Parameter 'ref_fasta': specify a dataset of the required format / build for parameter
*
No datasets with fasta or fastq.gz elements available
Parameter 'c': specify a dataset of the required format / build for parameter
*
No datasets with fasta or fasta.gz elements available
Parameter 'n': specify a dataset of the required format / build for parameter
*
No datasets with fasta or fasta.gz elements available
*
*
*
*
(--antisense)
*
*
Additional Options
Send an email notification when the job completes.
Help
Purpose
CPAT is a bioinformatics tool to predict RNAs coding probability based on the RNA sequence characteristics. To achieve this goal, CPAT calculates scores of these 4 linguistic features from a set of known protein-coding genes and another set of non-coding genes.
- ORF size
- ORF coverage
- Fickett TESTCODE
- Hexamer usage bias
CPAT will then builds a logistic regression model using these 4 features as predictor variables and the “protein-coding status” as the response variable. After evaluating the performance and determining the probability cutoff, the model can be used to predict new RNA sequences.
Unnamed history
Draggable