Introducing Transeq
Now we have a basic understanding of the command line and why we want to use it, let’s look at an example of a bioinformatic command line tool.
Transeq is a tool that translates DNA sequences to their corresponding amino acid sequences.
The transeq
command is a part of the EMBOSS (European Molecular Biology Open Software Suite) package, which is a collection of bioinformatics tools for sequence analysis.
You can find the online GUI version of the tool here https://www.ebi.ac.uk/Tools/st/emboss_transeq/.
An Example
Below is an example of a command to run transeq on a sequence file:
transeq -sequence example.fasta -outseq example_translated.fasta -frame 6 -clean
- The command starts with the program executable name
transeq
. - The argument
-sequence
requires your input file of sequences, here we use the name of our sequence fileexample.fasta
. -
-outseq
lets you define a filename for the output, this will be created when the program is run. We have called itexample_translated.fasta
. - The option
-frame
requires a value for the translation frame, the value 6 here means translate all 6 frames. - The option
-clean
does not require any additional information as it is a boolean flag, specifying on or off through presence or absence in the command. This option changes the “*” character to “X” (an unknown residue).
As with most command line programs, a list of the different options available for the tool can be found by running, -help
or -h
in the terminal, for example:
transeq -help
Which will output something like this:
Translate nucleic acid sequences
Version: EMBOSS:6.6.0.0
Standard (Mandatory) qualifiers:
[-sequence] seqall Nucleotide sequence(s) filename and optional
format, or reference (input USA)
[-outseq] seqoutall [.] Protein sequence
set(s) filename and optional format (output
USA)
Additional (Optional) qualifiers:
-frame menu [1] Frame(s) to translate (Values: 1 (1); 2
(2); 3 (3); F (Forward three frames); -1
(-1); -2 (-2); -3 (-3); R (Reverse three
frames); 6 (All six frames))
-table menu [0] Code to use (Values: 0 (Standard); 1
(Standard (with alternative initiation
codons)); 2 (Vertebrate Mitochondrial); 3
(Yeast Mitochondrial); 4 (Mold, Protozoan,
Coelenterate Mitochondrial and
Mycoplasma/Spiroplasma); 5 (Invertebrate
Mitochondrial); 6 (Ciliate Macronuclear and
Dasycladacean); 9 (Echinoderm
Mitochondrial); 10 (Euplotid Nuclear); 11
(Bacterial); 12 (Alternative Yeast Nuclear);
13 (Ascidian Mitochondrial); 14 (Flatworm
Mitochondrial); 15 (Blepharisma
Macronuclear); 16 (Chlorophycean......
How do I try it?
If you want to try this program out yourself you can download it as part of the EMBOSS package via:
sudo apt-get update
sudo apt-get install emboss
If you are using a Mac you can download EMBOSS using MacPorts.