Example of a Bioinformatic command line tool

Introducing Transeq

Now we have a basic understanding of the command line and why we want to use it, let’s look at an example of a bioinformatic command line tool.

Transeq is a tool that translates DNA sequences to their corresponding amino acid sequences.

The transeq command is a part of the EMBOSS (European Molecular Biology Open Software Suite) package, which is a collection of bioinformatics tools for sequence analysis.

You can find the online GUI version of the tool here https://www.ebi.ac.uk/Tools/st/emboss_transeq/.

An Example

Below is an example of a command to run transeq on a sequence file:

transeq -sequence example.fasta -outseq example_translated.fasta -frame 6 -clean
  • The command starts with the program executable name transeq.
  • The argument -sequence requires your input file of sequences, here we use the name of our sequence file example.fasta.
  • -outseq lets you define a filename for the output, this will be created when the program is run. We have called it example_translated.fasta.
  • The option -frame requires a value for the translation frame, the value 6 here means translate all 6 frames.
  • The option -clean does not require any additional information as it is a boolean flag, specifying on or off through presence or absence in the command. This option changes the “*” character to “X” (an unknown residue).

As with most command line programs, a list of the different options available for the tool can be found by running, -help or -h in the terminal, for example:

transeq -help

Which will output something like this:

Translate nucleic acid sequences
Version: EMBOSS:6.6.0.0

   Standard (Mandatory) qualifiers:
  [-sequence]          seqall     Nucleotide sequence(s) filename and optional
                                  format, or reference (input USA)
  [-outseq]            seqoutall  [.] Protein sequence
                                  set(s) filename and optional format (output
                                  USA)

   Additional (Optional) qualifiers:
   -frame              menu       [1] Frame(s) to translate (Values: 1 (1); 2
                                  (2); 3 (3); F (Forward three frames); -1
                                  (-1); -2 (-2); -3 (-3); R (Reverse three
                                  frames); 6 (All six frames))
   -table              menu       [0] Code to use (Values: 0 (Standard); 1
                                  (Standard (with alternative initiation
                                  codons)); 2 (Vertebrate Mitochondrial); 3
                                  (Yeast Mitochondrial); 4 (Mold, Protozoan,
                                  Coelenterate Mitochondrial and
                                  Mycoplasma/Spiroplasma); 5 (Invertebrate
                                  Mitochondrial); 6 (Ciliate Macronuclear and
                                  Dasycladacean); 9 (Echinoderm
                                  Mitochondrial); 10 (Euplotid Nuclear); 11
                                  (Bacterial); 12 (Alternative Yeast Nuclear);
                                  13 (Ascidian Mitochondrial); 14 (Flatworm
                                  Mitochondrial); 15 (Blepharisma
                                  Macronuclear); 16 (Chlorophycean......

How do I try it?

If you want to try this program out yourself you can download it as part of the EMBOSS package via:

sudo apt-get update
sudo apt-get install emboss

If you are using a Mac you can download EMBOSS using MacPorts.