BBB How-to Resource
Opening and exporting alignments
Navigating in the alignment
Working with gene annotations
Alignment and Search Tools
Import Analysis From File
Import and View mRNA Expression Data
Open an alignment file
BBB is capable of opening five different file formats:
- Base-by-Base or XML (abbreviated BBB)
- ClustalW files with the extension .aln
- Fasta files
- Genbank files
- EMBL files
BBB or XML files, unlike Clustal and Fasta files, can store additional information about the sequences they contain, including gene start/stop locations, comments, and various other annotations; such information can be accessed and displayed with BBB. To open an alignment, select “Open Alignment” from the File menu, then click on “From Your Local File”. Select the file you wish to open, and click “Open”. Watch the movie
You can also drag a valid file from your file browser into the main BBB window to open the file.
Note* – If you are using an accepted file format and the file is not opening check that your file extensions are correct. For example Genbank : filename.gbk ; FASTA : filename.fasta
Export a sequence selection to a different file
Suppose that you wish to analyze a subsection of a pre-existing alignment over a reduced number of sequences. The easiest way to do this is to export your region of interest to a new file. First, select the desired sequences from the Sequence List to the left (hold down the CTRL or Apple key to select more then one). Now switch to the Select mouse mode (from the Toolbar button) and click and drag to select the region of the sequence you wish to export.
(NOTE: You must switch to Select mode first, or you will introduce large gaps into your alignment!)
This region will become highlighted on all the selected sequences. Finally, select “Export Selection (Marked Sequences)” from the File menu; you can then choose the file format (see above) and enter a name for your file. Watch the movie
Import genome sequences from the viral databases available at Virology.ca
From the “File” menu, choose “Add Sequences to Alignment” -> “From VOCs Database”. Select the database and, from the list that appears, your genomes of interest (hold down the Apple key (Mac) or CTRL (PC) to select multiple genomes). Click OK to upload the sequences to your alignment. (Note that these sequences will be unaligned; for help with aligning sequences, see our Aligning Sequences tutorial.) To view the file you just created, close the current alignment and open your new file. Watch the movie
Export an image of the alignment
Any BBB alignment can be exported to an image. Select “Export Alignment Image” from the File menu. A window containing image options will open; adjust these if desired, then click “OK”. Watch the movie
Export a text overview of the alignment
Any BBB alignment can be exported to dense text format. This is useful if you wish to have a large sequence and wish to be able to scan across it quickly. To do this, select, “Export Alignment Overview”, from the File menu. This functionality includes difference comparisons and indicates differences by lowering the case of the amino or nucleic acid. The comparison algorithm is automatically determined based on the current comparison type the user is viewing the sequence with.
Switching between Mouse Modes
BBB contains three different mouse modes, controlled by the Toolbar (the fourth, fifth, and sixth buttons from the left.) It is important to keep these three modes straight as using the wrong one can be deleterious to your alignment!
– The Edit mode allows you to create gaps of any length (in the selected sequences) by clicking and dragging. DO NOT MISTAKE THIS FOR SELECT MODE.
– The Block Glue mode allows you to shrink or remove gaps (in the selected sequences) by clicking and dragging.
– The Select mode allows you to select regions within one or more sequences.
View gene annotations
When opening sequence files from the VOCs database, gene annotations are automatically uploaded to your alignment together with the sequences. To locate genes, either click on the relevant toolbar buttons to move the display to the last/next gene, or skip directly to the gene of interest by selecting it in the pull-down menu located on the top right of the display. (NOTE:Be sure to select the sequence of interest first, or the wrong group of genes will be displayed.) Click on the “Three Frame Translation” button (located to the right of the Sequence List) to view all three possible translations of the gene sequence; the annotated open reading frame is highlighted in red. Watch the movie
Import genes from a Genbank file
First, select the sequence you want to import the genes into. Select “Import genes” from the Tools menu, and click on “From feature file”. Find and select the appropriate Genbank file on your hard drive. Now input the gene prefix you desire to use for naming genes (e.g. genome name) and click “OK”. Repeat the same steps for the other sequence(s).
Finally, to obtain a statistical comparison of all the genes in the two strains, choose “CDC statistics” from the Reports menu. (For more information about viewing genes, click here.) Watch the movie
Align entire sequences or subsequence regions
To align sequences in BBB, first select the sequences you wish to align (by clicking on their names in the Sequence List; hold down the CTRL or Apple key to select more then one) Choose the “Select” mouse mode from the toolbar and select the sequence region that you wish to align.
Finally, go to “Align Selection” under the Tools menu; choose the method of alignment (ClustalW, TCoffee, or Muscle (proteins only)) you wish to use. Adjust the alignment parameters (if necessary) and wait for the alignment to be generated (this may take several minutes or longer for large alignments.) When the results window appears, click “Ok” to accept the alignment.Watch the movie
Search for a sequence motif, either exact (regular expression search) or inexact (fuzzy motif search)
In BBB it is possible both to search for an exact sequence motif (regular expression) and a motif with several mismatches (fuzzy motif). Both functions are available from the Tools menu.
– Suppose we wish to find a CCTGGC pattern with no mismatches. To perform this search, first select the sequences you wish to search (click on their names in the Sequence List; hold down the CTRL or Apple key to select more then one). Then, from the Tools menu, select “Search” and then click on “Regular Expression Search”. In the box that appears, enter the expression (CCTGGC) that you wish to search for.
– Now, suppose we want to find all matches to CCTGGC that are exact or almost exact — that is, that have at most one nucleotide altered. Again, begin by selecting the sequences to be searched. Then, from the Tools menu, select “Search” and then click on “Fuzzy Motif Search”. In the box that appears, enter the expression (CCTGGC) that you wish to search for and, in the box below, set the maximum number of mismatches to 1.
To search for an “N” character in the sequence, simply type “N” into the search expression. If you wish to insert a wildcard character, use either “X” or “.”.
In either case, the program will then search both the top and bottom strands of the selected DNA sequence for the given expression. After the search is complete, it will return a list of matches. You can jump to any location of any match by double clicking on it in the result list. Finally, the list of fuzzy search results can be saved to a text file by clicking on “Save Fuzzy Results”. Watch the movie.
Apply analysis to a sequence from a file
To import your analysis format it as follows:
The analysis must be preceded by a closing angle bracket “>”. All fields must be separated by pipes. The first number is the start position of the subsequence you are marking. The second number is its end position. The next field indicates the strand, either POSITIVE or NEGATIVE. The next field is the label that will be applied to the specified region; this can be a number, as in the example, or a phrase, such as “promoter element”. If you do not wish to label the region you must still include the pipes, but just put a space between them. The final field is a hexcode that determines the colour of the analysis, these codes can be found online. Finally the document should be saved in plain text format before attempting to import it. A single file may contain many such entries, each on a new line, with a single closing angle bracket as the first line of the file.
Once you have saved your analysis in this format select the sequence that you wish to apply it to in Base-By-Base. Go to the ‘File’ menu and select ‘Import Analysis from File…’. This will direct you to select the file in which you have saved your analysis; it will be automatically applied to the sequence.
Please note that this analysis can be applied to any sequence. There is no warning if the base pairs in your analysis are not present in the selected sequence, be careful to apply the correct analysis to your sequence.
Importing MOCHIview mRNA data files
In BBB it is now possible to visualize mRNA expression data that is in MOCHIview format. This is accomplished by doing the following:
- Selecting View in the menubar and expand the Display mRNA expression data item and select Load and Display mRNA Expression Data
- Select an appropriately formated file. This opens the zoomed in viewport, and allows you to see the expression over the available sequence range.
- Select the View –> Display mRNA Expression Data –> Display Sequence Expression Summary. This will open the summary view
In BBB it is now possible to access the Mafft command line option –add. This function aligns a new sequence to a set of previously aligned sequences, which much more favorable performance compared to running Mafft normally. To access the feature, first align a set of sequences normally using BBB, or import a file consisting of aligned sequences. Then navigate to the “Advanced” menu, and click on “mafft –add”. BBB will prompt for a new fasta format sequence and then perform the alignment. The results will be automatically applied to the BBB sequence viewer.