Regular Expression, Fuzzy Motif, GFS Searches
Regular Expression Search
How to perform a regular expression search on a sequence.
You can do a regular expression search from the sequence display window; select Reg. Expression Search from the Analysismenu. The following dialog will then appear.
Enter your regular expression in the textbox, and click on the OK button to perform the search, or the Cancel button to close the dialog window without performing the search.
Regular expressions allow one to search for of precise patterns which may include optional sections and/or repeated sequences. For detailed help on regular expressions, please see The Perl Regular Expression page for more information.
Examples of Regular Expression Searching
Regular Expression | What it matches |
---|---|
ACT | ACT |
[ AC ]T | A or C followed by T |
AC[ ^T ]ACT | AC followed by anything BUT a T followed by ACT |
ACT* | AC followed by 0 or more T’s |
(ACT)* | ACT repeated 0 or more times |
(ACT)+ | ACT repeated 1 or more times |
(ACT)? | ACT repeated 0 or 1 times |
(ACT){n} | ACT repeated n times |
(ACT){n,} | ACT repeated at least n times |
(ACT){n,m} | ACT repeated at least n times but not more than m times |
((AC)[ TA ]){n} | AC followed by T or A – repeated n times |
Fuzzy Motif Search
How to do a fuzzy motif search on a sequence.
You can do a fuzzy motif search from the sequence display window Select Fuzzy Motif Search from the Analysis menu. The following dialog will then appear.
Enter your fuzzy motif in the top textbox and enter the number of mismatches to allow in the lower textbox. Click on the OKbutton to perform the search, or the Cancel button to close the dialog window without performing the search.
The Fuzzy Motif Search allows users to enter in an expression pattern (see below for an explanation of the pattern grammar used) as well as a maximum number of mismatches tolerated in a search hit. VGO then searches marked sequences for this motif and displays the list of hits by location along the sequence. In addition to the ambiguities created by mismatches, users may enter in IUB ambiguity codes, which are also indicated below.
Examples of Fuzzy Motif Searching
Fuzzy Expression | What it matches |
---|---|
ACT | an A, C, T pattern |
[ AC ]T | an A or a C followed by a T |
{AC}T | Everything but an A or a C followed by a T |
Note: When counting mismatches, [] and {} count as a single match or mismatch. As well, if matching T(2,4) and only 1 T is found, this counts as a single mismatch.
Table of IUPAC Ambiguity Codes
IUPAC-IUB/GCG Code | Meaning | Complement |
---|---|---|
A | A | T |
C | C | G |
G | G | C |
T/U | T | A |
M | A or C | K |
R | A or G | Y |
W | A or T | W |
S | C or G | S |
Y | C or T | R |
K | G or T | M |
V | A or C or G | B |
H | A or C or T | D |
D | A or G or T | H |
B | C or G or T | V |
X/N | G or A or T or C | X |
. | Not G or A or T or C | . |
GFS Search
How to do a GFS search on a sequence.
You can do a GFS search from the sequence display window Select GFS from the Analysis menu. The following dialog will then appear.
Enter your mass list in the textbox, or click on the Load Mass List File button and select a file that contains the list of masses. When you are ready, click on the OK button to perform the search, or the Cancel button to close the dialog window without performing the search. You may see another dialog window depending on your preferences where you can enter the parameters for the GFS search.