Contents - Index - Previous - Next


Organizing sequence data

If the file is read successfully, you will see:



This should be somewhat familiar - it is similar in some ways to Windows Explorer. Groups are initially assigned colors arbitrarily (unless you have saved a favorite set - see below for further details). The Sequences are the terminal branches, and can be dragged from one group to another. Initially, groups are identified using the initial letter(s) of the sequence names. Groups cannot be nested within each other. You cannot rename a sequence, but you can rename groups, and group names are displayed in the plot legends. Colors can be changed by right-clicking the colored squares. The buttons on the right should be self-explanatory. The order in which the sequences appear will be the order they appear in the plot legend. Only sequences with checked boxes will be analyzed.  Hiding a sequence has no effect on the input file - SimPlot will not affect the input file unless you deliberately overwrite it, and ignore the warnings that a file is about to be overwritten.

One of the sequences in the demo file is a subtype A/C mosaic, but is listed as a member of the A group.



This occurs because (a) its name begins with that letter, and (b) the current setting is to use only the first letter of a sequence name to determine its group.  To change this, we click the grouping button:



And we change use one of two grouping methods: by the first N characters of the sequence name, or the characters prior to the Nth occurrence of the chosen separator.  In this case, either could be used, but for illustrative purposes we'll increase the number of characters for the first option:



After clicking 'OK', you can see that the button has changed to indicate your selection:



And here is the result:



Any time you want to see all of the group members in expanded form, just click the "Expand Groups" check box.

Note that SimPlot suppresses underscore characters ("_") at the end of group names.  This cleans up the appearance of the group names, and should be considered when naming sequences (if one first determines the longest group name that will be needed, then edits sequence names so that they all begin with that many characters, filled as needed with underscore characters, then group names will be automatically assigned using the steps above).