Tutorials
Variance Table Tutorial
The Variance Table gives you a new summary analysis of your data that focuses on differences. The differences may be polymorphisms, SNPs, mutations, or just bases that require editing in order to resolve differences in a consensus sequence. The Variance Table has a great deal of flexibility. You can compare just two sequences to generate the simplest Variance Table, or you can even create a Variance Table that summarizes all of the differences between each consensus sequence in your project and a common Reference Sequence.
You can use the Variance table to validate your data, because each cell in the Variance Table is linked to the data used to generate the base call in each cell. The sorting tools in the Variance Table makes it easy to find novel SNPs or to identify regions prone to base calling errors. You can also export reports based on the Variance Table.
COMMANDS TO GENERATE A VARIANCE TABLE
- Sequence > Compare To compares a sequence selection(s) to the Consensus, Top Sequence, or Reference Sequence.
- Sequence > Compare To compares all of the sequences from a single contig selection to the Consensus, Top Sequence, or Reference Sequence.
- Contig > Compare Consensus to Reference compares all of the consensus sequences from one or many selected contigs to a common assembled Reference Sequence.
USING THE VARIANCE TABLE TO IDENTIFY SNPS
In the following example, you use the Variance Table to quickly identify and report SNPs from 47 patient samples.
- Launch Sequencher to open an Untitled Project.
- Go to the File menu and select Import>Sequencher Project…
- Select the Variance Table project and select Open.
The Variance Table project contains 95 sequences. In addition to a Reference Sequence, the sequences include auto-sequencing data from 47 different samples with two different primers under a variety of conditions.
Trimming
The first step in analyzing data originating from an automated sequencer is trimming. This is particularly true for the sequences of PCR products.
- Select all of the sequences in the project by choosing Select>Select All.
- From the Sequence menu, select Trim Ends…
Sequencher will open the Ends Trimming window and display a graphic representation of the proposed trim for each of the sequences.
- Click on the Change Trim Criteria button.
- Change the settings to match the following three selections:
On the 5’end
On the 3’end
Post Fix
- Click OK to close the Ends Trimming Criteria window.
- Then, click the Trim Checked Items button on the Ends Trimming window.
- Confirm the Trim and close the Ends Trimming window.
Note the change in the values in the Quality column in the Project window.
Defining the Assembly Parameters
The name of each sequence is written in the following format: Well_ID_Primer_P#. A typical example is J06_14667_0045E1SNP88F_P1.
- Resize the Name column by dragging the Name column header, so you can read the full name.
Note that each of the four elements of the sequence name is separated from the adjacent elements by an underscore, and each element of the sequence name can become a handle to organize, assemble, and name your contigs.
- Click on the Assembly Parameters button at the top of the Project window.
- Click on the Name Settings button and confirm that the Name Delimiters is set to _Underscore.
- Change the names of the Handles to match the following window.
- Select the third Handle radio button, Primer.
- Click OK to close the Assemble by Name Settings dialog.
The names of the handles that you create in the Assemble by Name Settings dialog populate the Handle pull down menu in the Assemble By Name group box.
- Enable the Assemble by Name function by clicking on the Enabled check box.
- Leave the other parameters at their default settings: Dirty Data, Use ReAligner on, Prefer 3’ Gap Placement, 85% match, minimum 20 base overlap.
- Click OK to close the Assembly Parameters window.
Note that you now have a Handle column in your Project window that lists four primers used to sequence these PCR products.
- With all sequences still selected, click on the To Reference by Name button.
Sequencher provides you with a list of the Expected Contigs:
- Click on the Assemble button followed by the Close button.
Your Project window now contains the three contigs described in the previous window.
Creating a Variance Table
The Variance Table provides an overview of the data relative to a specified primary sequence. You may assign the Reference Sequence as the primary sequence, or you can use the Consensus Sequence, or the top sequence in the contig instead. Since we assembled to a Reference Sequence, we will use this as our primary sequence. Our initial view of the Variance Table displays the putative SNPs for the 47 patient samples.
- Select only the reverse primer contig 0045E1SNP88R.
- From the Sequence menu, select the command Compare To > Reference Sequence.
Sequencher generates the following view of the Variance Table:
In our table, the names of the sequences are quite long, so all are truncated in the default column size. The variants included in the table are listed in the default order, so that the columns are in the order that sequences appear in the contig, and the rows are ordered according to the positions of the differences relative to the Reference Sequence. Even when just displaying a fraction of the whole table, it is clear that there are only three positions with variants, 5, 235, and 476, and only position 476 has more than one variant. The cell in the bottom right corner indicates that there are a total of 16 variants listed in the table.
- Click on the Total button at the bottom left corner of the Variance Table.
All the rows of sequences with variants now move over to the left of the table, so that the sequence with the most variants is the left-most sequence. With this data, no sequence has more than two variants.
-
Click once on the resize icon
next to the bottom left Total button.
The table expands all of the columns to display the full name for each sequence.
- Click on the resize button again.
The column widths in the table change to only display a single base’s width.
The pink shading in the headers and footers tells you that the sequence in that column does not span the entire length of the Reference. In the second column, specifically at base position 5, the sequence does not cover the Reference, and so the cell in the table is marked with a pink X.
- Move your cursor to the line that divides the columns between each sequence name until the arrow icon becomes the resize column icon. Drag the column divider for any individual cell and note that the column resizes to the chosen size.
In addition to column width and sorting, there are many ways that you can customize the display of data in the Variance Table to increase the efficiency of your analysis.
- Search for Variance Table in Sequencher Help to display the following description of the Variance Table.
- Close the Help for Variance Table window.
Reviewing the Data by Primer Sequence
Each of the assembled contigs includes the results of sequencing each of the patient PCR products with one primer. In some studies, the only sample data available is from one primer. In this case, the majority of the samples were sequenced with two primers. We will first look at the data, one primer at a time, and we will then combine the data from the two primers to do our final analysis.
The Review mode of the Variance Table lets you use the table of differences to navigate to areas of interest in your contig. When you click on the Review button in the button bar, or when you double-click on a cell in the table, Sequencher opens the Contig and Chromatogram Editor windows in the assigned default positions. The data displayed in each of the editor windows updates to reflect your selection in the Variance Table.
- Resize the column widths to a single base wide size by clicking
the
icon three times. - Double-click on the top left-most cell where there is an “R”.
Sequencher opens the Contig and Chromatogram Editors to display the data for the cell that you selected.
- Move your selection using one of your arrow keys or your mouse.
The display in both editor windows updates to match your selection in the Variance Table. The cells that do not contain a base, because they match the Reference Sequence, still link to their supporting data. This enables you to quickly view a chromatogram from a variant base and then compare it to a base that matches the Reference at the same position. Simply move your selection horizontally from one cell to the next.
The data in the Variance Table from the reverse primer, 0045E1SNP88R, are supported by the chromatogram data so no editing is necessary. This is not the case for the forward primer.
- Close the Contig Editor and then the Variance Table.
- Select the forward primer contig, 0045E1SNP88F.
- From the Sequence menu, select Compare To>Reference Sequence.
This Variance Table was derived from the same set of PCR products as the previous Variance Table but with sequencing reactions primed in the opposite orientation. Every difference reported in the previous Variance Table, from the reverse primer, is also reported here, but instead of 16 total differences there are 19. The concentration of low quality base calls and disagreements with the Reference Sequence at base positions 5 flag this region for further attention.
- Double-click on the first cell with a difference at base 5.
This resizes the Variance Table window and opens the Contig and Contig Chromatogram Editors in Review mode for the selected base. You can see that the chromatograms and the base calls in this region are less than ideal. While holding down the option button, use the right arrow key to view the chromatograms and base calls at base position 5.
The Reference Sequence has four “A’s” at bases 2, 3, 4, and 5. In the forward primer, this area has partial coverage, but in no sequence are there four “A’s” in either the current or original data. This is an area that is best reviewed in combination with the reverse sequence.
You will notice as you navigate through the cells that, when you select a cell with a pink X indicating no coverage at that position, Sequencher displays all traces that do cover the selected position in the Chromatogram Editor. In contrast, if you select a cell that has no chromatogram data associated with a base call, as will occur if you select the Reference, Sequencher displays an empty Chromatogram Editor.
- Close the Contig Editor and then the Variance Table.
- Select the remaining contig - 1011MARa.
- From the Sequence menu, select Compare To>Reference Sequence.
There are no differences between the sample sequence and the Reference Sequence in this Contig.
- Close the Variance Table window.
Reviewing the Data by Contig
We have now looked at the sequence data for each of the primers in this study. We identified regions where the data were weak, around base 5, and we identified likely variants, at positions 235 and 476. We are now ready to synthesize all of the data related to each sample into 46 separate contigs and one Variance Table.
- Select all the contents of the Project window by choosing Select > Select All.
- From the Contig menu, choose Dissolve Contig and then select the Dissolve Contig button.
For each of the contigs that you dissolved, you will now have an extra Reference. These will not interfere with your assembly, but if you prefer to clean up your project, select the extra Reference Sequences and hit the Delete key.
- Select all the contents of the Project window by choosing Select > Select All.
- Click on the Assembly Parameters button.
- From the scroll down menu in the Assemble by Name group box, change the Handle from Primer to ID and select OK.
- Click on the To Reference by Name button at the top of the Project window.
- Click on the Assemble button in the Assembly Preview window and Close on the Assembly Completed window.
You will build 47 different contigs, each containing a forward and a reverse primer sequence and a Reference Sequence.
- Maintaining your selection on all of the contigs, from the Contig menu, choose the Compare Consensus to Reference command.
You have now created a new Variance Table that compares the consensus sequence from each of the selected contigs to the Reference Sequence. Combining the data from both primers creates more samples that have full Reference coverage, noted by the grey shading. The same three positions, 5, 235, and 476, are flagged as having variants in the contig table as were reported in the sequence tables, but again the numbers are different.
The nature of this table is that it reports a few differences across a large number of sequences. Therefore, it might be best configured to stretch horizontally across the top of your screen, rather than vertically to the left. One of the advantages of the Variance Table Review mode is that you can optimize the organization of your windows, and then preserve that orientation in subsequent Sequencher sessions.
- Click on the Review button to bring up the Contig and Chromatogram Editors.
- Rearrange the windows so that the Variance Table stretches along the top, Contig Editor is in the middle, and the Chromatogram Editor is at the bottom.
- From the Window menu, choose Remember Window Layout>Variance Review.
The windows configuration that you define now will become your default layout the next time you bring the Variance Table into the Review mode.
- Place your cursor in the top left-most position in the Variance Table and, while holding down the Alt key, press the right arrow key.
The Alt + arrow combination directs your cursor to the next difference position in the Variance Table. The editors now display the contig bases and chromatogram data for sample 15320, the first contig in the table that has a difference to the Reference at base position 5. The base call is an “R” in both the forward and reverse direction and the call is supported by the chromatograms.
- With your selection still in the position 5 of 15320, again hold the Alt key and press the right arrow key.
You have now jumped to Sample 16037 where there is a gap in the Variance Table. There is only data from the forward primer at this position, and it is not very strong. At this time, you can use the Contig and Chromatogram Editors as you normally would in Sequencher.
- Move your cursor to the first consensus base in the contig editor with coverage in both the forward and reverse direction, the “C” at position 12, so that you can see the trimmed data from the Reverse Primer.
The Reverse Primer sequence has four clearly defined A peaks around base position 5 where the Forward Primer sequence has an amorphous green blob.
Sequencher always stores two copies of your data, your Experimental Data, as you imported it into Sequencher, and your Current Data. In a chromatogram, the Experimental Data lies below the Current Data, where you can select the original base calls and selectively revert the Current Data to the original calls.
- Select the original bases in the Reverse Primer sequence so that you at least highlight the four “A” peaks around the colon in the forward primer.
- From the Sequence menu, select Revert To Experimental Data.
The Current Data in the Reverse Sequence now includes the four “A” bases around position 5.
- Place your cursor at base position at consensus position 5, above the bullet, and type A.
The disagreement is now resolved in the contig, but the Variance Table still shows the difference.
- In order to update the Variance Table with the new edit, hit the Refresh button in the Variance Table.
Note that the total number of differences for position 5 has been reduced from 2 to 1.
- Move your selection throughout the Variance Table.
You will find that the rest of variants in the table are supported by the data, with the exception of 16195, which has a pink X at position 5. You can use the Revert to Experimental command again, or you can just report that sequence as having partial coverage.
Exporting the Variance Table
Now that you have reviewed the results in the Variance Table, you can export them. Sequencher provides two export formats. We maintained the original export for the comparison list that Sequencher exported in previous versions. We also added a new export format that includes the coverage information for each column in the table.
- Click on the Export button on the button bar. (This function is not available if you are running in demo mode.)
Sequencher will bring up the following Export dialog.
You can choose to export the Entire Table, or filter your export to Selected Columns or Selected Rows.
- Click on the To File button.
- Browse to your desired location and Save the exported table.
Sequencher uses the default name of VarianceTable_YYMMDD-HHMM.txt.
The saved table is in a tab delimited format that you can import into databases or open in a spreadsheet. Below is a portion of the table in Excel.
Exported with the data from the Variance Table is a more detailed description of the coverage. When you are reporting differences in a sequence, it is important to distinguish between the positions reported to have no differences because they match the primary sequence, and those that have no data at a given position. The coverage information in the export has two formats. In the table above, there is a row that globally defines coverage for each sequence as “Full” or “Incomplete”. Above the table there is header information that more specifically details the coverage of the Reference by the data from each sample sequence.
The Comparison Range is equal to the bases as numbered in the primary sequence, in this case the Reference bases -39..511. The Comparison Range Coverage for the three sequences displayed in this snap shot of the Variance Table export in Excel all partially cover the Comparison Range on the 5’end and 1011MARa also has partial coverage on the 3’ end.
The other export available from the Variance Table replicates the format exported from previous versions of Sequencher. The earlier versions of Sequencher could only compare one sequence selection to another; therefore, this export is based on column selections, from one column to all. We want to export only those sequences with differences.
- Click on the Total button in the bottom left corner of the Variance Table to sort all of the sequences that have differences to the left of the table.
- Select the first sixteen sequences that have differences by clicking on the sequence name of the first and by shift + clicking on the sequence name of the last.
You can make your selection by holding the Ctrl key down while you click on the desired columns, or you can select a range of columns by holding the shift key while you make your selection.
- Click on the Export button.
- Choose the radio button Selected Columns.
- Choose the Individual Reports report format from the drop down list.
- Click To File.
- Browse to the location of your choice and save the files.
Below is an example of one of the 16 text documents created from the Export > Selected Columns command.
IN SUMMARY
In the previous example, the Variance Table immediately identified the 17 interesting base calls out of a total of 25,585. The Review mode then provided you with a link to the data supporting those calls and the tools to make corrections, if necessary. The Variance Table gives you both broader and more discriminating ways to evaluate your contig and project data, and when you are finished, you can export your conclusions for further analysis.
- Close the project without saving and exit Sequencher if desired.