Drawing gene structure

This information is available at http://wwwchg.duhs.duke.edu/grasp/index.html?how_to_create_a_gene_schematic.html

example: neuropeptide Y gene on chrom7

http://wwwchg.duhs.duke.edu/grasp/embim15.png

Useful links:
SNPSelector (public version): http://snpselector.duhs.duke.edu/hqsnp36.html


I.
Gene structure
-
Use your favorite public genome browser (i.e. UCSC or Ensembl) to get intron/exon structure of your gene.

-
If multiple isoforms/alternative transcripts exist, clarify with the person requesting the schematic which transcript to illustrate.
-
Get a screenshot of the gene structure as shown in the public genome browser (SnagIT software is a handy tool for this).  Copy the image into a blank Visio file.  PowerPoint can be substituted for Visio, as it has many of the same drawing functions.
ensembl     http://wwwchg.duhs.duke.edu/grasp/embim16.gif

UCSChttp://wwwchg.duhs.duke.edu/grasp/embim17.png

-
Use the Visio template file to ‘hand draw’ the gene structure.  The cylinders used for exons should be dark for coding regions and light gray for non-coding regions.  In Ensembl non-coding exonic regions are unshaded boxes; coding regions are shaded.  In UCSC, non coding regions are smaller boxes while coding regions are full height.
-
Note the direction in which your gene is transcribed.  Ensembl: if the gene is on the forward strand it will be above the blue contig bar; if it’s below then it is on the reverse strand.  UCSC uses arrows to show direction of transcription.

-
If you want to show the length of each exon and intron, go to Ensembl’s ‘exon info’ from the main gene report page. Under exon info, you can see the length of each intron and exon.  You can also verify where the coding regions are (untranslated exonic sequence is in purple text; coding regions are in black text).
http://wwwchg.duhs.duke.edu/grasp/embim18.png

-
Line up your Visio exon and intron shapes below the screenshot from your genome browser.  Where transcription begins, place the ‘ATG’ start codon text box and the arrow indicating direction.  For genes on the reverse strand the ATG arrow will be pointing to the left instead of to the right (or you can flip the schematic so that 5’ is always on the left, but note that will also flip the order of your SNPs – which is easily done in Visio).

http://wwwchg.duhs.duke.edu/grasp/embim19.gif

II.
SNP location
-
Once you have the basic gene structure you need to add in the SNP positions.

-
Go to UCSC table browser: http://genome.ucsc.edu/cgi-bin/hgTables.  For a good tutorial on using UCSC table browser, visit the Open Helix site http://www.openhelix.com/downloads/ucsc/ucsc_home.shtml.  
http://wwwchg.duhs.duke.edu/grasp/embim20.png
-
Above is a screenshot of the settings you want on the UCSC table browser.
o
Assembly=Mar. 2006, which is build 36.  If you want build 35, select assembly=May 2004.

o
Group=Variation and Repeats.
o
Track=SNPs (128) – this refers to dbSNP build 128.  This build number may change as UCSC loads updated builds of dbSNP.

-
If you have a long list of SNPs you can use the upload list option.  However, it’s usually convenient to just paste in your list of SNPs.

http://wwwchg.duhs.duke.edu/grasp/embim21.png

-
The database is case-sensitive, so be sure to use lower case ‘rs’ in your list of SNP names.  Click ‘submit’.  Be patient, as it takes a minute or two for the list to process.
-
Note: Occasionally UCSC will not be able to find your SNP name.  This is usually because you are using an outdated rs#.  Sometimes SNPs are merged into a new rs# in newer dbSNP builds.  To check this, go to dbSNP homepage, http://www.ncbi.nlm.nih.gov/SNP/.  In the ‘Search by IDs on All Assemblies’ section, paste in your SNP name, again with lowercase ‘rs’.  The search result should tell you whether the SNP has been merged into an updated rs#.

http://wwwchg.duhs.duke.edu/grasp/embim22.png

-
Back in UCSC table browser, your list has uploaded.  Set output format to ‘custom track’ and click the ‘get output’ button.  This will take you to the following settings page:

http://wwwchg.duhs.duke.edu/grasp/embim23.png

-
You can either accept the default track name and description, or add in your own.  Click ‘get custom track in genome browser.’
-
You will be taken to the UCSC genome browser.   Either search for your gene name or type in the exact coordinates (chr#:bp start-bp end).  If you know you have some flanking SNPs, be sure you have zoomed out wide enough to see all of your SNPs.

http://wwwchg.duhs.duke.edu/grasp/embim24.png

-
Above you can see the NPY gene and the SNP list uploaded through the Table Browser. Your custom track should always appear at the top.  Note the red arrow pointing to the track description.
-
Note: To turn off or delete this custom track when you are done, either go to ‘manage custom tracks’ or click on the link to your track name under ‘Custom Tracks.’  Both will give you the option to delete the track.  If you just want to hide the track, select ‘hide’ from the drop-down menu.

http://wwwchg.duhs.duke.edu/grasp/embim25.png

-
Take a screenshot of your SNP track, including the gene structure below.  Paste this into your Visio document and resize it to match the gene structure you have already drawn.
-
In Visio, place the arrows directly over the SNPs on the UCSC screenshot.  Once you’ve placed all the arrows, use the ‘Lasso select’ tool and then align shapes.  You can then move the arrows down off the screenshot as a set.
http://wwwchg.duhs.duke.edu/grasp/embim26.png

http://wwwchg.duhs.duke.edu/grasp/embim27.png


III.
SNP function
-
SNP function can be indicated using either text color or arrow shapes (or some combination thereof).  Below are examples:
http://wwwchg.duhs.duke.edu/grasp/embim28.png
-
You can get functional classification of your SNPs using:
o
CHG SNP Selector (“Function” column in the output).

o
UCSC Table Browser.  To do this, repeat the steps used to get the custom track.  Instead of selecting ‘custom track’ as the output, select ‘all fields from selected table’ from the drop-down menu.

http://wwwchg.duhs.duke.edu/grasp/embim29.png
If you specify an output file name, the data will be saved to a text file at that name and location.  If you leave ‘output file’ blank, the data will display in your browser.  Go to Page>Save as and save as a text file.  You can then open this file in Excel.
o
Other online bioinformatics resources such as GeneCruiser (http://genecruiser.broad.mit.edu/genecruiser3/pages/index.jsf) or SNPper (http://snpper.chip.org/).

-
Once you have SNP function added to your Excel list of SNP names, you can sort by function and change the SNP text color in batch.  I recommend setting the font of your SNP names to Arial 11 pt bold.
-
Copy the cell containing your SNP name and paste onto the Visio file and rotate the text box as desired.  You can change the text size within Visio, but if it doesn’t fit in the text box you may have to resize to avoid text wrapping.  It’s best to test out which font size looks best in Visio, then set all your font preferences in Excel before pasting to Visio.

-
You can also use the ‘align shapes’ tool in Visio to align your text boxes.  If you do, use the third ‘vertical alignment’ option, as your SNP names will likely be different lengths.

http://wwwchg.duhs.duke.edu/grasp/embim30.png

IV.
Final steps
-
Once you have gene structure and SNP location within the gene, it is up to you the level of detail you want to add.  Some options include indicating exon and intron lengths in base pairs, a scale bar, and start/stop codons.  Remember that while ATG is the universal start codon, not all genes have the same stop codon.  Check the sequence in one of the genome browsers.

-
Visio is very similar to PowerPoint and other Microsoft Office applications.  You can add text, draw shapes, group/ungroup objects, and much more.  You can insert new pages into your Visio document to keep different versions of your gene figure (similar to adding worksheets to an Excel workbook).  This way you can have multiple versions under the same file name.
-
Once you have the figure finalized in Visio, you can output it to several image file formats.  Use the lasso or area select tool to select the figure.  Under save as, select the desired image file format.  If saving as a .jpg, change the quality to 100% (default is 75%).

-
You can also copy out of Visio and paste directly onto a PowerPoint slide. However, the result usually looks of lesser quality compared to the image file.


 HOPE THIS HELPS........

Comments