- Contents
- 1. Preface
- Rodger Staden, January 1994.
- 1.1 Preface to third edition (November, 1993)
- 1.2 Preface to second edition (November, 1992)
- 1. 3 Preface to first edition (March 1992)
- 2. Introduction
- Table of contents
- 1.Introduction
- 2.Materials
- 2.1Versions.
- 2.1.1UNIX version.
- 2.1.2Other UNIX versions.
- 2.2Terminals.
- 2.3Digitizers.
- 2.4Sequencing machines and film readers.
- 3.User Interfaces
- 3.1The xterm interface
- 3.1.1.Menus and option selection
- 3.1.2Execution and dialogue
- 3.1.3Help
- 3.1.4.Quitting
- 3.1.5.Making selections
- 3.1.5.1.Choosing between opposites.
- 3.1.5.2. Choosing one from many.
- 3.1.5.3.Choosing at least one from many.
- 3.1.6.Input of numerical values
- 3.1.7.Input of character strings
- 3.2.The X interface
- 3.3.Use of the bell
- 3.4.Printing and saving results in files
- 3.5.Use of feature tables
- 3.6.Use of graphics
- 3.6.1.The drawing board and plot positions
- 3.6.2.The plot interval
- 3.6.3.The window length
- 3.6.4.Use of the cross hair
- 3.6.5Drawing scales on plots
- 3.6.6Saving graphics
- 3.6.7 Postscript files
- 3.7.The active region
- 3.8.Files of file names
- 4.Character Sets
- 4.1Character sets for finished sequences
- 4.2Symbols used in gel readings
- 5.Sequence Formats
- 5.1Personal sequence files
- 5.2Sequence libraries
- 5.2.1 Introduction to the EMBL CDROM indexing method
- 5.2.2 Organisation of the sequence library files.
- 5.2.3 Installing the sequence libraries.
- 5.2.3.1 Installing from the EMBL CDROM
- 5.2.3.2 Installing all libraries other than those onthe EMBL CDROM
- 6.Conventions Used In The Manual
- 7.NOTES
- 7.1
- 8.References
- 3. Sequence Input, Editing and Sequence Library Use
- Table of contents
- 1.Introduction
- 1.1Introduction to sequence input and editing
- 1.2Introduction to keyboard input
- 1.3Introduction to input from digitizer
- 1.4Introduction to editing single sequences
- 1.5Introduction to using the sequence libraries
- 2.Methods
- 2.1Sequence input from keyboard
- 2.2Sequence input from digitizer
- 2.3Sequence input from the Pharmacia A.L.F.
- 2.4Sequence input from the ABI 373A.
- 2.5Editing a nucleic acid sequence using restriction sites and a translation and base numbering as landmarks.
- 2.6Searching the freetext (or author, or taxon) index of a sequence library
- 2.7Using accession numbers to retrieve data from a sequence library
- 2.8Displaying the annotations for an entry in a sequence library
- 2.9Reading a sequence from a sequence library
- 2.10Worked example of sequence library access
- 3.NOTES
- 4.References
- 4. Managing Sequencing Projects
- Table of contents
- 1.Introduction
- 2.Methods
- 2.1Starting a project database
- 2.2Screening against restriction enzyme recognition sequences
- 2.3Screening against vector sequences and repeat families
- 2.3.1Clipping off vector sequences
- 2.3.2Screening for "vectors"
- 2.3.3 Screening for repeat families
- 2.4Entering readings into the project database (Assembly)
- 2.5Searching for internal joins
- 2.6Editing in XBAP
- 2.6.1Scrolling through the contig
- 2.6.2Editing operations
- 2.6.3Use of buttons
- 2.6.4Displaying traces for readings from fluorescent sequencing machines
- 2.6.5Extending reads with the hidden data
- 2.6.6Using the pop-up menu
- 2.6.7Annotating readings
- 2.6.8Creating a new annotation
- 2.6.9Editing an existing annotation
- 2.6.10Deleting an annotation
- 2.6.11Searching
- 2.6.11.1Search by position
- 2.6.11.2Search by reading name
- 2.6.11.3Search by tag type
- 2.6.11.4Search by annotation
- 2.6.11.5Search by sequence
- 2.6.11.6Search by problem
- 2.6.11.7Search by quality
- 2.7Joining contigs interactively using XBAP
- 2.8Selecting primers and templates
- 2.8.1Selecting primers and templates interactively
- 2.9Examining the "quality" of a contig
- 2.10Using graphical displays to examine contigs
- 2.11Check assembly
- 2.12Examining the positions of reads from the same template
- 2.13Disassembling and breaking contigs
- 2.13.1Disassembling contigs
- 2.11.2Breaking a contig
- 2.14Finding and labelling repeats
- 2.15.Filling single stranded regions with hidden data
- 2.16Shuffling pads
- 2.16Checking for editing mistakes
- 2.18Displaying a contig
- 2.19Highlighting differences between readings and the consensus
- 2.20Screen editing contigs in SAP
- 2.21Automatic editing of contigs in SAP
- 2.22Using the original editor in SAP
- 3. NOTES
- 3.1 Finding the cloning site.
- 3.2 Finding the primer site
- 3.2.1 The forward primer.
- 3.2.2 The reverse primer
- 3.3 How to do the calculations in a single step
- 3.3.1 The forward primer
- 3.3.2 The reverse primer
- 4.References
- 5. Analysing Sequences to Find Genes
- Table of contents
- 1.Introduction
- 2.Methods
- 2.1The uneven positional base frequencies method.
- 2.2The positional base preferences method
- 2.2.1Using the global standard
- 2.2.2Using a nonglobal standard
- 2.3The codon usage method
- 2.4Searching for open reading frames
- 2.5Searching for tRNA genes
- 3.Notes
- 4.References
- 6. Searching for Motifs in Nucleic Acid Sequences
- Table of contents
- 1.Introduction
- 2.Methods
- 2.1Searching for percentage matches to consensus sequences
- 2.2Searching for consensus sequences using a score matrix
- 2.3Using weight matrices for searching nucleotide sequences
- 2.3.1Creating a weight matrix file from a set of aligned sequences
- 2.3.2Searching using a weight matrix
- 2.4Using "hardwired" motif searches.
- 2.4.1Searching for splice junctions
- 3.Notes
- 4.References
- 7. Using Patterns to Analyse Nucleic Acid Sequences
- Table of contents
- 1.Introduction
- 2. Methods
- 2.1Creating a pattern file containing an exact match motif and weight matrix motif.
- 2.2Searching a sequence using a pattern file
- 2.3Comparing a sequence against a library of patterns
- 2.4Searching sequence libraries for patterns
- 3.Notes
- 4.References
- 8. Searching for Restriction Sites
- Table of contents
- 1.Introduction
- 2.Methods
- 2.1Search for restriction enzyme sites and list them enzyme by enzyme
- 2.2Search for restriction enzyme sites and list them by position
- 2.3Search for restriction enzyme sites and list their names above the sequence
- 2.4Search for restriction enzyme sites and plot their positions
- 2.5Finding restriction enzymes that cut infrequently
- 2.6Producing a back translation from a protein sequence
- 3.Notes
- 9. Statistical and Structural Analysis of Nucleotide Sequences
- Table of contents
- 1.Introduction
- 2.Methods
- 2.1Calculating the base composition
- 2.2Calculating the dinucleotide composition
- 2.3Calculating the codon composition
- 2.4 Creating a codon usage file
- 2.5Plotting the base composition
- 2.6Searching for anomalous compositions
- 2.7Search for anomalous word usage
- 2.8Calculate codon constraint
- 2.9Searching for stem-loop structures
- 2.10Searching for long range inverted repeats
- 2.11Searching for long range repeats
- 2.12Searching for repeated words
- 2.13Searching for possible Z DNA
- 3.Notes
- 4.References
- 10. Translating and Listing Nucleic Acid Sequences
- Table of contents
- 1.Introduction
- 2.Methods
- 2.1Listing the sequence with all six reading frames translated
- 2.2Listing the sequence with its open reading frames translated
- 2.3Listing the sequence with defined segments translated
- 2.4Listing the sequence with translated segments defined from a feature table
- 2.5Producing a file of protein sequences for all open reading frames.
- 2.6Producing a file of protein sequences for segments defined from a feature table
- 3.Notes
- 11. Statistical and Structural Analysis of Protein Sequences
- Table of contents
- 1.Introduction
- 2.Methods
- 2.1Plotting hydrophobicity
- 2.2Plotting charge
- 2.3Plotting hydrophobic moment and hydrophobicity
- 2.4Drawing helical wheels
- 2.5Producing a Robson secondary structure prediction
- 2.6Calculating the composition and molecular weight of a sequence.
- 3.Notes
- 4.References
- 12. Searching for Motifs in Protein Sequences
- Table of contents
- 1.Introduction
- 2.Methods
- 2.1Searching for exact matches.
- 2.2Searching for percentage matches to sequences
- 2.3Searching for sequences using a score matrix
- 2.4Using weight matrices for searching protein sequences
- 2.4.1Creating a weight matrix file from a set of aligned sequences
- 2.4.2Searching using a weight matrix
- 3.Notes
- 4.References
- 13. Using Patterns to Analyse Protein Sequences
- Table of contents
- 1.Introduction
- 1.1Introduction to the PROSITE motif library
- 1.1.1 Browsing through PROSITE.
- 2.Methods
- 2.1Creating a pattern file containing a weight matrix motif and a membership of a set motif.
- 2.2Searching a sequence using a pattern file
- 2.3Comparing a sequence against a library of patterns including PROSITE
- 2.4Searching libraries for patterns
- 2.5Preparing the PROSITE motif library for use by the programs
- 3.Notes
- 4.References
- 14. Comparing Sequences
- Table of contents
- 1.Introduction
- 2.Methods
- 2.1Producing a dot matrix plot (or list) of exact matches
- 2.2Producing a dot matrix plot using the proportional algorithm
- 2.3Producing a dot matrix plot using the quick scan algorithm
- 2.4Producing a list of all matching segments using the proportional algorithm
- 2.5Calculating the expected scores for the proportional algorithm
- 2.6Calculating the observed scores for the proportional algorithm
- 2.7Producing an optimal alignment
- 2.8Comparing a sequence against a library of sequences
- 3.Notes
- 4.References