home up

Staden Package Program Summary

Assembly

Assembly program

gap4 Performs assembly, contig joining, assembly checking, repeat searching, experiment suggestion, read pair analysis and contig editing. Has graphical views of contigs, templates, readings and traces which all scroll in register. Contig editor searches and experiment suggestion routines use phred confidence values to calculate the confidence of the consensus sequence and hence only identify places requiring visual trace inspection or extra data. The result is extremely rapid finishing and a consensus of known accuracy.

Preparing sequence trace data for analysis or assembly

pregap4 Provides a graphical user interface to set up the processing required to prepare trace data for assembly or analysis; and also gives a method for its automation. The possible processes which can be set up include trace format conversion, quality analysis, vector clipping, contaminant screening and repeat searching.

Sequence screening

vector_clip Finds and marks (with tags) vector segments of sequence readings stored as Experiment Files. Rapid and sensitive, and usually used via pregap4.
screen_seq Searches sequence readings stored as Experiment Files for matches against sets of possible contaminant sequences. Typically used to look for E.Coli or yeast contamination. Very fast, and usually used via pregap4.
repe Finds and marks (with tags) known repeat sequences ( e.g. ALUs) in sequence readings stored as Experiment Files. Usually used via pregap4.

Trace viewing

trev A rapid and flexible viewer and editor for ABI, ALF or SCF trace files. Provides good support for interaction with Experiment Files.

Mutation detection

trace_diff Automatically locates point mutations by comparing new traces against those of a reference trace. Handles any number of files in a single run and prepares results which can be viewed in gap4.
gap4 For viewing aligned sequences and traces and checking automatic mutation assignments. Can subtract traces and display their differences.


Sequence analysis

Comparison

sip4 Compares pairs of sequences in many ways, often presenting its results graphically. Has very rapid dot matrix analysis, global and local alignment, plus a sliding sequence window linked to the graphical plots. Can compare nucleic acid against nucleic acid, protein against protein, and protein against nucleic acid. Includes a sequence library browser

Nucleotide analysis

nip4 Analyses nucleotide sequences to find genes, restriction sites, motifs, etc. Performs translations, finds open reading frames, counts codons, etc. Many results are presented graphically and a sliding sequence window is linked to the graphics cursor. Includes a sequence library browser.


Sequence library access

slim Searches the indexes for EMBL, SWISSPROT, genbank, PIR and fasta format libraries.
nip4 Includes a sequence library browser.
sip4 Includes a sequence library browser


Sequence trace and reading file manipulation

ABI files

getABIstring Displays arbitrary string fields from an ABI trace file.
getABIcomment Displays the comments from an ABI trace file. Equivalent to getABIstring CMNT.
getABISampleName Displays the sample name (reading name) stored in an ABI trace file. Equivalent to getABIstring SMPL
getABIdate Displays the run date from an ABI trace file.

ALF files

alfsplit Splits the Pharmacia ALF gel file into multiple files. This is necessary before processing by pregap4.

SCF files

makeSCF Converts existing trace files (whatever format) into SCF files.
scf_info Displays details stored in the header of an SCF file.
scf_dump Displays the entire SCF file contents in a human readable format.
scf_update Converts between SCF file versions (2 to 3 and vice versa).
get_scf_field Extracts data from the SCF comment section.
eba Estimates the base accuracy of each base in an SCF file.

Sequence quality clipping

qclip Performs simple quality clipping of Experiment Files based on confidence values or on the sequence composition.

Gap4 database utilities

convert Converts between the various assembly database formats.
copy_db Copies and garbage collects gap4 databases.

Other sequencing utilities

extract_seq Extracts the sequence component from trace files or experiment files.
init_exp Extracts the sequence and related information from trace files to output in Experiment File format.


Scripting utilities

stash General purpose scripting interface to Gap4, Nip4, Sip4 and Slim. Based around Tcl and Tk, but with many additional commands. May also be used for producing graphical scripts and interfaces.


Misc

splitseq_da Splits large sequences into a set of overlapping smaller sequences. Outputs the sequences in a Experiment File format with attributes suitable for input using Directed Assembly.


home up
This page is maintained by staden-package.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/overview.html