Eba


DESCRIPTION

Eba will calculate numerical estimates of base accuracy for each base in an SCF or ZTR file. The figures calculated should not be considered as reliable and better values can be obtained from phred or ATQA.
The method employed by eba to estimate the base accuracies performs the following calculation for each base. Calculate the area under the peaks for each base type. Divide the area under the called base by the largest area under the other three bases. From the 2002 release these values are normalised to the phred scale (this was achieved by comaring the original eba values and phred values for 4.6 million base calls of Sanger Centre data).
With no filename as an argument eba reads from standard input and writes to standard output. This enables eba to be used as a filter, or to estimate base accuracies for unwritable files. If a file is specified on the command line then the accuracy figures will be written to this file.
extract_seq extracts the sequence information from binary trace files, Experiment files, or from the old Staden format plain files. The input can be read either from files or from standard input, and the output can be written to either a file or standard output. Multiple input files can be specified. The output contains the sequences split onto lines of at most 60 characters each.

find_renz may be used to determine the position that an enzyme cuts a sequence. It's use as a command line utility is primarily designed for internal use within pregap4 and as a user utility for producing vector-primer files for use with vector_clip. As such it is dedicated to finding one and only one such cut site and considers no cuts sites or multiple cut sites to be an error.
Only one enzyme may be specified, which is given by the enzyme name (upper or lower case is not important). One or more filenames may be specified. If an enzyme does not cut a sequence the message "Enzyme not found in sequence" will be sent to stderr. If an enzyme cuts a sequence more than once the message "Found more than one match" will be sent to stderr. Otherwise output is produced to stdout. This means that wildcards may be used (find_renz -vp smai *.seq >> vpfile) with the output redirected without needing to consider whether the enzyme is suitable for all files matching the wildcard pattern.


OPTIONS


Directs reading of experiment file to attempt extraction of sequence from the referenced (LN and LT line types) trace file. Without this option, or when the trace file cannot be found, the sequence output is that listed in the Experiment File. This option has no effect for other input format types.


-abi-alf-scf-exp-pln




Specify an input file format. This is not usually required as extract_seq will automatically determine the correct input file type. This option is supplied incase the automatic determination is incorrect (which is possible, but has never been observed).


-good_only




When reading an experiment file or SCF file containing clip marks, output only the good sequence which is contained within the boundaries marked by the QLQRSLSRCLCR and CS line types.


-clip_cosmid




When the -good_only argument is specified this controls whether the cosmid sequence should be considered good data. Without this argument cosmid sequence is considered good.


-fasta_out




Specifies that the output should be in fasta format


-output file




The sequence will be written to file instead of standard output.


vp




Specifies that the output should be in a format suitable for saving to a vector-primer file (to use with vector_clip). Without this only the cut site position is listed.

0 komentar:

Post a Comment

Related Posts with Thumbnails
GiF Pictures, Images and Photos