Hi Everyone,
I'm trying to use CAP3. I have two small files of paired reads- one with forward reads and the other with reverse reads. I'm trying to use their manual and figuring out how I can specify forward and reverse reads, but I'm a little confused by it. I want to know if anyone else has used this and could help me out. I just don't understand what they mean by "dots" and how to go about doing that.
The information from the manual that pertains to what I'm doing:
Input to CAP3
CAP3 takes as input a file of sequence reads in FASTA format.
If the names of reads contain a dot ('.'), CAP3 requres that
the names of reads sequenced from the same subclone contain
the same substring up to the first dot.
CAP3 takes two optional files: a file of quality values
in FASTA format and a file of forward-reverse constraints.
The file of quality values must be named "xyz.qual", and
the file of forward-reverse constraints must be named "xyz.con",
where "xyz" is the name of the sequence file.
CAP3 uses the same format of a quality file as Phrap.
Each line of the constraint file specifies one forward-reverse constraint
of the form:
ReadA ReadB MinDistance MaxDistance
where ReadA and ReadB are names of two reads, and
MinDistance and MaxDistance are distances (integers) in base pairs.
The constraint is satisfied if ReadA in forward orientation occurs
in a contig before ReadB in reverse orientation, or
ReadB in forward orientation occurs in a contig before ReadA
in reverse orientation, and their distance is between MinDistance
and MaxDistance.
CAP3 works better if a lot more constraints are used.
We have a separate program named "formcon" to generate
a constraint file from the sequence file.
The program takes an input file of fragments in FASTA format
and two integers (minimum distance and maximum distance in bp).
The minimum distance and maximum distances specify a lower and
a upper limit on the subclone length, respectively.
It produces a file of forward-reverse constraints for CAP3.
It is assumed that a pair of forward and reverse reads must
contain a dot in their names and a pair of forward and reverse reads
have a common name up to the first dot.
Because CAP3 uses reads whose ends are clipped, instead of raw reads,
to measure their distance, the distance seen by CAP3 could be different
from the insert size by 1000 to 1500 bp. For example,
if the insert size is 2000 to 3000 bp, we recommend that you use
500 for the minimum distance and 4000 for the maximum distance.
The results are in the file with name ending in ".con".
Any help would be appreciated, thanks!
I'm trying to use CAP3. I have two small files of paired reads- one with forward reads and the other with reverse reads. I'm trying to use their manual and figuring out how I can specify forward and reverse reads, but I'm a little confused by it. I want to know if anyone else has used this and could help me out. I just don't understand what they mean by "dots" and how to go about doing that.
The information from the manual that pertains to what I'm doing:
Input to CAP3
CAP3 takes as input a file of sequence reads in FASTA format.
If the names of reads contain a dot ('.'), CAP3 requres that
the names of reads sequenced from the same subclone contain
the same substring up to the first dot.
CAP3 takes two optional files: a file of quality values
in FASTA format and a file of forward-reverse constraints.
The file of quality values must be named "xyz.qual", and
the file of forward-reverse constraints must be named "xyz.con",
where "xyz" is the name of the sequence file.
CAP3 uses the same format of a quality file as Phrap.
Each line of the constraint file specifies one forward-reverse constraint
of the form:
ReadA ReadB MinDistance MaxDistance
where ReadA and ReadB are names of two reads, and
MinDistance and MaxDistance are distances (integers) in base pairs.
The constraint is satisfied if ReadA in forward orientation occurs
in a contig before ReadB in reverse orientation, or
ReadB in forward orientation occurs in a contig before ReadA
in reverse orientation, and their distance is between MinDistance
and MaxDistance.
CAP3 works better if a lot more constraints are used.
We have a separate program named "formcon" to generate
a constraint file from the sequence file.
The program takes an input file of fragments in FASTA format
and two integers (minimum distance and maximum distance in bp).
The minimum distance and maximum distances specify a lower and
a upper limit on the subclone length, respectively.
It produces a file of forward-reverse constraints for CAP3.
It is assumed that a pair of forward and reverse reads must
contain a dot in their names and a pair of forward and reverse reads
have a common name up to the first dot.
Because CAP3 uses reads whose ends are clipped, instead of raw reads,
to measure their distance, the distance seen by CAP3 could be different
from the insert size by 1000 to 1500 bp. For example,
if the insert size is 2000 to 3000 bp, we recommend that you use
500 for the minimum distance and 4000 for the maximum distance.
The results are in the file with name ending in ".con".
Any help would be appreciated, thanks!
Comment