SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Galaxy Architecture Documentation for Developers oren Bioinformatics 2 09-20-2011 11:02 AM
ABySS input Seta General 22 06-09-2011 08:29 AM
retrieving reads from SRA - lack of documentation NGSfan General 1 06-22-2010 07:09 AM
Solexa Pipeline documentation, or similar? jwaage Illumina/Solexa 4 09-15-2009 07:04 AM
About documentation of corona_lite ribomics SOLiD 4 11-15-2008 06:19 AM

Reply
 
Thread Tools
Old 11-23-2010, 08:45 AM   #1
harrb
Junior Member
 
Location: Germany

Join Date: Feb 2008
Posts: 5
Default documentation for ABySS

Hello,

I am a new user of ABySS. I have 50 Million Solexa 100bp paired-end transcriptome reads that I am intending to assemble de novo. Velvet could not handle my dataset, but my first run with ABySS worked. However, I have a hard time understanding the output.
I have 4 questions:

1) I could not find any documentation on the output files created. Can someone maybe direct me to a page where the various files are explained?

2) I also do not quite understand if ABySS considers the paired end reads for the alignment or not. Does someone know? I have run the "pe" mode.

3) What is the number of mismatches allowed in a read to still be assembled to a contig?

4) below I pasted a few contigs from the outfile that Abyss has created (i.e. xxx_contigs.fa)


>29036 60 1326
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTCTG
>29037 61 114
CGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGTTG
>29038 62 268
TCGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTGTG
>29039 60 1139
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGCTG
>29040 61 51
CGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGCCG
>29041 60 13
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGCTT
>29042 61 183
CGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGGTG
>29043 61 44
CGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTCCG
>29044 61 70
CGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTTTG
>29045 69 20
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGCTCATAAATGCA
>29046 69 34
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTCTCATAAATGCA
>29047 69 20
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTCTTGTAAATGCA

It is quite clear that these reads are identical with the exception of a few single base mismatchs. Why are they not merged into a single contig?

Thanks so much for your help!

Last edited by harrb; 11-23-2010 at 08:58 AM.
harrb is offline   Reply With Quote
Old 11-23-2010, 09:18 AM   #2
Bioinfo
Member
 
Location: canada

Join Date: Jul 2010
Posts: 15
Default

Quote:
Originally Posted by harrb View Post
Hello,

I am a new user of ABySS. I have 50 Million Solexa 100bp paired-end transcriptome reads that I am intending to assemble de novo. Velvet could not handle my dataset, but my first run with ABySS worked. However, I have a hard time understanding the output.
I have 4 questions:

1) I could not find any documentation on the output files created. Can someone maybe direct me to a page where the various files are explained?

2) I also do not quite understand if ABySS considers the paired end reads for the alignment or not. Does someone know? I have run the "pe" mode.

3) What is the number of mismatches allowed in a read to still be assembled to a contig?

4) below I pasted a few contigs from the outfile that Abyss has created (i.e. xxx_contigs.fa)


>29036 60 1326
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTCTG
>29037 61 114
CGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGTTG
>29038 62 268
TCGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTGTG
>29039 60 1139
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGCTG
>29040 61 51
CGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGCCG
>29041 60 13
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGCTT
>29042 61 183
CGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGGTG
>29043 61 44
CGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTCCG
>29044 61 70
CGGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTTTG
>29045 69 20
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGGCTCATAAATGCA
>29046 69 34
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTCTCATAAATGCA
>29047 69 20
GGCTCGAGGGTATCTAGAGTCACCAAAGCTGCCGGGCGGGCCCGGGGTGGGTTTGGTCTTGTAAATGCA

It is quite clear that these reads are identical with the exception of a few single base mismatchs. Why are they not merged into a single contig?

Thanks so much for your help!
Hi,
hope this paper may helps you..
best
Attached Files
File Type: pdf abyss.pdf (205.6 KB, 134 views)
Bioinfo is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:35 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO