SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Average Read Coverage for 454 paired end read data lisa1102 Core Facilities 8 10-18-2011 08:40 AM
Will single-read be enough coverage? Turnerac0987 Bioinformatics 6 10-07-2011 05:58 AM
newbler ignores minimal read length DNAjunk Bioinformatics 1 01-13-2011 11:33 PM
About the read depth of coverage El Mariachi Illumina/Solexa 2 12-30-2010 12:22 AM
1/2 read and coverage? Triticum 454 Pyrosequencing 3 09-08-2009 07:13 AM

Reply
 
Thread Tools
Old 01-31-2012, 05:03 AM   #1
aloliveira
Member
 
Location: Brazil

Join Date: Aug 2010
Posts: 47
Default Newbler and read coverage

Hello,

I am running Newbler under several parameters and have some questions.

1) Is there any parameter that specifies the coverage of the sample used by Newbler in the assembly process?

2) If there are areas with high coverage (eg repetitive regions) regions with low coverage are simply excluded from the assembly process because they are considered as sequencing errors (due to low coverage)?


I'm trying to assemble a genome that has major discrepancies in coverage and my results are not very good.


Thanks in advance,
André
aloliveira is offline   Reply With Quote
Old 01-31-2012, 05:22 AM   #2
nickloman
Senior Member
 
Location: Birmingham, UK

Join Date: Jul 2009
Posts: 356
Default

You can tell Newbler your expected genome coverage with the '-e' parameter.

http://contig.wordpress.com/2010/07/...-a-hidden-one/

Newbler will not in my experience discard low coverage regions just because of higher coverage regions, I regularly have contigs with 1x coverage, for example.

Are you sure that you actually have reads for these low coverage regions in your dataset?

Another option you might like to play with is '-urt'.
http://contig.wordpress.com/2011/03/...version-2-5-3/
nickloman is offline   Reply With Quote
Old 02-02-2012, 02:37 AM   #3
flxlex
Moderator
 
Location: Oslo, Norway

Join Date: Nov 2008
Posts: 415
Default

If there are regions with extreme coverage (hundreds of thousand times), you might want to filter some of the reads that go into these regions, and assemble again with a more evenly covered read dataset.
flxlex is offline   Reply With Quote
Old 02-02-2012, 02:40 AM   #4
aloliveira
Member
 
Location: Brazil

Join Date: Aug 2010
Posts: 47
Default

Thanks, for all the aswers. The parameter -urt reallys works for my problem. I will try also your suggestion flxlex.

André
aloliveira is offline   Reply With Quote
Old 01-29-2014, 06:01 AM   #5
HeyIamNuria
Member
 
Location: Barcelona

Join Date: Dec 2012
Posts: 19
Default

Hello!

I have a Newbler run output (I did not launch it myself) and I want to know the depth of coverage of every position. Or even better where the reads were placed into the scaffolds (or contigs).

For what I've seen I could get this information from 454AlignmentInfo.tsv file, but I don't have it since -info option was not used (I'm trying to avoid running Newbler again).

Also, I could get the information of where the reads were placed (contig, start and end) from the 454ReadStatus.txt file, but only for the reads that were completely assembled to a single contig. (Is this correct?). For the rest of the reads there is information of where the 5' and 3' ends mapped but I can't know if these reads map into some other contig, or how much of the length of the read was mapped (when Partially Assembled).

Is the coverage information somewhere else?
Thank you for your help, any inputs are appreciated.

Nuria
HeyIamNuria is offline   Reply With Quote
Old 01-30-2014, 05:10 AM   #6
HeyIamNuria
Member
 
Location: Barcelona

Join Date: Dec 2012
Posts: 19
Default

Hi again!

After digging on 454ReadStatus.txt I am starting to realize that I cannot get information about where every read was placed.

In my case 94,27% of the reads have their 5' start and 3' end positions assigned to the same contig. I thought I could use these start and end points like coordinates. But apparently I was wrong. Only 5% of these reads have their ends in the same strand (5' strand + and 3' strand + or 5' strand - and 3' strand -). (All reads were Assembled reads, but one that was PartiallyAssembled)

So I'm giving up on this, but I still don't understand why according to 454ReadStatus.txt most reads are not completly colinear with a contig fragment. What seems reasonable to me. Does anyone know why?

I assume coverage information for each base is only available from 454AlignmentInfo.tsv. Am I correct? Is there another way?

I've read other threads about 454ReadStatus, and Newbler output files in general, but in my opinion they were not discussing exactly my issue.

Thank your for your time
HeyIamNuria is offline   Reply With Quote
Old 01-30-2014, 05:37 AM   #7
aloliveira
Member
 
Location: Brazil

Join Date: Aug 2010
Posts: 47
Default

Nuria,

Maybe this website can help you.

http://contig.wordpress.com/2010/04/...raph-txt-file/

Everything we need to know about newbler is there.
Best regards,
André
aloliveira is offline   Reply With Quote
Old 01-30-2014, 05:51 AM   #8
HeyIamNuria
Member
 
Location: Barcelona

Join Date: Dec 2012
Posts: 19
Default

Thank you André!

Yes, I've reading flxlex blog too There is where I found out about 454AligmentInfo.tsv, since I don't have this file.

But I could not find an alternative, just like I could not find an explanation for my opposite strands problem.

Maybe there is no alternative to AlignmentInfo to get the coverage, even though I had to try it, and ask it here. :|
HeyIamNuria is offline   Reply With Quote
Old 02-03-2014, 01:00 AM   #9
flxlex
Moderator
 
Location: Oslo, Norway

Join Date: Nov 2008
Posts: 415
Default

You would expect the read orientations in the 454ReadStatus.txt file for reads that start and end in the same contig to be either '+' and '-', or '-' and '+'. That was put in 'by definition'. Are you sure the 5% with the same strand map to the same contig?

And I think you are in fact stuck without the 454AligmentInfo.tsv file, unless you have the 454Contigs.ace file, which has all the read positions.
flxlex is offline   Reply With Quote
Old 02-03-2014, 02:58 AM   #10
HeyIamNuria
Member
 
Location: Barcelona

Join Date: Dec 2012
Posts: 19
Default

Sorry, actually only 70/20199527 reads with ends in the same contig are in the same strand. I was counting also the repeats and singletons without strand information.

Thank you, for your answer and your blog

Nuria
HeyIamNuria is offline   Reply With Quote
Old 02-17-2014, 08:50 AM   #11
HeyIamNuria
Member
 
Location: Barcelona

Join Date: Dec 2012
Posts: 19
Default

So, to avoid assembling the genome (I would get a different output) I am trying to use GS Reference Mapper (v2.6), with the same reads used in the assembly (I know is not the same, but I hope is a good proxy).

I used the -bam option because I want to know the coverage of some regions, and I only know how to do this with bedtools. But every time I launch the mapper with -bam option I get an error. And if I launch it exactly the same but without the -bam option it works nicely.

nohup runProject -bam myprojectname >myproj_status &
Error: An internal error (segmentation fault) has occurred in the computation.
My guess is this is probably related to this:

Quote:
Originally Posted by flxlex View Post
I got the impression that SAM support for newbler was kind of experimental still. So, please go ahead and report a bug to Roche/454!
If so, has someone found a way to use the 454AlignmentInfo.tsv file to find out the coverage of particular regions?

Any help is appreciated

Thank you
HeyIamNuria is offline   Reply With Quote
Old 02-19-2014, 06:49 AM   #12
HeyIamNuria
Member
 
Location: Barcelona

Join Date: Dec 2012
Posts: 19
Default

ok, if someone encounters this problem:

I found that if I try to map 454 and fasta sequences gsMapper fails to create a bam file, but if I only map 454 reads it creates the bam file.

Nuria
HeyIamNuria is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:54 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO