SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SRMA error Ori Bioinformatics 6 08-19-2011 08:09 AM
SRMA - Error - cs TAG mrxcm3 Bioinformatics 3 06-20-2011 10:12 AM
another question about SRMA rlh Bioinformatics 2 10-23-2010 11:51 PM
question about SRMA rlh Bioinformatics 9 10-22-2010 07:08 AM
srma-0.1.8 error - Cannot check readability of null file. allenday Bioinformatics 1 10-12-2010 12:30 AM

Reply
 
Thread Tools
Old 09-02-2010, 07:26 PM   #1
dmurdock
Junior Member
 
Location: texas

Join Date: Mar 2010
Posts: 9
Default SRMA Error

Hello everyone, I'm attempting to use the SRMA tool and am running into a problem. I've followed the instructions in the user guide but can't seem to get past this error, "SAMRecord contig does not match the current reference sequence contig". Here's the command and full error message. Thanks for the help.

-David Murdock

java -Xmx2g -jar /users/bainbrid/projects/NimblegenCapturePipeline/projects/SRMA/srma-0.1.7/srma-0.1.7.jar I=NS.1.dupesmarked.bam O=NS.1.realign.bam R=/users/bainbrid/projects/NimblegenCapturePipeline/bucket/human.build36.fa


[Thu Sep 02 21:14:18 CDT 2010] srma.SRMA REFERENCE=/users/bainbrid/projects/NimblegenCapturePipeline/bucket/human.build36.fa OFFSET=20 MIN_MAPQ=0 MINIMUM_ALLELE_PROBABILITY=0.1 MINIMUM_ALLELE_COVERAGE=3 MAXIMUM_TOTAL_COVERAGE=100 CORRECT_BASES=false USE_SEQUENCE_QUALITIES=true QUIET_STDERR=false MAX_HEAP_SIZE=8192 MAX_QUEUE_SIZE=65536 NUM_THREADS=1 TMP_DIR=/tmp/dm147882 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
Allele coverage cutoffs:
coverage: 1 minimum allele coverage: 0
coverage: 2 minimum allele coverage: 0
coverage: 3 minimum allele coverage: 0
coverage: 4 minimum allele coverage: 1
coverage: 5 minimum allele coverage: 1
coverage: 6 minimum allele coverage: 1
coverage: 7 minimum allele coverage: 2
coverage: 8 minimum allele coverage: 2
coverage: 9 minimum allele coverage: 3
coverage: >9 minimum allele coverage: 3
java.lang.Exception: SAMRecord contig does not match the current reference sequence contig
at srma.Graph.addSAMRecord(Graph.java:49)
at srma.SRMA$GraphThread.run(SRMA.java:596)
Please report bugs to srma-help@lists.sourceforge.net
dmurdock is offline   Reply With Quote
Old 09-02-2010, 10:35 PM   #2
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Could you post the reference sequence contig names, as well as the SAM header? There may be a mismatch of names going on in your input files. Once I rule that out, I can start debugging SRMA. Thank-you for your patience.
nilshomer is offline   Reply With Quote
Old 09-03-2010, 07:30 AM   #3
dmurdock
Junior Member
 
Location: texas

Join Date: Mar 2010
Posts: 9
Default

Nils, here's the sam header:

@HD VN:1.0 GO:none SO:coordinate
@SQ SN:chr1 LN:247249719
@SQ SN:chr2 LN:242951149
@SQ SN:chr3 LN:199501827
@SQ SN:chr4 LN:191273063
@SQ SN:chr5 LN:180857866
@SQ SN:chr6 LN:170899992
@SQ SN:chr7 LN:158821424
@SQ SN:chr8 LN:146274826
@SQ SN:chr9 LN:140273252
@SQ SN:chr10 LN:135374737
@SQ SN:chr11 LN:134452384
@SQ SN:chr12 LN:132349534
@SQ SN:chr13 LN:114142980
@SQ SN:chr14 LN:106368585
@SQ SN:chr15 LN:100338915
@SQ SN:chr16 LN:88827254
@SQ SN:chr17 LN:78774742
@SQ SN:chr18 LN:76117153
@SQ SN:chr19 LN:63811651
@SQ SN:chr20 LN:62435964
@SQ SN:chr21 LN:46944323
@SQ SN:chr22 LN:49691432
@SQ SN:chrX LN:154913754
@SQ SN:chrY LN:57772954
@SQ SN:chrM LN:16571
@PG ID:bfast VN:0.6.4d

And here's the reference contig names:
>chr10
>chr11
>chr12
>chr13
>chr14
>chr15
>chr16
>chr17
>chr18
>chr19
>chr1
>chr20
>chr21
>chr22
>chr2
>chr3
>chr4
>chr5
>chr6
>chr7
>chr8
>chr9
>chrM
>chrX
>chrY

The only thing I see is that the ref's contigs aren't sorted. Could this be the problem? Thanks.
-David
dmurdock is offline   Reply With Quote
Old 09-03-2010, 06:56 PM   #4
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by dmurdock View Post
The only thing I see is that the ref's contigs aren't sorted. Could this be the problem? Thanks.
-David
That's it. The reference should be in the same order as the SAM header (not sure why it isn't?).
nilshomer is offline   Reply With Quote
Old 09-03-2010, 07:53 PM   #5
dmurdock
Junior Member
 
Location: texas

Join Date: Mar 2010
Posts: 9
Default

Thanks Nils. I'll make the change to the ref and let you know how it goes.
David
dmurdock is offline   Reply With Quote
Old 09-07-2010, 09:46 AM   #6
dmurdock
Junior Member
 
Location: texas

Join Date: Mar 2010
Posts: 9
Default

Thanks for the suggestion, it worked after sorting the reference! I'm now having an issue when using the RANGES option with a file containing different regions to realign. I have generated such a file but srma seems to exit when the chromosome changes. It runs successfully without generating an error but it just doesn't go beyond the first chr listed. If I make a separate file for each chr then they run fine. It's just when they're together. Here's the region file where it stops in bold.

chr1 1889866 1890066
chr1 12561395 12561595
chr1 34999494 34999694
chr1 43681831 43682031
chr1 74810345 74810545
chr1 74810352 74810552
chr1 89245929 89246129
chr1 143585219 143585419
chr1 144037256 144037456
chr1 150462252 150462452
chr1 156418029 156418229
chr1 169823406 169823606
chr1 227839825 227840025
chr1 232667941 232668141
chr2 15481809 15482009
chr2 24240582 24240782
chr2 26330529 26330729
chr2 38054620 38054820
chr2 73528635 73528835
chr2 95210667 95210867

And here's the output:
Allele coverage cutoffs:
coverage: 1 minimum allele coverage: 0
coverage: 2 minimum allele coverage: 0
coverage: 3 minimum allele coverage: 0
coverage: 4 minimum allele coverage: 1
coverage: 5 minimum allele coverage: 1
coverage: 6 minimum allele coverage: 1
coverage: 7 minimum allele coverage: 2
coverage: 8 minimum allele coverage: 2
coverage: 9 minimum allele coverage: 3
coverage: >9 minimum allele coverage: 3
^MRecords processsed: 265 (last chr1:1890066-1890115)^MRecords processsed: 265 (last chr1:1890066-1890115)^MRecords processsed: 328 (last chr1:12561570-12561619)^MRecords processsed: 328 (last chr1:12561570-12561619)^MRecords processsed: 426 (last chr1:34999694-34999743)^MRecords processsed: 426 (last chr1:34999694-34999743)^MRecords processsed: 506 (last chr1:43682031-43682080)^MRecords processsed: 506 (last chr1:43682031-43682080)^MRecords processsed: 644 (last chr1:74810552-74810601)^MRecords processsed: 644 (last chr1:74810552-74810601)^MRecords processsed: 888 (last chr1:89246127-89246176)^MRecords processsed: 888 (last chr1:89246127-89246176)^MRecords processsed: 915 (last chr1:143585419-143585468)^MRecords processsed: 915 (last chr1:143585419-143585468)^MRecords processsed: 923 (last chr1:144037447-144037496)^MRecords processsed: 923 (last chr1:144037447-144037496)^MRecords processsed: 1022 (last chr1:150462408-150462457)^MRecords processsed: 1022 (last chr1:150462408-150462457)^MRecords processsed: 1055 (last chr1:156418200-156418249)^MRecords processsed: 1055 (last chr1:156418200-156418249)^MRecords processsed: 1109 (last chr1:169823567-169823616)^MRecords processsed: 1109 (last chr1:169823567-169823616)^MRecords processsed: 1210 (last chr1:227840018-227840067)^MRecords processsed: 1210 (last chr1:227840018-227840067)^MRecords processsed: 1257 (last chr1:232668141-232668190)^M^MRecords processsed: 1257 (last chr1:232668141-232668190)
SRMA complete
Total memory usage: 249MB
Total execution time: 0h : 1m : 7s

[Tue Sep 07 11:42:05 CDT 2010] srma.SRMA REFERENCE=/users/bainbrid/projects/NimblegenCapturePipeline/projects/mendelianDisease/ref/hsap_36.1_hg18.fa RANGES=realign.coords OFFSET=20 MIN_MAPQ=0 MINIMUM_ALLELE_PROBABILITY=0.1 MINIMUM_ALLELE_COVERAGE=3 MAXIMUM_TOTAL_COVERAGE=100 CORRECT_BASES=false USE_SEQUENCE_QUALITIES=true QUIET_STDERR=false MAX_HEAP_SIZE=8192 MAX_QUEUE_SIZE=65536 NUM_THREADS=1 TMP_DIR=/tmp/dm147882 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000

I'm wondering if it's some sort of sorting issue but I can't seem to figure it out. Thanks!
-David
dmurdock is offline   Reply With Quote
Old 09-07-2010, 02:21 PM   #7
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

I think you have found a bug in Picard (I just sent an email to their developers mailing list). Picard seems to want the index of a "aln.bam" file names "aln.bai", whereas samtools produces them with the name "aln.bam.bai". Like I said, I have initiated a discussion with the Picard developers. A quick hack would be to create a symbolic link ("ln -s aln.bam.bai aln.bai"). Let me know if that doesn't work for you.
nilshomer is offline   Reply With Quote
Old 09-07-2010, 02:38 PM   #8
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by nilshomer View Post
I think you have found a bug in Picard (I just sent an email to their developers mailing list). Picard seems to want the index of a "aln.bam" file names "aln.bai", whereas samtools produces them with the name "aln.bam.bai". Like I said, I have initiated a discussion with the Picard developers. A quick hack would be to create a symbolic link ("ln -s aln.bam.bai aln.bai"). Let me know if that doesn't work for you.
My criticism of Picard is unfounded, it works with the latest GIT/SVN repositories. Can you try the latest SRMA GIT version? Also, could you check that there are reads mapped to chromosome 2 and maybe a "RANGES" file with two ranges on either chromosome?
nilshomer is offline   Reply With Quote
Old 09-08-2010, 01:32 PM   #9
dmurdock
Junior Member
 
Location: texas

Join Date: Mar 2010
Posts: 9
Default

I installed srma-0.1.8.jar and unfortunately i'm still having the same problem.
I've found that if the ranges file is sorted by chr and coordinate then srma will run on multiple chromosomes. However if a region's coordinates are not after the previous region's (regardless of which chr it came from) it will not realign it. Thus the following will work completely:

chr1 1889766 1890166
chr1 12561295 12561695
chr2 15481709 15482109
chr2 24240482 24240882
chr3 49338000 49338400
chr3 49544055 49544455

But the following will stop at the last chr 2 region:

chr1 1889766 1890166
chr1 12561295 12561695
chr2 15481709 15482109
chr2 24240482 24240882
chr2 95210567 95210967
chr3 49338000 49338400
chr3 49544055 49544455

It won't throw an error but just doesn't include the latter regions in the output file. Any thoughts?
-David
dmurdock is offline   Reply With Quote
Old 09-08-2010, 01:47 PM   #10
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

The solution for now is to run one "RANGE" command per region.

Quote:
Originally Posted by dmurdock View Post
I've found that if the ranges file is sorted by chr and coordinate then srma will run on multiple chromosomes. However if a region's coordinates are not after the previous region's (regardless of which chr it came from) it will not realign it. Thus the following will work completely:

chr1 1889766 1890166
chr1 12561295 12561695
chr2 15481709 15482109
chr2 24240482 24240882
chr3 49338000 49338400
chr3 49544055 49544455

But the following will stop at the last chr 2 region:

chr1 1889766 1890166
chr1 12561295 12561695
chr2 15481709 15482109
chr2 24240482 24240882
chr2 95210567 95210967
chr3 49338000 49338400
chr3 49544055 49544455

It won't throw an error but just doesn't include the latter regions in the output file. Any thoughts?
-David
I see it now, I will see what I can do.

Last edited by nilshomer; 09-08-2010 at 01:47 PM. Reason: Enlightenment
nilshomer is offline   Reply With Quote
Old 09-08-2010, 01:48 PM   #11
dmurdock
Junior Member
 
Location: texas

Join Date: Mar 2010
Posts: 9
Default

Thanks, I'll do that!
-David
dmurdock is offline   Reply With Quote
Old 09-08-2010, 02:02 PM   #12
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

I was able to reproduce the bug so it should be fixed now. Once you confirm I will package up a new release. Thank-you for your patience!
nilshomer is offline   Reply With Quote
Old 09-08-2010, 02:19 PM   #13
dmurdock
Junior Member
 
Location: texas

Join Date: Mar 2010
Posts: 9
Default

It works great! I was able to realign ~ 150 small regions across the whole genome. Thanks for your help.
-David
dmurdock is offline   Reply With Quote
Old 09-08-2010, 05:58 PM   #14
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

No problem. Bugs/features would be fixed/added without users like you. Having it open source makes it easier to fix and release.
nilshomer is offline   Reply With Quote
Old 10-21-2010, 06:02 PM   #15
rlh
Member
 
Location: China

Join Date: Oct 2010
Posts: 12
Default

Quote:
Originally Posted by dmurdock View Post
chr1 1889866 1890066
chr1 12561395 12561595
chr1 34999494 34999694
chr1 43681831 43682031
chr1 74810345 74810545
chr1 74810352 74810552
chr1 89245929 89246129
chr1 143585219 143585419
chr1 144037256 144037456
chr1 150462252 150462452
chr1 156418029 156418229
chr1 169823406 169823606
chr1 227839825 227840025
chr1 232667941 232668141
chr2 15481809 15482009
chr2 24240582 24240782
chr2 26330529 26330729
chr2 38054620 38054820
chr2 73528635 73528835
chr2 95210667 95210867
i am a freshman in this field, i just wanna ask how to generate this file? which tool should i use?
i will really appreciate your help if somebody give some advices.
rlh is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:23 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO