Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by ohofmann View Post
    Nils, congratulations on getting the publication out!

    I'm about to give this a try on an odd data set -- 2kb of genomic sequence at an average (but far from uniform) coverage of around 100.000 X. It's a sequencing mixture, and the lower cutoff of variation we'd like to be able to detect is at around 0.5% (after error correction) or 500 observations.

    Other than the biological samples we also have a mix of known genomic frequencies and defined indel regions to optimize parameters. Can you think of a realistic set of starting parameters?
    It wasn't designed for such high coverage so all bets are off.

    Comment


    • #17
      Hi Nils

      Just wondering can SRMA be used for rescuing orphaned reads. So we have a dataset of variable insert library as we are sequencing the 5' and 3' end of transcripts. As a result the distance between the mates( <--- --->) is dependent on the length of transcript. To map the reads initially I am first using Mosaik which i belv does a better job with variable insert mate pair data.

      After mapping we still see 40% orphaned reads where one read maps and the other doesn't. I am wondering if SRMA can rescue these reads.

      Thanks!
      -Abhi

      Comment


      • #18
        No, SRMA is not for read rescue. It is for re-aligning the reads to create a better consensus.

        Comment


        • #19
          Ok good to know. I will start a new thread for my question then.

          Best,
          -Abhi

          Comment


          • #20
            Dead project now? Are there other alternatives that work on the whole genome?

            Comment


            • #21
              Originally posted by ymc View Post
              Dead project now? Are there other alternatives that work on the whole genome?
              It's not a dead project, feel free to post questions and bug reports etc.

              Comment


              • #22
                Originally posted by ymc View Post
                Dead project now? Are there other alternatives that work on the whole genome?
                I have used it and it is fast. I have sometimes had trouble with files in the 100GB range but generally it works fine.

                We have also parallelized the GATK implementation of LR if you are interested. I am not sure which is better at realigning. I do remember comparing SRMA and GATK LR and there are differences but it was not clear to me if one was consistently better than the other. I suspect that Nils would be a better source for info on that.

                Comment


                • #23
                  Tried several bams with 0.1.16 but all I got was this:

                  at java.util.ArrayList$SubList.add(ArrayList.java:965)
                  at java.util.ArrayList$SubList.add(ArrayList.java:965)
                  at java.util.ArrayList$SubList.add(ArrayList.java:965)
                  at java.util.ArrayList$SubList.add(ArrayList.java:965)
                  at java.util.ArrayList$SubList.add(ArrayList.java:965)
                  at java.util.ArrayList$SubList.add(ArrayList.java:965)
                  at java.util.ArrayList$SubList.add(ArrayList.java:965)
                  at java.util.ArrayList$SubList.add(ArrayList.java:965)
                  at java.util.ArrayList$SubList.add(ArrayList.java:965)
                  at java.util.ArrayList$SubList.add(ArrayList.java:965)

                  Comment


                  • #24
                    Originally posted by ymc View Post
                    Tried several bams with 0.1.16 but all I got was this:

                    at java.util.ArrayList$SubList.add(ArrayList.java:965)
                    at java.util.ArrayList$SubList.add(ArrayList.java:965)
                    at java.util.ArrayList$SubList.add(ArrayList.java:965)
                    at java.util.ArrayList$SubList.add(ArrayList.java:965)
                    at java.util.ArrayList$SubList.add(ArrayList.java:965)
                    at java.util.ArrayList$SubList.add(ArrayList.java:965)
                    at java.util.ArrayList$SubList.add(ArrayList.java:965)
                    at java.util.ArrayList$SubList.add(ArrayList.java:965)
                    at java.util.ArrayList$SubList.add(ArrayList.java:965)
                    at java.util.ArrayList$SubList.add(ArrayList.java:965)
                    Could you post the full error message?

                    Comment


                    • #25
                      I have been interested in this tool for some time but never got it working:
                      Input is a sorted bam.

                      java -Xmx16g -jar srma-0.1.15.jar I=491_full_s.bam O=srma_491.bam R=../NC_002516.fna
                      [Fri Aug 17 10:00:54 CEST 2012] srma.SRMA INPUT=[491_full_s.bam] OUTPUT=[srma_491.bam] REFERENCE=../NC_002516.fna OFFSET=20 MIN_MAPQ=0 MINIMUM_ALLELE_PROBABILITY=0.1 MINIMUM_ALLELE_COVERAGE=3 MAXIMUM_TOTAL_COVERAGE=100 CORRECT_BASES=false USE_SEQUENCE_QUALITIES=true QUIET_STDERR=false MAX_HEAP_SIZE=8192 MAX_QUEUE_SIZE=65536 GRAPH_PRUNING=false NUM_THREADS=1 TMP_DIR=/tmp/colin2 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
                      java.util.NoSuchElementException
                      at java.util.Scanner.nextLine(Scanner.java:1503)
                      at net.sf.picard.reference.FastaSequenceIndex.parseIndexFile(FastaSequenceIndex.java:131)
                      at net.sf.picard.reference.FastaSequenceIndex.<init>(FastaSequenceIndex.java:55)
                      at net.sf.picard.reference.IndexedFastaSequenceFile.<init>(IndexedFastaSequenceFile.java:95)
                      at srma.SRMA.doWork(SRMA.java:131)
                      at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:156)
                      at srma.SRMA.main(SRMA.java:98)
                      Please report bugs to [email protected]


                      The fasta index file looks like this:

                      more ../NC_002516.fna.fai
                      NC_002516.2 6264404 58 70 71

                      Cheers for any help.

                      Comment


                      • #26
                        There are thousands of lines of these error messages. If I copy the stderr output, it will be too many lines. You can replicate my problem by downloading the pair-ended reads from

                        ftp://ftp.1000genomes.ebi.ac.uk/vol1...sequence_read/

                        and then align them using bwa. I got the same bug with SRR098401_*.filt.fastq.gz and SRR035330_*.filt.fastq.gz

                        Comment


                        • #27
                          Originally posted by colindaven View Post
                          I have been interested in this tool for some time but never got it working:
                          Input is a sorted bam.

                          java -Xmx16g -jar srma-0.1.15.jar I=491_full_s.bam O=srma_491.bam R=../NC_002516.fna
                          [Fri Aug 17 10:00:54 CEST 2012] srma.SRMA INPUT=[491_full_s.bam] OUTPUT=[srma_491.bam] REFERENCE=../NC_002516.fna OFFSET=20 MIN_MAPQ=0 MINIMUM_ALLELE_PROBABILITY=0.1 MINIMUM_ALLELE_COVERAGE=3 MAXIMUM_TOTAL_COVERAGE=100 CORRECT_BASES=false USE_SEQUENCE_QUALITIES=true QUIET_STDERR=false MAX_HEAP_SIZE=8192 MAX_QUEUE_SIZE=65536 GRAPH_PRUNING=false NUM_THREADS=1 TMP_DIR=/tmp/colin2 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
                          java.util.NoSuchElementException
                          at java.util.Scanner.nextLine(Scanner.java:1503)
                          at net.sf.picard.reference.FastaSequenceIndex.parseIndexFile(FastaSequenceIndex.java:131)
                          at net.sf.picard.reference.FastaSequenceIndex.<init>(FastaSequenceIndex.java:55)
                          at net.sf.picard.reference.IndexedFastaSequenceFile.<init>(IndexedFastaSequenceFile.java:95)
                          at srma.SRMA.doWork(SRMA.java:131)
                          at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:156)
                          at srma.SRMA.main(SRMA.java:98)
                          Please report bugs to [email protected]


                          The fasta index file looks like this:

                          more ../NC_002516.fna.fai
                          NC_002516.2 6264404 58 70 71

                          Cheers for any help.
                          It looks like your FASTA index is broken. Can you try rebuilding?

                          Originally posted by ymc View Post
                          There are thousands of lines of these error messages. If I copy the stderr output, it will be too many lines. You can replicate my problem by downloading the pair-ended reads from

                          ftp://ftp.1000genomes.ebi.ac.uk/vol1...sequence_read/

                          and then align them using bwa. I got the same bug with SRR098401_*.filt.fastq.gz and SRR035330_*.filt.fastq.gz
                          I am sorry, please try reducing your read set or the like to a manageable test case. Otherwise, I charge $5KUSD/hour

                          Comment


                          • #28
                            Empty VCF file

                            I tried to use SRMA to realign my reads and did a variant calling. However, after SRMA, which ran fine, I got an empty vcf file. Anything I can do to fix this problem?

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM
                            • seqadmin
                              Techniques and Challenges in Conservation Genomics
                              by seqadmin



                              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                              Avian Conservation
                              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                              03-08-2024, 10:41 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 03-27-2024, 06:37 PM
                            0 responses
                            12 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-27-2024, 06:07 PM
                            0 responses
                            11 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-22-2024, 10:03 AM
                            0 responses
                            53 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-21-2024, 07:32 AM
                            0 responses
                            68 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X