Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • @luisczul

    Samtools indicates that the error happens to line 164507. What does that line look like?As for the second problem, it seems like a bug. You are using very short reference. Could you send me an example? Thanks.

    @henry
    Are you generating results with "samse -n"? With -n, the output is NOT sam. You can tell this from the bwa manual page.

    These three questions are less relevant to samtools. They are mostly related to bwa.

    Comment


    • Originally posted by lh3 View Post
      @luisczul

      Samtools indicates that the error happens to line 164507. What does that line look like?As for the second problem, it seems like a bug. You are using very short reference. Could you send me an example? Thanks.

      @henry
      Are you generating results with "samse -n"? With -n, the output is NOT sam. You can tell this from the bwa manual page.

      These three questions are less relevant to samtools. They are mostly related to bwa.
      hi, lh3,
      I ran ./bwa samse -n 255 ref.fa tissue.sai tissue.fastq> tissue.sam
      I will try bwa without setting -n.
      Thank you for your help.

      Best

      Jing

      Comment


      • Originally posted by lh3 View Post
        For NGS data analysis, an aligner tends to be successful when it comes with utilities for comprehensive downstream analyses such as reference based assembly, SNP/indel calling and alignment viewer. Eland/GAPipeline, Soap and Maq are such examples. Unfortunately, it is non-trivial to implement all these downstream analyses and implementing these for each aligner would be a waste of time and human resources as well. Mostly we want to separate alignment from the downstream analyses after the alignment. To achieve this, we need a generic alignment format that makes all aligners happy. NovoAlign and Bowtie can output Maq alignment format to take the advantage of Maq downstream data processing. However, Maq format does not really suit the goal. It does not support longer reads nor alignment with more than one indel and it is too specific to Maq. To solve this problem, the 1000Genome Project Committee decided to develop a generic alignment format. And now the first version of specification and implementation have come out.

        The new alignment format, SAM (Sequence Alignment/Map), is the collaborative result of several major genome centres. It eliminates the major defects of Maq format while retaining its advantages. We also migrated and improved various downstream data processing implemented in Maq/Maqview, such as indexing, pileup, viewer and consensus caller. For more information, please check website:



        I hope samtools may help aligner developers to promote their own software: once a program can generate alignment in SAM format, Maq-like downstream analysis will be available right now.
        With SAMtools, can I extract the number of unique aligned reads, the number of aligned reads with one mismatches, the number of aligned reads with two mismatches?
        With SAMtools, can I extract the number of aligned reads in each chromosome?

        Thank you a lot.

        Jing

        Comment


        • Does anyone have a sense for why the output of 'samtools flagstat' doesn't match the read numbers in the original fastq file? Total reads reported is actually the number of reads mapped, according to maq.

          The other numbers (mapped, paired) also seem out. This happens both when converting from maq with maq2sam, and with SSAHA2's sam output. I've compared the output produced by 'maq map' vs samtools flagstat, and the numbers are different.

          Comment


          • Originally posted by tgolubch View Post
            Does anyone have a sense for why the output of 'samtools flagstat' doesn't match the read numbers in the original fastq file? Total reads reported is actually the number of reads mapped, according to maq.

            The other numbers (mapped, paired) also seem out. This happens both when converting from maq with maq2sam, and with SSAHA2's sam output. I've compared the output produced by 'maq map' vs samtools flagstat, and the numbers are different.
            I agree with your assessment. It is based on the aligner, since most aligners do not include unmapped reads in their SAM file. This does not happen with my own software BFAST. If you want a more accurate assessment, see the "dbamstats" utility in the DNAA package.

            Nils

            Comment


            • Thanks, Nils, that clears up the first question. I'm still not sure why the other numbers are different.

              Re BFAST - looks interesting. Can BFAST produce SAM output files?

              Comment


              • DNAA and SAM

                Does DNAA have any documentation?

                Comment


                • Originally posted by tgolubch View Post
                  Thanks, Nils, that clears up the first question. I'm still not sure why the other numbers are different.

                  Re BFAST - looks interesting. Can BFAST produce SAM output files?
                  BFAST outputs to the SAM format, as well as UCSC MAF, GFF, as well as other formats. Also note it includes unmapped reads in the SAM file, and for SOLiD reads it retains the color space information, which is critically important for variant discovery.

                  Originally posted by ech View Post
                  Does DNAA have any documentation?
                  "Release early, Release often" ~ Eric S. Raymond

                  So this is the "early" step. Unfortunately there is very little actual documentation (beyond the code) and I hope to get it up soon. PM me if you want examples or if you want to know how use the package to solve your problems.

                  Comment


                  • Is there a way to install BFAST without root access? I don't want it going into /usr/bin (and don't have access to it on the work server in any case).

                    Comment


                    • Originally posted by tgolubch View Post
                      Is there a way to install BFAST without root access? I don't want it going into /usr/bin (and don't have access to it on the work server in any case).
                      Technically you can't install it to /usr/bin without root access. I assume you have a typically Unix system and you use BASH.

                      A useful trick to "fake" an install is to make your own "bin" directory and add the "bin" directory to your path. If your "bin" is found at "$HOME/bin", add this line to your "$HOME/.bash_profile" (assuming bash):
                      Code:
                      PATH=$PATH:/home/<username>/bin
                      Now you can put everyting in "$HOME/bin".

                      You can install BFAST in "$HOME/bin" by using the "prefix" option in the "configure" script:
                      Code:
                      ./configure --prefix=$HOME
                      (This is not a typo, --prefix=$HOME/bin would install it in $HOME/bin/bin).
                      Now run:
                      Code:
                      make
                      make install
                      and everything should be installed into $HOME/bin.

                      Nils

                      Comment


                      • hi heng
                        I used bwa to generate a sam file.It works well with samtools, but when I use picard ExampleSamUsage, it reprots Exception:
                        Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. MRNM specified but flags indicate unpaired; File e:\temp\aln.sam; Line 1
                        Line: >chrY_1 0 chrY 2154703 37 45M = 2154703 0 ATGGTGGCGAGCGCCTGTAGTCCCAGCTACTCGGGAGGCAGGAGA hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:45
                        at net.sf.samtools.SAMTextReader.reportErrorParsingLine(SAMTextReader.java:168)
                        at net.sf.samtools.SAMTextReader.access$500(SAMTextReader.java:40)
                        at net.sf.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:316)
                        at net.sf.samtools.SAMTextReader$RecordIterator.<init>(SAMTextReader.java:202)
                        at net.sf.samtools.SAMTextReader$RecordIterator.<init>(SAMTextReader.java:188)
                        at net.sf.samtools.SAMTextReader.getIterator(SAMTextReader.java:124)
                        at net.sf.samtools.SAMFileReader.iterator(SAMFileReader.java:197)
                        at net.sf.samtools.SAMFileReader.iterator(SAMFileReader.java:39)
                        at net.sf.samtools.example.ExampleSamUsage.convertReadNamesToUpperCase(ExampleSamUsage.java:54)
                        at net.sf.samtools.example.ExampleSamUsage.main(ExampleSamUsage.java:93)
                        Java Result: 1
                        Dose picard is well tested to use now?

                        Comment


                        • Picard is well tested. It reports error because it is more rigorous than samtools. This error is actually a bwa problem, not picard. You may try the latest bwa-0.5.2. I believe I have fixed the issue.

                          Comment


                          • Hi,

                            I obtained complete genome sequence data in the form of “.bam” alignment files from ftp://ftp-trace.ncbi.nih.gov/1000genomes/ and I tried to extract a sub alignment of a region of chromosome 1 with the "view" command.


                            ./samtools view -q 30 -t ../human_b36_male.fa.fai ../NA12878.SLX.maq.SRP000031.2009_08.bam chr1:150,792,101–150,884,101

                            I get the following:

                            [sam_header_read2] 114 sequences loaded.
                            [main_samview] random alignment retrieval only works for indexed BAM files.

                            Could anyone help me fix this problem? Thanks a lot.
                            M.

                            Comment


                            • Originally posted by Jeckow View Post
                              Hi,

                              I obtained complete genome sequence data in the form of “.bam” alignment files from ftp://ftp-trace.ncbi.nih.gov/1000genomes/ and I tried to extract a sub alignment of a region of chromosome 1 with the "view" command.


                              ./samtools view -q 30 -t ../human_b36_male.fa.fai ../NA12878.SLX.maq.SRP000031.2009_08.bam chr1:150,792,101–150,884,101

                              I get the following:

                              [sam_header_read2] 114 sequences loaded.
                              [main_samview] random alignment retrieval only works for indexed BAM files.

                              Could anyone help me fix this problem? Thanks a lot.
                              M.
                              It looks like you either have not sorted the BAM file (samtools sort) or you have you have sorted the BAM file but have not created the index (samtools index).

                              Comment


                              • Originally posted by lh3 View Post
                                Picard is well tested. It reports error because it is more rigorous than samtools. This error is actually a bwa problem, not picard. You may try the latest bwa-0.5.2. I believe I have fixed the issue.
                                Yes, I used bwa-0.5.0, bwa-0.5.2 have no such bugs at all.

                                Excellent wrok!!!

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Advancing Precision Medicine for Rare Diseases in Children
                                  by seqadmin




                                  Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                                  12-16-2024, 07:57 AM
                                • seqadmin
                                  Recent Advances in Sequencing Technologies
                                  by seqadmin



                                  Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                                  Long-Read Sequencing
                                  Long-read sequencing has seen remarkable advancements,...
                                  12-02-2024, 01:49 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 12-17-2024, 10:28 AM
                                0 responses
                                34 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-13-2024, 08:24 AM
                                0 responses
                                49 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-12-2024, 07:41 AM
                                0 responses
                                34 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 12-11-2024, 07:45 AM
                                0 responses
                                46 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X