Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    When using bowtie on Galaxy you can choose to use the 'full parameter' list. This will provide you with additional parameters among them are the ability to trim from either or both the 3' and 5' end of your reads. Not trimming would end up greatly decreasing the number of sequences you will align.
    There are also programs available were you can 'clip-off' adaptor sequences from the fastq file.

    Comment


    • #17
      Well since I get the same quality values when I download the .fastq files from NCBI or when I use fastq-dump on the .sar file, why wouldn't I just use the downloadable .fastq files? The reads are 25bp so there's no need to trim them at all.
      However I don't know if the encoding could be a problem since the NCBI encoding is "Sanger" and the reads are supposed to be Illumina reads (but fastq-dump does not affect the encoding apparently, in this case).

      Comment


      • #18
        Since Illumina 1.7 and higher the quality encoding is now basically Sanger. If you have an older Illumina format, you can convert it to Sanger in Galaxy. In Galaxy, go to NGS: QC and Manipulation, then use FASTQ Groomer which can convert FASTQ files between various formats. But I think NCBI has already done this for you.
        I think you have to have Sanger format to use bowtie in Galaxy.
        Here tells a little more about quality scores, although it does not appear to be up-to-data with the new Illumina 1.7+ : http://en.wikipedia.org/wiki/FASTQ_format

        Comment


        • #19
          Hello again,

          Can someone explain me, in simple words, what the samse -n option does in BWA?

          the man says :

          "-n INT Maximum number of alignments to output in the XA tag for reads paired
          properly. If a read has more than INT hits, the XA tag will not be written.
          [3]
          "

          XA corresponds to alternative hits.
          I do not know exactly what "tags" are...

          I guess that, using the default value (3), BWA will report up to 3 alternative hits for a given read. But what about reads that produce more than 3 hits?
          a) Will the read be discarded (no hits reported)
          b) Will the first (random) 3 hits be reported (other hits discarded)
          c) Something else?

          Thank you very much in advance for your help.
          -a-

          Comment


          • #20
            Originally posted by asheenlevrai View Post
            I guess that, using the default value (3), BWA will report up to 3 alternative hits for a given read. But what about reads that produce more than 3 hits?
            a) Will the read be discarded (no hits reported)
            b) Will the first (random) 3 hits be reported (other hits discarded)
            c) Something else?

            Thank you very much in advance for your help.
            -a-
            You guessed it right for the default value, i.e. BWA will report up to 3 alternative hits for a given read.

            However, if a read has more than 3 hits, by default, BWA would remove the XA tag from the read information (reports a single randomly selected hit, read is not removed the SAM file). To overcome this issue, you can raise the number of hits reported by BWA, using the -n INT option (e.g. -n 100, would return the XA tag for reads that have <=100 hits in the XA tag; the XA tag for any read with more than 100 hits will not be reported).

            I hope this makes sense.
            Thanks,
            P

            Comment


            • #21
              I'm not sure I got it right.

              if there are more than 3 hits for a given read, then the XA tag will be removed and...
              a) only 1 randomly chosen read will be reported (in the sam file)?
              b) all hits will be reported, but as "independent" hits rather than as alternative hits?

              thanx again
              -a-

              Comment


              • #22
                Originally posted by asheenlevrai View Post
                I'm not sure I got it right.

                if there are more than 3 hits for a given read, then the XA tag will be removed and...
                a) only 1 randomly chosen read will be reported (in the sam file)?
                b) all hits will be reported, but as "independent" hits rather than as alternative hits?

                thanx again
                -a-
                What I meant was, (as far as I understand BWA's working) that for the default values, if BWA finds more than <=3 hits for a read, it still reports only 1 hit for that read (single line in the sam file), but it adds the XA tag to that read, with the other hits.

                However, if there are more than 3 hits, BWA doesn't include the XA tag with the read information.

                eg. Lets say after alignment, BWA found 2 hits for ABC:test:read (read name). 1st hit is at chr1:110228419 and the second hit is at chr10:96234821 In the SAM file, the row for this read would look something like this.

                ABC:test:read 0 chr1 110228419 37 100M * ... XA:Z:chr10,96234821, 100M, 10;

                However, if there are more than 3 hits for this read, the SAM file would look something like this:

                ABC:test:read 0 chr1 110228419 37 100M * ...

                BWA doesn't report alternative hits as separate rows. It just reports them in the XA tag for the read (if no. of hits <= -n INT).

                Does this make more sense?

                Comment


                • #23
                  Originally posted by aggp11 View Post
                  BWA doesn't report alternative hits as separate rows. It just reports them in the XA tag for the read (if no. of hits <= -n INT).

                  Does this make more sense?
                  Thank you. I didn't try to look at the SAM files "directly" actually. I just tried to open them in a graphical visualization program (which I am not familiar with, yet). I'll check that out...

                  I was wondering if the alternate reads (the ones on the same row with the 1st read, baring an XA tag) would be displayed in the visualization program. I guess it depends on the program...

                  Thank you very much for your help
                  -a-

                  Comment


                  • #24
                    I used the galaxy website to process the data. I generated SAM files (alignments) from the different fastq files and then used SAM tools to convert SAM to BAM. Finally, I merged BAM files. There is no "merge SAM files" tools in SAM tools on the galaxy website. I bet there's a good reason for that... I'll check if there's a way to merge my SAM files, in order for me to compare the reads' XA tags and the output from the visualization program I will use (not determined which one yet, unsuccessful with most of them...).
                    -a-

                    Comment


                    • #25
                      When I try to use IGV with a BAM file, it says
                      "Could not load index file for: /file_path
                      An index file is required for SAM & BAM files."
                      I don't know what this index file might be...

                      Comment


                      • #26
                        Originally posted by aggp11 View Post
                        However, if there are more than 3 hits, BWA doesn't include the XA tag with the read information.

                        ...

                        BWA doesn't report alternative hits as separate rows. It just reports them in the XA tag for the read (if no. of hits <= -n INT).

                        Does this make more sense?
                        I guess reads can be evaluated on their mapping quality, right?
                        if they have a positive mapping quality value, then they're unique reads...
                        if they have a mapping quality value of 0, then there are alternative hits...
                        if they have a mapping quality value of 0 and XA tag(s), then there are at most 3 alternative hits...
                        right?

                        Comment


                        • #27
                          Something I don't understand:
                          In my BAM file, I have reads with a positive Mapping Quality (not =0) and XA tags... How is that even possible?

                          Comment


                          • #28
                            does anyone know the exact commands to simulate reads in stampy??

                            m doing
                            -S <filename>.fastQ -g hg -h hg

                            please tell me if thats correct??

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM
                            • seqadmin
                              Techniques and Challenges in Conservation Genomics
                              by seqadmin



                              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                              Avian Conservation
                              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                              03-08-2024, 10:41 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Yesterday, 06:37 PM
                            0 responses
                            10 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, Yesterday, 06:07 PM
                            0 responses
                            10 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-22-2024, 10:03 AM
                            0 responses
                            51 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-21-2024, 07:32 AM
                            0 responses
                            67 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X