Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cutadapt Poor Mapping

    Hello.

    I am still researching this issue

    but essentially I am mapping cut adapt trims for Rnaseq data and when I trim them according to the manual and align them as single end reads with lib type -unstranded I get the map percent as 91%

    however if I align them as lib type first strand, or lib type second strand using tophat2 I get the map percent as 4%.

    what is the cause of this? I do have my qc reports.

  • #2
    This sounds like it could be an issue related to concordant reads.

    Have you tried mapping the reads using bowtie2 (which might give you more verbose statistics)? What is the fragment size of the reads (and did you specify that in the tophat2 command)? Did you prime tophat with a transcriptome GTF file?

    Comment


    • #3
      Length Sizes/ Frag Size

      Hello. thank you for the response.

      I am not sure how to use bowtie2 mapper and I will have to look into that.

      As for your questions

      1) The fragment size I am not too sure how to find, below is the PDF of the sequence distr from Fast QC, so the red line is the median seq. length. there is not a "single length" but I have a distribution of lengths. As for the parameter for this in TH2, i put the mater inner distance pair as 100. which parameter should I input to account for the frag size?

      2) Yes I did include the GTF transcriptome I am almost certain that was not an issue because the Singl end reads had fantastic results.

      thank you again.

      Could you give some insight regarding how concordant reads work? it seems as the the lengths are not matching up, or the lengths between the two reads is out of sound; and thus the matching fails.
      Attached Files

      Comment


      • #4
        The fragment size I am not too sure how to find, below is the PDF of the sequence distr from Fast QC, so the red line is the median seq. length. there is not a "single length" but I have a distribution of lengths.
        Fragment size (the length of template from which the reads are sequenced) doesn't necessarily relate to the read length (what you are describing). You need to ask for that information from the person who did the sequencing. With 150bp paired-end, I would expect a fragment size of about 400bp, so mate inner distance of 100bp (i.e. what you specified -- that's the correct parameter), but you do need to find that out.

        Could you give some insight regarding how concordant reads work? it seems as the the lengths are not matching up, or the lengths between the two reads is out of sound; and thus the matching fails.
        The bowtie2 manual discusses this:

        A pair that aligns with the expected relative mate orientation and with the expected range of distances between mates is said to align "concordantly". If both mates have unique alignments, but the alignments do not match paired-end expectations (i.e. the mates aren't in the expcted relative orientation, or aren't within the expected disatance range, or both), the pair is said to align "discordantly". Discordant alignments may be of particular interest, for instance, when seeking structural variants.

        Comment


        • #5
          so i contacted the lab that conducted the sequencing, and found that the fragment size has an insert size of 170bp. should I change the TH2 alignment parameters?
          thanks again

          Comment


          • #6
            Is that total length 470bp (in which case inner distance should be 170), or total length 170bp (inner distance -130)? Either way, it would be a good idea to set the tophat value to what it actually is.

            Comment


            • #7
              I just ran the bowtie aligner

              31584036 reads; of these:
              31584036 (100.00%) were paired; of these:
              6417877 (20.32%) aligned concordantly 0 times
              12924721 (40.92%) aligned concordantly exactly 1 time
              12241438 (38.76%) aligned concordantly >1 times
              ----
              6417877 pairs aligned concordantly 0 times; of these:
              583467 (9.09%) aligned discordantly 1 time
              ----
              5834410 pairs aligned 0 times concordantly or discordantly; of these:
              11668820 mates make up the pairs; of these:
              9583027 (82.13%) aligned 0 times
              1226455 (10.51%) aligned exactly 1 time
              859338 (7.36%) aligned >1 times
              84.83% overall alignment rate


              this is strange because TH2 give alignment rate as 4% for both library types. very odd. I am not sure how to process this.

              Comment


              • #8
                Calculating the Mate Inner Distance

                The lab responded and said The average insert size is 170 bp., however, how do I find the total length inner distance?

                Thank you in advance.

                Comment


                • #9
                  however, how do I find the total length inner distance?
                  Statistics relating to fragment length are confusing, and each program seems to chose a different statistic for its testing. "average insert size" could be the same thing as "inner distance" as defined by tophat, or it could be the total fragment size (including sequences at both ends), or it could be the distance from the start of one end to the start of the other end. I've already given answers for the first two cases (which are most likely).

                  I just ran the bowtie aligner... this is strange because TH2 give alignment rate as 4% for both library types
                  Okay, good. Now you need to tweak the fragment length using bowtie2 parameters -I and -X to match what tophat2 was doing, and that depends on what the mean fragment length was when the sequencing was carried out, which is difficult to do unless you know precisely what the lab's "average insert size" statistic relates to.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM
                  • seqadmin
                    The Impact of AI in Genomic Medicine
                    by seqadmin



                    Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                    02-26-2024, 02:07 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 03-14-2024, 06:13 AM
                  0 responses
                  34 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-08-2024, 08:03 AM
                  0 responses
                  72 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-07-2024, 08:13 AM
                  0 responses
                  82 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-06-2024, 09:51 AM
                  0 responses
                  68 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X