Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    The verbose option crashes terminal every time I use it (after a few hours, but before the mapping is done), so I don't know about that.

    The fastqc plots of kmer and overrepresented sequences are rather odd looking, but as this is a transcriptome it is unclear to me what the expectation should be, although from what I can tell by googling around what I have is not unusual for a transcriptome. I did not use barcodes for this dataset.

    I have tried doing some additional quality filtering to see if that makes a difference. I will report back on that.

    Just mapping the reads that do map and completing the pipeline does produce contigs that blast appropriately.

    Comment


    • #17
      Additional qc and the addition of the -v 3 option resulted in .034% of the reads mapping in bowtie, so I still have no idea whats wrong.

      Comment


      • #18
        Originally posted by sasignor View Post
        Just mapping the reads that do map and completing the pipeline does produce contigs that blast appropriately.
        Well, that's not the point - obviously these came from the right species. I rather meant to take the reads that do not map (or all, for simplicity), and feed them through e.g. Trinity. Then take some of the contig and blast them to nt to see what species you have sequenced. If you do it for let's say 10 million reads, this goes really fast and should give you an idea whether contamination is a problem...

        Comment


        • #19
          This is the output of FastQC for one of the files I am using - the other is comparable. Again - it does look unusual for a genome but as far as I can tell not for a transcriptome, and not for transcriptomes I have successfully aligned in the past.
          Attached Files

          Comment


          • #20
            Lots of poly-T/A in there.

            Try using bowtie2 (not bowtie) in '--local' mode.

            Comment


            • #21
              Yeah, lots of poly-T - my transcriptome data doesn't look like that, for sure. The weird GC-content can't be biological IMHO - everything up to base ~55 looks crazy.
              Did you look hard for adaptors? Perhaps you have a lot of weirdly ligated fragments in there, or the sequencing run had problems - talk to your provider.
              A shot in the dark may be to try to clip the sequences up to 55 (cut out 55-95 or so) and try to map that...

              Comment


              • #22
                Hello sasignor
                I am wondering if you diagnosed the problem
                I am facing a similar issue, and would love to hear about your progress
                Thanks

                Comment


                • #23
                  Originally posted by sasignor View Post
                  I am attempting to align a transcriptome sequenced with Hiseq to a reference using bowtie. The parameters I am using are:

                  bowtie -S -p 2 reference -q --phred64-quals

                  And none of the reads align. They also do not align if I do not include the quality parameter, or any modification such as the --ff suggested in related postings.

                  I have checked for adapter contamination and found very little, the reads were cleaned using ngs backbone, although I am not using that pipeline for anything downstream of cleaning. It has also been reported by some that they do not align in paired end but do in single end, my reads do not align in either case. Around 30% of the reas align in bwa.

                  Does anyone have any idea why this is the case?

                  Thanks!
                  Sarah
                  Hi Sarah,

                  Your reads look as if they were produced with CASAVA v1.8, which reports Phred+33 Q-scores (Illumina 1.9/Sanger). If that is the case, removing the --phred64-quals option from the bowtie command may do the trick (phred 33 is default).

                  Cheers,

                  Fernando

                  Comment


                  • #24
                    It looks to me that the first 10 bases of all your reads are similar if not identical sequences. Have you checked overrepresented sequences output from FastQC? It may also give you information about which kind of contamination (adaptors) the overrepresented sequences might be.

                    Try cleaning up reads before aligning them with bowtie, e.g. clip adaptors, trim low-quality bases, trim polyA tails, remove reads with low-complexity regions, etc. Prinseq or seqclean can do this job.

                    Sunny
                    Last edited by Sun-SEQ; 08-01-2012, 12:44 AM.

                    Comment


                    • #25
                      Also might be worth checking with your sequence provider to make sure they sent you the right dataset.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM
                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      27 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      31 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      27 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      52 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X