Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • adapter trimming and length/otogenics

    I have received some chip seq data from the company otogenics. They provide two fastq files one that appears to contain the adapter sequence and one that does not based on fastqc reports. Additionally in these fastqc reports the read length does not change with the trimming of the adapter. I am wondering how this is possible and maybe what program they may have used to trim the adapter. Any help would be greatly appreciated.

    Thanks
    Leanne

  • #2
    You can't trim adapters sequence without changing the read length, but you can either throw away reads that have adapter sequence, or (theoretically) produce reads that never had adapter sequence in the first place. Also, if you have for example a fragment library and a long mate pair library, they may have different adapters.

    Comment


    • #3
      Originally posted by lwhitmore View Post
      I have received some chip seq data from the company otogenics. They provide two fastq files one that appears to contain the adapter sequence and one that does not based on fastqc reports. Additionally in these fastqc reports the read length does not change with the trimming of the adapter. I am wondering how this is possible and maybe what program they may have used to trim the adapter. Any help would be greatly appreciated.

      Thanks
      Leanne
      I have used fastx_toolkit for trimming adapters. http://hannonlab.cshl.edu/fastx_toolkit/
      it's actually good and for trimming adapters u should use FASTA/Q Clipper.

      Comment


      • #4
        What is the length of your reads, and do all the reads in each file have the same length according to the fastqc reports?

        Comment


        • #5
          mastal,
          the length of my reads are 100b and all the reads have the same length in both the fastqc reports before and after trimming

          Comment


          • #6
            What do the 2 fastq files represent, before and after trimming, or 2 different samples, or R1 and R2 of paired-end reads??

            Comment


            • #7
              before and after trimming on 1 sample for single end reads

              Comment


              • #8
                How many reads in each sample?

                Comment


                • #9
                  7811028 reads

                  Comment


                  • #10
                    Could you post the first few lines of each file?

                    It seems impossible that the reads would all have the same length after trimming as before if anything was actually trimmed, and if nothing was trimmed, you would expect the fastqc report to still show the adapter sequence in the over-represented reads.

                    Comment


                    • #11
                      from the first file with the adaptor
                      @HWI-ST1129:515:H8V3LADXX:1:1101:2601:1960 1:N:0:CCGTCC
                      NAGAAATTTGGAAAATCAAATGCTTGAAGTAAGAGGACGATATTAAAACTTTTGTAACCAGAGACTACTTTAAGAAAAATCTGCTACTACTTTAACAAAG
                      +
                      #1DFFFHHHHHJJJJJJJJJJJJJJIJHHIIIEIHIHHGHIJJIJJJJJJJJIIJGJJJJJJJJJJJHHHHHHHFFFFEEDEEEEDDDDDEDDDDDDD
                      @HWI-ST1129:515:H8V3LADXX:1:1101:4187:1965 1:N:0:CCGTCC
                      NACAGAGCCTCGCTCTGTCTCCCAGGCTGGATGGAGTGCAGTGGCGCGATGTTGGCTCACTTCAAGCTCCGCGTCCTGTGTTCATGCCATTCTTCTGCCT
                      +
                      #1=DFFFFHHHHHJJJJJJJJJJJJJJJJJHIJJJJHIJIJGHJJJJJJJJJJJHHHHHFFFFFFEEEEEDDDDDDDDCDDDDEEDDDDDDEEEDDDDDD
                      @HWI-ST1129:515:H8V3LADXX:1:1101:4380:1977 1:N:0:CCGTCC
                      NTCCTCCCAAGAGAGATAGAGGAAGGAAAGGGAGAGATGGGACCACCACAGTGAGCAAATGGATCAGATTATTACTCTAAAATGTTCTTTTAGATCGGAA


                      From the second fastq file without the adaptor
                      ACAATGACACTTAGCATTTACTGTGTTAGTTAACATTTAGCAGATCTTTGTTAAAGTAGTAGCAGATTTTTCTTAAAGTAGTCTCTGGTTACAAAAGTTT
                      +
                      CC@FFFFFHHHHHJJJJJJIJJIFHIIJJHIIJJJJIJJJIJIJJJJJIJIJJJJJBGIIGIJJJJJIJJJJIJJJJJCHIEIIIIJH?HEHHFDFFCEE
                      @HWI-ST1129:515:H8V3LADXX:1:1101:4187:1965 2:N:0:CCGTCC
                      ATTAGCCGGGCATAGTGGCAGGGGCCTGTAGTCCCAGCTACTCGGTAGGCTGAGGCAGAAGAATGGCATGAACACAGGGCGCGGGGCTTGAAGTGAGCCA
                      +
                      @@@DDDDDHHHHHIIHIIIIII0??D1)9990?DH#################################################################
                      @HWI-ST1129:515:H8V3LADXX:1:1101:4380:1977 2:N:0:CCGTCC
                      AAAAGAACATTTTAGAGTAATAATCTGATCCATTTGCTCACTGTGGTGGTCCCATCTCTCCCTTTCCTTCCTCTATCTCTCTTGGGAGGAAAGATCGGAA

                      Comment


                      • #12
                        Looks like you have paired-reads from the same sample.

                        This part of the header - 2:N:0:CCGTCC - tells you it's the second read of a pair.

                        So if your reads are all the same length, that would suggest that they haven't been trimmed.

                        Comment


                        • #13
                          ahh ok sorry that i didn't pick up on that i am very new to sequence analysis.

                          One more question if you don't mind
                          Why wouldn't the second file have an over represneted sequence (or an adapter)?

                          Thanks Again!

                          Comment


                          • #14
                            the sequences in the R2 file will be different from the sequences in the R1 file.

                            it's possible that there are other over-represented sequences that are more abundant in the R2 file, so the adapter sequence doesn't make it into the top over-represented sequences.

                            Is it the FastQC report that has flagged the sequences as adapter sequences?

                            Comment


                            • #15
                              Yes it was the fastqc report that flagged the sequence as an adapter

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Advancing Precision Medicine for Rare Diseases in Children
                                by seqadmin




                                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                                12-16-2024, 07:57 AM
                              • seqadmin
                                Recent Advances in Sequencing Technologies
                                by seqadmin



                                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                                Long-Read Sequencing
                                Long-read sequencing has seen remarkable advancements,...
                                12-02-2024, 01:49 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 12-17-2024, 10:28 AM
                              0 responses
                              33 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-13-2024, 08:24 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-12-2024, 07:41 AM
                              0 responses
                              34 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-11-2024, 07:45 AM
                              0 responses
                              46 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X