Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Illumina 1.9 read lengths and trimming

    Hi all.

    I have had a genome of a bacteria I am working with sequenced by my universities sequencing facility. It has been sequenced on a Miseq and I have paired-end reads. I have received from them the raw fastq files and files that they have trimmed using sickle and scythe.

    I have run all the files through fastqc and have this has told me that the read lengths are as follows.

    Untrimmed_1 = 236 - 251
    Untrimmed_2 = 35 - 251
    Trimmed_1 = 20 - 251
    Trimmed_2 = 20 - 251

    It is my understanding that, at least the untrimmed, reads should be the same length?
    I also received a few warnings on all files for Kmer content, but I think this might be due to my organism having a low GC content (~25%).

    I would like to know if these read lengths are acceptable? Should I look at trimming them to the same length? Is there any perticually good software for this?

    To be honest, I have very little idea what I need to do. If anyone has any good links to papers or other information about triming files etc. I would really appreciate that.

    Thanks.

  • #2
    Do you know if the data was processed (before the trimming) on MiSeq itself/BaseSpace or after the run using bcl2fastq/CASAVA?

    Comment


    • #3
      Unfortunately not. All I received was an email with the files that basically said thanks for your custom.

      Comment


      • #4
        I have a feeling that some trimming occurred during the pre-processing of the data (either on MiSeq/BaseSpace) and what you received was not original full length reads. It probably does not really matter since you would have removed those bases yourself during post-processing. One issue that can result in shorter reads is that you had inserts that were shorter than what you thought they were.

        If you are looking to assemble the data then SPAdes is a good option.
        Last edited by GenoMax; 09-29-2014, 07:44 AM.

        Comment


        • #5
          There are plenty of threads for various trimming programs. BBDuk is the simplest option.

          Comment


          • #6
            Thanks for your help.

            Comment


            • #7
              Originally posted by jellybaby83 View Post
              Hi all.

              I have had a genome of a bacteria I am working with sequenced by my universities sequencing facility. It has been sequenced on a Miseq and I have paired-end reads. I have received from them the raw fastq files and files that they have trimmed using sickle and scythe.

              I have run all the files through fastqc and have this has told me that the read lengths are as follows.

              Untrimmed_1 = 236 - 251
              Untrimmed_2 = 35 - 251
              Trimmed_1 = 20 - 251
              Trimmed_2 = 20 - 251

              It is my understanding that, at least the untrimmed, reads should be the same length?
              I also received a few warnings on all files for Kmer content, but I think this might be due to my organism having a low GC content (~25%).

              I would like to know if these read lengths are acceptable? Should I look at trimming them to the same length? Is there any perticually good software for this?

              To be honest, I have very little idea what I need to do. If anyone has any good links to papers or other information about triming files etc. I would really appreciate that.

              Thanks.
              If you have raw fastq files where all the reads are of the same length, you may use skewer for adapter trimming.

              See http://www.biomedcentral.com/1471-2105/15/182/ for your reference.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Advances in Sequencing Analysis Tools
                by seqadmin


                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                05-06-2024, 07:48 AM
              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 05-10-2024, 06:35 AM
              0 responses
              20 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-09-2024, 02:46 PM
              0 responses
              25 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-07-2024, 06:57 AM
              0 responses
              21 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-06-2024, 07:17 AM
              0 responses
              21 views
              0 likes
              Last Post seqadmin  
              Working...
              X