Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to use Scythe in order to cut adapter sequences?

    I read the manual and it says how you need a fasta file as an input?

    I am looking at the overrepresented sequences in the FASTQC files of my paired end reads, but I don't see an option in Scythe in order to remove them this way.

    How can I do this?

  • #2
    You need a fasta file of your adapter sequences, which will depend on the library preps you did, as well as any barcodes you added. What chemistry did you use and what was the sequencing platform?

    Also, you will most likely have to change the quality scoring to match your fastq files (for instance, with '-q 33' if you used the newest Illumina pipeline).

    Comment


    • #3
      I'm a biologist/computer scientist, not a chemist.

      The sequencing platform was Sanger.

      The minimum quality score should be 20, but i don't know how to specify this from the program. The minimum base pair length should be 50 and i dont know how to specify this either.

      Comment


      • #4
        There are no adapters in Sanger sequencing for you to trim (they're used in next-gen sequencing).

        BTW, have a look at Lucy, which I recall being commonly used to quality trim Sanger sequencing (It's been a few years since I've done Sanger sequencing).
        Last edited by dpryan; 10-08-2013, 12:38 PM. Reason: Add link to Lucy

        Comment


        • #5
          I agree with Devon that Lucy would work for what you want (I think it uses Phred scores, so you'd need those).

          Like Devon said, you don't have adapters in Sanger sequencing. You might have primer sequences on the very ends of your sequences that you can trim, but those would be specific to the region you're amplifying.

          You also wouldn't use scythe to do hard quality-score trimming even on next-gen data, it's just for identifying adapter contamination on the ends of reads where base-calling quality is typically low. You'd instead use something like sickle or Trimmomatic. If you have phred scores on your Sanger sequences then I guess sickle might work on those too.

          Comment


          • #6
            How to put together an adapter file for Scythe?

            I still would like some information on this if possible. I'm new on sequence analysis and I'm having trouble to figure out how to put together the adapters file.

            I have a barcode file, and this is what I have found online:

            adaptors.fasta: Provide contaminant sequences as a fasta-formatted file.
            See ´/usr/share/doc/scythe/illumina_adaptors.fa´.
            N.B.: Index/Barcode sequences should be substituted for Ns in the example adaptor file.

            And this is what they say in the READ.me:

            In the case of the original Solexa/Illumina adapter sequences, we've seen barcodes "upstream" of forward reads (in which case the reverse complement of the barcode will appear before the adapter sequence at the 3'-end of reverse reads - replacing the [NNNNNN]). We've also seen barcodes upstream of reverse reads (in which case the reverse complement of the barcode will appear before the adapter sequence at the 3'-end of forward reads - replacing the [MMMMMM]). Your definition of the barcode may be someone else's reverse-complemented barcode, and the barcode may or may not be 6 bases.

            But where do the NNNs and MMMs go in the sequences? They don't show an example. They only give you a illumina.adapters.fa file like this:

            >multiplexing-forward
            GATCGGAAGAGCACACGTCT
            >solexa-forward
            AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
            >truseq-forward-contam
            AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
            >truseq-reverse-contam
            AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA
            >nextera-forward-read-contam
            CTGTCTCTTATACACATCTCCGAGCCCACGAGAC
            >nextera-reverse-read-contam
            CTGTCTCTTATACACATCTGACGCTGCCGACGA
            >solexa-reverse
            AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAG

            Thank you for the time to reply to this.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Exploring Human Diversity Through Large-Scale Omics
              by seqadmin


              In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
              06-25-2024, 06:43 AM
            • seqadmin
              Best Practices for Single-Cell Sequencing Analysis
              by seqadmin



              While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
              06-06-2024, 07:15 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 06-28-2024, 07:39 AM
            0 responses
            17 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 06-27-2024, 11:38 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 06-26-2024, 08:38 AM
            0 responses
            22 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 06-21-2024, 07:49 AM
            0 responses
            233 views
            0 likes
            Last Post seqadmin  
            Working...
            X