Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • filter sequences from rRNA, tRNA

    Hi all,

    After the first look, I make a rapid clone survey, to my small RNA library (for Illumina); things seem to be working. But I want to measure the grade of contamination from other sequences, like degraded rRNA, tRNA or even E.coli sequences. I think it could be a good starting point to BLAST my seqs against specific ribosomal RNA databases and/or E.coli database (intead of the whole nt db).
    So my question is: where can I find these databases? I'm being looking for a while throught the ncbi/embl but cound't find anything. Or if anyone has a better idea to check out this contamination...

  • #2
    rfam : http://rfam.sanger.ac.uk/

    Comment


    • #3
      Thanks for the information!

      But I've another question. When I have an Illumina read and I want to perform a filtering from contaminating seqs (eg. E. coli), I mean, to take off those non-desire reads: how can I do that. Is there any program/script for that pourpose?

      Thanks!

      Comment


      • #4
        The best way is to keep the reads that align with your reference genome.

        Comment


        • #5
          Originally posted by NicoBxl View Post
          The best way is to keep the reads that align with your reference genome.
          That's the best way, but you could always align your reads against e.coli or whatever your putative contamination is, and use the bowtie parameter -un that saves unaligned reads.

          Comment


          • #6
            Originally posted by Gators View Post
            That's the best way, but you could always align your reads against e.coli or whatever your putative contamination is, and use the bowtie parameter -un that saves unaligned reads.
            Of course. Very clever, thanks!

            Comment


            • #7
              I am analyzing some Illumina libraries that appear to have a lot of ribosomal RNA contamination.

              I'm using Bowtie to align the reads only to a specific set of sequences, and because of the differing amount of rRNA contamination in each sample, each of them maps a different percentage of reads to the dataset (some half of what others map), ranging from 1% to 0.3%.

              I wonder if the amount of rRNA contamination in the preparation of a library can have an impact on the apparent expression level of a gene -- even though one normalizes its counts agains the total number of reads that mapped.

              What's your opinion in this subject?

              Carmen
              Last edited by carmeyeii; 12-21-2012, 08:22 AM.

              Comment


              • #8
                Hi,

                I'm trying to use Bowtie, as Gators was suggesting, to clean my raw reads of rRNA contamination but bc it's my first time using Bowtie i'm a little bit lost.
                Can anyone suggest me a script for that with the --un parameter?
                Thanks,

                Comment


                • #9
                  Hi to all,

                  Finally i could write a bowtie script that is working. To generate my indexes i download all the Porifera rRNA sequences fro NCBI.
                  Now i have 2 questions:

                  - As an input, should i use my FASTQ raw reads (without any trimming, either trimming of the adapters or by quality) or it's better if i clip the adapters and filter by quality first and then i try to remove the rRNA contamination?

                  - I'm trying to print the general statistics using the -t parameter but i guess that because i'm launching the script to a queue i'm not getting anything. How can i obtain the general statistics information?

                  Time loading forward index: 00:00:00
                  Time loading mirror index: 00:00:00
                  Seeded quality full-index search: 00:00:00
                  # reads processed: 1000
                  # reads with at least one reported alignment: 699 (69.90%)
                  # reads that failed to align: 301 (30.10%)
                  Reported 699 alignments to 1 output stream(s)
                  Time searching: 00:00:00
                  Overall time: 00:00:00

                  Thanks,
                  Alicia.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  17 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  22 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  46 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X