Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • pooling libraries across multiple lanes

    Dear SEQanswers experts - I have a set of 24 samples for RNAseq. The analyses I need to do is pairwise (i.e. 12 pairs of samples) for differential gene expression. I need to run them on a single HiSeq 2500 flowcell (this is all we have money for).

    So I could run three samples per lane across the flowcell, which is fine, but it means that some pairs will be compared across lanes, and this feels like it would cause problems in the data later on, with any lane effects causing noise in some of the pairs.

    A long time ago an Illumina tech guy suggested pooling all libraries across all lanes as a flexible way of reducing any lane bias. It would also mean I can cross compare any sample to any other without lane bias.

    So my questions are:
    1. if i pooled my samples together into a single pool and run this pool across all 8 lanes, does that seem reasonable to you?
    2. do many people do this cross-lane pooling regularly?
    3. And if so, at which stage during data processing do you re-combine the data from multiple fastq files into single sample data? can this be done during initial deconvolution?


    I plan to use a tophat2/HTseq-count/DESeq2 pipeline for analysis.

    Thanks for any input.

    Matt

  • #2
    1. Yes, this is the recommended method.
    2. We do this with most* of our samples, though we tend to only use 2-4 lanes because we don't need the depth that you apparently do.
    3. Our demultiplexing pipeline merges things automatically (in fact, some versions of bcl2fastq can do this automatically).


    I should probably note that I've never personally seen a big lane effect. We actually split across lanes in case there's a technical failure of one of them.

    *Well, when a project needs multiple lanes. Many projects only need a single lane.

    Comment


    • #3
      Hi Matt,

      as long as your index strategy allows it, pooling all samples is absolutely fine and the way to go for your problem. So, as long as all your samples have different indices (or a different combination of indices), you can pool them.

      If you are not sure, just create a Sample-Sheet using the Software Illumina Experiment Manager. It tells you if you run into problems with non-unique indices or color balancing.

      We do it regularly with genomes on our HiSeq 4000. We pool 6 genomes and put the pool on all 8 lanes.

      Comment


      • #4
        fantastic news - thanks for the input! i'll plan it and test the barcodes/indices etc.

        i'm not sure how the demultiplexing pipeline works in our sequencing core (this is the only bit I won't be doing myself), but I'll liaise with them and find a way.

        So, Devon:

        Our demultiplexing pipeline merges things automatically (in fact, some versions of bcl2fastq can do this automatically).
        ..this generates one fastq file for each sample direct from the demultiplexing - that would be great.

        Matt

        Comment


        • #5
          That is what you will get from your facility.
          They will run the illumina bcl2fastq script and you will get a fastq file for each sample individually (or 2 if you do paired end sequencing).

          Good luck with it. I had to learn all these things the hard way too and I know it can be really confusing to start with.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          59 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          57 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          51 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          56 views
          0 likes
          Last Post seqadmin  
          Working...
          X