Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hello and a Question: 50 or 100 bp reads?

    Greetings all,

    I'm a 'senior' grad student at UCB working on a maize genetics/epigenetics project. I've prepared a couple libraries that we are planning to have sequenced here on one of our campus facility's nice new HiSeq 2000 machines! Validating them right now by small scale cloning, but from the size of most of the inserts, it looks like they are exactly what we expected, so all systems are go.

    I'm quite new to this whole deep sequencing technique, but I'm very excited to start the learning process of how to analyze these data sets! On advice from this excellent post (http://seqanswers.com/forums/showthr...good+computers), which explains my situation exactly, I am slowly but surely working through the Unix and Perl for Biologists primer (http://korflab.ucdavis.edu/Unix_and_Perl/). Hopefully I'll have at least a novice understanding of programming by the time we get our reads.

    But more importantly, a question: Should I get 50 or 100 bp reads for these libraries?

    Here are some details and issues that we are dealing with:

    The libraries were prepared using the small RNA adapters, so they will have to be done with single reads. Our main goal is to compare the two libraries, which represent two biological samples (WT vs. mutant), quantitatively, so getting fairly deep coverage is important to our analysis. However, we are working with the highly repetitive maize genome, so we also want to maximize the number of reads we can unambiguously map to the genome. In fact, reads that contain repetitive sequence AND unique sequence (eg., the insertion site of a transposon or other repeat into a unique genomic region) may be of particular interest, so capturing as many of these sites would be super. I'm guessing that longer reads would help in this respect.

    From the Bioanalyzer traces for the libraries, it looks like the most *abundant* inserts are ~75 and ~56 bp, ie. that's where the peaks are. The insert size range is ~30-230 bp though (I cut out between 100-300 bp on the gel). Does the range really matter here? What percentage of 75 and 56 bp-sized inserts can we expect out of all of the reads we get? And from the larger sized inserts that we capture, can we expect to get decent enough coverage to be able to compare the two libraries at a particular region?

    I would just automatically go with 100 bp reads I guess, but am wondering: is coverage significantly reduced with an increase in read length from what people have seen?

    It looks like there are many programs out there which recognize and trim
    adapter sequences from Illumina reads, for the reads that sequence INTO
    the 3' adapters. So it seems like that wouldn't be TOO big of a problem.

    Any advice/help on this would be much appreciated!

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    Yesterday, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
55 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
52 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
45 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
55 views
0 likes
Last Post seqadmin  
Working...
X