Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using 50bp short-reads to find a translocation

    I know of a translocation that occurred (I also know the sequence that was translocated) in my sequenced DNA, but I'm not sure where it was translocated to, and the reads of the translocated sequence line up against the reference in the original location. Is there any way to find a translocation using short-reads?

  • #2
    Hi Agc,
    It would help people if you supplied more information;

    1) Is it genome re-sequencing or transcriptome data
    2) Is it paired end or single reads
    3) What is the coverage/number of reads generated
    4) What species is it

    I don't know if there is a ready made solution for you but my bodge up and leg-it approach would be:

    1) Reads that span the translocation won't map to your reference sequence. E.g. Maybe as, for some reads you have, say the first 25bp could map to chromosome 1 and second 25bp to chromosome Y, using human as an example.
    2) Assuming my example is similar to your situation, I would take all reads that did not map; for each read here take the first 21bp and the last 21bp of the read and map to genome (maybe with BLAT or BLAST, altering paramters). See which first 21bp and last 21bp map to different chromosomes, there will be your candidate translocation regions but hopefully one stands out as being real (lots of reads mapping to them).
    3) Take the sequences of putative translocation, 49bp from each chromosome and make a pseudo translocation sequence BLAST database
    4) BLAST all reads against translocation sequences and record those reads that align over full length of the read. The most likely real translocation will have most reads mapping which at least overlap the translocation point by 1 base

    This is my rough approach, which will get you the right answer but involves a bit of BLAST, BLAT, Perl/Python magic and some result filtering.

    I predict some better experts of NGS know of better/easier solutions, probably with some already developed software. So give it a day or two before embarking my solution.

    Good luck.

    :-)

    ps. If it is paired end, this will help a lot as one mate pair will map to one chromosome and the other mate pair another chromosome (there should be definitely software to help do that) or just parse the SAM output from TopHat or BOWTIE.
    Last edited by poisson200; 07-22-2010, 04:08 AM.

    Comment


    • #3
      Thanks for the quick reply!

      1) Genome re-sequencing
      2) Single ~50bp reads
      3) Not sure where I can obtain that information.
      4) S. Cerevisiae

      The translocation occurred within the same chromosome, but I'll try to develop the idea of using the unmapped reads. Although I'd find some sort of ready made solution / any other suggestions very helpful.

      Comment


      • #4
        It will still work by mapping the 21bp read ends and find those that both ends map to this chromosome in question. Look at the distance between the two ends. Most reads should map each side of translocation site have similar distance between ends.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:47 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X