Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Exome sequencing: Illumina? SOLiD? Read length? Pair-Ended?

    Hi All,

    My collaborators are interested in detecting SNPs in some cancer samples. Exome sequencing seems to be a good start but we have not much knowledge about exome seq and analysis. It will be appreciated if you could give some advice on the following questions:

    1) Shall we use Illumina or SOLiD platform? We would like to use the one with better sequencing QUALITY.
    2) What is the appropriate read length we shall use? The larger the better?
    3) I am not sure if paired-end information is useful for SNP detection but I guess we had better use paired-end.
    4)Could you recommend a good software if we want to identify potential SVs using the exome seq data?

    Thank you very much.

  • #2
    Originally posted by mrfox View Post
    Hi All,

    My collaborators are interested in detecting SNPs in some cancer samples. Exome sequencing seems to be a good start but we have not much knowledge about exome seq and analysis. It will be appreciated if you could give some advice on the following questions:

    1) Shall we use Illumina or SOLiD platform? We would like to use the one with better sequencing QUALITY.
    2) What is the appropriate read length we shall use? The larger the better?
    3) I am not sure if paired-end information is useful for SNP detection but I guess we had better use paired-end.
    4)Could you recommend a good software if we want to identify potential SVs using the exome seq data?

    Thank you very much.
    1) I don't think it matters, but more tools are supported and more people use Illumina
    2) Yes, we do 100bp PE on a HiSeq2000 for instance
    3) Yes, but you will find it more useful for detecting indels. Lots of tools will expect paired-end data and there is no reason not to use it.
    4) Samtools or GATK after alignment are both popular tools for calling SNPs. SNVmix might be more appropriate for cancer samples. Annovar or SnpEff or Ensembl's VEP for annotation.

    Consider doing a paired/normal study if possible.
    Last edited by Bukowski; 12-07-2011, 08:39 AM.

    Comment


    • #3
      Thank you for your advice, Bukowski! One more question, if we perform CNV using the Exome Seq, what tool do you recommend? I know it is more challenging to do CNV only using Exome seq, compared to using whole genome data.

      Comment


      • #4
        I Would probably be looking at ExomeCNV for that:



        And I'm pretty sure that will require paired/normal data, but check.

        Comment


        • #5
          Can anyone tell me the pipeline for exome sequencing data analysis?

          Comment


          • #6
            Originally posted by Jayu View Post
            Can anyone tell me the pipeline for exome sequencing data analysis?
            That depends on how you want to do the analysis.

            Depending on how paranoid or pedantic you are, you can do a readjustment of read sequences based on the original intensity data. After that, you can do some pre-filtering or trimming of reads to exclude unlikely sequences.

            Your happiness with the current exon boundary annotation of your genome will determine if you can go straight to mapping, or if there needs to be some sort of assisted (or possibly de-novo) assembly first.

            If you care about isoforms, you will need to use a tool that can identify and distinguish different isoforms and estimate isoform proportions. This may be better achieved with a genome mapping with something that can split reads with very large gaps (something like Tophat). Otherwise you could map to the transcriptome, bearing in mind that isoform identification is much more difficult in that case.

            Once you have reads (or estimated reads), they need to be normalised to account for sampling variation and other types of random and systematic error. After that you can finally get around to the actual data analysis, which will generally be up to the researcher.

            Comment


            • #7
              Originally posted by gringer View Post
              That depends on how you want to do the analysis.

              Depending on how paranoid or pedantic you are, you can do a readjustment of read sequences based on the original intensity data. After that, you can do some pre-filtering or trimming of reads to exclude unlikely sequences.

              Your happiness with the current exon boundary annotation of your genome will determine if you can go straight to mapping, or if there needs to be some sort of assisted (or possibly de-novo) assembly first.

              If you care about isoforms, you will need to use a tool that can identify and distinguish different isoforms and estimate isoform proportions. This may be better achieved with a genome mapping with something that can split reads with very large gaps (something like Tophat). Otherwise you could map to the transcriptome, bearing in mind that isoform identification is much more difficult in that case.

              Once you have reads (or estimated reads), they need to be normalised to account for sampling variation and other types of random and systematic error. After that you can finally get around to the actual data analysis, which will generally be up to the researcher.
              That sounds an awful lot like a recipe for RNA-Seq analysis not exome analysis. The poster (who shouldn't be tacking questions on to other people's threads) might be interested in http://seqanswers.com/wiki/How-to/exome_analysis

              Comment


              • #8
                Originally posted by Bukowski View Post
                That sounds an awful lot like a recipe for RNA-Seq analysis not exome analysis.
                Er, yes. Sorry, I got a little carried away there....

                Comment


                • #9
                  I have a question not related to the thread though..

                  I assembled my illumina data using SOAP, now I want to carry out expression analysis using Rseq tool. it accepts only SAM format so I downloaded SAMTOOLS to convert my soap output to SAM. Can anyone tell me how to run it and convert, tutorial has been of no use yet!

                  Comment


                  • #10
                    I have a question not related to the thread though..
                    This was just recently posted in this thread:

                    The poster (who shouldn't be tacking questions on to other people's threads)
                    Please try to do what this comment suggests and start new threads for unrelated questions. It makes searching the forums much easier for other future browsers of questions and answers.

                    Comment


                    • #11
                      This is actually my 1st post, can't figure out how to start a new thread. I'll try and post it there . Thanks if this has been answered before kindly pass me on the link to the thread.

                      Comment


                      • #12
                        This is actually my 1st post, can't figure out how to start a new thread
                        From the SEQAnswers home page, click on the red 'Forums' link at the left, then click on the forum name, then click on the 'New Thread' button. You can also click on the link at the top of a thread page (SEQanswers > Bioinformatics > Bioinformatics) to go to the forum page.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM
                        • seqadmin
                          Strategies for Sequencing Challenging Samples
                          by seqadmin


                          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                          03-22-2024, 06:39 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        25 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        29 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 09:21 AM
                        0 responses
                        25 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-04-2024, 09:00 AM
                        0 responses
                        52 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X