Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • whole exome sequencing analysis

    Hello to everybody.
    I am new in this forum so I hope to use it properly.
    I am dealing with a whole exome sequencing and I am pretty new in the field of bioinformatic, so I am facing lots of problems.
    I started the analysis with the bwa alignment tools and I obtained my SAM and BAM files.
    After that I try to run samtools mpileup (after having sorted and filtered my BAM files from not aligned and paired reads) but the software stalls after a while.
    Is there a way to check if my SAM or BAM files are ok?
    Or do you have any other suggestion of pipelines more suitable for whole exome sequencing?
    Thank you all.

  • #2
    A couple of good links for exome pipelines: http://seqanswers.com/wiki/How-to/exome_analysis and https://www.broadinstitute.org/gatk/...best-practices

    Mpileup can take some time to run. What do you mean by "software stalls for a while"? Are you able to see the process consume CPU cycles in a process monitor (e.g. top)?

    Comment


    • #3
      I am working with Putty, using a server of the university.
      When I say that Samtools mpileup stalls, I mean that the connection between my computer and the server shuts down after more or less three quarter of hour because there is no process that is going on. Mpileup runs for a while as I can see from the output BCF file that is growing in size, but at a certain point the process stops, no more output is written and the connection stops.
      Putty is set to shut down the connection if there is no more process going on.
      It seems that samtools finds a point in the BAM file that is no more processable.

      Comment


      • #4
        If you feel that there is something wrong with your BAM file then you can use the ValidateSamFile tool from Picard to check it: http://broadinstitute.github.io/pica...alidateSamFile

        One other possibility is that at certain times firewalls (between you and the university server) are set to terminate an active ssh session following a period of "inactivity". If that is happening then you should submit the mpileup job (I am assuming that you are not using a compute cluster/job scheduler) using the "nohup" command so it continues to run in the background (http://linux.101hacks.com/unix/nohup-command/) even if your SSH session terminates for any reason.

        Comment


        • #5
          If you have pair end fastq.gz files, you can directly upload them to HiPipe (http://hipipe.ncgm.sinica.edu.tw/) for exome analysis. HiPipe driven by high performance computing has a few pre-configured pipelines such as whole genome variant analysis, RNA seq analysis, etc. available for NGS data analysis. However, most of the pipelines are for human only.

          Comment


          • #6
            Ok.. Thank you all for your precious suggestions. I am trying to do everything that you told me and I will tell you the result.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              Yesterday, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            58 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            45 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            55 views
            0 likes
            Last Post seqadmin  
            Working...
            X