Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cpipe Test Error

    Hi to all,
    I hope that is the right place where post my issue.

    I'm using Cpipe in a qualitative analysis, for a project of my university.

    Now, after the installation, I must execute the self-test. The issue is that the test is correct until the point where it starts the "recall precision test". The pipeline executes this piece of code:

    [bam_header_read] EOF marker is absent. The input is probably truncated.
    [M::main_mem] read 509156 sequences (50000164 bp)...
    [samopen] no @SQ lines in the header.
    [sam_read1] missing header? Abort!

    Cleaned up file align/NA12878CHR22_31fc6a75c408009fd2aaf62076fe0c304fc3931a_L001.bam to .bpipe/trash/NA12878CHR22_31fc6a75c408009fd2aaf62076fe0c304fc3931a_L001.bam
    ERROR: Command failed with exit status = 1 :

    Samtools gives me this error. [samopen] no @SQ lines in the header. [sam_read1] missing header? Abort! How I can fix that?

    I know this isn't a forum about cpipe, but I have read that is a common issue of samtools.

    I use Samtools ver. 0.1.19 and WBA ver. 0.7.5a .

    Thanks for, futures, answers.

  • #2
    It would probably be helpful if you could give the exact command lines you are using, describe the data you are working with, and the goal of the project. Also, the full screen output is generally much more useful that just a few lines.

    I'm not familiar with Cpipe, but it looks like you are using a bam file with no header, which is always a problem... so it's helpful to know where that bam file came from.

    Comment


    • #3
      set -o pipefail; /home/pmirabelli/cpipe2017/cpipe/tools/bwa/0.7.5a/bwa mem -M -t 5 -k 19 -R "@RG\tID:NA12878CHR22_L001\tPL:illumina\tPU:1\tLB:null\tSM:NA12878CHR22" /home/pmirabelli/cpipe2017/cpipe/hg19/ucsc.hg19.fasta /home/pmirabelli/cpipe2017/cpipe/batches/recall_precision_test/data/NA12878CHR22_L001_R1.fastq.gz /home/pmirabelli/cpipe2017/cpipe/batches/recall_precision_test/data/NA12878CHR22_L001_R2.fastq.gz | /home/pmirabelli/cpipe2017/cpipe/tools/samtools/0.1.19/samtools view -F 0x100 -bSu - | /home/pmirabelli/cpipe2017/cpipe/tools/samtools/0.1.19/samtools sort - align/NA12878CHR22_31fc6a75c408009fd2aaf62076fe0c304fc3931a_L001

      This is the code.

      On the Internet, some posts, suggest the -t option with the argument the FASTA file reference. The point is that I do not know from where comes this result. Also because, at that point of the code, in a precedent execution of the pipeline it gives me a right result.

      Comment


      • #4
        When debugging pipelines - or, generally, when there is little advantage to doing so - I highly encourage you to write a pipeline that does not use any piping. Rather, have each stage write a file, and let the next stage wait until the file is complete before executing.

        Also, I'm not sure what this has to do with Cpipe; you're just using bwa-mem and samtools, so asking about "Cpipe" in your title is reducing the scope of potential answerers. bwa-mem and samtools are very commonly used but I've never heard of Cpipe.

        I suggest you change the code to read and write files consecutively without any piping. It's not even clear right now which samtools process is failing, since there are two of them, and they are obsolete versions. I suggest you use the latest samtools version; the -S flag, for example, is now different.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 08:47 AM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        57 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Working...
        X