Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DBG2OLC - parameter f cannot identify my PacBio reads

    Hi guys!
    I'm trying to run DBG2OLC with contigs built with platanus + 13x times genome coverage in PacBio subreads (fastq). But looks like the parameter f don't understand my PacBio reads. Do I have to create any kind of info file to unpload to the f parameter?

    My command is:

    ./DBG2OLC LD 1 Contigs /data/scratch/xxx/data/pair_end/platinus/PE.lf_contig.fa k 17 KmerCovTh 2 MinOverlap 10 AdaptiveTh 0.001 f /data/scratch/xxx/data/PacBio/subreads/BVjE7HVA_filtered_subreads-1.fastq f /data/scratch/xxx/data/PacBio/subreads/q0QwkDI0_SMRT_09-10_filtered_subreads.fastq

    and the end of my log file:

    Scoring method: 3
    Match method: 2
    Loading long read index
    Loading file: ReadsInfoFrom_BVjE7HVA_filtered_subreads-1.fastq
    Loading file: ReadsInfoFrom_q0QwkDI0_SMRT_09-10_filtered_subreads.fastq
    0 reads loaded.

    I appreciate your help! Because I can't really understand why it's not loading.. the path is right, the format too.. (I also tried to rename the reads to have a final end .fq, but it did not work too!)

    Thank you so so much!!!

  • #2
    The following is the part of the code dbg2olc uses to parse the option 'f':

    Code:
    		if (strcmp(argv[i], "f") == 0)
    		{
    			i++;
    			LongReadFilenameVec.push_back(argv[i]);
    			string name1= argv[i];
    			int n = 0;
    			for (n = name1.size() - 1; n >= 0; --n)
    			{
    				if (name1[n] == '\\'||name1[n]=='/')
    				{
    					break;
    				}
    			}
    			name1 = name1.substr(n + 1, name1.size());
    
    			name1="ReadsInfoFrom_" + name1;
    			LongReadsInfoFiles.push_back(name1);
    			name1 = argv[i];
    			n = 0;
    			for (n = name1.size() - 1; n >= 0; --n)
    			{
    				if (name1[n] == '\\' || name1[n] == '/')
    				{
    					break;
    				}
    			}
    			name1 = name1.substr(n + 1, name1.size());
    
    			name1 = "NonContainedReadsFrom_" + name1;
    			NonContainedReadsFiles.push_back(name1);
    			continue;
    		}
    Looks like that it gets the basename of that string and append prefixes of "ReadsInfoFrom_" and "NonContainedReadsFrom_", hence the log file said:

    Loading file: ReadsInfoFrom_BVjE7HVA_filtered_subreads-1.fastq
    Loading file: ReadsInfoFrom_q0QwkDI0_SMRT_09-10_filtered_subreads.fastq

    Reading its manual, I believe the authors assume that you run "SparseAssembler" first with your data. It generates ReadsInfoFrom_XXX and NonContainedReadsFrom_XXX for you.

    Comment


    • #3
      Are you sure you can already set your LD paramater at 1? Because if you've never had a successful run, I think it's supposed to be 0.
      Last edited by kehey; 01-05-2016, 12:40 AM. Reason: . -> ?

      Comment


      • #4
        Kehey and bowhan,

        I'm sorry for the delay! You were completely right: for the first time run, it must be 0.

        It run like a charm and really helped with my repetitive genome!!

        Thank you, guys!

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 03-27-2024, 06:37 PM
        0 responses
        12 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-27-2024, 06:07 PM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        69 views
        0 likes
        Last Post seqadmin  
        Working...
        X