Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GMAP-GSNAP Run Facing Issues

    Hello All,

    I am writing to understand if my installation of Gmap is accurate or not, since I am facing issues with its run. I have tested both the 10-30-2017 and 11-15-2017 version of Gmap, and in both cases face identical issues. I have tested on the local machine and on an LSF cluster and in both cases the runs have failed due to memory issues. I'll just talk about the local machine failures here. I have attached some output and screenshots here, but if anyone needs more information to troubleshoot please let me know.

    LOCAL MACHINE FAILURES: (See "System Configuration.txt" for machine cpu and mem configuration)

    Command:
    Code:
    nohup /gsap/tools/bin/gmap -d B73v4_genome_masked -D /anno/sanyalab/GMAP/GMAP-DB/ -f gff3_gene -F -t 6 -n 10 -K 50000 --min-identity=0.95 --min-trimmed-coverage=0.90 /anno/foo/PUB_DATASETS/ZEA/Zea_mays_EST.fasta.clean > /anno/foo/GMAP/RESULT/Zea_mays_EST_B73v4.gff3 2>nohup.out &
    Comments: I built the maize B73v4_genome_masked gmap database using the 10-30-2017 version of Gmap. The evidence set is all ESTs belonging to genus Zea. I have cleaned these ESTs using seqclean, prior to running the Gmap command.

    Results: An incomplete gff3 file gets built because local memory limit is reached (local memory is 128GB) and Gmap job is Killed (See screenshot just before the job is killed "Capture1.png"). The B73 genome is 2.0 gigs in size and the evidence set of EST is 916 MB in size. The memory consumption is very high given the size of the EST file. Attached is the error file ("nohup.out"). The last lines from the error file indicates the execution just stops due to memory maximum getting hit. I tried the same command with the "11-15-2017" version and got the same result (did not retain result file). I rebuilt the gmap database with the "11-15-2017" version thinking that there might be version to version differences, but got the same fail result.

    I repeated the same command with an earlier Gmap version "2014-06-10". This time it passed. The error file is "nohup3.out". The last lines of the file suggests that it was a successful run and I got a 1.7G results file. Earlier the file size was 69M.

    Thinking that I might have better luck running cDNA with the new Gmap version, I ran the following command locally

    Command:
    Code:
    nohup /gsap/tools/gmap-2017-11-15/bin/gmap -d B73v4_genome_masked -D /anno/sanyalab/GMAP/GMAP-DB/ -f gff3_gene -F -t 14 -n 10 -K 50000 --min-identity=0.95 --min-trimmed-coverage=0.90 /anno/foo/PUB_DATASETS/PASA/ZEA/Zea_mays_cDNA.fasta > /anno/foo/GMAP/RESULT/Zea_mays_cDNA_B73v4_gmap2.gff3 2>nohup7.out &
    Comments: I built the maize B73v4_genome_masked gmap database using the 11-15-2017 version of Gmap. The evidence set is all cDNAs belonging to genus Zea. No sequence cleaning was done prior to running the command.

    Result: The run was successful. Attached error file "nohup7.out"

    I know GMAP can handle EST data, but looking at the results I am confused. I am unsure whether the program has a memory management issue or something on my side is cracking. Attached is the "config.site" I am using. Please advice what I need to do.

    Thank you
    Abhijit
    Attached Files

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
22 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
17 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
49 views
0 likes
Last Post seqadmin  
Working...
X