Seqanswers Leaderboard Ad

**schelhorn** · 09-26-2011, 11:15 PM

Umm, I might be wrong since I have't run your configuration, but shouldn't redirecting the stdout output into a file be all you need? SAM is a line-based ASCII format, probably that's exactly the running output you're seeing. If that is the case just append

Code:

> mysamfile.sam

at the end of the command and you should be good. Btw, having a bioinformatician in the department helps, too

**twu** · 10-03-2011, 01:04 PM

Hi,

I am the developer of GSNAP. The program as you have run it will send the output to stdout (which is the stream of alignments you saw). You should therefore send the output into a file, as suggested above, using the "> outputfile" convention for Unix.

Alternatively, there is a flag called --split-output that will send output to individual files, according to how they aligned (i.e., concordant unique, halfmapping multiple, and so on).

Regards,

Thomas Wu

**gpcr** · 12-15-2011, 06:02 PM

gsnap runn error?

Can any one solve this error:

Code:

gsnap --split-output bcsf -A sam -B 4 -t 8 -k 15 --read-group-id=SN388 --read-group-name=bcsf --input-buffer-size=3000 -D /data/gmap/hg19 -d hg19 -N 1 -s /data/gmap/hg19/hg19.maps/hg19.refGene.splicesitesfile.iit /seq/L005.ft.R1_1.fastq /seq/L005.ft.R2_2.fastq

Error message:

Novel splicing (-N) and known splicing (-s) both turned on => assume reads are RNA-Seq
Allocating memory for compressed genome...done (1,160,885,244 bytes, 1.79 sec)
Looking for index files in directory /data/gmap/hg19
Gammaptrs file is hg19.ref12153gammaptrs
Offsetscomp file is hg19.ref12153offsetscomp
Positions file is hg19.ref12153positions
Allocating memory for ref gammaptrs, kmer 15, interval 3...done (67,108,868 bytes, 0.11 sec)
Allocating memory for ref offsets, kmer 15, interval 3...done (349,277,996 bytes, 0.53 sec)
Allocating memory for ref positions, kmer 15, interval 3...done (3,815,118,780 bytes, 5.79 sec)
Reading splicing file /data/gmap/hg19/hg19.maps/hg19.refGene.splicesitesfile.iit locally...found donor and acceptor tags, so treating as splicesites file
splice distances present...390347 unique splicesites...390347 valid splicesites...splicetrie_obs has 644549 entries...splicetrie_max has 48518627 entries...done
GMAP modes: pairsearch, terminal, improvement
Starting alignment
Illegal instruction

**twu** · 12-15-2011, 08:25 PM

Illegal instruction in GSNAP

The illegal instruction error is occurring because the environment where you built GSNAP is different from the environment where you are running it. When you build GSNAP, the configure command sees if it can compile and run the program with the flag "-mpopcnt", which uses a machine-level instruction to count the number of bits in a word and results in a minor increase in speed. However, if you build on a machine where -mpopcnt works and then you move the program to a machine where the built-in popcount instruction does not exist, the machine will complain about an illegal instruction.

The solution is to rebuild GSNAP, this time running configure with the flag --disable-popcnt, which will not compile the programs with the "-mpopcnt" flag, even if it works in your build environment.

If you have any further questions or problems, please feel free to send email directly to me at [email protected].

Regards,

Thomas Wu

**gpcr** · 12-16-2011, 08:09 AM

@Twu; I recompiled gmap in the current computer its working fine so far...Thanks!!!

**ParthavJailwala** · 03-19-2012, 09:13 AM

Hi Thomas (@twu),

We have been trying to use GSNAP for alignment, followed by cufflinks for downstream analysis on RNA-seq data. When we look at the SAM file produced by GSNAP (alignment done with splicing on), the 'XS:A: ' tag should have either 'XS:A:+' or 'XS:A:-' to represent strand information for splicing reads; however, we also notice numerous alignment entries to have an 'XS:A:?'. These entries are not recognized by cufflinks as it expects either a '+' or a '-' in the XS:A tag. Any clues on what we should do to make the SAM file compatible with cufflinks? Thanks

**gpcr** · 03-19-2012, 10:47 AM

I think new version might have rectified 'XS:A:?'.
check here for the updated version

GMAP and GSNAP

http://research-pub.gene.com/gmap/

"Eliminated reporting of XS:A:? in cases of long deletions being considered incorrectly as introns "

**ParthavJailwala** · 03-19-2012, 12:39 PM

@gpcr,
Thanks for that pointer....it turns out we had the path to the older version in the .bash_profile.
We will run GSNAP using the latest version to see if the XS tag errors are resolved.

Thanks

**twu** · 03-20-2012, 02:38 PM

Also, I am about to release a new version later today (version 2012-03-20). I believe this new version should completely eliminate all occurrences of "XS:A:?".

**ParthavJailwala** · 03-20-2012, 06:52 PM

Originally posted by twu View Post

Also, I am about to release a new version later today (version 2012-03-20). I believe this new version should completely eliminate all occurrences of "XS:A:?".

Hi Thomas,

Thanks for your response. I checked the GSNAP webpage, and yes, there is a new version indeed (2012-03-20). We will use this version and see if the 'XS:A:?' related errors are resolved.

Thanks

**ParthavJailwala** · 03-22-2012, 11:36 AM

Parse error in SAM file from GSNAP

I used the latest version of GSNAP with default flags to produce a SAM file. When trying to convert SAM to BAM for input to cufflinks, I get this error:

'Parse error in line 19446415: missing colon in auxiliary data'
Aborted

Then, when I look at the read at that line number, I get this:
>$sed -n 19446415p MC01.sam

HTML Code:

3PO5HQ1:154:D0FV2ACXX:2:1104:9525:98311	387	15	82664506	1	100M1S	=	82664506	0	CTTAGATCTTATTCATCAGCCTGCTGAACAGTTCCTTTTTCAGAGACATAGATACCATCCAAAAATTTCCTGATATCCTTGTTTTTAACTGTTGTGGCTTT	CCCFFFFFHHHHHJJJJJJJJJJJJJJIJJJIIIJJJJJJIJIIJJJJJJJJJJJJJIJIJJIGIJJJHIJJJJJJEHIHHHHHHFFFFFEEEEEDDDDD3MD:Z:100	NH:i:4	HI:i:3	NM:i:0	SM:i:1	XQ:i:40	X2:i:40

What is wrong with this read entry in the SAM file that is throwing the error above?

Thanks

**ppoudel** · 01-09-2013, 01:43 AM

hi,

what does -n and -Q option mean in GSNAP, does -n mean the number of places that a read maps to a genome? I would like to write only the uniquely mapped reads to the output file.

**ParthavJailwala** · 01-09-2013, 06:30 AM

Originally posted by ppoudel View Post

hi,

what does -n and -Q option mean in GSNAP, does -n mean the number of places that a read maps to a genome? I would like to write only the uniquely mapped reads to the output file.

Hi ppoudel,

As per the GSNAP README, there is a -n flag that the user can specify for the number of hits that GSNAP should show. I did not find the -Q option, but I guess you imply the "--quiet-if-excessive" which is linked to the -n option, as per this paragraph from the README:
Hope that helps.

Thanks
Parthav

------------------------------------------------------------------------------------
After the query line, each of the genomic hits is shown, up to the
'-n' parameter. If too many hits were found (more than the '-n'
parameter), the behavior depends on whether the "--quiet-if-excessive"
flag is given to GSNAP. If not, then the first n hits will be printed
and the rest will not be printed. If the "--quiet-if-excessive" flag
is given to GSNAP, then no hits will be printed if the number exceeds
n.

Each of the genomic hits contains one or more alignment segments,
which is a region of continuous one-to-one correspondence (matches or
mismatches) between the query and the genome. Multiple segments occur
when the alignment contains an insertion, deletion, or splice. The
first segment is marked by a space (" ") at the beginning of the line,
while the second and following segments are marked by a comma (",") at
the beginning of the line. (In the current implementation of GSNAP
that allows only a single indel or splice, the number of segments is
at most two.)
---------------------------------------------------------------------

**ParthavJailwala** · 01-09-2013, 06:31 AM

Re: GSNAP output

Originally posted by ppoudel View Post

hi,

what does -n and -Q option mean in GSNAP, does -n mean the number of places that a read maps to a genome? I would like to write only the uniquely mapped reads to the output file.

Hi ppoudel,

As per the GSNAP README, there is a -n flag that the user can specify for the number of hits that GSNAP should show. I did not find the -Q option, but I guess you imply the "--quiet-if-excessive" which is linked to the -n option, as per this paragraph from the README:
Hope that helps.

Thanks
Parthav

------------------------------------------------------------------------------------
After the query line, each of the genomic hits is shown, up to the
'-n' parameter. If too many hits were found (more than the '-n'
parameter), the behavior depends on whether the "--quiet-if-excessive"
flag is given to GSNAP. If not, then the first n hits will be printed
and the rest will not be printed. If the "--quiet-if-excessive" flag
is given to GSNAP, then no hits will be printed if the number exceeds
n.

Each of the genomic hits contains one or more alignment segments,
which is a region of continuous one-to-one correspondence (matches or
mismatches) between the query and the genome. Multiple segments occur
when the alignment contains an insertion, deletion, or splice. The
first segment is marked by a space (" ") at the beginning of the line,
while the second and following segments are marked by a comma (",") at
the beginning of the line. (In the current implementation of GSNAP
that allows only a single indel or splice, the number of segments is
at most two.)
---------------------------------------------------------------------

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Today, 11:49 AM	0 responses 13 views 0 likes	Last Post by seqadmin Today, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

GSNAP Output

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News