Seqanswers Leaderboard Ad

**dpryan** · 12-11-2013, 07:59 AM

There's nothing really to interpret. That's just the name the sequencer gave to the read (it has nothing to do with alignment). If you really must know, usually the first part is the machine ID and the last part denotes which lane was used and where on the flow-cell the read was seen. That's the case for data from Illumina machines at least.

Late addition: BTW, the page on Fastq on wikipedia happens to mention illumina read name formats.

**GenoMax** · 12-11-2013, 08:04 AM

Here are a couple of pages with SAM format details.

SAM - Genome Analysis Wiki

http://genome.sph.umich.edu/wiki/SAM

Dave's Wiki | SAM

http://davetang.org/wiki/tiki-index.php?page=SAM

Dave's Wiki

What you have posted above looks like a Illumina sequence ID.

Edit: Did not see Devon's message when I posted this. See the "Illumina sequence identifiers" to get details: http://en.wikipedia.org/wiki/FASTQ_format

**KnowNothing2** · 12-11-2013, 09:12 AM

So then why aren't there any gene specific idnetifiers with this genome alignment, when all other SAM examples I see outputed from bowtie 2 have these identifiers?

**dpryan** · 12-11-2013, 09:18 AM

You need to post the whole line for us to know what you're talking about, not just the QNAME field.

**KnowNothing2** · 12-11-2013, 09:28 AM

Originally posted by dpryan View Post

You need to post the whole line for us to know what you're talking about, not just the QNAME field.

Here are my first 3 reads.

M00830:112:000000000-A6EGB:1:1101:16729:1705 16 chr2 156079200 22 21S29M * 0 0 AGACGTGTGCTCTTCCGATCTACACAGGGCTTGAGCAGTTGCGAACACGT B/B/B0B1A0000000AA212D110BAA1113BBA1FA11>1DFC1A1>1 AS:i:53 XN:i:0 XM:i:1 XO:i:0 XG:i:0NM:i:1 MD:Z:2T26 YT:Z:UU

M00830:112:000000000-A6EGB:1:1101:18463:1733 0 chr17 39846570 36 35M15S * 0 0 TGCGTGCATTTATCAGATCAAAACCAACCCGGTGAAATCGGAAGCGCCCA AAAAA1>1BFFBEG331BB1111A00000000A001AAB///////A/A/ AS:i:70 XN:i:0 XM:i:0 XO:i:0 XG:i:0NM:i:0 MD:Z:35 YT:Z:UU

M00830:112:000000000-A6EGB:1:1101:16633:1749 4 * 0 0 * * 0 0 CGTGCATTCATCAGATCAAAACCGACCCGGTGAGATCGGAAGAGCACACT >AAAA1BDFBFFBBBGC11111A00A0A00A0/01DB/////00B1B0A0 YT:Z:UU

**dpryan** · 12-11-2013, 09:32 AM

Right, so the first two reads map and the third doesn't. The original read in question isn't included among those you listed.

**KnowNothing2** · 12-11-2013, 09:37 AM

Originally posted by dpryan View Post

Right, so the first two reads map and the third doesn't. The original read in question isn't included among those you listed.

right, so now how would I go about determining which gene the first 2 correspond to? theoretically, these should all be exonic.

**dpryan** · 12-11-2013, 09:46 AM

Why don't you tell us what your biological goal is? I'm guessing that this is RNAseq data and you eventually want counts per gene for downstream statistics. In that case, just use htseq-count or featureCounts (from subRead). Actually, htseq-count will even annotate the reads for you if you really want (normally you'd just do that to debug a problem).

**shi** · 12-11-2013, 01:42 PM

Just add to Devon's post: featureCounts can also output detailed assignment results for each read when -R option is specified, although it only includes read names in this read-level output (other fields in SAM/BAM files are omitted).

**dpryan** · 12-11-2013, 03:21 PM

Originally posted by shi View Post

Just add to Devon's post: featureCounts can also output detailed assignment results for each read when -R option is specified, although it only includes read names in this read-level output (other fields in SAM/BAM files are omitted).

One of these days I really should fully familiarize myself with featureCounts

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 9 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 57 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Aligned to UCSC genome using Bowtie2, how do I interpret QNAME in SAM file?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News