Thanks to everyone who contributes to these basic threads. I know they may get tedious for the old-timers, but they're vital for newer folks. Even if the same info is technically available at reference sites, it's often easier to understand when the answer is to a specific, basic question, rather than organized as vendor documentation or generalized teaching material!
Unconfigured Ad
Collapse
X
-
This helped but not enough ! I think I get it but then questions pop up - this mates in a pair, paired read stuff pops up when I look a the specification for the SAM format output from the bowtie program. There are flags values that are returned to indicate if the aligned read is one of pair and/or first one in a pair and/or second one in a pair and so on..
But how does it, meaning bowtie, know ? Am I right in thinking that is specified in the input record ? I can see the .fastq input format has a place holder for this, so I'm guessing that other input formats eg. sra also have it ? And if they don't then there's no way for bowtie ( or other algorithms ) to derive it ?
Seeing as this is all equipment specific I'm going to need to look at some videos that describe the front parts of the this entire operation. Any ideas where ? I was hoping to just pick it up and work with it from the point of sequences of characters - an IT perspective - but guess not.
Comment
-
-
Hi,Originally posted by edilana.gomes View PostHi!!
Can anyone clarify the difference between mate pair, pair end and single end reads?
Thanks.
Mate-pairs and paired-end reads have been covered. Single end reads are just that... a single read from one end of each sheared DNA fragment.
Scott.
Comment
-
-
i could be wrong but i think u can get ur answer by decoding the name of the fastq read... i.e the string following the > symbol.Originally posted by raonyguimaraes View PostHello,
Suppose I have two reads from an exome in fastq. How to determine if they are pair-end or single-end or mate-pair ?
-A
Comment
-
-
Paired end or mate pair reads have to have a means of knowing which two reads go together. Usually, this is by the name. In Illumina at least, reads are normally named by their coordinates on the flowcell. So if you don't have two reads with the same coordiantes, you've got single end.Originally posted by raonyguimaraes View PostHello,
Suppose I have two reads from an exome in fastq. How to determine if they are pair-end or single-end or mate-pair ?
Paired ends run towards each other, and are about 100-500 bp apart. Mate pairs run away from each other, and tend to be a few kb apart, but I belive sometimes they are contaminated with ordinary paired end data.
Comment
-
-
I have some questions about aligning paired end sequencing reads. I am using BWA sampe function to align my paired end reads and it worked, but surprisingly almost all reads are being paired with reads on a different chromosomes, resulting in a lot of "improper reads". I don't understand why BWA did that and I wonder if it was because I used the command "bwa sampe -a 15000 -A" to force bwa to not run smith waterman alignment for unmapped reads.
Also, if paired end reads share the same x and y coordinates, which are indicated by the first line of their fastq files, why doesn't bwa just pair them up by their coordinates? That seems like the most straightforward way to find the right pair to me.
Comment
-
-
Hi all I am still struggling with using BWA to align my paired end reads. I used the command:
bwa sampe -P -s hg19.fasta CATTCG_1.sai CATTCG_3.sai CATTCG_1.fastq CATTCG_3.fastq > CATTCG_PE.sam
and the first few lines of the program running look like this:
The program seems to be running fine and quite quickly, but when I look at the output file, I see something like this:[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] fail to infer insert size: too few good pairs
[bwa_sai2sam_pe_core] time elapses: 10.96 sec
[bwa_sai2sam_pe_core] changing coordinates of 6 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_sai2sam_pe_core] time elapses: 0.00 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.82 sec
[bwa_sai2sam_pe_core] print alignments... 1.99 sec
[bwa_sai2sam_pe_core] 262144 sequences have been processed.
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] (25, 50, 75) percentile: (3520, 39961, 70863)
[infer_isize] low and high boundaries: 94 and 205549 for estimating avg and std
[infer_isize] inferred external isize from 27 pairs: 37726.370 +/- 35311.353
[infer_isize] skewness: 0.341; kurtosis: -1.395; ap_prior: 1.00e-05
[infer_isize] inferred maximum insert size: 251007 (6.04 sigma)
[bwa_sai2sam_pe_core] time elapses: 10.87 sec
[bwa_sai2sam_pe_core] changing coordinates of 178 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_sai2sam_pe_core] time elapses: 0.00 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.82 sec
[bwa_sai2sam_pe_core] print alignments... 1.97 sec
[bwa_sai2sam_pe_core] 524288 sequences have been processed.
It is really peculiar that most if not all read pairs have been mapped to different chromosomes. How is this possible? When I use samtools to filter out the correctly paired reads, I only obtained a very small file.DJB775P1_0215:5:1101:1262:2347#0 65 chr13 92966174 37 94M chr5 33346129 0 AATAACCACCTAGATAAATGTTCACTCATCTCGCCTGTCTAGCCTGTCTTGAGGCCGGTTTCATCATGAGTCACTCCACCAATTACTTCAAAAC cgggeghhhhfhhfffgffegbcfgffdhfffhhhdb^efgfddfhhhhffS\eefgg\W`c]RZ^__GU]UMMZ_Z]\_a^_TR_bbbbYYYW XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1495:2155#0/3 129 chr5 33346129 37 94M chr13 92966174 0 AATTAACTTCCTTTTTTTGTCTTCATATAACACTGTTGACCTACTCATATTGAGCCCTCAGTCTTTTTTGTACACATGCTCATCCCTGGCATGT ceggggiiiiiiiiiiiiihiiiiihiiihihifgffhiig`fgfghhfhghffhhihigfgfgeeceacabbcccb`b_bcccccc_X[`b^Y XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1465:2351#0 81 chr14 44601925 37 94M chr5 33346129 0 TCTCCTACCTCCTCTCCCTTATAGAAATCCCTGTGATTCCATTAGTCTCACCTGGATAAACCAGAGTATTCTTATTATCCCAAGATCCCCATCT XTR^YGG\^ZZZRbb]]ccc_db]d^dgfcb`bZZe_bSfc`fgf_fc^Iffhhhhfhfagddhfhhhhfffgeefddfedbbd_agfcgebe\ XT:A:U NM:i:1 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:38A55
DJB775P1_0215:5:1101:1483:2169#0/3 161 chr5 33346129 37 94M chr14 44601925 0 AATTAACTTCCTTTTTTTGTCTTCATATAACACTGTTGACCTACTCATATTGAGCCCTCAGTCTTTTTTGTACACATGCTCATCCCTGGCATGT ceYbgae_egihffhhd_^efhihfhhfXcghfcgacgfI^[cba\eecgZ_HWWHLaZ`VVVb`gacccZb_RZ]`]bbcbbbb^bc^`X][S XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1317:2430#0 81 chr9 8239023 37 94M chr12 51288201 0 TTAAGTATTAAATGACATAAAACCTATAAAGCACATAGCAGGTAAATGTGGTAAACTCTTGATAAATGTTATTGTTATCATCATCATCATCACT b`]a^VHRcaggggbgeghgc\bhgefbZ\MW[gfgce^[gfff^^aa^OXeaYIae[hhhhgf^[d[hgfge_hhhgd_hY^bfdbcegecZc XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1461:2198#0/3 161 chr12 51288201 37 94M chr9 8239023 0 CTGTTGGCTGGAATGTAAAATGGTGCAGCTGCTGTGGAAAACTGCATGGCAGTTCCTAGAAAAATTAAAAATAGAATTACCATATGATCCAGCA egggggefdf`egg[bdgh`]gh^dbe`dfhhbffbgIIX^^e_fgffhabgH\\_\HM\d]dUGV\\VV_ZVVHHUZ__bbc]`BBBBBBBBB XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0
Can someone tell me why my reads are paired so weirdly? Any help is greatly appreciated!
Comment
-
-
If the reads have wildly different names, they aren't supposed to be paired with eaach other.
bwa assumes that the first read of the first fq goes with the first read of the second fq, and so on. That doesn't appear to be the case here, that's why your "pairs" are all over the place.
Comment
-
-
Hey,
I just wanna ask about paired-end data filtering. Do I need to filter read 1 and read 2 separately or combine read 1 and read 2 then filter? Because later on I want to use the filter data for RNA-seq analysis, using Tophat and cufflink. But, the Tophat require the read 1 and read 2 as input not as paired-end.
Comment
-
Latest Articles
Collapse
-
by SEQadmin2
Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.
The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
...-
Channel: Articles
06-02-2026, 10:05 AM -
-
by SEQadmin2
With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.
Introduction
Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...-
Channel: Articles
05-22-2026, 06:42 AM -
ad_right_rmr
Collapse
News
Collapse
| Topics | Statistics | Last Post | ||
|---|---|---|---|---|
|
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism
by SEQadmin2
Started by SEQadmin2, 06-09-2026, 11:58 AM
|
0 responses
15 views
0 reactions
|
Last Post
by SEQadmin2
06-09-2026, 11:58 AM
|
||
|
Started by SEQadmin2, 06-05-2026, 10:09 AM
|
0 responses
26 views
0 reactions
|
Last Post
by SEQadmin2
06-05-2026, 10:09 AM
|
||
|
Started by SEQadmin2, 06-04-2026, 08:59 AM
|
0 responses
37 views
0 reactions
|
Last Post
by SEQadmin2
06-04-2026, 08:59 AM
|
||
|
Started by SEQadmin2, 06-02-2026, 12:03 PM
|
0 responses
61 views
0 reactions
|
Last Post
by SEQadmin2
06-02-2026, 12:03 PM
|

Comment