View Single Post
Old 08-21-2020, 06:59 AM   #1
statsteam
Member
 
Location: Californica

Join Date: Sep 2009
Posts: 19
Default Paired end seq read lengths are different

Hi all,

I recently acquired a dataset from GEO (HiSeq 2500, accession: GSE107029). It is a paired-end but the read_1 is 94 bp while read_2 is 100 bp. Since I've never seen paired-end data with different read length for read_1 and read_2 from HiSeq 2500, I am wondering if anyone can help me understand why read_1 and read_2 have different read lengths.

Here are a few reads from read_1 and read_2. I downloaded data using
fastq-dump --split-files SRR6300667

Read_1 (SRR6300667_1.fastq)
@SRR6300667.1 DHCDZDN1:3:1101:1145:1177 length=94
CGGAATGCAGCAATCAATGTCGTCGGAAGATCCTGAATAAATCCTACTGTATCTGAAAGAAGAACACTGTAGCCGCTTGGCAGGACCATTTTTC
+SRR6300667.1 DHCDZDN1:3:1101:1145:1177 length=94
DFDHHH<EEFGGIHIIHCGFDGDF@GHFADFHGICFAEHGHECCAG@GG;EH>CCEA73?;B>@CCCCCCCCCCBBBB???C?BB@@?CCEEC#
@SRR6300667.2 DHCDZDN1:3:1101:1178:1247 length=94
GGCTCCCCCCTGCAAATGAGCCCCAGCCTTCTCCATGGTGGTGAAGACGCCAGTGGACTCCACGACGTACTCAGCGCCAGCATCGCCCCACTTG
+SRR6300667.2 DHCDZDN1:3:1101:1178:1247 length=94
FHHHHHJJJJJJJIIJJJJJIJJJJJIJJJJJJIJJJJGHHAEHIHIIHHFFCEEEEDDDDDDDDDDDABDDDDDDDDBDDDDDBDDDDDDDD@
@SRR6300667.3 DHCDZDN1:3:1101:1313:1046 length=94
TCCTTTAGCTGACCACTTCTTCAAGTAGGCCGGGGATACAAAATCCTTTTGCATGAGGAAAGCTGAAATTCCACACAGGTACCACAAGATATTA
+SRR6300667.3 DHCDZDN1:3:1101:1313:1046 length=94
EHHHHHEGBGGCHIJGHIFHIHIIIIIIJJJHIIJAHGFGIIJJCFGGGIIBCHHEHGFDEFFEEECCCEDCCCCBDDD:@CCACBBDCDDEED


Read_2 (SRR6300667_2.fastq)
@SRR6300667.1 DHCDZDN1:3:1101:1145:1177 length=100
CGATGACCAGAAAAATGGTCCTGCCAAGCGGCTACAGTGTTCTTCTTTCAGATACAGTAGGATTTATTCAGGATCTTCCGACGACATTGATTGCTGCATT
+SRR6300667.1 DHCDZDN1:3:1101:1145:1177 length=100
@<ADDDDHBHFFEGGGE<CFGHIIIIGCEGDHIGI@GGGCFGHIIIIIHCHAGGHIG@@D>DGHGCACAEEHDFFFFFEDA>B@;,5@3>ADC:A@CCC:
@SRR6300667.2 DHCDZDN1:3:1101:1178:1247 length=100
ATGTTCCAATATGATTCCACCCATGGCAAATTCCATGGCACCGTCAAGGCTGAGAACGGGAAGCTTGTCATCAATGGAAATCCCATCACCATCTTCCAGG
+SRR6300667.2 DHCDZDN1:3:1101:1178:1247 length=100
CCFFFFDHHHDADEHGGGJJJEECHGDFHGIIJCDGHIGIJJFGAHEHGGGHGBHGEHIIIGHFHEHDDDD@EACECEECDDCC>CACD<>CDCCDCCD9
@SRR6300667.3 DHCDZDN1:3:1101:1313:1046 length=100
AGCCATACAGGAGATGGGAAACCACGCTATGATACTTTCTGGAAACATTTTATATTTGTTATGATGGACATTTTGCTCGATTGGAGCATGCATAATATCT
+SRR6300667.3 DHCDZDN1:3:1101:1313:1046 length=100
BCFFFFFHHHHHIHIIIJGJFGHIJJIJJJJIIGGHIJIJJJFJIAHHIHHIJIIJJJJJGIJJIJJJIGIHHHHEHFFFEECECDA?CCDDDDCDDEEF

I am quite confused.

Thank you,
Statsteam
statsteam is offline   Reply With Quote