SEQanswers

Go Back   SEQanswers > Introductions



Similar Threads
Thread Thread Starter Forum Replies Last Post
MiSeq gDNA reads still fail "Kmer content" and "per base seq content" after trimming" ysnapus Illumina/Solexa 4 11-12-2014 08:25 AM
AVA-difference betweeen "computation" reads and "global align" reads CCBIO 454 Pyrosequencing 0 01-17-2014 04:13 AM
Some "wrong" XS:A in Tophat output for strand specific pair-end RNA-Seq data ct586 Bioinformatics 4 05-08-2013 06:15 PM
definition of "fragment" in FPKM in single end reads thinkRNA Bioinformatics 1 06-25-2010 07:00 AM
SEQanswers second "publication": "How to map billions of short reads onto genomes" ECO Literature Watch 0 06-30-2009 12:49 AM

Reply
 
Thread Tools
Old 11-12-2018, 09:52 PM   #1
Jayesh
Junior Member
 
Location: Pittsburgh

Join Date: Nov 2018
Posts: 2
Default Problem: "Pair-end" reads scRNA seq data (Drop-seq)

In case of Drop-seq, we have paired end data.

Read 1: Cell code + UMI (unique molecule identifier)

Read 2: The transcript information

But I have a problem/doubt with the sample I am working on.

The sample I am using is the following:

https://trace.ncbi.nlm.nih.gov/Trace...run=SRR6261587
(Check the "Reads" tab)

As you know the Drop-seq is "paired-end", we are expected to see two reads for a spot. Although this sample say paired-end, it has only one read per spot.

For example I can share a link of a different scRNA-seq data where you can properly see two reads for a spot

Example sample:

https://trace.ncbi.nlm.nih.gov/Trace...run=SRR8086553 (Check the "Reads" tab)

Where I am going wrong?


I asked one of the main authors of the paper. The following is the reply I got :

"I recommend that you download the aligned BAM files that are hosted in the same GEO record. Read 1 is already processed into the cell and UMI barcodes and held as custom tags (XC and XM) in the BAM files. The cells are already barcode-corrected, so if you use those files, your cell barcodes will line up with mine; if you start from FASTQs, they will not. For most aligners, you can just use the BAM file as input to realign. (It has all reads, even those that did not align.)"

But I could not find any "XM" or "XC" keywords in the bam file

To understand his reply you have to be familiar the processing steps of the Drop-seq: Link: https://github.com/broadinstitute/Dr...1.2Jan2016.pdf

Looks like they have submitted some kind of processed data. I could not figure out how much the data is processed. I am trying to use the data starting at different processing steps but I could not figure out how much the data is processed.
Jayesh is offline   Reply With Quote
Old 11-13-2018, 05:17 AM   #2
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 499
Default

This type of processing (parsing the barcode and UMI) is standard for scRNA-Seq data. Would you post the header and first 10 lines of the BAM file? That would help us to troubleshoot your problem.

Last edited by HESmith; 11-13-2018 at 05:21 AM.
HESmith is offline   Reply With Quote
Old 11-13-2018, 07:11 AM   #3
Jayesh
Junior Member
 
Location: Pittsburgh

Join Date: Nov 2018
Posts: 2
Default Header part of the sam file

The first 10 lines:
@HD VN:1.4 SO:coordinate
@SQ SN:1 LN:58871917 M5:4ec834d5c957b0204ffb37ac619ac286 UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:10 LN:45574255 M5:07b063dca6221fc12a0c7af99a693a0a UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:11 LN:45107271 M5:34028488116d0ce140a9651e56b3361f UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:12 LN:49229541 M5:b93ef3975b8f9e3e291bf14fa725a87b UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:13 LN:51780250 M5:16c8bde090ec09d34d473ee462e266f8 UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:14 LN:51944548 M5:d3684e66d05aeeddfef5a365ed1d44ff UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:15 LN:47771147 M5:20a0e5e9ea8953e48ce8e93117a406a4 UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:16 LN:55381981 M5:85d5826023b6bde850fea5e42b0d22b5 UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:17 LN:53345113 M5:128b86b035cfaa4c62018d7bc2978024 UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta


Somewhere in between: (It gives a clue that the reads have been already aligned to the reference genome using Bowtie2 aligner)

@SQ SN:KN150247.1 LN:728 M5:35c483ee725789db7c67a0acd4ec7cb7 UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:KN150525.1 LN:650 M5:66c855181f1c9a7e22d6061d7c43964b UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:mCherry LN:1198 M5:05f1786feb0995593cbbb0f2e0822bfb UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@SQ SN:GFP LN:1699 M5:6d8fabfb60ba8de2f53ced3d571125a3 UR:file:/ahg/regevdata/projects/vertebrate_sc/dstools/metadata/dr82/dr82spike.fasta
@RG ID:A SM:ZF6S-DS5b_S3
@RG ID:A-2F1E5D7C SM:ZF6S-DS5b_S3
@PG ID:bowtie2 PN:bowtie2 VN:2.2.1 CL:"/broad/software/free/Linux/redhat_6_x86_64/pkgs/bowtie2_2.2.1/bowtie2-align-s --wrapper basic-0 --phred33 --reorder -p 8 -x /ahg/regevdata/proje
cts/vertebrate_sc/dstools/metadata/dr82/dr82 -S /broad/hptmp/jfarrell_dropseq/ZF6S-DS5b_S3//ZF6S-DS5b_S3.5.aligned.sam -U /broad/hptmp/jfarrell_dropseq/ZF6S-DS5b_S3//ZF6S-DS5b_S3.4.aligner.fastq"
@PG ID:bowtie2-7C029FB6 PN:bowtie2 VN:2.2.1 CL:"/broad/software/free/Linux/redhat_6_x86_64/pkgs/bowtie2_2.2.1/bowtie2-align-s --wrapper basic-0 --phred33 --reorder -p 8 -x /ahg/regevda
ta/projects/vertebrate_sc/dstools/metadata/dr82/dr82 -S /broad/hptmp/jfarrell_dropseq/ZF6S-DS5b_S3//ZF6S-DS5b_S3.5.aligned.sam -U /broad/hptmp/jfarrell_dropseq/ZF6S-DS5b_S3//ZF6S-DS5b_S3.4.aligner.fastq"
1 16 1 85 11 35M1D27M * 0 0 ACAACATACGACCTCTAAAAAAGGTGCTGTAACATTACCTATATGCAGCACCACTATATGAG E/EE/EEEAEE/EA/<EEEEEEA/<</EE/</AAEEEEEEEEE/E6E6/E///EEEEAAA
AA RG:Z:A NH:i:1 NM:i:1
2 16 1 85 15 35M1D27M * 0 0 ACAACATACGACCTCTAAAAAAGGTGCTGTAACATTACCTATATGCAGCACCACTATATGAG A/EE/EE/EEEE/E/EA66E/AA6AA</EEEEEE/EE//E/AA/EEEEAEE/E/EEE//A
AA RG:Z:A NH:i:1 NM:i:1
3 16 1 99 6 62M * 0 0 CTAAAAAAGGTGCTGTAACATGTACCTATATGCAGCACCACTATATGAGAGCAGCATAGCAG EEE6A//E<</EA/E/EAEEAEAEAEEEE/<EEEEEEEAEE/EAEEEEEEEEEEEEAAA/AA RG:Z
:A NH:i:1 NM:i:1
4 16 1 99 6 62M * 0 0 CTAAAAAAGGTGCTGTAACATGTACCTATATGCAGCACCACTATATGAGAGCAGCATAGCAG 6/EEEE/EEEA/AAEEEEEEEEEEEEEEEEEAEEAEEEEEEEEEEEEEEEEEEEEEEAAAAA RG:Z
:A NH:i:1 NM:i:1
5 16 1 99 6 62M * 0 0 CTAAAAAAGGTGCTGTAACATGTACCTATATGCAGCACCACTATATGAGAGCAGCATAGCAG <A/EEEEA/EEEEEE/EEAEEEEE/AEEEEEEEEEEEAEAEEEAEEEEEEEEEEEEEAAAAA RG:Z
:A NH:i:1 NM:i:1
6 0 1 108 1 55M * 0 0 GTGCTGTAACATGTACCTATATGCAGCACCACTATATGAGAGCGGCATAGCAGTG AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE/AEEAEAEEAEE RG:Z:A-2F1E5D7C NH:i:1 NM:i:0
7 0 1 110 1 62M * 0 0 GCTGTAACATGTACCTATATGCAGCACCACTATATGAGAGCGGCATAGCAGTGTTTAGTCAC AAAAAEEEEE/EEEEEEEAEEEEEEEEEEEEEEEEEEE/EE/EEEEEAEAAEEEAEEAEEA< RG:Z:A NH:i:1 NM:i:0
8 16 1 185 35 62M * 0 0 TTATATTAACTTGAAAGTGTGTTTTAGCTATTGAGTTTAAACAAAGGGAGCGGTTTACATTG AEEEEEEEAAEAEEEEEEEEEE<EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAAAA RG:Z:A NH:i:1 NM:i:0
Jayesh is offline   Reply With Quote
Reply

Tags
paired end reads, scrna-seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:45 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO