Unconfigured Ad

**Chipper** · 04-24-2015, 06:22 AM

My guess is you didn't trim your reads and are reading into the adaptor. Try bwa mem if you have 100 or 125 bp reads.

**priya** · 04-24-2015, 06:31 AM

Originally posted by Chipper View Post

My guess is you didn't trim your reads and are reading into the adaptor. Try bwa mem if you have 100 or 125 bp reads.

I checked through FASTQC and there is nothing poping in "overrepresented sequences"
Is there any other ways to check if there is any problems with adapter contamination..

The read length is 43 bp..

**Chipper** · 04-24-2015, 06:41 AM

Adapter dimers should be found by FASTQC so the only things I can think of then is if you have N:s in the reads or very low quality ends.

**priya** · 04-24-2015, 07:21 AM

Originally posted by Chipper View Post

Adapter dimers should be found by FASTQC so the only things I can think of then is if you have N:s in the reads or very low quality ends.

Thank you for suggestions. I tried to look into those unaligned reads. For that, while running bowtie I add an extra parameter --un to save the unaligned reads.

bowtie -t -p 1 -m 1 $index/mm10 -q -S sample.fastq --un unaligned.fq > sample.sam

If I go through those unaligned reads fastq file, it doesnt contain N:s..

For Example: If i just paste here only the second line (read sequence) from unaligned fastq file omitting the other 3 lines, they looks like below without N:s

ATAAAACTGTATTTTTTTGTGAAGAATCAACAACAAGTGGGAC
CCGGGCTTAGACAGCTCACATGAAAGGAAGGCCGTGCCACCTT
CACTCGTTGTGAATCTATCCACCAAGTCAGATTTGAAAAATGC
CAGTACAGACATAACTTATTAAAGCCTCTAGCAGGACAGCAAA
CACTAAACAGGAACTGCAAACACAAATATGTTTGGCACACAAA
TGTTTTAATTTTTTTTTTACAAATGTATCCATTATTATCGTTG
GGGTAAGTTTGGCGCCGTGAGTGAAGGGGGCTTTGTTGCGGAA
AATCTGTCTGTCCGTCTGTTCGTCTATCTGTCTGTCCGTCTGT
GTGTGTGTGATGGGTCAGGTGTGTGTGTGTGTGTGTGTGATGC
AGAACATATTAGATGAGTGAGTTACACTGAAAAACACATTCGT
CCCCATTAGTTCCTGTCAAGGCAGAAGCTACTCTTCCTGGGGT
ATTATGTCTTTGAGCAAGTTTATGTTTGAGTTAGTGAATTCAT
TTTCTAAATTTTCCACCTTTTTCAGTTTTCCTCGCCATATTTC
CATAATGTGATTTTGCCGTTGTTCTGTCTCTTTATTACATATC
GACCAATGTTTAGTTTCTTAGAAATCAGATGCATGAAATAACC
GCCGAACGACTCCTCTACCTCCTGCACCACTAACGCCCCCAAA

**Richard Finney** · 04-24-2015, 08:36 AM

Sanity check:
reverse complement of

AGAACATATTAGATGAGTGAGTTACACTGAAAAACACATTCGT
is
ACGAATGTGTTTTTCAGTGTAACTCACTCATCTAATATGTTCT

matches perfectly to mm10:chr6:103,649,175-103,649,217

via http://genome.ucsc.edu/cgi-bin/hgBlat

BLAT Search Results

ACTIONS QUERY SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN
---------------------------------------------------------------------------------------------------
browser details x10 43 1 43 43 100.0% 6 - 103649175 103649217 43

Other reads match mm10 also.

Verify that your reference sequence is good.
Check distribution of chromosomes in your output sam; are any missing?

**Chipper** · 04-24-2015, 11:20 AM

Reading the manual is also a good start:

-m <int>
Suppress all alignments for a particular read or pair if more than <int> reportable alignments exist for it. Reportable alignments are those that would be reported given the -n, -v, -l, -e, -k, -a, --best, and --strata options. Default: no limit. Bowtie is designed to be very fast for small -m but bowtie can become significantly slower for larger values of -m. If you would like to use Bowtie for larger values of -k, consider building an index with a denser suffix-array sample, i.e. specify a smaller -o/--offrate when invoking bowtie-build for the relevant index (see the Performance tuning section for details).

Why would you use -m 1 with 43 bp reads?

**priya** · 04-27-2015, 01:25 AM

Originally posted by Richard Finney View Post

Sanity check:
reverse complement of

AGAACATATTAGATGAGTGAGTTACACTGAAAAACACATTCGT
is
ACGAATGTGTTTTTCAGTGTAACTCACTCATCTAATATGTTCT

matches perfectly to mm10:chr6:103,649,175-103,649,217

via http://genome.ucsc.edu/cgi-bin/hgBlat

BLAT Search Results

ACTIONS QUERY SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN
---------------------------------------------------------------------------------------------------
browser details x10 43 1 43 43 100.0% 6 - 103649175 103649217 43

Other reads match mm10 also.

Verify that your reference sequence is good.
Check distribution of chromosomes in your output sam; are any missing?

Hi Richard,
Thank you ! I see that the most of reverse complement of reads matches to mm10. It looks little bit strange at this point?

Also I checked my SAM header, it looks like all the chromosomes are included
@SQ SN:chr1 LN:197195432
@SQ SN:chr2 LN:181748087
@SQ SN:chr3 LN:159599783
@SQ SN:chr4 LN:155630120
@SQ SN:chr5 LN:152537259
@SQ SN:chr6 LN:149517037
@SQ SN:chr7 LN:152524553
@SQ SN:chr8 LN:131738871
@SQ SN:chr9 LN:124076172
@SQ SN:chr10 LN:129993255
@SQ SN:chr11 LN:121843856
@SQ SN:chr12 LN:121257530
@SQ SN:chr13 LN:120284312
@SQ SN:chr14 LN:125194864
@SQ SN:chr15 LN:103494974
@SQ SN:chr16 LN:98319150
@SQ SN:chr17 LN:95272651
@SQ SN:chr18 LN:90772031
@SQ SN:chr19 LN:61342430
@SQ SN:chrX LN:166650296
@SQ SN:chrY LN:15902555
@SQ SN:chrM LN:16299

To cross-check: instead of using mm10 index built by me, I tried mm9 bowtie index available on Bowtie page , ftp://ftp.ccb.jhu.edu/pub/data/bowtie_indexes/

But it doesnt make any change in the alignment percentage

mm9- bowtie log
# reads processed: 16561198
# reads with at least one reported alignment: 3864950 (23.34%)
# reads that failed to align: 10834676 (65.42%)
# reads with alignments suppressed due to -m: 1861572 (11.24%)

**priya** · 04-27-2015, 01:30 AM

Originally posted by Chipper View Post

Reading the manual is also a good start:

-m <int>
Suppress all alignments for a particular read or pair if more than <int> reportable alignments exist for it. Reportable alignments are those that would be reported given the -n, -v, -l, -e, -k, -a, --best, and --strata options. Default: no limit. Bowtie is designed to be very fast for small -m but bowtie can become significantly slower for larger values of -m. If you would like to use Bowtie for larger values of -k, consider building an index with a denser suffix-array sample, i.e. specify a smaller -o/--offrate when invoking bowtie-build for the relevant index (see the Performance tuning section for details).

Why would you use -m 1 with 43 bp reads?

In order to obtain uniquely mapped reads , discarding the multiple hits.
I am not sure whether I can introduce more mismatches (default chosen was :2 )as i have short read sequences of 40-43 bp

**Chipper** · 04-27-2015, 02:04 AM

Originally posted by priya View Post

In order to obtain uniquely mapped reads , discarding the multiple hits.
I am not sure whether I can introduce more mismatches (default chosen was :2 )as i have short read sequences of 40-43 bp

Just use BWA and filter on mapping quality (e.g -q 20) to get uniquely mapped reads if you don't understand the bowtie options.
-m 1 will suppress even perfect matches if there are any other reported alignments within the tolerated error rate.

**SylvainL** · 04-27-2015, 02:45 AM

Can you try mapping the unaligned reads on human? Once I got up to 92% human contamination in a ChIPseq library... After checking, the technician didn't use gloves when she took an aliquot to sent to sequencing...

**priya** · 04-27-2015, 11:47 PM

Originally posted by SylvainL View Post

Can you try mapping the unaligned reads on human? Once I got up to 92% human contamination in a ChIPseq library... After checking, the technician didn't use gloves when she took an aliquot to sent to sequencing...

I cross-checked with human genome, but only 0.2 % of unaligned reads are aligned to human genome. I dont think its the contamination issue. I have done sanity check..i could not find huge contamination of other species in the sample..

**SylvainL** · 04-28-2015, 02:08 AM

Ok, can you re-run bowtie with --max overM_reads.fastq and --un Unaligned_reads.fastq just to be sure that the reads which do not map are really not mapping on Mouse? If I remeber well, if you only use --un, all the reads which are excluded from the mapping will be in this file (which may explain why you have reads perfectly mapping on mouse in this file, if they map more than once...)

**priya** · 05-06-2015, 05:53 AM

I found answer to my question.. changing the bowtie or bwa parameters improved my aligning percentages to 2-5% but still left with lot of unaligned reads..
This paper explains it,

303 See Other

http://www.nature.com/srep/2015/150302/srep08635/full/srep08635.html

I tried mapping my unaligned reads with Shrimp2 aligner with default settings, high percentage of unaligned reads got mapped.

**Richard Finney** · 05-06-2015, 11:13 AM

I'm not quite getting the paper ( http://www.nature.com/srep/2015/1503...srep08635.html )

When/where in the process are the reads getting modified so that they can't map?

Topics	Statistics	Last Post
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, Yesterday, 12:03 PM	0 responses 19 views 0 reactions	Last Post by SEQadmin2 Yesterday, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, Yesterday, 11:40 AM	0 responses 14 views 0 reactions	Last Post by SEQadmin2 Yesterday, 11:40 AM
MetaBeeAI Helps Scientists Process Research Literature Faster by SEQadmin2 Started by SEQadmin2, 05-28-2026, 11:40 AM	0 responses 29 views 0 reactions	Last Post by SEQadmin2 05-28-2026, 11:40 AM
Scientists Solve a 25-Year Mystery in RNA Interference by SEQadmin2 Started by SEQadmin2, 05-26-2026, 10:12 AM	0 responses 31 views 0 reactions	Last Post by SEQadmin2 05-26-2026, 10:12 AM

Unconfigured Ad

Issues with ChipSeq mapping

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News