![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
FastQC per base sequence content | analyst | Bioinformatics | 14 | 02-15-2017 07:25 AM |
Kmer content | subuhikhan | General | 9 | 03-05-2012 01:05 AM |
kmer content in the first bases of Illumina sequence | brachysclereid | Bioinformatics | 2 | 01-09-2012 03:54 PM |
FastQC - strange 'per base sequence content' graph | gconcepcion | Bioinformatics | 11 | 10-31-2011 01:39 AM |
FastQC "Per Base Sequence Content": systematic deviation at 3' end of reads | d f | Illumina/Solexa | 4 | 09-28-2010 10:46 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: London, UK Join Date: Nov 2011
Posts: 12
|
![]()
Hi,
I'd appreciate some advice on processing some Illumina libraries Initial FastQC runs showed the data as not great. I've used cutadapt to trim off adapters and FastQC shows improvements to all libraries. One remains of concern, because it still retains kmer and other issues (I've attached files for kmer content & per base sequence content for both the original and the processed data) My question is simple: is this good enough? (my next step is assembly with velvet) Does this data need some further processing before Velvet? If so, with what? I've considered trimming off the first 10nuc to remove the anomalous per_base_sequence_content trace, but that would do little for the persistent kmers. If this were your data, what would you do before velvet assembly? thanks mgg for the record my cutadapt commands are below PHP Code:
|
![]() |
![]() |
![]() |
#2 |
Junior Member
Location: China Join Date: Aug 2012
Posts: 5
|
![]()
yeah. i got the same question
i have a very similar graph with your prosessed-per-base-sequencecontent |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: San Francisco, CA Join Date: Feb 2011
Posts: 286
|
![]()
Looks like you have some base pair bias issues going on from bases 1-10 in your reads. You should trim those off.
|
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: Paris Join Date: Aug 2011
Posts: 239
|
![]()
Hello everybody,
I come back to this topic which fits well to my interrogation: I would like your point of view on my RNA-Seq data (paired-ends, 100bp) generated by an Illumina HiSeq 2000 machine. I attached the "Per Base sequence Quality" and "Kmer Content" for 3 examples. In the first one, the library was prepared using polyA method. The 2 next examples were performed by ribodepletion. I would like to know if my data are "good enough" despite these 2 last profiles and if there is an explanation for this increase of A/T sequence along the read? I have the feeling from these examples and some others that the "Kmer Content profile" depends on the library preparation (ribodepletion vs polyA), the run (samples from a same run show a similar profile) and the sample itself (I observed similar profiles for a same sample ran on 2 different runs). Is this true? Thank you, Jane |
![]() |
![]() |
![]() |
#5 | |
Member
Location: Australia Join Date: Aug 2010
Posts: 54
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#6 |
Senior Member
Location: Paris Join Date: Aug 2011
Posts: 239
|
![]()
I come back to my previous question because I still have doubts concerning the quality of my data. Any feedback would be appreciated
![]() |
![]() |
![]() |
![]() |
#7 |
Senior Member
Location: San Francisco, CA Join Date: Feb 2011
Posts: 286
|
![]()
I didn't see the attachment. But from what you describe, it sounds ok.
|
![]() |
![]() |
![]() |
#8 |
Senior Member
Location: Paris Join Date: Aug 2011
Posts: 239
|
![]()
Oups, I forgot to attach the file!
|
![]() |
![]() |
![]() |
#9 |
Senior Member
Location: Paris Join Date: Aug 2011
Posts: 239
|
![]() |
![]() |
![]() |
![]() |
#10 |
Senior Member
Location: San Francisco, CA Join Date: Feb 2011
Posts: 286
|
![]()
Looks good enough for mapping. Might want to see if you have some adapter contamination in the first one. I've often found weird suden spikes of particular kmers are the adapters.
|
![]() |
![]() |
![]() |
#11 |
Senior Member
Location: Paris Join Date: Aug 2011
Posts: 239
|
![]()
Thank you Wallysb01.
Isn't it suprising to see an increase of AAAAA and TTTTT all along the read? It shoulb be constant, right?, like in the first case. Why is there such a difference between polyA and ribodepletion? Do all the "normal/good profiles" of these 2 methods always differ? |
![]() |
![]() |
![]() |
Thread Tools | |
|
|