yes. It did work.. I then split the fastq files per barcode using fastx barcode splitter. However, it still did not solve my problem of less number of reads being aligned after running tophat. Also, fastq files I obtained from fastx and casava were totally different!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
well my barcodes are not illumina but are nugen. I ran casava normally with the added
use-bases-mask parameter. it did not complain and generated fastq files. When I ran tophat with these files, it somehow could not align most of the reads. Final read count of SAM files was in thousands or even less in some cases.
I then generated 1 fastq files per lane through casava ignoring the barcodes. Then used barcodespliiter to split the fastq file according to the barcode.
For any sample, fastq file generated this way did not match with the one generated by casava. (in terms of number of lines as well as contents).
Also, tophat alignment does better job then the previous version. But the line counts of the SAM file are still not in millions.. I am not sure of my results at this point.
Comment
-
I ran barsplitter as follows:
cat combined.fastq | fastx_barcode_splitter.pl --bcfile ../barcode1.txt --bol --mismatches 1 --prefix "lane1" --suffix ".fastq"
It creates separate fastq files but barcodes are retained in the file. So I removed those (first 4) first using:
fastx_trimmer -i fastqfile -o trim_fastqfile -f 5 -l 50 -Q 33
then ran tophat on the fastq files.
Comment
Latest Articles
Collapse
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
25 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
29 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
25 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
52 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment