Which tool (i.e. perl script) could I use to map my fastq-file onto the hairpin.fa file in mirbase?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
You can use Bowtie (or BWA, or any short-read aligner). Create an index from the hairpin.fa file (bowtie-build) and map your fastq directly to the index. Specify the number of mismatches you want (up to 3).
Make sure that you remove the 3'end adapter sequences from your raw file, if necessary (fastx-toolkit does that pretty well, but there are other tools).
-
I made an bowtie_index from hairpin.fa and aligned mapped with bowtie useing following command $ bowtie <bowtie_hairpin> <my_input>
but i get the following message:
# reads processed: 23178672
# reads with at least one reported alignment: 86 (0.00%)
# reads that failed to align: 23178586 (100.00%)
Reported 86 alignments to 1 output stream(s)
The input-file is fastqsanger and has been processed by fastq-groomer in Galaxy.
Comment
-
fastq-groomer does not remove the 3'end adapter sequence present on the 3'end of your sequences (correct me if I am wrong).
What is the size of the sequences you tried to align?
If they are microRNA sequences, most of them should be in the range 20-24 nt. Most probably, your sequences are longer than that, and therefore, you should remove some nucleotides (a non-fixed number) on the 3'end.
On Galaxy, you should use the tool "clip" under "fastx-toolkit for fastq data". If you don't know the sequence of your adapter, you can either guess it by looking at your file, look around to find the most common ones, or use the tool "trim" to trim a fixed number of nucleotides at the end of your sequences (this is not recommended, since you'll lose useful nucleotides).
Keep me posted,
Comment
-
I have clipped adapters from 3'end and I have trimmed remaining reads so that they are between 18 and 24 nt long. Most of them should therefore be miRNAs.
Shouldnt it work to downloads one of the files from miRBase and do bowtie_build and then just run
$ bowtie <db_file> <input> <output>
Comment
-
Palgrave,
I think the hairpin.fa has the sequences in A, U, G, C nucleotide format. Whereas, your sequence reads might have A,T,G,C. I don't know if this would matter with bowtie, but you might want to try converting the U to T in the ref and do index again and run an alignment.
P
Comment
Latest Articles
Collapse
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
31 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
32 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
53 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment