Seqanswers Leaderboard Ad

**kmcarr** · 06-17-2010, 05:06 AM

"Most likely to be true" is a nebulous standard. You can, however, filter a psl file to report only the best and nearly the best hits for a given query. The program pslReps, which should be distributed with BLAT, filters .psl files. There are a number of parameters to adjust the stringency of filtering. Here is a link to some tips given by Jim Kent (author of BLAT and pslReps) on the parameters they use at UCSC. Of course that was in the context of aligning ESTs or full length cDNAs. He makes the point in his response that it is not possible to force pslReps to only report a single alignment for a query (even when using the "-singleHit" option) if there are multiple hits with the same or nearly the same score.

**Adamo** · 06-17-2010, 05:17 AM

Yes, "most likely to be true" is a very fuzzy notion here.
I hadn't see pslReps/Sort... was distributed with blast, I'm still a newbie and I'm quite confused with all the different softs that have been developped...
This is a real mess for someone new in this field as I am!

Thank you very much for your help, I'll try to run pslReps and others psl stuff.

**lifeng.tian** · 07-02-2010, 03:58 PM

You may find the git repo helpful, here is the link:

Genome Browser Git Access

http://genome.ucsc.edu/admin/git.html

I used BLAT recently in a RNA-seq splice junction detection project, here is
some perl scripts for running BLAT and parsing psl result, might be of help to you:

GitHub - lifengtian/SplicePL: Yet another bioinformatics tool to detect de novo splice junctions from paired-end RNA-seq reads (human genome only)

http://github.com/lifengtian/SplicePL

Yet another bioinformatics tool to detect de novo splice junctions from paired-end RNA-seq reads (human genome only) - lifengtian/SplicePL

I tried pslReps for exactly the same problem, it was not designed for it.

Originally posted by Adamo View Post

Yes, "most likely to be true" is a very fuzzy notion here.
I hadn't see pslReps/Sort... was distributed with blast, I'm still a newbie and I'm quite confused with all the different softs that have been developped...
This is a real mess for someone new in this field as I am!

Thank you very much for your help, I'll try to run pslReps and others psl stuff.

**Adamo** · 07-05-2010, 12:04 AM

Thank you, I think it can be very helpful!

However, I have some questions about how to use the scripts (I'm all new to biology and bioinformatic...):

Why should I mask the genome? (actually, I haven't understood this notion yet). I'll work on a bacterial one, do I have to mask it too?

I only have single end read, is it ok anyway? Will it work if I just use the "--forward=..." thing?

As I understand it, I'll have my alignment stored in the "temp" directory after running Blat. Then, what is the command to filter the output.psl so that I obtain only uniquely mapped reads?

Sorry if some questions are a little bit naive...!

**lifeng.tian** · 07-05-2010, 10:54 AM

Please check out this perl script at

SplicePL/scripts/blat_singleend.pl at master · lifengtian/SplicePL

http://github.com/lifengtian/SplicePL/blob/master/scripts/blat_singleend.pl

Yet another bioinformatics tool to detect de novo splice junctions from paired-end RNA-seq reads (human genome only) - lifengtian/SplicePL

It will run BLAT on N processes and generate temp/unique and temp/unique.psl
LMK if you have more questions at [email protected]

BTW, you don't need to mask the genome.

**Adamo** · 07-06-2010, 05:22 AM

Thanks you again, I'm having a look at your script. It seems quite approachable, even for me!
I'll let you know if I need some more help.

**lifeng.tian** · 07-06-2010, 05:42 AM

Just remind you, the minscore will determine the final number of unique reads. The default value of 30 is way too low for bacterial genome and long reads. Assuming the read length is 200bp, then a 90% match requires
a minscore of 180.

**Adamo** · 07-06-2010, 06:28 AM

The thing is, I've reads of different lenghts, from 100bp to 300bp. Can't I specify a percentage instead of a precise score?

**Adamo** · 07-06-2010, 06:36 AM

Oops, mistake.

**lifeng.tian** · 07-06-2010, 01:21 PM

I modified the blat_singleend.pl.
Try run it with --minidentity=90
IT will require the match score to be larger than individual_read_length * 0.9.

Originally posted by Adamo View Post

The thing is, I've reads of different lenghts, from 100bp to 300bp. Can't I specify a percentage instead of a precise score?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

BLAT - uniquely mapped reads/multiple hits

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News