Seqanswers Leaderboard Ad

**htchu.taiwan** · 12-07-2011, 05:44 PM

Hi, friend,

You may try my program: EBARDenovo for RNA-Seq.

https://sourceforge.net/projects/ebardenovo

Download EBARDenovo for free. Highly-accurate de novo assembler of paired-end RNA-Seq. A highly-accurate search-based de novo assembler of paired-end RNA-Seq for advance transcriptomic study.

It's a 64-bits Windows command with .Net.

EBARDenovo can assembly lower-expressed transcripts even their coverage depths are very low (e.g., 1.5).

Frank H.T. Chu from Taiwan

Originally posted by shoegame2001 View Post

I am working on a project that seeks to call SNPs for a non-model organism with no existing reference genome or transcriptome using multiplexed Illumina RNA-seq data.

I used Trinity to assemble a partial 'reference' transcriptome of the most highly expressed transcripts for which we had sufficient coverage, as well as many fragments of lower-expressed transcripts. Then I used BWA to map all data for multiple individuals back to that reference, and finally used GATK to call SNPs.

However, I am running into an issue where reads derived from paralogous genes or a multigene family are mapping back to the same reference contig, creating false SNPs in divergent positions. My evidence of this is that in general one 'allele' (actually a slightly divergent gene) is supported by significantly fewer than half of the reads for a given individual that is called a heterozygote. These 'SNPs' are also generally observed across several individuals, leading me to believe that these are not sequencing/library prep errors.

I think that I will be able to identify these cases with some statistic, but I am wondering if there is a good way to modify the corresponding SAM files to remove the mis-mapped reads, then re-genotype. Has anyone else encountered similar issues, and if so how did you deal with it?

**Nico55** · 12-14-2011, 03:24 PM

I’m in the same boat my friend. Right now I am using oases to assemble; after trialing several assembly programs I found it did the best work with my transcriptomes. I then implemented SOAPaligner in conjunction with SOAPsnp. This trial is still underway I will update you as soon as I compile my results. I would love to hear if you have made any progress using different programs or pipelines.
Thanks

**rururara** · 03-06-2012, 05:12 AM

RNA-seq SNP-calling without a complete reference

Hi all,

I tried also Oases for de novo transcriptome and quite satisfied with the output.
But now, I notice that how to obtain the SNP position from de novo assembly?
Can we just rely on the SNP position that was given from variant calls etc: samtools, gigabayes, freebayes or we need to write in house script ?

In my case, I'm working with diploid plant. Some people said it's easier. But for me it's still a challenge.

Hope to hear comments from you guys.
Thanks!

**edge** · 05-30-2012, 09:18 AM

Hi shoegame2001,

Do you figure out the solution for your doubt?
Currently I'm facing the same problem as well.
I have a Illumina RNA-seq pair-end read, reference transcriptome.
However, I have no idea how to get the SNP result from my data set.
Thanks for any advice.

**shoegame2001** · 06-29-2012, 03:39 PM

As far as I can tell, there is no software designed for SNP-calling in RNA-seq data in the absence of a reference genome. Aligning reads back to a de novo assembled transcriptome and then filtering based on the proportion of reads supporting the alternative allele in called heterozygotes as well as deviation from Hardy-Weinberg results in a more reliable SNP set, but I am afraid there are still false positives that slip through.

**htchu.taiwan** · 07-04-2012, 12:55 AM

Hi, friends,

You may try my program: EBARDenovo for RNA-Seq.
EBARDenovo now can output SNP locations in the comtigs with the parameter (-P)
Please check:

EBARDenovo

https://sourceforge.net/projects/ebardenovo

Download EBARDenovo for free. Highly-accurate de novo assembler of paired-end RNA-Seq. A highly-accurate search-based de novo assembler of paired-end RNA-Seq for advance transcriptomic study.

It's a 64-bits Windows command with .Net.
You can run it on a Windows PC with 16G RAM for 30~40G fastq RNA-Seq data.
In our experiments, EBARDenovo is more accurate than Trinity and Oases.

Hsueh-Ting Chu

Originally posted by shoegame2001 View Post

As far as I can tell, there is no software designed for SNP-calling in RNA-seq data in the absence of a reference genome. Aligning reads back to a de novo assembled transcriptome and then filtering based on the proportion of reads supporting the alternative allele in called heterozygotes as well as deviation from Hardy-Weinberg results in a more reliable SNP set, but I am afraid there are still false positives that slip through.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

RNA-seq SNP-calling without a complete reference

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News