SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Consensus from mpileup for haploid sequences (forcing base calls - no ambiguities) ericarcher Bioinformatics 6 01-24-2014 09:52 AM
Miseq: how to access index base calls? bcl2fastq does not work dsobral Illumina/Solexa 19 12-12-2013 08:46 AM
no base calls? yaximik Sanger/Dye Terminator 0 06-02-2013 03:43 PM
unidentified base (N) in 454 reads bagdevi 454 Pyrosequencing 0 02-21-2013 03:45 AM
Range of quality of base calls at each position in my alignment of 454 reads trasver 454 Pyrosequencing 1 03-07-2011 05:31 AM

Reply
 
Thread Tools
Old 08-21-2018, 11:54 AM   #1
dlhmmll
Junior Member
 
Location: Charlottetown, PE, Canada

Join Date: Apr 2017
Posts: 3
Default Unidentified Reads in Base Calls

When looking for my .fasta files under MiSeq Analysis > Data > Base Calls, there are always small .fasta files called "unidentified". Why do I get these, what are they, and are they useful? I'm doing meta-transcriptomic sequencing and I care more for the presence of the RNA virus than the host. The "unidentified" sequences tend to be from pathogen.

Thanks
dlhmmll is offline   Reply With Quote
Old 08-21-2018, 11:59 AM   #2
dlhmmll
Junior Member
 
Location: Charlottetown, PE, Canada

Join Date: Apr 2017
Posts: 3
Default Correction: I mean "undetermined"

Correction: I mean "undetermined". I usually don't spike my runs with PhiX either
dlhmmll is offline   Reply With Quote
Old 08-21-2018, 12:24 PM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,814
Default

If you are referring to sequences in "undetermined" files then these reads contain indexes that don't fit expected results (based on the samplesheet that you provide that has SampleID_Index mapping). There can be tens of variations of index sequences in there. Some of those are sequencing/basecalling errors. These reads are not generally useful (unless there was an error in the samplesheet and real samples end up in this pool). This pool will also contain phiX reads (since it does not have an index).
GenoMax is offline   Reply With Quote
Old 08-22-2018, 12:46 AM   #4
shreedivya
Junior Member
 
Location: chennai

Join Date: Jun 2018
Posts: 5
Default

FASTA format is the text-based format for representing the nucleotide sequences and the peptide sequences. In this the base pairs or amino acids are represented by using single-letter codes. A sequence in FASTA format begins on a single-line description, followed by the lines of sequence data.
__________________
Hi I am anushka. I am working as a Manager in Hotel temple citi
shreedivya is offline   Reply With Quote
Old 08-22-2018, 04:42 AM   #5
dlhmmll
Junior Member
 
Location: Charlottetown, PE, Canada

Join Date: Apr 2017
Posts: 3
Default

Thanks for the replies
dlhmmll is offline   Reply With Quote
Old 08-30-2018, 01:05 AM   #6
shreedivya
Junior Member
 
Location: chennai

Join Date: Jun 2018
Posts: 5
Default

Nice post.
__________________
Hi I am anushka. I am working as a Manager in Hotel temple citi
shreedivya is offline   Reply With Quote
Reply

Tags
unidentified base, unidentified reads

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:55 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO