SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to NOT generate fastq files on MiSeq NGSMicro Illumina/Solexa 0 02-05-2019 11:40 PM
No fastq.gz files from MiSeq fabregas Illumina/Solexa 2 11-25-2015 02:23 PM
Illumina MiSeq FASTQ files kkggc Bioinformatics 5 12-13-2013 11:37 AM
Demultiplexing dual-indexed MiSeq fastq files lynchde Bioinformatics 2 08-18-2013 02:15 PM
Empty cuffdiff files repinementer Bioinformatics 1 08-17-2010 09:57 PM

Reply
 
Thread Tools
Old 04-01-2020, 02:00 PM   #1
mmhefny
Member
 
Location: Germany

Join Date: Apr 2020
Posts: 10
Default empty fastq files after Miseq is done

Dear all,
I am new in this field and I done my first sequencing using MiSeq. I checked the files and found that all generated fastq.gz files are empty except two files with names Undetermined_****.fastq.gz are in the Data/Intensities/BaseCalls/ folder which has data (each is about 200 kb).

Could you please help me to explain where is the problem?

Thank you
mmhefny is offline   Reply With Quote
Old 04-02-2020, 02:52 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Sounds like you had a demultiplexing failure. Check your sample sheet one more time (easy mistake is to provide the indexes in rev-comp) and re-queue demultiplexing. You should use Illumina Experiment Manager software (windows only) available from Illumina to make the sample sheet, if you are new to sequencing.

At worst, your run may have failed. Best solution to check on either possibilities is to contact Illumina tech support. They should be able to take a look at your MiSeq remotely (if your MiSeq is not on network then they will ask for a few files) and diagnose.
GenoMax is offline   Reply With Quote
Old 04-02-2020, 03:15 AM   #3
mmhefny
Member
 
Location: Germany

Join Date: Apr 2020
Posts: 10
Default

Thank you.

I will check both possibilities. I looked at the CompletedJobInfo file and it seems to be fine. I attached also my samplesheet file. I hope you have time to check them.

I really appreciate your efforts.
Attached Files
File Type: xml CompletedJobInfo.xml (4.0 KB, 0 views)
File Type: zip SampleSheet.zip (1,011 Bytes, 1 views)
mmhefny is offline   Reply With Quote
Old 04-02-2020, 05:04 AM   #4
mmhefny
Member
 
Location: Germany

Join Date: Apr 2020
Posts: 10
Default

Quote:
Originally Posted by GenoMax View Post
Sounds like you had a demultiplexing failure. Check your sample sheet one more time (easy mistake is to provide the indexes in rev-comp) and re-queue demultiplexing. You should use Illumina Experiment Manager software (windows only) available from Illumina to make the sample sheet, if you are new to sequencing.

At worst, your run may have failed. Best solution to check on either possibilities is to contact Illumina tech support. They should be able to take a look at your MiSeq remotely (if your MiSeq is not on network then they will ask for a few files) and diagnose.
Thank you.

I will check both possibilities. I looked at the CompletedJobInfo file and it seems to be fine. I attached also my samplesheet file. I hope you have time to check them.

I really appreciate your efforts.
mmhefny is offline   Reply With Quote
Old 04-02-2020, 05:39 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

It won't be very useful to look at the Samplesheet. If MiSeq accepted it at run time then it must have been in the correct format. If your samples did not demultiplex then it is very likely that you have wrong index sequence information in your samplesheet.

Can you post a screenshot of the Sequence Analysis Viewer info for this run? Especially the alignment to phiX. That would be useful for diagnosis.
GenoMax is offline   Reply With Quote
Old 04-02-2020, 05:53 AM   #6
mmhefny
Member
 
Location: Germany

Join Date: Apr 2020
Posts: 10
Default

Quote:
Originally Posted by GenoMax View Post
It won't be very useful to look at the Samplesheet. If MiSeq accepted it at run time then it must have been in the correct format. If your samples did not demultiplex then it is very likely that you have wrong index sequence information in your samplesheet.

Can you post a screenshot of the Sequence Analysis Viewer info for this run? Especially the alignment to phiX. That would be useful for diagnosis.
Unfortunately due to COVID 19 measures I have no access to the MiSeq machine now. I only have the run files. Also we did not include phiX sample control. Is there any file in the run folder can be representative so that I share it with you?

I really appreciate your help.
mmhefny is offline   Reply With Quote
Old 04-02-2020, 08:08 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Are you able to use command line? If you are I can suggest some options to look at those "Undetermined" files to see what may be going on.

If you are able to open them (on PC/Mac) post a small number of example reads here.
GenoMax is offline   Reply With Quote
Old 04-02-2020, 09:03 AM   #8
jdk787
josh kinman
 
Location: Austin

Join Date: Apr 2014
Posts: 71
Default

If you look at the DemultiplexSummaryF1L1.txt it will tell you what barcodes were used on the run if the demultiplexing worked.

It's located at
<run folder>\Data\Intensities\BaseCalls\Alignment
__________________
Josh Kinman
jdk787 is offline   Reply With Quote
Old 04-02-2020, 09:57 AM   #9
mmhefny
Member
 
Location: Germany

Join Date: Apr 2020
Posts: 10
Default

Quote:
Originally Posted by GenoMax View Post
Are you able to use command line? If you are I can suggest some options to look at those "Undetermined" files to see what may be going on.

If you are able to open them (on PC/Mac) post a small number of example reads here.
Yes I can. Here are some reads
@M02006:24:000000000-J397M:1:1101:17423:1618 2:N:0:0
CATTGTTCCTTTTGCTTCTTTCCTTTCCCTCTTCTCTTTCTTTTCTCTTCTCTCCTTCTTTTTTCTTTTTTGCCTCTCTTGTTTTCTTACTCTCTTTCCCCCTCTTTTTTCTCTTTCTTTCTTCTCCCTCTCTCTCCCTCACCTTTTCTAT
+
1111>3333B@3111A13D1331333111A0011A133333111212D212A210111111D1/011111/A1111B01A111002D12111211212110///01B11/012@21221B2212B1100000111100000/0111121B2
@M02006:24:000000000-J397M:1:1101:17369:2159 2:N:0:0
TCGCCGGCCTAGTAGGCCCTTCCCTTAGTCTCCTTCTTCTCGCTGCATTCTGACACCCCGGCTCTCTACTTGGCGCTGCTCAACTTTCTAATGTGGCCGCGCGTCGTGTAGGGCACTAGTGTCGTCCTGCGTGTCGATCTCGGAGGTCGCG
+
11111>>1100B1AG11B00A000111B22211B111D22B///B/A1222221B1///A////0112D211F0@//E/1111021FD2222B2BF/B>///>//</?/011?0//01?1?1</<//.<1>.<>0CA<..0<..--./:--
@M02006:24:000000000-J397M:1:1101:18125:2191 2:N:0:0
CCCCTTTCCCTCTGTCTTCTCCCTCGGCGCCTTCTTTGCTCTTCTCGCCGGCTTCGTTGTCCGCGCGTCGTGTCGTGCCCGCGTGTCCCGCTGGTTGTCGCTCTCTGTTGTCGCCGTCTCCTTCCCCCCCCGCCTGCCCTCCCCTTCTGTT
+
11>11111111>11B333331AA1BA0000A00111A111A1112BA/A//A//0>000B11//>////>///0///</0//////011/////01/?1/////1111<101<..-<-./00<0///<------...../....9.-////
@M02006:24:000000000-J397M:1:1101:21922:3716 2:N:0:0
GTTTTTCCTCGTCTGCCCGGCCTCCGTTCTTTCTTTCTCTTCTCCCCGTGCTGCTTCCCGCTTCTTCTCCTTTGCACCCCTCTTTCTTCCTGGTCCTCTTGGCGTGCGCCTTCTTGCCTCTTCCTCCCTGTCTCCCGTTTGTCCGCACGTG
+
1>111111311>11A1111000000BA100AA22D12BA2212110///B//1A11110/////11221111211210///?01122121@101111111110///?><///111111111111110/01<12>10//00/021/////0.
@M02006:24:000000000-J397M:1:1101:17741:4534 2:N:0:0
CTTCTCTTTTCTTTTCGGCGCTTGCTCCGGTCTCCTTGGCCCCTCCTTCCGTCTCCTGCTTCGCGCTGCCTTCGGTCGCCCCGGGCCCTTCCTTGGCGCTGGGCCGCGCGTCGTGTCGGGCCCGCGTGTCCCCGCGTGTGTCGATCTCGGT
mmhefny is offline   Reply With Quote
Old 04-02-2020, 10:25 AM   #10
mmhefny
Member
 
Location: Germany

Join Date: Apr 2020
Posts: 10
Default

Quote:
Originally Posted by jdk787 View Post
If you look at the DemultiplexSummaryF1L1.txt it will tell you what barcodes were used on the run if the demultiplexing worked.

It's located at
<run folder>\Data\Intensities\BaseCalls\Alignment
Yes I can see this file. None of the barcodes which I used has hit and only some hits are for different barcodes
mmhefny is offline   Reply With Quote
Old 04-02-2020, 10:34 AM   #11
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

@mmhefny,

Are you sure this run was set up to run a index cycle? This run header does not seem to make it look so.

Code:
@M02006:24:000000000-J397M:1:1101:21922:3716 2:N:0:0
Normally one should see something like this at the end of fastq header for index sequence

Code:
@M02006:24:000000000-J397M:1:1101:21922:3716 1:N:0:GAACGCAA+CTGGCACT
Can you find the RunInfo.xml file and post the section that shows this part?
Code:
<Reads>
      <Read NumCycles="150" Number="1" IsIndexedRead="N" />
      <Read NumCycles="8" Number="2" IsIndexedRead="Y" />
      <Read NumCycles="8" Number="3" IsIndexedRead="Y" />
      <Read NumCycles="150" Number="4" IsIndexedRead="N" />
    </Reads>
GenoMax is offline   Reply With Quote
Old 04-02-2020, 10:39 AM   #12
mmhefny
Member
 
Location: Germany

Join Date: Apr 2020
Posts: 10
Default

Quote:
Originally Posted by GenoMax View Post
@mmhefny,

Are you sure this run was set up to run a index cycle? This run header does not seem to make it look so.

Code:
@M02006:24:000000000-J397M:1:1101:21922:3716 2:N:0:0
Normally one should see something like this at the end of fastq header for index sequence

Code:
@M02006:24:000000000-J397M:1:1101:21922:3716 1:N:0:GAACGCAA+CTGGCACT
Can you find the RunInfo.xml file and post the section that shows this part?
Code:
<Reads>
      <Read NumCycles="150" Number="1" IsIndexedRead="N" />
      <Read NumCycles="8" Number="2" IsIndexedRead="Y" />
      <Read NumCycles="8" Number="3" IsIndexedRead="Y" />
      <Read NumCycles="150" Number="4" IsIndexedRead="N" />
    </Reads>
This part of RunInfo.xml is as follow
Code:
    <Reads>
      <Read NumCycles="151" Number="1" IsIndexedRead="N" />
      <Read NumCycles="8" Number="2" IsIndexedRead="Y" />
      <Read NumCycles="8" Number="3" IsIndexedRead="Y" />
      <Read NumCycles="151" Number="4" IsIndexedRead="N" />
    </Reads>

Last edited by GenoMax; 04-02-2020 at 10:54 AM. Reason: Added [code]tags
mmhefny is offline   Reply With Quote
Old 04-02-2020, 10:53 AM   #13
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Ah so your run was dual-indexed. I am puzzled as to why your reads don't have any index sequence in fastq headers.

What do you see in DemuxSummaryF1L1.txt?

Scroll down the file until your see this section:
Code:
### Most Popular Unknown Index Sequences
### Columns: Index_Sequence Hit_Count
CTCATCAC+GGTGGCAC       2940
AGGTCAAG+CGCGTAAT       860
TCCAGGTA+GCTTCAGT       700
GenoMax is offline   Reply With Quote
Old 04-02-2020, 10:54 AM   #14
mmhefny
Member
 
Location: Germany

Join Date: Apr 2020
Posts: 10
Default

Quote:
Originally Posted by GenoMax View Post
Ah so your run was dual-indexed. I am puzzled as to why your reads don't have any index sequence in fastq headers.

What do you see in DemuxSummaryF1L1.txt?

Scroll down the file until your see this section:
Code:
### Most Popular Unknown Index Sequences
### Columns: Index_Sequence Hit_Count
CTCATCAC+GGTGGCAC       2940
AGGTCAAG+CGCGTAAT       860
TCCAGGTA+GCTTCAGT       700
they are:

### Most Popular Index Sequences
### Columns: Sequence ReverseComplement HitCount
Index
C.CCC.CC GG.GGG.G 96
C.CCT.GC GC.AGG.G 94
C.CGC.CG CG.GCG.G 78
C.GCC.TT AA.GGC.G 65
C.ACC.CT AG.GGT.G 44
C.CCC.CT AG.GGG.G 40
C.ACG.GG CC.CGT.G 38
C.CGC.GG CC.GCG.G 37
C.AGC.GG CC.GCT.G 36
C.CGC.TC GA.GCG.G 33
C.CCC.CG CG.GGG.G 29
mmhefny is offline   Reply With Quote
Old 04-02-2020, 10:59 AM   #15
mmhefny
Member
 
Location: Germany

Join Date: Apr 2020
Posts: 10
Default

Quote:
Originally Posted by GenoMax View Post
Ah so your run was dual-indexed. I am puzzled as to why your reads don't have any index sequence in fastq headers.

What do you see in DemuxSummaryF1L1.txt?

Scroll down the file until your see this section:
Code:
### Most Popular Unknown Index Sequences
### Columns: Index_Sequence Hit_Count
CTCATCAC+GGTGGCAC       2940
AGGTCAAG+CGCGTAAT       860
TCCAGGTA+GCTTCAGT       700
This is the file and it does not have this Unknown Index Sequences
Attached Files
File Type: txt DemultiplexSummaryF1L1.txt (6.6 KB, 2 views)
mmhefny is offline   Reply With Quote
Old 04-02-2020, 11:03 AM   #16
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

I am afraid something has gone wrong with this run. Either specific index cycles failed (where you see the . ) or more likely (hate to say this) your libraries may have failed. Either scenario would make this data unusable.

Are you sure the adapters you used were good and/or there was no issue with the run itself (either hardware or software related)? You could submit a ticket to Illumina tech support. You seem to have access to full run folder so they will help you diagnose the problem. Even if you don't have a maintenance contract the diagnosis should be covered under regular tech support.

The file you attached above has an odd format that I have not seen before. Contact Illumina tech support and see what they have to say.

Last edited by GenoMax; 04-02-2020 at 11:06 AM.
GenoMax is offline   Reply With Quote
Old 04-02-2020, 11:22 AM   #17
mmhefny
Member
 
Location: Germany

Join Date: Apr 2020
Posts: 10
Default

Quote:
Originally Posted by GenoMax View Post
I am afraid something has gone wrong with this run. Either specific index cycles failed (where you see the . ) or more likely (hate to say this) your libraries may have failed. Either scenario would make this data unusable.

Are you sure the adapters you used were good and/or there was no issue with the run itself (either hardware or software related)? You could submit a ticket to Illumina tech support. You seem to have access to full run folder so they will help you diagnose the problem. Even if you don't have a maintenance contract the diagnosis should be covered under regular tech support.

The file you attached above has an odd format that I have not seen before. Contact Illumina tech support and see what they have to say.
OK. I will do this and I hope I can find a solution. You can not imagine how much effort I have put to get this results.

Thank you very much for your support and I really appreciate your time spent to answer my queries.

Regards,
mmhefny is offline   Reply With Quote
Old 04-02-2020, 11:28 AM   #18
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Please come back and post when you hear from tech support. I am curious to see the outcome of this as well.
GenoMax is offline   Reply With Quote
Old 04-02-2020, 11:39 AM   #19
jdk787
josh kinman
 
Location: Austin

Join Date: Apr 2014
Posts: 71
Default

Quote:
Originally Posted by GenoMax View Post
Ah so your run was dual-indexed. I am puzzled as to why your reads don't have any index sequence in fastq headers.
For MiSeq data, MiSeq Reporter and I think BaseSpace don't include the index sequence in the headers.
__________________
Josh Kinman
jdk787 is offline   Reply With Quote
Old 04-02-2020, 11:51 AM   #20
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Quote:
Originally Posted by jdk787 View Post
For MiSeq data, MiSeq Reporter and I think BaseSpace don't include the index sequence in the headers.
I see. We don't use either. Thanks for that info.

In any case the Demux file seems to be in a different format. Again may be due to use of MiSeq reporter/BaseSpace.
GenoMax is offline   Reply With Quote
Reply

Tags
fastq files, miseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:46 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO