SEQanswers

Go Back   SEQanswers > Applications Forums > Metagenomics



Similar Threads
Thread Thread Starter Forum Replies Last Post
A first look at Illumina’s new NextSeq 500 AllSeq Vendor Forum 105 03-13-2017 01:39 PM
Weird FastQC reports on NextSeq 500 data nucleolus Illumina/Solexa 4 11-05-2015 02:13 AM
A server for processing data from Nextseq 500? johnsunx1 Core Facilities 5 08-31-2015 03:52 AM
100 Gb Data/Day – Nextseq 500 Sequencing Services Now Available on Genohub Genohub Vendor Forum 3 04-24-2014 08:28 AM
Paired-end Bam from single-end aligned sam ramouz87 Bioinformatics 4 08-17-2011 12:55 PM

Reply
 
Thread Tools
Old 06-28-2016, 02:07 AM   #1
Julia_m
Junior Member
 
Location: stockholm

Join Date: Jun 2016
Posts: 4
Default paired end data not aligned in NextSeq 500

Hello!!
I'm quite new in NGS and I got these fastq files (read1 and read2) from illumina nextseq 500.
Now the problem is that in somehow the two file are not aligned, I mean, I checked for the coordinates in read1 and read2 on the same row and it appears that sometimes are not the same!
Since I'm quite new, this can be possible? or anyway is there any tools or bash script that can help me in the alignment of these files?


Hope seriously in somebody that can help me!
Julia_m is offline   Reply With Quote
Old 06-28-2016, 02:23 AM   #2
Krish_143
Member
 
Location: Sweden

Join Date: Jan 2012
Posts: 45
Default

Hi Julia_m,

Is that Genomic data or RNAseq - transcriptomic data. Okay !

If it was not aligning that could mean.. paired read was not found in one of the other file.

Probably you might have the trimmed data and considered all the ones. If it was trimmed you should keep the read that present in both the files. (Check any trimming software eg: cutadapt and try mapping)

Hope it works.
__________________
Krishna
Krish_143 is offline   Reply With Quote
Old 06-28-2016, 02:51 AM   #3
Julia_m
Junior Member
 
Location: stockholm

Join Date: Jun 2016
Posts: 4
Default

cutadapt is for adaptor cutting isn't it?
i mean my problem is quite different, I have two fastq files like these:
READ1.fastq
@NB501365:8:HF3HLAFXX:1:11101:22082:1033 1:N:0:CGAGTA
READ2.fastq
@NB501365:8:HF3HLAFXX:1:11101:22082:1033 2:N:0:CGAGTA

the coordinates 22082:1033 for read1 and 22082:1033 for read2 shouldn't be the same in paired ends?
Julia_m is offline   Reply With Quote
Old 06-28-2016, 02:56 AM   #4
Julia_m
Junior Member
 
Location: stockholm

Join Date: Jun 2016
Posts: 4
Default

sorry big mistake READ2.fast has the following header:
@NB501365:8:HF3HLAFXX:1:11101:64563:1033 2:N:0:CGAGTA
Julia_m is offline   Reply With Quote
Old 06-28-2016, 03:11 AM   #5
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,467
Default

Your data looks as it should. The "1:N:0:CGAGTA" portion, as an example, just says "I'm read 1, I didn't fail quality control filtering on the machine, and my barcode was CGAGTA". Read 2 should look the same (and have the same read name), with the exception that there's a 2 rather than a 1 in the second block of text.

So go ahead and quality/adapter trim this dataset (e.g., with "Trim Galore!" or trimmomatic) and then use an aligner (bowtie2, bbmap, hisat2, bwa, STAR, etc.).

Edit: Oh, if the read names are really different (I just now noticed your most recent post) then you'll need to resync the files. There's a tool in BBMap that can do this for you (it has a LOT of convenient tools).
dpryan is offline   Reply With Quote
Old 06-28-2016, 03:34 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,385
Default

As @Devon said your R1/R2 files may be out of sync. You can use repair.sh from BBMap suite to re-sync the files.

You will find example of that command line and lots of other things BBMap suite can do in this thread.
GenoMax is offline   Reply With Quote
Old 07-12-2016, 10:29 PM   #7
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 615
Default

It would still be interesting, what has caused the reads to be out-of-sync.
Have these reads been pre-processed with some tool before or is this the data that has been sent by sequencing provider?

Fixing is important, avoiding, or knowing how to avoid it, is more important ;-)

Just my 2p.
sklages is offline   Reply With Quote
Old 07-23-2016, 12:44 AM   #8
Julia_m
Junior Member
 
Location: stockholm

Join Date: Jun 2016
Posts: 4
Default

problem solved!! seems that something happened during the unzipping process (I don't know why), if you process everything directly from the gzip file it's fine!.
Julia_m is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:46 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO