View Single Post
Old 11-16-2020, 01:26 AM   #1
Marko Radojkovic
Junior Member
Location: Netherlands

Join Date: Nov 2020
Posts: 1
Default NGS data analysis - help

Hello everyone!

I am a 1st year PhD Biochemistry student who is doing NGS for the first time (also the first time in our research group).

Long-story short: We have two mutant libraries (484 and 1024 unique clones), both having two different barcodes and we are planning to pool them and sequence together. PCR amplicons will be 180 bp long, and they will only differ in 6 bp which is in the middle of the sequence.

Sequencing will be done on Illumina platform, here are the specs:
- Technology: Illumina NovaSeq
- Run type: Paired end
- Read length: 2 x 150 bp
- Guaranteed 5 million read pairs (10 million reads) per package (+/- 3%)
- Guaranteed 1.5 Gb raw data per package (+/- 3%)
Deliverables: FastQ Files (sequences and quality scores)

Since I am doing this for the first time, I have no experience in data analysis.
I would like to do basic things: merging of overlapping reads, tag sorting and counting/calculating percentage of each variant. If anyone can recommend me software for data analysis, which is suitable for inexperienced user like me, to run on Windows preferably, but Linux would also work.

If you have any other advice/recommendation it will be more than welcome!

Thank you.

Best regards
Marko Radojkovic is offline   Reply With Quote