Hi, we have PacBio data for a De Novo assembly project of viral sequences. The run was made in long reads.
I analyzed the data with smrtpipe LongAmpliconAnalysis. The config file follows :
The consensus generated by this pipeline are allright for almost all samples, we manage to create contigs of the length of the expected genome using MIRA with the resulting fastqs or manually with the fastas.
But we wanted to give a try with VICUNA to extract quasi-species information. When I tried to launch VICUNA on the fastq resulting from the demultiplexing step (barcoded-fasqs folder):
VICUNA can't read the fastq and the error is a little fuzzy. I tried with fastq originating from MiSeq sequencing and it works fine.
fastq from PacBio:
fastq from MiSeq:
Any clue?
I analyzed the data with smrtpipe LongAmpliconAnalysis. The config file follows :
Code:
<?xml version="1.0"?> <smrtpipeSettings> <module name="P_Fetch"/> <module name="P_Filter"/> <module name="P_Barcode"> <param name="barcode.fasta"> <value>barcode_run1.fasta</value> </param> <param name="mode"> <value>symmetric</value> </param> <param name="adapterSidePad"> <value>2</value> </param> <param name="insertSidePad"> <value>2</value> </param> </module> <module name="P_AmpliconAnalysis"> <param name="minLength"> <value>2400</value> </param> <param name="minReadScore"> <value>0.78</value> </param> <param name="maxReads"> <value>700</value> </param> </module> </smrtpipeSettings>
But we wanted to give a try with VICUNA to extract quasi-species information. When I tried to launch VICUNA on the fastq resulting from the demultiplexing step (barcoded-fasqs folder):
Code:
vicuna vicuna_config_11F1.txt -------------------------------------------------------- Program runs with the following Parameter setting: ===== Trimmer ===== vectorFileName trimLogFileName minMSize 9 minInternalMSize 15 maxOverhangSize 4 minReadSize 25 ===== Profiler ===== MSAFileName binNumber 20 kmerLength 15 (encode using 4 bytes) maxHD 1 minSpan 75 blockNumber 5 rMapFileName ===== Contiger ===== w1 12 w2 5 Divergence 8 max_read_overhang 2 min_profile_col_weight 5 min_consensus_base_ratio 85 max_contig_overhang 10 seed_kmer_len 12 min_contig_overlap 25 min_contig_links 3 min_identity 90 min_perc_polymorphism 5 max_variant_len 20 ===== Assembly ===== npFqDir /home/nico/labo/etudes/HEPAC/run_1/deNovo_VICUNA/data batchSize 2000000 LibSizeLowerBound 100 LibSizeUpperBound 800 min_output_contig_len 300 outputDIR /home/nico/labo/etudes/HEPAC/run_1/deNovo_VICUNA/11F1/ -------------------------------------------------------- Indexing ... /home/nico/labo/etudes/HEPAC/run_1/deNovo_VICUNA/data/11F1_lbc11--lbc11--11F1_lbc11--lbc11.fastq create: err reading fastq file: /home/nico/labo/etudes/HEPAC/run_1/deNovo_VICUNA/data/11F1_lbc11--lbc11--11F1_lbc11--lbc11.fastq ... exit
fastq from PacBio:
Code:
@m160321_175808_42263_c100986322550000001823225107191690_s1_X0/32876/1557_2547 0.87 28 AACTTAAGATAGTTGGTGACCATCCGCTGGTGATAGAGCGTGTGCGGGCCTATTGCTGCCACTTTTGTGCTGCTGCTCCACTGCGGCCCCTGAGCCGTCACCAATGCTTATGTCCCATACCCCCGTCGACAAGGTGTATGTTCGTTCACATATTTGGCCCTGGCGGGTCCGCCGTCCCTGTTCCATCAGCCTGCTCTACAAAAATCCACATTTCATGCGTCCCAGTTTCAATATTTGGGACCGGGCTCATGCTCATTTGGTGCCCCCCTGGACGATCAGTGCGTTTTGCTGCTCACGGCTTTTTATGACTTACCTTCGTGGATTGTTACAAGGTTACTGTAGGTGCCCTTGTTGTTCCCTAATGAAGGGTGGAATGCTTCGGAGGAAACGCTGCTCACCCTGTTACACCGCAGCGTACTTGACCATTTTCATCAGCGTTAGCCTCCGTACTCAAAGCTATATCCAAGGCATGCGCCGGCTGGAGGTTGAGCAATGCTCAAAAAATTTATCACAAAAGACCTATAGTTGGCTGTTTTGAGAAGTCTGGCCGTGACTACATCCCCGGCCGTCAGCTTTCAGTTCTATGCACAGTGCCGCCGCTGGCTATCGGCGGGTTTCCACCTCGATCAAGGGTGCCTGTTTTTTTGATGAGTCTGCTCCCTGCCGTTTGTAGGCGTTTCTTAGAAAGGTTTGCAGGTAAGTTCTGCTGTTTTTATGAAGTGGCTGGGGCCAGGAGTGCACCTGCTTTCCTTGGAAACGCAGCTGAGGGCCTGGTTGGTGACCATGCCACGATAATGAAGCTATGGAGGGTTCTGAGGTCGACCAGGCTGAGCCCGCTCATCTTGATGTTTTCTGGGACTTATGGCCGTCCAATGGAAGCCAACCTCGAGGCCCTGTACAGGGCGCTTAACATCCCGCACGATATCGCCGCTTTGGGGCCTCCATGGCGTCGACATCCGC + (.(,'(..+-$-,,,-..-++,+#+,-.-.///./.(.-+/./%/)('%/./+.-/)-,,//.++-//)/+.///.+*./-'-.+*$%(..)//,//,//,/)//./%///--&&)///*)$)/-///-.(&./////'/*.//-/.#///*.+-.,$*/+.+/#%+-%-%+/))'++//-/+////&.+/,/++....$%*(-+*/////*///*.$(/',///'*./%/-/&,-,--.).*+,.)//*/.//#.+*-./.#(*$$(.--/+//////%./*-&%-*/..+/..,',/,')*///',/.//,+*+()#*+,+%!*')/%/..*.//+/.,&-.-%-''-.-/(#(').()/(.*/--./.)*//./&./**--)'#.///.$////*+///-//&.+/.+///*//..,./-,//&+),&./)*//-(.$+).%.).)-//%$/*.,////,/*/.)//(../,//&/',+/.+-+//./$///./+,,'.&//'//////-$%.+/,//.-//.-.-/.+.,'.--+).+/,..,&/-///////(/))+..,+//,,)//&)./..*&&''/-/*//,/.+/-,/-/*-(/.-.(.,/,,--)/+//'.//../)*/)',+.).,..#*#+*../.+/%/////-)-&/-*/+,#,--'+*&,+&/.-//%(*/..,+..//.-,*&*'////&.//*+*,*..),(.&+//,*-*+.//-///./),///..*/*/(-**%'/+$//,//./,*)$/.-/*,-///*,/./.,,/+//+*///)..*+#+*&-',.,/,///*.//./)/+/-//.//),//././.+.-.+//.+$///,+-(.....-#(/-.-+(-.(*$..+.).''.....)(&/*././/--+*//-..#...).(+-./*//.//*.'*+,$,*),$%.++/+..*/*-/.(.+(),./
Code:
@MISEQ3:10:000000000-AGH1Y:1:1101:19170:2650 1:N:0:CGAGGCTGACTGCATA GACCATCAGCGGCCCGAAAGGACCGGTCAACCAGATGTACACCAATGTCGACCAAGACTTGGTGGGTTGGCCCGCACCTCCAGGAGTGAAGTCCTTGGCCCCATGCACCTGTGGCTCGTCGGACCTGTTCCTGGTTACCAGGCACGCCGACGTGGTGCCCGTGCGCAGAAGAGGCGACACTCGTGGCGCCCTCCTAAGCCCCAGGCCAATTTCAACTCTTAAGGGGTCGTCCGGTGGGCCACTGCTGTG + CDDDDFFFFFDDGGGGGGGGGGHHGGGGGHHHGHHHHHHHHHHHHHHHHHHGGGHHHHHHHHHGGGGGGGHHHGGGGGHHHHHHHGGHHHHHHHHHHGHHHGGHHHHHHHHHHHHHHGGG@DDGGFB=FDGHFHHHHHHH0CGEGDGGGGGGBFGCEFGGGAD?B@?@;FFFFFFFCCF;0BAFEFFFFFFFFFFFFBFFFFFBFFFAFBFFF0FBBFBFFFFFFFFFFFFFFF-@FDF?FFFFEFFFF
Comment