Hi all,
I'm trying to re-construct the transcriptome from RNA-seq data using scripture and I get some errors. I'm pasting a log of errors and commands I've been using.
I have paired end data from Illumina, I use tophat on these files as:
I get the standard output files. The accepted_hits.bam is converted into accepted_hits.sam as:
I now take off the header as:
There is only one apparent change in file. I attach screenshots of a small part of the two files.
I then use Scripture's -task makePairedFile as:
On completion, I use IGVtools to sort and index the paired and alignment files as:
This completes successfully and I get the .sai files.
I then use Scripture as:
This is when I get the error. Log of the error...
I don't reckon that I've made a major mistake while doing this. My most probable guess is that something is wrong in format conversion (bam to sam). I'm working on that right now.
Any insights or help would be greatly appreciated.
Thanks!
I'm trying to re-construct the transcriptome from RNA-seq data using scripture and I get some errors. I'm pasting a log of errors and commands I've been using.
I have paired end data from Illumina, I use tophat on these files as:
Code:
$nohup tophat -o tophat_out_SRR065506_1 --GTF ../ucsc.gtf ../hg19index/hg19-25chr SRR065506_1.fastq $nohup tophat -o tophat_out_SRR065506_2 --GTF ../ucsc.gtf ../hg19index/hg19-25chr SRR065506_2.fastq
Code:
$ samtools view -h -o tophat_out_SRR065506_1/accepted_hits.sam tophat_out_SRR065506_1/accepted_hits.bam $ samtools view -h -o tophat_out_SRR065506_2/accepted_hits.sam tophat_out_SRR065506_2/accepted_hits.bam
Code:
$ sed '1,2d' tophat_out_SRR065506_1/accepted_hits.sam | sort > tophat_out_SRR065506_1/accepted_hits.sorted.sam $ sed '1,2d' tophat_out_SRR065506_2/accepted_hits.sam | sort > tophat_out_SRR065506_2/accepted_hits.sorted.sam
I then use Scripture's -task makePairedFile as:
Code:
$java -jar scripture.jar -task makePairedFile -pair1 tophat_out_SRR065506_1/accepted_hits.sorted.sam -pair2 tophat_out_SRR065506_2/accepted_hits.sorted.sam -out postTophat/SRR065506.scripturePaired.sam -sorted
$cat tophat_out_SRR065506_1/accepted_hits.sorted.sam tophat_out_SRR065506_2/accepted_hits.sorted.sam > postTophat/all_tophat_alignments.sam
$igvtools sort postTophat/all_tophat_alignments.sam all_tophat_alignments.sorted.sam
$igvtools sort postTophat/SRR065506.scripturePaired.sorted.sam
$igvtools sort postTophat/all_tophat_alignments.sam all_tophat_alignments.sorted.sam
$igvtools sort postTophat/SRR065506.scripturePaired.sorted.sam
I then use Scripture as:
Code:
$ java -jar scripture.jar -alignment all_tophat_alignments.sorted.sam -out scriptureResults/chr1.segment -sizeFile ../hg19/hg19.chrom.sizes2 -chr chr1 -chrSequence ../hg19/chr1.fa -pairedEnd SRR065506.scripturePaired.sorted.sam
Code:
[SIZE="1"]Using Version VPaperR3 Computing weights..... upweighting? false weight: 1.0 Computing alignment global stats for chromosome chr1 Computing alignment global stats for chromosome chr10 Computing alignment global stats for chromosome chr11 Computing alignment global stats for chromosome chr12 Computing alignment global stats for chromosome chr13 Computing alignment global stats for chromosome chr14 Computing alignment global stats for chromosome chr15 Computing alignment global stats for chromosome chr16 Computing alignment global stats for chromosome chr17 Computing alignment global stats for chromosome chr18 Computing alignment global stats for chromosome chr19 Computing alignment global stats for chromosome chr2 Computing alignment global stats for chromosome chr20 Computing alignment global stats for chromosome chr21 Computing alignment global stats for chromosome chr22 Computing alignment global stats for chromosome chr3 Computing alignment global stats for chromosome chr4 Computing alignment global stats for chromosome chr5 Computing alignment global stats for chromosome chr6 Computing alignment global stats for chromosome chr7 Computing alignment global stats for chromosome chr8 Computing alignment global stats for chromosome chr9 Computing alignment global stats for chromosome chrM Computing alignment global stats for chromosome chrX Computing alignment global stats for chromosome chrY Has pairs: true Has upweighting turned on: false Computing weights..... upweighting? false weight: 1.0 AlignmentDataModel loaded, initializing model stats Computing alignment global stats for chromosome chr1 Computing alignment global stats for chromosome chr10 Computing alignment global stats for chromosome chr11 Computing alignment global stats for chromosome chr12 Computing alignment global stats for chromosome chr13 Computing alignment global stats for chromosome chr14 Computing alignment global stats for chromosome chr15 Computing alignment global stats for chromosome chr16 Computing alignment global stats for chromosome chr17 Computing alignment global stats for chromosome chr18 Computing alignment global stats for chromosome chr19 Computing alignment global stats for chromosome chr2 Computing alignment global stats for chromosome chr20 Computing alignment global stats for chromosome chr21 Computing alignment global stats for chromosome chr22[/SIZE] [B]Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. Not enough fields; Line 510022 Line: @SQ SN:chr15 LN:102531392 at [/B][SIZE="1"]net.sf.samtools.SAMTextReader.reportFatalErrorParsingLine(SAMTextReader.java:169) at net.sf.samtools.SAMTextReader.access$400(SAMTextReader.java:40) at net.sf.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:268) at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:232) at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:196) at org.broad.igv.sam.reader.SamQueryTextReader$SAMQueryIterator.next(SamQueryTextReader.java:197) at org.broad.igv.sam.reader.SamQueryTextReader$SAMQueryIterator.next(SamQueryTextReader.java:141) at broad.pda.seq.segmentation.GenericAlignmentDataModel.getCountsPerAlignment(GenericAlignmentDataModel.java:257) at broad.pda.seq.segmentation.GenericAlignmentDataModel.getCountsPerAlignment(GenericAlignmentDataModel.java:196) at broad.pda.seq.segmentation.GenericAlignmentDataModel.getCountsPerAlignment(GenericAlignmentDataModel.java:1661) at broad.pda.seq.segmentation.AlignmentDataModelStats.getNumberOfReadsByChr(AlignmentDataModelStats.java:178) at broad.pda.seq.segmentation.AlignmentDataModelStats.computeDataStats(AlignmentDataModelStats.java:133) at broad.pda.seq.segmentation.AlignmentDataModelStats.computeGlobalStats(AlignmentDataModelStats.java:122) at broad.pda.seq.segmentation.AlignmentDataModelStats.<init>(AlignmentDataModelStats.java:86) at broad.pda.seq.segmentation.AlignmentDataModelStats.<init>(AlignmentDataModelStats.java:67) at broad.pda.seq.segmentation.ContinuousDataAlignmentModel.main(ContinuousDataAlignmentModel.java:2277)[/SIZE]
Any insights or help would be greatly appreciated.
Thanks!
Comment