Hi everybody,
I am trying to create a junction file from the SJ.out.tab file of the STAR aligner.
I would like to see only those alignment in there, which have at least 5 reads covering the junction for canonical motives and 8 reads for non-canonical.
To get the reads I have used the following parameters:
I have than run STAR and modify the SJ.out.tab file in the following way:
so that I get a bed file structure.
But even though I have tried everything to set a threshold for the "score" (the columns 7 and 8 of the original SJ.out.tab file, $7 and $8 in the awk command) greater than 1 I still get the in my junction file.
I would like to know if there is a way of making sure, that only splice junction with a coverage >=5 (or 8 ) will be in the SJ.out.tab file.
Does it means that the junctions with a coverage below 5 are annotated junctions? (If I understand the parameters correctly)
Thanks
Assa
I am trying to create a junction file from the SJ.out.tab file of the STAR aligner.
I would like to see only those alignment in there, which have at least 5 reads covering the junction for canonical motives and 8 reads for non-canonical.
To get the reads I have used the following parameters:
Code:
--outFilterType BySJout --outSJfilterCountTotalMin 10 8 8 8 --outSJfilterReads default: All --outSJfilterOverhangMin 30 12 12 12 --alignSJoverhangMin 8 --alignSJDBoverhangMin 5 --outFilterMultimapNmax 20
Code:
awk {'if($4=="2") print "chr"$1"\t"$2-$9"\t"$3+$9"\tJUNC000"NR"\t"$7+$8"\t-\t"$2-$9"\t"$3+$9"\t255,0,0\t2\t"$9","$9"\t","0,"$3-$2+$9; else if($4=="1") print "chr"$1"\t"$2-$9"\t"$3+$9"\tJUNC000"NR"\t"$7+$8"\t+\t"$2-$9"\t"$3+$9"\t0,0,255\t2\t"$9","$9"\t","0,"$3-$2+$9'} $file >> $NEW
Code:
track name="IFM16h_1.bed" description="IFM16h_1.STAR.SJ.out.tab" visibility=2 useScore=1 chr2L 8082 8227 JUNC0001 26 + 8082 8227 0,0,255 2 35,35 0,110 chr2L 8105 8240 JUNC0002 1 + 8105 8240 0,0,255 2 12,12 0,123 chr2L 11330 11424 JUNC0003 2 - 11330 11424 255,0,0 2 15,15 0,79
I would like to know if there is a way of making sure, that only splice junction with a coverage >=5 (or 8 ) will be in the SJ.out.tab file.
Does it means that the junctions with a coverage below 5 are annotated junctions? (If I understand the parameters correctly)
Thanks
Assa
Comment