Hi ! I had hard time finding a good piwi database so here is what I did, you can use it and tell me if it seems reasonable.
http://pirnabank.ibab.ac.in is the most used database, thez give you a list of fasta file corresponding to HG18 coordinates so I made this script to build a gff from their website:
http://pirnabank.ibab.ac.in is the most used database, thez give you a list of fasta file corresponding to HG18 coordinates so I made this script to build a gff from their website:
Code:
for i in X Y $(seq 1 21); do wget http://pirnabank.ibab.ac.in/cgi-bin/chr_result.pl?org=Homo_sapiens\&table=Human37_piRNA\&chr=$i\&from=0\&to=1000000000; mv chr_result.pl?org=Homo_sapiens\&table=Human37_piRNA\&chr=$i\&from=0\&to=1000000000 $i.html; done 2> /dev/null >/dev/null (echo "## gff-version 3" echo "## date: $(date +%Y_%m_%d)" echo "## Chromosomal coordinates of piwi RNA according to piRNAbank catin hsa genome" for i in $(seq 1 21) X Y; do cat $i.html | sed ':a;N;$!ba;s/\n/ /g' | sed 's/table/\n/g' | grep pir_info | tr ':\-=><|' ' ' | sed 's/ */ /g' > temp grep -v Accession temp | awk '{print $19,"NOID",$73,$74,$75,$76}'> temptemp grep Accession temp | cut -d' ' -f20,61,93,94,95,96 >> temptemp cat temptemp | tr -d '[]"' | sort -nk3 > temp cat temp| awk '{strand="-";if($6=="Plus"){strand="+"};printf "chr%s\tpiRNAbank\tpiwi\t%i\t%i\t.\t%s\t.\tID=%s;Name=%s\n",$3,$4,$5,strand,$2,$1}' done ) > pirna.gff