PDA

View Full Version : Shell Scripting help ***


AnushaC
10-03-2013, 07:12 PM
i ,
can you please answer my question
I am trying to use shell scripting to convert sra file to bam file
I have 3 commands i am using

1) fastq-dump SRR203400.sra
2) bowtie -v 3 -k 2 --sam /raid/references-and-indexes/hg19/bowtie-indexes/hg19 /raid/development/anusha/SRR203400.fastq > /raid/development/anusha/SRR203400.bowtieold.sam
3)samtools view -b -S SRR203400.bowtieold.sam > SRR203400.bowtieold.bam
Now i am trying use shell scripting piping operation

#!/bin/bash
# generating a fastq file using fastq dump
$1=='sraip'

$2=='hg19ref'
fastq-dump sraip |bowtie -v 3 -k 2 --sam hg19ref fastqop > $3
I am getting confused on how to pipe fast-dump output to bow tie option and use it in samtools view

Thanks in advance ,
Anusha.Ch

rhinoceros
10-03-2013, 09:52 PM
#!/bin/bash
# generating a fastq file using fastq dump
sraip=$1
hg19ref=$2
fastq-dump $sraip | bowtie -v 3 -k 2 --sam $hg19ref fastqop > outputfile


or alternatively


#!/bin/bash
# generating a fastq file using fastq dump
fastq-dump $1 | bowtie -v 3 -k 2 --sam $2 fastqop > outputfile


Now you would call your script "script.sh /where/ever/it/is/sraip.file /where/ever/it/is/hg19reffile". Don't know if it works. I'm just demonstrating how you hadn't quite grasped how to use variables. In shell scripts $1 is the first thing after the program name when calling the script, $2 the second, etc. Variables are assigned as variableA="something" and then referred to by $variableA, e.g.


variableA="Hello"
echo $variableA

dpryan
10-04-2013, 12:54 AM
@rhinoceros, yeah, that won't do what you want.

fastq-dump -Z $1 | bowtie -v 3 -k 2 --sam $2 - $3

where $3 is an output name. You need to use the -Z flag so fastq-dump doesn't write instead to a file and pass - as the input file name to bowtie so it knows to read from stdin. In a real script, you'd have the output name generated according to the name of the SRA file. Also, piping input into bowtie won't work for paired-end data (not to mention that you should always QC SRA data prior to alignment as a lot of it is low quality).

AnushaC
10-04-2013, 09:32 AM
Hi ,
I need to pipe third program which is samtools view to make sam to bam


fastq-dump -Z $1 | bowtie -v 3 -k 2 --sam $2 - |samtools view -b -S > $3

what will direct my output to sam tools view.


Thanks in advance ,
Anusha.Ch

dpryan
10-04-2013, 09:35 AM
fastq-dump -Z $1 | bowtie -v 3 -k 2 --sam $2 - |samtools view -b -S > $3


That's almost correct. Have a look at the samtools manual for what you're missing (hint, it's a -).

AnushaC
10-04-2013, 09:39 AM
Hi ,
You guys talking about only two of my outputs but i have three program i am using samtools view to convert sam file to bam file . I am trying but will direct sam to bam output with piping

Thanks,
Anusha.Ch

dpryan
10-04-2013, 09:43 AM
You guys talking about only two of my outputs but i have three program i am using samtools view to convert sam file to bam file.

I'm aware that you're trying to then convert to a BAM file by piping to samtools. Please actually read my reply and then do what I wrote. I could simply write the full command, but then you'd not start to learn how to properly create these commands yourself.

AnushaC
10-04-2013, 09:44 AM
Hi Ryan,
That is not correct . I am getting error


Thanks,
Anusha.Ch

dpryan
10-04-2013, 09:51 AM
That is not correct . I am getting error

Whatever you're typing may be incorrect, but what I told you to do is not (btw, when you receive an error message, it's usually best to actually post it).

AnushaC
10-04-2013, 10:04 AM
Hi Ryan ,



fastq-dump -Z $1 | bowtie -v 3 -k 2 --sam $2 -|samtools view -u -t -F 4 -$3

will this work do i need to use faidix also

Thanks,
Anusha.Ch

dpryan
10-04-2013, 10:10 AM
fastq-dump -Z $1 | bowtie -v 3 -k 2 --sam $2 -|samtools view -u -t -F 4 -$3


Again, the samtools command isn't specified correctly. Reread the samtools manual (and why are you using -t?).

AnushaC
10-04-2013, 10:20 AM
Hi Ryan,
I read the manual and If `-t' is
applied, the input file is assumed to be in the SAM format. so i started using -t but it did not work for -t or -T i did try by giving reference file . I did not understand where i am going wrong .

Thanks,
Anusha.Ch

dpryan
10-04-2013, 10:28 AM
That's not what the manual says regarding -t.

AnushaC
10-04-2013, 10:33 AM
By default, this command assumes the file on the command line is in
the BAM format and it prints the alignments in SAM. If `-t' is
applied, the input file is assumed to be in the SAM format. The
file supplied with `-t' is SPACE/TAB delimited with the first two
fields of each line consisting of the reference name and the
corresponding sequence length. The `.fai' file generated by `faidx'
can be used here. This file may be empty if reads are unaligned.

this what i got when i trying to run command then i changed to like this


#!/bin/bash
# generating a fastq file using fastq dump
sraip=$1
hg19ref=$2
fastq-dump -Z $1 | bowtie -v 3 -k 2 --sam $2 -|samtools view -u -S -$3

It also did not work ,then i tried giving samtools view -u -T $2 -$3 and
samtools view -u -t $2 -$3 none of this worked

AnushaC
10-04-2013, 10:49 AM
Hi ,


fastq-dump -Z $1 | bowtie -v 3 -k 2 --sam $2 -|samtools $2 -| samtools view -u -t -$3

will this work ?

Thanks,
Anusha.Ch

AnushaC
10-04-2013, 10:52 AM
sorry i mean
fastq-dump -Z $1 | bowtie -v 3 -k 2 --sam $2 -|samtools faidx $2 -| samtools view -u -t -$3

AnushaC
10-04-2013, 10:54 AM
No it is not working for me . It is saying fail to build fasta index . it is coming like this


anusha@cn1:/raid/development/anusha/python_test/shelltest> ./SRAtoBAM.copy.sh /raid/development/anusha/python_test/shelltest/SRR203400.sra /raid/references-and-indexes/hg19/bowtie-indexes/hg19 /raid/development/anusha/python_test/shelltest/SRR203400.bam
view: invalid option -- '/'

Usage: samtools view [options] <in.bam>|<in.sam> [region1 [...]]

Options: -b output BAM
-h print header for the SAM output
-H print header only (no alignments)
-S input is SAM
-u uncompressed BAM output (force -b)
-1 fast compression (force -b)
-x output FLAG in HEX (samtools-C specific)
-X output FLAG in string (samtools-C specific)
-c print only the count of matching records
-L FILE output alignments overlapping the input BED FILE [null]
-t FILE list of reference names and lengths (force -S) [null]
-T FILE reference sequence file (force -S) [null]
-o FILE output file name [stdout]
-R FILE list of read groups to be outputted [null]
-f INT required flag, 0 for unset [0]
-F INT filtering flag, 0 for unset [0]
-q INT minimum mapping quality [0]
-l STR only output reads in library STR [null]
-r STR only output reads in read group STR [null]
-s FLOAT fraction of templates to subsample; integer part as seed [-1]
-? longer help

Notes:

1. By default, this command assumes the file on the command line is in
the BAM format and it prints the alignments in SAM. If `-t' is
applied, the input file is assumed to be in the SAM format. The
file supplied with `-t' is SPACE/TAB delimited with the first two
fields of each line consisting of the reference name and the
corresponding sequence length. The `.fai' file generated by `faidx'
can be used here. This file may be empty if reads are unaligned.

2. SAM->BAM conversion: `samtools view -bT ref.fa in.sam.gz'.

3. BAM->SAM conversion: `samtools view in.bam'.

4. A region should be presented in one of the following formats:
`chr1', `chr2:1,000' and `chr3:1000-2,000'. When a region is
specified, the input alignment file must be an indexed BAM file.

5. Option `-u' is preferred over `-b' when the output is piped to
another samtools command.

dpryan
10-04-2013, 11:18 AM
sorry i mean
fastq-dump -Z $1 | bowtie -v 3 -k 2 --sam $2 -|samtools faidx $2 -| samtools view -u -t -$3

-t is used if you have a SAM file without a header, which isn't the case here. Also -u is mostly useful if you're then piping the output of samtools to something else that expects BAM as input. What you want is something like:

fastq-dump -Z $1 | bowtie -v 3 -k 2 --sam $2 - | samtools view -bS -o $3 -

swbarnes2
10-04-2013, 02:16 PM
Forget the shell script for a moment. Can you get these commands to work by just typing them one at a time into the command line? You need to figure that out first, before you go stringing commands you don't understand into each other.

AnushaC
10-04-2013, 02:23 PM
Hi swbarness ,
these commands all working meaning my fastq dump ,bow tie and samtools view all i checked on command line . now i have a pipeline for converting single sam to bam file . Now i have 8 of those kind of file each in different subdirectory and in a directory now i want to iterate over each sam file . Hope u understood what i am saying .


Thanks,
Anusha.Ch