SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
[bedtools][fastaFrombed] could not stat file Azazel Bioinformatics 7 06-27-2013 04:15 AM
Bedtools getfasta PatB Bioinformatics 1 05-06-2013 05:20 AM
BEDtools intersect output is BED instead of BAM syfo Bioinformatics 1 12-18-2012 05:26 AM
bedtools cmccabe General 2 10-31-2012 12:54 PM
intersectBed (BEDtools) generating empty output file palc Bioinformatics 1 08-28-2012 09:36 AM

Reply
 
Thread Tools
Old 06-19-2014, 09:19 AM   #1
lwhitmore
Member
 
Location: washington

Join Date: Aug 2013
Posts: 70
Default Bedtools fastafrombed output

Hey Everyone, I had a quick question about my bedtools fastafrombed output. When I get a sequence it contains both capital and lowercase letters. Does anyone know why it does this or what it means??

Here is an example
>chr1:157917-159078
ATAAGGAAGAATTATGGAGAATTTAAAAATCTATGCTATTTATAGGCACCTAGTAACAGCTCAGTAAATATTAGCTGCTACTATTATTATTTTTATGGTAATTTCACTCAATTAAAAACTGTCGTTAAAAATTGCCATTGTCATGGAACATAATGTCTCCTACTGTATAATTGTAGAAACAGATACAATttgtcccttggtatatggggggattagttccagctctcccatttctgtgtataccaaaatccacgcatactcaagttttcaaagtcagtcctgtggaatccacatataACACAAATGGGaaaattagtgaggtgtggtgacaagcacctgtagtcccagctacttgtgaggctgaggcaggaggattgcttgagcccaggaggttgaggctgcagtgagccataattgcaccactacactccagtctgggcaacagagtgagacAGAAGGTTGACTTTTTAATAGAATTTTTCTGTTCACTTGAAGATATGGTCAGGATTGTGGCATATGAAAATTCTTCATAAAATAACTATCTAATCCAATTAATGCTGGAATTGGGAACAGCAGAAGTGTCATCTCAGAGCTACTCGCAATGAAAGGTGATGTCTGGGGCTCAGGTGTGTTGAGGTCCCCATGCCTGGACTATGGGTGCTGAGTGGGATTTACTTGTCCATCCATTTTCTATATTCCAGCACTGGGAAACTAGGGACAGTACTTGTTCTCAAGGGAATCTTCAGCTTAGGTGGCTCTGTAAAAGAGAAATTACATCATTGAAAAATCGTCGCAggtcaggtgaggtggctcatacctataatcccagcccactgggagactaaggcaggaggattccgtgaggccaggagttcaagaccagcctgagcaacacagtgaaacctcatctctacaaaaaattagaaaatgaactgggtgcggtaaaacattcgtatagtcccagctactctggaggctgaaataggaggatcgcttgagcccaggaagtggaagctgcagtgagctctgatctcaccactgcactctagccttggtgacagagtgagaccctgtctcaaGacacacacaaacacacacacacacacacacacacCCCCAATCTCACTCTGTCCAGCCTTGACTAATCAAAAGGGCCTTCTG

Thanks
Leanne
lwhitmore is offline   Reply With Quote
Old 06-19-2014, 09:37 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,089
Default

This could be reflective of exons in upper case and introns in lower case sequence, provided it was encoded that way. Something to check on.
GenoMax is offline   Reply With Quote
Old 06-20-2014, 12:38 AM   #3
WhatsOEver
Senior Member
 
Location: Germany

Join Date: Apr 2012
Posts: 215
Default

These are so-called soft-clipped repetitive regions.
All lower case characters are thereby representing the repeats.

You can reproduce this output using e.g. the Ensembl browser:
-Input your coordinates
-Click "Export Data" on the left
-Choose repeat masked(soft) under fasta options (you will also find the possibility to create a hard clipped version there which will output all lower case characters as 'N')
WhatsOEver is offline   Reply With Quote
Old 06-20-2014, 09:45 AM   #4
lwhitmore
Member
 
Location: washington

Join Date: Aug 2013
Posts: 70
Default

Hey when I try using the ensembl browser no matter which option I choose the sequence that is outputted is on N's
lwhitmore is offline   Reply With Quote
Old 06-20-2014, 10:06 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,089
Default

Quote:
Originally Posted by lwhitmore View Post
Hey when I try using the ensembl browser no matter which option I choose the sequence that is outputted is on N's
Are you sure? I just go this from ensembl browser using your example (and WhatSOever's directions) in original post:

Code:
>1 dna:chromosome chromosome:GRCh37:1:157917:159078:1
AATAAGGAAGAATTATGGAGAATTTAAAAATCTATGCTATTTATAGGCACCTAGTAACAG
CTCAGTAAATATTAGCTGCTACTATTATTATTTTTATGGTAATTTCACTCAATTAAAAAC
TGTCGTTAAAAATTGCCATTGTCATGGAACATAATGTCTCCTACTGTATAATTGTAGAAA
CAGATACAATttgtcccttggtatatggggggattagttccagctctcccatttctgtgt
ataccaaaatccacgcatactcaagttttcaaagtcagtcctgtggaatccacatataAC
ACAAATGGGaaaattagtgaggtgtggtgacaagcacctgtagtcccagctacttgtgag
gctgaggcaggaggattgcttgagcccaggaggttgaggctgcagtgagccataattgca
ccactacactccagtctgggcaacagagtgagacAGAAGGTTGACTTTTTAATAGAATTT
TTCTGTTCACTTGAAGATATGGTCAGGATTGTGGCATATGAAAATTCTTCATAAAATAAC
TATCTAATCCAATTAATGCTGGAATTGGGAACAGCAGAAGTGTCATCTCAGAGCTACTCG
CAATGAAAGGTGATGTCTGGGGCTCAGGTGTGTTGAGGTCCCCATGCCTGGACTATGGGT
GCTGAGTGGGATTTACTTGTCCATCCATTTTCTATATTCCAGCACTGGGAAACTAGGGAC
AGTACTTGTTCTCAAGGGAATCTTCAGCTTAGGTGGCTCTGTAAAAGAGAAATTACATCA
TTGAAAAATCGTCGCAggtcaggtgaggtggctcatacctataatcccagcccactggga
gactaaggcaggaggattccgtgaggccaggagttcaagaccagcctgagcaacacagtg
aaacctcatctctacaaaaaattagaaaatgaactgggtgcggtaaaacattcgtatagt
cccagctactctggaggctgaaataggaggatcgcttgagcccaggaagtggaagctgca
gtgagctctgatctcaccactgcactctagccttggtgacagagtgagaccctgtctcaa
GacacacacaaacacacacacacacacacacacacCCCCAATCTCACTCTGTCCAGCCTT
GACTAATCAAAAGGGCCTTCTG
GenoMax is offline   Reply With Quote
Old 06-20-2014, 10:17 AM   #6
lwhitmore
Member
 
Location: washington

Join Date: Aug 2013
Posts: 70
Default

Hey I got it , I guess I wasn't putting in the coordinates correctly
lwhitmore is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:56 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO