Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • troubleshoot with tophat2 or problem with my fastq files

    Dear Folks,
    I am using TopHat2 to map the reads, I guess, I am fulfilling all the requirements, additionally I give
    /usr/local/bin/tophat2 -p 8 -G ~/path/to/Homo_sapiens.GRCh37.72.gtf -o
    ~/path/to/Human_mapping_iPS_s7_rep1
    --splice-mismatches 1 --max-multihits 30 --microexon-search --fusion-search
    ~/path/to/bowtie2_index/hg19
    ~/path/to/myfile.fastq

    I am submitting on grid engine cluster with qsub -l h_vmem=50G [above_script]
    this is showing error as:
    """""TopHat requires all reads be either FASTQ or FASTA. Mixing formats is not supported"""

    I am bit frustrated because my fastq files look fine to me as shown in code

    @SOLEXA-GA05_00009_SRi_AD_MS_BN_VW:7:1:2364:933#ATGAGCA
    NGGCCTTCCCACATTCTTTACACTCATAGGTTTTCTCACCAGTGTGAGTTCTCTTGTGCACAATAAGGTAAGAGCC
    +SOLEXA-GA05_00009_SRi_AD_MS_BN_VW:7:1:2364:933#ATGAGCA
    !454478347;09977778<655476;69;8588380745<75;57495945158::=677976:7674:64763-

    Please help???????
    manu
    Last edited by manvendra7; 08-19-2013, 12:10 AM. Reason: easier to understand
    Manvendra Singh
    PhD student,
    Mobile DNA, MDC, Berlin

  • #2
    Don't use an asterisk, you need to use the actual file names. Also, I kind of doubt that your annotation file is actually in the root ("/") directory. Finally, why are you giving tophat2 a fasta file as input (the "/path/to/hg19.fa" shouldn't be there)? You need to reread the tophat2 manual.
    Last edited by dpryan; 08-18-2013, 04:34 AM.

    Comment


    • #3
      dpryan reply

      Dear dpryan,
      Please accept my sincere apologies, I am so sorry,
      I was little sleepy while submitting this post, actually there is no problem with tophat, the script written in post is just to convey an impression that I had given these options in tophat2 jobscript ( I had never used asterisk, and had just given path to directory, where my bowtie2 index and ref seq files are there. my tophat2 code is:

      /usr/local/bin/tophat2 -p 8 -G ~/path/to/Homo_sapiens.GRCh37.72.gtf -o
      ~/path/to/Human_mapping_iPS_s7_rep1
      --splice-mismatches 1 --max-multihits 30 --microexon-search --fusion-search
      ~/path/to/bowtie2_index/hg19
      ~/path/to/myfile.fastq
      Last edited by manvendra7; 08-18-2013, 04:44 PM. Reason: made it easier to understand
      Manvendra Singh
      PhD student,
      Mobile DNA, MDC, Berlin

      Comment


      • #4
        Hmm, I don't see anything obviously wrong with that. Your best bet is to just make a minimal version of that command:
        Code:
        tophat2 -p 8 ~/path/to/bowtie2_index/hg19 ~/path/to/myfile.fastq
        to ensure that that works (once you're certain it's running properly, you can then just cancel the job). Then start building up the command, a parameter at a time, until it either fails (in which case you've found the issue) or you have the whole thing running (in which case, who knows what the problem was, go have a beer!). Once you've found the problem, kindly post that back here. I can pretty much guarantee that someone else will run into the exact same issue.

        Comment


        • #5
          Dear dpryan,
          Thanks a lot, You always help.
          My problem was my fastq file
          Manvendra Singh
          PhD student,
          Mobile DNA, MDC, Berlin

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 11:49 AM
          0 responses
          15 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-24-2024, 08:47 AM
          0 responses
          16 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          62 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          60 views
          0 likes
          Last Post seqadmin  
          Working...
          X