Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Running ~35 bp and >=50 RNASeq reads

    Hi all

    I tried Tophat on 35 bp Illumina reads and 50bp Solid reads. Failed by giving warning Junction database is empty. I posted this error message with logs but no response. As I see few others also posted the same error but the responses given were not so useful.

    Could any one please assist me regarding finding new splice junctions by using 35 bp or 50 bp RNASeq libraries (Illumina or Solid)

    TopHat-30bp RNASeq read library Error

    Code:
    Warning: found a read < 20bp in SL001_R00089_RHE011_01pgx2_F3.fasta
    Warning: found a read < 20bp in SL001_R00089_RHE011_01pgx2_F3.fasta
    Warning: found a read < 20bp in SL001_R00089_RHE011_01pgx2_F3.fasta
    Warning: found a read < 20bp in SL001_R00089_RHE011_01pgx2_F3.fasta
    Warning: found a read < 20bp in SL001_R00089_RHE011_01pgx2_F3.fasta
    Warning: found a read < 20bp in SL001_R00089_RHE011_01pgx2_F3.fasta
    Warning: found a read < 20bp in SL001_R00089_RHE011_01pgx2_F3.fasta
    Traceback (most recent call last):
      File "/home/bogugk/XXX/tophat-1.0.13/bin/tophat", line 1635, in ?
        sys.exit(main())
      File "/home/bogugk/XXX/tophat-1.0.13/bin/tophat", line 1562, in main
        params.read_params = check_reads(params.read_params, left_reads_list)
      File "/home/bogugk/XXX/tophat-1.0.13/bin/tophat", line 719, in check_reads
        if line_num % 2 == 1:
    KeyboardInterrupt
    TopHat-50bp RNASeq read library Error

    Code:
    ]$ tophat -o tophat_out_SRR017234_1.fas
    tq /home/bogugk/rnaseq/software/bowtie-0.12.5/indexes/hg18_c SRR017234_1.fastq 
    
    [Wed Aug  4 20:40:52 2010] Beginning TopHat run (v1.0.13)
    -----------------------------------------------
    [Wed Aug  4 20:40:52 2010] Preparing output location tophat_out_SRR017234_1.fastq/
    [Wed Aug  4 20:40:52 2010] Checking for Bowtie index files
    [Wed Aug  4 20:40:52 2010] Checking for reference FASTA file
    [Wed Aug  4 20:40:52 2010] Checking for Bowtie
            Bowtie version:          0.12.5.0
    [Wed Aug  4 20:40:52 2010] Checking reads
            seed length:     35bp
            format:          fastq
            quality scale:   phred33 (default)
    [Wed Aug  4 20:42:21 2010] Mapping reads against hg18_c with Bowtie
    [Wed Aug  4 20:42:21 2010] Joining segment hits
    [Wed Aug  4 20:42:21 2010] Searching for junctions via segment mapping
    Warning: junction database is empty!
    [Wed Aug  4 20:44:06 2010] Joining segment hits
    [Wed Aug  4 20:44:06 2010] Reporting output tracks
            [FAILED]
    Error: Report generation failed with err = 1
    Traceback (most recent call last):
      File "/home/bogugk/XXX/tophat-1.0.13/bin/tophat", line 1635, in ?
        sys.exit(main())
      File "/home/bogugk/XXX/tophat-1.0.13/bin/tophat", line 1607, in main
        params.gff_annotation)
      File "/home/bogugk/XXX/tophat-1.0.13/bin/tophat", line 1044, in compile_reports
        exit(1)
    TypeError: 'str' object is not callable
    Then I shifted to Splicemap. It is also failed by giving error message

    SpliceMap error: same for both 35 bp and 50bp Libraries

    Code:
    k@Trivia SpliceMap3313_linux-64]$ ./bin/runSpliceMap run.cfg 
    ---== Welcome to SpliceMap 3.3.1.3 ==---
    Developed by Kin Fai Au and John C. Mu
    http://www.stanford.edu/group/wonglab/SpliceMap/
    __________
    Loading configuration file... 
    Genome Directory: ../SpliceMap3313_linux-64/genome/
    Number of mismatch allowed: 2
    Mapping option: B
    Minimum (25th-percentile) intron size: 20000
    Maximum (99th-percentile) intron size: 400000
    Ref path:  name: all.gene.refFlat.txt
    package path: ./bin/ name: runSpliceMap
    Read format: FASTA
    Number of threads: 2
    Separating non-unique coverage
    Will print Cufflinks SAM file
    COMMAND: mkdir temp
    mkdir: cannot create directory `temp': File exists
    List 1:
    /home/bogugk/rnaseq/shiyan_data/csfasta_Folder/SL001_R00089_RHE011_01pgx2_F3.fasta
    List 2:
    Preparing the reads!...
    Using bases 1 to 1 (inclusive) of each read for mapping.
    1 bases in total.
    I'm sorry, SpliceMap 3.3.1.3 only supports read lengths >= 50
    Please contact: http://www.stanford.edu/group/wonglab/SpliceMap/
    if you need this requirement

  • #2
    hmm, could you post a sample of your reads here?

    I have the feeling there could be something particular about them.

    Are all of your reads the same length or have they been trimmed?
    SpliceMap: De novo detection of splice junctions from RNA-seq
    Download SpliceMap Comment here

    Comment


    • #3
      Hi

      No I didn't trim the read length of fastq/fasta files.
      My raw files are looks like below file .May be this could be the reason?
      But I think TopHat has a automatic trimming pipeline ?

      Code:
      >a1ghg
      T
      >a2jK
      TGT
      
      >eJHV
      TTTGTGAGATGATGACACAT
      
      >dfhGH
      TATATCATACGATGACTATATACGATGACATGACATGACATGACATGACTGA
      Last edited by repinementer; 08-04-2010, 01:14 PM.

      Comment


      • #4
        If those are your FASTA files they look to be shorter than 30bp? Also, why is there a gap in the reads between "TGT" and ">eJHV"?

        I don't think TopHat trims the reads since they are all required to be the same length.

        Edit: Also both TopHat and SpliceMap require all the reads to be the same length. This doesn't seem to be the case for you.

        Although, the next version of SpliceMap will remove this requirement.
        Last edited by john_mu; 08-04-2010, 01:17 PM.
        SpliceMap: De novo detection of splice junctions from RNA-seq
        Download SpliceMap Comment here

        Comment


        • #5
          hi

          Thanx for the reply
          Ya there is no gap. sorry.
          Yes FASTA files contains read length ranging from =>1 to <=35.
          Then this is the case how to trim the reads of equal length i.e.35 ?
          Should I use AMOScmp-shortReads-alignmentTrimmed ?
          Download AMOS for free. AMOS is a collection of tools for genome assembly. AMOS is a collection of tools and class interfaces for the assembly of DNA reads. The package includes a robust infrastructure, modular assembly pipelines, and tools for overlapping, consensus generation, contigging, and assembly manipulation.
          Last edited by repinementer; 08-04-2010, 01:22 PM.

          Comment


          • #6
            If all of your reads are different lengths then I'm not sure how you can use the current tools.

            If you have reads that are >=50bp but varying in length. I can send you a pre-release version of SpliceMap, which can handle that. My email is johnmu (at) stanford (dot) edu
            SpliceMap: De novo detection of splice junctions from RNA-seq
            Download SpliceMap Comment here

            Comment


            • #7
              yes

              may be I should remove the ones that are >50bp? for 50bp reads ?
              yes please send me (to [email protected]). that would be great!

              thanx

              Comment


              • #8
                Originally posted by repinementer View Post
                may be I should remove the ones that are >50bp? for 50bp reads ?
                yes please send me (to [email protected]). that would be great!

                thanx
                Yes, that would be another option. You can trim all of the reads longer than 50bp to 50bp.

                Maybe use something like



                I'll send you an email soon.
                SpliceMap: De novo detection of splice junctions from RNA-seq
                Download SpliceMap Comment here

                Comment


                • #9
                  and final question. Either TopHat or Splicemap are not useful for 35 bp reads. Do you know any software that can find new splice junctions by using ~35 bp reads. thxn again

                  Comment


                  • #10
                    Tophat can use 35bp reads. They just have to be all 35bp...
                    SpliceMap: De novo detection of splice junctions from RNA-seq
                    Download SpliceMap Comment here

                    Comment


                    • #11
                      Thanx alot

                      Great then! Thank you for referring FASTX. Very helpful.

                      Comment


                      • #12
                        Sample_FASTA_Untrimmed
                        >1279_16_1960_F3
                        A
                        >1279_16_2010_F3
                        BCCC
                        >1279_16_2027_F3
                        CDDDD
                        >1279_17_27_F3
                        ABABABACACACACACAADADADADABABABABAACACACACACACAACAC
                        >1279_17_39_F3
                        ABABABABABBACCCCCCCCCCCACACAADADADADADDADADADADADAD
                        >1279_17_64_F3
                        CDDDDAAAAAAAAAACADACADACDACADCADCADCABABABABABADADADDDDDADADA
                        Sample_FASTA_trimmed
                        >1279_17_27_F3
                        ABABABACACACACACAADADADADABABABABAACACACACACACAACAC
                        >1279_17_39_F3
                        ABABABABABBACCCCCCCCCCCACACAADADADADADDADADADADADAD
                        >1279_17_64_F3
                        CDDDDAAAAAAAAAACADACADACDACADCADCADCABABABABABADADA
                        After trimming I ran splicemap. it ran well but finshed with in 15 min for 40 million reads. and the output files are just empty.

                        Could please tell me why?

                        Thanx

                        Comment


                        • #13
                          I'm not sure what could be wrong.

                          Could you send me your output from the command line and your "run.cfg" and i'll take a look.

                          Also, which version of SpliceMap are you using?

                          You can email them to me at johnmu (at) stanford (dot) edu.

                          Sorry that SpliceMap had problems with your data.
                          SpliceMap: De novo detection of splice junctions from RNA-seq
                          Download SpliceMap Comment here

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Essential Discoveries and Tools in Epitranscriptomics
                            by seqadmin




                            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                            04-22-2024, 07:01 AM
                          • seqadmin
                            Current Approaches to Protein Sequencing
                            by seqadmin


                            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                            04-04-2024, 04:25 PM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 10:49 AM
                          0 responses
                          17 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-25-2024, 11:49 AM
                          0 responses
                          24 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-24-2024, 08:47 AM
                          0 responses
                          20 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-11-2024, 12:08 PM
                          0 responses
                          62 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X