Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SpliceMap 3.3 Released - Improved long-read sensitivity

    Hi Everybody,

    SpliceMap 3.3 has just been released. The main change is significant improvements to sensitivity when aligning long RNA-seq reads. The following 3.3.x releases will focus on usability.

    New Website: http://www.stanford.edu/group/wongla...Map/index.html

    Please check the website for descriptions of the new features.

    Please let me know if there are any concerns.
    SpliceMap: De novo detection of splice junctions from RNA-seq
    Download SpliceMap Comment here

  • #2
    Hi,
    Thanks for this good tool. It's very fast and easy to use. But it seems that I did something wrong when I used it.
    I followed the tutorial as the website describes and edited the run.cfg file when running my own data. The problem may lie in the genome.
    I choosed "bowtie" as the mapping way, so I downloaded the index reference from the bowtie website like h_sapiens.1.ebwt, h_sapiens.2.ebwt, h_sapiens.3.ebwt, h_sapiens.4.ebwt, h_sapiens.rev.1.ebwt, h_sapiens.rev.2.ebwt, and also, downloaded the hg18 chromosome files(chromFa.fa) like chr1.fa, chr2.fa,.... After that, I simply put all these files into the same directory like "genome" while not putting all the FASTA chromosome files into a same file, such as the CMD "cat chr*.fa > hg18.fa", as it didn't work.
    After running, the output file didn't show any mapped results, only the information like "
    @HD VN:1.0 SO:coordinate
    @SQ SN:chr1 LN:247249719
    @SQ SN:chr10 LN:135374737
    @SQ SN:chr11 LN:134452384
    @SQ SN:chr12 LN:132349534
    @SQ SN:chr13 LN:114142980
    @SQ SN:chr14 LN:106368585
    @SQ SN:chr15 LN:100338915
    @SQ SN:chr16 LN:88827254
    @SQ SN:chr17 LN:78774742
    @SQ SN:chr18 LN:76117153
    @SQ SN:chr19 LN:63811651
    @SQ SN:chr2 LN:242951149
    @SQ SN:chr20 LN:62435964
    @SQ SN:chr21 LN:46944323
    @SQ SN:chr22 LN:49691432
    @SQ SN:chr3 LN:199501827
    @SQ SN:chr4 LN:191273063
    @SQ SN:chr5 LN:180857866
    @SQ SN:chr6 LN:170899992
    @SQ SN:chr7 LN:158821424
    @SQ SN:chr8 LN:146274826
    @SQ SN:chr9 LN:140273252
    @SQ SN:chrX LN:154913754
    @SQ SN:chrY LN:57772954
    @PG ID:SpliceMap VN:3.3
    "
    I've tried using the reads to align a certain chromosome such as the chr21, and it worked well as expected, but it failed when using the genome as I described above.

    I'm sorry for my clumsy understanding. Any kind suggestion is appreciated.


    lix
    Last edited by lix; 06-17-2010, 09:50 PM.

    Comment


    • #3
      Hi Lix,

      Thanks for your interest, I'll try my best to help you.

      However, firstly could you please email me your run.cfg file and a log of your SpliceMap run? johnmu (at) stanford.edu

      I'll respond to you via email and then post here, when we have solved the problem.

      Sorry, that it did not work straight away.

      John Mu
      SpliceMap: De novo detection of splice junctions from RNA-seq
      Download SpliceMap Comment here

      Comment


      • #4
        OK, thank you so much.

        Comment


        • #5
          Hi
          Glad to hear that new useful feature has been added,
          But I have one runtime problem below:

          ################################
          genome/chr1.fa
          --== SpliceMap 3.3 Junction Discoverer ==--
          Developed by Kin Fai Au and John C. Mu

          ____________

          Loading configuration file... run.cfg
          Reading read-file-list files!...
          Sequencer full read length: 75
          List 1:
          temp/read_1_1
          List 2:
          temp/read_1_2
          ____________
          Reading and indexing reference genome!...
          Chromosone name: chr1
          Chromosone Size: 247249719
          Index creation time: 72.4036 s.
          Index sorting time: 2.02698 s.
          ____________
          Reading and finding search seeds!...
          in ... read_1_1.1-50.seq
          terminate called after throwing an instance of 'std:ut_of_range'
          what(): basic_string::substr
          find: `./bin/SpliceMap' terminated by signal 6

          genome/chr8.fa

          could be the reason and how do I fix it ?
          My OS is Ubuntu 10.4 AMD64
          I am not sure if this applies to all other linux kernel versions.

          Thanks very much.

          Comment


          • #6
            Runtime Error

            Hi, I try to this useful tool for aligning 75*2 nt RNA-seq data

            But when I execute this step:
            find genome/ -name "chr*.fa" -print -exec ./bin/SpliceMap run.cfg {} \;

            Loading configuration file... run.cfg
            Reading read-file-list files!...
            Sequencer full read length: 75
            List 1:
            temp/read_1_1
            List 2:
            temp/read_1_2
            ____________
            Reading and indexing reference genome!...
            Chromosone name: chr21
            Chromosone Size: 46944323
            Index creation time: 9.66561 s.
            Index sorting time: 0.435851 s.
            ____________
            Reading and finding search seeds!...
            in ... read_1_1.1-50.seq
            terminate called after throwing an instance of 'std:ut_of_range'
            what(): basic_string::substr
            find: `./bin/SpliceMap' terminated by signal 6


            what's the problem and how can I do to slove it?

            Thanks very much!

            Comment


            • #7
              Originally posted by lix View Post
              OK, thank you so much.
              Just updating for everyone else. The reason for this was possibly because the original bowtie index was corrupted during download from the bowtie website.
              SpliceMap: De novo detection of splice junctions from RNA-seq
              Download SpliceMap Comment here

              Comment


              • #8
                Originally posted by john_mu View Post
                Just updating for everyone else. The reason for this was possibly because the original bowtie index was corrupted during download from the bowtie website.
                Thanks for this convenient tool. It runs extremely fast and very easy to use. And, thank John for the very kindful help.
                Last edited by lix; 06-21-2010, 05:18 PM.

                Comment


                • #9
                  SpliceMap has been updated again!

                  There was a tiny bug in the SAM output (an extra tab at the end) and also it is updated for people with older version of g++.
                  SpliceMap: De novo detection of splice junctions from RNA-seq
                  Download SpliceMap Comment here

                  Comment


                  • #10
                    Problems with single-reads

                    Hi,

                    first of all thanks for the great tool!

                    As already mentioned at you homepage there is bug concerning single-reads. I tried to analyzed 50-76 bp (different read length is due to previous read trimming based on Illumina's Read Segment Quality Control Indicator) single-reads with SpliceMap 3.3.1.2 and the output's log file contains just NANs.

                    Since I'm a little bit in a hurry (my group is waiting impatiently for the data ) and we do not have reads >100 bp I'd like to ask if it would be possible to get SpliceMap Version 3.2.2. I had a look at the homepage but I couldn't find a link to that particular version.

                    By the way, while running the analysis I got a lot of "Skipping read ... because it is less than 4 characters long" warnings. Is this behaviour expected for read lengths between 50 and 76 bp?

                    Thank you very much!
                    Best regards

                    Comment


                    • #11
                      Originally posted by mfischer View Post
                      Hi,

                      first of all thanks for the great tool!

                      As already mentioned at you homepage there is bug concerning single-reads. I tried to analyzed 50-76 bp (different read length is due to previous read trimming based on Illumina's Read Segment Quality Control Indicator) single-reads with SpliceMap 3.3.1.2 and the output's log file contains just NANs.

                      Since I'm a little bit in a hurry (my group is waiting impatiently for the data ) and we do not have reads >100 bp I'd like to ask if it would be possible to get SpliceMap Version 3.2.2. I had a look at the homepage but I couldn't find a link to that particular version.

                      By the way, while running the analysis I got a lot of "Skipping read ... because it is less than 4 characters long" warnings. Is this behaviour expected for read lengths between 50 and 76 bp?

                      Thank you very much!
                      Best regards
                      Hi mfischer,

                      I'll be happy to help you get SpliceMap working on your data as soon as possible.

                      Firstly, are you using single reads? If so I already have the fixed code on my computer. It has been tested on some example data of various lengths, so it should be ok. I just haven't had time to run it on the bigger datasets yet.

                      I can send you this version via email if you need it very soon, otherwise if you can wait about 3 days, I will put it on the website.

                      However, you say there is the error "Skipping read ... because it is less than 4 characters long" This error looks a bit odd to me. Do you mind emailing me you output from the terminal and your run.cfg? I have not seen this error before and it looks like it is from Eland or Bowtie. Which version of each are you using?

                      Regardless, fire me an email (johnmu at stanford.edu) and I'll get back to you soon. However, I'm leaving home (Australia) and heading back to the US tomorrow, so I might be a bit late with my reply.

                      Thanks for the feedback!


                      John Mu

                      EDIT: Also, I don't recommend you try to use 3.2.2 because where were some bugs in the SAM output that have been fixed in recent versions. You can see this in the "news" on the website.
                      SpliceMap: De novo detection of splice junctions from RNA-seq
                      Download SpliceMap Comment here

                      Comment


                      • #12
                        The single-read bug has been fixed. Also, a small bug with the display of the "max_intron" and "min_intron" parameters.



                        I would just like to remind everyone that SpliceMap can't work with reads that are not uniform in length. We will endeavor to add this in future versions as well as read quality information.
                        SpliceMap: De novo detection of splice junctions from RNA-seq
                        Download SpliceMap Comment here

                        Comment


                        • #13
                          Hi,

                          I ran SpliceMap on 2*76bp paired end reads (Mouse) and it did not produce any splice junction. (the bed files are empty). I expected that the problem was connected to the problem Lix had, so I downloaded the mm9 Bowtie index from UCSC again.
                          But the problem was still the same : no junctions. The coverage_all.wig file has more than 1Mio. entries and Cufflinks was able to produce a lot of transcripts using the sorted good_hits.sam file.
                          Is this a problem with SpliceMap output processing or did I do a mistake?
                          The parameters in my run.cfg are:
                          ##########################################
                          genome_dir = /cluster/rnaseq/genome/Mus_musculus_9/
                          > reads_list1
                          /cluster/rnaseq/readfiles/SRR037950_1.fastq
                          <
                          > reads_list2
                          /cluster/rnaseq/readfiles/SRR037950_2.fastq
                          <
                          read_format = FASTQ
                          mapper = bowtie
                          head_clip_length = 0
                          seed_mismatch = 1
                          sam_file = cuff
                          ud_coverage = no
                          bowtie_base_dir = /cluster/rnaseq/genome/Mus_musculus_9/mm9
                          num_threads = 3
                          ##########################################

                          (spliceMap Version 3.3.1.3, Bowtie version 0.12.5)

                          Thanks,
                          C.

                          Comment


                          • #14
                            hmm... are you saying that you get output in the coverage and SAM file but no junctions? That is extremely odd....

                            Could you email me your terminal output? johnmu (at) stanford (dot) edu

                            I'll take a look and see if I can work out what is going on.

                            Also, I will have a new version out soon (in a few days) with many more additional checks to make sure the data and genome files are as expected. Hopefully that will fix your issue.

                            Look forward to hearing from you!

                            Originally posted by Enrico Palazzo View Post
                            Hi,

                            I ran SpliceMap on 2*76bp paired end reads (Mouse) and it did not produce any splice junction. (the bed files are empty). I expected that the problem was connected to the problem Lix had, so I downloaded the mm9 Bowtie index from UCSC again.
                            But the problem was still the same : no junctions. The coverage_all.wig file has more than 1Mio. entries and Cufflinks was able to produce a lot of transcripts using the sorted good_hits.sam file.
                            Is this a problem with SpliceMap output processing or did I do a mistake?
                            The parameters in my run.cfg are:
                            ##########################################
                            genome_dir = /cluster/rnaseq/genome/Mus_musculus_9/
                            > reads_list1
                            /cluster/rnaseq/readfiles/SRR037950_1.fastq
                            <
                            > reads_list2
                            /cluster/rnaseq/readfiles/SRR037950_2.fastq
                            <
                            read_format = FASTQ
                            mapper = bowtie
                            head_clip_length = 0
                            seed_mismatch = 1
                            sam_file = cuff
                            ud_coverage = no
                            bowtie_base_dir = /cluster/rnaseq/genome/Mus_musculus_9/mm9
                            num_threads = 3
                            ##########################################

                            (spliceMap Version 3.3.1.3, Bowtie version 0.12.5)

                            Thanks,
                            C.
                            SpliceMap: De novo detection of splice junctions from RNA-seq
                            Download SpliceMap Comment here

                            Comment


                            • #15
                              Originally posted by john_mu View Post
                              hmm... are you saying that you get output in the coverage and SAM file but no junctions? That is extremely odd....

                              Could you email me your terminal output? johnmu (at) stanford (dot) edu

                              I'll take a look and see if I can work out what is going on.

                              Also, I will have a new version out soon (in a few days) with many more additional checks to make sure the data and genome files are as expected. Hopefully that will fix your issue.

                              Look forward to hearing from you!
                              Thanks a lot. Unfortunately I did not save the STDOUT and the only log file produced by SpliceMap contains:

                              ####Total####alljunction##########
                              total: 0
                              ave: nan
                              1: 0 nan%
                              2-5: 0 nan%
                              6-20: 0 nan%
                              21-50: 0 nan%
                              51-200: 0 nan%
                              201-1000: 0 nan%
                              1000+: 0 nan%

                              ####nNR####alljunction#########

                              This supports the fact that no splice junctions were found. I checked the transcript.gtf produced by Cufflinks against a transcript.gtf which was based on a TopHat run in the GenomeBrowser manually for some samples of chr1: The results of the SpliceMap run seem to be ok.

                              I just started a new run on the same run.cfg and I'll send you an email with the output when it is finished (this will take > 20h).
                              Last edited by Enrico Palazzo; 08-12-2010, 12:54 PM.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                Yesterday, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              55 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              52 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              45 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              55 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X