Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by kentawan View Post
    Thank you very much Simon for the headstart guide. I made significant progress with my data analysis over the week.

    Just one question though, what does the output value of the DNA bisulphite quantitation pipeline actually means? Does higher value means higher methylation counts on a specific probes or does it mean higher percentage of methylation on that basepair out of all calls (methylated + non-methylated)? My input files are actually CpG methylation call files from bismark. From my understanding, the output files of bismark CpG calles are all 1bp long, where the + strand reads are methylated C while - strand reads are unmethylated c, am I right?

    Thanks and regards, hope to hear from you soon.
    The output of the bisulphite pipeline is percentage methylation. The only differences from the pipeline and a simple percent for/all is that the pipeline allows you to remove Cs with very low coverage, and then gives each C equal weight when calculating the overall percentage for the region.

    You're correct that the output of the methylation extractor in bismark are 1bp long methylation calls, where the strand quoted in the file actually represents the methylation state and not the strand of origin (which can be determined by looking at the file name, OT and OB strands are reported separately, but we usually recombine them for analysis).

    Simon.

    Comment


    • I've just released a new version of seqmonk (v0.28). This is now available from the project site.

      The new release makes a fairly large change in the way bisulphite data is handled by removing the old flag values, and allowing for individual probes to be listed as not having a value.

      We've also added some new QC plots for RNA-Seq and Small RNA data which will be useful in the early stage assessment of new data. There are also a few new options for how data is displayed in the chromosome view.

      We've added some new options and modules to both the probe generators and the quantitation methods, and there is a new "proportion of library" statistical filter which will be of use to some people.

      A full list of the changes can be found in the release notes which along with the software itself is available from the project web site.

      Please have a play with the new version and let me know if you hit any problems.

      Comment


      • Hi Simon, I ran the RNAseq Quantification Pipeline with these descriptions. Log value is on. "Transcript features over mRNA. Quantitated with RNA-Seq pipeline quantitation counting reads over exons. Log transformed. Assuming a Non-strand specific library."

        What does the heatmap coloured peaks represents, especially the value of the peaks? Does it mean a relative abundance calculation of RNA expressed?

        Thanks in advance and I hope to hear from you soon.

        regards,

        Ziyi

        Comment


        • Originally posted by kentawan View Post
          Hi Simon, I ran the RNAseq Quantification Pipeline with these descriptions. Log value is on. "Transcript features over mRNA. Quantitated with RNA-Seq pipeline quantitation counting reads over exons. Log transformed. Assuming a Non-strand specific library."

          What does the heatmap coloured peaks represents, especially the value of the peaks? Does it mean a relative abundance calculation of RNA expressed?

          Thanks in advance and I hope to hear from you soon.
          Yes, in effect it's a relative abundance measure, it's actually a log transformed normalised read count. With the default settings you're getting values which are comparable for the same gene across different sample, but aren't comparable for different genes within the same sample, which is what people normally want.

          Comment


          • Hi Simon,
            I'm using Seqmonk v0.27.0 to visualize the methylation level in individual CpGs in three individual data sets. The genomic distance relative to the reference genome, and thus also between the samples, appear to be offset by a couple of bases (possibly due to indels?). This, of course, makes Seqmonk "call" CpGs where there should not be any, when quantifying. Can I avoid this in some way..? (See attached image)

            Thanks,
            Martina
            Attached Files

            Comment


            • Simon or anyone else,

              I am trying to create and quantitate probes between two features which I have already defined (Gene End and CDS End ie the 3' UTR). How can I do this in Seqmonk?

              Best,
              Christian

              Comment


              • Originally posted by ctstackh View Post
                Simon or anyone else,

                I am trying to create and quantitate probes between two features which I have already defined (Gene End and CDS End ie the 3' UTR). How can I do this in Seqmonk?

                Best,
                Christian
                Hi Christian,

                I had a think about this, but couldn't come up with a way to easily do this directly in SeqMonk. If it was me doing this I'd probably export the two tracks you already have, work out the differences outside the program (you could probably do it easily enough in Excel), and then import that back through the generic text annotation import.

                If you're stuck with this then feel free to send me the coordinates of the tracks you have and I'll make up a file you can re-import directly.

                Simon.

                Comment


                • We have just released seqmonk v0.29.0 onto the project web site.

                  This release adds a bunch of new features. Many of these are improvements to the chromosome view, specifically targeted at studies with large numbers of samples (we're getting a lot of single cell datasets in these days). You can now display your quantitated data in some new ways, and can also look at the variability in the quantitations shown when you're displaying a replicate set.

                  We've also added some new probe generation and quantitation options which have proved to be useful for projects we've worked on recently.

                  A related release which prompted some of the new features is that we have also put up the documentation for our methylation analysis course. We'll be running this fairly regularly starting in the new year, but all of the material for the course is available for anyone who wants to look. The course isn't solely focussed on seqmonk for the visualisation and analysis, but this does make up the majority of the practicals, so anyone wanting to use seqmonk to look at methylation data might want to take a look at the material we've put up.

                  Comment


                  • Originally posted by m.olsson View Post
                    Hi Simon,
                    I'm using Seqmonk v0.27.0 to visualize the methylation level in individual CpGs in three individual data sets. The genomic distance relative to the reference genome, and thus also between the samples, appear to be offset by a couple of bases (possibly due to indels?). This, of course, makes Seqmonk "call" CpGs where there should not be any, when quantifying. Can I avoid this in some way..? (See attached image)

                    Thanks,
                    Martina
                    Sorry Martina - I missed this post.

                    Where did you get the data for the data which you're importing into seqmonk? Normally for bs-seq applications the data you import would be individual methylation calls, so the positions seqmonk would read would be the actual bases of the calls, so there wouldn't be any chance to get the positioning wrong. We would normally use the output of the bismark methylation extractor as input into seqmonk, and we've never seen an issue like this.

                    Have you looked in the genome to see whether the positions you're seeing are CpGs or not? The other possibility would simply be that the different samples are seeing differnet subsets of the genome. If you're really getting non-C positions imported then you'd need to go back to the imported file and check whether the positions there were wrong. If you're really seeing seqmonk change positions from the imported data to the project file then that would obviously be a bug and we can look at it, but that would seem to be unlikely with data as simple as this.

                    If you can reduce this problem to a small dataset which illustrates the error then I'd be happy to take a look at it for you.

                    Comment


                    • Thanks!

                      Originally posted by simonandrews View Post
                      Hi Christian,

                      I had a think about this, but couldn't come up with a way to easily do this directly in SeqMonk. If it was me doing this I'd probably export the two tracks you already have, work out the differences outside the program (you could probably do it easily enough in Excel), and then import that back through the generic text annotation import.

                      If you're stuck with this then feel free to send me the coordinates of the tracks you have and I'll make up a file you can re-import directly.

                      Simon.
                      Thank you for the suggestion! I was able to export the necessary tracks, edit, and save them as bed files. I then used the interval and subtract functions in Galaxy to isolate the regions of interest. Next step will be to import into Seqmonk for quantitation.

                      Just thought I would post my solution in case anyone was interested or has a similar problem to solve.

                      Best,
                      Christian

                      Comment


                      • v0.29.0 crashing issues

                        Originally posted by simonandrews View Post
                        We have just released seqmonk v0.29.0 onto the project web site.

                        This release adds a bunch of new features. Many of these are improvements to the chromosome view, specifically targeted at studies with large numbers of samples (we're getting a lot of single cell datasets in these days). You can now display your quantitated data in some new ways, and can also look at the variability in the quantitations shown when you're displaying a replicate set.

                        We've also added some new probe generation and quantitation options which have proved to be useful for projects we've worked on recently.

                        A related release which prompted some of the new features is that we have also put up the documentation for our methylation analysis course. We'll be running this fairly regularly starting in the new year, but all of the material for the course is available for anyone who wants to look. The course isn't solely focussed on seqmonk for the visualisation and analysis, but this does make up the majority of the practicals, so anyone wanting to use seqmonk to look at methylation data might want to take a look at the material we've put up.
                        Simon,

                        I just wanted to let you know that I started using version 29 today, but I've had to force quit 3 times now because the program becomes non responsive. The issues begin after I generate an annotated probe report then try to use the report to navigate to regions of interest by double clicking them in the report. The program then freezes.

                        OSX 10.8.5 16GB RAM i7 core ( the program uses 10GB RAM)

                        Comment


                        • Originally posted by ctstackh View Post
                          Simon,

                          I just wanted to let you know that I started using version 29 today, but I've had to force quit 3 times now because the program becomes non responsive. The issues begin after I generate an annotated probe report then try to use the report to navigate to regions of interest by double clicking them in the report. The program then freezes.

                          OSX 10.8.5 16GB RAM i7 core ( the program uses 10GB RAM)
                          Hi,

                          How many probes you have in your report?

                          Comment


                          • Hi Simon or anyone else,

                            Is it possible to edit the chromosome names in SeqMonk? I started my project using genomes downloaded from Ensembl FTP. As I proceed I realized that I needed data from UCSC Genome Browser (CpG Island estimated position to be exact, which is an annotation track). I realised that the genome labelling is different, hence I want to edit them. Any way of doing it?

                            regards

                            Ziyi

                            Comment


                            • Quantitating RRBS coverage with SeqMonk

                              Hi Simon and other SeqMonk developers,

                              Thanks so much for this powerful and comprehensive set of tools!

                              I'm using the bisulphite feature methylation pipeline for an RRBS dataset. Re: setting the minimum level of observation required to include a base in the methylation calculation, the pipeline help page recommendation is :
                              "this value should be a fair reflection of the overall depth of coverage in your data"

                              Backing up a bit, can we use SeqMonk to find the overall depth of coverage for each CpG in the data? This is a stat that I would like to have anyway, as a measure of the quality of each RRBS library that I'm comparing. Initially I thought of creating a probe set containing all CpG dinucleotides, quantitating over these and generating a coverage histogram, but I'm not sure whether this is an option.

                              Any help would be greatly appreciated.

                              Thank you!
                              jd

                              Comment


                              • I've just released seqmonk v0.30.0 on the project web site.

                                This version adds an optional link between seqmonk and a local R installation to allow us to easily run some R based analyses seamlessly from within SeqMonk. Initially the applications we've implemented using this link are DESeq2 and EdgeR for RNA-Seq analysis and logistic regression for replicated bisulphite sequencing analysis. We're open to suggestions for what other packages or tests should be implemented so please shout out if there's anything you think would be particularly useful.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                22 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                24 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                20 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                52 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X