Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hello !!!

    Hi,

    I am Rashmi Kulkarni and looking for some initial answer to work on single cell RNA-seq. obtained by 10x genomics and sequenced on illumina NextSeq. I have raw data and struggling to understand where to go from here. I am trying to look at open source progrms to handle this. Can anybody direct me to resources on how to do this analysis from the scratch with raw reads?

    Thanks,
    Rashmi
    Last edited by GenoMax; 02-17-2017, 04:54 AM. Reason: as suggested in the reply to earlier post!

  • #2
    I recommend adapter-trimming with BBDuk and then mapping with BBMap to produce a sam file. What you do with the sam file depends on your experiment. There are programs like edgeR and DESeq for differential expression analysis between samples, but I'm not sure if they were designed with single-cell in mind.

    You may want to read about RNA-seq in the Biostar Handbook for starters.

    Comment


    • #3
      Analysis workflow depends on the scRNA-Seq protocol used for library prep. Some methods such as Drop-Seq and 10x Genomics requires starting analysis with open source or free software. These may require other software at the final steps for data presentation but generally they have end to end solution.For other methods it would be best to look at the papers that describes the method.

      PS. You will get better responses if you change the title of your post to represent the question.
      Last edited by nucacidhunter; 02-16-2017, 07:35 PM.

      Comment


      • #4
        Yes.. I have updated my title but it doesn't show up. Anyway scRNA-seq library was prepared using 10x genomics technology. Will you please kindly direct me to relevant resources regarding in general how to start from the raw reads? There are tutorials which start from count matrices but hardly any show initial steps.

        Thanks a lot,
        Rashmi

        Comment


        • #5
          Edit: I will leave this post here but it is not applicable to the original question since the original data is from 10x genomics not plain illumina.

          The procedure of QC/scanning and trimming adapters/alignment is more or less the same for most *-seq analysis one may be doing.

          You can see sections 4 and 5 in this WikiBook for a general idea of the process. I recommend that you give FastQC (for QC) and then BBMap suite a try for the steps noted above. Both tools are easy to find/use and have extensive support here if you run into questions.
          Last edited by GenoMax; 02-17-2017, 05:57 AM.

          Comment


          • #6
            Ok.. will follow those steps and post a question, if required.

            Thanks,
            Rashmi

            Comment


            • #7
              Originally posted by Rashmi007 View Post
              Yes.. I have updated my title but it doesn't show up. Anyway scRNA-seq library was prepared using 10x genomics technology. Will you please kindly direct me to relevant resources regarding in general how to start from the raw reads? There are tutorials which start from count matrices but hardly any show initial steps.

              Thanks a lot,
              Rashmi
              10x Genomics scRN-ASeq analysis requires BCL files (fastq files can be used with some extra steps) and is processed through Cell Ranger pipeline and results can be presented with Loupe Cell Browser (both are free and supported). Following link for download and guide:

              https://support.10xgenomics.com/sing...erview/welcome

              They also have data sets that have been processed through their pipelines ad can be found in the following link:

              https://support.10xgenomics.com/single-cell/datasets

              Following files from run folder are required for 10x Cell Ranger pipeline:
              Data directory (BCL files for the lane)
              InterOp directory
              runParameters.xml
              RTAComplete.txt
              RunInfo.xml

              Using any third party software for initial data processing will give non-optimal results as reads originating from any single cell are barcoded and each transcript is marked with a UMI and they are checked and corrected against a white list (they are not random 16 or 10 base barcodes or UMIs, respectively). They have a very responsive tech support as well.
              Last edited by nucacidhunter; 02-19-2017, 05:08 PM. Reason: corrected required files name

              Comment


              • #8
                @Rashmi: I only read your last post and missed the critical part that this is 10x data (Thanks @nucacidhunter for quoting the original post).

                I have not personally used cellranger but if it is anything like their other software (longranger and supernova, which I have used) then it would not be a trivial thing to get going. You would need good bit of hardware (preferably access to a cluster). If you are not tech-savvy then definitely talk with your local IT support first.

                Alternatively, you may want to see if the facility that did your 10x work would be willing to run some of these analyses and give you analyzed data that you can look at locally (with Loupe browser).
                Last edited by GenoMax; 02-17-2017, 05:58 AM.

                Comment


                • #9
                  @GenoMax.. Yes that is precisely the problem. I did look at CellRanger as an option, but its system requirements are too much for my personal PC. We do have CGC account where we can use it, but that needs wrapping the tool and then using it, which is bit tedious to do. That is why I was looking for other options maybe other than CellRanger.

                  @nucacidhunter.. We have fastq files and not BCL files. What can be done in that case?

                  I am particularly looking for this, just to understand the basics, how to find out UMI's and barcodes in a given sequence? Can I write a program to do that? Maybe this is too ambitious, but just want to know where I get that information?

                  Thanks,
                  Rashmi
                  Last edited by Rashmi007; 02-17-2017, 12:22 PM.

                  Comment


                  • #10
                    I think your best option is to go back to whoever did the 10x for you and see if they will analyze the data (you may have to pay if this was done at a service facility). If this is a one-time run it would be the most cost/time effective solution for you.

                    Comment


                    • #11
                      Originally posted by Rashmi007 View Post
                      @nucacidhunter.. We have fastq files and not BCL files. What can be done in that case?

                      I am particularly looking for this, just to understand the basics, how to find out UMI's and barcodes in a given sequence? Can I write a program to do that? Maybe this is too ambitious, but just want to know where I get that information?
                      Instruction for using fastq data file with Cell Ranger:

                      https://support.10xgenomics.com/sing...l2fastq-direct

                      Position of UMIs and barcodes: in V1 kit barcodes are read as index 1 (i7) and UMI is the 10 base Read2 but in V2 kit barcode is the bases 1-16 and UMI bases 17-26 of Read 1.

                      For troubleshooting and more information I would suggest getting in contact with 10x tech support.

                      As GenoMax has pointed the easiest way would be to ask the place that have prepared and sequenced the libraries to do preliminary analysis and then you can use Cell Ranger output files for further analysis if you need. They should do it for free as they would have some interest in performance of the platform and evaluating their technical skills.

                      Comment


                      • #12
                        I was reading about other sources and come across kallisto. In pseudoaligning reads to model transcriptome, kallisto generates transcript count matrix. They have given example for fastaq files using 10x genomics platform (https://pachterlab.github.io/kallisto/10xstarting.html). They have developed python scripts to take care of barcoding and UMIs in 10x genomics. Can I make use of kallisto pipeline to generate count matrices. Any suggestion on this?

                        Thanks,
                        Rashmi

                        Comment


                        • #13
                          Your best option is definitely processing the raw data using the 10x Genomics free proprietary software Cell Ranger. It uses STAR to map, I would create a new genome index because the one they offer to download on their website is generated from outdated builds.

                          I would not trust Cell Rangers analysis beyond QC readouts, after generating the raw and filtered UMI matrices, process the unfiltered matrix with an R Bioconductor package like Seurat or Monocle 2 . Both are very easy to use, were designed to be compatible with data from droplet devices like 10X, and can give you more reliable results and control over your workflow than CellRanger will.

                          Comment


                          • #14
                            Hi hideandSEQ,

                            Now we have got hold on matrices. I have a question about QC of the data. We clearly can not do QC over how alignment has been performed? You also mention that start with unfiltered matrices but I thought the starting point would be filtered matrices? From where can I get logical explanation about which QC steps are required and why? I am already doing Seurat demo tutorial. I will have a look at Monocle.

                            Thanks,
                            Rashmi

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM
                            • seqadmin
                              Techniques and Challenges in Conservation Genomics
                              by seqadmin



                              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                              Avian Conservation
                              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                              03-08-2024, 10:41 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Yesterday, 06:37 PM
                            0 responses
                            10 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, Yesterday, 06:07 PM
                            0 responses
                            10 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-22-2024, 10:03 AM
                            0 responses
                            51 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-21-2024, 07:32 AM
                            0 responses
                            67 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X