Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • thermophile
    Senior Member
    • Apr 2015
    • 243

    Basespace update

    Illumina replaced MiSeq reporter as the demultiplexor for miseq data on basespace with bcl2fastq a couple of weeks ago. Since then I've run into a number of instances where MSR and bcl2fastq are different which had meant many demultiplexing failures. To save people time I thought I'd start the list of issues I've hit.

    1. MSR allowed "." in sample names and sample ID, bcl2fastq does not


    2. MSR treats "N" as wildcard, bcl2fastq treats it as exact. I run a mix of dual 8bp and TruSeq lt single index 6bp. MSR had allowed me to just put NNNNNNNN as the i5 and NN at the end of the i7, this no longer works. you have to use the actual sequences (AT at the end of i7 and TCTTTCCC for i5)


    3. bcl2fastq or basespace is much much slower at demultiplexing. It used to take <30min to rerun a sample sheet, it's taking >4hours now.


    4. bcl2fastq doesn't allow you to set the indexing mismatch (at least tech support that I talked to didn't know how to globally set). It tries to allow 1 mismatch and only drops to exact match if the hamming distance is <3
    Last edited by thermophile; 09-20-2016, 07:06 AM.
    Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.
  • kcchan
    Senior Member
    • Jul 2012
    • 186

    #2
    In addition the BaseSpace apps are no longer free. You now must be subscribed to a Professional account to access the apps and also pay each time you run them.

    Comment

    • microgirl123
      Senior Member
      • Jun 2012
      • 199

      #3
      Interesting about having to pay to use all of the apps. We're going to have some unhappy customers!

      Comment

      • thermophile
        Senior Member
        • Apr 2015
        • 243

        #4
        Well that sucks, I just talked a few users into trying BaseSpace based on the NCBI_SRA app
        Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

        Comment

        • fanli
          Senior Member
          • Jul 2014
          • 197

          #5
          Originally posted by thermophile View Post
          3. bcl2fastq or basespace is much much slower at demultiplexing. It used to take <30min to rerun a sample sheet, it's taking >4hours now.
          Perhaps this is related to the number of cores you allow bcl2fastq to use?

          Originally posted by thermophile View Post
          4. bcl2fastq doesn't allow you to set the indexing mismatch (at least tech support that I talked to didn't know how to globally set). It tries to allow 1 mismatch and only drops to exact match if the hamming distance is <3
          from the bcl2fastq --help text:
          Code:
            --barcode-mismatches arg (=1)
          number of allowed mismatches per index
          multiple entries, comma delimited entries, allowed; 
          each entry is applied to the corresponding index;
          last entry applies to all remaining indices
          there is also this, which I have no idea what it does:
          Code:
            --adapter-stringency arg (=0.9)                 adapter stringency

          Comment

          • kcchan
            Senior Member
            • Jul 2012
            • 186

            #6
            Originally posted by microgirl123 View Post
            Interesting about having to pay to use all of the apps. We're going to have some unhappy customers!
            I would imagine some of the third party developers aren't too happy either since their apps are now stuck behind a paywall.

            Comment

            • thermophile
              Senior Member
              • Apr 2015
              • 243

              #7
              $5k to upgrade to professional which gives you the privilege of paying for apps
              Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

              Comment

              • thermophile
                Senior Member
                • Apr 2015
                • 243

                #8
                Originally posted by fanli View Post
                Perhaps this is related to the number of cores you allow bcl2fastq to use?


                from the bcl2fastq --help text:
                Code:
                  --barcode-mismatches arg (=1)
                number of allowed mismatches per index
                multiple entries, comma delimited entries, allowed; 
                each entry is applied to the corresponding index;
                last entry applies to all remaining indices
                there is also this, which I have no idea what it does:
                Code:
                  --adapter-stringency arg (=0.9)                 adapter stringency
                Thanks! I'll have to see if i can change something in the sample sheet to set this
                Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

                Comment

                • GenoMax
                  Senior Member
                  • Feb 2008
                  • 7142

                  #9
                  Originally posted by thermophile View Post
                  Thanks! I'll have to see if i can change something in the sample sheet to set this
                  Or switch to using bcl2fastq locally instead of BaseSpace

                  Comment

                  • thermophile
                    Senior Member
                    • Apr 2015
                    • 243

                    #10
                    Originally posted by GenoMax View Post
                    Or switch to using bcl2fastq locally instead of BaseSpace
                    I may have to do that, but that means I'll have to build a server for distributing the data to clients.
                    Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

                    Comment

                    • GenoMax
                      Senior Member
                      • Feb 2008
                      • 7142

                      #11
                      Originally posted by thermophile View Post
                      I may have to do that, but that means I'll have to build a server for distributing the data to clients.
                      If you are part of an academic institution then look into tapping common central compute resource. That way you won't need to become a sys admin in addition to other hats you wear (and not have to worry about security etc). If your users use that central compute resource then they would appreciate getting their data directly delivered to them.

                      Comment

                      • fanli
                        Senior Member
                        • Jul 2014
                        • 197

                        #12
                        Originally posted by GenoMax View Post
                        If you are part of an academic institution then look into tapping common central compute resource. That way you won't need to become a sys admin in addition to other hats you wear (and not have to worry about security etc). If your users use that central compute resource then they would appreciate getting their data directly delivered to them.
                        My experience has been that you always still need a bit of sysadmin experience to configure things exactly the way you like. For example, how do you add a new user/client for data access? Sometimes it's just easier to htpasswd it yourself

                        Comment

                        • elutheria
                          Junior Member
                          • Jul 2013
                          • 1

                          #13
                          the only charge for the apps is the cost for the compute on AWS unless it is a 3rd party app that costs to run. Also there are still Free accounts that come with some credits so you can trial Basespace. If your clients plan on using it a lot for analysis then they will need to upgrade otherwise they can still receive the data on a free account i think.

                          Comment

                          • Geneus
                            Member
                            • Dec 2010
                            • 60

                            #14
                            Originally posted by elutheria View Post
                            the only charge for the apps is the cost for the compute on AWS unless it is a 3rd party app that costs to run. Also there are still Free accounts that come with some credits so you can trial Basespace. If your clients plan on using it a lot for analysis then they will need to upgrade otherwise they can still receive the data on a free account i think.
                            Or buy an Illumina sequencer and negotiate free use of BaseSpace for some time period as part of the deal.

                            Comment

                            • ScottC
                              Senior Member
                              • Jan 2008
                              • 244

                              #15
                              A couple of things:

                              - BaseSpace will still do basecalling for free from instrument runs.
                              - The newest version of bcl2fastq2 will now treat N bases properly (as 'wildcards' so-to-speak)

                              Also, surely this doesn't come as a surprise... as far as I'm aware, it was pretty well communicated a long time ago that it was going to become a pay-per-use service.

                              Cheers,

                              Scott.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              19 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              14 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              29 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-26-2026, 10:12 AM
                              0 responses
                              31 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...