Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Hi Simon,

    I have a mac 10.6, intel and python 2.6. Matplotlib is unavailable for 10.6, py 2.6.... as far as I know. I give up. The website (matplotlib) asks me to tinker around with System files which I don't want to tamper with. Thanks anyway for your help.

    Abhijit

    Comment


    • #32
      Hi Abhijit

      Originally posted by gen2prot View Post
      I have a mac 10.6, intel and python 2.6. Matplotlib is unavailable for 10.6, py 2.6.... as far as I know.
      Yours seems to be a pretty standard configuration, so it would be apity if matplotlib was not available. If you have Xcode installed, building from source might work. Just try:

      Code:
      wget http://sourceforge.net/projects/matplotlib/files/matplotlib/matplotlib-0.99.1/matplotlib-0.99.1.2.tar.gz/download
      tar -xzvf matplotlib-0.99.1.2.tar.gz 
      cd matplotlib-0.99.1.1/
      python setup.py build
      sudo python setup.py install
      Simon

      Comment


      • #33
        Got it to work... Coffee helps.

        Comment


        • #34
          Hi Simon,

          I used BWA to do the alignment. If a read mapped to the chromosomal junction (end of one chromosome and beginning of another chromosome), BWA will produce a Flag = 4, but you can still see the other tags: "chr", "CIGAR","MAPQ", etc. in the sam file. This causes a problem for htseq-qa in that it can't process an alignment with flag=4, but mapping position does "NOT" equal "*". One option for me is to pre-process my sam file by excluding all such alignments before giving to htseq-qa, but I am wondering is it possible to turn off this requirement in htseq-qa so that I don't have to change my sam file in advance?Thank you very much for your advice in advance!

          Yuan

          Comment


          • #35
            Hi Simon,

            I made a Drosophila Gene Index file and ran all the solexa reads against this. I have a 5.4 GB sam file. Can I use the Htseq-qa program on this sam file? or will this crash since the subject to which the alignment was done are genes and not chromosomes. Secondly how can I get the read count in my case. The GTF file that I have has the start and end coordinates wrt to the genome. Therefore this won't work with the sam output. Any suggestions?

            thanks
            Abhijit

            Comment


            • #36
              Hi Yuan

              Originally posted by yh253 View Post
              I used BWA to do the alignment. If a read mapped to the chromosomal junction (end of one chromosome and beginning of another chromosome), BWA will produce a Flag = 4, but you can still see the other tags: "chr", "CIGAR","MAPQ", etc. in the sam file. This causes a problem for htseq-qa in that it can't process an alignment with flag=4, but mapping position does "NOT" equal "*". One option for me is to pre-process my sam file by excluding all such alignments before giving to htseq-qa, but I am wondering is it possible to turn off this requirement in htseq-qa so that I don't have to change my sam file in advance?Thank you very much for your advice in advance!
              I've changed the SAM parser such that it now only writes a warning rather than stopping with an error if this case is encountered. Please try again with version 0.4.4p2.

              However, I don't quite understand how such SAM lines come about. What is a chromosomal junction? How could a read get mapped partly to one and partly to another chromosome?

              Cheers
              Simon

              Comment


              • #37
                Hi Abhijit

                Originally posted by gen2prot View Post
                I made a Drosophila Gene Index file and ran all the solexa reads against this. I have a 5.4 GB sam file. Can I use the Htseq-qa program on this sam file? or will this crash since the subject to which the alignment was done are genes and not chromosomes.
                Just try it out. But it should work. htseq-count does not care what you have aligned against.

                Secondly how can I get the read count in my case. The GTF file that I have has the start and end coordinates wrt to the genome. Therefore this won't work with the sam output. Any suggestions?
                So, basically, wherever you usually have a chromosome name (i.e., in the third column of your SAM file) you now have a gene name, and you want to count how many reads fall onto each gene? Why don't you just cut out this third column and count how often each gene appears there?

                Of course, your approach has other dangers. For example, how does your aligner handle multiple transcripts? If the same exon appears in several transcripts, the aligner might think it is a repeat.

                Cheers
                Simon

                Comment


                • #38
                  Hi Simon,

                  Thanks very much for your reply! Quoted from BWA:"Internally BWA concatenates all reference sequences into one long sequence. A read may be mapped to the junction of two adjacent reference sequences. In this case, BWA will flag the read as unmapped, but you will see position, CIGAR and all the tags". This has happened to me that some reads mapped to the end of chrY and the beginning of chrM by using BWA. However, I didn't see these mappings by using Bowtie. This might arise from mapping strategy adopted by BWA, which converts "N" in the reference sequence into random bases. During mapping, BWA must have converted "N"s at the end of ChrY into random bases which happened to match the beginning bases of some reads.

                  Cheers,
                  Yuan
                  Last edited by yh253; 06-05-2010, 06:17 AM.

                  Comment


                  • #39
                    Hi Simon,
                    I've used HTSeq-counts to extract raw counts from Cufflinks output, and it worked correctly for 40/42 of my samples. For some reason, two samples raised the error:

                    Error: 'generator' object has no attribute 'get_line_number_string'
                    [Exception type: AttributeError, raised in count.py:126]


                    I haven't been able to find anything wrong with the data in those two samples. Any ideas? Thank you!

                    Comment


                    • #40
                      Hi

                      Originally posted by rkusko View Post
                      I've used HTSeq-counts to extract raw counts from Cufflinks output, and it worked correctly for 40/42 of my samples. For some reason, two samples raised the error
                      Could you maybe send me (be e-mail to anders(at)embl(dot)de) some excerpts from the data that produce the error? Then I could investigate.

                      Cheers
                      Simon

                      Comment


                      • #41
                        I check the website of HTSeq
                        A framework to process and analyze data from high-throughput sequencing (HTS) assays


                        There is not binary version available, would you give a link to download ?

                        Thank you

                        Wei

                        Comment


                        • #42
                          Originally posted by townway View Post
                          There is not binary version available, would you give a link to download ?
                          For which operating system?

                          Comment


                          • #43
                            Linux 2.6.18-194.8.1.el5 #1 SMP Wed Jun 23 10:52:51 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux


                            Thank you

                            Comment


                            • #44
                              Apologies if this is a series of dumb questions.

                              I see HTSeq can read SAM files & assume it can also read BAM files. Can it retrieve BAM files from a remote (FTP or HTTP) location as samtools can? Is it using the samtools code or have you reimplemented this in Python?

                              thanks

                              Keith R.

                              Comment


                              • #45
                                Hi Simon,

                                I have a question about qval in DESeq, when I do not have biological replicate and then analyze two samples? How does DESeq calculate the pval. Sorry I have read your paper, but the formula is very complicated for me? Can you explain this to me?

                                Thanks a lot!
                                Wei

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 06:37 PM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, Yesterday, 06:07 PM
                                0 responses
                                9 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                50 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                67 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X