Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hi,

    I got the following strange pileup results and I am at a loss what to do.
    My data is RNA-SEQ and sam file was generated by using Tophat ( ver1.0.14 )
    Is there anyone who had similar result or have any idea to solve this problem ?

    Thanks
    Corthay

    -----------------------------------------------------------
    Supercontig4 3775441 A A 36 0 60 3 .., EGD
    Supercontig4 3775442 A A 36 0 60 3 .., FGC
    Supercontig4 3775443 C C 36 0 60 3 .., EGB
    Supercontig4 3775444 G G 36 0 60 3 .., FGD
    Supercontig4 3775445 G G 36 0 60 3 .., GGD
    Supercontig4 3775446 A A 36 0 19 30 ^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G^"G.., G>GCFGEGFGDGGGD>?E=3EEF#BE?AEB
    Supercontig4 3775447 A A 117 0 19 30 ............................., G?GCGGEGGGDGGGBCBFB<BBG#?DBEGE
    Supercontig4 3775448 G G 36 0 19 30 TTTTTTTTTTTTTTTTTTTTTTTTTTT.., F>G5EDBBEGDFGGBA?D@-B?D#??AEGE
    Supercontig4 3775449 A A 36 0 19 30 CCCCCCCCCCCCCCCCCCCCCCCCCCC.., G:GBFGCGGGCGGFECCA:?B5D?BDCEGE
    Supercontig4 3775450 A A 36 0 19 30 GGGGGGGGGGGGGGGGGGGGGGGGGGG.., G<G?GGEGGG?GGGDCEB::BAFA@ECBDD
    Supercontig4 3775451 T T 36 0 19 30 AAAAAAAAAAAAAAAAAAAAAAAAAAA.., GBGDGGEGGG?GGGEBCDB?:?GA?G?BGA
    -----------------------------------------------------------

    Comment


    • Hi Ih3,

      Is there any utility/script/tool for converting SAM format to SOAP alignment output format.Iam using SOAPsnp which accepts only the SOAP aligner format.

      Comment


      • which script adds the header

        Just wondering which script was used to add the header after the maq2sam-long conversion.

        Originally posted by aby View Post
        Okay, I have solved my problems. Seems there is a script to add the header file, and different command options for conversion to Bam.

        Comment


        • @bioenvisage

          I would encourage you to use samtools/gatk/snvmix/varscan. Soapsnp is great but others are as good and easier to use.

          Comment


          • fasta2sam converter

            Dear All,

            I have generated a mapping assembly in fasta format and I now need to convert it into the sam format accepted by samtools.
            Would anybody know a fasta2sam converter I could get access to?

            Thank you very much for your assistance.

            Carole

            Comment


            • Originally posted by carole_smadja View Post
              Dear All,

              I have generated a mapping assembly in fasta format and I now need to convert it into the sam format accepted by samtools.
              Would anybody know a fasta2sam converter I could get access to?

              Thank you very much for your assistance.

              Carole
              That isn't possible - FASTA files don't have enough information to describe an assembly, they can only be used to store the raw reads (without qualities or mapping information) or the contigs (without qualities or mapping information).

              What assembly tool did you use? Does it have any other output files?
              Last edited by maubp; 10-22-2010, 04:12 AM. Reason: fixed typo

              Comment


              • I carried out a NimbleGen array capture experimentm followed by 454 sequencing. I first used gsMapper to get a mapping assembly (output ace and fasta alignments). However, I did perform a series of subsequent manipulations : a second assembly using SSAHA2 (fasta output), some curation steps and a division of the initial alignment into segments of invariable depth of coverage (still as fasta). what would you recommend?

                Thanks
                Carole

                Comment


                • Originally posted by carole_smadja View Post
                  I carried out a NimbleGen array capture experimentm followed by 454 sequencing. I first used gsMapper to get a mapping assembly (output ace and fasta alignments). However, I did perform a series of subsequent manipulations : a second assembly using SSAHA2 (fasta output), some curation steps and a division of the initial alignment into segments of invariable depth of coverage (still as fasta). what would you recommend?

                  Thanks
                  Carole
                  Search for ACE to SAM/BAM conversion. It is possible, but you will also
                  need the original SFF files (or FASTQ or QUAL) for the read qualities.

                  Comment


                  • I used maq2sam-long to convert maq output to sam format, but all pairing information is missed in the results: MRNM is "*", and MPOS and ISIZE are 0. Can you recommend how to get these information? Thanks, Mei

                    Comment


                    • The MACS distribution includes two scripts: elandexport2bed.py elandmulti2bed.py elandresult2bed.py

                      These might be more up-to-date than your scripts and easier to use in some cases.

                      Comment


                      • What is the safest way to reheader a BAM file generated by an alignment to the human_g1k_v37.fasta genome, i.e.
                        cat GRCh37/human_g1k_v37.fasta.fai
                        1 249250621 52 60 61
                        2 243199373 253404903 60 61
                        3 198022430 500657651 60 61
                        4 191154276 701980507 60 61
                        5 180915260 896320740 60 61
                        6 171115067 1080251307 60 61
                        7 159138663 1254218344 60 61
                        8 146364022 1416009371 60 61
                        9 141213431 1564812846 60 61
                        10 135534747 1708379889 60 61
                        11 135006516 1846173603 60 61
                        12 133851895 1983430282 60 61
                        13 115169878 2119513096 60 61
                        14 107349540 2236602526 60 61
                        15 102531392 2345741279 60 61
                        16 90354753 2449981581 60 61
                        17 81195210 2541842300 60 61
                        18 78077248 2624390817 60 61
                        19 59128983 2703769406 60 61
                        20 63025520 2763883926 60 61
                        21 48129895 2827959925 60 61
                        22 51304566 2876892038 60 61
                        X 155270560 2929051733 60 61
                        Y 59373566 3086910193 60 61
                        MT 16569 3147273397 70 71
                        GL000207.1 4262 3147290265 60 61
                        GL000226.1 15008 3147294661 60 61
                        GL000229.1 19913 3147309982 60 61
                        GL000231.1 27386 3147330289 60 61
                        GL000210.1 27682 3147358194 60 61
                        GL000239.1 33824 3147386400 60 61
                        GL000235.1 34474 3147420850 60 61
                        GL000201.1 36148 3147455961 60 61
                        GL000247.1 36422 3147492774 60 61
                        GL000245.1 36651 3147529866 60 61
                        GL000197.1 37175 3147567190 60 61
                        GL000203.1 37498 3147605047 60 61
                        GL000246.1 38154 3147643232 60 61
                        GL000249.1 38502 3147682084 60 61
                        GL000196.1 38914 3147721290 60 61
                        GL000248.1 39786 3147760915 60 61
                        GL000244.1 39929 3147801427 60 61
                        GL000238.1 39939 3147842084 60 61
                        GL000202.1 40103 3147882751 60 61
                        GL000234.1 40531 3147923585 60 61
                        GL000232.1 40652 3147964854 60 61
                        GL000206.1 41001 3148006246 60 61
                        GL000240.1 41933 3148047993 60 61
                        GL000236.1 41934 3148090687 60 61
                        GL000241.1 42152 3148133382 60 61
                        GL000243.1 43341 3148176299 60 61
                        GL000242.1 43523 3148220425 60 61
                        GL000230.1 43691 3148264736 60 61
                        GL000237.1 45867 3148309218 60 61
                        GL000233.1 45941 3148355912 60 61
                        GL000204.1 81310 3148402681 60 61
                        GL000198.1 90085 3148485409 60 61
                        GL000208.1 92689 3148577058 60 61
                        GL000191.1 106433 3148671355 60 61
                        GL000227.1 128374 3148779625 60 61
                        GL000228.1 129120 3148910202 60 61
                        GL000214.1 137718 3149041537 60 61
                        GL000221.1 155397 3149181614 60 61
                        GL000209.1 159169 3149339664 60 61
                        GL000218.1 161147 3149501549 60 61
                        GL000220.1 161802 3149665445 60 61
                        GL000213.1 164239 3149830007 60 61
                        GL000211.1 166566 3149997047 60 61
                        GL000199.1 169874 3150166453 60 61
                        GL000217.1 172149 3150339222 60 61
                        GL000216.1 172294 3150514304 60 61
                        GL000215.1 172545 3150689533 60 61
                        GL000205.1 174588 3150865017 60 61
                        GL000219.1 179198 3151042578 60 61
                        GL000224.1 179693 3151224826 60 61
                        GL000223.1 180455 3151407577 60 61
                        GL000195.1 182896 3151591103 60 61
                        GL000212.1 186858 3151777111 60 61
                        GL000222.1 186861 3151967147 60 61
                        GL000200.1 187035 3152157186 60 61
                        GL000193.1 189789 3152347402 60 61
                        GL000194.1 191469 3152540418 60 61
                        GL000225.1 211173 3152735142 60 61
                        GL000192.1 547496 3152949898 60 61
                        to something that would be more compatible with the UCSC-centric Bioconductor stack, i.e.

                        cat hg19.fasta.fai
                        chr1 249250621 6 50 51
                        chr2 243199373 254235646 50 51
                        chr3 198022430 502299013 50 51
                        chr4 191154276 704281898 50 51
                        chr5 180915260 899259266 50 51
                        chr6 171115067 1083792838 50 51
                        chr7 159138663 1258330213 50 51
                        chr8 146364022 1420651656 50 51
                        chr9 141213431 1569942965 50 51
                        chr10 135534747 1713980672 50 51
                        chr11 135006516 1852226121 50 51
                        chr12 133851895 1989932775 50 51
                        chr13 115169878 2126461715 50 51
                        chr14 107349540 2243934998 50 51
                        chr15 102531392 2353431536 50 51
                        chr16 90354753 2458013563 50 51
                        chr17 81195210 2550175419 50 51
                        chr18 78077248 2632994541 50 51
                        chr19 59128983 2712633341 50 51
                        chr20 63025520 2772944911 50 51
                        chr21 48129895 2837230949 50 51
                        chr22 51304566 2886323449 50 51
                        chrX 155270560 2938654113 50 51
                        chrY 59373566 3097030091 50 51
                        chrM 16571 3157591135 50 51
                        So I want to toss the unscaffolded and haplotyped sequences and rename the rest.
                        --
                        Jeremy Leipzig
                        Bioinformatics Programmer
                        --
                        My blog
                        Twitter

                        Comment


                        • perhaps try using reheader option in samtools. i think you can filter out those reads yo want using the view option and then use the reheader option

                          Comment


                          • Does anyone know what has happened to indel lines in the new samtools mpileup format? In the old pileup, each indel is followed by an additional line carrying some further info - when I run mpileup (samtools 0.1.12a) I don't get any such lines. The old pileup also has a flag ("-i") for outputting only indel variants - there does not seem to be such an option for mpileup.

                            Comment


                            • samtools index and alternative alignments

                              I would like to index/search the alternative alignments found by bwa with samtools.

                              Currently it seems like samtools index does not index the alternative alignments from the XA flag. It would be great if this worked (maybe as an option).

                              Another possibility would be for bwa to output alternative alignments as another sam line.

                              Are either of these possible or planned for future releases?

                              Andrew
                              Last edited by andrewm; 10-15-2012, 07:44 AM.

                              Comment


                              • can any 1 tell me please how can i install SAMtools on my windows operating system...Help Please...thanks

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Essential Discoveries and Tools in Epitranscriptomics
                                  by seqadmin




                                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                  04-22-2024, 07:01 AM
                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 08:47 AM
                                0 responses
                                14 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                60 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                60 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                54 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X