Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    yippee

    one down , more to go...

    I have found a solution for the problem above. I'm not sure whether this is a good one, but I am sure it doesn't mentioned no where in the manuals.

    I changed the setting of the mysqld to complain doing this:
    $ sudo aa-complain /usr/sbin/mysqld
    [sudo] password:
    Setting /usr/sbin/mysqld to complain mode.
    $ sudo /etc/init.d/apparmor reload
    Reloading AppArmor profiles : done.
    This enabled mysqld to find/read the files.

    I realy did underestimate what the guys from chipa-pet told me, that it is complicated to install, but I won't give up!

    Comment


    • #32
      Originally posted by guoliang View Post
      Good to know you have fixed the issue. The manual is outdated. A updated version of the manual should be expected.
      Hi Guoliang, I am wondering is there any recent updates about the ChIA-PET analysis pipeline? Or should I stick to the version developed in 2010? Thanks a lot for making the ChIA-PET technology available to everyone.

      Comment


      • #33
        Hi yilongzou,

        there is a somewhat newer version of chiapet here from March 2012.

        But I must admit that this one is not a better version. I spend a lot of time tying to understand how it works and it still doesn't function correctly.
        The documentation is still the old one, with no update in sight.

        I'm really hope you can figure it out better than I did. Please let me know If you have run into any problems, I am sure we can fix them together

        The pipeline is running now, but my browser (gbrowe or gbrowse2) doesn't work.

        Assa

        Comment


        • #34
          Hi Assa,
          thank you very much for trying to help, I have figured out the linkerfilter and the mapping step, but using the chiapet.py command constantly gives errors. For example, when set: -run 2 (second step out of eight):

          Traceback (most recent call last):
          File "chiapet.py", line 252, in <module>
          retcode, msg = main()
          File "chiapet.py", line 172, in main
          run_v3(uniqfile, opts.lib, libworkdir, opts.asm, infofile, util.shell)
          File "~/Applications/package-v1/chiapet-pipeline-r261/src/python/main/v3_runner.py", line 69, in run_v3
          config=confile, output=worklib))
          File "~/Applications/package-v1/chiapet-pipeline-r261/src/python/common/util.py", line 157, in __init__
          raise Exception("Execution failed: '{0}'".format(cmd))
          Exception: Execution failed:

          Have you seen these errors before? I read your previously comments and there are similar errors,how did you fixed it?
          Thanks again,
          Yilong

          Comment


          • #35
            Sorry for the late response. I have just saw your mail.

            There might be several possibilities for the error. Can you post the complete log file (output) for the run as well as the commend you use?

            In my v3_runner.py script line 69 is empty, but I have changed the script so very often, that I am not sure what was originally in this line. please send me also the snippet of this script with line 69 in it.

            I try to find out where is the problem this time.

            Thanks
            Assa

            Comment


            • #36
              Did you solve the problem?

              I will really be interested in keeping this channel open, as I still have some problems with the tool.
              I manage to make it to run all the way through and it produces also some result files (though not all of them).
              But the biggest problem is that I can't get any display in the browser.

              Gbrowser just doesn't work. I am not sure what the correct options to put, but as far as I understand them I did everything right. I have the chiapet tool and I can see some things, but as soon as I am geoing to the browser, I keep getting
              Code:
              The requested URL /gbrowse2/library_Name was not found on this server.
              I will be happy to know if you managed to run the Gbrowser and if so how.

              Thanks
              Assa

              Comment


              • #37
                Hi frymor,

                thank you for all the information on trying to get this pipeline to run. I am also trying to analyze our paired-end ChIA-PET data with this tool, but haven't managed to make all the necessary changes yet it seems.

                I am still having problems trying to get the mapping script to finish. I am having a similar problem to what you used to have it seems, where the merging of the .bat files is unsuccessful. What I have done so far is the following:

                1) I compiled and installed the "new" LinkerFilter class and changed the csa_mapper.py to use this one (LGL.chiapet.LinkerFilter) instead of the pre-packaged sg.edu.astar.gis.chiapet.LinkerFilter. The pre-packaged one gave me problems when doing 2)

                2) Like you, i changed the options to start this filter in csa_mapper.py to
                Code:
                t = tshell('''{i} {script} {fasq1} {fasq2} {output} {link1} {link2}
                			--flip-tail --bar-start_1 7 --bar-start_2 7 --bar-length_1 4
                			--bar-length_2 4 1>/dev/null 2>&1'''
                since my barcode is 4 bases long and starts at position 7

                3) I modified the script to skip the deletion of some intermediary files since, for a reason I couldn't figure out, I got permission denied errors which would cause the script to fail.

                To start the processing/mapping I run the following command (I used the linker set b to generate these libraries and also created the index for the mm10 assembly)

                Code:
                python csa_mapper.py --lib=test --run=3-4 --head=head.fastq --tail=tail.fastq --linker=linker_b --asm=mm10
                The filtering seems to work fine, where i get an output as follows for one homodimer file
                Code:
                NTAGTAAAGACTGGGACAAG    CAGGCATATGGACATAGAAA
                NTAGTAAAGACTGGGACAAGAGTTGGAATGTATATCGCGGCCGCGATATA      GTATATCGCGGCCGCGATATACATTCCAACCAGGCATATGGACATAGAAA
                AGAGTTGGAATGTATATCGCGGCCGCGATATA        TGGTTGGAATGTATATCGCGGCCGCGATATAC
                Score: 38
                AGAGTTGGAATGTATATCGCGG  TGGTTGGAATGTATATCGCGG   ATGT    ATGT
                ---GTTGGAATGTATATCGCGG  --GTTGGAATGTATATCGCGG
                   |||||||||||||||||||    |||||||||||||||||||
                3       22
                0       19
                2       21
                0       19
                I would assume the two "ATGT"s in this case represent the barcode found in the mates, which is fine as well.

                Would you have any tips in trying to get this script to finish? I sure would like to help out in getting the Gbrowser to work in the end as well.

                Cheers

                Comment


                • #38
                  Originally posted by frymor View Post
                  I would like to know how seqmonq works with this kind of data.

                  I have two fastq files, which I can't map, as they still have the two linkers inside them.

                  Did you work with ChIA-PET data in seqmonq?
                  Hi Frymor, this is probably a little off topic now but if you're still interested here is a rough overview of how SeqMonk works with interaction data.

                  Firstly, you need to process and align the data. This isn't done with SeqMonk, instead we typically use another program written in-house called HiCUP (see also the quick start and manual). This does a bunch of quality control steps as well, and gives you a nice report. It's designed for HiC data, but I see no reason why it shouldn't work for ChIA-PET.

                  Once you have your processed and aligned BAM file, you can import it into SeqMonk, checking the 'treat as HiC data' check box. Each ditag will have the two ends displayed as reads within SeqMonk. Many of the methods within SeqMonk will treat them as traditional single reads (which is usually quite useful - for instance, quantifying read position relative to restriction enzyme sites and so on). However, there are a number of HiC specific functions. These include heatmaps and pulling out '4C other ends' from a region, that is, the corresponding ends from fragments found in a region of interest.

                  I hope that all makes sense - Simon has put together a couple of YouTube screencasts about HiC data here: Initial QC and preparation of HiC data in SeqMonk) and Analysis of HiC interactions in SeqMonk).

                  Shout if you have any questions.

                  Phil (BI bioinformatician doing HiC analysis until recently)

                  Comment


                  • #39
                    Sorry for the late response.

                    I am happy to hear it running now. As I mentioned earlier, I do not use
                    the “older” (but to be honest, not necessarily worse) version of chiapet,
                    but the newer one from, I think, 2012. Here I have a script named mapper
                    (instead of csa_mapper.py), which have only 4 steps. Than I have the
                    chiapet.py script with 6 steps.

                    As I don’t have paired-end reads, but my library is single-end. I also use
                    a separate “plug-In” to filter the linker and create the link files, which
                    the original mapper script do in the first three steps.

                    So I am using a java plug-In for FilterLinker and than do the mapping using
                    the mapper.py script.

                    In the last few runs I do encounter a problem, but strangely the problem
                    is only in one file. I am running a separate analysis in my data for the different
                    linker pairs, so I have four different files (AA,AB,BA and BB). AA means where chiapet identify that the tow linkers in the pair correspond to linkerA-LinkerA
                    When I run my chiapet.py script for the AA pairs I keep getting the error message:

                    Code:
                    2014-05-14 12:06:25,650 ERROR [ChIA-PET/bacillus_1_SAA] Execution failed: 'cat /export/Assa/projects/Anita_3C/chiapet/work/bacillus_1_SAA/bacillus_1_SAA.ipet | java -cp /export/Assa/projects/Anita_3C/chiapet/bin:/export/Assa/projects/Anita_3C/chiapet/lib/java/commons-cli-1.2.jar:/export/Assa/projects/Anita_3C/chiapet/lib/java/guava-r05.jar -Xmx2G LGL.chiapet.PetCluster /export/Assa/projects/Anita_3C/chiapet/data/genome/size/NC_000964.3.txt bacillus_1_SAA 100000 1750'
                    2014-05-14 12:06:25,651 ERROR [ChIA-PET/bacillus_1_SAA] ??? Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 42
                    at LGL.chiapet.PetCluster.getProb3(PetCluster.java:737)
                    at LGL.chiapet.PetCluster.weightMatrixScore3(PetCluster.java:672)
                    at LGL.chiapet.PetCluster.generateClusters(PetCluster.java:372)
                    at LGL.chiapet.PetCluster.<init>(PetCluster.java:178)
                    at LGL.chiapet.PetCluster.main(PetCluster.java:749)
                    
                    Traceback (most recent call last):
                    File "/export/Assa/projects/Anita_3C/chiapet/src/python/main/chiapet.py", line 249, in <module>
                    retcode, msg = main()
                    File "/export/Assa/projects/Anita_3C/chiapet/src/python/main/chiapet.py", line 174, in main
                    run_v3(uniqfile, args.lib, libworkdir, args.asm, infofile, shell)
                    File "/export/Assa/projects/Anita_3C/chiapet/src/python/main/v3_runner.py", line 177, in run_v3
                    label='Finding clusters on Intra and Inter PETs')
                    File "/export/Assa/projects/Anita_3C/chiapet/src/python/common/exec_util.py", line 61, in __call__
                    return super(TimedShell, self).__call__(cmd, *args, **kwargs)
                    File "/export/Assa/projects/Anita_3C/chiapet/src/python/common/exec_util.py", line 46, in __call__
                    raise Exception("Execution failed: '{0}'".format(cmd))
                    Exception: Execution failed: 'cat /export/Assa/projects/Anita_3C/chiapet/work/bacillus_1_SAA/bacillus_1_SAA.ipet | java -cp /export/Assa/projects/Anita_3C/chiapet/bin:/export/Assa/projects/Anita_3C/chiapet/lib/java/commons-cli-1.2.jar:/export/Assa/projects/Anita_3C/chiapet/lib/java/guava-r05.jar -Xmx2G LGL.chiapet.PetCluster /export/Assa/projects/Anita_3C/chiapet/data/genome/size/NC_000964.3.txt bacillus_1_SAA 100000 1750'
                    I can’t figure out what the meaning of that.
                    The ArrayIndex comment is not true, as I have only 42 elements in the array.

                    When I run the same script (with no changes), for the other three files (AB,Ba and BB), it runs through with no problems. For the lack of a manual I don’t even know where to start looking for this error.

                    Do you have any idea?

                    Assa

                    Comment


                    • #40
                      Hi Assa,

                      wish I could help you, but I haven't tried using the updated pipeline at all so far. Not that it helps, but from what I can tell, this error for one of the files occurs at step 3 of the chiapet.py when clusters are made to distinguish between self-ligating and and inter-ligating PET's?

                      Is there a specific reason you use the "updated" pipeline? As I've mentioned to you in an email, I got the original one working all the way through the mapping and the 8 steps of the chiapet.py for my sample now. With the exception of one specific substep in 7, for which a completely out of date R package (rimage) is needed. I spent some time trying to install it from the sources but was unsuccessful, so I am just skipping the plotting of that matrix for now by commenting the corresponding lines in stats_runner.py. So as long as we can get your data to be properly formatted into the .link files, it should be no worries to use the old pipeline for your it. Actually, a simple flattened file with one line per paired ends which are tab-separated should be sufficient for that.

                      I am now working on getting the Gbrowser to work, but also in parallel I am trying to convert the output into formats that can be used by other browsers as well. I am however a wet-lab person and only do the bioinformatics analyses for my data on the side, so I don't know how quickly I will manage to make progress there.

                      Cheers

                      Comment


                      • #41
                        To be honest i switched to the newer version, because I was hoping it will run smoother.
                        The error happens, when I run the PetCluster.java script.
                        What irritates me is the fact, that it happens only in one of four files, which all use the same parameters.

                        If you want I can send you the rimage package (rimage_0.5-8.2.tar.gz). I install it manually ( R CMD INSTALL) from the command line, after downloading it locally. I am working with R 2.14.1 under my chiapet pipeline.

                        Assa

                        Comment


                        • #42
                          Hi,

                          I've tried to install it from source, but I cannot manage to get the necessary fftw headers to be found. I tried it with several different version of R, 2.14.1 being one of them, but I'll try again. Problem is, this machine is multi-user and I've installed the headers into an include folder in my home directory. So it is very possible that I'm just not managing R to to search there instead of the system folders.

                          I will make an archive of the pipeline I am currently using with the changes I've made and will send it to you for testing.

                          Cheers

                          Comment


                          • #43
                            OK, I got it installed by giving the --with-fftw-include and --with-fftw-lib paths. So finally the whole pipeline runs all the way through.

                            I'm on to the browser issue now.

                            Comment


                            • #44
                              Hi,
                              Can somebody help me with using the ChIA-PET tool. My first question is related to CTCF ChIA-PET data the authors published in 2011.
                              Although the experiment is paired-end but there is only 1 file in GEO (GSE28247) of length 72 bp. the spot-descriptor tag describes the layout as: fwd 1-36 and rev 37-72.
                              Does this mean that I have to split this file into fwd and rev tags and the use them as input to the ChIA-PET tool as head and tail sequences for mapping?

                              Many thanks,
                              Munazah
                              Last edited by SMZA; 10-22-2016, 07:53 AM.

                              Comment


                              • #45
                                Originally posted by SMZA View Post
                                Hi,
                                Can somebody help me with using the ChIA-PET tool. My first question is related to CTCF ChIA-PET data the authors published in 2011.
                                Although the experiment is paired-end but there is only 1 file in GEO (GSE28247) of length 72 bp. the spot-descriptor tag describes the layout as: fwd 1-36 and rev 37-72.
                                Does this mean that I have to split this file into fwd and rev tags and the use them as input to the ChIA-PET tool as head and tail sequences for mapping?

                                Many thanks,
                                Munazah
                                Hi,

                                I downloaded the ChIA-PET data from https://www.ncbi.nlm.nih.gov/sra/SRX1210752 which was provided by "CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription. Tang et al. Cell, 2015, 163(7): 1611-1627.".

                                The authors claimed that DNA products were then subjected to size-selection and paired-end sequencing (2×150 bp) using Illumina Hi-Seq 2500.

                                However, the reads length in the downloaded fastq ranged from 70 to 302bp. This confused me a lot.

                                Do you have any suggestions.

                                Thanks
                                Cao

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Essential Discoveries and Tools in Epitranscriptomics
                                  by seqadmin




                                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                  04-22-2024, 07:01 AM
                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 08:47 AM
                                0 responses
                                12 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                60 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                59 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                54 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X