What I want to do now is go from entrez gene_id and grab relevant “sequence features” – the size of the coding region, length of the gene, chromosome, RNA length, and so on. Doesn’t look like the info is archived locally, so possible avenues to whack are using BioMart or NCBI via web services to [...]
Archive for the ‘Uncategorized’ Category
Gene Properties to Disease
Posted in Uncategorized on April 15, 2009 | Leave a Comment »
Work Report
Posted in Uncategorized on February 23, 2009 | Leave a Comment »
Bugfix – propogating the makefile generation code. Makefiles are no longer generated separate from the files that use them.
BUG TOFIX – auc computation doesn’t stop on error. Perhaps we should make this a submake, so we can compute all the scores simultaneously?
Computing – gene-genevalues for old
TODO – biopython MDMR compute. Use MDMR data in [...]
Progress Report
Posted in Uncategorized on February 10, 2009 | Leave a Comment »
Mouse general background compute completed. Disease/GeneBG compute in progress.
cmp-digenei2 for mouse being run
TODO: adapt cmp-digenei2 for BG computes, grep mus results
Weekend Work
Posted in Uncategorized on February 8, 2009 | Leave a Comment »
GeneBG seems to be OK, trying to get the diseaseBG off the ground. Need to investigate why PMID 9753684 is flagged with MeSH term “Chromosome Aberration” and AGAMOUS Protein, Arabidopsis, but isn’t being picked up by the mesh-disease…
Ah. I see the problem – it’s a matter of odd intersections in the tree – Mosaicism is [...]
New Results
Posted in Uncategorized on February 6, 2009 | Leave a Comment »
Seems pretty similar to the cmp-digenei result. Should note that CTD validation and training validation are unchanged, since those only depend on the predictions.
Mouse direct connections are done. Write a simple cmd-line tool to grep results? What other organisms would be useful use cases – yeast perhaps?
I really need a web interface – somewhere people [...]
Update
Posted in Uncategorized on February 5, 2009 | Leave a Comment »
digenei2 – use this for the new background tweaking
cmp-digenei2 running – added tmp directory stub (for auc.sh)
running mouse version on digenei3
Currently In Progress
Posted in Uncategorized on February 4, 2009 | Leave a Comment »
-mus TAXON_NAME test run in digenei1/ (new TAXON_NAME code)
-hum first compute in digenei3/
TO RUN – cmp-digenei2 once digenei3 is complete
TO FIX – reference to taxon_id=9606 (e.g. in direct_gd_predict.mk)
TO REVISE – maybe use comesh counts (comesh for disease) to get the background stats (save computation)
Mouse Organism Prediction
Posted in Uncategorized on February 3, 2009 | Leave a Comment »
Change organism references from hum to TAXON_FILTER. Hard-code the extraction of mouse genes, but everything else should follow pretty straightforward from that.
Extract the organism-specific filtering from the general computation (get_pval.mk) in direct-predict to the main Makefile, to make switching between organisms possible via simply changing the TAXON_FILTER.
Status Update
Posted in Uncategorized on January 30, 2009 | Leave a Comment »
Paper: Skeleton of the methods section for the profile comparisons – we should update that with descriptions of the distance functions.
Fixes: Fixed some sorting error in the all-mesh-refs.txt code, which would have affected some p-value computations. Should probably double-check if similar errors are elsewhere (ie in the profile comparison code? direct association code?)
Computation: digenei3 is [...]
Bug Detected
Posted in Uncategorized on January 21, 2009 | Leave a Comment »
This needs to be fixed – badly. txt/direct-gene-disease/all-mesh-refs seems to have some sorting problem, especially obvious around Antigens, CD (and the various CD#s) -g does not do what I think it did when I rewrote the BIGSORT macro.
This does affect a pretty high level mesh term count though. Might be the ideal time to add [...]