MDMR – Nicholas Schork Multivariate distance matrix regression. Explain a gene distance matrix using parameters (e.g. micro-array chip, or for us, MeSH annotation=disease) BioPython function exists. http://www.pnas.org/content/103/51/19430.abstract?ck=nck Profile comparison used to make the distance matrix? Also possible – look for pubmed co-citation (an independant distance matrix, not involving MeSH?) Regenerating generif. BG-profiles computed, appear to [...]
Archive for the ‘progress’ Category
WIP
Posted in progress on February 20, 2009 | Leave a Comment »
Towards Profile comparison results
Posted in progress, todo on April 22, 2008 | Leave a Comment »
Ballpark runtime figures – I took about 30 nuclear receptor genes and ran those against the 4000 MeSH Disease profiles – this took about 30 minutes. Extrapolating to the approximately 10000 human genes with GeneRIFs makes the runtime about 7 days. Plan for tomorrow is to pull out an example for and mock up the [...]
Biting the Bullet
Posted in progress, SQL on June 11, 2007 | Leave a Comment »
A bit anticlimatic — I had previously waited forever for Entrez Gene and all the GeneRIFs to convert from XML to to text files. So rather than doing that myself, I wrote the code to load from text files on the Entrez Gene FTP server. Have other people do most of my processing and all [...]
Biweekly Progress Report
Posted in agenda, progress on May 30, 2007 | Leave a Comment »
WASSERMAN/OUELLETTE LABORATORY SUMMARY FORM Name: Warren Cheung Date: 2007-05-30 Progress Draft of Proposal Modularity Setup for Entrez Gene to PubMed “muscle” subset Other Activities Reading statistics – multiple testing correction, Fisher/hypergeometric distribution. GO term analysis MSL Poster Goals Complete Proposal Gene-to-muscle analysis Model for Properties Roadblocks/Needs MySQL limitations [...]
Biweekly Progress Report
Posted in progress on May 11, 2007 | Leave a Comment »
Name: Warren A. Cheung Progress Code to download all TF genes (all species, including human) Can generate links between genes and pubmed articles up to 3 degrees of separation Other Activities First Committee Meeting Booked Qualifying Exam for Augusts 9th, Room 174 at UBiC completed first pass of related work for proposal Goals generate numerical [...]
Optimise
Posted in progress, SQL, todo on April 28, 2007 | Leave a Comment »
Sounds like a transformers yell…”Transformers, Optimise!” Anyways, thing to fix: all the CREATE TABLE AS SELECT…. create tables without keys. No key is definitely bad. Trying this for the one that really matters – the gene_term_related_citations query. EXPLAIN seems happier, but proof will be in the pudding (or the running, in this case). Dang…optimised query [...]
A cunning plan
Posted in progress, SQL on April 26, 2007 | Leave a Comment »
Time to get cracking on the optimisation. The dual plans are to Expand the TF database with taxonids (and eventually load all of Entrez Gene) Optimise queries using table indexes Looks optimistic – I like orders of magnitude improvements. Just hacked the Entrez Gene fetcher to extract the taxonid from the XML. I should look [...]