MDMR – Nicholas Schork Multivariate distance matrix regression. Explain a gene distance matrix using parameters (e.g. micro-array chip, or for us, MeSH annotation=disease) BioPython function exists.
http://www.pnas.org/content/103/51/19430.abstract?ck=nck
Profile comparison used to make the distance matrix? Also possible – look for pubmed co-citation (an independant distance matrix, not involving MeSH?)
Regenerating generif.
BG-profiles computed, appear to be less effective. Some [...]
Archive for the ‘progress’ Category
WIP
Posted in progress on February 20, 2009 | Leave a Comment »
Towards Profile comparison results
Posted in progress, todo on April 22, 2008 | Leave a Comment »
Ballpark runtime figures – I took about 30 nuclear receptor genes and ran those against the 4000 MeSH Disease profiles – this took about 30 minutes. Extrapolating to the approximately 10000 human genes with GeneRIFs makes the runtime about 7 days.
Plan for tomorrow is to pull out an example for and mock up the distance [...]
Biting the Bullet
Posted in SQL, progress on June 11, 2007 | Leave a Comment »
A bit anticlimatic — I had previously waited forever for Entrez Gene and all the GeneRIFs to convert from XML to to text files. So rather than doing that myself, I wrote the code to load from text files on the Entrez Gene FTP server. Have other people do most of my [...]
Biweekly Progress Report
Posted in agenda, progress on May 30, 2007 | Leave a Comment »
WASSERMAN/OUELLETTE LABORATORY
SUMMARY FORM
Name: Warren Cheung
Date: 2007-05-30
Progress
Draft of Proposal
Modularity
Setup for Entrez Gene to PubMed “muscle” subset
Other Activities
Reading
statistics – multiple testing correction, Fisher/hypergeometric distribution.
GO term analysis
MSL Poster
Goals
Complete [...]
Biweekly Progress Report
Posted in progress on May 11, 2007 | Leave a Comment »
Name: Warren A. Cheung
Progress
Code to download all TF genes (all species, including human)
Can generate links between genes and pubmed articles up to 3 degrees of separation
Other Activities
First Committee Meeting
Booked Qualifying Exam for Augusts 9th, Room 174 at UBiC
completed first pass of related work for proposal
Goals
generate numerical results linking genes to brain disease MeSH terms, via [...]
Optimise
Posted in SQL, progress, todo on April 28, 2007 | Leave a Comment »
Sounds like a transformers yell…”Transformers, Optimise!”
Anyways, thing to fix: all the CREATE TABLE AS SELECT…. create tables without keys. No key is definitely bad. Trying this for the one that really matters – the gene_term_related_citations query. EXPLAIN seems happier, but proof will be in the pudding (or [...]
A cunning plan
Posted in SQL, progress on April 26, 2007 | Leave a Comment »
Time to get cracking on the optimisation. The dual plans are to
Expand the TF database with taxonids (and eventually load all of Entrez Gene)
Optimise queries using table indexes
Looks optimistic – I like orders of magnitude improvements.
Just hacked the Entrez Gene fetcher to extract the taxonid from the XML. I should look into [...]