Feeds:
Posts
Comments

Archive for the ‘progress’ Category

WIP

MDMR – Nicholas Schork Multivariate distance matrix regression.  Explain a gene distance matrix using parameters (e.g. micro-array chip, or for us,  MeSH annotation=disease)  BioPython function exists.
http://www.pnas.org/content/103/51/19430.abstract?ck=nck
Profile comparison used to make the distance matrix?  Also possible – look for pubmed co-citation (an independant distance matrix, not involving MeSH?)
Regenerating generif.
BG-profiles computed, appear to be less effective.  Some [...]

Read Full Post »

Ballpark runtime figures – I took about 30 nuclear receptor genes and ran those against the 4000 MeSH Disease profiles – this took about 30 minutes.  Extrapolating to the approximately 10000 human genes with GeneRIFs makes the runtime about 7 days.
Plan for tomorrow is to pull out an example for and mock up the distance [...]

Read Full Post »

Biting the Bullet

A bit anticlimatic — I had previously waited forever for Entrez Gene and all the GeneRIFs to convert from XML to to text files. So rather than doing that myself, I wrote the code to load from text files on the Entrez Gene FTP server. Have other people do most of my [...]

Read Full Post »

WASSERMAN/OUELLETTE LABORATORY
SUMMARY FORM
 
Name:  Warren Cheung
 
Date: 2007-05-30
 
Progress

Draft of Proposal
Modularity

Setup for Entrez Gene to PubMed “muscle” subset

 
Other Activities

Reading

statistics – multiple testing correction, Fisher/hypergeometric distribution.
GO term analysis

MSL Poster

 
Goals

Complete [...]

Read Full Post »

Name: Warren A. Cheung
Progress

Code to download all TF genes (all species, including human)
Can generate links between genes and pubmed articles up to 3 degrees of separation

Other Activities

First Committee Meeting
Booked Qualifying Exam for Augusts 9th, Room 174 at UBiC
completed first pass of related work for proposal

Goals

generate numerical results linking genes to brain disease MeSH terms, via [...]

Read Full Post »

Optimise

Sounds like a transformers yell…”Transformers, Optimise!”
Anyways, thing to fix: all the CREATE TABLE AS SELECT…. create tables without keys. No key is definitely bad. Trying this for the one that really matters – the gene_term_related_citations query. EXPLAIN seems happier, but proof will be in the pudding (or [...]

Read Full Post »

A cunning plan

Time to get cracking on the optimisation. The dual plans are to

Expand the TF database with taxonids (and eventually load all of Entrez Gene)
Optimise queries using table indexes

Looks optimistic – I like orders of magnitude improvements.
Just hacked the Entrez Gene fetcher to extract the taxonid from the XML. I should look into [...]

Read Full Post »