Things that I’ve been reminded to do after seeing things in RECOMB:
MeSH term attachment (general paper) – this data is running! HA! After that, we can run the validation!
Stability check – predictions change with respect to missing annotation, misannotation
Bayesian mode for predicting term attachment – P(term|papers) = P(term)P(papers|term)/P(papers)
Break down the AUC by term (can actually ignore the tree and do it per term…) – for this I should probably rewrite the AUC calculator as an object…
Hausdorff distance == likelihood
PageRank/social network analysis (especially for the author data!)
As for results, seems that the validation set for pharma-chem/disease annotations is ZERO – I should generate the histogram of annotation over time (this is probably interesting in general) – Histogram is in progress. Also potential alternate avenues are doing attachment of all MeSH terms rather than just disease, or looking at the attachment of new pharmacological actions – txt/mesh/mesh_pharma.txt new entries. Will need to compute chem<->all MeSH profiles
Also want to