Finally feel like I’m getting close to generating the co-occurrence numbers for MeSH in PubMed articles. In some sense, the computations are all but done – all I’m doing now is getting grand totals.
There are a lot of totals to do… It’s currently at “Food Additives” …
On the upside, I was super worried about how [...]
Archive for the ‘MeSH’ Category
Co-occurring MeSH term computation…Almost…
Posted in MeSH, cluster on April 1, 2008 | Leave a Comment »
Need Moar Speed
Posted in MeSH, PubMed on March 18, 2008 | Leave a Comment »
I guess I should be running things through the fine-toothed comb of careful analysis.
Current idea of file-level joins seems to not be going so well.
Really, we ought to be able to do it all in memory…well, given that there are 24355(2.4e4) terms, that makes the co-occurrence matrix on the order of 4e8 … not completely [...]
Python Mesh-Child File generation – Success!
Posted in MeSH, PubMed on March 18, 2008 | Leave a Comment »
Successfully generated the mesh-parent datafiles in reasonable time. Loading all the results into a database results in a very large table (pubmed_mesh_parent) is slower but still reasonable. Querying the table is still pretty slow right now – attempting to optimise the index (right now indexed on pmid,term, so creating another index on term), but feeling [...]
Plan of attack
Posted in MeSH, PubMed on March 6, 2008 | Leave a Comment »
Long term rewrite – Separate datafiles and workfiles from projectfiles to simplify backup
Will probably need a program to handle the join efficiently.
Thinking of writing in Python:
Read (mesh-child) file:
Each line converts to a dictionary entry (key=term) and add to the value (append to set)
(Reverse? Child is the key, parent is the set of [...]
Reviewing The Big Problem
Posted in MeSH on March 5, 2008 | Leave a Comment »
Currently, I’m trying to find mesh terms overrepresented in articles with a particular disease term.
Turns out I’ve made a couple errors on this front, so I’m going to rewrite the solution here so that I can remember my current take on it.
I want articles with the disease term, or any of it’s children.
From [...]
Moar Speed
Posted in MeSH, PubMed on February 20, 2008 | Leave a Comment »
Since things are still chugging slowly on the servers, started looking at ideas on how to make everything faster. After all, there’s more than a couple CPUs sitting around twiddling their fingers – I ought to think of ways to have them all play. That and making the solution more scalable – too many [...]
Co-occurrence numbers
Posted in MeSH, PubMed on February 1, 2008 | Leave a Comment »
I’ll probably have to re-run the MeSH co-occurrence numbers, due to errors (blech!) – looks like single quotes are no good, I’ll have to go to double quotes. Actually, I can probably resubset it so that I do disease MeSH-MeSH co-occurrence, since this is to compute disease profiles. That should speed things up dramatically (by [...]
Generating Disease to MeSH term profiles
Posted in MeSH, PubMed on January 28, 2008 | Leave a Comment »
What we would like is a disease-specific MeSH term profile, i.e.
For a given disease, which mesh terms are commonly associated. The first step is finding how many references for a given MeSH term co-occur with the disease MeSH term.
Since MeSH has a structure, this is more specifically (for a given disease term and a given [...]
Doh Moment
Posted in MeSH, PubMed on November 13, 2007 | Leave a Comment »
Very Important – medline baseline files are stored in a different directory than the updates. And here I thought they’d all gone away due to the end of year maintenance. Anyways, doing a high speed wget grab of all the baseline files…we’ll see how the processing goes.
On the flipside, all Pubmed articles with a [...]
OMIM Gene-Disease to Entrez Gene-MeSH
Posted in Entrez, MeSH, OMIM on November 3, 2007 | 1 Comment »
OMIM Gene to OMIM Disease: The OMIM MorbidMap links loci to disease phenotypes
Entrez Gene mim2gene: Links Entrez Gene to genotype and phenotype OMIM entries
MeSH to OMIM: Doing this by converting to UMLS concepts (prototype in KNIME)