<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>DNA Helix</title>
	<atom:link href="http://dnahelix.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://dnahelix.wordpress.com</link>
	<description>Simply Complicated</description>
	<lastBuildDate>Fri, 01 Apr 2011 22:37:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='dnahelix.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>DNA Helix</title>
		<link>http://dnahelix.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://dnahelix.wordpress.com/osd.xml" title="DNA Helix" />
	<atom:link rel='hub' href='http://dnahelix.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Ideas from RECOMB</title>
		<link>http://dnahelix.wordpress.com/2011/04/01/ideas-from-recomb/</link>
		<comments>http://dnahelix.wordpress.com/2011/04/01/ideas-from-recomb/#comments</comments>
		<pubDate>Fri, 01 Apr 2011 21:21:31 +0000</pubDate>
		<dc:creator>warrenac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://dnahelix.wordpress.com/?p=347</guid>
		<description><![CDATA[Things that I&#8217;ve been reminded to do after seeing things in RECOMB: MeSH term attachment (general paper) &#8211; this data is running!  HA! After that,  we can run the validation! Stability check &#8211; predictions change with respect to missing annotation, misannotation Bayesian mode for predicting term attachment &#8211; P(term&#124;papers) = P(term)P(papers&#124;term)/P(papers) Break down the AUC [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=347&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Things that I&#8217;ve been reminded to do after seeing things in RECOMB:</p>
<p>MeSH term attachment (general paper) &#8211; this data is running!  HA! After that,  we can run the validation!</p>
<p>Stability check &#8211; predictions change with respect to missing annotation, misannotation</p>
<p>Bayesian mode for predicting term attachment &#8211; P(term|papers) = P(term)P(papers|term)/P(papers)</p>
<p>Break down the AUC by term (can actually ignore the tree and do it per term&#8230;) &#8211; for this I should probably rewrite the AUC calculator as an object&#8230;</p>
<p>Hausdorff distance == likelihood</p>
<p>PageRank/social network analysis (especially for the author data!)</p>
<p>As for results,  seems that the validation set for pharma-chem/disease annotations is ZERO &#8211; I should generate the histogram of annotation over time (this is probably interesting in general) &#8211; Histogram is in progress.  Also potential alternate avenues are doing attachment of all MeSH terms rather than just disease,  or looking at the attachment of new pharmacological actions &#8211; txt/mesh/mesh_pharma.txt new entries.  Will need to compute chem&lt;-&gt;all MeSH profiles</p>
<p>Also want to</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dnahelix.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dnahelix.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dnahelix.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dnahelix.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dnahelix.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dnahelix.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dnahelix.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dnahelix.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dnahelix.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dnahelix.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dnahelix.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dnahelix.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dnahelix.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dnahelix.wordpress.com/347/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=347&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dnahelix.wordpress.com/2011/04/01/ideas-from-recomb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/43ec5822eaddbadf25fcee221ebcc9ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">warrenac</media:title>
		</media:content>
	</item>
		<item>
		<title>2011 Build GO!</title>
		<link>http://dnahelix.wordpress.com/2011/03/24/2011-build-go/</link>
		<comments>http://dnahelix.wordpress.com/2011/03/24/2011-build-go/#comments</comments>
		<pubDate>Fri, 25 Mar 2011 04:06:39 +0000</pubDate>
		<dc:creator>warrenac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://dnahelix.wordpress.com/?p=343</guid>
		<description><![CDATA[Database set up, PubMed files transferred &#8211; this only leaves Entrez Gene and the MeSH files to be grabbed (in integrator/Archive now!) the getMeSH script hardcodes the year being grabbed, so had to switch these to the 2011 files. Had to remember to set up the database files &#8211; if the database access script fails [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=343&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Database set up, PubMed files transferred &#8211; this only leaves Entrez Gene and the MeSH files to be grabbed (in integrator/Archive now!)</p>
<p>the getMeSH script hardcodes the year being grabbed, so had to switch these to the 2011 files.</p>
<p>Had to remember to set up the database files &#8211; if the database access script fails silently (as it does if you call a database that&#8217;s not in .dbrc) you get weird errors.</p>
<p>Looks like some of the previous builds are almost done&#8230;and looks like I might need to rebuild the geneRIF bits?</p>
<p>&nbsp;</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dnahelix.wordpress.com/343/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dnahelix.wordpress.com/343/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dnahelix.wordpress.com/343/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dnahelix.wordpress.com/343/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dnahelix.wordpress.com/343/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dnahelix.wordpress.com/343/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dnahelix.wordpress.com/343/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dnahelix.wordpress.com/343/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dnahelix.wordpress.com/343/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dnahelix.wordpress.com/343/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dnahelix.wordpress.com/343/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dnahelix.wordpress.com/343/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dnahelix.wordpress.com/343/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dnahelix.wordpress.com/343/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=343&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dnahelix.wordpress.com/2011/03/24/2011-build-go/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/43ec5822eaddbadf25fcee221ebcc9ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">warrenac</media:title>
		</media:content>
	</item>
		<item>
		<title>Multiple things on the go</title>
		<link>http://dnahelix.wordpress.com/2011/03/23/multiple-things-on-the-go/</link>
		<comments>http://dnahelix.wordpress.com/2011/03/23/multiple-things-on-the-go/#comments</comments>
		<pubDate>Thu, 24 Mar 2011 02:12:38 +0000</pubDate>
		<dc:creator>warrenac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://dnahelix.wordpress.com/?p=339</guid>
		<description><![CDATA[Still waiting for word on the paper &#8211; should probably follow up tonight? Downloaded the 2011 PubMed files &#8211; need to set up an wcdb5 to house it.  Currently scp&#8217;ing the baseline over to chickenwire,  then need to move it into position for the build.  Also need versions of Entrez Gene, MeSH, etc&#8230; ALSO, need [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=339&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Still waiting for word on the paper &#8211; should probably follow up tonight?</p>
<p>Downloaded the 2011 PubMed files &#8211; need to set up an wcdb5 to house it.  Currently scp&#8217;ing the baseline over to chickenwire,  then need to move it into position for the build.  Also need versions of Entrez Gene, MeSH, etc&#8230;</p>
<p>ALSO, need to update the website and mention exactly which versions of which files are live on the databases.</p>
<p>Grabbed pubmed-chem-term.txt and put it into integrator/mesh-chem.  WIll match against the drugbank database, get a list of non-matching pharma.  Also,  get list of compounds with pharmaco action, and see how much that loses.</p>
<p>Re: circular make.  Digenei4 seems to have choked with a &#8220;directory doesn&#8217;t exist&#8221; error for a directory that exists.  Maybe a node with a file system problem?</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dnahelix.wordpress.com/339/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dnahelix.wordpress.com/339/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dnahelix.wordpress.com/339/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dnahelix.wordpress.com/339/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dnahelix.wordpress.com/339/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dnahelix.wordpress.com/339/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dnahelix.wordpress.com/339/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dnahelix.wordpress.com/339/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dnahelix.wordpress.com/339/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dnahelix.wordpress.com/339/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dnahelix.wordpress.com/339/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dnahelix.wordpress.com/339/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dnahelix.wordpress.com/339/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dnahelix.wordpress.com/339/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=339&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dnahelix.wordpress.com/2011/03/23/multiple-things-on-the-go/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/43ec5822eaddbadf25fcee221ebcc9ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">warrenac</media:title>
		</media:content>
	</item>
		<item>
		<title>Hunting for prerequisites,  2011 Build</title>
		<link>http://dnahelix.wordpress.com/2011/03/22/hunting-for-prerequisites-2011-build/</link>
		<comments>http://dnahelix.wordpress.com/2011/03/22/hunting-for-prerequisites-2011-build/#comments</comments>
		<pubDate>Tue, 22 Mar 2011 22:37:34 +0000</pubDate>
		<dc:creator>warrenac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://dnahelix.wordpress.com/?p=335</guid>
		<description><![CDATA[Suddenly realised that RECOMB is nearly upon us &#8211; if I want to put in some new figures, now&#8217;s the time! Seems like there&#8217;s a circular/non-updating portion in the Makefile&#8230;Or hopefully it&#8217;s more I&#8217;ve been twiddling with the makefiles so it&#8217;s been unhappy.  Here&#8217;s hoping a final build will sufffice to finish things off.  And [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=335&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Suddenly realised that RECOMB is nearly upon us &#8211; if I want to put in some new figures, now&#8217;s the time!</p>
<p>Seems like there&#8217;s a circular/non-updating portion in the Makefile&#8230;Or hopefully it&#8217;s more I&#8217;ve been twiddling with the makefiles so it&#8217;s been unhappy.  Here&#8217;s hoping a final build will sufffice to finish things off.  And then it&#8217;s probably time to build a new version with the new pubmed&#8230;Let&#8217;s go and download that now, shall we?</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dnahelix.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dnahelix.wordpress.com/335/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dnahelix.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dnahelix.wordpress.com/335/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dnahelix.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dnahelix.wordpress.com/335/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dnahelix.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dnahelix.wordpress.com/335/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dnahelix.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dnahelix.wordpress.com/335/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dnahelix.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dnahelix.wordpress.com/335/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dnahelix.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dnahelix.wordpress.com/335/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=335&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dnahelix.wordpress.com/2011/03/22/hunting-for-prerequisites-2011-build/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/43ec5822eaddbadf25fcee221ebcc9ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">warrenac</media:title>
		</media:content>
	</item>
		<item>
		<title>End of the week Progress</title>
		<link>http://dnahelix.wordpress.com/2011/03/18/end-of-the-week-progress/</link>
		<comments>http://dnahelix.wordpress.com/2011/03/18/end-of-the-week-progress/#comments</comments>
		<pubDate>Sat, 19 Mar 2011 01:31:55 +0000</pubDate>
		<dc:creator>warrenac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://dnahelix.wordpress.com/?p=332</guid>
		<description><![CDATA[Waiting on final confirmation for the paper revisions &#8211; will likely submit tomorrow. Fixing silly keyErrors &#8211; unicode causing sed to barf, running the same regex through perl will hopefully fix that digenei0 is making without errors, but doesn&#8217;t seem to be &#8220;done&#8221; &#8211; is there a circular dependency?!<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=332&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Waiting on final confirmation for the paper revisions &#8211; will likely submit tomorrow.</p>
<p>Fixing silly keyErrors &#8211; unicode causing sed to barf, running the same regex through perl will hopefully fix that</p>
<p>digenei0 is making without errors, but doesn&#8217;t seem to be &#8220;done&#8221; &#8211; is there a circular dependency?!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dnahelix.wordpress.com/332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dnahelix.wordpress.com/332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dnahelix.wordpress.com/332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dnahelix.wordpress.com/332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dnahelix.wordpress.com/332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dnahelix.wordpress.com/332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dnahelix.wordpress.com/332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dnahelix.wordpress.com/332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dnahelix.wordpress.com/332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dnahelix.wordpress.com/332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dnahelix.wordpress.com/332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dnahelix.wordpress.com/332/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dnahelix.wordpress.com/332/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dnahelix.wordpress.com/332/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=332&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dnahelix.wordpress.com/2011/03/18/end-of-the-week-progress/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/43ec5822eaddbadf25fcee221ebcc9ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">warrenac</media:title>
		</media:content>
	</item>
		<item>
		<title>Online Overrepresentation</title>
		<link>http://dnahelix.wordpress.com/2011/03/08/online-overrepresentation/</link>
		<comments>http://dnahelix.wordpress.com/2011/03/08/online-overrepresentation/#comments</comments>
		<pubDate>Tue, 08 Mar 2011 23:55:12 +0000</pubDate>
		<dc:creator>warrenac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://dnahelix.wordpress.com/?p=329</guid>
		<description><![CDATA[Previously was using the PHP based PDL package,  but that seems to break once the number of articles gets large (once past 51 articles). Installed R to the web server &#8211; we can run it directly using &#8220;R &#8211;vanilla &#8211;slave&#8221;,  but that seems more than a bit slow &#8211; getting results takes a good chunk [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=329&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Previously was using the PHP based PDL package,  but that seems to break once the number of articles gets large (once past 51 articles).</p>
<p>Installed R to the web server &#8211; we can run it directly using &#8220;R &#8211;vanilla &#8211;slave&#8221;,  but that seems more than a bit slow &#8211; getting results takes a good chunk of time, probably because it costs a few seconds to compute each p-value, and you have to do it for every one of the MeSH terms.  Maybe a couple of minutes to process them all.  Maybe I should look at figuring out how to batch it all up into one giant computation &#8211; maybe make an array that can be read in &#8211; which might allow for parallel processing, or at least save on loadup time for R.  Otherwise,  maybe there&#8217;s a lightweight stats package that could be used instead?</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dnahelix.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dnahelix.wordpress.com/329/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dnahelix.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dnahelix.wordpress.com/329/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dnahelix.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dnahelix.wordpress.com/329/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dnahelix.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dnahelix.wordpress.com/329/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dnahelix.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dnahelix.wordpress.com/329/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dnahelix.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dnahelix.wordpress.com/329/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dnahelix.wordpress.com/329/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dnahelix.wordpress.com/329/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=329&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dnahelix.wordpress.com/2011/03/08/online-overrepresentation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/43ec5822eaddbadf25fcee221ebcc9ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">warrenac</media:title>
		</media:content>
	</item>
		<item>
		<title>Paper compressing</title>
		<link>http://dnahelix.wordpress.com/2011/03/07/paper-compressing/</link>
		<comments>http://dnahelix.wordpress.com/2011/03/07/paper-compressing/#comments</comments>
		<pubDate>Mon, 07 Mar 2011 22:23:31 +0000</pubDate>
		<dc:creator>warrenac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://dnahelix.wordpress.com/?p=327</guid>
		<description><![CDATA[From time to time I need to squeeze things in a paper to get things to fit.  I might as well list my usual ideas in case I need them in the future: shorten paragraphs.  Particularly,  make sure the last line is as full as possible &#8211; if there&#8217;s only a couple hanging words,  perhaps [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=327&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>From time to time I need to squeeze things in a paper to get things to fit.  I might as well list my usual ideas in case I need them in the future:</p>
<ul>
<li>shorten paragraphs.  Particularly,  make sure the last line is as full as possible &#8211; if there&#8217;s only a couple hanging words,  perhaps a little editing can cut enough to save that line.</li>
<li>move figures as close as possible to text.  Also related is to crop the figures to eliminate empty space</li>
<li>remove sentences which duplicate content &#8211; any time a phrase is repeated,  see if it is really necessary, or if it is possible to rearrange to avoid extra occurrences</li>
<li>simplify &#8211; this helps clarity which is also good,  but shorter, more direct wording,  avoid &#8220;weasel wording&#8221;</li>
<li>Often times a lot of small short words are unnecessary</li>
<li>use common abbreviations/abbreviate long complex phrases</li>
<li>Avoid multiline titles/headings (since the font for those is big!)</li>
<li>watch out of extra carriage returns between sections</li>
</ul>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dnahelix.wordpress.com/327/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dnahelix.wordpress.com/327/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dnahelix.wordpress.com/327/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dnahelix.wordpress.com/327/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dnahelix.wordpress.com/327/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dnahelix.wordpress.com/327/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dnahelix.wordpress.com/327/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dnahelix.wordpress.com/327/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dnahelix.wordpress.com/327/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dnahelix.wordpress.com/327/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dnahelix.wordpress.com/327/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dnahelix.wordpress.com/327/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dnahelix.wordpress.com/327/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dnahelix.wordpress.com/327/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=327&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dnahelix.wordpress.com/2011/03/07/paper-compressing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/43ec5822eaddbadf25fcee221ebcc9ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">warrenac</media:title>
		</media:content>
	</item>
		<item>
		<title>Plans/solutions from lab meeting</title>
		<link>http://dnahelix.wordpress.com/2011/02/25/planssolutions-from-lab-meeting/</link>
		<comments>http://dnahelix.wordpress.com/2011/02/25/planssolutions-from-lab-meeting/#comments</comments>
		<pubDate>Fri, 25 Feb 2011 22:04:23 +0000</pubDate>
		<dc:creator>warrenac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://dnahelix.wordpress.com/?p=318</guid>
		<description><![CDATA[Solution for authors too big Use only one score, and only keep the &#8220;highest k&#8221; &#8211; DO NOT save it all To IMPLEMENT:  modify the profile comparison code to store only the top k lines Need to overlap pharm list with the drugbank list to make sure we&#8217;re not losing too many Messed up something [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=318&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Solution for authors too big</p>
<ul>
<li>Use only one score, and only keep the &#8220;highest k&#8221; &#8211; DO NOT save it all</li>
<li>To IMPLEMENT:  modify the profile comparison code to store only the top k lines</li>
</ul>
<p>Need to overlap pharm list with the drugbank list to make sure we&#8217;re not losing too many</p>
<ul>
<li>Messed up something here it seems &#8211; only 302 of the drugbank generic names map to chem terms (ignoring case)</li>
<li>Actually &#8211; only 397 of the chem terms are mapping from chem-mesh-refs.txt</li>
<li>and only 827 of all-chem-refs.txt is mapping</li>
<li>CAS number matching 794 records</li>
<li>SIGH&#8230;looks like ~/drugcards-sorted is not properly sorted. BOO</li>
<li>OK NEW STATS</li>
<li>all-chem-refs matches 2666 of the drugcards</li>
<li>1029 of pharma-chem matches the drugcards</li>
<li>1731 are in all-chem but not in pharma-chem</li>
<li>94 are in pharma-chem but not in all-chem&#8230;WAITASEC WHAT??</li>
<li>DOH &#8211; have to be careful on joins regarding whitespace &#8211; use the -t param to specify the break field (ie NO BREAK FIELD)</li>
<li><strong>910 in all-chem match drugcards</strong></li>
<li><strong>803 in pharma-chem match drugcards</strong></li>
<li>surprisingly &#8211; 27 pharma-chem are NOT in all-chem??  Isn&#8217;t phamra-chem a subset of all-chem? or is all-chem-refs something else entirely vs pharma-chem-refs?</li>
<li>INTERESTING &#8211; pharma is NOT a subset of all-chem..although in the pipeline it is used as a filter so doesn&#8217;t quite matter.  weird though.</li>
</ul>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dnahelix.wordpress.com/318/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dnahelix.wordpress.com/318/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dnahelix.wordpress.com/318/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dnahelix.wordpress.com/318/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dnahelix.wordpress.com/318/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dnahelix.wordpress.com/318/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dnahelix.wordpress.com/318/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dnahelix.wordpress.com/318/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dnahelix.wordpress.com/318/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dnahelix.wordpress.com/318/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dnahelix.wordpress.com/318/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dnahelix.wordpress.com/318/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dnahelix.wordpress.com/318/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dnahelix.wordpress.com/318/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=318&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dnahelix.wordpress.com/2011/02/25/planssolutions-from-lab-meeting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/43ec5822eaddbadf25fcee221ebcc9ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">warrenac</media:title>
		</media:content>
	</item>
		<item>
		<title>Ref breaking</title>
		<link>http://dnahelix.wordpress.com/2011/02/24/ref-breaking/</link>
		<comments>http://dnahelix.wordpress.com/2011/02/24/ref-breaking/#comments</comments>
		<pubDate>Fri, 25 Feb 2011 00:03:08 +0000</pubDate>
		<dc:creator>warrenac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://dnahelix.wordpress.com/?p=313</guid>
		<description><![CDATA[Seems that there&#8217;s some wonkiness happening when converting/merging some of the entries in ./txt/direct_gene_disease/all-short-author-refs.txt Weird that it only appeard in digenei4 &#8211; maybe some weird names in recent author entries triggered this. ALLGO\xc1\xba\x85ER, M maps to ALLGO&#124;1&#60;C1&#62;&#60;BA&#62;&#60;85&#62;ER, M but as can be seen above,  the &#8220;&#124;1&#8243; portion is for some reason next to ALLGO rather [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=313&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Seems that there&#8217;s some wonkiness happening when converting/merging some of the entries in</p>
<p>./txt/direct_gene_disease/all-short-author-refs.txt</p>
<p>Weird that it only appeard in digenei4 &#8211; maybe some weird names in recent author entries triggered this.</p>
<p>ALLGO\xc1\xba\x85ER, M</p>
<p>maps to</p>
<p>ALLGO|1&lt;C1&gt;&lt;BA&gt;&lt;85&gt;ER, M</p>
<p>but as can be seen above,  the &#8220;|1&#8243; portion is for some reason next to ALLGO rather than at the end.</p>
<p>Seems to be an issue with the UNIQ_COUNT?</p>
<p>TROUBLE SHOOT &#8211; checking out digenei0</p>
<p>For some reason, that version builds cleanly.  Maybe it&#8217;s just a delete and retry&#8230;WONKINESS</p>
<p>Looks like it&#8217;s fixed.  HUZZAH!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dnahelix.wordpress.com/313/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dnahelix.wordpress.com/313/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dnahelix.wordpress.com/313/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dnahelix.wordpress.com/313/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dnahelix.wordpress.com/313/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dnahelix.wordpress.com/313/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dnahelix.wordpress.com/313/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dnahelix.wordpress.com/313/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dnahelix.wordpress.com/313/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dnahelix.wordpress.com/313/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dnahelix.wordpress.com/313/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dnahelix.wordpress.com/313/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dnahelix.wordpress.com/313/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dnahelix.wordpress.com/313/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=313&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dnahelix.wordpress.com/2011/02/24/ref-breaking/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/43ec5822eaddbadf25fcee221ebcc9ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">warrenac</media:title>
		</media:content>
	</item>
		<item>
		<title>Profile Compute</title>
		<link>http://dnahelix.wordpress.com/2011/02/24/profile-compute/</link>
		<comments>http://dnahelix.wordpress.com/2011/02/24/profile-compute/#comments</comments>
		<pubDate>Thu, 24 Feb 2011 23:54:38 +0000</pubDate>
		<dc:creator>warrenac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://dnahelix.wordpress.com/?p=310</guid>
		<description><![CDATA[Target is to do pharma to EVERYTHIING today.  Should be doable since pharma is pretty small. Currently duped the disease-chem block Next &#8211; dupe the gene-disease block,  then add a pharma-pharma block. Code looks straightforward so far &#8211; change the dependencies inputs, ADD A BUILD DIRECTORY. Well, code is written.  I guess I&#8217;ll wait till [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=310&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Target is to do pharma to EVERYTHIING today.  Should be doable since pharma is pretty small.</p>
<p>Currently duped the disease-chem block</p>
<p>Next &#8211; dupe the gene-disease block,  then add a pharma-pharma block.</p>
<p>Code looks straightforward so far &#8211; change the dependencies inputs, ADD A BUILD DIRECTORY.</p>
<p>Well, code is written.  I guess I&#8217;ll wait till the previous build on the pharm-disease completes,  then run the full thing.</p>
<p>HMMM&#8230;noticed that the build directory is actually the profile splitting directory.  Thinking perhaps it might be more efficient to not have a separate split directory for each output, but rather just one split directory per split input.  This does mean that the cleanup procedure can&#8217;t delete the split files though&#8230;HMMMMM</p>
<p>&nbsp;</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dnahelix.wordpress.com/310/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dnahelix.wordpress.com/310/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dnahelix.wordpress.com/310/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dnahelix.wordpress.com/310/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dnahelix.wordpress.com/310/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dnahelix.wordpress.com/310/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dnahelix.wordpress.com/310/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dnahelix.wordpress.com/310/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dnahelix.wordpress.com/310/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dnahelix.wordpress.com/310/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dnahelix.wordpress.com/310/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dnahelix.wordpress.com/310/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dnahelix.wordpress.com/310/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dnahelix.wordpress.com/310/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dnahelix.wordpress.com&amp;blog=1020283&amp;post=310&amp;subd=dnahelix&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dnahelix.wordpress.com/2011/02/24/profile-compute/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/43ec5822eaddbadf25fcee221ebcc9ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">warrenac</media:title>
		</media:content>
	</item>
	</channel>
</rss>
