What I want to do now is go from entrez gene_id and grab relevant “sequence features” – the size of the coding region, length of the gene, chromosome, RNA length, and so on. Doesn’t look like the info is archived locally, so possible avenues to whack are using BioMart or NCBI via web services to convert the gene_ids to features. Then it’ll be a matter of doing some strange merge of gene_id/term/ValidYN with the result.