Programming Language: Perl
Modules required: LWP::Simple

This script retrieves the full articles (or the abstract if the full article are not available) that a relevant to a search term from the PubmedCentral (pmc) database.

The results are sorted in reverse chronological order.

There is also an option to limit you number of results returns (very handy).

The output is in xml format.

An example for a perl module to tranform XML files : XML:Simple

LWP::Simple is required to execute the search.

Installing new modules: this can be easily done using cpan.
Here is the documentation for cpan.
Or, here is documentation for installing modules manually.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#!/usr/bin/perl -w
 
use strict;
use LWP::Simple;
 
 
#search term to find
my $search_term = 'breast cancer';
 
#maximum number of results to retrieve
my $retmax = 10;
 
my $utils = 'http://www.ncbi.nlm.nih.gov/entrez/eutils';
my $db_name = 'pmc';
 
# Submit the search and retrieve the XML based results
my $esearch_result = get( $utils . '/esearch.fcgi?db=' . $db_name . '&retmax='.$retmax.'&term=' . $search_term );
 
# paper IDs
my @ids = ($esearch_result =~ m|.*<Id>(.*)</Id>.*|g);
 
#loop through all the ids
# get individual papers (if not, then abstacts)
foreach my $id (@ids) {
 
	#get all details for each paper - full text if available	
	my $efetch = $utils . '/efetch.fcgi?db=' . $db_name . '&id=' . $id;   
 
	#prints out to a xml file (file name generated from database name and current paper ID)
	open(OUTFILE, ">$db_name$id.xml");
	print OUTFILE get($efetch);
	close OUTFILE;
}