ArXiv Search in Python
The arXiv API gives access to the Cornell e-preprint server archive of science articles. The answer to a query, specified in terms of keywords, a comma-separated list of article ID-s, start position, and number of articles retrieved (the default is 10), is returned in Atom format.
Let usfrom search_arxiv import *
after having copied the script to our Python path.
For a relatively small number of articles on a given topic, e.g. on unparticles, a single_query(search_query='all:unparticle', id_list='', start=0, max_res=100)
can be used. (The defaults are search_query=''
, id_list=''
, start=0
, max_res=10
.)
For a larger number of articles in separate Atom files, arxiv_query(search_query='', id_list='', start=0, total_res=None,
max_res=100)
can be used. Default arguments are given here; with total_res=None
, all the articles corresponding to a given query are retrieved in chunks of max_res
. (Wait time between the chunks is 3 seconds, as suggested in the manual.)
The results can be conveniently parsed with the Feedparser.