Tuesday, July 12, 2011

How to send search requests to Journal Websites

When I search for papers, I simply use google search. Sometimes I use google scholar, but oddly enough, the former usually gives me better quality results. But in some areas of study people use journal websites directly, maybe because terminologies they use are quite general, thus they want to exclude non-academic websites. (Maybe they don't like Google? Actually I don't really know why :D)

However, you do not want to visit every journal website for every single query. If there are ten journals in your area of study, then visiting all ten journal websites should consume a lot of your precious time. So a friend of mine wants to make a web-page which can send search request to any journal website she wants to use.

If the request is done in GET, it is easy figure out what parameters you need. For example, if you search "graphs" in Google, the address bar on your browser shows the following URL:

http://www.google.de/search?sourceid=chrome&ie=UTF-8&q=graphs

So you can simply substitute "graphs" part by the word of your choice to request search to google. But when you search 'graphs' in APS (American Physics Society) website, it simply shows

http://publish.aps.org/search

because the request is done in POST. There are two ways of finding out parameters used in POST, I think. The first is to look at HTML source code of the web page which requests search. But usually HTML source codes are very unreadable, so reading it is not very fun. The other way is usually more efficient: to look at sent request. You may use fancy network monitoring tools, but actually it suffices to use Google Chrome.

To do this, click on 'Wrench' icon next to the address bar, go to 'Tools' menu, and turn on developer tools. Then a fancy window appears on bottom of the browser. Choose 'Network' tab, then you can see something like the following:

 From 'Form Data', you can see which parameters were requested in 'POST' message. In this case, it seems that 'q%5Bclauses%5D%5B%5D%5Bfield%5D' denotes the type of the field, and 'q%5Bclauses%5D%5B%5D%5Bvalue%5D' is the value of the field. Thus, by using the following URL:

http://publish.aps.org/search/query?q%5Bclauses%5D%5B%5D%5Bfield%5D=abstitle&q%5Bclauses%5D%5B%5D%5Bvalue%5D=graphs

you can search for 'graphs' in APS website. Similarly,

http://jcp.aip.org/search?key=JCPSA6&societykey=AIP&coden=JCPSA6&q=graphs&displayid=AIP&sortby=newestdate&faceted=faceted&sortby=newestdate&CP_Style=false&alias=&searchzone=2

Oh, in this case search was done in GET...!?

http://pubs.acs.org/action/doSearch?action=search&searchText=graphs&qsSearchArea=searchText&type=within&publication=40001010

Oh... it was also done in GET... So there was only one case that was done in POST... Why did I start writing this article in the first place... OTL Anyways, this is how to do it.

No comments:

Post a Comment