By Ken Mankoff
I’ve been working in the scientific domain since 1998, even though I only officially became a scientist in 2013 when I completed my Ph.D. When I began my Ph.D. I had two wishes: 1) Open access publication and 2) Data and software should be considered in addition to publications (and number of citations) as part of the official productivity and impact measure of a scientist.
I was thrilled when, approximately two years ago, a popular movement toward open access publications began, recently formalized by presidential decree. Additionally, the National Science Foundation standard resume now has a section where one can list software or data products, and with the growing popularity of DOIs, data can often be cited directly, and a growing number of journals suggest or require that data and code be submitted with the manuscript.
Since my two wishes appear to have almost come true, neither of which are easy to implement (one benefitted from a presidential decree), I have one more I would like the scientific community to implement: geo-spatial-temporal-tagged publications.
More specifically: If a publication contains a map, a KML (or equivalent) file that contains the same information as the map, but in digital form, should be search-able. Maps should not be only printed graphics, just as source-code is not released as screen-shots of your editor, but instead actual ASCII text files. Any publication with a geospatial component on a scale smaller than the entire planet should publish the geo-spatial and temporal information to one clearinghouse. Picture a world where Google Scholar, in addition to letting you search on author or year of publication, supports searching on latitude, longitude (perhaps Google Scholar Maps?), and year of data.
Whenever one changes scientific fields, or even just changes fjords in the same field, there is a high cost to learn the material for the new subject area. Much of that cost is used finding the seminal publications for the new field, and citation count is one useful metric to determine key publications. But finding publications which specifically address the new location (neighborhood, glacier, sector of ocean, etc.) is also an important part of learning the details of the new site. Much of what we study contains a spatial component, and if one could search publications based on that spatial and temporal data, it would make learning about new fields, or new geographic areas in your existing field, much easier.
Not only would this allow researchers to study new geographic areas or new domains more easily, but it would also support inter-disciplinary research. For example, I could zoom in to a well-studied fjord where I know the glaciology component, and also see oceanographic and atmospheric papers that model or study the same region. Although my domain and my examples use Earth Science disciplines, this proposal would benefit all fields with a geospatial component, for example economics, health and medicine, linguistics, etc.
When I began studying the Pine Island Glacier, my new field site in 2009, I produced a template of the system I describe above. For each publication I drew the points, lines, or area covered by the publication, and included the date of the publication, and the time-stamp or time-range of the publication. Theoretical future modeling studies were assigned to start in 2015. This rough alpha-prototype can be seen here: http://kenmankoff.com/maps/PIG/timemap/. The software used in my prototype is timemap based on the MIT Simile Timeline. You can easily view your own papers in a timeline-only mode (no map) if you use a paper management software such as BibDesk, Mendeley, Papers, or Zotero. If you use these or similar, export your library to a BibTeX file, import into Zotero, and then select “Create Timeline” from the “Tools” menu. The software allows you to filter and select keywords to highlight in different colors.
The final product would need to be auto-generated for all papers across all disciplines. One method to achieve this would be for publishers to require location and time to be included with the meta data that is part of the paper submission – author name and affiliation, title, abstract, etc. The place and date could then be encoded into the PDF just as modern PDFs have author name encoded, or the information can be encoded in the DOI available at http://dx.doi.org/ Many people know that DOIs, when entered at that website, link to the “official” version of the publication, but I recently discovered that that site can also be accessed using computer code other than your web browser and an “official” BibTeX record can be requested. Date and place could therefore be encoded and delivered through http://doi.org. Finally, a website would need to create a front-end to display the publication time map, either graphically (using Google Maps) or textually, for example the way Google Scholar currently supports searching metadata.
I intend to write proposals to NSF and other funding agencies to work on this project in the future. If anyone reading this is interested in collaborating, or at least benefiting from the results of this proposal, please contact me. The software already exists to support searching on these new dimensions. The only thing missing is a) A research project demonstrating that this method is effective and 2) A requirement from NSF or the President requiring all funded projects to submit geo-spatial-data of publications, just as NSF requires all funded projects to include a data management plan.
Edit: I just discovered that this idea has been implemented in the health domain via the Health GeoJunction.