What’s sure to happen? What is contingent or uncertain? What’s already happening?
The linkages between articles and underlying data are a very real and active (though far from fully settled) area of development today. All the models — supplementary materials as part of articles, disciplinary repositories and institutional repositories — are in use today.
The problems raised by visual displays of data in scientific articles are more widely recognized, and frustrate more readers each day. Some problems — graphs for example — are easily solved in prototypes. Others, such as the appropriate way to express data provenance, remain active and challenging research areas. But the real challenges here are about deployment, scale, adoption, and standards. We have decades of examples of systems that support the creation and use of digital documents that are far more powerful and expressive than those in common today, but for a wide variety of reasons these systems never reached critical mass in deployment and adoption, which has proven hugely difficult to predict or ensure at Internet scale.
Text mining is a reality today, at least on a limited basis, and producing some results of real value. Here, I suspect, the barriers to progress will be more around business models for those journals that aren’t open access (some open access journals actually package up a compressed archive of all their articles and invite interested parties to simply copy the files and compute away; clearly this is not going to be as straightforward for a commercial publisher).
One thing is clear: in the cyberinfrastructure, the future of the article will be shaped not only by the requirements to describe changing scientific practices, but also by the changing nature of the authoring and usage environments for the scholarly literature that describes this changing science.
There are now a wealth of publications dealing with data and e-science; particularly valuable is the article “The Data Deluge” by Hey & Trefethen;5 the NSF/NSB report on Long-Lived Datasets;6 the report of the NSF/ARL workshop “To Stand the Test of Time;”7 the current NSF Office of Cyberinfrastructure vision document,8 and, a UK perspective.9 On institutional repositories see the Lynch article in the ARL Bimonthly Report;10 for more information on the implications of literature mining, see for example the article in the book Open Access: Key Strategic, Technical, and Economics Aspects.11
2See Donald Kennedy’s editorial at www.sciencemag.org/cgi/content/summary/314/5804/1353 and the report itself at www.sciencemag.org/cgi/content/full/314/5804/1353/DC1
3See Peter Murray-Rust’s Blog, wwmm.ch.cam.ac.uk/blogs/murrayrust/, and the various publications and presentations referenced there.
4See, for example, www.arl.org/sparc/meetings/ala06/ for PowerPoint and audio.
5Hey, T., Trefethen, A. “The Data Deluge: An e-Science Perspective,” Grid Computing: Making the Global Infrastructure a Reality. (Chichester: Wiley, 2003), pp. 809-24. www.rcuk.ac.uk/cmsweb/downloads/rcuk/research/esci/datadeluge.pdf
6National Science Board. “Long-Lived Digital Data Collections: Enabling Research and Education in the 21st Century,” National Science Foundation, 2005. www.nsf.gov/pubs/2005/nsb0540/start.jsp
7Association of Research Libraries. “To Stand the Test of Time: Long-term Stewardship of Digital Data Sets in Science and Engineering,” Association of Research Libraries, 2006. www.arl.org/pp/access/nsfworkshop.shtml
8National Science Foundation Office of Cyberinfrastructure — www.nsf.gov/oci
9Lyon, L. “Dealing with Data: Roles, Rights, Responsibilities and Relationships (Consultancy report),” UKOLN and the Joint Information Systems Committee (JISC), 2006. www.jisc.ac.uk/whatwedo/programmes/programme_digital_repositories/proj...
10Lynch, C.A. “Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age,” ARL Bimonthly Report, Vol 226 (February 2003), pp. 1-7. www.arl.org/newsltr/226/ir.html
11Lynch, C.A. “Open Computation: Beyond Human-Reader-Centric Views of Scholarly Literatures,” Open Access: Key Strategic, Technical and Economic Aspects. Neil Jacobs (Ed.), (Oxford: Chandos Publishing, 2006), pp. 185-193. www.cni.org/staff/cliffpubs/OpenComputation.pdf