CTWatch
August 2007
The Coming Revolution in Scholarly Communications & Cyberinfrastructure
Paul Ginsparg, Cornell University

7

On the one decade time scale, it is likely that more research communities will join some form of global unified archive system without the current partitioning and access restrictions familiar from the paper medium, for the simple reason that it is the best way to communicate knowledge and hence to create new knowledge. The genomic and related resources described above are naturally interlinked by virtue of their common hosting by a single organization, a situation very different from that described earlier for astronomy research. For most disciplines, the key to progress will be development of common web service protocols, common languages (e.g., for manipulating and visualizing data), and common data interchange standards, to facilitate distributed forms of the above resources. The adoption of these protocols will be hastened as independent data repositories adopt dissemination of seamlessly discoverable content as their raison d'être. Analogs of the test parsings described above have natural analogs in all fields: such as astronomical objects and experiments in astronomy; mathematical terms and theorems in mathematics; physical objects, terminology, and experiments in physics; chemical structures and experiments in chemistry, etc., and many of the external databases to provide targets for such automated markup already exist.

One of the surprises of the past two decades is how little progress has been made in the underlying document format employed. Equation-intensive physicists, mathematicians, and computer scientists now generally create PDF from TeX. It is a methodology based on a pre-1980's print-on-paper mentality and not optimized for network distribution. The implications of widespread usage of newer document formats such as Microsoft's Open Office XML or the OASIS OpenDocument format and the attendant ability to extract semantic information and modularize documents are scarcely appreciated by the research communities. Machine learning techniques familiar from artificial intelligence research will assist in the extraction of metadata and classification information, assisting authors and improving services based on the cleaned metadata. Semantic analysis of the document bodies will facilitate the automated interlinking to external resources described above and lead to improved navigation and discovery services for readers. A related question will be what authoring tools and functions should be added to word processing software, both commercial and otherwise, to provide an optimal environment for scientific authorship? Many of the interoperability protocols for distributed database systems will equally accommodate individual authoring clients or their proxies, and we can expect many new applications beyond real-time automated markup and autonomous reference finding.

Every generation thinks it's somehow unique, but there are nonetheless objective reasons to believe that we are witnessing an essential change in the way information is accessed, the way it is communicated to and from the general public, and among research professionals - fundamental methodological changes that will lead to a terrain 10-20 years from now more different than it was 10-20 years ago than in any comparable time period.

References
1http://arXiv.org/
2Ginsparg, P. "Winners and losers in the global research village", in Elliot, R. & Shaw, D. (Eds.) Proceedings of the ICSU Press-UNESCO Conference on Electronic Publishing in Science (1996), users.ox.ac.uk/~icsuinfo/ginsparg.htm (copy at arXiv.org/blurb/pg96unesco.html)
3O'Connell, H. B. "Physicists Thriving with Paperless Publishing," HEP Lib.Web., Vol. 6 (2002), no. 3 (copy at arXiv.org/abs/physics/0007040)
4Ginsparg, P. "Creating a Global Knowledge Network", in Elliot, R. & Shaw, D. (Eds.) Proceedings of the second ICSU Press-UNESCO Conference on Electronic Publishing in Science (2001), users.ox.ac.uk/~icsuinfo/ginspargfin.htm (copy at arXiv.org/blurb/pg01unesco.html)
5Mele, S., Dallman, D., Vigen, J., Yeomans, J. "Quantitative Analysis of thePublishing Landscape in High-Energy Physics,"JHEP 0612 (2006) S01 (copy at arXiv.org/abs/cs/0611130)
6These surveys were conducted independently by the CERN library during May/Jun 2007 and by the American Physical Society at its Mar 2007 and Apr 2007 meetings of condensed matter and high energy physicists, respectively. I thank S. Mele and M. Doyle for access to the results.
7Kurtz, M. "The Connection between Scientific Literature and Data in Astronomy,"talk given at the eScience Workshop at The Johns Hopkins University (Oct 2006), cfa-www.harvard.edu/~kurtz/eScience.ppt. Some of the resources accessed were: adsabs.harvard.edu/ (The Smithsonian/NASA Astrophysics Data System), www.journals.uchicago.edu/ApJ/ (The Astrophysical Journal), simbad.u-strasbg.fr/simbad/sim-fid (SIMBAD), nedwww.ipac.caltech.edu/ (Nasa/IPAC extragalactic database (NED)), chandra.harvard.edu/pub.html (Chandra X-ray center), sci.esa.int/xmm/ (XMM-Newton), aladin.u-strasbg.fr/ (Aladin Sky atlas), heasarc.gsfc.nasa.gov/ (HEASARC missions and object catalogs), tdc-www.harvard.edu/uzc/ (Zwicky Catalog), irsa.ipac.caltech.edu/ (Nasa/IPAC IR science archive)
8http://precedings.nature.com/; See also R. Bailey, "News Ages Quickly," reason.com/news/show/121178.html
9Berry, R. S. "Is electronic publishing being used in the best interests of science? The scientist's view," in Elliot, R. & Shaw, D. (Eds.) Proceedings of the second ICSU Press-UNESCO Conference on Electronic Publishing in Science(2001), users.ox.ac.uk/~icsuinfo/berryfin.htm; S. Bachrach, R.S. Berry, M. Blume, T. von Foerster, A. Fowler, P. Ginsparg, S. Heller, N. Kestner, A. Odlyzko, A. Okerson, R. Wigington, A. Moffat, "Who should own scientific papers?" Science Vol. 281, pp. 1459—1460 (1998). www.sciencemag.org/cgi/content/full/281/5382/1459
10"Congressional Panel Favors Access to Publicly Funded Research," www.taxpayeraccess.org/media/release07-0628.html
11Ewing, J. comment at NAS e-journal Summit (Aug, 2004).
12Bianco, S. et al. (SCOAP3 Working Party), "Towards Open Access Publishing in High Energy Physics" (Apr 2007) cern.ch/oa/Scoap3WPReport.pdf; See also T. C. Brooks, "Open Access Publishing in Particle Physics: A Brief Introduction for the non-Expert,"http://arxiv.org/abs/0705.3466
13See, e.g. "Directory of Open Access Journals," www.doaj.org/
14Blume, M. presentation at American Physical Society panel on open access publishing (May 2007).
15Hajjem, C., Harnad, S., Gingras, Y. "Ten-Year Cross-Disciplinary Comparison of the Growth of Open Access and How it Increases Research Citation Impact," IEEE Data Engineering Bulletin, Vol. 28, no. 4, pp. 39-47; 2005, arXiv.org/abs/cs.DL/0606079; Henneken, E. A., Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant, C., Thompson, D., Murray, S. S. "Effect of E-printing on Citation Rates in Astronomy and Physics," Journal of Electronic Publishing,Vol. 9, no. 2 (Summer 2006). arXiv.org/abs/cs.DL/0604061; Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant, C., Demleitner, M., Henneken, E. A., Murray, S. S. "The Effect of Use and Access on Citations,"Inform Process Manag., Vol. 41, pp. 1395-1402 (2005), arXiv.org/abs/cs.DL/0503029; Metcalfe, T. S. "The Rise and Citation Impact of astro-ph in Major Journals," Bull. Am. Astron. Soc. , Vol. 37 (2005), pp. 555-557, arXiv.org/abs/astro-ph/0503519; Metcalfe, T. S. "The Citation Impact of Digital Preprint Archives for Solar Physics Papers," Solar Phys., Vol. 239 (2006), pp. 549-553, arXiv.org/abs/astro-ph/0607079; Davis, P. M., Fromerth, M. J., "Does the arXiv lead to higher citations and reduced publisher downloads for mathematics articles?" Scientometrics, Vol. 71, no. 2. (May, 2007), arxiv.org/abs/cs.DL/0603056; Davis, P. M. "Do Open-Access articles really have a greater research impact?" College & Research Libraries, Vol. 67, no. 2 (March 2006), pp.103-104
16Wren, J. D. "Open access and openly accessible: a study of scientific publications shared via the internet," BMJ, Vol. 330, pp. 1128 (2005), bmj.bmjjournals.com/cgi/content/full/330/7500/1128
17Ginsparg, P. "As We May Read," J. Neurosci., Vol. 26, No. 38, pp. 9606-9608 (2006), www.jneurosci.org/cgi/content/full/26/38/9606
18PubMed Central - www.pubmedcentral.nih.gov/
19"PMC Back Issue Digitization Project" - www.pubmedcentral.nih.gov/about/scanning.html; See also "PubMed Central Hits One Million Article Mark", www.nlm.nih.gov/news/pmcmillion.html
20Sequeira, E. private communication.

Pages: 1 2 3 4 5 6 7

Reference this article
Ginsparg, P. "Next-Generation Implications of Open Access," CTWatch Quarterly, Volume 3, Number 3, August 2007. http://www.ctwatch.org/quarterly/articles/2007/08/next-generation-implications-of-open-access/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.