CTWatch
August 2007
The Coming Revolution in Scholarly Communications & Cyberinfrastructure
Paul Ginsparg, Cornell University

2
1. Background

The e-print arXiv 1 was initiated in Aug 1991 to accommodate the needs of a segment of the High Energy Physics theoretical research community. (For general background, see Ginsparg, P. "Winners and losers....". 2) Due to its prior history of paper preprint distribution (for a history, see O'Connell, H. B. "Physicists Thriving..."3), it was natural for this community to port its prepublication distribution to the internet, even before the widespread adoption of the World Wide Web in the mid 1990's. A primary motivation for this electronic initiative was to level the research playing field, by providing equal and timely access to new research results to researchers all over the globe and at all levels, from students to postdocs to professors.

arXiv has since effectively transformed the research communication infrastructure of multiple fields of physics and will likely continue to play a prominent role in a unified set of global resources for physics, mathematics, and computer science. It has grown to contain over 430,000 articles (as of July 2007), with over 56,000 new submissions expected in calendar year 2007, and over 45 million full text downloads per year. It is an international project, with dedicated mirror sites in 16 countries, collaborations with U.S. and foreign professional societies and other international organizations, and has also provided a crucial life-line for isolated researchers in developing countries. It also helped initiate, and continues to play a leading role, in the growing movement for open access to scholarly literature. The arXiv is entirely scientist driven: articles are deposited by researchers when they choose - either prior to, simultaneous with, or post peer review - and the articles are immediately available to researchers throughout the world.

As a pure dissemination system, it operates at a factor of 100 to 1000 times lower in cost than a conventionally peer-reviewed system.4 This is the real lesson of the move to electronic formats and distribution: not that everything should somehow be free, but that with many of the production tasks automatable or off-loadable to the authors, the editorial costs will then dominate the costs of an unreviewed distribution system by many orders of magnitude. Even with the majority of science research journals now on-line, researchers continue to enjoy both the benefits of the rapid availability of the materials, even if not yet reviewed, and open archival access to the same materials, even if held in parallel by conventional publishers. The methodology works within copyright law, as long as the depositor has the authority to deposit the materials and assign a non-exclusive license to distribute at the time of deposition, since such a license takes precedence over any subsequent copyright assignment.

The site has never been a random UseNet newsgroup or blogspace-like free-for-all. From the outset, arXiv.org relied on a variety of heuristic screening mechanisms, including a filter on institutional affiliation of submitter, to ensure insofar as possible that submissions are at least "of refereeable quality." That means they satisfy the minimal criterion that they would not be peremptorily rejected by any competent journal editor as nutty, offensive, or otherwise manifestly inappropriate, and would instead at least in principle be suitable for review. These mechanisms are an important - if not essential - component of why readers find the arXiv site so useful.

The arXiv repository functions are flexible enough either to co-exist with the pre-existing publication system or to help it evolve to something better optimized for researcher needs. A recent study5 suggests the extent to which researchers continue to use the conventional publication venues in parallel with arXiv distribution. While there are no comprehensive editorial operations administered by the site, the vast majority of the 56,000 new articles per year are nonetheless subject to some form of review, whether by journals, conference organizers, or thesis committees. Physics and astronomy journals have learned to take active advantage of the prior availability of the materials, and the resulting symbiotic relation was not anticipated 15 years ago. Though the most recently submitted articles have not yet necessarily undergone formal review, most can, would, or do eventually satisfy editorial requirements somewhere.

arXiv's success has relied upon its highly efficient use of both author and administrative effort, and has served its large and ever-growing user-base with only a fixed-size skeletal staff. In this respect, it long anticipated many of the current "Web 2.0" and social networking trends: providing a framework in which a community of users can deposit, share and annotate content provided by others.

Pages: 1 2 3 4 5 6 7

Reference this article
Ginsparg, P. "Next-Generation Implications of Open Access," CTWatch Quarterly, Volume 3, Number 3, August 2007. http://www.ctwatch.org/quarterly/articles/2007/08/next-generation-implications-of-open-access/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.