Web search engines have become a generally accepted feature of contemporary life. When you think about it, this is rather surprising, since the things have only been around for about ten years ( the BG considers 1994 to have been the natal year for these products). Engines were devised to overcome a very serious problem with the first generation Internet/web: it was very difficult to find material on a topic without a lot of tedious surfing, helped by strong doses of luck. The story of web search engines and how they got to be the things they turned into, and their effect on the larger culture is really an interesting saga, since it blends high-tech geekery, capitalism, psychology, marketing hype, and idealistic visions of an information society into one gooey mass. Explaining it all calls for a chronicler who understands the technical side, is familiar with the rather bizarre mental map of Internetland and is also something of a social scientist. In fact, it calls for a whole railroad carload of such people, working thoughtfully and carefully. We may have a down payment on a really useful study of the emergence of search engines in the form of a book by John Battelle, called The Search: how Google and its rivals rewrote the rules of business and transformed our culture. The author has good credentials, having been one of the original editors of WIRED magazine, the New Yorker of the digerati, and also one of the founders of the now defunct Industry Standard, a well regarded Internet industry dope sheet, which didn’t survive the dot com massacres. Battelle’s book is not the first study of Google as a company or as a cultural phenomenon. But, the survey of other search products and his insider connections and status may give him another perspective. Anyway, we’ll watch it.
The National Center for Biotechnology Information (NCBI) has been doing interesting things to make the wealth of biological data available to bench researchers and clinicians. The PubMed interface to MEDLINE has links across the top of the screen, and on a drop down menu, to databanks such as the Taxonomy Browser, GENOME, and Online Mendelian Inheritance in Man (OMIM). Reports of research in genetics, protein science, molecular biology and other related disciplines offer the prospect of deeper understanding of life properties, and of greater insight into causes of and possible therapies for human and animal diseases. Keeping the flood of information accessible and letting it serve as a basis for further biological discovery is one of the missions of the NCBI. A new search toolbar has been devised which can reside on a workstation. This tool allows a researcher to search either the entire fleet of NCBI databanks, or to pick particular query-dependent file, as appropriate. The toolbar can be downloaded from the NCBI website. It doesn’t take long, and the tool works very well.
The National Library of Medicine’s TECHNICAL BULLETIN has a descriptive article with links to the download:
The Blogging Grouch extends to all readers of LibraryLink, and to their families, friends, pets, houseplants and other life forms as may be therunto attached, his very best wishes for a happy, healthy, safe and, if possible, relaxing Thanksgiving Day holiday. Don’t forget to help with the dishes. And, try to take a walk afterwards.
Senior faculty from two of California’s major universities have circulated an open letter to their colleagues everywhere, urging them to consider charging the publishers of high-priced journals a service fee, to cover the work done by university faculty and support staff in the acquisition, preparation and release of manuscripts. Theodore Bersgtrom of UC Santa Barbara and R. Preston McAffee of CalTech point out that universities support extensive editorial functions carried out by researchers and staff on manuscripts which are then published in journals, at what the authors consider exhorbitant subscription prices. The authors explain their position in this way: previously, publishers took care of the business end of publication and academics supplied the content, and at least some support work by academic staff was accepted as part of the unwritten deal. Publishers made a modest profit, and scholars got exposure for themselves and their institutions in contributing to the scholary record. Win-Win as they say nowadays. McAffee and Bergstrom are convinced that commercial publishers have abandoned their side of the covenant, through irresponsible and predatory pricing. These are not new charges, but the suggested countermeasure certainly is: send them a bill. Universities should create a list of offending publishers, and charge them editorial costs. Publishers who are not gouging would not be affected. On reflection, it’s hard to duck the conclusion that schools have doing a lot of free labor for publishers, so, maybe M and B are on to something. It’s only fair to get some dough back for the very considerable services editors and their staffs perform, when those services are helping big publishers hit big profit targets. Stop being a bunch of saps and chumps. Sock it to ‘em.
The Association of Research Libraries (ARL) has endorsed the course of action recommended recently by a team representing various college and university libraries, meeting under the auspices of the prestigious and influential Mellon Foundation. While the move to endorse the Mellon Report is no surprise, the relative promptness of the ARLâ€™s reaction is a sign that the body, which represents more than 120 research libraries located in North America, takes the recommendations very seriously. As they should, since the long-term preservation of, and consequent access to, scholarly materials published in electronic journals is by no means guaranteed. The ARL has assigned the drafting of suitable objectives and action items based on the Mellon Report to its Scholarly Communication Steering Committee. The more attention is brought to this matter, the better. Electronic publication of academic journals is no longer an experiment. Many of the kinks have been worked out. There is no longer any excuse for delay in dealing with the preservation problem, since the amount of digital content is already very great and is growing rapidly.
Nature continues to worrry about plagiarism. Some reports suggest a sharp increase in scientific misconduct cases involving alleged plagiarism. The BG will assent to the propostion that many authors are lifting text from other authors, or from their own past work, and inserting it without proper attribution and acknowledgement into their current manuscripts or grant applications. The folks at Nature are at the center of science journal publishing, and enjoy the benefits of an extensive and active “bush telegraph” apart from what they can learn from open sources, so they probably have a good idea of what’s going on. The authors suggest that laboratory mentors insist that younger researchers get into good citation habits right away, at the bench top, and not wait until publication deadlines are approaching. Attitudes toward acknowledging prior work may be looser than in the past, as a result of some kind of “all pals together” attitude. And, the presence of all that e-text so easy to snip out and move around may be a very big tempation. Nature wants journal editors and research institutions to pursue alleged cases more vigorously, but it seems that the limits of the possible are very soon reached. And, who is to say that only younger investigators are responsible for the uptick. The established Old Walruses may be energetic snippers themselves, especially of their own stuff. When you have two or three grant applications to finish, things can get very tense.
The Mellon Foundation, the one with all the bucks, has released a report which calls for serious and immediate attention to the problems involved in preserving the growing body of content found in scholarly electronic journals. The report results from a meeting held at the Foundataion’s HQ in NYC on September 9th. Present were “academic librarians, university administrators and others”. The group had a very, very heavy Northeast slant. There was one rep from the Left Coast, 2 from the Midwest, and 2 from the South, sort of, if you count Duke and Virginia. But despite the geographic skew, the committee came up with some interesting conclusions.
The Meat: Content contained in electronic journals exists on the computers of publishing companies who may not necessarily be able or willing to maintain its integrity into the future, and somebody should do something about this. Libraries used to own the journals they bought, and the robustness of print on paper and the dispersion of copies in many libraries assured a nice degree of preservation. Now they lease content, more and more of which is on the servers of fewer companies, and in a rather fragile form. Librarians have been saying this for a while now but the Mellon report does a big service in putting its name behind efforts to deal with a situation that will soon move from serious to critical. Most of the Academy is utterly clueless about this question, since they are dazzled by the convenience of desktop access, and just assume that Somebody is taking care of this. Sorry, no. The technical challenges are pretty tough, the intellectual property concerns are thorny, and the available Somebodys are short of cash, a lot of which will be needed to float the development and deployment of proper measures.
At a time when libraries are canceling print versions of academic journals, the doubts about the long-term preservation of digital materials are enough to make us wonder if we are in for another Great Forgetting, like the one that accompanied the end of the Roman Empire, in the West at least.
The BG has been wondering what’s going on with institutional repositories (IR), so was happy to find a story about them in Research Information, a site managed by Europa Science Ltd. An IR is created to allow the archiving in digital form of materials created at a particular college or university. The purpose of an IR is to ensure preservation and improve access to scholarly materials, in a kind of alternative form of publication. Content description schemes and harvesting protocols can ensure that content in an IR is searchable and retrievable. When viewed in the context of rising materials prices and stunted library acquisitions budgets, the IR concept can start to look pretty good. Nadya Anscombe’s article covers the basics well, and then goes on to survey where IR’s have been started and by whom. European countries are moving right along, with Germany and the UK roughly at the head of the herd. Some of the German Uni’s have several functioning IR’s. She also provides a quick overview of the software products now being used in IR hosting, and the Grouch was surprised to see how many products are available, about half of which are from the USA. Anscombe remarks that the biggest problems in an IR have more to do with motivating authors to see the value in such a mechanism and provide content than with technical matters. Libraries may have a big role to play in establishing and maintaining an IR, at least in some analyses. But the BG hastens to point out, in time to spoil the fun, that an IR can start to cost big money very soon if it’s done properly. Running it out of somebody’s back pocket is not realistic for a serious operation, and it it’s not to be taken seriously, why do it?