|
Not much of what I am going to say is new. It's just a story, a case history that should nail down some facts for you.
You know (or you should by now) the old cliche about low cost, high quality, and timeliness in publishing -- pick any two.
Forget it. Nothing in the world of electronic publishing is low cost. Maybe eventually, but not in the foreseeable future and certainly not in time to help with next year's budget pass.
Before I get any further, I'd like to establish some common grounds in terminology. Have you all had occasion to speak with in-house computer department staff, software or hardware sales representatives? If you have, then you have you heard them use the phrase "No problem!" when you have asked about capabilities. And, by now, you know that this is a technical phrase and that it means --
"Whatever you want to do, it will take twice as long and cost three times as much as you ever imagined."
We've all been talking about electronic journals and full-text for about fifteen years. And, a few have actually been in existence all that time: the American Chemical Society electronic versions of their paper journals, Steven Harnad's Psycoloquy, and others. Of course, they didn't have any tables or figures, and ASCII text is not exactly a delight to read, but they were there. Most of them also struggled to get submissions, because, even more than a new paper journal, the electronic journals appeared ephemeral and not substantial enough to count when it came to tenure decisions. And, they all cost a lot, if only in volunteer labor. None of them broke even, as far as I know, without that volunteer labor and grant money.
I came up in my career from PsycINFO and the world of online secondary services. When I transferred to the Publications Department, I thought it would only be a couple of years before we could offer the APA journals in some electronic form. After all, there was "no problem" in doing so.
Well, we've had the journal tables of contents and the abstracts up on our web site for over a year, and we've put up a few full-text articles we think have general interest. (They probably do -- we get almost a million hits a month on our site and the journals get about 300,000 of those.)
Even that little bit of information is slow getting to the screen at this time, because we use printer files in many cases and because existing staff and a few free lancers are doing the HTML coding. "Downsizing" is everywhere and although I am getting some staff increases it is never enough to keep up with the increases in work. Sound familiar? It's the old "do more with the same or less" school of management.
And what about real electronic journals, not just bits and pieces? Well, the technology has improved substantially, particularly in the last few years, and there are more electronic journals, and electronic versions of paper journals -- on CD-ROMs and on the Internet. So why aren't we all out there? What's been the hangup? Some people are out there -- like MIT Press -- where are the rest of us?
Lord knows, the library community has said "No more money for subscriptions, but we've got lots for computers!"
The technology is there, but it is not always easy to use or cheap.
To illustrate: We could do page images on CD-ROMs by digitizing pages through a scanner. But, that's expensive, takes up a lot of disk space and is not searchable. And, everyone tells me they want searchable full-text. For that we could use an OCR -- Optical Character Recognition -- scanner to digitize the text. They seem to be fairly accurate -- the one we have gives 99.4% accuracy.
Sound good? The OCR vendor told us, "Hey, no problem." But, reality is not so attractive. Taking an average page with 3,115 characters and you will get 18-20 errors on one page. And, they are not just broken characters. The OCR has trouble with some letters and numbers. Randomly, of course. Well, that's just not acceptable.
Then we looked at scanning in the figures and photos and going for computerized text. The two most popular choices these days seem to be Adobe PDF and/or SGML coding. We have chosen to SGML code the journals for a number of reasons -- among them the belief that SGML has long-term applicability, that it allows us to programmatically convert to HTML files for the Internet and to use the ICADD standard to produce materials for the visually-impaired, and generally, to preserve options down the line.
Have you ever seen an SGML coded document? Usually, when you get a demonstration, the salespeople show you something nice and simple. You wonder, how hard can it be? A simple code for the beginning and end of a paragraph?
Of course, the first page of a document can get a bit more complex, to say the least, with levels of coding that start to significantly interfere with what you can read on the screen. I suppose it is possible to get authors to do this kind of coding, but, I don't know: I have trouble getting them to write English. I mean, they won't even spell check.
And when things get really complex, like for references or a table, there is no way to get authors to handle it. If you look at an example you will see what looks like pure code and a coded table can take up three or four screens.
Eventually there may be word-processing programs that all the authors use that will provide us with SGML files automatically. In the meantime, publishers need to do it, and that raises a money issue. If I train my staff to do it, they are going to want more money, and it might not be in our best interests -- good copyeditors don't necessarily make good coders, and good coders may well not have a good feel for language.
So, that leaves free lancers and service providers, most notably printers who are looking for ways to preserve and add to typesetting revenues. Can they provide me with SGML coded files for my journals? Hey, No problem.
Of course, in order to do SGML coding, there is a preliminary step you have to go through. It is called the development of a Document Type Descriptor or DTD. What's a DTD? It's sort of an outline of the different SGML codes that will handle a certain set of publication specifications. How complicated can that be? Very complicated. Even a simple DTD for a book review article can look like pure code, and it can get worse. Much worse.
It's six months later and we have finally (I hope) finished developing the DTDs for APA journal articles. The bill has not yet come in.
Ten years have passed since I originally thought there would be "no problem" in developing electronic journals. That's about right: It has taken twice as long, and although the bills aren't all in yet, I am somehow quite sure the costs will be at least three times as much as anticipated.
What now?
We will be going back and getting SGML coded files for all 1996 issues of our book review journal shortly. As for the other 32 APA journals, all their 1997 issues will be SGML coded. We will probably offer some of our members an electronic test subscription in 1997 to check things out. For 1998, we'll offer a group of journals and probably by 1999, we'll have all of them available.
So, the BIG question arises: How will I charge for these subscriptions?
We all know that I won't have postage, paper, binding, costs for them. So, I should be able to provide them for less, right? But what about all that coding? The printers are not going to provide these files to me for free. They've invested a lot in software to convert files, staff training to do tagging, programming time to convert DTDs, etc. We'll have to pay for that one way or another.
One quote I got for SGML coding was for about $5.50 - $5.75 per printed page which translates for a 600-page quarterly journal into another $6,000 a year. For my biggest journal, that's another $14,000. Of course, I'm not sure about that, because we haven't done it yet and all I've heard is "No problem" so you know I am not confident. (I do figure that as soon as they figure out that women, and particularly women off-shore, can be trained on the coding, the costs will come down.) But, no matter what, I will still have to print the journal because not everyone in the world is ready for an electronic only journal including the US library community.
There's some thought that electronic journals are going to -somehow -- solve all the problems of library budgets, that libraries will pay less, have less to store, get more of what they want, and get more useful information. There are trade-offs: no postage, paper, or binding, but added complex coding, linkages, and maintenance of an online file in useful forms. That last is new. Publishers used to print and that was the end of it; in the electronic world, they may have to take over the archival function.
I may sound a little cynical and you may hear a subtext that there's going to be a whopping price increase, but I actually don't think so. I can't promise you what all the publishers will be doing, but the general feeling seems to be that for the first couple of years, at least, the price for electronic products will be anywhere from 90-120% the paper subscription and that much will depend on whether you are and continue to be a subscriber to the paper product, the linkages created, and the historical file maintained.
We all share an interest in having budgets that don't vary dramatically in ways that hurt us. The potential of electronic journals for scholarship is tremendous. The ability to create linkages should make for some fascinating discoveries. There are all sorts of possibilities.
But -- is it a roller coaster ride that will be fun but leave us off where we got on? Having spent some money and gained a few grey hairs? Or, is it like travel and education that broadens our horizons and leads us to new and wonderful ways of looking at the world? Having spent some money and gotten a few more grey hairs?
The potential is huge and exciting. And, our mutual interest in taking advantage of that potential is what will help us work out any difficulties along the way.
IP News Fall 1996 Table of Contents | IP News Title Page |
|