Tuesday, 30 January 2007

AAP PR campaign: opinion

The last week or so have seen an explosion of discussion over the hiring by the Americal Association of Publishers of a well known PR firm whose director is known as the 'pit bull' of the PR community. Others have done the details, so I won't go over them here. Instead check out the coverage at Peter Suber's blog.

It's already been pretty heavily commented, so I wasn't going to add anything, but I've not yet seen the words of warning that immediately sprung to my mind when I read about this. Most commentary has been of the "they know they're backed into the corner, and they're fooling nobody" line. While I agree that those of us on the other side of the fence are not fooled by this, it is not us that they are concerned with. If Microsoft want to outdoo Apple, they don't market to Apple employees, saying "we're better than you, so just give up". Whether we know or not that this is just FUD is irrelevant - it is the people who ultimately make the decisions that are the targets of a campaign like this, and those people are our practicing academics, and, to a degree, members of the public.

We are all aware that people will believe the most ridiculous things if they're told them the right way, and being a top academic does not change that (I've seen some "interesting" opinions on OA from very senior staff). The battle is between the links twixt us and the academics and the links twixt publishers and the academics. If the AAP can convince their authors that OA is bad/wrong/immoral/censorship then we have a serious problem on our hands.

An analagous situation might be the Linux vs Windows argument, which has been raging for some time in this PR zone. Linux might be in the right, but at ever step of the way Microsoft have yet more cards to play to maintain their stranglehold monopoly. I don't think we've seen the end of this; in fact, I would say that we are only just over the Fuseki, and the middle-game is now underway. We cannot allow ourselves to relax in the knowledge that the publisher's have admitted that we're right, and a threat, because we've always known that

How does a loose community (by necessity) such as the Open Access community combat a well directed organisation which is seriously motivated to see itself previal? If you know the answer to that, then it won't just be this dispute which we can solve.

DSpace Architecture Review Report

Following a meeting at MIT, in Cambridge, Massachussets in October 2006, the findings of the DSpace Architecture Review have now been presented at Open Repositories 2007:


Many thanks to John Mark Ockerbloom for all his work, and for guiding the architecture review group.

Monday, 29 January 2007

The Institutional Repository: sales figures for year 1

Self-indulgent though it might be, I am pleased to report that the first annual sales figures for the book The Institutional Repository, which I co-wrote with Theo Andrew and John MacColl of the University of Edinburgh has sold a total of 737 copies this year. This is well in excess of what I had anticipated, so big thanks to any and all of you out there who purchased a copy.

Open Repositories 2007: preliminary feedback

I was, for various reasons, unable to attend the Open Repositories 2007 conference in San Antonio last week. Although the presentations themselves don't appear to have made it online yet (I'll post when they do), there has been plenty of blogging going on, especially over at Jim Downing's blog:


And to save the clicks, here's some that he's linked to:


Plus some great summaries from Dorothea Salo, for which, as a non-attendee, I am very grateful:


Promise I'll be there for Open Repositories 2008 in Southampton. It is only just up the road!

EDIT: All the links were broken the first time I published this. All fixed now.

Thursday, 25 January 2007

EPrints 3.0 Released

The release of EPrints.org 3.0 was announced yesterday at the Open Repositories 2007 conference in San Antonio


DSpace developer though I am, the Open Source voice in me reminds us that diversity in software and breadth of choice is part of the point, so congratulations to the EPrints.org team on their latest release.

Mind you, despite the press release saying "EPrints is already the world’s leading software for producing open access institutional repositories", the Registry of Open Access Repositories (ROAR) lists (at time of writing) 218 EPrints.org repositories and 223 DSpace ones. I'm just saying ;)

Thursday, 18 January 2007

Knowledge Exchange Workshop 16 - 17 January

I have just returned from a very interesting workshop organised by the Knowledge Exchange organisation, on the topic of Interoperability and Institutional Repositories. There were around 70 experts from the 4 countries involved in Knowledge Exchange (UK, Denmark, The Netherlands, and Germany) discussing the following broad topics in the context of interoperability:

  • e-theses


  • Research Paper Metadata

  • Usage Statistics

  • Exchanging Research Information

  • Author Identification

The findings of each group should be made public shortly, and I will be sure to post the location of any resources that I am aware of.

In the mean time I can present only the outline of the findings of the group I was in: Exchanging Research Information. This was focussed around the possibility for integration or interoperation between Current Research Information Systems (CRIS) and Open Access Repositories (OAR). There were representitives from both communities, and a large part of the meeting was for each of us to understand the other. The Common European Research Information Format was introduced to us, in light of the upcoming release of the latest revision.

It was initially felt, especially by the CRIS community, that interactions between CRIS and OARs would be very one way, and that the CRIS would simply make available the relevant information for the OARs. This doesn't strike me as being the definition of Interoperability, and so it was necessary for us to examine what was really the relationship between the data held by each system.

The approach that was taken was that a simple use case was analysed for the following features:

1) What information it would need to encompass
2) Where the information could be obtained
3) Where the information would be of interest

The use case is the traditional repository use case of "Deposit", although it was necessary to formulate this in a more general way as a "Publication Registration Process". This allowed us to successfully abstract away from where the User Interface for such a registration process lay, and thus to take away some of the arguments over whether this was the domain of the CRIS or the OAR.

Throughout the meeting, the discussion was very wide ranging, but out of it were extracted some important similarities and differences between CRIS and OARs. The most basic formulation of the key difference is as follows: The CRIS's primary interest is in high-quality, accurate metadata, while the OAR's primary interest is in content, and can live with a lower-quality metadata. This exposes two things: how the CRIS can be of benefit to the OAR, and how the problem domains do not overlap quite as much as it might first appear. My conclusion from this is that the interoperability we are talking about is actually about finding the layer at which these two systems domains can be stitched together for the benefit of the research community.

With this discussion under our belts, then, we enumerated first the information that CRIS are interested in, and then the information that OARs are interested in. The following list is not exhaustive, but gives an example of the differing perspectives:

  • CRIS

    • project information

    • bibliographic metadata

    • researcher role

    • scientific impact

  • OAR

    • bibliographic metadata

    • administrative metadata (technical, preservation, etc)

    • collection/group information

    • full-text / content

    • {persistent} identifier

The resulting analysis of the use case showed that information needed to come from all corners to achieve this process, including the special case of author information, which may come to the process from yet another system, albeit via the CRIS.

The general consensus of the meeting is that a working group needs to look closely at the interactions going on in this and other use cases, and specify some set of interfaces and content models that can allow for interchange of the relevant data. This should be followed by a reference implementation and service. It was proposed that the basis for a project looking at these issues might consider e-theses and other grey literature, as they may prove to be the easiest place to start.

It was good to see plenty of crossover between this and other strands. The bibliographic metadata obviously mattered to the Research Paper Metadata group, while starting with e-theses and grey literature will matter to the E-Theses group. That author names may have to come from some third-party system may well be connected to the Author Identifier group, and since interoperability is of essence, you can barely go any distance before considering at least the base problems which OAI-PMH addresses.

All in, an interesting meeting, and I'm looking forward to seeing the reports that will be published by the group moderators in due time.

Wednesday, 17 January 2007

EC Petition for Open Access

The Knowledge Exchange organisation has set up an online petition to be sent to the European Commission in support of the recent recommendations in the following study:

Study on the Economic and Technical Evolution of the Scientific Publication Markets of Europe

Currently, the recommendations are being lobbied against by publisher groups, so Knowledge Exchange feel that the other side needs to be represented. Therefore, if you would like to support the study, you can find some more information and option to sign at the following location:


The principal concern is to support recommendation A1:


Saturday, 13 January 2007

ORE Technical Committee Meeting 11 - 12 January

On 11 and 12 of January, 13 members of the ORE Technical Committee met at Columbia University in New York for the first face-to-face meeting of this project. Attendants were (in no particular order): Tony Hammond (Nature Publishing), Michael Nelson (Old Dominion University), Pete Johnstone (Eduserv, on behalf of Andy Powell), Ray Plante (NCSA), David Fulker (UCAR), Richard Jones (Imperial College London), Peter Murray (OhioLINK), Jeff Young (OCLC), Rob Sanderson (University of Liverpool), Tim DiLauro (Johns Hopkins University), Simeon Warner (Cornell), and of course Herbert van de Sompel (LANL) and Carl Lagoze (Cornell).

The results of this meeting are due to be reported at Open Repository 2007 at the end of this month, once they have been formalised from the complex debate and discussion that occurred at the meeting, so I won't attempt to summarise outcomes in any detail.

We began with an overview of the problem domain, which is of compound digital objects in a heterogeneous environment, which must be operable within the web architecture. One of the core outcomes of the project, therefore, will be a specification for describing these objects, and their internal and external relationships. Each of the attendant committee members was given the opportunity to present their thoughts on the initial documentation for the project. These ranged from commentary on a privately circulated white paper on the project through to suggestions on implementation technologies or methodologies that might be appropriate.

On the second day of the meeting we moved on to start formalising the goals for the various aspects of the project. This included our communication channels, our use cases, what we understand by the format that will help us describe structures and relationships, and our forthcoming work and subsequent meetings.

Communication for the project will happen through private mailing lists and a wiki. All outcomes from the project will be pushed out to the ORE website, and later there may be a project blog when there are findings to disseminate. We also specified 6 use cases and assigned members of the technical committee to examine the use case titles and develop some working "stories" which we will be able to develop. These use cases should be ready in time for presentation at Open Repository 2007.

Overall, it feels like we covered significant ground in just two short days, although I for one found the results of the meeting quite complex, and in need of some significant work to make coherent results from. Carl and Herbert will be carrying out this analysis in the coming weeks, which is when meeting results will be made available.

Tuesday, 9 January 2007

UK and Ireland DSpace User Group Videoconference presentation

Videos and presentation slides are now available from the UK&I DSUG held on 24 November 2006.

Presentation: http://cadair.aber.ac.uk/dspace/handle/2160/281

Videos: http://cadair.aber.ac.uk/dspace/handle/2160/290

Many thanks to Stuart Lewis at the University of Wales Aberystwyth for organising this event, and for making the presentations and videos available in their institutional repository.

At the moment, these presentations are only available in Windows Media and Real Media formats, due to limitations at the video editing suite in Aberystwyth.

The presentations given at this meeting were under the following titles:

  • Inside, Outside, Where Have We Been? The Who - of DSpace
    development in Trinity College Dublin (along with the why, the
    what and the how)

  • Distributing repository functions with DSpace [yours truly]

  • Next Steps for the China Digital Museum Project

  • What OR did next, or administering admins in a hosted repository

  • Thanks Google! A love-hate relationship

  • An update from the DSpace Architecture and Technology Review [yours truly]