Showing posts with label digital repositories. Show all posts
Showing posts with label digital repositories. Show all posts

Monday, 22 March 2010

An Analytical Anniversary

Today is my anniversary.  I have been at Symplectic Ltd for one of your Earth "years".  And a very busy one it has been, what with writing repository integration tools for our research management system to deposit content into DSpace, EPrints and Fedora, plus supporting the integration into a number of other platforms.  I thought it would be fun to do a bit of a breakdown of the code that I've written from scratch in the last 12 months (which I'm counting as 233 working days).  I'm going to do an analysis of the following areas of productivity:

  • lines of code
  • lines of inline code commentary
  • number of A4 pages of documentation (end user, administrator and technical)
  • number of version control commits

Lets start from the bottom and work upwards.

Number of version control commits

Total: 700

Per day: 3

I tend to commit units of work, so this might suggest that I do 3 bits of functionality every day.  In reality I quite often also commit quick bug fixes (so that I can record in the commit log the fix details), or at the end of a day/week, when I want to know that my code is safe from hardware theft, nuclear disaster, etc.

Number of A4 pages of documentation

Total: 72

Per day: 0.31

Not everyone writes their documentation in A4 form any more, and it's true that some of my dox take the form of web pages, but as a commercial software house we tend to produce well formatted, nice end-user and administrator documentation.  In addition, I rather enjoy at a geek level a nice printable document that's well laid out, so I do my technical dox that way too.

The amount of documentation is relatively small, but it doesn't take into account a lot of informal documentation.  More importantly, though, at the back end of the first version of our Repository Tools software, the documentation is still in development.  I expect the number of pages to probably triple or quadruple over the next few weeks.

Lines of Code and Lines of Commentary

I wrote a script which analysed my outputs.  Ironically, it's written in Python, which isn't one of the languages that I use professionally, so it's not included in this analysis (and none of my personal programming projects are therefore included).  This analysis covers all of my final code on my anniversary (23rd March), and does not take into account prototyping or refactoring of any kind.  Note also that blank lines are not counted.

Line Counts:

XML (107 Files) :: Lines of Code: 17819; Lines of Inline Comments: 420

XML isn't really programming, but it was interesting to see how much I actually work with it.  This figure is not used in any of the below statistics.  Some of these are large metadata documents and some are configuration (maven build files, ant build files, web server config, etc).


XSLT (36 Files) :: Lines of Code: 8502; Lines of Inline Comments: 2762
JAVA (181 Files) :: Lines of Code: 22350; Lines of Inline Comments: 7565
JSP (16 Files) :: Lines of Code: 2847; Lines of Inline Comments: 1
PERL (58 Files) :: Lines of Code: 6506; Lines of Inline Comments: 1699
---------------
TOTAL (291 Files) :: Lines of Code: 40205; Lines of Inline Comments: 12027

I remember once being told that 30k lines of code a year was pretty reasonable for a developer.  I feel quite chuffed!


Lines of code/comments per day:

XSLT :: Lines of Code: 36; Lines of Inline Comments: 12
JAVA :: Lines of Code: 96; Lines of Inline Comments: 32
JSP :: Lines of Code: 12; Lines of Inline Comments: 0
PERL :: Lines of Code: 28; Lines of Inline Comments: 7
---------------
TOTAL :: Lines of Code: 173; Lines of Inline Comments: 52

It looks much less impressive when you look at it on a daily basis.  We just have to remember that this is 173 wonderful lines of code every day!

Comment to code ratio (comments/code):

XSLT :: 0.33
JAVA :: 0.34
JSP :: 0
PERL :: 0.26
---------------
TOTAL :: 0.30

It was interesting to see that my commenting ratio is fairly stable at about 30% of the overall codebase size.  I didn't plan that or anything.  This includes block comments for classes and methods, and inline programmer documentation.  The reason for the shortfall in Perl is suggested below.  Notice that I didn't write any comments in the JSPs because I only use this code for testing, and is less carefully curated code.

Some perl comments don't start with anything specific - they are block comments starting and ending with =xxx and =cut respectively, which is difficult to parse out for analysis easily. Therefore the Perl code line counts overestimate and the comment counts underestimate. More likely figures are, given a 0.33 comment to code ratio:

PERL (58 Files) :: Lines of Code: 5498; Lines of Inline Comments: 2707

Amount of testing code (testing/production):

9937 / 30268 = 0.33

This is the total amount of code that I wrote to test the other code that I wrote.  So nearly 10k lines of code are there purely to demonstrate that the other 30k lines of code are working.  I'm not going to suggest that this 33% is a linear relationship as the projects increase in size, but maybe we'll find out next year.  Incidentally, the test code that I analysed was the third version of my test framework, so in reality I wrote quite a few more lines of code (perhaps 3 or 4k) before reaching the final version used above.

Note that I'm a big fan of Behaviour Driven Development, and this does tend to cause testing code to be fairly extensive in its own right.

Number of new files per day:

XSLT :: 0.15
JAVA :: 0.78
JSP :: 0.07
PERL :: 0.25
---------------
TOTAL :: 1.25

In reality, of course, I create lots and lots of new files over a short period of time, and then nothing for ages.


Average file length:

Excluding blank lines: 179
Including blank lines: 211
Spaciousness (including/excluding): 1.18

What is spaciousness?  It's a measure of how I tend to space my code.  Everyone, I have noticed, is fairly different in this regard - I wonder what other people's spaciousness is?

Source Code

Do you want to have a go at this yourself?  Blogger doesn't make attaching files particularly easy, so you can get this from the nice folks at pastebin, who say this shouldn't ever time out: http://pastebin.com/GVkHd7tB.

Monday, 9 June 2008

ORE software libraries from Foresite

The Foresite [1] project is pleased to announce the initial code of two software libraries for constructing, parsing, manipulating and serialising OAI-ORE [2] Resource Maps. These libraries are being written in Java and Python, and can be used generically to provide advanced functionality to OAI-ORE aware applications, and are compliant with the latest release (0.9) of the specification. The software is open source, released under a BSD licence, and is available from a Google Code repository:

http://code.google.com/p/foresite-toolkit/

You will find that the implementations are not absolutely complete yet, and are lacking good documentation for this early release, but we will be continuing to develop this software throughout the project and hope that it will be of use to the community immediately and beyond the end of the project.

Both libraries support parsing and serialising in: ATOM, RDF/XML, N3, N-Triples, Turtle and RDFa

Foresite is a JISC [3] funded project which aims to produce a demonstrator and test of the OAI-ORE standard by creating Resource Maps of journals and their contents held in JSTOR [4], and delivering them as ATOM documents via the SWORD [5] interface to DSpace [6]. DSpace will ingest these resource maps, and convert them into repository items which reference content which continues to reside in JSTOR. The Python library is being used to generate the resource maps from JSTOR and the Java library is being used to provide all the ingest, transformation and dissemination support required in DSpace.

Please feel free to download and play with the source code, and let us have your feedback via the Google group:

foresite@googlegroups.com

Richard Jones & Rob Sanderson

[1] Foresite project page: http://foresite.cheshire3.org/
[2] OAI-ORE specification: http://www.openarchives.org/ore/0.9/toc
[3] Joint Information Systems Committee (JISC): http://www.jisc.ac.uk/
[4] JSTOR: http://www.jstor.org/
[5] Simple Web Service Offering Repository Deposit (SWORD):
http://www.ukoln.ac.uk/repositories/digirep/index/SWORD
[6] DSpace: http://www.dspace.org/

Thursday, 24 January 2008

CRIG Flipchart Outputs

The JISC CRIG meeting which I previously live-blogged from has now had its output formulated into a series of slides with annotations on Flickr, which can be found here:

http://www.flickr.com/photos/wocrig/

The process by which this was achieved was through an intense round of brain-storming sessions culminating in a room full of topic spaced flip chart sheets. We then performed a Dotmocracy, and the results that you see on the Flickr page are the ideas which made it through the process as having some interest invested in them.

Thursday, 13 December 2007

The Data Access Layer Divide

Warning: technical post.

One of the things that has been giving me consternation this week is the division between the data storage layer and the application layer. A colleague of mine has been working hard on this problem for some months for DSpace, and his work will form the backbone of the 1.6 release next year. As an new HP Labs employee, I'm just getting involved in this work too, with my focus currently on identifiers for objects in the system (not just content objects, but everything from access policies to user accounts).

We are replacing the default Handle mechanism for exposing URLs in DSpace with an entirely portable identification mechanism which should support whatever identifier scheme you want to put on top of it. DSpace is going to provide its own local identification through UUIDs, so that we can try to break the dependency of identification of artifacts in the system away from the specific implementation of the storage engine. That is, at the moment, database ids are passed around and used with little thought. But what happens if the data storage layer is replaced with something which doesn't use database ids? It's not even slightly inconceivable. Hence the introduction of the UUID.

Now, here's where it gets tricky. The UUID becomes an application level identifier for system artifacts. Fine. The database is free to give columns in tables integer ids, and use them to maintain its own referential integrity. Fine.

I have several questions, and some half-answers for you:

- Why is this a problem?

Suppose I have two modules which store in the database. Lets use a DSpace example of Item and Bitstream objects (DSpace object model sticklers: I know what I'm about to say isn't really true, it's for the purposes of example): I want to store the Item, I want to store the Bitstream, and I want to preserve the relationship between them. Therefore, the Item storage module needs to know how to identify the Bitstream (or vice versa). If I want, I can use the UUIDs, nice long strings, which may have implications on my database performance; why use a relational database if I'm going to burden it with looking up long strings when it could be using nice small integers?

So the problem is: how does the Item get to find out the Bitstream storage id?

- How far up the API can I pass the database id?

The answer to this is "not very far". In fact, it looks like i can't even pass it as far as the DAO API.

- Can I use a RelationalDatabase interface?

The best solution I've come up with so far is to allow my DAO to implement a RelationalDatabase interface, so that other DAO implementations can inspect it to see if they can get database ids out of it. Is that a good solution? I don't know, I'm asking you!

- What's the point?

At the moment the DSpace API is awash with references to the database id. It's fine for the time being, and most people will never get upset about it. But it bothers engineers, and it will bother people who want to try and implement novel storage technologies behind DSpace.

The title of this post reflects my current feeling that these two particular layers of the system, the application and the data storage, have, at some point, to collide; can we really engineer it so that no damage occurs? Answers on a postcard.

Wednesday, 12 December 2007

BMC and the Free Open Repository Trial

Our good buddies at BioMedCentral's Open Repository team have released the latest upgrade to their service, and are offering 3 month trial repositories for evaluation. From the DSpace home page:


BioMed Central announced the latest upgrades to Open Repository, the open access publisher's hosted repository solution. Open Repository offers institutions a cost effective repository solution (setup, hosting and maintenance) which includes new DSpace features, customization options, improved user interface. Along with the annoucement of the upgrades, Open Repository is offereing a free 3-month pilot repository, so institutions can test the suitability of the service without obligation. See the full articles in Weekly News Digest and in Alpha Galieo.

Tuesday, 11 December 2007

Multi-lingualism and the masses

Multi-lingualism, and the provision of multi-lingual services, is one of those problems that just keeps on giving. Like digging a hole in sand which just keeps filling with water as fast as you can shovel it out again, or the loose thread which unravels your clothes when you pull on it. I remember being told, back at the start, that multi-lingualism was a solved problem; that i18n allowed us to keep our language separate from our application.

When the first major work was done on DSpace to convert the UI away from being strictly UK to being internationalised, there was great cause for celebration. This initial step was extremely large, and DSpace has reaped the benefits of having an internationalised UI, with translations into 19 languages at time of writing. It's also helped me, among others, understand where else we might want to go with the internationalisation of the platform, and what the issues are. This post is designed to allow me to enumerate the issues that I've so far come up against or across, to suggest some directions where possible, but mostly just to help organise thoughts.

So lets start with the UI. It turns out that there are a couple of questions which immediately come to the fore once you have a basically international interface. The first is whether display semantics should be embedded in your international tags. My gut reaction was, of course, no ... but, suppose, for example, emphasised text needs to be done differently in different locales? The second is in the granularity of the language tags, and the way that they appear on the page. Suppose it is better in one language to reverse the order of two distinct tags, to dispense with one altogether, or to add additional ones? All of these require modifications in the pages which call the language specific messages, not in the messages themselves. Is there a technical solution to these problems? (I don't know, by the way, but I'm open to suggestion).

We also have the problem of wholesale documentation. User and Administrator help, and system documentation. Not only are they vast, but they are often changing, and maintaining many versions of them is a serious undertaking. It seems inappropriate to use i18n tagging to do documentation, so a different approach is necessary. The idea of the "language pack" would be to include not only custom i18n tags, but also language specific documentation, and all of the other things that I'm going to waffle about below.

Something else happens in the UI which is nothing to do with the page layout. Data is displayed. It is not uncommon to see DSpace instances with hacked attempts at creating multi-lingual application data such as Community and Collection structures, because the tools simply don't yet exist to manage them properly. For example:

https://gupea.ub.gu.se/dspace/community-list

where the English and Swedish terms are included in the single field for the benefit of their national and international readership.

Capturing all data in a multi-lingual way is very very hard, mostly because of the work involved. But DSpace should be offering multi-lingual administrator controlled data such as Communities and Collections, and at least offering the possibility of multi-lingual items. The application challenges here are to:


  • Capture the data in multiple languages

  • Store the data in multiple languages

  • Offer administrator tools for adding translations (automated?)

  • Disseminate in the correct language.


Dissemination in the correct language ought not to be too much hassle through the UI (and DSpace already offers tools to switch UI language), but I wonder how much of a difficulty this would be for packaging? Or other types of interoperability? Do we need to start adding language qualifiers to everything? And what happens if the language you are interested in isn't available, or is only partial for what you are looking at? Defining a fall-back chain shouldn't be too hard, but perhaps that fall-back chain is user specific; suppose I'm English, but I also understand German and French: I don't want the application to fall back from English to Russian, for example.

This post was actually motivated by a discussion I have been having about multi-lingual taxonomies, and using URIs to store the vocabulary terms, instead of the terms themselves. In this particular space, URIs are a good solution, because they are tied to a specific, recognised wording. It does place a burden on the UI, though, to be able to hide the URI from the user during deposit and dissemination.

But the same approach could, in theory, be used to offer multi-lingual browse and search results across an entire database. Imagine: each indexable field is collected in its many languages, a single (internal) URI is assigned to that cluster of terms, and that URI is stored instead of the value. With a lot of computational effort you could produce a map of URIs to all the same terms in all the different languages in the database and their corresponding digital objects, which you could offer to your users through search or browse interfaces (I'd not like to be the one to have to implement this, and iron out the wrinkles which I'm blatantly overlooking here).

There are many other corner areas of applications which include language-specifics, and it's going to take me a while to gather the list of what they are. Here are a few which aren't covered by the above:

  • system configuration

  • code exceptions and errors

  • application email notifications


A second major step has been taken for DSpace 1.5 with regard to multi-lingualism, in the form of Claudia Jürgen's work on submission configuraton, help files, emails and front page news. The natural progression would be onto multi-lingual application metadata, and from there the stars ...

Friday, 7 December 2007

CRIG Meeting Day 2 (2)

Topics for today:

http://www.ukoln.ac.uk/repositories/digirep/index/CRIG_Unconference#Friday_December_7th

The ones that interest me the most are probably these:

- Death to Packages

Not really Death to Packages, but lets not forget that packaging sometimes isn't what we want to do or what we can do.

- Get What?

This harks to my ORE interest, as to what is available under the URLs, and what that means for something like content negotiation.

- One Put to Multiple Places

Really important to distributed information systems (e.g. ethosnet integration into local institutions). Also, this relates, for me, to the unpackaging question, because it introduces differences between what systems might all be expecting.

- Web 2.0 interfaces (ok, ok)

I'm interested in web services. Yes it's a bit trendy. But it is useful.

- Core Servies of a Repository

For repository core architecture, this is important. With my DSpace hat on I'd like to see what sorts of things an internal service architecture or api ought to be able to support

CRIG Meeting Day 2 (1)

It's first thing on day two. I'm late because I have to get all the way across town, which takes a surprisingly long time in London. I should have just stayed at a nearby hotel. Oh well.

The remainder of yesterday was interesting. Scope for live blogging is difficult, as the conference is extremely mobile. Today I will have to pick a point and hide in a corner to get you up to date.

In the afternoon we discussed the CRIG scenarios, and then implemented something called a Dotmocracy, which involves sticking dots (like house points at school) next to topics which appeared which we were interested in. When we start up today, the first order of business will be to see what topics made the cut. From what I saw at the end of the day, this will include Federated Searching, Google Search, and package deconstruction (my personal favourite this week).

As a brief aside, one running theme has been "no more standards". As it happens, I disagree with this. We're never going to get everything thinking the same and working the same. That's why there are so many standards, and why new ones get made all the time. It's the way of the world. At least, with a standard, though, when you have implemented one, you at least have a way of telling people what you did, over the home grown undocumented solutions which are the alternative.

Right, I suppose I'd better get my skates on.

Thursday, 6 December 2007

CRIG Meeting Day 1 (2)

http://en.wikipedia.org/wiki/Unconference

See also Jim Downing's live blogging.

We've just done a round of preliminary unconferencing, where the CRIG Podcast topics were brainstormed onto flip charts. Not sure how useful that's going to be, but I'm going to approach the whole thing with an open mind. I've got my marker pen, my baloon, and my three dots.

wish me luck ...

CRIG Meeting Day 1 (1)

Some live blogging; may be slightly malformed, as this is happening inline, with no post-editing.

http://www.ukoln.ac.uk/repositories/digirep/index/CRIG_Unconference

Les Carr and Jim Downing have introduced us to the CRIG workshop first day. We're unconferencing which means that there's not a programme! We're going to try and stay at the abstract or high level discussion, not try to talk about technology.

David Flanders outlines the meeting philosophy. The outputs aimed for the meeting include: ideas (bluesky), standards and scenarios and how they can be linked together. The outputs will be taken to OR08. The best way for a group to produce good stuff is for everyone to think about themselves. Makes me think of an article I read recently:

http://www7.nationalgeographic.com/ngm/0707/feature5/index.html

We are not about creating new specs.

Julie then brings us some stuff about SWORD. See my previous post on this. We are going to have implementations for xrXiv, white rose research online and Jorum. A SPECTRa deposit client, and later an article in Ariadne and a presentation at OR08.

Break time ... tea and coffee!

Friday, 30 November 2007

CRIG Podcast

A couple of weeks ago the JISC CRIG (Common Repository Interfaces Group) organised a series of telephone debates on important areas for it. These have now been edited into short commentaries which might be of interest to you, and are aimed at priming and informing the upcoming "unconference" to be held 6/7 December in London:

http://www.ukoln.ac.uk/repositories/digirep/index/CRIG_Podcasts

The "unconference" will take place at Birkbeck College in Bloomsbury, London. Take a listen, and enjoy. Yours truly appears in the "Get and Put within Repositories" and the "Object Interoperability" discussions.

Wednesday, 24 October 2007

my my where did the summer go

OK, ok, it's been a long long time since I updated. Did I say at the beginning that this was an experiment in seeing if I was capable of maintaining a blog? If I didn't I should have done.

But there's a good reason that I've not updated for a while. That is, that I've been working flat out on the Imperial College Digital Repository: Spir@l, and am pleased to finally announce in a quiet way that we are officially LIVE:

http://spiral.imperial.ac.uk/

On the outside it doesn't look too serious. A standard looking DSpace, I hear you say, with an Imperial College site template on it. And you'd be right. But only about the tip of the ice-berg.

Without wishing to blow my own trumpet (modesty is the third or fourth best thing about me), please do check out the article which I co-wrote with my good colleague Fereshteh Afshari:

http://hdl.handle.net/10044/1/493

And you may also be interested in my presentation at the recent DSpace User Group Meeting in Rome 2007 (more on that later, maybe):

http://www.aepic.it/conf/viewabstract.php?id=200&cf=11

I could probably be persueded to write a little here about how it works; maybe you'll even get snippets from the monolithic technical documentation that I'm in the middle of writing.

Oh, and there's more news, but now I've got your attention again you have to wait for the next installment.

Thursday, 10 May 2007

EThOSnet Kick-Off

On Tuesday of this week the EThOSnet Project Board met for the first time to kick off this significant new project. For background, this project is the successor to the EThOS project, which in turn grew out of the Scottish projects: Theses Alive at Edinburgh, DAEDALUS at Glasgow, and Electronic Theses at the Robert Gordon University.

The aim of EThOSnet is to take the work done under EThOS and bring it up to a point where UK institutions can actually start to become early adopters, to start to digitise the back-catalogue of print theses in the UK, investigate technology for the current and the future incarnations of the system, and to basically kick-start a genuinely viable service for deposit and dissemination of UK theses.

At this stage, the project does not have a Project Manager, which is causing minor hold-ups initially, but Project Director, and Director of Library Services Clare Jenkins of Imperial College Library has stepped in to hold things together until one is appointed (we are expecting to hear very soon). In the interim, the Project Board has also been put in place to check that all the 7 Work Packages have the things they need to get going.

Of these 7 workpackages, the first and last are concerned with project management and exit strategy, and the meat of the project will take place in packages 2 - 6. Details of these work packages are available in the project proposal, which will hopefully be available on the JISC website soon.

A quick summary, then, of some of the changes and more concrete decisions that we made during the meeting:


  • We have set a pleasingly high target of 20,000 digitised theses and 3,000 born-digital theses by the end of the project. This will be sourced from the many institutions who have already expressed an interest in adopting the service, before the project is even going!

  • The first port of call for the technology is to smooth the process of the existing software tools for repository users. I would hope to have something which works well for DSpace available quickly, and general enough to be part of the main distribution. EPrints is already fully compliant, and Fedora has representitives from the University of Hull looking after it.

  • Communications will be done primarily through a soon-to-exist project wiki, and it is hoped that the existing E-Theses UK list will be used more heavily than it is already. Imperial College has agreed to host the existing ethos website, the wiki, and potentially the toolkit if necessary (currently hosted at RGU).

  • Toolkit development will be ongoing, with work being done on it within a wiki, but with the option to move to some XML format for the final product



This is a very big project, and I can't possibly represent everything that came out of Tuesday's meeting here. In the near future expect to see links to the project wiki appear and more information to come out.

Monday, 19 March 2007

Repository 66 and the Google Map Adventure

Tim Brody from the University of Southamptom has just blogged some URLs to add repository locations to Google Earth.

I thought it would be worth adding that the University of Aberystwyth's repository guru Stuart Lewis has been running Repository66 for a couple of months now, with the same premis (except you don't have to download Google Earth to use it).

Wednesday, 7 March 2007

IR Manager site and mailing list

Dorothea Salo has produced a new site and potential for set of resources for non-software specific IR Management issues:

http://oaresearch.org/

It has a weblog, forum and mailing list. It will be interesting to see if this takes off alongside the many other disparate resources for repository managers, such as the software-specific lists such as DSpace General and the broader ranging lists such as American Scientist and SPARC OA. Perhaps, in the long run, the forum might be the source of some kind of generalised How-To or FAQ for repository management, which would be a valuable resource.

Friday, 23 February 2007

JISC Capital Circular 4/06 outcomes

Today has been an exciting day. Projects that I am potentially involved in which have so far been announced as funded under the last round of JISC bids from November last year are as follows:

SWORD - Repository Deposit API development work in association with Aberystwyth, Southampton, Hull, Cambridge, Birkbeck (University of London), National Library of Wales, and Intralect, as a DSpace advisor and developer

EThOSnet - A major e-theses project following on from the great work of the recently completed EThOS project. Imperial is pleased to be leading this project, with partners from the following institutions: Leicester, Warwick, the British Library, Nottingham, Hull, Glasgow, Birmingham, National Library of Scotland, Edinburgh, Southampton, Cranfield, Robert Gordon University, Aberystwyth, Cardiff, Loughborough, National Library of Wales, and Exeter. What a team, and what a great looking project. My role is yet to be formalised, but hopefully somewhere in the area of the software development ;)

The future for repositories at Imperial looks bright. Today we completed our first UAT for our upcoming IR service "Spir@l", and we are due, over the course of this year to go live with that service, our own internal e-theses management system, and now the outcomes of these two projects will no doubt play a role in shaping our repository environment, which I hope will rapidly become one to be pround of.

Tuesday, 30 January 2007

DSpace Architecture Review Report

Following a meeting at MIT, in Cambridge, Massachussets in October 2006, the findings of the DSpace Architecture Review have now been presented at Open Repositories 2007:

http://wiki.dspace.org/index.php/ArchReviewReport

Many thanks to John Mark Ockerbloom for all his work, and for guiding the architecture review group.

Monday, 29 January 2007

The Institutional Repository: sales figures for year 1

Self-indulgent though it might be, I am pleased to report that the first annual sales figures for the book The Institutional Repository, which I co-wrote with Theo Andrew and John MacColl of the University of Edinburgh has sold a total of 737 copies this year. This is well in excess of what I had anticipated, so big thanks to any and all of you out there who purchased a copy.

Open Repositories 2007: preliminary feedback

I was, for various reasons, unable to attend the Open Repositories 2007 conference in San Antonio last week. Although the presentations themselves don't appear to have made it online yet (I'll post when they do), there has been plenty of blogging going on, especially over at Jim Downing's blog:

http://wwmm.ch.cam.ac.uk/blogs/downing/?p=65
http://wwmm.ch.cam.ac.uk/blogs/downing/?p=67
http://wwmm.ch.cam.ac.uk/blogs/downing/?p=68
http://wwmm.ch.cam.ac.uk/blogs/downing/?p=69

And to save the clicks, here's some that he's linked to:

http://dltj.org/2007/01/open-source-for-open-repositories/
http://blog.zsr.wfu.edu/pd/?p=38
http://nx650.wordpress.com/2007/01/23/at-open-repositories-san-antonio/

Plus some great summaries from Dorothea Salo, for which, as a non-attendee, I am very grateful:

http://cavlec.yarinareth.net/archives/2007/01/28/simile/
http://cavlec.yarinareth.net/archives/2007/01/28/manakin-and-geospatial-metadata/
http://cavlec.yarinareth.net/archives/2007/01/28/themes-in-manakin/
http://cavlec.yarinareth.net/archives/2007/01/27/custom-interfaces-with-content-packaging-plugins/
http://cavlec.yarinareth.net/archives/2007/01/27/dspace-preservation-repository-design/
http://cavlec.yarinareth.net/archives/2007/01/27/dspace-configurable-submission-system/
http://cavlec.yarinareth.net/archives/2007/01/27/pf-dspace/
http://cavlec.yarinareth.net/archives/2007/01/25/introducing-new-services-with-dspace/
http://cavlec.yarinareth.net/archives/2007/01/25/dspace-at-ncsu/
http://cavlec.yarinareth.net/archives/2007/01/23/dspace-for-managing-digitized-collections/
http://cavlec.yarinareth.net/archives/2007/01/23/manakin-ui/
http://cavlec.yarinareth.net/archives/2007/01/23/dspace-the-next-generation/

Promise I'll be there for Open Repositories 2008 in Southampton. It is only just up the road!

EDIT: All the links were broken the first time I published this. All fixed now.

Thursday, 25 January 2007

EPrints 3.0 Released

The release of EPrints.org 3.0 was announced yesterday at the Open Repositories 2007 conference in San Antonio

http://www.ecs.soton.ac.uk/about/news/1148

DSpace developer though I am, the Open Source voice in me reminds us that diversity in software and breadth of choice is part of the point, so congratulations to the EPrints.org team on their latest release.

Mind you, despite the press release saying "EPrints is already the world’s leading software for producing open access institutional repositories", the Registry of Open Access Repositories (ROAR) lists (at time of writing) 218 EPrints.org repositories and 223 DSpace ones. I'm just saying ;)