Thursday, May 12, 2005
Random thoughts on DSpace
There is an article about Dutch universities opening their research to the web. A random sample of the listed repositories revealed that the platform used as a repository is DSpace. From the DSpace site:
A groundbreaking digital repository system, DSpace captures, stores, indexes, preserves and redistributes an organization's research material in digital formats.The not-so-random thought is that there is a clear opportunity to layer a good topic map engine over DSpace, and, from that, provide for the support of research-oriented communities.



4 Comments:
I've seen evaluations of DSpace, Fedora and others, and ... they all have problems. I don't understand actually why there must be special repository software; all they are are a database (with or without the actual content) and some metadata slapped on it with a possible semantic datamodel of sorts in the background.
Maybe I seem a bit grumpy about this, but this is why I was drawn to Topic Maps to begin with; sharing the data and datamodel, letting specialist applications deal with the interactions. Why is this practice being taken up so slowly? It is starting to bug me. Seriously.
Alexander, while I agree in principle, the aspect that to my mind confers "digital library" rather than simply "database with metadata" is the integration and modularization of high-level services. In looking at Greenstone 3, the whole architecture is based on XML (actually, SOAP) messages sent in a call-and-response mode over HTTP. One service (e.g., the search feature) doesn't need to know the details of another (e.g., the repository). When one adds in the ability to handle common protocols like z39.50 and OAI-PMH, we start to emergent abilities arising from the combination of these lower-level features.
I know we both agree that adding a Topic Map engine to a digital library is a Very Interesting Prospect. Conal Tuohy has recently done something like this with the New Zealand Electronic Text Centre, as he announced on the topicmapmail list. I'm very excited by his work in this area.
Doing this with DSpace would be considerably more work than with Greenstone 3, since in the latter the entire interface is via XML rather than having to dig into code.
I'm still at an early stage of evaluation (and I think open source digital library software is in general still at a very early stage) but I believe there's an interesting road in front of us. I'm not grumpy about things because I see Topic Maps as a potential service layer on top of any digital library. It's perhaps the one big thing missing.
I was going to ask, "Isn't DSpace based on RDF?" But I think I'm confusing the related Simile project, which overlaps in some way (or persons?) with DSpace.
At any rate, Topic Maps may be redundant.
Actually, my problem with DSpace and the whole "institutional repository" notion is the "work vs. benefit" problem (see Jon Grudin's excellent Groupware and Social Dynamics: Eight Challenges for Developers). Why would anyone go through the effort of contributing anything to the repository? The amount of mandatory metadata that I have to provide to use these things is ridiculous. And is it integrated with the maintenance of my academic CV (which is actually important to me)?
Anecdotally, I've heard that it is not really worth deploying these things unless a substantial amount is invested in the maintenance of these resources. At MIT there is at least one full-time secretary per academic unit who is responsible primarily for keeping DSpace up-to-date.
In reality, I don't believe these systems, as they stand, add much to the world. If we could develop something similar that actually does something that people need to do on a day-to-day basis (e.g. a reading-list/citation management system like CiteULike) and integrate that into library services, then I think we're on the right track. I'll post more about this later.
Post a Comment
<< Home