Data curation, eScience and the White Rose Grid

On Thursday November 13th I attended an ‘e-Science Collaborative Workshop’, hosted by the White Roase Grid e-Science Centre in Sheffield.  The event was focussed on the notion of ‘data curation’ and included a number of practical presentations of curation ‘in action’, and also more informational presentations from those working to support digital preservation and curation.  A very quick overview follows and I’m hoping they will put up presentations soon so that others can have a look.

Joanna Schmidt who co-organised the event and gave an overview of the White Rose Grid:

Graham Pryor talked about the services of the Digital Curation Centre:

Martin Lewis covered the work of the UK Research Data Service (, in particular a feasibility study to look at all aspects of the research data lifecycle, from creation to retrieval, through review and re-use.

Darren Treanor gave the best presentation of the day about a project to build an open access repository of very high-quality images of tissue slide samples, which are used by pathologists to identify cancer and the like.  It’s an amazing resource, not just because the images are extremely detailed, but also because it would have been very easy for them to close off this content, but instead they put in a bit of extra effort to anonymise the samples and make them available freely. Virtual Pathology:

Sarah Jones talked about the Data Audit Framework (, a methodology and tool for auditing an institutionals research data management.

Mike Meredith from the Virtual Vellum project talked more about technology than the tool he has built, which is an open source application for viewing groups of images in JPEG 2000 format.  It looks interesting and potentially useful for the digital library project:

Frank Gibson presented CARMEN : Code analysis, repository & modelling for e-neuroscience (
CARMEN is an e-Science Pilot Project funded by the Engineering and Physical Sciences Research Council (UK). “It will deliver a virtual laboratory for neurophysiology, enabling sharing and collaborative exploitation of data, analysis code and expertise. Neural activity recordings (signals and image series) are the primary data types”.  This is a large project with 20 scientific investigators across 11 institutions and £5m in funding.

Oh, and I presented on the digital library.  Presentation here:

Overall, it was an interesting event – good to see different things happening across White Rose and to meet a mixture of people involved in managing large datasets.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: