As part of our series of interviews with the Data Science Group, today we meet Enrico Daga, Project Officer for Linked Data, who tells us why our understanding data licensing information is so important.
What is linked data?
Linked data is a novel approach to publishing information on the World Wide Web in such a way that machines can both understand it, and link it to other data.
A good analogy is that data can be linked in the same way that web pages are linked. But rather than a page containing links, the actual data itself has links and points to other data.
This approach has led to the creation of a new Web of Data that is made of large interlinked databases, for example DBPedia and Europeana . The LinkedUp project is a good example of this in the higher education sector.
What does your role involve?
My role is to develop and maintain the Linked Open Data platform for The Open University. You can view this at data.open.ac.uk.
It’s worth mentioning that The Open University was the first academic institution in the UK to have a linked open data platform, and to be part of the Web of Data.
Of course, I work on the MK Data Hub too; managing the way information is catalogued, shared and linked.
What are you working on right now?
Right now, I’m researching methods to organise the policies and processes that relate to data in the MK Data Hub so that it’s computable or, in more simple terms, manageable.
There’s a lot of data coming in and out of the MK Data Hub. All the data is open, but in some cases there are certain policies that people need to adhere to.
We need to publish data licensing information so that users understand what they’re allowed to do with the data, and how licensing might affect what they are able to develop with it.
For example, you could use the data but not in a commercial setting. Or you could use the data, but you may need to minimise it to remove private information.
Why is this work important?
Supporting users in understanding how policies can affect their systems is hugely important.
If we don’t overcome the licensing and policy issues surrounding data, than then users won’t be able to benefit from it—whether that’s valuable insights or the building blocks for developers to work with.
It’s essentially about building trust, which is integral part of data sharing. No-one’s going to share data if they can’t be certain that users will respect their rules around its use.
Data governance is key in both safeguarding the data and building trust, so that we can all continue to benefit from the ever-increasing amount of information now available to us.