Spotlight Story: Data Quality

shuangyanToday, MK:Smart Research Associate Shuangyan Liu tells us how she makes data easy to interrogate, share and integrate to build smart applications.


What are you working on right now?

I’m currently responsible for the quality of the data stored in the MK Data Hub.

Data quality as a term is pretty self-explanatory—it refers to the level of quality of data. However, there are many different definitions of data quality and a wide range of metrics for measuring it.

Of the many definitions, data are generally considered to be high quality if they’re fit for purpose.

So while I’m responsible for ensuring the information stored by the MK Data Hub is accurate, I’m also concerned with whether that data can be integrated (or linked) to form applications. To that end, the data’s RDF description also needs to be accurate.


What’s RDF?

 RDF is an acronym for Resource Description Framework, which is a commonly adopted approach used to describe web resources.

The information stored in the MK Data Hub is underpinned by RDF technology that describes each resource in a triple format: subject, predicate, object.

For example:

  • Subject: Milton Keynes
  • Predicate: total population
  • Object: 229941 (the statistic)

 When the RDF description is accurate, users can share the data and its descriptions more easily, and developers can link the data together in order to build applications.


How do you ensure the RDF description is accurate?

Much of the information can be checked against trusted, external resources. For example, you can cross-reference Milton Keynes’ latitude and longitude against external resources to ensure the accuracy of a dataset on Milton Keynes’ geography.

For example, you can validate the accuracy of RDF triples about Milton Keynes’ geography using data from external data repositories like Geonames, Ordnance Survey, and OpenStreetMap.

While some of the data is available externally, which means we can check its accuracy, the value of the MK Data Hub is that such a rich resource of Milton Keynes’ data is stored in one hub—one place—in a language that makes it easy to interrogate, share and integrate to form smart applications.

And through a variety of sensors across Milton Keynes, the Data Hub is also capturing a wide range of new information that can easily be linked to other data on the city.