Linked data

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

In computing, linked data (often capitalized as Linked Data) is a method of publishing structured data so that it can be interlinked and become more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a way that can be read automatically by computers. This enables data from different sources to be connected and queried.[1]

Tim Berners-Lee, director of the World Wide Web Consortium (W3C), coined the term in a design note about the Semantic Web project.[2]

Principles

Tim Berners-Lee outlined four principles of linked data in his Design Issues: Linked Data note,[2] paraphrased along the following lines:

  1. Use URIs to name (identify) things.
  2. Use HTTP URIs so that these things can be looked up (interpreted, "dereferenced").
  3. Provide useful information about what a name identifies when it's looked up, using open standards such as RDF, SPARQL, etc.
  4. Refer to other things using their HTTP URI-based names when publishing data on the Web.

Tim Berners-Lee gave a presentation on linked data at the TED 2009 conference.[3] In it, he restated the linked data principles as three "extremely simple" rules:

  1. All kinds of conceptual things, they have names now that start with HTTP.
  2. If I take one of these HTTP names and I look it up [..] I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about that event.
  3. When I get back that information it's not just got somebody's height and weight and when they were born, it's got relationships. And when it has relationships, whenever it expresses a relationship then the other thing that it's related to is given one of those names that starts with HTTP.

Components

Linked open data

Linked open data is linked data that is open content.[4][5][6] Tim Berners-Lee gives the clearest definition of linked open data in differentiation with linked data. He defines linked data by identifying its four components, and then adds a fifth rule - open content - to define linked open data.[7][8] Large linked open data sets include DBpedia and Freebase.

History

The term "linked open data" has been in use since at least February, 2007, when the "Linking Open Data" mailing list[9] was created.[10] The mailing list was initially hosted by the SIMILE project[11] at the Massachusetts Institute of Technology.

Linking Open Data community project

The above diagram shows which Linking Open Data datasets are connected, as of August 2014. This was produced by the Linked Open Data Cloud project, which was started in 2007. Some sets may include copyrighted data which is freely available.[12]

The goal of the W3C Semantic Web Education and Outreach group's Linking Open Data community project is to extend the Web with a data commons by publishing various open datasets as RDF on the Web and by setting RDF links between data items from different data sources. In October 2007, datasets consisted of over two billion RDF triples, which were interlinked by over two million RDF links.[13][14] By September 2011 this had grown to 31 billion RDF triples, interlinked by around 504 million RDF links. A detailed statistical breakdown was published in 2014.[15]

European Union projects

There are a number of European= Union projects[when defined as?] involving linked data. These include the linked open data around the clock (LATC) project,[16] the PlanetData project,[17] the DaPaaS (Data-and-Platform-as-a-Service) project,[18] and the Linked Open Data 2 (LOD2) project.[19][20][21] Data linking is one of the main goals of the EU Open Data Portal, which makes available thousands of datasets for anyone to reuse and link.

Datasets

  • DBpedia – a dataset containing extracted data from Wikipedia; it contains about 3.4 million concepts described by 1 billion triples, including abstracts in 11 different languages
  • GeoNames provides RDF descriptions of more than 7,500,000 geographical features worldwide.
  • UMBEL – a lightweight reference structure of 20,000 subject concept classes and their relationships derived from OpenCyc, which can act as binding classes to external data; also has links to 1.5 million named entities from DBpedia and YAGO
  • FOAF – a dataset describing persons, their properties and relationships

Dataset instance and class relationships

Clickable diagrams that show the individual datasets and their relationships within the DBpedia-spawned LOD cloud (as shown by the figures to the right) are available.[22][23]

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found. Solving Semantic Interoperability Conflicts in Cross–Border E–Government Services.
  2. 2.0 2.1 Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. Lua error in package.lua at line 80: module 'strict' not found.
  6. Lua error in package.lua at line 80: module 'strict' not found.
  7. Lua error in package.lua at line 80: module 'strict' not found.
  8. Lua error in package.lua at line 80: module 'strict' not found.
  9. Lua error in package.lua at line 80: module 'strict' not found.
  10. Lua error in package.lua at line 80: module 'strict' not found.
  11. Lua error in package.lua at line 80: module 'strict' not found.
  12. Linking open data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/
  13. Linking Open Data
  14. Lua error in package.lua at line 80: module 'strict' not found.
  15. http://linkeddatacatalog.dws.informatik.uni-mannheim.de/state/
  16. Linked open data around the clock (LATC)
  17. PlanetData
  18. DaPaaS
  19. Linking Open Data 2 (LOD2)
  20. Lua error in package.lua at line 80: module 'strict' not found.
  21. Lua error in package.lua at line 80: module 'strict' not found.
  22. Instance relationships amongst datasets
  23. Class relationships amongst datasets

Further reading

External links