Package Exports
- rechtspraak-nl
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (rechtspraak-nl) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
Rechtspraak.js
A bunch of utility functions to work with the open data in Rechtspraak.nl and create pretty, well-formed JSON-LD.
Written in TypeScript, compiled to a Javascript commonjs module.
Why?
Rechtspraak.nl publishes information about a lot of Dutch court judgments. Although the source XML suggests that the data is valid RDF, the truth is not so. Furthermore, Rechtspraak.nl provides no schema for its documents other than an incomplete PDF in natural language. So it's hard to know what to expect, especially for some of the more esoteric metadata fields.
So the purpose of this project is to formalize the data model of Rechtspraak.nl. I have done this by analyzing all existing documents (~2 million) on Rechtspraak.nl to generate a JSON Schema and Typescript typings for the metadata associated with the court judgments. I have corrected some common errors in the source files (mostly to do with not properly encoding URIs) and generate valid JSON-LD (which is compatible with RDF).
Data
A dump of the metadata in sanitized JSON-LD is available at https://rechtspraak.lawreader.nl/_all.
You can use most of the API from CouchDB views, ie: https://rechtspraak.lawreader.nl/_all?limit=100&skip=50 will limit your request to 100 docs after the first 50. Mind that you can also use startkey to paginate faster: https://rechtspraak.lawreader.nl/_all?startkey=%22ECLI:NL:CBB:2015:5%22&limit=50 will fetch the first 50 docs starting at ECLI:NL:CBB:2015:5. The documents are ordered alphabetically by their ids.
This URL will load the complete knowledge graph of Rechtspraak.nl, making use mostly of dcterms and schema.org. I've invented my own URIs where appropriate. I'm planning to make them resolvable as well.
Types
JSON Schema
Rechtspraak.nl metadata gotchas
- Some
dcterms:typetriples don't have a resourceIdentifier, e.g. ECLI:NL:RBMNE:2016:1637:<dcterms:type rdf:language="nl" resourceIdentifier="">Uitspraak</dcterms:type> - Some docs miss .nl in the URI; eg ECLI:NL:CBB:2002:AD9059:
psi:type="http://psi.rechtspraak/conclusie" - Many URIs aren't encoded properly, most notably the "gevolg" URIs: eg.
http://psi.rechtspraak.nl/gevolg#(Gedeeltelijke) vernietiging en zelf afgedaan. Considering the official URI specification, spaces are illegal in URIs.- This also applies to some references, eg. in http://data.rechtspraak.nl/uitspraken/content?id=ECLI:NL:HR:1992:AA2957:
1.0✌️BWB:BWBV0001506&artikel=7 (oud)&g=1992-12-23 - Most dramatically, the URI
http://psi.rechtspraak.nl/procedure#
tussenbeschikking¬ontains line feeds (see ECLI:NL:RBMNE:2016:1780)
- This also applies to some references, eg. in http://data.rechtspraak.nl/uitspraken/content?id=ECLI:NL:HR:1992:AA2957:
Some issues derived from an earlier report:
In general, the W3C RDF validator crashes on input documents
The subject of a triple is not always clear. There are two dcterms:modified properties described, and it is unclear which one refers to the date on which the document was modified and which one to the date on which the metadata was modified.
Values are usually not typed, for example in the case of dates.
Resource identifiers are not always used, when they easily can be. An example is the
dcterms:coverageproperty. This might not seem important, such as in the case of dcterms:accessRights, which is fixed to the string literal public. But RDF processors typically do not treat two equal strings literals as the same concept: URIs are used for that. (Also, properties in the Dublin Core normally define a range which usually imply URIs.)There are some ECLI identifiers that turn up when searching for documents that have a body, but actually do not have a body. Encountered are:
Property-specific issues:
dcterms:referencesprefixes the resourceIdentifier attribute with the namespace of the corpus that the referent is in. This is not properly formed RDF.dcterms:subject: when a judgment is about multiple fields, a resource identifier is given that contains both subjects concatenated. An example is http://psi.rechtspraak.nl/rechtsgebied#bestuursrecht_socialezekerheidsrecht. It makes more sense to have one URI for 'bestuursrecht' and one URI for 'socialezekerheidsrecht'.psi:zaaknummerdoesn't seem to split lists of identifiers correctly. A string like 97/8236 TW, 97/8241 TW is probably two case numbers, not one.
The XML defines a prefix that refers to the relative URI
bwb-dl. Prefixing to relative URIs is a practice that has been deprecated by W3C.
License
GPL v3. Note that this is a viral open source license. If you create derivatives, you must publish your code under compatible license terms. Please support free software.