feat(pipeline): surface per-distribution RDF-validity verdicts#476
Open
ddeboer wants to merge 1 commit into
Open
feat(pipeline): surface per-distribution RDF-validity verdicts#476ddeboer wants to merge 1 commit into
ddeboer wants to merge 1 commit into
Conversation
7dd5b6e to
faf55f5
Compare
Wire @lde/distribution-health into the pipeline so each distribution's validity is reported as a plain TypeScript verdict via a new ProgressReporter.distributionValidated(distribution, verdict) callback. The pipeline emits no RDF and coins no vocabulary; consumers map the verdict to their own RDF (see netwerk-digitaal-erfgoed/dataset-register#2103). - Deep verdict from the import outcome: invalid (parse-error) on import failure - surfaced even when the dataset is then skipped, so an invalid distribution is recorded rather than silently dropped - and valid, or empty when the import yielded no triples, on success. - Shallow verdict from the probe's body validation, per probed distribution. - Each verdict carries the distribution's observed source fingerprint; the fingerprint is also added to the reachability result (DistributionAnalysisResult.fingerprint) so it is the shared key across the reachability and validity rails. Part of netwerk-digitaal-erfgoed/dataset-register#2103. Closes #469.
faf55f5 to
af2273e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #469. Part of the distribution-health feature: netwerk-digitaal-erfgoed/dataset-register#2103.
What
Wires
@lde/distribution-healthinto@lde/pipelineso every distribution the pipeline touches gets an RDF-validity verdict, surfaced as a plain TypeScript value through a new reporter callback. The pipeline emits no RDF and coins no vocabulary — consumers map the verdict to their own RDF (the “RDF emission & vocabulary boundary” decision in #2103).New surface:
Behaviour
parse-error, surfaced even when the dataset is then skipped (the previously-silent drop that motivated #2103);valid, oremptywhen it yielded no triples.DistributionAnalysisResult.fingerprint) so it is the shared key across the reachability and validity rails.Tests
311 pipeline tests pass (5 new, behaviour-asserted through the reporter): deep invalid / valid / empty, shallow invalid, and the reachability fingerprint. Lint + typecheck clean.
Notes
ImportSuccessfulto feed the shared mapper. A possible later cleanup is to compute the verdict where the real import outcomes live (importResolver) and thread the fingerprint in; left as-is for now since it's localised.def.nde.nlmapping lands in the consumers (epic tasks 4 & 5).