R.I.P.
๐ป
Ghosted
Shift schema drift left: policy-aware compile-time contracts for typed JVM and Spark pipelines
April 18, 2026 ยท Grace Period ยท + Add venue
Abstract
Schema drift in data pipelines is often caught only when a job touches real data. Typed-Dataset layers close part of this gap but require wholesale adoption; table-level enforcement systems close another part but operate at write time against a stored schema. We present a small Scala 3 framework that occupies the seam: it proves producer-to-contract structural compatibility under explicit policies at compile time, derives Spark schemas from the same contract types, and re-checks the actual DataFrame schema at the sink boundary before write. The artifact fuses the compile-time witness with a policy-aware runtime comparator that adds a nested-collection-optionality check Spark's built-in comparators omit and implements structural subset semantics for backward- and forward-compatible field sets. Evaluation covers compile-time proofs, runtime policy tests, builder-path end-to-end tests, and reproducible benchmarks on two environments. This is a narrow, honest mechanism artifact; the broader claim that compile-time structural contracts deliver measurable productivity or reliability in practice is stated as motivation and left for future work.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Programming Languages
R.I.P.
๐ป
Ghosted
Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions
R.I.P.
๐ป
Ghosted
Glow: Graph Lowering Compiler Techniques for Neural Networks
R.I.P.
๐ป
Ghosted
Learnable Programming: Blocks and Beyond
R.I.P.
๐ป
Ghosted
Scenic: A Language for Scenario Specification and Scene Generation
R.I.P.
๐ป
Ghosted