Knows: Agent-Native Structured Research Representations

April 19, 2026 Β· Grace Period Β· + Add venue

⏳ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Guangsheng Yu, Xu Wang arXiv ID 2604.17309 Category cs.AI: Artificial Intelligence Citations 0
Abstract
Research artifacts are distributed primarily as reader-oriented documents like PDFs. This creates a bottleneck for increasingly agent-assisted and agent-native research workflows, in which LLM agents need to infer fine-grained, task-relevant information from lengthy full documents, a process that is expensive, repetitive, and unstable at scale. We introduce Knows, a lightweight companion specification that binds structured claims, evidence, provenance, and verifiable relations to existing research artifacts in a form LLM agents can consume directly. Knows addresses the gap with a thin YAML sidecar (KnowsRecord) that coexists with the original PDF, requiring no changes to the publication itself, and validated by a deterministic schema linter. We evaluate Knows on 140 comprehension questions across 20 papers spanning 14 academic disciplines, comparing PDF-only, sidecar-only, and hybrid conditions across six LLM agents of varying capacity. Weak models (0.8B--2B parameters) improve from 19--25\% to 47--67\% accuracy (+29 to +42 percentage points) when reading sidecar instead of PDF, while consuming 29--86\% fewer input tokens; an LLM-as-judge re-scoring confirms that weak-model sidecar accuracy (75--77\%) approaches stronger-model PDF accuracy (78--83\%). Beyond this controlled evaluation, a community sidecar hub at https://knows.academy/ has already indexed over ten thousand publications and continues to grow daily, providing independent evidence that the format is adoption-ready at scale.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Artificial Intelligence