Tokenisation is NP-Complete

December 19, 2024 Β· Declared Dead Β· πŸ› arXiv.org

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Philip Whittington, Gregor Bachmann, Tiago Pimentel arXiv ID 2412.15210 Category cs.DS: Data Structures & Algorithms Cross-listed cs.CL, cs.FL Citations 6 Venue arXiv.org Last Checked 4 months ago
Abstract
In this work, we prove the NP-completeness of two variants of tokenisation, defined as the problem of compressing a dataset to at most $Ξ΄$ symbols by either finding a vocabulary directly (direct tokenisation), or selecting a sequence of merge operations (bottom-up tokenisation).
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Data Structures & Algorithms

Died the same way β€” πŸ‘» Ghosted