The Mechanism of Additive Composition

November 26, 2015 · Declared Dead · 🏛 Machine-mediated learning

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Ran Tian, Naoaki Okazaki, Kentaro Inui arXiv ID 1511.08407 Category cs.CL: Computation & Language Cross-listed cs.LG Citations 28 Venue Machine-mediated learning Last Checked 4 months ago

Abstract

Additive composition (Foltz et al, 1998; Landauer and Dumais, 1997; Mitchell and Lapata, 2010) is a widely used method for computing meanings of phrases, which takes the average of vector representations of the constituent words. In this article, we prove an upper bound for the bias of additive composition, which is the first theoretical analysis on compositional frameworks from a machine learning point of view. The bound is written in terms of collocation strength; we prove that the more exclusively two successive words tend to occur together, the more accurate one can guarantee their additive composition as an approximation to the natural phrase vector. Our proof relies on properties of natural language data that are empirically verified, and can be theoretically derived from an assumption that the data is generated from a Hierarchical Pitman-Yor Process. The theory endorses additive composition as a reasonable operation for calculating meanings of phrases, and suggests ways to improve additive compositionality, including: transforming entries of distributional word vectors by a function that meets a specific condition, constructing a novel type of vector representations to make additive composition sensitive to word order, and utilizing singular value decomposition to train word vectors.