Self-Consistency from Only Two Samples: CoT-PoT Ensembling for Efficient LLM Reasoning

April 19, 2026 · Grace Period · 🏛 Findings of ACL 2026

Authors Raman Saparkhan, Majd Hawasly, Md Rizwan Parvez, Mohammad Raza arXiv ID 2604.17433 Category cs.CL: Computation & Language Cross-listed cs.AI, cs.LG Citations 0 Venue Findings of ACL 2026

Abstract

Self-consistency (SC) is a popular technique for improving the reasoning accuracy of large language models by aggregating multiple sampled outputs, but it comes at a high computational cost due to extensive sampling. We introduce a hybrid ensembling approach that leverages the complementary strengths of two distinct modes of reasoning: Chain-of-Thought (CoT) and Program-of-Thought (PoT). We describe a general framework for combining these two forms of reasoning in self-consistency, as well as particular strategies for both full sampling and early-stopping. We show that CoT-PoT ensembling not only improves overall accuracy, but also drastically reduces the number of samples required for SC by a factor of 9.3x. In particular, the majority of tasks (78.6%) can be addressed with only two samples, which has not been possible with any prior SC methods.