All papers
Why Can't RNNs Learn Math? Automata-Inspired RNNs for Exact Computation
Simon Bührer, Preprint, July 2025
Abstract
RNNs are Turing-complete in theory but usually fail at exact arithmetic in practice. This work compiles p-stack automata into trainable RNNs, using stack splitting so the state grows linearly instead of exponentially, and a five-layer Clipped-ReLU network that reproduces the automaton exactly. It then looks at why training is so hard: the loss has a narrow V-shaped basin around the optimum that gradient descent tends to miss. Initialising close to that basin raises accuracy from 71.7% to about 90% across 13 arithmetic and bitwise operations.
Tags
- Program Induction
- Automata Theory
- Exact Computation
- RNNs