Simon Bührer
All papers

Why Can't RNNs Learn Math? Automata-Inspired RNNs for Exact Computation

Simon Bührer, Preprint, July 2025

Abstract

RNNs are Turing-complete in theory but usually fail at exact arithmetic in practice. This work compiles p-stack automata into trainable RNNs, using stack splitting so the state grows linearly instead of exponentially, and a five-layer Clipped-ReLU network that reproduces the automaton exactly. It then looks at why training is so hard: the loss has a narrow V-shaped basin around the optimum that gradient descent tends to miss. Initialising close to that basin raises accuracy from 71.7% to about 90% across 13 arithmetic and bitwise operations.

Tags

  • Program Induction
  • Automata Theory
  • Exact Computation
  • RNNs