Simon Bührer

Papers

6 publications, most recent first.

  1. 2026

    BitLogic: A Training Framework for Gradient-Based FPGA-Native Neural Networks

    TMLR

    An end-to-end, gradient-based framework for training neural networks that run natively on FPGA lookup tables, replacing multiply-accumulate arithmetic with differentiable LUT nodes that use binary computation and sparse connectivity. It comes with modular PyTorch APIs and hardware-aware components, and it exports trained models to synthesisable RTL with verified bit-accurate equivalence. On CIFAR-10 it reaches 72.3% accuracy using under 0.3M logic gates, with single-sample latency below 20 ns.

  2. 2026

    GIC-DLC: Differentiable Logic Circuits for Hardware-Friendly Grayscale Image Compression

    AAAI 2026 Workshop on ML4Wireless

    Neural image codecs compress better than hand-built formats like PNG and JPEG-XL, but they are too expensive for energy-constrained edge devices. GIC-DLC is a grayscale codec that trains lookup-table logic circuits instead, keeping much of the flexibility of a neural network while running as a cheap Boolean circuit. On grayscale benchmarks it compresses better than traditional codecs while using less energy and running faster, which makes it practical for low-power imaging on phones, cameras, and drones.

  3. 2025

    Recurrent Deep Differentiable Logic Gate Networks

    EdgeFM @ ICML / ICLR 2026 workshop track

    The first recurrent network built from differentiable logic gates, which extends learnable Boolean computation to sequence-to-sequence tasks. On WMT'14 English to German it reaches 5.00 BLEU during training, close to a GRU baseline at 5.41, and drops to 4.39 BLEU once the gates are discretised to hard logic for inference. The goal is sequence models that map directly onto logic hardware.

  4. 2025

    Why Can't RNNs Learn Math? Automata-Inspired RNNs for Exact Computation

    Preprint

    RNNs are Turing-complete in theory but usually fail at exact arithmetic in practice. This work compiles p-stack automata into trainable RNNs, using stack splitting so the state grows linearly instead of exponentially, and a five-layer Clipped-ReLU network that reproduces the automaton exactly. It then looks at why training is so hard: the loss has a narrow V-shaped basin around the optimum that gradient descent tends to miss. Initialising close to that basin raises accuracy from 71.7% to about 90% across 13 arithmetic and bitwise operations.

  5. 2025

    Investigating the Role of Samples in Catastrophic Forgetting

    Deep Learning course project, ETH Zürich

    A study of how the choice of replayed examples affects catastrophic forgetting in continual learning. It compares three ways of deciding which samples to keep in the replay buffer: confidence-based learning-speed estimation, weighted prioritisation, and sensitivity-aware sampling based on the Memory-Perturbation Equation. On CIFAR-10 with a ResNet-18, all three end up close to the Goldilocks baseline without clearly beating it. The confidence-based scores track Goldilocks well (Pearson r up to 0.83) but use more memory.

  6. 2024

    ClassFormer: Transformers for Multivariate Time Series Classification

    BSc Thesis, ETH Zürich

    A Transformer for multivariate time-series classification. It uses continuous wavelet transforms to add frequency information, patch-wise embeddings to keep the sequence length manageable, and a three-stage attention that looks across time, channels, and frequency. On 18 UEA datasets it is competitive with eight standard baselines and best on several of them, including perfect accuracy on Epilepsy. A learned masking scheme adds a further accuracy gain and makes training more stable.