Papers

6 publications, most recent first.

2026
BitLogic: A Training Framework for Gradient-Based FPGA-Native Neural Networks
TMLR
An end-to-end, gradient-based framework for training neural networks that run natively on FPGA lookup tables, replacing multiply-accumulate arithmetic with differentiable LUT nodes that use binary computation and sparse connectivity. It comes with modular PyTorch APIs and hardware-aware components, and it exports trained models to synthesisable RTL with verified bit-accurate equivalence. On CIFAR-10 it reaches 72.3% accuracy using under 0.3M logic gates, with single-sample latency below 20 ns.
2026
GIC-DLC: Differentiable Logic Circuits for Hardware-Friendly Grayscale Image Compression
AAAI 2026 Workshop on ML4Wireless
Neural image codecs compress better than hand-built formats like PNG and JPEG-XL, but they are too expensive for energy-constrained edge devices. GIC-DLC is a grayscale codec that trains lookup-table logic circuits instead, keeping much of the flexibility of a neural network while running as a cheap Boolean circuit. On grayscale benchmarks it compresses better than traditional codecs while using less energy and running faster, which makes it practical for low-power imaging on phones, cameras, and drones.
- arXiv
- AAAI 2026 ML4Wireless
2025
Recurrent Deep Differentiable Logic Gate Networks
EdgeFM @ ICML / ICLR 2026 workshop track
The first recurrent network built from differentiable logic gates, which extends learnable Boolean computation to sequence-to-sequence tasks. On WMT'14 English to German it reaches 5.00 BLEU during training, close to a GRU baseline at 5.41, and drops to 4.39 BLEU once the gates are discretised to hard logic for inference. The goal is sequence models that map directly onto logic hardware.
2025
Why Can't RNNs Learn Math? Automata-Inspired RNNs for Exact Computation
Preprint
RNNs are Turing-complete in theory but usually fail at exact arithmetic in practice. This work compiles p-stack automata into trainable RNNs, using stack splitting so the state grows linearly instead of exponentially, and a five-layer Clipped-ReLU network that reproduces the automaton exactly. It then looks at why training is so hard: the loss has a narrow V-shaped basin around the optimum that gradient descent tends to miss. Initialising close to that basin raises accuracy from 71.7% to about 90% across 13 arithmetic and bitwise operations.
- PDF
- Code
2025
Investigating the Role of Samples in Catastrophic Forgetting
Deep Learning course project, ETH Zürich
A study of how the choice of replayed examples affects catastrophic forgetting in continual learning. It compares three ways of deciding which samples to keep in the replay buffer: confidence-based learning-speed estimation, weighted prioritisation, and sensitivity-aware sampling based on the Memory-Perturbation Equation. On CIFAR-10 with a ResNet-18, all three end up close to the Goldilocks baseline without clearly beating it. The confidence-based scores track Goldilocks well (Pearson r up to 0.83) but use more memory.
- PDF
- Code
2024
ClassFormer: Transformers for Multivariate Time Series Classification
BSc Thesis, ETH Zürich
A Transformer for multivariate time-series classification. It uses continuous wavelet transforms to add frequency information, patch-wise embeddings to keep the sequence length manageable, and a three-stage attention that looks across time, channels, and frequency. On 18 UEA datasets it is competitive with eight standard baselines and best on several of them, including perfect accuracy on Epilepsy. A learned masking scheme adds a further accuracy gain and makes training more stable.
- PDF
- Code

Papers

BitLogic: A Training Framework for Gradient-Based FPGA-Native Neural Networks

GIC-DLC: Differentiable Logic Circuits for Hardware-Friendly Grayscale Image Compression

Recurrent Deep Differentiable Logic Gate Networks

Why Can't RNNs Learn Math? Automata-Inspired RNNs for Exact Computation

Investigating the Role of Samples in Catastrophic Forgetting

ClassFormer: Transformers for Multivariate Time Series Classification