← Back to Blog
Machine Learning

40-64× Faster Than JAX, PyTorch, and TensorFlow

ML frameworks are general-purpose differentiable programming tools. For quantitative finance workloads — Monte Carlo, path-dependent payoffs, iterative calibration — a domain-specific approach is 40-64× faster.

Dmitri Goloubentsev
Dmitri Goloubentsev
· 2 min read
JAX PyTorch TensorFlow benchmark AAD Monte Carlo performance
Why ML Frameworks Fail the Quant Finance Test

A common question: why not just use JAX or PyTorch for adjoint differentiation in finance? They support automatic differentiation, they’re well-maintained, they have large communities.

The answer is performance. We benchmarked on a representative quant workload: 50,000 Monte Carlo paths, 500 time steps, with full adjoint Greeks.

Results

FrameworkTime (s)Speedup vs AADC
AADC0.166
JAX (XLA JIT)6.8243× slower
PyTorch10.6264× slower
TensorFlow9.5157× slower

With kernel reuse (record once, evaluate many times with different inputs), the gap widens to 64×.

Why the gap exists

ML frameworks are designed for neural network training: large matrix multiplications, batch normalization, convolutions. Their automatic differentiation is optimized for these patterns.

Quantitative finance workloads look different:

  • Path-dependent logic with branches and state (barrier options, callable bonds)
  • Iterative calibration with solver loops
  • Scalar operations accumulated over thousands of time steps
  • Double precision throughout (ML frameworks default to float32)

The compiled kernel approach records the computation once, JIT-compiles it to native AVX-vectorized machine code, and replays it millions of times. ML frameworks reinterpret the computation graph on every call (even with JIT tracing), paying interpreter and framework overhead each time.

When ML frameworks are the right choice

  • Neural network training (their core strength)
  • Prototyping and exploration (better tooling, visualization)
  • When the team already has PyTorch expertise and the workload is small enough that 40× doesn’t matter

When they’re not

  • Production pricing with >1000 risk factors
  • Real-time risk (latency matters)
  • Monte Carlo with path-dependent payoffs
  • Any workload where you need Greeks through complex financial logic

Benchmark methodology: same hardware, same data, same algorithm. AADC uses compiled kernel with AVX-2, 8 threads. JAX uses XLA JIT compilation. PyTorch uses torch.compile. TensorFlow uses tf.function with XLA. Implemented using AADC (matlogica.com).

Want to see these results on your own portfolio?

Get in Touch

Interested in these opportunities?

Let's arrange a free demo for you and your team.

Book a Demo