Open-source C++ AAD libraries introduce significant overhead. For Monte Carlo workloads, multi-threaded bump-and-revalue (75ms) outperforms all tested AAD libraries.
MatLogica AADC provides the optimal solution: AAD speed with native SIMD vectorization and multi-threading support.
Important Finding: For Monte Carlo workloads, open-source C++ AAD libraries can be slower than bump-and-revalue, especially when the inner loop is tight. Multi-threaded bump-and-revalue (75ms) beats all tested AAD libraries for Greeks computation.
100 trades × 1,000 scenarios × 252 timesteps
Speedup relative to single-threaded optimised bump-and-revalue (986ms baseline)
Without Greeks computation - all libraries perform similarly
Key insight: For price-only computation, AAD libraries have minimal overhead. The cost comes during the reverse pass for Greeks.
Why open-source AAD is slow for Monte Carlo
Code changes required for each approach
OpenMP bump-and-revalue scales near-linearly with threads
When AAD Beats Bump-and-Revalue:
For standard Delta/Rho/Vega (3 Greeks), multi-threaded bump-and-revalue is often faster.
Strengths and considerations for each AAD library
Best open-source option
Compiled library
COIN-OR project
Modern C++17 - Too slow
Avoid for Production:
MatLogica AADC combines AAD correctness with native SIMD vectorization and multi-threading - the best of both worlds for Monte Carlo Greeks.
All benchmarks executed on identical hardware for fair comparison
| CPU | Intel Xeon Platinum 8280L @ 2.70GHz |
| Cores | 28 cores/socket, 2 sockets, 112 threads |
| Compiler | GCC 12.2.0 with -O3 -std=c++17 |
| OS | Linux 6.1.0-13-amd64 (Debian) |
| Trades | 100 |
| Scenarios | 1,000 |
| Timesteps | 252 |
| Threads | 1 (baseline), 16 (parallel) |
Asian Option Monte Carlo with GBM dynamics