What is AADC?

AADC (Automatic Adjoint Differentiation Compiler) is a just-in-time compiler that transforms mathematical computations into highly optimized machine code. While automatic differentiation is a core capability, AADC provides two distinct benefits:

Dual Purpose: Acceleration and Differentiation

Performance Acceleration: AADC can significantly speed up repeated mathematical computations, even when derivatives aren’t needed. It compiles your calculations once and generates optimized binary code that leverages modern CPU features like SIMD vectorization (AVX2/AVX512) and multi-threading. This can provide substantial speedups for computationally intensive code.

Automatic Differentiation: When derivatives are required, AADC computes both the original function and all its partial derivatives in a single execution pass. The reverse-mode automatic differentiation, when combined with JIT acceleration, often computes gradients faster than the original code computes just the function value.

How AADC Achieves “Faster than Original Code” Performance

Traditional automatic differentiation theory suggests that computing derivatives should require at least as much computational work as computing the original function, with reverse-mode AD typically requiring 2-4x the computation time. AADC breaks this theoretical barrier through several key innovations:

JIT Compilation Acceleration

When AADC records your computation, it doesn’t just build a computational graph—it compiles optimized machine code. This compiled version often runs significantly faster than the original code because:

Eliminated overhead: Function call overhead, loop overhead, and conditional checks are minimized
Optimized instruction sequences: Mathematical operations are compiled into efficient CPU instruction patterns
Memory access optimization: Data layout and access patterns are optimized for cache efficiency and CPU register use

SIMD Vectorization

AADC automatically vectorizes operations to process multiple values simultaneously:

AVX2: Processes 4 double-precision values in parallel
AVX512: Processes 8 double-precision values in parallel
Minimal code changes required: Existing scalar code can be implemented using scalar operations while automatically benefiting from vectorization

Combined Forward and Reverse Efficiency

During the reverse pass, AADC:

Reuses forward computations: Intermediate values computed during the forward pass are efficiently reused
Optimized adjoint propagation: The reverse sweep is compiled to minimize memory access and maximize instruction throughput
Eliminates redundant operations: Common subexpressions and redundant calculations are eliminated

Practical Example

Consider a Monte Carlo option pricing calculation:

Original code (1000 paths):     100ms
AADC forward only:             15ms  (6.7x speedup)
AADC forward + all gradients:  20ms  (5x speedup vs original)

In this example, computing the function value plus all derivatives with AADC takes only 20% of the time required by the original code to compute just the function value.

When AADC Works Well

AADC is particularly effective for computational patterns common in quantitative finance and scientific computing, which differ significantly from machine learning workloads:

AADC’s Sweet Spot: Computational graphs with very large numbers of “light” scalar operations (millions of simple mathematical operations like additions, multiplications, function calls). Traditional Machine Learning frameworks are not designed for this pattern and perform poorly with such fine-grained operations.

Machine Learning Tools’ Sweet Spot: Small numbers of “heavy” tensor operations (matrix multiplications, convolutions on large arrays). Machine Learning frameworks excel at these coarse-grained operations but have significant overhead for scalar computations.

Specific Applications Where AADC Excels:

Quantitative finance models: Option pricing, risk calculations, portfolio optimization
Machine learning: Gradient-based optimization, neural network training
Scientific computing: Numerical optimization, parameter estimation
Monte Carlo simulations: Especially those with continuous mathematical operations
Analytical functions: Problems dominated by continuous mathematical expressions
Legacy code modernization: Improving and accelerating complex legacy analytics code, with AADC successfully applied to multi-million line of code projects

These domains typically involve smooth, differentiable functions with relatively straightforward control flow.

When AADC Is Not Suitable

AADC has limitations and isn’t appropriate for all computational problems:

Heavily branching algorithms: Code with extensive conditional logic based on computed values
Discrete state models: Algorithms that rely on complex discrete decision trees
String processing or symbolic computation: Non-numerical operations
Database operations or I/O intensive tasks: Operations outside mathematical computation

The key limitation is that AADC requires the computational graph to be relatively continuous and predictable. Applications with significant algorithmic branching or discrete state transitions may still benefit from AAD capabilities for parameter calibration and sensitivity analysis, but often require careful analytical modifications such as smoothing techniques to transform discrete problems into continuous ones.

Language Support

AADC currently supports three programming languages:

C++: Full-featured implementation with complete access to all capabilities
Python: Native integration allowing seamless use within Python workflows
Mixed C++/Python: User analytics can be implemented across both languages, with AADC recording computations seamlessly regardless of which language defines each component
C#: .NET-compatible implementation for enterprise and Windows environments

Each language binding provides appropriate interfaces while maintaining the same underlying compilation and optimization engine.

Technical Approach

Unlike traditional automatic differentiation tools that build computational tapes, AADC directly compiles native binary kernels optimized for your specific hardware and specific run-time problem definition. Using available information at runtime significantly boosts performance by essentially forming a custom program for the specific task configuration. This approach enables both the performance acceleration and efficient derivative computation, but requires that your mathematical operations can be effectively represented in this compiled form.

The “recording” phase happens once when you define your computation, after which the generated kernels can be executed repeatedly with different input values, making it particularly efficient for scenarios requiring many function evaluations.

Performance Expectations

While AADC can provide dramatic speedups for suitable problems, performance gains depend on several factors:

Mathematical complexity: More complex functions benefit more from compilation
Repetition count: Benefits increase with more function evaluations
Hardware features: Modern CPUs with AVX512 see larger gains
Code structure: AADC can transform complex object-oriented code with irregular memory access patterns into highly cache-efficient compiled kernels
Function continuity: AADC handles two types of branches differently: (1) “stochastic” branches that depend on function inputs require analytical changes to linearize computations, bringing both differentiability and performance gains; (2) “static” branches that depend on problem configuration are effectively eliminated during the recording stage

For typical quantitative applications, speedups of 10-100x are common, with the “faster than primal” differentiation providing additional value for optimization and sensitivity analysis workflows.