5-Minute Quick Start

Get up and running with AADC in minutes by building a simple program that records and JIT (just-in-time) compiles a mathematical expression into an efficient kernel capable of processing multiple inputs (scenarios) and computing derivatives.

What You’ll Learn

By the end of this guide, you’ll have:

Created a working AADC program that computes f = exp(x/y + z) at different input values
Automatically calculated its partial derivatives ∂f/∂x, ∂f/∂y, and ∂f/∂z
Understood AADC’s vectorized computation capabilities
Verified your setup is working correctly

Prerequisites

AADC installed (see Installation Guide)
Basic C++ knowledge
CPU with AVX2 support (Intel Haswell+ or AMD Zen+)

Complete Working Example

File: aadc-getting-started/Ex1HelloWorld.cpp

#include <iostream>
#include <aadc/aadc.h>

void exampleHelloWorld()
{
    // Step 1: Declare variables using AADC's active type
    idouble x(0.5), y(1.1), z, f;

    // AADC's idouble has the same memory footprint as native double
    assert(sizeof(double) == sizeof(idouble));

    // Step 2: Use idouble as you would native double
    z = x + sin(y);
    std::cout << "z: " << z << std::endl;

    // Step 3: Create AADC kernel object
    typedef __m256d mmType;  // AVX2 vectorization type
    aadc::AADCFunctions<mmType> aadc_kernel;  // Controls recording and stores compiled graph
    
    // Step 4: Start recording the computational graph
    aadc_kernel.startRecording();
    
    // Mark input variables
    aadc::AADCArgument x_arg(x.markAsInput());
    aadc::AADCArgument y_arg(y.markAsInput());
    aadc::AADCArgument z_arg(z.markAsInput());
    
    // Step 5: Execute function with active types to record operations
    f = std::exp(x/y + z);  // AADC traces and records each elementary operation
    
    // Mark output variable
    aadc::AADCResult f_res(f.markAsOutput());
    aadc_kernel.stopRecording();
    
    // Step 6: Create workspace for computations
    std::shared_ptr<aadc::AADCWorkSpace<mmType>> ws(aadc_kernel.createWorkSpace());
    
    // Step 7: Set input values (vectorized - 4 evaluations at once)
    ws->setVal(x_arg, _mm256_set_pd(1.0, 2.0, 3.0, 4.0));
    ws->setVal(y_arg, 2.0);  // Same value (2.0) used for all 4 vector elements
    ws->setVal(z_arg, _mm256_set_pd(0.1, 0.2, 0.3, 0.4));
    
    // Step 8: Execute forward pass (compute function values)
    aadc_kernel.forward(*ws);
    
    // Verify forward pass results
    std::cout << "\nForward pass results:" << std::endl;
    for (uint64_t i = 0; i < aadc::mmSize<mmType>(); ++i) {
        std::cout << "f(" << ws->valp(x_arg)[i] 
                  << ", " << ws->valp(y_arg)[i]
                  << ", " << ws->valp(z_arg)[i]
                  << ") = " << ws->valp(f_res)[i] << std::endl;
    }
    
    // Step 9: Prepare for reverse pass (gradient computation)
    ws->setDiff(f_res, 1.0);  // Set adjoint seed
    
    // Step 10: Execute reverse pass (compute derivatives)
    aadc_kernel.reverse(*ws);
    
    // Display computed derivatives
    std::cout << "\nComputed derivatives:" << std::endl;
    for (uint64_t i = 0; i < aadc::mmSize<mmType>(); ++i) {
        std::cout << "Point " << i << ": "
                  << "∂f/∂x = " << ws->diffp(x_arg)[i] << ", "
                  << "∂f/∂y = " << ws->diffp(y_arg)[i] << ", "
                  << "∂f/∂z = " << ws->diffp(z_arg)[i] << std::endl;
    }
    
    std::cout << "\n✓ Example completed successfully!" << std::endl;
}

int main() {
    exampleHelloWorld();
    return 0;
}

Build and Run

Linux

cd aadc-getting-started
mkdir build && cd build
cmake ..
make
./Ex1HelloWorld

Windows (Visual Studio)

Open visualstudio/20XX/AADC.sln
Build -> Build Solution
Run Ex1HelloWorld project

Expected Output

z: 1.39121

Forward pass results:
f(4, 2, 0.4) = 11.0232
f(3, 2, 0.3) = 6.04965
f(2, 2, 0.2) = 3.32012
f(1, 2, 0.1) = 1.82212

Computed derivatives:
Point 0: ∂f/∂x = 5.51159, ∂f/∂y = -11.0232, ∂f/∂z = 11.0232
Point 1: ∂f/∂x = 3.02482, ∂f/∂y = -4.53724, ∂f/∂z = 6.04965
Point 2: ∂f/∂x = 1.66006, ∂f/∂y = -1.66006, ∂f/∂z = 3.32012
Point 3: ∂f/∂x = 0.911059, ∂f/∂y = -0.45553, ∂f/∂z = 1.82212

✓ Example completed successfully!

Step-by-Step Breakdown

Steps 1-2: Variable Declaration and Basic Operations

idouble x(0.5), y(1.1), z, f;
z = x + sin(y);

We declare variables using AADC’s active type idouble instead of native double. These variables can be used in regular mathematical expressions. AADC tracks operations automatically, but only those executed during the recording phase will be captured.

Step 3: Create AADC Kernel

typedef __m256d mmType;
aadc::AADCFunctions<mmType> aadc_kernel;

The kernel object manages the entire AADC workflow. The template parameter __m256d specifies AVX2 vectorization, enabling computation of 4 function evaluations simultaneously. For AVX512 systems, you could use __m512d for 8-way vectorization.

Step 4: Recording Phase Setup

aadc_kernel.startRecording();
aadc::AADCArgument x_arg(x.markAsInput());
// ... mark other inputs

Recording captures the computational graph. Variables marked as inputs become the independent variables and can be used for differentiation. Each markAsInput() call returns an argument handle used later to set values and retrieve derivatives.

Step 5: Function Execution and Output Marking

f = std::exp(x/y + z);
aadc::AADCResult f_res(f.markAsOutput());
aadc_kernel.stopRecording();

This is where the actual function is executed and recorded. AADC traces each elementary operation (division, addition, exponential) and builds the computational graph. The output marking creates a handle for retrieving function values and setting adjoint seeds. After stopRecording(), the kernel becomes constant and cannot be modified - it represents a compiled, optimized version of your function.

Step 6: Workspace Creation

std::shared_ptr<aadc::AADCWorkSpace<mmType>> ws(aadc_kernel.createWorkSpace());

The workspace provides memory and execution context for the compiled kernels. Multiple workspaces can share the same kernel, enabling parallel execution across different threads.

Steps 7-8: Forward Pass Execution

ws->setVal(x_arg, _mm256_set_pd(1.0, 2.0, 3.0, 4.0));
aadc_kernel.forward(*ws);

Input values are set using vectorized operations. The forward pass executes the compiled function kernel, computing f(x,y,z) at all input points simultaneously. Results are stored in the workspace and accessed via the output handle.

Steps 9-10: Reverse Pass for Derivatives

ws->setDiff(f_res, 1.0);
aadc_kernel.reverse(*ws);

The reverse pass implements the adjoint method for computing derivatives. Setting the output adjoint to 1.0 computes ∂f/∂x, ∂f/∂y, and ∂f/∂z. For multiple outputs, different adjoint values enable computation of weighted derivative combinations.

Key Concepts Explained

Active Types

idouble replaces double in code you want to differentiate (can be done globally for quick integration)
Same memory footprint and performance as native types
Automatically tracks operations for derivatives and valuation acceleration

Recording Phase

Call startRecording() to begin capturing operations
Mark variables as inputs using markAsInput()
Execute your function - AADC traces all operations on active types
Mark variables as outputs using markAsOutput()
Call stopRecording() to finish and compile the computational graph

Only operations executed between startRecording() and stopRecording() are captured for valuation and differentiation.

Execution Phase

Forward pass: Computes function values at specified input points
Reverse pass: Computes derivatives using automatic differentiation
Uses efficient vectorized operations (4 evaluations simultaneously with AVX2)

Vectorization

AADC evaluates your function at multiple points simultaneously:

Single function definition → multiple parallel evaluations
Significant performance improvement for repeated calculations
Ideal for Monte Carlo simulations and optimization

Troubleshooting

Build errors: Ensure your CPU supports AVX2 and you have the correct compiler version.

Runtime errors: Verify the example matches your AADC version exactly.

Performance issues: Use Release build configuration for optimal performance.