AADCWorkSpace

The AADCWorkSpace class provides execution context and memory management for compiled AADC kernels. Each workspace maintains input values, intermediate variables, output results, and adjoint values required for automatic differentiation.

Overview

AADCWorkSpace is a templated class defined in aadc/aadc.h that manages:

Input/output value storage: Setting kernel inputs and retrieving results
Adjoint Input/output storage: Setting and retrieving derivatives
Memory allocation: Providing vectorized memory for compiled kernels
Execution state: Maintaining state between forward and reverse passes

#include <aadc/aadc.h>

typedef __m256d mmType;  // AVX2 vectorization (4 doubles)
aadc::AADCFunctions<mmType> aadc_kernel;
std::shared_ptr<aadc::AADCWorkSpace<mmType>> workspace = aadc_kernel.createWorkSpace();

Template Parameters

The template parameter must match the vectorization type used in the corresponding AADCFunctions:

Type	Elements	Description
`__m256d`	4 doubles	AVX2 vectorization
`__m512d`	8 doubles	AVX512 vectorization

Memory Layout: Each workspace allocates vectorized memory where a single logical variable occupies multiple adjacent memory locations for SIMD processing.

Creation and Lifecycle

Workspace Creation

Workspaces are created through the AADCFunctions::createWorkSpace() method:

aadc::AADCFunctions<mmType> aadc_kernel;
// ... recording and compilation ...

std::shared_ptr<aadc::AADCWorkSpace<mmType>> ws = aadc_kernel.createWorkSpace();

Memory Management

Automatic allocation: Workspace automatically allocates required memory
Shared ownership: Use std::shared_ptr for safe memory management
Thread isolation: Each thread’s workspace maintains independent state

Input Value Management

Setting Input Values

Input values are set using variable handles returned during recording:

// During recording
aadc::AADCArgument spot_arg = spot.markAsInput();
aadc::AADCArgument vol_arg = volatility.markAsInput();

// During execution
workspace->setVal(spot_arg, _mm256_set_pd(95.0, 100.0, 105.0, 110.0));  // Vectorized
workspace->setVal(vol_arg, 0.25);  // Broadcast to all vector elements

Value Setting Methods

// Vectorized input (full SIMD width)
AADCWorkSpace& setVal(const AADCArgument& var_arg, const mmType& val);

// Scalar input (broadcast to all elements)
AADCWorkSpace& setVal(const AADCArgument& var_arg, double val);

// Named input using string identifiers
AADCWorkSpace& setVal(const ArgID& var_id, const mmType& val);
AADCWorkSpace& setVal(const ArgID& var_id, double val);

// Scalar-specific inputs
AADCWorkSpace& setVal(const AADCScalarArgument& var_arg, double val);

Vectorized Input Examples

// Setting different values for each vector element
__m256d spot_prices = _mm256_set_pd(90.0, 100.0, 110.0, 120.0);
workspace->setVal(spot_arg, spot_prices);

// Setting the same value for all elements
workspace->setVal(rate_arg, 0.05);  // All elements = 0.05

// Using named variables (requires string identifiers during recording)
workspace->setVal("underlying_price", spot_prices);

Output Value Access

Retrieving Results

Results are accessed using output handles from recording:

// During recording
aadc::AADCResult price_result = option_price.markAsOutput();

// After forward execution
for (int i = 0; i < aadc::mmSize<mmType>(); ++i) {
    std::cout << "Option price[" << i << "] = " 
              << workspace->valp(price_result)[i] << std::endl;
}

Value Access Methods

// Vector access
mmType& val(const AADCResult& result_arg);
const mmType& val(const AADCResult& result_arg) const;

// Pointer access to underlying array
double* valp(const AADCResult& result_arg);
const double* valp(const AADCResult& result_arg) const;

// Named result access
mmType& val(const ResID& var_id);
const double* valp(const ResID& var_id) const;

Adjoint Management

Setting Adjoint Seeds

Before reverse pass execution, set adjoint seeds for output variables:

// Set unit seed for computing first derivatives
workspace->setDiff(price_result, 1.0);  // Broadcast to all elements

// Set vectorized seeds for multiple scenarios
__m256d seeds = _mm256_set_pd(1.0, 2.0, 1.0, 0.5);
workspace->setDiff(price_result, seeds);

// Reset all adjoints
workspace->resetDiff();

Retrieving Derivatives

After reverse pass execution, access computed derivatives:

// Access derivatives for input variables
for (int i = 0; i < aadc::mmSize<mmType>(); ++i) {
    std::cout << "Delta[" << i << "] = " << workspace->diffp(spot_arg)[i] << std::endl;
    std::cout << "Vega[" << i << "] = " << workspace->diffp(vol_arg)[i] << std::endl;
}

Adjoint Methods

// Set adjoint values
void setDiff(const AADCResult& result_arg, const mmType& val);
void setDiff(const AADCResult& result_arg, double val);
void setDiff(const AADCArgument& input_arg, const mmType& val);

// Access adjoint values
mmType& diff(const AADCArgument& input_arg);
const mmType& diff(const AADCArgument& input_arg) const;
double* diffp(const AADCArgument& input_arg);

// Reset all adjoints
void resetDiff();

Advanced Features

Named Variable Access

Named variable access allows for more convenient referencing. This has performance overhead due to string lookups, so use it judiciously.

When variables are marked with string identifiers during recording, they can be accessed by name:

// During recording with names
spot.markAsInput("spot_price");
volatility.markAsInput("volatility");
option_price.markAsOutput("option_value");

// Access by name during execution
workspace->setVal(aadc::ArgID("spot_price"), 100.0);
workspace->setVal(aadc::ArgID("volatility"), 0.25);

double result = workspace->valp(aadc::ResID("option_value"))[0];
double delta = workspace->diffp(aadc::ArgID("spot_price"))[0];

Multi-dimensional Indexing

Variables can be marked with multi-dimensional indices:

// During recording with indices
assets.markAsInput("asset_prices", 0);  // Asset index
corr_matrix.markAsInput("correlations", i, j);  // Matrix indices

// Access with matching indices
workspace->setVal(aadc::ArgID("asset_prices", 0), 100.0);
workspace->setVal(aadc::ArgID("correlations", 1, 2), 0.75);

Array Input Operations

For setting multiple related inputs efficiently:

template<typename IdxIter, typename ValIter>
void setVal(IdxIter indx_begin, ValIter val_begin, ValIter val_end);

// Example usage
std::vector<aadc::AADCArgument> asset_args = {asset1_arg, asset2_arg, asset3_arg};
std::vector<double> asset_prices = {95.0, 100.0, 105.0};
workspace->setVal(asset_args.begin(), asset_prices.begin(), asset_prices.end());

Memory Layout and Performance

Vectorized Memory Structure

Each variable in the workspace occupies vectorized memory:

// For AVX2 (__m256d), each variable stores 4 doubles
aadc::AADCArgument spot_arg;
double* spot_values = workspace->valp(spot_arg);
// spot_values[0], spot_values[1], spot_values[2], spot_values[3]
// represent 4 different scenarios processed simultaneously

Memory Requirements

Workspace memory usage includes:

Value storage: work_array_size × sizeof(mmType) for function values
Adjoint storage: work_array_size × sizeof(mmType) for derivatives
Stack memory: stack_size × sizeof(mmType) for intermediate values
Checkpoint memory: Additional storage for checkpointing (if used)

Performance Considerations

Thread locality: Each thread should use its own workspace
Memory access patterns: Sequential access to vectorized data is most efficient
Reuse: Workspaces can be reused across multiple kernel executions
Avoid sharing: Do not share workspace instances between threads and different kernels

Thread Safety

Safe Usage Patterns

// Shared kernel (thread-safe after compilation)
aadc::AADCFunctions<mmType> shared_kernel = compile_pricing_kernel();

void thread_worker(int thread_id) {
    // Each thread creates its own workspace
    auto workspace = shared_kernel.createWorkSpace();
    
    // Thread-safe execution
    set_thread_inputs(*workspace, thread_id);
    shared_kernel.forward(*workspace);
    shared_kernel.reverse(*workspace);
    process_thread_results(*workspace, thread_id);
}

Thread Safety Rules

Workspace isolation: Don’t share workspace objects between threads
Kernel sharing: Compiled kernels can be safely shared between threads
No synchronization needed: For workspace operations within a single thread