Hull-White · CSA Collateral · CVA/DVA · MatLogica Benchmark

XVA Sensitivities
AADC vs Bump & Revalue

Benchmark on interest rate swap portfolios under a Hull-White model with CSA collateral rules. AADC computes all 535 CVA/DVA sensitivities in the time bump & revalue computes fewer than 5.

3,300×
Peak Speedup
535
Risk Factors (O(1))
$11.7M
Annual Cloud Savings
<1e-14
Sensitivity Accuracy
CPU with AVX-256 support · 16 threads · Hull-White 1-factor · 30yr horizon · 122 pricing times

Benchmark Configuration

Parameter Value
Model Hull-White 1-factor with mean reversion
Instruments Interest rate swaps (fixed + floating legs)
Collateral CSA rules with threshold and minimum transfer amounts
Time horizon 30 years (10,950 days)
Model step 30 days (365 steps)
Pricing frequency Every 3rd model step (122 pricing times)
Risk factors 535 (r0, sigma, 251 MR curve, 141+141 survival curves)
Hardware CPU with AVX-256 support, 16 threads

Note: This is an isolated benchmark example using a single-factor Hull-White model with interest rate swaps. Production XVA libraries typically involve additional complexity — multi-factor models, exotic instruments, cross-currency exposures, MVA/KVA, and regulatory constraints — all of which are well-understood by MatLogica. Production deployments also benefit from additional optimisations (kernel partitioning, incremental recompilation, warm caching across trading desks) not exercised here. As a result, real-world speedups will vary by implementation, but remain comfortably in the 10× and above range even for the most complex portfolios.

Performance: AADC vs Bump & Revalue

Full sensitivity set — 535 risk factors, all computed simultaneously

BUMP & REVALUE

Finite Differences

O(N × M)

N+1 full MC re-simulations for N risk factors. Each sensitivity requires a complete Monte Carlo reprice.

AADC

Adjoint Algorithmic Differentiation

O(M)

1 forward + 1 reverse sweep per MC path. All 535 gradients computed simultaneously — cost independent of N.

Head-to-Head: Wall-Clock Time (535 Sensitivities)

Log scale — includes all CVA + DVA Greeks

AADC Timing Breakdown

Phase 5 Trades 50 Trades 200 Trades
Kernel compilation (one-time) 0.77s 1.87s 8.10s
Evaluation (prices + 535 Greeks) 0.024s ~0.20s ~0.88s
Relative perf vs primal 18.75% 0.78% 0.70%

Relative performance = (AADC forward + 2 reverse sweeps) / (single double forward pass). Values below 100% mean the full gradient computation costs less than a single repricing.

Bump & Revalue Timing

Portfolio Primal (1 run) 535 Bumps
5 trades 0.129s ~69s
50 trades 25.8s ~3.8 hr
200 trades 125.0s ~18.6 hr

Total time = (N + 1) × single MC time. Cost grows linearly with the number of risk factors.

Scaling: Cost Ratio vs Number of Risk Factors

Bump & revalue cost grows linearly — AADC stays constant at ~2× primal

Warm run: When only market data changes (rate curves, survival probabilities), AADC reuses the compiled kernel — 0s compilation overhead. For 50 trades at 10K paths: cold start 18.2s → warm run 13.9s (1.3× faster).

Cloud Cost: 500,000 Trades

The O(N) vs O(1) scaling difference at production scale

Assumptions

Portfolio500,000 IR swaps
MC Paths50,000
Risk Factors535 (full set)
Batch Window4 hours
Instancec5.9xlarge (36 vCPU)
Spot Pricing$0.55/hr
Bump & Revalue
~21,100
Nodes for 4hr window
vCPU-hours ~3,040,000
Per nightly run (spot) ~$46,400
Annual (252 days, spot) ~$11.7M
vs
AADC
7
Nodes for 4hr window
vCPU-hours ~910
Per nightly run (spot) ~$19
Annual (252 days, spot) ~$4.8K
~$11.7M
Annual cloud savings (spot pricing)

AADC's full-year cloud spend is less than the cost of a single bump & revalue run.

Cost Scaling with Risk Factor Count

Risk Factors (N) B&R Annual (spot) AADC Annual (spot) B&R / AADC
50 ~$1.1M ~$4.8K 229×
535 ~$11.7M ~$4.8K 2,438×
2,000 ~$43.7M ~$4.8K 9,104×

Estimates assume linear scaling of B&R compute with risk factor count. AADC evaluation cost is independent of N. Infrastructure overhead excluded. Actual costs vary with instance selection, region, and reserved pricing. GPU instances can reduce B&R wall-clock time but do not change the O(N) scaling.

Accuracy Validation

AADC produces mathematically exact derivatives — to machine precision

AADC

< 1e-14 Error vs primal
  • Exact derivatives via chain rule
  • No bump-size parameter to tune
  • No subtractive cancellation
  • Machine-precision accuracy

Bump & Revalue

~1e-6 Typical accuracy
  • Approximation via finite differences
  • Accuracy depends on bump size (1e-8)
  • Subtractive cancellation in floating-point
  • Matches AADC to ~6 digits

Sensitivity Agreement

Metric AADC Bump & Revalue (1e-8)
CVA match to primal Exact (< 1e-14) N/A
DVA match to primal Exact (< 1e-14) N/A
dCVA/d(sigma) agreement Reference ~6 digits
dCVA/d(r0) agreement Reference ~6 digits

Workflow Scenarios

A

End-of-Day Full Risk Report

Compute CVA/DVA + all 535 sensitivities for the entire portfolio.

B&R 536× primal cost
AADC ~2× primal + compilation
B

Intraday Market Data Update

Rate curves or survival probabilities change — re-compute CVA + sensitivities.

B&R 536 MC simulations
AADC Kernel reused (0s compilation)
C

New Trade Added

A new trade of a type already in the portfolio is added.

B&R Full re-computation
AADC Recompile + evaluation

Integration: Drop-in Replacement

aadc::idouble replaces double directly — no templates required

BEFORE Standard C++ (double)
double rate = 0.05;
double price = notional * exp(-rate * T);
double cva = max(price, 0.0) * (1.0 - survival_prob);
AFTER AADC drop-in (idouble)
aadc::idouble rate = 0.05;
aadc::idouble price = notional * exp(-rate * T);
aadc::idouble cva = max(price, 0.0) * (1.0 - survival_prob);
1

Replace types

doubleaadc::idouble in the valuation path

2

Record tape

startRecording() / stopRecording() around computation

3

Mark I/O

markAsDiff() for inputs, markAsOutput() for outputs

4

Compute Greeks

forward() + reverse() → all sensitivities

No refactoring needed. aadc::idouble overloads all arithmetic operators and math functions. The template approach (XVAProblem<T>) used in this benchmark is an engineering convenience for maintaining both code paths side-by-side — it is not an AADC requirement.

Code Examples from the Benchmark

Full source from the benchmark — same business logic runs with double or aadc::idouble

IDENTICAL XVAProblem<T> — same code for double and aadc::idouble
// XVAProblem.h — same code for both double and aadc::idouble

template<typename T>
class XVAProblem {
public:
    void simulatePath(const std::vector<T>& random_vec) {
        m_model->initT0(m_model_times[0]);
        HullWhiteIrMarket<T> market = m_model->getIrMarket();
        m_CSA_object->initCollateral();

        for (int i = 0; i < m_model_times.size(); i++) {
            if (i > 0) m_model->nextQT(random_vec[i], m_model_times[i]);
            if (m_is_pricing[i]) {
                market = m_model->getIrMarket();
                m_total_price = 0;
                for (int ti = 0; ti < numTrades; ++ti) {
                    m_total_price += m_fixed_leg[ti]->getPrice(market, t);
                    m_total_price += m_float_leg[ti]->getPrice(market, t);
                }
                m_CSA_object->next(m_total_price);
                T CSA_total_price = m_total_price - m_CSA_object->getCollateral();

                m_PEE[price_time_index] += max(CSA_total_price, 0.);
                m_NEE[price_time_index] += min(0., CSA_total_price);
            }
        }
    }

    void computeXVAMeasures() {
        // Trapezoidal rule for CVA/DVA integration
        for (int i = 0; i < i_end - 1; i++) {
            m_CVA += (m_PEE[i] + m_PEE[i+1]) * (ctrparty_prev - ctrparty_next) * 0.5;
            m_DVA += (m_NEE[i] + m_NEE[i+1]) * (company_prev - company_next) * 0.5;
        }
    }
};

The template parameter T is the only difference. Supporting classes (HullWhiteMarketModel, CSARules, FixedLeg, FloatLeg) are all templated on T.

O(N × M) bumpAndRevalue() — 536 full MC re-simulations
// XVAJobRequest.h — bumpAndRevalue()

void bumpAndRevalue(const json& request_data) {
    double bump_size = 1e-8;

    // For EACH mean reversion curve point (251 points):
    for (int i = 0; i < xva.getModel()->getMeanRev()->getVals().size(); i++) {
        json bump_data(request_data);
        bump_data["Currencies"]["EUR"]["HWMeanReversionCurve"]["bump_index"] = i;
        bump_data["Currencies"]["EUR"]["HWMeanReversionCurve"]["bump_size"]  = bump_size;

        XVAProblem<double> bumped_xva;
        bumped_xva.initData(bump_data);  // Re-initialize with bumped parameter

        // Run FULL Monte Carlo simulation again
        calcRiskByBump(risk_results, base_results, bumped_xva,
                       "MeanRev", i, bump_size, mc_iterations, randoms);
    }

    // For EACH survival curve point (141 + 141 points):
    for (int i = 0; i < xva.getCtrpSurvCurv()->getZeroRatesVector().size(); i++) {
        json bump_data(request_data);
        bump_data["CounterPartySurvivalCurve"]["bump_index"] = i;
        bump_data["CounterPartySurvivalCurve"]["bump_size"]  = bump_size;
        // ... full MC re-simulation ...
    }
    // ... same for company survival, sigma, r0 ...
}

// Each bump requires the full MC loop:
void calcRiskByBump(json& risk_results, const json& base_results,
                    XVAProblem<double>& XVA, ...) {
    XVA.prepareSimulations();
    for (int mc_i = 0; mc_i < mc_iterations; mc_i++) {
        XVA.simulatePath(randoms[mc_i]);  // Full path simulation
    }
    XVA.computeXVAMeasures();
    // Finite difference: (bumped_CVA - base_CVA) / bump_size
    risk_results["CVA"][risk_id][risk_index] =
        (XVA.getCVA() / mc_iterations - base_results["CVA"].get<double>()) / bump_size;
}

Total work: (535 + 1) full MC simulations = 536× the cost of a single pricing run. Each sensitivity is independent but requires a complete reprice.

COMPILE ONCE compileAADFunction() — record tape + mark all 535 risk factors
// XVAJobRequest.h — compileAADFunction()

void compileAADFunction(const json& request_data) {
    // Random variables as input (no derivatives needed for these)
    std::vector<idouble> aad_random_vec(XVA.numberOfRandomVars());

    m_aad_funcs->startRecording();

        // Mark random inputs
        markVectorAsInput(m_random_arg, aad_random_vec, false);

        // Mark market data as variable inputs (can change without recompilation)
        markVariableInputs(getArgumentsMap(), true);

        // Create the XVA problem with aadc::idouble — same code as double version
        XVAProblem<idouble> aad_XVA;
        aad_XVA.initData(m_data);

        idouble::CheckPoint();

        // Mark ALL risk factors for differentiation (one-time setup)
        m_xva_diff_args.ir_crvs[0].r0    = aad_XVA.getModel()->getR0().markAsDiff();
        m_xva_diff_args.ir_crvs[0].sigma  = aad_XVA.getModel()->getSigma().markAsDiff();
        for (int i = 0; i < meanRevSize; i++)
            m_xva_diff_args.ir_crvs[0].mr_crv.push_back(
                aad_XVA.getModel()->getMeanRev()->getVals()[i].markAsDiff());
        for (int i = 0; i < survivalCurveSize; i++)
            m_xva_diff_args.company_surv_crv.default_rates.push_back(
                aad_XVA.getCompSurvCurv()->getZeroRatesVector()[i].markAsDiff());
        // ... counterparty survival curve ...

        // Record ONE forward pass (same simulatePath + computeXVAMeasures)
        aad_XVA.prepareSimulations();
        aad_XVA.simulatePath(aad_random_vec);
        aad_XVA.computeXVAMeasures();

        // Mark outputs
        m_res_args.CVA = aad_XVA.getCVA().markAsOutput();
        m_res_args.DVA = aad_XVA.getDVA().markAsOutput();

    m_aad_funcs->stopRecording();
    // Kernel is now compiled to optimized AVX2/AVX-512 machine code
}

markAsDiff() tells AADC which inputs need gradients. All 535 risk factors are marked once during recording — the compiled kernel computes all their gradients on every reverse sweep.

O(M) aADExecution() — multi-threaded AVX forward + reverse sweeps
// XVAJobRequest.h — aADExecution()

void aADExecution(const json& request_data, int threads_num) {
    auto threadWorker = [&](int th_i) {
        // Each thread gets its own AADC workspace
        auto ws = m_aad_funcs->createWorkSpace();

        // Set market data inputs (can update without recompilation)
        for (auto& [path, arg] : request_variable_inputs)
            ws->val(arg.second) = mmSetConst<mmType>(request_data[path]);

        m_aad_funcs->forward(*ws, 0, 0);  // Initialize once

        for (int mc_i = 0; mc_i < AVX_iterations; mc_i++) {
            // Set random numbers (4 paths at once with AVX-256)
            setAVXVector(*ws, m_random_arg, mm_randoms[mc_i + AVX_iterations * th_i]);

            // Forward sweep: computes prices for 4 MC paths simultaneously
            m_aad_funcs->forward(*ws, 1, -1);

            // Reverse sweep for CVA sensitivities (ALL 535 gradients at once)
            ws->resetDiff();
            ws->diff(m_res_args.CVA) = mmSetConst<mmType>(1);
            m_aad_funcs->reverse(*ws, 1, -1);

            // Reverse sweep for DVA sensitivities (ALL 535 gradients at once)
            ws->resetDiff();
            ws->diff(m_res_args.DVA) = mmSetConst<mmType>(1);
            m_aad_funcs->reverse(*ws, 1, -1);
        }
    };

    // Launch threads — each processes its share of MC paths
    for (int i = 0; i < threads_num; i++)
        threads.push_back(new std::thread(threadWorker, i));
    for (auto& t : threads) t->join();
}

// Kernel caching: zero-cost market data updates
void processRequest(const json& request_data, ...) {
    auto cached = find_in_cache(m_data);
    if (cached != cache.end()) {
        m_aad_funcs = cached->aad_funcs;  // REUSE: ~0ms
    } else {
        compileAADFunction(request_data);  // COMPILE: ~1-8s
        cache.push_back(new_kernel);
    }
    aADExecution(request_data, threads_num);
}

AVX-256 processes 4 MC paths per SIMD instruction. Each thread works independently on its share of paths. Market data updates reuse the cached kernel with zero recompilation overhead.

Summary of Changes Required

Aspect Bump & Revalue AADC
Type change None (native double) doubleaadc::idouble
Business logic Unchanged Unchanged
Sensitivity loop 536 full MC re-runs 0 (reverse sweep)
One-time setup None Tape + compile (~1-8s)
Per-path work 1 forward pass 1 fwd + 2 reverse
Parallelism Single-threaded Multi-threaded + AVX
Market data update Re-run everything No recompilation

Conclusion

O(1)

Sensitivity Cost

AADC reverse sweeps — cost independent of N risk factors

360–3,300×

Speedup

Over bump & revalue for 535 parameters

<1e-14

Accuracy

Machine precision vs ~6 digits for finite differences

$11.7M

Annual Savings

Cloud cost reduction at 500K-trade production scale

Head-to-Head Comparison

Dimension Bump & Revalue AADC
Code changes None (native double) doubleaadc::idouble + tape recording
Sensitivity cost O(N) full MC re-simulations O(1) reverse sweeps
Speedup (535 params) Baseline 360–3,300×
Accuracy ~6 digits (bump-dependent) Machine precision (<1e-14)
Market data update Full re-computation Kernel reuse (0s compilation)
Cloud cost (500K trades) ~$11.7M/yr (spot) ~$4.8K/yr (spot)
SIMD vectorization Not applicable AVX-256/512 (4–8 paths/cycle)
Multi-threading Parallel but N× work Parallel with 1× work

AADC transforms XVA risk computation from an O(N) problem into an O(1) problem with respect to the number of risk factors, while maintaining mathematical exactness and requiring minimal code changes to the existing valuation library.

Risk Factor Inventory

Risk Factor Count Description
r0 (short rate) 1 Hull-White initial rate
sigma (volatility) 1 Hull-White volatility
Mean reversion curve 251 Piecewise-linear, 50yr horizon, 0.2yr step
Counterparty survival curve 141 14,000-day horizon, 100-day step
Company survival curve 141 14,000-day horizon, 100-day step
Total 535

About This Benchmark

The benchmark: Hull-White 1-factor model with IR swaps, CSA collateral rules, 535 risk factors (r0, sigma, mean reversion curve, counterparty and company survival curves), 30-year horizon.

Important context: This benchmark was produced with Intel in 2019 to demonstrate AADC's capabilities on a clean, well-understood model. It deliberately uses vanilla IR swaps to isolate the AAD performance gains from model complexity. It doesn't cover more complex exotics like Bermudans, TARFs, or multi-factor hybrids.

If there's a specific asset class that would make this more convincing for your book, we'd be happy to extend the benchmark — let us know what would resonate.

Intel × MatLogica XVA White Paper XVA Pricing Application for Financial Services (PDF)
MatLogica Benchmark · Hull-White 1-factor · CSA Collateral · CVA/DVA · All timings from actual runs

Cloud cost estimates extrapolated linearly from benchmark baselines. Actual costs vary with instance selection, region, and pricing model.

Frequently Asked Questions

What speedup does AADC achieve over bump & revalue for XVA sensitivities?

AADC achieves 360-3,300× speedup over bump & revalue for computing all 535 CVA/DVA sensitivities. For 200 trades, AADC computes all sensitivities in ~0.88 seconds vs ~18.6 hours for bump & revalue. The speedup scales with the number of risk factors because AADC cost is O(M) while bump & revalue is O(N × M).

How much cloud cost can AADC save for XVA computation?

At production scale (500,000 IR swaps, 50K MC paths, 535 risk factors), AADC reduces annual cloud spend from ~$11.7M to ~$4,800 — a 2,438× cost reduction. AADC needs 7 c5.9xlarge nodes to complete within a 4-hour batch window, compared to ~21,100 nodes for bump & revalue.

How accurate are AADC sensitivities compared to bump & revalue?

AADC produces mathematically exact derivatives to machine precision (<1e-14 error vs primal). Bump & revalue achieves approximately 6 digits of accuracy and is sensitive to bump-size selection. AADC eliminates subtractive cancellation errors inherent in finite-difference methods.

What is the integration effort to switch from bump & revalue to AADC?

AADC is a drop-in replacement: change 'double' to 'aadc::idouble' in the valuation path. No business logic changes are required. Integration steps: replace types, record tape with startRecording()/stopRecording(), mark inputs/outputs, then call forward()+reverse() for all sensitivities.

Does AADC need to recompile when market data changes?

No. When only market data changes (rate curves, survival probabilities), AADC reuses the compiled kernel with zero compilation overhead. For 50 trades at 10K paths, warm run time drops from 18.2s (cold start) to 13.9s. Only structural changes (new trade types) require recompilation.

What risk factors are included in this XVA benchmark?

The benchmark computes sensitivities to 535 risk factors: r0 (Hull-White initial rate), sigma (volatility), 251 mean-reversion curve points (50yr horizon, 0.2yr step), 141 counterparty survival curve points, and 141 company survival curve points (14,000-day horizon, 100-day step).

How does AADC XVA performance scale with portfolio size?

AADC evaluation time scales linearly with portfolio size: 0.024s for 5 trades, ~0.20s for 50 trades, ~0.88s for 200 trades. Kernel compilation is a one-time cost (0.77s to 8.1s) amortized across all subsequent evaluations. Relative performance improves with size: from 18.75% at 5 trades to 0.70% at 200 trades.

What production workflows does AADC XVA support?

AADC supports three main XVA workflows: (A) End-of-day full risk reports computing all 535 sensitivities at ~2× primal cost + compilation, (B) Intraday market data updates with kernel reuse (no recompilation), and (C) New trade additions requiring only recompile + evaluation.