Skip to content

Conversation

@lmondada
Copy link
Collaborator

@lmondada lmondada commented Jan 9, 2026

This PR's main change is to remove the tracking of the current execution context from the various components of the runtime (qpu, quantum_platform, simulators and execution managers) and to store it instead in a single place, as a thread-local variable in the new runtime/common/ExecutionContext.cpp file.

Uniform handling of ExecutionContext

Having had issues to track where state is set and reset (e.g. exceptions not cleaning up the state, or context set in some components of the runtime but not all), I have homogenised how the execution context is handled across the runtime. Instead of manually setting and resetting context in the various components, I suggest using a Python style with construction:

ctx = ExecutionContext("sample")
with ctx:
    kernel()

The intent of this is to make it clear that all code run within such a block can expect the execution context to be set, whereas for all code outside of it, no context exists. It also separates syntactically code run "on the host" and code run "on (a simulation of the) device".

The syntax above is valid in Python. In C++, use quantum_platform::with_execution_context, which can be used as follows

cudaq::ExecutionContext ctx("sample")
cudaq::get_platform().with_execution_context(ctx, [&]() {
    kernel();
});

Under the hood: Context Lifecycle

At a high level, context setting and resetting was (most of the time) handled as follows:

setExecutionContext(ctx)  →  [execute kernel]  →  resetExecutionContext()

These set and reset calls were propagated across the various components of the runtime (QPU, simulators, ExecutionManagers, platforms etc). However, these functions had much more responsibilities than just context setting:

  • in simulators and execution managers, resetContext was used to "flush" the state and update the context; calling it twice resulted in errors/undefined behaviour
  • on the other hand, when exceptions happened or the run was aborted otherwise, resetContext was often not called at all (and could not be called, as it would attempt to finalise the context state). As a result, invalidated context pointers were not always reset
  • Some QPUs forwarded set and reset calls to execution managers and/or simulators, but others didn't. This meant that at any one point, various context pointers across the runtime may be pointing at different states. Similarly, execution contexts on different threads were not always tracked separately.
  • for get_state, CircuitSimulator::resetExecutionContext is in charge of moving the simulationState into the context after the execution. However, it is only called after DefaultExecutionManager::resetExecutionContext, which is in charge of deallocating qubits. Thus qubits must be marked for deallocation in the simulator but their actual deallocation must be deferred for the simulationState to remain valid long enough
  • execution modes such as tracer and resource-count would often overwrite the execution context global but never reset it back; restoring it actually caused further post-processing to break
  • Execution managers had hooks handleExecutionContextChanged and handleExecutionContextEnded which were triggered precisely at set and reset time.

To resolve and unify this, I suggest the following context lifecycle:

configureExecutionContext(ctx)  →  setExecutionContext(ctx)  →  beginExecution()
    →  [execute kernel]  →
finalizeExecutionContext(ctx)  →  endExecution()  →  resetExecutionContext()

I have adjusted all runtime components to be using these method names consistently.

These functions work in pairs and have well-defined responsibilities:

1. configureExecutionContext(ctx) & finalizeExecutionContext(ctx)

Propagated across the runtime components (QPUs, simulators, execution managers, platforms).

These methods should be (morally) const on the components of the runtime (qpu, simulators etc). Their purpose is to mutate ctx for the upcoming execution, resp. flush all data to it after execution. I have not always marked them const, as they may mutate internal state (such as queued gates etc).

configureExecutionContext is called before setting the context and finalizeExecutionContext is called before de-initializing the runtime components.

2. beginExecution and endExecution

Propagated across the runtime components (QPUs, simulators, execution managers, platforms).

These methods are used by the components of the runtime to initialise/deinitialise resources for the execution of the kernel. When beginExecution is called, the execution context has already been set, so that the component can be initialised accordingly.

3. setExecutionContext & resetExecutionContext

Does what it says: sets the thread-local variable to point to the current context. The context remains owned by the calling scope. These are NOT propagated to any runtime components. Further, there should not be any reason to call these functions directly anymore: use with_execution_context (see above) to guarantee that the runtime is always left in a consistent state.

Comment on lines +174 to 206
if (ctx.name == "sample") {
CUDAQ_INFO("Sampling");
auto shots = ctx.shots;
auto sampleResult =
qpp::sample(shots, state, ids, sampleQudits.begin()->levels);
cudaq::ExecutionResult counts;
for (auto [result, count] : sampleResult) {
std::stringstream bitstring;
for (const auto &quditRes : result) {
bitstring << quditRes;
}
// Add to the sample result
// in mid-circ sampling mode this will append 1 bitstring
counts.appendResult(bitstring.str(), count);
// Reset the string.
bitstring.str("");
bitstring.clear();
}
ctx.result.append(counts);
} else if (ctx.name == "extract-state") {
CUDAQ_INFO("Extracting state");
// If here, then we care about the result qudit, so compute it.
for (auto &q : sampleQudits) {
const auto measurement_tuple = qpp::measure(
state, qpp::cmat::Identity(q.levels, q.levels), {q.id},
/*qudit dimension=*/q.levels, /*destructive measmt=*/false);
const auto measurement_result = std::get<qpp::RES>(measurement_tuple);
const auto &post_meas_states = std::get<qpp::ST>(measurement_tuple);
const auto &collapsed_state = post_meas_states[measurement_result];
state = Eigen::Map<const qpp::ket>(collapsed_state.data(),
collapsed_state.size());
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a big diff: it is in fact the same old code, just unindented (removing the outer if)

Comment on lines -287 to -305
/// @brief Set the Execution Context
void __quantum__rt__setExecutionContext(cudaq::ExecutionContext *ctx) {
__quantum__rt__initialize(0, nullptr);

if (ctx) {
ScopedTraceWithContext("NVQIR::setExecutionContext", ctx->name);
CUDAQ_INFO("Setting execution context: {}{}", ctx ? ctx->name : "basic",
ctx->hasConditionalsOnMeasureResults ? " with conditionals"
: "");
nvqir::getCircuitSimulatorInternal()->setExecutionContext(ctx);
}
}

/// @brief Reset the Execution Context
void __quantum__rt__resetExecutionContext() {
ScopedTraceWithContext("NVQIR::resetExecutionContext");
CUDAQ_INFO("Resetting execution context.");
nvqir::getCircuitSimulatorInternal()->resetExecutionContext();
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found a use case for these and I don't think we should encourage setting the context manually. Happy to revert if I have missed the reason for their existence.

Comment on lines -33 to +45
bool limit_pathfinding = cudaq::getEnvBool("CUDAQ_TENSORNET_FIND_LIMIT", true);
bool limit_pathfinding =
cudaq::getEnvBool("CUDAQ_TENSORNET_FIND_LIMIT", true);
if (!limit_pathfinding) {
CUDAQ_INFO("Disabling cutensornet smart opt");
setenv("CUTENSORNET_CONTRACTION_OPTIMIZER_CONFIG_SMART_OPTION","0",1);
setenv("CUTENSORNET_CONTRACTION_OPTIMIZER_CONFIG_SMART_OPTION", "0", 1);
}

bool deterministic = cudaq::getEnvBool("CUDAQ_TENSORNET_FIND_DETERMINISTIC",
false);
bool deterministic =
cudaq::getEnvBool("CUDAQ_TENSORNET_FIND_DETERMINISTIC", false);
if (deterministic) {
CUDAQ_INFO("Enabling deterministic tensornet with 1 pathfinding thread");
setenv("CUTENSORNET_CONTRACTION_OPTIMIZER_CONFIG_HYPER_NUM_THREADS",
"1",1);
setenv("CUTENSORNET_CONTRACTION_OPTIMIZER_CONFIG_HYPER_NUM_THREADS", "1",
1);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the noisy diff here. No changes were made, but it seems formatting on .inc files is not enforced.

I can revert this if desired but it is probably useful to have this formatted correctly for the future.

@sacpis
Copy link
Collaborator

sacpis commented Jan 9, 2026

@lmondada In the following flow

configureExecutionContext(ctx)  →  setExecutionContext(ctx)  →  beginExecution()
    →  [execute kernel]  →
finalizeExecutionContext(ctx)  →  endExecution()  →  resetExecutionContext()

Shouldn't finalizeExecutionContext happen after endExecution?

@lmondada
Copy link
Collaborator Author

lmondada commented Jan 9, 2026

haha, I agree with you @sacpis that it would be more symmetric and aesthetically pleasing that way - and trust me, I've tried :P

The issue is that finalizeExecutionContext currently requires the components to still be initialized. Think for instance of the queue of gates in simulators: if you freed that memory (in endExecution) you wouldn't be able to flush the gates and apply them in finalizeExecutionContext.

In the end, I see no harm in the ordering as I suggested it, so I've left it as it is. Maybe we'll be able to clean this up as part of the refactor.


if (!isInTracerMode() && count != state->getNumQubits())
// TODO: remove execution context from CircuitSimulator
auto executionContext = cudaq::getExecutionContext();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused as why do we have auto executionContext = cudaq::getExecutionContext(); in every method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before, executionContext was a member attribute of CircuitSimulator. Now it has to be accessed through cudaq::getExecutionContext.

Adding this statement at the beginning of the methods means the rest of the method body does not have to be changed. The better solution would be to rewrite them to not depend on executionContext altogether (hence the TODOs). That will be part of the refactor.

@lmondada lmondada force-pushed the lm/globalcontext branch 5 times, most recently from 58166f4 to 1e39402 Compare January 13, 2026 13:55
github-actions bot pushed a commit that referenced this pull request Jan 13, 2026
@github-actions
Copy link

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

github-actions bot pushed a commit that referenced this pull request Jan 19, 2026
@github-actions
Copy link

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

github-actions bot pushed a commit that referenced this pull request Jan 19, 2026
@github-actions
Copy link

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

@lmondada lmondada force-pushed the lm/globalcontext branch 2 times, most recently from b680cc7 to e3a4886 Compare January 21, 2026 14:01
Signed-off-by: Luca Mondada <luca@mondada.net>
Signed-off-by: Luca Mondada <luca@mondada.net>
Signed-off-by: Luca Mondada <luca@mondada.net>
Signed-off-by: Luca Mondada <luca@mondada.net>
Signed-off-by: Luca Mondada <luca@mondada.net>
Signed-off-by: Luca Mondada <luca@mondada.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants