We deployed an LLM to a secure enclave and verified exactly what code is running: an industry first for full source bootstrapped, deterministic, fully verifiable, and end-to-end encrypted AI inference.
Table of contents
- The problem with AI privacy
- Current approaches fall short
- What if you could verify everything?
- The deployment
- Inspecting the build
- What this proves
- True end-to-end encryption with STEVE
- Early access and general availability
- Helpful resources
The problem with AI privacy
Large Language Models have transformed nearly every industry, but a fundamental problem remains: how does one use an LLM without exposing sensitive data to third parties?
Tech companies, and AI companies in particular, have a poor track record with user data. Prompts may be logged, used for training, shared with contractors, or retained indefinitely. Alternatively, the LLM may be biased by the hosts, via hidden prompts. Privacy policies can also change at any time. When data leaves your control, there is no way to verify how it’s handled.
Current approaches fall short
There is significant demand for AI applications that better protect user data privacy. Some attempts use Trusted Execution Environments (TEEs) to isolate data, providing remote attestation as proof. But these solutions fall short: their “proofs” only demonstrate that the deployed code hasn’t changed, not what that code actually is.
Without full verifiability, the trust still lies with the operator alone. But promises are not good enough; we need a concrete way to verify the safety of data sent into third party servers.
Additionally, many confidential compute solutions today terminate TLS outside of the enclave, leaving the data exposed on the host within which the enclave runs. This defeats the entire point of enclaves as the data is exposed to an untrusted system, outside of the secure enclave. To mitigate this, the data has to be encrypted until it reaches inside of the enclave.
We cover how the approach Caution takes mitigates both of these risks.
What if you could verify everything?
Imagine being able to inspect the exact code powering an online service, and being able to prove it can’t mishandle your data: no logging, no saving, no undesirable behavior.
In this demo, Caution deploys a verifiable AI inference app, letting you prove precisely what code runs inside a secure enclave.
The verifiable AI inference demo runs a CPU-based LLM and is not optimized for performance. The goal is to demonstrate the verification workflow, not production inference efficiency. Full GPU-backed inference is planned once EnclaveOS V2 is production-ready. With the right partners, we could accelerate this. If you’re interested, please reach out.
The deployment
LLM applications must be deterministic for verification to work. Our prior enclave experiments with LLMs made this straightforward.
The deployment uses the standard Caution platform workflow:
caution init
git push caution main
caution verify
Build and deploy output
❯ git push caution test-enclave
...
Deployment successful!
Application: http://<redacted>:8080
Attestation: http://<redacted>:5000/attestation
Run 'caution verify' to verify the application attestation.
Verification output
❯ caution verify
Verifying enclave attestation...
Challenge nonce (sent): dc695fd5e10b2f0887a0ec163520127b40455defaa31686c4dcee77884c1177c
Requesting attestation...
Verifying attestation...
✓ Certificate chain verified against AWS Nitro root CA
✓ All certificates are within validity period
✓ COSE signature verification passed
✓ Nonce verified (prevents replay attacks)
Challenge nonce (received): dc695fd5e10b2f0887a0ec163520127b40455defaa31686c4dcee77884c1177c
✓ Attestation verified successfully
Remote PCR values (from deployed enclave):
PCR0: 267a49a97b94b57e11ef1fe59c798415d61157c68563d6b2901ef17a48c0c4b82f66c45fc0a156bcf014b742b75a277f
PCR1: 267a49a97b94b57e11ef1fe59c798415d61157c68563d6b2901ef17a48c0c4b82f66c45fc0a156bcf014b742b75a277f
PCR2: 21b9efbc184807662e966d34f390821309eeac6802309798826296bf3e8bec7c10edb30948c90ba67310f7b964fc500a
Manifest information:
App source: https://git.distrust.co/public/llmshell/archive/bd4d093ae51663e21ed29ab2607324080a8704d5.tar.gz (git archive)
Enclave source: https://git.distrust.co/public/enclaveos/archive/attestation_service.tar.gz (git archive)
Reproducing build from current directory...
Build artifacts available at: /home/user/.cache/caution/build/.tmp802BZp/eif-stage
You can review everything that went into building this enclave:
• Containerfile.eif - The complete build recipe
• app/ - Your application files
• enclave/ - EnclaveOS source (attestation-service, init)
• run.sh - Generated startup script
• manifest.json - Build provenance information
Expected PCR values:
PCR0: 267a49a97b94b57e11ef1fe59c798415d61157c68563d6b2901ef17a48c0c4b82f66c45fc0a156bcf014b742b75a277f
PCR1: 267a49a97b94b57e11ef1fe59c798415d61157c68563d6b2901ef17a48c0c4b82f66c45fc0a156bcf014b742b75a277f
PCR2: 21b9efbc184807662e966d34f390821309eeac6802309798826296bf3e8bec7c10edb30948c90ba67310f7b964fc500a
Comparing PCR values...
✓ Attestation verification PASSED
The deployed enclave matches the expected PCRs.
This means the code running in the enclave is exactly what you expect.
Powered by: Caution (https://caution.co)
Inspecting the build
The files for local reproduction are stored in the cache directory, containing every line of code used to build the software:
❯ tree -I app /home/user/.cache/caution/build/.tmp802BZp/eif-stage/
/home/user/.cache/caution/build/.tmp802BZp/eif-stage/
├── app
│ ├── <omitted full app file system...>
├── build.log
├── Containerfile.eif
├── enclave
│ ├── attestation-service
│ │ ├── Cargo.toml
│ │ └── src
│ │ └── main.rs
│ ├── Cargo.lock
│ ├── Cargo.toml
│ ├── Containerfile
│ ├── init.sh
│ ├── LICENSE.md
│ ├── Makefile
│ ├── README.md
│ ├── src
│ │ ├── aws
│ │ │ ├── Cargo.toml
│ │ │ └── src
│ │ │ └── lib.rs
│ │ ├── init
│ │ │ ├── Cargo.lock
│ │ │ ├── Cargo.toml
│ │ │ └── init.rs
│ │ └── system
│ │ ├── Cargo.toml
│ │ └── src
│ │ └── lib.rs
│ └── udhcpc-script.sh
├── manifest.json
├── output
│ ├── enclave.eif
│ ├── enclave.pcrs
│ └── rootfs.cpio.gz
└── run.sh
69 directories, 1432 files
What this proves
With Caution platform, you can:
- Deploy an LLM to a hardware-isolated enclave, ensuring your data never leaves the secure environment
- Reproduce the exact deployment locally: rebuilding the enclave image from source, all the way down to the kernel. In other words the entire tech stack.
- Verify the running code matches via cryptographic proof through PCR comparison
This enables the first fully verifiable LLM deployment. No trust required: full verification covers every line of code down to the kernel, proving the LLM can’t perform undesirable actions with your data.
True end-to-end encryption with STEVE
Verifiability solves a major problem: knowing exactly what code is running inside of a secure enclave. But there’s a second problem that is often not addressed adequately: TLS termination.
In typical enclave deployments, TLS terminates at a reverse proxy or load balancer outside the enclave. The traffic is then forwarded to the enclave in plaintext. This means the host system within which the secure enclave runs, the very thing the enclave is supposed to protect against, can read every request and response.
While enclave remote attestation without end to end encryption still preserves integrity and prevents things like inferance manipulation by advertizers, it is useless for confidentiality (in spite of what some marketing teams might imply). A compromised host, a malicious cloud operator, or an attacker with infrastructure access can intercept all data before it ever reaches the protected environment.
Caution solves this with STEVE (Secure Transport Encryption Via Enclave), a freely licensed open source solution which adds a second encryption layer that terminates exclusively inside the enclave.
STEVE uses X25519 key exchange with Ed25519 signatures bound to the enclave’s attestation. Clients verify they’re communicating with the attested enclave before establishing an encrypted channel. The host never sees plaintext application data.
What makes this powerful is that the E2E leverages the hardware backed keys only accessible inside of the secure enclave. In other words it’s hardware backed keys provided by confidential compute components that is backing the security of this setup.
For client side applications, a service worker handles encryption transparently, requiring no application changes. For this LLM deployment, prompts and responses are encrypted from the browser all the way into the enclave.
This combination of full verifiability and true end-to-end encryption is what sets Caution apart from other confidential compute solutions.
Early access and general availability
Caution is currently available in alpha access for teams testing and deploying reproducible enclaves. Learn more at alpha.caution.co.
We are developing EnclaveOS for broader attestation hardware support and superior isolation beyond AWS Nitro. Here’s what’s coming in 2026:
- Managed cloud platform
- Multi-hardware attestation (TPM 2.0, Intel TDX, AMD SEV-SNP)
- Support for additional cloud backends and bare metal
- Multi-cloud deployments from a single configuration
We invite developers building and operating verifiable compute to join our open Community space on Matrix to ask questions, share ideas, and help us shape the future of verifiable compute.
Helpful resources
- Introduction to the Caution platform (blog)
- Caution platform: codeberg.org/caution
- AI inference app: codeberg.org/caution/demo-ai-inference
- EnclaveOS: git.distrust.co/public/enclaveos
- STEVE: git.distrust.co/public/steve