OLLM and Phala Partner to Bring Confidential AI Models to OLLM AI Gateway

GENEVA, Switzerland — December 17, 2025 — Leads & Copy — OLLM and Phala have partnered to integrate confidential AI models into the OLLM AI Gateway, giving users the ability to conduct private inference on frontier models using cryptographic TEE attestation. This ensures that every workload is encrypted within a confidential computing chip, protecting data from input to output.

Phala operates a hardware-secured confidential AI cloud built on GPU TEEs, processing models and data inside GPU TEEs to keep prompts, outputs, and model weights encrypted during execution. Each workload is paired with a cryptographic attestation verifiable via Phala’s Trust Center, confirming it ran on genuine TEE hardware. This TEE environment typically introduces a performance trade-off of around 0.5% to 5%, making full privacy viable for real-world workloads.

Through this partnership, OLLM’s Confidential AI Gateway users can select Phala’s encrypted models and access them alongside other providers through a single, unified API. This includes Phala-hosted encrypted models such as Llama 3.3, Gemma 3, DeepSeek, and GPT-oss, with more confidential models becoming available as the catalog expands. Every request to a Phala-backed model on OLLM is processed inside a hardware-isolated enclave, and the resulting cryptographic proof of confidentiality is visible in the OLLM scanner.

According to Ahmad Shadid, founder and CEO of OLLM, enterprises want the benefits of AI but cannot compromise on data confidentiality or control. He says aggregating Phala’s confidential AI cloud into the OLLM Confidential AI Gateway gives builders a simple way to tap into hardware-secured models with verifiable privacy, without rewriting their stack or accepting double-digit performance hits.

The announcement follows OLLM’s recent partnership with NEAR Protocol, which brought NEAR’s confidential computing infrastructure and NEAR AI Cloud into the OLLM Gateway. These integrations position OLLM as a neutral access layer for confidential AI, offering not only developers but also security, risk, and compliance teams an enterprise-ready AI gateway when they need strong data privacy and consistent observability across providers.

For a bank’s risk team, this means they can run an internal copilot over sensitive transaction data, proving that neither the cloud provider nor the model host can see the raw inputs. A healthcare provider can summarize patient records for clinicians without moving data out of a confidential computing environment. Public-sector and Web3 teams can analyze user behavior or on-chain activity, keeping identities, prompts, and outputs sealed inside TEEs, and still demonstrate compliance to regulators and partners.

Marvin Tong, CEO of Phala, says that AI adoption shouldn’t come at the cost of trust, and partnering with OLLM lets teams use powerful models while keeping data private by default, not by policy.

Phala’s technology has already processed over 1.34B LLM tokens in a single day via partners such as OpenRouter, underscoring that confidential AI is ready for production-scale workloads. By making these models available through OLLM, the two teams aim to lower the barrier for any developer or enterprise that needs strong privacy guarantees but wants the simplicity of an API-first gateway.

Shadid adds that with Phala and NEAR now live inside the OLLM Gateway, builders don’t have to choose between performance and privacy and can see and verify how their AI is secured every time a request is made.

Wahid Pessarlay, PR Manager, can be reached at pr@o.xyz.

Source: OLLM

×

Welcome!

AIReporter.news is a Leads & Copy Publication

Leads & Copy is a Media “news tip” source, providing Industry Reporters story Leads, written as Publishable CP-style Copy.

By Subscribing you will receive Daily AI Story Leads via email 10:30 am ET Mon-Fri.