Minerva - Using LLMs with Noreja

Minerva enables flexible and controlled integration of Large Language Models within Noreja, allowing organizations to choose between cloud-based models, Noreja-managed deployments, or fully customer-operated “Bring Your Own LLM” setups. This modular approach ensures that requirements around data sovereignty, output quality, operational responsibility, and regulatory compliance can be precisely addressed. Whether prioritizing maximum linguistic performance through leading cloud providers or full infrastructure control with on-premise deployment, Noreja provides a unified orchestration layer that maintains transparency, security, and governance across all LLM interactions.

Options for Using Large Language Models in Noreja

Noreja offers several clearly defined options for using Large Language Models, addressing different requirements around data protection, control, quality, and operational responsibility. These options can primarily be distinguished by who operates and manages the model: an external provider, Noreja itself, or the customer.

With cloud-based, publicly available LLMs, model operation is fully handled by the respective vendor. In this setup, Noreja accesses the models via standardized interfaces and leverages their high model maturity, scalability, and continuous improvement.

Alternatively, Noreja provides its own Noreja-managed LLM setups. These include both Noreja Application LLMs and dedicated, customer-specific LLM instances operated by Noreja and orchestrated through the Noreja AI Center. This approach combines centralized model selection and control with clearly defined data protection and integration mechanisms, enabling the targeted use of different model providers.

For maximum control, customers can also integrate their own LLM into the Noreja platform (“Bring Your Own LLM”). In this scenario, full operational responsibility remains with the customer. Noreja acts as an integration and orchestration layer, enabling the connection of additional systems, agents, or APIs without assuming responsibility for model operation or inference.

These three options form the foundation for flexible LLM usage within Noreja. Regardless of the chosen operating model, a key consideration is the technical distinction between locally deployed and cloud-based models.

Local LLMs (On-Premise) vs. Cloud-Based Models

Noreja supports both locally deployed Large Language Models and cloud-hosted models, enabling flexible choices based on requirements around data protection, output quality, and usage scenarios.

Quality

The most significant quality differences emerge in pure LLM interactions without tool calling or external system integration. Cloud-based models typically benefit from substantially larger training datasets, more powerful model architectures, and continuous optimization. This results in higher linguistic quality, more stable reasoning, better handling of ambiguity, and more consistent responses to complex or open-ended queries.

Local LLMs can also deliver strong results in well-defined, structured use cases but tend to reach qualitative limits more quickly without additional control mechanisms. In particular, for longer contexts, open-ended prompts, or implicit logical reasoning, output quality depends heavily on model size, available hardware, and individual configuration.

Data Control

Local models offer maximum control over data and execution. All inputs, contextual information, and model outputs remain entirely within the organization’s own infrastructure. No external data transfer occurs, making local LLMs well suited for privacy-sensitive process, personal, or enterprise data and for meeting internal compliance, security, and governance requirements.

Cloud-based models are accessed via APIs, with submitted data processed within the infrastructure of the respective provider. Leading vendors implement extensive technical and organizational safeguards, including encrypted transmission, isolated processing, and contractual commitments not to use customer data for training or model improvement. Even when enterprise or zero-retention options are available, data processing remains tied to external systems and their technical and legal frameworks.

In practice, Noreja enables the targeted use of both approaches: local LLMs for controlled, data-sensitive, and clearly structured scenarios, and cloud-based models where maximum language quality, robustness, and deep reasoning—particularly in pure LLM interactions—are required.

OpenAI Data Control

The following information is related to the official OpenAI documentation available here.

Use of Data by OpenAI

Since March 1, 2023, data submitted via the OpenAI API is not used to train or improve OpenAI models, unless the customer explicitly opts in.
Customer data remains the property of the customer.

Types of Stored Data

When using the API, the following categories of data may be processed:

a) Abuse Monitoring Logs

Used to detect misuse and enforce OpenAI’s usage policies.
May contain prompts, responses, and derived metadata.
Default retention period: up to 30 days, unless legal obligations require longer retention.

b) Application State

Temporary storage required for certain API features to function (e.g., multi-turn conversations, audio outputs, background processing).
Retention depends on the specific endpoint used.

Advanced Data Controls

Subject to approval, OpenAI offers the following options:

Modified Abuse Monitoring (MAM)

Customer content is excluded from abuse monitoring logs (with rare exceptions for image/file inputs).
Full API functionality remains available.
Customers are responsible for ensuring compliance with OpenAI’s usage policies.

Zero Data Retention (ZDR)

Customer content is excluded from abuse monitoring logs.
The store parameter is technically enforced as false.
Certain features are restricted or incompatible (e.g., Background Mode, Extended Prompt Caching, OpenAI-hosted containers).

Endpoint-Specific Retention (Overview)

Endpoint	Storage Behavior
`/v1/chat/completions`	Audio outputs stored for up to 1 hour (context support).
`/v1/responses`	30-day retention by default when `store=true` ; Background Mode stores data for ~10 minutes.
`/v1/assistants` `/v1/threads` `/v1/vector_stores`	Retained until deleted; deleted objects are removed after 30 days.
`/v1/images`	ZDR-compatible when using `gpt-image-1` models.
`/v1/files`	Manual or automated deletion available.
`/v1/videos`	Not compatible with data retention controls.

Image and File Uploads

Images and files are automatically scanned for illegal content (e.g., CSAM).
If flagged, content may be retained for manual review — even if ZDR or MAM is enabled.

Data Residency

Project-level configuration allows selecting a data region (e.g., EU or US).
Customer content is stored in the selected region where technically required.
System data (e.g., billing, metadata, usage statistics) is not subject to data residency controls.
Not all endpoints support regional processing.

Enterprise Key Management (EKM)

Customer content can be encrypted using customer-managed keys (BYOK).
Supported: AWS KMS, Google Cloud KMS, Azure Key Vault.
Not compatible with certain endpoints (e.g., Assistants API).

Summary

No model training with API data without explicit opt-in
Default log retention: up to 30 days
Optional Zero Data Retention (with functional limitations)
Configurable regional data storage
Enterprise-grade encryption (BYOK) available