Helm Chart

The Oracle AI Optimizer and Toolkit (the AI Optimizer) was specifically designed to run on infrastructure supporting microservices architecture, including Kubernetes. A Helm Chart is provided to make the deployment easier.

To use the AI Optimizer Helm Chart:

  1. Build, Tag, and Push the AI Optimizer Images
  2. Configure the values.yaml
  3. Deploy!
Go Local

A full example of running the AI Optimizer in a local Kubernetes cluster using Docker container “nodes” via the Kind tool is provided.

Images

You will need to build the AI Optimizer container images and stage them in a container registry, such as the OCI Container Registry (OCIR).

  1. Build the AI Optimizer images:

    From the code source src/ directory:

    podman build --arch amd64 -f client/Dockerfile -t ai-optimizer-client:latest .
    
    podman build --arch amd64 -f server/Dockerfile -t ai-optimizer-server:latest .
  2. Tag the AI Optimizer images:

    Tag the images as required by your container registry. For example, if using the OCIR registry in US East (Ashburn) with a namespace of testing:

    podman tag ai-optimizer-client:latest iad.ocir.io/testing/ai-optimizer-client:latest
    podman tag ai-optimizer-server:latest iad.ocir.io/testing/ai-optimizer-server:latest
  3. Push the AI Optimizer images:

    Push the images to your container registry. If required, login to the registry first. For example, if using the OCIR registry in US East (Ashburn) with a namespace of testing:

    podman login iad.ocir.io
    
    podman push iad.ocir.io/testing/ai-optimizer-client:latest
    podman push iad.ocir.io/testing/ai-optimizer-server:latest

    You will use the URL for the pushed images when configuring the values.yaml.

Configure values.yaml

The values.yaml allows you to customize the deployment by overriding settings such as image versions, resource requests, service configurations, and more. You can modify this file directly or supply your own overrides during installation using the -f or –set flags.

Only a subset of the most important settings are documented here, review the values.yaml file for more configuration options.

Global Settings

The global: sections contains values that are shared across the chart and its subcharts.

KeyTypeDefaultDescription
global.apiobjectEither provide the ‘apiKey’ directly or provide a secretName referring to an existing Secret containing the API key.
global.api.apiKeystring""Key for making API calls to the server. Recommended to supply at command line or use the secretName to avoid storing in the values file. Example: “abcd1234opt5678”
global.api.secretKeystring"apiKey"Key name inside the Secret that contains the API key when secretName defined.
global.api.secretNamestring""Name of the Secret that stores the API key. This allows you to keep the API key out of the values file and manage it securely via Secrets. Example: “optimizer-api-keys”
global.baseUrlPathstring"/"URL path appended to the host. Example: “/test” results in URLs like http://hostname/test/…

Server Settings

The server: sections contains values that are used to configure the AI Optimizer API Server.

KeyTypeDefaultDescription
server.replicaCountint1Number of desired pod replicas for the Deployment when autoscaling is disabled
server.image.repositorystring"localhost/ai-optimizer-server"Image Repository
server.image.tagstring"latest"Image Tag
server.imagePullSecretslist[]Secret name containing image pull secrets
Server Database Settings

Configure the Oracle Database used by the AI Optimizer API Server.

KeyTypeDefaultDescription
server.database.typestring""Either SIDB-FREE, ADB-FREE, or ADB-S
server.database.imageobjectFor SIDB-FREE/ADB-FREE, location of the image and its tag; Exclude for ADB-S
server.database.image.repositorystring""For SIDB-FREE/ADB-FREE, repository location of the image
server.database.image.tagstring"latest"For SIDB-FREE/ADB-FREE, tag of the image
server.database.authNRequiredApplication User Authentication/Connection Details If defined, used to create the user defined in the authN secret
server.database.authN.secretNamestring"db-authn"Name of Secret containing the authentication/connection details
server.database.authN.usernameKeystring"username"Key in secretName containing the username
server.database.authN.passwordKeystring"password"Key in secretName containing the password
server.database.authN.serviceKeystring"service"Key in secretName containing the connection service name
server.database.privAuthNOptionalPrivileged User Authentication/Connection Details If defined, used to create the user defined in the authN secret
server.database.privAuthN.secretNamestring"db-priv-authn"secretName containing privileged user (i.e. ADMIN/SYSTEM) password
server.database.privAuthN.passwordKeystring"password"Key in secretName containing the password
server.database.oci_dbOptionalFor ADB-S, OCID of the Autonomous Database Exclude for SIDB-FREE/ADB-FREE
server.database.oci_db.ocidstring""OCID of the DB
Examples

SIDB-FREE

A containerized single-instance Oracle Database:

  database:
    type: "SIDB-FREE"
    image:
      repository: container-registry.oracle.com/database/free
      tag: latest

ADB-FREE

A containerized Autonomous Oracle Database:

  database:
    type: "ADB-FREE"
    image:
      repository: container-registry.oracle.com/database/adb-free
      tag: latest-23ai

ADB-S

A pre-deployed Oracle Autonomous Database (requires the OraOperator to be installed in the cluster):

  database:
    type: "ADB-S"
    oci_db: 
      ocid: "ocid1.autonomousdatabase.oc1..."
Server Oracle Cloud Infrastructure Settings

Configure Oracle Cloud Infrastructure used by the AI Optimizer API Server for access to Object Storage and OCI GenAI Services.

KeyTypeDefaultDescription
server.oci_config.okeboolfalseEnable Workload Identity Principals (WIP) (must be implemented)
server.oci_config.tenancystring""Tenancy OCID. Required when specifying keySecretName.
server.oci_config.userstring""User OCID. Required when specifying keySecretName.
server.oci_config.fingerprintstring""Fingerprint. Required when specifying keySecretName.
server.oci_config.regionstring""Region. Required when oke is true.
server.oci_config.fileSecretNamestring""Secret containing an OCI config file and the key_file(s). Use the scripts/oci_config.py script to help create the secret based on an existing ~.oci/config file
server.oci_config.keySecretNamestring""Secret containing a single API key corresponding to above tenancy configuration This used by OraOperator when not running in OKE
Examples

OKE with Workload Identity Principles

  oci_config:
    oke: true
    region: "us-ashburn-1"

Secret generated using scripts/oci_config.py

  oci_config:
    fileSecretName: "oci-config-file"

Manual Configuration with Secret containing API Key

  oci_config:
    tenancy: "ocid1.tenancy.oc1.."
    user: "ocid1.user.oc1.."
    fingerprint: "e8:65:45:4a:85:4b:6c:.."
    region: "us-ashburn-1"
    keySecretName: my-api-key
Server 3rd-Party Model Settings

Configure 3rd-Party AI Models used by the AI Optimizer API Server. Create Kubernetes Secret(s) to hold the 3rd-Party API Keys.

KeyTypeDefaultDescription
server.models.cohereobject{"secretKey":"apiKey","secretName":""}Cohere API Key
server.models.openAIobject{"secretKey":"apiKey","secretName":""}OpenAI API Key
server.models.perplexityobject{"secretKey":"apiKey","secretName":""}Perplexity API Key

Client Settings

The client: sections contains values that are used to configure the AI Optimizer frontend web client.

The frontend web client can be disabled by setting global.enableClient to false.

KeyTypeDefaultDescription
client.replicaCountint1Number of desired pod replicas for the Deployment when autoscaling is disabled
client.imagePullSecretslist[]Secret name containing image pull secrets
client.image.repositorystring"localhost/ai-optimizer-client"Image Repository
client.image.tagstring"latest"Image Tag
Client Features Settings

Disable specific AI Optimizer in the frontend web client.

KeyTypeDefaultDescription
client.features.disableTestbedboolfalseDisable the Test Bed
client.features.disableApiboolfalseDisable the API Server Administration/Monitoring
client.features.disableToolsboolfalseDisable Tools such as Prompt Engineering and Split/Embed
client.features.disableDbCfgboolfalseDisable Tools Database Configuration
client.features.disableModelCfgboolfalseDisable Tools Model Configuration
client.features.disableOciCfgboolfalseDisable OCI Configuration
client.features.disableSettingsboolfalseDisable the Import/Export of Settings

Ollama Settings

The ollama: section contains values that are used to automatically install Ollama and optionally pull models.

The Ollama functionality can be enabled by setting global.enableOllama to true.

It is recommended only to enable this functionality when you have access to a GPU worker node. Use the scheduling and resource constraints to ensure the Ollama resources are running on that GPU.

KeyTypeDefaultDescription
ollama.replicaCountint1Number of desired pod replicas for the Deployment
ollama.image.repositorystring"docker.io/ollama/ollama"Image Repository
ollama.image.tagstring"latest"Image Tag
ollama.models.enabledbooltrueEnable automatic pulling of models
ollama.models.modelPullListlist["llama3.1","mxbai-embed-large"]List of models to automatically pull
ollama.resourcesobject{}Requests and limits for the container. Often used to ensure pod is running on a GPU worker
ollama.nodeSelectorobject{}Constrain pods to specific nodes Often used to ensure pod is running on a GPU worker
ollama.affinityobject{}Rules for scheduling pods Often used to ensure pod is running on a GPU worker
ollama.tolerationslist[]For scheduling pods on tainted nodes Often used to ensure pod is running on a GPU worker

Deploy

Once your values.yaml has been configured and you have a Kubernetes cluster available. Deploy the Helm Chart:

  1. Add the Helm Repository
helm repo add ai-optimizer https://oracle-samples.github.io/ai-optimizer/helm
  1. Apply the values.yaml file:
helm upgrade --install ai-optimizer \
  ai-optimizer/ai-optimizer \
  --namespace ai-optimizer \
  --values values.yaml

Kind Example

Give the Helm Chart a spin using a locally installed Kind for experimenting and development.

  1. Install Kind locally

    There are many ways to install Kind, refer to the official documentation for more information.

  2. Create a Cluster

    kind create cluster -n ai-optimizer
  3. Build the Images

    Build the AI Optimizer Images per the above instructions. There’s no need to tag or push them.

  4. Load the images into the Kind cluster

    kind load docker-image ai-optimizer-client:latest -n ai-optimizer
    kind load docker-image ai-optimizer-server:latest -n ai-optimizer
    Top Tip
    Pull and load the database and ollama images before deploying the Helm Chart.  This will speed up the deployment:
    
    podman pull docker.io/ollama/ollama:latest
    podman pull container-registry.oracle.com/database/free:latest
    
    kind load docker-image docker.io/ollama/ollama:latest -n ai-optimizer
    kind load docker-image container-registry.oracle.com/database/free:latest -n ai-optimizer
  5. (Optional) Configure for Oracle Cloud Infrastructure

    If you already have an OCI API configuration file, use the scripts/oci_config.py helper script to turn it into a secret for OCI connectivity:

    kubectl create namespace ai-optimizer
    python scripts/oci_config.py --namespace ai-optimizer

    Run the output to create the secret

  6. Create a values-kind.yaml file

    OCI: Remove the server.oci_config specification if skipping the above optional step.

    server:
      replicaCount: 1
      image:
        repository: localhost/ai-optimizer-server
        tag: latest
      database:
        type: "SIDB-FREE"
        image:
          repository: container-registry.oracle.com/database/free
          tag: latest
    client:
      replicaCount: 1
      image:
        repository: localhost/ai-optimizer-client
        tag: latest
    ollama:
      enabled: true
      replicaCount: 1
      models:
        enabled: true
  7. Add the Helm Repository

helm repo add ai-optimizer https://oracle-samples.github.io/ai-optimizer/helm
  1. Deploy the Helm Chart

    helm upgrade \
      --create-namespace \
      --namespace ai-optimizer \
      --install ai-optimizer . \
      --set global.api.apiKey="my-api-key" \
      --values ./values-kind.yaml
  2. Wait for all Pods to be “Running”

    kubectl -n ai-optimizer get all

    The Ollama pod may take some time as it pulls models.

  3. Create a port-forward to access the environment:

    kubectl -n ai-optimizer port-forward services/ai-optimizer-client-http 8501:80
  4. Open your browser to http://localhost:8501