Microservices

The Oracle AI Optimizer and Toolkit (the AI Optimizer) was specifically designed to run on infrastructure supporting microservices architecture, including Kubernetes.

Oracle Kubernetes Engine

The following example shows running the AI Optimizer in Oracle Kubernetes Engine (OKE). The Infrastructure as Code (IaC) provided in the source opentofu directory was used to provision the infrastructure in Oracle Cloud Infrastructure (OCI).

OCI OKE OCI OKE

The command to connect to the OKE cluster will be output as part of the IaC.

Images

You will need to build the AI Optimizer container images and stage them in a container registry, such as the OCI Container Registry (OCIR).

  1. Build the AI Optimizer images:

    From the code source src/ directory:

    podman build --arch amd64 -f client/Dockerfile -t ai-optimizer-client:latest .
    
    podman build --arch amd64 -f server/Dockerfile -t ai-optimizer-server:latest .
  2. Log into your container registry:

    More information on authenticating to OCIR can be found here.

    podman login <registry-domain>

    Example:

    podman login iad.ocir.io

    You will be prompted for a username and token password.

  3. Push the AI Optimizer images:

    More information on pushing images to OCIR can be found here.

    Example (the values for <server_repository> and <server_repository> are provided from the IaC):

    podman tag ai-optimizer-client:latest <client_repository>:latest
    podman push <client_repository>:latest
    
    podman tag ai-optimizer-server:latest <server_repository>:latest
    podman push <server_repository>:latest

Ingress

To access the AI Optimizer GUI and API Server, you can either use a port-forward or an Ingress service. For demonstration purposes, the OCI Native Ingress Controller, which was enabled on the OKE cluster as part of the IaC, will be used to for public Ingress access.

The Flexible LoadBalancer was provisioned using the IaC. This example will create the Listeners and Backends to expose port 80 for the AI Optimizer GUI and port 8000 for the AI Optimizer API Server on the existing LoadBalancer.

It is HIGHLY recommended to protect these ports with Network Security Groups (NSGs).

The service manifest has five values that should be supplied:

  • <lb_compartment_ocid> - OCID of the LoadBalancer Compartment
  • <lb_subnet_ocid> - OCID of the Subnet for the LoadBalancer
  • <lb_ocid> - OCID of the LoadBalancer provisioned by IaC
  • <lb_nsg_ocid> - NSG OCID’s to protect the LB ports
  • <lb_reserved_ip_ocid> - A reserved IP address for the Loadbalancer

These will be output as part of the IaC but can be removed from the code if not reserving an IP or protecting the Load Balancer.

  1. Create a native_ingress.yaml:
    apiVersion: v1
    kind: Namespace
    metadata:
      name: hologram
    ---
    apiVersion: "ingress.oraclecloud.com/v1beta1"
    kind: IngressClassParameters
    metadata:
      name: native-ic-params
      namespace: ai-optimizer
    spec:
      compartmentId: <compartment_ocid>
      subnetId: <lb_subnet_ocid>
      loadBalancerName: "ai-optimizer-lb"
      reservedPublicAddressId: <lb_reserved_ip_ocid>
      isPrivate: false
      maxBandwidthMbps: 1250
      minBandwidthMbps: 10
    ---
    apiVersion: networking.k8s.io/v1
    kind: IngressClass
    metadata:
      name: native-ic
      namespace: hologram
      annotations:
        ingressclass.kubernetes.io/is-default-class: "true"
        oci-native-ingress.oraclecloud.com/network-security-group-ids: <lb_nsg_ocid>
        oci-native-ingress.oraclecloud.com/id: <lb_ocid>
        oci-native-ingress.oraclecloud.com/delete-protection-enabled: "true"
    spec:
      controller: oci.oraclecloud.com/native-ingress-controller
      parameters:
        scope: Namespace
        namespace: hologram
        apiGroup: ingress.oraclecloud.com
        kind: IngressClassParameters
        name: native-ic-params

The AI Optimizer

The AI Optimizer can be deployed using the Helm chart provided with the source: AI Optimizer Helm Chart. A list of all values can be found in values_summary.md.

If you deployed a GPU node pool as part of the IaC, you can deploy Ollama and enable a Large Language and Embedding Model out-of-the-box.

  1. Create the ai-optimizer namespace:

    kubectl create namespace ai-optimizer
  2. Create a secret to hold the API Key:

    kubectl create secret generic api-key \
      --from-literal=apiKey=$(openssl rand -hex 32) \
      --namespace=ai-optimizer
  3. Create a secret to hold the Database Authentication:

    The command has two values that should be supplied:

    • <adb_password> - Password for the ADB ADMIN User
    • <adb_service> - The Service Name (i.e. ADBDB_TP)
    kubectl create secret generic db-authn \
      --from-literal=username='ADMIN' \
      --from-literal=password='<adb_password>' \
      --from-literal=service='<adb_service>' \
      --namespace=ai-optimizer

    These will be output as part of the IaC.

    While the example shows the ADMIN user, it is advisable to create a new non-privileged database user.

  4. Create the values.yaml file for the Helm Chart:

    The values.yaml has five values that should be supplied:

    • <lb_reserved_ip> - A reserved IP address for the Loadbalancer
    • <adb_ocid> - Autonomous Database OCID
    • <client_repository> - Full path to the repository for the AI Optimizer Image
    • <server_repository> - Full path to the repository for the API Server Image

    These will be output as part of the IaC.

    If using the IaC for OCI, it is not required to specify an ImagePullSecret as the cluster nodes are configured with the Image Credential Provider for OKE.

    global:
      api:
        secretName: "api-key"
    
    # -- API Server configuration
    server:
      enabled: true
      image:
        repository: <server_repository>
        tag: "latest"
    
      ingress:
        enabled: true
        className: native-ic
        annotations:
          nginx.ingress.kubernetes.io/upstream-vhost: "<lb_reserved_ip>"
          oci-native-ingress.oraclecloud.com/http-listener-port: "8000"
          oci-native-ingress.oraclecloud.com/protocol: TCP
    
      service:
        http:
          type: "NodePort"
    
      # -- Oracle Autonomous Database Configuration
      adb:
        enabled: true
        ocid: "<adb_ocid>"
        mtlsWallet: ""
        authN:
          secretName: "db-authn"
    
    client:
      enabled: true
      image:
        repository: <client_repository>
        tag: "latest"
    
      ingress:
        enabled: true
        className: native-ic
        annotations:
          nginx.ingress.kubernetes.io/upstream-vhost: "<lb_reserved_ip>"
          oci-native-ingress.oraclecloud.com/http-listener-port: "80"
          oci-native-ingress.oraclecloud.com/protocol: TCP
    
      service:
        http:
          type: "NodePort"
    
      features:
        disableTestbed: "false"
        disableApi: "false"
        disableTools: "false"
        disableDbCfg: "true"
        disableModelCfg: "false"
        disableOciCfg: "true"
        disableSettings: "true"
    
    ollama:
      enabled: true
      models:
        - llama3.1
        - mxbai-embed-large
      resources:
        limits:
          nvidia.com/gpu: 1
  5. Deploy the Helm Chart:

    From the helm/ directory:

    helm upgrade \
      --install ai-optimizer . \
      --namespace ai-optimizer \
      -f values.yaml

Cleanup

To remove the AI Optimizer from the OKE Cluster:

  1. Uninstall the Helm Chart:

    helm uninstall ai-optimizer -n ai-optimizer
  2. Delete the ai-optimizer namespace:

    kubectl delete namespace ai-optimizer