1 - Installation

This describes the different installation options for Kiali.

1.1 - Quick Start

You can quickly install and try Kiali via one of the following two methods.

Install via Istio Addons

If you downloaded Istio, the easiest way to install and try Kiali is by running:

kubectl apply -f ${ISTIO_HOME}/samples/addons/kiali.yaml

To uninstall:

kubectl delete -f ${ISTIO_HOME}/samples/addons/kiali.yaml --ignore-not-found

Install via Helm

To install the latest version of Kiali Server using Helm, run the following command:

helm install \
  --namespace istio-system \
  --set auth.strategy="anonymous" \
  --repo https://kiali.org/helm-charts \
  kiali-server \
  kiali-server

To uninstall:

helm uninstall --namespace istio-system kiali-server

Access to the UI

Run the following command:

kubectl port-forward svc/kiali 20001:20001 -n istio-system

Then, access Kiali by visiting \https://localhost:20001/ in your preferred web browser.

1.2 - Installation Guide

The next few sections describe the different installation methods available for Kiali.

The recommended way to deploy Kiali is via the Kiali Operator, either using Helm Charts or OperatorHub.

The Kiali Operator is a Kubernetes Operator and manages your Kiali installation. It watches the Kiali Custom Resource (Kiali CR), a YAML file that holds the deployment configuration.

1.2.1 - Prerequisites

Service Mesh Compatibility

Each Kiali release is tested against the most recent Istio release. In general, Kiali tries to maintain compatibility with older Istio releases and Kiali versions later than those posted in the below table may work, but such combinations are not tested and will not be supported. Known incompatibilities are noted in the compatibility table below.

Istio Version Compatibility

Istio
Kiali
Notes
1.11 1.38.1 or later
1.10 1.34.1 to 1.37.x
1.9 1.29.1 to 1.33.x
1.8 1.26.0 to 1.28.x Istio 1.8 removes all support for mixer/telemetry V1, as does Kiali 1.26.0. Use earlier versions of Kiali for mixer support.
1.7 1.22.1 to 1.25.x Istio 1.7 istioctl will no longer install Kiali. Use the Istio samples/addons all-in-one yaml or the Kiali Helm Chart for quick demo installs. Istio 1.7 is out of support.
1.6 1.18.1 to 1.21.x Istio 1.6 introduces CRD and Config changes, Kiali 1.17 is recommended for Istio < 1.6.


Maistra Version Compatibility

Maistra
SMCP CR
Kiali
Notes
2.0 2.0 1.24 Using Maistra 2.0 to install service mesh control plane 2.0 requires Kiali Operator v1.36. Other operator versions are not compatible.
2.0 1.1 1.12 Using Maistra 2.0 to install service mesh control plane 1.1 requires Kiali Operator v1.36. Other operator versions are not compatible.
1.1 1.1 1.12 Using Maistra 1.1 to install service mesh control plane 1.1 requires Kiali Operator v1.36. Other operator versions are not compatible.
n/a 1.0 n/a Service mesh control plane 1.0 is out of support.


Browser Version Requirements

Kiali requires a modern web browser and supports the last two versions of Chrome, Firefox, Safari or Edge.

Hardware Requirements

Any machine capable of running a Kubernetes based cluster should also be able to run Kiali.

However, Kiali tends to grow in resource usage as your cluster grows. Usually the more namespaces and workloads you have in your cluster, the more memory you will need to allocate to Kiali.

Platform-specific requirements

OpenShift

If you are installing on OpenShift, you must grant the cluster-admin role to the user that is installing Kiali. If OpenShift is installed locally on the machine you are using, the following command should log you in as user system:admin which has this cluster-admin role:

$ oc login -u system:admin

Google Cloud Private Cluster

Private clusters on Google Cloud have network restrictions. Kiali needs your cluster’s firewall to allow access from the Kubernetes API to the Istio Control Plane namespace, for both the 8080 and 15000 ports.

To review the master access firewall rule:

gcloud compute firewall-rules list --filter="name~gke-${CLUSTER_NAME}-[0-9a-z]*-master"

To replace the existing rule and allow master access:

gcloud compute firewall-rules update <firewall-rule-name> --allow <previous-ports>,tcp:8080,tcp:15000

1.2.2 - Installation via Helm

Introduction

Helm is a popular tool that lets you manage Kubernetes applications. Applications are defined in a package named Helm chart, which contains all of the resources needed to run an application.

Kiali has a Helm Charts Repository at https://kiali.org/helm-charts. Two Helm Charts are provided:

  • The kiali-operator Helm Chart installs the Kiali operator which in turn installs Kiali when you create a Kiali CR.
  • The kiali-server Helm Chart installs a standalone Kiali without the need of the Operator nor a Kiali CR.

Although the kiali-server Helm Chart is actively maintained, it is only provided for convenience. The recommended method to install Kiali is by using the kiali-operator Helm Chart to install the Operator and then creating a Kiali CR to let the operator deploy Kiail.

Make sure you have the helm command available by following the Helm installation docs.

Adding the Kiali Helm Charts repository

Add the Kiali Helm Charts repository with the following command:

$ helm repo add kiali https://kiali.org/helm-charts

If you already added the repository, you may want to update your local cache to fetch latest definitions by running:

$ helm repo update

Installing Kiali using the Kiali operator

Once you’ve added the Kiali Helm Charts repository, you can install the latest Kiali Operator along with the latest Kiali server by running the following command:

$ helm install \
    --set cr.create=true \
    --set cr.namespace=istio-system \
    --namespace kiali-operator \
    --create-namespace \
    kiali-operator \
    kiali/kiali-operator

The --namespace kiali-operator and --create-namespace flags instructs to create the kiali-operator namespace (if needed), and deploy the Kiali operator on it. The --set cr.create=true and --set cr.namespace=istio-system flags instructs to create a Kiali CR in the istio-system namespace. Since the Kiali CR is created in advance, as soon as the Kiali operator starts, it will process it to deploy Kiali.

The Kiali Operator Helm Chart is configurable. Check available options and default values by running:

$ helm show values kiali/kiali-operator

The kiali-operator Helm Chart mirrors all settings of the Kiali CR as chart values that you can configure using regular --set flags. For example, the Kiali CR has a spec.namespace setting which you can configure in the kiali-operator Helm Chart by passing the --set cr.namespace flag as shown in the previous helm install example.

For more information about the Kiali CR, see the Creating and updating the Kiali CR page.

Operator-Only Install

To install only the Kiali Operator, omit the --set cr.create and --set cr.namespace flags of the helm command previously shown. For example:

$ helm install \
    --namespace kiali-operator \
    --create-namespace \
    kiali-operator \
    kiali/kiali-operator

This will omit creation of the Kiali CR, which you will need to link:https://v1-41.kiali.io/docs/installation/installation-guide/creating-updating-kiali-cr/[create later to install Kiali Server]. This option is good if you plan to do large customizations to the installation.

Installing Multiple Instances of Kiali

By installing a single Kiali operator in your cluster, you can install multiple instances of Kiali by simply creating multiple Kiali CRs. For example, if you have two Istio control planes in namespaces istio-system and istio-system2, you can create a Kiali CR in each of those namespaces to install a Kiali instance in each control plane.

If you wish to install multiple Kiali instances in the same namespace, or if you need the Kiali instance to have different resource names than the default of kiali, you can specify spec.deployment.instance_name in your Kiali CR. The value for that setting will be used to create a unique instance of Kiali using that instance name rather than the default kiali. One use-case for this is to be able to have unique Kiali service names across multiple Kiali instances in order to be able to use certain routers/load balancers that require unique service names.

Standalone Kiali installation

To install the Kiali Server without the operator, use the kiali-server Helm Chart:

$ helm install \
    --namespace istio-system \
    kiali-server \
    kiali/kiali-server

The kiali-server Helm Chart mirrors all settings of the Kiali CR as chart values that you can configure using regular --set flags. For example, the Kiali CR has a spec.server.web_fqdn setting which you can configure in the kiali-server Helm Chart by passing the --set server.web_fqdn flag as follows:

$ helm install \
    --namespace istio-system \
    --set server.web_fqdn=example.com \
    kiali-server \
    kiali/kiali-server

Upgrading Helm installations

If you want to upgrade to a newer Kiali version (or downgrade to older versions), you can use the regular helm upgrade commands. For example, the following command should upgrade the Kiali Operator to the latest version:

$ helm upgrade \
    --namespace kiali-operator \
    --reuse-values \
    kiali-operator \
    kiali/kiali-operator

WARNING: No migration paths are provided. However, Kiali is a stateless application and if the helm upgrade command fails, please uninstall the previous version and then install the new desired version.

Managing configuration of Helm installations

After installing either the kiali-operator or the kiali-server Helm Charts, you may be tempted to manually modify the created resources to modify the installation. However, we recommend using helm upgrade to update your installation.

For example, assuming you have the following installation:

$ helm list -n kiali-operator
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
kiali-operator  kiali-operator  1               2021-09-14 18:00:45.320351026 -0500 CDT deployed        kiali-operator-1.40.0   v1.40.0

Notice that the current installation is version 1.40.0 of the kiali-operator. Let’s assume you want to use your own mirrors of the Kiali Operator container images. You can update your installation with the following command:

$ helm upgrade \
    --namespace kiali-operator \
    --reuse-values \
    --set image.repo=your_mirror_registry_url/owner/kiali-operator-repo \
    --set image.tag=your_mirror_tag \
    --version 1.40.0 \
    kiali-operator \
    kiali/kiali-operator

Uninstalling

Removing the Kiali operator and managed Kialis

If you used the kiali-operator Helm chart, first you must ensure that all Kiali CRs are deleted. For example, the following command will agressively delete all Kiali CRs in your cluster:

$ kubectl delete kiali --all --all-namespaces

The previous command may take some time to finish while the Kiali operator removes all Kiali installations.

Then, remove the Kiali operator using a standard helm uninstall command. For example:

$ helm uninstall --namespace kiali-operator kiali-operator
$ kubectl delete crd kialis.kiali.io

Known problem: uninstall hangs (unable to delete the Kiali CR)

Typically this happens if not all Kiali CRs are deleted prior to uninstalling the operator. To force deletion of a Kiali CR, you need to clear its finalizer. For example:

$ kubectl patch kiali kiali -n istio-system -p '{"metadata":{"finalizers": []}}' --type=merge

Removing standalone Kiali

If you installed a standalone Kiali by using the kiali-server Helm chart, use the standard helm uninstall commands. For example:

$ helm uninstall --namespace istio-system kiali-server

1.2.3 - Installation via OperatorHub

Introduction

The OperatorHub is a website that contains a catalog of Kubernetes Operators. Its aim is to be the central location to find Operators.

The OperatorHub relies in the Operator Lifecycle Manager (OLM) to install, manage and update Operators on any Kubernetes cluster.

The Kiali Operator is being published to the OperatorHub. So, you can use the OLM to install and manage the Kiali Operator installation.

Installing the Kiali Operator using the OLM

Go to the Kiali Operator page in the OperatorHub: https://operatorhub.io/operator/kiali.

You will see an Install button at the right of the page. Press it and you will be presented with the installation instructions. Follow these instructions to install and manage the Kiali Operator installation using OLM.

Afterwards, you can create the Kiali CR to install Kiali.

Installing the Kiali Operator in OpenShift

The OperatorHub is bundled in the OpenShift console. To install the Kiali Operator, simply go to the OperatorHub in the OpenShift console and search for the Kiali Operator. Then, click on the Install button and follow the instruction on the screen.

Afterwards, you can create the Kiali CR to install Kiali.

1.2.4 - Creating and updating the Kiali CR

The Kiali Operator watches the Kiali Custom Resource (Kiali CR), a YAML file that holds the deployment configuration. Creating, updating, or removing a Kiali CR will trigger the Kiali Operator to install, update, or remove Kiali.

The Operator provides comprehensive defaults for all properties of the Kiali CR. Hence, the minimal Kiali CR does not have a spec:

apiVersion: kiali.io/v1alpha1
kind: Kiali
metadata:
  name: kiali

Assuming you saved the previous YAML to a file named my-kiali-cr.yaml, and that you are installing Kiali in the same default namespace as Istio, create the resource with the following command:

$ kubectl apply -f my-kiali-cr.yaml -n istio-system

Once created, the Kiali Operator should shortly be notified and will process the resource, performing the Kiali installation. You can check installation progress by inspecting the status attribute of the created Kiali CR:

$ kubectl describe kiali -n istio-system
Name:         kiali
Namespace:    istio-system
Labels:       <none>
Annotations:  <none>
API Version:  kiali.io/v1alpha1
Kind:         Kiali

  (...some output is removed...)

Status:
  Conditions:
    Last Transition Time:  2021-09-15T17:17:40Z
    Message:               Running reconciliation
    Reason:                Running
    Status:                True
    Type:                  Running
  Deployment:
    Instance Name:  kiali
    Namespace:      istio-system
  Environment:
    Is Kubernetes:       true
    Kubernetes Version:  1.21.2
    Operator Version:    v1.40.0
  Progress:
    Duration:  0:00:16
    Message:   5. Creating core resources
Events:        <none>

We recommended to download the example Kiali CR YAML file that is available in the Operator’s GitHub repository. This example file contains and describes all available settings. Then, edit the downloaded file being very careful to maintain proper formatting. Incorrect indentation is a common problem!

Once you created a Kiali CR, you can manage your Kiali installation by editing the resource using the usual Kubernetes tools:

$ kubectl edit kiali kiali -n istio-system

1.2.5 - Accessing and exposing Kiali

Introduction

After Kiali is succesfully installed you will need to make Kiali accessible to users. This page describes some popular methods of exposing Kiali for use.

If exposing Kiali in a custom way, you may need to set some configurations to make Kiali aware of how users will access Kiali.

Accessing Kiali using port forwarding

You can use port-forwarding to access Kiali by running any of these commands:

# If you have oc command line tool
oc port-forward svc/kiali 20001:20001 -n istio-system
# If you have kubectl command line tool
kubectl port-forward svc/kiali 20001:20001 -n istio-system

These commands will block. Access Kiali by visiting \https://localhost:20001/ in your preferred web browser.

Accessing Kiali through an Ingress

By default, the installation exposes Kiali creating an Ingress resource. Find out your Ingress IP or domain name and use it to access Kiali by visiting this URL in your web browser: https://_your_ingress_ip_or_domain_/kiali.

To find your Ingress IP or domain name, as per the Kubernetes documentation, try the following command (doesn’t work if using Minikube):

kubectl get ingress kiali -n istio-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

If it doesn’t work, unfortunately, it depends on how and where you had setup your cluster. There are several Ingress controllers available and some cloud providers have their own controller or preferred exposure method. Please, check the documentation of your cloud provider. You may need to customize the pre-installed Ingress rule, or to expose Kiali using a different method.

Customizing the Ingress resource

By default, a catch-all Ingress resource is created to route traffic to Kiali. Most likely, you will need a more specific Ingress resource that routes traffic to Kiali only on a specific domain or path. To do this, you can specify route settings.

Alternatively, and for more advanced Ingress configurations, you can provide your own Ingress declaration in the Kiali CR. For example:

spec:
  deployment:
    override_ingress_yaml:
      metadata:
        annotations:
          nginx.ingress.kubernetes.io/secure-backends: "true"
          nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
      spec:
        rules:
        - http:
            paths:
            - path: /kiali
              backend:
                serviceName: kiali
                servicePort: 20001

Accessing Kiali in Minikube

If you [enabled the Ingress controller]https://kubernetes.io/docs/tasks/access-application-cluster/ingress-minikube/#enable-the-ingress-controller, the default Ingress resource created by the installation (mentioned in the previous section) should be enough to access Kiali. The following command should open Kiali in your default web browser:

xdg-open https://$(minikube ip)/kiali

Accessing Kiali through a LoadBalancer or a NodePort

By default, the Kiali service is created with the ClusterIP type. To use a LoadBalancer or a NodePort, you can change the service type in the Kiali CR as follows:

spec:
  deployment:
    service_type: LoadBalancer

Once the Kiali operator to updates the installation, you should be able to use the kubectl get svc -n istio-system kiali command to retrieve the external address (or port) to access Kiali. For example, in the following output Kiali is assigned the IP 192.168.49.201, which means that you can access Kiali by visiting \http://192.168.49.201:20001 in a browser:

NAME    TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)                          AGE
kiali   LoadBalancer   10.105.236.127   192.168.49.201   20001:31966/TCP,9090:30128/TCP   34d

Accessing Kiali through an Istio Ingress Gateway

If you want to take advantage of Istio’s infrastructure, you can expose Kiali using an Istio Ingress Gateway. The Istio documentation provides a good guide explaining how to expose the sample add-ons. Even if the Istio guide is focused on the sample add-ons, the steps are the same to expose a Kiali installed using this Installation guide.

Accessing Kiali in OpenShift

By default, installation exposes Kiali through a Route. The following command should open Kiali in your default web browser:

xdg-open https://$(oc get routes -n istio-system kiali -o jsonpath='{.spec.host}')/console

Specifying route settings

Either if you are using your own exposure method, or if you are using one of the methods mentioned in this page, you may need to configure the route that is being used to access Kiali.

In the Kiali CR, route settings are broken in several attributes. For example, to specify that Kiali is being accessed under the \https://apps.example.com:8080/dashboards/kiali URI, you would need to set the following:

spec:
  server:
    web_fqdn: apps.example.com
    web_port: 8080
    web_root: /dashboards/kiali
    web_schema: https

If you are letting the installation to create an Ingress resource for you, the Ingress will be adjusted to match these route settings. If you are using your own exposure method, this is only making Kiali aware about what is its public endpoint.

It is possible to omit these settings and Kiali may be able to discover some of these configurations, depending on your exposure method. For example, if you are exposing Kiali via LoadBalancer or NodePort service types, Kiali can discover most of these settings. If you are using some kind of Ingress, Kiali will honor X-Forwarded-Proto, X-Forwarded-Host and X-Forwarded-Port HTTP headers if they are properly injected in the request.

The web_root receives special treatment, because this is the path where Kiali will serve itself (both the user interface and its api). This is useful if you are serving multiple applications under the same domain. It must begin with a slash and trailing slashes must be omitted. The default value is /kiali for Kubernetes and / for OpenShift.

Configuring listening ports

It is possible to configure the listening ports of the Kiali service to use your preferred ones:

spec:
  server:
    port: 80 # Main port for accessing Kiali
    metrics_port: 8080

1.2.6 - Advanced installation options

Multiple Istio control planes in the same cluster

Currently, Kiali can manage only one Istio control plane. However, there are certain cases where you may have more than one Istio control plane in your cluster. One of such cases is when performing a Canary upgrade of Istio.

In these cases, you will need to configure in Kiali which control plane you want to manage. This is done by configuring the name of the components of the control plane. This is configured in the Kiali CR and the default values are the following:

spec:
  external_services:
    istio:
      config_map_name: "istio"
      istiod_deployment_name: "istiod"
      istio_sidecar_injector_config_map_name: "istio-sidecar-injector"

If you want to manage both Istio control planes, simply install two Kiali instances and point each one to a different Istio control plane.

Installing a Kiali Server of a different version than the Operator

When you install the Kiali Operator, it will be configured to install a Kiali Server that is the same version as the operator itself. For example, if you have Kiali Operator v1.34.0 installed, that operator will install Kiali Server v1.34.0. If you upgrade (or downgrade) the Kiali Operator, the operator will in turn upgrade (or downgrade) the Kiali Server.

There are certain use-cases in which you want the Kiali Operator to install a Kiali Server whose version is different than the operator version. Read the following section «Using a custom image registry» section to learn how to configure this setup.

Using a custom image registry

Kiali is released and published to the Quay.io container image registry. There is a repository hosting the Kiali operator images and another one for the Kiali server images.

If you need to mirror the Kiali container images to some other registry, you still can use Helm to install the Kiali operator as follows:

$ helm install \
    --namespace kiali-operator \
    --create-namespace \
    --set image.repo=your.custom.registry/owner/kiali-operator-repo
    --set image.tag=your_custom_tag
    --set allowAdHocKialiImage=true
    kiali-operator \
    kiali/kiali-operator

Then, when creating the Kiali CR, use the following attributes:

spec:
  deployment:
    image_name: your.custom.registry/owner/kiali-server-repo
    image_version: your_custom_tag

Development Install

This option installs the latest Kiali Operator and Kiali Server images which are built from the master branches of Kiali GitHub repositories. This option is good for demo and development installations.

helm install \
  --set cr.create=true \
  --set cr.namespace=istio-system \
  --set cr.spec.deployment.image_version=latest \
  --set image.tag=latest \
  --namespace kiali-operator \
  -- create-namespace \
  kiali-operator \
  kiali/kiali-operator

2 - Features

An overview of important Kiali features.

2.1 - Topology

How Kiali visualizes the mesh topology.

Kiali offers multiple ways for users to examine their mesh Topology. Each combines several information types to help users quickly evaluate the health of their service architecture.

Overview

Kiali’s default page is the topology Overview. It presents a high-level view of the namespaces accessible to Kiali, for this user. It combines service and application information, along with telemetry, validations and health, to provide a holistic summary of system behvior. The Overview page provides numerous filtering, sorting and presentation options. From here users can perform namespace-level Actions, or quickly navigate to more detailed views.

Topology namespace overview

Graph

The Kiali Graph offers a powerful visualization of your mesh traffic. The topology combines real-time request traffic with your Istio configuration information to present immediate insight into the behavior of service mesh, allowing you to quickly pinpoint issues. Multiple Graph Types allow you to visualize traffic as a high-level service topology, a low-level workload topology, or as an application-level topology.

Graph nodes are decorated with a variety of information, pointing out various route routing options like virtual services and service entries, as well as special configuration like fault-injection and circuit breakers. It can identify mTLS issues, latency issues, error traffic and more. The Graph is highly configurable, can show traffic animation, and has powerful Find and Hide abilities.

You can configure the graph to show the namespaces and data that are important to you, and display it in the way that best meets your needs.

Topology graph

Health

Colors in the graph represent the health of your service mesh. A node colored red or orange might need attention. The color of an edge between components represents the health of the requests between those components. The node shape indicates the type of component such as services, workloads, or apps.

The health of nodes and edges is refreshed automatically based on the user’s preference. The graph can also be paused to examine a particular state, or replayed to re-examine a particular time period.

Topology graph health

Side-Panel

The collapsible side-panel summarizes the current graph selection, or the graph as a whole. A single-click will select the node, edge, or box of interest. The side panel provides:

  • Charts showing traffic and response times.
  • Health details.
  • Links to fully-detailed pages.
  • Response Code and Host breakdowns.
  • Traces involving the selected component.

Topology graph side-panel app Topology graph side-panel service Topology graph side-panel workload

Node Detail

A single-click selects a graph node. A double-click drills in to show the node’s Detail Graph. The node detail graph visualizes traffic from the point-of-view of that node, meaning it shows only the traffic reported by that node’s Istio proxy.

You can return back to the main graph, or double-click to change to a different node’s detail graph.

Topology graph node detail

Traffic Animation

Kiali offers several display options for the graph, including traffic animation.

For HTTP traffic, circles represent successful requests while red diamonds represent errors. The more dense the circles and diamonds the higher the request rate. The faster the animation the faster the response times.

TCP traffic is represented by offset circles where the speed of the circles indicates the traffic speed.

Topology graph animation

Graph Types

Kiali offers four different traffic-graph renderings:

  • The workload graph provides the a detailed view of communication between workloads.

  • The app graph aggregates the workloads with the same app labeling, which provides a more logical view.

  • The versioned app graph aggregates by app, but breaks out the different versions providing traffic breakdowns that are version-specific.

  • The service graph provides a high-level view, which aggregates all traffic for defined services.

Topology graph type workload Topology graph type app Topology graph type versioned app Topology graph service

Replay

Graph replay allows you to replay traffic from a selected past time-period. This gives you a chance to thoroughly examine a time period of interest, or share it with a co-worker. The graph is fully bookmarkable, including replay.


Operation Nodes

Istio v1.6 introduced Request Classification. This powerful feature allows users to classify requests into aggregates, called “Operations” by convention, to better understand how a service is being used. If configured in Istio the Kiali graph can show these as Operation nodes. The user needs only to enable the “Operation Nodes” display option. Operations can span services, for example, “VIP” may be configured for both CarRental and HotelRental services. To see total “VIP” traffic then display operation nodes without service nodes. To see “VIP” traffic specific to each service then also enable the “Service Nodes” display option.

When selected, an Operation node also provides a side-panel view. And when double-clicked a node detail graph is also provided.

Because operation nodes represent aggregate traffic they are not compatible with Service graphs, which themselves are already logical aggregates. For similar reasons response time information is not available on edges leading into or out of operation nodes. But by selecting the edge the response time information is available in the side panel (if configured).

Operation nodes are represented as pentagons in the Kiali graph:

Topology graph operation

2.2 - Health

Health

Colors in the graph represent the health of your service mesh. A node colored red or orange might need attention. The color of an edge between components represents the health of the requests between those components. The node shape indicates the type of component such as services, workloads, or apps.

The health of nodes and edges is refreshed automatically based on the user’s preference. The graph can also be paused to examine a particular state, or replayed to re-examine a particular time period.

Visualize the health of your mesh

Health Configuration

Kiali calculates health by combining the individual health of several indicators, such as pods and request traffic. The global health of a resource reflects the most severe health of its indicators.

Health Indicators

The table below lists the current health indicators and whether the indicator supports custom configuration for its health calculation.

Indicator Supports Configuration
Pod Status No
Traffic Health Yes

Icons and colors

Kiali use icons and colors to indicate the health of resources and associated request traffic.

  • No Health Information (NA)
  • Healthy
  • Degraded
  • Failure

Default Values

Request Traffic

By default Kiali uses the traffic rate configuration shown below. Application errors have minimal tolerance while client errors have a higher tolerance reflecting that some level of client errors is often normal (e.g. 404 Not Found):

  • For http protocol 4xx are client errors and 5xx codes are application errors.
  • For grpc protocol all 1-16 are errors (0 is success).

So, for example, if the rate of application errors is >= 0.1% kiali will show Degraded health and if > 10% will show Failure health.

# ...
  health_config:
    rate:
      - namespace: ".*"
        kind: ".*"
        name: ".*"
        tolerance:
          - code: "^5\\d\\d$"
            direction: ".*"
            protocol: "http"
            degraded: 0
            failure: 10
          - code: "^4\\d\\d$"
            direction: ".*"
            protocol: "http"
            degraded: 10
            failure: 20
          - code: "^[1-9]$|^1[0-6]$"
            direction: ".*"
            protocol: "grpc"
            degraded: 0
            failure: 10
# ...

Configuration

Custom health configuration is specified in the Kiali CR. To see the supported configuration syntax for health_config visit Kiali CR.

Kiali applies the first matching rate configuration (namespace, kind, etc) and calculates the status for each tolerance. The reported health will be the status with highest priority (see below).

Rate OptionDefinitionDefault
namespaceMatching Namespaces (regex).* (match all)
kindMatching Resource Types (workload|app|service) (regex).* (match all)
nameMatching Resource Names (regex).* (match all)
ToleranceArray of tolerances to apply.
Tolerance Option Definition Default
code Matching Response Status Codes (regex) [1] required
direction Matching Request Directions (inbound|outbound) (regex) .* (match all)
protocol Matching Request Protocols (http|grpc) (regex) .* (match all)
degraded Degraded Threshold(% matching requests >= value) 0
failure Failure Threshold (% matching requests >= value) 0

[1] The status code typically depends on the request protocol. The special code -, a single dash, is used for requests that don’t receive a response, and therefore no response code.

Kiali reports traffic health with the following top-down status priority :

Priority Rule (value=% matching requests) Status
1 value >= FAILURE threshold FAILURE
2 value >= DEGRADED threshold AND value < FAILURE threshold DEGRADED
3 value > 0 AND value < DEGRADED threshold HEALTHY
4 value = 0 HEALTHY
5 No traffic No Health Information

Examples

These examples use the repo _https://github.com/kiali/demos/tree/master/error-rates_.

In this repo we can see 2 namespaces: alpha and beta (Demo design).

Alpha

Where nodes return the responses (You can configure responses here):

App (alpha/beta) Code Rate
x-server 200 9
x-server 404 1
y-server 200 9
y-server 500 1
z-server 200 8
z-server 201 1
z-server 201 1

The applied traffic rate configuration is:
# ...
health_config:
  rate:
   - namespace: "alpha"
     tolerance:
       - code: "404"
         failure: 10
         protocol: "http"
       - code: "[45]\\d[^\\D4]"
         protocol: "http"
   - namespace: "beta"
     tolerance:
       - code: "[4]\\d\\d"
         degraded: 30
         failure: 40
         protocol: "http"
       - code: "[5]\\d\\d"
         protocol: "http"
# ...

After Kiali adds default configuration we have the following (Debug Info Kiali):

{
  "healthConfig": {
    "rate": [
      {
        "namespace": "/alpha/",
        "kind": "/.*/",
        "name": "/.*/",
        "tolerance": [
          {
            "code": "/404/",
            "degraded": 0,
            "failure": 10,
            "protocol": "/http/",
            "direction": "/.*/"
          },
          {
            "code": "/[45]\\d[^\\D4]/",
            "degraded": 0,
            "failure": 0,
            "protocol": "/http/",
            "direction": "/.*/"
          }
        ]
      },
      {
        "namespace": "/beta/",
        "kind": "/.*/",
        "name": "/.*/",
        "tolerance": [
          {
            "code": "/[4]\\d\\d/",
            "degraded": 30,
            "failure": 40,
            "protocol": "/http/",
            "direction": "/.*/"
          },
          {
            "code": "/[5]\\d\\d/",
            "degraded": 0,
            "failure": 0,
            "protocol": "/http/",
            "direction": "/.*/"
          }
        ]
      },
      {
        "namespace": "/.*/",
        "kind": "/.*/",
        "name": "/.*/",
        "tolerance": [
          {
            "code": "/^5\\d\\d$/",
            "degraded": 0,
            "failure": 10,
            "protocol": "/http/",
            "direction": "/.*/"
          },
          {
            "code": "/^4\\d\\d$/",
            "degraded": 10,
            "failure": 20,
            "protocol": "/http/",
            "direction": "/.*/"
          },
          {
            "code": "/^[1-9]$|^1[0-6]$/",
            "degraded": 0,
            "failure": 10,
            "protocol": "/grpc/",
            "direction": "/.*/"
          }
        ]
      }
    ]
  }
}

What are we applying?

  • For namespace alpha, all resources

  • Protocol http if % requests with error code 404 are >= 10 then FAILURE, if they are > 0 then DEGRADED

  • Protocol http if % requests with others error codes are> 0 then FAILURE.

  • For namespace beta, all resources

  • Protocol http if % requests with error code 4xx are >= 40 then FAILURE, if they are >= 30 then DEGRADED

  • Protocol http if % requests with error code 5xx are > 0 then FAILURE

  • For other namespaces kiali apply defaults.

  • Protocol http if % requests with error code 5xx are >= 20 then FAILURE, if they are >= 0.1 then DEGRADED

  • Protocol grpc if % requests with error code match /^[1-9]$|^1[0-6]$/ are >= 20 then FAILURE, if they are >= 0.1 then DEGRADED

Alpha Beta

2.3 - Detail Views

Kiali provides list and detail views for your mesh components.

Kiali provides filtered list views of all your service mesh definitions. Each view provides health, details, YAML definitions and links to help you visualize your mesh. There are list and detail views for:

  • Applications
  • Istio Configuration
  • Services
  • Workloads

Detail list apps Detail list service Detail list workload Detail list Istio config

Selecting an object from the list will bring you to its detail page. For Istio Config, Kiali will present its YAML, along with contextual validation information. Other mesh components present a variety of Tabs.

Overview Tab

Overview is the default Tab for any detail page. The overview tab provides detailed information, including health status, and a detailed mini-graph of the current traffic involving the component. The full set of tabs, as well as the detailed information, varies based on the component type.

Each Overview provides:

  • links to related components and linked Istio configuration.
  • health status.
  • validation information.
  • an Action menu for actions that can be taken on the component.

And also type-specfic information. For example:

  • Service detail includes Network information.
  • Workload detail provides backing Pod information.

Detail overview app Detail overview service Detail overview workload

Both Workload and Service detail can be customized to some extent, by adding additional details supplied as annotations. This is done through the additional_display_details field in the Kiali CR.

Detail overview additional details

Traffic

The Traffic Tab presents a service, app, or workload’s Inbound and Outbound traffic in a table-oriented way:

Detail traffic

Logs

Workload detail offers a special Logs tab. Kiali offers a special unified logs view that lets users correlate app and proxy logs. It can also add-in trace span information to help identify important traces based on associated logging. More powerful features include substring or regex Show/Hide, full-screen, and the ability to set proxy log level without a pod restart.

Detail logs

Metrics

Each detail view provides Inbound Metrics and/or Outbound Metrics tabs, offering predefined metric dashboards. The dashboards provided are tailored to the relevant application, workload or service level. Application and workload detail views show request and response metrics such as volume, duration, size, or tcp traffic. The service detail view shows request and response metrics for inbound traffic only.

Kiali allows the user to customize the charts by choosing the charted dimensions. It can also present metrics reported by either source or destination proxy metrics. And for troublshooting it can overlay trace spans.

Detail metric inbound Detail metrics outbound

Traces

Each detail view provides a Traces tab with a native integration with Jaeger. For more, see Tracing.

Dashboards

Kiali will display additional tabs for each applicable Built-In Dashboard or Custom Dashboard.

Built-in dashboards

Kiali comes with built-in dashboards for several runtimes, including Envoy, Go, Node.js, and others.

Envoy

The most important built-in dashboard is for Envoy. Kiali offers the Envoy tab for many workloads. The Envoy tab is actually a Built-In Dashboard, but it is very common as it applies to any workload injected with, or that is itself, an Envoy proxy. Being able to inspect the Envoy proxy is invaluable when troublshooting your mesh. The Envoy tab itself offers five subtabs, exposing a wealth of information.

Detail Envoy

Istio’s Envoy sidecars supply some internal metrics, that can be viewed in Kiali. They are different than the metrics reported by Istio Telemetry, which Kiali uses extensively. Some of Envoy’s metrics may be redundant.

Note that the enabled Envoy metrics can be tuned, as explained in the Istio documentation: it’s possible to get more metrics using the statsInclusionPrefixes annotation. Make sure you include cluster_manager and listener_manager as they are required.

For example, sidecar.istio.io/statsInclusionPrefixes: cluster_manager,listener_manager,listener will add listener metrics for more inbound traffic information. You can then customize the Envoy dashboard of Kiali according to the collected metrics.

Go

Contains metrics such as the number of threads, goroutines, and heap usage. The expected metrics are provided by the Prometheus Go client.

Example to expose built-in Go metrics:

        http.Handle("/metrics", promhttp.Handler())
        http.ListenAndServe(":2112", nil)

As an example and for self-monitoring purpose Kiali itself exposes Go metrics.

The pod annotation for Kiali is: kiali.io/dashboards: go

Node.js

Contains metrics such as active handles, event loop lag, and heap usage. The expected metrics are provided by prom-client.

Example of Node.js metrics for Prometheus:

const client = require('prom-client');
client.collectDefaultMetrics();
// ...
app.get('/metrics', (request, response) => {
  response.set('Content-Type', client.register.contentType);
  response.send(client.register.metrics());
});

Full working example: https://github.com/jotak/bookinfo-runtimes/tree/master/ratings

The pod annotation for Kiali is: kiali.io/dashboards: nodejs

Quarkus

Contains JVM-related, GC usage metrics. The expected metrics can be provided by SmallRye Metrics, a MicroProfile Metrics implementation. Example with maven:

    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-smallrye-metrics</artifactId>
    </dependency>

The pod annotation for Kiali is: kiali.io/dashboards: quarkus

Spring Boot

Three dashboards are provided: one for JVM memory / threads, another for JVM buffer pools and the last one for Tomcat metrics. The expected metrics come from Spring Boot Actuator for Prometheus. Example with maven:

    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>
    <dependency>
      <groupId>io.micrometer</groupId>
      <artifactId>micrometer-core</artifactId>
    </dependency>
    <dependency>
      <groupId>io.micrometer</groupId>
      <artifactId>micrometer-registry-prometheus</artifactId>
    </dependency>

Full working example: https://github.com/jotak/bookinfo-runtimes/tree/master/details

The pod annotation for Kiali with the full list of dashboards is: kiali.io/dashboards: springboot-jvm,springboot-jvm-pool,springboot-tomcat

By default, the metrics are exposed on path /actuator/prometheus, so it must be specified in the corresponding annotation: prometheus.io/path: "/actuator/prometheus"

Thorntail

Contains mostly JVM-related metrics such as loaded classes count, memory usage, etc. The expected metrics are provided by the MicroProfile Metrics module. Example with maven:

    <dependency>
      <groupId>io.thorntail</groupId>
      <artifactId>microprofile-metrics</artifactId>
    </dependency>

Full working example: https://github.com/jotak/bookinfo-runtimes/tree/master/productpage

The pod annotation for Kiali is: kiali.io/dashboards: thorntail

Vert.x

Several dashboards are provided, related to different components in Vert.x: HTTP client/server metrics, Net client/server metrics, Pools usage, Eventbus metrics and JVM. The expected metrics are provided by the vertx-micrometer-metrics module. Example with maven:

    <dependency>
      <groupId>io.vertx</groupId>
      <artifactId>vertx-micrometer-metrics</artifactId>
    </dependency>
    <dependency>
      <groupId>io.micrometer</groupId>
      <artifactId>micrometer-registry-prometheus</artifactId>
    </dependency>

Init example of Vert.x metrics, starting a dedicated server (other options are possible):

      VertxOptions opts = new VertxOptions().setMetricsOptions(new MicrometerMetricsOptions()
          .setPrometheusOptions(new VertxPrometheusOptions()
              .setStartEmbeddedServer(true)
              .setEmbeddedServerOptions(new HttpServerOptions().setPort(9090))
              .setPublishQuantiles(true)
              .setEnabled(true))
          .setEnabled(true));

Full working example: https://github.com/jotak/bookinfo-runtimes/tree/master/reviews

The pod annotation for Kiali with the full list of dashboards is: kiali.io/dashboards: vertx-client,vertx-server,vertx-eventbus,vertx-pool,vertx-jvm

Custom Dashboards

When the built-in dashboards don’t offer what you need, it’s possible to create your own. See Custom Dashboard Configuration for more information.

2.4 - Tracing

How Kiali integrates Distributed Tracing with Jaeger.

Kiali offers a native integration with Jaeger for Distributed Tracing. As such, users can access Jaeger’s trace visualizations. But more than that, Kiali incorporates tracing into several correlated views, making your investment in trace data even more valuable.

For a quick glimpse at Kiali tracing features, see below. For a detailed explanation of tracing in Kiali, see this 3-part Trace my mesh blog series,


Workload detail

When investigating a workload, click the Traces tab to visualize your traces in a chart. When selecting a trace Kiali presents a tab for trace detail, and a tab for span details. Kiali always tries to surface problem areas, Kiali uses a heatmap approach to help the user identify problem traces or spans.

Trace detail Span detail

Metric Correlation

Kiali offers span overlays on Metric charts. The user can simply enable the spans option to generate the overlay. Clicking any span will navigate back to the Traces tab, focused on the trace of interest.

Metrics with Tracing

Graph Correlation

Kiali users often use the Graph Feature to visualize their mesh traffic. In the side panel, When selecting a graph node, the user will be presented with a Traces tab, which lists available traces for the time period. When selecting a trace the graph will display an overlay for the trace’s spans. And the side panel will display span details and offer links back to the trace detail views.

Graph with Tracing

Logs Correlation

Kiali works to correlate the standard pillars of observability: traces, metrics and logs. Kiali can present a unified view of log and trace information, in this way letting users use logs to identify traces of interest. When enabling the spans option Kiali adds trace entries to the workload logs view. Below, in time-sorted order, the user is presented with a unified view of application logs (in white), Envoy proxy logs (in gold), and trace spaces (in blue). Clicking a span of interest brings you to the detail view for the trace of interest.

Logs with Tracing

2.5 - Validations

Validations Overview

Kiali performs a set of validations to the most common Istio Objects such as Destination Rules, Service Entries, and Virtual Services. Those validations are done in addition to the existing ones performed by Istio’s Galley component. Most validations are done inside a single namespace only, any exceptions, such as gateways, are properly documented.

Galley validations are mostly syntactic validations based on the object syntax analysis of Istio objects while Kiali validations are mostly semantic validations between different Istio objects. Kiali validations are based on the runtime status of your service mesh, Galley validations are static ones and doesn’t take into account what is configured in the mesh.

Istio Config Validation

Check the complete list of validations for further information.

AuthorizationPolicy

KIA0101 - Namespace not found for this rule

AuthorizationPolicy enables access control on workloads. Each policy effects only to a group of request. For instance, all requests started from a workload on a list of namespaces. The present validation points out those rules referencing a namespace that don’t exist in the cluster.

Resolution

Either remove the namespace from the list, correct if there is any typo or create a new namespace.

Severity

Warning

Example

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: httpbin
 namespace: default
spec:
 selector:
   matchLabels:
     app: httpbin
     version: v1
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/default/sa/sleep"]
   - source:
       namespaces:
         - default
         - non-existing # warning
         - unexisting # warning
   to:
   - operation:
       methods: ["GET"]
       paths: ["/info*"]
   - operation:
       methods: ["POST"]
       paths: ["/data"]
   when:
   - key: request.auth.claims[iss]
     values: ["https://accounts.google.com"]

See Also

KIA0102 - Only HTTP methods and fully-qualified gRPC names are allowed

An AuthorizationPolicy has an Operation field where is defined the oprations allowed for a request. In the method field are listed all the allowed methods that request can have. This validation appears when a problem is found in there. The only methods accepted are: either HTTP valid methods or fully-qualified names of gRPC service in the form of “/package.service/method”

Resolution

Either change or remove the violating method. It has to be either a HTTP valid method or a fully-qualified names of a gRPC service in the form of “/package.service/method”

Severity

Warning

Example

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: httpbin
 namespace: default
spec:
 selector:
   matchLabels:
     app: httpbin
     version: v1
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/default/sa/sleep"]
   - source:
       namespaces:
         - default
   to:
   - operation:
       methods:
         - "GET"
         - "/package.service/method"
         - "WRONG" # Warning
         - "non-fully-qualified-grpc" # Warning
       paths: ["/info*"]
   - operation:
       methods: ["POST"]
       paths: ["/data"]
   when:
   - key: request.auth.claims[iss]
     values: ["https://accounts.google.com"]

See Also

KIA0104 - This host has no matching entry in the service registry

AuthorizationPolicy enables access control on workloads. Each policy effects only to a group of request going to a specific destination. For instance, allow all the request going to details host.

The present validation points out those rules referencing a host that don’t exist in the authorization policy namespace. Kiali considers services and service entries. Those hosts that refers to hosts outside of the object namespace will be presented with an unknow error.

Resolution

Either remove the host from the list, correct if there is any typo or deploy a new service or service entry.

Severity

Error

Example

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: httpbin
 namespace: default
spec:
 selector:
   matchLabels:
     app: httpbin
     version: v1
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/default/sa/sleep"]
   - source:
       namespaces:
         - default
   to:
   - operation:
       hosts:
         - wrong # Error
         - ratings
         - details.default
         - reviews.default.svc.cluster.local
         - productpage.outside # Unknown
         - google.com # Service Entry present. No error
         - google.org # Service Entry not present, wrong domain. Error.
       methods:
         - "GET"
         - "/package.service/method"
       paths: ["/info*"]
   - operation:
       methods: ["POST"]
       paths: ["/data"]
   when:
   - key: request.auth.claims[iss]
     values: ["https://accounts.google.com"]

See Also

Destination rules

KIA0201 - More than one DestinationRules for the same host subset combination

Istio applies traffic rules for services after the routing has happened. These can include different settings such as connection pooling, circuit breakers, load balancing, and detection. Istio can define the same rules for all services under a host or different rules for different versions of the service.

This validation warning could be a result of duplicate definition of the same subsets as well as from rules that apply to all subsets. Also, a combination of one Destination Rule (DR) applying to all subsets and another defining behavior for only some subsets triggers this validation warning.

Istio silently ignores the duplicate subsets and merge these destination rules without letting the user know. Only the first seen rule (by Istio) per subset is used and information from multiple definitions is not merged. While the routing might work correctly, this is most likely a configuration error. It may lead to a undesired behavior if one of the offending rules is removed or modified and that is probably not the intention of the deployer of this service. Also, if the two offending destination rules have different policies for traffic routing the wrong one might be used.

Resolution

Either merge the settings to a single DR or split the subsets in such a way that they do not interleave. This ensures that the routing behavior stays consistent.

Severity

Warning

Example

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews-dr1
spec:
  host: reviews
  trafficPolicy:
    loadBalancer:
      simple: RANDOM
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
  - name: v3
    labels:
      version: v3
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews-dr2
spec:
  host: reviews
  trafficPolicy:
    loadBalancer:
      simple: RANDOM
  subsets:
  - name: v1
    labels:
      version: v1

See Also

KIA0202 - This host has no matching entry in the service registry (service, workload or service entries)

Istio applies traffic rules for services after the routing has happened. These can include different settings such as connection pooling, circuit breakers, load balancing, and detection. Istio can define the same rules for all services under a host or different rules for different versions of the service. The host must a service that is defined in the platform’s service registry or as a ServiceEntry. Short names are extended to include ‘.namespace.cluster’ using the namespace of the destination rule, not the service itself. FQDN is evaluated as is. It is recommended to use the FQDN to prevent any confusion.

If the host is not found, Istio ignores the defined rules.

Resolution

Correct the host to point to a correct service, in this namespace or with FQDN to other namespaces, or deploy the missing service to the mesh.

Severity

Error

Example

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: notpresent
  trafficPolicy:
    loadBalancer:
      simple: RANDOM
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
  - name: v3
    labels:
      version: v3

See Also

KIA0203 - This subset’s labels are not found in any matching host

Istio applies traffic rules for services after the routing has happened. These can include different settings such as connection pooling, circuit breakers, load balancing, and detection. Istio can define the same rules for all services under a host or different rules for different versions of the service. The host must a service that is defined in the platform’s service registry or as a ServiceEntry. Short names are extended to include ‘.namespace.cluster’ using the namespace of the destination rule, not the service itself. FQDN is evaluated as is. It is recommended to use the FQDN to prevent any confusion.

Subsets can override the global settings defined in the DR for a host.

If the host is not found, Istio ignores the defined rules.

If the not found subset is not referenced in any Virtual Service, the severity of this error is changed to Info.

Resolution

Correct the host to point to a correct service, in this namespace or with FQDN to other namespaces, or deploy the missing service to the mesh. If the hostname is equal to the one used otherwise in the DR, consider removing the duplicate host resolution.

Also, verify that the labels are correctly matching a workload with the intended service.

Severity

Error

Example

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  trafficPolicy:
    loadBalancer:
      simple: RANDOM
  subsets:
  - name: v1
    labels:
      version: v10
  - name: v2
    labels:
      notfoundlabel: v2
  - name: v3
    labels:
      version: v3

See Also

KIA0204 - mTLS settings of a non-local Destination Rule are overridden

Istio allows you to define DestinationRule at three different levels: mesh, namespace and service level. A mesh may have multiple DRs. In case of having two DestinationRules on the first one is at a lower level than the second one, the first one overrides the TLS values of the second one.

This validation appears only when autoMtls is disabled.

Resolution

This validation aims to warn Kiali users that they may be disabling/enabling mTLS from the higher DestinationRule. Merging the TLS settings to one of the DestinationRules is the only way to fix this validation. However, this is a valid scenario so it might be impossible to remove this warning.

Severity

Warning

Example

apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: "default"
  namespace: "istio-system"
spec:
  host: "*.local"
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews
spec:
  host: reviews
  trafficPolicy:
    loadBalancer:
      simple: RANDOM
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
  - name: v3
    labels:
      version: v3

See Also

KIA0205 - PeerAuthentication enabling mTLS at mesh level is missing

Istio has the ability to define mTLS communications at mesh level. In order to do that, Istio needs one DestinationRule and one PeerAuthentication. The DestinationRule configures all the clients of the mesh to use mTLS protocol on their connections. The PeerAuthentication defines what authentication methods that can be accepted on the workload of the whole mesh. If the PeerAuthentication is not found or doesn’t exist and the mesh-wide DestinationRule is on ISTIO_MUTUAL mode, all the communication returns 500 errors.

This validation appears only when autoMtls is disabled.

Resolution

Add a PeerAuthentication within the istio-system namespace without specifying targets but setting peers mtls mode to STRICT or PERMISSIVE. The PeerAuthentication should be like this.

Severity

Error

Example

# AutoMtls disabled, no PeerAuthentication at mesh-level defined
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: "default"
  namespace: "istio-system"
spec:
  host: "*.local"
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL

See Also

KIA0206 - PeerAuthentication enabling namespace-wide mTLS is missing

Istio has the ability to define mTLS communications at namespace level. In order to do that, Istio needs both a DestinationRule and a PeerAuthentication targeting all the clients/workloads of the specific namespace. The PeerAuthentication allows mTLS authentication method for all the workloads within a namespace. The DestinationRule defines all the clients within the namespace to start communications in mTLS mode. If the PeerAuthentication is not found and the DestinationRule is on STRICT mode in that namespace but there is the DestinationRule enabling mTLS, all the communications within that namespace returns 500 errors.

This validation appears only when autoMtls is disabled.

Resolution

A PeerAuthentication enabling mTLS method is needed for the workloads in the namespace. Otherwise all the clients start mTLS connections that those workloads won’t be ready to manage. Add a PeerAuthentication without specifying targets but setting mTLS mode to STRICT or PERMISSIVE in the same namespace as the DestinationRule.

Severity

Error

Example

apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: "enable-mtls"
  namespace: "bookinfo"
spec:
  host: "*.bookinfo.svc.cluster.local"
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL

See Also

KIA0207 - PeerAuthentication with TLS strict mode found, it should be permissive

Istio needs both a DestinationRule and PeerAuthentication to enable mTLS communications. The PeerAuthentication configures the authentication method accepted for all the targeted workloads. The DestinationRule defines which is the authentication method that the clients of specific workloads has to start communications with.

Resolution

Kiali has found that there is a DestinationRule sending traffic without mTLS authentication method. There are two different ways to fix this situation. You can either change the PeerAuthentication applying to PERMISSIVE mode or change the DestinationRule to start communications using mTLS.

Severity

Error

Example

apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: "disable-mtls"
  namespace: "bookinfo"
spec:
  host: "*.bookinfo.svc.cluster.local"
  trafficPolicy:
    tls:
      mode: DISABLE
---
apiVersion: "authentication.istio.io/v1alpha1"
kind: "Policy"
metadata:
  name: "default"
  namespace: "bookinfo"
spec:
  peers:
  - mtls:
      mode: STRICT

See Also

KIA0208 - PeerAuthentication enabling mTLS found, permissive mode needed

Istio needs both a DestinationRule and PeerAuthentication to enable mTLS communications. The PeerAuthentication configures the authentication method accepted for all the targeted workloads. The DestinationRule defines which is the authentication method that the clients of specific workloads has to start communications with.

Kiali found a DestinationRule starting communications without TLS but there was a PeerAuthentication allowing all services in the mesh to accept only requests in mTLS.

Resolution

There are two ways to fix this situation. You can either change the PeerAuthentication to enable PERMISSIVE mode to all the workloads in the mesh or change the DestinatonRule to enable mTLS instead of disabling it (change the mode to ISTIO_MUTUAL).

Severity

Error

Example

apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: "default"
  namespace: "bookinfo"
spec:
  host: "*.bookinfo.svc.cluster.local"
  trafficPolicy:
    tls:
      mode: DISABLE
---
apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "default"
  namespace: "istio-system"
spec:
  mtls:
    mode: STRICT

See Also

KIA0209 - DestinationRule Subset has not labels

A DestinationRule subset without labels may miss the destination endpoint linked with a specific workload.

Resolution

Validate that a subset is properly configured.

Severity

Warning

See Also

Gateways

KIA0301 - More than one Gateway for the same host port combination

Gateway creates a proxy that forwards the inbound traffic for the exposed ports. If two different gateways expose the same ports for the same host, this creates ambiguity inside Istio as either of these gateways could handle the traffic. This is most likely a configuration error. This check is done across all namespaces the user has access to.

There is one exception: when both gateways points to a different ingress. Then the ambiguity doesn’t exist and, in consequence, no validation is shown. Kiali considers that two gateways points to the same ingress if they share the exact same selector.

Resolution

Remove the duplicate gateway entries or merge the two gateway definitions into a single one.

Severity

Warning

Example

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: bookinfo-gateway # Validation shown
  namespace: bookinfo
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: bookinfo-gateway-copy # Validation shown
  namespace: bookinfo2
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: bookinfo-gateway-diff-ingress # No validations shown
  namespace: bookinfo
spec:
  selector:
    istio: ingressgateway-pub # Using different ingress
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"
---

See Also

KIA0302 - No matching workload found for gateway selector in this namespace

This validation checks the current namespace for matching workloads as this is recommended, and potentially in the future required, by the Istio. Excluded from this check are the default “istio-ingressgateway” and “istio-egressgateway” workloads which are included in Istio by default.

Although your traffic might be correctly routed to a workload in other namespace, this is not a guaranteed behavior and thus a warning is flagged in such cases also.

Resolution

Deploy the missing workload or fix the selector to target a correct location.

Severity

Warning

Example

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: bookinfo-gateway
  namespace: bookinfo
spec:
  selector:
    app: nonexisting # workload doesn't exist in the namespace
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"

See Also

Mesh Policies

KIA0401 - Mesh-wide Destination Rule enabling mTLS is missing

Istio has the ability to define mTLS communications at mesh level. In order to do that, Istio needs one DestinationRule and one PeerAuthentication. The DestinationRule configures all the clients of the mesh to use mTLS protocol on their connections. The PeerAuthentication defines what authentication methods can be accepted on the workload of the whole mesh. If the DestinationRule is not found or doesn’t exist and the PeerAuthentication is on STRICT mode, all the communication returns 500 errors.

This validation appears only when autoMtls is disabled.

Resolution

Add a DestinationRule with “*.cluster” host and ISTIO_MUTUAL as tls trafficPolicy mode. The DestinationRule should be like this.

Severity

Error

Example

# Make sure there isn't any DestinationRule enabling meshwide mTLS
apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "default"
  namespace: "istio-system"
spec:
  mtls:
    mode: STRICT

See Also

PeerAuthentication

KIA0501 - Destination Rule enabling namespace-wide mTLS is missing

Istio has the ability to define mTLS communications at namespace level. In order to do that, Istio needs one DestinationRule and one PeerAuthentication. The DestinationRule configures all the clients of the namespace to use mTLS protocol on their connections. The PeerAuthentication defines what authentication methods can be accepted on a specific group of workloads. PeerAuthentications without target field specified will target all the workloads within its namespace. If the DestinationRule is not found or doesn’t exist in the namespace and the namespace-wide PeerAuthentication is on STRICT mode, all the communication will return 500 errors.

This validation appears only when autoMtls is disabled.

Resolution

Add a DestinationRule with “*.namespace.svc.cluster.local” host and ISTIO_MUTUAL as tls trafficPolicy mode. The DestinationRule should be like this.

Severity

Error

Example

# Make sure there isn't any DestinationRule enabling meshwide mTLS
apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "default"
  namespace: "bookinfo"
spec:
  mtls:
    mode: STRICT

See Also

KIA0505 - Destination Rule disabling namespace-wide mTLS is missing

PeerAuthentication objects are used to define the authentication methods that a set of workloads can accept: Mutual, Istio Mutual, Simple or Disabled.

This validation warns the scenario where there is one PeerAuthentication at namespace level with DISABLE mode but there is DestinationRule at namespace or mesh level enabling mTLS. With this scenario, all the traffic flowing between the services in that namespace will fail.

Resolution

You can either change the namespace/mesh-wide Destination Rule to DISABLE mode or change the current PeerAuthentication to allow mTLS (mode STRICT or PERMISSIVE).

Severity

Error

Example

apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "default"
  namespace: "bookinfo"
spec:
  mtls:
    mode: DISABLE
---
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: "enable-mtls"
  namespace: bookinfo
spec:
  host: "*.bookinfo.svc.cluster.local"
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL

See Also

KIA0506 - Destination Rule disabling mesh-wide mTLS is missing

PeerAuthentication objects are used to define the authentication methods that a set of workloads can accept: Mutual, Istio Mutual, Simple or Disabled.

This validation warns the scenario where there is one PeerAuthentication at mesh level with DISABLE mode but there is DestinationRule at mesh level enabling mTLS. With this scenario, all the traffic flowing between the services in that namespace will fail.

Resolution

You can either change the mesh-wide Destination Rule to DISABLE mode or change the current PeerAuthentication to allow mTLS (mode STRICT or PERMISSIVE).

Severity

Error

Example

apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "default"
  namespace: "istio-system"
spec:
  mtls:
    mode: DISABLE
---
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: "enable-mtls"
  namespace: bookinfo
spec:
  host: "*.local"
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL

See Also

Ports

KIA0601 - Port name must follow [-suffix] form

Istio requires the service ports to follow the naming form of ‘protocol-suffix’ where the ‘-suffix’ part is optional. If the naming does not match this form (or is undefined), Istio treats all the traffic TCP instead of the defined protocol in the definition. Dash is a required character between protocol and suffix. For example, ‘http2foo’ is not valid, while ‘http2-foo’ is (for http2 protocol).

Resolution

Rename the service port name field to follow the form and the traffic flows correctly.

Severity

Error

Example

apiVersion: v1
kind: Service
metadata:
  name: ratings-java-svc
  namespace: bookinfo
  labels:
    app: ratings
    service: ratings-svc
spec:
  ports:
  - port: 9080
    name: wrong-http
  selector:
    app: ratings-java
    version: v1

See Also

Services

KIA0701 - Deployment exposing same port as Service not found

Service definition has a combination of labels and port definitions that are not matching to any workloads. This means the deployment will be unsuccessful and no traffic can flow between these two resources. The port is read from the Service ‘TargetPort’ definition first and if undefined, the ‘Port’ field is used as Kubernetes defaults the ‘TargetPort’ to ‘Port’. If the ‘TargetPort’ is using a integer, the port numbers are compared and if the ‘TargetPort’ is a string, the deployment’s portName is used for comparison.

Resolution

Fix the port definitions in the workload or in the service definition to ensure they match.

Severity

Warning

Example

Invalid example with port definitions unmatched:

apiVersion: v1
kind: Service
metadata:
  name: ratings-java-svc
  namespace: ratings-java
  labels:
    app: ratings
    service: ratings-svc
spec:
  ports:
  - port: 9080
    name: http
  selector:
    app: ratings-java
    version: v1
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: ratings-java
  namespace: ratings-java
  labels:
    app: ratings-java
    version: v1
spec:
  replicas: 1
  template:
    metadata:
      annotations:
         sidecar.istio.io/inject: "true"
      labels:
        app: ratings-java
        version: v1
    spec:
      containers:
      - name: ratings-java
        image: pilhuhn/ratings-java:f
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080

Valid example using targetPort definition matching:

apiVersion: v1
kind: Service
metadata:
  name: ratings-java-svc
  namespace: ratings-java
  labels:
    app: ratings
    service: ratings-svc
spec:
  ports:
  - port: 9080
    targetPort: 8080
    name: http
  selector:
    app: ratings-java
    version: v1
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: ratings-java
  namespace: ratings-java
  labels:
    app: ratings-java
    version: v1
spec:
  replicas: 1
  template:
    metadata:
      annotations:
         sidecar.istio.io/inject: "true"
      labels:
        app: ratings-java
        version: v1
    spec:
      containers:
      - name: ratings-java
        image: pilhuhn/ratings-java:f
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080

See Also

ServiceMesh Policies

KIA0801 - Mesh-wide Destination Rule enabling mTLS is missing

Maistra has the ability to define mTLS communications at mesh level. In order to do that, Maistra needs one DestinationRule and one ServiceMeshPolicy. The DestinationRule configures all the clients of the mesh to use mTLS protocol on their connections. The ServiceMeshPolicy defines what authentication methods can be accepted on the workload of the whole mesh. If the DestinationRule is not found or doesn’t exist and the ServiceMeshPolicy is on STRICT mode, all the communication returns 500 errors.

Resolution

Add a DestinationRule named as default with “*.cluster” host and ISTIO_MUTUAL as tls trafficPolicy mode. The DestinationRule should be like this.

Severity

Error

Example

apiVersion: "maistra.io/v1"
kind: "ServiceMeshPolicy"
metadata:
  name: default
  namespace: control-plane-ns
spec:
  peers:
  - mtls: {}

See Also

ServiceRoles and ServiceRoleBindings

KIA0901 - Unable to find all the defined services

Services can be listed with an exact match, prefix match as well as suffix match. Using “*” refers to all the services in this namespace. This error indicates the services list is pointing to a service that can not be found from this namespace.

Resolution

Deploy the missing services or fix the services list to point to correct services.

Severity

Error

Example

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: details-reviews-viewer
  namespace: bookinfo
spec:
  rules:
  - services: ["wrongservice.bookinfo.svc.cluster.local", "reviews.bookinfo.svc.cluster.local"]
    methods: ["GET"]
    constraints:
      - key: "destination.labels[version]"
        values: ["v1"]

See Also

KIA0902 - ServiceRole can only point to current namespace

Services can be listed with an exact match, prefix match as well as suffix match. Using “*” refers to all the services in this namespace. Although FQDN can be used, the namespace of the services must be the same as ServiceRole’s deployment.

Resolution

If the services in question are located in another namespace, deploy this ServiceRole to that namespace, or deploy the services to this namespace.

Severity

Error

Example

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: details-reviews-viewer
  namespace: bookinfo
spec:
  rules:
  - services: ["details.test.svc.cluster.local", "reviews.bookinfo.svc.cluster.local"]
    methods: ["GET"]
    constraints:
      - key: "destination.labels[version]"
        values: ["v1"]

See Also

KIA0903 - ServiceRole does not exists in this namespace

ServiceRoleBinding assigns a ServiceRole to subjects. As such, it must refer to a ServiceRole object and this object can only reside in the same namespace. Cross-namespace refers are not valid.

Resolution

Deploy the missing ServiceRole to the same namespace.

Severity

Error

Example

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: details-reviews-viewer
  namespace: default
spec:
  rules:
  - services: ["details.bookinfo.svc.cluster.local", "reviews.bookinfo.svc.cluster.local"]
    methods: ["GET"]
    constraints:
      - key: "destination.labels[version]"
        values: ["v1"]
---
apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: bind-details-reviews
  namespace: bookinfo
spec:
  subjects:
  - user: "cluster.local/ns/bookinfo/sa/bookinfo-productpage"
  roleRef:
    kind: ServiceRole
    name: "details-reviews-viewer"

See Also

Sidecars

KIA1003 - Invalid host format. ‘namespace/dnsName’ format expected

The Sidecar resources are used for configuring the sidecar proxies in the service mesh. IstioEgressListener specifies the properties of an outbound traffic listener on the sidecar proxy attached to a workload instance.

The hosts list is the list of hosts that will be exposed to the workload. Each host in the list must have the namespace/dnsName format.

Resolution

Make sure the host has the namespace/dnsName format. See more info in the documentation link right below.

Severity

Error

Example

apiVersion: networking.istio.io/v1alpha3
kind: Sidecar
metadata:
  name: notmatching
  namespace: bookinfo
spec:
  workloadSelector:
    labels:
      app: reviews
  egress:
  - port:
      number: 3306
      protocol: MYSQL
      name: egressmysql
    captureMode: NONE
    bind: 127.0.0.1
    hosts:
    - "*/mysql.foo.com"
    - "noslashsymbolpresent" # unsupported format


See Also

KIA1004 - This host has no matching entry in the service registry

The Sidecar resources are used for configuring the sidecar proxies in the service mesh. IstioEgressListener specifies the properties of an outbound traffic listener on the sidecar proxy attached to a workload instance.

In the hosts field, there is the list of hosts exposed to the workload. Each host in the list have the namespace/dnsName format where both namespace and dnsName may have non-obvious values. namespace may be either ., ~, * or an actual namespace name. dnsName has to be a FQDN representing a service, virtual service or a service entry. This FQDN may use the wildcard character.

See more information about the syntax of both namespace and dnsName into istio documentation.

Resolution

Make sure there is a service, virtual service or service entry matching with the host.

Severity

Warning

Example

apiVersion: networking.istio.io/v1alpha3
kind: Sidecar
metadata:
  name: servicenotfound
  namespace: bookinfo
spec:
  workloadSelector:
    labels:
      app: reviews
  egress:
  - port:
      number: 3306
      protocol: MYSQL
      name: egressmysql
    captureMode: NONE
    bind: 127.0.0.1
    hosts:
    - "bookinfo/*.bookinfo.svc.cluster.local" # Bookinfo running into bookinfo ns
    - "default/kiali.io" # Service entry present in the namespace
    - "bookinfo/bogus.bookinfo.svc.cluster.local" # Bogus service into bookinfo doesn't exist
    - "bogus-ns/reviews.bookinfo.svc.cluster.local" # Cross-namespace validation: unable to verify validity


See Also

KIA1006 - Global default sidecar should not have workloadSelector

The Sidecar resources are used for configuring the sidecar proxies in the service mesh. By default, all the sidecars are configured with the default sidecar instance specified in the control plane namespace (usually istio-system). In case there are sidecar resources in the namespaces where your applications are, this default sidecar resource won’t be considered. The sidecar in your namespace will be applied.

Having workloadSelector in your global default sidecar won’t make any effect in the other sidecars living outside of the control plane namespace.

Resolution

Make sure you don’t have the workloadSelector in this global sidecar resource. In case you need specific settings for specific workloads, move those settings to the sidecar resources in your application namespaces.

Severity

Warning

Example

apiVersion: networking.istio.io/v1alpha3
kind: Sidecar
metadata:
  name: default
  namespace: istio-system
spec:
  workloadSelector: # Default sidecar can't have labels
    labels:
      version: v1
  egress:
  - port:
      number: 3306
      protocol: MYSQL
      name: egressmysql
    captureMode: NONE
    bind: 127.0.0.1
    hosts:
    - "bookinfo/reviews.bookinfo.svc.cluster.local"
    - "bookinfo/details.bookinfo.svc.cluster.local"

See Also

VirtualServices

KIA1101 - DestinationWeight on route doesn’t have a valid service (host not found)

VirtualService routes matching requests to a service inside your mesh. Routing can also match a subset of traffic to a certain version of it for example. Any service inside the mesh must be targeted by its name, the IP address are only allowed for hosts defined through a Gateway. Host must be in a short name or FQDN format. Short name will evaluate to VS' namespace, regardless of where the actual service might be placed.

If the host is not found, Istio ignores the defined rules. However, if a subset with a Destination Rule is not found it affects all the subsets and all the routings. As such, care must be taken that the Destination rule is available before deploying the Virtual Service.

Resolution

Correct the host to point to a correct service (in this namespace or with FQDN to other namespaces), deploy the missing service to the mesh or remove the configuration linking to that non-existing service.

Severity

Error

Example

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: details
spec:
  hosts:
  - details
  http:
  - route:
    - destination:
        host: nonexistentsvc
        subset: v2

See Also

KIA1102 - VirtualService is pointing to a non-existent gateway

By default, VirtualService routes apply to sidecars inside the mesh. The gateway field allows to override that default and if anything is defined, the VS applies to those selected. ‘mesh’ is a reserved gateway name and means all the sidecars in the mesh. If one wishes to apply the VS to gateways as well as the sidecars, the ‘mesh’ keyword must be used as one of the gateways. Incorrect gateways mean that the VS is not applied correctly.

Resolution

Fix the possible gateway field to target all necessary gateways or remove the field if the default ‘mesh’ is enough.

Severity

Error

Example

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: details
spec:
  hosts:
  - details
  gateways:
  - non-existent-gateway
  http:
  - route:
    - destination:
        host: details
        subset: v1

See Also

KIA1103 - VirtualService doesn’t define any route protocol

VirtualService is a defined set of rules for routing certain type of traffic to target destinations with rules. At least one, ‘tcp’, ‘http’ or ‘tls’ must be defined.

Resolution

This appears to be a configuration error. Fix the definition.

Severity

Error

Example

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: details
spec:
  hosts:
  - details

See Also

KIA1104 - The weight is assumed to be 100 because there is only one route destination

Istio assumes the weight to be 100 when there is only one HTTPRouteDestination or RouteDestination. The warning is present because there is one route with a weight less than 100.

Resolution

Either remove the weight field or you might want to add another RouteDestination with an specific weight.

Severity

Warning

Example

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
  - route:
    - destination:
        host: reviews
        subset: v1
      weight: 10

See Also

KIA1105 - This subset is already referenced in another route destination

Istio allows you to apply rules over the traffic targetting to a specific service. In order to achieve that, it is necessary to add those rules into either http, tcp or tls fields in a VirtualService. In each field it is possible to specify rules for redirection or forwarding traffic. Those rules are the RouteDestination and HTTPRouteDestination structs. Each structs defines where the traffic is shifted to using the subset field. This warning message refers to the fact of referencing one subset more than one time within the same route. Galley, Istio module in charge of configuration validation, allows the subset duplicity. However, the mesh it might become broken when there are different duplicates. Also, the presented warning might help spoting a typo.

Resolution

Make sure there is only one reference to the same subset for each RouteDestination. Either HTTPRouteDestination or RouteDestination.

Severity

Warning

Example

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
  - route:
    - destination:
        host: reviews
        subset: v1
      weight: 80
    - destination:
        host: reviews
        subset: v2 # duplicate
      weight: 10
    - destination:
        host: reviews
        subset: v2 # duplicate
      weight: 10

See Also

KIA1106 - More than one Virtual Service for same host

A VirtualService defines a set of traffic routing rules to apply when a host is addressed. Each routing rule defines matching criteria for traffic of a specific protocol. If the traffic is matched, then it is sent to a named destination service (or subset/version of it) defined in the registry.

Resolution

This is a valid configuration only if two VirtualServices share the same host but are bound to a different gateways, sidecars do not accept this behavior. There are several caveats when using this method and defining the same parts in multiple Virtual Service definitions is not recommended. While Istio will merge the configuration, it does not guarantee any ordering for cross-resource merging and only the first seen configuration is applied (rest ignored). As recommended, each VS definition should have a ‘catch-all’ situation, but this can only be defined in a definition affecting the same host.

Severity

Warning

Example

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
  - route:
    - destination:
        host: reviews
        subset: v1
      weight: 90
    - destination:
        host: reviews
        subset: v2
      weight: 10
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews-cp
spec:
  hosts:
    - reviews
  http:
  - route:
    - destination:
        host: reviews
        subset: v1
      weight: 90
    - destination:
        host: reviews
        subset: v2
      weight: 10

See Also

KIA1107 - Subset not found

VirtualService routes matching requests to a service inside your mesh. Routing can also match a subset of traffic to a certain version of it for example. The subsets referred in a VirtualService have to be defined in one DestinationRule.

If one route in the VirtualService points to a subset that doesn’t exist Istio won’t be able to send traffic to a service.

Resolution

Fix the routes that points to a non existing subsets. It might be fixing a typo in the subset’s name or defining the missing subset in a DestinationRule.

Severity

Error

Example

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts:
    - reviews
  http:
  - route:
    - destination:
        host: reviews
        subset: nosubset
      weight: 90
    - destination:
        host: reviews
        subset: v2
      weight: 10

See Also

KIA1108 - Preferred nomenclature: /

A virtual service may include a list of gateways which the defined routes should be applied to. Gateways in other namespaces may be referred to by /; specifying a gateway with no namespace qualifier is the same as specifying the VirtualService’s namespace.

Resolution

Move the nomenclature of the gateways into the supported Istio form: /

Example

kind: VirtualService
apiVersion: networking.istio.io/v1alpha3
metadata:
  name: bookinfo
  namespace: bookinfo
spec:
  hosts:
    - '*'
  gateways:
    - bookinfo-gateway.bookinfo.svc.cluster.local # unsupported format
    - bookinfo/bookinfo-gateway # works
  http:
    - match:
        - uri:
            exact: /productpage
        - uri:
            prefix: /static
        - uri:
            exact: /login
        - uri:
            exact: /logout
        - uri:
            prefix: /api/v1/products
      route:
        - destination:
            host: productpage
            port:
              number: 9080
---
kind: Gateway
apiVersion: networking.istio.io/v1alpha3
metadata:
  name: bookinfo-gateway
  namespace: bookinfo
spec:
  servers:
    - hosts:
        - '*'
      port:
        name: http
        number: 80
        protocol: HTTP
  selector:
    istio: ingressgateway


See Also

Generic

KIA0001 - Unable to verify the validity, cross-namespace validation is not supported for this field

In certain cases, Kiali is unable to validate the field since it spans another namespace to which the validator is not capable of looking at. In such cases, Kiali will mark this field with a grey icon indicating that the fields correctness could not be verified. This does not necessarily mean there is an error, but that the user should be careful and do the validation manually.

KIA0002 - More than one selector-less object in the same namespace

This validation refers to the usage of the selector. Selector-less Istio objects are those objects that don’t have the selector field specified. Therefore, objects that apply to all the workloads of a namespace (or whole mesh if the namespace is the same as the control plane namespace).

This validation warns you that you have two different objects living in the same namespace. This may leave an non-deterministic or unexpected behavior on the workloads of the namespace.

Resolution

The natural solution is to merge both objects. In case there are different behaviors you want to apply, consider to define the selector field targeting a specific set of workloads.

Severity

Error

Example

apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "default"
  namespace: "bookinfo"
spec:
  mtls:
    mode: STRICT
---
apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "duplicate"
  namespace: "bookinfo"
spec:
  mtls:
    mode: STRICT

See Also

KIA0003 - More than one object applied to the same workload

This validation refers to the usage of the selector. In this field are defined the labels of the workloads that this object will be applied to. It might be one or more workloads in the same namespace.

This validation warns the scenario where there are two different objects applying to the same workload(s). This may leave an undeterministic or unexpected behavior on the workloads of the namespace.

Resolution

There isn’t a standard solution for that. It is a good practice not to have multiple rules of the same kind applying to the same workloads. Otherwise you would end up having interferences between objects and having troubles when debugging. The first approach would be to merge both objects into one if possible. The second approach would be to reorganize the objects of the same kind in a way that each one only applies to a different set of workloads. Applying no change into the objects is also an option although not desiderable.

Severity

Error

Example

apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "productpage"
  namespace: "bookinfo"
spec:
  selector:
    matchLabels:
      app: productpage
  mtls:
    mode: STRICT
---
apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "productpage-disable-80"
  namespace: "bookinfo"
spec:
  selector:
    matchLabels:
      app: productpage
      version: v1
  mtls:
    mode: STRICT
  portLevelMtls:
    80:
      mode: DISABLE

See Also

KIA0004 - No matching workload found for the selector in this namespace

This validation warns the scenario where there are not workloads matching with the selector labels. In other terms, this object doesn’t have any implication into the mesh.

Resolution

There are three scenarios: either change the labels to match an existing workload (useful with typos), deploy a workload that match with those labels or safely remove this object.

Severity

Warning

Example

apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
  name: "nomatchingworkloads"
  namespace: "bookinfo"
spec:
  selector:
    matchLabels:
      app: wrong-typo
      version: v1
  mtls:
    mode: STRICT
  portLevelMtls:
    80:
      mode: DISABLE

See Also

2.6 - Wizards

Kiali is more than observability, it also helps you to configure, update and validate your Istio service mesh.

Istio Wizards

Kiali provides actions to create, update and delete Istio configuration, driven by wizards.

Actions can be applied to a Service

Service Details Actions

Actions can also be applied to a Workload

Workload Details Actions

Also, actions are available for an entire Namespace

Namespace Actions

Service Actions

Traffic Management: Request Routing

The Request Routing Wizard allows creating multiple routing rules.

  • Every rule is composed of a Request Matching and a Routes To section.
  • The Request Matching section can add multiple filters using HEADERS, URI, SCHEME, METHOD or AUTHORITY Http parameters.
  • The Request Matching section can be empty, in this case any http request received is matched against this rule.
  • The Routes To section can specify the percentage of traffic that is routed to a specific workload.

Request Routing

Istio applies routing rules in order, meaning that the first rule matching an HTTP request performs the routing. The Matching Routing Wizard allows changing the rule order.

Traffic Management: Fault Injection

The Fault Injection Wizard allows injecting faults to test the resiliency of a Service.

  • HTTP Delay specification is used to inject latency into the request forwarding path.
  • HTTP Abort specification is used to prematurely abort a request with a pre-specified error code.

Fault Injection

Traffic Management: Traffic Shifting

The Traffic Shifting Wizard allows selecting the percentage of traffic that is routed to a specific workload.

Traffic Shifting

Traffic Management: Request Timeouts

The Request Timeouts Wizard sets up request timeouts in Envoy, using Istio.

  • HTTP Timeout defines the timeout for a request.
  • HTTP Retry describes the retry policy to use when an HTTP request fails.

Request Timeouts

Traffic Management: Gateways

Traffic Management Wizards have an Advanced Options section that can be used to extend the scenario.

One available Advanced Option is to expose a Service to external traffic through an existing Gateway or to create a new Gateway for this Service.

Request Timeouts

Traffic Management: Circuit Breaker

Traffic Management Wizards allows defining Circuit Breakers on Services as part of the available Advanced Options.

  • Connection Pool defines the connection limits for an upstream host.
  • Outlier Detection implements the Circuit Breaker based on the consecutive errors reported.

Circuit Breakers

Security: Traffic Policy

Traffic Management Advanced Options allows defining Security and Load Balancing settings.

  • TLS related settings for connections to the upstream service.
  • Automatically generate a PeerAuthentication resource for this Service.
  • Load balancing policies to apply for a specific destination.

Traffic Policy

Workload Actions

Automatic Sidecar Injection

A Workload can be individually annotated to control the Sidecar Injection.

A default scenario is to indicate this at Namespace level but there can be cases where a Workload shouldn’t be part of the Mesh or vice versa.

Kiali allows users to annotate the Deployment template and propagate this configuration into the Pods.

Automatic Sidecar Injection

Namespace Actions

The Kiali Overview page offers several Namespace actions, in any of its views: Expanded, Compacted or Table.

Namespace Actions

Show

Show actions navigate from a Namespace to its specific Graph, Applications, Workloads, Services or Istio Config pages.

Automatic Sidecar Injection

When Automatic Sidecar Injection is enabled in the cluster, a Namespace can be labeled to enable/disable the injection webhook, controlling whether new deployments will automatically have a sidecar.

Canary Istio upgrade

When Istio Canary revision is installed, a Namespace can be labeled to that canary revision, so the sidecar of canary revision will be injected into workloads of the namespace.

Security: Traffic Policies

Kiali can generate Traffic Policies based on the traffic for a namespace.

For example, at some point a namespace presents a traffic graph like this:

Traffic Policies: Graph

And a user may want to add Traffic Policies to secure that communication. In other words, to prevent traffic other than that currently reflected in the Graph’s Services and Workloads.

Using the Create Traffic Policies action on a namespace, Kiali will generate AuthorizationPolicy resources per every Workload in the Namespace.

Traffic Policies: Graph

2.7 - Istio Configuration

Kiali is more than observability, it also helps you to configure, update and validate your Istio service mesh.

The Istio configuration view provides advanced filtering and navigation for Istio configuration objects such as Virtual Services and Gateways. Kiali provides inline config edition and powerful semantic validation for Istio resources.

Istio Config List

Validations

Kiali performs a set of validations to the most common Istio Objects such as Destination Rules, Service Entries, and Virtual Services. Those validations are done in addition to the existing ones performed by Istio’s Galley component. Most validations are done inside a single namespace only, any exceptions, such as gateways, are properly documented.

Galley validations are mostly syntactic validations based on the object syntax analysis of Istio objects while Kiali validations are mostly semantic validations between different Istio objects. Kiali validations are based on the runtime status of your service mesh, Galley validations are static ones and doesn’t take into account what is configured in the mesh.

Check the complete list of validations for further information.

Istio Config Validation

Istio Forms

Istio Wizards provide a way to apply a Service Mesh pattern and let Kiali to generate the Istio Configuration. Kiali also offers actions to create Istio Config for Gateways and Security scenarios.

These actions are located under the Istio Config page.

Create Istio Config

Istio Security Forms

Kiali allows creation of Istio AuthorizationPolicy resources:

AuthorizationPolicy

Istio PeerAuthentication resources:

PeerAuthentication

Istio RequestAuthentication resources:

RequestAuthentication

Istio Traffic Forms

Kiali uses Istio Wizards to generate Istio Traffic config for a specific Service, but Kiali also allows creation of Gateway, ServiceEntry and Sidecar Istio resources for more generic scenarios.

Istio Gateway resources:

Gateway

Istio ServiceEntry resources:

ServiceEntry

Istio Sidecar resources:

Sidecar

2.8 - Security

Kiali gives support to better understand how mTLS is used in Istio meshes. Find those helpers in the graph, the masthead menu, the overview page and specific validations.

Masthead indicator

At the right side of the Masthead, Kiali shows a lock when the mesh has strictly enabled mTLS for the whole service mesh. It means that all the communications in the mesh uses mTLS.

mTLS mesh-wide enabled strictly Custom Vertx Metrics

Kiali shows a hollow lock when either the mesh is configured in PERMISSIVE mode or there is a misconfiguration in the mesh-wide mTLS configuration.

Overview locks

The overview page shows all the available namespaces with aggregated data. Besides the health and validations, Kiali shows also the mTLS status at namespace-wide. Similar to the masthead, it shows a lock when strict mTLS is enabled or a hollow lock when permissive.

Overview page: showing mTLS at namespace-wide

Graph

The mTLS method is used to establish communication between microservices. In the graph, Kiali has the option to show which edges are using mTLS and with what percentatge during the selected period. When an edge shows a lock icon it means at least one request with mTLS enabled is present. In case there are both mTLS and non-mTLS requests, the side-panel will show the percentage of requests using mTLS.

Enable the option in the Display dropdown, select the security badge.

Graph shows edges which uses mTLS

Validations

Kiali has different validations to help troubleshoot configurations related to mTLS such as DestinationRules and PeerAuthentications.

Validation supporting mTLS configuration

2.9 - Istio Status

Istio Component Status

The Istio service mesh architecture is comprised of several components, from istiod to Jaeger. Each component must work as expected for the mesh to work well overall. Kiali regularly checks the status of each Istio component to ensure the mesh is healthy.

Istio components status: components not healthy or found

A component status will be one of: Not found, Unreachable, Not healthy and Healthy. The Not found status means that Kiali is not able to find the deployment. The Unreachable status means that Kiali hasn’t been succesfuly able to communicate with the component (Prometheus, Grafana and Jaeger). The Not healthy status means that the deployment doesn’t have the desired amount of replicas running. The Healthy status is when the component is not in the previous ones, plus, healthy components won’t be shown in the list.

Regarding the severity of each component, there are only to options: core or add-on. The core components are those shown as errors (in red) whereas the add-ons are displayed as warnings (in orange).

By default, Kiali checks the following components installed in the control plane namespace: istiod, ingress, egress; and prometheus, grafana and jaeger accessible thought their services.

Certificates Information Indicators

In some situations, it would be useful to get information about the certificates used by internal mTLS, for example:

  • Know whether the default CA is used or if there is another CA configured
  • Check the certificates issuer and their validity timestamps to troubleshoot any issue with certificates

The certificates shown depends on how Istio is configured. The following cases are possible:

The following is an example of viewing the default case:

Certificates information

Note that displaying this configuration requires permissions to read secrets (istio-ca-secret by default, possibly cacerts or any secret configured when using DNS certificates).

Having these permissions may concern users. For this reason, this feature is implemented as a feature flag and not only can be disabled, avoiding any extra permissions to read secrets, but also a list of secrets can be configured to explicitly grant read permissions for some secrets in the control plane namespace. By default, this feature is enabled with a Kiali CR configuration equivalent to the following:

spec:
  kiali_feature_flags:
    certificates_information_indicators:
      enabled: true
      secrets:
      - cacerts
      - istio-ca-secret

You can extend this default configuration with additional secrets, remove secrets you don’t want, or disable the feature.

If you add additional secrets, the Kiali operator also needs the same privileges in order to configure Kiali successfully. If you used the Helm Charts to install the operator, specify the secretReader value with the required secrets:

$ helm install \
    --namespace kiali-operator \
    --create-namespace \
    --set "secretReader={cacerts,istio-ca-secret}"
    kiali-operator \
    kiali/kiali-operator

If you installed the operator via the OperatorHub you need to update the operator privileges as a post-installation step, as follows:

$ kubectl patch $(kubectl get clusterroles -o name | grep kiali-operator) --type "json" -p '[{"op":"add","path":"/rules/0","value":{"apiGroups":[""],"resources":["secrets"],"verbs":["get"],"resourceNames":["secret-name-to-be-read"]}}]'

Replace secret-name-to-be-read with the secret name you want the operator to read and restart the Kiali operator pod after running the previous command.

2.10 - Advanced Mesh Deployment and Multi-cluster support

A basic Istio mesh deployment has a single control plane with a single data plane, deployed on a single Kubernetes cluster. But Istio supports a variety of advanced deployment models. It allows a mesh to span multiple primary (control plane) and/or remote (data plane only) clusters, and can use a single or multi-network approach. The only strict rule is that within a mesh service names are unique. A non-basic mesh deployment generally involves multiple clusters. See installation instructions for more detail on installing advanced mesh deployments.

Kiali v1.29 introduces experimental support for advanced deployment models. The recommended Kiali deployment model is to deploy an instance of Kiali along with each Istio primary control plane. Each Kiali instance will work as it has in the past, with the configured Kubernetes, Prometheus and Jaeger instances. It will concern itself with the istio config managed by the local primary. But, there are two new additions:

List View: Mesh Discovery

Kiali will attempt to discover the clusters configured in the mesh. And it will identify the home cluster, meaning the cluster on which it is installed and from which it presents its traffic, traces and Istio config. In the following example there are two clusters defined in the mesh, Kukulcan and Tzotz. Kukulcan is identified as the home cluster in three places: the browser tab (not shown), the masthead, and with a star icon in the clusters list:

Mesh list view

Graph View: Cluster and Namespace Boxing

Starting in v1.8 Istio provides cluster names in the traffic telemetry for multi-cluster installations. The Kiali graph can now use this information to better visualize clusters and namespaces. The Display menu now offers two new options: Cluster Boxes and Namespace Boxes. When enabled, either separately or together, the graph will generate boxes to help more easily identify the relevant nodes and edges, and to see traffic traveling between them.

Each new box type supports selection and will provide a side-panel summary of traffic. Below we see a Bookinfo traffic graph for when Bookinfo services are deployed across the Kukulcan and Tzotz clusters. The Kukulcan cluster box is selected. Because Kukulcan is the Kiali home cluster, you see traffic from the perspective of Kukulcan. Note that you do not see the internal traffic on Tzotz (the requests from Reviews to Ratings service). To see traffic from the Tzotz cluster point of view, you would open a Kiali session on Tzotz, assuming you have privileges.

Multi-cluster traffic graph

3 - Configuration

3.1 - Authentication

Kiali supports five authentication mechanisms:

  • anonymous: gives free access to Kiali.
  • header: requires Kiali to run behind a reverse proxy that is responsible for injecting the user’s token or a token with impersonation.
  • openid: requires authentication through a third-party service to access Kiali.
  • openshift: requires authentication through the OpenShift authentication to access Kiali.
  • token: requires user to provide a Kubernetes ServiceAccount token to access Kiali.

The default authentication mechanism for OpenShift clusters is openshift. For all other Kubernetes clusters, the default mechanism is token.

Read the dedicated page of each authentication mechanism (by clicking on the bullets) to learn more. All mechanisms other than anonymous support Role-based access control.

3.1.1 - Header strategy

Introduction

The header strategy assumes a reverse proxy is in front of Kiali, such as OpenUnison or OAuth2 Proxy, injecting the user’s identity into each request to Kiali as an Authorization header. This token can be an OpenID Connect token or any other token the cluster recognizes. All requests to Kubernetes will be made with this token, allowing Kiali to use the user’s own RBAC context.

In addition to a user token, the header strategy supports impersonation headers. If the impersonation headers are present in the request, then Kiali will act on behalf of the user specified by the impersonation (assuming the token supplied in the Authorization header is authorized to do so).

The header strategy takes advantage of the cluster’s RBAC. See the Role-based access control documentation for more details.

Set-up

The header strategy will work with any Kubernetes cluster. The token provided must be supported by that cluster. For instance, most “on-prem” clusters support OpenID Connect, but cloud hosted clusters do not. For clusters that don’t support a token, the impersonation headers can be injected by the reverse proxy.

spec:
  auth:
    strategy: header

The header strategy doesn’t have any additional configuration.

HTTP Header

The header strategy looks for a token in the Authorization HTTP header with the Bearer prefix. The HTTP header should look like:

Authorization: Bearer TOKEN

Where TOKEN is the appropriate token for your cluster. This TOKEN will be submitted to the API server via a TokenReview to validate the token ONLY on the first access to Kiali. On subsequent calls the TOKEN is passed through directly to the API server.

Security Considerations

Network Policies

A policy should be put in place to make sure that the only “client” for Kiali is the authenticating reverse proxy. This helps limit potential abuse and ensures that the authenticating reverse proxy is the source of truth for who accessed Kiali.

Short Lived Tokens

The authenticating reverse proxy should inject a short lived token in the Authorization header. A shorter lived token is less likely to be abused if leaked. Kiali will take whatever token is passed into the reqeuest, so as tokens are regenerated Kiali will use the new token.

Impersonation

TokenRequest API

The authenticating reverse proxy should use the TokenRequest API instead of static ServiceAccount tokens when possible while using impersonation. The ServiceAccount that can impersonate users and groups is privileged and having it be short lived cuts down on the possibility of a token being leaked while it’s being passed between different parts of the infrastructure.

Drop Incoming Impersonation Headers

The authenticating proxy MUST drop any headers it receives from a remote client that match the impersonation headers. Not only do you want to make sure that the authenticating proxy can’t be overriden on which user to authenticate, but also what groups they’re a member of.

3.1.2 - Anonymous strategy

Introduction

The anonymous strategy removes any authentication requirement. Users will have access to Kiali without providing any credentials.

Although the anonymous strategy doesn’t provide any access protection, it’s valid for some use-cases. Some examples known from the community:

  • Exposing Kiali through a reverse proxy, where the reverse proxy is providing a custom authentication mechanism.
  • Exposing Kiali on an already limited network of trusted users.
  • When Kiali is accessed through kubectl port-forward or alike commands that allow usage of the cluster’s RBAC capabilities to limit access.
  • When developing Kiali, where a developer has a private instance on his own machine.

Set-up

To use the anonymous strategy, use the following configuration in the Kiali CR:

spec:
  auth:
    strategy: anonymous

The anonymous strategy doesn’t have any additional configuration.

Access control

When using the anonymous strategy, the content displayed in Kiali is based on the permissions of the Kiali service account. By default, the Kiali service account has cluster wide access and will be able to display everything in the cluster.

OpenShift

If you are running Kiali in OpenShift, access can be customized by changing privileges to the Kiali ServiceAccount. For example, to reduce permissions to individual namespaces, first, remove the cluster-wide permissions granted by default:

  oc delete clusterrolebindings kiali

Then grant the kiali role only in needed namespaces. For example:

  oc adm policy add-role-to-user kiali system:serviceaccount:istio-system:kiali-service-account -n ${NAMESPACE}

View only

You can tell the Kiali Operator to install Kiali in “view only” mode (this does work for either OpenShift or Kubernetes). You do this by setting the view_only_mode to true in the Kiali CR, which allows Kiali to read service mesh resources found in the cluster, but it does not allow any change:

spec:
  deployment:
    view_only_mode: true

3.1.3 - OpenID Connect strategy

Introduction

The openid authentication strategy lets you integrate Kiali to an external identity provider that implements OpenID Connect, and allows users to login to Kiali using their existing accounts of a third-party system.

If your Kubernetes cluster is also integrated with your OpenId provider, then Kiali’s openid strategy can offer role-based access control (RBAC) through the Kubernetes authorization mechanisms. See the RBAC documentation for more details.

Currently, Kiali supports the authorization code flow (preferred) and the implicit flow of the OpenId Connect spec.

Requirements

If you want to enable usage of the OpenId’s authorization code flow, make sure that the Kiali’s signing key is 16, 24 or 32 byte long. If you setup a signing key of a different size, Kiali will only be capable of using the implicit flow. If you install Kiali via the operator and don’t set a custom signing key, the operator should create a 16 byte long signing key.

If you don’t need RBAC support, the only requirement is to have a working OpenId Server where Kiali can be configured as a client application.

If you do need RBAC support, you need either:

The first option is preferred if you can manipulate your cluster API server startup flags, which will result in your cluster to also be integrated with the external OpenID provider.

The second option is provided for cases where you are using a managed Kubernetes and your cloud provider does not support configuring OpenID integration. Kiali assumes an implementation of a Kubernetes API server. For example, a community user has reported to successfully configure Kiali’s OpenID strategy by using kube-oidc-proxy which is a reverse proxy that handles the OpenID authentication and forwards the authenticated requests to the Kubernetes API.

Set-up with RBAC support

Assuming you already have a working Kubernetes cluster with OpenId integration (or a working alternative like kube-oidc-proxy), you should already had configured an application or a client in your OpenId server (some cloud providers configure this app/client automatically for you). You must re-use this existing application/client by adding the root path of your Kiali instance as an allowed/authorized callback URL. If the OpenID server provided you a client secret for the application/client, or if you had manually set a client secret, issue the following command to create a Kubernetes secret holding the OpenId client secret:

kubectl create secret generic kiali --from-literal="oidc-secret=$CLIENT_SECRET" -n $NAMESPACE

where $NAMESPACE is the namespace where you installed Kiali and $CLIENT_SECRET is the secret you configured or provided by your OpenId Server. If Kiali is already running, you may need to restart the Kiali pod so that the secret is mounted in Kiali.

Then, to enable the OpenID Connect strategy, the minimal configuration you need to set in the Kiali CR is like the following:

spec:
  auth:
    strategy: openid
    openid:
      client_id: "kiali-client"
      issuer_uri: "https://openid.issuer.com"

This assumes that your Kubernetes cluster is configured with OpenID Connect integration. In this case, the client-id and issuer_uri attributes must match the --oidc-client-id and --oidc-issuer-url flags used to start the cluster API server. If these values don’t match, users will fail to login to Kiali.

If you are using a replacement or a reverse proxy for the Kubernetes API server, the minimal configuration is like the following:

spec:
  auth:
    strategy: openid
    openid:
      api_proxy: "https://proxy.domain.com:port"
      api_proxy_ca_data: "..."
      client_id: "kiali-client"
      issuer_uri: "https://openid.issuer.com"

The value of client-id and issuer_uri must match the values of the configuration of your reverse proxy or cluster API replacement. The api_proxy attribute is the URI of the reverse proxy or cluster API replacement (only HTTPS is allowed). The api_proxy_ca_data is the public certificate authority file encoded in a base64 string, to trust the secure connection.

Set-up with no RBAC support

Register Kiali as a client application in your OpenId Server. Use the root path of your Kiali instance as the callback URL. If the OpenId Server provides you a client secret, or if you manually set a client secret, issue the following command to create a Kubernetes secret holding the OpenId client secret:

kubectl create secret generic kiali --from-literal="oidc-secret=$CLIENT_SECRET" -n $NAMESPACE

where $NAMESPACE is the namespace where you installed Kiali and $CLIENT_SECRET is the secret you configured or provided by your OpenId Server. If Kiali is already running, you may need to restart the Kiali pod so that the secret is mounted in Kiali.

Then, to enable the OpenID Connect strategy, the minimal configuration you need to set in the Kiali CR is like the following:

spec:
  auth:
    strategy: openid
    openid:
      client_id: "kiali-client"
      disable_rbac: true
      issuer_uri: "https://openid.issuer.com"

Additional configurations

Configuring the displayed user name

The Kiali front-end will, by default, retrieve the string of the sub claim of the OpenID token and display it as the user name. You can customize which field to display as the user name by setting the username_claim attribute of the Kiali CR. For example:

spec:
  auth:
    openid:
      username_claim: "email"

If you enabled RBAC, you will want the username_claim attribute to match the --oidc-username-claim flag used to start the Kubernetes API server, or the equivalent option if you are using a replacement or reverse proxy of the API server. Else, any user-friendly claim will be OK as it is purely informational.

Configuring requested scopes

By default, Kiali will request access to the openid, profile and email standard scopes. If you need a different set of scopes, you can set the scopes attribute in the Kiali CR. For example:

spec:
  auth:
    openid:
      scopes:
      - "openid"
      - "email"
      - "groups"

The openid scope is forced. If you don’t add it to the list of scopes to request, Kiali will still request it from the identity provider.

Configuring authentication timeout

When the user is redirected to the external authentication system, by default Kiali will wait at most 5 minutes for the user to authenticate. After that time has elapsed, Kiali will reject authentication. You can adjust this timeout by setting the authentication_timeout with the number of seconds that Kiali should wait at most. For example:

spec:
  auth:
    openid:
      authentication_timeout: 60 # Wait only one minute.

Using an OpenID provider with a self-signed certificate

If your OpenID provider is using a self-signed certificate, you can disable certificate validation by setting the insecure_skip_verify_tls to true in the Kiali CR:

spec:
  auth:
    openid:
      insecure_skip_verify_tls: true

However, if your organization or internal network has an internal trusted certificate authority (CA), and your OpenID server is using a certificate issued by this CA, you can configure Kiali to trust certificates from this CA, rather than disabling verification. For this, create a ConfigMap named kiali-cabundle containing the root CA certificate (the public component) under the openid-server-ca.crt key:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kiali-cabundle
  namespace: istio-system # This is Kiali's install namespace
data:
  openid-server-ca.crt: <the public component of your CA root certificate encoded in base64>

After restarting the Kiali pod, Kiali will trust this root certificate for all HTTPS requests related to OpenID authentication.

Using an HTTP/HTTPS Proxy

In some network configurations, there is the need to use proxies to connect to the outside world. OpenID requires outside world connections to get metadata and do key validation, so you can configure it by setting the http_proxy and https_proxy keys in the Kiali CR. They use the same format as the HTTP_PROXY and HTTPS_PROXY environment variables.

spec:
  auth:
    openid:
      http_proxy: http://USERNAME:PASSWORD@10.0.1.1:8080/
      https_proxy: https://USERNAME:PASSWORD@10.0.0.1:8080/

Passing additional options to the identity provider

When users click on the Login button on Kiali, a redirection occurs to the authentication page of the external identity provider. Kiali sends a fixed set of parameters to the identity provider to enable authentication. If you need to add an additional set of parameters to your identity provider, you can use the additional_request_params setting of the Kiali CR, which accepts key-value pairs. For example:

spec:
  auth:
    openid:
      additional_request_params:
        prompt: login

The prompt parameter is a standard OpenID parameter. When the login value is passed in this parameter, the identity provider is instructed to ask for user credentials regardless if the user already has an active session because of a previous login in some other system.

If your OpenId provider supports other non-standard parameters, you can specify the ones you need in this additional_request_params setting.

Take into account that you should not add the client_id, response_type, redirect_uri, scope, nonce nor state parameters to this list. These are already in use by Kiali and some already have a dedicated setting.

Provider-specific instructions

Using with Keycloak

When using OpenId with Keycloak, you will need to enable the Standard Flow Enabled option on the Client (in the Administration Console):

Client configuration screen on Keycloak

The Standard Flow described on the options is the same as the authorization code flow from the rest of the documentation.

If you get an error like Client is not allowed to initiate browser login with given response_type. Implicit flow is disabled for the client., it means that your signing key for Kiali is not a standard size (16, 24 or 32 bytes long).

Enabling the Implicit Flow Enabled option of the client will make the problem go away, but be aware that the implicit flow is less secure, and not recommended.

Using with Google Cloud Platform / GKE OAuth2

If you are using Google Cloud Platform (GCP) and its products such as Google Kubernetes Engine (GKE), it should be straightforward to configure Kiali’s OpenID strategy to authenticate using your Google credentials.

First, you’ll need to go to your GCP Project and to the Credentials screen which is available at (Menu Icon) > APIs & Services > Credentials.

Credentials Screen on in GCP Project

On the Credentials screen you can select to create a new OAuth client ID.

Select OAuth on Credentials Screen

On the Create OAuth client ID screen, set the Application type to Web Application and enter a name for your key.

Select Web Application

Then enter in the Authorized Javascript origins and Authorized redirect URIs for your project. You can enter in localhost as appropriate during testing. You can also enter multiple URIs as appropriate.

Enter URLs

After clicking Create you’ll be shown your newly minted client id and secret. These are important and needed for your Kiali CR yaml and Kiali secrets files.

Get Credentials

You’ll need to update your Kiali CR file to include the following auth block.

spec:
  auth:
    strategy: "openid"
    openid:
      client_id: "<your client id from GCP>"
      disable_rbac: true
      issuer_uri: "https://accounts.google.com"
      scopes: ["openid", "email"]
      username_claim: "email"

Finally you will need to create a secret, if you don’t have one already, that sets the oidc-secret for the openid flow.

apiVersion: v1
kind: Secret
metadata:
  name: kiali
  namespace: istio-system
  labels:
    app: kiali
type: Opaque
data:
  oidc-secret: "<base64 encode your client secret from GCP and enter here>"

Once all these settings are complete just set your Kiali CR and the Kiali secret to your cluster. You may need to refresh your Kiali Pod to set the Secret if you add the Secret after the Kiali pod is created.

Using with Azure: AKS and AAD

AKS has support for a feature named AKS-managed Azure Active Directory, which enables integration between AKS and AAD. This has the advantage that users can use their AAD credentials to access AKS clusters and can also use Kubernetes RBAC features to assign privileges to AAD users.

However, Azure is implementing this integration via the Kubernetes Webhook Token Authentication rather than via the Kubernetes OpenID Connect Tokens authentication (see the Azure AD integration section in AKS Concepts documentation). Because of this difference, authentication in AKS behaves slightly different from a standard OpenID setup, but Kiali’s OpenID authentication strategy can still be used with full RBAC support by following the next steps.

First, enable the AAD integration on your AKS cluster. See the official AKS documentation to learn how. Once it is enabled, your AKS panel should show the following:

AKS-managed AAD is enabled,700

Create a web application for Kiali in your Azure AD panel:

  1. Go to AAD > App Registration, create an application with a redirect url like \https://<your-kiali-url>
  2. Go to Certificates & secrets and create a client secret.
    1. After creating the client secret, take note of the provided secret. Create a Kubernetes secret in your cluster as mentioned in the «setup-with-rbac,Set-up with RBAC support» section. Please, note that the suggested name for the Kubernetes Secret is kiali. If you want to customize the secret name, you will have to specify your custom name in the Kiali CR. See the comments for the secret_name configuration in the sample Kiali CR.
  3. Go to API Permissions and press the Add a permission button. In the new page that appears, switch to the APIs my organization uses tab.
  4. Type the following ID in the search field: 6dae42f8-4368-4678-94ff-3960e28e3630 (this is a shared ID for all Azure clusters). And select the resulting entry.
  5. Select the Delegated permissions square.
  6. Select the user.read permission.
  7. Go to Authentication and make sure that the Access tokens checkbox is ticked.

Access tokens enabled

Then, create or modify your Kiali CR and include the following settings:

spec:
  auth:
    strategy: "openid"
    openid:
      client_id: "<your Kiali application client id from Azure>"
      issuer_uri: "https://sts.windows.net/<your AAD tenant id>/"
      username_claim: preferred_username
      api_token: access_token
      additional_request_params:
        resource: "6dae42f8-4368-4678-94ff-3960e28e3630"

You can find your client_id and tenant_id in the Overview page of the Kiali App registration that you just created. See this documentation for more information.

3.1.4 - OpenShift strategy

Introduction

The openshift authentication strategy is the preferred and default strategy when Kiali is deployed on an OpenShift cluster.

When using the openshift strategy, a user logging into Kiali will be redirected to the login page of the OpenShift console. Once the user provides his OpenShift credentials, he will be redireted back to Kiali and will be logged in if the user has enough privileges.

The openshift strategy takes advantage of the cluster’s RBAC. See the Role-based access control documentation for more details.

Set-up

Since openshift is the default strategy when deploying Kiali in OpenShift, you shouldn’t need to configure anything. If you want to be verbose, use the following configuration in the Kiali CR:

spec:
  auth:
    strategy: openshift

The openshift strategy doesn’t have any additional configuration. The Kiali operator will make sure to setup the needed OpenShift OAuth resources to register Kiali as a client.

3.1.5 - Token strategy

Introduction

The token authentication strategy allows a user to login to Kiali using the token of a Kubernetes ServiceAccount. This is similar to the login view of Kubernetes Dashboard.

The token strategy takes advantage of the cluster’s RBAC. See the Role-based access control documentation for more details.

Set-up

Since token is the default strategy when deploying Kiali in Kubernetes, you shouldn’t need to configure anything, unless your cluster is OpenShift. If you want to be verbose or if you need to enable the token strategy in OpenShift, use the following configuration in the Kiali CR:

spec:
  auth:
    strategy: token

The token strategy doesn’t have any additional configuration.

3.2 - Namespace Management

Introduction

The default Kiali installation (as mentioned in the Installation guide) gives Kiali access to all namespaces available in the cluster.

It is possible to restrict Kiali to a set of desired namespaces by providing a list of the ones you want, excluding the ones you don’t want, or filtering by a label selector. You can use a combination of these options.

Accessible Namespaces

You can configure which namespaces are accessible and observable through Kiali. You can use regex expressions which will be matched against the operator’s visible namespaces. If not set in the Kiali CR, the default makes accessible all cluster namespaces, with the exception of a predefined set of the cluster’s system workloads.

The list of accessible namespaces is specified in the Kiali CR via the accessible_namespaces setting, under the main deployment section. As an example, if Kiali is to be installed in the istio-system namespace, and is expected to monitor all namespaces prefixed with mycorp_, the setting would be:

spec:
  deployment:
    accessible_namespaces:
    - istio-system
    - mycorp_.*

This configuration accepts the special pattern accessible_namespaces: "**" which denotes that Kiali is given access to all namespaces in the cluster.

When installing multiple Kiali instances into a single cluster, accessible_namespaces must be mutually exclusive. In other words, a namespace set must be matched by only one Kiali CR. Regular expressions must not have overlapping patterns.

Maistra supports multi-tenancy and the accessible_namespaces extends that feature to Kiali. However, explicit naming of accessible namespaces can benefit non-Maistra installations as well - with it Kiali does not need cluster roles and the Kiali Operator does not need permissions to create cluster roles.

Excluded Namespaces

The Kiali CR tells the Kiali Operator which accessible namespaces should be excluded from the list of namespaces provided by the API and UI. This can be useful if wildcards are used when specifying link:#_accessible_namespaces[Accessible Namespaces]. This setting has no effect on namespace accessibility. It is only a filter, not security-related.

For example, if the accessible_namespaces configuration includes mycorp_.* but it is not desirable to see test namespaces, the following configuration can be used:

api:
  namespaces:
    exclude:
      - mycorp_test.*

Namespace Selectors

To fetch a subset of the available namespaces, Kiali supports an optional Kubernetes label selector. This selector is especially useful when spec.deployment.accessible_namespaces is set to ["+++**+++"] but you want to reduce the namespaces presented in the UI’s namespace list.

The label selector is defined in the Kiali CR setting spec.api.namespaces.label_selector.

The example below selects all namespaces that have a label kiali-enabled: true:

api:
  namespaces:
    label_selector: kiali-enabled=true

For further information on how this label_selector interacts with spec.deployment.accessible_namespaces read the technical documentation.

To label a namespace you can use the following command. For more information see the link:https://kubernetes.io/docs/concepts/overview/working-with-objects/labels[Kubernete’s official documentation].

  kubectl label namespace my-namespace kiali-enabled=true

Note that when deploying multiple control planes in the same cluster, you will want to set the label selector’s value unique to each control plane. This allows each Kiali instance to select only the namespaces relevant to each control plane. Because in this “soft-multitenancy” mode spec.deployment.accessible_namespaces is typically set to an explicit set of namespaces (i.e. not ["+++**+++"]), you do not have to do anything with this label_selector. This is because the default value of label_selector is kiali.io/member-of: <spec.istio_namespace> when spec.deployment.accessible_namespaces is not set to the “all namespaces” value ["+++**+++"]. This allows you to have multiple control planes in the same cluster, with each control plane having its own Kiali instance. If you set your own Kiali instance name in the Kiali CR (i.e. you set spec.deployment.instance_name to something other than kiali), then the default label will be kiali.io/<spec.deployment.instance_name>.member-of: <spec.istio_namespace>.

3.3 - Role-based access control

Introduction

Kiali supports role-based access control (RBAC) when you are using either the openid, openshift or token authentication strategies.

If you are using the anonymous strategy, RBAC isn’t supported, but you still can limit privileges if your cluster is OpenShift. See the access control section in Anonymous strategy .

Kiali uses the RBAC capabilities of the underlying cluster. Thus, RBAC is accomplished by using the standard RBAC features of the cluster, which is through ClusterRoles, ClusterRoleBindings, Roles and RoleBindings resources. Read the Kubernetes RBAC documentation for details. If you are using OpenShift, read the OpenShift RBAC documentation.

In general, Kiali will give access to the resources granted to the account used to login. Specifically, depending on the authentication strategy, this translates to:

Authentication Strategy Access To
openid resources granted to the user of the third-party authentication system
openshift resources granted to the OpenShift user
token resources granted to the ServiceAccount whose token was used to login

For example, if you are using the token strategy, you would grant cluster-wide privileges to a ServiceAccount with this command:

$ kubectl create clusterrolebinding john-binding --clusterrole=kiali --serviceaccount=mynamespace:john

and if you are using openshift or openid strategies, you could assign privileges with any of these commands:

$ kubectl create rolebinding john-openid-binding --clusterrole=kiali --user="john@example.com" --namespace=mynamespace
$ oc adm policy add-role-to-user kiali john -n mynamespace # For OpenShift clusters

Please read your cluster RBAC documentation to learn how to assign privileges.

Minimum required privileges to login

The get namespace privilege in some namespace is the minimum privilege needed in order to be able to login to Kiali. This means you need the following minimal Role bound to the user that wants to login:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
rules:
- apiGroups: [""]
  resources:
  - namespaces
  verbs:
  - get

This minimal Role will allow a user to login. Kiali may work partially, but some pages may be blank or show erroneous information, and errors will be logged constantly. You will need a broader set of privileges so that Kiali works fine.

Privileges required for Kiali to work correctly

The default installation of Kiali creates a ClusterRole with the needed privileges to take the most advantage of all Kiali features. Inspect the privileges with

kubectl describe clusterrole kiali

Alternatively, check in the Kiali Operator source code. See either the Kubernetes role.yaml template file, or the OpenShift role.yaml template file.

You can use this ClusterRole to assign privileges to users requiring access to Kiali. You can assign privileges either in one namespace, which will result in users being able to see only resources in that namespace; or assign cluster-wide privileges.

For example, to assign privileges to the john user and limiting access to the myApp namespace, you could run either:

$ kubectl create rolebinding john-binding --clusterrole=kiali --user="john" --namespace=myApp
$ oc adm policy add-role-to-user kiali john -n myApp # For OpenShift clusters

But if you need to assign cluster-wide privileges, you could run either:

$ kubectl create clusterrolebinding john-admin-binding --clusterrole=kiali --user="john"
$ oc adm policy add-cluster-role-to-user kiali john # For OpenShift clusters

In case you need to assign a more limited set of privileges than the ones present in the Kiali ClusterRole, create your own ClusterRole or Role based off the privileges in Kiali’s ClusterRole and remove the privileges you want to ban. You must understand that some Kiali features may not work properly because of the reduced privilege set.

3.4 - Virtual Machine workloads

Introduction

Kiali graph visualizes both Virtual Machine workloads (WorkloadEntry) and pod-based workloads, running inside a Kubernetes cluster. You must ensure that the Istio Proxy is running, and correctly configured, on the Virtual Machine. Also, Prometheus must be able to scrape the metrics endpoint of the Istio Proxy running on the VM. Kiali will then be able to read the traffic telemetry for the Virtual Machine workloads, and incorporate the VM workloads into the graph.

Configuring Prometheus to scrape VM-based Istio Proxy

Once the Istio Proxy is running on a Virtual Machine, configuring Prometheus to scrape the VM’s Istio Proxy metrics endpoint is the only configuration Kiali needs to display traffic for the VM-based workload. Configuring Prometheus will vary between environments. Here is a very simple example of a Prometheus configuration that includes a job to scrape VM based workloads:

- job_name: bookinfo-vms
  honor_timestamps: true
  scrape_interval: 15s
  scrape_timeout: 10s
  metrics_path: /stats/prometheus
  scheme: http
  follow_redirects: true
  static_configs:
  - targets:
    - details-v1:15020
    - productpage-v1:15020
    - ratings-v1:15020
    - reviews-v1:15020
    - reviews-v2:15020
    - reviews-v3:15020

3.5 - Custom Dashboards

Custom Dashboards require some configuration to work properly.

Declaring a custom dashboard

When installing Kiali, you define your own custom dashboards in the Kiali CR spec.custom_dashboards field. Here’s an example of what it looks like:

custom_dashboards:
- name: vertx-custom
  title: Vert.x Metrics
  runtime: Vert.x
  discoverOn: "vertx_http_server_connections"
  items:
  - chart:
      name: "Server response time"
      unit: "seconds"
      spans: 6
      metrics:
      - metricName: "vertx_http_server_responseTime_seconds"
        displayName: "Server response time"
      dataType: "histogram"
      aggregations:
      - label: "path"
        displayName: "Path"
      - label: "method"
        displayName: "Method"
  - chart:
      name: "Server active connections"
      unit: ""
      spans: 6
      metricName: "vertx_http_server_connections"
      dataType: "raw"
  - include: "micrometer-1.1-jvm"
  externalLinks:
  - name: "My custom Grafana dashboard"
    type: "grafana"
    variables:
      app: var-app
      namespace: var-namespace
      version: var-version

The name field corresponds to what you can set in the pod annotation kiali.io/dashboards.

The rest of the field definitions are:

  • runtime: optional, name of the related runtime. It will be displayed on the corresponding Workload Details page. If omitted no name is displayed.
  • title: dashboard title, displayed as a tab in Application or Workloads Details
  • discoverOn: metric name to match for auto-discovery. If omitted, the dashboard won’t be discovered automatically, but can still be used via pods annotation.
  • items: a list of items, that can be either chart, to define a new chart, or include to reference another dashboard
    • chart: new chart object
      • name: name of the chart
      • chartType: type of the chart, can be one of line (default), area, bar or scatter
      • unit: unit for Y-axis. Free-text field to provide any unit suffix. It can eventually be scaled on display. See specific section below.
      • unitScale: in case the unit needs to be scaled by some factor, set that factor here. For instance, if your data is in milliseconds, set 0.001 as scale and seconds as unit.
      • spans: number of “spans” taken by the chart, from 1 to 12, using bootstrap convention
      • metrics: a list of metrics to display on this single chart:
        • metricName: the metric name in Prometheus
        • displayName: name to display on chart
      • dataType: type of data to be displayed in the chart. Can be one of raw, rate or histogram. Raw data will be queried without transformation. Rate data will be queried using promQL rate() function. And histogram with histogram_quantile() function.
      • min and max: domain for Y-values. When unset, charts implementations should usually automatically adapt the domain with the displayed data.
      • xAxis: type of the X-axis, can be one of time (default) or series. When set to series, only one datapoint per series will be displayed, and the chart type then defaults to bar.
      • aggregator: defines how the time-series are aggregated when several are returned for a given metric and label set. For example, if a Deployment creates a ReplicaSet of several Pods, you will have at least one time-series per Pod. Since Kiali shows the dashboards at the workload (ReplicaSet) level or at the application level, they will have to be aggregated. This field can be used to fix the aggregator, with values such as sum or avg (full list available in Prometheus documentation). However, if omitted the aggregator will default to sum and can be changed from the dashboard UI.
      • aggregations: list of labels eligible for aggregations / groupings (they will be displayed in Kiali through a dropdown list)
        • label: Prometheus label name
        • displayName: name to display in Kiali
        • singleSelection: boolean flag to switch between single-selection and multi-selection modes on the values of this label. Defaults to false.
      • groupLabels: a list of Prometheus labels to be used for grouping. Similar to aggregations, except this grouping will be always turned on.
      • sortLabel: Prometheus label to be used for the metrics display order.
      • sortLabelParseAs: set to int if sortLabel needs to be parsed and compared as an integer instead of string.
    • include: to include another dashboard, or a specific chart from another dashboard. Typically used to compose with generic dashboards such as the ones about MicroProfile Metrics or Micrometer-based JVM metrics. To reference a full dashboard, set the name of that dashboard. To reference a specific chart of another dashboard, set the name of the dashboard followed by $ and the name of the chart (ex: include: "microprofile-1.1$Thread count").
  • externalLinks: a list of related external links (e.g. to Grafana dashboards)
    • name: name of the related dashboard in the external system (e.g. name of a Grafana dashboard)
    • type: link type, currently only grafana is allowed
    • variables: a set of variables that can be injected in the URL. For instance, with something like namespace: var-namespace and app: var-app, an URL to a Grafana dashboard that manages namespace and app variables would look like: http://grafana-server:3000/d/xyz/my-grafana-dashboard?var-namespace=some-namespace&var-app=some-app. The available variables in this context are namespace, app and version.

In Kiali, labels for grouping are aggregated in the top toolbar, so if the same label refers to different things depending on the metric, you wouldn’t be able to distinguish them in the UI. For that reason, ideally, labels should not have too generic names in Prometheus. For instance labels named “id” for both memory spaces and buffer pools would better be named “space_id” and “pool_id”. If you have control on label names, it’s an important aspect to take into consideration. Else, it is up to you to organize dashboards with that in mind, eventually splitting them into smaller ones to resolve clashes.

Dashboard scope

The custom dashboards defined in the Kiali CR are available for all workloads in all namespaces.

Additionally, new custom dashboards can be created for a given namespace or workload, using the dashboards.kiali.io/templates annotation.

This is an example where a “Custom Envoy” dashboard will be available for all applications and workloads for the default namespace:

apiVersion: v1
kind: Namespace
metadata:
  name: default
  annotations:
    dashboards.kiali.io/templates: |
      - name: custom_envoy
        title: Custom Envoy
        discoverOn: "envoy_server_uptime"
        items:
          - chart:
              name: "Pods uptime"
              spans: 12
              metricName: "envoy_server_uptime"
              dataType: "raw"

This other example will create an additional “Active Listeners” dashboard only on details-v1 workload:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: details-v1
  labels:
    app: details
    version: v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: details
      version: v1
  template:
    metadata:
      labels:
        app: details
        version: v1
      annotations:
        dashboards.kiali.io/templates: |
          - name: envoy_listeners
            title: Active Listeners
            discoverOn: "envoy_listener_manager_total_listeners_active"
            items:
              - chart:
                  name: "Total Listeners"
                  spans: 12
                  metricName: "envoy_listener_manager_total_listeners_active"
                  dataType: "raw"
    spec:
      serviceAccountName: bookinfo-details
      containers:
      - name: details
        image: docker.io/istio/examples-bookinfo-details-v1:1.16.2
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9080
        securityContext:
          runAsUser: 1000

Units

Some units are recognized in Kiali and scaled appropriately when displayed on charts:

  • unit: "seconds" can be scaled down to ms, µs, etc.
  • unit: "bytes-si" and unit: "bitrate-si" can be scaled up to kB, MB (etc.) using SI / metric system. The aliases unit: "bytes" and unit: "bitrate" can be used instead.
  • unit: "bytes-iec" and unit: "bitrate-iec" can be scaled up to KiB, MiB (etc.) using IEC standard / IEEE 1541-2002 (scale by powers of 2).

Other units will fall into the default case and be scaled using SI standard. For instance, unit: "m" for meter can be scaled up to km.

Prometheus Configuration

Kiali custom dashboards work exclusively with Prometheus, so it must be configured correctly to pull your application metrics.

If you are using the demo Istio installation with addons, your Prometheus instance should already be correctly configured and you can skip to the next section; with the exception of Istio 1.6.x where you need customize the ConfigMap, or install Istio with the flag --set meshConfig.enablePrometheusMerge=true.

Using another Prometheus instance

You can use a different instance of Prometheus for these metrics, as opposed to Istio metrics. This second Prometheus instance can be configured from the Kiali CR when using the Kiali operator, or ConfigMap otherwise:

# ...
external_services:
  custom_dashboards:
    prometheus:
      url: URL_TO_PROMETHEUS_SERVER_FOR_CUSTOM_DASHBOARDS
    namespace_label: kubernetes_namespace
  prometheus:
    url: URL_TO_PROMETHEUS_SERVER_FOR_ISTIO_METRICS
# ...

For more details on this configuration, such as Prometheus authentication options, check this page.

You must make sure that this Prometheus instance is correctly configured to scrape your application pods and generates labels that Kiali will understand. Please refer to this documentation to setup the kubernetes_sd_config section. As a reference, here is how it is configured in Istio.

It is important to preserve label mapping, so that Kiali can filter by app and version, and to have the same namespace label as defined per Kiali config. Here’s a relabel_configs that allows this:

      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace

Pod Annotations and Auto-discovery

Application pods must be annotated for the Prometheus scraper, for example, within a Deployment definition:

spec:
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
  • prometheus.io/scrape tells Prometheus to fetch these metrics or not
  • prometheus.io/port is the port under which metrics are exposed
  • prometheus.io/path is the endpoint path where metrics are exposed, default is /metrics

Kiali will try to discover automatically dashboards that are relevant for a given Application or Workload. To do so, it reads their metrics and try to match them with the discoverOn field defined on dashboards.

But if you can’t rely on automatic discovery, you can explicitly annotate the pods to associate them with Kiali dashboards.

spec:
  template:
    metadata:
      annotations:
        # (prometheus annotations...)
        kiali.io/dashboards: vertx-server

kiali.io/dashboards is a comma-separated list of dashboard names that Kiali will look for. Each name in the list must match the name of a built-in dashboard or the name of a custom dashboard as defined in the Kial CR’s spec.custom_dashboards.

3.6 - Jaeger / Distributed Tracing

Below are some commonly used configuration options for Kiali’s Jaeger integration for Tracing.

For advanced tracing integration, you can also refer to the Kiali CR external_services.tracing section. For Helm installations, it is valid for the config map as well.

In-Cluster URL

In order to fully integrate with Jaeger, Kiali needs a URL that can be resolved from inside the cluster, typically using Kubernetes DNS. This is the in_cluster_url configuration. For instance, for a Jaeger service named tracing, within istio-system namespace, Kiali config would be:

  external_services:
    tracing:
      in_cluster_url: 'http://tracing.istio-system/jaeger'

External URL

When not using an In-Cluster configuration, Kiali can be configuring with an external URL. This can be useful, for example, if Jaeger is not accessible from the Kiali pod. Doing so will enable links from Kiali to the Jaeger UI. This URL needs to be accessible from the browser (it’s used for links generation). Example:

  external_services:
    tracing:
      in_cluster_url: 'http://tracing.istio-system/jaeger'
      url: 'http://my-jaeger-host/jaeger'

Once this URL is set, Kiali will show an additional item to the main menu:

Distributed Tracing View

4 - Tutorials

Kiali Tutorials.

The following tutorials are designed to help users understand how to use Istio with Kiali, features, configuration, etc. They are highly recommended!

4.1 - Travel Demo Tutorial

Learn how to use Kiali to configure, observe and manage Istio.

This tutorial uses the Kiali Travel Demo to teach Kiali and Istio features.

4.1.1 - Prerequisites

Platform Setup

This tutorial assumes you will have access to a Kubernetes cluster with Istio installed.

This tutorial has been tested using:

Install Istio

Once you have your Kubernetes cluster ready, follow the Istio Getting Started, window="_blank" to install and setup a demo profile that will be used in this tutorial.

DNS entries can be added in a basic way to the /etc/hosts file but you can use any other DNS service that allows to resolve a domain with the external Ingress IP.

Update Kiali

Istio defines a specific Kiali version as an addon.

In this tutorial we are going to update Kiali to the latest release version.

Assuming you have installed the addons following the Istio Getting Started guide, you can uninstall Kiali with the command:

kubectl delete -f ${ISTIO_HOME}/samples/addons/kiali.yaml --ignore-not-found

There are multiple ways to install a recent version of Kiali, this tutorial follows the Quick Start using Helm Chart.

helm install \
  --namespace istio-system \
  --set auth.strategy="anonymous" \
  --repo https://kiali.org/helm-charts \
  kiali-server \
  kiali-server

Access the Kiali UI

The Istio istioctl client has an easy method to expose and access Kiali:

${ISTIO_HOME}/bin/istioctl dashboard kiali

There are other alternatives to expose Kiali or other Addons in Istio. Check Remotely Accessing Telemetry Addons for more information.

After the Prerequisites you should be able to access Kiali. Verify its version by clicking the “?” icon and selecting “About”:

Verify Kiali Access

4.1.2 - Install Travel Demo

Deploy the Travel Demo

This demo application will deploy several services grouped into three namespaces.

Note that at this step we are going to deploy the application without any reference to Istio.

We will join services to the ServiceMesh in a following step.

To create and deploy the namespaces perform the following commands:

kubectl create namespace travel-agency
kubectl create namespace travel-portal
kubectl create namespace travel-control

kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travel_agency.yaml) -n travel-agency
kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travel_portal.yaml) -n travel-portal
kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travel_control.yaml) -n travel-control

Check that all deployments rolled out as expected:

$ kubectl get deployments -n travel-control
NAME      READY   UP-TO-DATE   AVAILABLE   AGE
control   1/1     1            1           85s

$ kubectl get deployments -n travel-portal
NAME      READY   UP-TO-DATE   AVAILABLE   AGE
travels   1/1     1            1           91s
viaggi    1/1     1            1           91s
voyages   1/1     1            1           91s

$ kubectl get deployments -n travel-agency
NAME            READY   UP-TO-DATE   AVAILABLE   AGE
cars-v1         1/1     1            1           96s
discounts-v1    1/1     1            1           96s
flights-v1      1/1     1            1           96s
hotels-v1       1/1     1            1           96s
insurances-v1   1/1     1            1           96s
mysqldb-v1      1/1     1            1           96s
travels-v1      1/1     1            1           96s

Understanding the demo application

Travel Portal namespace

The Travel Demo application simulates two business domains organized in different namespaces.

In a first namespace called travel-portal there will be deployed several travel shops, where users can search for and book flights, hotels, cars or insurance.

The shop applications can behave differently based on request characteristics like channel (web or mobile) or user (new or existing).

These workloads may generate different types of traffic to imitate different real scenarios.

All the portals consume a service called travels deployed in the travel-agency namespace.

Travel Agency namespace

A second namespace called travel-agency will host a set of services created to provide quotes for travel.

A main travels service will be the business entry point for the travel agency. It receives a destination city and a user as parameters and it calculates all elements that compose a travel budget: airfare, lodging, car reservation and travel insurance.

Each service can provide an independent quote and the travels service must then aggregate them into a single response.

Additionally, some users, like registered users, can have access to special discounts, managed as well by an external service.

Service relations between namespaces can be described in the following diagram:

Travel Demo Design

Travel Portal and Travel Agency flow

A typical flow consists of the following steps:

. A portal queries the travels service for available destinations. . Travels service queries the available hotels and returns to the portal shop. . A user selects a destination and a type of travel, which may include a flight and/or a car, hotel and insurance. . Cars, Hotels and Flights may have available discounts depending on user type.

Travel Control namespace

The travel-control namespace runs a business dashboard with two key features:

  • Allow setting changes for every travel shop simulator (traffic ratio, device, user and type of travel).
  • Provide a business view of the total requests generated from the travel-portal namespace to the travel-agency services, organized by business criteria as grouped per shop, per type of traffic and per city.

Travel Dashboard

4.1.3 - First Steps

Missing Sidecars

The Travel Demo has been deployed in the previous step but without installing any Istio sidecar proxy.

In that case, the application won’t connect to the control plane and won’t take advantage of Istio’s features.

In Kiali, we will see the new namespaces in the overview page:

Overview

But we won’t see any traffic in the graph page for any of these new namespaces:

Empty Graph

If we examine the Applications, Workloads or Services page, it will confirm that there are missing sidecars:

Missing Sidecar

Enable Sidecars

In this tutorial, we will add namespaces and workloads into the ServiceMesh individually step by step.

This will help you to understand how Istio sidecar proxies work and how applications can use Istio’s features.

We are going to start with the control workload deployed into the travel-control namespace:

Enable Auto Injection per Namespace

Enable Auto Injection per Workkload

Understanding what happened:

(i) Sidecar Injection

(ii) Injection Concepts

Open Travel Demo to Outside Traffic

The control workload now has an Istio sidecar proxy injected but this application is not accessible from the outside.

In this step we are going to expose the control service using an Istio Ingress Gateway which will map a path to a route at the edge of the mesh.

For minikube we will check the External IP of the Ingress gateway:

$ kubectl get services/istio-ingressgateway -n istio-system
NAME                   TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)                                                                      AGE
istio-ingressgateway   LoadBalancer   10.101.6.144   10.101.6.144   15021:30757/TCP,80:32647/TCP,443:30900/TCP,31400:30427/TCP,15443:31072/TCP   19h

And we will add a simple entry to the /etc/hosts of the tutorial machine with the desired DNS entry:

...
10.101.6.144 control.travel-control.istio-cluster.org
...

Then, from this machine, the url control.travel-control.istio-cluster.org will resolve to the External IP of the Ingress Gateway of Istio.

For OpenShift we will expose the Ingress gateway as a service:

$ oc expose service istio-ingressgateway -n istio-system
$ oc get routes -n istio-system
NAME                   HOST/PORT                                  PATH   SERVICES               PORT    TERMINATION          WILDCARD
istio-ingressgateway   <YOUR_ROUTE_HOST>                                 istio-ingressgateway   http2                        None

Then, from this machine, the host <YOUR_ROUTE_HOST> will resolve to the External IP of the Ingress Gateway of Istio. For OpenShift we will not define a DNS entry, instead, where you see control.travel-control.istio-cluster.org in the steps below, subsitute the value of <YOUR_ROUTE_HOST>.

Request Routing Wizard

Use “Add Route Rule” button to add a default rule where any request will be routed to the control workload.

Routing Rule

Use the Advanced Options and add a gateway with host control.travel-control.istio-cluster.org and create the Istio config.

Create Gateway

Verify the Istio configuration generated.

Istio Config

Test Gateway

Travel Control Graph

Understanding what happened:

  • External traffic enters into the cluster through a Gateway
  • Traffic is routed to the control service through a VirtualService
  • Kiali Graph visualizes the traffic telemetry reported from the control sidecar proxy
    • Only the travel-control namespace is part of the mesh

(i) Istio Gateway

(ii) Istio Virtual Service

4.1.4 - Observe

Enable Sidecars in all workloads

An Istio sidecar proxy adds a workload into the mesh.

Proxies connect with the control plane and provide Service Mesh functionality.

Automatically providing metrics, logs and traces is a major feature of the sidecar.

In the previous steps we have added a sidecar only in the travel-control namespace’s control workload.

We have added new powerful features but the application is still missing visibility from other workloads.

Missing Sidecars

That control workload provides good visibility of its traffic, but telemetry is partially enabled, as travel-portal and travel-agency workloads don’t have sidecar proxies.

In the First Steps of this tutorial we didn’t inject the sidecar proxies on purpose to show a scenario where only some workloads may have sidecars.

Typically, Istio users annotate namespaces before the deployment to allow Istio to automatically add the sidecar when the application is rolled out into the cluster. Perform the following commands:

kubectl label namespace travel-agency istio-injection=enabled
kubectl label namespace travel-portal istio-injection=enabled

kubectl delete -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travel_agency.yaml) -n travel-agency
kubectl delete -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travel_portal.yaml) -n travel-portal

// Wait until all services, deployments and pods are deleted from the travel-agency and travel-portal namespaces

kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travel_agency.yaml) -n travel-agency
kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travel_portal.yaml) -n travel-portal


Verify that travel-control, travel-portal and travel-agency workloads have sidecars deployed:

Updated Workloads

Updated Telemetry

Graph walkthrough

The graph provides a powerful set of Graph Features to visualize the traffic topology of the service mesh.

In this step, we will show how to use the Graph to show relevant information in the context of the Travel Demo application.

Our goal will be to identify the most critical service of the demo application.

Graph Request Distribution

Review the status of the mesh, everything seems healthy, but also note that hotels service has more load compared to other services inlcuded in the travel-agency namespace.

Hotels Normal Trace

Combining telemetry and tracing information will show that there are traces started from a portal that involve multiple services but also other traces that only consume the hotels service.

Hotels Single Trace

Travels Zoom

The graph can focus on an element to study a particular context in detail. Note that a contextual menu is available using right-click, to easily shortcut the navigation to other sections.

Application details

Kiali provides Detail Views to navigate into applications, workloads and services.

These views provide information about the structure, health, metrics, logs, traces and Istio configuration for any application component.

In this tutorial we are going to learn how to use them to examine the main travels application of our example.

Travels Application

An application is an abstract group of workloads and services labeled with the same “application” name.

From Service Mesh perspective this concept is significant as telemetry and tracing signals are mainly grouped by “application” even if multiple workloads are involved.

At this point of the tutorial, the travels application is quite simple, just a travels-v1 workload exposed through the travels service. Navigate to the travels-v1 workload detail by clicking the link in the travels application overview.

Travels-v1 Workload

Travels-v1 Metrics

The Metrics tab provides a powerful visualization of telemetry collected by the Istio proxy sidecar. It presents a dashboard of charts, each of which can be expanded for closer inspection. Expand the Request volume chart:

Travels-v1 Request Volume Chart

Metrics Settings provides multiple predefined criteria out-of-the-box. Additionally, enable the spans checkbox to correlate metrics and tracing spans in a single chart.

We can see in the context of the Travels application, the hotels service request volume differs from that of the other travel-agency services.

By examining the Request Duration chart also shows that there is no suspicious delay, so probably this asymmetric volume is part of the application business' logic.

The Logs tab provides a unified view of application container logs with the Istio sidecar proxy logs. It also offers a spans checkbox, providing a correlated view of both logs and tracing, helping identify spans of interest.

From the application container log we can spot that there are two main business methods: GetDestinations and GetTravelQuote.

In the Istio sidecar proxy log we see that GetDestinations invokes a GET /hotels request without parameters.

Travels-v1 Logs GetDestinations

However, GetTravelQuote invokes multiple requests to other services using a specific city as a parameter.

Travels-v1 Logs GetTravelQuote

Then, as discussed in the Travel Demo design, an initial query returns all available hotels before letting the user choose one and then get specific quotes for other destination services.

That scenario is shown in the increase of the hotels service utilization.

Now we have identified that the hotels service has more use than other travel-agency services.

The next step is to get more context to answer if some particular service is acting slower than expected.

The Traces tab allows comparison between traces and metrics histograms, letting the user determine if a particular spike is expected in the context of average values.

Travels-v1 Traces

In the same context, individual spans can be compared in more detail, helping to identify a problematic step in the broader scenario.

Travels-v1 Spans

4.1.5 - Connect

Request Routing

The Travel Demo application has several portals deployed on the travel-portal namespace consuming the travels service deployed on the travel-agency namespace.

The travels service is backed by a single workload called travels-v1 that receives requests from all portal workloads.

At a moment of the lifecycle the business needs of the portals may differ and new versions of the travels service may be necessary.

This step will show how to route requests dynamically to multiple versions of the travels service.

To deploy the new versions of the travels service execute the following commands:

kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travels-v2.yaml) -n travel-agency
kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travels-v3.yaml) -n travel-agency

Travels-v2 and travels-v3

As there is no specific routing defined, when there are multiple workloads for travels service the requests are uniformly distributed.

Travels graph before routing

The Traffic Management features of Istio allow you to define Matching Conditions for dynamic request routing.

In our scenario we would like to perform the following routing logic:

  • All traffic from travels.uk routed to travels-v1
  • All traffic from viaggi.it routed to travels-v2
  • All traffic from voyages.fr routed to travels-v3

Portal workloads use HTTP/1.1 protocols to call the travels service, so one strategy could be to use the HTTP headers to define the matching condition.

But, where to find the HTTP headers ? That information typically belongs to the application domain and we should examine the code, documentation or dynamically trace a request to understand which headers are being used in this context.

There are multiple possibilities. The Travel Demo application uses an Istio Annotation feature to add an annotation into the Deployment descriptor, which adds additional Istio configuration into the proxy.

Istio Config annotations

In our example the HTTP Headers are added as part of the trace context.

Then tracing will populate custom tags with the portal, device, user and travel used.

Travels Service Request Routing

We will define three “Request Matching” rules as part of this request routing. Define all three rules before clicking the Create button.

In the first rule, we will add a request match for when the portal header has the value of travels.uk.

Define the exact match, like below, and click the “Add Match” button to update the “Matching selected” for this rule.

Add Request Matching

Move to “Route To” tab and update the destination for this “Request Matching” rule. Then use the “Add Route Rule” to create the first rule.

Route To

Add similar rules to route traffic from viaggi.it to travels-v2 workload and from voyages.fr to travels-v3 workload.

When the three rules are defined you can use “Create” button to generate all Istio configurations needed for this scenario. Note that the rule ordering does not matter in this scenario.

Rules Defined

The Istio config for a given service is found on the “Istio Config” card, on the Service Details page.

Service Istio Config

Once the Request Routing is working we can verify that outbound traffic from every portal goes to the single travels workload. To see this clearly use a “Workload Graph” for the “travel-portal” namespace, enable “Traffic Distribution” edge labels and disable the “Service Nodes” Display option:

Travel Portal Namespace Graph

Note that no distribution label on an edge implies 100% of traffic.

Examining the “Inbound Traffic” for any of the travels workloads will show a similar pattern in the telemetry.

Travels v1 Inbound Traffic

Using a custom time range to select a large interval, we can see how the workload initially received traffic from all portals but then only a single portal after the Request Routing scenarios were defined.

Kiali Wizards allow you to define high level Service Mesh scenarios and will generate the Istio Configuration needed for its implementation (VirtualServices, DestinationRules, Gateways and PeerRequests). These scenarios can be updated or deleted from the “Actions” menu of a given service.

To experiment further you can navigate to the travels service and update your configuration by selecting “Request Routing”, as shown below. When you have finished experimenting with Routing Request scenarios then use the “Actions” menu to delete the generated Istio config.

Update or Delete

Fault Injection

The Observe step has spotted that the hotels service has additional traffic compared with other services deployed in the travel-agency namespace.

Also, this service becomes critical in the main business logic. It is responsible for querying all available destinations, presenting them to the user, and getting a quote for the selected destination.

This also means that the hotels service may be one of the weakest points of the Travel Demo application.

This step will show how to test the resilience of the Travel Demo application by injecting faults into the hotels service and then observing how the application reacts to this scenario.

Fault Injection Action

Select an HTTP Delay and specify the “Delay percentage” and “Fixed Delay” values. The default values will introduce a 5 seconds delay into 100% of received uuests.

HTTP Delay

Telemetry is collected from proxies and it is labeled with information about the source and destination workloads.

In our example, let’s say that travels service (“Service A” in the Istio diagram below) invokes the hotels service (“Service B” in the diagram). Travels is the “source” workload and hotels is the “destination” workload. The travels proxy will report telemetry from the source perspective and hotels proxy will report telemetry from the destination perspective. Let’s look at the latency reporting from both perspectives.

Istio Architecture

The travels workload proxy has the Fault Injection configuration so it will perform the call to the hotels service and will apply the delay on the travels workload side (this is reported as source telemetry).

We can see in the hotels telemetry reported by the source (the travels proxy) that there is a visible gap showing 5 second delay in the request duration.

Source Metrics

But as the Fault Injection delay is applied on the source proxy (travels), the destination proxy (hotels) is unaffected and its destination telemetry show no delay.

Destination Metrics

The injected delay is propagated from the travels service to the downstream services deployed on travel-portal namespace, degrading the overall response time. But the downstream services are unaware, operate normally, and show a green status.

Degraded Response Time

As part of this step you can update the Fault Injection scenario to test different delays. When finished, you can delete the generated Istio config for the hotels service.

Traffic Shifting

In the previous Request Routing step we have deployed two new versions of the travels service using the travels-v2 and travels-v3 workloads.

That scenario showed how Istio can route specific requests to specific workloads. It was configured such that each portal deployed in the travel-portal namespace (travels.uk, viaggi.it and voyages.fr) were routed to a specific travels workload (travels-v1, travels-v2 and travels-v3).

This Traffic Shifting step will simulate a new scenario: the new travels-v2 and travels-v3 workloads will represent new improvements for the travels service that will be used by all requests.

These new improvements implemented in travels-v2 and travels-v3 represent two alternative ways to address a specific problem. Our goal is to test them before deciding which one to use as a next version.

At the beginning we will send 80% of the traffic into the original travels-v1 workload, and will split 10% of the traffic each on travels-v2 and travels-v3.

Traffic Shifting Action

Create a scenario with 80% of the traffic distributed to travels-v1 workload and 10% of the traffic distributed each to travels-v2 and travels-v3.

Split Traffic

Travels Graph

Istio Telemetry is grouped per logical application. That has the advantage of easily comparing different but related workloads, for one or more services.

In our example, we can use the “Inbound Metrics” and “Outbound Metrics” tabs in the travels application details, group by “Local version” and compare how travels-v2 and travels-v3 are working.

Compare Travels Workloads Compare Travels Workloads

The charts show that the Traffic distribution is working accordingly and 80% is being distributed to travels-v1 workload and they also show no big differences between travels-v2 and travels-v3 in terms of request duration.

As part of this step you can update the Traffic Shifting scenario to test different distributions. When finished, you can delete the generated Istio config for the travels service.

TCP Traffic Shifting

The Travel Demo application has a database service used by several services deployed in the travel-agency namespace.

At some point in the lifecycle of the application the telemetry shows that the database service degrades and starts to increase the average response time.

This is a common situation. In this case, a database specialist suggests an update of the original indexes due to the data growth.

Our database specialist is suggesting two approaches and proposes to prepare two versions of the database service to test which may work better.

This step will show how the “Traffic Shifting” strategy can be applied to TCP services to test which new database indexing strategy works better.

To deploy the new versions of the mysqldb service execute the commands:

kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/mysql-v2.yaml) -n travel-agency
kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/mysql-v3.yaml) -n travel-agency

TCP Traffic Shifting Action

Create a scenario with 80% of the traffic distributed to mysqldb-v1 workload and 10% of the traffic distributed each to mysqldb-v2 and mysqldb-v3.

TCP Split Traffic

MysqlDB Graph

Note that TCP telemetry has different types of metrics, as “Traffic Distribution” is only available for HTTP/gRPC services, for this service we need to use “Traffic Rate” to evaluate the distribution of data (bytes-per-second) between mysqldb workloads.

TCP services have different telemetry but it’s still grouped by versions, allowing the user to compare and study pattern differences for mysqldb-v2 and mysqldb-v3.

Compare MysqlDB Workloads

The charts show more peaks in mysqldb-v2 compared to mysqldb-v3 but overall a similar behavior, so it’s probably safe to choose either strategy to shift all traffic.

As part of this step you can update the TCP Traffic Shifting scenario to test a different distribution. When finished, you can delete the generated Istio config for the mysqldb service.

Request Timeouts

In the Fault Injection step we showed how we could introduce a delay in the critical hotels service and test the resilience of the application.

The delay was propagated across services and Kiali showed how services accepted the delay without creating errors on the system.

But in real scenarios delays may have important consequences. Services may prefer to fail sooner, and recover, rather than propagating a delay across services.

This step will show how to add a request timeout for one of the portals deployed in travel-portal namespace. The travel.uk and viaggi.it portals will accept delays but voyages.fr will timeout and fail.

Repeat the Fault Injection step to add delay on hotels service.

Add a rule to add a request timeout only on requests coming from voyages.fr portal:

  • Use the Request Matching tab to add a matching condition for the portal header with voyages.fr value.
  • Use the Request Timeouts tab to add an HTTP Timeout for this rule.
  • Add the rule to the scenario.

Request Timeout Rule

A first rule should be added to the list like:

Voyages Portal Rule

Add a second rule to match any request and create the scenario. With this configuration, requests coming from voyages.fr will match the first rule and all others will match the second rule.

Any Request Rule

Create the rule. The Graph will show how requests coming from voyages.fr start to fail, due to the request timeout introduced.

Requests coming from other portals work without failures but are degraded by the hotels delay.

Travels Graph

This scenario can be visualized in detail if we examine the “Inbound Metrics” and we group by “Remote app” and “Response code”.

Travels Inbound Metrics Travels Inbound Metrics

As expected, the requests coming from voyages.fr don’t propagate the delay and they fail in the 2 seconds range, meanwhile requests from other portals don’t fail but they propagate the delay introduced in the hotels service.

As part of this step you can update the scenarios defined around hotels and travels services to experiment with more conditions, or you can delete the generated Istio config in both services.

Circuit Breaking

Distributed systems will benefit from failing quickly and applying back pressure, as opposed to propagating delays and errors through the system.

Circuit breaking is an important technique used to limit the impact of failures, latency spikes, and other types of network problems.

This step will show how to apply a Circuit Breaker into the travels service in order to limit the number of concurrent requests and connections.

In this example we are going to deploy a new workload that will simulate an important increase in the load of the system.

kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travel_loadtester.yaml) -n travel-portal

The loadtester workload will try to create 50 concurrent connections to the travels service, adding considerable pressure to the travels-agency namespace.

Loadtester Graph

The Travel Demo application is capable of handling this load and in a first look it doesn’t show unhealthy status.

Loadtester Details

But in a real scenario an unexpected increase in the load of a service like this may have a significant impact in the overall system status.

Use the “Traffic Shifting” Wizard to distribute traffic (evenly) to the travels workloads and use the “Advanced Options” to add a “Circuit Breaker” to the scenario.

Traffic Shifting with Circuit Breaker

The “Connection Pool” settings will indicate that the proxy sidecar will reject requests when the number of concurrent connections and requests exceeds more than one.

The “Outlier Detection” will eject a host from the connection pool if there is more than one consecutive error.

In the loadtester versioned-app Graph we can see that the travels service’s Circuit Breaker accepts some, but fails most, connections.

Remember, that these connections are stopped by the proxy on the loadtester side. That “fail sooner” pattern prevents overloading the network.

Using the Graph we can select the failed edge, check the Flags tab, and see that those requests are closed by the Circuit breaker.

Loadtester Flags Graph

If we examine the “Request volume” metric from the “Outbound Metrics” tab we can see the evolution of the requests, and how the introduction of the Circuit Breaker made the proxy reduce the request volume.

Loadtester Outbound Metrics

As part of this step you can update the scenarios defined around the travels service to experiment with more Circuit Breaker settings, or you can delete the generated Istio config in the service.

Understanding what happened:

(i) Circuit Breaking

(ii) Outlier Detection

(iii) Connection Pool Settings

(iv) Envoy’s Circuit breaking Architecture

Mirroring

This tutorial has shown several scenarios where Istio can route traffic to different versions in order to compare versions and evaluate which one works best.

The Traffic Shifting step was focused on travels service adding a new travels-v2 and travels-v3 workloads and the TCP Traffic Shifting showed how this scenario can be used on TCP services like mysqldb service.

Mirroring (or shadowing) is a particular case of the Traffic Shifting scenario where the proxy sends a copy of live traffic to a mirrored service.

The mirrored traffic happens out of band of the primary request path. It allows for testing of alternate services, in production environments, with minimal risk.

Istio mirrored traffic is only supported for HTTP/gRPC protocols.

This step will show how to apply mirrored traffic into the travels service.

We will simulate the following:

  • travels-v1 is the original traffic and it will keep 80% of the traffic
  • travels-v2 is the new version to deploy, it’s being evaluated and it will get 20% of the traffic to compare against travels-v1
  • But travels-v3 will be considered as a new, experimental version for testing outside of the regular request path. It will be defined as a mirrored workload on 50% of the original requests.

Mirrored Traffic

Note that Istio does not report mirrored traffic telemetry from the source proxy. It is reported from the destination proxy, although it is not flagged as mirrored, and therefore an edge from travels to the travels-v3 workload will appear in the graph. Note the traffic rates reflect the expected ratio of 80/20 between travels-v1 and travels-v2, with travels-v3 at about half of that total.

Mirrored Graph

This can be examined better using the “Source” and “Destination” metrics from the “Inbound Metrics” tab.

The “Source” proxy, in this case the proxies injected into the workloads of travel-portal namespace, won’t report telemetry for travels-v3 mirrored workload.

Mirrored Source Metrics

But the “Destination” proxy, in this case the proxy injected in the travels-v3 workload, will collect the telemetry from the mirrored traffic.

Mirrored Destination Metrics

As part of this step you can update the Mirroring scenario to test different mirrored distributions.

When finished you can delete the generated Istio config for the travels service.

4.1.6 - Secure

Authorization Policies and Sidecars

Security is one of the main pillars of Istio features.

The Istio Security High Level Architecture provides a comprehensive solution to design and implement multiple security scenarios.

In this tutorial we will show how Kiali can use telemetry information to create security policies for the workloads deployed in a given namespace.

Istio telemetry aggregates the ServiceAccount information used in the workloads communication. This information can be used to define authorization policies that deny and allow actions on future live traffic communication status.

Additionally, Istio sidecars can be created to limit the hosts with which a given workload can communicate. This improves traffic control, and also reduces the memory footprint of the proxies.

This step will show how we can define authorization policies for the travel-agency namespace, in the Travel Demo application, for all existing traffic in a given time period.

Once authorization policies are defined, a new workload will be rejected if it doesn’t match the security rules defined.

In this example we will use the loadtester workload as the “intruder” in our security rules.

If we have followed the previous tutorial steps, we need to undeploy it from the system.

kubectl delete -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travel_loadtester.yaml) -n travel-portal

We should validate that telemetry has updated the travel-portal namespace and “Security” can be enabled in the Graph Display options.

Travel Portal Graph

Every workload in the cluster uses a Service Account.

travels.uk, viaggi.it and voyages.fr workloads use the default cluster.local/ns/travel-portal/sa/default ServiceAccount defined automatically per namespace.

This information is propagated into the Istio Telemetry and Kiali can use it to define a set of AuthorizationPolicy rules, and Istio Sidecars.

The Sidecars restrict the list of hosts with which each workload can communicate, based on the current traffic.

The “Create Traffic Policies” action, located in the Overview page, will create these definitions.

Create Traffic Policies

This will generate a main DENY ALL rule to protect the whole namespace, and an individual ALLOW rule per workload identified in the telemetry.

Travel Agency Authorization Policies

It will create also an individual Sidecar per workload, each of them containing the set of hosts.

Travel Agency Sidecars

As an example, we can see that for the travels-v1 workload, the following list of hosts are added to the sidecar.

Travels V1 Sidecar

If the loadtester workload uses a different ServiceAccount then, when it’s deployed, it won’t comply with the AuthorizationPolicy rules defined in the previous step.

kubectl apply -f <(curl -L https://raw.githubusercontent.com/kiali/demos/master/travels/travel_loadtester.yaml) -n travel-portal

Now, travels workload will reject requests made by loadtester workload and that situation will be reflected in Graph:

Loadtester Denied

This can also be verified in the details page using the Outbound Metrics tab grouped by response code (only the 403 line is present).

Loadtester Denied Metrics

Inspecting the Logs tab confirms that loadtester workload is getting a HTTP 403 Forbidden response from travels workloads, as expected.

Loadtester Logs

AuthorizationPolicy resources are defined per workload using matching selectors.

As part of the example, we can show how a ServiceAccount can be added into an existing rule to allow traffic from loadtester workload into the travels-v1 workload only.

AuthorizationPolicy Edit

As expected, now we can see that travels-v1 workload accepts requests from all travel-portal namespace workloads, but travels-v2 and travels-v3 continue rejecting requests from loadtester source.

Travels v1 AuthorizationPolicy

Using “Outbound Metrics” tab from the loadtester workload we can group per “Remote version” and “Response code” to get a detailed view of this AuthorizationPolicy change.

Travels v1 AuthorizationPolicy

According to Istio Sidecar documentation, Istio configures all mesh sidecar proxies to reach every mesh workload. After the sidecars are created, the list of hosts is reduced according to the current traffic. To verify this, we can look for the clusters configured in each proxy.

As an example, looking into the cars-v1 workload, we can see that there is a reduced number of clusters with which the proxy can communicate.

Cars v1 clusters

As part of this step, you can update the AuthorizationPolicies and Istio Sidecars generated for the travel-agency namespace, and experiment with more security rules. Or, you can delete the generated Istio config for the namespace.

4.1.7 - Uninstall Travel Demo

To uninstall the Travel Demo application perform the following commands:

kubectl delete namespace travel-agency
kubectl delete namespace travel-portal
kubectl delete namespace travel-control

5 - Architecture and Terms

High-level description of the Kiali architecture and a glossary of common terms.

5.1 - Architecture

Kiali architecture

Kiali is composed of two components: a back-end application running in the container application platform, and a user-facing front-end application. Kiali depends on external services and components provided by the container application platform and Istio.

The following diagram illustrates the components involved in Kiali and its interactions:

Kiali architecture

Kiali back-end

The back-end is the application that runs in the container application platform. It’s written in Go. The code can be found at kiali/kiali GitHub repository.

This is the component that communicates with Istio parts, retrieves and processes data, and exposes this data to the front-end.

The back-end doesn’t need storage. The back-end configuration is managed via the Kiali CR when Kiali is installed via the Kiali operator, or via a configmap when installed via Helm.

Kiali front-end

The front-end is a single page web application, built using Patternfly, React, Typescript and Redux. The code can be found at kiali/kiali-ui GitHub repository.

In a standard deployment, the back-end serves the front-end. Then, the front-end queries the Kiali back-end in order to get data and present it to the user.

There are limited options for personalization, the front-end is mainly stateless. Some data may be persisted, such as session credentials, but this data is stored in the browser and won’t be available in other browsers nor other devices.

Istio Service Mesh

Kiali is a management console for Istio, and as such, Istio is a requirement. It provides and controls the service mesh. Kiali and Istio are installed separately.

Kiali needs to retrieve Istio data and configurations, which are exposed through Prometheus and the cluster API. This is the reason the diagram shows a dashed line: to denote an indirect dependency.

Prometheus

Prometheus is an Istio dependency. When Istio telemetry is enabled, metrics data is stored in Prometheus. Kiali uses the data stored in Prometheus to figure out the mesh topology, show metrics, calculate health, show possible problems, etc.

Kiali communicates directly with Prometheus and assumes the metrics used by Istio Telemetery. It’s a hard dependency for Kiali, and many Kiali features will not work without it.

Currently, Kiali relies on Istio’s default metrics set. Make sure that these default metrics are always in place. Some metric customization is possible as long as the Kiali requirements are still met. For the current list of required metrics see this FAQ entry.

Cluster API

Kiali uses the API of the container application platform (cluster API) in order to fetch and resolve service mesh configurations.

Container application platforms where Kiali is known to work are http://www.okd.io/[OKD] and http://kubernetes.io/[Kubernetes]. Kiali shoud also work on the derivatives of these platforms. If you want to learn the cluster API, check the https://docs.okd.io/latest/rest_api/index.html[OKD REST API reference] and the Kubernetes API reference.

Kiali queries the cluster API to retrieve, for example, definitions for namespaces, services, deployments, pods, and other entities. Kiali also makes queries to resolve relationships between the different cluster entities.

The cluster API is also queried to retrieve Istio configurations like virtual services, destination rules, route rules, gateways, and quotas.

Jaeger

Jaeger is optional. When available, Kiali will be able to direct the user to Jaeger’s tracing data. If you need this feature, make sure Kiali is link:https://v1-41.kiali.io/docs/features/tracing/[properly configured for Jaeger integration].

Tracing data will be available only if Istio’s distributed tracing is enabled.

Grafana

Grafana is optional. When available, the metrics pages of Kiali will show a link to direct the user to the same metric in Grafana. If you need this feature, make sure Kiali is properly configured for Grafana integration.

Kiali has basic metric capabilities. It can show the default Istio metrics for workloads, apps and services. It allows to apply some groupings to the provided metrics and fetch metrics for different time ranges. However, Kiali doesn’t allow to customize the views nor customize the Prometheus queries. If you need these capabilities, you’ll want to install Grafana. Follow the Istio documentation to install Grafana if you need it.

5.2 - Terminology

5.2.1 - Concepts

Application

Is a logical grouping of Workloads defined by the application labels that users apply to an object. In Istio it is defined by the Label App. Istio Label Requirements.

Application Name

It’s the name of the Application deployed in your environment. This name is provided by the Label App on the Workload.

Deployment

A deployment is a replication controller based on a user defined template called a Deployment configuration. Deployments are created manually or in response to triggered events.

Istio object/configuration Type

This is the type specified in the Istio Config. This could be any of the following types: Gateway, Virtual Service, DestinationRule, ServiceEntry, Rule, Quota or QuotaSpecBinding.

Istio Sidecar

For more information see the Istio Sidecar definition in Istio Sidecar Documentation.

Label

It’s a user-created tag to identify a set of objects.

An empty label selector (that is, one with zero requirements) selects every object in the collection.

A null label selector (which is only possible for optional selector fields) selects no objects.

For example, Istio uses the Label App & Label Version on a Workload to specify the version and the application.

Label App

This is the ‘app’ label on an object. For more information, see Istio Label Requirements.

Label Version

This is the ‘version’ label on an object. For more information, see Istio Label Requirements.

Namespace

Namespaces are intended for use in environments with many users spread across multiple teams, or projects.

Namespaces are a way to divide cluster resources between multiple users.

Quota

A limited or fixed number or amount of resources.

ReplicaSet

Ensures that a specified number of pod replicas are running at any one time.

Service

A Service is an abstraction which defines a logical set of Pods and a policy by which to access them. A Service is determined by a Label.

Service Entry

For more information see the Service Entry definition in Istio Service Entry Documentation.

VirtualService

For more information see the Virtual Service definition in Istio VirtualService Documentation.

Workload

For more information see the Istio Workload definition.

5.2.2 - Networking

Destination

For more information see the Destination definition in the Istio Glossary.

Destination Rule

For more information see the Istio Destination Rule Documentation.

Endpoint

A communication endpoint is a type of communication network node. It is an interface exposed by a communicating party or by a communication channel.

Error Rate

It’s the percentage of errors in the traffic to a specific object for a Rate Interval.

Gateway

For more information see the Gateway definition in the Istio Gateway Documentation.

Inbound Metrics

Metrics on requests received by a given Workload, Service or Application.

Outbound Metrics

Metrics on requests emitted by a given Workload or Application.

Port

For more information see the Istio Port Documentation.

Rate Interval

It’s an amount of time. By Default in Kiali last 10 minutes.

Rule

It’s an object that manages external access to the services in a cluster, typically HTTP.

Source

For more information see the Source definition in the Istio Glossary.

Subset

For more information see the Istio Subset Documentation.

5.2.3 - Observability

Envoy

A proxy that Istio starts for each pod in the service mesh. For more information see the [Istio Envoy Documentation]https://istio.io/docs/ops/deployment/architecture/#envoy).

Envoy Health

A health check performed by Envoy proxies, for inbound and outbound traffic: see membership_healthy and membership_total from Envoy documentation.

6 - FAQ

Frequently Asked Questions about Kiali.

6.1 - General

Why is the Workload or Application Detail page so slow or not responding?

We have identified a performance issue that happens while visiting the Workload or Application detail page, related to discovering metrics in order to show custom dashboards. This issue was originally reported here and is now tracked there.

To summarize, Kiali might be very slow to fetch some metrics from Prometheus, it might even run out of memory, and so does Prometheus.

The immediate workaround you can take in that situation is to disable dashboards discovery from config:

  external_services:
    custom_dashboards:
      discovery_enabled: "false"

But we would also recommend that you consider a more robust setup for Prometheus, like the one described in this Istio guide (see also this Kiali blog post), in order to decrease the metrics cardinality.

As explained in the tracking issue, a modification of the Prometheus API should soon be available and, hopefully, would allow Kiali to get what it needs at a much lower cost.

What do I need to run Kiali in a private cluster?

Private clusters have higher network restrictions. Kiali needs your cluster to allow TCP traffic between the Kubernetes API service and the Istio Control Plane namespace, for both the 8080 and 15000 ports. This is required for features such as Health and Envoy Dump to work as expected.

Make sure that the firewalls in your cluster allow the connections mentioned above.

Check section Google Kubernetes Engine (GKE) Private Cluster requirements in the Installation Guide.

How do I access Kiali UI?

See Accessing Kiali in the Installation guide.

Does Kiali support Internet Explorer?

No version of Internet Explorer is supported with Kiali. Users may experience some issues when using Kiali through this browser.

Kiali does not work - What do i do?

If you are hitting a problem, whether it is listed here or not, do not hesitate to open a GitHub Discussion to ask about your situation. If you are hitting a bug, or need a feature, you can vote (using emojis) for any existing bug or feature request found in the GitHub Issues. This will help us prioritize the most needed fixes or enhancements. You can also create a new issue.

How do I obtain the logs for Kiali?

Kiali operator logs can be obtained from within the Kiali operator pod. For example, if the operator is installed in the kiali-operator namespace:

KIALI_OPERATOR_NAMESPACE="kiali-operator"
kubectl logs -n ${KIALI_OPERATOR_NAMESPACE} $(kubectl get pod -l app=kiali-operator -n ${KIALI_OPERATOR_NAMESPACE} -o name)

Kiali server logs can be obtained from within the Kiali server pod. For example, if the Kiali server is installed in the istio-system namespace:

KIALI_SERVER_NAMESPACE="istio-system"
kubectl logs -n ${KIALI_SERVER_NAMESPACE} $(kubectl get pod -l app=kiali -n ${KIALI_SERVER_NAMESPACE} -o name)

Note that you can configure the logger in the Kiali Server via these settings in the Kiali CR:

  • log_format supports “text” and “json”.
  • log_level supports “trace”, “debug”, “info”, “warn”, “error”, “fatal”.
  • time_field_format supports a link:https://golang.org/pkg/time/[golang time format]
  • sampler_rate defines a basic log sampler setting as an integer. With this setting every “sampler_rate”-th message will be logged. By default, every message is logged.

For example,

spec:
  deployment:
    logger:
      log_level: info
      log_format: text
      sampler_rate: "1"
      time_field_format: "2006-01-02T15:04:05Z07:00"

Which Istio metrics and attributes are required by Kiali?

To reduce Prometheus storage some users want to customize the metrics generated by Istio. This can break Kiali if the pruned metrics and/or attributes are used by Kiali in its graph or metric features.

Kiali currently requires the following metrics and attributes (note, this assumes Telemetry V2 in being used):

Metric Notes
istio_requests_total used throughout Kiali and the primary metric for http/grpc traffic graph generation
istio_request_bytes_sum used in metrics displays
istio_request_duration_milliseconds_bucket used throughout Kiali for response time calculation. used by iter8 extension.
istio_request_duration_milliseconds_sum used throughout Kiali for response time calculation. used by iter8 extension.
istio_response_bytes_sum used in metrics displays
istio_tcp_received_bytes_total used in metrics displays
istio_tcp_sent_bytes_total used throughout Kiali and the primary metric for tcp traffic graph generation


Attribute Metric Notes
connection_security_policy istio_requests_total used only when graph Security display option is enabled
destination_canonical_revision istio_requests_total
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_tcp_sent_bytes_total
destination_canonical_service istio_requests_total
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
istio_tcp_received_bytes_total
istio_tcp_sent_bytes_total
destination_principal istio_requests_total used only when graph Security display option is enabled
istio_tcp_sent_bytes_total
destination_service istio_requests_total
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_tcp_sent_bytes_total
destination_service_name istio_requests_total
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
istio_tcp_received_bytes_total
istio_tcp_sent_bytes_total
destination_service_namespace istio_requests_total
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
istio_tcp_received_bytes_total
istio_tcp_sent_bytes_total
destination_workload istio_requests_total
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
istio_tcp_received_bytes_total
istio_tcp_sent_bytes_total
destination_workload_namespace istio_requests_total
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
istio_tcp_received_bytes_total
istio_tcp_sent_bytes_total
grpc_response_status istio_requests_total used only when request_protocol=“grpc”
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
istio_tcp_received_bytes_total
istio_tcp_sent_bytes_total
job istio_request_duration_milliseconds_sum used only by iter8 extension
reporter istio_requests_total both “source” and “destination” metrics are used by Kiali
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
istio_tcp_received_bytes_total
istio_tcp_sent_bytes_total
request_operation istio_requests_total used only when request classification is configured. “request_operation” is the default attribute, it is configurable.
istio_request_bytes_sum
istio_response_bytes_sum
request_protocol istio_requests_total
istio_request_bytes_sum
istio_response_bytes_sum
response_code istio_requests_total
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
response_flags istio_requests_total
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
source_canonical_revision istio_requests_total
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_tcp_sent_bytes_total
source_canonical_service istio_requests_total
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
istio_tcp_received_bytes_total
istio_tcp_sent_bytes_total
source_principal istio_requests_total
istio_tcp_sent_bytes_total
source_workload istio_requests_total
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
istio_tcp_received_bytes_total
istio_tcp_sent_bytes_total
source_workload_namespace istio_requests_total
istio_request_bytes_sum
istio_request_duration_milliseconds_bucket
istio_request_duration_milliseconds_sum
istio_response_bytes_sum
istio_tcp_received_bytes_total
istio_tcp_sent_bytes_total

What are the minimum privileges to login when using RBAC?

The get namespace privilege is required for Kiali login when using an RBAC-enabled authentication strategy. The user needs the privilege in at least one namespace. The Kiali Operator will provide a ClusterRole named either kiali or kiali-viewer with the needed privilege. Users can be bound to this role.

When using a customized Role or ClusterRole then the following rule is required for Kiali login:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: custom-kiali-role
rules:
- apiGroups: [""]
  resources:
  - namespaces
  verbs:
  - get

Although required for login, this privilege is not sufficient for Kiali to function well in general.

What is the License?

See here for the Kiali license.

Why isn’t my namespace in the Namespace Selection dropdown?

When deploying Kiali with the Kiali operator, by default some namespaces are excluded from the list of namespaces provided by the API and UI. Kiali filters out these namespaces and you will not see them in the Namespace Selection dropdown. You can adjust which namespaces are excluded by setting the spec.api.namespaces.exclude field on the Kiali CR.

In addition, you must ensure that Kiali has access to the namespaces you are interested in by setting the spec.deployment.accessible_namespaces field on the Kiali CR accordingly. Setting spec.api.namespaces.exclude alone does not give Kiali access to the namespaces. See the Namespace Management guide for more information.

Kiali also caches namespaces by default for 10 seconds. If the cache is enabled, it might take up to the spec.kubernetes_config.cache_token_namespace_duration in order for a newly added namespace to be seen by Kiali.

Workload “is not found as” messages

Kiali queries Deployment ,ReplicaSet, ReplicationController, DeploymentConfig, StatefulSet, Job and CronJob controllers. Deployment, ReplicaSet and StatefulSet are always queried, but ReplicationController, DeploymentConfig, Job and CronJobs are excluded by default for performance reasons.

To include them, update the list of excluded_workloads from the Kiali config.

#    ---
#    excluded_workloads:
#    - "CronJob"
#    - "DeploymentConfig"
#    - "Job"
#    - "ReplicationController"
#

An empty list will tell Kiali to query all type of known controllers.

6.2 - Graph

Why are my TCP requests disconnected in the graph?

Some users are surprised when requests are not connected in the graph. This is normal Istio telemetry for TCP requests if mTLS is not enabled. For HTTP requests, the requests will be connected even without MTLS, because Istio uses headers to exchange workload metadata between source and destination. With the disconnected telemetry you will see an edge from a workload to a terminal service node. That’s the first hop. And then another edge from “Unknown” to the expected destination service/workload. In the graph below, this can be seen for the requests from myapp to redis and mongodb:

Disconnected graph for non-mTLS TCP requests

Why my external HTTPS traffic is showing as TCP?

Istio can’t recognize HTTPS request that go directly to the service, the reason is that these requests are encrypted and are recognized as TCP traffic.

You can however configure your mesh to use TLS origination for your egress traffic. This will allow to see your traffic as HTTP instead of TCP.

Why is the graph badly laid out?

The layout for Kiali Graph may render differently, depending on the data to display (number of graph nodes and their interactions) and it’s sometimes difficult, not to say impossible, to have a single layout that renders nicely in every situation. That’s why Kiali offers a choice of several layout algorithms. However, we may still find some scenarios where none of the proposed algorithms offer a satisfying display. If Kiali doesn’t render your graph layout in a satisfactory manner please switch to another layout option. This can be done from the Graph Toolbar located on the bottom left of the graph.

If Kiali doesn’t produce a good graph for you, don’t hesitate to open an issue in GitHub or reach out via the usual channels.

Why are there many unknown nodes in the graph?

In some situations you can see a lot of connections from an “Unknown” node to your services in the graph, because some software external to your mesh might be periodically pinging or fetching data. This is typically the case when you setup Kubernetes liveness probes, or have some application metrics pushed or exposed to a monitoring system such as Prometheus. Perhaps you wouldn’t like to see these connections because they make the graph harder to read.

From the Graph page, you can filter them out by typing node = unknown in the Graph Hide input box.

Graph Hide

For a more definitive solution, there are several ways to prevent Istio from gathering this kind of telemetry.

The first is to have these endpoints (like /health or /metrics) exposed on a different port and server than your main application, and to not declare this port in your Pod’s container definition as containerPort. This way, the requests will be completely ignored by the Istio proxy, as mentioned in Istio documentation (at the bottom of that page).

The second way is to modify Istio’s Prometheus rule to explicitly exclude some requests based on the User Agent. This is the Rule resource named promhttp located in istio-system. To edit it:

kubectl edit rule promhttp -n istio-system

Then locate the match field under spec section. Change it to filter out, for instance, the Kubernetes probes:

match: (context.protocol == "http" || context.protocol == "grpc") && (match((request.useragent | "-"), "kube-probe*") == false)

Why do I have missing edges?

Kiali builds the graph from Istio’s telemetry. If you don’t see what you expect it probably means that it has not been reported in Prometheus. This usually means that:

1- The requests are not actually sent.

2- Sidecars are missing.

3- Requests are leaving the mesh and are not configured for telemetry.

For example, If you don’t see traffic going from node A to node B, but you are sure there is traffic, the first thing you should be doing is checking the telemetry by querying the metrics, for example, if you know that MyWorkload-v1 is sending requests to ServiceA try looking for metrics of the type:

istio_requests_total{destination_service="ServiceA"}

If telemetry is missing then it may be better to take it up with Istio.

Which lock icons should I see when I enable the Kiali Graph Security Display option?

Sometimes the Kiali Graph Security Display option causes confusion. The option is disabled by default for optimal performance, but enabling the option typically adds nominal time to the graph rendering. When enabled, Kiali will determine the percentage of mutual TLS (mTLS) traffic on each edge. Kiali will only show lock icons on edges with traffic for edges that have > 0% mTLS traffic.

Kiali determines the mTLS percentage for the edges via the connection_security_policy attribute in the Prometheus telemetry. Note that this is destination telemetry (i.e. reporter="destination").

Why can’t I see traffic leaving the mesh?

See Why do I have missing edges?, and additionally consider whether you need to create a ServiceEntry (or several) to allow the requests to be mapped correctly.

You can check this article on how to visualize your external traffic in Kiali for more information.

Why do I see traffic to PassthroughCluster?

Requests going to PassthroughCluster (or BlackHoleCluster) are requests that did not get routed to a defined service or service entry, and instead end up at one of these built-in Istio request handlers. See Monitoring Blocked and Passthrough External Service Traffic for more information.

Unexpected routing to these nodes does not indicate a Kiali problem, you’re seeing the actual routing being performed by Istio. In general it is due to a misconfiguration and/or missing Istio sidecar. Less often but possible is an actual issue with the mesh, like a sync issue or evicted pod.

Use Kiali’s Workloads list view to ensure sidecars are not missing. Use Kiali’s Istio Config list view to look for any config validation errors.

How do I inspect the underlying metrics used to generate the Kiali Graph?

It is not uncommon for the Kiali graph to show traffic that surprises the user. Often the thought is that Kiali may have a bug. But in general Kiali is just visualizing the metrics generated by Istio. The next thought is that the Istio telemetry generation may have a bug. But in general Istio is generating the expected metrics given the defined configuration for the application.

To determine whether there is an actual bug it can be useful to look directly at the metrics collected by and stored in the Prometheus database. Prometheus provides a basic console that can be opened using the istioctl dashboard command:

> istioctl dashboard prometheus

The above command, assuming Istio and Prometheus are in the default istio-system namespace, should open the Prometheus console in your browser.

Kiali uses a variety of metrics but the primary request traffic metrics for graph generation are these:

  • istio_requests_total
  • istio_tcp_sent_bytes_total

The Prometheus query language is very rich but a few basic queries is often enough to gather time-series of interest.

Here is a query that returns time-series for HTTP or GRPC requests to the reviews service in Istio’s BookInfo sample demo:

istio_requests_total{reporter="source", destination_service_name="reviews"}

And here is an example of the results:

Prometheus Console - all attributes

The query above is good for dumping all of the attributes but it can be useful to aggregate results by desired attributes. The next query will get the request counts for the reviews service broken down by source and destination workloads:

sum(istio_requests_total{reporter="source", destination_service_name="reviews"}) by (source_workload, destination_workload)

Prometheus Console - aggregation

The first step to explaining your Kiali graph is to inspect the metrics used to generate the graph. Kiali devs may ask for this info when working with you to solve a problem, so it is useful to know how to get to the Prometheus console.

Why don’t I see response times on my service graph?

Users can select Response Time to label their edges with 95th percentile response times. The response time indicates the amount of time it took for the destination workload to handle the request. In the Kiali graph the edges leading to service nodes represent the request itself, in other words, the routing. Kiali can show the request rate for a service but response time is not applicable to be shown on the incoming edge. Only edges to app, workload, or service entry nodes show response time because only those nodes represent the actual work done to handle the request. This is why a Service graph will typically not show any response time information, even when the Response Time option is selected.

Because Service graphs can show Service Entry nodes the Response Time option is still a valid choice. Edges to Service Entry nodes represent externally handled requests, which do report the response time for the external handling.

Why does my workload graph show service nodes?

Even when Display Service Nodes is disabled a workload graph can show service nodes. Display Service Nodes ensures that you will see the service nodes between two other nodes, providing an edge to the destination service node, and a subsequent edge to the node handling the request. This option injects service nodes where they previously would not be shown. But Kiali will always show a terminal service node when the request itself fails to be routed to a destination workload. This ensures the graph visualizes problem areas. This can happen in a workload or app graph. Of course in a service graph the Display Service Nodes option is simply ignored.

6.3 - Distributed Tracing

Why is Jaeger unreachable or Kiali showing the error “Could not fetch traces”?

Istio components status indicator shows “Jaeger unreachable”:

Jaeger unreachable

While on any Tracing page, error “Could not fetch traces” is displayed:

Could not fetch traces

Apparently, Kiali is unable to connect to Jaeger. Make sure tracing is correctly configured in the Kiali custom resource or ConfigMap (View full CR description).

      tracing:
        auth:
          type: none
        enabled: true
        in_cluster_url: 'http://tracing.istio-system/jaeger'
        url: 'http://jaeger.example.com/'
        use_grpc: true

You need especially to pay attention to the in_cluster_url field, which is how Kiali backend contacts the Jaeger service. In general, this URL is written using Kubernetes domain names in the form of http://service.namespace, plus eventually a path.

If you’re not sure about this URL, try to find your Jaeger service and its exposed ports:

$ kubectl get services -n istio-system
...
tracing      ClusterIP      10.108.216.102   <none>        80/TCP      47m
...

To validate this URL, you can try to curl its API via Kiali pod, by appending /api/traces to the configured URL (in the following, replace with the appropriate Kiali pod):

$ kubectl exec -n istio-system -it kiali-556fdb8ff5-p6l2n -- curl http://tracing.istio-system/jaeger/api/traces

{"data":null,"total":0,"limit":0,"offset":0,"errors":[{"code":400,"msg":"parameter 'service' is required"}]}

If you see some returning JSON as in the above example, congrats, it should be well configured!

If instead of that you see some blocks of mixed HTML/Javascript mentioning JaegerUI, then probably the host+port are correct but the path isn’t.

A common mistake is to forget the /jaeger suffix, which is often used in Jaeger deployments.

It may also happen that you have a service named jaeger-query, exposing port 16686, instead of the more common tracing service on port 80. In that situation, set in_cluster_url to http://jaeger-query.istio-system:16686/jaeger.

If Jaeger needs an authentication, make sure to correctly configure the auth section.

Note that in general, Kiali will connect to Jaeger via GRPC, which provides better performances. If for some reason it cannot be done (e.g. Jaeger being behind a reverse-proxy that doesn’t support GRPC, or that needs more configuration in that purpose), it is possible to switch back to using the http/json API of Jaeger by setting use_grpc to false.

If for some reason the GRPC connection fails and you think it shouldn’t (e.g. your reverse-proxy supports it, and the non-grpc config works fine), please get in touch with us.

In addition to the embedded integration that Kiali provides with Jaeger, it is possible to show external links to the Jaeger UI. To do so, the external URL must be configured in the Kiali CR or ConfigMap (field url).

    tracing:
      # ...
      url: "http://jaeger.example.com/"

When configured, this URL will be used to generate a couple of links to Jaeger within Kiali. It’s also visible in the About modal:

About menu

About modal

Jaeger integration disabled

On the Application detail page, the Traces tab might redirect to Jaeger via an external link instead of showing the Kiali Tracing view. It happens when you have the url field configured, but not in_cluster_url, which means the Kiali backend will not be able to connect to Jaeger.

To fix it, configure in_cluster_url in Kiali CR or ConfigMap.

6.4 - Validations

Which formats does Kiali support for specifying hosts?

Istio highly recommends that you always use fully qualified domain names (FQDN) for hosts in DestinationRules. However, Istio does allow you to use other formats like short names (details or details.bookinfo). Kiali only supports FQDN and simple service names as host formats: for example details.bookinfo.svc.cluster.local or details. Validations using the details.bookinfo format might not be accurate.

In the following example it should show the validation of “More than one Destination Rule for the same host subset combination”. Because of the usage of the short name reviews.bookinfo Kiali won’t show the warning message on both destination rules.

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews-dr1
spec:
  host: reviews
  trafficPolicy:
    loadBalancer:
      simple: RANDOM
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
  - name: v3
    labels:
      version: v3
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews-dr2
spec:
  host: reviews
  trafficPolicy:
    loadBalancer:
      simple: RANDOM
  subsets:
  - name: v1
    labels:
      version: v1

See the recomendation Istio gives regarding host format: “To avoid potential misconfigurations, it is recommended to always use fully qualified domain names over short names."

For best results with Kiali, you should use fully qualified domain names when specifying hosts.

6.5 - Authentication

How to obtain a token when logging in via token auth strategy

When configuring Kiali to use the token auth strategy, it requires users to log into Kiali as a specific user via the user’s service account token. Thus, in order to log into Kiali you must provide a valid Kubernetes token.

You can extract a service account’s token from the secret that was created for you when you created the service account.

For example, if you want to log into Kiali using Kiali’s own service account, you can get the token like this:

kubectl get secret -n istio-system $(kubectl get sa kiali-service-account -n istio-system -o jsonpath={.secrets[0].name}) -o jsonpath={.data.token} | base64 -d

Note that this example assumes you installed Kiali in the istio-system namespace.

Once you obtain the token, you can go to the Kiali login page and copy-and-paste that token into the token field. At this point, you have logged into Kiali with the same permissions as that of the Kiali server itself (note: this gives the user the permission to see everything).

Create different service accounts with different permissions for your users to use. Each user should only have access to their own service accounts and tokens.

How to configure the originating port when Kiali is served behind a proxy (OpenID support)

When using OpenID strategy for authentication and deploying Kiali behind a reverse proxy or a load balancer, Kiali needs to know the originating port of client requests. You may need to setup your proxy to inject a X-Forwarded-Port HTTP header when forwarding the request to Kiali.

For example, when using an Istio Gateway and VirtualService to expose Kiali, you could use the link:https://istio.io/latest/docs/reference/config/networking/virtual-service/#Headers[headers property] of the route:

spec:
  gateways:
  - istio-ingressgateway.istio-system.svc.cluster.local
  hosts:
  - kiali.abccorp.net
  http:
  - headers:
      request:
        set:
          X-Forwarded-Port: "443"
    route:
    - destination:
        host: kiali
        port:
          number: 20001

6.6 - Installation

Operator fails due to cannot list resource "clusterroles" error

When the Kiali Operator installs a Kiali Server, the Operator will assign the Kiali Server the proper roles/rolebindings so the Kiali Server can access the appropriate namespaces. The Kiali Operator will check to see if the Kiali CR setting deployment.accessible_namespaces has a value of ['**']. If it does, this means the Kiali Server is to be given access to all namespaces in the cluster, including namespaces that will be created in the future. In this case, the Kiali Operator will create and assign ClusterRole/ClusterRoleBinding resources to the Kiali Server. But in order to do this, the Kiali Operator must itself be given permission to create those ClusterRole and ClusterRoleBinding resources. When you install the Kiali Operator via OLM, these permissions are automatically granted. However, if you installed the Kiali Operator with the Operator Helm Chart, and if you did so with the value clusterRoleCreator set to false then the Kiali Operator will not be given permission to create cluster roles. In this case, you will be unable to install a Kiali Server if your Kiali CR has deployment.accessible_namespaces set to ['**'] - you will get an error similar to this:

Failed to list rbac.authorization.k8s.io/v1, Kind=ClusterRole:
clusterroles.rbac.authorization.k8s.io is forbidden:
User "system:serviceaccount:kiali-operator:kiali-operator"
cannot list resource "clusterroles" in API group
"rbac.authorization.k8s.io" at the cluster scope

Thus, if you do not give the Kiali Operator the permission to create cluster roles, you must tell the Operator which specific namespaces the Kiali Server can access (you cannot use [**']). When specific namespaces are specified in deployment.accessible_namespaces, the Kiali Operator will create Role and RoleBindings (not the “Cluster” kinds) and assign them to the Kiali Server.

What values can be set in the Kiali CR?

A Kiali CR is used to tell the Kiali Operator how and where to install a Kiali Server in your cluster. You can install one or more Kiali Servers by creating one Kiali CR for each Kiali Server you want the Operator to install and manage. Deleting a Kiali CR will uninstall its associted Kiali Server.

All the settings that you can define in a Kiali CR are fully documented here. If you are using a specific version of the Operator, the Kiali CR that is valid for that version can be found in the version tag within the github repository (e.g. Operator v1.25.0 supported these Kiali CR settings).

Note that that example Kiali CR is formatted in such a way that if you uncomment a line (by deleting the line’s leading # character), that line’s setting will be at the appropriate indentation level. This allows you to create an initial Kiali CR by simply uncommenting lines of the settings you want to define. The comments are also specifically laid out in such a way that each major top-level group of settings are in their own major comment section, and each value in each commented setting is the default value (so if you do not define that setting yourself, you will know what its default value is).

How to configure some operator features at runtime

Once the Kiali Operator is installed, you can change some of its configuration at runtime in order to utilize certain features that the Kiali Operator provides. These features are configured via environment variables defined in the operator’s deployment.

Perform the following steps to configure these features in the Kiali Operator:

  1. Determine the namespace where your operator is located and store that namespace name in $OPERATOR_NAMESPACE. If you installed the operator via helm, it may be kiali-operator. If you installed the operator via OLM, it may be openshift-operators. If you are not sure, you can perform a query to find it:
OPERATOR_NAMESPACE="$(kubectl get deployments --all-namespaces  | grep kiali-operator | cut -d ' ' -f 1)"
  1. Determine the name of the environment variable you need to change in order to configure the feature you are interested in. Here is a list of currently supported environment variables you can set:
  • ALLOW_AD_HOC_KIALI_NAMESPACE: must be true or false. If true, the operator will be allowed to install the Kiali Server in any namespace, regardless of which namespace the Kiali CR is created. If false, the operator will only install the Kiali Server in the same namespace where the Kiali CR is created - any attempt to do otherwise will cause the operator to abort the Kiali Server installation.
  • ALLOW_AD_HOC_KIALI_IMAGE: must be true or false. If true, the operator will be allowed to install the Kiali Server with a custom container image as defined in the Kiali CR’s spec.deployment.image_name and/or spec.deployment.image_version. If false, the operator will only install the Kiali Server with the default image. If a Kiali CR is created with spec.deployment.image_name or spec.deployment.image_version defined, the operator will abort the Kiali Server installation.
  • ANSIBLE_DEBUG_LOGS: must be true or false. When true, turns on debug logging within the Operator SDK. For details, see the docs here.
  • ANSIBLE_VERBOSITY_KIALI_KIALI_IO: Controls how verbose the operator logs are - the higher the value the more output is logged. For details, see the docs here.
  • ANSIBLE_CONFIG: must be /etc/ansible/ansible.cfg or /opt/ansible/ansible-profiler.cfg. If set to /opt/ansible/ansible-profiler.cfg a profiler report will be dumped in the operator logs after each reconciliation run.
  1. Store the name of the environment variable you want to change in $ENV_NAME:
ENV_NAME="ANSIBLE_CONFIG"
  1. Store the new value of the environment variable in $ENV_VALUE:
ENV_VALUE="/opt/ansible/ansible-profiler.cfg"
  1. The final step depends on how you installed the Kiali Operator:
  • If you installed the operator via helm, simply set the environment variable on the operator deployment directly:
oc -n ${OPERATOR_NAMESPACE} set env deploy/kiali-operator "${ENV_NAME}=${ENV_VALUE}"
  • If you installed the operator via OLM, you must set this environment variable within the operator’s CSV and let OLM propagate the new environment variable value down to the operator deployment:
oc -n ${OPERATOR_NAMESPACE} patch $(oc -n ${OPERATOR_NAMESPACE} get csv -o name | grep kiali) --type=json -p "[{'op':'replace','path':"/spec/install/spec/deployments/0/spec/template/spec/containers/0/env/$(oc -n ${OPERATOR_NAMESPACE} get $(oc -n ${OPERATOR_NAMESPACE} get csv -o name | grep kiali) -o jsonpath='{.spec.install.spec.deployments[0].spec.template.spec.containers[0].env[*].name}' | tr ' ' '\n' | cat --number | grep ${ENV_NAME} | cut -f 1 | xargs echo -n | cat - <(echo "-1") | bc)/value",'value':"\"${ENV_VALUE}\""}]"

How can I inject an Istio sidecar in the Kiali pod?

By default, Kiali will not have an Istio sidecar. If you wish to deploy the Kiali pod with a sidecar, you have to define the sidecar.istio.io/inject=true label in the spec.deployment.pod_labels setting in the Kiali CR. For example:

spec:
  deployment:
    pod_labels:
      sidecar.istio.io/inject: "true"

If you are utilizing CNI in your Istio environment (for example, on OpenShift), Istio will not allow sidecars to work when injected in pods deployed in the control plane namespace, e.g. istio-system. (1) (2) (3). In this case, you must deploy Kiali in its own separate namespace. On OpenShift, you can do this using the following instructions.

Determine what namespace you want to install Kiali and create it. Give the proper permissions to Kiali. Create the necessary NetworkAttachmentDefinition. Finally, create the Kiali CR that will tell the operator to install Kiali in this new namespace, making sure to add the proper sidecar injection label as explained earlier.

NAMESPACE="kialins"

oc create namespace ${NAMESPACE}

oc adm policy add-scc-to-group privileged system:serviceaccounts:${NAMESPACE}

cat <<EOM | oc apply -f -
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: istio-cni
  namespace: ${NAMESPACE}
EOM

cat <<EOM | oc apply -f -
apiVersion: kiali.io/v1alpha1
kind: Kiali
metadata:
  name: kiali
  namespace: ${NAMESPACE}
spec:
  istio_namespace: istio-system
  auth:
    strategy: anonymous
  deployment:
    accessible_namespaces: [ '**' ]
    pod_labels:
      sidecar.istio.io/inject: "true"
EOM

After the operator installs Kiali, confirm you have two containers in your pod. This indicates your Kiali pod has its proxy sidecar successfully injected.

$ oc get pods -n ${NAMESPACE}
NAME                    READY   STATUS    RESTARTS   AGE
kiali-56bbfd644-nkhlw   2/2     Running   0          43s

6.7 - Istio Component Status

How can I add one component to the list?

If you are interested in adding one more component to the Istio Component Status tooltip, you have the option to add one new component into the Kiali CR, under the external_services.istio.component_status field.

For each component there, you will need to specify the app label of the deployment’s pods, the namespace and whether is a core component or add-on.

One component is ‘Not found’ but I can see it running. What can I do?

The first thing you should do is check into the Kiali CR for the istio.component_status field.

Kiali looks for a Deployment for which its pods have the app label with the specified value in the CR, and lives in that namespace. The app label name may be changed from the default (app) and it is specified in the istio_labels.app_label_name in the Kiali CR.

Ensure that you have specified correctly the namespace and that the deployment’s pod template has the specified label.

One component is ‘Unreachable’ but I can see it running. What can I do?

Kiali considers one component as Unreachable when the component responds to a GET request with a 4xx or 5xx response code.

The URL where Kiali sends a GET request to is the same as it is used for the component consumption. However, Kiali allows you to set a specific URL for health check purposes: the health_check_url setting.

In this example, Kiali uses the Prometheus url for both metrics consumption and health checks.

external_services:
  prometheus:
    url: "http://prometheus.istio-system:9090"

In case that the prometheus.url endpoint doesn’t return 2XX/3XX to GET requests, you can use the following settings to specify which health check URL Kiali should use:

external_services:
  prometheus:
    health_check_url: "http://prometheus.istio-system:9090/healthz"
    url: "http://prometheus.istio-system:9090"

Please check the Kiali CR for more information. Each external service component has its own health_check_url and is_core setting to tailor the experience in the Istio Component Status feature.