A Guide To Kubernetes Logs That Isn't A Vendor Pitch

A guide to logging at each cluster layer with a focus on AuditPolicy.

Published: June 1, 2024

Reading Time: 27 minutes

One of the frustrating aspects of researching topics in the Kubernetes/cloud-native world is having to trek through the vast sea of SEO-optimized articles that are nothing more than rehashed vendor marketing of the Kubernetes documentation thinly veiled as a technical guide. It sometimes reminds me of walking through the vendor floor at Blackhat. I get it, and I’m sure many of the products are great, but sometimes I just want to understand a concept, not pay someone to understand it for me.

Kubernetes logging is not entirely straightforward (are logs ever straightforward?). In this post I’ll discuss logging at each of the “layers” of a Kubernetes cluster and why you should probably spend some time looking into and tuning AuditPolicy if you’re attempting to collect logs from a Kubernetes cluster.

I work in offensive security so the lens that I see logging through is definitely bias. I don’t want to look at logs all day, but part of being a good red teamer is understanding what a full attack can look like from a log perspective (and how to not show up in the logs). In a perfect world, we would just log everything that even thinks about touching our cluster. If you go down this route, you’ll quickly realize that the amount of logs generated by each layer of a cluster is absurd.

Furthemore, your time could be better spent collecting logs elsewhere such as netflow. Even if you could store all the logs a cluster generates, you still have to parse through them to make them useful, and at a certain point, having too many logs gives an attacker more hay to hide their needle in.

What does logging mean in the Kubernetes world?

Despite there being some overlap, I feel there is a need to separate logs into two categories: logs that are helpful for debugging, which I will be referring to as debug logs, and logs that are useful for security which I will refer to as security logs.

  • Debug Logs: Logs that are helpful to investigate if something isn’t working properly during setup. These answer the “Why?” behind an issue. IE: Why is this server crashing? Oh, it’s CPU is at 100%.
  • Security Logs: Logs that are helpful to investigate during a security incident. These answer: Who? What? When? Where? IE: Graham created a pod called PWND on the control node on Friday at 4:00 pm.

Introducing: The Beefy 4 Layer Kuburrito

Thinking about Kubernetes in layers greatly simplifies how to think about a problem. The “4Cs of cloud-native security” is the model I use for this:

  1. Code Security: Is the code deployed into a pod secure? Is it vulnerable to SQL injection, command injection, or any other type of vulnerability in the OWASP Top 10?
  2. Container Security: Is the container you’re launching hosting your application trusted? Where did you get the image from? Is the container running as root?
  3. Cluster Security: Is your cluster configured with the principles of least privilege in mind? Is RBAC in use? Are secrets being stored appropriately?
  4. Cloud Security: Is the infrastructure hosting the cluster secure? Have the nodes been patched? Are they running SSH with a default password? Is access to the API server restricted?

Using this model, we can separate the many different types of logs that can be gathered from a Kubernetes cluster into each of these buckets.

  1. Code Logging: Logging that is done at the application level (IE: Graham made a GET request to a webserver server).
  2. Container Logging: Logs that are produced about the container running an application. (IE: Container X is pulling Image Y)
  3. Cluster Logging: Logs at the Kubernetes cluster layer (and its components). (IE: Service A issued a GET action for secret super_secret)
  4. Cloud Level Logging: Logging at the cloud provider level or logs for a managed Kubernetes cluster. (IE: Graham logged into to the management interface at 2am)

So what do logs look like at each of these layers? What generates them? Where do we collect them? Should we even collect them?

Code Level Logging

Getting logs from the applications running in a Kubernetes pod is a bit more difficult than collecting logs from an application running inside a VM for a few reasons. The first issue we run into is pods are ephemeral: when they are deleted, recreated, or removed by the cluster, the logs inside of them are deleted. If an attacker exploited a web server and then the pod crashed, we wouldn’t have a way to see the logs.

Exec into a pod

Probably the most unhinged way you can inspect logs in a container is by execing into the pod and looking for the logs manually by running kubectl exec -it <pod_name> -- bash. This is probably not something you should ever do unless you’re really in the weeds with troubleshooting or you’re just tinkering. Doing this in a production cluster is almost always a terrible idea.

kubectl logs

The standard way of viewing logs from a Kubernetes Pod is to run kubectl logs <pod_name>. This will display STDOUT and STDERR for the application running in your pod (assuming it’s configured to output text to these file descriptors).

This is great for debug logs, but not great for security logs as they’re not collected in a SIEM. Additionally, there is no way for us to inspect logs that are not written to STDOUT or STDERR. What if we want to view logs from a file such as /var/log/syslog? Utilizing kubectl logs doesn’t allow us to do so.

Sidecar Containers

Sidecar containers are another way we can grab logs from a container running inside a pod. Remember, each Pod can have one or more containers inside of it which we can define in the Pod manifest as follows:

 1# Modified from https://www.airplane.dev/blog/kubernetes-sidecar-container (rip)
 2apiVersion: v1
 3kind: Pod
 4metadata:
 5  name: simple-webapp
 6  labels:
 7    app: webapp
 8spec:
 9  containers:
10    # Define the main application, nginx
11    - name: main-application
12      image: nginx
13    # Mount /var/log/nginx
14      volumeMounts:
15        - name: shared-logs
16          mountPath: /var/log/nginx
17	# Create a second container inside the pod
18    - name: sidecar-container
19      image: busybox
20	# Read /var/log/nginx/access.log to STDOUT every 30 seconds
21      command: ["sh","-c","while true; do cat /var/log/nginx/access.log; sleep 30; done"]
22      volumeMounts:
23        - name: shared-logs
24          mountPath: /var/log/nginx
25  volumes:
26    - name: shared-logs
27      emptyDir: {}

In this example, we are creating a Pod manifest that creates both an Nginx application, as well as a sidecar that simply reads the /var/log/nginx/access.log file to STDOUT. Note that this is just a demonstration, if you want to collect the logs from this container, you would perform some other operation (such as collecting the logs and sending them to a SIEM).

You’ll notice that a sidecar can view the /var/log/nginx/access.log file because we’ve set up volumeMounts in our Pod manifests that allow for those resources to be accessed.

This is a much more robust solution to collecting logs from a pod and it’s actually what many vendor products do to collect logs from your Pods. The power of sidecar containers lies in the fact that you don’t have to modify your application to make changes as long as application produces logs somewhere, you can collect them with a sidecar.

Container Level Logging

Logs from the container layer of a cluster are generally debug logs. This means that while they can be used when investigating security incidents, they’re probably not the first place you should look unless you have a very specific reason.

Container Runtime

A container runtime is responsible for (among other things) running the container on the node of a Kubernetes cluster. Some popular container runtimes include docker, containerd, CRI-O, etc. Depending on which container runtime you’re working with, the logs for these may contain different information. You can identify which container runtime your nodes are using by running kubectl get nodes -o wide

Collecting logs at the container level generally means collecting logs from the container runtime. These are typically stored in /var/log/pods/* and are often symlinked in /var/log/containers/*

Taking a look at the kube-apiserver logs, they’re not immediately helpful from a security perspective. The general rule of thumb I have for determining if a log is a debug log or a security log is asking myself “If I were a SOC analyst and I saw this log, would I know what is being communicated?”. In this case, the answer is clear: “Absolutely not”:

12024-05-31T19:05:51.584763355Z stderr F I0531 19:05:51.584688       1 trace.go:236] Trace[1866683030]: "Update" accept:application/vnd.kubernetes.protobuf, */*,audit-id:c033a34d-3b3a-4b2c-a871-ec998630828d,client:192.168.1.201,
2<snip_for_brevity>
3user-agent:kube-controller-manager/v1.30.1 (linux/amd64) kubernetes/6911225/leader-election,verb:PUT (31-May-2024 19:05:51.046) (total time: 537ms):

Cluster Level Logging

Logging at the cluster level is where things get a little freaky. Logging at the cluster level means collecting information about events from the orchestration components themselves. In this case, we’re talking about things like the Kubelet and API server.

Kubelet Logs

Logs from the Kubelet display information on what actions the Kubelet is taking. If you remember from Kubernetes 101, the kubelet is a process that runs on each node in a cluster that is responsible for taking requests from the scheduler and launching containers on the node. It also periodically reports the status of the Pods running on the node to the API Server.

1Jun 02 00:39:08 kubecontrol kubelet[43672]: I0602 00:39:08.105506   43672 reconciler_common.go:247] "operationExecutor.VerifyController AttachedVolume started for volume \"etc-pki\" (UniqueName: \"kubernetes.io/host-path/ab9e569b3b5df9381f0b4449875b2fa0-etc-pki\") pod \"kube-apiserver-kubecontrol\" (UID: \"ab9e569b3b5df9381f0b4449875b2fa0\") " pod="kube-system/kube-apiserver-kubecontrol"

These can be useful for both debugging and for security purposes, however, I wouldn’t make them the first thing you collect from a security perspective as they contain a lot of Jargon that a SOC analyst probably won’t understand without sitting with a Kubernetes engineer. If we’re curious about information pertaining to the API server, there is a far better way of gathering that information: AuditPolicy.

AuditPolicy

Kubernetes AuditPolicy is probably what you’re looking for if you’re attempting to collect logs from a Kubernetes cluster to send to a SIEM, but be forewarned, it’s a little more complicated than simply turning it on and pointing it at your SIEM of choice.

AuditPolicy logs (which I’ll be referring to as audit logs) are generated by the API server when traffic traverses it. If you remember from Kubernetes 101, all requests must traverse the API server, making this a great place to collect security logs. According to the Kubernetes documentation:

Auditing allows cluster administrators to answer the following questions:

  • what happened?
  • when did it happen??
  • who initiated it??
  • on what did it happen??
  • where was it observed??
  • from where was it initiated??
  • to where was it going?

That’s quite a bit of information. It’s probably TOO much information to collect. For context, if you decide to collect all of these logs in a production cluster, you’re looking at potentially hundreds or even thousands of gigabytes of logs per day.

There are 4 different levels of log data AuditPolicy allows us to collect.

  • None: Doesn’t log anything (obviously…)
  • Metadata: Logs metadata.
  • Request: Logs event metadata and the body of the request sent to the API server but does not record the body of the response from the API server
  • RequestResponse: Logs the event metadata, the request posted to the API server, AND the response of the API server.

Each different level adds additional log data to the previous level. For example, requests logged at the RequestResponse level also include log information from the Request and Metadata levels. Below I’ll show some example of each logging level. Beware, there are a lot of logs in the next section!

Metadata

Sample log when only collecting at the Metadata level:

 1// 
 2// Metadata Information
 3//
 4{
 5  "kind": "Event",
 6  "apiVersion": "audit.k8s.io/v1",
 7  "level": "Metadata",
 8  "auditID": "18190867-edaa-48a4-95c5-6935576a9939",
 9  "stage": "RequestReceived",
10  "requestURI": "/api/v1/namespaces/default/pods?fieldManager=kubectl-client-side-apply&fieldValidation=Strict",
11  "verb": "create",
12  "user": {
13    "username": "kubernetes-admin",
14    "groups": [
15      "kubeadm:cluster-admins",
16      "system:authenticated"
17    ]
18  },
19  "sourceIPs": [
20    "192.168.1.167"
21  ],
22  // Want something fun to look into? What userAgent do other attack tools use?
23  "userAgent": "kubectl/v1.28.9 (linux/amd64) kubernetes/587f5fe",
24  "objectRef": {
25    "resource": "pods",
26    "namespace": "default",
27    "apiVersion": "v1"
28  },
29  "requestReceivedTimestamp": "2024-05-31T20:30:27.956279Z",
30  "stageTimestamp": "2024-05-31T20:30:27.956279Z"
31}
32{
33  "kind": "Event",
34  "apiVersion": "audit.k8s.io/v1",
35  "level": "Metadata",
36  "auditID": "18190867-edaa-48a4-95c5-6935576a9939",
37  "stage": "ResponseComplete",
38  "requestURI": "/api/v1/namespaces/default/pods?fieldManager=kubectl-client-side-apply&fieldValidation=Strict",
39  "verb": "create",
40  "user": {
41    "username": "kubernetes-admin",
42    "groups": [
43      "kubeadm:cluster-admins",
44      "system:authenticated"
45    ]
46  },
47  "sourceIPs": [
48    "192.168.1.167"
49  ],
50  "userAgent": "kubectl/v1.28.9 (linux/amd64) kubernetes/587f5fe",
51  "objectRef": {
52    "resource": "pods",
53    "namespace": "default",
54    "name": "priv-pod",
55    "apiVersion": "v1"
56  },
57  "responseStatus": {
58    "metadata": {},
59    "code": 201
60  },
61  "requestReceivedTimestamp": "2024-05-31T20:30:27.956279Z",
62  "stageTimestamp": "2024-05-31T20:30:27.982521Z",
63  "annotations": {
64    "authorization.k8s.io/decision": "allow",
65    "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"kubeadm:cluster-admins\" of ClusterRole \"cluster-admin\" to Group \"kubeadm:cluster-admins\"",
66    "pod-security.kubernetes.io/enforce-policy": "privileged:latest"
67  }
68}

Request

Sample log when collecting at the Request level:

  1// 
  2// Metadata Information
  3//
  4{
  5  "kind": "Event",
  6  "apiVersion": "audit.k8s.io/v1",
  7  "level": "Request",
  8  "auditID": "5fd8a404-29b3-4518-93d8-e77135a426fa",
  9  "stage": "RequestReceived",
 10  "requestURI": "/api/v1/namespaces/default/pods?fieldManager=kubectl-client-side-apply&fieldValidation=Strict",
 11  "verb": "create",
 12  "user": {
 13    "username": "kubernetes-admin",
 14    "groups": [
 15      "kubeadm:cluster-admins",
 16      "system:authenticated"
 17    ]
 18  },
 19  "sourceIPs": [
 20    "192.168.1.167"
 21  ],
 22  "userAgent": "kubectl/v1.28.9 (linux/amd64) kubernetes/587f5fe",
 23  "objectRef": {
 24    "resource": "pods",
 25    "namespace": "default",
 26    "apiVersion": "v1"
 27  },
 28  "requestReceivedTimestamp": "2024-05-31T20:34:55.398698Z",
 29  "stageTimestamp": "2024-05-31T20:34:55.398698Z"
 30}
 31{
 32  "kind": "Event",
 33  "apiVersion": "audit.k8s.io/v1",
 34  "level": "Request",
 35  "auditID": "5fd8a404-29b3-4518-93d8-e77135a426fa",
 36  "stage": "ResponseComplete",
 37  "requestURI": "/api/v1/namespaces/default/pods?fieldManager=kubectl-client-side-apply&fieldValidation=Strict",
 38  "verb": "create",
 39  "user": {
 40    "username": "kubernetes-admin",
 41    "groups": [
 42      "kubeadm:cluster-admins",
 43      "system:authenticated"
 44    ]
 45  },
 46  "sourceIPs": [
 47    "192.168.1.167"
 48  ],
 49  "userAgent": "kubectl/v1.28.9 (linux/amd64) kubernetes/587f5fe",
 50  "objectRef": {
 51    "resource": "pods",
 52    "namespace": "default",
 53    "name": "priv-pod",
 54    "apiVersion": "v1"
 55  },
 56  "responseStatus": {
 57    "metadata": {},
 58    "code": 201
 59  },
 60//
 61// Request Information 
 62//
 63  "requestObject": {
 64    "kind": "Pod",
 65    "apiVersion": "v1",
 66    "metadata": {
 67      "name": "priv-pod",
 68      "namespace": "default",
 69      "creationTimestamp": null,
 70      "annotations": {
 71        "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{},\"name\":\"priv-pod\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"nginx\",\"name\":\"priv-pod\",\"securityContext\":{\"privileged\":true}}],\"hostNetwork\":true}}\n"
 72      }
 73    },
 74    "spec": {
 75      "containers": [
 76        {
 77          "name": "priv-pod",
 78          "image": "nginx",
 79          "resources": {},
 80          "terminationMessagePath": "/dev/termination-log",
 81          "terminationMessagePolicy": "File",
 82          "imagePullPolicy": "Always",
 83          "securityContext": {
 84            "privileged": true
 85          }
 86        }
 87      ],
 88      "restartPolicy": "Always",
 89      "terminationGracePeriodSeconds": 30,
 90      "dnsPolicy": "ClusterFirst",
 91      "hostNetwork": true,
 92      "securityContext": {},
 93      "schedulerName": "default-scheduler",
 94      "enableServiceLinks": true
 95    },
 96    "status": {}
 97  },
 98  "requestReceivedTimestamp": "2024-05-31T20:34:55.398698Z",
 99  "stageTimestamp": "2024-05-31T20:34:55.415504Z",
100//
101// END Request Information 
102//
103  "annotations": {
104    "authorization.k8s.io/decision": "allow",
105    "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"kubeadm:cluster-admins\" of ClusterRole \"cluster-admin\" to Group \"kubeadm:cluster-admins\"",
106    "pod-security.kubernetes.io/enforce-policy": "privileged:latest"
107  }
108}

RequestResponse

Sample log when collecting information at the RequestResponse level:

  1//
  2// Metadata Information
  3// 
  4{
  5  "kind": "Event",
  6  "apiVersion": "audit.k8s.io/v1",
  7  "level": "RequestResponse",
  8  "auditID": "1bc5391b-4896-4ac4-a919-34c7d869fbb7",
  9  "stage": "RequestReceived",
 10  "requestURI": "/api/v1/namespaces/default/pods?fieldManager=kubectl-client-side-apply&fieldValidation=Strict",
 11  "verb": "create",
 12  "user": {
 13    "username": "kubernetes-admin",
 14    "groups": [
 15      "kubeadm:cluster-admins",
 16      "system:authenticated"
 17    ]
 18  },
 19  "sourceIPs": [
 20    "192.168.1.167"
 21  ],
 22  "userAgent": "kubectl/v1.28.9 (linux/amd64) kubernetes/587f5fe",
 23  "objectRef": {
 24    "resource": "pods",
 25    "namespace": "default",
 26    "apiVersion": "v1"
 27  },
 28  "requestReceivedTimestamp": "2024-05-31T20:49:52.847213Z",
 29  "stageTimestamp": "2024-05-31T20:49:52.847213Z"
 30}
 31{
 32  "kind": "Event",
 33  "apiVersion": "audit.k8s.io/v1",
 34  "level": "RequestResponse",
 35  "auditID": "1bc5391b-4896-4ac4-a919-34c7d869fbb7",
 36  "stage": "ResponseComplete",
 37  "requestURI": "/api/v1/namespaces/default/pods?fieldManager=kubectl-client-side-apply&fieldValidation=Strict",
 38  "verb": "create",
 39  "user": {
 40    "username": "kubernetes-admin",
 41    "groups": [
 42      "kubeadm:cluster-admins",
 43      "system:authenticated"
 44    ]
 45  },
 46  "sourceIPs": [
 47    "192.168.1.167"
 48  ],
 49  "userAgent": "kubectl/v1.28.9 (linux/amd64) kubernetes/587f5fe",
 50  "objectRef": {
 51    "resource": "pods",
 52    "namespace": "default",
 53    "name": "priv-pod",
 54    "apiVersion": "v1"
 55  },
 56  "responseStatus": {
 57    "metadata": {},
 58    "code": 201
 59  },
 60//
 61// Request Information
 62//
 63  "requestObject": {
 64    "kind": "Pod",
 65    "apiVersion": "v1",
 66    "metadata": {
 67      "name": "priv-pod",
 68      "namespace": "default",
 69      "creationTimestamp": null,
 70      "annotations": {
 71        "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{},\"name\":\"priv-pod\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"nginx\",\"name\":\"priv-pod\",\"securityContext\":{\"privileged\":true}}],\"hostNetwork\":true}}\n"
 72      }
 73    },
 74    "spec": {
 75      "containers": [
 76        {
 77          "name": "priv-pod",
 78          "image": "nginx",
 79          "resources": {},
 80          "terminationMessagePath": "/dev/termination-log",
 81          "terminationMessagePolicy": "File",
 82          "imagePullPolicy": "Always",
 83          "securityContext": {
 84            "privileged": true
 85          }
 86        }
 87      ],
 88      "restartPolicy": "Always",
 89      "terminationGracePeriodSeconds": 30,
 90      "dnsPolicy": "ClusterFirst",
 91      "hostNetwork": true,
 92      "securityContext": {},
 93      "schedulerName": "default-scheduler",
 94      "enableServiceLinks": true
 95    },
 96    "status": {}
 97  },
 98//
 99// RequestResponse Information 
100//
101  "responseObject": {
102    "kind": "Pod",
103    "apiVersion": "v1",
104    "metadata": {
105      "name": "priv-pod",
106      "namespace": "default",
107      "uid": "34946cec-ef89-470b-9496-da357d082966",
108      "resourceVersion": "358120",
109      "creationTimestamp": "2024-05-31T20:49:52Z",
110      "annotations": {
111        "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{},\"name\":\"priv-pod\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"nginx\",\"name\":\"priv-pod\",\"securityContext\":{\"privileged\":true}}],\"hostNetwork\":true}}\n"
112      },
113      "managedFields": [
114        {
115          "manager": "kubectl-client-side-apply",
116          "operation": "Update",
117          "apiVersion": "v1",
118          "time": "2024-05-31T20:49:52Z",
119          "fieldsType": "FieldsV1",
120          "fieldsV1": {
121            "f:metadata": {
122              "f:annotations": {
123                ".": {},
124                "f:kubectl.kubernetes.io/last-applied-configuration": {}
125              }
126            },
127            "f:spec": {
128              "f:containers": {
129                "k:{\"name\":\"priv-pod\"}": {
130                  ".": {},
131                  "f:image": {},
132                  "f:imagePullPolicy": {},
133                  "f:name": {},
134                  "f:resources": {},
135                  "f:securityContext": {
136                    ".": {},
137                    "f:privileged": {}
138                  },
139                  "f:terminationMessagePath": {},
140                  "f:terminationMessagePolicy": {}
141                }
142              },
143              "f:dnsPolicy": {},
144              "f:enableServiceLinks": {},
145              "f:hostNetwork": {},
146              "f:restartPolicy": {},
147              "f:schedulerName": {},
148              "f:securityContext": {},
149              "f:terminationGracePeriodSeconds": {}
150            }
151          }
152        }
153      ]
154    },
155    "spec": {
156      "volumes": [
157        {
158          "name": "kube-api-access-hnmnl",
159          "projected": {
160            "sources": [
161              {
162                "serviceAccountToken": {
163                  "expirationSeconds": 3607,
164                  "path": "token"
165                }
166              },
167              {
168                "configMap": {
169                  "name": "kube-root-ca.crt",
170                  "items": [
171                    {
172                      "key": "ca.crt",
173                      "path": "ca.crt"
174                    }
175                  ]
176                }
177              },
178              {
179                "downwardAPI": {
180                  "items": [
181                    {
182                      "path": "namespace",
183                      "fieldRef": {
184                        "apiVersion": "v1",
185                        "fieldPath": "metadata.namespace"
186                      }
187                    }
188                  ]
189                }
190              }
191            ],
192            "defaultMode": 420
193          }
194        }
195      ],
196      "containers": [
197        {
198          "name": "priv-pod",
199          "image": "nginx",
200          "resources": {},
201          "volumeMounts": [
202            {
203              "name": "kube-api-access-hnmnl",
204              "readOnly": true,
205              "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
206            }
207          ],
208          "terminationMessagePath": "/dev/termination-log",
209          "terminationMessagePolicy": "File",
210          "imagePullPolicy": "Always",
211          "securityContext": {
212            "privileged": true
213          }
214        }
215      ],
216      "restartPolicy": "Always",
217      "terminationGracePeriodSeconds": 30,
218      "dnsPolicy": "ClusterFirst",
219      "serviceAccountName": "default",
220      "serviceAccount": "default",
221      "hostNetwork": true,
222      "securityContext": {},
223      "schedulerName": "default-scheduler",
224      "tolerations": [
225        {
226          "key": "node.kubernetes.io/not-ready",
227          "operator": "Exists",
228          "effect": "NoExecute",
229          "tolerationSeconds": 300
230        },
231        {
232          "key": "node.kubernetes.io/unreachable",
233          "operator": "Exists",
234          "effect": "NoExecute",
235          "tolerationSeconds": 300
236        }
237      ],
238      "priority": 0,
239      "enableServiceLinks": true,
240      "preemptionPolicy": "PreemptLowerPriority"
241    },
242    "status": {
243      "phase": "Pending",
244      "qosClass": "BestEffort"
245    }
246  },
247  "requestReceivedTimestamp": "2024-05-31T20:49:52.847213Z",
248  "stageTimestamp": "2024-05-31T20:49:52.866921Z",
249//
250// END RequestResponse Information 
251//
252  "annotations": {
253    "authorization.k8s.io/decision": "allow",
254    "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"kubeadm:cluster-admins\" of ClusterRole \"cluster-admin\" to Group \"kubeadm:cluster-admins\"",
255    "pod-security.kubernetes.io/enforce-policy": "privileged:latest"
256  }
257}

Wonderful. Whats the big deal, can’t we just log everything? Not quite. Due to the sheer amount of logs the AuditPolicy can generate in a real cluster, you can have WAY more logs than you’ll ever be able to parse (and probably store…). The best way to configure audit policy is to tune it for what you’re looking for.

Are you very worried that someone is going to create a privileged pod but for some reason you can’t deploy an admission controller to stop them? AuditPolicy can at least give you the logs needed to alert on such behavior. It’s only as powerful as you make it through configuring the AuditPolicy rules.

How to configure AuditPolicy

Unfortunately configuring a AuditPolicy is a little more complicated to setup than simply running kubectl apply -f auditpolicy.yaml like you can with most kubernetes resources.

At this point, you should choose your audit backend. There are two that you can choose from: - Webhooks: Logs are sent to an external server - Log: Logs are written to a user defined directory on the node

In this example we’ll be simply writing the logs to a a directory on the node. These can later be collected and sent to a SIEM with a logging agent.

The first thing we need to do is create a simple AuditPolicy manifest to get started with. The following policy is what I used in the above examples to demonstrate metadata logging for pod creation. It’s fairly simple, it just looks for pod creation events and drops all other logs.

 1# AuditPolicy.yaml located in /etc/kubernetes/audit/
 2apiVersion: audit.k8s.io/v1
 3kind: Policy
 4rules:
 5
 6- level: Metadata
 7  # Look for creation events...
 8  verbs: ["create"]
 9  resources:
10    - group: "" 
11	  # For these resources
12      resources: ["pods", "pods/status"]
13
14# Do not log anything else
15- level: None

The AuditPolicy.yaml needs to be placed on the node in a location accessible by the kube-apiserver pod. The tricky part about this is that for AuditPolicy.yaml to be visible to the kube-apiserver Pod, we need to do a few things.

  1. Define a few new flags in our /etc/kubernetes/manifests/kube-apiserver.yaml manifest (this is what defines the parameters for our API server, which is launched as a static pod), specifically:
1# Define the file the audit policy should be read from
2- --audit-policy-file=/etc/kubernetes/audit/policy.yaml
3# Define where to log the files to
4- --audit-log-path=/etc/kubernetes/audit/audit.log
5# Define the max size (in MB) of the audit log. 
6# For a production cluster you're gonna need way more than 500MB
7- --audit-log-maxsize=500
8# Define how many times the log will rotate before being overwritten
9- --audit-log-maxbackup=3
  1. We need to create and mount a volume into our pod. Remember how I said that the AuditPolicy.yaml needs to accessible to the kube-apiserver? Well the kube-apiserver runs as a pod which means (by default), it cannot access anything on the Node. This presents a problem. We need it to both read the /etc/kubernetes/audit/AuditPolicy.yaml and have the ability to write the log files to the node’s file system, otherwise, if the kube-apiserver pod died or otherwise got recreated, we would lose our logs. Creating a VolumeMount and hostPath mount is fairly straightforward. Add the following lines to /etc/kubernetes/manifests/kube-apiserver.yaml
1# Place under the volumes section of /etc/kubernetes/manifests/kube-apiserver.yaml
2- mountPath: /etc/kubernetes/audit      
3  name: audit
4  
5# Place under the volumeMounts section of /etc/kubernetes/manifests/kube-apiserver.yaml
6- hostPath:                              
7    path: /etc/kubernetes/audit          
8    type: DirectoryOrCreate               
9  name: audit

Note: You should be as specific as possible with your mounts. You should NOT simply mount the entire filesystem or mount /etc/kubernetes/. For more info see the kubenomicon.

Your /etc/kubernetes/manifests/kube-apiserver.yaml should now look roughly akin to this:

  1# /etc/kubernetes/manifests/kube-apiserver.yaml with AuditPolicy Configured
  2# To log to /etc/kubernetes/audit/audit.log
  3apiVersion: v1
  4kind: Pod
  5metadata:
  6  annotations:
  7    kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 192.168.1.201:6443
  8  creationTimestamp: null
  9  labels:
 10    component: kube-apiserver
 11    tier: control-plane
 12  name: kube-apiserver
 13  namespace: kube-system
 14spec:
 15  containers:
 16  - command:
 17    - kube-apiserver
 18    - --advertise-address=192.168.1.201
 19    - --allow-privileged=true
 20    - --authorization-mode=Node,RBAC
 21    - --client-ca-file=/etc/kubernetes/pki/ca.crt
 22    - --enable-admission-plugins=NodeRestriction
 23    - --enable-bootstrap-token-auth=true
 24    - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
 25    - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
 26    - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
 27    - --etcd-servers=https://127.0.0.1:2379
 28    - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
 29    - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
 30    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
 31    - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
 32    - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
 33    - --requestheader-allowed-names=front-proxy-client
 34    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
 35    - --requestheader-extra-headers-prefix=X-Remote-Extra-
 36    - --requestheader-group-headers=X-Remote-Group
 37    - --requestheader-username-headers=X-Remote-User
 38    - --secure-port=6443
 39    - --service-account-issuer=https://kubernetes.default.svc.cluster.local
 40    - --service-account-key-file=/etc/kubernetes/pki/sa.pub
 41    - --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
 42    - --service-cluster-ip-range=10.96.0.0/12
 43    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
 44    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
 45	# Notice the new audit flags below
 46    - --audit-policy-file=/etc/kubernetes/audit/policy.yaml
 47    - --audit-log-path=/etc/kubernetes/audit/audit.log
 48    - --audit-log-maxsize=500
 49    - --audit-log-maxbackup=3
 50    image: registry.k8s.io/kube-apiserver:v1.30.1
 51    imagePullPolicy: IfNotPresent
 52    livenessProbe:
 53      failureThreshold: 8
 54      httpGet:
 55        host: 192.168.1.201
 56        path: /livez
 57        port: 6443
 58        scheme: HTTPS
 59      initialDelaySeconds: 10
 60      periodSeconds: 10
 61      timeoutSeconds: 15
 62    name: kube-apiserver
 63    readinessProbe:
 64      failureThreshold: 3
 65      httpGet:
 66        host: 192.168.1.201
 67        path: /readyz
 68        port: 6443
 69        scheme: HTTPS
 70      periodSeconds: 1
 71      timeoutSeconds: 15
 72    resources:
 73      requests:
 74        cpu: 250m
 75    startupProbe:
 76      failureThreshold: 24
 77      httpGet:
 78        host: 192.168.1.201
 79        path: /livez
 80        port: 6443
 81        scheme: HTTPS
 82      initialDelaySeconds: 10
 83      periodSeconds: 10
 84      timeoutSeconds: 15
 85    volumeMounts:
 86	# Notice the new volumeMounts info we added
 87    - mountPath: /etc/kubernetes/audit      
 88      name: audit
 89    - mountPath: /etc/ssl/certs
 90      name: ca-certs
 91      readOnly: true
 92    - mountPath: /etc/ca-certificates
 93      name: etc-ca-certificates
 94      readOnly: true
 95    - mountPath: /etc/pki
 96      name: etc-pki
 97      readOnly: true
 98    - mountPath: /etc/kubernetes/pki
 99      name: k8s-certs
100      readOnly: true
101    - mountPath: /usr/local/share/ca-certificates
102      name: usr-local-share-ca-certificates
103      readOnly: true
104    - mountPath: /usr/share/ca-certificates
105      name: usr-share-ca-certificates
106      readOnly: true
107  hostNetwork: true
108  priority: 2000001000
109  priorityClassName: system-node-critical
110  securityContext:
111    seccompProfile:
112      type: RuntimeDefault
113  volumes:
114  # Notice the new hostPath mounts
115  - hostPath:                              
116      path: /etc/kubernetes/audit          
117      type: DirectoryOrCreate               
118    name: audit
119  - hostPath:
120      path: /etc/ssl/certs
121      type: DirectoryOrCreate
122    name: ca-certs
123  - hostPath:
124      path: /etc/ca-certificates
125      type: DirectoryOrCreate
126    name: etc-ca-certificates
127  - hostPath:
128      path: /etc/pki
129      type: DirectoryOrCreate
130    name: etc-pki
131  - hostPath:
132      path: /etc/kubernetes/pki
133      type: DirectoryOrCreate
134    name: k8s-certs
135  - hostPath:
136      path: /usr/local/share/ca-certificates
137      type: DirectoryOrCreate
138    name: usr-local-share-ca-certificates
139  - hostPath:
140      path: /usr/share/ca-certificates
141      type: DirectoryOrCreate
142    name: usr-share-ca-certificates
143status: {}

Now we need to configure the actual audit policy. The AuditPolicy I was using in the above examples looks like this.

 1# AuditPolicy.yaml located in /etc/kubernetes/audit/
 2apiVersion: audit.k8s.io/v1
 3kind: Policy
 4rules:
 5
 6- level: Metadata
 7  verbs: ["create"]
 8  resources:
 9    - group: "" 
10      resources: ["pods", "pods/status"]
11
12- level: None

The only rule this policy contains is a pod creation event (which is something you should be auditing on, as many privilege escalation techniques require an attacker to create a pod). Notice the rules work similar to a firewall, they’re evaluated from the top down and the final rule is saying “Log all other requests not specified at the level None” Which is saying don’t log them.

Ready for the dumb part? The cluster I’m working with (create using kubeadm and running containerd), there is not a great way to tell the API server to respect our new configuration parameters. It would be nice if there was a command to restart the API server. Unfortunately deleting the pod and waiting for it to be recreated and running touch /etc/kubernetes/manifests/kube-apiserver.yamldoesn’t seem to always get the new configuration applied.

The only reliable way I’ve found to get the kube-apiserver pod to be recreated with the updated configuration options is to run mv kube-apiserver.yaml /tmp from /etc/kubernetes/manifests/ , waiting a few seconds and moving it back with mv /tmp/kube-apiserver.yaml . If you know of a better way of doing this, please let me know!

Anyway. Your /etc/kubernetes directory should now contain the following information.

Let’s test out our audit policy to ensure we can catch a pod creation event by creating a new pod. To do so, we can tail our log file on the Node and pipe it to jq so the formatting is a little easier to read: tail -f audit.log | jq and then run kubectl apply -f <pod_manifest>.yaml or kubectl run policytest --image=nginx. We immediately see that our log file has been populated with details about the pod creation.

If we take a closer look at this log, we can see that two different stages are being captured. We can verify this by running cat audit.log | jq '.stage'

Four different stages can be recorded:

  1. RequestReceived: Generated as soon as the API server recieves the request
  2. ResponseStarted: Generated for repsonses like watch that may take some time
  3. ResponseComplete: Generated when the response body has been sent.
  4. Panic: Events generated when something goes wrong

This is great, but if we’re tuning our logs, we might only be interested in seeing if the API server responded with RepsonseComplete, indicating something happened. Additionally, we also want to update our AuditPolicy to indicate when someone gets secrets from the cluster. Lets modify our policy.yaml file to reflect that:

🚨 Note that we’ll also be introducing a gap in detection that we’ll discuss later, see if you can spot it

 1kind: Policy
 2# Here we're explictly saying don't log when the API server recieves the request
 3omitStages:
 4  - "RequestReceived"
 5rules:
 6
 7- level: Metadata
 8  verbs: ["create"]
 9  resources:
10    - group: "" 
11      resources: ["pods", "pods/status"]
12
13
14# Here we're logging activity associated with running something like `kubectl get secret secret123`
15- level: Metadata
16  verbs: ["get"]
17  resources:
18    - group: "" 
19      resources: ["secrets"]
20
21- level: None

Now, when we create a pod we will only see the log for it when at the ServerComplete stage, and not both stages, reducing the amount of logs by half. (Note that there are still reasons you might want both, but if you’re struggling with log volume, this may make things a bit more bearable)

We can also see that our log shows get requests for secrets if someone were to run kubectl get secret <secret_name>, amazing, right?

Unfortunately, there is a large detection gap here that is VERY easy to overlook. Remember how in our AuditPolicy file we specified that we wanted to log any requests with the get verb for the secret resource? In Kubernetes, there is this weird quirk: running the command kubectl get secret <secret_name> is indeed a get verb as you would expect, however, running the command kubectl get secrets is NOT technically a get action, it’s a list action because it’s listing all the secrets and you’ll notice that our audit policy does not collect the list verb. This means that running kubectl get sercets will not be logged at all under our current AuditPolicy:

1# Here we're logging activity associated with running something like kubectl get secret secret123
2- level: Metadata
3  # We are not collecting "list" actions 
4  verbs: ["get"]
5  resources:
6    - group: "" 
7      resources: ["secrets"]

Running the command kubectl get secrets doesn’t seem like it is something that should be logged because it only lists the secrets, it doesn’t provide the actual sensitive data, right? Well…. actually no. You can very easily see the data if you run kubectl get secrets -o yaml (or -o json). Despite this, since we didn’t specify the list verb in our AuditPolicy, this action will NOT be logged even though we’ve just accessed every secret in this namespace.

Luckily this is a very simple fix if you’re aware of this odd quirk. All we need to do is update our AuditPolicy to reflect our desire to capture the list verbs on secrets:

 1apiVersion: audit.k8s.io/v1
 2kind: Policy
 3omitStages:
 4  - "RequestReceived"
 5rules:
 6
 7- level: Metadata
 8  verbs: ["create"]
 9  resources:
10    - group: "" 
11      resources: ["pods", "pods/status"]
12
13- level: Metadata
14  verbs: ["get", "list"] # Added "list"
15  resources:
16    - group: "" 
17      resources: ["secrets"]
18
19- level: None

Now once we run kubectl get secrets, our log file contains that request. Depending on what you’re running in your cluster, this may be very noisy as lots of things list secrets in clusters (which… is a whole different topic…….).

 1{
 2  "kind": "Event",
 3  "apiVersion": "audit.k8s.io/v1",
 4  "level": "Metadata",
 5  "auditID": "36d2e2dd-3348-4ce1-ad9b-d364c871324a",
 6  "stage": "ResponseComplete",
 7  "requestURI": "/api/v1/namespaces/default/secrets?limit=500",
 8  //
 9  // Notice the list verb has been logged
10  //
11  "verb": "list",
12  "user": {
13    "username": "kubernetes-admin",
14    "groups": [
15      "kubeadm:cluster-admins",
16      "system:authenticated"
17    ]
18  },
19  "sourceIPs": [
20    "192.168.1.167"
21  ],
22  "userAgent": "kubectl/v1.28.9 (linux/amd64) kubernetes/587f5fe",
23  "objectRef": {
24    "resource": "secrets",
25    "namespace": "default",
26    "apiVersion": "v1"
27  },
28  "responseStatus": {
29    "metadata": {},
30    "code": 200
31  },
32  "requestReceivedTimestamp": "2024-06-02T00:39:37.042236Z",
33  "stageTimestamp": "2024-06-02T00:39:37.123574Z",
34  "annotations": {
35    "authorization.k8s.io/decision": "allow",
36    "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"kubeadm:cluster-admins\" of ClusterRole \"cluster-admin\" to Group \"kubeadm:cluster-admins\""
37  }
38}

AuditLogs are extremely powerful but only if you tune them correctly. Keep an eye on The Kubenomicon for ideas on what you should be auditing on to catch attackers.

Kubernetes Events

Kubernetes events are viewed by running kubectl get events. These events are created whenever something changes at the cluster level. For example, if I create a Pod that is attempting to pull a nginx image, events will document that process. In this case, the pod was successfully created so there were no issues.

If there was an error pulling the image, these logs would show that. In this example, I’ve made a typo in the image name and thus Kubernetes (or really the container runtime), can’t pull the image.

By default, Kubernetes events are stored for 1 hour (although this can be configured). Kubernetes events are great for troubleshooting a Kubernetes cluster, but if you’re relying on your SOC to investigate information using kubectl, you should probably rethink your logging architecture.

4. Cloud Level Logging

Logging at the cloud level is anything “above” the cluster level. This mostly means the log artifacts are generated by your cloud provider. I’m not going to cover this in much detail because it’s different for each provider and I typically work with non-cloud provider managed clusters so I don’t have too much to say on the topic that you can’t just read from the documentation which you can find here:

Tail -f

Phew that was a lot of logs we just waded through. As you can see, it’s a little less straightforward to collect logs from all the layers of a Kubernetes cluster than it is to collect logs from just a normal virtual machine but it’s certainly possible.

I hope the main idea you take away from this is that collecting ALL the logs is not super useful in most instances, it’s very important to understand what you want to collect from each layer and tune your logging to align with it.

Here are some things you should look into if you’re interested in learning more about Kubernetes logging: