A security context defines privilege and access control settings for a Pod or Container. Security context settings include:
Discretionary Access Control: Permission to access an object, like a file, is based on user ID (UID) and group ID (GID).
Security Enhanced Linux (SELinux): Objects are assigned security labels.
Running as privileged or unprivileged.
Linux Capabilities: Give a process some privileges, but not all the privileges of the root user.
AppArmor: Use program profiles to restrict the capabilities of individual programs.
Seccomp: Limit a process’s access to open file descriptors.
AllowPrivilegeEscalation: Controls whether a process can gain more privileges than its parent process. This bool directly controls whether the no_new_privs
flag gets set on the container process. AllowPrivilegeEscalation is true always when the container is: 1) run as Privileged OR 2) has CAP_SYS_ADMIN
.
For more information about security mechanisms in Linux, see Overview of Linux Kernel Security Features
You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by using Minikube, or you can use one of these Kubernetes playgrounds:
To check the version, enter kubectl version
.
To specify security settings for a Pod, include the securityContext
field
in the Pod specification. The securityContext
field is a
PodSecurityContext object.
The security settings that you specify for a Pod apply to all Containers in the Pod.
Here is a configuration file for a Pod that has a securityContext
and an emptyDir
volume:
security-context.yaml
|
---|
|
In the configuration file, the runAsUser
field specifies that for any Containers in
the Pod, the first process runs with user ID 1000. The fsGroup
field specifies that
group ID 2000 is associated with all Containers in the Pod. Group ID 2000 is also
associated with the volume mounted at /data/demo
and with any files created in that
volume.
Create the Pod:
kubectl create -f https://k8s.io/docs/tasks/configure-pod-container/security-context.yaml
Verify that the Pod’s Container is running:
kubectl get pod security-context-demo
Get a shell to the running Container:
kubectl exec -it security-context-demo -- sh
In your shell, list the running processes:
ps aux
The output shows that the processes are running as user 1000, which is the value of runAsUser
:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
1000 1 0.0 0.0 4336 724 ? Ss 18:16 0:00 /bin/sh -c node server.js
1000 5 0.2 0.6 772124 22768 ? Sl 18:16 0:00 node server.js
...
In your shell, navigate to /data
, and list the one directory:
cd /data
ls -l
The output shows that the /data/demo
directory has group ID 2000, which is
the value of fsGroup
.
drwxrwsrwx 2 root 2000 4096 Jun 6 20:08 demo
In your shell, navigate to /data/demo
, and create a file:
cd demo
echo hello > testfile
List the file in the /data/demo
directory:
ls -l
The output shows that testfile
has group ID 2000, which is the value of fsGroup
.
-rw-r--r-- 1 1000 2000 6 Jun 6 20:08 testfile
Exit your shell:
exit
To specify security settings for a Container, include the securityContext
field
in the Container manifest. The securityContext
field is a
SecurityContext object.
Security settings that you specify for a Container apply only to
the individual Container, and they override settings made at the Pod level when
there is overlap. Container settings do not affect the Pod’s Volumes.
Here is the configuration file for a Pod that has one Container. Both the Pod
and the Container have a securityContext
field:
security-context-2.yaml
|
---|
|
Create the Pod:
kubectl create -f https://k8s.io/docs/tasks/configure-pod-container/security-context-2.yaml
Verify that the Pod’s Container is running:
kubectl get pod security-context-demo-2
Get a shell into the running Container:
kubectl exec -it security-context-demo-2 -- sh
In your shell, list the running processes:
ps aux
The output shows that the processes are running as user 2000. This is the value
of runAsUser
specified for the Container. It overrides the value 1000 that is
specified for the Pod.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
2000 1 0.0 0.0 4336 764 ? Ss 20:36 0:00 /bin/sh -c node server.js
2000 8 0.1 0.5 772124 22604 ? Sl 20:36 0:00 node server.js
...
Exit your shell:
exit
With Linux capabilities,
you can grant certain privileges to a process without granting all the privileges
of the root user. To add or remove Linux capabilities for a Container, include the
capabilities
field in the securityContext
section of the Container manifest.
First, see what happens when you don’t include a capabilities
field.
Here is configuration file that does not add or remove any Container capabilities:
security-context-3.yaml
|
---|
|
Create the Pod:
kubectl create -f https://k8s.io/docs/tasks/configure-pod-container/security-context-3.yaml
Verify that the Pod’s Container is running:
kubectl get pod security-context-demo-3
Get a shell into the running Container:
kubectl exec -it security-context-demo-3 -- sh
In your shell, list the running processes:
ps aux
The output shows the process IDs (PIDs) for the Container:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 4336 796 ? Ss 18:17 0:00 /bin/sh -c node server.js
root 5 0.1 0.5 772124 22700 ? Sl 18:17 0:00 node server.js
In your shell, view the status for process 1:
cd /proc/1
cat status
The output shows the capabilities bitmap for the process:
...
CapPrm: 00000000a80425fb
CapEff: 00000000a80425fb
...
Make a note of the capabilities bitmap, and then exit your shell:
exit
Next, run a Container that is the same as the preceding container, except that it has additional capabilities set.
Here is the configuration file for a Pod that runs one Container. The configuration
adds the CAP_NET_ADMIN
and CAP_SYS_TIME
capabilities:
security-context-4.yaml
|
---|
|
Create the Pod:
kubectl create -f https://k8s.io/docs/tasks/configure-pod-container/security-context-4.yaml
Get a shell into the running Container:
kubectl exec -it security-context-demo-4 -- sh
In your shell, view the capabilities for process 1:
cd /proc/1
cat status
The output shows capabilities bitmap for the process:
...
CapPrm: 00000000aa0435fb
CapEff: 00000000aa0435fb
...
Compare the capabilities of the two Containers:
00000000a80425fb
00000000aa0435fb
In the capability bitmap of the first container, bits 12 and 25 are clear. In the second container,
bits 12 and 25 are set. Bit 12 is CAP_NET_ADMIN
, and bit 25 is CAP_SYS_TIME
.
See capability.h
for definitions of the capability constants.
Note: Linux capability constants have the form CAP_XXX
. But when you list capabilities in your Container manifest, you must omit the CAP_
portion of the constant. For example, to add CAP_SYS_TIME
, include SYS_TIME
in your list of capabilities.
To assign SELinux labels to a Container, include the seLinuxOptions
field in
the securityContext
section of your Pod or Container manifest. The
seLinuxOptions
field is an
SELinuxOptions
object. Here’s an example that applies an SELinux level:
...
securityContext:
seLinuxOptions:
level: "s0:c123,c456"
Note: To assign SELinux labels, the SELinux security module must be loaded on the host operating system.
The security context for a Pod applies to the Pod’s Containers and also to
the Pod’s Volumes when applicable. Specifically fsGroup
and seLinuxOptions
are
applied to Volumes as follows:
fsGroup
: Volumes that support ownership management are modified to be owned
and writable by the GID specified in fsGroup
. See the
Ownership Management design document
for more details.
seLinuxOptions
: Volumes that support SELinux labeling are relabeled to be accessible
by the label specified under seLinuxOptions
. Usually you only
need to set the level
section. This sets the
Multi-Category Security (MCS)
label given to all Containers in the Pod as well as the Volumes.
Warning: After you specify an MCS label for a Pod, all Pods with the same label can access the Volume. If you need inter-Pod protection, you must assign a unique MCS label to each Pod.