Skip to content

Failed to start containerd task due to operation not permitted accessing /proc/thread-self/fd/ #13293

@WilfSilver

Description

@WilfSilver

Environmental Info:
K3s Version: v1.33.6+k3s1

Node(s) CPU architecture, OS, and Version:

Linux ctrl1 6.17.9-300.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Nov 24 23:31:27 UTC 2025 x86_64 GNU/Linux
Linux ctrl2 6.17.8-300.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 14 01:47:12 UTC 2025 x86_64 GNU/Linux
Linux ctrl3 6.17.8-300.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 14 01:47:12 UTC 2025 x86_64 GNU/Linux
Linux agent1 6.17.9-300.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Nov 24 23:31:27 UTC 2025 x86_64 GNU/Linux
Linux agent2 6.17.9-300.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Nov 24 23:31:27 UTC 2025 x86_64 GNU/Linux

Cluster Configuration:
3 servers (all bare-metal) 2 agents (all vms) running fedora 43 server (see naming scheme above). The cluster is set up with a dual-stack config with etcd for the control plane (see the steps to reproduce for the specifics).

Describe the bug:
Summary + potential solution: I believe it's related to an issue in runc: opencontainers/runc#5007 which has been fixed in 1.3.4. But I could be wrong, as the error message is slightly different.

I have installed security-profile-operator on the cluster to manage SELinux policies. However, around 1–2 weeks ago, the spod instances started failing to boot, with the error message (I do have automated upgrades turned on):

reopen exec fifo: get safe /proc/thread-self/fd handle: unsafe procfs detected: openat2 fsmount:fscontext:proc/thread-self/fd/: operation not permitted

With the state of the pod being:

failed to start containerd task "f73c95bbd93515cf412c48f1a541e1b1e54ce9fbc637f5d77deab0c00b8a3f99": cannot start a stopped process

The especially confusing part is that there are other containers using the same image (including in the same pod for the init) which don't have this issue.

Steps To Reproduce:

  • Setup load balancer:
    $ sudo dnf install nginx nginx-mod-stream
    $ sudo vi /etc/nginx/nginx.conf
    load_module /usr/lib64/nginx/modules/ngx_stream_module.so;
    events {}
    
    stream {
      upstream k3s_servers {
        server ctrl1.mydomain.org:6443;
        server ctrl2.mydomain.org:6443;
        server ctrl3.mydomain.org:6443;
      }
      server {
        listen 6445;
        proxy_pass k3s_servers;
      }
    }
    $ sudo semanage port -a -t http_port_t -p tcp 6445
    $ sudo semanage port -a -t http_port_t -p udp 6445
    $ sudo setsebool -P httpd_can_network_connect 1
    $ sudo systemctl enable --now nginx
    
  • Installed K3s, on the nodes:
    # Initial server node
    curl -sfL https://get.k3s.io | sh -s - server --disable=traefik --cluster-cidr=10.42.0.0/16,2001:db8:42::/56 --service-cidr=10.43.0.0/16,2001:db8:43::/112 --flannel-ipv6-masq --tls-san k8s.mydomain.org --cluster-init --selinux
    # Other two server nodes
    curl -sfL https://get.k3s.io | K3S_TOKEN=server_token sh -s - server --disable=traefik --cluster-cidr=10.42.0.0/16,2001:db8:42::/56 --service-cidr=10.43.0.0/16,2001:db8:43::/112 --flannel-ipv6-masq --tls-san k8s.mydomain.org --server https://k8s.mydomain.org:6445 --selinux
    # Agent nodes
    curl -sfL https://get.k3s.io | K3S_URL=https://k8s.mydomain.org:6445 K3S_TOKEN=server_token sh -s - agent --selinux
  • Install security-profiles-operator (see the docs). Specifically, I installed with helm (NOTE: cert-manager must be installed):
    $ vim values.yaml
    # current selinuxd does not work on fedora 43, so I have my own image
    selinuxdImage:
      default:
        registry: "docker.io"
        repository: "wilfsilver/selinuxd-fedora"
        tag: "f43"
      fedora:
        registry: "docker.io"
        repository: "wilfsilver/selinuxd-fedora"
        tag: "f43"
    enableSelinux: true
    verbosity: 1
    $ helm install -f values.yaml security-profiles-operator --namespace security-profiles-operator https://github.com/kubernetes-sigs/security-profiles-operator/releases/download/v0.9.0/security-profiles-operator-0.9.0.tgz
    

Expected behavior:

  • spod daemonset to correctly deploy without issues

Actual behavior:

  • spod pods for daemonset cannot start after successfully running the init containers (see the error messages in the description)

Additional context / logs:

Full description of the daemonset auto-generated by spod. As discussed in the issue linked within runc, it could be related to the seccomp policy definition in the first defined pod (which is the one that is erroring out).
apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    deprecated.daemonset.template.generation: "25"
  creationTimestamp: "2025-11-23T22:00:50Z"
  generation: 25
  name: spod
  namespace: security-profiles-operator
  ownerReferences:
  - apiVersion: security-profiles-operator.x-k8s.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: SecurityProfilesOperatorDaemon
    name: spod
    uid: ce158e96-2751-4e84-b9b2-6aa5cc76ac11
  resourceVersion: "10413597"
  uid: 2994f0cd-ae3c-4292-81f6-192c105bea25
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: security-profiles-operator
      name: spod
  template:
    metadata:
      annotations:
        kubectl.kubernetes.io/restartedAt: "2025-11-24T11:32:13Z"
        openshift.io/scc: privileged
      creationTimestamp: null
      labels:
        app: security-profiles-operator
        name: spod
    spec:
      containers:
      - args:
        - daemon
        - --with-selinux=true
        - --with-recording=false
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: OPERATOR_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: SPOD_NAME
          value: spod
        - name: KUBELET_DIR
          value: /var/lib/kubelet
        - name: HOME
          value: /home
        - name: SPO_VERBOSITY
          value: "1"
        image: gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 1
          httpGet:
            path: /healthz
            port: liveness-port
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: security-profiles-operator
        ports:
        - containerPort: 8085
          name: liveness-port
          protocol: TCP
        resources:
          limits:
            ephemeral-storage: 200Mi
            memory: 128Mi
          requests:
            cpu: 100m
            ephemeral-storage: 50Mi
            memory: 64Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsGroup: 65535
          runAsUser: 65535
          seLinuxOptions:
            type: spc_t
          seccompProfile:
            localhostProfile: security-profiles-operator.json
            type: Localhost
        startupProbe:
          failureThreshold: 10
          httpGet:
            path: /healthz
            port: liveness-port
            scheme: HTTP
          periodSeconds: 3
          successThreshold: 1
          timeoutSeconds: 1
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/kubelet/seccomp/operator
          name: host-operator-volume
        - mountPath: /etc/selinux.d
          name: selinux-drop-dir
        - mountPath: /var/run/selinuxd
          name: selinuxd-private-volume
        - mountPath: /tmp/security-profiles-operator-recordings
          name: profile-recording-output-volume
        - mountPath: /var/run/grpc
          name: grpc-server-volume
        - mountPath: /home
          name: home-volume
        - mountPath: /tmp
          name: tmp-volume
        - mountPath: /var/run/secrets/metrics
          name: metrics-cert-volume
          readOnly: true
      - args:
        - daemon
        - --datastore-path
        - /var/run/selinuxd/selinuxd.db
        - --socket-path
        - /var/run/selinuxd/selinuxd.sock
        - --socket-uid
        - "0"
        - --socket-gid
        - "65535"
        env:
        - name: KUBELET_DIR
          value: /var/lib/kubelet
        - name: SPO_VERBOSITY
          value: "1"
        image: docker.io/wilfsilver/selinuxd-fedora:f43
        imagePullPolicy: Always
        name: selinuxd
        resources:
          limits:
            ephemeral-storage: 400Mi
            memory: 1Gi
          requests:
            cpu: 100m
            ephemeral-storage: 200Mi
            memory: 512Mi
        securityContext:
          capabilities:
            add:
            - CHOWN
            - FOWNER
            - FSETID
            - DAC_OVERRIDE
          readOnlyRootFilesystem: true
          runAsGroup: 0
          runAsUser: 0
          seLinuxOptions:
            type: spc_t
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/selinux.d
          name: selinux-drop-dir
          readOnly: true
        - mountPath: /var/run/selinuxd
          name: selinuxd-private-volume
        - mountPath: /sys/fs/selinux
          name: host-fsselinux-volume
        - mountPath: /etc/selinux
          name: host-etcselinux-volume
        - mountPath: /var/lib/selinux
          name: host-varlibselinux-volume
      dnsPolicy: ClusterFirst
      initContainers:
      - args:
        - non-root-enabler
        - --runtime=
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: KUBELET_DIR
          value: /var/lib/kubelet
        - name: SPO_VERBOSITY
          value: "1"
        image: gcr.io/k8s-staging-sp-operator/security-profiles-operator:latest
        imagePullPolicy: Always
        name: non-root-enabler
        resources:
          limits:
            ephemeral-storage: 50Mi
            memory: 64Mi
          requests:
            cpu: 100m
            ephemeral-storage: 10Mi
            memory: 32Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - CHOWN
            - FOWNER
            - FSETID
            - DAC_OVERRIDE
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsUser: 0
          seLinuxOptions:
            type: spc_t
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib
          name: host-varlib-volume
        - mountPath: /opt/spo-profiles
          name: operator-profiles-volume
          readOnly: true
        - mountPath: /host
          name: host-root-volume
      - args:
        - |
          set -x
          chown 65535:0 /etc/selinux.d
          chmod 750 /etc/selinux.d
          semodule -i /usr/share/selinuxd/templates/*.cil
          semodule -i /opt/spo-profiles/selinuxd.cil
          semodule -i /opt/spo-profiles/selinuxrecording.cil
        command:
        - bash
        - -c
        env:
        - name: KUBELET_DIR
          value: /var/lib/kubelet
        - name: SPO_VERBOSITY
          value: "1"
        image: docker.io/wilfsilver/selinuxd-fedora:f43
        imagePullPolicy: Always
        name: selinux-shared-policies-copier
        resources:
          limits:
            ephemeral-storage: 50Mi
            memory: 1Gi
          requests:
            cpu: 100m
            ephemeral-storage: 10Mi
            memory: 32Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - CHOWN
            - FOWNER
            - FSETID
            - DAC_OVERRIDE
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsUser: 0
          seLinuxOptions:
            type: spc_t
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/selinux.d
          name: selinux-drop-dir
        - mountPath: /opt/spo-profiles
          name: operator-profiles-volume
          readOnly: true
        - mountPath: /sys/fs/selinux
          name: host-fsselinux-volume
        - mountPath: /etc/selinux
          name: host-etcselinux-volume
        - mountPath: /var/lib/selinux
          name: host-varlibselinux-volume
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-node-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      serviceAccount: spod
      serviceAccountName: spod
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      - effect: NoSchedule
        key: node-role.kubernetes.io/control-plane
        operator: Exists
      - effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
      volumes:
      - hostPath:
          path: /var/lib
          type: Directory
        name: host-varlib-volume
      - hostPath:
          path: /var/lib/security-profiles-operator
          type: DirectoryOrCreate
        name: host-operator-volume
      - configMap:
          defaultMode: 420
          name: security-profiles-operator-profile
        name: operator-profiles-volume
      - emptyDir: {}
        name: selinux-drop-dir
      - emptyDir: {}
        name: selinuxd-private-volume
      - hostPath:
          path: /sys/fs/selinux
          type: Directory
        name: host-fsselinux-volume
      - hostPath:
          path: /etc/selinux
          type: Directory
        name: host-etcselinux-volume
      - hostPath:
          path: /var/lib/selinux
          type: Directory
        name: host-varlibselinux-volume
      - hostPath:
          path: /tmp/security-profiles-operator-recordings
          type: DirectoryOrCreate
        name: profile-recording-output-volume
      - hostPath:
          path: /var/log/audit
          type: DirectoryOrCreate
        name: host-auditlog-volume
      - hostPath:
          path: /var/log
          type: DirectoryOrCreate
        name: host-syslog-volume
      - name: metrics-cert-volume
        secret:
          defaultMode: 420
          secretName: metrics-server-cert
      - hostPath:
          path: /sys/kernel/debug
          type: Directory
        name: sys-kernel-debug-volume
      - hostPath:
          path: /sys/kernel/security
          type: Directory
        name: sys-kernel-security-volume
      - hostPath:
          path: /sys/kernel/tracing
          type: Directory
        name: sys-kernel-tracing-volume
      - hostPath:
          path: /etc/os-release
          type: File
        name: host-etc-osrelease-volume
      - emptyDir: {}
        name: tmp-volume
      - emptyDir: {}
        name: grpc-server-volume
      - hostPath:
          path: /
          type: Directory
        name: host-root-volume
      - emptyDir: {}
        name: home-volume
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 100%
    type: RollingUpdate

Metadata

Metadata

Assignees

Labels

kind/upstream-issueThis issue appears to be caused by an upstream bug

Type

No type

Projects

Status

Accepted

Relationships

None yet

Development

No branches or pull requests

Issue actions