安全最佳实践:RBAC + NetworkPolicy + 镜像安全
深入理解 Kubernetes 安全体系,学习 RBAC 权限控制、网络策略、Pod 安全策略以及镜像安全加固。
概述
安全是 Kubernetes 生产环境的重要议题。本文将深入探讨云原生安全体系的各个层面:
学习目标:
- 理解 Kubernetes 安全模型(纵深防御)
- 掌握 RBAC 权限控制配置
- 学会使用 NetworkPolicy 网络隔离
- 掌握 Pod 安全策略(PSA/PSP)
- 了解镜像安全与扫描实践
安全模型概述
纵深防御
┌─────────────────────────────────────────────────────────────────┐
│ Kubernetes 安全层次 │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Cluster 边界 │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ 命名空间隔离 │ │ │
│ │ │ │ │ │
│ │ │ ┌──────────────────────────────────────────┐ │ │ │
│ │ │ │ NetworkPolicy │ │ │ │
│ │ │ │ (微服务间网络隔离) │ │ │ │
│ │ │ └──────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │ │
│ │ │ ┌──────────────────────────────────────────┐ │ │ │
│ │ │ │ Pod Security │ │ │ │
│ │ │ │ (容器的运行限制) │ │ │ │
│ │ │ └──────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │ │
│ │ │ ┌──────────────────────────────────────────┐ │ │ │
│ │ │ │ RBAC │ │ │ │
│ │ │ │ (身份和权限控制) │ │ │ │
│ │ │ └──────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │ │
│ │ │ ┌──────────────────────────────────────────┐ │ │ │
│ │ │ │ Secrets │ │ │ │
│ │ │ │ (敏感数据保护) │ │ │ │
│ │ │ └──────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │ │
│ │ └─────────────────────┼───────────────────────────┘ │ │
│ │ │ │ │
│ └────────────────────────┼─────────────────────────────────┘ │
│ │ │
│ ▼ │
│ 物理/云安全 │
│ │
└─────────────────────────────────────────────────────────────────┘
安全原则
┌─────────────────────────────────────────────────────────────────┐
│ 安全设计原则 │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 最小权限原则(Principle of Least Privilege) │
│ - 只授予完成任务所需的最小权限 │
│ - 避免使用 cluster-admin │
│ │
│ 深度防御(Defense in Depth) │
│ - 多层安全控制 │
│ - 单点失败不影响整体 │
│ │
│ 零信任(Zero Trust) │
│ - 不信任任何请求 │
│ - 验证所有来源 │
│ │
│ 默认安全(Secure by Default) │
│ - 使用安全的默认值 │
│ - 显式配置而非隐式 │
│ │
└─────────────────────────────────────────────────────────────────┘
RBAC 权限控制
RBAC 模型
┌─────────────────────────────────────────────────────────────────┐
│ RBAC 核心概念 │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Verb │ │ Resource│ │ Role │ │Subject │ │
│ ├─────────┤ ├─────────┤ ├─────────┤ ├─────────┤ │
│ │ get │ │ pods │ │ │ │ User │ │
│ │ list │───▶│ services│───▶│ Role │◀───│ Group │ │
│ │ create │ │ configmaps│ │ Cluster │ │ SA │ │
│ │ update │ │ secrets │ │ Role │ │ │ │
│ │ delete │ │ ... │ │ │ │ │ │
│ └─────────┘ └─────────┘ └────┬────┘ └─────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ RoleBinding │ │
│ │ ClusterRole │ │
│ │ Binding │ │
│ └───────────────┘ │
│ │
│ Role:命名空间级别权限 │
│ ClusterRole:集群级别权限 │
│ │
└─────────────────────────────────────────────────────────────────┘
Role 与 RoleBinding
# namespace-reader.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: namespace-reader
namespace: production
rules:
- apiGroups: [""]
resources: ["pods", "services", "configmaps"]
verbs: ["get", "list", "watch"]
---
# namespace-reader-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: namespace-reader
namespace: production
subjects:
- kind: User
name: alice@example.com
apiGroup: rbac.authorization.k8s.io
- kind: Group
name: developers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: namespace-reader
ClusterRole 与 ClusterRoleBinding
# cluster-admin-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: node-reader
rules:
# 读取节点信息(用于监控)
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch"]
---
# cluster-admin-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: node-reader-binding
subjects:
- kind: ServiceAccount
name: monitoring-agent
namespace: monitoring
roleRef:
kind: ClusterRole
name: node-reader
常用权限模式
# 只读权限(审计、监控)
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: readonly
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["get", "list", "watch"]
# 应用开发者权限(部署、扩缩容)
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: developer
rules:
- apiGroups: ["apps"]
resources: ["deployments", "statefulsets", "daemonsets"]
verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: [""]
resources: ["pods", "services", "configmaps", "secrets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
# 命名空间管理员
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: namespace-admin
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["*"]
# 注意:排除某些权限
- apiGroups: ["rbac.authorization.k8s.io"]
resources: ["clusterroles", "clusterrolebindings"]
verbs: ["get", "list"]
聚合 ClusterRole
# 聚合多个 Role 到一个 ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: aggregate-viewer
aggregationRule:
clusterRoleSelectors:
- matchLabels:
rbac.example.com/aggregate-to-view: "true"
---
# 使用标签聚合
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
labels:
rbac.example.com/aggregate-to-view: "true"
name: myapp-reader
rules:
- apiGroups: ["myapp.example.com"]
resources: ["myapps"]
verbs: ["get", "list", "watch"]
# 自动聚合到 aggregate-viewer
NetworkPolicy
默认网络策略
# 禁止所有入站流量
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
spec:
podSelector: {} # 选择所有 Pod
policyTypes:
- Ingress # 拒绝所有入站
---
# 禁止所有出站流量
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-egress
spec:
podSelector: {}
policyTypes:
- Egress
---
# 同时禁止入站和出站
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
微服务网络策略
# frontend-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-policy
namespace: production
spec:
podSelector:
matchLabels:
app: frontend
policyTypes:
- Ingress
- Egress
# 允许来自 Ingress 的流量
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 8080
# 允许访问 backend
egress:
- to:
- podSelector:
matchLabels:
app: backend
ports:
- protocol: TCP
port: 8080
# 允许 DNS
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
# backend-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-policy
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
- Egress
# 允许来自 frontend 的流量
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
# 允许访问 database
egress:
- to:
- podSelector:
matchLabels:
app: database
ports:
- protocol: TCP
port: 5432
# 允许访问 Redis
egress:
- to:
- podSelector:
matchLabels:
app: redis
ports:
- protocol: TCP
port: 6379
# 允许 DNS
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
命名空间隔离
# namespace-isolation.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: namespace-isolation
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
# 只允许同命名空间流量
ingress:
- from:
- namespaceSelector: {}
# 允许 DNS
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
# 允许外部 API(白名单)
egress:
- to:
- ipBlock:
cidr: 10.0.0.0/8
- ipBlock:
cidr: 192.168.0.0/16
Pod 安全策略
Pod Security Standards(PSS)
# baseline-policy.yaml
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/warn: baseline
pod-security.kubernetes.io/warn-version: latest
---
# restricted-policy.yaml(更严格)
apiVersion: v1
kind: Namespace
metadata:
name: secure-namespace
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: v1.29
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/audit: restricted
Pod Security Context
# secure-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: secure-app
spec:
securityContext:
runAsNonRoot: true # 必须以非 root 运行
runAsUser: 1000 # 指定用户
runAsGroup: 1000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault # 使用默认 seccomp
supplementalGroups:
- 1000
containers:
- name: app
image: myapp:1.0
securityContext:
allowPrivilegeEscalation: false # 不允许提权
readOnlyRootFilesystem: true # 只读根文件系统
capabilities:
drop:
- ALL # 移除所有能力
seccompProfile:
type: RuntimeDefault
resources:
limits:
memory: "256Mi"
cpu: "500m"
requests:
memory: "128Mi"
cpu: "100m"
SecurityContext 对比
┌─────────────────────────────────────────────────────────────────┐
│ Pod vs Container SecurityContext │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Pod SecurityContext(Pod 级别) │
│ ├── runAsUser / runAsGroup / fsGroup │
│ ├── supplementalGroups │
│ ├── seccompProfile │
│ └── sysctls │
│ │
│ Container SecurityContext(容器级别) │
│ ├── runAsUser(覆盖 Pod 级别) │
│ ├── capabilities │
│ ├── allowPrivilegeEscalation │
│ ├── readOnlyRootFilesystem │
│ └── seccompProfile(覆盖 Pod 级别) │
│ │
│ 优先级:Container > Pod │
│ │
└─────────────────────────────────────────────────────────────────┘
镜像安全
镜像扫描
# 使用 Trivy 扫描镜像
brew install aquasecurity/trivy/trivy
# 扫描镜像漏洞
trivy image myapp:1.0
# 扫描 CI/CD 流水线
trivy fs --security-checks vuln,config /path/to/project
# 按严重性过滤
trivy image --severity HIGH,CRITICAL myapp:1.0
# 输出 JSON 格式
trivy image --format json --output report.json myapp:1.0
# 扫描已知漏洞
trivy image --ignore-unfixed myapp:1.0
安全镜像策略
# 安全 Deployment 配置
apiVersion: apps/v1
kind: Deployment
metadata:
name: secure-app
spec:
template:
spec:
containers:
- name: app
image: myapp:1.0
imagePullPolicy: Always # 始终拉取最新镜像
# 使用 ImagePullPolicy + tag 策略
# 推荐:使用 SHA 而非 tag
# image: myapp@sha256:abc123...
# 只允许来自指定仓库的镜像
imagePullSecrets:
- name: my-registry-secret
---
# 限制可使用的镜像
apiVersion: v1
kind: ConfigMap
metadata:
name: allowed-images
namespace: production
data:
allowed-repositories.yaml: |
allowed:
- myregistry.com/*
- docker.io/bitnami/*
- gcr.io/distroless/*
安全上下文示例
# production-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: production-app
spec:
template:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: myapp:1.0
imagePullPolicy: Always
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}
# Pod 中禁止特权容器
hostPID: false
hostNetwork: false
hostIPC: false
密钥安全
Secret 管理
# 使用 SOPS 加密 Secret
# .sops.yaml
creation_rules:
- age: <public-key>
namespaces:
- production
path_regex: secrets/.*
---
# 加密后的 Secret
apiVersion: v1
kind: Secret
metadata:
name: encrypted-db-credentials
namespace: production
annotations:
sops: "true"
data:
password: ENC[AESGCM,...]
sops:
kms: []
gcp_kms: []
azure_kms: []
age:
- recipient: <public-key>
enc: |
---- BEGIN AGE ENCRYPTED FILE ----
...
---- END AGE ENCRYPTED FILE ----
外部密钥管理
# External Secrets Operator
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: vault-backend
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: db-credentials
creationPolicy: Owner
data:
- secretKey: username
remoteRef:
key: secret/data/db
property: username
- secretKey: password
remoteRef:
key: secret/data/db
property: password
审计日志
审计策略
# audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# 不记录只读请求
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- group: ""
resources: ["endpoints"]
# 记录元数据级别
- level: Metadata
resources:
- group: ""
resources: ["pods", "services"]
- group: "apps"
resources: ["deployments"]
# 记录请求体
- level: RequestResponse
resources:
- group: ""
resources: ["secrets", "configmaps"]
verbs: ["create", "update", "patch", "delete"]
# 记录所有命名空间级变更
- level: RequestResponse
namespaces: ["production", "staging"]
resources:
- group: "apps"
resources: ["deployments", "statefulsets"]
verbs: ["create", "update", "patch", "delete"]
审计配置
# kube-apiserver 配置
# --audit-policy-file=/etc/kubernetes/audit-policy.yaml
# --audit-log-path=/var/log/kubernetes/audit.log
# --audit-log-maxage=30
# --audit-log-maxbackup=10
# --audit-log-maxsize=100
apiVersion: v1
kind: ConfigMap
metadata:
name: audit-policy
namespace: kube-system
data:
audit-policy.yaml: |
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
resources:
- group: ""
resources: ["secrets"]
verbs: ["create", "update", "patch", "delete"]
安全工具集成
OPA Gatekeeper
# 限制容器以非 root 运行
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPAllowPrivilegeEscalationContainer
metadata:
name: psp-allow-no-privilege-escalation
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
parameters:
exemptImages:
- docker.io/library/*
spec:
enforcementAction: deny
---
# 强制使用只读根文件系统
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPReadOnlyRootFilesystem
metadata:
name: psp-readonly-root-filesystem
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
parameters:
exemptImages:
- docker.io/library/*
spec:
enforcementAction: deny
Falco 安全监控
# Falco 配置
apiVersion: v1
kind: ConfigMap
metadata:
name: falco-config
namespace: falco
data:
falco.yaml: |
log_level: info
program_output:
enabled: true
keep_alive: false
program: "jq '{utc: .time, container: .outputfields.container_image_repository, command: .outputfields.container_image_command}'"
falco_rules.yaml: |
- rule: Terminal shell in container
desc: A shell was spawned in a container
condition: >
spawned_process and
container and
shell in (proc.name)
output: >
Terminal shell in container
(user=%user.name container=%container.image.repository
cmd=%proc.cmdline)
- rule: Privileged container
desc: A privileged container was created
condition: >
container and
privilege_container
output: >
Privileged container created
(user=%user.name container=%container.image.repository
pod=%k8s.pod.name)
常见问题与避坑指南
Q1:RBAC 权限不足?
# 排查步骤
# 1. 检查用户身份
kubectl auth whoami
# 2. 检查用户权限
kubectl auth can-i --list --as=user@example.com
# 3. 查看绑定
kubectl get rolebindings -n production
kubectl get clusterrolebindings
# 4. 模拟权限检查
kubectl auth can-i get pods --as=system:serviceaccount:default:sa-name
Q2:NetworkPolicy 不生效?
# 排查步骤
# 1. 检查 CNI 支持
kubectl get cni
# 需要 CNI 支持 NetworkPolicy(Calico/Cilium/Weave)
# 2. 检查 Policy 是否存在
kubectl get networkpolicy -n production
# 3. 检查 Pod 选择器
kubectl describe networkpolicy my-policy
# 4. 检查被选中的 Pod
kubectl get pods -n production -l app=myapp
Q3:镜像漏洞如何处理?
# 1. 扫描镜像
trivy image --severity HIGH,CRITICAL myapp:1.0
# 2. 更新基础镜像
docker pull alpine:3.19
docker build -t myapp:1.1 -f Dockerfile << EOF
FROM alpine:3.19
COPY app /app
RUN apk add --no-cache ca-certificates
CMD ["/app"]
EOF
# 3. 定期扫描 CI
# 在 CI 流水线中添加 Trivy 扫描步骤
Q4:如何审计集群变更?
# 启用审计日志
# kube-apiserver 启动参数
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--audit-log-path=/var/log/kubernetes/audit.log
--audit-log-maxage=30
--audit-log-maxbackup=10
# 使用 Falco 监控敏感操作
- rule: Modify Kubernetes Secrets
desc: Attempt to modify Kubernetes secrets
condition: >
modify and
container and
(ka.target.resource == "secrets" or ka.target.resource == "configmaps")
总结
┌─────────────────────────────────────────────────────────────────┐
│ 核心要点回顾 │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 安全原则 │
│ ├── 最小权限 │
│ ├── 深度防御 │
│ └── 零信任 │
│ │
│ RBAC │
│ ├── Role/RoleBinding:命名空间级别 │
│ ├── ClusterRole/ClusterRoleBinding:集群级别 │
│ └── 聚合规则 │
│ │
│ NetworkPolicy │
│ ├── 默认拒绝所有流量 │
│ ├── 按需开放最小权限 │
│ └── 支持命名空间隔离 │
│ │
│ Pod 安全 │
│ ├── PodSecurityStandards(baseline/restricted) │
│ ├── SecurityContext 配置 │
│ └── 非 root 运行,禁止提权 │
│ │
│ 镜像安全 │
│ ├── 定期扫描漏洞 │
│ ├── 使用最小化基础镜像 │
│ └── 使用 SHA 而非 tag │
│ │
└─────────────────────────────────────────────────────────────────┘
思考题
- 如何设计一个最小权限的 RBAC 策略?
- 在多租户场景下,如何实现网络隔离?
- 如何建立完善的镜像安全扫描流程?
引用与参考
下篇预告
下一篇文章我们将探讨 多租户集群管理,包括:
- 命名空间隔离
- ResourceQuota 与 LimitRange
- 集群联邦
- 成本管理
敬请期待!