云原生时代Kubernetes技术解决了基础架构平台Day 1 Operation问题,而Day 2 Operation包含了monitor,maintain,和 troubleshoot等一系列运行时工作,其中「云原生安全问题」已经引起越来越多的注意,今天的主角「Falco」就是保障云原生运行时安全。
Falco,开源的云原生运行时安全项目,目前是威胁Kubernetes平台监测引擎的事实标准,还可以监测意外的应用行为和运行时发出的威胁警告。Falco也是第一个加入云原生计算基金会CNCF并处于孵化阶段的运行时安全项目,可谓是前途无量的正规军。
在Kubernetes平台部署Falco非常简单,helm一键安装或者使用原生的DaemonSet yaml部署都可以。由于Falco支持内核级别的系统调用监测,安装部署前需要确认你的集群节点的内核模块是否满足条件,详细可以看官方中文文档。
最近官方博客提到了最新的minikube 1.8版本内置了Falco所需的内核模块,这样一来,安装Falco就只需一条helm install命令。
由于众所周知的原因,原生minikube start下载物料不快。给大家准备好了下载好的物料作为cache,大家放置到适合的文件位置,就可以加速minikube start启动速度(下文基于MacOS系统的hyperkit虚拟化技术)。
以上两个文件可以到这里下载:
链接: https://pan.baidu.com/s/1Orr6pMri2E6L9mrTTpqe-Q 提取码: 1cj4
然后就可以用以下命令启动,为了防止下载好的物料不生效,已经配置了国内下载绿色通道:
minikube start --v=7 \
--image-mirror-country='cn' \
--registry-mirror=http://f1361db2.m.daocloud.io \ # maybe not working, need to manually config
--image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers
一条命令搞定:
# 如果你还没有添加helm官方stable仓库,那你就还需要一条命令:
# helm repo add stable https://kubernetes-charts.storage.googleapis.com/
helm install falco stable/falco
如果你想看下Falco的helm chart具体是什么样子的,可以用 helm install falco stable/falco --dry-run > falco-helm-chart.yaml
导出到文件,一睹其真容。
等部署完成后,观察Falco pod的运行日志,如果看到以下信息,就是成功启动了:
> kubectl logs -l app=falco
* Unloading falco-probe, if present
* Running dkms install for falco
Error! echo
Your kernel headers for kernel 4.19.94 cannot be found at
/lib/modules/4.19.94/build or /lib/modules/4.19.94/source.
* Running dkms build failed, couldn't find /var/lib/dkms/falco/0.20.0+d77080a/build/make.log
* Trying to load a system falco-probe, if present
falco-probe found and loaded with modprobe
Thu Mar 12 12:07:19 2020: Falco initialized with configuration file /etc/falco/falco.yaml
Thu Mar 12 12:07:19 2020: Loading rules from file /etc/falco/falco_rules.yaml:
Thu Mar 12 12:07:20 2020: Loading rules from file /etc/falco/falco_rules.local.yaml:
Thu Mar 12 12:07:21 2020: Starting internal webserver, listening on port 8765
官方Falco已自带了很多默认监控规则,具体可以查看Falco pod中的/etc/falco/falco_rules.yaml文件。下面给大家运行几个样例:
kubectl exec
进入一个pod/container,就会触发下面的规则:- rule: Terminal shell in container
desc: A shell was used as the entrypoint/exec point into a container with an attached terminal.
condition: >
spawned_process and container
and shell_procs and proc.tty != 0
and container_entrypoint
output: >
A shell was spawned in a container with an attached terminal (user=%user.name %container.info
shell=%proc.name parent=%proc.pname cmdline=%proc.cmdline terminal=%proc.tty container_id=%container.id image=%container.image.repository)
priority: NOTICE
tags: [container, shell, mitre_execution]
监控输出:
Notice A shell was spawned in a container with an attached terminal (user=root k8s.ns=default k8s.pod=falco-rw8wg container=b915a438710d shell=sh parent=runc cmdline=sh terminal=34816 container_id=b915a438710d image=<NA>) k8s.ns=default k8s.pod=falco-rw8wg container=b915a438710d
wget sth
,就会触发:- rule: Write below root
desc: an attempt to write to any file directly below / or /root
condition: >
root_dir and evt.dir = < and open_write
and not fd.name in (known_root_files)
and not fd.directory in (known_root_directories)
and not exe_running_docker_save
and not gugent_writing_guestagent_log
and not dse_writing_tmp
and not zap_writing_state
and not airflow_writing_state
and not rpm_writing_root_rpmdb
and not maven_writing_groovy
and not chef_writing_conf
and not kubectl_writing_state
and not cassandra_writing_state
and not galley_writing_state
and not calico_writing_state
and not rancher_writing_root
and not runc_writing_exec_fifo
and not known_root_conditions
and not user_known_write_root_conditions
and not user_known_write_below_root_activities
output: "File below / or /root opened for writing (user=%user.name command=%proc.cmdline parent=%proc.pname file=%fd.name program=%proc.name container_id=%container.id image=%container.image.repository)"
priority: ERROR
tags: [filesystem, mitre_persistence]
监控输出:
Error File below / or /root opened for writing (user=root command=wget http://www.baidu.com parent=sh file=/index.html program=wget container_id=ffbf070885d5 image=<NA>) k8s.ns=default k8s.pod=testbox container=ffbf070885d5 k8s.ns=default k8s.pod=testbox container=ffbf070885d5
# Container is supposed to be immutable. Package management should be done in building the image.
- rule: Launch Package Management Process in Container
desc: Package management process ran inside container
condition: >
spawned_process
and container
and user.name != "_apt"
and package_mgmt_procs
and not package_mgmt_ancestor_procs
and not user_known_package_manager_in_container
output: >
Package management process launched in container (user=%user.name
command=%proc.cmdline container_id=%container.id container_name=%container.name image=%container.image.repository:%container.image.tag)
priority: ERROR
tags: [process, mitre_persistence]
监控输出:
Error Package management process launched in container (user=root command=apk add --no-cache mysql-client container_id=cc1cdcea736c container_name=<NA> image=<NA>:<NA>) k8s.ns=default k8s.pod=testbox container=cc1cdcea736c k8s.ns=default k8s.pod=testbox container=cc1cdcea736c
是不是非常Cool!快来动手体验下云原生安全保镖的威力吧!
下面是最近两年CNCF云原生大会上Falco的精彩talk,都是来自sysdig的大神,介绍了Falco的运作原理和使用场景,是了解掌握Falco非常好的材料,大家不要错过。
CNCF 2018:
CNCF 2019:
特别介绍下面这个视频,名字为Cloud Native Runtime Security with Falco – Kris Nova, Sysdig & Abhinav Srivastava, Frame.io,其中Kris是个非常cool的技术女神,她的PPT也非常有个性,是基于命令行的。
下次我们讲讲如何将这些监控消息发送出去,形成实时告警机制,尽情期待。