Kubernetes is an open-source platform for automating deployment, scaling, and operations of application containers across clusters of hosts, providing container-centric infrastructure.
特性:
其他特点:
总结:调度,管理,扩展(deployment/demon set/stateful set/job, health check,auto-scaling,rolling updates)应用程序,提供应用程序运行平台(日志,监控,服务发现,负载均衡,鉴权),以及管理控制和分配平台资源(内存,cpu,网络,存储,镜像)
我们看一下操作系统的定义
操作系统(Operating System, OS)是指控制和管理整个计算机系统的硬件和软件资源,并合理地组织调度计算机的工作和资源的分配,以提供给用户和其他软件方便的接口和环境的程序集合. kubernetes就是一个分布式的操作系统,它管理一个计算机集群的软件和硬件资源,并且合理的组织调用程序(容器)和资源的分配,以提供给用户和其他软件方便的接口和环境。
单机操作系统中的大多概念 都在k8s有或者正在有对应的形态。举个例子systemctl有reload操作,这个k8s也没有,但是是k8s正在做的。
这段很有意思,很值得看,Kubernetes不是什么,里面很多都是Kubernetes发行商需要考虑和完成的事
middleware
(e.g., message buses), data-processing frameworks (for example, Spark), databases (e.g., mysql), nor cluster storage systems (e.g., Ceph) as built-in services. Such applications run on Kubernetes.click-to-deploy service marketplace
.Continuous Integration (CI) workflow
is an area where different users and projects have their own requirements and preferences, so it supports layering CI workflows on Kubernetes but doesn’t dictate how layering should work.logging
, monitoring
, and alerting systems
. (It provides some integrations as proof of concept.)comprehensive application configuration language/system
(for example, jsonnet).comprehensive machine configuration, maintenance, management, or self-healing systems
.角色 | 组件 | 说明 | ||
---|---|---|---|---|
Master Components | kube-apiserver | kube-apiserver exposes the Kubernetes API; | ||
- | - | it is the front-end for the Kubernetes control plane. | ||
Master Components | etcd | Kubernetes’ backing store. stored All cluster data | ||
Master Components | kube-controller-manager | 一个binary包括: | ||
- | - | 1.Node Controller: noticing & responding when nodes go down. | ||
- | - | 2.Replication Controller:maintain correct number of pods for every Replication Controller object. - | - | 3.Endpoints Controller: Populates the Endpoints object (如join Services & Pods). |
- | - | 4.Service Account & Token Controllers:Create default accounts,API access tokens for namespaces. | ||
- | - | 5.others. | ||
Master Components | cloud-controller-manager | a binary run controllers interact with cloud providers.包括: | ||
- | - | 1.Node Controller: checking cloud provider,determine if node deleted in cloud after stops responding | ||
- | - | 2.Route Controller: For setting up routes in the underlying cloud infrastructure | ||
- | - | 3.Service Controller: For creating, updating and deleting cloud provider load balancers | ||
- | - | Volume Controller: For creating,attaching,mounting,interacting with cloud provider to orchestrate volumes | ||
Master Components | kube-scheduler | kube-scheduler watches newly created pods that have no node assigned, and selects a node for them to run on. | ||
Master Components | addons | Addons are pods and services that implement cluster features. | ||
- | - | 如:DNS (Cluster DNS is a DNS server, in addition to the other DNS server(s) in your environment, which serves DNS records for Kubernetes services.), | ||
- | - | User interface,Container Resource Monitoring,Cluster-level Logging | ||
Node components | kubelet | primary node agent,主要功能: | ||
- | - | 1.Watches for pods that have been assigned to its node (either by apiserver or via local configuration file) | ||
- | - | 2.Mounts the pod’s required volumes | ||
- | - | 3.Downloads the pod’s secrets | ||
- | - | 4.Runs the pod’s containers via docker (or, experimentally, rkt). | ||
- | - | 5.Periodically executes any requested container liveness probes. | ||
- | - | 6.Reports the status of the pod back to the rest of the system, by creating a “mirror pod” if necessary | ||
- | - | 7.Reports the status of the node back to the rest of the system. | ||
Node components | kube-proxy | kube-proxy enables the Kubernetes service abstraction by maintaining network rules on the host and performing connection forwarding. | ||
Node components | docker/rkt | for actually running containers. | ||
Node components | supervisord | supervisord is a lightweight process babysitting system for keeping kubelet and docker running. | ||
Node components | fluentd | fluentd is a daemon which helps provide cluster-level logging. |
分类
类别 | 名称 |
---|---|
资源对象 | Pod、ReplicaSet、ReplicationController、Deployment、StatefulSet、DaemonSet、Job、CronJob、HorizontalPodAutoscaling |
配置对象 | Node、Namespace、Service、Secret、ConfigMap、Ingress、Label、ThirdPartyResource、 ServiceAccount |
存储对象 | Volume、Persistent Volume |
策略对象 | SecurityContext、ResourceQuota、LimitRange |
Kubernetes Objects are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster. Specifically, they can describe:
Kubernetes Objects描述desired state
=> 状态驱动
Kubernetes对象就是应用,资源和策略
每个对象都有两个嵌套的字段Object Spec 和 Object Status
Object Spec描述desired的状态
,Object Status 描述当前状态
. Object Status -》match Object Spec
Kubernetes Control Plane就是要让 object’s actual state => object's desired state
略
Labels are key/value pairs that are attached to objects, such as pods. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but which do not directly imply semantics to the core system
.
不唯一
Via a label selector, the client/user can identify a set of objects. The label selector is the core grouping primitive in Kubernetes.
The API currently supports two types of selectors: equality-based
(如:environment = production)and set-based
(如:environment in (production, qa)).
例子见 https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
label 可用在 LIST and WATCH filtering;Set references in API objects
Some Kubernetes objects, such as services and replicationcontrollers
, also use label selectors to specify sets of other resources, such as pods.但是支持equality-based requirement selectors
"selector": {
"component" : "redis",
}
Newer resources, such as Job, Deployment, Replica Set, and Daemon Set
, support set-based requirements as well.这些资源,同时支持set-based requirements
selector:
matchLabels:
component: redis
matchExpressions:
- {key: tier, operator: In, values: [cache]}
- {key: environment, operator: NotIn, values: [dev]}
另一个使用场景事用label来选择node
作用是Attaching metadata to objects
和label有区别:
You can use either labels or annotations to attach metadata to Kubernetes objects. Labels can be used to select objects and to find collections of objects that satisfy certain conditions. In contrast, annotations are not used to identify and select objects. The metadata in an annotation can be small or large, structured or unstructured, and can include characters not permitted by labels.
Complete API details are documented using Swagger v1.2
and OpenAPI
(就是Swagger 2.0).
如:/api/v1, 根据稳定性分为 stabel(v1), alpha (v1alpha1), beta (v2beta3)
为了方便extend Kubernetes API
Currently there are several API groups in use:
core
(oftentimes called “legacy”, due to not having explicit group name) group, which is at REST path /api/v1 and is not specified as part of the apiVersion field, e.g. apiVersion: v1.扩展api目前有两种方式: CustomResourceDefinition 和 kube-aggregator
某个api group可以在apiserver启动的时候被打开或者
关闭, 比如
--runtime-config=extensions/v1beta1/deployments=false,extensions/v1beta1/ingress=false
这部分来自 https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md
All JSON objects returned by an API MUST have the following fields:
object内容 | 说明 |
---|---|
Metadata | MUST: namespace,name,uid; SHOULD: resourceVersion,generation,creationTimestamp,deletionTimestamp,labels,annotations |
Spec and Status | status (current) -> Spec(desired);A /status subresource MUST be provided to enable system components to update statuses of resources they manage; Status常是Conditions |
References to related objects | ObjectReference type |
PATCH比较特别,支持三种patch
All compatible Kubernetes APIs MUST support "name idempotency" and respond with an HTTP status code 409
"confict"
Optional fields have the following properties:
使用 +optional 而不是omitempty
使用resourceVersion来做Concurrency Control
All Kubernetes resources have a "resourceVersion" field as part of their metadata.
Kubernetes leverages the concept of resource versions to achieve optimistic concurrency.
The resourceVersion is changed by the server every time an object is modified.
什么什么api会返回status kind类型
Kubernetes will always return the Status kind from any API endpoint when an error occurs. Clients SHOULD handle these types of objects when appropriate.
$ curl -v -k -H "Authorization: Bearer WhCDvq4VPpYhrcfmF6ei7V9qlbqTubUc" https://10.240.122.184:443/api/v1/namespaces/default/pods/grafana
> GET /api/v1/namespaces/default/pods/grafana HTTP/1.1
> User-Agent: curl/7.26.0
> Host: 10.240.122.184
> Accept: */*
> Authorization: Bearer WhCDvq4VPpYhrcfmF6ei7V9qlbqTubUc
>
< HTTP/1.1 404 Not Found
< Content-Type: application/json
< Date: Wed, 20 May 2015 18:10:42 GMT
< Content-Length: 232
<
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "pods \"grafana\" not found",
"reason": "NotFound",
"details": {
"name": "grafana",
"kind": "pods"
},
"code": 404
}
The API therefore exposes certain operations over upgradeable HTTP connections (described in RFC 2817) via the WebSocket and SPDY protocols.
支持两种协议
Node Status | 描述 |
---|---|
Addresses | HostName/ExternalIP/InternalIP |
Condition | OutOfDisk / Ready / MemoryPressure / DiskPressure / NetworkUnavailable |
Capacity |
Info |
Node Controller
The node controller is a Kubernetes master component which manages various aspects of nodes.
作用:
The CCM consolidates all of the cloud-dependent logic from the preceding three components to create a single point of integration with the cloud. The new architecture with the CCM looks like this
The default pull policy is IfNotPresent
which causes the Kubelet to not pull an image if it already exists.
如果要强制拉取,使用imagePullPolicy: Always
, 推荐的做法是 "Vxx + IfNotPresent", 而不是"latest + Always",因为不知道正在运行的是什么版本,但是实际上pull是调用docker这样的runtime去pull, 即使Always也不会重复下载大量数据,因为layer已经存在来,从这方面讲Always是无害的。
可用:
Using Google Container Registry
Using AWS EC2 Container Registry
Using Azure Container Registry (ACR)
通过$HOME/.docker/config.json (过期问题??)
$ kubectl create secret docker-registry myregistrykey --docker-server=DOCKER_REGISTRY_SERVER --docker-username=DOCKER_USER --docker-password=DOCKER_PASSWORD --docker-email=DOCKER_EMAIL
secret "myregistrykey" created.
不通过kubectl也可以从.docker/config.json的内容,用yaml创建secrets
怎么使用创建出来的imagePullSecrets
可以在podspec里面指定,也可以通过serviceaccount自动完成这个设定。
You can use this in conjunction with a per-node .docker/config.json. The credentials will be merged
. This approach will work on Google Container Engine (GKE).
apiVersion: v1
kind: Pod
metadata:
name: foo
namespace: awesomeapps
spec:
containers:
- name: foo
image: janedoe/awesomeapp:v1
imagePullSecrets:
- name: myregistrykey
使用场景,值得注意的是 AlwaysPullImages admission controller,这个有时候要打开,比如多租户的情况,否则有可能获取别人的镜像。
具体多种挂在方式 元数据->container里面的文件/环境变量,参考 https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/ 和相关文档
创建的时候存在的service host/port作为变量都会挂在container里面(目前看是这个namespace的),这个特性保证了即使没开dns addon,也可以访问service,当然这种方式不可靠。
现在有两种 PostStart; PreStop,如果hook调用hangs,Pod状态变化会阻塞。
支持Exec,HTTP两种方式
从上面的特点可以看出,PostStart; PreStop的目前的设计都是针对非常轻量级的命令,如果不是可以考虑用initcontainer,defercontainer(还没实现,有issue)
一般只会发一次,但是不保证
If a handler fails for some reason, it broadcasts an event.
You can see these events by running kubectl describe pod <pod_name>
Pod是什么:部署的最小单位; 涵盖了一个或多个application container,(共用的)存储资源,网络IP,options
A Pod encapsulates an application container (or, in some cases, multiple containers), storage resources, a unique network IP, and options that govern how the container(s) should run. A Pod represents a unit of deployment: a single instance of an application in Kubernetes, which might consist of either a single container or a small number of containers that are tightly coupled and that share resources.
参考:
http://blog.kubernetes.io/2015/06/the-distributed-system-toolkit-patterns.html (一个pod多个container的use case:Sidecar (git, log...), Ambassador (proxy, 透明代理),Adapter (exporter)...)
http://blog.kubernetes.io/2016/06/container-design-patterns.html
一个例子:
multiple Containers共享:
Pods are designed as relatively ephemeral, disposable entities.Pods do not, by themselves, self-heal
,Kubernetes uses a higher-level abstraction, called a Controller, that handles the work of managing the relatively disposable Pod instances.
A Controller can create and manage multiple Pods for you, handling replication and rollout and providing self-healing capabilities at cluster scope
. For example, if a Node fails, the Controller might automatically replace the Pod by scheduling an identical replacement on a different Node.
Some examples of Controllers that contain one or more pods include:
Controllers use Pod Templates to make actual pods.
没有 desired state of all replicas,不像pod,会规定desired state of all containers belonging to the pod.
A Pod’s status field is a PodStatus object, which has a phase field.
可能的状态 | 说明 |
---|---|
Pending | The Pod has been accepted by the Kubernetes system, but one or more of the Container images has not been created. |
Running | The Pod has been bound to a node, and all of the Containers have been created. At least one Container is still running, or is in the process of starting or restarting |
Succeeded | All Containers in the Pod have terminated in success, and will not be restarted. |
Failed | All Containers in the Pod have terminated, and at least one Container has terminated in failure. |
Unknown|
pod 终止
A Pod has a PodStatus, which has an array of PodConditions.Each element of the PodCondition array has a type field and a status field.
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2017-10-28T06:30:03Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2017-10-28T06:30:13Z
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: 2017-10-28T06:30:03Z
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://dd82608cabe226247bcbc8d5fbce6121edf935320486c41046481000dbb7784f
image: deis/brigade-api:latest
imageID: docker-pullable://deis/brigade-api@sha256:943cf822adddf6869ff02d2e1a55cbb19c96d01be41e88d1d56bc16a50f5c91f
lastState: {}
name: brigade
ready: true
restartCount: 0
state:
running:
startedAt: 2017-10-28T06:30:06Z
A Probe is a diagnostic performed periodically by the kubelet on a Container. To perform a diagnostic, the kublet calls a Handler implemented by the Container.
三种检测方式:
三种结果: Success,Failure,Unknown
两种类型:livenessProbe(和restart policy相关),readinessProbe
todo
Job
for Pods that are expected to terminate, for example, batch computations. Jobs are appropriate only for Pods with restartPolicy equal to OnFailure or Never.ReplicationController, ReplicaSet, or Deployment
for Pods that are not expected to terminate, for example, web servers. ReplicationControllers are appropriate only for Pods with a restartPolicy of Always.DaemonSet
for Pods that need to run one per machine
, because they provide a machine-specific system service.这里比较值得注意的是如果pod设计成run to complete的,那么restartPolicy不能用Always
当前pod phase | container发生事件 | pod restartPolicy | 对container的动作 | log | pod phase |
---|---|---|---|---|---|
Running | exits with success | Always | Restart Container | Log completion event | Running |
Running | exits with success | OnFailure | - | Log completion event | Succeeded |
Running | exits with success | Never | - | Log completion event | Succeeded |
Running | exits with failure | Always | Restart Container | Log failure event | Running |
Running | exits with failure | OnFailure | Restart Container | Log failure event | Running |
Running | exits with failure | Never | - | Log failure event | Failed |
Running | oom | Always | Restart Container | Log OOM event | Running |
Running | oom | OnFailure | Restart Container | Log OOM event | Running |
Running | oom | Never | - | Log OOM event | Failed |
当前pod phase | container1发生事件 | pod restartPolicy | 对container的动作 | log | pod phase |
---|---|---|---|---|---|
Running | exits with failure | Always | Restart Container | Log failure event | Running |
Running | exits with failure | OnFailure | Restart Container | Log failure event | Running |
Running | exits with failure | Never | - | Log failure event | Running, 如果container2也退出 =》Failed |
常用来做set-up,或者等待set-up
Init Containers are exactly like regular Containers, except:
pod preset,是一种给pod注入元数据的方法。
使用pod preset会决定对某一类的pod,在Admission controller那里透明的对pod spec进行修改,给pod动态的注入依赖的一些信息,如env,mount volumns
表现:
当PodPreset被应用于一个或者多个Pod,Kubernetes修改pod的spec。对于Env,EnvFrom和VolumeMounts,Kubernetes修改了Pod里面所有容器的spec;对于Volume Kubernetes修改了Pod Spec。
例子:
kind: PodPreset
apiVersion: settings.k8s.io/v1alpha1
metadata:
name: allow-database
namespace: myns
spec:
selector:
matchLabels:
role: frontend
env:
- name: DB_PORT
value: "6379"
volumeMounts:
- mountPath: /cache
name: cache-volume
volumes:
- name: cache-volume
emptyDir: {}
参考http://www.jianshu.com/p/83fe99a5e37f
包含 PodSecurityPolicy 的 许可控制,允许控制集群资源的创建和修改,基于这些资源在集群范围内被许可的能力。
如果某个策略能够匹配上,该 Pod 就被接受。如果请求与 PSP 不匹配,则 Pod 被拒绝
https://jimmysong.io/kubernetes-handbook/concepts/pod-security-policy.html
如何减轻Involuntary Disruptions的影响: 指名要的资源, Replicate and spread.
在Kubernetes中,为了保证业务不中断或业务SLA不降级,需要将应用进行集群化部署。通过PodDisruptionBudget控制器可以设置应用POD集群处于运行状态最低个数,也可以设置应用POD集群处于运行状态的最低百分比,这样可以保证在主动销毁应用POD的时候,不会一次性销毁太多的应用POD,从而保证业务不中断或业务SLA不降级。
使用那种调用Eviction API 的工具而不是直接删除POD,因为Eviction API 会respect Pod Disruption Budgets,比如 kubectl drain命令。
参考:
https://www.kubernetes.org.cn/2486.html
http://ju.outofmemory.cn/entry/327564
Write disruption tolerant applications and use PDBs
一般不直接用,而是通过Deployments.
mainly used by Deployments as a mechanism to orchestrate pod creation, deletion and updates.
A ReplicaSet ensures that a specified number of pod replicas are running at any given time.
一些操作:
略,现在不推荐了。
A Deployment controller provides declarative updates
for Pods and ReplicaSets.
Pod-template-hash label: this label ensures that child ReplicaSets of a Deployment do not overlap. It is generated by hashing the PodTemplate of the ReplicaSet and using the resulting hash as the label value that is added to the ReplicaSet selector, Pod template labels, and in any existing Pods that the ReplicaSet might have.
Deployment can ensure that only a certain number of Pods may be down while they are being updated. By default, it ensures that at least 1 less than the desired number of Pods are up (1 max unavailable).
rollout, rollout history/status, undo......
Proportional scaling: RollingUpdate (maxSurge,maxUnavailable)可能短暂大于预期数量
$ kubectl get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx-deployment 10 10 10 10 50s
$ kubectl set image deploy/nginx-deployment nginx=nginx:sometag
deployment "nginx-deployment" image updated
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
nginx-deployment-1989198191 5 5 0 9s
nginx-deployment-618515232 8 8 8 1m
You can set .spec.revisionHistoryLimit field in a Deployment to specify how many old ReplicaSets for this Deployment you want to retain
注意:目前不支持Canary Deployment,推荐用multiple Deployment来实现
since 1.5 取代PetSets,特点是:Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering
and uniqueness
of these Pods.
stateful意味着:
components of a StatefulSet.例子
A Headless Service
(带selector), named nginx, is used to control the network domain.这种service不带lb,kube-proxy不处理,dns直接返回后端endpointStatefulSet
, named web, has a Spec that indicates that 3 replicas of the nginx container will be launched in unique Pods.volumeClaimTemplates
will provide stable storage using PersistentVolumes provisioned by a PersistentVolume Provisioner.apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: gcr.io/google_containers/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: my-storage-class
resources:
requests:
storage: 1Gi
Cluster Domain | Service (ns/name) | StatefulSet (ns/name) | StatefulSet Domain | Pod DNS | Pod Hostname |
---|---|---|---|---|---|
cluster.local | default/nginx | default/web | nginx.default.svc.cluster.local | web-{0..N-1}.nginx.default.svc.cluster.local | web-{0..N-1} |
cluster.local | foo/nginx | foo/web | nginx.foo.svc.cluster.local | web-{0..N-1}.nginx.foo.svc.cluster.local | web-{0..N-1} |
kube.local | foo/nginx | foo/web | nginx.foo.svc.kube.local | web-{0..N-1}.nginx.foo.svc.kube.local | web-{0..N-1} |
In Kubernetes 1.7 and later, StatefulSet allows you to relax its ordering guarantees while preserving its uniqueness and identity guarantees via its .spec.podManagementPolicy field.
On Delete;Rolling Updates;Partitions
一个node跑一个pod,作为一个deamon
When you delete an object, you can specify whether the object’s dependents are also deleted automatically. Deleting dependents automatically is called cascading deletion
.There are two modes of cascading deletion: background and foreground.
前台删除:根对象首先进入 “删除中” 状态。=> 垃圾收集器会删除对象的所有 Dependent。 => 删除 Owner 对象。
后台删除:Kubernetes 会立即删除 Owner 对象,然后垃圾收集器会在后台删除这些 Dependent。
Deployments必须使用propagationPolicy: Foreground
自定义资源目前不支持垃圾回收
To control the cascading deletion policy, set the deleteOptions.propagationPolicy field on your owner object. Possible values include “Orphan”, “Foreground”, or “Background”.
The default garbage collection policy for many controller resources is orphan, including ReplicationController, ReplicaSet, StatefulSet, DaemonSet, and Deployment.
todo
todo
这个优点像effective k8s了:
todo
略
Resource requests and limits
todo
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
env: test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
nodeSelector:
disktype: ssd
kubernetes.io/hostname
failure-domain.beta.kubernetes.io/zone
failure-domain.beta.kubernetes.io/region
beta.kubernetes.io/instance-type
beta.kubernetes.io/os
beta.kubernetes.io/arch
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
containers:
- name: with-node-affinity
image: gcr.io/google_containers/pause:2.0
Node affinity, 写在pod上,描述希望什么node。
实现:Contiv,Contrail,Flannel,GCE,L2 networks and linux bridging,Nuage,OpenVSwitch,OVN,Calico,Romana,Weave Net
Kubernetes audit is part of kube-apiserver logging all requests coming to the server.
The kubelet can pro-actively monitor for and prevent against total starvation of a compute resource. In those cases, the kubelet can pro-actively fail one or more pods in order to reclaim the starved resource. When the kubelet fails a pod, it terminates all containers in the pod, and the PodPhase is transitioned to Failed.
Eviction Thresholds: <eviction-signal><operator><quantity>
soft eviction threshold
pairs an eviction threshold with a required administrator specified grace periodhard eviction threshold
has no grace period, and if observed, the kubelet will take immediate action to reclaim the associated starved resourceFederation makes it easy to manage multiple clusters. It does so by providing 2 major building blocks:
- Sync resources across clusters: Federation provides the ability to keep resources in multiple clusters in sync. This can be used, for example, to ensure that the same deployment exists in multiple clusters.
- Cross cluster discovery: It provides the ability to auto-configure DNS servers and load balancers with backends from all clusters. This can be used, for example, to ensure that a global VIP or DNS record can be used to access backends from multiple clusters.
Setting up Cluster Federation with Kubefed
Rescheduler ensures that critical add-ons are always scheduled. If the scheduler determines that no node has enough free resources to run the critical add-on pod given the pods that are already running in the cluster the rescheduler tries to free up space for the add-on by evicting some pods; then the scheduler will schedule the add-on pod.
可以设置一个临时的taint "CriticalAddonsOnly",只用来部署Critical Add-On Pod,防止其他pod调度上去
Static pods are managed directly by kubelet daemon on a specific node, without API server observing it. It does not have associated any replication controller, kubelet daemon itself watches it and restarts it when it crashes. There is no health check though. Static pods are always bound to one kubelet daemon and always run on the same node with it.
Kubelet automatically creates so-called mirror pod
on Kubernetes API server for each static pod, so the pods are visible
there, but they cannot be controlled from the API server
.
If you are running clustered Kubernetes and are using static pods to run a pod on every node, you should probably be using a DaemonSet!
可以通过--pod-manifest-path 或者 --manifest-url设置
Safe sysctl
: In addition to proper namespacing a safe sysctl must be properly isolated between pods on the same node.
//访问restapi 方式
// 1. proxy
kubectl proxy --port=8083 &
curl localhost:8083/api
// 2.直接访问
$ APISERVER=$(kubectl config view | grep server | cut -f 2- -d ":" | tr -d " ")
$ TOKEN=$(kubectl describe secret $(kubectl get secrets | grep default | cut -f1 -d ' ') | grep -E '^token' | cut -f2 -d':' | tr -d '\t')
$ curl $APISERVER/api --header "Authorization: Bearer $TOKEN" --insecure
several options for connecting to nodes, pods and services from outside the cluster:
//Discovering builtin services
kubectl cluster-info
Kubernetes proxy种类
Services
Endpoints API
that is updated whenever the set of Pods in a Service changes. For non-native applications, Kubernetes offers a virtual-IP-based bridge to Services which redirects to the backend Podstype: ExternalName
转发流量到external service ClusterIP(default)
, NodePort
(会在每个node上都开一个端口->service), LoadBalancer
(依赖iaas,会有一个EXTERNAL-IP), ExternalName
kind: Service
apiVersion: v1
metadata:
name: my-service
namespace: prod
spec:
type: ExternalName
externalName: my.database.example.com
tutorial
tutorial
tutorial
kubectl exec -ti busybox -- nslookup kubernetes.default
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。