apiserver是k8s的核心组件,对于apiserver的源码分析分为两期,一期是apiserver主流程的解析,另外一期是apiserver对于kubectl、kube-proxy等请求的处理。
apiserver是kubernetes的api server,运行在集群的master上,提供集群的管理API服务。
入口函数在cmd包下面的apiserver.go文件中
func main() {
rand.Seed(time.Now().UnixNano())
command := app.NewAPIServerCommand()
... ...
logs.InitLogs()
defer logs.FlushLogs()
if err := command.Execute(); err != nil {
os.Exit(1)
}
}
NewAPIServerCommand与kube-scheduler一样,同样使用了cobra.Command的CLI应用,cobra是很强大的工具,kubernetes、docker、istio等很多开源项目使用了cobra。先前在写scheduler的时候,没有说明启动参数是如何传递的,今天先说明下启动参数的解析,启动参数很多,源码分为了generic, etcd, secure serving,insecure serving,auditing, features, authentication, authorization,cloud provider,API enablement,egress selector,admission这些类。解析需要先看app.NewAPIServerCommand的方法:
func NewAPIServerCommand() *cobra.Command {
s := options.NewServerRunOptions()
cmd := &cobra.Command{
... ...
RunE: func(cmd *cobra.Command, args []string) error {
... ...
return Run(completedOptions, genericapiserver.SetupSignalHandler())
},
}
fs := cmd.Flags()
namedFlagSets := s.Flags()
verflag.AddFlags(namedFlagSets.FlagSet("global"))
globalflag.AddGlobalFlags(namedFlagSets.FlagSet("global"), cmd.Name())
options.AddCustomGlobalFlags(namedFlagSets.FlagSet("generic"))
for _, f := range namedFlagSets.FlagSets {
fs.AddFlagSet(f)
}
... ...
return cmd
}
s := options.NewServerRunOptions()是初始化了Server所有的启动参数,这里都是使用了默认值。cmd.Flags()是为command new了一个flagSet, 这里主要关注下namedFlagSets := s.Flags()
, namedFlagSets里面按照上述的分类定义了多个flagSet, 然后添加到了cmd的flagSet中。看下定义的Flag,
ServerRunOptions的Flags方法:
func (s *ServerRunOptions) Flags() (fss cliflag.NamedFlagSets) {
// Add the generic flags.
s.GenericServerRunOptions.AddUniversalFlags(fss.FlagSet("generic"))
s.Etcd.AddFlags(fss.FlagSet("etcd"))
s.SecureServing.AddFlags(fss.FlagSet("secure serving"))
s.InsecureServing.AddFlags(fss.FlagSet("insecure serving"))
s.InsecureServing.AddUnqualifiedFlags(fss.FlagSet("insecure serving")) // TODO: remove it until kops stops using `--address`
s.Audit.AddFlags(fss.FlagSet("auditing"))
s.Features.AddFlags(fss.FlagSet("features"))
s.Authentication.AddFlags(fss.FlagSet("authentication"))
s.Authorization.AddFlags(fss.FlagSet("authorization"))
s.CloudProvider.AddFlags(fss.FlagSet("cloud provider"))
s.APIEnablement.AddFlags(fss.FlagSet("API enablement"))
s.EgressSelector.AddFlags(fss.FlagSet("egress selector"))
s.Admission.AddFlags(fss.FlagSet("admission"))
... ...
随便选取一类进行查看,例如generic:
func (s *ServerRunOptions) AddUniversalFlags(fs *pflag.FlagSet) {
// Note: the weird ""+ in below lines seems to be the only way to get gofmt to
// arrange these text blocks sensibly. Grrr.
fs.IPVar(&s.AdvertiseAddress, "advertise-address", s.AdvertiseAddress, ""+
"The IP address on which to advertise the apiserver to members of the cluster. This "+
"address must be reachable by the rest of the cluster. If blank, the --bind-address "+
"will be used. If --bind-address is unspecified, the host's default interface will "+
"be used.")
fs.StringSliceVar(&s.CorsAllowedOriginList, "cors-allowed-origins", s.CorsAllowedOriginList, ""+
"List of allowed origins for CORS, comma separated. An allowed origin can be a regular "+
"expression to support subdomain matching. If this list is empty CORS will not be enabled.")
fs.IntVar(&s.TargetRAMMB, "target-ram-mb", s.TargetRAMMB,
"Memory limit for apiserver in MB (used to configure sizes of caches, etc.)")
fs.StringVar(&s.ExternalHost, "external-hostname", s.ExternalHost,
"The hostname to use when generating externalized URLs for this master (e.g. Swagger API Docs).")
deprecatedMasterServiceNamespace := metav1.NamespaceDefault
fs.StringVar(&deprecatedMasterServiceNamespace, "master-service-namespace", deprecatedMasterServiceNamespace, ""+
"DEPRECATED: the namespace from which the kubernetes master services should be injected into pods.")
fs.IntVar(&s.MaxRequestsInFlight, "max-requests-inflight", s.MaxRequestsInFlight, ""+
"The maximum number of non-mutating requests in flight at a given time. When the server exceeds this, "+
"it rejects requests. Zero for no limit.")
fs.IntVar(&s.MaxMutatingRequestsInFlight, "max-mutating-requests-inflight", s.MaxMutatingRequestsInFlight, ""+
"The maximum number of mutating requests in flight at a given time. When the server exceeds this, "+
"it rejects requests. Zero for no limit.")
fs.DurationVar(&s.RequestTimeout, "request-timeout", s.RequestTimeout, ""+
"An optional field indicating the duration a handler must keep a request open before timing "+
"it out. This is the default request timeout for requests but may be overridden by flags such as "+
"--min-request-timeout for specific types of requests.")
fs.DurationVar(&s.LivezGracePeriod, "livez-grace-period", s.LivezGracePeriod, ""+
"This option represents the maximum amount of time it should take for apiserver to complete its startup sequence "+
"and become live. From apiserver's start time to when this amount of time has elapsed, /livez will assume "+
"that unfinished post-start hooks will complete successfully and therefore return true.")
... ...
可以看到上面将flag的name和ServerRunOptions的field进行了绑定,后面解析之后会直接赋值给ServerRunOptions的field。那解析是在何时进行的呢? 在空壳的main方法中,还有cmd.execute方法,解析则在这里进行:
func (c *Command) Execute() error {
_, err := c.ExecuteC()
return err
}
// ExecuteC executes the command.
func (c *Command) ExecuteC() (cmd *Command, err error) {
... ...
args := c.args
// Workaround FAIL with "go test -v" or "cobra.test -test.v", see #155
if c.args == nil && filepath.Base(os.Args[0]) != "cobra.test" {
args = os.Args[1:]
}
var flags []string
if c.TraverseChildren {
cmd, flags, err = c.Traverse(args)
} else {
cmd, flags, err = c.Find(args)
}
... ...
err = cmd.execute(flags)
... ...
return cmd, err
}
可以看到,将os.Args取出赋值给args,调用c.Find(args)和c.Traverse(args) 将args解析为cmd和flags, flags为string数组。再调用cmd.execute(flags):
func (c *Command) execute(a []string) (err error) {
... ...
err = c.ParseFlags(a)
if err != nil {
return c.FlagErrorFunc()(c, err)
}
... ...
c.preRun()
... ...
if c.PreRunE != nil {
if err := c.PreRunE(c, argWoFlags); err != nil {
return err
}
} else if c.PreRun != nil {
c.PreRun(c, argWoFlags)
}
if err := c.validateRequiredFlags(); err != nil {
return err
}
if c.RunE != nil {
if err := c.RunE(c, argWoFlags); err != nil {
return err
}
} else {
c.Run(c, argWoFlags)
}
if c.PostRunE != nil {
if err := c.PostRunE(c, argWoFlags); err != nil {
return err
}
} else if c.PostRun != nil {
c.PostRun(c, argWoFlags)
}
for p := c; p != nil; p = p.Parent() {
if p.PersistentPostRunE != nil {
if err := p.PersistentPostRunE(c, argWoFlags); err != nil {
return err
}
break
} else if p.PersistentPostRun != nil {
p.PersistentPostRun(c, argWoFlags)
break
}
}
return nil
}
execute首先parseFlags, 这里parse就是解析出name和value,然后根据name赋值给先前绑定的ServerRunOptions的Field。这里可以看到execute还执行了PreRun, RunE, Run, PostRun的方法,在NewAPIServerCommand也是刚好实现了RunE的方法, RunE调用了Run方法:
func Run(completeOptions completedServerRunOptions, stopCh <-chan struct{}) error {
// To help debugging, immediately log version
klog.Infof("Version: %+v", version.Get())
server, err := CreateServerChain(completeOptions, stopCh)
if err != nil {
return err
}
prepared, err := server.PrepareRun()
if err != nil {
return err
}
return prepared.Run(stopCh)
}
Run方法很简洁,该方法在没有收到stopCh的信号时,会一直运行在prepared.Run里面。
主要执行的有3个方法:
func CreateServerChain(completedOptions completedServerRunOptions, stopCh <-chan struct{}) (*aggregatorapiserver.APIAggregator, error) {
nodeTunneler, proxyTransport, err := CreateNodeDialer(completedOptions)
if err != nil {
return nil, err
}
kubeAPIServerConfig, insecureServingInfo, serviceResolver, pluginInitializer, err := CreateKubeAPIServerConfig(completedOptions, nodeTunneler, proxyTransport)
if err != nil {
return nil, err
}
// If additional API servers are added, they should be gated.
apiExtensionsConfig, err := createAPIExtensionsConfig(*kubeAPIServerConfig.GenericConfig, kubeAPIServerConfig.ExtraConfig.VersionedInformers, pluginInitializer, completedOptions.ServerRunOptions, completedOptions.MasterCount,
serviceResolver, webhook.NewDefaultAuthenticationInfoResolverWrapper(proxyTransport, kubeAPIServerConfig.GenericConfig.LoopbackClientConfig))
if err != nil {
return nil, err
}
apiExtensionsServer, err := createAPIExtensionsServer(apiExtensionsConfig, genericapiserver.NewEmptyDelegate())
if err != nil {
return nil, err
}
kubeAPIServer, err := CreateKubeAPIServer(kubeAPIServerConfig, apiExtensionsServer.GenericAPIServer)
if err != nil {
return nil, err
}
// aggregator comes last in the chain
aggregatorConfig, err := createAggregatorConfig(*kubeAPIServerConfig.GenericConfig, completedOptions.ServerRunOptions, kubeAPIServerConfig.ExtraConfig.VersionedInformers, serviceResolver, proxyTransport, pluginInitializer)
if err != nil {
return nil, err
}
aggregatorServer, err := createAggregatorServer(aggregatorConfig, kubeAPIServer.GenericAPIServer, apiExtensionsServer.Informers)
if err != nil {
// we don't need special handling for innerStopCh because the aggregator server doesn't create any go routines
return nil, err
}
if insecureServingInfo != nil {
insecureHandlerChain := kubeserver.BuildInsecureHandlerChain(aggregatorServer.GenericAPIServer.UnprotectedHandler(), kubeAPIServerConfig.GenericConfig)
if err := insecureServingInfo.Serve(insecureHandlerChain, kubeAPIServerConfig.GenericConfig.RequestTimeout, stopCh); err != nil {
return nil, err
}
}
return aggregatorServer, nil
}
这部分代码比较清晰,CreateNodeDialer是根据配置生成ssh tunneler,用以连接各个node的kubelet。另外就是创建了3个config和3个server:
config:
server:
config和server的对应关系应当一目明了吧,这里是按照创建顺序书写的。
type Config struct { // master.config
GenericConfig *genericapiserver.Config
ExtraConfig ExtraConfig
}
type Config struct { // genericapiserver.Config
// SecureServing is required to serve https
SecureServing *SecureServingInfo
// Authentication is the configuration for authentication
Authentication AuthenticationInfo
// Authorization is the configuration for authorization
Authorization AuthorizationInfo
// LoopbackClientConfig is a config for a privileged loopback connection to the API server
// This is required for proper functioning of the PostStartHooks on a GenericAPIServer
// TODO: move into SecureServing(WithLoopback) as soon as insecure serving is gone
LoopbackClientConfig *restclient.Config
// EgressSelector provides a lookup mechanism for dialing outbound connections.
// It does so based on a EgressSelectorConfiguration which was read at startup.
EgressSelector *egressselector.EgressSelector
// RuleResolver is required to get the list of rules that apply to a given user
// in a given namespace
RuleResolver authorizer.RuleResolver
// AdmissionControl performs deep inspection of a given request (including content)
// to set values and determine whether its allowed
AdmissionControl admission.Interface
CorsAllowedOriginList []string
EnableIndex bool
EnableProfiling bool
EnableDiscovery bool
// Requires generic profiling enabled
EnableContentionProfiling bool
EnableMetrics bool
DisabledPostStartHooks sets.String
// done values in this values for this map are ignored.
PostStartHooks map[string]PostStartHookConfigEntry
// Version will enable the /version endpoint if non-nil
Version *version.Info
// AuditBackend is where audit events are sent to.
AuditBackend audit.Backend
// AuditPolicyChecker makes the decision of whether and how to audit log a request.
AuditPolicyChecker auditpolicy.Checker
// ExternalAddress is the host name to use for external (public internet) facing URLs (e.g. Swagger)
// Will default to a value based on secure serving info and available ipv4 IPs.
ExternalAddress string
//===========================================================================
// Fields you probably don't care about changing
//===========================================================================
// BuildHandlerChainFunc allows you to build custom handler chains by decorating the apiHandler.
BuildHandlerChainFunc func(apiHandler http.Handler, c *Config) (secure http.Handler)
// HandlerChainWaitGroup allows you to wait for all chain handlers exit after the server shutdown.
HandlerChainWaitGroup *utilwaitgroup.SafeWaitGroup
// DiscoveryAddresses is used to build the IPs pass to discovery. If nil, the ExternalAddress is
// always reported
DiscoveryAddresses discovery.Addresses
// The default set of healthz checks. There might be more added via AddHealthChecks dynamically.
HealthzChecks []healthz.HealthChecker
// The default set of livez checks. There might be more added via AddHealthChecks dynamically.
LivezChecks []healthz.HealthChecker
// The default set of readyz-only checks. There might be more added via AddReadyzChecks dynamically.
ReadyzChecks []healthz.HealthChecker
// LegacyAPIGroupPrefixes is used to set up URL parsing for authorization and for validating requests
// to InstallLegacyAPIGroup. New API servers don't generally have legacy groups at all.
LegacyAPIGroupPrefixes sets.String
// RequestInfoResolver is used to assign attributes (used by admission and authorization) based on a request URL.
// Use-cases that are like kubelets may need to customize this.
RequestInfoResolver apirequest.RequestInfoResolver
// Serializer is required and provides the interface for serializing and converting objects to and from the wire
// The default (api.Codecs) usually works fine.
Serializer runtime.NegotiatedSerializer
// OpenAPIConfig will be used in generating OpenAPI spec. This is nil by default. Use DefaultOpenAPIConfig for "working" defaults.
OpenAPIConfig *openapicommon.Config
// RESTOptionsGetter is used to construct RESTStorage types via the generic registry.
RESTOptionsGetter genericregistry.RESTOptionsGetter
// If specified, all requests except those which match the LongRunningFunc predicate will timeout
// after this duration.
RequestTimeout time.Duration
// If specified, long running requests such as watch will be allocated a random timeout between this value, and
// twice this value. Note that it is up to the request handlers to ignore or honor this timeout. In seconds.
MinRequestTimeout int
// This represents the maximum amount of time it should take for apiserver to complete its startup
// sequence and become healthy. From apiserver's start time to when this amount of time has
// elapsed, /livez will assume that unfinished post-start hooks will complete successfully and
// therefore return true.
LivezGracePeriod time.Duration
// ShutdownDelayDuration allows to block shutdown for some time, e.g. until endpoints pointing to this API server
// have converged on all node. During this time, the API server keeps serving, /healthz will return 200,
// but /readyz will return failure.
ShutdownDelayDuration time.Duration
// The limit on the total size increase all "copy" operations in a json
// patch may cause.
// This affects all places that applies json patch in the binary.
JSONPatchMaxCopyBytes int64
// The limit on the request size that would be accepted and decoded in a write request
// 0 means no limit.
MaxRequestBodyBytes int64
// MaxRequestsInFlight is the maximum number of parallel non-long-running requests. Every further
// request has to wait. Applies only to non-mutating requests.
MaxRequestsInFlight int
// MaxMutatingRequestsInFlight is the maximum number of parallel mutating requests. Every further
// request has to wait.
MaxMutatingRequestsInFlight int
// Predicate which is true for paths of long-running http requests
LongRunningFunc apirequest.LongRunningRequestCheck
// MergedResourceConfig indicates which groupVersion enabled and its resources enabled/disabled.
// This is composed of genericapiserver defaultAPIResourceConfig and those parsed from flags.
// If not specify any in flags, then genericapiserver will only enable defaultAPIResourceConfig.
MergedResourceConfig *serverstore.ResourceConfig
//===========================================================================
// values below here are targets for removal
//===========================================================================
// PublicAddress is the IP address where members of the cluster (kubelet,
// kube-proxy, services, etc.) can reach the GenericAPIServer.
// If nil or 0.0.0.0, the host's default interface will be used.
PublicAddress net.IP
// EquivalentResourceRegistry provides information about resources equivalent to a given resource,
// and the kind associated with a given resource. As resources are installed, they are registered here.
EquivalentResourceRegistry runtime.EquivalentResourceRegistry
}
type ExtraConfig struct {
ClusterAuthenticationInfo clusterauthenticationtrust.ClusterAuthenticationInfo
APIResourceConfigSource serverstorage.APIResourceConfigSource
StorageFactory serverstorage.StorageFactory
EndpointReconcilerConfig EndpointReconcilerConfig
EventTTL time.Duration
KubeletClientConfig kubeletclient.KubeletClientConfig
// Used to start and monitor tunneling
Tunneler tunneler.Tunneler
EnableLogsSupport bool
ProxyTransport http.RoundTripper
// Values to build the IP addresses used by discovery
// The range of IPs to be assigned to services with type=ClusterIP or greater
ServiceIPRange net.IPNet
// The IP address for the GenericAPIServer service (must be inside ServiceIPRange)
APIServerServiceIP net.IP
// dual stack services, the range represents an alternative IP range for service IP
// must be of different family than primary (ServiceIPRange)
SecondaryServiceIPRange net.IPNet
// the secondary IP address the GenericAPIServer service (must be inside SecondaryServiceIPRange)
SecondaryAPIServerServiceIP net.IP
// Port for the apiserver service.
APIServerServicePort int
// TODO, we can probably group service related items into a substruct to make it easier to configure
// the API server items and `Extra*` fields likely fit nicely together.
// The range of ports to be assigned to services with type=NodePort or greater
ServiceNodePortRange utilnet.PortRange
// Additional ports to be exposed on the GenericAPIServer service
// extraServicePorts is injectable in the event that more ports
// (other than the default 443/tcp) are exposed on the GenericAPIServer
// and those ports need to be load balanced by the GenericAPIServer
// service because this pkg is linked by out-of-tree projects
// like openshift which want to use the GenericAPIServer but also do
// more stuff.
ExtraServicePorts []apiv1.ServicePort
// Additional ports to be exposed on the GenericAPIServer endpoints
// Port names should align with ports defined in ExtraServicePorts
ExtraEndpointPorts []apiv1.EndpointPort
// If non-zero, the "kubernetes" services uses this port as NodePort.
KubernetesServiceNodePort int
// Number of masters running; all masters must be started with the
// same value for this field. (Numbers > 1 currently untested.)
MasterCount int
// MasterEndpointReconcileTTL sets the time to live in seconds of an
// endpoint record recorded by each master. The endpoints are checked at an
// interval that is 2/3 of this value and this value defaults to 15s if
// unset. In very large clusters, this value may be increased to reduce the
// possibility that the master endpoint record expires (due to other load
// on the etcd server) and causes masters to drop in and out of the
// kubernetes service record. It is not recommended to set this value below
// 15s.
MasterEndpointReconcileTTL time.Duration
// Selects which reconciler to use
EndpointReconcilerType reconcilers.Type
ServiceAccountIssuer serviceaccount.TokenGenerator
ServiceAccountMaxExpiration time.Duration
VersionedInformers informers.SharedInformerFactory
}
master.Config的配置是相当的多,包含genericapiserver.Config和ExtraConfig, genericapiserver.Config的注释里面也说了,在代码里是按照重要级别排序的,前面几个分别是server配置、验证配置、授权配置。ExtraConfig主要包含的是apiserver额外的信息:集群认证信息、etcd配置等。
这里主要方法是CreateKubeAPIServerConfig,该方法返回master.Config:
func CreateKubeAPIServerConfig(
s completedServerRunOptions,
nodeTunneler tunneler.Tunneler,
proxyTransport *http.Transport,
) (
*master.Config,
*genericapiserver.DeprecatedInsecureServingInfo,
aggregatorapiserver.ServiceResolver,
[]admission.PluginInitializer,
error,
) {
genericConfig, versionedInformers, insecureServingInfo, serviceResolver, pluginInitializers, admissionPostStartHook, storageFactory, err := buildGenericConfig(s.ServerRunOptions, proxyTransport)
if err != nil {
return nil, nil, nil, nil, err
}
if _, port, err := net.SplitHostPort(s.Etcd.StorageConfig.Transport.ServerList[0]); err == nil && port != "0" && len(port) != 0 {
if err := utilwait.PollImmediate(etcdRetryInterval, etcdRetryLimit*etcdRetryInterval, preflight.EtcdConnection{ServerList: s.Etcd.StorageConfig.Transport.ServerList}.CheckEtcdServers); err != nil {
return nil, nil, nil, nil, fmt.Errorf("error waiting for etcd connection: %v", err)
}
}
capabilities.Initialize(capabilities.Capabilities{
AllowPrivileged: s.AllowPrivileged,
// TODO(vmarmol): Implement support for HostNetworkSources.
PrivilegedSources: capabilities.PrivilegedSources{
HostNetworkSources: []string{},
HostPIDSources: []string{},
HostIPCSources: []string{},
},
PerConnectionBandwidthLimitBytesPerSec: s.MaxConnectionBytesPerSec,
})
if len(s.ShowHiddenMetricsForVersion) > 0 {
metrics.SetShowHidden()
}
serviceIPRange, apiServerServiceIP, err := master.ServiceIPRange(s.PrimaryServiceClusterIPRange)
if err != nil {
return nil, nil, nil, nil, err
}
// defaults to empty range and ip
var secondaryServiceIPRange net.IPNet
// process secondary range only if provided by user
if s.SecondaryServiceClusterIPRange.IP != nil {
secondaryServiceIPRange, _, err = master.ServiceIPRange(s.SecondaryServiceClusterIPRange)
if err != nil {
return nil, nil, nil, nil, err
}
}
config := &master.Config{
GenericConfig: genericConfig,
ExtraConfig: master.ExtraConfig{
APIResourceConfigSource: storageFactory.APIResourceConfigSource,
StorageFactory: storageFactory,
EventTTL: s.EventTTL,
KubeletClientConfig: s.KubeletConfig,
EnableLogsSupport: s.EnableLogsHandler,
ProxyTransport: proxyTransport,
Tunneler: nodeTunneler,
ServiceIPRange: serviceIPRange,
APIServerServiceIP: apiServerServiceIP,
SecondaryServiceIPRange: secondaryServiceIPRange,
APIServerServicePort: 443,
ServiceNodePortRange: s.ServiceNodePortRange,
KubernetesServiceNodePort: s.KubernetesServiceNodePort,
EndpointReconcilerType: reconcilers.Type(s.EndpointReconcilerType),
MasterCount: s.MasterCount,
ServiceAccountIssuer: s.ServiceAccountIssuer,
ServiceAccountMaxExpiration: s.ServiceAccountTokenMaxExpiration,
VersionedInformers: versionedInformers,
},
}
clientCAProvider, err := s.Authentication.ClientCert.GetClientCAContentProvider()
if err != nil {
return nil, nil, nil, nil, err
}
config.ExtraConfig.ClusterAuthenticationInfo.ClientCA = clientCAProvider
requestHeaderConfig, err := s.Authentication.RequestHeader.ToAuthenticationRequestHeaderConfig()
if err != nil {
return nil, nil, nil, nil, err
}
if requestHeaderConfig != nil {
config.ExtraConfig.ClusterAuthenticationInfo.RequestHeaderCA = requestHeaderConfig.CAContentProvider
config.ExtraConfig.ClusterAuthenticationInfo.RequestHeaderAllowedNames = requestHeaderConfig.AllowedClientNames
config.ExtraConfig.ClusterAuthenticationInfo.RequestHeaderExtraHeaderPrefixes = requestHeaderConfig.ExtraHeaderPrefixes
config.ExtraConfig.ClusterAuthenticationInfo.RequestHeaderGroupHeaders = requestHeaderConfig.GroupHeaders
config.ExtraConfig.ClusterAuthenticationInfo.RequestHeaderUsernameHeaders = requestHeaderConfig.UsernameHeaders
}
if err := config.GenericConfig.AddPostStartHook("start-kube-apiserver-admission-initializer", admissionPostStartHook); err != nil {
return nil, nil, nil, nil, err
}
if nodeTunneler != nil {
// Use the nodeTunneler's dialer to connect to the kubelet
config.ExtraConfig.KubeletClientConfig.Dial = nodeTunneler.Dial
}
if config.GenericConfig.EgressSelector != nil {
// Use the config.GenericConfig.EgressSelector lookup to find the dialer to connect to the kubelet
config.ExtraConfig.KubeletClientConfig.Lookup = config.GenericConfig.EgressSelector.Lookup
}
return config, insecureServingInfo, serviceResolver, pluginInitializers, nil
}
该方法创建了apiserver所需要的所有资源,并将ServerRunOptions里面的参数转化为genericapiserver.Config和ExtraConfig。
创建服务链的第二步是创建ApiExtensionConfig, ApiExtensionConfig与ApiServerConfig几乎一致,在ExtraConfig上是不同的, ApiExtensionConfig的ExtraConfig只有几个字段,少很多。创建ApiExtensionConfig的方法是createAPIExtensionsConfig, 该方法主要做了四件事:
当创建好APIExtensionConfig之后,就是创建APIExtensionServer,调用方法如下:
apiExtensionsServer, err := createAPIExtensionsServer(apiExtensionsConfig, genericapiserver.NewEmptyDelegate())
这里传了一个空的Delegate,和后面创建KubeApiServer不一样,创建KubeApiServer传入了APIExtensionServer的GenericAPIServer。回到createAPIExtensionsServer方法,最终会调用到apiextension-apiserver/apiserver#New():
func (c completedConfig) New(delegationTarget genericapiserver.DelegationTarget) (*CustomResourceDefinitions, error) {
genericServer, err := c.GenericConfig.New("apiextensions-apiserver", delegationTarget)
if err != nil {
return nil, err
}
s := &CustomResourceDefinitions{
GenericAPIServer: genericServer,
}
... ...
if err := s.GenericAPIServer.InstallAPIGroup(&apiGroupInfo); err != nil {
return nil, err
}
... ...
delegateHandler := delegationTarget.UnprotectedHandler()
if delegateHandler == nil {
delegateHandler = http.NotFoundHandler()
}
versionDiscoveryHandler := &versionDiscoveryHandler{
discovery: map[schema.GroupVersion]*discovery.APIVersionHandler{},
delegate: delegateHandler,
}
groupDiscoveryHandler := &groupDiscoveryHandler{
discovery: map[string]*discovery.APIGroupHandler{},
delegate: delegateHandler,
}
establishingController := establish.NewEstablishingController(s.Informers.Apiextensions().InternalVersion().CustomResourceDefinitions(), crdClient.Apiextensions())
crdHandler, err := NewCustomResourceDefinitionHandler(
versionDiscoveryHandler,
groupDiscoveryHandler,
s.Informers.Apiextensions().InternalVersion().CustomResourceDefinitions(),
delegateHandler,
c.ExtraConfig.CRDRESTOptionsGetter,
c.GenericConfig.AdmissionControl,
establishingController,
c.ExtraConfig.ServiceResolver,
c.ExtraConfig.AuthResolverWrapper,
c.ExtraConfig.MasterCount,
s.GenericAPIServer.Authorizer,
c.GenericConfig.RequestTimeout,
time.Duration(c.GenericConfig.MinRequestTimeout)*time.Second,
apiGroupInfo.StaticOpenAPISpec,
c.GenericConfig.MaxRequestBodyBytes,
)
if err != nil {
return nil, err
}
s.GenericAPIServer.Handler.NonGoRestfulMux.Handle("/apis", crdHandler)
s.GenericAPIServer.Handler.NonGoRestfulMux.HandlePrefix("/apis/", crdHandler)
crdController := NewDiscoveryController(s.Informers.Apiextensions().InternalVersion().CustomResourceDefinitions(), versionDiscoveryHandler, groupDiscoveryHandler)
namingController := status.NewNamingConditionController(s.Informers.Apiextensions().InternalVersion().CustomResourceDefinitions(), crdClient.Apiextensions())
nonStructuralSchemaController := nonstructuralschema.NewConditionController(s.Informers.Apiextensions().InternalVersion().CustomResourceDefinitions(), crdClient.Apiextensions())
apiApprovalController := apiapproval.NewKubernetesAPIApprovalPolicyConformantConditionController(s.Informers.Apiextensions().InternalVersion().CustomResourceDefinitions(), crdClient.Apiextensions())
finalizingController := finalizer.NewCRDFinalizer(
s.Informers.Apiextensions().InternalVersion().CustomResourceDefinitions(),
crdClient.Apiextensions(),
crdHandler,
)
var openapiController *openapicontroller.Controller
if utilfeature.DefaultFeatureGate.Enabled(apiextensionsfeatures.CustomResourcePublishOpenAPI) {
openapiController = openapicontroller.NewController(s.Informers.Apiextensions().InternalVersion().CustomResourceDefinitions())
}
s.GenericAPIServer.AddPostStartHookOrDie("start-apiextensions-informers", func(context genericapiserver.PostStartHookContext) error {
s.Informers.Start(context.StopCh)
return nil
})
s.GenericAPIServer.AddPostStartHookOrDie("start-apiextensions-controllers", func(context genericapiserver.PostStartHookContext) error {
// OpenAPIVersionedService and StaticOpenAPISpec are populated in generic apiserver PrepareRun().
// Together they serve the /openapi/v2 endpoint on a generic apiserver. A generic apiserver may
// choose to not enable OpenAPI by having null openAPIConfig, and thus OpenAPIVersionedService
// and StaticOpenAPISpec are both null. In that case we don't run the CRD OpenAPI controller.
if utilfeature.DefaultFeatureGate.Enabled(apiextensionsfeatures.CustomResourcePublishOpenAPI) && s.GenericAPIServer.OpenAPIVersionedService != nil && s.GenericAPIServer.StaticOpenAPISpec != nil {
go openapiController.Run(s.GenericAPIServer.StaticOpenAPISpec, s.GenericAPIServer.OpenAPIVersionedService, context.StopCh)
}
go crdController.Run(context.StopCh)
go namingController.Run(context.StopCh)
go establishingController.Run(context.StopCh)
go nonStructuralSchemaController.Run(5, context.StopCh)
go apiApprovalController.Run(5, context.StopCh)
go finalizingController.Run(5, context.StopCh)
return nil
})
// we don't want to report healthy until we can handle all CRDs that have already been registered. Waiting for the informer
// to sync makes sure that the lister will be valid before we begin. There may still be races for CRDs added after startup,
// but we won't go healthy until we can handle the ones already present.
s.GenericAPIServer.AddPostStartHookOrDie("crd-informer-synced", func(context genericapiserver.PostStartHookContext) error {
return wait.PollImmediateUntil(100*time.Millisecond, func() (bool, error) {
return s.Informers.Apiextensions().InternalVersion().CustomResourceDefinitions().Informer().HasSynced(), nil
}, context.StopCh)
})
return s, nil
}
apiextensions-apiserver/apiserver.go这个文件几乎只写了这个New方法。
Server链中最重要的环节,创建KubeAPIServer,调用如下:
kubeAPIServer, err := CreateKubeAPIServer(kubeAPIServerConfig, apiExtensionsServer.GenericAPIServer)
因为APIServer最终是运行在master中的,最终调用到了master/New()方法:
func (c completedConfig) New(delegationTarget genericapiserver.DelegationTarget) (*Master, error) {
if reflect.DeepEqual(c.ExtraConfig.KubeletClientConfig, kubeletclient.KubeletClientConfig{}) {
return nil, fmt.Errorf("Master.New() called with empty config.KubeletClientConfig")
}
s, err := c.GenericConfig.New("kube-apiserver", delegationTarget)
if err != nil {
return nil, err
}
if c.ExtraConfig.EnableLogsSupport {
routes.Logs{}.Install(s.Handler.GoRestfulContainer)
}
m := &Master{
GenericAPIServer: s,
ClusterAuthenticationInfo: c.ExtraConfig.ClusterAuthenticationInfo,
}
// install legacy rest storage
if c.ExtraConfig.APIResourceConfigSource.VersionEnabled(apiv1.SchemeGroupVersion) {
legacyRESTStorageProvider := corerest.LegacyRESTStorageProvider{
StorageFactory: c.ExtraConfig.StorageFactory,
ProxyTransport: c.ExtraConfig.ProxyTransport,
KubeletClientConfig: c.ExtraConfig.KubeletClientConfig,
EventTTL: c.ExtraConfig.EventTTL,
ServiceIPRange: c.ExtraConfig.ServiceIPRange,
SecondaryServiceIPRange: c.ExtraConfig.SecondaryServiceIPRange,
ServiceNodePortRange: c.ExtraConfig.ServiceNodePortRange,
LoopbackClientConfig: c.GenericConfig.LoopbackClientConfig,
ServiceAccountIssuer: c.ExtraConfig.ServiceAccountIssuer,
ServiceAccountMaxExpiration: c.ExtraConfig.ServiceAccountMaxExpiration,
APIAudiences: c.GenericConfig.Authentication.APIAudiences,
}
if err := m.InstallLegacyAPI(&c, c.GenericConfig.RESTOptionsGetter, legacyRESTStorageProvider); err != nil {
return nil, err
}
}
// The order here is preserved in discovery.
// If resources with identical names exist in more than one of these groups (e.g. "deployments.apps"" and "deployments.extensions"),
// the order of this list determines which group an unqualified resource name (e.g. "deployments") should prefer.
// This priority order is used for local discovery, but it ends up aggregated in `k8s.io/kubernetes/cmd/kube-apiserver/app/aggregator.go
// with specific priorities.
// TODO: describe the priority all the way down in the RESTStorageProviders and plumb it back through the various discovery
// handlers that we have.
restStorageProviders := []RESTStorageProvider{
auditregistrationrest.RESTStorageProvider{},
authenticationrest.RESTStorageProvider{Authenticator: c.GenericConfig.Authentication.Authenticator, APIAudiences: c.GenericConfig.Authentication.APIAudiences},
authorizationrest.RESTStorageProvider{Authorizer: c.GenericConfig.Authorization.Authorizer, RuleResolver: c.GenericConfig.RuleResolver},
autoscalingrest.RESTStorageProvider{},
batchrest.RESTStorageProvider{},
certificatesrest.RESTStorageProvider{},
coordinationrest.RESTStorageProvider{},
discoveryrest.StorageProvider{},
extensionsrest.RESTStorageProvider{},
networkingrest.RESTStorageProvider{},
noderest.RESTStorageProvider{},
policyrest.RESTStorageProvider{},
rbacrest.RESTStorageProvider{Authorizer: c.GenericConfig.Authorization.Authorizer},
schedulingrest.RESTStorageProvider{},
settingsrest.RESTStorageProvider{},
storagerest.RESTStorageProvider{},
flowcontrolrest.RESTStorageProvider{},
// keep apps after extensions so legacy clients resolve the extensions versions of shared resource names.
// See https://github.com/kubernetes/kubernetes/issues/42392
appsrest.RESTStorageProvider{},
admissionregistrationrest.RESTStorageProvider{},
eventsrest.RESTStorageProvider{TTL: c.ExtraConfig.EventTTL},
}
if err := m.InstallAPIs(c.ExtraConfig.APIResourceConfigSource, c.GenericConfig.RESTOptionsGetter, restStorageProviders...); err != nil {
return nil, err
}
if c.ExtraConfig.Tunneler != nil {
m.installTunneler(c.ExtraConfig.Tunneler, corev1client.NewForConfigOrDie(c.GenericConfig.LoopbackClientConfig).Nodes())
}
m.GenericAPIServer.AddPostStartHookOrDie("start-cluster-authentication-info-controller", func(hookContext genericapiserver.PostStartHookContext) error {
kubeClient, err := kubernetes.NewForConfig(hookContext.LoopbackClientConfig)
if err != nil {
return err
}
controller := clusterauthenticationtrust.NewClusterAuthenticationTrustController(m.ClusterAuthenticationInfo, kubeClient)
// prime values and start listeners
if m.ClusterAuthenticationInfo.ClientCA != nil {
if notifier, ok := m.ClusterAuthenticationInfo.ClientCA.(dynamiccertificates.Notifier); ok {
notifier.AddListener(controller)
}
if controller, ok := m.ClusterAuthenticationInfo.ClientCA.(dynamiccertificates.ControllerRunner); ok {
// runonce to be sure that we have a value.
if err := controller.RunOnce(); err != nil {
runtime.HandleError(err)
}
go controller.Run(1, hookContext.StopCh)
}
}
if m.ClusterAuthenticationInfo.RequestHeaderCA != nil {
if notifier, ok := m.ClusterAuthenticationInfo.RequestHeaderCA.(dynamiccertificates.Notifier); ok {
notifier.AddListener(controller)
}
if controller, ok := m.ClusterAuthenticationInfo.RequestHeaderCA.(dynamiccertificates.ControllerRunner); ok {
// runonce to be sure that we have a value.
if err := controller.RunOnce(); err != nil {
runtime.HandleError(err)
}
go controller.Run(1, hookContext.StopCh)
}
}
go controller.Run(1, hookContext.StopCh)
return nil
})
return m, nil
}
KubeAPIServer的new方法和ExtensionAPIServer的New方法大致流程一样, 不过这里传入了ExtensionApiServer的GenericAPIServer:
AppregatorConfig提供了自动注册控制器,将传入进来的KubeApiServer监听的路径自动注册到APIServices中, 后续提供APIServer的同步功能。
func createAggregatorServer(aggregatorConfig *aggregatorapiserver.Config, delegateAPIServer genericapiserver.DelegationTarget, apiExtensionInformers apiextensionsinformers.SharedInformerFactory) (*aggregatorapiserver.APIAggregator, error) {
aggregatorServer, err := aggregatorConfig.Complete().NewWithDelegate(delegateAPIServer)
if err != nil {
return nil, err
}
// create controllers for auto-registration
apiRegistrationClient, err := apiregistrationclient.NewForConfig(aggregatorConfig.GenericConfig.LoopbackClientConfig)
if err != nil {
return nil, err
}
autoRegistrationController := autoregister.NewAutoRegisterController(aggregatorServer.APIRegistrationInformers.Apiregistration().V1().APIServices(), apiRegistrationClient)
apiServices := apiServicesToRegister(delegateAPIServer, autoRegistrationController)
crdRegistrationController := crdregistration.NewCRDRegistrationController(
apiExtensionInformers.Apiextensions().InternalVersion().CustomResourceDefinitions(),
autoRegistrationController)
err = aggregatorServer.GenericAPIServer.AddPostStartHook("kube-apiserver-autoregistration", func(context genericapiserver.PostStartHookContext) error {
go crdRegistrationController.Run(5, context.StopCh)
go func() {
// let the CRD controller process the initial set of CRDs before starting the autoregistration controller.
// this prevents the autoregistration controller's initial sync from deleting APIServices for CRDs that still exist.
// we only need to do this if CRDs are enabled on this server. We can't use discovery because we are the source for discovery.
if aggregatorConfig.GenericConfig.MergedResourceConfig.AnyVersionForGroupEnabled("apiextensions.k8s.io") {
crdRegistrationController.WaitForInitialSync()
}
autoRegistrationController.Run(5, context.StopCh)
}()
return nil
})
if err != nil {
return nil, err
}
err = aggregatorServer.GenericAPIServer.AddBootSequenceHealthChecks(
makeAPIServiceAvailableHealthCheck(
"autoregister-completion",
apiServices,
aggregatorServer.APIRegistrationInformers.Apiregistration().V1().APIServices(),
),
)
if err != nil {
return nil, err
}
return aggregatorServer, nil
}
运行服务分为两步,PrepareRun和Run。
func (s *APIAggregator) PrepareRun() (preparedAPIAggregator, error) {
// add post start hook before generic PrepareRun in order to be before /healthz installation
if s.openAPIConfig != nil {
s.GenericAPIServer.AddPostStartHookOrDie("apiservice-openapi-controller", func(context genericapiserver.PostStartHookContext) error {
go s.openAPIAggregationController.Run(context.StopCh)
return nil
})
}
prepared := s.GenericAPIServer.PrepareRun()
// delay OpenAPI setup until the delegate had a chance to setup their OpenAPI handlers
if s.openAPIConfig != nil {
specDownloader := openapiaggregator.NewDownloader()
openAPIAggregator, err := openapiaggregator.BuildAndRegisterAggregator(
&specDownloader,
s.GenericAPIServer.NextDelegate(),
s.GenericAPIServer.Handler.GoRestfulContainer.RegisteredWebServices(),
s.openAPIConfig,
s.GenericAPIServer.Handler.NonGoRestfulMux)
if err != nil {
return preparedAPIAggregator{}, err
}
s.openAPIAggregationController = openapicontroller.NewAggregationController(&specDownloader, openAPIAggregator)
}
return preparedAPIAggregator{APIAggregator: s, runnable: prepared}, nil
}
这里比较关键的是s.GenericApiServer.PrepareRun方法:
func (s *GenericAPIServer) PrepareRun() preparedGenericAPIServer {
s.delegationTarget.PrepareRun()
if s.openAPIConfig != nil {
s.OpenAPIVersionedService, s.StaticOpenAPISpec = routes.OpenAPI{
Config: s.openAPIConfig,
}.Install(s.Handler.GoRestfulContainer, s.Handler.NonGoRestfulMux)
}
s.installHealthz()
s.installLivez()
err := s.addReadyzShutdownCheck(s.readinessStopCh)
if err != nil {
klog.Errorf("Failed to install readyz shutdown check %s", err)
}
s.installReadyz()
// Register audit backend preShutdownHook.
if s.AuditBackend != nil {
err := s.AddPreShutdownHook("audit-backend", func() error {
s.AuditBackend.Shutdown()
return nil
})
if err != nil {
klog.Errorf("Failed to add pre-shutdown hook for audit-backend %s", err)
}
}
return preparedGenericAPIServer{s}
}
这里实例化了所有的http server。
func (s preparedGenericAPIServer) Run(stopCh <-chan struct{}) error {
delayedStopCh := make(chan struct{})
go func() {
defer close(delayedStopCh)
<-stopCh
time.Sleep(s.ShutdownDelayDuration)
}()
// close socket after delayed stopCh
err := s.NonBlockingRun(delayedStopCh)
if err != nil {
return err
}
<-stopCh
// run shutdown hooks directly. This includes deregistering from the kubernetes endpoint in case of kube-apiserver.
err = s.RunPreShutdownHooks()
if err != nil {
return err
}
// wait for the delayed stopCh before closing the handler chain (it rejects everything after Wait has been called).
<-delayedStopCh
// Wait for all requests to finish, which are bounded by the RequestTimeout variable.
s.HandlerChainWaitGroup.Wait()
return nil
}
调用了NonBlockingRun方法,实例化http server,然后会读取stopCh,如果没有收到stopCh的信号,会始终在这里等待。如果收到停止信息,则进行善后处理。
func (s preparedGenericAPIServer) NonBlockingRun(stopCh <-chan struct{}) error {
// Use an stop channel to allow graceful shutdown without dropping audit events
// after http server shutdown.
auditStopCh := make(chan struct{})
// Start the audit backend before any request comes in. This means we must call Backend.Run
// before http server start serving. Otherwise the Backend.ProcessEvents call might block.
if s.AuditBackend != nil {
if err := s.AuditBackend.Run(auditStopCh); err != nil {
return fmt.Errorf("failed to run the audit backend: %v", err)
}
}
// Use an internal stop channel to allow cleanup of the listeners on error.
internalStopCh := make(chan struct{})
var stoppedCh <-chan struct{}
if s.SecureServingInfo != nil && s.Handler != nil {
var err error
stoppedCh, err = s.SecureServingInfo.Serve(s.Handler, s.ShutdownTimeout, internalStopCh)
if err != nil {
close(internalStopCh)
close(auditStopCh)
return err
}
}
// Now that listener have bound successfully, it is the
// responsibility of the caller to close the provided channel to
// ensure cleanup.
go func() {
<-stopCh
close(s.readinessStopCh)
close(internalStopCh)
if stoppedCh != nil {
<-stoppedCh
}
s.HandlerChainWaitGroup.Wait()
close(auditStopCh)
}()
s.RunPostStartHooks(stopCh)
if _, err := systemd.SdNotify(true, "READY=1\n"); err != nil {
klog.Errorf("Unable to send systemd daemon successful start message: %v\n", err)
}
return nil
}
其中比较重要的方法是: stoppedCh, err = s.SecureServingInfo.Serve(s.Handler, s.ShutdownTimeout, internalStopCh)
func (s *SecureServingInfo) Serve(handler http.Handler, shutdownTimeout time.Duration, stopCh <-chan struct{}) (<-chan struct{}, error) {
if s.Listener == nil {
return nil, fmt.Errorf("listener must not be nil")
}
tlsConfig, err := s.tlsConfig(stopCh)
if err != nil {
return nil, err
}
secureServer := &http.Server{
Addr: s.Listener.Addr().String(),
Handler: handler,
MaxHeaderBytes: 1 << 20,
TLSConfig: tlsConfig,
}
// At least 99% of serialized resources in surveyed clusters were smaller than 256kb.
// This should be big enough to accommodate most API POST requests in a single frame,
// and small enough to allow a per connection buffer of this size multiplied by `MaxConcurrentStreams`.
const resourceBody99Percentile = 256 * 1024
http2Options := &http2.Server{}
// shrink the per-stream buffer and max framesize from the 1MB default while still accommodating most API POST requests in a single frame
http2Options.MaxUploadBufferPerStream = resourceBody99Percentile
http2Options.MaxReadFrameSize = resourceBody99Percentile
// use the overridden concurrent streams setting or make the default of 250 explicit so we can size MaxUploadBufferPerConnection appropriately
if s.HTTP2MaxStreamsPerConnection > 0 {
http2Options.MaxConcurrentStreams = uint32(s.HTTP2MaxStreamsPerConnection)
} else {
http2Options.MaxConcurrentStreams = 250
}
// increase the connection buffer size from the 1MB default to handle the specified number of concurrent streams
http2Options.MaxUploadBufferPerConnection = http2Options.MaxUploadBufferPerStream * int32(http2Options.MaxConcurrentStreams)
if !s.DisableHTTP2 {
// apply settings to the server
if err := http2.ConfigureServer(secureServer, http2Options); err != nil {
return nil, fmt.Errorf("error configuring http2: %v", err)
}
}
klog.Infof("Serving securely on %s", secureServer.Addr)
return RunServer(secureServer, s.Listener, shutdownTimeout, stopCh)
}
实例化http.Server, 设置各种参数,然后调用了RunServer方法:
func RunServer(
server *http.Server,
ln net.Listener,
shutDownTimeout time.Duration,
stopCh <-chan struct{},
) (<-chan struct{}, error) {
if ln == nil {
return nil, fmt.Errorf("listener must not be nil")
}
// Shutdown server gracefully.
stoppedCh := make(chan struct{})
go func() {
defer close(stoppedCh)
<-stopCh
ctx, cancel := context.WithTimeout(context.Background(), shutDownTimeout)
server.Shutdown(ctx)
cancel()
}()
go func() {
defer utilruntime.HandleCrash()
var listener net.Listener
listener = tcpKeepAliveListener{ln.(*net.TCPListener)}
if server.TLSConfig != nil {
listener = tls.NewListener(listener, server.TLSConfig)
}
err := server.Serve(listener)
msg := fmt.Sprintf("Stopped listening on %s", ln.Addr().String())
select {
case <-stopCh:
klog.Info(msg)
default:
panic(fmt.Sprintf("%s due to error: %v", msg, err))
}
}()
return stoppedCh, nil
}
终于看到了err := server.Serve(listener)方法,这里启动对应的服务,监听对应的接口,通过前面传入的Handler实现对应接口请求的处理。
apiserver是k8s中很重要的一个组件,对应源码也比较多,零零散散终于看完了,其中需要对k8s的各个组件有全面的认识之后,在来看源码应当有更好的理解。这篇主要是对启动流程跟着源码进行了解读,下篇将对apiserver的请求流程进行梳理。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。