学习记录---Kubernetes的资源指标管道-metrics api的安装部署

2023-12-13 11:45:30

一、简介

Metrics API,为我们的k8s集群提供了一组基本的指标(资源的cpu和内存),我们可以通过metrics api来对我们的pod开展HPA和VPA操作(主要通过在pod中对cpu和内存的限制实现动态扩展),也可以通过kubectl top的方式,获取k8s中node和pod的cpu及内存使用情况。
node资源分配情况和使用情况:
在这里插入图片描述

pod资源分配情况和使用情况:
在这里插入图片描述

二、使用和部署

从kubernetes的官网上来看,k8s初始化后默认是没有提供metrics api服务的,如果需要使用该服务,则需要部署metrics-server。

针对metrics-server的部署方式,我们可以直接从官网的超链接中获取,而该项目则是在github的kubernetes项目中:

https://github.com/kubernetes-sigs/metrics-server/tree/master

在该项目中有比较明确的安装步骤,该项目安装比较简单,直接用其yaml进行apply即可(这里我们选择使用高可用模式,高可用模式其实就是多使用了几个replicas):

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability-1.21+.yaml

我们先把yaml下载下来看看:

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  replicas: 2
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                k8s-app: metrics-server
            namespaces:
            - kube-system
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        #新增,使其不验证k8s提供的ca证书
        - --kubelet-insecure-tls
        #修改,修改为我们可以下载的镜像,如果有私有仓库也可以用私有仓库的地址
        #image: registry.k8s.io/metrics-server/metrics-server:v0.6.4
        image: bitnami/metrics-server:0.6.4
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 4443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          initialDelaySeconds: 20
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 200Mi
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: metrics-server
  namespace: kube-system
spec:
  minAvailable: 1
  selector:
    matchLabels:
      k8s-app: metrics-server
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100

这里,我们修改了两个地方:

1) 在deployment中的container的arg参数中加入:- --kubelet-insecure-tls
2) 修改镜像资源

修改镜像资源主要是方便我们可以顺利pull镜像;
加入kubelet-insecure-tls的主要目的是metrics-server不验证k8s提供的ca证书,如果不加该参数,则可能会导致出现以下问题:

pod状态:
在这里插入图片描述

查看描述,会出现Readiness探针异常(Readiness probe failed:Http probe failed with statuscode: 500)
在这里插入图片描述

查看日志,出现以下错误:
在这里插入图片描述

“Failed to scrape node” err=“Get “https://xxx.xxx.xxx.xxx:10250/metrics/resource”: tls: failed to verify certificate: x509: cannot validate certificate for xxx.xxx.xxx.xxx because it doesn’t contain any IP SANs” node=“k8s-slave3”

正常状态:
在这里插入图片描述

文章来源:https://blog.csdn.net/wx370092877/article/details/134843516
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。