The available resources of the node do not show the virtualized gpu memory.

**What happened**:

The available resources of the node do not show the virtualized gpu memory.

**What you expected to happen**:

Display the virtualized gpu memory so that my pod can allocate nodes; otherwise, it will always be pending

**How to reproduce it (as minimally and precisely as possible)**:

**Anything else we need to know?**:
hami-yaml
```
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: hami-ascend
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "update", "watch", "patch"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: hami-ascend
subjects:
  - kind: ServiceAccount
    name: hami-ascend
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: hami-ascend
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: hami-ascend
  namespace: kube-system
  labels:
    app.kubernetes.io/component: "hami-ascend"
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: hami-ascend-device-plugin
  namespace: kube-system
  labels:
    app.kubernetes.io/component: hami-ascend-device-plugin
spec:
  selector:
    matchLabels:
      app.kubernetes.io/component: hami-ascend-device-plugin
      hami.io/webhook: ignore
  template:
    metadata:
      labels:
        app.kubernetes.io/component: hami-ascend-device-plugin
        hami.io/webhook: ignore
    spec:
      priorityClassName: "system-node-critical"
      serviceAccountName: hami-ascend
      containers:
        - image: projecthami/ascend-device-plugin:v1.1.0
          imagePullPolicy: IfNotPresent
          name: device-plugin
          resources:
            requests:
              memory: 500Mi
              cpu: 500m
            limits:
              memory: 500Mi
              cpu: 500m
          args:
            - --config_file
            - /device-config.yaml
          securityContext:
            privileged: true
            readOnlyRootFilesystem: false
          volumeMounts:
            - name: device-plugin
              mountPath: /var/lib/kubelet/device-plugins
            - name: pod-resource
              mountPath: /var/lib/kubelet/pod-resources
            - name: hiai-driver
              mountPath: /usr/local/Ascend/driver
              readOnly: true
            - name: log-path
              mountPath: /var/log/mindx-dl/devicePlugin
            - name: tmp
              mountPath: /tmp
            - name: ascend-config
              mountPath: /device-config.yaml
              subPath: device-config.yaml
              readOnly: true
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
      volumes:
        - name: device-plugin
          hostPath:
            path: /var/lib/kubelet/device-plugins
        - name: pod-resource
          hostPath:
            path: /var/lib/kubelet/pod-resources
        - name: hiai-driver
          hostPath:
            path: /usr/local/Ascend/driver
        - name: log-path
          hostPath:
            path: /var/log/mindx-dl/devicePlugin
            type: Directory
        - name: tmp
          hostPath:
            path: /tmp
        - name: ascend-config
          configMap:
            name: hami-scheduler-device
      nodeSelector:
        ascend: "on"

```

device-config.yaml
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: hami-scheduler-device
  namespace: kube-system
data:
  device-config.yaml: |-
    vnpus:
    - chipName: 310P3
      commonWord: Ascend310P
      resourceName: huawei.com/Ascend310P
      resourceMemoryName: huawei.com/Ascend310P-memory
      memoryAllocatable: 21527
      memoryCapacity: 24576
      aiCore: 8
      aiCPU: 7
      templates:
        - name: vir02
          memory: 6144
          aiCore: 2
          aiCPU: 2
```
Then the available resources of node changed from 8 to 24
```
  allocatable:
    cpu: "96"
    ephemeral-storage: "1700179318837"
    huawei.com/Ascend310P: "8"
    hugepages-2Mi: "0"
    memory: 526690192Ki
    pods: "110"
  capacity:
    cpu: "96"
    ephemeral-storage: 1844812632Ki
    huawei.com/Ascend310P: "8"
    hugepages-2Mi: "0"
    memory: 526792592Ki
    pods: "110"
->
->
->
  allocatable:
    cpu: "96"
    ephemeral-storage: "1700179318837"
    huawei.com/Ascend310P: "24"
    hugepages-2Mi: "0"
    memory: 526690192Ki
    pods: "110"
  capacity:
    cpu: "96"
    ephemeral-storage: 1844812632Ki
    huawei.com/Ascend310P: "24"
    hugepages-2Mi: "0"
    memory: 526792592Ki
    pods: "110"

```
But when I built the pod based on the example, the resource huawei.com/Ascend310P-memory was not registered in kublet. This has led to all the Pods I created being pending. May I ask if I did anything wrong
```
    containers:
    - name: npu_pod
      ...
      resources:
        limits:
          huawei.com/Ascend910B: "1"
          # 不填写显存默认使用整张卡
          huawei.com/Ascend910B-memory: "4096"
```


- The output of `nvidia-smi -a` on your host
- Your docker or containerd configuration file (e.g: `/etc/docker/daemon.json`)
- The hami-device-plugin container logs
- The hami-scheduler container logs
- The kubelet logs on the node (e.g: `sudo journalctl -r -u kubelet`)
- Any relevant kernel output lines from `dmesg`

**Environment**:
- HAMi version:
- nvidia driver or other AI device driver version:
- Docker version from `docker version`
- Docker command, image and tag used
- Kernel version from `uname -a`
- Others:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The available resources of the node do not show the virtualized gpu memory. #1469

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The available resources of the node do not show the virtualized gpu memory. #1469

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions