Skip to content

Commit 7943924

Browse files
uneet7YashwantGohokar
authored andcommitted
Enable lustre driver and add the relevant documentation
1 parent 9a5427b commit 7943924

File tree

13 files changed

+340
-14
lines changed

13 files changed

+340
-14
lines changed

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ else
3838
VERSION ?= ${VERSION}
3939
endif
4040

41-
RELEASE = v1.31.1
41+
RELEASE = v1.31.2
4242

4343
GOOS ?= linux
4444
ARCH ?= amd64

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ cloud-provider specific code out of the Kubernetes codebase.
3737
| v1.29.2 | v1.29 | - |
3838
| v1.30.1 | v1.30 | - |
3939
| v1.31.1 | v1.31 | - |
40+
| v1.31.2 | v1.31 | - |
4041

4142

4243
Note:

container-storage-interface.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,10 @@ Check if PVC is now in bound state:
199199
$ kubectl describe pvc/oci-bv-claim
200200
```
201201

202+
## PVCs with Lustre File System
203+
204+
Instructions for provisioning PVCs on the file storage with lustre service can be found [here](docs/pvcs-with-lustre.md)
205+
202206
# Troubleshoot
203207

204208
## FsGroup policy not propagated from pod security context

docs/pvcs-with-lustre-using-csi.md

Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
# Provisioning PVCs on the File Storage with Lustre Service
2+
3+
The Oracle Cloud Infrastructure File Storage with Lustre service is a fully managed storage service designed to meet the demands of AI/ML training and inference, and high performance computing needs. You use the Lustre CSI plugin to connect clusters to file systems in the File Storage with Lustre service.
4+
5+
You can use the File Storage with Lustre service to provision persistent volume claims (PVCs) by manually creating a file system in the File Storage with Lustre service, then defining and creating a persistent volume (PV) backed by the new file system, and finally defining a new PVC. When you create the PVC, Kubernetes binds the PVC to the PV backed by the File Storage with Lustre service.
6+
7+
The Lustre CSI driver is the overall software that enables Lustre file systems to be used with Kubernetes via the Container Storage Interface (CSI). The Lustre CSI plugin is a specific component within the driver, responsible for interacting with the Kubernetes API server and managing the lifecycle of Lustre volumes.
8+
9+
Note the following:
10+
11+
- The Lustre CSI driver is supported on Oracle Linux 8 x86 and on Ubuntu x86 22.04.
12+
- To use a Lustre file system with a Kubernetes cluster, the Lustre client package must be installed on worker nodes that have to mount the file system. For more information about Lustre clients, see [Mounting and Accessing a Lustre File System](https://docs.oracle.com/iaas/Content/lustre/file-system-connect.htm).
13+
14+
## Provisioning a PVC on an Existing File System
15+
16+
To create a PVC on an existing file system in the File Storage with Lustre service (using Oracle-managed encryption keys to encrypt data at rest):
17+
18+
1. Create a file system in the File Storage with Lustre service, selecting the Encrypt using Oracle-managed keys encryption option. See [Creating a Lustre File System](https://docs.oracle.com/iaas/Content/lustre/file-system-create.htm).
19+
20+
2. Create security rules in either a network security group (recommended) or a security list for both the Lustre file system, and for the cluster's worker nodes subnet. The security rules to create depend on the relative network locations of the Lustre file system and the worker nodes which act as the client, according to the following scenarios:
21+
22+
These scenarios, the security rules to create, and where to create them, are fully described in the File Storage with Lustre service documentation (see [Required VCN Security Rules](https://docs.oracle.com/iaas/Content/lustre/security-rules.htm)).
23+
24+
3. Create a PV backed by the file system in the File Storage with Lustre service as follows:
25+
26+
a. Create a manifest file to define a PV and in the `csi:` section, set:
27+
28+
- `driver` to `lustre.csi.oraclecloud.com`
29+
- `volumeHandle` to `<MGSAddress>@<LNetName>:/<MountName>`
30+
where:
31+
- `<MGSAddress>` is the Management service address for the file system in the File Storage with Lustre service
32+
- `<LNetName>` is the LNet network name for the file system in the File Storage with Lustre service
33+
- `<MountName>` is the mount name used while creating the file system in the File Storage with Lustre service
34+
35+
For example: `10.0.2.6@tcp:/testlustrefs`
36+
37+
- `fsType` to `lustre`
38+
- (optional, but recommended) `volumeAttributes.setupLnet` to `"true"` if you want the Lustre CSI driver to perform lnet (Lustre Network) setup before mounting the filesystem
39+
- (required) `volumeAttributes.lustreSubnetCidr` to the CIDR block of the subnet where the worker node's VNIC having access to lustre filesystem is located (typically worker node subnet in default setup) to ensure the worker node has network connectivity to the Lustre file system. For example, 10.0.2.0/24.
40+
- (optional) `volumeAttributes.lustrePostMountParameters` to set Lustre parameters. For example:
41+
```yaml
42+
volumeAttributes:
43+
lustrePostMountParameters: '[{"*.*.*MDT*.lru_size": 11200},{"at_history" : 600}]'
44+
```
45+
46+
For example, the following manifest file (named `lustre-pv-example.yaml`) defines a PV called `lustre-pv-example` backed by a Lustre file system:
47+
48+
```yaml
49+
apiVersion: v1
50+
kind: PersistentVolume
51+
metadata:
52+
name: lustre-pv-example
53+
spec:
54+
capacity:
55+
storage: 31Ti
56+
volumeMode: Filesystem
57+
accessModes:
58+
- ReadWriteMany
59+
persistentVolumeReclaimPolicy: Retain
60+
csi:
61+
driver: lustre.csi.oraclecloud.com
62+
volumeHandle: "10.0.2.6@tcp:/testlustrefs"
63+
fsType: lustre
64+
volumeAttributes:
65+
setupLnet: "true"
66+
```
67+
68+
b. Create the PV from the manifest file by entering:
69+
```bash
70+
kubectl apply -f <filename>
71+
```
72+
73+
For example:
74+
```bash
75+
kubectl apply -f lustre-pv-example.yaml
76+
```
77+
78+
c. Verify that the PV has been created successfully by entering:
79+
```bash
80+
kubectl get pv <pv-name>
81+
```
82+
83+
For example:
84+
```bash
85+
kubectl get pv lustre-pv-example
86+
```
87+
88+
Example output:
89+
```
90+
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
91+
lustre-pv-example 31Ti RWX Retain Bound 56m
92+
```
93+
94+
4. Create a PVC that is provisioned by the PV you have created, as follows:
95+
96+
a. Create a manifest file to define the PVC and set:
97+
98+
- `storageClassName` to `""`
99+
100+
**Note:** You must specify an empty value for `storageClassName`, even though storage class is not applicable in the case of static provisioning of persistent storage. If you do not specify an empty value for `storageClassName`, the default storage class (`oci-bv`) is used, which causes an error.
101+
102+
- `volumeName` to the name of the PV you created (for example, `lustre-pv-example`)
103+
104+
For example, the following manifest file (named `lustre-pvc-example.yaml`) defines a PVC named `lustre-pvc-example` that will bind to a PV named `lustre-pv-example`:
105+
106+
```yaml
107+
apiVersion: v1
108+
kind: PersistentVolumeClaim
109+
metadata:
110+
name: lustre-pvc-example
111+
spec:
112+
accessModes:
113+
- ReadWriteMany
114+
storageClassName: ""
115+
volumeName: lustre-pv-example
116+
resources:
117+
requests:
118+
storage: 31Ti
119+
```
120+
121+
**Note:** The `requests: storage:` element must be present in the PVC's manifest file, and its value must match the value specified for the `capacity: storage:` element in the PV's manifest file. Apart from that, the value of the `requests: storage:` element is ignored.
122+
123+
b. Create the PVC from the manifest file by entering:
124+
```bash
125+
kubectl apply -f <filename>
126+
```
127+
128+
For example:
129+
```bash
130+
kubectl apply -f lustre-pvc-example.yaml
131+
```
132+
133+
c. Verify that the PVC has been created and bound to the PV successfully by entering:
134+
```bash
135+
kubectl get pvc <pvc-name>
136+
```
137+
138+
For example:
139+
```bash
140+
kubectl get pvc lustre-pvc-example
141+
```
142+
143+
Example output:
144+
```
145+
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
146+
lustre-pvc-example Bound lustre-pv-example 31Ti RWX 57m
147+
```
148+
149+
The PVC is bound to the PV backed by the File Storage with Lustre service file system. Data is encrypted at rest, using encryption keys managed by Oracle.
150+
151+
5. Use the new PVC when creating other objects, such as deployments. For example:
152+
153+
a. Create a manifest named `lustre-app-example-deployment.yaml` to define a deployment named `lustre-app-example-deployment` that uses the `lustre-pvc-example` PVC, as follows:
154+
155+
```yaml
156+
apiVersion: apps/v1
157+
kind: Deployment
158+
metadata:
159+
name: lustre-app-example-deployment
160+
spec:
161+
selector:
162+
matchLabels:
163+
app: lustre-app-example
164+
replicas: 2
165+
template:
166+
metadata:
167+
labels:
168+
app: lustre-app-example
169+
spec:
170+
containers:
171+
- args:
172+
- -c
173+
- while true; do echo $(date -u) >> /lustre/data/out.txt; sleep 60; done
174+
command:
175+
- /bin/sh
176+
image: busybox:latest
177+
imagePullPolicy: Always
178+
name: lustre-app-example
179+
volumeMounts:
180+
- mountPath: /lustre/data
181+
name: lustre-volume
182+
restartPolicy: Always
183+
volumes:
184+
- name: lustre-volume
185+
persistentVolumeClaim:
186+
claimName: lustre-pvc-example
187+
```
188+
189+
b. Create the deployment from the manifest file by entering:
190+
```bash
191+
kubectl apply -f lustre-app-example-deployment.yaml
192+
```
193+
194+
c. Verify that the deployment pods have been created successfully and are running by entering:
195+
```bash
196+
kubectl get pods
197+
```
198+
199+
Example output:
200+
```
201+
NAME READY STATUS RESTARTS AGE
202+
lustre-app-example-deployment-7767fdff86-nd75n 1/1 Running 0 8h
203+
lustre-app-example-deployment-7767fdff86-wmxlh 1/1 Running 0 8h
204+
```
205+
206+
## Provisioning a PVC on an Existing File System with Mount Options
207+
208+
You can optimize the performance and control access to an existing Lustre file system by specifying mount options for the PV. Specifying mount options enables you to fine-tune how pods interact with the file system.
209+
210+
To include mount options:
211+
212+
1. Start by following the instructions in [Provisioning a PVC on an Existing File System](#provisioning-a-pvc-on-an-existing-file-system).
213+
214+
2. In the PV manifest described in [Provisioning a PVC on an Existing File System](#provisioning-a-pvc-on-an-existing-file-system), add the `spec.mountOptions` field, which enables you to specify how the PV should be mounted by pods.
215+
216+
For example, in the `lustre-pv-example.yaml` manifest file shown in [Provisioning a PVC on an Existing File System](#provisioning-a-pvc-on-an-existing-file-system), you can include the `mountOptions` field as follows:
217+
218+
```yaml
219+
apiVersion: v1
220+
kind: PersistentVolume
221+
metadata:
222+
name: lustre-pv-example
223+
spec:
224+
capacity:
225+
storage: 31Ti
226+
volumeMode: Filesystem
227+
accessModes:
228+
- ReadWriteMany
229+
persistentVolumeReclaimPolicy: Retain
230+
mountOptions:
231+
- ro
232+
csi:
233+
driver: lustre.csi.oraclecloud.com
234+
volumeHandle: "10.0.2.6@tcp:/testlustrefs"
235+
fsType: lustre
236+
volumeAttributes:
237+
setupLnet: "true"
238+
```
239+
240+
In this example, the `mountOptions` field is set to `ro`, indicating that pods are to have read-only access to the file system. For more information about PV mount options, see [Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) in the Kubernetes documentation.
241+
242+
## Encrypting Data At Rest on an Existing File System
243+
244+
The File Storage with Lustre service always encrypts data at rest, using Oracle-managed encryption keys by default. However, you have the option to specify at-rest encryption using your own master encryption keys that you manage yourself in the Vault service.
245+
246+
For more information about creating File Storage with Lustre file systems that use Oracle-managed encryption keys or your own master encryption keys that you manage yourself, see [Updating File System Encryption](https://docs.oracle.com/iaas/Content/lustre/file-system-encryption.htm).

manifests/cloud-controller-manager/oci-cloud-controller-manager.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ spec:
4242
path: /etc/kubernetes
4343
containers:
4444
- name: oci-cloud-controller-manager
45-
image: ghcr.io/oracle/cloud-provider-oci:v1.31.1
45+
image: ghcr.io/oracle/cloud-provider-oci:v1.31.2
4646
command: ["/usr/local/bin/oci-cloud-controller-manager"]
4747
args:
4848
- --cloud-config=/etc/oci/cloud-provider.yaml

manifests/container-storage-interface/csi/templates/oci-csi-controller-driver.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ spec:
9696
- --fss-csi-endpoint=unix://var/run/shared-tmpfs/csi-fss.sock
9797
command:
9898
- /usr/local/bin/oci-csi-controller-driver
99-
image: ghcr.io/oracle/cloud-provider-oci:v1.31.1
99+
image: ghcr.io/oracle/cloud-provider-oci:v1.31.2
100100
imagePullPolicy: IfNotPresent
101101
env:
102102
- name: BLOCK_VOLUME_DRIVER_NAME

manifests/container-storage-interface/csi/templates/oci-csi-node-driver.yaml

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,15 @@ metadata:
1313
spec:
1414
fsGroupPolicy: File
1515
---
16+
apiVersion: storage.k8s.io/v1
17+
kind: CSIDriver
18+
metadata:
19+
name: {{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com
20+
spec:
21+
attachRequired: false
22+
podInfoOnMount: false
23+
fsGroupPolicy: File
24+
---
1625
kind: ConfigMap
1726
apiVersion: v1
1827
metadata:
@@ -107,6 +116,7 @@ spec:
107116
- --nodeid=$(KUBE_NODE_NAME)
108117
- --loglevel=debug
109118
- --fss-endpoint=unix:///fss/csi.sock
119+
- --lustre-endpoint=unix:///lustre/csi.sock
110120
command:
111121
- /usr/local/bin/oci-csi-node-driver
112122
env:
@@ -117,11 +127,15 @@ spec:
117127
fieldPath: spec.nodeName
118128
- name: PATH
119129
value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/host/usr/bin:/host/sbin
130+
- name: LUSTRE_DRIVER_ENABLED
131+
value: "true"
120132
- name: BLOCK_VOLUME_DRIVER_NAME
121133
value: "{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}blockvolume.csi.oraclecloud.com"
122134
- name: FSS_VOLUME_DRIVER_NAME
123135
value: "{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}fss.csi.oraclecloud.com"
124-
image: ghcr.io/oracle/cloud-provider-oci:v1.31.1
136+
- name: LUSTRE_VOLUME_DRIVER_NAME
137+
value: "{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com"
138+
image: ghcr.io/oracle/cloud-provider-oci:v1.31.2
125139
securityContext:
126140
privileged: true
127141
volumeMounts:
@@ -152,6 +166,8 @@ spec:
152166
- mountPath: /sbin/mount
153167
name: fss-driver-mounts
154168
subPath: mount
169+
- mountPath: /lustre
170+
name: lustre-plugin-dir
155171
- name: csi-node-registrar
156172
args:
157173
- --csi-address=/csi/csi.sock
@@ -190,6 +206,25 @@ spec:
190206
name: fss-plugin-dir
191207
- mountPath: /registration
192208
name: registration-dir
209+
- name: csi-node-registrar-lustre
210+
args:
211+
- --csi-address=/lustre/csi.sock
212+
- --kubelet-registration-path=/var/lib/kubelet/plugins/{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com/csi.sock
213+
image: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.12.0
214+
securityContext:
215+
privileged: true
216+
lifecycle:
217+
preStop:
218+
exec:
219+
command:
220+
- /bin/sh
221+
- -c
222+
- rm -rf /registration/{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com /registration/{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com-reg.sock
223+
volumeMounts:
224+
- mountPath: /lustre
225+
name: lustre-plugin-dir
226+
- mountPath: /registration
227+
name: registration-dir
193228
dnsPolicy: ClusterFirst
194229
hostNetwork: true
195230
restartPolicy: Always
@@ -212,6 +247,10 @@ spec:
212247
path: /var/lib/kubelet/plugins/{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}fss.csi.oraclecloud.com
213248
type: DirectoryOrCreate
214249
name: fss-plugin-dir
250+
- hostPath:
251+
path: /var/lib/kubelet/plugins/{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com
252+
type: DirectoryOrCreate
253+
name: lustre-plugin-dir
215254
- hostPath:
216255
path: /var/lib/kubelet
217256
type: Directory

manifests/container-storage-interface/oci-csi-controller-driver.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ spec:
9696
- --fss-csi-endpoint=unix://var/run/shared-tmpfs/csi-fss.sock
9797
command:
9898
- /usr/local/bin/oci-csi-controller-driver
99-
image: ghcr.io/oracle/cloud-provider-oci:v1.31.1
99+
image: ghcr.io/oracle/cloud-provider-oci:v1.31.2
100100
imagePullPolicy: IfNotPresent
101101
volumeMounts:
102102
- name: config

0 commit comments

Comments
 (0)