Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lustre-csi-driver mount.lustre error: is already mounted #149

Closed
roehrich-hpe opened this issue Apr 11, 2024 · 1 comment
Closed

lustre-csi-driver mount.lustre error: is already mounted #149

roehrich-hpe opened this issue Apr 11, 2024 · 1 comment
Assignees

Comments

@roehrich-hpe
Copy link
Contributor

The lustre-csi-driver should be able to recognize when a mount has already completed, and should return without an error.

Here is the 'kubectl describe' event:

Events:
  Type     Reason       Age                  From     Message
  ----     ------       ----                 ----     -------
  Warning  FailedMount  3m9s (x44 over 90m)  kubelet  MountVolume.SetUp failed for volume "lustre4-nnf-dm-system-readwritemany-pv" : rpc error: code = Internal desc = NodePublishVolume - Mount Failed: Error mount failed: exit status 17
Mounting command: mount
Mounting arguments: -t lustre 2056@kfi4:2120@kfi4:2184@kfi4:2248@kfi4:/lustre4 /var/lib/kubelet/pods/c1ab40dc-3864-4938-b66a-7490abd6524c/volumes/kubernetes.io~csi/lustre4-nnf-dm-system-readwritemany-pv/mount
Output: mount.lustre: according to /etc/mtab 2056@kfi4:2120@kfi4:2184@kfi4:2248@kfi4:/lustre4 is already mounted on /var/lib/kubelet/pods/c1ab40dc-3864-4938-b66a-7490abd6524c/volumes/kubernetes.io~csi/lustre4-nnf-dm-system-readwritemany-pv/mount

And the node's mount output:

# mount | grep lustre
2056@kfi4:2120@kfi4:2184@kfi4:2248@kfi4:/lustre4 on /var/lib/kubelet/pods/c1ab40dc-3864-4938-b66a-7490abd6524c/volumes/kubernetes.io~csi/lustre4-nnf-dm-system-readwritemany-pv/mount type lustre (rw,checksum,flock,nouser_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,encrypt)

Here's the pod and node summaries:

$ kubectl get pod -n nnf-dm-system nnf-dm-worker-9bhxv -o wide
NAME                  READY   STATUS              RESTARTS   AGE    IP       NODE       NOMINATED NODE   READINESS GATES
nnf-dm-worker-9bhxv   0/2     ContainerCreating   0          103m   <none>   elcap317   <none>           <none>

$ kubectl get node elcap317
NAME       STATUS   ROLES    AGE    VERSION
elcap317   Ready    <none>   102m   v1.29.3

Here's the CSI driver:

$ kubectl get pods -n lustre-csi-system -o wide | grep elcap317
lustre-csi-node-dms74   2/2     Running   0          107m    10.85.148.130   elcap317   <none>           <none>

The CSI driver log shows one successful mount followed by many more mount attempts:

time="2024-04-11T17:03:01Z" level=info msg=Mounted source="2056@kfi4:2120@kfi4:2184@kfi4:2248@kfi4:/lustre4" target="/var/lib/kubelet/pods/c1ab40dc-3864-4938-b66a-7490abd6524c/volumes/kubernetes.io~csi/lustre4-nnf-dm-system-readwritemany-pv/mount"

[and the rest is continuous repeating of the following...]

time="2024-04-11T17:05:13Z" level=debug msg="/csi.v1.Node/NodePublishVolume: REQ
 0011: VolumeId=2056@kfi4:2120@kfi4:2184@kfi4:2248@kfi4:/lustre4, TargetPath=/va
r/lib/kubelet/pods/c1ab40dc-3864-4938-b66a-7490abd6524c/volumes/kubernetes.io~cs
i/lustre4-nnf-dm-system-readwritemany-pv/mount, VolumeCapability=mount:<fs_type:
\"lustre\" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > , Readonly=false, XXX_N
oUnkeyedLiteral={}, XXX_sizecache=0"
Mounting arguments: -t lustre 2056@kfi4:2120@kfi4:2184@kfi4:2248@kfi4:/lustre4 /
var/lib/kubelet/pods/c1ab40dc-3864-4938-b66a-7490abd6524c/volumes/kubernetes.io~
csi/lustre4-nnf-dm-system-readwritemany-pv/mount
Output: mount.lustre: according to /etc/mtab 2056@kfi4:2120@kfi4:2184@kfi4:2248@
kfi4:/lustre4 is already mounted on /var/lib/kubelet/pods/c1ab40dc-3864-4938-b66
a-7490abd6524c/volumes/kubernetes.io~csi/lustre4-nnf-dm-system-readwritemany-pv/
mount
time="2024-04-11T17:05:13Z" level=debug msg="Mounting arguments: -t lustre 2056@
kfi4:2120@kfi4:2184@kfi4:2248@kfi4:/lustre4 /var/lib/kubelet/pods/c1ab40dc-3864-
4938-b66a-7490abd6524c/volumes/kubernetes.io~csi/lustre4-nnf-dm-system-readwrite
many-pv/mount"
time="2024-04-11T17:05:13Z" level=debug msg="Output: mount.lustre: according to 
/etc/mtab 2056@kfi4:2120@kfi4:2184@kfi4:2248@kfi4:/lustre4 is already mounted on
 /var/lib/kubelet/pods/c1ab40dc-3864-4938-b66a-7490abd6524c/volumes/kubernetes.i
o~csi/lustre4-nnf-dm-system-readwritemany-pv/mount"
@roehrich-hpe
Copy link
Contributor Author

Fixed by HewlettPackard/lustre-csi-driver#71

@github-project-automation github-project-automation bot moved this from 📋 Open to ✅ Closed in Issues Dashboard Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Closed
Development

No branches or pull requests

1 participant