Unsolved
1 Rookie
•
15 Posts
0
1869
June 10th, 2021 12:00
csi-powerscale fsGroup behavior changed from 1.3.0 to 1.4.0+
It seems that between csi-powerscale v1.3.0.1 and v1.4.0, fsGroup behavior changed in the CSI.
Typically having fsGroup set in a pod spec will cause a recursive chgrp to happen each time the volume is mounted (assuming "rootClientEnabled" is true in the StorageClass, which it is in my case).
In v1.3.0.1 this happened as expected, the group ownership of the volume directory and its contents is set to fsGroup, and reset on mount if changed on the backend. In 1.4.0, this stopped happening - the recursive chgrp isn't happening, and even new files aren't being created with their gid set to the fsGroup value, they are created with gid=0. This presents an issue with a workload which relies on the gid. Same behavior in 1.5.0, too.
Was the behavior change intentional? I know fsGroup and NFS has always been an odd thing. I was experimenting with using the new ConfigurableFSGroupPolicy feature (fsGroupChangePolicy: "OnRootMismatch") to avoid the recursive chown on every mount for performance reasons, but I still need NEW files to respect fsGroup when created.
To validate, on a new cluster running RKE 1.19.10 I installed csi-powerscale v1.3.0.1 and confirmed fsGroup was working, then upgraded to v1.4.0 with no other changes made in between.
Thar_J
42 Posts
0
June 11th, 2021 03:00
Hi @drb45
We don't remember any such change being made in the driver which would lead to this behavior however we will try to reproduce this in our lab and we will revert back to you.
Regards
Thar_J
Thar_J
42 Posts
0
June 23rd, 2021 03:00
Hi@drb45
Following are the fsGroup behaviour observed in csi-powerscale 1.3. 1.4 and 1.5 :
--> Scenario 1: I had set these in security-context-pod.yaml,
if runAsUser, runAsGroup and fsGroup is set like above given, data are written to the pod and check the id in the mount path then it displayed
>> It returned the same gid as runAsGroup set.
--> Scenario 2: Only if the runAsGroup is omitted and if pod is created as below
securityContext: runAsUser: 1000 fsGroup: 2000
$ id
uid =1000 gid=0(root) groups=2000
>> If the runAsGroup was omitted the gid would remain as 0(root) and the process will be able to interact with files that are owned by root(0) group and that have the required group permissions for root(0) group.
--> Concerning fsGroupChangePolicy:
fsGroupChangePolicy - fsGroupChangePolicy defines behavior for changing ownership and permission of the volume before being exposed inside a Pod. This field only applies to volume types that support fsGroup controlled ownership and permissions. This field has two possible values:
Kindly refer the kubernetes document for further clarity:
https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
Regards
Thar_J
drb45
1 Rookie
•
15 Posts
0
June 23rd, 2021 10:00
Thanks @Thar_J
I ran a few more tests. This is on K8s 1.19.10 with a Helm install of the CSI (using the csi-installer.sh script). And to keep things clean, none of these tests involve fsGroupChangePolicy - the ConfigurableFSGroupPolicy feature gate is NOT enabled on this cluster.
I have a pod with the following defined:
The image has a local user 1000 "myuser", and the pod is mounting an Isilon PVC at /home/myuser.
With v1.3.0:
All good - group owner of the file is users (100) as expected.
I then deleted that pod (but not the PVC), upgraded to v1.4.0, and redeployed the pod.
Still good - new file also has the right group owner.
Next I created a new PVC, and a second pod which mounts it.
Uh oh - file group owner is root.
Here's what the directories look like on the Isilon cluster...k8s-6b54099747 is the one created with v1.3.0, k8s-789bcdc544 with 1.4.0. Note the difference in group ownership. Under v1.3.0, when the volume got mounted the first time, the directory group owner was set to the fsGroup value. Not so with v1.4.0 and 1.5.0.
Another observation - if I delete and re-launch the pod with the volume that was created by v1.3.0, the recursive fsGroup behavior is happening as expected... g+w gets set on existing files and if I manually make a file in there owned by some other group, it gets updated to gid 100 on a re-mount. This is NOT happening with the volume created with v1.4.0. So it seems the difference is not so much the version of the CSI running at mount-time, but rather the version that created the volume in the first place.
v1.5.0 behavior is identical to v1.4.0.
drb45
1 Rookie
•
15 Posts
0
June 23rd, 2021 10:00
Another observation that you can see in the Isilon cluster's directory listing above... besides setting the group owner on the volume's directory (k8s-6b54099747) on first mount, v1.3.0 also set the setgid bit (g+s). v1.4.0 and 1.5.0 don't do that either.
Examining the spec of the volumes created by v1.3.0 compared to v1.4.0+, only one thing stands out...v1.3.0 was explicitly setting spec.csi.fsType: ext4, whereas the new versions omit that. Not sure if that matters since the docs say ext4 is the default for fsType.
Thar_J
42 Posts
0
June 24th, 2021 03:00
Hi @drb45
Kindly share the YAML files that you used to create the PVC and POD
drb45
1 Rookie
•
15 Posts
0
June 24th, 2021 05:00
PVC:
The pod is using a jupyter notebook image which has a user "jovyan" with uid 1000:
Thar_J
42 Posts
0
June 25th, 2021 07:00
H @drb45
We will get back to post analyzing the behavior using your YAML files shared in our lab.
Regards
Thar_J
Thar_J
42 Posts
0
July 9th, 2021 01:00
Hi @drb45
I apologize for the delay in response.
Kindly provide us the user through which you installed the driver. Provide me the secret.json output
also, let me know the privileges applied for this user.
Regards
Thar_J
drb45
1 Rookie
•
15 Posts
0
July 9th, 2021 12:00
@Thar_J, do you mean the Isilon user that is being used for the API permissions? I can't provide the secret.json contents because I would have to redact it past the point of usability.
The API user is a Local account in the System Access Zone, assigned to a role with permissions as shown below. The top-level directory for volumes is owned by this same user.
A few other notes in case they're helpful: the cluster is on OneFS 8.1.2.0, and the volumes are being created in a non-system Access Zone (isiAccessZone is not "System").
Thar_J
42 Posts
0
July 22nd, 2021 03:00
Hi @drb45
Kindly share me the logs concerning - v1.3.0 was explicitly setting spec.csi.fsType: ext4, whereas the new versions omit that.
Also I would like to understand the user through which you install the csi-powerscale driver, is it admin user or a non root user.
Kindly share the privileges assigned to that particular user through which you installed the driver.
Regards
Thar_J
Thar_J
42 Posts
0
July 22nd, 2021 03:00
Hi @drb45
Kindly share the storage class yaml also.
Would like to understand how the rootClientEnable field is set.
Regards
Thar_J
drb45
1 Rookie
•
15 Posts
0
July 22nd, 2021 08:00
@Thar_J,
Re: the fsType thing, this is a snippent from the yaml of a volume created with 1.3. The fsType setting is not present on a volume created with 1.4+:
I used an admin user to install the CSI on the K8s cluster:
And here's the Storage Class with rootClientEnabled true:
Thar_J
42 Posts
0
July 23rd, 2021 00:00
Hi @drb45
I would like to have a call concerning this issue, Kindly let me know the time to connect. We work on IST time zone.
Regards
Thar_J
drb45
1 Rookie
•
15 Posts
0
July 23rd, 2021 13:00
I will PM you.
Thar_J
42 Posts
0
July 26th, 2021 02:00
Hi @drb45
Kindly let us know the time to connect with you to have a look at your environment.
Regards
Thar_J