Start a Conversation

Unsolved

W

1 Rookie

 • 

6 Posts

1825

July 8th, 2020 18:00

CSI driver:POD failover with PV attached

Hello ,i have a question about the pod failover.

CSI Driver for Dell EMC PowerMax 1.2

CentOS 7.6 with Native Multipath.

 

Is it possible to create a statefulset/deployment with this CSI driver volumes and the pods in this set can failover to other nodes when the current node down.

1 Rookie

 • 

6 Posts

July 8th, 2020 23:00

I tried both deployment and statefulset ,all of them didn't completed failover ,got the similar error output :

######################

Multi-Attach error for volume "pmax-c2e3a49fbf" Volume is already exclusively attached to one node and can't be attached to another

########################

2 Intern

 • 

166 Posts

July 9th, 2020 01:00

Hi @Wellin,

Can you share a bit more details or log on the error ?

On hard node failure, kubernetes cannot know if the issue is because kubelet is down and the stateful pod is still alive and writing or if the pod is down. 

Therefore the pod will be stuck in Terminating mode.

 

If you are in that state and you know the Pod is not writing anymore you can force deletion with the following procedure : https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/#force-deletion

 

If you want to share more logs or details feel free to contact me in PM.

 

HTH

1 Rookie

 • 

6 Posts

July 9th, 2020 20:00

Hello @Flo_csI ,

    Thanks for asking,the pod is not stucked in terminating ,and the pod has been deleted from the down node automatically ,and begin to create pod in a running node,the status is creatingcontainer.

    The error is generated at the moment.

######################

Multi-Attach error for volume "pmax-c2e3a49fbf" Volume is already exclusively attached to one node and can't be attached to another

########################

 

    Many thanks.

2 Posts

July 14th, 2020 00:00

Hi Wellin, 

We have also seen this, when the system goes to reattach the volume. Normally, if you give it some time, this automatically corrects itself, as the attacher and reconciler takes over. Give it 10 mins and see (most times, it recovers automatically) 

If not, continue on 

  1. Whats the status of your vxflexos-controller-0 pod (are all the 4 containers up)? Check the logs of the attacher and it should point you to the right direction in terms of your error.(kubectl logs vxflexos-controller-0 -n vxflexos -c attacher --tail 100) and driver (kubectl logs vxflexos-controller-0 -n vxflexos -c driver--tail 100)
  2. Check if the query_mdm command is successful on all the node (where the pod is trying to come up) and ensure the mdm connection is fine 

This should point you towards the right direction 

2 Intern

 • 

166 Posts

July 17th, 2020 12:00

@Wellin, you can also check if after 10 minutes the volume attachment are still there  with 

kubectl get volumeattachments.storage.k8s.io

This object is created by the driver and normally... it should be removed if the volume is not attached anymore.

You can confirm it with a lsblk

 

Let me know

--Florian

No Events found!

Top