Unsolved
This post is more than 5 years old
8 Posts
0
1861
September 2nd, 2016 13:00
1.32 to 2.0 upgrade failing failure switch to cluster mode
Hi, I am running 1.32 on windows, trying to upgrade to 2.0
Have my packages loaded, query runs fine, upgrade proceeds till switch to cluster mode and fails due to a communication error. Any ideas? Running IM from server2012r2, all scaleio roles are on 2012r2 machines
"slavesToRemove": null,
"tbsToRemove": null,
"newClusterMode": "MDM_CLUSTER_MODE_3_NODES",
"commandID": "MDM Commands<->84",
"untrustedCertificateThumbprint": null,
"commandState": "failed",
"startTime": "2016-09-02T20:34:44.486Z",
"message": "Command failed: Could not switch to cluster mode on x.x.x.x,x.x.x.x due to: Communication error",
"result": null,
"followingCommand": "WaitForNormalClusterModeCommand[working on= (MDM) [x.x.x.x, x.x.x.x]]",
"acceptInState": false,
"allowedPhase": "install",
"archived": false,
"commandParameters": [],
"commandName": ".SwitchToClusterModeCommand"
}
pawelw1
306 Posts
0
September 3rd, 2016 05:00
Hi,
You can check if all IPs participating in ScaleIO (management, data etc.) can communicate with each other IPs; also make sure you allowed non-secure communication in the Advanced dropdown list.
Let us know how it goes,
Pawel
ericammon
8 Posts
0
September 3rd, 2016 08:00
Pawel,
Checked that all devices can communicate with each other. I believe I had allow non-secure communications ticked when I started the upgrade. At the point where the update failed I am stuck, cant do anything in any tab aside from 'monitor' because:
'Can't mark the operation as completed in the current running process, please retry or contact support.'
Eric
pawelw1
306 Posts
0
September 3rd, 2016 23:00
Hi Eric,
We have seen this error in the past and in most cases there was a problem with IP communication between the nodes.
Can you make sure for 100% that the Gateway host you are using for the upgrade can reach absolutely all ScaleIO IP addresses? It's a common mistake, not having some GW interfaces configured or missing on certain networks.
Thank you,
Pawel
ericammon
8 Posts
0
September 4th, 2016 07:00
I have the GW and scaleio interfaces all in the same /24 network. The GW can ping the scaleio interfaces for each system that is participating. I temporarily disabled the fw on the MDM nodes, still doesnt work. I disabled all but the scaleio interface on the GW node, can still ping the scaleio addresses on each participating server but the upgrade fails. Also this is an upgrade not a new install, everything worked fine with these IP addresses before the upgrade.
ericammon
8 Posts
0
September 4th, 2016 08:00
Is it possible this is an issue with OpenSSL? are there any configuration docs for openSSL with scaleio, it doesnt look like it was needed in 1.32
SanjeevMalhotra
138 Posts
0
September 4th, 2016 12:00
Can you please run the following:-
1. RDP to the Primary MDM SVM and run the command:-
scli --query_cluster
In the output of the above command check if the Primary, and the standby MDMs with role Master and TB are listed there with correct IP addresses. (Even though the mode may be 1_node).
If the above are correct, we may run the manual command to switch to 3_node mode and check if we are able to do it. If you are able to do, you can proceed further with the upgrade
pawelw1
306 Posts
0
September 4th, 2016 13:00
Hi Eric,
There might be a problem with OpenSSL, that's why I asked if you checked "Allow non-secure communication" checkbox in the IM. As per ScaleIO User Guide, "To use the secure authentication mode, ensure that OpenSSL 64bit 1.0.1 or higher is installed on all the systems". Can you check the installation logs and see if there are any complains regarding certificates etc.?
Thank you,
Pawel
ericammon
8 Posts
1
September 5th, 2016 06:00
When installing the 1.32 env I didnt have an option for non-secure communication, presumably there is when upgrading to 2.0
OpenSSL is installed on the MDM nodes however I am not sure if any additional configuration needs to be done, is there a best practices doc for configuring OpenSSL?
Eric
ericammon
8 Posts
0
September 5th, 2016 06:00
All info in the --query_cluster output looks correct.