This post is more than 5 years old
22 Posts
0
4781
March 6th, 2017 05:00
Scaleio command: scli --start_upgrade fails
Hello,
I have cluster running sw version: Version: 2.0.7120
And I want to upgrade to latest sw version. however command scli --start_upgrade fails .
# scli --login --username admin --password *******
Logged in. User role is SuperUser. System ID is 7f16affdg085ea76c
]# scli --query_cluster
Cluster:
Mode: 5_node, State: Normal, Active: 5/5, Replicas: 3/3
Master MDM:
ID: 0x7e272f9f63f5e210
IPs: 10.22.26.11, Management IPs: 10.22.26.11, Port: 9011
Version: 2.0.7120
Slave MDMs:
ID: 0x0fa334e15156abe2
IPs: 10.22.26.13, Management IPs: 10.22.26.13, Port: 9011
Status: Normal, Version: 2.0.7120
ID: 0x5cbaf3001f4f1801
IPs: 10.22.26.12, Management IPs: 10.22.26.12, Port: 9011
Status: Normal, Version: 2.0.7120
Tie-Breakers:
ID: 0x6beff5195f712f54
IPs: 10.22.26.32, Port: 9011
Status: Normal, Version: 2.0.7120
ID: 0x197422174da8d613
IPs: 10.22.26.31, Port: 9011
Status: Normal, Version: 2.0.7120
# scli --start_upgrade
This command should only be used when intending to upgrade the entire system to a new product version. Press 'y' and then Enter to confirm.y
Error: MDM failed command. Status: Invalid session. Please login and try again.
Also, commands like:
# scli --query_user --username admin
Error: MDM failed command. Status: Invalid session. Please login and try again.
# scli --query_all
Error: MDM failed command. Status: Invalid session. Please login and try again.
fails also.
Any ideas ?
ScaleIO was deployed in version: Version: 2.0.7120
RHasleton1
73 Posts
0
March 7th, 2017 10:00
Matas,
Take a look at this KB:
https://support.emc.com/kb/487432
It outlines what you are seeing. Essentially, you have something that is logging in/out too many times, too quickly, and that is filling up the login table. If you can find and figure out what is causing the login table to fill up, that should do it. Look at your events on the Master MDM as a clue (/opt/emc/scaleio/mdm/bin/showevents.py -p |grep "Command login ").
Hope that helps,
Rick
pawelw1
306 Posts
0
March 7th, 2017 00:00
Hi Matas,
If you need to upgrade, the recommended way would be through the Installation Manager which would take care of that for you. I don't think you need to run "start_upgrade" manually, like, ever
As to your problem - please try "scli --allow_commands_during_upgrade" and see if that helps in your case, but I still believe you should just use the IM.
Cheers,
Pawel
SanjeevMalhotra
138 Posts
0
March 7th, 2017 01:00
Error: MDM failed command. Status: Invalid session. Please login and try again.
This error means you need to login again in scli using the username admin.
In case you are unable to login, try to SSH to one of the slave MDM and try to switch the MDM ownership (use switch --mdm_ip in the commands you run from the slave MDM until you are able to switch the MDM ownership to that slave MDM.
As Pawel suggested upgrade is suggested from IM unless you want to manually upgrade, In that scenario the process is given in the deployment guide.
SanjeevMalhotra
138 Posts
0
March 7th, 2017 03:00
As suggested by Pawel, did you run the command "scli --allow_commands_during_upgrade" ?
Matas1
22 Posts
0
March 7th, 2017 03:00
Hi, Pawel ScaleIO user Guide defines manual upgrade process , and command "start_upgrade" is required.
SanjeevMalhotra I cannot change MDM ownership. I think something is not working:
i.e:
[root@dkscmd002prvjay matv]# scli --query_cluster
Cluster:
Mode: 5_node, State: Normal, Active: 5/5, Replicas: 3/3
Master MDM:
ID: 0x5cbaf3001f4f1801
IPs: 10.22.26.12, Management IPs: 10.22.26.12, Port: 9011
Version: 2.0.7120
Slave MDMs:
ID: 0x7e272f9f63f5e210
IPs: 10.22.26.11, Management IPs: 10.22.26.11, Port: 9011
Status: Normal, Version: 2.0.7120
ID: 0x0fa334e15156abe2
IPs: 10.22.26.13, Management IPs: 10.22.26.13, Port: 9011
Status: Normal, Version: 2.0.7120
Tie-Breakers:
ID: 0x6beff5195f712f54
IPs: 10.22.26.32, Port: 9011
Status: Normal, Version: 2.0.7120
ID: 0x197422174da8d613
IPs: 10.22.26.31, Port: 9011
Status: Normal, Version: 2.0.7120
[root@dkscmd002prvjay matv]# scli --query_all
Error: MDM failed command. Status: Invalid session. Please login and try again.
Any ideas ?
Matas1
22 Posts
0
March 7th, 2017 04:00
update:
MDM log: /opt/emc/scaleio/mdm/logs/trc.0
07/03 13:24:24.216991 e0de1eb8:nativeAuthMgr_GetUserInfoFromToken:00596: Failed to find session for authentication. rc=INVALID_SESSION
07/03 13:24:24.216995 e0de1eb8:cliMsg_AuthenticateSession:01328: Failed getting user role
07/03 13:24:24.216997 e0de1eb8:mdmCliMsg_RecvRequestCB:01562: Failed authenticate session for command 4367
07/03 13:24:25.794455 e0bd7eb8:actor_Loop:11662: #### Log sync send - actorId: 013d39903a48fa31, ticks: 1407290160
07/03 13:24:25.794549 e0bceeb8:voter_HandleMeMaster:02627: #### Log sync receive - Sender: actorId: 013d39903a48fa31, ticks: 1407290160, State: voterId: 053e9b5c5006bf81, actorId: 013d39903a48fa31, actorGen 2, degradedGen 45, oosIDs [], IsFrozen 0, bHasLease 1
after I try operation with admin user like:
[root@dkscmd002prvjay bin]# scli --query_all
Error: MDM failed command. Status: Invalid session. Please login and try again.
Matas1
22 Posts
1
March 7th, 2017 07:00
Hello, it seems our monitoring system was "bombing" ScaleIO API, and to many sessions was happening, so some commands failed.
After stopping monitoring, we noticed such logs:
***
07/03 16:13:07.155552 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 372 of type 15
07/03 16:13:07.157933 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 1 of type 15
07/03 16:13:07.159606 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 159 of type 15
07/03 16:13:07.161376 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 231 of type 15
07/03 16:13:07.163032 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 83 of type 15
07/03 16:13:07.164624 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 429 of type 15
***
after SIO cleaned up old sessions, all commands works fine.
Thanks !