Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

4781

March 6th, 2017 05:00

Scaleio command: scli --start_upgrade fails

Hello,

I have cluster running sw version:  Version: 2.0.7120

And I want to upgrade to latest sw version. however command scli --start_upgrade fails .

# scli --login --username admin --password *******

Logged in. User role is SuperUser. System ID is 7f16affdg085ea76c

]# scli --query_cluster

Cluster:

    Mode: 5_node, State: Normal, Active: 5/5, Replicas: 3/3

Master MDM:

    ID: 0x7e272f9f63f5e210

        IPs: 10.22.26.11, Management IPs: 10.22.26.11, Port: 9011

        Version: 2.0.7120

Slave MDMs:

    ID: 0x0fa334e15156abe2

        IPs: 10.22.26.13, Management IPs: 10.22.26.13, Port: 9011

        Status: Normal, Version: 2.0.7120

    ID: 0x5cbaf3001f4f1801

        IPs: 10.22.26.12, Management IPs: 10.22.26.12, Port: 9011

        Status: Normal, Version: 2.0.7120

Tie-Breakers:

    ID: 0x6beff5195f712f54

        IPs: 10.22.26.32, Port: 9011

        Status: Normal, Version: 2.0.7120

    ID: 0x197422174da8d613

        IPs: 10.22.26.31, Port: 9011

        Status: Normal, Version: 2.0.7120

# scli --start_upgrade

This command should only be used when intending to upgrade the entire system to a new product version. Press 'y' and then Enter to confirm.y

Error: MDM failed command.  Status: Invalid session. Please login and try again.

Also,  commands like:

# scli --query_user --username admin

Error: MDM failed command.  Status: Invalid session. Please login and try again.

# scli --query_all

Error: MDM failed command.  Status: Invalid session. Please login and try again.

fails also.

Any ideas ?

ScaleIO was deployed in version: Version: 2.0.7120

73 Posts

March 7th, 2017 10:00

Matas,

Take a look at this KB:

https://support.emc.com/kb/487432

It outlines what you are seeing. Essentially, you have something that is logging in/out too many times, too quickly, and that is filling up the login table. If you can find and figure out what is causing the login table to fill up, that should do it. Look at your events on the Master MDM as a clue (/opt/emc/scaleio/mdm/bin/showevents.py -p |grep "Command login ").

Hope that helps,

Rick

306 Posts

March 7th, 2017 00:00

Hi Matas,

If you need to upgrade, the recommended way would be through the Installation Manager which would take care of that for you. I don't think you need to run "start_upgrade" manually, like, ever

As to your problem - please try "scli --allow_commands_during_upgrade" and see if that helps in your case, but I still believe you should just use the IM.

Cheers,

Pawel

March 7th, 2017 01:00

Error: MDM failed command.  Status: Invalid session. Please login and try again.


This error means you need to login again in scli using the username admin.


In case you are unable to login, try to SSH to one of the  slave MDM and try to switch the MDM ownership (use switch --mdm_ip in the commands you run from the slave MDM until you are able to switch the MDM ownership to that slave MDM.


As Pawel suggested upgrade is suggested from IM unless you want to manually upgrade, In that scenario the process is given in the deployment guide.


March 7th, 2017 03:00

As suggested by Pawel, did you run the command "scli --allow_commands_during_upgrade" ?

22 Posts

March 7th, 2017 03:00

Hi,  Pawel  ScaleIO user Guide defines manual upgrade process , and command "start_upgrade" is required.

SanjeevMalhotra  I cannot change MDM ownership. I think something is not working:

i.e:

[root@dkscmd002prvjay matv]# scli --query_cluster

Cluster:

    Mode: 5_node, State: Normal, Active: 5/5, Replicas: 3/3

Master MDM:

    ID: 0x5cbaf3001f4f1801

        IPs: 10.22.26.12, Management IPs: 10.22.26.12, Port: 9011

        Version: 2.0.7120

Slave MDMs:

    ID: 0x7e272f9f63f5e210

        IPs: 10.22.26.11, Management IPs: 10.22.26.11, Port: 9011

        Status: Normal, Version: 2.0.7120

    ID: 0x0fa334e15156abe2

        IPs: 10.22.26.13, Management IPs: 10.22.26.13, Port: 9011

        Status: Normal, Version: 2.0.7120

Tie-Breakers:

    ID: 0x6beff5195f712f54

        IPs: 10.22.26.32, Port: 9011

        Status: Normal, Version: 2.0.7120

    ID: 0x197422174da8d613

        IPs: 10.22.26.31, Port: 9011

        Status: Normal, Version: 2.0.7120

[root@dkscmd002prvjay matv]# scli --query_all

Error: MDM failed command.  Status: Invalid session. Please login and try again.

Any ideas ?

22 Posts

March 7th, 2017 04:00

update:

MDM log: /opt/emc/scaleio/mdm/logs/trc.0

07/03 13:24:24.216991 e0de1eb8:nativeAuthMgr_GetUserInfoFromToken:00596: Failed to find session for authentication. rc=INVALID_SESSION

07/03 13:24:24.216995 e0de1eb8:cliMsg_AuthenticateSession:01328: Failed getting user role

07/03 13:24:24.216997 e0de1eb8:mdmCliMsg_RecvRequestCB:01562: Failed authenticate session for command 4367

07/03 13:24:25.794455 e0bd7eb8:actor_Loop:11662: #### Log sync send - actorId: 013d39903a48fa31, ticks: 1407290160

07/03 13:24:25.794549 e0bceeb8:voter_HandleMeMaster:02627: #### Log sync receive - Sender: actorId: 013d39903a48fa31, ticks: 1407290160, State: voterId: 053e9b5c5006bf81, actorId: 013d39903a48fa31, actorGen 2, degradedGen 45, oosIDs [], IsFrozen 0, bHasLease 1

after I try operation with admin user like:

[root@dkscmd002prvjay bin]# scli --query_all

Error: MDM failed command.  Status: Invalid session. Please login and try again.

22 Posts

March 7th, 2017 07:00

Hello, it seems our monitoring system was "bombing" ScaleIO API, and to many sessions was happening, so some commands failed.

After stopping monitoring, we noticed such logs:

***

07/03 16:13:07.155552 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 372 of type 15

07/03 16:13:07.157933 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 1 of type 15

07/03 16:13:07.159606 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 159 of type 15

07/03 16:13:07.161376 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 231 of type 15

07/03 16:13:07.163032 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 83 of type 15

07/03 16:13:07.164624 8fd8feb8:repType_DestroyObjInternal:04008: Called from authSession_CleanupSession. Destroying object 429 of type 15
***

after SIO cleaned up old sessions, all commands works fine.

Thanks !

No Events found!

Top