Unsolved
39 Posts
0
1256
February 27th, 2020 03:00
VNX5500 Shelf 0/0 SPB & Batery B Removed
Hi all:
I'm concerning about a issue(apparently) that we had yesterday with the DPE.
The syslog show that it was a SPSB issue but this strange that only the getcrus command not display SP's status with fault.
nas_logviewer /nas/log/sys_log | grep -i power
Feb 26 13:30:24 2020:CS_PLATFORM:NaviEventMonitor:ERROR:3:::::CLARiiON event number 0x7404 Host SPA Storage Array CKM00000000000 SP N/A SoftwareRev 7.33.2 (0.51) BaseRev 05.32.000.5.221 Description Standby Power Supply (Bus 0 Enclosure 0 SPS B) is faulted. See alerts for details..
cat /nas/log/enclosure_status.enclosure_0.xml | grep FAULT
POWER_SUPPLY_A0_FAULT="0"
POWER_SUPPLY_A0_INTERNAL_FAN_1_FAULT="0"
POWER_SUPPLY_A0_INTERNAL_FAN_2_FAULT="0"
PS_A0_PEER_POWER_SUPPLY_FAULT="0"
POWER_SUPPLY_A1_FAULT="0"
POWER_SUPPLY_A1_INTERNAL_FAN_1_FAULT="0"
POWER_SUPPLY_A1_INTERNAL_FAN_2_FAULT="0"
PS_A1_PEER_POWER_SUPPLY_FAULT="0"
POWER_SUPPLY_B0_FAULT="0"
POWER_SUPPLY_B0_INTERNAL_FAN_1_FAULT="0"
POWER_SUPPLY_B0_INTERNAL_FAN_2_FAULT="0"
PS_B0_PEER_POWER_SUPPLY_FAULT="0"
POWER_SUPPLY_B1_FAULT="0"
POWER_SUPPLY_B1_INTERNAL_FAN_1_FAULT="0"
POWER_SUPPLY_B1_INTERNAL_FAN_2_FAULT="0"
PS_B1_PEER_POWER_SUPPLY_FAULT="0"
nas_checkup shows the following:
--------------------------------------------------------------------------------
-------------------------------------Errors-------------------------------------
Control Station: Check if primary is active
Error HC_CS_14505082897: Slot 0 is not the primary Control Station
Action :
1. Use various logs on the system to determine the problem that caused
the Control Stations to fail over. Make sure that you know what the
cause of the failure was and that it has been fixed before
attempting to failback the Control Stations.
2. On the Control Station in slot 1, run command
"/nasmcd/sbin/cs_standby -failback". The Control Station in
slot 1 will reboot, and the Control Station in slot 0 will
become the primary.
3. Wait 5 minutes for services to come up fully and verify that the
problem has been fixed by logging into both Control stations as
nasadmin. When you log into the Control Station in slot 0, it
should log you in normally. When you log into the Control Station
in slot 1, it should tell you that slot 1 is not the primary
Control Station.
Storage System : Check if FLARE is committed
Error HC_BE_14505017375: Backend Storage Requirements Check Failed:
directory does not exist -- /tftpboot
Cannot determine if the Storage System array software is committed.
Action :
See the messages above to determine the problem and correct it; then,
run nas_checkup to verify that the problem has been corrected.
Storage System : Check array model
Error HC_BE_14505017387: Backend Storage Requirements Check Failed:
directory does not exist -- /tftpboot
Unable to determine if the Storage System software is qualified with
this version of NAS.
Action :
Check the messages above to determine the problem, fix it, then run
nas_checkup to verify that the problem has been fixed.
--------------------------------------------------------------------------------
Please, Anyone give me a clue?
Thanks and regards.
PD: Array has not support.
DELL-Sam L
Moderator
•
7.6K Posts
0
February 27th, 2020 13:00
Hello InfraMED,
What you need to do is to check the power cables to make sure that they are all plugged in & secure. If all power cables are secure you want to confirm that there are no amber lights on your SPS or power supplies. You also need to check to make sure that your sense cable is connected to SP & SPS has not come lose.
Please let us know if you have any other questions.
InfraMED
39 Posts
0
February 28th, 2020 01:00
Hello @DELL-Sam L :
Thanks for the reply. The cabling check it seems ok.
By the other hand we have seem that it have a "bugcheck" in the code according shows SPCollect.
PANICS, DUMPS, RESTARTS AND FLARE NOT RESCHEDULING INFORMATION
**********************************************************************************************
SPB has encountered Bugcheck code 05900000 on 02/26/2020 12:31:25.
Bugcheck Name and Definition
*****************************
FF_ASSERT_PANIC - Bugcheck in FCT driver (fast cache) when lookup in the hash table fails - for the element to work on as requested by the peer. Most likely this was due to a prior synchronization failure between the directories on each SP due to a very short timing window that causes an operation to be performed on one storage processor without informing the peer, which might happen during reboot, including NDU reboot.
Recommendation
**************
Debugging information added in R32.015. Fix went to R32.215 but turned out to be partial. Duplicate to AR 682652 for statistics update
SPB has encountered Bugcheck code 05900000 on 11/14/2016 01:10:27.
Bugcheck Name and Definition
*****************************
FF_ASSERT_PANIC - Bugcheck in FCT driver (fast cache) when lookup in the hash table fails - for the element to work on as requested by the peer. Most likely this was due to a prior synchronization failure between the directories on each SP due to a very short timing window that causes an operation to be performed on one storage processor without informing the peer, which might happen during reboot, including NDU reboot.
Recommendation
**************
Debugging information added in R32.015. Fix went to R32.215 but turned out to be partial. Duplicate to AR 682652 for statistics update
B 02/26/20 12:42:23 DGSSP 76008106 The Storage Processor rebooted unexpectedly @ 12:27:20 on 02/26/2020: BugCheck 0, {0000000000000000, 0000000005900000, 0000000000000aa7, 0000000000000000}, Failing Instruction: 0xfffff880464cecf2 in Fct.sys loaded @ 0xfffff880464c6000 [FF_ASSERT_PANIC]
[ BugcheckCode: 5900000 ]
I understand that it's complicated for fix when the array has not support.
Many thanks and regards.