Unsolved

5 Posts

1347

April 1st, 2022 03:00

PowerVault MD3600i VD reconstruction stuck at 99%

Hi,

We have a PowerVault MD3600i where 2 disks had failed. We replaced the disks and Virtual Disk reconstruction began immediately for the first disk showing status replaced, while the second disk is showing as unassigned after replacing it.

We decided to wait until the reconstruction completes for the first disk but it has gone up to 99% and seems to have just stuck there. After waiting a day to see if the reconstruction will progress from the 99%, we decided to remove/replace the disk with a new disk, however this does not seems to have changed anything and array reconstruction is still stuck at 99% and has been that way for ~3days already. Is there any way to stop/kill this reconstruction without loosing the data on the array?

What are our options now?

Moderator

 • 

9.7K Posts

April 1st, 2022 08:00

Zevth80,

 

What I would suggest is that we get a support bundle to determine what the issue is, before we try anything that can risk the data. Pull the support bundle and then upload it to upload . dell . com and then Private Message me the svc tag you used to upload the logs, so that I can locate them. Once I get them I can research and see what caused it and we can go from there.

 

Let me know. 

 

 

 

5 Posts

April 3rd, 2022 19:00

Hi Chris,

I have uploaded the support bundle and PM you the svc tag.

Thanks.

Moderator

 • 

4.1K Posts

April 4th, 2022 01:00

Hello @zevth80,

 

Would you mind going into the Storage Manager where you pulled the logs, and check in Recovery Guru for the errors that prompted. There is one, VOLUME_GROUP_PARTIALLY_COMPLETE-Recovery Failure Type Code: 200 which needed to be checked. 

 

May I which drive have you replaced? Disk 0 and 11? Is there any errors on disk 4?

5 Posts

April 4th, 2022 02:00

Hi Joey,
Please see the screenshots below for Recovery Guru errors. The disks replaced were 0 and 4, however the array seems to only be rebuilding for disk 0. Disk 4 is still showing as Optimal Unassigned.

zevth80_0-1649065790164.png

zevth80_1-1649066015795.png

 

Moderator

 • 

4K Posts

April 4th, 2022 06:00

Hello,

please let me check with my collegues what can be the action plan.

I will let you know asap.

Thanks

Marco

Moderator

 • 

9.7K Posts

April 4th, 2022 10:00

Zenvth80,

 

Would you confirm if you performed the recovery guru steps, as shown below?

Assign a physical disk to take over for the missing physical disk by completing the following steps:

a Refer to the Details area to find out which disk group has been affected.
b Highlight the affected disk group on the Storage and Copy Services tab in the Array Management Window, and then select the Storage > Disk Group > Replace Physical Disks menu option.
c Select the missing physical disk that corresponds to the to the slot location (identified in the Details area).
d Select an appropriate replacement physical disk, and then click the Replace Physical Disk button.

Note:

  • When the replace operation begins, data is reconstructed on the new physical disk using information on the other physical disks comprising the disk group. This reconstruction should begin automatically. The physical disk's fault indicator lights will go off and the activity indicator lights of the physical disks in the disk group will start flashing.
  • If you are replacing a physical disk in a storage array that contains hot spares, data reconstruction will start on the hot spare before you insert the new physical disk. The data on the replacement physical disk may not be reconstructed until after it has completed the process on the hot spare.
e When the physical disk reconstruction completes and the disk group is Optimal, go to step 3.

 

3

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your Technical Support Representative.



Let me know, as we need to confirm that before moving forward, if so then let me know and I will respond via private message. 

 

 

5 Posts

April 4th, 2022 19:00

Hi Chris, we get the following error when running Replace Physical Disk:

zevth80_0-1649126118426.png

Details are below:

----------------------

OPERATING ENVIRONMENT DATA

Client system name: **removed**
Client OS: Windows Server 2012 R2

Application version: 10.84.G6.58
Storage array management version: 10.84.G6.58
Storage array name: **removed**-mgmt-dell-san
Firmware version: 07.84.47.60
Management class: devmgr.v1084api04.Manager



**************************************************

ERROR DATA

Command sent to RAID controller module in slot: 0
Host name: **removed**
IP address:**removed**

Return code: Error 29 - This operation cannot complete because there was a security authentication failure on a parameter in the command sent to the RAID controller module. Please retry the operation. If this message persists, contact your Technical Support Representative.

Operation when error occurred: PROC_replaceDrive
Timestamp: 5 Apr, 2022 10:33:02 AM

STACK DATA

devmgr.v1084api04.sam.jal.ManagementOperationFailedException:
at devmgr.v1084api04.sam.jal.SYMbolClient.dispatchOperation(SYMbolClient.java:569)
at devmgr.v1084api04.sam.jal.StorageArrayFacade.issueCommand(StorageArrayFacade.java:10491)
at devmgr.v1084api04.sam.jal.StorageArrayFacade.sendCommandCommon(StorageArrayFacade.java:10037)
at devmgr.v1084api04.sam.jal.StorageArrayFacade.sendCommand(StorageArrayFacade.java:9964)
at devmgr.v1084api04.sam.jal.StorageArrayFacade.replaceDrive(StorageArrayFacade.java:11042)
at devmgr.v1084api04.sam.advanced.ReplaceDriveDialog$7.performOp(ReplaceDriveDialog.java:621)
at devmgr.v1084api04.shared.AbstractTaskAdapter.run(AbstractTaskAdapter.java:108)







THREAD DATA

Thread[Reference Handler,10,system]
Thread[Finalizer,8,system]
Thread[Signal Dispatcher,9,system]
Thread[Attach Listener,5,system]
Thread[Java2D Disposer,10,system]
Thread[AWT-Shutdown,5,system]
Thread[AWT-Windows,6,system]
Thread[TimerQueue,5,system]
Thread[Swing-Shell,6,system]
Thread[AWT-EventQueue-0,6,main]
Thread[DestroyJavaVM,5,main]
Thread[SwingWorker-pool-2-thread-1,5,main]
Thread[ChangeDetector,6,main]
Thread[PreferenceStoreChangeDetector,6,main]
Thread[LogMsgThread,6,main]
Thread[Np_Link_Monitor0,6,main]
Thread[DMVGarbageCollectorThread,6,main]
Thread[RecoveryProfile-12,6,main]
Thread[AEN-13,6,main]
Thread[Timer-0,6,main]
Thread[SwingWorker-pool-2-thread-2,5,main]
Thread[Thread-278,6,main]
Thread[SwingWorker-pool-2-thread-3,5,main]
Thread[SwingWorker-pool-2-thread-4,5,main]
Thread[SwingWorker-pool-2-thread-5,5,main]
Thread[SwingWorker-pool-2-thread-6,5,main]
Thread[SwingWorker-pool-2-thread-7,5,main]
Thread[SwingWorker-pool-2-thread-8,5,main]
Thread[SwingWorker-pool-2-thread-9,5,main]
Thread[SwingWorker-pool-2-thread-10,5,main]
Thread[pool-7-thread-52,5,main]
Thread[pool-7-thread-53,5,main]
Thread[pool-7-thread-54,5,main]
Thread[pool-7-thread-55,5,main]
Thread[pool-7-thread-56,5,main]
Thread[pool-7-thread-57,5,main]
Thread[Image Fetcher 0,8,main]
Thread[Image Fetcher 1,8,main]
Thread[Image Fetcher 2,8,main]
Thread[Image Fetcher 3,8,main]






































----------------------

Do note, during this time array reconstruction is still stuck at 99% for the past week.

zevth80_2-1649126555869.png

 

 

5 Posts

April 5th, 2022 02:00

Hi,

We are currently moving data off the array since it is still accessible, however I/O for the array is high so its slow going. We will attempt any changed once all data is moved off.

Do you recommend restarting the SAN to stop the disk 0 rebuild and to start the disk 4 rebuild instead? Or should we just delete and recreate the VD?

Moderator

 • 

4.1K Posts

April 5th, 2022 02:00

Hi @zevth80,

 

I may suspect that the rebuild is unable to complete due to drive 11 has some unreadable sector. Whilst disk 0 has not been completed the rebuild, disk 4 might be able to start rebuild. 

 

Is data accessible? - Do you have data back of the storage?

 

Do you have physical access to the storage? Do you know what is the LED status of all the drives?

Moderator

 • 

4K Posts

April 5th, 2022 06:00

Hello,

yes you can try first to start disk 4 rebuild before, then if it doesn't work you can recreate VD.

Thanks

Marco

Moderator

 • 

9.7K Posts

April 5th, 2022 08:00

Zevth80,

 

You don't want to restart the SAN, you can do the guru steps from where it is currently, you don't want to restart if we don't have to.

Verify the guru steps and then we can go from there, we may end up needing to delete and recreate, but we aren't at that point yet.

 

 

 

Top