Unsolved

767

November 23rd, 2020 17:00

PS6210 showing 19TB free but both members showing under 1TB.

Nodes have been showing 10000+ ms write latency.  Dell found an issue with low disk space but have not been able to determine where its going. 

 

Group space is showing 19TB but member 1 is under 200gig and member 2 is under 800gig.

 

Volumes are attached to ESXi 7 Update 1.

 

GreyTavistar_0-1606182823988.pngGreyTavistar_1-1606182866360.pngGreyTavistar_2-1606182905035.png

 

4 Operator

 • 

1.5K Posts

November 23rd, 2020 19:00

Hello, 

 The images haven't shown up yet so I can't comment much. 

 But depending on where you look there is allocated and actually written.  Which often leads to confusion but doesn't indicate a problem. 

  What version of EQL firmware are those arrays running? 

  I would suggest keep working with Dell Support as they have access to the diagnostic files from the array. 

  Regards, 

Don

4 Operator

 • 

2.3K Posts

November 23rd, 2020 23:00

I can tell you... if free space drops under 200GB or 5% you will get worse performance.  Per default SANHQ will warn you and there will be a warning in GroupMGR as well.

 

Cant help you with the 19TB free space because no images. How the Group looks like from the commandline?

Regards,
Joerg

4 Operator

 • 

2.3K Posts

November 24th, 2020 07:00

You sure that you looking to the right group manager?

..

In all the years i never seen that the groupmanager got it wrong in that way. For sure i have see that he is slow starting and some kind of unusable when having a large group with a lot of connections.

Regards,
Joerg

November 24th, 2020 07:00

CC67BA93-F238-40CF-96DF-0D126AA66D45.png

AD7616F5-4B35-428C-9EB5-2AFE46920B78.png

1F90BEA1-30E6-4CF8-BE9A-3C5F12BD9332.png

November 24th, 2020 14:00

According to Dell its the Ghost Volume that was caused by Secure Erase being enabled.  We disabled it but its taking forever to clear it up.

 

 

Iom Meters 11/24/2020 16:33:53 (metadata version 147 (BaseVolCompress), Opt 0Writes, raid accelerator using 1% of disk space)
meter duration 00:19:52 (max lock: 10.0 ms, max cache call: 8.8 ms)
user disk size 19,093,875 MB (1,272,925 pages)
user disk free 432,240 MB (28,816 pages)
user disk used 18,661,635 MB (1,244,109 pages)
user snaps used 1,057,515 MB (70,501 pages)
user temp used 283,920 MB (18,928 pages)
ghost volume 899,629 pages
owner lists in ghost volume 130
ISCSI I/O Meters
Random Total Bytes Avg | Sequential Total Bytes Avg
Reads 798,044 49,840 MB 2,215us | Reads 286,949 18,030 MB 1,027us
Writes 7,595 46,917 MB 49us | Writes 1,020 2,751 MB 313us
LV Meters
Name Cur Max Total <100ms >100ms KB <100ms >100ms Avg <100ms >100ms Max usec
ISCSI Reads 5 16 1,084K 1,084K 915 69,500,139 66.2 GB 57.9 MB 1,901us 1,775us 151ms 597,223
ISCSI Writes 0 10 1,695 1,695 0 98,828 96.5 MB 0 KB 329us 329us 0us 19,510
ISCSI Unmaps 0 1 6,920 6,920 0 50,762,752 48.4 GB 0 KB 20us 20us 0us 166
ISCSI Zeroes 0 0 0 0 0 0 0 KB 0 KB 0us 0us 0us 0
Eqllog Read 0 1 19 19 0 38 38 KB 0 KB 8,483us 8,483us 0us 19,683
Eqllog Write 0 1 1,159 1,159 0 2,318 2.2 MB 0 KB 96us 96us 0us 33,577
DB Data Write 0 1 498 498 0 284 284 KB 0 KB 90us 90us 0us 6,947
Cow Copies 0 0 0 0 0 0 0 KB 0 KB 0us 0us 0us 0
unCow Copies 0 0 0 0 0 0 0 KB 0 KB 0us 0us 0us 0
Move Copies 0 48 66 66 0 29,757 29.0 MB 0 KB 12.3ms 12.3ms 0us 21,137
Zero Writes 0 0 0 0 0 0 0 KB 0 KB 0us 0us 0us 0
User Writes 0 4 38 38 0 841 841 KB 0 KB 169us 169us 0us 815
LvLog Writes 0 1 2 2 0 65 65 KB 0 KB 37.8ms 37.8ms 0us 40,778
Page Meters
Pages New Cur Max Done | Sectors KB Max KB
Cow 1,828 292,916 292,949 1,761 | Cow 3,771,165,193 3,771,360,874
unCow 0 0 0 0 | unCow 0 0
Move 2 0 1 2 | Move 0 15,360
Zero 1,801 375,873 375,998 1,869 | Zero 1,054,592,204 1,055,573,804
Secure 0 0 0 0 | Secure 0 15,360
Total 3,631 668,789 3,632 | Total 4,825,757,397
Page Frees
Pages New Cur Max Done
Freed 1,886 0 1 1,886
Actual Address Space and Usage
Lun Size Free Used Metadata RA
1 RAID-6 Pages: 1,025,594 12,957 1,012,637 151 0
MB: 15,383,910 194,355 15,189,555 2,265 0
0 HOT-6 Pages: 253,092 0 253,092 11 2,530
MB: 3,796,380 0 3,796,380 165 37,950
Totals Pages: 1,278,686 12,957 1,265,729 162 2,530
MB: 19,180,290 194,355 18,985,935 2,430 37,950
Other Information
Resource Total Used Max Used %Used | Page Data Used Max Used
all mem 9,437,184 KB 431,276 KB 431,283 KB 4.569% | alloc 1,244,109 1,244,215
page init 401,408 KB 131,930 KB 131,934 KB 32.866%
page ptrs 346,692 KB 40,347 KB 39,609 KB 11.637%
nvram 65,464 B 386 B 32,754 B 0.589%
lv log 1,228,800 KB 38,231 KB 3.111%
Master State: Appending_Log
LogNum: 0 Rewrites: 0 Marker: 0x5ED160065ED16079 SectorNum: 76462 SectorsUsed: 0
Shadow LV State: Reading_Log
Shadow LogNum: 0 Rewrites: 0 Marker: 0x0000000000000000 SectorNum: 0 SectorsUsed: 76462

November 24th, 2020 16:00

Any reason it would have completely filled up the SAN?  I would have expected it to continue to do housekeeping.

 

Only 8TB used out of the 20TB luns.  36TB total space on the SAN and it was completely filled up.

 

The 2nd node doesnt seem to be clearing out any of the ghost volume pages.

4 Operator

 • 

1.5K Posts

November 24th, 2020 16:00

Hello, 

 Thank you for the update.  Secure delete of a volume is a low priority background task so as not to impact current IO response time. Additionally, if there is a reboot, power failure, failover as I recall, it starts the process over. 

 Regards,

Don 

4 Operator

 • 

1.5K Posts

November 25th, 2020 07:00

Hello,

I would not want to speculate on a cause. 

 If you still have questions, I suggest you contact support again and follow up with them. 

 Regards, 
Don 

No Events found!

Top