Impacts of reads and writes on a storage system

I have a VNX 5800 with a ton of 2tb NL-SAS that I just inherited.

I ran a short period of time through Mitrends

Of the top 15 LUNs:

14 reside on 2TB NL-SAS (413 drives total out of 637 in the array)

The read average on the 2tb NL-SAS is 51% with 4 LUN's peaking between 62-84%. The skew is 30% of the LUNs doing 70% of the I/O

Average I/O size is 100k with a few LUNs peaking near 200k

There are several "company standards" we have to adhere to because of the sheer number of arrays we manage (100pb), Though I manage 1pb for a specific client.

The standard here is 60 drives per 2tb pool.
No mixing of drive types within pools

We do not utilize FAST tiering between drive types (only within pools)
For every approx (20) 2TB drives added a drive is allocated to FAST Cache (though obviously not allocated every single time we add 20 drives). Today, we have 20-200gb Cache drives

My question is, I have a customer running Hadoop, where as I said, the read skew on some of my LUN's is quite high. According to this doc, Capture3.PNG.png https://www.emc.com/collateral/white-papers/h12682-vnx-best-practices-wp.pdf

One front end read iop translates to one back end iop, however, random reads are more likely to result in a cache miss where in the case of writes virtually all writes can be satisfied by the cache. I've been told this is a terrible setup to run Hadoop on, can anyone point me in a direction so I can illuminate the problem of running Hadoop in this storage environment?

I found this tool, would this be helpful? https://0x0fff.com/hadoop-cluster-sizing/

View All

No Events found!

VNX

Impacts of reads and writes on a storage system

Was this post helpful?