VNX FAST VP/FAST CACHE- Pool vs raid group

Question

Hi

I'm planning to configure newly purchased VNX 7600 by my company. I need some suggestions to configure Pool [FASTVP] or Raid group [Fast Cache].

Following are disk available for POOL and RAID group.

25 SSD disks - 200G* -- correction

260 SAS disks- 10K - 900G

96 NL-SAS disks -7.2K - 3T

which combination I should choose ?

if I create pool [FAST VP] "Extreame performance [4+1], performance [4+1], capacity [6+2]

what should be Pool size in order to get max. utilization + high I/O performance? and what should be the size of FAST CACHE?

Any suggestion would be highly appreciated

Thanks in advance,

/aman

kelleg · Accepted Answer

I've attached a Best Practices for VNX - this is for an earlier version (Flare 31.5) but the basics are the same. Review Chapter 5 (pg 113), this has a lot of the information about configuring Raid Groups and Pools for best performance.

You can also collect a set of spcollects (both SPA and SPB - or you can use USM to collect the spcollects - it creates a single zipped file) and a couple of sequential NAR files, zip them all up into one zip, and you can then submit these to a site that will review the data and create a report showing you the current performance on the array (this is provided as a service by EMC).

I would suggest that you configure Data Logging to run the "Archive Interval" at 300 seconds (this will create a new NAR file every 13.1 hours) with "Periodic Archiving" enabled and let it run for some period of time m(say a couple of days or 7 days) that covers the time when you experience host latency issues. If you see issues each day, then change the Archive Interval to 60 seconds (one new NAR file every 2.6 hours) and collect the NAR files at the end of 24 hours (you should have about 10 or 11 files). Always "Stop" "Data Logging" when making changes, then Start once changes have been completed. When submitting the NAR files, only use the ones from either SPA or SPB, but not both. Remember, the spcollects and NAR files need to be in one zip file.

https://app.mitrend.com/#loginPage

You'll need to create a new user account. The reports created are very detailed.

glen

amank1 · Answer

Thanks Ross.

Well, Pool LUN's or Raid group LUN's would be presented to ESX environment. I've read best practices PDF about VNX [FAST VP/FAST Cache] Thanks to add PDF links.

My question is here: should I create Pool with FAST VP [ tiering ] or RAID groups with FAST Cache ?

2. currently I have 7500 VNX where I have 4 pools with FAST VP configuration [FAST CACHE on pool] and Pool size is more than 100TB but I am having performance issue [Latency] in our ESX environment.

Additional :

I have checked in reports for IOPS for current datastores and found AVG LOAD is 5553 -IOPS and PEAK LOAD 21,421 - IOPS

Thanks,

Aman

amank1 · Answer

Hi Glen, Thanks for valuable inputs.

amank1 · Answer

Hi Glen, I'm having performance issue only on Tuesday or Wednesday. couple of weeks ago, I have disabled veeam backup which help us to reduce I/O workload operations on VNX. moreover FASTVP relocation has to move UP approx. 8TB data to the higher tier but due to lack of enough I/O - storage couldn't perform relocation.

brettesinclair · Answer

This is why you need to be aware of your i/o profile and data skew. Your avg and peaks are not really high at all.

When you have such a skew towards what might be one particular workload, you might consider isolating it to it's own Pool.

amank1 · Answer

Hi Brett, thanks for response to the discussion. problem which I am facing is how to distributed IOPS among disks. It is really difficult to distribute workload among SSD, SAS,NL but I have read some PDF from EMC but it is not clear which disks  {SAS}, NL-SAS should used to satisfy the requirement of performance and capacity.

DCsafe · Answer

EMC PS would be a good help on this.

However, consider "for a 3-tier pool, start with 10 percent flash, 20 percent SAS, and 70 percent NL-SAS for capacity per tier if skew is not known. Tiers can be expanded after initial deployment to effect a change in the capacity distribution if needed."

kelleg · Answer

With Pools the FAST VP (Auto-tiering) will re-balance the workload based on the relative temperature of the IO between the different tiers (SSD, SAS and NL-SAS). But this is conditioned on the IO workload stays relatively stable. Backups which only run at certain times will cause some data to be marked as hotter than others, which means that data will/may be relocated, but not accessed again, meaning will have to move to a lower tier, but then backup runs again and makes new hot slices - this back and forth could cause issues.There are some processes in auto-tiering that will look for backup patterns and try not to mark that data for relocation.

Getting an idea of what your hosts are doing is the first step is determining how to balance the workload over the array. You'll want to create separate Pools for different workloads - for example, you should not create a single pool for databases and put the files, logs and temp in the same pool. There are different requirements for Logs and Temp and the database files, so you should have two Pools - one for the database (Raid 5 with FAST cache) and one for Logs and Temp (Raid 10 without FAST cache).

glen

ZaphodB · Answer

Food for thought:

Your raw disk composition is ~55% NL-SAS, ~44% 10k, ~0.5% SSD. With only 0.5% SSD, I would suggest allocating that all as Fast Cache...unless you happen to have a incredibly important workload that is only ever going to require ~2TBs or less.

From memory I believe the white papers suggest using 150IOPS for a 10K and 90IOPS for a 7.2K. In the math below, I'll use those, but you can use different values if you prefer.

I would give consideration to what size and IOPS makes a good fit to you overall ESX workload. It is just as important to offer separation between some guests as it is to give a guest access to IOPS. I would build pools sized against what you have, and expect to have, running as guests.

For example, I would probably construct five symmetric pools each containing ((14+2R6 of 3TBs) + 5*(8+1R5 of 900GBs)), and construct an odd sized pool out of the remainder. That would give me five pools with a net capacity of ~72TB each, which would each support ~7000 IOPS, and one smaller pool with ~35TBs and ~4000 IOPs.

Alternatively, I might construct smaller pools in order to provide more separation between guests; that might look like this:

Seven symmetric pools each containing ((6+2R6 of 3TBs) + 5*(4+1R5 of 900GBs)) along with four symmetric pools containing ((6+2R6 of 3TBs) + 4*(4+1R5 of 900GBs)). That yields seven pools with ~32TBs and ~3500 IOPS, and four pools with ~29TBs and ~3000 IOPS.

Note that the first config gives you ~395TBs of usable capacity while the second one is closer to 340TBs.

Personally I tend to prefer to build pools that are optimized for capacity, and then only place enough 'stuff' in each to consume the available IOPS. I also rarely use more than ~90% of the net capacity of a pool in order to reduce the tendency to thrash data between tiers as the pool becomes fully utilized. For both of those reasons I would favor something more like the first alternative...but those opinions are informed by my workloads, and your mileage may vary.

amank1 · Answer

Hi Glen,

Thanks again for the comments. but in my current environment I have 4000 VM's which I cannot put in separate pools somehow I need to find out common situation which could balance the load and performance.

amank1 · Answer

Hi Zaphod

Thanks for thought. what do you think about this combination?

useable capacity SSD= 200 G --> 186 G

SAS=900 G ---> 833 G

NL-SAS= 3000G --> 2794 G

Tier 0 - SSD { 1*4+1} R5 = 744 G

Tier 1 - SAS { 6* 4+1} R5 =19992 G

Tier 2 - NL SAS { 4* 6+2} R6 = 67056 G

Total IOPS

SSD = 4*5250 =21,000

SAS= 24*150=3600

NL-SAS=24*90=2160

Total IOPS: 26760

Total capacity : 87792 G

I've attached my current pool workload and AVG/Peak load from existing 7500 vnx array.

ZaphodB · Answer

Let me start here:

In your pool design, ~78.5% of your computed IOPS are provided by ~0.8% of your pool capacity. Those SSD IOPS can only work for you if they meet the needs of your I/O.

In looking at your load breakdown, 50-63% of your I/O are reads. Data has to be read from where it lives; and I have to suspect that most of the data you read will not fit in Tier 0. So you can't count on those SSDs to provide much of a contribution to your read/random read workload.

In a modern array, writes are acknowledged as soon as they are deemed secure, here that means as soon as the controllers mirror the write buffers. But they do need to be destaged by the controllers soon enough to not overwhelm the cache. Having a tier 0 to write new data to is 'nice', but in a well running array it has little to no impact on your host write I/O response times.

So unless you have a (somewhat unusual) workload, where the majority of your I/O is concentrated on a very small amount of space, your tier 0 isn't going to help very much with either your reads performance, or your write performance.

Fast Cache is both more granular, and more dynamic, than tiering. It is better designed to benefit whatever I/O you are doing at the moment, without regard to where the data you are using is stored in the array.

amank1 · Answer

Hi Zaphod ,

I agree with you but I see in my report that cache is 99% on peak utilization therefore I thought to have SSD which will improve read I/O's and write I/O's but only for hot slices which will be ready to move on tier0. the thing is I want to avoid queue depth and outstanding I/O situation where cache is on 99% and unable to process I/O's queue.

I have attached a screenshot - might will more info.

ZaphodB · Answer

I'm going to recommend this for your review:

https://www.emc.com/collateral/white-papers/h12102-vnx-fast-vp-wp.pdf

It offers a good description of FAST VP, and it compares/contrasts it with Fast Cache. I would draw your attention in particular to pages 22-30, and specifically to Table 4 which compares the two technologies.

amank1 · Answer

Hi 'DCsafe' Thanks for comments, I have checked overall skew for existing 7500 VNX  which is 74% of IOPS go to 29% of LUNS.

VNX

VNX FAST VP/FAST CACHE- Pool vs raid group

Was this post helpful?