Unsolved

This post is more than 5 years old

20 Posts

5074

December 21st, 2011 09:00

EMC Storage System "IO Response Time": Avoiding bottlenecks: cache or disk access most important?

Hi All

Probably there are alot of "noobie" queries here, so apologies in advance, but I'm trying to understand where the bottle necks for Storage System "IO Response Time" can occur.

My initial research is how cache works. Therefore, I would like to detail my current understanding and my follow on queries. I would be very grateful for comments on any mistakes I have made and any answers to my queries.

My current understanding.

Guest OS

The guest Operating System (OS) sends an I/O to the Storage System/Array. This will be a block size of the Guest OS.

Storage System

The Storage Processors (SPs) receive the I/O and process it.

Potential Bottle Neck 1:

If the SPs are overloaded a queue can form which would be a cause of a "bottle neck".

The SPs will check the cache. There are 4 possible outcomes of checking the cache to see if the memory location in question is stored in the cache. If the memory location is already stored in cache, this is a hit. If the memory location is not already stored in cache, this is a miss.

A Read Cache Hit

The data is read from the cache and sent to the Guest OS

B Read Cache Miss

The data is read from disk, placed in cache and sent to the Guest OS.

Potential Bottle Neck 2:

This is the 1 of the 4 cache check outcomes where it is necessary to have an immediate disk I/O rather than via cache. Therefore, intensive random read I/Os could cause a "bottle neck" if they are sufficient to exceed the I/O capacity of the RAID Group/disks.

C Write Cache Hit

The data is written to cache on both SPs (mirrored). Once this is done, an acknowledgement is sent back to the Guest Operating System.

D Write Cache Miss

Free cache is allocated to the memory location in question. The data is written to cache on both SPs (mirrored) and an acknowledgement is sent back to the Guest Operating System.

Other important cache behaviours that are of importance are:

i) For sequential reads, data can be prefetched from disk to the read cache.

ii) Write cache data is flushed (written to disk):

  • Dirty Pages are the write cache pages waiting to be written.
    • Idle Cache Flushing
      • Occurs when the % Dirty Pages is less than the LWM (Low Water Mark)
    • WaterMark Cache Flushing (EMC default = 60%)
      • Occurs when the % Dirty Pages is between the LWM and the High Water Mark (HWM)
    • High WaterMark Flushing (EMC default = 80%)
    • Forced Flusing
      • Potential Bottle Neck 3:
      • Occurs when the % Dirty Pages is above 99%. This will disable the write cache until the % of dirty pages drops below ? again.

iii) Data can be "coalesced" in the write cache for more efficient flushing

My queries!

A Is my understanding of how the cache operates above correct?

B We have looked at 2 monitoring systems (SolarWinds & Nimsoft). They use EMC SMI-S and navicli to respectively to obtain performance metrics. However, in both cases their documentation on what the metrics are seems limited. Therefore:

i) Do people know if EMC produce detailed documentation on what metrics can be obtained using SMI-S and navicli?

ii) Are good indicators of performance issues at the Storage Level?

  • Forced Flushing i.e. HVM is exceeded
  • High percentage of Read Cache Misses
  • SP's being overutilized

C Alot of documentation (both EMC and other) is devoted to matching Array RAID Group characteristics (usually 5, 6 and 1_0) to the following application I/O characteristics:

  • sequential versus random
  • Write versus reads
  • Large-block size versus small-block size
  • steady versus bursty
  • Multiple threaded versus single threaded

However, it seems to me that RAID Group I/O capacity will only be limiting in 2 situations:

  • intensive reads which are cache misses
  • intensive writes which cause forced flushing

Is this correct or could RAID Group I/O capacity reach its limit during WaterMark Cache flushing, and even if it could, would this matter?

Therefore, my query is, why is RAID Group design given so much attention and would it not be much more sensible to ensure good cache design so that very little strain is put on RAID Groups/disks?

D How can you know whether actual disk read/writes are a bottle neck and what is the best level to do this at i.e. Lun, RAID Group or Disk?

E When a guest OS sends an IO request, is the I/O size always "1" block, or can it be multiples of a block?

Also, when the Storage Array reads or writes to disk, does it keep the IO request the same as what the guest OS sent, or will it split/coalesce it?

For you that have had the patience to get to this point - many thanks! There is alot here but I couldn't see how else to contextualise my queries! I hope its of interest to people and I look forward to comments/responses!

Thanks

John

No Responses!
No Events found!

Top