Start a Conversation

Unsolved

This post is more than 5 years old

3962

July 28th, 2013 18:00

Ask the Expert Summary: Symmetrix Performance Analysis & Monitoring

Ask the Expert Summary: Symmetrix Performance

Analysis & Monitoring

Introduction

This article summaries 2013 Chinese ATE activity: “Symmetrix Performance Analysis & Monitoring ". The original thread is https://community.emc.com/thread/173593.

Detailed information

Question 1:

What kinds of tools can perform Symmetrix Performance Analysis and Monitoring?

Answer:

In general, to monitor the Symmetrix performance, we use the tools Symmetrix Performance Analyzer and EMC IonixControlCenter Performance Manager.

EMC Symmetrix Performance Analyzer (SPA) is a simple, intuitive, browser-based user interface for the purpose of historical trending and analysis of the performance data. It was developed to overlay the Symmetrix Management Console (SMC). From the SMC, the SPA interface opens its own separate web window. To install the Symmetrix Performance Analyzer (SPA), you must first install the Symmetrix Management Console (SMC), and include the SPA license. SPA is included in the SMC installation package however, the SPA license is not. The SPA license can be added to SMC, according to the Symmetrix serial number (SID), to manage the license. SPA adds an optional layer of data collection, analysis, and presentation tools to the SMC implementation. You can use SPA to:

  • View graphs detailing the system’s performance
  • Drill-down through the data to investigate an issue
  • Monitor performance over time

The key capabilities of Symmetrix Performance Analyzer are:

  • Dashboard view – configurable for user-defined views.
  • Heat map –provides real-time heat map of the entire array.
  • Real-time monitoring –allows a storage administrator to monitor the system real-time.
  • Advanced diagnostics –allows users to set alerts and thresholds to be able to drill down and identify the root cause of the performance issues.
  • Advanced alerting –allows you to establish different types of alerts such as, critical, informational, and warning.
  • Trend and forecast –provides detailed performance tracking to trend and forecast future storage growth.

EMC Ionix ControlCenter Performance Manager allows the user to collect, manage, view, and analyze historical performance data for a period of time, or performance trends.

Performance analysis is done using historical data collected from the following EMC products:

  • Symmetrix storage arrays
  • CLARiiON storage arrays
  • Hosts
  • Oracle databases
  • Fiber Channel connectivity device ports


Question 2:

How do I evaluate storage read/write performance based on the hosts, storage, network storage configurations to determine if it meets my requirements? For example, I have a DMX-4 array using an FC 15K RPM disk, 8GB SAN, dual HBA paths. I use the DD command to perform the read/write stress test. With this configuration, what is the standard speed?  With the current storage technology, what is the main bottleneck for speed?

Answer:

Generally, there are three conditions affecting performance:

1. Utilization: The workload on the back-end or the front-end port is unbalanced.  Balance can be achieved through the backend port via disk load balancing through Symmetrix Optimizer, or FAST, etc. It can also be carried out through host side striping or set Meta Volume and so on. Through the front-end, using PowerPathsoftware for load balancing can help.

2. Limitations.

1) Physical limitations: This includes hard disk and port limitation, the resolution is through system expansion.

2) Enginuity limitations: Symmetrix Device Write Pending limitations and /or System Write Pending limitations. The solution is to increase memory or to use Meta Devices or other technology.

3. Unrealistic performance expectations: The solution is to scale out the hardware, or better understand the performance requirements and load. Storage Read/Write speed depends on many factors. It is difficult to define an absolute performance standard. Getting the most out of your configuration of read/write performance, such as whether the physical disks are distributed to different backend DAs, or whether there are enough physical hard disks, as well as different types of RAID protection, etc. Those factors will determine and optimize the performance of Symmetrix. For example: the performance of a 15K RPM FC disk will depend on your hard disk configuration, whether it is evenly distributed in the various back-end DMX-4 DAs, and how the LUN is striped, and the RAID protection used, and so on. Regarding the HBA dual path performance improvements, it also depends on how you put these HBA ports in the front-end Symmetrix FA ports configuration.

In general, the configuration for performance optimization is to allocate sufficient resources and have all resources balanced and utilized as much as possible. The other method is to not let your configuration become too complicated.

First, I always advise my clients to use their business data to perform a stress test, however not many clients have a test environment to do so. So, we can only use some I/O test software to simulate the client's I/O. Use IOMeterfor a Windows platform or IORate for a UNIX platform, they are free software which can be downloaded from the Internet. The benefits of this I/O test software are that users can control the number of concurrent processes or the composition of the specified I/O (for example: I/O size, the ratio of read I/O, what percentage of random I/O, etc). The problem is that it only produces a DD command to simulate sequential I/O, even if there are multiple DD concurrent processes. Concurrent or limited, it still cannot fully mimic the client’s application for the I/O situation. In addition, the performance is a composite indicator, many clients look at one indicator, some clients value IOPS (I/O per second) more, and other clients value user response time more. In fact, these indicators are all important. But considering the I/O size, literacy ratio, random/sequential ratio, and other factors, these can evaluate the storage I/O performance objectively. As the Symmetrix structure is very complex, low loading and a small number of I/O’s response time might certainly be less than a simple structural storage, so the results of the DD command is not desirable. Therefore, for the appropriate requirements, ensure that you select the appropriate storage.



Question 3:

Comparing the middle range storage products, what are the advantages of the Hyper Volume in Symmetrix DMX and the VMAX storage pool architecture?

Answer:

Previous generations of Symmetrix have been using Hyper Volumes,there are several points to consider:

1. Performance. The physical hard drives are divided into Hyper Volumes to make more volumes or devices visible to the host. , It is able to generate more concurrent I/Os.

2. Function. It is able to use different protection methods, mirror or RAID-5, etc., but also facilitate the expansion of a variety of functions, for example: disaster recovery, cloning, snapshots in physical hard disk according to the client demands.

Hyper Volume has a feature that requires the user to use a more balanced distribution of data, which can be through the storage array, the operating system or database level striping. If the data distribution is unreasonable, it may produce a “hot disk”, which is unable to give the full performance of the entire storage array.

VMAX announced the FAST VP technology which uses the Storage Pool, Tiered storage technology. First of all, the storage can evenly distribute their data. On the other hand, data can be stored in different data block I/O characteristics where the appropriate data is stored into the appropriate storage tier. For example, frequently read data into the flash drive and the archived data on the SATA drive.



Question 4:

With VMAX Series Thin Pools, would configuring more hard disk to improve performance?  For example, my VMAX has 50 identical hard disk drives, is it possible to create a 50 disk pool?

Answer:

From a performance perspective, if there are more physical disks in the thin pool, then the logical LUN binding data, on the average, will automatically be distributed to each physical disk, it is best  to avoid “hot disks” to improve performance. Of course, the configuration also will result in different applications sharing the same physical disk pool. Depending on the client demands, since different application share a pool on the back-end disk, it may cause some competition for resources, which includes disk capacity. The client can design a pool according to an application, or a group of applications, or one tier in FAST.


Symmetrix Performance case sharing and analysis summary

Performance tools on the host generally focuses on CPU, memory, Paging (or Swap), IOPS, I/O, etc., the average response time, average response time for I/O, and does not distinguish between the hosts read I/O and write I/O. Storage performance monitoring is focused on I/O. Note the following points from a storage perspective:

1. Read/Write:

Due to the nature of RAID protection, writes usually requires more resources than reading. But on the other hand, modern storage arrays have Cache and Write I/O response time that are often smaller than the response time of the Read I/O. But the Cache algorithm cannot avoid a Read Miss. So if the I/O load is small, the average response times of a Random Read is generally in a few milliseconds, and Write I/O response time is generally a few tenths of a millisecond, the response time for Read I/O is 10 times faster than Write I/O.

In addition, the Write I/O on the storage back-end (for example: disk controller) is easier to optimize. The read hit response time and the write response time are always completed in the Cache. Let us take a look at the Read Miss concept. For example: when shipping packages from an online store, the shipping company needs to get the package immediately. The packages come one at a time and the shipping company needs to receive them immediately. That is the Read Miss concept.  When they deliver the package, they do not need to ship it out immediately. They can wait for other packages that go to the same location or area before they ship it out. That is the Write I/O concept.

2. Random / Sequential I/O:

The general OLTP system I/O are random small I/Os, data warehouse, data mining, backup / recovery, archiving, streaming media, and so on. Application I/O is a big sequential I/O. The OLTP system requirements, IOPS, data warehousing and other applications generally require lots of throughput (MB/s).

Based on the two points above, when random read misses occur (Random Read Miss), it incurs the worst I/O response time. To improve this, that is where the flash drive (SSD or Flash Drive) technology comes in.

3. I/O Size:

In theory, all of the storage processing, if it is on a smaller scale, the I/O processing is faster. However, there should be a condition added. Based on the same amount of data (MB), most of the storage system handling large I/Os will be faster than small I/Os.

4. I/O concurrent degrees:

As mentioned, the DD command test cannot really measure the performance storage array because DD is not able to produce a sufficient number of concurrent I/O. Reducing the I/O dependencies between them to improve I/O has been the goal of modern host systems. For example: in a database system, when reading and writing a record, they do not need to lock the entire table anymore. They only need to lock the particular record so that other transactions can concurrently access other records as well. This also goes for the storage system. This means that the storage can handle the increasing concurrent I/O request before reaching a bottleneck.


When implementing a disaster recovery project, in order to meet the needs of the client application IOPS, it is necessary to ensure an adequate degree of concurrency by increasing the number of fiber ports, then increasing the LUN number to increase concurrency. For a Synchronous Disaster Recovery, each write I/O needs to be written to the disaster recovery side so the I/O is not complete. Therefore each write I/O response time will be longer, this delay cannot be avoided. Disaster recovery must pay the price, as long as the storage array can complete all the IOPS within the time window of the user’s requirement, therefore meeting the user needs.

5. Cache utilization:

The “Hit Rate” impact on performance is very obvious. The write cache capacity for Symmetrix is the number of write pending where the data has not yet destaged from the cache to the disk. The Symmetrix system has a Write Pending Limit where the cache can hold up to a certain amount of data. If the storage reached the Write Pending Limit, it will reduce the read hit rate and the storage back-end cannot calmly write to disk, then the storage back-end will be busy writing to disk, which results in decreased storage efficiency.

Finally, the performance requirements depend on the application needs of the users. Some applications require an acceptable response time of tens of milliseconds. For example: a user requires a response time that is less than a few milliseconds to meet their requirements. They need to add more than 100 SSD into the DMX-4. The faster response time required, the lower the storage array performance utilization is. For example: 6ms response time needs an array utilization of under 30%. 10ms needs an array utilization of 50%.

For different types of hard disks, the current accepted value is:

10K FC disk 15K FC disk     SATA

MB / s     10MB / s       13MB / s     8MB / s

IOPS:        100           150               50

Note that the average value are based on normal circumstances, and is taken based on the case of the type of I/O value that reached first. For example: FC 15K RPM, in the case of a small I/O, reached 150 IOPS first, but the throughput may be far from reaching 13MB/s. Even with a hard drive, the outer tracks’ performance maybe more than twice the inner track within the signal platter.

The write pending percentage in the Symmetrix memory should be 50% or under. If it is a short I/O peak time, 50% is acceptable.

Identifying the DA IOPS limit is hard (case to case basis) because the performance of different storage DAs is different and it depends on the I/O. I usually see that the DA utilization percentage is high, and it sometimes even reaches 90%. However, the user does not realize that there is a performance problem. We recommend that DA utilizations percentages do not exceed 50% for preferred performance.

The disk is for the server to access where the configuration could be RAID-1 or RAID-6, Meta volume, hit rate and so on. Those configurations do not have a limitation. or a recommended value because the I/O will go to the physical hard disk. The performance capacity planning only considers whether the physical hard disk meets the performance requirements. Overall, the recommendation is to balance the load for different applications on the disk to avoid “hot disks” from occurring.

Author: Fenglin Li

iEMC APJ

Please click here for for all contents shared by us.

No Responses!
No Events found!

Top