Unsolved
This post is more than 5 years old
9 Posts
0
1458
March 5th, 2012 16:00
chunk size terminology translation
For a Symmetrix with a Linux host, what is the chunk size on each RAID 5 group presented given the definition of "chunk size" below and translated from Linux terminology accurately into EMC terminolog(ies)?
From Linux Kernel Documentation (for the md driver)...
"The chunk-size deserves an explanation. You can never write completely parallel to a set of disks. If you had two disks and wanted to write a byte, you would have to write four bits on each disk, actually, every second bit would go to disk 0 and the others to disk 1. Hardware just doesn't support that. Instead, we choose some chunk-size, which we define as the smallest "atomic" mass of data that can be written to the devices. A write of 16 kB with a chunk size of 4 kB, will cause the first and the third 4 kB chunks to be written to the first disk, and the second and fourth chunks to be written to the second disk, in the RAID-0 case with two disks. Thus, for large writes, you may see lower overhead by having fairly large chunks, whereas arrays that are primarily holding small files may benefit more from a smaller chunk size."
-- Chunk size is often referred to as "segment size", "amount of data written to or read from each device before moving on to the next", "amount of data in KiB written at once to an array disk", "amount of data written to a physical disk at a time".
What is the Symmetrix equivalent for "chunk size"?
What is the default value for this for a Symmetrix DMX?
Quincy561
1.3K Posts
1
March 5th, 2012 20:00
I think you are asking "How much data is on each disk, before the data is moved to the next disk in the raid set?" Or what I call the raid stripe size.
Each member has 4 tracks. So with 3+1 in DMX3 and above, that would be 64k (track) x 4 = 256k on each disk (the last disk being 4 parity tracks). DMX2 and below had 32k track sizes.
However since each disk is on a completely independent processor, all 4 disks CAN do IO at the same time.
You should never make the allocation unit less than 8k in DMX3 or above, and you should also make sure your filesystems are aligned with track boundaries whenever possible.
MOk52mY3Ym12403
9 Posts
0
March 6th, 2012 05:00
Quincy - thanks this is helpful.
Yes, I was asking how much data is written to/read from each "data disk", before the data written to/read from the next "data disk" in the raid set.
Is the track that you mention equivalent terminology to the extent (strip - stripe without the e) terminology used by SNIA?
re: http://www.snia.org/education/dictionary/s
stripe
[Storage System] The set of strips at corresponding locations of each member extent of a disk array that uses striped data mapping.
Does this mean that on DMX3 and above the default value for each "data disk" is that it gets 256k (4 tracks of 64k each) written to it/read from it before moving on to the next "data disk"?
Additionally, is it correct to assume that this default can be adjusted away from the default at creation time so as to try an optimize the RAID set configuration for a best match for the application usage characteristics for that RAID set? How would you do this? By modifying the number of tracks per "data disk" or the size of each track? Or both?
"However since each disk is on a completely independent processor, all 4 disks CAN do IO at the same time." -- This is a nice to have performance feature of the Symmetrix and is good to know, but I think it can be considered superfluous information to the tuning/matching of Linux disk access patterns to RAID array configuration.
MOk52mY3Ym12403
9 Posts
0
March 6th, 2012 06:00
Thank you.
It would appear that one difference between the Symmetrix DMX and CLARiiON/VNX technologies is that the Symmetrix has the notion of a subdividing each data disk (tracks).
It is my understanding at this point that CLARiiON/VNX does not operate with subdivisions, but simply a 1-to1 ratio on "data disks" to what they call "Element Size" which they also do not allow modifcation on and hard set at 64k.
Thus the "chunk" size can be translated for Linux folks as follows for these two sets of EMC storage platform technologies:
Symmetrix/DMX3 and above: 256k
Symmetrix/DMX2: 128k
CLARiiON/VNX: 64k
Please correct if you see any inaccuracies in the above.
Thanks again.
Quincy561
1.3K Posts
1
March 6th, 2012 06:00
The 256k would be the "chunk" size. It is not modifiable.
MOk52mY3Ym12403
9 Posts
0
March 6th, 2012 07:00
Quincy:
I see this in the Host Connectivity Guide for Linux rev. A28 from Oct. 2011 on page 44.....
◆ RAID 5 Boundaries (four Tracks [256 Blocks] – 128 KB)
They are referencing 256 blocks instead of 256k. Is this documentation simply talking about DMX2 and not DMX3?
Thanks,
Tim
sasi_symmetrix
5 Posts
0
March 22nd, 2012 11:00
Thanks Quincy for detailed explaination.
can you pls calrify what exactly "Each member has 4 tracks" means. Suppose if we use 7+1 then the chunk size on each disk will be 64k x 8 = 512k. Pls correct me. thanks in advance.
Quincy561
1.3K Posts
0
March 22nd, 2012 13:00
3+1, 7+1, 6+2 and 14+2 all have the same number of data tracks on each disk of 4
So 7+1 has 4 data tracks on each disk for a total of 28 data tracks, then 4 parity tracks.
sasi_symmetrix
5 Posts
0
March 24th, 2012 19:00
Thanks for quick reply Quincy
So, by default 4 tracks of data is written on each member(disk) of any raid set before going to next member. Quincy, does the raid stripe size depends on type of host(linux, unix). Can we change the raid stripe size?
Thanks in advance.
Quincy561
1.3K Posts
1
March 25th, 2012 03:00
Yes, 4 tracks for each disk before moving onto the next disk. The host type doesn't matter.
It cannot be changed.