Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

2386

June 29th, 2010 09:00

EMC Celerra NS-120 NFS Deduplication with VMware vSphere

Months ago I created 922GB NFS file system/volume with dedupe enabled and presented it to 2 vSphere hosts.  Originally, I stored a large amount of ISO files and 10 Windows templates on this datastore and realized dedupe savings.

Last week I moved (Storage vMotion) all (40’ish) thinly provisioned VMs from the NetApp NFS datastore they were on, to the Celerra NFS volume.  I also enabled Virtual Provisioning on the NFS file system. 

After waiting a few days, the Deduplication Statistics hadn’t updated, so I forced an update by toggling Deduplication from On, to Suspended, and then back on On again.  At that point, the "As Of" statistic had incremented to current, however, Space Saved had not.  It remained at 105GB (which after the move is 11% of original data size).

Does 11% sound like a reasonable Dedupe savings for blended data types of ISO, 10 Windows templates, 40 Windows VMs, and 16GB of VMkernel swap?  I get the feeling none of the thin provisioned 700GB of VMs which was moved was processed by the Dedupe engine.

CM screen shot attached for reference purposes.  As I mentioned to Chad Sakac, a "Dedupe now" as well as "Update dedupe stats now" button, as NetApp Data ONTAP has, would be helpful in future versions of DART.

Thank you!

Jas

1 Attachment

31 Posts

June 29th, 2010 11:00

Hi Jason,

The internal policy engine that selects files for deduplication is configured by default to ignore files that have been accessed or modified recently.   Hence it will be ignoring all your recently VMotioned in VMDK files.   Indeed it will continue to ignore any VMDK files that are in use by running VMs as they will be constantly read and modified.

You could tune the dedupe setting for the file system to configure the policy engine to ignore the atime and mtime of the files and it should then process the VMDK files.   However, that isn't how the system is designed to be used.   Rather the policy engine is designed to deal with data that no longer in active use, leaving the active data alone.   Processing of active data, VMware virtual machines in particular, is handled through the EMC Celerra vCentre plug-in.  

The plug-in can be found at:

http://powerlink.emc.com/ > Support > Software Downloads and Licensing > Downloads A-B > Adapters for Third-Party Applications

Also, take a look at the following blog article by Chad for more details.

http://virtualgeek.typepad.com/virtual_geek/2010/05/emcs-next-generation-vcenter-plugins.html

Cheers,

Chris

Message was edited by: Chris Stacey to add where to find the plug-ins on Powerlink

No Events found!

Top