This post is more than 5 years old
26 Posts
1
4083
January 18th, 2012 13:00
"Shrinking" a file system with deduplication enabled
Let me first start by saying that I know that once you extend a file system, you can not "shrink" it.
I did find this article which told of a workaround which seems like it would give you the same result though:http://stujordan.wordpress.com/2011/11/22/shrinking-a-celerra-filesystem/ Here is my dilema. I have a 1Tb file system that I am copying from one Celerra to another. The source Celerra has dedupe enabled on the File System, so the 1Tb of data has been shrunk to 500Gb. In order for me to copy that to the target Celerra, I have to create a 1Tb file system on the target for it to move to because when it is copying, it copies the files in their original sizes. So once the file system has been copied from Celerra A to Celerra B, I am left with 1Tb of data that has not yet been deduped. I can let dedupe run on the file system on Celerra B, and the 1Tb of data will eventually compress down to 500Gb. The problem with the procedure in the link above is that when it is on Celerra B, and I create a second smaller file system to move the data to, it will "rehydrate" the files again when I go from the larger file system to the smaller one, and I am back in the same boat. I am just wondering if there is a workaround to this that anyone can think of? One method would be to create a small file system on Celerra B, and copy the data in small chunks, let dedupe run, then copy some more, let dedupe run, copy more, etc etc. That would just take a lot of time though. Any other thoughts?
McK
12 Posts
0
January 18th, 2012 14:00
NDMPcopy which is available on the Celerra Tools and Apps CD can be used to Copy From Celerra B LargeFS to Celerra B SmallFS while preserving deduplicated data.
Mike McKay
EMC Corporation
Mobile: 201-704-9947
Email: mike.mckay@emc.com
dynamox
9 Legend
•
20.4K Posts
0
January 18th, 2012 14:00
are you migrating file system from Celerra A to Celerra B ? If this is for continuous replication they must be identical size.
btrotter
26 Posts
0
January 19th, 2012 05:00
This is not for replication. This is simply to move data from one Celerra to another. Mike, I am going to dig up the NDMPcopy tool. That sounds awesome. Thank you for mentioning that tool. Do you by chance have any tips or suggestions on how to use it, or is it well documented on the CD?
Rainer_EMC
4 Operator
•
8.6K Posts
1
January 19th, 2012 07:00
Ndmpcopy is also available as an extra download from support.emc.com under VNX downloads
Not sure if this direct link works https://download.emc.com/downloads/DL32451_NDMPCopy.zip.CMS.DLTYP.0175
btrotter
26 Posts
0
January 19th, 2012 07:00
Thank you all for the assistance!
Rainer_EMC
4 Operator
•
8.6K Posts
1
January 19th, 2012 07:00
Here is a How-To document for ndmpcopy
https://community.emc.com/docs/DOC-7832
Rainer_EMC
4 Operator
•
8.6K Posts
0
January 19th, 2012 07:00
There is a readme and you can download it from the support site
Just keep in mind that ndmpcopy currently (like a backup) doesn’t do deleted file tracking – so if you run it incrementally (unlike emcopy) will not delete files/dirs on the destination that have been deleted on the source since the last run
Rainer
McK
12 Posts
0
January 19th, 2012 08:00
Sounds like you have the process down.
It is recommended that you not use the CS due to the possibility of the Copy going over the internal Celerra network or high cpu utilization on the CS. If the internal network is over utilized you could skew boxmonitors view of the internal health of the Celerra and trigger unwanted events such as Datamover failover for failure to respond to ping etc. With that being said NDMPcopy will run on the CS if you ask it to just understand the consequences. If the Celerra where shrinking needs to take place is not yet in production then it may not be worth spinning up a new VM.
The NDMPCopy versions on the tools and apps I’m looking at now 6.0.40.5 appear to have an ndmpcopy version for SUSE 32bit or 1.2 Linux. The 1.2 Linux version has been successfully executed with the CS version of Linux on a pre-prod Celerra with no issues.
McK
12 Posts
0
January 19th, 2012 08:00
Rainer is correct. NDMPcopy for this scenario is for the bulk move of the deduplicated data from Large Celerra FS to Small Celerra FS. Any subsequent “incremental / delta” type copy was done (in our case) with emcopy from Original Source to new Small FS without have to rehydrate the previously deduped data. Thanks
btrotter
26 Posts
0
January 19th, 2012 08:00
What I thought I would do is let the original copy finish (Celerra A smallFS to CelerraB largeFS). Run dedupe on CelerraB largeFS
Then once that is finished, I will create CelerraB smallFS, then run NDMPcopy to go from CelerraB largeFS to CelerraB smallFS. After that is complete, I will run emcopy from CelerraA smallFS to CelerraB smallFS with the /purge command to make sure everything is synced up. I know there is some wasted effort there, but I have already invested a lot of hours to getting the data copied from one Celerra to another, so that portion is almost done.I did see the comment about not running this from the CS, but running it from a linux machine. I am not sure I understand why that was necessary. I can create a linuxVM on my ESXi server (any specific distro?), and download the ndmpcopy tool to there and run it and point it at the NAS. Why shouldnt it be run on the CS?
Rainer_EMC
4 Operator
•
8.6K Posts
0
January 19th, 2012 08:00
The recommendation for not running it on the CS has two reasons:
- The CS is an important part of the system and we don’t want it overloaded
- We don’t want the internal networks used for data transfer by mistake
The ndmpopy util isn’t EMC specific – if you find a Windows version for download somewhere it should work as well.
Rainer