This post is more than 5 years old
1 Rookie
•
16 Posts
2
10914
November 22nd, 2013 06:00
metadata capacity issues
Hello Everyone,
We have a customer running Avamar v6.1.1-87 single node Gen4s server integrated with Data Domain. We also have a secondary site with the same setup acting as the replication target for the Avamar and DD. Some time in September the primary node GSAN hit maximum capacity and was cleaned off soon after with GC and the customer were advised to reduce their retention policy. After that event, the gsan capacity was returned to a very nice level again, but the “cur” size remained high. EMC is saying there is no way to clean this up now and the only option is to migrate to another node. Has anyone faced this situation before and what steps were taken to resolve this?
Just a thought.....what if we re-kickstart/format the grid and rebuild the grid and use the secondary grid to migrate data back to the primary....does that clean up the metadata space usage? The secondary grid doesn't have metadata space issues like this one.
thanks much for your help in advance!
ionthegeek
2 Intern
•
2K Posts
0
November 28th, 2013 07:00
You mentioned Avamar 6.1.1-87 but you also mentioned "metadata capacity" which is an Avamar 7 term. Is this in the context of an upgrade to Avamar 7?
Avamar 7 introduced a new metadata capacity measurement (based on the "cur" capacity) that has been causing problems for some customers using Avamar / Data Domain integration. If the customer is not planning to upgrade to Avamar 7, the previous posters are correct and there is no need to worry about the cur capacity. If the customer is planning to upgrade to Avamar 7, the metadata capacity may be an issue.
Unfortunately, the only way to reduce the cur capacity of a single node system is to reinitialize it. There are two basic ways to do this:
Which of those options is best is somewhat situational. If you feel that the customer would be better served by a re-kickstart and restore from the target, that is an option.
TimQuan
4 Operator
•
1.2K Posts
0
November 24th, 2013 17:00
Avamar storage nodes mainly store the following data:
OS Capacity contains all room of each data partition. 20% of OS Capacity is used to store checkpoint overhead. 65% of OS Capacity is used to store de-duplicated data and RAIN parity data.
The OS view of capacity utilization is the total amount of data in each data partition, as measured by the OS. This includes the total of data in the cur directory , plus overhead in other directories.
GSAN Capacity is also called as User Capacity. It occupies 65% of OS Capacity.
The GSAN view of capacity utilization is defined as the total space allocated to stripes (cur view) minus the space freed by garbage collection plus RAIN parity data stored in each data partition. What GSAN view shows is not the percentage of OS Capacity that data occupies in each data partition but the percentage of GSAN Capacity that data occupies in each data partition.
The cur view of capacity utilization is the amount of data as measured by the OS that is in the current working directory in each of the data partitions (data0?/cur) . This capacity is approximately equal to the sum of the maximum allocated sizes of all the stripes in the /data0?/cur directories.
When GSAN view reaches 80%, a pop-up notification informs you that the server has consumed 80% of its available storage capacity.
It is important to note: When data is added into a stripe, the stripe is filled with chunks to the maximum allocated size of the stripe (stripe will be full of chunks). Even when data is deleted from the stripe, the size of the stripe remains the maximum size but new data can be written into the freed space. After garbage collection, expired data in a stripe is deleted. Thus, GSAN view will become smaller and cur view remains the same.
Therefore, it is not necessary to pay much attention to cur view. If GSAN view and OS view is not high, capacity should not be an issue for your Avamar Server. Since your Gsan view and OS view returns to a very nice level, please feel free and no more action is required.
DELL-Leo
Community Manager
•
8.7K Posts
0
November 24th, 2013 18:00
Hey,
Please refer below material. You will clearly understand the capacity utilization of Avamar. Hope this helpful.
Introduction of GSAN view of capacity utilization and OS view of capacity utilization
https://emc--c.na5.visual.force.com/apex/KB_HowTo?id=kA0700000004PFc
o17Uu33DCF12520
4 Operator
•
1.1K Posts
0
November 24th, 2013 21:00
1. delete backup files and reduce the retention period
2. review the definition of all backup jobs and make sure no unnecessary data has been backed up
3. review the output of capacity.sh and see if the capacity daily change is in a stable status, say net change is about 0, if not this case, figure out why
4. review the log of GC each day and make sure it's working and removed a certain number of data
5. if all looks fine, you might need think that the grid is too small to accommodate the target data and therefore you could optionally add a new node into the grid to increase the total capacity.
6. as to the replication, I don't suppose you are running root-to-root replication, so if you want to try that way, make sure the wanted domain is included.
KeyurD
1 Rookie
•
16 Posts
0
December 2nd, 2013 06:00
thanks guys for the prompt responses.
Ian, yes we are trying to upgrade to v7 and we came across this during the proactive health check script. Can we re-kickstart and then rebuild the node and catalogs from the secondary Avamar? How will DD behave when we do this rebuild?
ionthegeek
2 Intern
•
2K Posts
0
December 2nd, 2013 09:00
The general process for rebuilding a single node would be something like:
The behaviour of the DD should be the same as with any other replication but I've never personally run this procedure with a DD attached so I don't know what the potential pitfalls might be.
In any case, a replicate restore is a non-trivial operation even when DD is not involved so if you decide to go that route, I would recommend getting in touch with support after you finish step 2 above. The SR would have to be opened under your partner account -- support can only work with partners on tasks like system re-inits. Once the SR is open, I'd recommend asking the L1 to consult with L2 support because DD integration is involved. You can point them to this forums thread if needed.
A root-to-root replication to a new node might be faster and would almost certainly be easier. System migration using root-to-root replication is a fairly common activity for things like hardware refreshes so that path is well-trod. I believe there is a tech note or procedure generator procedure that covers system migration.
JWeinsheimer
91 Posts
0
January 30th, 2014 15:00
Where can I find documentation relating to the meta data cur capacity and these issues upgrading to 7.0?
edinm
15 Posts
0
May 25th, 2014 13:00
Hi all,
I've got a client with two Avamar 7 systems, one of which has a Data Domain backend. Now, we went planning for the upgrade (due to some Avamar proxy related issues, upgrade 7.0.0-427>7.0.1-61), but our DD Avamar server stalls at 16% upgrade when checking available metadata space, which is 94%. Avamar storage consumption is around 75%.
Have there been any developments in managing the metadata storage and how can we, if possible in the first place, clean this up, so that the upgrade can take place? Kickstart is not an option.
Regards,
Edin