Change of MAC Address on Avamar Virtual Edition 7 servers

Question

We have been evaluating Avamar using 3 x AVE 7 servers successfully for the last few weeks (running on ESXi 5.5)

Due to host hardware failure/changes, the MAC addresses of 2 of the vm's/servers network adaptors have changed, and cannot be changed back to their original.

We cannot now get the AVE servers to start back up again correctly.

After correcting the 1st issue observed (Linux changing eth0 to eth1) and getting AVE's networking running again, gsan and other functions will not correctly load, and we observe invalid license messages in the dpnctl.log (below).

We note that the server's MAC address appears in the license files (eg. gsankeydata.xml and _Key.xml file) and so think that we need to correct this. We could manually edit the _Key.xml file, change the MAC address, re-install it an, but is this allowable? If not, should we be requesting a new license file?.

Or maybe, as it is just an evaluation, and we have data mostly replicated to another AVE server, we just start a new install?.

We appreciate any advice.

2014/06/05-07:19:35 Checking for server ready. Please wait. 2014/06/05-07:19:35 sleep 300 2014/06/05-07:19:35 ERROR: command "wait.dpn --runlevel=fullaccess --hfsaddr=ave1-05tb --verbose" failed, return: 256, exitcode: 1, signal: 0, dumped core: 0 2014/06/05-07:19:35 - - - - - - - - - - - - - - - END 2014/06/05-07:19:35 dpnctl: ERROR: error return from "/usr/bin/yes no | /usr/local/avamar/bin/restart.dpn" - exit status 25 2014/06/05-07:19:35 dpnctl: ERROR: 1 error seen in output of "/usr/bin/yes no | /usr/local/avamar/bin/restart.dpn" 2014/06/05-07:19:35 rm -f /tmp/dpnctl-gsan-restart-status-3905 /tmp/dpnctl-gsan-restart-output-3905 2014/06/05-07:19:35 gsan error log scan: 2014/06/05-07:19:35 - - - - - - - - - - - - - - - BEGIN 2014/06/05-07:19:36 Using /usr/local/avamar/var/probe.xml 2014/06/05-07:19:37 (0.0) ssh -x -o GSSAPIAuthentication=no admin@192.168.150.200 '/usr/bin/perl -ne "m&& (\$1 gt q) && do { print; }" /data01/cur/err.log 2>&1' 2014/06/05-07:19:38 2014/06/05-05:37:47.81459 {0.0} [licensevalidator:114] ERROR:licensevalidator::validate license error invalid license 2014/06/05-07:19:38 2014/06/05-05:38:47.94827 {0.0} [licensevalidator:114] ERROR:licensevalidator::validate license error invalid license 2014/06/05-07:19:38 2014/06/05-05:39:48.10104 {0.0} [licensevalidator:114] ERROR:licensevalidator::validate license error invalid license 2014/06/05-07:19:38 2014/06/05-05:40:48.25609 {0.0} [licensevalidator:114] ERROR:licensevalidator::validate license error invalid license 2014/06/05-07:19:38 2014/06/05-05:41:48.40897 {0.0} [licensevalidator:114] ERROR:licensevalidator::validate license error invalid license 2014/06/05-07:19:38 2014/06/05-05:42:48.50925 {0.0} [licensevalidator:114] ERROR:licensevalidator::validate unknown license error 2014/06/05-07:19:38 2014/06/05-05:42:48.50939 {0.0} [licensevalidator:114] FATAL ERROR:licensevalidator::body no valid license found 2014/06/05-07:19:38 - - - - - - - - - - - - - - - END

ionthegeek · Accepted Answer

If this is AVE (not VDP, etc.) and the logs are reporting an invalid license message, the issue is with the license file. Make sure the license file has not been modified in transit to the AVE -- the content is hashed so even whitespace changes will cause the licenese validation to fail. The license validator is really, really picky.

Missing or extra newline characters will cause validation to fail. Mail servers are notorious for mangling whitespace. If the license file was opened on a Windows system and copied and pasted across to the server through the terminal, the UNIX line endings may have been replaced with Windows line endings. Copying the zip file across to the system using SCP or SFTP, then decompressing it in place is the most reliable way to ensure the license makes it to the server undamaged.

Besides whitespace issues, incorrect ownership and permissions are the most common issue with the license.xml file. On my lab system, the license file is owned by "admin:admin" and has 644 permissions.

No information about licensing is recorded in the checkpoints.

Assuming the license file was generated correctly, the support team should be able to fix the problem in 5 minutes or less so if you get stuck, please do open a service request. There is no need to wipe your AVEs over this issue.

ionthegeek · Answer

The server software will reject a license file if it's been tampered with. The license files will have to be regenerated with the new hardware information. Please get in touch with your account team for assistance with this.

hugovg2 · Answer

Many thanks! Have just send off the re-generated gsankeydata.xml files.

ionthegeek · Answer

My pleasure!

hugovg2 · Answer

Mmmmmm, sorry to report that the newly generated licenses didn't resolve the problem (invalid license messages in the dpnctl.log, and gsan & MCS not starting). There are lots of details reported in the log that we do not fully understand, and not sure what to do next.

When restarting the server (dpnctl start), we are required to chose if to restore a checkpoint, we chose to restore the last fully validated checkpoint. Is this complicating the issue perhaps? As the checkpoint was made with the previous license?

Happy to post the log, or if it looks too hard, we can just start again, with most of the data safely replicated to another server.

Thanks for the help.

hugovg2 · Answer

You are right re getting the licensing process exactly right, thanks!

Another question if I may to ensure this doesn't happen again.

We have been evaluating Avamar using AVE 7 servers, and have successfully used replication between servers, configured using the Avamar Administrator GUI only (so I understand is policy-based replication and via the “MCS”).

We desire to employ root-to-root replication to create a simple DR facility, where the primary would be replicated to the secondary daily.

Although we see this method recommended for DR, the main EMC documentation for root-to-root replication is for System Migration (and the replication is handled manually).

After reviewing the available documentation (EMC Support and community based), we are not clear what the easiest replication method is/are, as there seem to be different ways…

MCS ("policy-based"?) – easiest, GUI based, client data only
EM ("cron-based"?)
manual (using the repl_cron.cfg file, so "cron-based" too?)

Questions..
1. Is “Policy-based” the same as “plug-in-based” replication?
2. As it seems we need to be able to specify the --fullcopy replicate command line option (in the repl_cron.cfg), only a cron-based option is applicable, correct?
3. We read a Forum post that says “... issue can occur on 7.0 systems if both cron-based and plug-in-based replication are being used. If plug-in-based replication is in use, it is recommended to avoid cron-based replication entirely” – so we need to be careful we do never use policy-based replication on the primary server/s?
4. We read forum posts about how to monitor the replicate process for progress, so does this mean it cannot be monitored via the Avamar Administrator GUI (eg. Activity page)?
5. Is the EM the correct GUI monitor for cron-based replications, whether created using EM or manually editing the repl_cron.cfg file?

Thanks again for any help.

ionthegeek · Answer

I'm glad to hear you were able to get the license issue sorted out.

In general, I would recommend against using root-to-root replication. Standard replication (also called root-to-REPLICATE replication) is much more flexible. The only scenario I'm aware of where root-to-root replication has an advantage over standard replication is in the case where the source system has been completely destroyed and is never coming back.

The main advantage of root-to-root is that you can run an Avamar Administrator Server (MCS) restore on the target, fail the source system's DNS name over to the target system and the clients will never notice the change.

Standard replication has a number of advantages over root-to-root replication:

With root-to-root replication, the target cannot be used for operational backups (it can only accept new backups in a DR scenario). With standard replication, you can activate clients to the target grid and perform normal backups if you run into some kind of temporary issue with the source, for example.
Root-to-root replication only permits unidirectional one-to-one replication. Replication can only run source => destination. With standard replication, it's possible to set up bi-directional replication, many-to-one replication and other, more exotic configurations depending on your needs.
All the data required to perform disaster recovery for a replication source system is replicated to the target during standard replication anyway.

The main downside of standard replication is that failing over the clients in case of a disaster is a manual process.

Furthermore, I would not recommend trying to switch from root-to-REPLICATE to root-to-root replication. Changing replication modes has some serious potential pitfalls around client IDs on the replication target. Client ID issues can cause subtle "slow burn" problems on the system that are very difficult to diagnose and take a long time to correct.

Policy-based and plugin-based replication refer to the same thing. I would strongly recommend sticking with policy-based replication. It is much, much easier to manage.

Replication can be monitored through the Avamar Administrator GUI using the activity monitor, including both policy-based and cron-based replication.

hugovg2 · Answer

Thanks for the info & advice, I understand better the limitations of root-to-root replication.

If all of this DR subject is nicely described in an EMC document, I am very happy to learn more there & not bother you, but I can’t the subject nicely shrink-wrapped sorry!

Ø All the data required to perform disaster recovery for a replication source system is replicated to the target during standard replication anyway.

Sorry but I don’t quite understand this. I see how clients & their data & logs are all nicely replicated to the REPLICATE domain, but I can’t see the Policy objects/configuration that would be required to resume normal backup operations. Without these, after the primary failure, surely we would need to:

1. Re-assign clients to secondary server, which would mean their folders/domains would have to already exist – is there a way to for these to be replicated (or export/import) ahead of time?

2. Copy the Policy configuration info (datasets etc.) to secondary server, eg. by...

a. Using “mccli group export” to export them to xml file, then use “mccli mcs import” on the target server to import them – this would need to be done periodically anyway to ensure its immediately available, yes?

b. We read “However, groups exported in this way do not contain a list of client members” – so this means clients will have to be manually assigned back to their correct Groups unless it was automated via “mccli group add-client” etc.

At this point, on the secondary server, the clients will appear in their own folders\domains, as well as in the REPLICATE folder, but with different CIDs.

When the primary server is up and running again, either in the same state it was in before, or…

1. If a fresh new server install, then it needs to be configured

2. replicate all the data back again

3. Policy configs imported

4. folders/domains re-created

5. client re-assignment process reversed

6. clients assigned to their correct Group/s

In the Avamar Administration guide, it describes “MCS configuration settings”, backing up & restoring “MCS data”, not sure if this is applicable to our DR scenario.

Thanks again for the help – I am only asking this in detail as after losing 2 AVE servers for a few days, I would like to know there’s a good dependable solution we as Avamar learners can make work!

ionthegeek · Answer

Just to clarify, EMC normally performs replicate restore operations.

ionthegeek · Answer

All the data required to perform disaster recovery for a replication source system is replicated to the target during standard replication anyway.

Everything needed to recover the primary server is replicated. Running backups to the replication target in the interim is a separate problem.

I see how clients & their data & logs are all nicely replicated to the REPLICATE domain, but I can’t see the Policy objects/configuration that would be required to resume normal backup operations.

The policies, datasets, etc. are stored in the MCS database. MCS flushes (backups of the MCS configuration and database) are replicated to the target during standard replication. Once the primary system has been returned to service, a "replicate restore" is run to migrate the GSAN accounts and MCS flushes back to the system, then the MCS is restored from flush. This restores the operating configuration of the primary system.

Without these, after the primary failure, surely we would need to:

1.       Re-assign clients to secondary server, which would mean their folders/domains would have to already exist – is there a way to for these to be replicated (or export/import) ahead of time?

2.       Copy the Policy configuration info (datasets etc.) to secondary server, eg. by...

a.       Using “mccli group export” to export them to xml file, then use “mccli mcs import” on the target server to import them – this would need to be done periodically anyway to ensure its immediately available, yes?

b.      We read “However, groups exported in this way do not contain a list of client members” – so this means clients will have to be manually assigned back to their correct Groups unless it was automated via “mccli group add-client” etc.

This is all true. However, I assume that the goal during a disaster that affects the primary system is to protect the clients temporarily until the primary system can be brought back into service. A disaster is an exceptional event, so "normal backup operations" may not be the best thing to strive for.

If the goal is just to make sure the clients are protected, a one-for-one duplicate of the source configuration seems like it would create a lot of unnecessary administrative overhead. Instead of trying to resume normal operations on the replication target, why not create a domain called "DR 2014-06-16" and activate the clients there temporarily? You could migrate the datasets from the source but you'd have to be sure to keep them in sync. Alternatively, you could configure some basic "DR datasets" directly on the target and use those.

This would be a little more tricky if you're a service provider or subject to regulatory requirements, etc. but for most customers, I suspect it would be easier to manage DR as I've described above.

When the primary server is up and running again, either in the same state it was in before, or…

1.       If a fresh new server install, then it needs to be configured

2.       replicate all the data back again

3.       Policy configs imported

4.       folders/domains re-created

5.       client re-assignment process reversed

6.       clients assigned to their correct Group/s

Once the source system is back up and running, the general process for bringing the system back into production is as follows:

Replicate restore of the GSAN User Accounting System information (this includes the domain structure, client CIDs, etc.) and MC_BACKUPS account containing the MCS data.
Restore MCS from the most recent flush to restore clients, groups, datasets, etc.
Re-activate clients to source grid
Replicate restore of backup data (Optional)

Many customers opt to skip the restore of the backup data completely because it can take considerable time. Some customers restore only the most recent backup because this allows the clients to use their cache files. Some customers restore everything because of regulatory requirements, etc.. It depends on the level of risk they're willing to accept.

Unless the source and target are configured for bi-directional replication, any backups created on the target would have to be manually replicated to the newly restored source since -- as you pointed out -- the clients have different CIDs.

My pleasure! I'm always happy to help.

hugovg2 · Answer

Hey thanks for the info & advice Ian, some good options & thoughts there, and I think I see in the Avamar Administrator GUI where to do that REPLICATE-restore now too. Thanks.

Avamar

Change of MAC Address on Avamar Virtual Edition 7 servers

Was this post helpful?