Can I capture backup ssids in a workflow for cloning later?

Question

I've been manually creating lists of ssids/cloneids by running mminfo queries after the fact, to use for nsrclone operations to clone from Data Domain to tape (LTO7). A bit of a time sink since I have to make sure I get everything and not a lot extra, etc. Currently running Networker 19.5.05.

Is there any way to capture the list of ssids created by the backup in a workflow? We just have one back up in a workflow, and sometimes a clone to cloudboost. So the workflow process must make a list of ssids in the backup step for the clone stop to know what to clone. And the post in this community, "Overview of Networker Group Cloning Operation", says as much: "Every time a save job complete, it will writes the SSID into the file \\nsr\\tmp\\sg\\ \\<.groupName>. nsrclone job periodically checks on this file to get the cloning source.
The temp file will be deleted after the save group (workflow) completed"

Has anyone figured out how to capture the temp list of ssids for future use, after the workflow finishes? Would make the tape cloning a lot easier, and if I can do this, then the tape cloning is more easily scriptable. Would be great to have a scripting "hook" in the workflow... maybe there is a way for the policy completion command to copy the clone ssid temp file?

Any ideas appreciated.

bingo.1 · Answer

Well - filtering the SSIDs from the reports is possible but tricky. There are better methods - and you are already on the right track: scripting.

This is what I did some years ago:

- Creating workflows that send data to 2 pools: TO_CLONE & NOT_TO_CLONE

- Writing appropriate scripts which also looks for the number of instances in the TO_CLONE pool.

- Letting the clone script run on a daily basis.

This allows you to even skip the process for a day or two (if you need to run other maintenance tasks on the DD).

What about this solution?

barry_beckers · Answer

we do something similar, but instead of a script, we use a workflow with a protection group that performs the querying for the savesets to be cloned, based on one source pool.

Pool based cloning makes things simple in the sense, that you don't care how many ssid's are in that pool, you simply clone any ssid that isn't cloned yet. In the past data to be cloned was queried based on the group (pre-nw9). But that means whenever anything new is created, you have to make sure that group was added to the query as well. Moving twoards pool based cloning makes things way easier. All data that ends up in CLONE_DATA_FROM_THIS_POOL is cloned, whereas all data in DO_NOT_CLONE_DATA_FROM_THIS_POOL is not picked up by any clone related workflow. As easy as that...

We prefer this over having a clone action after the backup action directly as it is very inefficient (especially in larger environments) where the workflow as a whole is running longer but also more clone actions might be competing for the same target and source resources. In NW9.2 something it even flat out stopped working altogether causing clone actions to get hung or fail. After having disconnected the clone action from the workflows that make the backups, we haven't had any issues like that anymore.

So we have the workflows with the clone actions running once or even twice a day, stating it should look for all data for the last x-amount of days (looking back 3 days or so might be good enough for most, to find ssid's not yet cloned in case there where some cloning failures in between. One can look back further, but that also causes a higher load on the NW server to query for older data). We call it asynchronous cloning as you separate the cloning from the actual backup. We run the clone jobs mainly during office hours, so that backup and cloning are not interfering with each other too much.

dstrom · Answer

Thank you both for taking the time to reply.

I'm not sure how I would adapt a pool based selection for cloning actions. Right now we use the pool to choose which DDboost interface(s) to use for the backup to Data Domain, so the network traffic stays in the subnet and doesn't traverse the network firewall. So the backup operation has vlanA or vlanB (clever names, eh?) pool.

We've got about 8 profile/workflow/savegroups set up, some vms, some NAS-ndmp, some physical servers. Our policy is incremental every day, full on the last Friday of the month. Some select ("production") workflows clone the monthly full backups to cloudboost (AWS S3). About 12TB now, used to be less, closer to 8TB when we started using cloudboost. Since this is over the weekend, it's not a problem for us if one nightly incremental gets skipped. Networker hasn't had any problems with these 3 cloudboost cloning operations all running on the same weekend. Finished by Sunday evening.

And our policy is on a quarterly basis, I clone all of the full monthly backups including the ones already cloned to cloudboost. So I manually run a bunch of mminfo commands to generate a list of SSIDS or SSIDS/CloneIDS (for the 3 cloudboosted groups), selecting by savegroup, date range, location (DD pool, not the cloudboost). Thank goodness for the history command in Bash.

First a verbose list, which I compare to the backup completion email... check the count of savegroups vs. backed up volumes/vms, make sure nothing missed and no extra. Then output SSID (CloneId) to a file to use as input to an nsrclone command (-S

Like barry_beckers above, I do this cloning during the day when the Data Domain is idle. Takes a week or more elapsed time to produce 10 LTO7 tapes, average about 10TB per tape or a bit more.

I was hoping to avoid this manual mminfo process by capturing the ssids from the workflow, but I am not hopeful. There's also the manual process if something gets skipped in the regular full backup process, like if a server goes down and the backup fails. I add a special full backup of that server to the tape clone set, so that a set of tapes theoretically can restore *everything* if there's a real disaster... at least that's the goal.

barry_beckers · Answer

Like bingo0.1 and I were trying to convey, if you have an approach where the pools you backup towards determine if ALL data in it needs to be cloned or not, then setting that up is rather simple. Some prefer scripting for the cloning, but as stated when using a protection group with "query" as selection option, then you can choose from various selections. In that query group you simply select one source pool to clone from. Then you create a workflow where you connect this protection into and define a clone action in which you specify where the data needs to cloned towards. So with that you would have a workflow that only clones data based on the selection by querying only for one single source pool.

creating a query protection group:
https://www.dell.com/support/manuals/en-us/networker/nw_p_ddboost_int_guide_19.8/creating-a-query-group?guid=guid-d4e6ddba-1b89-46d2-b3e2-fa10c8ee36c9&lang=en-us

But as said that requires you to have the backups split up in such a way that you'd have pools that need to be cloned and pools that don't need to be cloned. For each pool containing data to be cloned, you'd create its own workflow to do so. Mixing up things with multiple source pools even with nw19 seem to have issues (for example when having one pool containing VM vproxy based backups and one pool containing regular backups), whereas once things are split-up into separate workflows with each just one source pool, things work flawlessly. One would think mixing up multiple source pools should work, but it has mixed results, hence we simply have a workflow for each source pool to be cloned.

Similar you can have a clone action to be run after the backup action in the same workflow, however that did not appear to be very scalable in our case with many workflows being used. It worked until suddenly with nw9.2 or so it stopped working. NW simply lost track along the way. Way too many workflows would then be competing with source and target resources to have them all have a go at cloning. So that kinda cloning we'd let go completely. It is also far from effective due to possibly having multiple workflows being busy from and to the same devices. Doing that with just one workflow for that source pool goes just fine. We learned the hard way and are now happy with current results, trying to fight against NW as cloning is a challenge already for 20+ years already, but things have definitely improved for the better...

NetWorker

Can I capture backup ssids in a workflow for cloning later?

Was this post helpful?