Networker not mounting tapes from different pools

Question

We have some trouble with a tape library managed by networker V19.4. and ealier versions.

After clone to tape jobs to a tape pool are finished, networker will not mount tapes from another tape pool to the used drive. If you mount a tape from the desired pool manually, the waiting job will start.

This happens with or without pool to target device selections.

Can someone help?

Nico

smart4sugar · Accepted Answer

Thanks @barry_beckers for your detailed answer.

Meanwhile our problem has been solved.

First thing I've found, was that the tape storage node (bagsv32) was not listed under StorageNodeProperties -> Configuration -> Advanced Devices -> Clone storage nodes. After adding it there it was possible again to use the tape devices.

Second source of the trouble was a parallel installed networker client on the tape storage node (in addition with a different version). Since I have removed it, everything runs fine again.

Thanks to all the supporters in this thread for helping me!

Nico

bingo.1 · Answer

This sounds weird. I personally expect a configuration error. In principle you are following the right idea - do not allow assign pools to devices.

I personally have never seen a good reason to use it. IMHO, this is the last choice because you can easily end up in a mess and cause confusion. Do not forget: "enablers are also disablers." - If you specify a certain criteria you block others.

I would use the "nsrjb -vvv -l ..." command to load and mount a tape in one of these devices. Now NW should tell you more where the trouble is. You could also look at the daemon.raw file.

smart4sugar · Answer

Ok, I have tried to load a volume:nsrjb -vvv -l 000139L6 gives11/18/20 08:25:07.669553 Error: Cannot allocate `1' devices for this operation: max active devices for jukebox rd=bagsv32.xxx.group:IBM@0.3.1 = 0It is a TL4000 library attached to a w2016 server.Mounting using the gui is working without problems.But that seems not be the reason of all the trouble.I tried a clone like this:nsrclone -vvv -b Wochenband -S 2813537987It gave me :180206:nsrclone: Step (7 of 16): NSRCLONE_PROCESSING_PHASE_THREE: Sending the media reservation request for the current save set list. 160812:nsrclone: Retry reservation after it was rejected. Retry counter [4] 116284:nsrclone: Sending cancel broker request 96 180876:nsrclone: Request 96 was cancelled. 116283:nsrclone: retrying mmd reservation request!!!After that I checked the 'Disable RPS Clone' box in the server properties and that solved the problem!Instead of the error message the job continued with:180179:nsrclone: Updating the total number of steps from 16 to 5 for the non VM save sets workflow.Now the library is behaving like expected... But I don't know why the RPS option is causing this errors.

bingo.1 · Answer

Well done. There are obvious still issues with this option although it is even enabled by default. But you obviously disable it when you so not clone between 2 DDs.

You may also look the following document: https://avus-cr.de/19_1_gen_7.pdf

smart4sugar · Answer

Unfortunately it turned out that the problem is still existing...

After starting another clone job with debug level 9 I have noticed this lines:

broker_find_usable_save_mmd: no ready or setup mmd available on snode bagsv32
is_device_busy: device rd=bagsv32:\\.\Tape1 assigned to mmd #269893653.
is_device_busy: device rd=bagsv32:\\.\Tape0 is not shared 
find_init_mmd_for_save: write_reserved_mmds: 0 
0 out of 8 mmds are available for save 
broker_find_usable_save_mmd: NSRATTR_CLONE_SAVE_VOL_HINT not set in the request
Failed to allocate save mmd since it was not started. Reject request id 595 
There are no available devices for client `bagsv32' on all usable storage nodes for the requested save operation.

Where is the problem? Should Tape0 be shared? Or is the number of possible mmd processes somewhere limited?

bingo.1 · Answer

IMHO this is the key message: 0 out of 8 mmds are available for save So all of the allowed nsrmmd processes have been started for this device and are in use. A new one to support the save/clone operation cannot be started. This points to the fact that you either hit a limit (Max nsrmmd count - although it seems to be only valid for AFTD and DD devices) or that earlier nsrmmd processes have not been properly terminated. Support should thoroughly check this area.

smart4sugar · Answer

After deleting and recreating the staorage node with the library attached it is no longer possible to clone to the library. Inventory of the tapes worked, but I cannot restore fronm tape or clone to tape any longer.

Unable to start clone session: no matching devices for save of client `bagvsv32'; check storage nodes, devices or pools
No available device on storage node bagvsv32
broker: no matching devices for save

What can I do??

bingo.1 · Answer

I assume that your pool configuration is wrong (to 'narrow'). This might be especially due to the fast that you assigned devices to pools. Such is possible but it is not really necessary - as mentioned, it will narrow your pool selection criteria. The best example is in fact the pool Default which you are not allowed to change but it NW will still be able to use it. So avoid the device criteria whenever possible.

The other issue is that you have to be aware that loading/scanning and mounting are different issues. You can scan any compatible media in any appropriate device - but if you want to mount it (which you must do before you start a backup/recovery process), this will only be possible in a device which is allowed for the specific pool. Such would explain your specific behavior.

barry_beckers · Answer

I assume it would have already be addressed by now?

As the client bagvsv32 being mentioned is also a storage node, does it have itself stated as storagenode to be used? As otherwise it would try to use the NW server as storagenode.

But as your error states "check storage nodes, devices or pools No available device on storage node bagvsv32" it seems to recognize to use itself as storage node might more likely be the pool in question not having option to use the devices fro this storage node. So are there restrictions in pool assignments? Are the devices from this storage node assigned to the pool in question?

Too bad also that the dreaded "check storage nodes, devices or pools" at times had nothing whatsoever to do with any of that, but was a rather generic error message that could be caused by completely unrelated other issues.

We tended to have some separation even way in the past when still using tape (now pretty much only datadomain via ddboost), to have OS backup go over the network via a shared storage node, while large database backups where using a specific san pool, that only dedicated storage nodes were supposed to use. This was at a tim when we were heavily sharing drives, hence would want to use them as effective as possible, hence send the much smaller OS backups over the LAN, while DB backups went over the SAN. We split up everything pretty much in san based pools and non-san pools and pools which were retention based, so 1wek, 2week, 3week, month, quarter and year pools and what not so that tapes would contain as much as possible active data without having to revert too much to staging to make tapes available again (which we still had to do... often)), restricting which device from which storage node, dedicated or shared, was supposed to use which pools.

bingo.1 · Answer

Just to clarify - if you define a (remote) device, the client is automatically promoted to a storage node. And of course it will automatically be added as such in the configuration/resources.

bingo.1 · Answer

Just to get this straight - the NW client software is a mandatory software which must be installed on each NW client. If you later promote him to a storage node, you will only need the storage node package.

This is what you will clearly see on a Linux SN. Unfortunately on Windows, you just tell the host the functionality it shall become. However, what should read 'Client and Storage Node' just appears as 'Storage Node' - the client software (if not already present) will be installed during the process.

barry_beckers · Answer

but if I recall correctly (at least up until nw18) if you created a device, so would end up with a new storage node, that the client in question would not have itself defined as storage node and would use the nw server by default. if that however would not have a backup device in the required pool, then the backup would fail. it only required the networker storage node to have itself defined as storage node for itself, something I forgot to do when setting up a system as storage node initially, resulting in the dreaded 'check devices, pools and storage nodes' error which can point to a large amount of causes.

bingo.1 · Answer

I have not worked with remote storage nodes for a while. So I just verified the issue with NW 19.4.0.0 on Windows.

As an old 'NetWorker' I usually do not use the wizard (except for DDBoost devices) but create a new AFTD device via its new properties. If all software is installed, you simply specify the

Name, like rd=snode-1:Z:\AFTD_1 and the
Media type adv_file

and this is it. As I mentioned before, the storage node resource will be created automatically. At least it has been the procedure for at least 20 years.

However, when I verified the procedure today I had to learn that Dell/EMC changed the rules silently in the interim. If you have not already a storage node defined, a window will pop up and clearly state:

Unable to find storage node for device 'rd=snode-1:Z:\AFTD_1'. Please configure the storage node first.

Of course, the old method mentioned above will work properly then. Well ... with NW you will often encounter surprises.

Could someone from Dell/EMC please check when the behavior has been changed and save me some time for verification?

bingo.1 · Answer

Short tests showed that the requirement to define a SN prior to the creation of a device exists much longer than I expected.

I think this is necessary since NW 7.0 which has been released sometime in 2005, most likely at the same time when the 'device access information' has been introduced. Unfortunately I do not have these old Release Notes any longer so I cannot be more precise.

My parameters/properties

Name, like rd=snode-1:Z:\AFTD_1 and the
Media type adv_file

are still valid. However, the official way according to the Admin Guide is a bit different. Because of the fact that you should be able to use any string as device name, you should specify something like

Name, like rd=snode-1:AFTD_1 the
Device access path Z:\AFTD_1 and the
Media type adv_file

For whatever reason, this will not work. NW will create the device but if you want to label the volume it comes up with the following error:

Cannot open AFTD_1 for reading. Cannot get status on 'AFTD_1'.

This is poor because this procedure has been listed in the Admin Guide for years

Dell/EMC - please comment ....

bingo.1 · Answer

It took me a while until I could look deeper into the error I have seen.

As it turned out, the problem was due no an improper installation of the remote storage node software. Unfortunately, I did not completely understand the true reason but de-/and re-installing the software finally solved the issue. As it turns out, the documentation is correct.

Just thought this might be useful for you.

NetWorker

Networker not mounting tapes from different pools

Was this post helpful?