Unsolved
This post is more than 5 years old
9 Posts
0
3491
August 15th, 2012 09:00
Why some process getting too many locks
While I try to expand some file system or delete snapsure cehckpoints I get error "unable to acquire lock(s), try later". I then have to go to /nas/lock/db and check od -d * to check what process is locked and then kill the process. But this keep on coming again and again.
What possiblities are there that we get these so frequently and also is there a solutions for this
We have NAS700 running with 5.6.52
No Events found!
afp92Tq1w012558
86 Posts
0
August 16th, 2012 00:00
Hi,
Can you please check if there is any partition that is getting full, by using the command df -k.
Also can you please check the sys_log / server_log and see if there are any error messages .
Thanks
Vanitha
afp92Tq1w012558
86 Posts
0
August 16th, 2012 02:00
Hi,
Can you please check in sys_log if the checkpoint is being created during the time the nasdb backup is running
Thanks
Vanitha
omcsan
9 Posts
0
August 16th, 2012 02:00
Hi,
I have checked the space and all are well in limits. I have also checked the server_logs but did not find any alerts. One thing which I have noticed is that we have too many snapsure checkpoints request getting generated and these process I can see in the lock state. I do not why we have so many request of snapsure checkpoint request even though the request has come during the backup scheduled job triggers. This happens in the night and the checkpoint creation goes on through out the day,
omcsan
9 Posts
0
August 16th, 2012 09:00
Hi,
I have checked and found that the checkpoint are getting created during the nasdb backup. Below are some logs. But the backup is completing successfully. Does this mean there could be issues?
Aug 16 13:59:55 2012:CS_PLATFORM:NASDB:ERROR:11::::1345121995:Command failed. Command: /nas/bin/server_mountpoint server_2 -de
lete /root_vdm_25/automaticNDMPCkpts/automaticTempNDMPCkpt641-85-1345115385 Error: Error 2218: server_2 : /root_vdm_25/automat
icNDMPCkpts/automaticTempNDMPCkpt641-85-1345115385 : does not exist
Aug 16 13:59:55 2012:CS_PLATFORM:NDMP:INFO:31::::1345121995:command succeeded: delete_ckpt
Aug 16 14:04:30 2012:DART:REP:INFO:47:Slot 2:::1345122270:Scheduler=380_CK200062500340_000A_3500_CK200063700624_0010(alias=Rep
_BFS_012_TH), is out of sync.
Aug 16 14:05:00 2012:CS_PLATFORM:NASDB:INFO:300::::1345122300:nasdb_backup: NAS_DB checkpoint is in progress.
Aug 16 14:05:00 2012:DART:REP:INFO:46:Slot 2:::1345122300:Scheduler=380_CK200062500340_000A_3500_CK200063700624_0010(alias=Rep
_BFS_012_TH), is in sync.
Aug 16 14:05:16 2012:DART:NDMP:INFO:2:Slot 2:::1345122316:fsid:11515 mount_point:/root_vdm_26/automaticNDMPCkpts/automaticTemp
NDMPCkpt643-7562-1345117452
Aug 16 14:05:24 2012:CS_PLATFORM:NASDB:INFO:305::::1345122324:nasdb_backup: NAS_DB checkpoint done.
Aug 16 14:05:30 2012:DART:REP:INFO:47:Slot 2:::1345122330:Scheduler=356_CK200062500340_000A_3437_CK200063700624_0010(alias=Rep
_BFS_004_TH), is out of sync.
Aug 16 14:05:50 2012:CS_PLATFORM:NDMP:INFO:31::::1345122350:command succeeded: delete_ckpt
Aug 16 14:06:14 2012:DART:REP:INFO:46:Slot 2:::1345122374:Scheduler=356_CK200062500340_000A_3437_CK200063700624_0010(alias=Rep
_BFS_004_TH), is in sync.
Aug 16 14:06:20 2012:DART:REP:INFO:47:Slot 2:::1345122380:Scheduler=3007_CK200062500340_000A_3489_CK200063700624_0010(alias=Re
p_BFS_020_TH), is out of sync.
Aug 16 14:06:53 2012:CS_PLATFORM:NASDB:INFO:306::::1345122413:nasdb_backup: Celerra database backup done.
Aug 16 14:06:57 2012:DART:REP:INFO:46:Slot 2:::1345122417:Scheduler=3007_CK200062500340_000A_3489_CK200063700624_0010(alias=Re
p_BFS_020_TH), is in sync.
Aug 16 14:09:25 2012:DART:NDMP:INFO:1:Slot 2:::1345122565:source_fsid:2718 mount_point:/root_vdm_25/automaticNDMPCkpts/automat
icTempNDMPCkpt653-2718-1345122565
Aug 16 14:09:50 2012:DART:REP:INFO:47:Slot 2:::1345122590:Scheduler=365_CK200062500340_000A_3515_CK200063700624_0010(alias=Rep
_BFS_007_TH), is out of sync.
Aug 16 14:09:57 2012:DART:CFS:INFO:7:Slot 2:::1345122597:The file system size (fs /root_vdm_18/BFS_005) dropped below the thre
shold of (90%)
Aug 16 14:10:06 2012:DART:REP:INFO:46:Slot 2:::1345122606:Scheduler=365_CK200062500340_000A_3515_CK200063700624_0010(alias=Rep
_BFS_007_TH), is in sync.
Aug 16 14:10:29 2012:CS_PLATFORM:NDMP:INFO:31::::1345122629:command succeeded: create_ckpt
Aug 16 14:10:41 2012:DART:REP:WARNING:24:Slot 2:::1345122641:Source=359_CK200062500340_000A_3449_CK200063700624_0010(alias=Rep
_BFS_005_TH), starting transfer. Destination has no space available. Source is retrying.
Aug 16 14:11:56 2012:CS_PLATFORM:CFS:INFO:116::::1345122716:fs auto extension forced ext start: [AFS_008] max_size:911360 hwm:
0 (Slot2:1345122711:)
Aug 16 14:11:51 2012:DART:VCS:INFO:1:Slot 2:::1345122711:Replication file system extension fsId: 83 maxSizeInSector: 186646528
0
Thanks
Sachin
afp92Tq1w012558
86 Posts
0
August 17th, 2012 03:00
Hi,
I dont see any failed task messages in the sys_log output pasted above.
can you please check the cmd_log.err file and see if there are any errors logged regarding unable to acquire lock(s) or regarding checkpoints .
Thanks
Vanitha
omcsan
9 Posts
0
August 17th, 2012 03:00
Hi Vanitha,
I have checked the logs and below is some of the logs details. I auume that while creating tehcehckpoints, as there is no space avaiable, the cehckpoints get in hung state. But no sure if I am correct.
2012-08-03 23:26:31.521 db:0:23108:E: /nas/sbin/rootnas_fs -x root_rep_ckpt_123_336566_1 QOSsize=20000M: There are not enough
free disks available to satisfy the request.
2012-08-03 23:29:00.863 db:0:24662:E: /nas/sbin/rootnas_fs -x root_rep_ckpt_123_336566_2 QOSsize=20000M: There are not enough
free disks available to satisfy the request.
2012-08-03 23:33:07.639 db:0:27410:E: /nas/sbin/rootnas_fs -x root_rep_ckpt_123_336566_1 QOSsize=20000M: There are not enough
free disks available to satisfy the request.
2012-08-03 23:39:48.584 db:0:31229:E: /nas/sbin/rootnas_fs -x root_rep_ckpt_123_336566_2 QOSsize=20000M: There are not enough
free disks available to satisfy the request.
2012-08-03 23:44:00.210 db:0:1628:E: /nas/sbin/rootnas_fs -x root_rep_ckpt_123_336566_1 QOSsize=20000M: There are not enough
free disks available to satisfy the request.
2012-08-03 23:46:47.230 db:0:3602:E: /nas/sbin/rootnas_fs -x root_rep_ckpt_123_336566_2 QOSsize=20000M: There are not enough
free disks available to satisfy the request.
2012-08-04 12:31:47.958 db:0:27891:E: fs_ckpt id=5584 -refresh: unable to acquire lock(s), try later
2012-08-05 05:12:39.324 db:0:18395:E: /nas/bin/nas_fs -delete automaticTempNDMPCkpt362-7532-1344064214 -Force: Internal MAC S
ocket Error 10: Connection timeout
2012-08-05 05:14:10.196 db:0:11757:E: /nas/bin/nas_fs -delete automaticTempNDMPCkpt362-7532-1344064214 -Force: DpRequest_Max_
VS_SuspendedCheckpointsReached
2012-08-05 05:16:17.363 db:0:16406:E: /nas/bin/nas_fs -delete automaticTempNDMPCkpt362-7532-1344064214 -Force: DpRequest_Max_
VS_SuspendedCheckpointsReached
Thanks
Sachin
afp92Tq1w012558
86 Posts
0
August 17th, 2012 08:00
Hi,
Please check if there is space in Celerra storage pool using the command nas_pool -size
Thanks
Vanitha