This post is more than 5 years old
3 Posts
0
714
August 22nd, 2014 07:00
avsql - slow scan
Hello all -
I'm a SQL DBA dealing with cutover to use of the SQL plug-in for many of our key instances at my organization. I'm seeing extremely slow initial "scan" of the databases on some SQL Server database instances, and fairly fast scans on others. The one that troubles me in particular is a low-utilization SQL failover cluster with only a handful of databases that is taking 3-5 minutes between databases before even kicking off the "task" that performs the actual incremental backup. How do I diagnose the issue and potentially get this to run faster? Attaching the log of this activity.
Our standard is to guarantee a 15 minute RPO for any database, which was easy with the native SQL backups. Given that I'm seeing some of these incremental jobs take more than 15 minutes doing these initial "scans" before spawning the backup threads, and the inability to run more than one backup execution at a time - how do I possibly maintain this RPO?
Any ideas or clever work-arounds folks have stumbled upon??
Thanks!
Stogie_Maru
3 Posts
1
September 11th, 2014 08:00
Just want to provide some closure on this, in the event anyone else experiences the issue and stumbles across this thread...
EMC engineering provided a fix binary that used a switch "search-interval", effectively reducing the amount of historical data searched for the most recent viable Full backup. By specifying either "day" or "hour", this drastically reduced the amount of time taken to process an Incremental backup execution. (by default, it examines the backups by searching on a weekly basis - examining the last four weeks, one at a time - or whole history in case no Full backup is found in those four week attempts). They also provided a different fix binary using a switch "count", working in a similar fashion. I could supply an integer value that would be the initial number of historical backups searched for the last viable Full, and it would keep searching by a factor of that integer value until found. Also achieved good results.
Not sure which fix binary will make the final cut, but I was informed that EMC engineering is moving forward with official hotfix 200279 for v7.0 SP2, estimated availability time around the end of September 2014. Cite that hotfix in the event you need to pursue this fix.
ionthegeek
2 Intern
•
2K Posts
0
August 28th, 2014 06:00
I would recommend working with support on this issue.
Stogie_Maru
3 Posts
0
August 28th, 2014 08:00
Case is currently open with engineering and pending resolution.
With enhanced debugging-logging turned on, the wait for each database always seems to occur after:
yyyy/mm/dd-hh:mm:ss:xxxxx [stderr] avspawnpipe::body stderr waiting for I/O
On our performant systems it’s as few as 10 seconds, averaging about 20 seconds. Not great, but acceptable... for now. On this non-performant system, it’s frequently 3-5 minutes. What is going on here – what is it really waiting for?? Is it a factor of database size, amount of backup history (both in SQL server system tables and Avamar?), network latency (though all other calls/connections to the Avamar appliance are just as performant as any other system)...? Totally baffling.
Reading through various discussions in this Forum, I’m not seeing a lot of SQL DBAs present, which I find a little concerning. I’m starting to wonder if the use of the SQL plug-in for direct SQL database backups is more trouble than it’s worth. I’m certainly frustrated. If the general consensus is “you’re really better off continuing to dump your SQL DB backups to local disk on the server and then let Avamar do its de-duplication magic on those at the file system level”, so be it. I can abandon the SQL plug-in - though we were really hoping to get back the terabytes worth of designated SQL DB “backup” drives scattered across the numerous SQL Servers across our enterprise and add that space back to our ever-shrinking SAN pool.
Most of all, I’m still stumped by how anyone is able to maintain a standard 15 minute recovery point objective (which is a pretty common rule-of-thumb across the SQL community) considering the amount of time it takes from when the backup task kicks off until when the backups are actually taken. With the trend toward SQL consolidation, some instances run 100+ databases. At an average of 20 seconds per scan (on our performant Avamar SQL plug-in systems), even on an instance with 50 databases, that’s over 16 minutes burned … then at least another couple minutes for the transaction log (“incremental”) backups themselves. The nightly Full database backup process on the typical server runs 30-60 minutes, so with the limitation of only one backup task being able to run at a time, there can be a sizeable gap of time between the Full backup and the next incremental backup for any given database. Are there any plans for an enhancement to allow more than one task to run simultaneously?
ionthegeek
2 Intern
•
2K Posts
0
September 11th, 2014 09:00
Thank you for the thorough follow-up! I'm sure this will be useful if others encounter the same situation.