Unsolved
This post is more than 5 years old
33 Posts
0
213657
August 14th, 2012 10:00
vfoglight unbearably slow
Does anyone have any tips for improving vfoglight performance? We have a dedicated physical ms sql server hosting the db. The vfog server is a vm with lots of virtual resources. We've tried tweaking the java settings and removing all the community bundles but it's still so slow that at best we can use it for a few monits before it stops responding. Even when it works you have to wait at least 15 seconds between clicks to navigate the interface. Quest support has not been helpful at all with this and we are geting desperate.


DELL-Thomas B
171 Posts
0
August 14th, 2012 11:00
Can you give me a little details about number of VMs and how long it has been running? Do you have all of the default alarms running? Also the first thing to do is go into the VM settings and reserve the full amount of memory. We run java and need a contiguous memory block for our heap.
DELL-Thomas B
171 Posts
0
August 14th, 2012 11:00
What are the settings you have in your server.config? Also run this script in the script console/editor and let me know the number of rows.
import groovy.sql.Sql;
import com.quest.nitro.service.util.JDBCHelper;
msg = new StringBuilder();
try{
sql = new Sql(JDBCHelper.getDataSource());
msg.append("Alarms Table Count: \n")
msg.append(sql.rows("select count(*) from alarm_alarm"));
}catch (Exception ex){
msg.append("Exception"+ex);
}
return msg.toString();
bmalone11
33 Posts
0
August 14th, 2012 11:00
We have about 1000 vms and have been running around a year. We already memory for java and increase the cache size by editing the server.config file.
bmalone11
33 Posts
0
August 14th, 2012 12:00
We were told to add these lines to the server.config file
server.vm.option0 = "-Xms12288m";
server.vm.option1 = "-Xmx12288m";
The script returned: 2377055
server.vm.option2 = "-XX:ReservedCodeCacheSize=128m";
john_s_main
132 Posts
0
August 14th, 2012 12:00
Have hope. We have 13000 VMs on a single server, with 1 - 2 second response time pulling up a table with all 13000 rows.
Thomas is awesome, and will certainly be able to assist you.
I'm surprised that support has not been as helpful as you would like, they are among the best support around.
The reason that vBundle, in particular, needs to be removed is that it has some derived metrics which are known to cause performance issues.
Other things which could be helpful:
bmalone11
33 Posts
0
August 14th, 2012 12:00
I'm hesitant to remove vBundle as probably 9 of the 10 dashboards we use rely on this add-on. I'm not sure vFoglight is worthwhile for us without it.
john_s_main
132 Posts
0
August 14th, 2012 12:00
Understand, it has some useful tools, no doubt.
Would simply be a diagnostic step, and would only need to disable the vBundle temporarily, not completely uninstall it.
(Once determined that vBundle is part of the issue, we can then track down which part is the issue, and remediate that piece.)
Did you get those screen shots?
bmalone11
33 Posts
0
August 14th, 2012 13:00
The vfog vm has 4 cores and 12gb ram
The host it's running on has 128 gb ram and 24 cores.
dlarsen1
15 Posts
0
August 14th, 2012 13:00
First, I want to state that vBundle was last fully supported on vFoglight 6.5.1.
vBundle 1.9b3 was the last TEST version, which was for vFoglight 6.6.0
There are some definite issues with vBundle being used on 6.6 and later. The derived metrics cause slowness and may contribute to churn. I highly recommend you at least test disabling it.
4 vCPUs is not overkill, so I'd consider adding a couple vCPUs
Also: You can adjust the resource share allocation of the vFoglight VM in VMware:
Feel free to open a support case. I don't see any vFoglight cases from you.
Thanks,
-dave
vFoglight Support Engineer
DELL-Thomas B
171 Posts
0
August 14th, 2012 13:00
So the issue with this is basically your alarms table. The memory and everything else, imo, is far to large as well. With a 1000VMs your heap shouldn't be larger then 6144m for XMS/XMX and I don't see a need for the code cache.
The first thing I would do is go and turn off all of the alarms in Rules Management that you aren't looking at or don't need. As a rule, I turn off all but the agent alarms and then turn on only what is needed for a given environment.
Secondly, the rules table needs to be cleared out. The easiest way is via this support article https://support.quest.com/SolutionDetail.aspx?id=SOL40651 The basic problem is many screens look at alarm counts and with 2.7 million rows, the SQL statements take time to run. Since there are alarm counts on almost all pages this can become an issue in larger environments where rules are not tuned. Run this to purge all old alarms and your UI should almost instantly become better once it completes.
bmalone11
33 Posts
0
August 15th, 2012 10:00
I've run the script a few times without success :-(
I get this error in the script console after a long delay:
com.quest.nitro.service.sl.interfaces.scripting.ScriptingException: com.quest.nitro.service.sl.interfaces.scripting.ScriptAbortException: org.hibernate.exception.SQLGrammarException: could not execute update query ----script start------ long now = System.currentTimeMillis() Calendar threshold = Calendar.getInstance(); threshold .add(Calendar.DATE, -60) server.get("AlarmService").purgeAlarms(new Date(0), threshold.getTime()); "done in "+(System.currentTimeMillis()-now)/1000+"s" ---- script end ------ com.quest.nitro.service.scripting.ScriptingService.invoke(ScriptingService.java:618) com.quest.nitro.service.sl.impl.scripting.ScriptingBean.invoke(ScriptingBean.java:171) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) java.lang.reflect.Method.invoke(Method.java:597) org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:310) org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182) org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149) com.quest.nitro.service.sl.aop.AuditingInterceptor.invoke(AuditingInterceptor.java:71) org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171) org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) $Proxy92.invoke(Unknown Source) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) java.lang.reflect.Method.invoke(Method.java:597) org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:310) org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:198) $Proxy92.invoke(Unknown Source) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) java.lang.reflect.Method.invoke(Method.java:597) org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:88) groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233) groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1058) groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:886) org.codehaus.groovy.runtime.InvokerHelper.invokePojoMethod(InvokerHelper.java:781) org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:772) com.quest.nitro.service.scripting.FoglightServiceInterface.invokeMethod(FoglightServiceInterface.java:84) org.codehaus.groovy.runtime.callsite.PogoInterceptableSite.call(PogoInterceptableSite.java:45) org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:124) system._core_commons.scripts.runScript.run(runScript:23) com.quest.nitro.service.scripting.groovy.GroovyScript.exec(GroovyScript.java:141) com.quest.nitro.service.scripting.Script.runInternal(Script.java:310) com.quest.nitro.service.scripting.Script.run(Script.java:255) com.quest.nitro.service.scripting.ScriptingService.invoke(ScriptingService.java:604) com.quest.nitro.webconsole.services.ScriptServiceImpl.eval(ScriptServiceImpl.java:86) sun.reflect.GeneratedMethodAccessor526.invoke(Unknown Source) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) java.lang.reflect.Method.invoke(Method.java:597) com.quest.wcf.services.Registry$LazyInvocationHandler.invoke(Registry.java:127) $Proxy304.eval(Unknown Source) com.quest.wcf.core.module.function.FunctionManager.evaluateScript(FunctionManager.java:577) com.quest.wcf.core.module.function.FunctionManager.evaluateScriptFunction(FunctionManager.java:396) com.quest.wcf.core.module.function.FunctionManager.evaluate(FunctionManager.java:268) com.quest.wcf.core.module.function.FunctionManager.getResult(FunctionManager.java:184) com.quest.wcf.core.module.function.AsyncFunctionProgressTracker.execute(AsyncFunctionProgressTracker.java:56) com.quest.wcf.core.module.function.BaseAsyncProgressTracker.run(BaseAsyncProgressTracker.java:166) com.quest.nitro.service.taskmanager.TaskRunnable.run(TaskRunnable.java:294) java.lang.Thread.run(Thread.java:662) Caused by: com.quest.nitro.service.sl.interfaces.scripting.ScriptAbortException: org.hibernate.exception.SQLGrammarException: could not execute update query ----script start------ long now = System.currentTimeMillis() Calendar threshold = Calendar.getInstance(); threshold .add(Calendar.DATE, -60) server.get("AlarmService").purgeAlarms(new Date(0), threshold.getTime()); "done in "+(System.currentTimeMillis()-now)/1000+"s" ---- script end ------ com.quest.nitro.service.scripting.Script.runInternal(Script.java:334) com.quest.nitro.service.scripting.Script.run(Script.java:255) com.quest.nitro.service.scripting.ScriptingService.invoke(ScriptingService.java:604) com.quest.nitro.service.sl.impl.scripting.ScriptingBean.invoke(ScriptingBean.java:171) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) java.lang.reflect.Method.invoke(Method.java:597) org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:310) org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182) org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149) com.quest.nitro.service.sl.aop.AuditingInterceptor.invoke(AuditingInterceptor.java:71) org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171) org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) $Proxy92.invoke(Unknown Source) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) java.lang.reflect.Method.invoke(Method.java:597) org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:310) org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:198) $Proxy92.invoke(Unknown Source) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) java.lang.reflect.Method.invoke(Method.java:597) org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:88) groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:233) groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1058) groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:886) org.codehaus.groovy.runtime.InvokerHelper.invokePojoMethod(InvokerHelper.java:781) org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:772) com.quest.nitro.service.scripting.FoglightServiceInterface.invokeMethod(FoglightServiceInterface.java:84) org.codehaus.groovy.runtime.callsite.PogoInterceptableSite.call(PogoInterceptableSite.java:45) org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:124) system._core_commons.scripts.runScript.run(runScript:23) com.quest.nitro.service.scripting.groovy.GroovyScript.exec(GroovyScript.java:141) com.quest.nitro.service.scripting.Script.runInternal(Script.java:310) com.quest.nitro.service.scripting.Script.run(Script.java:255) com.quest.nitro.service.scripting.ScriptingService.invoke(ScriptingService.java:604) com.quest.nitro.webconsole.services.ScriptServiceImpl.eval(ScriptServiceImpl.java:86) sun.reflect.GeneratedMethodAccessor526.invoke(Unknown Source) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) java.lang.reflect.Method.invoke(Method.java:597) com.quest.wcf.services.Registry$LazyInvocationHandler.invoke(Registry.java:127) $Proxy304.eval(Unknown Source) com.quest.wcf.core.module.function.FunctionManager.evaluateScript(FunctionManager.java:577) com.quest.wcf.core.module.function.FunctionManager.evaluateScriptFunction(FunctionManager.java:396) com.quest.wcf.core.module.function.FunctionManager.evaluate(FunctionManager.java:268) com.quest.wcf.core.module.function.FunctionManager.getResult(FunctionManager.java:184) com.quest.wcf.core.module.function.AsyncFunctionProgressTracker.execute(AsyncFunctionProgressTracker.java:56) com.quest.wcf.core.module.function.BaseAsyncProgressTracker.run(BaseAsyncProgressTracker.java:166) com.quest.nitro.service.taskmanager.TaskRunnable.run(TaskRunnable.java:294) java.lang.Thread.run(Thread.java:662)
DELL-Thomas B
171 Posts
1
August 15th, 2012 11:00
Can you please do it via the jmx-console? It's more reliable in my experience and can delete them all faster anyways.
bmalone11
33 Posts
0
August 17th, 2012 06:00
OK the alarms have been purged and now there are only about 100,000.
I have also removed vbundle, but I'd like to enable it again - obviously only time will tell which was having a greater impact on performance but what do you expect is the most likely cause - the alarms or vbundle?
Thanks for your advice. The number of alarms wasn't something which the quest support mentioned at all - they had us running support bundle after support bundle without any progress.
john_s_main
132 Posts
0
August 17th, 2012 10:00
100,000 is a good, normal level for alarms.
So, has performance improved back to normal levels?
You will probably want to go through and disable any alarms which you are not actually using - if, when the alarm goes off, no one is going in and acknowledging/clearing it, then you should turn off the alarm altogether.
To protect yourself from others who are not acknowledging/clearing alarms, you might want to enable the following rule: "Clear Old LogFilter Alarms", which will run every 24 hrs and clear any alarms older than 30 days, preventing them from building up and overwhelming your database. On FMS servers which are not actually being used for alerting, I copy and modify this rule to run every 8 hrs and clear anything older than 3 days.
Again, the best solution is to only alert on things which someone will actually take action upon.
DELL-Thomas B
171 Posts
0
August 17th, 2012 12:00
I'd say you can re-enable the vBundle, but please delete his derived metrics. His calculations are not accurate for the datastore for throughput and IO. The reason is that I created those and shared them, but I found an issue with the way I was querying the data was not filtering correctly and that I forgot to divide the IOPS correctly, so the numbers are quite a bit off. I've created a community datastore cartridge that has the correct metrics, but better yet these will all be moved into the next GA release