Unsolved
This post is more than 5 years old
4 Posts
0
1225
June 23rd, 2015 18:00
Inactive metrics for SCOM
Hi,
Good day, just wanted to ask some help for our Watch4Net/EMC.
Lately I noticed some of the recurring tasks from W4N starts to run longer than its average duration.
It is more evident to one particular task optimize-tables-SCOM.
I checked the SCOM database from W4N and can see there are around 2 Million inactive metrics and only around 95K active metrics in there. I can see from the W4N collecting log (SCOM) some error messages like this (Please see below):
The below error are happening every now and then.
We are not using any load balancing in our current deployment, only split the the DB and the backends.
Any Comments/help would be highly appreciated.
Regards,
Dante
| SEVERE | -- [2015-06-22 11:31:43 EST] -- SocketConnector::sendBuffer(): Can't write to auynpz07.ori.orica.net/168.252.83.47:2500 |
java.net.SocketTimeoutException
| at com.watch4net.apg.v2.collector.plugins.SocketConnector.write(SocketConnector.java:462) | |
| at com.watch4net.apg.v2.collector.plugins.SocketConnector.sendBuffer(SocketConnector.java:383) | |
| at com.watch4net.apg.v2.collector.plugins.SocketConnector.commit(SocketConnector.java:367) | |
| at com.watch4net.apg.v2.collector.plugins.SocketConnector.pushData(SocketConnector.java:352) | |
| at com.watch4net.apg.v2.collector.AbstractCollector.pushNext(AbstractCollector.java:56) | |
| at com.watch4net.apg.v2.collector.plugins.FailOverFilter.sendValue(FailOverFilter.java:324) | |
| at com.watch4net.apg.v2.collector.plugins.FailOverFilter.access$1700(FailOverFilter.java:53) | |
| at com.watch4net.apg.v2.collector.plugins.FailOverFilter$FlushQueueToNextTask.flushOutQueue(FailOverFilter.java:1076) | |
| at com.watch4net.apg.v2.collector.plugins.FailOverFilter$FlushQueueToNextTask.flushQueueToNext(FailOverFilter.java:986) | |
| at com.watch4net.apg.v2.collector.plugins.FailOverFilter$FlushQueueToNextTask.run(FailOverFilter.java:1119) | |
| WARNING | -- [2015-06-22 11:31:43 EST] -- AbstractCollector::pushNext(): Error pushing (r)1433089107: group::SCOM2012-Collector_AUYNPX85_DW_21124_MSSQL$SQLEXPRESS : Database_INVOICES56TST2_Microsoft.SQLServer.Database:SEYGTZ49.ori.orica.net;MSSQLSERVER;INVOICES56TST2_DB Allocated Size (MB)({insname=INVOICES56TST2, rdefdesc=Collect database allocated size, dbhost=AUYNPX85, partname=INVOICES56TST2, mpdname=SQL Server 2008 (Monitoring), mpname=Microsoft.SQLServer.2008.Monitoring, mpdesc=Microsoft SQL Server 2008 Management Pack. This management pack will discover and monitor Microsoft SQL Server 2008., rrid=1114, fpname=Microsoft.SQLServer.Database:SEYGTZ49.ori.orica.net;MSSQLSERVER;INVOICES56TST2, unit=DB Allocated Size (MB), source=SCOM2012-Collector, parttype=MSSQL$SQLEXPRESS : Database, mprid=102, name=DB Allocated Size (MB), dpname=INVOICES56TST2, device=seygtz49.ori.orica.net, path=SEYGTZ49.ori.orica.net;MSSQLSERVER, devtype=Microsoft.Windows.Computer, datetime=2015-05-31 16:18:27.653, part=INVOICES56TST2, inrowid=6978})=581.9375 to Backend: com.watch4net.apg.v2.collector.PipeError: Can't write to auynpz07.ori.orica.net/168.252.83.47:2500! |
| INFO | -- [2015-06-22 11:31:43 EST] -- FailOverFilter$FlushQueueToNextTask::flushQueueToNext(): Pipe error while flushing queue to next. |
| INFO | -- [2015-06-22 11:32:43 EST] -- FailOverFilter::flushPushNextBufferToNext(): Flushed buffer: pushNextBufferToNext. |
| SEVERE | -- [2015-06-22 11:32:48 EST] -- SocketConnector::sendBuffer(): Can't write to auynpz07.ori.orica.net/168.252.83.47:2500 |
java.net.SocketTimeoutException
| at com.watch4net.apg.v2.collector.plugins.SocketConnector.write(SocketConnector.java:462) | |
| at com.watch4net.apg.v2.collector.plugins.SocketConnector.sendBuffer(SocketConnector.java:383) | |
| at com.watch4net.apg.v2.collector.plugins.SocketConnector.commit(SocketConnector.java:367) | |
| at com.watch4net.apg.v2.collector.plugins.SocketConnector.pushData(SocketConnector.java:352) | |
| at com.watch4net.apg.v2.collector.AbstractCollector.pushNext(AbstractCollector.java:56) | |
| at com.watch4net.apg.v2.collector.plugins.FailOverFilter.sendValue(FailOverFilter.java:324) | |
| at com.watch4net.apg.v2.collector.plugins.FailOverFilter.access$1700(FailOverFilter.java:53) | |
| at com.watch4net.apg.v2.collector.plugins.FailOverFilter$FlushQueueToNextTask.flushOutQueue(FailOverFilter.java:1076) | |
| at com.watch4net.apg.v2.collector.plugins.FailOverFilter$FlushQueueToNextTask.run(FailOverFilter.java:1101) | |
| WARNING | -- [2015-06-22 11:32:48 EST] -- AbstractCollector::pushNext(): Error pushing (r)1433089113: group::SCOM2012-Collector_AUYNPX85_DW_26415_Web Service__Total_Microsoft.Windows.InternetInformationServices.2003.WebServer:auylvzu5.ori.orica.net_Bytes Total/sec({insname=_Total, rdefdesc=Microsoft.Windows.InternetInformationServices.2003.WebServer.WebServiceBytesTotalSec.Monitor.Collection, dbhost=AUYNPX85, mpdname=Windows Server Internet Information Services 2003, mpname=Microsoft.Windows.InternetInformationServices.2003, mpdesc=Microsoft Windows Server Internet Information Services 2003 Management Pack: This management pack discovers and monitors Windows Server Internet Information Services 2003., rrid=1606, fpname=Microsoft.Windows.InternetInformationServices.2003.WebServer:auylvzu5.ori.orica.net, unit=Bytes Total/sec, source=SCOM2012-Collector, parttype=Web Service, mprid=108, name=Bytes Total/sec, dpname=IIS Web Server, device=auylvzu5.ori.orica.net, path=auylvzu5.ori.orica.net, devtype=Microsoft.Windows.Computer, datetime=2015-05-31 16:18:33.0, part=IIS Web Server, inrowid=245})=0.0 to Backend: com.watch4net.apg.v2.collector.PipeError: Can't write to auynpz07.ori.orica.net/168.252.83.47:2500! |
| INFO | -- [2015-06-22 11:33:48 EST] -- FailOverFilter::flushPushNextBufferToNext(): Flushed buffer: pushNextBufferToNext. |
| WARNING | -- [2015-06-22 11:32:39 EST] -- SocketConnector::sendBuffer(): Can't write to auynpz07.ori.orica.net/168.252.83.47:2500. Retrying 1 times... |
| WARNING | -- [2015-06-22 11:32:49 EST] -- SocketConnector::sendBuffer(): Can't write to auynpz07.ori.orica.net/168.252.83.47:2500. Retrying 1 times... |
| INFO | -- [2015-06-22 11:33:02 EST] -- FailOverFilter$FlushQueueToNextTask::flushQueueToNext(): Deleting file D:\APG\Collecting\FailOver-Filter\Default\.\tmp-backend\inQDumpFile1434936542285.qdf after pushing 28618 values from it to the next component. |
| INFO | -- [2015-06-22 12:28:58 EST] -- SQLCollector$b::run(): Reseting RawValue property cache |
| INFO | -- [2015-06-22 12:28:58 EST] -- o::compute(): 'SCOM2012' collect starting... |
| SEVERE | -- [2015-06-22 12:29:00 EST] -- e::compute(): Error during the execution of child query 'common_properties' for 'main_collect' (connection group: null) |
java.lang.NullPointerException
| at com.watch4net.apg.v2.collector.plugins.sqlcollector.i.a(SourceFile:126) | |
| at com.watch4net.apg.v2.collector.plugins.sqlcollector.i.a(SourceFile:139) | |
| at com.watch4net.apg.v2.collector.plugins.sqlcollector.e.compute(SourceFile:109) | |
| at com.watch4net.async6.EvaluatorTask.computeWithMDC(EvaluatorTask.java:176) | |
| at com.watch4net.async6.EvaluatorTask.run(EvaluatorTask.java:158) | |
| at com.watch4net.apg.v2.collector.plugins.sqlcollector.j.rejectedExecution(SourceFile:18) | |
| at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) | |
| at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372) | |
| at com.watch4net.async6.CompoundEvaluatorTask.nestTask(CompoundEvaluatorTask.java:80) | |
| at com.watch4net.apg.v2.collector.plugins.sqlcollector.p.compute(SourceFile:91) | |
| at com.watch4net.async6.EvaluatorTask.computeWithMDC(EvaluatorTask.java:176) | |
| at com.watch4net.async6.EvaluatorTask.run(EvaluatorTask.java:158) | |
| at com.watch4net.apg.v2.collector.plugins.sqlcollector.j.rejectedExecution(SourceFile:18) | |
| at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) | |
| at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372) | |
| at com.watch4net.async6.CompoundEvaluatorTask.nestTask(CompoundEvaluatorTask.java:80) | |
| at com.watch4net.apg.v2.collector.plugins.sqlcollector.d.a(SourceFile:323) | |
| at com.watch4net.apg.v2.collector.plugins.sqlcollector.d.a(SourceFile:295) | |
| at com.watch4net.apg.v2.collector.plugins.sqlcollector.d.a(SourceFile:191) | |
| at com.watch4net.apg.v2.collector.plugins.sqlcollector.d.compute(SourceFile:129) | |
| at com.watch4net.async6.EvaluatorTask.computeWithMDC(EvaluatorTask.java:176) | |
| at com.watch4net.async6.EvaluatorTask.run(EvaluatorTask.java:158) | |
| at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) | |
| at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) | |
| at java.lang.Thread.run(Thread.java:744) | |
| WARNING | -- [2015-06-22 12:29:07 EST] -- SocketConnector::sendBuffer(): Can't write to auynpz07.ori.orica.net/168.252.83.47:2500. Retrying 1 times... |
| INFO | -- [2015-06-22 12:30:08 EST] -- o::completed(): 'SCOM2012' collect done: 222,394 new raw values collected in 1 minute 9 seconds 860 ms using 437 SQL queries. |
| INFO | -- [2015-06-22 13:28:58 EST] -- SQLCollector$b::run(): Reseting RawValue property cache |



PaulORourke
170 Posts
1
July 2nd, 2015 08:00
Hi DanteR,
Based on the information above it appears that the EMC M&R (Watch4net) backend is not sufficient for the environment.
When this point is reached, the Backend will have difficulty writing the collected data to the database and in accepting the collected data from the Collector-Manager in a timely manner, which will cause the temp files at /opt/APG/Backends/APG-Backend/Default/tmp/ to begin to build up, resulting in the above issue and environment impacts.
Please open a SR with EMC M&R support to help find a resolution to this issue.
Kind Regards,
Paul O'Rourke
DanteR1
4 Posts
0
July 2nd, 2015 15:00
Thanks Paul. I will get our support/contract details (Orica) and log a SR for this. Have a great day.