This blog has been also published on Service Management 360 on 09-Jul-2014.
A few weeks ago I read a blog entry written by Vinay Rajagopal on Service Management 360 with the headline “Still configuring thresholds to detect IT problems? Don’t just detect, predict!” I was wondering what that new big data approach will imply and what it means to my profession focusing on IT monitoring. Is IT monitoring old style now?
The IT service management discipline today is really a big data business. We have to take a lot of data under consideration if we want to understand the health of IT services. In today’s modern application architectures, with their multitier processing layers and the requirement that everything be available all the time and that performance remains at an acceptable level, IT management becomes a threat that often ends in critical situations.
The “old” approach, of monitoring a single resource or a dedicated response time of a single transaction doesn’t seem to be the way to succeed anymore. However, it is still essential to perform IT monitoring for multiple reasons:
IT monitoring helps to gather performance and availability data as well as log data from all involved systems.
This data may be used to understand and learn the “normal” behavior. Understanding this “normal behavior” is essential to predict upcoming situations and to send out alerts earlier.
The more data we gather from different source, the better our prediction accuracy gets.
With this early detection mechanism in place from so many different data sources, injected by the IT monitoring, operations teams can earn enough time before the real outage takes place, so that they can avoid this outage.
IT monitoring can help to identify very slow-growing misbehavior.
Gathering large amounts of data does not guarantee that all misbehavior can be identified. If the response time of a transaction server system increases over a long period of time and all other monitored metrics evolve accordingly, an anomaly detection system will fail. There are no anomalies. Because growing workload is nothing unexpected and the growth takes place over a long period of time, only distinct thresholds will help. This is classical IT monitoring.
IT monitoring helps subject matter experts to understand their silos.
Yes, we should no longer think in silos, but for good system performance it is essential to have a good understanding of key performance metrics in the different disciplines, like operating systems, databases and middleware layers. IT monitoring gives the experts the required detailed insight and enables the teams to adjust performance tasks as required.
So the conclusion is simple: monitoring is a kind of prerequisite for doing successful predictive analysis. Without monitoring you won’t have the required data to make the required decisions, whether manually or automatically, as described with IBM SmartCloud Analytics – Predictive Insights.
Prediction based on big data approaches is a great enhancement for IT monitoring and enables IT operation teams to identify system anomalies much earlier and thus to start reactive responses in time.
IBM SmartCloud Application Performance Management offers a suite of products to cover most monitoring requirements and gather the required data for predictive analysis.
So what is your impression? Is monitoring yesterday’s discipline?
Follow me on Twitter @DetlefWolf, or drop me a discussion point below to continue the conversation.