IT monitoring is out of style?

This blog has been also published on Service Management 360 on 09-Jul-2014.

A few weeks ago I read a blog entry written by Vinay Rajagopal on Service Management 360 with the headline “Still configuring thresholds to detect IT problems? Don’t just detect, predict!” I was wondering what that new big data approach will imply and what it means to my profession focusing on IT monitoring. Is IT monitoring old style now?

The IT service management discipline today is really a big data business. We have to take a lot of data under consideration if we want to understand the health of IT services. In today’s modern application architectures, with their multitier processing layers and the requirement that everything be available all the time and that performance remains at an acceptable level, IT management becomes a threat that often ends in critical situations.

The “old” approach, of monitoring a single resource or a dedicated response time of a single transaction doesn’t seem to be the way to succeed anymore. However, it is still essential to perform IT monitoring for multiple reasons:

  1. IT monitoring helps to gather performance and availability data as well as log data from all involved systems.

    This data may be used to understand and learn the “normal” behavior. Understanding this “normal behavior” is essential to predict upcoming situations and to send out alerts earlier.

    The more data we gather from different source, the better our prediction accuracy gets.

    With this early detection mechanism in place from so many different data sources, injected by the IT monitoring, operations teams can earn enough time before the real outage takes place, so that they can avoid this outage.

     

  2. IT monitoring can help to identify very slow-growing misbehavior.

    Gathering large amounts of data does not guarantee that all misbehavior can be identified. If the response time of a transaction server system increases over a long period of time and all other monitored metrics evolve accordingly, an anomaly detection system will fail. There are no anomalies. Because growing workload is nothing unexpected and the growth takes place over a long period of time, only distinct thresholds will help. This is classical IT monitoring.

  3. IT monitoring helps subject matter experts to understand their silos.

    Yes, we should no longer think in silos, but for good system performance it is essential to have a good understanding of key performance metrics in the different disciplines, like operating systems, databases and middleware layers. IT monitoring gives the experts the required detailed insight and enables the teams to adjust performance tasks as required.

So the conclusion is simple: monitoring is a kind of prerequisite for doing successful predictive analysis. Without monitoring you won’t have the required data to make the required decisions, whether manually or automatically, as described with IBM SmartCloud Analytics – Predictive Insights.

Prediction based on big data approaches is a great enhancement for IT monitoring and enables IT operation teams to identify system anomalies much earlier and thus to start reactive responses in time.

IBM SmartCloud Application Performance Management offers a suite of products to cover most monitoring requirements and gather the required data for predictive analysis.

So what is your impression? Is monitoring yesterday’s discipline?

Follow me on Twitter @DetlefWolf, or drop me a discussion point below to continue the conversation.

IT Monitoring: Necessary, or just “nice to have”?

This blog was first published on Service Management 360 on 30-Oct-2013.

http://www.servicemanagement360.com/2013/10/30/it-monitoring-necessary-or-just-nice-to-have/

Why do I so often see poorly managed system environments in small and medium businesses?

Well, there are multiple reasons for that:

  • They are not aware that they deeply depend on reliable IT services.

  • IT is not their core business.

  • They don’t know what to take care of, and how to do it.

Monitoring is often seen as a nice-to-have discipline as long as IT service outages do not cause any business-relevant losses. This is often achieved by minimizing the dependencies on IT services even if this “backup scenario” is inefficient and cost intensive.

I’ve seen businesses who print every incoming email. The reason for that? Well, the email system might go down, which would prevent the company from getting any work done. Having every communication outside the IT systems protects them in the event of an outage, or so they believe. The vision of the paper-free office – forget about it!

Why don’t they introduce a monitoring solution to achieve more reliable IT services? Let’s do a quick review of the main reasons:

  • Complexity
    Monitoring requires a complex infrastructure, which has to be maintained. It requires deep knowledge of the monitoring mechanisms and the monitored applications, and the achieved results have to be frequently reviewed and amended.

  • Lack of knowledge
    What should be monitored? Which components should be under investigation? What thresholds should be set? All of these questions are absolutely valid and require time-consuming answers. And there are no “correct” answers. But there is experience in the market, and it is ready to be used.

  • Time consumption
    Small IT departments are not able to dedicate a team of people to perform professional IT monitoring. There are lots of things to do in such small operation centers, and watching a monitor wall all day is completely illusive. And to be honest, it is not a full-time job, because the number of systems is not large enough.

  • Monitoring is too expensive
    Well, cost is really the killer in every discussion. The ramp-up costs are too high. The products are too expensive. The implementation effort at the beginning is unaffordable.

But having said all that, is it now time to leave the room and tell my customers: “Yes, you are right, IT monitoring is for large businesses only”?

I think we have to address the above inhibitors very carefully and show up with alternatives for the small and medium business. Our customers’ businesses are too valuable, and unreliable IT services should not be allowed to set these services at risk.

Monitoring as a Service

As mentioned in the blog post linked above, I see the Monitoring as a Service delivery model as the best choice to cover all these challenges. But what should it look like?

First of all, we have to keep all the complexity out of sight of the service requester (that is, from you as the customer). The monitoring should just happen. The customer requests monitoring for its business services (emailing, customer portal and so on) and the service provider has to deal with it. The service provider has to have the knowledge, not the business owner.

Second, the service provider has to take care of the customer’s systems during business off-hours, to prevent system outages during normal operations, when the systems are being used frequently. This has been often seen in the past as one of the main inhibitors for small businesses. Even the system are not actively used, they still are doing background work, like database reorganizations, data backup and other administration stuff.

And now the third aspect, the investment. Customers expect to have managed availability and performance with low ramp-up costs and a quick initial implementation phase. This is only possible with service providers who have real solutions available that focus on their customers’ needs and the services these customers use. That means that industry-specific solutions are required to cover the different markets.

Below I’ll present an example scenario to illustrate these ideas.

Sample from the car manufacturing industry

A supplier in the car manufacturing industry is tightly connected to its main customer sharing IT services with him. To get connected to the principal customer’s systems this supplier has to have its own IT systems. Additionally, other specialized IT systems are on premise to serve the daily business processes, including:

  • Accounting (for example, SAP)

  • Billing (for example, SAP)

  • Emailing (for example, Lotus Notes)

  • Phone systems (for example, Ericson)

  • IT network (for example, CISCO)

The supplier’s IT systems are essential to deliver all required parts to its principal customer in the required just-in-time chain. Any failure in this process might lead to significant penalties to the supplier. The Monitoring as a Service provider should have artifacts available to quickly monitor the health and availability of theses infrastructure components and the IT services implemented on top of it:

  • Infrastructure monitoring

    • SAP monitoring

    • Lotus Notes monitoring

    • Network monitoring

  • Service monitoring

    • Accounting

    • Billing

    • Phone line availability

    • Applications on the principal customer’s systems

All these monitors require specialized, industry-specific know-how but are similar for all suppliers in this industry. The solutions could be provided in monitoring packages, including technical resource monitoring, process monitoring and availability tracking. Additionally, these packages should include reporting features for reviewing the achieved results.

Conclusion

By ordering Monitoring as a Service, small and medium businesses might overcome today’s existing inhibitors for implementing a strong control of their IT systems. With SmarCloud Application Performance Management the required products are there. Within IBM’s business partner organization and service providers, the infrastructure to deliver these services is also available. It is now your turn to act. What stops you from doing so?

Is IT monitoring for large businesses only?

This blog was first published on Service Management 360 on 13-Aug-2013.

http://www.servicemanagement360.com/2013/08/13/is-it-monitoring-for-large-businesses-only/

Could you imagine any large company leaving their IT systems unmanaged? I can’t.

These major companies have dedicated departments in place, responsible for monitoring IT system availability and performance. They are in charge of detecting issues before the business processes are affected. They also have to measure the fulfillment of service level agreements and deliver input for capacity planning on IT resources. Often they have to provide metrics to bill delivered services to the different business units.

Repeatable processes are in place to deliver these services:

  1. Sense
    First you must sense the user experience or the service quality (like sending email). If problems are detected go to phase 2.

  2. Isolate
    In multiple tier architecture application environments the isolation of the problem source is the key factor for a quick problem resolution. By having all resources under control and having transaction tracking mechanisms in place, you can quickly identify the failing resource.

  3. Diagnose
    After you identified the bottleneck, you need to perform a detailed diagnosis. You must investigate the system and its applications and make the right conclusions.

  4. Take action
    With the results from the previous step, you can then take the required actions.

  5. Evaluate
    By doing a re-evaluation you go back to step 1 and make sure that the alerted situation no longer exists and the applied action was successful.

The frequent measurement of key metrics is also used to earn data for historical analysis. Service-level measurement and capacity planning become actionable with this data.

And what about small and medium businesses?

Here I often see very limited attention given to system monitoring. In small and medium businesses often a kind of system availability monitoring with a very limited scope is performed. In some cases, the sensibility for that IT discipline doesn’t exist at all.

But what does it mean if IT systems are not available? Today, most kinds of businesses rely on IT services somehow. Any production facility, medical service, office, modern garage and so on is almost incapacitated when the IT systems are down. That means that staff can’t perform their core business roles, can’t earn money, can’t provide the services to their customers. This leads to a massive loss of revenue and reputation.

Additionally, a lot of these small and medium businesses are suppliers in a just-in-time supply chain for the large businesses (for example, a car manufacturer) and penalties apply if the delivery and production process is interrupted.

So the need for enterprise class monitoring systems exists. But why don’t they do it?

  1. It is too expensive!

  2. It is too complex!

  3. It is too time consuming!

These are the three favorite reasons I often hear. And all of them are partially true. Monitoring is a very special subject matter expert discipline. It requires detailed understanding of the monitored systems, applications and services as well as knowledge of the monitoring product used.

The purchase of an enterprise-class monitoring system might require a huge amount of money and a remarkable education effort. And it requires a kind of sustainability small IT departments can’t dedicate to monitoring. Monitoring requires repeated reviews to enhance quality, but it is not possible to keep two or three persons focused on monitoring questions because the workload for this discipline is not high enough in a small IT department and they have responsibility for lot of other things. In consequence, the skill level for this discipline declines and the results no longer justify the investment in enterprise-class monitoring.

So what now?

Is the answer “yes” on my initial question? No, it isn’t. A new delivery model is required. Enterprise-class monitoring is needed in all businesses relying on stable IT services.

The answer might be a Monitoring as a Service model. A trusted service provider could deliver such a monitoring service and overcome the above inhibitors. Because he delivers this service to multiple clients he can lower the ramp up costs for the software purchase, offer the required sustainability and bring in the expertise for monitoring systems, applications and services.

In my blog series “Monitoring as a Service” (see parts 1, 2, 3 and 4) I described a business model for using IBM monitoring solutions to set up such a service.

IBM SmartCloud Application Performance Management offers a suite of products to cover the above described five-step monitoring process, including reporting features for historical reviews and projections to the future.

So what is your impression? Are we covering the right markets? How could we enhance? Please share your thoughts below.