Embedding Monitoring into Existing Organizations

To ensure the success of an IT Monitoring project implementation in the long-run, the monitoring should be embedded into the existing organization with clear responsibilities and roles for each stakeholder.

In the slide below a sample organization is shown, with a typical matrix organization. This is often seen in enterprises.

There are four groups in focus regarding the monitoring deployment:

  • Users
    Most enterprises earn their money with these group of people. They expect a running application, performing well and supporting them getting their job done. These could be internal or external users or both.
  • Business
    This group are the stakeholders of the business processes and focus on getting things done, to maximize the contribution to the companies, they work for.
  • IT Application
    These people design, code, test and deploy the required applications.
  • IT Infrastructure
    This group of people are responsible to provide and run a reliable IT infrastructure and architecture for the future, including internal and external cloud platforms.

These definitions are samples, seen with different customers over the last years. Other organizations my have other setups working for them very well.

The point I want to highlight is, how to embed the monitoring into such an organization:

  • Support Level 1
    While the business support organization supports the questions from the user’s using the applications and processes supported. Often the product catalog is in focus, the order process itself, the configuration of product and so on. The questions are business driven rather than IT technical. IT technical question are rerouted to the IT support desk.
    The IT support desk is in charge for all questions, regarding IT related objects. This includes assistance for program usage help, user login issues, performance issues and so on.
  • Support Level 2
    This function is covered by an operations center. This organization keeps IT services up and running, monitors the state of key resources, applications, network and all other stuff making up a well performing IT environment. They are in charge to initiate requests to level 3 if problems arise, they can’t handle by themselves. Regarding monitoring, these people are the power user of the monitoring solution. All acquired data is used here, to provide a comprehensive overview about the IT status and to enable decisions, how to handle given situations.
  • Support Level 3
    Level 3 has several teams, contributing to this mission. Developers and system programmers have to collaborate to get solutions for serious problems arising while operating the application.
    • System programmers are in charge to fix problems with hard- and software packages and their configuration.
    • Developers have to deal with issues originating from self-developed application code.Today, these roles are often combined in so called DevOps teams. These teams are responsible to perform deep dive analysis in case of application errors, including log analysis, detailed performance measurement and threshold definition. They also have to continually develop monitoring thresholds to increase alarm accuracy. They install new monitoring tools and integrate these into the existing solution. DevOps teams always keep in mind, that any new technology requires also a new monitoring review and potentially a new monitoring component.

Monitoring should be embedded into IT management processes, best described in ITIL. Incident management, problem management and change management are the disciplines in focus. For more details on monitoring and ITIL see the Process Symphony Knowledge Base.

As I described in my blog entry Finding the best Monitoring Solution, monitoring is a process rather than a status. It is a never ending iteration of requirement review, software upgrade, solution design and execution. That’s why it is so important to embed monitoring into IT organization’s daily business.

Monitoring Focus Areas

Monitoring discussion with out a product focus.

Discussing with customers the expectations they have, regarding a monitoring system are pretty different across departments involved. As mentioned in the blog Finding the best Monitoring solution, it is essential to understand theses requirements and have these needs in focus while moving on. The monitoring solution has to support a wide range of requirements.

So let us take a look on generalized requirements without thinking in products. The slide below shows four different work areas, often causing trouble in the daily business of IT departments.

  • Monitoring
    This is the core area of a monitoring system. Gather data about any resource, regarding performance and availability.
  • Prediction
    While feeding time series data from the monitoring component and other sources (like the event management system or log analysis, this system enables to detect anomalies earlier and helps to differentiate between common behavior and unexpected processing.
  • Event Management
    Events are consolidated and correlated in one place, decreasing the complexity to follow in SOS situations where multiple components are involved and are monitored with different tools. This discipline is tremendous important, because most customers are using specialized tools to monitor their IT infrastructure, their IT network, the user experience, the cloud components/resources and so on.
  • Log Analysis
    Most IT issues are fixed by consulting the required logs. Wide spread logs hamper this activity. It is very important to collect and consolidate these logs in one place and make them actionable to search, scan, sort and analyze these information sources efficiently.

Monitoring is more than sensing the availability and performance of a single IT resource. It is essential to do so, but several other steps have to follow.

Most IT monitoring projects fail over time, because the value for the business can’t be manifested over time. IT monitoring is not a short shot project. It is more a travel which never ends. It has to change as the IT landscape changes. That means, it must be as flexible as possible to support the requirements of tomorrow. A dedicated responsibility should be defined, that supports the monitoring requirements over time. Today, DevOp departments exactly take over this role – but it helps, having a common monitoring infrastructure on hand, ready to be used, flexible enough to integrate new tools.