Incident Management vs. Problem Management – Why it’s Critical You Understand the Difference

Posted by on June 29, 2016


Jarod GreeneIn the following article, Jarod Greene, VP of Service Management Strategy at Cherwell Software and former Gartner IT service management (ITSM) industry analyst, discusses some of the key differences between Incident and Problem Management, and why distinguishing between the two ITIL processes is important. With more than 12 years of ITSM industry experience, Jarod understands the market from the vendor, end-user, customer, and analyst perspectives. His proficiency in IT service support management processes, organizational structures, and technology is sought after for speaking engagements, customer consultations, and product development.

Despite the establishment of ITIL as the de facto best practice IT management framework almost 10 years ago, there is still a good deal of confusion. What’s causing the uncertainty? It's not the nomenclature (it’s pretty well documented and straightforward), but rather the application of the guidance. Nowhere is this confusion more apparent than in discussions about the key differences between incident and Problem Management. In this article, I will break down these differences and help you walk away with a clear understanding of both Incident and Problem Management.

Tweet this: Does your IT team know the what the key differences between Incident and Problem Management?

What Is Incident Management?

First, let’s look at Incident Management. The goal of Incident Management is to restore service operations as quickly as possible and minimize the adverse impact of a service outage or degradation on the organization. The activities associated with Incident Management primarily deal with recording the details of the incident, classifying the incident, investigating the incident, and ultimately resolving the incident.

Whether an IT organization aligns to ITIL or not, there is almost always a role or function responsible for the management of incidents, whether it’s a group of two or a group of 200. The objectives and key performance indicators are relatively straightforward—resolve issues as quickly as possible, conscious of the priority and cost of the resolution, as well as the users’ level of satisfaction throughout the process; measured with metrics such as First Contact Resolution, Cost Per Contact, and Customer Satisfaction.

(For more complete guidance on ITIL Incident Management, see an Essential Guide to ITIL Incident Management.)

I recently had a bout with back pain. While frustrating, my experience helps illustrate how Incident Management, when performed well, functions more like well-run doctor’s office rather than a “take two of these and call me in the morning” approach. On my first visit to the orthopedist, I was required to fill out forms to provide a context into my overall health and to articulately describe my symptoms. My doctor used that information, in addition to an x-ray, to diagnose and prescribe a treatment plan. 

This is how IT organizations apply Incident Management.

Tweet this: By integrating key components of Change, Asset, and Knowledge this adds value to the Incident Management process.

In order for Incident Management to be effective, it's important to understand the following requirements:

  • Continuous development of problem and error control
  • A tiered support structure, where the team understands Tier 1 and 2 escalations
  • A Continual Service Improvement program that measures efficiency and effectiveness through KPIs aligned to organizational goals and objectives
  • Clear and documented roles and responsibilities within IT in terms of desired outcomes

Furthermore, IT must have robust Incident Management software at its disposal that includes: 

  • Integration of the IT service desk software and the IT asset management repository, which provides IT support with a user context of the assets the user has and services the user leverages, negating the need to fill out forms
  • A knowledge base provided within the ITSM tool that helps spread, scale, and standardize symptomology, which provides IT support the means to better classify, investigate, and diagnose issues
  • The view of an IT service map provided by the ITSM solution’s configuration management database (CMDB), which helps understand issues at the service level and better isolate troublesome configuration items that impact availability and performance

It is clear the integration of change, asset, and knowledge adds value to the Incident Management process, and therefore the organization. So why then do we see such a major drop-off when it comes to the Problem Management process? In the most recent HDI Practices and Salary Report 2015, only 44 percent of IT organizations have adopted the Problem Management process, and only 22 percent of those organizations had a dedicated problem manager!

I believe the low adoption rate of Problem Management can most often be attributed to a lack of understanding of why Problem Management is important to the organization, which affects the alignment of roles and responsibilities associated with the process. There also tends to be an over-reliance on technology, which does an outstanding job of creating problem records and assigning ownership, but can’t within itself encourage individuals to determine root-cause, identify workarounds, and recommend resolution approaches.

And, therein lies the problem with Problem Management.

What Is Problem Management?

The goal of problem management is to minimize the adverse impact of incidents and problems on the organization caused by errors in the infrastructure, and to prevent the recurrence of incidents related to those errors. The activities associated with problem management primarily deal with identifying why the incident occurred in the first place, and identifying and documenting known errors. Unlike incident management, there is not almost always a role or function responsible for the management of problems, nor is there a solid understanding of the objectives and key performance indicators.

(For more complete guidance on ITIL problem management, see an Essential Guide to ITIL Problem Management.)

Let’s go back to my back issue to understand how Problem Management, when performed well, functions like treatment. While my doctor provided some immediately relief, he mentioned that if the treatment plan wasn’t working and I continued to experience issues and pain, we might be dealing with something more significant that an MRI and further analysis would be able to determine.

Note this doesn’t negate the doctor's initial work. He couldn’t provide immediate resolution, but rather a workaround (medication and exercise, while limiting travel) that he’s identified and documented, having seen issues like mine in the past. He didn’t recommend surgery, and we agreed upon initial resolution in my first visit—understanding that not only is that option not cost effective, but not overly appropriate until the root cause is accurately determined.

To build successful Problem Management process, IT must first determine why Problem Management is important to them, and assign the roles and resources accordingly. At a minimum, IT leaders must apply the same amount of rigor as done with Incident Management.

A Problem Management leader must ensure that:

  • Problems and errors are regularly (and properly) classified and identified 
  • Workarounds are documented communicated to the incident management function
  • The Problem Management process has well-defined and relevant KPIs
  • Clear and documented roles and responsibilities in terms of desired outcomes

Tweet this: To build successful Problem Management process, IT must first determine why Problem Management is important to them

IT must also ensure it has the proper enabling ITSM solution (or more targeted Problem Management software) that performs the following functions:

  • Cross-reference the details of the incident against both the knowledge base and the known-error database, making it easier to link incident records to problem records
  • Make it easy to assign ownership of problem records to individuals or functional groups
  • Make it easy to quickly promote problems to RFC’s, complete with all the necessary context and documentation
  • Provide a full and rich dashboard that intuitively organizes critical problem management metrics into a single panel

Understanding the difference between Incident and Problem Management is merely the first step. The doctor’s office analogy is one of many to help you understand that Incident Management deals with issue as quickly as possible, and that Problem Management deals with why the issue occurred, and seeks to either eliminate the root cause or build an effective workaround.

Next Up:  15-Minute ITIL Video Series

Master Incident, Change and Problem Management practices with these short videos

15-Minute ITIL Video Series