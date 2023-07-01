Today’s IT and software infrastructures are more complex than ever before. Diverse wheels mesh, teams deliver iterative changes faster and more frequently than ever, monolithic products have given way to microservice architectures. It is therefore inevitable and inevitable that problems will arise more frequently, because complexity always comes with surprises and unforeseen events.

Unfortunately, IT incidents such as disruptions or failures are also unavoidable occurrences in organizations. Companies have to face this reality: The question is not whether an incident will occur, but when it will happen and how serious it is.

Of course, this doesn’t change the fact that IT incidents and the associated downtime are not only expensive for the company; they also cost reputation and customer trust. Therefore, modern ITSM teams have established systematic methods that ensure effective and efficient handling of incidents and their underlying problems. One of those approaches from the toolbox, provided by the ITIL frameworkis the problem management.

The Purpose of Problem Management

A systematic problem management as part comprehensive IT service management aims to establish standardized procedures for analyzing incidents and IT processes in order to prevent similar incidents in the future and to eliminate potential sources of danger. Specifically, it’s about identifying the root causes of an incident, understanding them, and identifying the best approach to eliminate that root cause once and for all.

Through this practice, ITSM teams seek to prevent repeatable incidents from occurring and to minimize the impact of incidents that cannot be prevented. But isn’t that already the task of incident management?

The differentiation between incident management and problem management

What are the differences between incident and problem management? Both practices revolve around incidents and disruptions and have various overlaps. But essentially they deal with different levels. Incident management is the fire brigade; it starts directly with the incident.

First of all, the highest priority is to fix the incident quickly, to limit its scope as much as possible and to fully restore the affected service. This is followed by analysis and review, for example in the form of post-mortem documents. And that’s where problem management comes in: Because here it’s all about the deeper causes of an incident, i.e. the analysis and solution of the actual problem.

The superficial cause of an incident is usually quickly identified: Often a trivial setting, a configuration error, an incorrect commit turns out to be the obvious culprit. But only rarely is an incident actually due to a single, strictly isolated reason – and the initially recognized cause was often just the straw that broke the camel’s back. The incident may be resolved for now, but the problem persists.

It is the task of problem management to analytically get to the bottom of these deep causes with their favorable aspects and to get rid of them. Can a similar incident occur again? What factors encourage it? These are the central questions to which problem management should find answers.

Is there a process?

In the current iteration (version 4), the ITIL framework no longer provides for a strictly defined process. Rather, ITSM teams should adapt the practice in a form that suits their specific services, framework, systems and tools. However, experience has shown that teams do well when they adapt a mix of reactive and proactive elements in their problem management.

The approach described above applies to reactive problem management: an incident has occurred or a potential challenge or weakness has been identified. An in-depth analysis is now necessary, which should lead to the implementation of a solution that is as permanent as possible. In this way, the team wants to ensure that similar incidents are avoided in the future or that no actual incident arises from the identified danger.

Proactive problem management, on the other hand, does not need an external stimulus but takes place on its own to find and eliminate potential risks so that no incidents can arise from them. This approach is to be understood as an ongoing measure. For example, it may involve periodic analysis of incident records, logs and data from other ITSM processes to identify patterns and anomalies that have the potential to develop into major challenges.

Atlassian-Tools im Incident- und Problem-Management

Of course, none of this is possible without supporting software, which not only brings powerful features in the areas of incident processing, workflows, documentation and collaboration, but also does justice to the individuality of the ITSM team with its specific processes.

The Atlassian tool suite meets these needs. Jira Service Management, in conjunction with Statuspage and Opsgenie, provides the flexible tools for methodical incident management, while Confluence is the knowledge management tool for jointly creating and sharing post-mortem reports, documenting root cause analysis and continuous, make the team’s proactive analysis activities centrally available.

Would you like to learn more about how Jira Service Management and Confluence support ITSM processes and help teams become more effective and efficient? Then get in touch with us: Our experts would be happy to discuss the use cases and requirements of your company with you in terms of modern IT service management!

PS: You can also find lots of tips and information on the subject in our whitepaper “How does IT service management work?”, which you can download here.

