The Best Ultimate Essential Guide of ITIL Incident Management

ITIL Incident Management Process is the part of IT Service Operations in IT Service Management (ITSM) and Incident Manager is the Process Owner of this Process. Businesses always faced different types of issues and problems related to Technology and Incident Management process help and support to the business to quickly resolve the issues and restore the service for smooth Business Operations to accomplish greater proficiency and productivity with a speedy recovery of IT Services. In this guide, we will discuss the ITIL Incident Management Process how this process can help and support to the enterprises.

ITIL ITSM Service Desk / ServiceDesk Software — **Start Your Free Trial Today for IT Service Desk Software**

What is ITIL Incident Management?

In ITIL 3, an ‘incident’ is defined as an unplanned interruption to an IT Service or reduction in the quality of an IT Service. Failure of a configuration item (CI) that has not yet impacted service is also an incident, for example, failure of one disk from a mirror set.

Book a Call – Digital, Business & Technology Solutions

If you are planning to build your IT Strategy & Business Technology Roadmap aligned with your Business Strategy and Commercial Objectives for coming years focused on Digital Transformation. Book a call with me and discuss about Digital, Business & Technology Solutions. #TalkToMe about resolving your bottom line and equip it with the next level of success

Incident Management is the process for dealing with all incidents; this can include failures, questions, or queries reported by the users (usually via a telephone call to the Service Desk), by technical staff, or automatically detected and reported by event monitoring tools.

Table Of Contents

What is ITIL Incident Management?
What is the Purpose of ITIL Incident Management?
Incident Management Basic Concepts
What is the ITIL Major Incident Management Process?
What is the Objective of the Incident Management Process?
Scope of ITIL Incident Management
Incident Management Value to Business
Differences between Incident Management and Problem Management?
Why use Incident Management? / Benefits of ITIL Incident Management
What happens if Incident Management is not used?
Issues with deciding on an Incident Management Process
Who uses Incident Management?
How Incident Management works / Incident Management Process
Incident Management Process Flow & Life Cycle Diagram
ITIL Incident Management Roles and Functions in the Incident Process
Incident Management Post Implementation Review
How to manage Incident Management Statuses?
Measurements of Incident Management
ITIL 4 Incident Management Contribution to the Service Value Chain (SVC)
Incident Management Interfaces with Other ITIL Processes
ITIL Incident Management Metrics / KPIs
ITIL Incident Management Challenges, Critical Success Factors, and Risks
Conclusion

What is the Purpose of ITIL Incident Management?

ITIL 4 describes the purpose of the Incident Management Practice is to minimize the negative impact of incidents by restoring normal service operation as quickly as possible. ITIL 4 definition of Incident is an unplanned interruption to a service or reduction in the quality of service.

Want the Value-Added Support to resolve your Business Technology Problems?

Join the Experts Forums, Communities & Groups

Our Expert Support Forums, Communities & Groups Contents, and Collaboration is full of value, void of hype, tailored to your interests whenever possible, never pushy, and always free to help & support.

FORUMS

The process of dealing with all the incidents throughout the service lifecycle is known as Incident Management. The primary goal of the Incident Management process is to restore normal service operation as quickly as possible and minimize the adverse impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained. ‘Normal service operation’ is defined here as service operation within Service Level Agreement (SLA) limits.

Incident Management finds out a workaround or temporary fixes, rather than through trying to find a permanent solution. Incident management is a defined process for logging, recording, and resolving incidents.

Examples of Incidents

User Experienced Incidents

Application

Service not available (this could be due to either the network or the application, but at first the user will not be able to determine which)
Error message when trying to access the application
Application bug or query preventing the user from working
Disk space full
Technical Incident

Hardware

System down
Printer not printing
New hardware, such as scanner, printer, or digital camera, not working
Technical Incident

What are the Technical Incidents?

Technical incidents can occur without the user being aware of them. There may be a slower response on the network or on individual workstations but, if this is a gradual decline, the user may not notice.

Technicians using diagnostics or proactive monitoring usually spot technical incidents. If a technical incident is not resolved, the impact can affect many users for a long time. In time, Experienced users and the Service Desk will spot these Incidents before the impact affects most users.

Examples of Technical Incidents:

Disk space nearly full (this will affect users only when it is completely full)
Network card intermittent fault – sometimes it appears that the user cannot connect to the network, but on a second attempt the connection works. Replacing the card before it stops working completely provides more benefit to the users
Monitor flickering – it is more troublesome in some applications than others
Although the flicker may be easy to live with or ignore, the monitor will not usually last more than a few weeks in this state

Incident Management Basic Concepts

Timescales
- Depend on the priority defined.
- Should be documented in Operational Level Agreements and Underpinning Contracts
Incident Models
- Based on predefined steps to handle a particular incident
- A detailed description of the steps
Major Incident
- A break in service which threatens to cause or may cause a loss to the business
- A separate procedure including shorter timescales and greater urgency is used

What is the ITIL Major Incident Management Process?

A separate procedure, with shorter timescales and greater urgency, must be used for ‘major’ incidents. A definition of what constitutes a major incident must be agreed upon and ideally mapped onto the overall incident prioritization system – such that they will be dealt with through the major incident process. You can define the rule in which situation, you should follow the Major Incident Process aligned with Business.

Handling Major Incidents within IT and Business.
Starting trigger for this major incident process is setting the priority of an IT-incident ticket to “CRITICAL”
This is allowed to key users only defined by Management and IT
Root Cause Analysis reports of all MINCs are reviewed and approved by the CIO / IT Management Team
Build Major Incident Process Teams
- IT Service Desk like Service-Now: Incident Management Tool
- MINC Resolution Team (MRT) – Technical & Functional Teams to resolved MINC
- MINC Manager (On-Call) – Lead the MINC: Perform Root Cause Analysis (RCA)
- MINC Management Team – Accept/Reject the MINC: Re-Prioritization
- Management IT (MIT) – RCA Accepted and Continuous Improvements

What is the Objective of the Incident Management Process?

Following are the ITIL Incident Management Objectives of the Incident Management process:

To ensure standardization of methods and procedures used for an efficient and prompt response.
To analyze, document, and report incidents during the management process.
Increase visibility and communication of incidents to business and IT support staff
To align incident management activities and priorities with those of the business.

Scope of ITIL Incident Management

Managing any disruption or potential disruption to live IT services is the primary scope of incident management. It also comprises events identified:

Directly by users through Service Desk.
Through an interface from Event Management to Incident Management tools; and
Reported or logged by Technical Staff.

Incident Management Value to Business

The value of Incident Management includes:

The ability to detect and resolve incidents, which results in lower downtime to the business, which in turn means higher availability of the service. This means that the business is able to exploit the functionality of the service as designed.

The ability to align IT activity to real-time business priorities. This is because Incident Management includes the capability to identify business priorities and dynamically allocate resources as necessary.

The ability to identify potential improvements to services. This happens as a result of understanding what constitutes an incident and also from being in contact with the activities of business operational staff.

The Service Desk can, during its handling of incidents, identify additional service or training requirements found in IT or the business. Incident Management is highly visible to the business, and it is, therefore, easier to demonstrate its value than most areas in Service Operation. For this reason, Incident Management is often one of the first processes to be implemented in Service Management projects. The added benefit of doing this is that Incident Management can be used to highlight other areas that need attention – thereby providing a justification for expenditure on implementing other processes.

Differences between Incident Management and Problem Management?

Problem Management differs from Incident Management in that its main goal is the detection of the underlying causes of an incident and the best resolution and prevention. In many situations, the goals of problem management can be in direct conflict with the goals of incident management.

Deciding which approach to take requires careful consideration. A sensible approach would be to restore the service as quickly as possible (Incident Management) but ensuring that all details are recorded. This will enable problem management to continue once a workaround has been implemented.

Discipline is required, as the idea that the incident is fixed is likely to prevail. However, the incident may well appear again if the resolution to the problem is not found.

Incident versus Problem

An incident is where an error occurs: something does not work the way it is expected.

This is often described as:

A fault
An error
It does not work!
A problem

But the ITIL term used with is an Incident.

A problem (is different) and can be:

The occurrence of the same incident many times
An incident that affects many users
The result of network diagnostics revealing that some systems are not operating
In the expected way

A problem can exist without having an immediate impact on the users, whereas incidents are usually more visible and the impact on the user is more immediate.

Why use Incident Management? / Benefits of ITIL Incident Management

There are major benefits to be gained by implementing an Incident Management Process:

Improved information to customers/users on aspects of service quality
Improved information on the reliability of equipment
Better staff confidence that a process exists to keep IT services working
The certainty that incidents logged will be addressed and not forgotten
Reduction of the impact of incidents on the business/organization
Resolving the Incident first rather than the problem, which will help in keeping the service available (but beware of too many quick fixes that problem management does not ultimately resolve)
Working with knowledge about the configuration and any changes made, which will enable you to identify the cause of incidents quickly
Improved monitoring and ability to interpret the reports, which will help to identify Incidents before they have an impact

What happens if Incident Management is not used?

Failing to Implement Incident Management may result in:

No one managing and escalating incidents
Unnecessary severity of incidents and increased likelihood of impact on other areas (for instance, a full disk will prevent printing, saving work and copying files)
Technicians being asked to do routine tasks such as clear paper jams, repair a broken monitor that has merely had the power disconnected, or fix a disk error when a floppy disk was left in during reboot
Specialist support staff being subject to constant interruption, making them less effective
Your other staff being disrupted as people ask their colleagues for advice
Frequent reassessment of incidents from first principles rather than referring to existing solutions, such as the knowledge database
Lack of coordinated management information
Forgotten, incorrectly handled, or badly managed incidents

Issues with deciding on an Incident Management Process

Prepare to overcome

Absence of visible management or staff commitment, resulting in non-availability of resources for implementation
Lack of clarity about the business/organization’s needs
Out of date working practices
Poorly defined objectives, goals, and responsibilities
Absence of knowledge for resolving incidents
Inadequate staff training
Resistance to change

Who uses Incident Management?

Any organization that needs to understand its technical support requirements should start with implementing a service desk, closely followed by a defined Incident Management Process.

It will help to channel all incidents through a single point of contact (service desk) so that someone is responsible for following them through to a speedy resolution. Most organizations that rely on IT services need to know how their ICT systems/IT services are functioning, what is failing, and how long systems are unavailable.

The reports produced in the process of incident management focus on the performance of equipment, and not on the technical issues that created the incidents.

How Incident Management works / Incident Management Process

Incident management is about understanding the incident life cycle and the actions to take at each stage.

Incident process

Inputs to the incident process

Incident details logged at the service desk
Configuration details from the configuration management database
Output from problem management and known errors
Resolution details from other incidents
Responses to requests for change

Output from the incident process

Incident resolution and closure
Updated incident record and call log
Methods for work arounds
Communication with the user
Requests for change
Management information (reports)
Input to the problem management process

Activities of the Incident process

Incident detection and recording
Initial user support by the single point of contact (service desk)
Investigation and diagnosis
Resolution and recovery of service
Incident closure
Incident ownership, monitoring, and communication

Incident Management Implementation Requirements

Before identifying your needs, consider what you want to achieve
This is an opportunity to re-evaluate the way you have, to date, approached, and fixed incidents. Rethink current processes and activities
Understand the difference between incident management and problem management
Technical staff will always try to solve the cause of a problem. Their way of thinking needs to change so that they approach it with incident management before problem management
Choose which areas to improve and which processes to remove
You need to sell the idea to the other staff, so make it appeal to yourself first

Incident Management Process Flow & Life Cycle Diagram

ITIL Incident Management Roles and Functions in the Incident Process

Service desk role in Incident Management

Service desk responsibilities include:

Logging the incident in the call log
Performing the initial Incident diagnostics
Requesting technician support when required
Owning, monitoring, and communicating
Updating records (call log, incident sheet) with the resolution
Closing incidents
Progressing any follow up action (for example, following through into problem management)

Technical support role in Incident Management

The technician’s role in incident management has the same focus – to restore the service as soon as possible. The technician will keep the service desk informed at all stages.

Other Incident Management Roles

Additional first line support groups, such as configuration management or change management specialists should be consulted.

Second- and third-line support groups, including specialist support groups and external suppliers should be consulted as necessary.

Users should keep the service desk informed of any further changes to the state of the affected equipment (sometimes computers start working again when different incidents are resolved).

Prepare to Implement Incident Management

Implement the service desk first
Decide how incident management will interface with the service desk
Decide who will take on the responsibility of incident management
Make sure that management commitment, budget and resource is made available before you consider setting up incident management
Ensure that the proposed solution aligns with your business/organization’s strategy and vision
Define clear objectives and deliverables
Involve and consult IT staff
Sell the benefits to the support staff – implementing incident management will need a change of behaviour from IT staff as well as users
Plan the incident management process training
Service desk training is the priority
Incident management training – who, when
Decide what to measure and report
Before making changes, you must understand the levels of service you are currently providing with the current resources available
Produce a report on the number of calls currently logged, the time taken to resolve them and the time the equipment is unavailable – this is your baseline
Set targets for a manageable number of objectives for the effectiveness of incident management
Decide what incident management reports are required
Ensure that the incident management process is regularly reviewed

Incident Management Post Implementation Review

It is the users’ perception rather than availability statistics or transaction rates that, in the end, defines whether the service is meeting their needs.

User satisfaction Analysis and Surveys

Satisfaction surveys are an excellent method of monitoring user perception and expectation and can be used as a powerful marketing tool. However, to ensure success you should address several key points:

Decide on the scope of the survey
Decide on the target audience
Clearly define the questions
Make the survey easy to complete
Conduct the survey regularly
Make sure that your users understand the benefits
Publish the results
Follow through on survey results
Translate survey results into actions

How to manage Incident Management Statuses?

Incident statuses mirror the Incident Process and as follows:

New
Assigned
In Progress
On hold or pending or awaiting user info
Resolved
Closed

The New status indicates that the service desk has received the incident but has not assigned it to a group or technician. Response Time is important Matrix to be used for this tatus.

The Assigned status means that an Incident has been assigned to an individual Technician or Group on service desk.

The In-Progress status indicates that an incident has been assigned to a group or technician but has not been resolved. The technician is actively working with the user to diagnose and resolve the incident.

The On-hold status indicates that the incident requires some information or response from the user or from a third party or awaiting info. The Incident is placed “on hold” so that SLA response deadlines are not exceeded while waiting for a response from the user or vendor or others.

The Resolved status means that the service desk has confirmed that the incident is resolved, and that the user’s service has restored to the SLA levels. On this stage, users can re-open the incident if issue not resolved within defined timeline before closing the incident.

The Closed status indicates that the incident is resolved and that no further actions can be taken.

Incident management follows incidents through the service desk to track trends in incident categories and time in each status. The final component of incident management is the evaluation of the data gathered. Incident data guides organizations to make decisions that improve the quality of service delivered and decrease the overall volume of incidents reported. Incident management is just one process in the service operation framework.

Measurements of Incident Management

Do not set targets that cannot be measured
Ensure that users are aware of what you are doing, and why
Establish a baseline before discussing formal Service Level Agreements (SLAs) with customers
Maintain measurements of what is necessary and viable. For instance, if your staff think that they need feedback on response times, then measure them

ITIL 4 Incident Management Contribution to the Service Value Chain (SVC)

Contribution of incident management to the service value chain, with the practice being applied mainly to the engage, and deliver and support value chain activities. Except for plan, other activities may use information about incidents to help set priorities:

Improve: Incident records are a key input to improvement activities and are prioritized both in terms of incident frequency and severity.

Engage: Incidents are visible to users, and significant incidents are also visible to customers. Good incident management requires regular communication to understand the issues, set expectations, provide status updates, and agree that the issue has been resolved so the incident can be closed.

Design and transition: Incidents may occur in test environments, as well as during service release and deployment. The practice ensures these incidents are resolved in a timely and controlled manner.

Obtain/build: Incidents may occur in development environments. Incident management practice ensures these incidents are resolved in a timely and controlled manner.

Deliver and support: Incident management makes a significant contribution to support. This value chain activity includes resolving incidents and problems.

Incident Management Interfaces with Other ITIL Processes

The interfaces with Incident Management include:

Problem Management: Incident Management forms part of the overall process of dealing with problems in the organization. Incidents are often caused by underlying problems, which must be solved to prevent the incident from recurring. Incident Management provides a point where these are reported.

Configuration Management provides the data used to identify and progress incidents. One of the uses of the CMS is to identify faulty equipment and to assess the impact of an incident. It is also used to identify the users affected by potential problems. The CMS also contains information about which categories of incident should be assigned to which support group. In turn, Incident Management can maintain the status of faulty CIs. It can also assist Configuration Management to audit the infrastructure when working to resolve an incident.

Change Management: Where a change is required to implement a workaround or resolution, this will need to be logged as an RFC and progressed through Change Management. In turn, Incident Management is able to detect and resolve incidents that arise from failed changes.

Capacity Management: Incident Management provides a trigger for performance monitoring where there appears to be a performance problem. Capacity Management may develop workarounds for incidents.

Availability Management: will use Incident Management data to determine the availability of IT services and look at where the incident lifecycle can be improved.

SLM-Service Level Management: The ability to resolve incidents at a specified time is a key part of delivering an agreed level of service. Incident Management enables SLM to define measurable responses to service disruptions. It also provides reports that enable SLM to review SLAs objectively and regularly. In particular, Incident Management is able to assist in defining where services are at their weakest so that SLM can define actions as part of the Service Improvement Plan (SIP) – please see the Continual Service Improvement publication for more details. SLM defines the acceptable levels of service within which Incident Management works, including:

Incident response times
Impact definitions
Target fix times
Service definitions, which are mapped to users
Rules for requesting services
Expectations for providing feedback to users.

ITIL Incident Management Metrics / KPIs

The metrics that should be monitored and reported upon to judge the efficiency and effectiveness of the Incident Management process, and its operation, will include:

Total numbers of Incidents (as a control measure)
Breakdown of incidents at each stage (e.g. logged, work in progress, closed etc)
Size of current incident backlog
Number and percentage of major incidents
Mean elapsed time to achieve incident resolution or circumvention, broken down by impact code
Percentage of incidents handled within agreed response time (incident response-time targets may be specified in SLAs, for example, by impact and urgency codes)
The average cost per incident
Number of Incidents reopened and as a percentage of the total
Percentage and Number of incidents incorrectly assigned
Number and percentage of incidents incorrectly categorized
Percentage of Incidents closed by the Service Desk without reference to other levels of support (often referred to as ‘first point of contact’)
Percentage and Number the of incidents processed per Service Desk agent
Number and percentage of incidents resolved remotely, without the need for a visit
Number of incidents handled by each Incident Model
Breakdown of incidents by time of day, to help pinpoint peaks and ensure matching of resources.

Incident Management Reports

There should already be reports produced by the service desk on the number of incidents logged each week. Expand on the information in those reports to decide whether your new approach to incident management is effective. For example:

In addition to recording the number of incidents logged each week, compare the numbers to incidents logged prior to implementing incident management
Show the average length of time taken to resolve incidents before and after implementing incident management
Where possible, show the types of incident reported
Percentage of incidents handled within the agreed response time
Show the percentage of incidents closed by the service desk without the need for contacting technical support
Show the number and percentage of incidents resolved remotely, without the need for a visit

For you, Reports are used to summarize in non-technical language and to show where improvements could be made. Often the improvements require expenditure, so having reports to back up your suggestions can prove invaluable.

Reports should be produced under the authority of the Incident Manager, who should draw up a schedule and distribution list, in collaboration with the Service Desk and support groups handling incidents. Distribution lists should at least include IT Services Management and specialist support groups. Consider also making the data available to users and customers, for example via SLA reports.

ITIL Incident Management Challenges, Critical Success Factors, and Risks

Challenges of Incident management

The following challenges will exist for successful Incident Management:

The ability to detect incidents as early as possible. This will require education of the users reporting incidents, the use of Super Users, and the configuration of Event Management tools.
Difficult for you convincing all staff (technical teams as well as users) that all incidents must be logged and encouraging the use of self-help web-based capabilities (which can speed up assistance and reduce resource requirements).
Availability of information about problems and Known Errors. This will enable Incident Management staff to learn from previous incidents and also to track the status of resolutions.
Integration into the CMS to determine relationships between CIs and to refer to the history of CIs when performing first-line support.
Integration into the SLM process. This will assist Incident Management correctly to assess the impact and priority of incidents and assists in defining and executing escalation procedures. SLM will also benefit from the information learned during the Incident. Management, for example in determining whether service level performance targets are realistic and achievable.

Critical Success Factors (CSF) of Incident Management

The following factors will be critical for successful Incident Management:

A good Service Desk is key to successful Incident Management
Clearly defined targets to work to – as defined in SLAs
Adequate customer-oriented and technically training support staff with the correct skill levels, at all stages of the process
Integrated support tools to drive and control the process
OLAs and UCs that are capable of influencing and shaping the correct behavior of all support staff.

Risks of Incident Management

The risks to successful Incident Management are actually similar to some of the challenges and the reverse of some of the Critical Success Factors mentioned above. They include:

Being inundated with incidents that cannot be handled within acceptable timescales due to a lack of available or properly trained resources
Incidents being bogged down and not progressed as intended because of inadequate support tools to raise alerts and prompt progress
Lack of adequate and/or timely information sources because of inadequate tools or lack of integration
Mismatches in objectives or actions because of poorly aligned or non-existent OLAs and/or UCs.

Conclusion

ITIL Incident Management plays a major role in IT Service Operations to deliver and support IT Services to Business. This process help to Business to resolve their on going Issues and problems to increase the Productivity and Efficiency. IT Service Management (ITSM) support and delivery models of IT Services depend the high quality Incident Management Process.

At the end, I would like to you ask you, what you would like to share and comments from your experience to improve this process to be more values addition to the Businesses / Enterprises?

The Best Ultimate Essential Guide of ITIL Incident Management

What is ITIL Incident Management?

Book a Call – Digital, Business & Technology Solutions

What is the Purpose of ITIL Incident Management?

Want the Value-Added Support to resolve your Business Technology Problems?

Join the Experts Forums, Communities & Groups

Incident Management Basic Concepts

What is the ITIL Major Incident Management Process?

What is the Objective of the Incident Management Process?

Scope of ITIL Incident Management

Incident Management Value to Business

Differences between Incident Management and Problem Management?

Why use Incident Management? / Benefits of ITIL Incident Management

What happens if Incident Management is not used?

Issues with deciding on an Incident Management Process

Who uses Incident Management?

How Incident Management works / Incident Management Process

Incident process

Activities of the Incident process

Incident Management Implementation Requirements

Incident Management Process Flow & Life Cycle Diagram

ITIL Incident Management Roles and Functions in the Incident Process

Service desk role in Incident Management

Technical support role in Incident Management

Other Incident Management Roles

Prepare to Implement Incident Management

Incident Management Post Implementation Review

User satisfaction Analysis and Surveys

How to manage Incident Management Statuses?

Measurements of Incident Management

ITIL 4 Incident Management Contribution to the Service Value Chain (SVC)

Incident Management Interfaces with Other ITIL Processes

ITIL Incident Management Metrics / KPIs

Incident Management Reports

ITIL Incident Management Challenges, Critical Success Factors, and Risks

Challenges of Incident management

Critical Success Factors (CSF) of Incident Management

Risks of Incident Management

Conclusion

Share this:

LATEST POSTS

About The Author

Manoj Kumar

Leave a ReplyCancel reply

Want Value-Added Support to resolve your Digital & Technology Problems?

Join the Experts Forums, Communities & Groups

Start typing and press enter to search