3 Ways To Accelerate Incident Management

Incident management is one of the more popularly adopted ITIL practices—with close to 100% of small-to-large organizations doing it in some form or another.

But the shape and maturity of incident management practices varies greatly from one firm to another. In most organizations there’s still a lot of room for doing incident management better, faster, and cheaper.

Today we’re going to look at some of the angles you need to think about to accelerate incident management and reduce those overflowing ticket queues. The good news is that there are many opportunities for improvement, so you can choose which one is the best starting point for you. Here we look at three things you can do:

Defined Processes
Good Incident Data
Self-Service IT Portal

1. Defined Processes

The way work happens matters. According to W. Edwards Deming, the father of statistical process control:

“95% of variation in the performance of a system is caused by the system itself; only 5% is caused by the people.”

What does that mean? A room full of well-trained experts will still struggle to deal with a flood of IT issues if work happens on an ad-hoc basis—without any structure. Results are haphazard and unpredictable. Variations in the way people work lead to variations in the quality of output. With no system of work in place to enable consistency, people tend to avoid the service desk and go direct to their favourite techy—someone they know will fix it for them. Until that techy is burnt out because the work that the service desk does is no longer distributed evenly.

In ad-hoc environments, success depends on the individual agent. Each agent has their own approach. Some work better than others. Some are more organized and experienced than others. If you don’t have a defined incident management process, people will create their own way of working, or import a process they used at a previous organization. Either way, the lack of standardisation makes it very difficult to measure/evaluate how work gets done and thus improve it.

Sitting down with a single agent to review how they work and give pointers isn’t scalable—it has a limited effect (only on the incidents they work on). Applying a standard process is scalable because improvement is applied across the board. When you have standard processes in place, an improvement means performance on every incident improves.

However, it is important to note that change happens—and when change happens, rigid processes often break. Some flexibility is required to handle edge cases for which a rigid process won’t work. Agents need to be able to adapt processes on-the-fly to fit the current (possibly new) context.

The trick here is this: know the rules before you break them. That means starting out with rules, make sure everyone knows and follows the process(es) and then applying training and collaboration to your service desk so that agents can make decisions on how to adapt processes for exceptional cases. It’s also important to have a ITSM system of record that records what happened so that it is repeatable: today’s exception may become tomorrow’s normal.

Have a process. Put effort into achieving universal process adherence. That’s your foundation for making further improvements. To quote Deming once again:

“If you can’t describe what you are doing as a process, you don’t know what you’re doing.”

Service downtime means lost productivity and lost revenue, so the service desk is no place for headless chickens.

2. Good Incident Data

Data isn’t the most glamorous of topics, but its importance cannot be over-stated. One of the most common failure modes for incident management is when an agent opens the next incident in the queue only to find that the record doesn’t contain all the information they need to find and fix the issue, or push it to the right domain expert. It’s stalled. They need to call the customer to fill in the blanks—consuming time for them and the customer (and delaying resolution time). Clean flow of data is critical to accelerated incident management.

Calling the customer for more information prevents clean flow—because the process has to loop back a step. Clean flow means the process only moves forward; there is no step-back to a previous stage (AKA the dreaded rework). This requires both a well-designed process, and the availability of the right supporting data.

So how can you solve this?

3 Ways To Accelerate Incident Management

This is where service desk tools become critical. For each incident category, an agent will need a slightly different set of data points to solve it. Most large organizations have a complex infrastructure spanning a datacenter (or datacenters), cloud platforms, hundreds of applications and thousands of devices—distributed across the globe. The impact is that there may be hundreds of different categories of incident. That means potentially hundreds of different sets of information that must be collected.

Without a service desk/ITSM system of record in place which tells an agent what information they need to log an issue properly, collecting the right data relies on that agent knowing precisely what they need to ask for—every time. That’s too much to realistically expect in complex environments. Mistakes will inevitably be made—and each one of those mistakes will cost time later when the agent or a downstream technician has to stop what they’re doing and hunt for data.

The best way to solve this is to know what information you need to collect at the start. That means mapping incident types to custom logging forms which guide the agent through the information they need to collect—no more, no less. Over-collecting information puts a burden on the service consumer and increases call times. Under-collecting causes the process to stall for a lack of data “fuel” that is needed to drive it.

Using custom category forms to guide the collection of the correct data prevents process loopbacks. Processes don’t stall for a lack of data. Agents receive and handover a complete set of data in the first instance—saving time for agents, technical support, and service users (who no longer need to be interrupted to help fill in the blanks).

Data and process go together. Don’t limit your problem management practice to looking at infrastructure issues alone. Resolving process and data issues are important too. Look for the outliers that suggest process/data issues. If some types of incident are taking much longer than others, that could indicate a process flow issue where the information that agents need is missing and they have to manually collect information before they can proceed. If you can spot and resolve messy data situations, you can accelerate processes and cut the time it takes to get services back online—and create improvement momentum. Bad data creates waste.

Think about what data you capture and how it flows from person to person and team to team as they go about getting the customer back online and solving underlying infrastructure issues. A single data point, missing at the beginning, can cause stress and delays downstream in the process. When data anomalies happen, trace them back to the root cause so you can pinpoint and eliminate what’s slowing down your incident management process.

3. Self-Service IT Portal

Self-service—offering service consumers a digital alternative to calling the service desk—cuts wait time to near-zero. It also takes the task of logging the call off the service desk’s plate. One reduces the time the service consumer spends waiting in a call queue. The other creates bandwidth for the service desk so that agents can work on and close more issue, more quickly. One is transformative in terms of the customer experience. The other is transformative in terms of workload.

Picture this: an HR person calls the service desk to report an issue with their HCM system. After a couple of minutes they get through to an agent. They explain the situation. The agent records all the details. It takes five minutes—start to finish.

What if they could visit a portal to log the issue, or even find the answer themselves? What if they bypassed the service desk call queue altogether? With a self-service portal, service consumers can log and/or solve their own problems through a web portal. Digital channels like this are scalable and instant. There’s no waiting.

discover

SOLUTIONS: Unified Self-Service Portal

Now think about what our HR scenario looks like when you have an IT portal in place: The HR person logs in to your portal and is guided through the logging process in sixty seconds. That’s four minutes saved—and one fewer call to the service desk. Scale this up across all your end users and the service can expect a sizeable reduction in tickets—and more free time to pursue improvement activities. When compared to manual work and one-to-one human interaction, digital portals are truly transformative.

There’s one key point which can’t be stressed enough: Search is critical to the success of a self-service portal. Findability is key. If they can’t find what they need—fast—they will revert to calling the service desk (meaning the portal has actually delayed the response and increased end user effort). Service customers are looking for the line of least resistance. If your self-service portal provides a frictionless, stress-free experience, people will use it. If it’s easier and faster to call your service desk—that’s what they will do.

Conclusion

Where you start on your improvement journey is up to you—depending on your organizational needs and constraints. For example, if you’re struggling to get budget right now, buying a new ITSM tool might not be an option—so focus on the “free” improvements like improving processes—or opt for a new cloud ITSM solution which is charged monthly out of your operational budget instead of having to go to your CFO to authorise a large capital expense purchase.

Perhaps you have a self-service portal, but people just aren’t using it. If that’s the case, you need to work on driving adoption. That means two things: Firstly, telling people about it. Market your portal to your end user community as a way to log (and fix) issues without waiting in a call queue. Secondly, you need to tweak your portal’s user experience to remove friction and make it easier to use. Constantly. Both are essential to driving adoption…and driving adoption is essential to getting a return on investment.

Find Out More

ITIL 4 is the world’s most popular source of ITSM best-practice guidance, covering everything from Incident Management, to Problem Management, to Change, Service Catalogs, Demand Management, and more. To see some practical guidance from industry experts on how you can use ITIL 4 practices to improve your ITSM, watch this on-demand webinar now.