Follow us

The Secret Life of Incidents

By | August 26, 2015 in Service Desk

A service desk incident's journey

Everyone knows that incident management is designed to manage the overall life of an incident. Start to finish. Cradle to grave.

But what does that really mean?

The standard, accepted practice is that (most) incidents are started at the service desk – aka "Level 1 support". In fact, the service desk is generally synonymous with incident management ownership. The role of service desk employees includes ensuring timely and effective resolution.

Issues that cannot be fixed at level 1 are escalated to specialty teams, also known as Level 2 (or 3) “resolver” groups. Resolver groups have deep technical experience in a specific technology area (network, servers, application, etc.)


We’ve all heard horror stories of incident tickets that spend days, weeks, or months getting kicked from queue to queue. It’s a concept I call “bus therapy”. Basically, when a resolver group either can’t resolve a ticket, or doesn’t know what to do with it (and before the SLA expires!), they assign it to some other random queue, making it someone else’s problem.

The reason I call it bus therapy is because it’s like buying a one-way ticket to some other town, wishing it well, and sending it down the road, hopefully never to be seen again.

This happens when there’s not clear ownership for incidents. And when that happens, guess who suffers. You got it – the customer.

Who’s On First?

Incidents that are created, worked on, and resolved by the service desk are pretty straightforward. Ownership is clear, tracking resolution time is pretty simple, and, in the end (hopefully), a happy customer.

No problem here.

But incidents worked on by the service desk that requires deeper support for resolution – now that’s a different story. When that happens, the ticket is reassigned to the appropriate resolver group (network, servers, etc.).

But what if the resolver group determines it’s not their issue? Do they route the ticket on to another resolver group? Do they route it back to the service desk for rerouting?

What if the issue requires multiple resolver groups’ efforts? Who manages the ticket, or are there now multiple tickets?

Every hand off potentially adds delay as the ticket is queued awaiting the new resolver group to respond and begin working the ticket. The more hand offs, the more potential for delay.

It’s not that anyone is trying to give bad service to any particular incident. Everyone is busy, and tickets have a way of getting forgotten or lost in queues. Unassigned tickets can be particularly troublesome.

This is the shocking, secret life of escalated incidents:

  • Potential for unclear or multiple owners
  • Convoluted paths between resolver groups
  • Unclear, or circular, paths to resolution

It really comes down to one fundamental principle: who owns an incident at any given time?

There’s really only two options:

  • The person currently working it, or
  • The incident owner, regardless of who’s working it

Two Roads Diverged

In the first option, incident ownership transfers with the ticket as it gets assigned to different people. This is the simplest, and perhaps most common, approach. The service desk analyst owns the incident while working it. If it gets transferred to a colleague for any reason, the ticket owner is simply changed, and off you go.

Where this model struggles a bit is when transferring ownership to a resolver queue, rather than a person, and the ticket sits unassigned for a period of time. At that point in time, it has no identified owner. The service desk analyst no longer owns it, and it’s sitting in a queue waiting for someone to notice and take ownership for it.

If there are multiple handoffs, each new group’s response time is added to the overall resolution time.

In the second option, ownership of the incident remains constant. It never transfers. The owner is generally the service desk analyst whom, as you recall, has ownership for all incidents.

With this approach, when issues go to resolver groups, a child ticket (sub-ticket, task, etc.) is created, related to the original (parent) ticket, and the child ticket is routed to the resolver group. If multiple resolver groups are required, additional child tickets are created and related to the single parent ticket.

The service desk maintains ownership of the parent ticket, and keeps watch over the child ticket(s). This gives the service desk the ability to monitor the overall coordinated response to an incident. The service desk calls attention to child tickets that don’t appear to be making progress, and takes action to ensure timely resolution.

The service desk is then positioned to facilitate the overall response to complex and escalating incidents. The service desk uses child tickets to effectively manage the incident, and rally additional resources as needed.

Which Is Better?

Neither, really. Or both, perhaps. I don’t want to go the “it depends” route, but, well, it kind of does depend.

In smaller organizations, transferring ownership can be clean and effective.  In tight-knit groups, tickets are much less likely to get lost or bumped multiple times without being noticed.

But in a larger shop, especially where there are multiple, geographically dispersed resolver groups, and follow-the-sun support, it’s easier for tickets to get lost in the shuffle.

It’s worth noting that the service desk is designed to be a customer-service contact center. Its primary purpose is to ensure all incidents are resolved within service level targets. Having a handle on all incidents facilitates high levels of service. It also puts them in a powerful position to correlate, detect, and trigger major incident response.

What to Do?

The bottom line is, either approach can be made to work perfectly well, depending on the circumstances. Personally, I prefer a service desk with clear ownership and end-to-end accountability for incident management. To me, this is the fulfillment of “managing the lifecycle” objective of incident management.

But that’s a decision that each organization must make for itself based on its unique circumstances and challenges.

As with all processes and tools efforts, the focus should be on the process first. Don’t rely on the tool to define the process. You have to understand what works best for your organization, and why, before you begin configuring your ITSM tool.

All modern ITSM tools can be configured to effectively manage either of these approaches.

What approach have you seen to be most successful?

If you are struggling with service level performance and incident resolution times, and if you have too many tickets receiving bus therapy, you may want to revisit your incident model. Service desks that started small and grew with the business may never have made an intentional decision on incident ownership, and it’s worth revisiting.

Image credit

Greg Sanker

About Greg Sanker

Greg is an IT Service Management blogger, speaker, and practitioner with decades of global IT experience ranging from Fortune 10 tech giant to public sector. He lives in the Pacific Northwest (USA), where stunning natural beauty and high tech form a unique lifestyle. In his spare time, Greg hikes, bikes, and plays a bit of blues guitar. He blogs about Excellence in IT Service Management at ITSMTransition.com.
 

Leave a Reply

Your email address will not be published.

*

Subscribe now