ITIL V3 Fundamentals, Processes and Service Delivery Concepts

INCIDENT MANAGEMENT- Part One

Here we are with one of the most important topic of IT Operations Management. Incident management is the most visible component to the business.

Let us define the Incident first.

Incident can be defined as an unplanned disruption to Service or a reduction in quality of Service. Any failure of a Configuration Item that has not impacted the service yet is also categorized as Incident. Process of dealing with Incident is Incident Management.

The Incident management process is targeted to-

Restore the normal services at fast as possible.
Minimize negative impact on business operations

In the above lines few very important terms has been used understanding which is very important.

What is Configuration Item

A configuration Item is a component of a system that is treated as a self contained unit for the purpose of Identification and change management. A CI may be a primitive system building block (e.g. code module) or an aggregate of other CIs. For example, a PC may be designated as CI but in a support environment that requires more control, different part of PC like monitor, HDD etc may be designated as CI.

What is Normal Service

Normal Service Operation may be defined as Service Operation that is within the limits of SLA.

Incident Management should be designed so that it offers below Values to Business:

Ability to detect and Resolve Incidents => Higher Availability and Lower Downtime
Align IT Activities with business priorities so that it can identify those priorities and allocate the resources.
Ability to identify potential service improvements
Identify other requirements like added services, trainings etc by analyzing the incidents.

Important points to be considered while designing Incident management:

Know Your Services

List down all the services a disruption or potential disruption to which shall be qualified as Incident. Just for example, failure of a Server that is not in production need to be considered for incident management.

Know the Criticality & Impact of services

Every service has different level of criticality and impact on business and hence it must be defined clearly. E.g. failure of a PC of a VIP may have high criticality but impacts only one user while failure of a Server may result into impact on a large number of users. Definitely later incident needs faster recovery than the former one.

Define the Severity

On the basis of criticality and impact, severity of incident is defined. A sample matrix of Criticality, impact and Severity is shown below

Ceriticality	Impact	Severity
High	High	Major Incident
High	Medium	High or Sev 1
High	low	Medium or Sev2
Medium	High	High or Sev 1
Medium	Medium	Medium or Sev2
Medium	low	Low or Sev 3
Low	High	Medium or Sev2
Low	Medium	Low or Sev 3
Low	low	Low or Sev 3

Define Timelines

Now that we have categorized the different levels of incident, each severity category must have a target time for resolution. These time targets must be realistic and aligned to business requirement. Smaller time frame means more number of resources with high capabilities (which generally means high cost too) needs to be aligned. If timelines are not defined properly, it may result into inefficient services.

For example, take a case where timelines for End User PC resolution is 30 Minutes. In almost every environment, number of low severity incidents is generally high as compared to high severity case. Now if the timelines for such cases is small, we’ll end up employing higher number of resources. If money saved by smaller time is suppose 10$ but to ensure these timelines, we end up spending 20$, it is definitely an inefficient use of resource. Same is true for the reverse situation.

Keep reading for more information on Incident management.

Tuesday, 5 March 2013

INCIDENT MANAGEMENT- Part One

INCIDENT MANAGEMENT- Part One

What is Configuration Item

What is Normal Service

Incident Management should be designed so that it offers below Values to Business:

Important points to be considered while designing Incident management:

Know Your Services

Know the Criticality & Impact of services

Define the Severity

Define Timelines