← Back to Portfolio

Full Process Walkthrough

Service Level Management

A complete SLM case study: from reading the SLA contract to building a validated Power BI dashboard and producing governance reports that drove real management decisions. Every finding shows the contractual stakes — and why the team had to act.

5-phase SLM process+12.1% SLA improvementP2 gap uncoveredManagement action secured
01📃

Contract Review

SLA ContractPriority MatrixExcel

The SLA contract was reviewed end-to-end to extract all measurable service obligations. Each obligation was mapped to a priority tier, response and resolution target, and a penalty clause if applicable. This formed the master reference for all downstream KPI design and dashboard logic.

Contracted SLA Obligations by Priority Tier

PriorityDefinitionResponse TargetResolution TargetAvailabilityPenalty if Breached
P1 — CriticalFull service outage15 min1 hr99.9% / month5% monthly fee credit
P2 — HighMajor function impaired30 min4 hrs99.5% / month2% monthly fee credit
P3 — MediumPartial degradation1 hr8 hrsWarning issued
P4 — LowMinor issue / request4 hrs24 hrsNo penalty

🔍 Finding

P1 carries a 5% fee credit penalty per breach, which at the current contract value represents a significant financial exposure if multiple P1 incidents breach in a single month. The contract defines a 99.9% availability target — meaning no more than 44 minutes of unplanned downtime per month.

✅ Why the Team Must Act

The operations team must be made aware of the financial implications of P1 breaches and the exact 44-minute downtime cap. This context was embedded directly into dashboard tooltips and the monthly SLA report to ensure decision-makers understood the cost of inaction.

02🎯

KPI Definition

KPI FrameworkDAX LogicExcel

Each contracted obligation was translated into a measurable KPI with a defined data source, calculation formula, breach threshold, and RAG (Red/Amber/Green) status logic. This KPI dictionary became the single source of truth for all dashboard metrics.

KPI Dictionary — SLA Dashboard Metrics

KPI NameFormulaGreenAmberRed (Breach)
P1 Compliance RateP1 resolved on time / Total P1 x 100100%95-99%Below 95%
P2 Compliance RateP2 resolved on time / Total P2 x 100Above 90%85-90%Below 85%
Avg P1 Resolution TimeSum of P1 resolution hrs / P1 countBelow 0.8 hrs0.8-1.0 hrsAbove 1 hr
Monthly Availability %Uptime mins / Total mins x 100Above 99.9%99.5-99.9%Below 99.5%
Repeat Incident RateRepeat tickets / Total tickets x 100Below 5%5-10%Above 10%
SLA Overall ScoreWeighted avg of all tier complianceAbove 90%80-90%Below 80%

🔍 Finding

The repeat incident rate KPI revealed a gap not visible in standard SLA reporting — some tickets were being closed and re-opened repeatedly, inflating the resolution volume without actually solving root causes.

✅ Why the Team Must Act

The Repeat Incident Rate KPI was added to the weekly report and given a dedicated section in the monthly deck. The operations team was asked to investigate any repeat rate above 8% for root-cause action — directly reducing noise in ticket volume and improving real resolution quality.

03🖥️

Dashboard Development

Power BIDAXData Model Design

A 3-page SLA Compliance Dashboard was built in Power BI. Page 1: Executive Summary with KPI gauges and traffic-light indicators. Page 2: Team-level compliance trends by week. Page 3: Breach Detail drill-through table with root-cause tags. All visuals are cross-filtered and connected to a live data model that refreshes daily.

Dashboard Design Specification

PageVisual TypeMetrics ShownFilter OptionsAudience
Page 1 — Executive SummaryKPI cards, gauge charts, RAG tableOverall SLA score, P1/P2 compliance, availability %Month, QuarterSenior Management
Page 2 — Team TrendsLine chart, bar chart, heatmapWeekly compliance by team, breach count trendTeam, Priority, Date rangeDepartment Manager
Page 3 — Breach DetailDrill-through table, matrixTicket ID, breach duration, assigned team, root causePriority, Category, TeamOps Team Lead

🔍 Finding

During dashboard development, the data model revealed that P2 breach counts were being under-reported by 14% in the legacy Excel tracker because the old tracker used a different timestamp logic than the contract definition.

✅ Why the Team Must Act

The breach calculation logic in the dashboard was corrected to align with the exact contractual definition. This corrected view was presented to the client — who initially challenged the higher breach counts — with full documentation of the old vs new logic difference.

04

Contract Validation

Dashboard vs Contract MatrixExcelSign-off Log

Every KPI displayed on the dashboard was individually validated against its corresponding clause in the SLA contract. A validation matrix was produced documenting the contract clause reference, the KPI formula used, a sample test case, and the sign-off status. This was reviewed and approved by the service delivery manager and client representative.

Dashboard Validation Matrix — KPI vs Contract Clause

KPIContract Clause RefFormula ValidatedSample Test ResultStatus
P1 Compliance RateSection 4.1 — P1 Response & ResolutionYes1 breach in Jan = 92.3% — Compliant with calcCompliant
P2 Compliance RateSection 4.2 — P2 Service ObligationsYes169 breaches / 498 tickets = 66.1% — At RiskAt Risk
Avg P1 Resolution TimeSection 4.1.3 — Resolution SLAYesAvg 1.4 hrs — BREACH (target 1 hr)BREACH
Monthly Availability %Section 5.1 — Availability SLAYes99.82% Jan — Met 99.9% after P1 fixMet
Repeat Incident RateSection 6.3 — Quality ObligationYes8.8% Jan — Warning thresholdWarning
SLA Overall ScoreSection 7 — Composite ScoreYes68% Jan — Below 80% Amber thresholdBREACH

🔍 Finding

Validation revealed that P2 compliance and Overall SLA Score were both in breach territory in January. The P2 breach rate (66.1%) was previously hidden because the legacy tracker had been averaging across all priorities, masking the severity within specific tiers.

✅ Why the Team Must Act

The client was formally notified of the P2 compliance gap with a remediation plan. The service delivery manager accepted accountability and committed to a 75% P2 compliance target by end of February and 85% by March — both tracked via weekly dashboard review meetings.

05📊

Reporting & Governance

Monthly SLA ReportPowerPointWeekly Review Meeting

Structured SLA reports were produced and distributed on weekly and monthly cadences. Monthly reports included a full scorecard, trend charts, root-cause summaries, remediation status, and a risk forecast for the next period. Weekly SLA review meetings were held with the operations team to track in-flight actions.

Monthly SLA Scorecard — Q1 2024 (Contract Baseline: 90%)

SLA MetricJan ActualFeb ActualMar ActualTargetTrend
P1 Compliance %73.2%84.2%92.3%100%Improving
P2 Compliance %66.1%74.8%82.1%90%Improving
Availability %99.82%99.91%99.94%99.9%Met from Feb
Avg P1 Res. Time (hrs)1.41.10.91.0 maxMet from Mar
Repeat Incident Rate8.8%7.1%4.9%5% maxMet from Mar
Overall SLA Score68.0%77.2%80.1%90%Improving

🔍 Finding

By end of Q1, availability and average P1 resolution time both met contracted targets. The Overall SLA Score improved from 68% to 80.1% — still 10 points below the 90% threshold but with a clear upward trajectory driven directly by weekly reporting and management action.

✅ Why the Team Must Act

The monthly report was used to secure management approval for a Q2 staffing increase and a P1 auto-escalation rule. Both decisions were data-backed from the dashboard, proving that consistent SLA reporting directly translates into resource investment decisions.