Service Level Agreements21st November 2020
A service-level agreement (SLA) is a contract between a service provider and its customers that documents what services the provider will furnish and defines the service standards the provider is obligated to meet.
A service-level agreement (SLA) is a commitment between a service provider and a client. Particular aspects of the service: quality, availability, responsibilities are agreed between the service provider and the service user. The most common component of an SLA is that the services should be provided to the customer as agreed upon in the contract. As an example, Internet service providers and telcos will commonly include service level agreements within the terms of their contracts with customers to define the level(s) of service being sold in plain language terms. In this case the SLA will typically have a technical definition in mean time between failures (MTBF), mean time to repair or mean time to recovery (MTTR); identifying which party is responsible for reporting faults or paying fees; responsibility for various data rates; throughput; jitter; or similar measurable details.
A service-level commitment (SLC) is a broader and more generalized form of an SLA. The two differ because an SLA is bidirectional and involves two teams. In contrast, an SLC is a single-directional obligation that establishes what a team can guarantee its customers at any given time.
Service providers need SLAs to help them manage customer expectations and define the severity levels and circumstances under which they are not liable for outages or performance issues. Customers can also benefit from SLAs because the contract describes the performance characteristics of the service — which can be compared with other vendors’ SLAs — and sets forth the means for redressing service issues.
The SLA is typically one of two foundational agreements that service providers have with their customers. Many service providers establish a master service agreement to establish the general terms and conditions in which they will work with customers. The SLA is often incorporated by reference in the service provider’s master service agreement. Between the two service contracts, the SLA adds greater specificity regarding the services provided and the metrics that will be used to measure their performance.
When IT outsourcing emerged in the late 1980s, SLAs evolved as a mechanism to govern such relationships. Service-level agreements set the expectations for a service provider’s performance and established penalties for missing the targets and, in some cases, bonuses for exceeding them. Since outsourcing projects were frequently customized for a particular customer, outsourcing SLAs were often drafted to govern a specific project.
Key components of an SLA
Key components of a service-level agreement include:
Agreement overview: This first section sets forth the basics of the agreement, including the parties involved, the start date and a general introduction of the services provided.
Description of services: The SLA needs detailed descriptions of every service offered, under all possible circumstances, with the turnaround times included. Service definitions should include how the services are delivered, whether maintenance service is offered, what the hours of operation are, where dependencies exist, an outline of the processes and a list of all technology and applications used.
Exclusions: Specific services that are not offered should also be clearly defined to avoid confusion and eliminate room for assumptions from other parties.
Service performance: Performance measurement metrics and performance levels are defined. The client and service provider should agree on a list of all the metrics they will use to measure the service levels of the provider.
Redressing: Compensation or payment should be defined in the event that a provider cannot properly fulfill their SLA.
Stakeholders: Clearly defines the parties involved in the agreement and establishes their responsibilities.
Security: All security measures that will be taken by the service provider are defined. Typically, this includes the drafting and consensus on antipoaching, IT security and nondisclosure agreements.
Risk management and disaster recovery: Risk management processes and a disaster recovery plan are established and clearly communicated.
Service tracking and reporting: This section defines the reporting structure, tracking intervals and stakeholders involved in the agreement.
Periodic review and change processes: The SLA and all established key performance indicators (KPIs) should be regularly reviewed. This process is defined as well as the appropriate process for making changes.
Termination process: The SLA should define the circumstances under which the agreement can be terminated or will expire. The notice period from either side should also be established.
Signatures: Finally, all stakeholders and authorized participants from both parties must sign the document to show their approval of every detail and process.
SLAs establish customer expectations regarding the service provider’s performance and quality in several ways. Some metrics that SLAs may specify include:
- Availability and uptime percentage: The amount of time services are running and accessible to the customer. Uptime is generally tracked and reported per calendar month or billing cycle.
- Specific performance benchmarks: Actual performance will be periodically compared to these benchmarks.
- Service provider response time: The time it takes the service provider to respond to a customer’s issue or request. A larger service provider may operate a service desk to respond to customer inquiries.
- Resolution time: The time it takes for an issue to be resolved once logged by the service provider.
Other metrics include the schedule for notification in advance of network changes that may affect users and general service usage statistics.
An SLA may specify availability, performance and other parameters for different types of customer infrastructure, such as internal networks, servers and infrastructure components like uninterruptable power supplies.
Service level agreements are also defined at different levels:
- Customer-based SLA: An agreement with an individual customer group, covering all the services they use. For example, an SLA between a supplier (IT service provider) and the finance department of a large organization for the services such as finance system, payroll system, billing system, procurement/purchase system, etc.
- Service-based SLA: An agreement for all customers using the services being delivered by the service provider. For example:
A mobile service provider offers a routine service to all the customers and offers certain maintenance as a part of an offer with the universal charging.
An email system for the entire organization. There are chances of difficulties arising in this type of SLA as level of the services being offered may vary for different customers (for example, head office staff may use high-speed LAN connections while local offices may have to use a lower speed leased line).
- Multilevel SLA: The SLA is split into the different levels, each addressing different set of customers for the same services, in the same SLA.
Corporate-level SLA: Covering all the generic service level management (often abbreviated as SLM) issues appropriate to every customer throughout the organization. These issues are likely to be less volatile and so updates (SLA reviews) are less frequently required.
Customer-level SLA: covering all SLM issues relevant to the particular customer group, regardless of the services being used.
Service-level SLA: covering all SLM issue relevant to the specific services, in relation to this specific customer group.
A well-defined and typical SLA will contain the following components:
- Type of service to be provided: It specifies the type of service and any additional details of type of service to be provided. In case of an IP network connectivity, type of service will describe functions such as operation and maintenance of networking equipment, connection bandwidth to be provided, etc.
- The service’s desired performance level, especially its reliability and responsiveness: A reliable service will be the one which suffers minimum disruptions in a specific amount of time and is available at almost all times. A service with good responsiveness will perform the desired action promptly after the customer requests for it.
- Monitoring process and service level reporting: This component describes how the performance levels are supervised and monitored. This process involves gathering of different type of statistics, how frequently this statistics will be collected and how this statistics will be accessed by the customers.
- The steps for reporting issues with the service: This component will specify the contact details to report the problem to and the order in which details about the issue have to be reported. The contract will also include a time range in which the problem will be looked upon and also till when the issue will be resolved.
- Response and issue resolution time-frame: Response time-frame is the time period by which the service provider will start the investigation of the issue. Issue resolution time-frame is the time period by which the current service issue will be resolved and fixed.
- Repercussions for service provider not meeting its commitment: If the provider is not able to meet the requirements as stated in SLA then service provider will have to face consequences for the same. These consequences may include customer’s right to terminate the contract or ask for a refund for losses incurred by the customer due to failure of service.
Service-level agreements can contain numerous service-performance metrics with corresponding service-level objectives. A common case in IT-service management is a call center or service desk. Metrics commonly agreed to in these cases include:
- Abandonment Rate: Percentage of calls abandoned while waiting to be answered.
- ASA (Average Speed to Answer): Average time (usually in seconds) it takes for a call to be answered by the service desk.
- TSF (Time Service Factor): Percentage of calls answered within a definite timeframe, e.g., 80% in 20 seconds.
- FCR (First-Call Resolution): Percentage of incoming calls that can be resolved without the use of a call back or without having the caller call back the helpdesk to finish resolving the case.
- TAT (Turn-Around Time): Time taken to complete a certain task.
- TRT (total resolution time): Total time taken to complete a certain task.
- MTTR (Mean Time to Recover): Time taken to recover after an outage of service.
Uptime is also a common metric, often used for data services such as shared hosting, virtual private servers and dedicated servers. Common agreements include percentage of network uptime, power uptime, number of scheduled maintenance windows, etc.
Many SLAs track to the Information Technology Infrastructure Library specifications when applied to IT services.