BMC is introducing agentic artificial intelligence (AI) to transform enterprise IT work. The new BMC HelixGPT Knowledge Curator AI agent works with support and operations teams to enhance the quality of knowledge resources, keeping content fresh, concise, and high quality.
In the fast-paced world of IT service management (ITSM) and support, access to the right information at the right time is not just a matter of convenience—it is increasingly essential. Yet, many organizations struggle to maintain a cohesive knowledge management base, particularly in the complex enterprise service and technology landscape.
Duplicate management with BMC HelixGPT Knowledge Curator
Some commonly observed issues with enterprise knowledge management include:
The BMC HelixGPT Knowledge Curator agent boosts the knowledge management capabilities of a support organization and ensures the enduring high quality of its knowledge management base. Below are three primary benefits for support organizations:
The ability of BMC HelixGPT Knowledge Curator to detect specific patterns in large quantities of data, and recommend remediations, makes it an ideal assistant for a service organization. BMC Helix Knowledge Curator augments and enhances your knowledge management practices by addressing difficult-to-solve issues of duplication, obsolescence, and consistency.
Delivering contextual knowledge to employees and customers can significantly improve response times and resolution rates. BMC HelixGPT Knowledge Curator already provides significant assistance in getting knowledge to users at the right time in the context of their activity.
User engagement with knowledge management requires sufficient organizational confidence in the resources available. Adopting BMC HelixGPT Knowledge Curator helps to ensure your knowledge base is current, complete, and consistently high quality, inspiring higher rates of usage and contribution, and a greater return on your knowledge management investment.
Learn more about transforming your service and operations management with agentic AI at https://www.bmc.com/it-solutions/ai-agents.html. To discuss agentic AI with BMC, please contact us.
]]>BMC is introducing agentic artificial intelligence (AI) to transform enterprise IT work. The new BMC HelixGPT Insight Finder AI agent works with your teams to unlock the power of ITSM and ITOM data.
Modern IT service and operations management (ITSM/ITOM) solutions are smarter than ever, able to discover, monitor, understand, and even repair complicated distributed systems. However, the explosion of data that results from those solutions can be both a blessing and a challenge.
Many organizations become dependent on specialized teams to perform data analysis, causing delays, inefficiencies, and additional costs, particularly if the reporting teams need to learn the data structures and query mechanisms of the products providing the data. If ITSM, ITOps and DevOps practitioners are left to create their own reports and dashboards without assistance from reporting experts, data complexity can lead to errors. Such mistakes in data analysis may lead to bad decisions and erroneous communications, impacting the organization severely.
Alternatively, users may simply avoid data analysis altogether, dissuaded by the difficulty of dealing with the volume of information, resulting in opportunity cost and uninformed tactical and strategic decisions.
BMC Helix Insight Finder provides proactive information and responds to natural language prompts
BMC HelixGPT Insight Finder allows support and operations professionals to use generative and conversational AI interactions to simplify the process of creating and interpreting dashboards and reports. As a result, data insights become more accessible and immediate than ever before.
BMC HelixGPT Insight finder makes data analytics and reporting both easier and more powerful, enabling even novice users to get better insights from ITSM/ITOM data. The benefits are significant:
In the evolving landscape of enterprise software, staying ahead means equipping your team with the tools they need to act on data swiftly and effectively. BMC HelixGPT Insight Finder is not just a dashboard tool—it’s a game-changing assistant that transforms how your organization interacts with data. By simplifying dashboard creation, identifying actionable insights, and enriching reports with critical data, it empowers your team to make better, faster decisions, driving success in today’s competitive business environment. BMC HelixGPT Insight Finder brings data science superpowers to everyone.
Learn more about transforming your service and operations management with agentic AI at https://www.bmc.com/it-solutions/ai-agents.html, or contact BMC sales to learn more.
]]>
I am happy to announce the addition of agentic AI, powered by BMC HelixGPT, to our BMC Helix platform. We are beginning a journey to transform enterprise IT work, starting with agentic bots for service management and operations.
Agentic bots take generative artificial intelligence (Gen AI) beyond the “call and response” mechanism familiar to many users from their early interactions with GenAI. BMC HelixGPT now powers a set of bots that work autonomously, in a number of different roles, which include task automation, curation of knowledge content, proactive provision of insights from data, evaluation of change risks, and more.
Agentic bots in BMC Helix work alongside your human experts to improve service organization productivity and user experience (both for end users and the support and operations teams who support them), deliver better value from data, and improve service quality.
The new BMC Helix ITSM release introduces the first of many agentic bots, which will work to improve service organization productivity and the user experience (both for end users and the support and operations teams who support them), to deliver better insights from data and provide faster and more successful outcomes.
Effective support of complex service environments requires good knowledge sharing between support and operations professionals. Furthermore, with effective delivery of that knowledge to the users who need it, when they need it, people are more likely to find solutions to their issues and inquiries, improving operational outcomes and reducing the burden on the support organization.
But there is a challenge: While many organizations have sought to implement effective knowledge management practices, this is often very challenging in large and complex environments. In practice, many knowledge management initiatives have suffered from a lack of article quality and consistent coverage, and content may become duplicated or obsolete. When stakeholders lose confidence in the reliability and availability of knowledge, their engagement with the knowledge management system is also impacted, and people become less inclined to participate in using, creating, or improving knowledge.
BMC HelixGPT Knowledge Curator is a new agentic bot that uses Gen AI to solve these problems. The bot has a number of skills: It can identify whether new content needs to be written for any given issue, and guide the user through doing so, or matching them to relevant existing content. It can locate issues that are difficult for knowledge managers to detect, particularly in a large knowledge base, such as duplicated content or obsolete information. Furthermore, it helps knowledge managers resolve these issues by drafting proposed new content and guiding the knowledge author through the process of editing, finalizing, and publishing it.
BMC HelixGPT Knowledge Creator transforms knowledge management in support organizations, empowering your teams to achieve greater self-service levels and higher resolution rates, while reducing unproductive and repetitive toil.
Duplicate management with BMC HelixGPT Knowledge Curator
The ongoing digital transformation of industries is creating ever more complexity in the technology landscape of enterprises. In turn, the tools used to discover, run, manage, and support those technologies are receiving and generating more data than ever.
However, maximizing the potential of complex data has traditionally been the domain of data scientists and reporting technology experts, which can leave stakeholders waiting in line for data insights. Additionally, data experts may not have significant experience with the domain-specific context of the requests coming into their team, leading to multiple iterations or missed opportunities to surface valuable insights.
BMC HelixGPT Insight Finder is an agentic bot which uses Gen AI to surface important, timely insights directly to stakeholders such as support team managers or operations team members. It can tabulate and chart data on the fly, create dashboard widgets and reports automatically, and converse with the user as they drill further into the data.
Furthermore, it acts as a cognitive interface between users and the often–complex data that is relevant to them, allowing them to use natural language to state their requirements without needing to be familiar with the table–and–graph structure of the underlying database, or knowing how to navigate the interface and features of a specialist reporting tool. Users simply tell BMC HelixGPT Insight Finder what they want to know, and BMC HelixGPT executes their request, collating the data and presenting the results conversationally and visually.
As a result, BMC HelixGPT Insight Finder enables support and operations team users to make truly data-driven decisions. Every user becomes a reporting expert. By proactively providing them with significant data, BMC HelixGPT Insight Finder prompts users to explore higher–priority subjects first, ensuring their efforts are directed toward the most important and effective topics.
Traditional ITSM tools have done a great job automating workflows and fulfilling service requests for supported end users. However, as business operations become increasingly complex, demand is growing for a more straightforward, intuitive way for employees to find information and services quickly, across departments, enabling them to stay focused on more productive tasks.
BMC HelixGPT Employee Navigator is an agentic AI bot focused on those supported users of services and systems. Consider this example: An employee encounters a problem with an application. Instead of submitting a ticket and waiting for IT to resolve the issue, they simply talk directly with the agentic bot in natural language—perhaps simply stating “I’m having trouble with this application.” BMC HelixGPT immediately responds with troubleshooting steps or, if needed, automatically creates a service request for the IT team.
Giving users a fully conversational experience, BMC HelixGPT can provide summarized answers to questions consolidated from multiple knowledge sources, removing the need for users to sift through a list of search results. It is designed to be used across the company, giving users a single point of dialogue for accessing services from different lines of business.
However, it is more than just a conversational tool. BMC HelixGPT Employee Navigator can execute tasks and workflows, automating not only the service request, but also its fulfilment.
These new capabilities work together to encourage greater use of self-service, delivering results faster to users while reducing the volume of requests to the support organization. This enables ITSM and IT operations teams to move away from simple request-handling to more productive activities with better enduring value.
The most effective way to manage change, and its impact, is to proactively analyze risks, and thus avoid making changes that are likely to cause an incident. However, the current change management process typically involves manual analysis of risk, and lacks real-time operational data for accurate risk predictions.
BMC HelixGPT Change Risk Advisor is an agentic bot that uses Gen AI to reduce change failures in complex systems by surfacing risky changes which might impact a service. This information is crucial for balancing risk and speed in DevOps. The agentic AI bot de-risks DevOps by catching unforeseen risks in real time using precise analysis of ITSM and IT operations management (ITOM) data. DevOps and SRE teams can follow up with questions and get responses for a given change request so they can continue to deploy quickly and with confidence, only delaying their push when a potential failure is detected.
BMC HelixGPT Insight Finder, BMC HelixGPT Knowledge Curator, and BMC HelixGPT Employee Navigator are available to all customers with advanced-tier-suite licensing for BMC Helix ITSM. To learn more about BMC HelixGPT Change Risk Advisor, contact us for a consultation.
They represent the first set of autonomous, intelligent assistants for service and operations management professionals, and the people supported by them. We will continue to deploy more agentic AI bots in BMC HelixGPT to empower knowledge workers with powerful personal assistants that interact with human users—and each other—to boost productivity, reduce toil, and enhance outcomes.
Learn more about BMC Agentic AI here , and please contact BMC if you would like to discuss this further.
]]>Have you ever found yourself in a chat-based support conversation, only to be transferred to a new agent who suddenly falls silent for what feels like an eternity? This familiar and frustrating experience is a direct result of a common challenge. In this blog, we will learn how new BMC HelixGPT-powered capabilities, introduced in BMC Helix Virtual Agent 23.3.02, are eliminating this problem.
Ever since their introduction to the industry, virtual support agents have transformed service desk productivity. With virtual agents implemented to handle simple and common queries, human support agents are free to focus on more complex and unique issues. This approach enables human agents to provide a high level of service commensurate with their skills, while also reducing the overall average handling time (AHT) for each customer interaction with the service desks.
However, digital technology is complicated and evolves quickly, and issues will still arise that need the insight and abilities of a human support agent. Of course, this may not always be apparent at the start of a conversational support interaction. Whether the decision is made by the virtual agent or the customer themselves, the transfer to a human happens part-way through the conversation, creating a challenging handover.
If their support tool transfers the entire conversation to them, the human agent inevitably finds it challenging to maintain the flow of the conversation. Before they can continue, they must establish the customer’s identity, and then read the conversation in detail to understand their issue. Next, they must deduce what challenge led the virtual agent (or the customer themselves) to decide that the transfer was necessary.
This process could be more efficient both for the customer and the person supporting them. The transfer also adds cognitive load for the agent and increases the possibility of misinterpretation or error.
BMC Helix Virtual Agent is a powerful tool designed to eliminate the problem of transferring customers to human agents while also reducing the number of transfers needed. It utilizes large language model (LLM) technology to quickly provide high-quality summarized answers to customer queries, effectively reducing the need to transfer to human agents.
In cases where a transfer is still necessary, the flow of conversational support from t virtual to human agent is smooth and effective. BMC HelixGPT further enhances the customer service experience by providing human agents with an instant, clear, plain-language summary of the interaction at the point of transfer, improving customer satisfaction, support agent productivity, and average handling time per customer for the agent and their service desk.
Click here to learn more about how BMC HelixGPT enhances intelligent chatbot support, improving the service experience for both your employees and the support professionals supporting them.
]]>“What is the impact of this change?” That’s a very hard question to answer in today’s complex digital environments, particularly as cloud and DevOps adoption increases the speed of change. As applications and the supporting infrastructure grow and evolve continuously, how quickly IT operations (ITOps) teams can answer that question will determine how quickly they can diagnose and remediate the problem. Let’s take a step back and explore why it’s important to stay on top of change.
Whatever the observed behavior—a service performance degrades, a service becomes unavailable, resource contention issues—issues can eventually be traced to a change or changes in the system. It might be a system change such as turning on and off feature flags, a configuration change, a system upgrade push, or an autoscaling action. It could also be a workload change related to a business running a site-wide sales event. How well your digital service performs is related to changes in your code, infrastructure, workload, and network.
In DevOps environments, these changes happen multiple times a day, and for good reason. We must continuously improve our application functionality and performance to meet customer expectations. Unfortunately, new changes also bring new risks. Maintaining application delivery speed and quality among the complexities of modern IT environments can be an overwhelming challenge.
To balance the pace of change and risk, IT organizations need a new approach that bridges the gap between service management and operations practices. This is where ServiceOps comes in, breaking down barriers to accelerate change while predicting and managing risk. Through change impact analysis, service automation, observability, and AIOps, IT teams are empowered to reduce service outages caused by change.
BMC Helix ServiceOps combines the configuration management and change management capabilities of BMC Helix ITSM and third-party applications with the BMC Helix observability and AIOps solution to give IT teams more confidence in delivering resilience to the business.
The most effective practice to manage change and its impacts among the chaos of ever-evolving technologies is proactively analyzing risk to avoid making changes that will likely cause an incident. With BMC Helix ITSM’s change management capabilities, service management teams can create change requests, run impact analysis, and schedule change dates through a graphical calendar to reduce conflicts. Based on the discovery and modeling that’s already been done for the service by BMC Helix Discovery, service managers can see how the applications relate and talk to each other.
When the service management team creates a change request for an application, they can run an impact analysis by simply pushing a button in BMC Helix ITSM. Through the change analysis, service management teams will be able to understand, in their change windows, what’s being changed, what’s going to get impacted, and if there are any collisions that need to be scheduled to avoid issues being changed at the same time. Addressing potential collisions leads to reduced incidents, reduced risk of a failed change, and increased stability in the environment. For example, if you’re making an application change and there’s an infrastructure change happening at the same time, the change manager can avoid collision issues by adjusting the timing of those changes so they do not overlap.
Figure 1: BMC Helix ITSM change impact analysis
Monitoring the impact of changes before and after is critical to spotting imminent change-related incidents before they occur. BMC Helix Discovery helps by automatically discovering and tracking relationships between the configuration items (CIs), and visually presenting the topology maps so service management teams can feel more confident about the changes that they make. More importantly, this information can be shared with the development and infrastructure teams so they can see the potential impact their changes will have on the environment.
When the changes are implemented in production, service and operations teams can continue to monitor the impact using the service health timeline from the BMC Helix observability and AIOps solution, which also helps them quickly understand whether an incident might have been caused by recent changes to the environment, saving them the time and energy of blindly sifting through log files.
Figure 2: BMC Helix observability and AIOps showing the service health timeline
Documenting, tracking, reporting, and auditing all the information on a change helps to prevent unauthorized changes and meet service level agreements (SLAs). Service management teams can use out-of-the-box BMC Helix ITSM change management reports and dashboards to understand and monitor change activity within the organization. Reports listing details of changes like change type, urgency, status, incidents caused by changes, and upcoming changes provide visibility into changes and their impact on business SLAs.
When an incident emerges from a change, service operators or site reliability engineers (SREs) need a way to correlate it to the change. The BMC Helix observability and AIOps solution presents a view of the specific incident that contains all the data necessary to diagnose the issue. Clear visualizations of the service model and topology, and other rich information related to the incident such as the service health timeline, help to correlate the incident to the changes that caused them.
Figure 3: BMC Helix observability and AIOps showing the service health timeline, service topology, and root cause analysis
Using AI to correlate the events, changes, incidents, and service topology, BMC Helix observability and AIOps visually presents the relationship between a service and its nodes with node details such as events, incidents, and changes in a unified view. The contextual topology map also presents the complete picture of all CI elements, starting with the infrastructure nodes, network devices, hosts, clusters, namespaces, and all the application nodes of a service. In addition to the CI topology map, the solution provides a service hierarchy to help operators investigate the impacted service and/or child services easier. For example, when looking at the impact of storage or something in the infrastructure going down, service management teams could identify the shared services or infrastructure and see what may be related and impacted when determining the scope.
Using the BMC Helix Integration Service, the BMC Helix platform aggregates change data from all change feeds and tools, including change management, configuration management, Azure DevOps, and Jira. The BMC Helix observability and AIOps solution then applies causal AI to identify the root cause of issues and the changes that likely caused them.
With this information in a unified view, the ITOps, network operations center (NOC), DevOps, and SRE teams can identify the impacted nodes and impact path and work on a resolution immediately.
Change impact analysis, observability, and AIOps play key roles in proactively managing risk and reducing outages caused by change. Using AI and automation in change management leads to change processes that are more predictive and efficient and less prone to human error. BMC Helix ServiceOps leverages the key capabilities of BMC Helix IT service management and IT operations to help both teams answer the question: “What is the impact of this change?”
BMC Helix ServiceOps strives to make IT service management and operations more efficient and more collaborative. When done right, ServiceOps will empower IT organizations to proactively manage risk and solve problems quickly with minimal impact on critical business services.
To learn more about how BMC Helix ServiceOps can transform your service and operations, schedule a customized demo today.
]]>As the enterprise technology landscape becomes ever more diverse and expansive, pressure on IT service management (ITSM) departments to deliver more, faster have never been higher. However, artificial intelligence (AI) also provides ITSM professionals with even more new opportunities to deliver high-quality service at an enterprise scale. BMC Helix Service Management 23.3.02 leverages BMC HelixGPT to help ITSM practitioners achieve better outcomes for their supported users.
For IT asset managers, a new, automated ownership assignment function reduces toil. Release managers benefit from a significant user interface refresh, offering a more dynamic and productive user experience. Network operations teams gain new automation to ensure fast and accurate prioritization and geolocation of incidents.
The release is driven by market-leading BMC HelixGPT generative AI. BMC has collaborated closely with numerous IT professionals to align generative AI to their roles, and this release includes a number of BMC HelixGPT-powered enhancements:
This release also enhances automation and workflows, boosting the productivity of IT asset management and network operations teams:
This release represents the latest iteration of BMC’s significant and differentiating ongoing investment in AI-driven service management, BMC HelixGPT-driven insights and interaction, and enhanced automation. These enhancements will drive greater productivity, facilitate more efficient and successful support responses, and drive a more seamless flow of innovation through the wider digital organization.
View the BMC Helix Service Management 23.3.02 release notes for more details on these exciting enhancements and many others in the release.
]]>As we advance through 2024, IT service management (ITSM) remains a pivotal cornerstone for organizations aiming to streamline their services and manage IT operations more effectively. With approximately 48 percent of organizations rating their ITSM capabilities as “great” or “good,” there’s a burgeoning confidence in current ITSM strategies. However, this sentiment is balanced by an equal number acknowledging the necessity for significant enhancements.
This post delves into the nuances of ITSM adoption, the implications of the remote work era, the transformative role of artificial intelligence (AI) and automation, the intensifying investment in cloud infrastructure, and the evolving focus on the employee experience.
1. ITSM adoption and perception:
2. Impact of remote work on ITSM:
3. AI and automation in ITSM:
4. Cloud infrastructure spend:
5. Employee experience:
The ITSM landscape continues to evolve. Organizations are adapting to the demands of remote work, exploring the advantages of AI and automation, investing heavily in cloud services, and placing a renewed emphasis on the employee experience. By embracing these trends, companies can stay resilient and agile in the face of ongoing digital transformation.
In a managed service provider (MSP)-centric environment, managing the diverse needs of multiple end users within a single tenant environment can be challenging. However, the benefits of personalized dashboards for these end users are significant. Each end user brings unique requirements and preferences for visualizing their information technology (IT) infrastructure within this domain. This is where the multi-tenancy dashboard becomes invaluable. Using tools like BMC Helix Access Controls, MSPs can now seamlessly create personalized dashboard views for individual end users, revolutionizing how data is accessed, analyzed, and utilized within a unified framework.
We’ve had inquiries regarding the feasibility and logistics of implementing personalized dashboards within a multi-tenant environment; one resounding question echoes: “Can it truly be done?” How do you navigate the process of setting up and managing dashboards tailored to multiple end users’ individual roles and preferences?
This blog is designed to address your concerns. We will guide you through the practical application of creating and managing dashboards with distinct users. The answer to the question is a resounding “Yes, it can be done!” We will equip you with the necessary steps and insights to make this process a reality in your multi-tenant environment.
Let’s start by creating a User group. For this blog, we will create a User group for the end customer, “XYZ Manufacturing.” In the next steps, you’ll discover how easy it is to create personalized dashboards based on your end user, empowering you to tailor the experience to their unique needs.
Here is a screenshot showing the group creation. As the BMC Helix tenant administrator, you will first go into the portal:
Figure 1. Main screen.
As the administrator for the XYZ Manufacturing company, you will then click on Add group and create the specific User group. The image below shows the administrator creating TestGroup1.
Figure 2. User groups.
Now that the administrator has created a new User group, they will need to add the permissions. The administrator simply clicks on Actions -> Assigned, and assigns the User to the Group:
Figure 3. User directory.
Next, the administrator will want to check the User’s role. This is easily done by going to the BMC Helix portal landing page, clicking the User access tab, going to the Users and keys page, and searching for the User.
Figure 4: Users and keys.
Now that the administrator sees the User, they can simply click on Actions -> User options to see which User groups and Roles are assigned to them. By selecting either the Groups or Roles assignment area, the administrator can quickly validate that they are assigned correctly.
Figure 5: User options.
One of the next things you will want to do as the administrator is to create an authorization profile for this User group (TestGroup1). By creating this authorization profile, you ensure that the Users in this Group only have access to the appropriate XYZ Manufacturing company information.
Here’s how you do this: Launch the BMC Helix Operations Management console from the BMC Helix portal landing page. Then, navigate to the Administration Authorization profile, and you will see the Authorization Profile Test_AutoProf1, which we have created for User group TestGroup1. In your case, this needs to be created using the “Create” button option.
Figure 6. Authorization profiles.
The next step is to add and associate the User group we created.
Figure 7. Profile details.
In this case, we have selected Microsoft Windows Servers as the PATROL Solutions and then assigned a specific Device and Group. The Device name and Group name will vary depending on customer requirements.
Figure 8. Administration.
Figure 9. Administration detail.
Here we are showing the selection of the Group Windows Servers.
Figure 9a.Windows Servers.
As you’ve seen so far, this has all been straightforward to implement. Let’s now log into the BMC Helix portal as the user and see how it shows up. Here’s the dashboard, with all the devices that the user has available to them displayed.
Figure 10. Main dashboard.
Based on the rules the administrator has put in place, this user can only see the two devices that were assigned.
The user can see deeper insights from the dashboard by clicking on one of the servers (for this example, we’ve clicked into the vl-pun-dombl107 server).
Figure 11. Deeper insights.
Once on the Device Details page, the user can click on the three dots beside the Device Name and then click on the “Launch Dashboard” pop-up to delve into all of the performance details:
Figure 12. Launching dashboard.
Once the user has clicked on Launch Dashboard, the system will default to the same device and show the CPU utilization, Memory usage, Disk usage, Network bandwidth utilization, and related events for this device.
Figure 13. Performance details.
As you can see, creating and managing multiple end users with personalized dashboards is quite simple. Using BMC Helix User group and Authorization profiles, the administrator can easily create the views needed to support personalized dashboards based on the user profiles. We hope this process walk-through will provide you with the guidance you have been asking for as you create your dashboards in your environment.
We are also here to answer any questions you might have; please feel free to reach out to us:
]]>The service desk is a valuable ITSM function that ensures efficient and effective IT service delivery. A variety of metrics are available to help you better manage and achieve these goals. These metrics often identify business constraints and quantify the impact of IT incidents. Of course, the vast, complex nature of IT infrastructure and assets generate a deluge of information that describe system performance and issues at every network node. The challenge for service desk? Identifying the metrics that best describe the true system performance and guide toward optimal issue resolution.
We’ve talked before about service desk metrics, such as the cost per ticket. Another service desk metric is mean time to resolve, which quantifies the time needed for a system to regain normal operation performance after a failure occurrence. In this article, we’ll explore mean time to resolve (sometimes abbreviated MTTR), including defining and calculating mean time to resolve and showing how it supports a DevOps environment.
Beyond the service desk, it’s is a popular and easy-to-understand metric:
In each case, the popular discussion topic is the time spent between failure and issue resolution. So, let’s define mean time to resolve.
‘Mean time to recovery’ is the average time duration to fix a failed component and return to an operational state. This metric includes the time spent during the alert and diagnostic processes, before repair activities are initiated. (The average time solely spent on the repair process is called ‘mean time to repair’, also shortened to MTTR.) It can be mathematically defined in terms of maintenance or the downtime duration:
In other words, MTTR describes both the reliability and availability of a system:
The shorter the mean time to resolve, the higher the reliability and availability of the system. From a practical service desk perspective, this concept makes MTTR the metric valuable: users of IT services expect services to perform optimally for significant durations as well as at specific instances. For example, Amazon Prime customers expect the website to remain fast and responsive for the entire duration of their purchase cycle, especially during the holiday season. If the website is down several times per day but only for a millisecond, a regular user may not experience the impact.
It’s a valuable metric for service desks on its own, but it also encourages DevOps culture and practices in a variety of ways:
By following the DevOps philosophy, service desk can achieve the wider ITSM objectives of efficiently and effectively delivering IT services. Mean time to resolve is one among many other service desk metrics that companies can use to evaluate for deeper insights into IT service management and operations activities. With any technology or metrics, however, remember that there is no ‘one size fits all’: you’ll want to determine which metrics are useful for your organization’s unique needs, and build your ITSM practice to achieve real-world business goals.
Dive into more about a closely related concept: Mean time to repair (MTTR) >
]]>Most service providers understand the need for service level agreements (SLAs) with their partners and customers. But creating one might feel daunting because you don’t know where to start or what to include. In this article, we share some SLA examples and templates to help you create SLAs.
An SLA is a documented agreement between a service provider and a customer that defines: (i) the level of service a customer should expect, while laying out the metrics by which service is measured, as well as (ii) remedies or penalties should agreed-upon service levels not be achieved. It is a critical component of any technology vendor contract.
Before subscribing to an IT service, the SLA should be carefully evaluated and designed to realize maximum service value from an end-user and business perspective. Service providers should pay attention to the differences between internal outputs and customer-facing outcomes, as these can help define the service expectations.
Let’s examine a sample SLA that you can use as a template for creating your own SLAs. Remember that these documents are flexible and unique. Make changes as necessary, and ensure that you correctly identify and include the relevant parties. Also, consider additional topics that you may want to add to your agreement(s) to enhance them, such as:
There are several ways to write an SLA. Below is a mock table of contents that you can leverage to start writing your own SLAs.
Now, I’ll break down each section with a few details and examples.
The first page of your document is simple, yet important. It should include:
Document details & change history | |||
Version | Date | Description | Authorization |
… | … | … | … |
Document approvals | |||
Name | Role | Signature | Date |
… | … | … | … |
Last Review: MM/DD/YYYY
Next Scheduled Review: MM/DD/YYYY
In the next section, the agreement overview should include four components:
Include a brief introduction of the agreement, relevant parties, service scope, and contract duration. For instance:
This is a Service Level Agreement (SLA) between [Customer] and [Service Provider]. This document identifies the services required and the expected level of services between MM/DD/YYYY to MM/DD/YYYY.
Subject to review and renewal scheduled by MM/DD/YYYY.
Signatories:
Include a definition and brief description of terms used to represent services, roles, metrics, scope, parameters, and other contractual details that may be interpreted subjectively in different contexts. This information may also be distributed across appropriate sections of this document instead of collated into a single section.
Term | Description |
SLA | Service Level Agreement |
Accuracy | Degree of conformance between a result specification and standard value. |
Timeliness | The characteristic representing performance of action that leaves sufficient time remaining to maintain SLA service expectation. |
IT Operations Department | A business unit of [Customer] responsible for internal IT operations. |
… | … |
This section defines the goals of this agreement, such as:
The purpose of this SLA is to specify the requirements of the software-as-a-service (SaaS) solution as defined herein with regards to:
In this section, you’ll want to define the policies and scope of this contract related to application, renewal, modification, exclusion, limitations, and termination of the agreement.
This section specifies the contractual parameters of this agreement:
This section can include a variety of components and subsections, including:
Key performance indicators (KPIs) and other related metrics can and should support your SLA, but the achievement of these alone does not necessarily result in the desired outcome for the customer.
Metric | Commitment | Measurement |
Availability | MTTR (mean time to repair) | |
Reliability | MTTF (mean time to failure) | |
Issue Recurrence | ||
… | … | … |
Severity Level | Description | Target Response |
1. Outage | SaaS server down | Immediate |
2. Critical | High risk of server downtime | Within 10 minutes |
3. Urgent | End-user impact initiated | Within 20 minutes |
4. Important | Potential for performance impact if not addressed | Within 30 minutes |
5. Monitor | Issue addressed but potentially impactful in the future | Within one business day |
6. Informational | Inquiry for information | Within 48 hours |
… | … | … |
Include any exceptions to the SLA conditions, scope, and application, such as:
This SLA is subject to the following exceptions and special conditions:
Here, you’ll define the responsibilities of both the service provider and the customer.
[Customer] responsibilities:
[Service Provider] responsibilities
Include service management and support details applicable to the service provider in this section.
Service coverage by the [Service Provider] as outlined in this agreement follows the schedule specified below:
Include reference agreements, policy documents, glossary, and relevant details in this section. This might include terms and conditions for both the service provider and the customer, and any additional reference material, such as third-party vendor contracts.
The appendix is a good place to include relevant information that doesn’t seem to fit elsewhere, such as pricing models and charges. The following section is an example of information that you may want to append to your SLA.
Include the pricing models for each service type with detailed specifications.
Service | Capacity | Type – Throughput | Price |
Cloud Storage A | |||
Option | |||
A | 500GB | HDD – 250 MB/s | $5.00/Mo |
B | 10TB | SSD – 500 MB/s | $10.00/Mo |
C | 50TB | SSD – 1000 MB/s | $15.00/Mo |
Additional Storage | |||
A.1 | 100GB | HDD – 250 MB/s | $1.00/Mo |
B.1 | 2TB | SSD – 500 MB/s | $2.00/Mo |
C.1 | 10TB | SSD – 1000 MB/s | $4.00/Mo |
… | … | … | … |
Though your SLA is intended to be a legally binding agreement, it doesn’t need to be incredibly lengthy or overly complicated. It can further be a malleable document that is improved upon over time, with the consent of all relevant parties. Our advice: Begin building an SLA using the template above and the examples found herein and consult with your customers for any perceived gaps. As unforeseen circumstances are often inevitable, you can always revisit and tweak the SLA, if needed.
Additional SLA templates and examples are available here:
]]>