How We Summarize rrweb Sessions Using LLMs to Determine a Bug's User Impact

We’ve all had the experience: you open your Slack and there are hundreds of Sentry alerts for obscure errors:

Error: Bad request
Error: Failed to create item
Error: Token not found

You’ve got two options:

Ignore the errors and wait for a customer to complain
Invest a ton of time investigating each to see if any are important

This dilemma is all too common for developers dealing with noisy error alerts. At Decipher, we’ve tackled this problem head-on by leveraging AI to generate session summaries that make sense of these alerts, highlighting what users were doing right before an error occurred. Our solution integrates rrweb, an open-source library designed to record and replay user interactions on web applications, and LLMs to create concise, plain English summaries.

A primer on rrweb

rrweb is an open-source library designed to record and replay user interactions on web applications. This powerful tool captures a comprehensive series of events throughout the session.

Recording Events

At the core of rrweb are its recording capabilities, which capture a wide range of user interactions and DOM mutations each as an “events”.

There are many different types of recorded events:

DOM Changes: Node additions, deletions, and attribute modifications
User Interactions: Mouse movements, clicks, keyboard inputs, etc.
Viewport Changes: Scrolling and window resizing.
Form Inputs: Changes in form field values.

Each event in rrweb is represented as an object with several key properties, including:

type: Indicates the type of event. For example, a DOM Snapshot event vs. a mouse movement.
timestamp: The timestamp of when the event occurred.
data: Contains the detailed data related to the event, which varies based on the event type.

Incremental Snapshots

To efficiently manage the volume of data, rrweb uses incremental snapshots. After an initial full snapshot of the DOM, only changes (incremental snapshots) are recorded. This method reduces the amount of data needed to accurately replay a session.

Full Snapshot: Captures the entire state of the DOM at a specific point in time.
Incremental Snapshots: Record only the changes made to the DOM after the full snapshot. These include added, removed, or modified nodes and attributes.

Replaying

Usually when you capture rrweb sessions, you want to replay it back in video form. You can replay a recorded session by reconstructing the full snapshot DOMs and then reapplying the incremental changes in the correct order.

Summarizing rrweb sessions with LLMs

Our summarization pipeline involves several steps to ensure that only the most relevant events are included in the LLM context:

Here’s an overview of the process:

Identify the Last Click Event Before the Error: We iterate through the recorded events to find the last user interaction (typically a click) that occurred before the error timestamp. This helps us understand the user's actions leading up to the error.
Traverse Nodes to Find the Clicked Element: Once we identify the last click event (which includes the node ID), we need to locate the specific node in the DOM that was clicked. This involves traversing the recorded DOM nodes and their child nodes to match the node ID with the one captured in the click event. By doing this, we can pinpoint the exact element that the user interacted with before the error occurred.
Filter Events Based on Timestamps: With the clicked node identified, we filter the recorded events to include those that happened within a specific timeframe around the loading if the clicked node, click event and the error. This ensures that we capture the context of what the user was doing right before and after the error and that we have the context on what the clicked DOM is. We use a buffer size to determine how far back and forward we look from the timestamp of the click event and the error. Events that fall within this buffer are included in our analysis.
Include Relevant Events: The filtered events include various types of user interactions and DOM changes. These events provide a detailed view of the user's actions and the product's responses, helping us to reconstruct the sequence of events leading up to the error.

Choice of LLM

We experimented with a few different LLMs including GPT-4o, Claude 3 Haiku, Claude 3 Sonnet, Claude 3 Opus, and Gemini 1.5.

After manual evaluation, we determined that a mix of Haiku and Sonnet provide a sweet spot when it comes to context window, speed, and cost.

Limitations

The summarizer can struggle with very long sessions where relevant events (like the initial rendering of a clicked button and the click event itself) are widely dispersed.

The future

We are investigating additional enhancements including the incorporation of new models like Gemini 1.5 Flash and interspersing actual visuals for multimodal interpretation.

Try it for yourself!

Book some time here and we'll walk you through how to instrument your application in < 2 minutes.

Written by:

Michael Rosenfield

Co-founder

Share with friends:

Share on X