Percentile in Looker: Exploring System Activities
Introduction
Looker, a powerful business intelligence tool, is designed to unlock significant value for your organization. It can be utilized for a variety of tasks, from monitoring key performance metrics to validating specific queries. For users, one of the most important aspects to consider is the performance of the tool. Dashboards need to load swiftly and provide data in a timely manner. This is crucial as it not only improves user experience but also enhances productivity by providing necessary insights. Paraphrasing a quote from the book "Designing for Performance" [1], we are just one click away from moving from our dashboard to other things.
Think about how you search for things on the Web. How quick are you to close a tab and go to the next search engine result if a site takes too long to load? [2]
Google has also conducted research and published a blog post titled 'Speed Matters' [3]. They describe how slower searches can have an impact on the number of searches over time.
Figure 2. from “Speed Matters for Google Web Search” [4]
Background
To gain a deeper understanding of Looker's performance, we can utilize the System Activity Explorer [3]. This feature offers an extensive overview of Looker's internal operations, offering valuable insights into how the system is functioning. Our goal is to assess the overall performance of our existing dashboards. Initially, my plan was to simply find the queries with the longest execution times and use that as a basis for comparison. However, I wanted to have one simple metric that would provide an accurate representation of overall performance and work in tandem with the maximum longest execution times. The second problem with the longest execution times is that we have extreme times which might not represent the majority of experiences. Percentiles [4] have been chosen as a more effective alternative to gain deeper insight into performance. This statistical measure provides a more nuanced view of our data, allowing us to understand not just average performance, but also the variability and distribution of performance metrics.
Looker limitation
Looker doesn't support the direct use of percentiles from System Activities. By leveraging Table Calculations, we can create a workaround to extract the desired percentile information.
Internal looker informations
The Query Performance Metrics explorer is the most relevant component when it comes to assessing query execution times within Looker. By navigating to the History view, users can access the runtime field which holds valuable information about the duration of each query execution. However, when attempting to analyze this data further, the provided measurements are somewhat limited. Users can only obtain the minimum, maximum, and average execution times for the queries.
Looker table calculation
Looker offers a useful feature in the form of percentile measurement types, which can help users analyze data more effectively by focusing on specific segments of the distribution. However, this functionality is not available for use with System Activities explorers, limiting the options for evaluating query performance within that context.
To work around this limitation, users can employ table calculations as an alternative method. Table calculations [5] allow for the creation of custom metrics and data transformations based on the existing columns within a dataset. By leveraging this feature, users can still gain insights into query performance and identify potential areas for optimization, even without direct access to percentile measurements in the System Activities explorers.
Table calculations require all data to be present in the table. Let's say we are interested in queries from the last 12 hours.
Add table calculation using percentile expression
Results
Summary
Understanding Looker performance and optimizing its use is key to fully leveraging its capabilities. While Looker may not offer direct percentile measurements within its System Activities, we've demonstrated how Table Calculations can serve as an effective workaround. This approach allows us to dive deeper into our data, providing a more comprehensive view of our dashboards performance
References
Lara Callender Hogan, 2014. Designing for Performance. https://designingforperformance.com
Lara Callender Hogan, 2014. Designing for Performance. Chapter 1 Performance Is User Experience. https://designingforperformance.com/performance-is-ux/
Speed Matters. https://ai.googleblog.com/2009/06/speed-matters.html
Speed Matters for Google Web Search. https://services.google.com/fh/files/blogs/google_delayexp.pdf
Looker Documentation. System Activie explorer. https://cloud.google.com/looker/docs/system-activity-pages
Looker Documentation. Percentile. https://cloud.google.com/looker/docs/reference/param-measure-types#percentile
Looker Documentation. Table calculation. https://cloud.google.com/looker/docs/table-calculations