~/devreads

#analytics

17 posts

28 May

12 May

24 Mar

Prakhar Sapre 8 min read

Expedia Group Technology — Data Workload‑aware routing for Trino Photo by Joseph Barrientos on Unsplash Trino — a fork of PrestoSQL — is a powerful tool in modern data analytics, enabling organizations to query large datasets quickly and efficiently. As a distributed SQL query engine, Trino provides fast, scalable insights without requiring data relocation. While Trino is robust on its…

trino-gatewaysqlanalyticstrinosdata-science

27 Jan

Alyssa White, PhD 7 min read

Expedia Group Technology — Data Two roles one goal — understanding users better By Sophie Rabet and Alyssa White Photo by Samsung Memory US on Unsplash Quantitative User Experience (UX) Research, as a discipline, is growing rapidly. Quant UX Con 2022, the first ever general industry conference for the discipline, was organized with the expectation of about 200 attendees. After…

career-advicedata-scienceanalyticsux-researchquantitative-ux-research

26 Nov 2025

Sujit Singh 7 min read

Introduction In an age where artificial intelligence (AI) and machine learning (ML) are integral to almost every aspect of our lives, ensuring the effectiveness, fairness, and reliability of ML models is paramount. Observability plays a crucial role in maintaining the performance of these models, allowing us to detect and resolve issues promptly. At Helpshift, we recognized the need for robust…

analyticsartificial-intelligencemachine-learningobservability

Gayatri Panganti 4 min read

Every accurate metric is backed by countless validations, events checks and integrity tests in the background. Introduction Quality Assurance in the data-driven systems extends beyond UI validation and backend verification. Such systems rely heavily on data precision and accuracy. A recent QA focused on validating a productivity analytics framework, ensuring that every event, metric and data flow accurately represented real-world…

data-validation-testingkafka-consumeranalytics

6 Oct 2025

Poorva Patil 9 min read

As a data engineer, I used to see metrics as just numbers on a dashboard — until I realized they’re the lens through which customers view and run their operations. In customer support, for example, agent productivity metrics aren’t just figures, they’re actionable insights that drive efficiency, shape staffing decisions, and directly impact customer satisfaction. These aren’t just charts —…

apache-sparkanalyticsbig-datadata-analysis

6 Jan 2025

7 Aug 2024

2 Jul 2024

Nilanjana Mukherjee 9 min read

Slack Data Engineering recently underwent data workload migration from AWS EMR 5 (Spark 2/Hive 2 processing engine) to EMR 6 (Spark 3 processing engine). In this blog, we will share our migration journey, challenges, and the performance gains we observed in the process. This blog aims to assist Data Engineers, Data Infrastructure Engineers, and Product…

uncategorizedanalyticsawsbig-datadata-engineering

30 Apr 2024

Mithil Oswal 5 min read

Photo by Aaron Burden on Unsplash A simple guide on how to connect Snowflake data in Power BI to create reports, publish them, and schedule refreshes. Pre-requisites Well, since you’ve already reached this page, I’m assuming that you know of, and have access to both the tools — Snowflake as well as Power BI. In case you do not have…

reportingpipelinesnowflakeanalyticspower-bi

28 Jan 2019

lukaseder 1 min read

In my previous article, I showed what the very useful percentile functions (also known as inverse distribution functions) can be used for. Unfortunately, these functions are not ubiquitously available in SQL dialects. As of jOOQ 3.11, they are known to work in these dialects: Dialect As aggregate function As window function MariaDB 10.3.3 No Yes … Continue reading How to…

sqlaggregate functionsanalyticsinverse distribution functionordered-set aggregate function

19 Apr 2018

Colin Schimmelfing 7 min read

For data engineers and analysts, it’s pretty common to get questions about missing or incorrect data. “Hey Data Engineer, there’s an issue with the data – I expect numbers at least 20% higher than what our reporting tools show. Can you take a look?” If you’ve ever been responsible for a Business Intelligence pipeline, you’ve […] The post Save sanity…

generalanalyticsdata

20 Apr 2017

lukaseder 1 min read

At a customer site, I’ve recently encountered a report where a programmer needed to count quite a bit of stuff from a single table. The counts all differed in the way they used specific predicates. The report looked roughly like this (as always, I’m using the Sakila database for illustration): And then, unsurprisingly, combinations of … Continue reading How to…

sqlaggregate functionanalyticscountcube

25 Apr 2016

lukaseder 1 min read

Listicles like these do work – not only do they attract attention, if the content is also valuable (and in this case it is, trust me), the article format can be extremely entertaining. This article will bring you 10 SQL tricks that many of you might not have thought were possible. The article is a … Continue reading 10 SQL…

sqlanalytic functionsanalyticscommon table expressionsmatch recognize clause

20 Feb 2015

Trey Perry 4 min read

Every holiday season, the virtual doors of your favorite retailer are blown open by a torrent of shoppers who are eager to find the best deal, whether they’re looking for a Turbo Man action figure or a ludicrously discounted 4K flat screen. This series focuses on our Big Data analytics platform, which is used to learn more […]

big dataanalyticsbigdatareporting