~/devreads

#database

36 posts

Yesterday

4 Jun

3 Jun

Netflix Technology Blog 11 min read

By Rajiv Shringi , Kaidan Fullerton , Oleksii Tkachuk and Kartik Sathyanarayanan Introduction Netflix’s TimeSeries Abstraction is a scalable system for ingesting and querying petabytes of temporal event data with millisecond latency. We use Apache Cassandra 4.x as the underlying storage for these main reasons: Throughput, latency, and cost : Cassandra can handle millions of low‑latency reads and writes in…

scalabilitycassandradatabasedistributed-systemstimeseries

14 May

16 Apr

25 Feb

28 Jan

3 Nov 2025

22 Oct 2025

10 Sept 2025

28 Aug 2025

Raphael Montaud 14 min read

How we made our filtering 10x cheaper by removing our Bloom Filters Bloom Filters are great tools to make fast and cheap filtering. They also come with plenty of problems and can easily get expensive and cumbersome. We switched to user-based direct database queries, which made our filtering cheaper and easy to maintain. Here’s the full breakdown of that migration.…

databasesoftware-engineeringrecommendation-systembloom-filterdynamodb

25 Aug 2025

Raphael Montaud 6 min read

Cross-Digest diversification In this part 4, we’ll see how we went from investigating a few complaints from digest power users to improving our digest recommendations across the board. Intro : This is a 4-part series breaking down improvements to the algorithm behind the Medium’s Daily Digest over the past year. When we started this work, the Digest was suboptimal —…

programmingrecommendation-systemsoftware-engineeringdatabasemachine-learning

Kovi 9 min read

Discover how Bazaarvoice migrated millions of UGC records from RDS MySQL to AWS Aurora – at scale and with minimal user impact. Learn about the technical challenges, strategies, and outcomes that enabled this ambitious transformation in reliability, performance, and cost efficiency Bazaarvoice ingests and serves millions of user-generated content (UGC) items—reviews, ratings, questions, answers, and […]

databasedevops

19 Jun 2025

somesh sharma 5 min read

Optimistic locking is a concurrency control mechanism where we assume that multiple transactions can safely access data without conflict, allowing them to proceed without locking the data upfront. Unlike pessimistic locking, where resources are locked to avoid conflicts, optimistic locking allows transactions to proceed without locks and checks for conflicts only when updating the data. If a conflict is detected…

optimistic-lockingdatabaseconcurrency-controlisolation-leveltransaction-management

7 Nov 2024

vladmihalcea 1 min read

Introduction In this article, we are going to see the best way to determine the optimal connection pool size using the FlexyPool auto-incrementing pool strategy. If you are unfamiliar with the reason why database applications need a connection pool, then check out this article first. Now, according to the Universal Scalability Law, the maximum throughput of a database system is…

databaseflexypoolconnection poolingdatabase connection provisioninghikaricp

10 Oct 2024

24 Jul 2024

vladmihalcea 1 min read

Introduction In this article, we are going to see how we can use symbolic links to move the DB data folder. The reason why I needed to move the data folder from the C to the D Windows partition was because the C partition was running out of disk space. DB data folder A relational database system requires to store…

databaseoraclesymbolic links

10 Apr 2024

vladmihalcea 1 min read

Introduction In this article, we are going to investigate the difference between the PostgreSQL FOR UPDATE and FOR NO KEY UPDATE when locking a parent record and inserting a child row. Domain Model To see the difference between the PostgreSQL FOR UPDATE and FOR NO KEY UPDATE locking clauses, consider the following one-to-many table relationship where the post table is…

databasepostgresqlsqlexplicit lockingfor no key update

26 Mar 2024

vladmihalcea 1 min read

Introduction In this article, we are going to analyze how PostgreSQL Heap-Only-Tuple or HOT Update optimization works, and why you should avoid indexing columns that change very frequently. PostgreSQL Tables and Indexes Unlike SQL Server or MySQL, which store table records in a Clustered Index, in Oracle and PostgreSQL, records are stored in Heap Tables that have unique row identifiers.…

databasepostgresqlsqlheap-only-tuplehot

11 Mar 2024

vladmihalcea 1 min read

Introduction In this article, we are going to analyze the PostgreSQL Index Types so that we can understand when to choose one index type over the other. When using a relational database system, indexing is a very important topic because it can help you speed up your SQL queries by reducing the number of pages that have to be scanned…

databasepostgresqlsqlaivenbtree

14 Feb 2024

vladmihalcea 1 min read

Introduction In this article, we are going to explore various PostgreSQL performance tuning settings that you might want to configure since the default values are not suitable for a QA or production environment. As explained in this PostgreSQL wiki page, the default PostgreSQL configuration settings were chosen to make it easier to install the database on a wide range of…

databasepostgresqlaivenconfigurationperformance tuning

11 Jan 2024

28 Sept 2023

Claire Adams 5 min read

Cron scripts are responsible for critical Slack functionality. They ensure reminders execute on time, email notifications are sent, and databases are cleaned up, among other things. Over the years, both the number of cron scripts and the amount of data these scripts process have increased. While generally these cron scripts executed as expected, over time…

databasegolanginfrastructurekubernetesscalability

31 Aug 2023

vladmihalcea 1 min read

Introduction In this article, we’re going to see how the PostgreSQL JDBC Driver implements Statement Caching and what settings we need to configure in order to optimize the performance of our data access layer. Prepared Statements The JDBC API allows you to create a PreparedStatement by calling prepareStatement(java.lang.String) method on a given Connection reference. For this reason, it’s very common…

databasesqlconnection poolingpostgresqlprepared statement

19 Apr 2023

Rob 1 min read

I recently needed count the number of rows in an SQL query that had a Group By clause. It looked something like this: SELECT account_name FROM events WHERE created_at >= CURDATE() - INTERVAL 3 MONTH GROUP BY account_id This provides a list of account names (28 in my case), but if you try to count them using: SELECT COUNT(account_name) as…

database

23 Feb 2023

vladmihalcea 1 min read

Introduction In this article, we are going to explore the YugabyteDB architecture and see how it manages to provide automatic sharding and failover without compromising data integrity. YugabyteDB is a distributed SQL database, so its architecture is different than the ones employed by traditional relational database systems. Traditional relational database architecture Most relational database systems use a Single-Primary replication architecture,…

databasedocdblsmsoftware architectureyugabytedb

8 Feb 2023

vladmihalcea 1 min read

Introduction In this article, we are going to see how we can achieve fault tolerance in your Spring Data application with the help of YugabyteDB. As previously explained, YugabyteDB is an open-source distributed SQL database that combines the benefits of traditional relational databases with the advantages of globally-distributed auto-sharded database systems. Fault tolerance First, let’s start with the definition of…

databasespringfault tolerancespring datayugabytedb

11 Jan 2023

vladmihalcea 1 min read

Introduction In this article, we are going to see the overhead of acquiring a new connection when using YugabyteDB and why connection pooling is mandatory for performance. Acquiring a database connection using JDBC To interact with a database system, first, we need to acquire a database connection. And, when using Java, we need to obtain a Connection object from the…

databasesqlconnectionconnection poolinghikaricp

30 Oct 2022

srinivas.tamada@gmail.com (Srinivas Tamada) 1 min read

Pocketbase is an open-source application and alternative to Google Firebase. This is offering realtime database, authentication(including social), and file storage for your next web and mobile application. This article is about how to host the Pocketbase application server which usually runs at 8090 port with your existing application server. If you are using Linux and Apache based server, the following…

apacheauthenticationdatabasehostingpocketbase

24 May 2017

Seth Hubbell 7 min read

(Always One More Thing…) Who Are We? The Ad Management team here at Bazaarvoice grew out of an incubator team. The goal of our incubator is to quickly iterate on ideas, producing prototypes and “proof of concept” projects to be iterated on if they validate a customer need. The project of interest here generates reports […]

testingbackenddatabaseemodbmigration

12 Mar 2015

lukaseder 1 min read

The past decade has been an extremely exciting one in all matters related to data. We have had: An ever increasing amount of data produced by social media (once called “Web 2.0”) An ever increasing amount of data produced by devices (a.k.a. the Internet of Things) An ever increasing amount of database vendors that explore … Continue reading 3 Reasons…

sqldatabasedzonepersistencerdbms

20 Oct 2014

lukaseder 1 min read

One of MongoDB’s arguments when evangelising MongoDB is the fact that MongoDB is a “schemaless” database: Why Schemaless? MongoDB is a JSON-style data store. The documents stored in the database can have varying sets of fields, with different types for each field. And that’s true. But it doesn’t mean that there is no schema. There … Continue reading Stop Claiming…

sqldatabasedatabase schemadynamically typedjavascript

22 Jul 2014

7 Jan 2014

Schakko 1 min read

Recently I struggled upon the same problem, this guy described. Our Oracle database instance contains multiple schematics with almost the same structure. Every developer has it’s own schema for unit and integration tests. On application startup the Hibernate schema validator calls the DatabaseMetaData.getTables() for every linked entity. The method returns the first […] The post Hibernate uses wrong schema during…

application serverdatabasesdatabasehibernatejava

7 Feb 2013

Schakko 1 min read

Play SQL is a an Atlassian Confluence plug-in for querying database tables and displaying the results inside a Confluence page. The plug-in has only native support for PostgreSQL and HSQL but other drivers can be used via a JNDI datasource. For using MySQL with Play SQL you have to download the […] The post Use Confluence Play SQL Plug-in with…

application serveratlassianconfluencedatabasejira

13 Dec 2011

Junior Grossi 1 min read

Hi all. Here I am again. Today I have a quick tip for beginners using Zend Framework. Do not insert pre and post code (for database) in your Controller. The Zend_Db_Table_Row is for that. Lets create our DatabaseTable class for Posts: /** * Located in .../models/DbTable/Posts.php */ class Posts extends Zend\_Db\_Table_Abstract { protected $_primary = … Continue reading Pre and…

databasephpzend frameworkzend dbzend db table