Problematic Terminology in Open-Source

By on in Database, Development, Open Source

It remains a common practice in database systems today to refer to configurations where one database is a source of truth, and another database is a replica that follows the state of the source of truth database as a “master/slave” configuration.

Use of this term is problematic. It references slavery to convey meaning about the relationship between two entities. The term “slave” is used because one system is controlling the state of the other system.

Using these terms like this is cavalier. It downplays slavery and the massive human suffering it causes. By having an everyday use for the term “slave” we normalize the concept of having things called “slaves” and it desensitizes us to the seriousness of slavery. More importantly, the casual use of the term may be an unwelcome daily presence in the life of a person of color, for whom slavery has great personal significance. Continue reading …

There will be SQL

By on in Database, golang

I recently worked on my first Go project. As a web developer, the applications I work with are often database driven. If you are like me, you might be curious about what working with a database is like in Go. And if you're used to working with a web framework like Rails, you might be wondering about an ORM. As the title of this article implies, there aren't a lot of options. In this article we'll learn to relax and go back to working without an ORM.

Continue reading …

Adventures in Searching with Postgres: Part 1

By on in Database, Rails, Ruby

For a recent project, we built a live-search for over 60,000 records which used both simple pattern matching and full-text search. The records we needed to search were diagnoses that are used to assign patients to a therapy. I had never done full-text search or anything real-time with that many records, so I ended up doing a lot of experimentation. These posts will cover my experience, and I hope they’ll be of value to anyone implementing their own Postgres search.


The Problem

The records we were searching were International Statistical Classification of Diseases (ICD) codes. Each record consists of a 3-7 character code, a short description averaging 11 words, and a few other data and association fields (for this post, I generated 100,000 records matching the real ICD format). We needed to be able to search by code and description, and users would be changing their search query quickly to find the right record, so it needed to be responsive. In part one, I’ll cover the code search where a user enters one or more codes (which may be partial).

Continue reading …

RethinkDB: a Qualitative Review

By on in Database, Development, Ops

RethinkDB Evaluation
At Carbon Five we install and use many different database engines. Document-oriented databases are proving to be a good fit for more and more of our projects. MongoDB is the most popular of these and provides a powerful set of tools to store and query data, but it’s been plagued by performance problems when used with very large databases or large cluster sizes. Riak is another interesting option that is built from the ground up to perform at scale. But Riak is difficult to set up and has a minimal API that requires a lot more work to manage the data. RethinkDB is a relative newcomer that wants to fill the gap between these.

The trade-off between developer friendliness and high performance is unavoidable, but I’ve been looking for something in the middle. RethinkDB claims to solve the 1-15 problem, which is a database that is reasonable to use as a single node, but can scale up to around 15 nodes with minimal configuration and no changes to the application. Whether or not this claim holds up remains to be seen. In this post I take it for a test drive and provide a qualitative assessment (i.e. no benchmarks) of its ease of use and effectiveness for application development. The question is, what do developers have to give up for the peace of mind of knowing they won’t have to rip out the persistence layer when the app gains popularity (tldr; not much).

Continue reading …

Using Redis sorted sets to build a scalable real-time web waiting list.

By on in Database, Development

As websocket communication makes large real-time web experiences more common, we are increasingly faced with the problem of how to build apps that work just as well with many concurrent users as they do with a few. redis_waiting_list The problem touches both the technical limits of the server infrastructure and the design of the user experience. This post argues that imposing an artificial constraint on concurrent users and then designing around that constraint will lead to better real-time apps. It then describes a scalable waiting list implementation (with a working demonstration project) built around Redis sorted sets.

Continue reading …