Discover the power of Redis

Redis: A Clear Breakdown

Paweł Dąbrowski

CTO / Software Engineer

Most developers, when they hear the name Redis, they automatically think about cache. That's true, but handling the cache and being in front of another database like MySQL or PostgreSQL it's not the only superpower of this popular key-value database server.

Redis is super useful in multiple scenarios that happen when you build modern web applications that handle a lot of data. This article focuses on a lot of aspects of Redis, and you will benefit from reading it no matter what level of experience you are. By the end of this reading, you will truly understand what Redis is and when to use it.

Starting with some basic usage instructions, an explanation of the data structures, through the deployment process and handling Redis in production, to a deep dive to understand how Redis is working.

Getting started with Redis

I won't be using any specific programming language, but I can assure you that all popular programming languages have Redis clients, so you can go and start experimenting with Redis using your favorite tools.

Why - what problems does it solve

You should always start your learning process by understanding what problems the technology solves. Redis provides exceptional performance and simplicity when it comes to storing and manipulating high-level data types like strings, lists, sets, sorted sets, or hashes.

Redis plays very well in situations where traditional relational databases offer poor performance, or it's quite difficult to implement the solution:

  • Caching - you can significantly improve response times and reduce the load on your main database thanks to fast read and write operations provided by Redis.
  • Publish and subscribe messaging - you can design real-time communication between various components of the application using channels
  • Background job processing - use Redis as a message broker to provide scheduling and processing of tasks in an asynchronous mode
  • Counting things - thanks to the support for atomic increment and decrement operations, and sorted sets, you can easily and effectively implement statistics features, leaderboards, and ranking algorithms.

The examples mentioned above are not the only ones where Redis is doing a great job. You should think about Redis each time when you need to manipulate high-level data types with exceptional performance and not as a replacement for a traditional database but more like an enhancement.

How - the way Redis is working

How is it possible that Redis is so efficient? It's because of the nature of this technology, and the following are the most fundamental characteristics:

  • In-memory storage - Redis operates on the RAM memory, and that is the reason why the operations are so fast.
  • High-level data types - a good practice is to store only simple data inside Redis and more expensive information using disk storage
  • Persistence - the data won't be deleted when the server is restarted because Redis provides mechanisms allowing you to save data to disk and recover it

The RAM memory has a much smaller and more expensive capacity than other types of storage, but Redis has built-in mechanisms to set the limits and inform you when you will run out of memory.

Who - authors behind the solution

Salvatore Sanfilippo started the project back in 2009 as a solution to improve the scalability of his application. After successful tests, he released the code on open-source rules.

Basic usage of Redis

As I mentioned before, Redis is a key-value storage, so you reference the saved value by using a unique key:

redis 127.0.0.1:6379> SET name John
OK
redis 127.0.0.1:6379> GET name
"John"

A value saved in such a way will be persistent. You can shut down the Redis server, and when you will trigger it again, you will be able to get the value for the name key.

Basic data types

In the above example, I used the value of the string type, but Redis also has built-in support for other high-level data types. Also, when it comes to strings, a popular approach is to save serialized JSON string and deserialize it later in the application:

> SET user "\"{'name': 'John', 'email': 'john@doe.com'}\""

When dealing with bigger string values, you have to remember that Redis, by default, has a limit of 512 MB per single string value.

List

The list is a collection of string values. Those values can be duplicated:

# Add items to names list
> LPUSH names John
(integer) 1
> LPUSH names Tim
(integer) 2

# Check the size of the list
> LLEN names
(integer) 2

# First in, first out
> LPOP names
"John"

The maximum size of the list is 4,294,967,295 elements.

Set

If you need a collection of string values, but you need unique values, use a data structure called Set:

# Add item to emails set
> SADD emails john@gmail.com
(integer) 1

# Check if email is already on list
> SISMEMBER emails john@gmail.com
(integer) 1
> SISMEMBER emails tim@gmail.com
(integer) 0

The maximum size of the set is the same as for the list, 4,294,967,295 elements.

Sorted set

If you need a collection of unique string values but sorted by an associated score, use a structure called a sorted set:

# Add value with score, in this case age and name
> ZADD userages 65 John
(integer) 1
> ZADD userages 25 Tim
(integer) 1
> ZADD userages 70 Andrew
(integer) 1

# Get the position of Andrew (he is the oldest)
> ZREVRANK userages Andrew
(integer) 0

Hash

Hash is an alternative for storing serialized JSON, as this structure allows you to have a collection of key-value pairs:

> HSET user:1 name John email john@doe.com
(integer) 1
> HGET user:1 name
"John"

Hash is limited to 4,294,967,295 key-value pairs.

Redis community

There is happening a lot in terms of the community around the Redis technology. If you want to get in touch with experts, seek for help, be up to date with the development, and take part in interesting discussions, join the community in the following places:

  • LinkedIn - the biggest group has around 5k members, but smaller groups also exist; they are often focused on certain physical locations like the UK, France, or Italy
  • Reddit - the main reddit around this technology has more than 7k members, and each day, there are new posts published by the community
  • Facebook - less numerous groups of enthusiasts meet on the Facebook group, but still, you can find an interesting discussion there among self-promotion posts
  • Redis official forum - if you prefer a more traditional way of communicating, check out the official forum where, on average, 2 - 3 new topics are created every week

You should also read the official community page on the Redis website, where various information about the community is posted.

Clients and integrations

The code examples presented in this article are executed directly in the redis console, but during the development, you will be mostly using Redis client for a programming language of choice. Here is the list of official clients for various technologies:

Building with Redis

It's time to discuss some real-world usage examples for Redis. Through various use cases, I will demonstrate the different features of this key-value database and explain more advanced data types.

Caching

A lot of programmers, know Redis primarily as a solution for caching. The following features make this technology a perfect caching engine:

  • Fast write and read - Redis keeps the data in the in-memory storage; that's why the reading and writing operations are super fast.
  • Automatic expiration - if you don't want manually expire the cache, you can leverage the TTL feature that will automatically remove the item from the memory after a specified amount of time.
  • Publish / subscribe messaging - when you deal with distributed systems, you need a good and fast cache invalidation mechanism. Redis implements pub/sub pattern so you can easily notify other systems each time there is a need to invalidate the data.
  • Memory management - if you will hit the limit of the memory of your cache, you can create a special policy to remove the data according to simple or complex rules.

It's also worth mentioning that thanks to the persistence options, you don't have to worry that you will lose your cached data in case of restart or failure. With atomic operations, you can easily maintain data integrity and consistency.

It's time to see some examples.

Automatic expiration

The shortcut TTL stands for time to live. Such value is usually set for cache data that is needed for a certain amount of time, and then it should be updated with the newest information.

Once you will set a key, you can determine its time to live:

> SET day Monday
OK
> GET day
"Monday"
> TTL day
(integer) -1

I used TTL command to determine how long the key will be saved, and by default, we get -1 which means that the key will be present unless we delete it. Let's set the TTL to 10 seconds for this key:

> EXPIRE day 10
(integer) 1
> TTL day
(integer) 9
> TTL day
(integer) 8

As you can see, the time to live value was constantly decreased each second as expected. After 10 seconds, we receive -2 value that tells us that the given key does not exist.

Pub / sub messaging

When you maintain one monolithic application, you can expire the cache either by setting TTL for a given key or deleting it directly. When it comes to more complex systems, you may find yourself in a need to notify other systems when a given item should be expired.

In such a situation, the pub / sub pattern comes in handy. You publish information to a channel, and one or more systems can subscribe to it and then handle the message in their own way.

Memory management

The principle that determines how the items are removed from the memory when you hit the memory limit is called the eviction policy in Redis. You have a few options to choose from, depending on your current use case.

The following policies are the most popular ones:

  • allkeys-lru - the shortcut LRU stands for least recently used. According to this policy, Redis will remove items that were the least recently used to make space for new items.
  • allkeys-lfu - the shortcut LFU stands for least frequently used. When you use this policy, Redis maintains a counter for each key, and when the memory is full, it removes the keys where the value for the counter is the lowest, as it means that they were the least frequently used.
  • volatile-lru - similar to allkeys-lru but apply only to keys where the time to live (TTL) value is set.
  • volatile-lfu - similar to allkeys-lfu but apply only to keys where the time to live (TTL) value is set.
  • volatile-ttl - this policy applies only to the keys where TTL is set and removes the items where the expiration time is the nearest.

If you don't want to set any of the policies mentioned above, you can use noeviction policy, and then Redis would reject all write requests when the memory will be full.

Leaderboards

In every application, when there is a competition feature included, meaning that scores are stored, you might need to build a leaderboard so users can track their achievements and compare them with other people.

Redis is great for building the leaderboard feature because of the following features:

  • Sorted Sets - as you may remember, a sorted set is a basic data type where each key has a score assigned, and the list is automatically sorted by Redis. Sounds like a core of the leaderboard, isn't it?
  • Scores manipulation - you need to display the score for each user, but you also have to increment or decrement the score when needed.
  • Score retrieval - sorted sets, except that they are useful structures, they also provide an easy way to pull the score for a given member, list the top scorers or find the current ranking for a given member.
  • Automatic expiration - as with cache, you can set TTL for a given key if your leaderboard presents statistics for a fixed period of time like day, week, or month.
  • Publish / subscribe messaging - if you need to update the leaderboard in real-time, use a pub / sub pattern with Redis.

Again, it's worth mentioning that with atomic operations, you can avoid data inconsistency.

Building a simple leaderboard

It will be a simple yet meaningful tutorial for building a leaderboard with building blocks provided by Redis. We should start with creating a new sorted set and adding members with some scores:

> ZADD leaderboard 25 john
(integer 1)
> ZADD leaderboard 30 tim
(integer 1)
> ZADD leaderboard 27 rick
(integer 1)

We have a leaderboard with three members. Let's get the list sorted by the score to see who is the top performer:

> ZRANGE leaderboard 0 2 REV
(1) "tim"
(2) "rick"
(3) "john"

Just like in programming, 0 is the first point on the list, so if we want to get three items, we use 2 as the end of the range. The REV part is responsible for listing the users from the top score; without it, we would get the list starting from the smallest score.

Alternatively, you can add WITHSCORES to get the list but with scores:

> ZRANGE leaderboard 0 2 REV WITHSCORES
(1) "tim"
(2) "30"
(3) "rick"
(4) "27"
(5) "john"
(6) "25"

It's also possible to quickly check which position in the general ranking, a given member is occupying:

> ZREVRANK leaderboard tim
(integer) 0

Rate limiting

If you have ever worked with external API, you are probably familiarized with the rate-limiting mechanism. It's a technique used to control the number of requests allowed within a certain time period. If you need to implement a similar mechanism inside your API, you can consider using Redis.

For the sake of this article, I will assume that our API's limit is 100 calls per minute:

> SET rate_limit:u123 0 EX 3600 NX
OK
> GET rate_limit:u123
"0"
> INCR rate_limit:u123
(integer) 1
> GET rate_limit:u123
"1"

In the first line, I set the key rate_limit:u123 to the value 0 . I also tell Redis to expire the value after one minute and with the NX option, I ensure that the command will only be executed if the key does not already exist.

The code is very simple, but it's fast and does its job. When you provide API, you need to ensure that the response is as fast as possible and that the rate-limiting mechanism won't interrupt the request the way the user will notice.

Job queues

Background jobs are a must-have for every application that is processing a lot of data and transforming it, but also for the smaller ones, as there are plenty of tasks you can perform in the background, not to ask the user to wait for the page to load.

Redis is a perfect backend for the background jobs engine. One of the best examples is Sidekiq, a background job client for Ruby. Here are the features that are the most useful when it comes to job queues:

  • Lists - in a job queue, you push items to the end of the list, and you usually want to process the first ones. With the list structure, it's easy as you can use the LPUSH to push item to the head of the list and RPOP to remove the last item from the list.
  • Sorted sets - if you need to prioritize jobs, you can use sorted sets, as each item has a score assigned. I have already shown what you can do with scores in the section about leaderboards.
  • Hashes - a background job usually contains more than just a name. To store attributes, you can use a hash structure or just a serialized JSON.

Building a simple job background engine

First of all, the job background engine usually has a few queues defined:

> ZADD queues 0 default
(integer) 1
> ZADD queues 0 mails
(integer) 1
> ZADD queues 1 exports
(integer) 1

The exports queue has the priority so if we would process the jobs in queues, we would look into the higher priority queue first:

> ZRANGE queues 0 0 REV
"exports"

Of course, in a script, you would have a loop and go through all queues. When you have a queue, you can get the items from it:

> LPUSH queues:exports "{ \"id\": \"1\", \"class\": \"BackgroundJob\" }"
(integer) 1
> LPUSH queues:exports "{ \"id\": \"2\", \"class\": \"BackgroundJob\" }"
(integer) 2
RPOP queues:exports
"{ \"id\": \"1\", \"class\": \"BackgroundJob\" }"

The good idea is to have a sorted set for jobs where the score is the timestamp when the job should be executed. This way, you can easily pull all jobs that should be executed at the current moment.

More advanced Redis

Once you know base data structures and features on Redis, it's worth discovering some more advanced concepts, are transactions, publish / subscribe mechanism, persistence mechanism details, and more advanced data types. In this part of the article, you will discover all of that in a simple and easy-to-understand form (hopefully!).

Transactions

When it comes to the transactions, you can intuitively think about database transactions. Redis transactions, at some point, have similar purposes but, in some other aspects, are quite different. When you use transaction, all wrapped commands are executed sequentially, and you have a guarantee that other request won't be served in the middle of the transaction.

The syntax for defining the transaction is simple. You start the transaction with the MULTI keyword, and every next command is queued to be executed when you will type EXEC command:

> MULTI
OK
(TX)> SET counter 0
QUEUED
(TX)> INCR counter
QUEUED
> EXEC
1) OK
2) (integer) 2

If you would use DISCARD keyword instead of EXEC, all queued commands will be deleted:

> MULTI
OK
(TX)> SET counter 0
QUEUED
(TX)> INCR counter
QUEUED
> DISCARD
OK

Rollbacks

While in database transactions, we have rollbacks that undo the changes we made inside the transaction, in Redis, there is no such feature. As it's stated in the official documentation, the maintainers didn't add support for rollbacks as it would have a significant impact on the simplicity and performance.

If you would perform an invalid command and it will be queued, then after the EXEC the error will be raised, but other commands will be executed anyway:

> MULTI
OK
(TX)> SET counter 0 invalid
QUEUED
(TX)> SET name john
QUEUED
> EXEC
1) (error) ERR syntax error
2) OK
> GET name
"john"

The transaction will be only aborted when Redis will detect the error and reject the command from being queued.

Publish / subscribe mechanism

I already mentioned this mechanism at least twice, when demonstrating the caching and leaderboard feature. The truth is that you will find this pattern useful each time you need real-time updates of the data. Besides Redis, this pattern is also popular in other technologies but not necessarily meaning that updates are real-time.

We can separate the following building blocks of the pub/sub pattern:

  • Channel - a place where messages are published
  • Publisher - an entity that is publishing messages to some channel
  • Subscriber - an entity that is receiving messages from some channel

The idea is that the publisher is not aware of who will receive the message he pushes. Multiple subscribers can subscribe given channel without the knowledge of who is pushing the messages. Thanks to this structure, you can implement a flexible architecture where the given system's elements can respond differently to the same message.

Building publisher and subscriber

You can observe how this pattern is working in practice by using two sessions. In one session, subscribe to the messages channel:

redis-cli SUBSCRIBE messages
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "messages"
3) (integer) 1

In the second terminal window, publish a message to the messages channel:

redis-cli PUBLISH messages "Good morning"
(integer) 1

At the same time, in the first terminal window, you should see an additional message:

1) "message"
2) "messages"
3) "Good morning"

The structure of a published message

As you probably noticed, when we subscribe to a channel, we receive a response that, instead of one published message, contains three elements. Here is the explanation:

  • The first line - it's always message as Redis sends information that the response is related to the channel message.
  • The second line - it's the name of the channel
  • The third line - it's the message published to the channel

Thanks to such a structure of the response, we can write a code that will be very flexible and easily detect when the response is related to the channel.

Persistence mechanism

Redis provides very fast write and read operations because it operates on the RAM memory. It does not mean you will lose your data in case of server failure or restart. If you ever ran redis-server locally, after stopping it, you may notice the dump.rdb file on your disk. It's a snapshot of the data that you have in Redis.

You can disable the persistence mechanism, and Redis won't save the data in any way. It's sometimes useful in scenarios when Redis is a caching engine. However, if you want to have a snapshot of your data, you have a few options to consider.

Redis database

I already mentioned the dump.rdb file. It's a single file, point-in-time representation of your data. You can treat this file as a backup and store it, for example, on AWS S3 and restore it when needed. You need to be aware that it can harm your performance with large datasets and stop serving clients even for a second in the worst scenario. In terms of the power outage, you need to be prepared to lose some data depending on when the last snapshot was performed.

Append-only file

With this persistence mechanism, Redis logs every operation after each write in the same format as the Redis protocol itself. Even in the case of a power outage, you won't lose any data as Redis is able to properly fix a command even if it was logged only in half.

By selecting this option, you need to be aware that the backup file will be bigger than the one created with the Redis database option.

Redis database + append-only file

You can also combine both options. The Redis creators suggest using this option if you want to get a safety level comparable to the PostgreSQL database. However, if you can deal with losing a few minutes of the data, it's better to go for the Redis database option.

Custom data types

There are more data types than strings, lists, sets, sorted sets, and hashes. The ones you read about in this article are the most common ones. In this chapter, I will describe more advanced data types that are often used for more specialized use cases.

Geospatial

Just like in PostgreSQL, in Redis, you can also store coordinates and easily search for them (I think even easier than in the mentioned database engine). In order to do this, you have to first use GEOADD command, and provide the index name, coordinates, and location name:

> GEOADD capitals 13.381777 52.531677 Berlin
(integer) 1
> GEOADD capitals 21.017532 52.237049 Warsaw
(integer) 1
> GEOADD capitals 2.349014 48.864716 Paris
(integer) 1

The first of the coordinates is longitude. Having some information in the index, we can now search how far it is to each capital from the given location (I will select Rome), assuming that we want to search with a radius of 10k kilometers:

> GEOSEARCH capitals FROMLONLAT 12.496366 41.902782 BYRADIUS 10000 km WITHDIST
1) 1) "Paris"
   2) "1106.3562"
2) 1) "Berlin"
   2) "1184.0790"
3) 1) "Warsaw"
   2) "1316.2319"

We've got the list of pairs where the first item is the city name, and the second is the distance from Rome. There are more ways you can modify the GEOSEARCH command with optional flags, the demonstration was the simplest case to illustrate how easy it is to perform a geo search with Redis.

Streams

Stream is the data structure usually used to record events in real time. It works like an append-only log, so you cannot modify the element that you have already added. Imagine reading data from the sensor; you collect the information to receive statistics data later.

Each stream consists of key-value pairs, just like a hash. It also has a unique ID, and it is assigned to the named list. Here is the basic command for adding a new stream to the list named user:events:

> XADD user:events * action click position 11.53
1685464740235-0
> XADD user:events * action click position 20.33
1685464871221-0

I used * as the second argument for XADD command because I wanted Redis to automatically generate an ID for me. In most cases, you will go for this option. The result of this command is the ID. The identification consists of two parts:

  • milliseconds time - in this case, it's 1685464740235
  • sequence number - in this case, it's 0

Because time is part of the identification, we can easily get the streams that were recorded in the given time with millisecond precision:

> XRANGE user:events 1685464740235 1685464871221
1) 1) "1685464740235-0"
   2) 1) "action"
      2) "click"
      3) "position"
      4) "11.53"
2) 1) "1685464871221-0"
   2) 1) "action"
      2) "click"
      3) "position"
      4) "20.33"

You can also listen for new items in the given stream list. In some aspects, it is similar to the publish / subscribe patterns, but in some other aspects, the specification is quite different.

Bitmaps

Another less popular data type in Redis are bitmaps. They are an extension of the string. One of the most popular use cases for bitmaps are object permissions, where each bit represents a particular permission:

> SETBIT users:11 23 1
(integer) 0

In the above command, I specified that the user with id 11 has the permission represented by 23 - if he would not have this permission, I would use 0 as the last argument. While 23 is not meaningful, you would have to have an additional mapping of the permission name to an integer. With the above setup, you can check the permission in the following way:

> GETBIT users:11 23
1

Storing the permission name as part of the bitmap key limits the flexibility and extensibility of the system.

Bitfields

According to the documentation, the bitfields allow you to set, increment, and get integer values of arbitrary bit length. This data type is good for managing counters, for example, in an online game:

> BITFIELD gamer:35:stats SET u32 #0 0
1) (integer) 0

In the above command, we used the following elements:

  • BITFIELD - this is the command we use to set bitfields
  • gamer:35:points - it is just a key name
  • SET - this is the command name to set the value
  • u32 - a type of the bitfield, in this example, it's an unsigned 32-bit integer
  • #0 - offset position, `0 represents the first 32-bit block
  • 0 - the value set for the bitfield

If you would like to increment the value of points and lives, you have to use the following command:

> BITFIELD gamer:35:stats INCRBY u32 #0 50 INCRBY u32 #1 5
1) (integer) 50
2) (integer) 5

With one command we updated the points and lives counter for the gamer with an id 35. You can also get the stats for the given gamer:

> BITFIELD gamer:35:stats GET u32 #0 GET u32 #1
1) (integer) 50
2) (integer) 5

HyperLogLog

This data type is like a special magic box. It's a good solution when you need to store multiple items, and all you want to get is the number of unique items in the given set. HyperLogLog uses special math to do the quick estimation, and the more items you have, the more accurate the result is:

> PFADD colors yellow red green yellow pink
(integer) 1
> PFCOUNT colors
(integer) 4

Of course, it makes sense to operate on a large number of items, but I hope this demonstration clarified at some point the idea of HyperLogLog data type.

Running Redis in production

For every developer, the time comes when it's time to go into production and stop playing with Redis locally. There are a lot of factors you have to take into account to create a stable and efficient architecture for your production. I will try to cover the most important ones in this section.

Important parameters of the server

There are a few important parameters of the server you have to take into account when selecting the server for your production setup. Of course, it depends on your use case, but the following one will be more or less important in every case:

  • RAM - since Redis primarily stores the data in RAM memory, the bigger memory option you have, the better performance and increased data capacity you will get. Bigger RAM memory will reduce the need for disk I/O operations. Also, you will be able to use more advanced data types like Redis Streams or HyperLogLog on larger datasets.
  • CPU - this parameter plays the role when it comes to executing commands, handling client requests, and processing the data. With a better CPU, you reduce latency, have better concurrency, and have faster command execution.
  • Persistence options - you would like to enable the persistence options; ensure that your server supports this; otherwise, you might get an unpleasant surprise after a restart or server failure. I will discuss how to avoid such surprises also later in the cloud section.
  • Network and I/O - fast and reliable network connectivity is important when it comes to high client concurrency and data replication.
  • Cluster and replication - you should revise your hardware and network choices if your application requires high availability and scalability in terms of Redis. In such cases, go for a setup where there is support for Redis in a clustered or replicated configuration.

Before selecting the server, always start by analyzing the requirements for your application. Try to estimate how heavy the workload can be, and specify the nature, format, and size of the data you want to process. After making these assumptions, you can start looking for an appropriate setup for your Redis server(s).

Redis and cloud

There are plenty of solutions in the cloud in terms of Redis, but for the sake of this article, I will focus on the biggest players on the market as of now: Amazon Web Services, Microsoft Azure, and Google Cloud Platform.

Amazon Web Services (AWS)

Currently, AWS offers two fully managed services that support Redis:

Amazon ElastiCache

It's a caching service that comes in two flavors: Redis and Memcached. It provides high availability, cost optimization, and microsecond speed. As for other AWS services, you can control the access with Identity and Access Management and easily integrate it with other services.

If you want to achieve high availability, you can go for the automated backups or the version with Multi-AZ, as with the second option, when there is a problem with the primary instance, there is an automatic failover to the secondary instance.

You can improve the performance by using read replicas, just like in the case of the relational database. As always, you have to choose among many instance types where you can select the size or RAM and CPU you need.

Amazon MemoryDB

This service is Redis-compatible. Amazon MemoryDB is an in-memory database that acts like a cache service but also as a primary database. It's compatible with Redis because to manage it, you can use the same Redis data structures and APIs.

It provides fast failover, database recovery, and node restarts. In terms of other features, it's similar to ElastiCache as you can control permissions with IAM, configure Multi-AZ availability, and perform automatic snapshots.

Microsoft Azure

Currently, Azure offers one fully managed service that supports Redis:

Azure Cache for Redis

Traditionally when it comes to fully managed service, you don't have to worry about patching, updates, scaling, and provisioning. Azure provides high throughput and performance to handle a large number of requests with very small latency.

Azure Cache service also supports module integration for Redis, including RedisBloom, RedisSearch, RedisJSON, and RedisTimeSeries modules. When it comes to pricing, is as flexible as in the case of AWS. You can select different cache sizes, network performance, and the number of maximum client connections. You are also billed for hourly usage.

Google Cloud Platform

Currently, GCP offers one fully managed service that supports Redis:

Memorystore

Similar to Amazon ElastiCache, this service comes in two flavors: Redis and Memcached. You don't have to worry about provisioning, replication, failover and patching. You can control permissions with the IAM as well.

The pricing options are also flexible and constructed in a similar way to Azure and AWS. You can have up to 5 read replicas, and Google assures that even when scaling with a standard tier instance, your applications will experience a downtime of less than a minute.

Deployments

Depending on the nature of your business and the use cases of your application, you can select among multiple types of deployments of your Redis infrastructure.

Standalone

With the standalone deployment type, you have only one server and one Redis instance. It's a good option if you don't have important information stored in Redis, or you run a small application.

Such deployment does not provide any built-in scalability or high-availability mechanism that you can use. It's very simple to set up and maintain a single installation. You can expect downtime caused by updates or maintenance rather than because Redis failing itself, as it's a very stable technology.

Replication

In the replication deployment type, you have multiple Redis instances where one instance is the master instance, and the other instances are called slaves. Master handles both write and read requests. Slave instances can handle read requests to increase performance and provide data durability.

Replication with Sentinel

Sentinel is a Redis component that, in combination with replication, provides automatic failover from the master instance to the slave instance in case of failure.

Replication without Sentinel

Without the Sentinel component, you have to manage the failover manually, which can cause downtime.

Sentinel

As I mentioned in the previous section, Sentinel is a Redis component for providing automatic failover. The failover is not the only feature of this component. It also provides monitoring of the Redis instances and notifications to let the administrator know when something bad is happening.

Sentinel without replication

Although the Sentinel component is usually used with replication, you can also use it in a standalone mode. In such a situation, you can benefit from monitoring and notifications, but you have to perform the failover process manually.

Cluster

While in the replication deployment type, you host multiple instances with the same set of data, in cluster mode, you spread the data into multiple instances. It's called sharding. Each of the servers contains a different part of the data as a complete set.

Thanks to clustering, you can reduce the load of the server by scaling Redis horizontally (adding new instances). You can also combine cluster with replication to improve also the data availability (and performance with read replicas).

Managed services

I already explained this option in the section about cloud providers. With managed services, you delegate most of the responsibilities to the cloud provider, and you are responsible mostly for selecting the right architecture and server specification.

With managed services, you can also go for automatic backups, custom monitoring, and an easy deployment process that requires a little technical expertise.

Backups and restores

If you plan to use Redis only as a caching engine and you can deal with losing all of the cached data, then you don't need to think about performing backups. However, in most cases, you can't afford to lose the data, so I considered this section as an important part of the whole article.

Performing backups

There are a few ways you can make a copy of your data inside the Redis instance. Which method will be the most efficient and effective highly depends on the nature of the data and your application:

  • Snapshot - you can enable automatic periodic snapshots or manually trigger snapshots using SAVE or BGSAVE commands. The SAVE command will block all Redis commands until the moment when the dump is completed, the BGSAVE command is asynchronous and won't block the Redis.
  • Append-only file - you can simply copy the file produced by this persistence mechanism. As of the moment of copying, the dump will contain all of the data that is in Redis.
  • Replication - you can perform the snapshot from the slave instance, either with SAVE and BGSAVE command. That way, you will get access to the latest dump without interrupting the master node, even with calling SAVE.

If you use fully managed services in the cloud, the backup process can be fully automated and performed on a scheduled basis without any interruptions or extra actions from your side.

Restoring backups

There is a saying that we can divide people into two groups: those who do backups and those who will do backups. Now as you know how to perform Redis backups using different methods, it is good to know how to restore the data in case of emergency or migration:

  • Restoring an RDB Backup - if you have a dump.rdb file and a single instance, you have to stop the server, copy the file to the Redis data directory (you can check the path in redis.conf), and then start the server. Redis will automatically load the RDB file.
  • Restoring and AOF Backup - first, issue the BGREWRITEAOF command to generate a new AOF file. After running this command, you can start the server with the empty dataset. Once the new AOF file is generated, you can replace the existing AOF file with a restored backup to complete the restoration.
  • Cluster - to avoid downtime, you can add a new node to the cluster with restored data, and the cluster will automatically redistribute the data.
  • Replication - you can restore the data on the slave node and then promote it to the master to avoid downtime.

Remember to store the backup in a safe place. Usually, cloud storages like AWS S3 are good solutions to assure high availability and low costs.

Scaling

When one Redis instance is not enough, you have to think about the process of providing more computing power. There are two ways you can achieve this:

  • Horizontally - you can add more Redis instances. I already mentioned Redis Clustering, which can handle such situations by using sharding - splitting the data between multiple servers to reduce the workload on a single instance.
  • Vertically - you can increase the size of the current Redis instance, whether it is CPU or RAM. Generally, it's better to scale horizontally if the workload is unpredicted. If the workload is constant and you just need more power, consider scaling vertically.

Security

Last but not least. I already said a lot about manipulating data inside the Redis and the nature of the data you can store there. I haven't mentioned anything about security yet; it's also a very important aspect, you should always keep in mind.

Authentication

Currently, there are two types of authentication supported by Redis. Both are handled with the AUTH command:

  • Configured password - you can set a password in the Redis configuration and then require it when connecting - AUTH <password>
  • ACL system - in Redis 6 or greater, you can benefit from using an access control list where you can set permissions per user and authenticate with AUTH <username> <password> command.

You should be aware that due to the high-performance nature of Redis, it's possible to try a lot of passwords in parallel, so ensure that you have a very strong password. You can always generate one by using ACL GENPASS command.

Encryption

If you need encryption at rest, consider using Redis with one of the cloud providers mentioned before. They provide encryption at rest as an additional option to provide better security for your data when the persistence option is used.

Communication

Your architecture should be secure by the design, which means that the Redis instance should not be accessible from the internet. In the perfect scenario, Redis sits in a private subnet, and only authorized components of your infrastructure can access it.

If you can't place Redis in a private subnet and untrusted clients can connect, consider implementing a layer that uses ACLs, validates user input, and decides what operations to perform against the instance.

Redis also supports TLS to establish secure encrypted connections between clients and the Redis server.