In our previous post, we welcomed Valkey (an open source fork of Redis) to the Percona family. We also learned how to get Valkey up and running using docker, along with understanding some of the basic data types, string and list. In this post, we will continue working with Valkey/Redis and learn about sets and sorted sets.
Sets
“In computer science, a set is an abstract data type that can store unique values, without any particular order.” –Wikipedia entry on Set
In Valkey/Redis, a set is just like what Wikipedia tells us: a data type that can hold string-based unique values without any order. Let’s create a few different sets of users for our application which holds their favorite foods.
We will use SADD to not only create our sets initially but also to add one or more elements to the set.
1 2 3 4 5 6 | 127.0.0.1:6379> SADD user:42:favfoods "Pizza" "Tacos" "Chicken Nuggets" (integer) 3 127.0.0.1:6379> SADD user:87:favfoods "Tacos" "Hamburgers" "Salads" (integer) 3 127.0.0.1:6379> SADD user:91:favfoods "Salads" "Pizza" "Tacos" "Steak" "Brisket" (integer) 5 |
Recall from our first post that Valkey/Redis has no concept of namespaces, or any nested structure. The use of the “:” separator in the key names is purely by choice; we could have used “user 91-favfoods” as the key, and it would have worked just the same. Be consistent in your code on how you want to create keys.
The number returned after each invocation of SADD indicates the number of elements added to the set.
1 2 | 127.0.0.1:6379> SADD user:42:favfoods "Tofu" "Pizza" (integer) 1 |
Why was only one element added? Recall from the definition for set that it can “store unique values”. Since ‘Pizza’ was already a member of this set, it was not added again.
We can inspect the contents of each set using SMEMBERS <key> and count how many elements with SCARD <key>:
1 2 3 4 5 6 7 8 9 10 | # Get all members of the set 127.0.0.1:6379> SMEMBERS user:42:favfoods 1) "Pizza" 2) "Tacos" 3) "Chicken Nuggets" 4) "Tofu" # Get the cardinality, or number of unique values in the set 127.0.0.1:6379> SCARD user:42:favfoods (integer) 4 |
Let’s imagine our application has a cool feature that lets you see which foods you have in common with your friends. The foods in common would be those foods in your set and your friend’s set. This would be an intersection of the two sets:
1 2 3 | 127.0.0.1:6379> SINTER user:42:favfoods user:91:favfoods 1) "Pizza" 2) "Tacos" |
Here’s a quick visualization of the sets and their intersection using Venn diagrams:
What about foods that all three users like? SINTER <key1> <key2> [keyN …]
1 2 | > SINTER user:42:favfoods user:91:favfoods user:87:favfoods 1) "Tacos" |
Everybody loves Tacos!
Foods that user:91 likes but user:87 does not like? SDIFF <key1> <key2> [keyN …]
1 2 3 4 | > SDIFF user:91:favfoods user:87:favfoods 1) "Pizza" 2) "Steak" 3) "Brisket" |
SDIFF shows the differences between the first key and all subsequent keys.
Hopefully, now you understand how sets can handle various types of memberships and how using the different SET-based functions can generate interesting relationships!
Sorted sets
A sorted set in Valkey/Redis is similar to a set in that it contains only unique members. An additional property is used called the ‘score’ to create a sorting order. ‘Score’ is a generic term used by the sorted set. It could be the weight of packages, ages of students in class, or number of protons in an atom. You give the score an actual meaning based on the data you store.
Scores can be signed integers or floating-point numbers. If two elements have the same score, then the value is sorted lexicographically to determine the order. Since no two values can be the same, there will never be ties in the sorted set.
Let’s create a sorted set of a few cities in Argentina using their population as the score:
1 2 3 4 5 6 7 8 9 10 11 12 13 | > ZADD city_pop 2890150 "Buenos Aires" 1308072 "Córdoba" 1159000 "Rosario" 763943 "La Plata" 541000 "Mar del Plata" (integer) 5 > ZRANGE city_pop 0 -1 1) "Mar del Plata" 2) "La Plata" 3) "Rosario" 4) "Cordoba" 5) "Buenos Aires" |
Hang on, why is Mar del Plata first on this list? It has the smallest population and should be sorted last in this list. Ah, yes, however sorted sets in Valkey/Redis are sorted low to high. Scores start at 0 and increase from there. (i.e., ASC by default)
Valkey/Redis provides a way to reverse the output of our set, so we get the highest score first (i.e., DESC); we can also get the scores with the option ‘WITHSCORES’ modifier.
1 2 3 4 5 6 7 | > ZREVRANGE city_pop 0 2 WITHSCORES 1) "Buenos Aires" 2) "2890150" 3) "Cordoba" 4) "1308072" 5) "Rosario" 6) "1159000" |
Why are we seeing three results? Much like most computer programming, lists, sets, sorted sets, and others use 0-based indexes. So, in fact, we’ve asked to start at element 0, and go through element 2, thus three results.
Conclusion
In this post, we introduced the concept of sets and sorted sets in Valkey/Redis. We learned how to find common elements and differences between sets. With sorted sets, we introduced the ‘score’ concept and showed how to get elements returned in reverse sorted order.