From 8e98058e69f5173c20073ba656d3f51ed4a61edd Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Mon, 22 May 2023 23:50:54 +0100 Subject: [PATCH 01/23] Breaking up the Data Types tutorial into smaller parts --- .gitignore | 1 + commands/scan.md | 182 +---- docs/data-types/bitmaps.md | 75 +- docs/data-types/hashes.md | 41 ++ docs/data-types/hyperloglogs.md | 40 +- docs/data-types/lists.md | 273 +++++++ docs/data-types/sets.md | 130 ++++ docs/data-types/sorted-sets.md | 208 ++++++ docs/data-types/strings.md | 104 ++- docs/data-types/tutorial.md | 997 -------------------------- docs/manual/client-side-caching.md | 2 +- docs/manual/keyspace-notifications.md | 2 +- docs/manual/pubsub.md | 2 +- docs/manual/the-redis-keyspace.md | 324 +++++++++ docs/manual/transactions.md | 2 +- 15 files changed, 1173 insertions(+), 1210 deletions(-) delete mode 100644 docs/data-types/tutorial.md create mode 100644 docs/manual/the-redis-keyspace.md diff --git a/.gitignore b/.gitignore index 17952c7fd0..4610ac14e8 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,3 @@ .idea tmp +.DS_Store diff --git a/commands/scan.md b/commands/scan.md index dcc94de9cb..84ec7634c1 100644 --- a/commands/scan.md +++ b/commands/scan.md @@ -11,188 +11,8 @@ However while blocking commands like `SMEMBERS` are able to provide all the elem Note that `SCAN`, `SSCAN`, `HSCAN` and `ZSCAN` all work very similarly, so this documentation covers all the four commands. However an obvious difference is that in the case of `SSCAN`, `HSCAN` and `ZSCAN` the first argument is the name of the key holding the Set, Hash or Sorted Set value. The `SCAN` command does not need any key name argument as it iterates keys in the current database, so the iterated object is the database itself. -## SCAN basic usage -SCAN is a cursor based iterator. This means that at every call of the command, the server returns an updated cursor that the user needs to use as the cursor argument in the next call. - -An iteration starts when the cursor is set to 0, and terminates when the cursor returned by the server is 0. The following is an example of SCAN iteration: - -``` -redis 127.0.0.1:6379> scan 0 -1) "17" -2) 1) "key:12" - 2) "key:8" - 3) "key:4" - 4) "key:14" - 5) "key:16" - 6) "key:17" - 7) "key:15" - 8) "key:10" - 9) "key:3" - 10) "key:7" - 11) "key:1" -redis 127.0.0.1:6379> scan 17 -1) "0" -2) 1) "key:5" - 2) "key:18" - 3) "key:0" - 4) "key:2" - 5) "key:19" - 6) "key:13" - 7) "key:6" - 8) "key:9" - 9) "key:11" -``` - -In the example above, the first call uses zero as a cursor, to start the iteration. The second call uses the cursor returned by the previous call as the first element of the reply, that is, 17. - -As you can see the **SCAN return value** is an array of two values: the first value is the new cursor to use in the next call, the second value is an array of elements. - -Since in the second call the returned cursor is 0, the server signaled to the caller that the iteration finished, and the collection was completely explored. Starting an iteration with a cursor value of 0, and calling `SCAN` until the returned cursor is 0 again is called a **full iteration**. - -## Scan guarantees - -The `SCAN` command, and the other commands in the `SCAN` family, are able to provide to the user a set of guarantees associated to full iterations. - -* A full iteration always retrieves all the elements that were present in the collection from the start to the end of a full iteration. This means that if a given element is inside the collection when an iteration is started, and is still there when an iteration terminates, then at some point `SCAN` returned it to the user. -* A full iteration never returns any element that was NOT present in the collection from the start to the end of a full iteration. So if an element was removed before the start of an iteration, and is never added back to the collection for all the time an iteration lasts, `SCAN` ensures that this element will never be returned. - -However because `SCAN` has very little state associated (just the cursor) it has the following drawbacks: - -* A given element may be returned multiple times. It is up to the application to handle the case of duplicated elements, for example only using the returned elements in order to perform operations that are safe when re-applied multiple times. -* Elements that were not constantly present in the collection during a full iteration, may be returned or not: it is undefined. - -## Number of elements returned at every SCAN call - -`SCAN` family functions do not guarantee that the number of elements returned per call are in a given range. The commands are also allowed to return zero elements, and the client should not consider the iteration complete as long as the returned cursor is not zero. - -However the number of returned elements is reasonable, that is, in practical terms SCAN may return a maximum number of elements in the order of a few tens of elements when iterating a large collection, or may return all the elements of the collection in a single call when the iterated collection is small enough to be internally represented as an encoded data structure (this happens for small sets, hashes and sorted sets). - -However there is a way for the user to tune the order of magnitude of the number of returned elements per call using the **COUNT** option. - -## The COUNT option - -While `SCAN` does not provide guarantees about the number of elements returned at every iteration, it is possible to empirically adjust the behavior of `SCAN` using the **COUNT** option. Basically with COUNT the user specified the *amount of work that should be done at every call in order to retrieve elements from the collection*. This is **just a hint** for the implementation, however generally speaking this is what you could expect most of the times from the implementation. - -* The default COUNT value is 10. -* When iterating the key space, or a Set, Hash or Sorted Set that is big enough to be represented by a hash table, assuming no **MATCH** option is used, the server will usually return *count* or a bit more than *count* elements per call. Please check the *why SCAN may return all the elements at once* section later in this document. -* When iterating Sets encoded as intsets (small sets composed of just integers), or Hashes and Sorted Sets encoded as ziplists (small hashes and sets composed of small individual values), usually all the elements are returned in the first `SCAN` call regardless of the COUNT value. - -Important: **there is no need to use the same COUNT value** for every iteration. The caller is free to change the count from one iteration to the other as required, as long as the cursor passed in the next call is the one obtained in the previous call to the command. - -## The MATCH option - -It is possible to only iterate elements matching a given glob-style pattern, similarly to the behavior of the `KEYS` command that takes a pattern as its only argument. - -To do so, just append the `MATCH ` arguments at the end of the `SCAN` command (it works with all the SCAN family commands). - -This is an example of iteration using **MATCH**: - -``` -redis 127.0.0.1:6379> sadd myset 1 2 3 foo foobar feelsgood -(integer) 6 -redis 127.0.0.1:6379> sscan myset 0 match f* -1) "0" -2) 1) "foo" - 2) "feelsgood" - 3) "foobar" -redis 127.0.0.1:6379> -``` - -It is important to note that the **MATCH** filter is applied after elements are retrieved from the collection, just before returning data to the client. This means that if the pattern matches very little elements inside the collection, `SCAN` will likely return no elements in most iterations. An example is shown below: - -``` -redis 127.0.0.1:6379> scan 0 MATCH *11* -1) "288" -2) 1) "key:911" -redis 127.0.0.1:6379> scan 288 MATCH *11* -1) "224" -2) (empty list or set) -redis 127.0.0.1:6379> scan 224 MATCH *11* -1) "80" -2) (empty list or set) -redis 127.0.0.1:6379> scan 80 MATCH *11* -1) "176" -2) (empty list or set) -redis 127.0.0.1:6379> scan 176 MATCH *11* COUNT 1000 -1) "0" -2) 1) "key:611" - 2) "key:711" - 3) "key:118" - 4) "key:117" - 5) "key:311" - 6) "key:112" - 7) "key:111" - 8) "key:110" - 9) "key:113" - 10) "key:211" - 11) "key:411" - 12) "key:115" - 13) "key:116" - 14) "key:114" - 15) "key:119" - 16) "key:811" - 17) "key:511" - 18) "key:11" -redis 127.0.0.1:6379> -``` - -As you can see most of the calls returned zero elements, but the last call where a COUNT of 1000 was used in order to force the command to do more scanning for that iteration. - - -## The TYPE option - -You can use the `!TYPE` option to ask `SCAN` to only return objects that match a given `type`, allowing you to iterate through the database looking for keys of a specific type. The **TYPE** option is only available on the whole-database `SCAN`, not `HSCAN` or `ZSCAN` etc. - -The `type` argument is the same string name that the `TYPE` command returns. Note a quirk where some Redis types, such as GeoHashes, HyperLogLogs, Bitmaps, and Bitfields, may internally be implemented using other Redis types, such as a string or zset, so can't be distinguished from other keys of that same type by `SCAN`. For example, a ZSET and GEOHASH: - -``` -redis 127.0.0.1:6379> GEOADD geokey 0 0 value -(integer) 1 -redis 127.0.0.1:6379> ZADD zkey 1000 value -(integer) 1 -redis 127.0.0.1:6379> TYPE geokey -zset -redis 127.0.0.1:6379> TYPE zkey -zset -redis 127.0.0.1:6379> SCAN 0 TYPE zset -1) "0" -2) 1) "geokey" - 2) "zkey" -``` - -It is important to note that the **TYPE** filter is also applied after elements are retrieved from the database, so the option does not reduce the amount of work the server has to do to complete a full iteration, and for rare types you may receive no elements in many iterations. - -## Multiple parallel iterations - -It is possible for an infinite number of clients to iterate the same collection at the same time, as the full state of the iterator is in the cursor, that is obtained and returned to the client at every call. No server side state is taken at all. - -## Terminating iterations in the middle - -Since there is no state server side, but the full state is captured by the cursor, the caller is free to terminate an iteration half-way without signaling this to the server in any way. An infinite number of iterations can be started and never terminated without any issue. - -## Calling SCAN with a corrupted cursor - -Calling `SCAN` with a broken, negative, out of range, or otherwise invalid cursor, will result in undefined behavior but never in a crash. What will be undefined is that the guarantees about the returned elements can no longer be ensured by the `SCAN` implementation. - -The only valid cursors to use are: - -* The cursor value of 0 when starting an iteration. -* The cursor returned by the previous call to SCAN in order to continue the iteration. - -## Guarantee of termination - -The `SCAN` algorithm is guaranteed to terminate only if the size of the iterated collection remains bounded to a given maximum size, otherwise iterating a collection that always grows may result into `SCAN` to never terminate a full iteration. - -This is easy to see intuitively: if the collection grows there is more and more work to do in order to visit all the possible elements, and the ability to terminate the iteration depends on the number of calls to `SCAN` and its COUNT option value compared with the rate at which the collection grows. - -## Why SCAN may return all the items of an aggregate data type in a single call? - -In the `COUNT` option documentation, we state that sometimes this family of commands may return all the elements of a Set, Hash or Sorted Set at once in a single call, regardless of the `COUNT` option value. The reason why this happens is that the cursor-based iterator can be implemented, and is useful, only when the aggregate data type that we are scanning is represented as a hash table. However Redis uses a [memory optimization](/topics/memory-optimization) where small aggregate data types, until they reach a given amount of items or a given max size of single elements, are represented using a compact single-allocation packed encoding. When this is the case, `SCAN` has no meaningful cursor to return, and must iterate the whole data structure at once, so the only sane behavior it has is to return everything in a call. - -However once the data structures are bigger and are promoted to use real hash tables, the `SCAN` family of commands will resort to the normal behavior. Note that since this special behavior of returning all the elements is true only for small aggregates, it has no effects on the command complexity or latency. However the exact limits to get converted into real hash tables are [user configurable](/topics/memory-optimization), so the maximum number of elements you can see returned in a single call depends on how big an aggregate data type could be and still use the packed representation. - -Also note that this behavior is specific of `SSCAN`, `HSCAN` and `ZSCAN`. `SCAN` itself never shows this behavior because the key space is always represented by hash tables. +For more information on `SCAN` please refer to the [The Redis Keyspace](/docs/manual/the-redis-keyspace.md) tutorial. ## Return value diff --git a/docs/data-types/bitmaps.md b/docs/data-types/bitmaps.md index d1d4440644..b829b93173 100644 --- a/docs/data-types/bitmaps.md +++ b/docs/data-types/bitmaps.md @@ -6,13 +6,84 @@ description: > Introduction to Redis bitmaps --- -Redis bitmaps are an extension of the string data type that lets you treat a string like a bit vector. -You can also perform bitwise operations on one or more strings. +Bitmaps are not an actual data type, but a set of bit-oriented operations +defined on the String type which is treated like a bit vector. +Since strings are binary safe blobs and their maximum length is 512 MB, +they are suitable to set up to 2^32 different bits. + +You can perform bitwise operations on one or more strings. Some examples of bitmap use cases include: * Efficient set representations for cases where the members of a set correspond to the integers 0-N. * Object permissions, where each bit represents a particular permission, similar to the way that file systems store permissions. + +Bit operations are divided into two groups: constant-time single bit +operations, like setting a bit to 1 or 0, or getting its value, and +operations on groups of bits, for example counting the number of set +bits in a given range of bits (e.g., population counting). + +One of the biggest advantages of bitmaps is that they often provide +extreme space savings when storing information. For example in a system +where different users are represented by incremental user IDs, it is possible +to remember a single bit information (for example, knowing whether +a user wants to receive a newsletter) of 4 billion of users using just 512 MB of memory. + +Bits are set and retrieved using the `SETBIT` and `GETBIT` commands: + + > setbit key 10 1 + (integer) 0 + > getbit key 10 + (integer) 1 + > getbit key 11 + (integer) 0 + +The `SETBIT` command takes as its first argument the bit number, and as its second +argument the value to set the bit to, which is 1 or 0. The command +automatically enlarges the string if the addressed bit is outside the +current string length. + +`GETBIT` just returns the value of the bit at the specified index. +Out of range bits (addressing a bit that is outside the length of the string +stored into the target key) are always considered to be zero. + +There are three commands operating on group of bits: + +1. `BITOP` performs bit-wise operations between different strings. The provided operations are AND, OR, XOR and NOT. +2. `BITCOUNT` performs population counting, reporting the number of bits set to 1. +3. `BITPOS` finds the first bit having the specified value of 0 or 1. + +Both `BITPOS` and `BITCOUNT` are able to operate with byte ranges of the +string, instead of running for the whole length of the string. The following +is a trivial example of `BITCOUNT` call: + + > setbit key 0 1 + (integer) 0 + > setbit key 100 1 + (integer) 0 + > bitcount key + (integer) 2 + +For example imagine you want to know the longest streak of daily visits of +your web site users. You start counting days starting from zero, that is the +day you made your web site public, and set a bit with `SETBIT` every time +the user visits the web site. As a bit index you simply take the current unix +time, subtract the initial offset, and divide by the number of seconds in a day +(normally, 3600\*24). + +This way for each user you have a small string containing the visit +information for each day. With `BITCOUNT` it is possible to easily get +the number of days a given user visited the web site, while with +a few `BITPOS` calls, or simply fetching and analyzing the bitmap client-side, +it is possible to easily compute the longest streak. + +Bitmaps are trivial to split into multiple keys, for example for +the sake of sharding the data set and because in general it is better to +avoid working with huge keys. To split a bitmap across different keys +instead of setting all the bits into a key, a trivial strategy is just +to store M bits per key and obtain the key name with `bit-number/M` and +the Nth bit to address inside the key with `bit-number MOD M`. + ## Examples Suppose you have 1000 sensors deployed in the field, labeled 0-999. diff --git a/docs/data-types/hashes.md b/docs/data-types/hashes.md index 2bafb4ab61..1ecb940e9b 100644 --- a/docs/data-types/hashes.md +++ b/docs/data-types/hashes.md @@ -9,6 +9,47 @@ description: > Redis hashes are record types structured as collections of field-value pairs. You can use hashes to represent basic objects and to store groupings of counters, among other things. + > hset user:1000 username antirez birthyear 1977 verified 1 + (integer) 3 + > hget user:1000 username + "antirez" + > hget user:1000 birthyear + "1977" + > hgetall user:1000 + 1) "username" + 2) "antirez" + 3) "birthyear" + 4) "1977" + 5) "verified" + 6) "1" + +While hashes are handy to represent *objects*, actually the number of fields you can +put inside a hash has no practical limits (other than available memory), so you can use +hashes in many different ways inside your application. + +The command [`HSET`](/commands/hset) sets multiple fields of the hash, while [`HGET`](/commands/hget) retrieves +a single field. [`HMGET`](/commands/hmget) is similar to [`HGET`](/commands/hget) but returns an array of values: + + > hmget user:1000 username birthyear no-such-field + 1) "antirez" + 2) "1977" + 3) (nil) + +There are commands that are able to perform operations on individual fields +as well, like [`HINCRBY`](/commands/hincrby): + + > hincrby user:1000 birthyear 10 + (integer) 1987 + > hincrby user:1000 birthyear 10 + (integer) 1997 + +You can find the [full list of hash commands in the documentation](https://redis.io/commands#hash). + +It is worth noting that small hashes (i.e., a few elements with small values) are +encoded in special way in memory that make them very memory efficient. + + + ## Examples * Represent a basic user profile as a hash: diff --git a/docs/data-types/hyperloglogs.md b/docs/data-types/hyperloglogs.md index 3c60947af6..5a22dcdc0f 100644 --- a/docs/data-types/hyperloglogs.md +++ b/docs/data-types/hyperloglogs.md @@ -6,10 +6,48 @@ description: > Introduction to the Redis HyperLogLog data type --- -HyperLogLog is a data structure that estimates the cardinality of a set. As a probabilistic data structure, HyperLogLog trades perfect accuracy for efficient space utilization. +HyperLogLog is a probabilistic data structure that estimates the cardinality of a set. As a probabilistic data structure, HyperLogLog trades perfect accuracy for efficient space utilization. The Redis HyperLogLog implementation uses up to 12 KB and provides a standard error of 0.81%. +Counting unique items usually requires an amount of memory +proportional to the number of items you want to count, because you need +to remember the elements you have already seen in the past in order to avoid +counting them multiple times. However there is a set of algorithms that trade +memory for precision: you end with an estimated measure with a standard error, +which in the case of the Redis implementation is less than 1%. The +magic of this algorithm is that you no longer need to use an amount of memory +proportional to the number of items counted, and instead can use a +constant amount of memory! 12k bytes in the worst case, or a lot less if your +HyperLogLog (We'll just call them HLL from now) has seen very few elements. + +HLLs in Redis, while technically a different data structure, are encoded +as a Redis string, so you can call `GET` to serialize a HLL, and `SET` +to deserialize it back to the server. + +Conceptually the HLL API is like using Sets to do the same task. You would +`SADD` every observed element into a set, and would use `SCARD` to check the +number of elements inside the set, which are unique since `SADD` will not +re-add an existing element. + +While you don't really *add items* into an HLL, because the data structure +only contains a state that does not include actual elements, the API is the +same: + +* Every time you see a new element, you add it to the count with `PFADD`. +* Every time you want to retrieve the current approximation of the unique elements *added* with `PFADD` so far, you use the `PFCOUNT`. + + > pfadd hll a b c d + (integer) 1 + > pfcount hll + (integer) 4 + +An example of use case for this data structure is counting unique queries +performed by users in a search form every day. + +Redis is also able to perform the union of HLLs, please check the +[full documentation](/commands#hyperloglog) for more information. + ## Examples * Add some items to the HyperLogLog: diff --git a/docs/data-types/lists.md b/docs/data-types/lists.md index b4769f5bfc..5594209344 100644 --- a/docs/data-types/lists.md +++ b/docs/data-types/lists.md @@ -12,6 +12,279 @@ Redis lists are frequently used to: * Implement stacks and queues. * Build queue management for background worker systems. +### What are Lists? +To explain the List data type it's better to start with a little bit of theory, +as the term *List* is often used in an improper way by information technology +folks. For instance "Python Lists" are not what the name may suggest (Linked +Lists), but rather Arrays (the same data type is called Array in +Ruby actually). + +From a very general point of view a List is just a sequence of ordered +elements: 10,20,1,2,3 is a list. But the properties of a List implemented using +an Array are very different from the properties of a List implemented using a +*Linked List*. + +Redis lists are implemented via Linked Lists. This means that even if you have +millions of elements inside a list, the operation of adding a new element in +the head or in the tail of the list is performed *in constant time*. The speed of adding a +new element with the [`LPUSH`](/commands/lpush) command to the head of a list with ten +elements is the same as adding an element to the head of list with 10 +million elements. + +What's the downside? Accessing an element *by index* is very fast in lists +implemented with an Array (constant time indexed access) and not so fast in +lists implemented by linked lists (where the operation requires an amount of +work proportional to the index of the accessed element). + +Redis Lists are implemented with linked lists because for a database system it +is crucial to be able to add elements to a very long list in a very fast way. +Another strong advantage, as you'll see in a moment, is that Redis Lists can be +taken at constant length in constant time. + +When fast access to the middle of a large collection of elements is important, +there is a different data structure that can be used, called sorted sets. +Sorted sets are covered in the [Sorted sets](/docs/data-types/sorted-sets) tutorial page. + +### First steps with Redis Lists + +The [`LPUSH`](/commands/lpush) command adds a new element into a list, on the +left (at the head), while the [`RPUSH`](/commands/rpush) command adds a new +element into a list, on the right (at the tail). Finally the +[`LRANGE`](/commands/lrange) command extracts ranges of elements from lists: + + > rpush mylist A + (integer) 1 + > rpush mylist B + (integer) 2 + > lpush mylist first + (integer) 3 + > lrange mylist 0 -1 + 1) "first" + 2) "A" + 3) "B" + +Note that [LRANGE](/commands/lrange) takes two indexes, the first and the last +element of the range to return. Both the indexes can be negative, telling Redis +to start counting from the end: so -1 is the last element, -2 is the +penultimate element of the list, and so forth. + +As you can see [`RPUSH`](/commands/rpush) appended the elements on the right of the list, while +the final [`LPUSH`](/commands/lpush) appended the element on the left. + +Both commands are *variadic commands*, meaning that you are free to push +multiple elements into a list in a single call: + + > rpush mylist 1 2 3 4 5 "foo bar" + (integer) 9 + > lrange mylist 0 -1 + 1) "first" + 2) "A" + 3) "B" + 4) "1" + 5) "2" + 6) "3" + 7) "4" + 8) "5" + 9) "foo bar" + +An important operation defined on Redis lists is the ability to *pop elements*. +Popping elements is the operation of both retrieving the element from the list, +and eliminating it from the list, at the same time. You can pop elements +from left and right, similarly to how you can push elements in both sides +of the list: + + > rpush mylist a b c + (integer) 3 + > rpop mylist + "c" + > rpop mylist + "b" + > rpop mylist + "a" + +We added three elements and popped three elements, so at the end of this +sequence of commands the list is empty and there are no more elements to +pop. If we try to pop yet another element, this is the result we get: + + > rpop mylist + (nil) + +Redis returned a NULL value to signal that there are no elements in the +list. + +### Common use cases for lists + +Lists are useful for a number of tasks, two very representative use cases +are the following: + +* Remember the latest updates posted by users into a social network. +* Communication between processes, using a consumer-producer pattern where the producer pushes items into a list, and a consumer (usually a *worker*) consumes those items and executes actions. Redis has special list commands to make this use case both more reliable and efficient. + +For example both the popular Ruby libraries [resque](https://github.com/resque/resque) and +[sidekiq](https://github.com/mperham/sidekiq) use Redis lists under the hood in order to +implement background jobs. + +The popular Twitter social network [takes the latest tweets](http://www.infoq.com/presentations/Real-Time-Delivery-Twitter) +posted by users into Redis lists. + +To describe a common use case step by step, imagine your home page shows the latest +photos published in a photo sharing social network and you want to speedup access. + +* Every time a user posts a new photo, we add its ID into a list with [`LPUSH`](/commands/lpush). +* When users visit the home page, we use `LRANGE 0 9` in order to get the latest 10 posted items. + +### Capped lists + +In many use cases we just want to use lists to store the *latest items*, +whatever they are: social network updates, logs, or anything else. + +Redis allows us to use lists as a capped collection, only remembering the latest +N items and discarding all the oldest items using the [`LTRIM`](/commands/ltrim) command. + +The [`LTRIM`](/commands/ltrim) command is similar to [`LRANGE`](/commands/lrange), but **instead of displaying the +specified range of elements** it sets this range as the new list value. All +the elements outside the given range are removed. + +An example will make it more clear: + + > rpush mylist 1 2 3 4 5 + (integer) 5 + > ltrim mylist 0 2 + OK + > lrange mylist 0 -1 + 1) "1" + 2) "2" + 3) "3" + +The above [`LTRIM`](/commands/ltrim) command tells Redis to take just list elements from index +0 to 2, everything else will be discarded. This allows for a very simple but +useful pattern: doing a List push operation + a List trim operation together +in order to add a new element and discard elements exceeding a limit: + + LPUSH mylist + LTRIM mylist 0 999 + +The above combination adds a new element and takes only the 1000 +newest elements into the list. With [`LRANGE`](/commands/lrange) you can access the top items +without any need to remember very old data. + +Note: while [`LRANGE`](/commands/lrange) is technically an O(N) command, accessing small ranges +towards the head or the tail of the list is a constant time operation. + +Blocking operations on lists +--- + +Lists have a special feature that make them suitable to implement queues, +and in general as a building block for inter process communication systems: +blocking operations. + +Imagine you want to push items into a list with one process, and use +a different process in order to actually do some kind of work with those +items. This is the usual producer / consumer setup, and can be implemented +in the following simple way: + +* To push items into the list, producers call [`LPUSH`](/commands/lpush). +* To extract / process items from the list, consumers call [`RPOP`](/commands/rpop). + +However it is possible that sometimes the list is empty and there is nothing +to process, so [`RPOP`](/commands/rpop) just returns NULL. In this case a consumer is forced to wait +some time and retry again with [`RPOP`](/commands/rpop). This is called *polling*, and is not +a good idea in this context because it has several drawbacks: + +1. Forces Redis and clients to process useless commands (all the requests when the list is empty will get no actual work done, they'll just return NULL). +2. Adds a delay to the processing of items, since after a worker receives a NULL, it waits some time. To make the delay smaller, we could wait less between calls to [`RPOP`](/commands/rpop), with the effect of amplifying problem number 1, i.e. more useless calls to Redis. + +So Redis implements commands called [`BRPOP`](/commands/brpop) and [`BLPOP`](/commands/blpop) which are versions +of [`RPOP`](/commands/rpop) and [`LPOP`](/commands/lpop) able to block if the list is empty: they'll return to +the caller only when a new element is added to the list, or when a user-specified +timeout is reached. + +This is an example of a [`BRPOP`](/commands/brpop) call we could use in the worker: + + > brpop tasks 5 + 1) "tasks" + 2) "do_something" + +It means: "wait for elements in the list `tasks`, but return if after 5 seconds +no element is available". + +Note that you can use 0 as timeout to wait for elements forever, and you can +also specify multiple lists and not just one, in order to wait on multiple +lists at the same time, and get notified when the first list receives an +element. + +A few things to note about [`BRPOP`](/commands/brpop): + +1. Clients are served in an ordered way: the first client that blocked waiting for a list, is served first when an element is pushed by some other client, and so forth. +2. The return value is different compared to [`RPOP`](/commands/rpop): it is a two-element array since it also includes the name of the key, because [`BRPOP`](/commands/brpop) and [`BLPOP`](/commands/blpop) are able to block waiting for elements from multiple lists. +3. If the timeout is reached, NULL is returned. + +There are more things you should know about lists and blocking ops. We +suggest that you read more on the following: + +* It is possible to build safer queues or rotating queues using [`LMOVE`](/commands/lmove). +* There is also a blocking variant of the command, called [`BLMOVE`](/commands/blmove). + +## Automatic creation and removal of keys + +So far in our examples we never had to create empty lists before pushing +elements, or removing empty lists when they no longer have elements inside. +It is Redis' responsibility to delete keys when lists are left empty, or to create +an empty list if the key does not exist and we are trying to add elements +to it, for example, with [`LPUSH`](/commands/lpush). + +This is not specific to lists, it applies to all the Redis data types +composed of multiple elements -- Streams, Sets, Sorted Sets and Hashes. + +Basically we can summarize the behavior with three rules: + +1. When we add an element to an aggregate data type, if the target key does not exist, an empty aggregate data type is created before adding the element. +2. When we remove elements from an aggregate data type, if the value remains empty, the key is automatically destroyed. The Stream data type is the only exception to this rule. +3. Calling a read-only command such as [`LLEN`](/commands/llen) (which returns the length of the list), or a write command removing elements, with an empty key, always produces the same result as if the key is holding an empty aggregate type of the type the command expects to find. + +Examples of rule 1: + + > del mylist + (integer) 1 + > lpush mylist 1 2 3 + (integer) 3 + +However we can't perform operations against the wrong type if the key exists: + + > set foo bar + OK + > lpush foo 1 2 3 + (error) WRONGTYPE Operation against a key holding the wrong kind of value + > type foo + string + +Example of rule 2: + + > lpush mylist 1 2 3 + (integer) 3 + > exists mylist + (integer) 1 + > lpop mylist + "3" + > lpop mylist + "2" + > lpop mylist + "1" + > exists mylist + (integer) 0 + +The key no longer exists after all the elements are popped. + +Example of rule 3: + + > del mylist + (integer) 0 + > llen mylist + (integer) 0 + > lpop mylist + (nil) + + ## Examples * Treat a list like a queue (first in, first out): diff --git a/docs/data-types/sets.md b/docs/data-types/sets.md index 76ce907a3a..00ee2cf45c 100644 --- a/docs/data-types/sets.md +++ b/docs/data-types/sets.md @@ -13,6 +13,136 @@ You can use Redis sets to efficiently: * Represent relations (e.g., the set of all users with a given role). * Perform common set operations such as intersection, unions, and differences. +The [`SADD`](/commands/sadd) command adds new elements to a set. It's also possible +to do a number of other operations against sets like testing if a given element +already exists, performing the intersection, union or difference between +multiple sets, and so forth. + + > sadd myset 1 2 3 + (integer) 3 + > smembers myset + 1. 3 + 2. 1 + 3. 2 + +Here I've added three elements to my set and told Redis to return all the +elements. As you can see they are not sorted -- Redis is free to return the +elements in any order at every call, since there is no contract with the +user about element ordering. + +Redis has commands to test for membership. For example, checking if an element exists: + + > sismember myset 3 + (integer) 1 + > sismember myset 30 + (integer) 0 + +"3" is a member of the set, while "30" is not. + +Sets are good for expressing relations between objects. +For instance we can easily use sets in order to implement tags. + +A simple way to model this problem is to have a set for every object we +want to tag. The set contains the IDs of the tags associated with the object. + +One illustration is tagging news articles. +If article ID 1000 is tagged with tags 1, 2, 5 and 77, a set +can associate these tag IDs with the news item: + + > sadd news:1000:tags 1 2 5 77 + (integer) 4 + +We may also want to have the inverse relation as well: the list +of all the news tagged with a given tag: + + > sadd tag:1:news 1000 + (integer) 1 + > sadd tag:2:news 1000 + (integer) 1 + > sadd tag:5:news 1000 + (integer) 1 + > sadd tag:77:news 1000 + (integer) 1 + +To get all the tags for a given object is trivial: + + > smembers news:1000:tags + 1. 5 + 2. 1 + 3. 77 + 4. 2 + +Note: in the example we assume you have another data structure, for example +a Redis hash, which maps tag IDs to tag names. + +There are other non trivial operations that are still easy to implement +using the right Redis commands. For instance we may want a list of all the +objects with the tags 1, 2, 10, and 27 together. We can do this using +the [`SINTER`](/commands/sinter) command, which performs the intersection between different +sets. We can use: + + > sinter tag:1:news tag:2:news tag:10:news tag:27:news + ... results here ... + +In addition to intersection you can also perform +unions, difference, extract a random element, and so forth. + +The command to extract an element is called [`SPOP`](/commands/spop), and is handy to model +certain problems. For example in order to implement a web-based poker game, +you may want to represent your deck with a set. Imagine we use a one-char +prefix for (C)lubs, (D)iamonds, (H)earts, (S)pades: + + > sadd deck C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 CJ CQ CK + D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 DJ DQ DK H1 H2 H3 + H4 H5 H6 H7 H8 H9 H10 HJ HQ HK S1 S2 S3 S4 S5 S6 + S7 S8 S9 S10 SJ SQ SK + (integer) 52 + +Now we want to provide each player with 5 cards. The [`SPOP`](/commands/spop) command +removes a random element, returning it to the client, so it is the +perfect operation in this case. + +However if we call it against our deck directly, in the next play of the +game we'll need to populate the deck of cards again, which may not be +ideal. So to start, we can make a copy of the set stored in the `deck` key +into the `game:1:deck` key. + +This is accomplished using [`SUNIONSTORE`](/commands/sunionstore), which normally performs the +union between multiple sets, and stores the result into another set. +However, since the union of a single set is itself, I can copy my deck +with: + + > sunionstore game:1:deck deck + (integer) 52 + +Now I'm ready to provide the first player with five cards: + + > spop game:1:deck + "C6" + > spop game:1:deck + "CQ" + > spop game:1:deck + "D1" + > spop game:1:deck + "CJ" + > spop game:1:deck + "SJ" + +One pair of jacks, not great... + +This is a good time to introduce the set command that provides the number +of elements inside a set. This is often called the *cardinality of a set* +in the context of set theory, so the Redis command is called [`SCARD`](/commands/scard). + + > scard game:1:deck + (integer) 47 + +The math works: 52 - 5 = 47. + +When you need to just get random elements without removing them from the +set, there is the [`SRANDMEMBER`](/commands/srandmember) command suitable for the task. It also features +the ability to return both repeating and non-repeating elements. + ## Examples * Store the set of favorited book IDs for users 123 and 456: diff --git a/docs/data-types/sorted-sets.md b/docs/data-types/sorted-sets.md index d7049f94b9..868231e591 100644 --- a/docs/data-types/sorted-sets.md +++ b/docs/data-types/sorted-sets.md @@ -13,6 +13,214 @@ Some use cases for sorted sets include: * Leaderboards. For example, you can use sorted sets to easily maintain ordered lists of the highest scores in a massive online game. * Rate limiters. In particular, you can use a sorted set to build a sliding-window rate limiter to prevent excessive API requests. +You can think of sorted sets as a mix between a Set and +a Hash. Like sets, sorted sets are composed of unique, non-repeating +string elements, so in some sense a sorted set is a set as well. + +However while elements inside sets are not ordered, every element in +a sorted set is associated with a floating point value, called *the score* +(this is why the type is also similar to a hash, since every element +is mapped to a value). + +Moreover, elements in a sorted set are *taken in order* (so they are not +ordered on request, order is a peculiarity of the data structure used to +represent sorted sets). They are ordered according to the following rule: + +* If B and A are two elements with a different score, then A > B if A.score is > B.score. +* If B and A have exactly the same score, then A > B if the A string is lexicographically greater than the B string. B and A strings can't be equal since sorted sets only have unique elements. + +Let's start with a simple example, adding a few selected hackers names as +sorted set elements, with their year of birth as "score". + + > zadd hackers 1940 "Alan Kay" + (integer) 1 + > zadd hackers 1957 "Sophie Wilson" + (integer) 1 + > zadd hackers 1953 "Richard Stallman" + (integer) 1 + > zadd hackers 1949 "Anita Borg" + (integer) 1 + > zadd hackers 1965 "Yukihiro Matsumoto" + (integer) 1 + > zadd hackers 1914 "Hedy Lamarr" + (integer) 1 + > zadd hackers 1916 "Claude Shannon" + (integer) 1 + > zadd hackers 1969 "Linus Torvalds" + (integer) 1 + > zadd hackers 1912 "Alan Turing" + (integer) 1 + + +As you can see `ZADD` is similar to `SADD`, but takes one additional argument +(placed before the element to be added) which is the score. +`ZADD` is also variadic, so you are free to specify multiple score-value +pairs, even if this is not used in the example above. + +With sorted sets it is trivial to return a list of hackers sorted by their +birth year because actually *they are already sorted*. + +Implementation note: Sorted sets are implemented via a +dual-ported data structure containing both a skip list and a hash table, so +every time we add an element Redis performs an O(log(N)) operation. That's +good, but when we ask for sorted elements Redis does not have to do any work at +all, it's already all sorted: + + > zrange hackers 0 -1 + 1) "Alan Turing" + 2) "Hedy Lamarr" + 3) "Claude Shannon" + 4) "Alan Kay" + 5) "Anita Borg" + 6) "Richard Stallman" + 7) "Sophie Wilson" + 8) "Yukihiro Matsumoto" + 9) "Linus Torvalds" + +Note: 0 and -1 means from element index 0 to the last element (-1 works +here just as it does in the case of the `LRANGE` command). + +What if I want to order them the opposite way, youngest to oldest? +Use [ZREVRANGE](/commands/zrevrange) instead of [ZRANGE](/commands/zrange): + + > zrevrange hackers 0 -1 + 1) "Linus Torvalds" + 2) "Yukihiro Matsumoto" + 3) "Sophie Wilson" + 4) "Richard Stallman" + 5) "Anita Borg" + 6) "Alan Kay" + 7) "Claude Shannon" + 8) "Hedy Lamarr" + 9) "Alan Turing" + +It is possible to return scores as well, using the `WITHSCORES` argument: + + > zrange hackers 0 -1 withscores + 1) "Alan Turing" + 2) "1912" + 3) "Hedy Lamarr" + 4) "1914" + 5) "Claude Shannon" + 6) "1916" + 7) "Alan Kay" + 8) "1940" + 9) "Anita Borg" + 10) "1949" + 11) "Richard Stallman" + 12) "1953" + 13) "Sophie Wilson" + 14) "1957" + 15) "Yukihiro Matsumoto" + 16) "1965" + 17) "Linus Torvalds" + 18) "1969" + +### Operating on ranges + +Sorted sets are more powerful than this. They can operate on ranges. +Let's get all the individuals that were born up to 1950 inclusive. We +use the `ZRANGEBYSCORE` command to do it: + + > zrangebyscore hackers -inf 1950 + 1) "Alan Turing" + 2) "Hedy Lamarr" + 3) "Claude Shannon" + 4) "Alan Kay" + 5) "Anita Borg" + +We asked Redis to return all the elements with a score between negative +infinity and 1950 (both extremes are included). + +It's also possible to remove ranges of elements. Let's remove all +the hackers born between 1940 and 1960 from the sorted set: + + > zremrangebyscore hackers 1940 1960 + (integer) 4 + +`ZREMRANGEBYSCORE` is perhaps not the best command name, +but it can be very useful, and returns the number of removed elements. + +Another extremely useful operation defined for sorted set elements +is the get-rank operation. It is possible to ask what is the +position of an element in the set of the ordered elements. + + > zrank hackers "Anita Borg" + (integer) 4 + +The `ZREVRANK` command is also available in order to get the rank, considering +the elements sorted a descending way. + +### Lexicographical scores + +In version Redis 2.8, a new feature was introduced that allows +getting ranges lexicographically, assuming elements in a sorted set are all +inserted with the same identical score (elements are compared with the C +`memcmp` function, so it is guaranteed that there is no collation, and every +Redis instance will reply with the same output). + +The main commands to operate with lexicographical ranges are `ZRANGEBYLEX`, +`ZREVRANGEBYLEX`, `ZREMRANGEBYLEX` and `ZLEXCOUNT`. + +For example, let's add again our list of famous hackers, but this time +use a score of zero for all the elements: + + > zadd hackers 0 "Alan Kay" 0 "Sophie Wilson" 0 "Richard Stallman" 0 + "Anita Borg" 0 "Yukihiro Matsumoto" 0 "Hedy Lamarr" 0 "Claude Shannon" + 0 "Linus Torvalds" 0 "Alan Turing" + +Because of the sorted sets ordering rules, they are already sorted +lexicographically: + + > zrange hackers 0 -1 + 1) "Alan Kay" + 2) "Alan Turing" + 3) "Anita Borg" + 4) "Claude Shannon" + 5) "Hedy Lamarr" + 6) "Linus Torvalds" + 7) "Richard Stallman" + 8) "Sophie Wilson" + 9) "Yukihiro Matsumoto" + +Using `ZRANGEBYLEX` we can ask for lexicographical ranges: + + > zrangebylex hackers [B [P + 1) "Claude Shannon" + 2) "Hedy Lamarr" + 3) "Linus Torvalds" + +Ranges can be inclusive or exclusive (depending on the first character), +also string infinite and minus infinite are specified respectively with +the `+` and `-` strings. See the documentation for more information. + +This feature is important because it allows us to use sorted sets as a generic +index. For example, if you want to index elements by a 128-bit unsigned +integer argument, all you need to do is to add elements into a sorted +set with the same score (for example 0) but with a 16 byte prefix +consisting of **the 128 bit number in big endian**. Since numbers in big +endian, when ordered lexicographically (in raw bytes order) are actually +ordered numerically as well, you can ask for ranges in the 128 bit space, +and get the element's value discarding the prefix. + +If you want to see the feature in the context of a more serious demo, +check the [Redis autocomplete demo](http://autocomplete.redis.io). + +Updating the score: leader boards +--- + +Just a final note about sorted sets before switching to the next topic. +Sorted sets' scores can be updated at any time. Just calling `ZADD` against +an element already included in the sorted set will update its score +(and position) with O(log(N)) time complexity. As such, sorted sets are suitable +when there are tons of updates. + +Because of this characteristic a common use case is leader boards. +The typical application is a Facebook game where you combine the ability to +take users sorted by their high score, plus the get-rank operation, in order +to show the top-N users, and the user rank in the leader board (e.g., "you are +the #4932 best score here"). + ## Examples * Update a real-time leaderboard as players' scores change: diff --git a/docs/data-types/strings.md b/docs/data-types/strings.md index 77dbe16d73..00cf39d918 100644 --- a/docs/data-types/strings.md +++ b/docs/data-types/strings.md @@ -7,34 +7,88 @@ description: > --- Redis strings store sequences of bytes, including text, serialized objects, and binary arrays. -As such, strings are the most basic Redis data type. +As such, strings are the simplest type of value you can associate with +a Redis key. They're often used for caching, but they support additional functionality that lets you implement counters and perform bitwise operations, too. -## Examples +Since Redis keys are strings, when we use the string type as a value too, +we are mapping a string to another string. The string data type is useful +for a number of use cases, like caching HTML fragments or pages. + +Let's play a bit with the string type, using `redis-cli` (all the examples +will be performed via `redis-cli` in this tutorial). + + > set mykey somevalue + OK + > get mykey + "somevalue" + +As you can see using the [`SET`](/commands/set) and the [`GET`](/commands/get) commands are the way we set +and retrieve a string value. Note that [`SET`](/commands/set) will replace any existing value +already stored into the key, in the case that the key already exists, even if +the key is associated with a non-string value. So [`SET`](/commands/set) performs an assignment. + +Values can be strings (including binary data) of every kind, for instance you +can store a jpeg image inside a value. A value can't be bigger than 512 MB. + +The [`SET`](/commands/set) command has interesting options, that are provided as additional +arguments. For example, I may ask [`SET`](/commands/set) to fail if the key already exists, +or the opposite, that it only succeed if the key already exists: + + > set mykey newval nx + (nil) + > set mykey newval xx + OK + +There are a number of other commands for operating on strings. For example +the [`GETSET`](/commands/getset) command sets a key to a new value, returning the old value as the +result. You can use this command, for example, if you have a +system that increments a Redis key using [`INCR`](/commands/incr) +every time your web site receives a new visitor. You may want to collect this +information once every hour, without losing a single increment. +You can [`GETSET`](/commands/getset) the key, assigning it the new value of "0" and reading the +old value back. + +The ability to set or retrieve the value of multiple keys in a single +command is also useful for reduced latency. For this reason there are +the [`MSET`](/commands/mset) and [`MGET`](/commands/mget) commands: + + > mset a 10 b 20 c 30 + OK + > mget a b c + 1) "10" + 2) "20" + 3) "30" + +When [`MGET`](/commands/mget) is used, Redis returns an array of values. + +### Strings as counters +Even if strings are the basic values of Redis, there are interesting operations +you can perform with them. For instance, one is atomic increment: + + > set counter 100 + OK + > incr counter + (integer) 101 + > incr counter + (integer) 102 + > incrby counter 50 + (integer) 152 + +The [INCR](/commands/incr) command parses the string value as an integer, +increments it by one, and finally sets the obtained value as the new value. +There are other similar commands like [INCRBY](/commands/incrby), +[DECR](/commands/decr) and [DECRBY](/commands/decrby). Internally it's +always the same command, acting in a slightly different way. + +What does it mean that INCR is atomic? +That even multiple clients issuing INCR against +the same key will never enter into a race condition. For instance, it will never +happen that client 1 reads "10", client 2 reads "10" at the same time, both +increment to 11, and set the new value to 11. The final value will always be +12 and the read-increment-set operation is performed while all the other +clients are not executing a command at the same time. -* Store and then retrieve a string in Redis: - -``` -> SET user:1 salvatore -OK -> GET user:1 -"salvatore" -``` - -* Store a serialized JSON string and set it to expire 100 seconds from now: - -``` -> SET ticket:27 "\"{'username': 'priya', 'ticket_id': 321}\"" EX 100 -``` - -* Increment a counter: - -``` -> INCR views:page:2 -(integer) 1 -> INCRBY views:page:2 10 -(integer) 11 -``` ## Limits diff --git a/docs/data-types/tutorial.md b/docs/data-types/tutorial.md deleted file mode 100644 index 8681607964..0000000000 --- a/docs/data-types/tutorial.md +++ /dev/null @@ -1,997 +0,0 @@ ---- -title: "Redis data types tutorial" -linkTitle: "Tutorial" -description: Learning the basic Redis data types and how to use them -weight: 1 -aliases: - - /topics/data-types-intro - - /docs/manual/data-types/data-types-tutorial ---- - -The following is a hands-on tutorial that teaches the core Redis data types using the Redis CLI. For a general overview of the data types, see the [data types introduction](/docs/data-types/). - -## Keys - -Redis keys are binary safe, this means that you can use any binary sequence as a -key, from a string like "foo" to the content of a JPEG file. -The empty string is also a valid key. - -A few other rules about keys: - -* Very long keys are not a good idea. For instance a key of 1024 bytes is a bad - idea not only memory-wise, but also because the lookup of the key in the - dataset may require several costly key-comparisons. Even when the task at hand - is to match the existence of a large value, hashing it (for example - with SHA1) is a better idea, especially from the perspective of memory - and bandwidth. -* Very short keys are often not a good idea. There is little point in writing - "u1000flw" as a key if you can instead write "user:1000:followers". The latter - is more readable and the added space is minor compared to the space used by - the key object itself and the value object. While short keys will obviously - consume a bit less memory, your job is to find the right balance. -* Try to stick with a schema. For instance "object-type:id" is a good - idea, as in "user:1000". Dots or dashes are often used for multi-word - fields, as in "comment:4321:reply.to" or "comment:4321:reply-to". -* The maximum allowed key size is 512 MB. - - -## Strings - -The Redis String type is the simplest type of value you can associate with -a Redis key. It is the only data type in Memcached, so it is also very natural -for newcomers to use it in Redis. - -Since Redis keys are strings, when we use the string type as a value too, -we are mapping a string to another string. The string data type is useful -for a number of use cases, like caching HTML fragments or pages. - -Let's play a bit with the string type, using `redis-cli` (all the examples -will be performed via `redis-cli` in this tutorial). - - > set mykey somevalue - OK - > get mykey - "somevalue" - -As you can see using the `SET` and the `GET` commands are the way we set -and retrieve a string value. Note that `SET` will replace any existing value -already stored into the key, in the case that the key already exists, even if -the key is associated with a non-string value. So `SET` performs an assignment. - -Values can be strings (including binary data) of every kind, for instance you -can store a jpeg image inside a value. A value can't be bigger than 512 MB. - -The `SET` command has interesting options, that are provided as additional -arguments. For example, I may ask `SET` to fail if the key already exists, -or the opposite, that it only succeed if the key already exists: - - > set mykey newval nx - (nil) - > set mykey newval xx - OK - -Even if strings are the basic values of Redis, there are interesting operations -you can perform with them. For instance, one is atomic increment: - - > set counter 100 - OK - > incr counter - (integer) 101 - > incr counter - (integer) 102 - > incrby counter 50 - (integer) 152 - -The [INCR](/commands/incr) command parses the string value as an integer, -increments it by one, and finally sets the obtained value as the new value. -There are other similar commands like [INCRBY](/commands/incrby), -[DECR](/commands/decr) and [DECRBY](/commands/decrby). Internally it's -always the same command, acting in a slightly different way. - -What does it mean that INCR is atomic? -That even multiple clients issuing INCR against -the same key will never enter into a race condition. For instance, it will never -happen that client 1 reads "10", client 2 reads "10" at the same time, both -increment to 11, and set the new value to 11. The final value will always be -12 and the read-increment-set operation is performed while all the other -clients are not executing a command at the same time. - -There are a number of commands for operating on strings. For example -the `GETSET` command sets a key to a new value, returning the old value as the -result. You can use this command, for example, if you have a -system that increments a Redis key using `INCR` -every time your web site receives a new visitor. You may want to collect this -information once every hour, without losing a single increment. -You can `GETSET` the key, assigning it the new value of "0" and reading the -old value back. - -The ability to set or retrieve the value of multiple keys in a single -command is also useful for reduced latency. For this reason there are -the `MSET` and `MGET` commands: - - > mset a 10 b 20 c 30 - OK - > mget a b c - 1) "10" - 2) "20" - 3) "30" - -When `MGET` is used, Redis returns an array of values. - -## Altering and querying the key space - -There are commands that are not defined on particular types, but are useful -in order to interact with the space of keys, and thus, can be used with -keys of any type. - -For example the `EXISTS` command returns 1 or 0 to signal if a given key -exists or not in the database, while the `DEL` command deletes a key -and associated value, whatever the value is. - - > set mykey hello - OK - > exists mykey - (integer) 1 - > del mykey - (integer) 1 - > exists mykey - (integer) 0 - -From the examples you can also see how `DEL` itself returns 1 or 0 depending on whether -the key was removed (it existed) or not (there was no such key with that -name). - -There are many key space related commands, but the above two are the -essential ones together with the `TYPE` command, which returns the kind -of value stored at the specified key: - - > set mykey x - OK - > type mykey - string - > del mykey - (integer) 1 - > type mykey - none - -## Key expiration - -Before moving on, we should look at an important Redis feature that works regardless of the type of value you're storing: key expiration. Key expiration lets you set a timeout for a key, also known as a "time to live", or "TTL". When the time to live elapses, the key is automatically destroyed. - -A few important notes about key expiration: - -* They can be set both using seconds or milliseconds precision. -* However the expire time resolution is always 1 millisecond. -* Information about expires are replicated and persisted on disk, the time virtually passes when your Redis server remains stopped (this means that Redis saves the date at which a key will expire). - -Use the `EXPIRE` command to set a key's expiration: - - > set key some-value - OK - > expire key 5 - (integer) 1 - > get key (immediately) - "some-value" - > get key (after some time) - (nil) - -The key vanished between the two `GET` calls, since the second call was -delayed more than 5 seconds. In the example above we used `EXPIRE` in -order to set the expire (it can also be used in order to set a different -expire to a key already having one, like `PERSIST` can be used in order -to remove the expire and make the key persistent forever). However we -can also create keys with expires using other Redis commands. For example -using `SET` options: - - > set key 100 ex 10 - OK - > ttl key - (integer) 9 - -The example above sets a key with the string value `100`, having an expire -of ten seconds. Later the `TTL` command is called in order to check the -remaining time to live for the key. - -In order to set and check expires in milliseconds, check the `PEXPIRE` and -the `PTTL` commands, and the full list of `SET` options. - - -## Lists - -To explain the List data type it's better to start with a little bit of theory, -as the term *List* is often used in an improper way by information technology -folks. For instance "Python Lists" are not what the name may suggest (Linked -Lists), but rather Arrays (the same data type is called Array in -Ruby actually). - -From a very general point of view a List is just a sequence of ordered -elements: 10,20,1,2,3 is a list. But the properties of a List implemented using -an Array are very different from the properties of a List implemented using a -*Linked List*. - -Redis lists are implemented via Linked Lists. This means that even if you have -millions of elements inside a list, the operation of adding a new element in -the head or in the tail of the list is performed *in constant time*. The speed of adding a -new element with the `LPUSH` command to the head of a list with ten -elements is the same as adding an element to the head of list with 10 -million elements. - -What's the downside? Accessing an element *by index* is very fast in lists -implemented with an Array (constant time indexed access) and not so fast in -lists implemented by linked lists (where the operation requires an amount of -work proportional to the index of the accessed element). - -Redis Lists are implemented with linked lists because for a database system it -is crucial to be able to add elements to a very long list in a very fast way. -Another strong advantage, as you'll see in a moment, is that Redis Lists can be -taken at constant length in constant time. - -When fast access to the middle of a large collection of elements is important, -there is a different data structure that can be used, called sorted sets. -Sorted sets will be covered later in this tutorial. - -### First steps with Redis Lists - -The `LPUSH` command adds a new element into a list, on the -left (at the head), while the `RPUSH` command adds a new -element into a list, on the right (at the tail). Finally the -`LRANGE` command extracts ranges of elements from lists: - - > rpush mylist A - (integer) 1 - > rpush mylist B - (integer) 2 - > lpush mylist first - (integer) 3 - > lrange mylist 0 -1 - 1) "first" - 2) "A" - 3) "B" - -Note that [LRANGE](/commands/lrange) takes two indexes, the first and the last -element of the range to return. Both the indexes can be negative, telling Redis -to start counting from the end: so -1 is the last element, -2 is the -penultimate element of the list, and so forth. - -As you can see `RPUSH` appended the elements on the right of the list, while -the final `LPUSH` appended the element on the left. - -Both commands are *variadic commands*, meaning that you are free to push -multiple elements into a list in a single call: - - > rpush mylist 1 2 3 4 5 "foo bar" - (integer) 9 - > lrange mylist 0 -1 - 1) "first" - 2) "A" - 3) "B" - 4) "1" - 5) "2" - 6) "3" - 7) "4" - 8) "5" - 9) "foo bar" - -An important operation defined on Redis lists is the ability to *pop elements*. -Popping elements is the operation of both retrieving the element from the list, -and eliminating it from the list, at the same time. You can pop elements -from left and right, similarly to how you can push elements in both sides -of the list: - - > rpush mylist a b c - (integer) 3 - > rpop mylist - "c" - > rpop mylist - "b" - > rpop mylist - "a" - -We added three elements and popped three elements, so at the end of this -sequence of commands the list is empty and there are no more elements to -pop. If we try to pop yet another element, this is the result we get: - - > rpop mylist - (nil) - -Redis returned a NULL value to signal that there are no elements in the -list. - -### Common use cases for lists - -Lists are useful for a number of tasks, two very representative use cases -are the following: - -* Remember the latest updates posted by users into a social network. -* Communication between processes, using a consumer-producer pattern where the producer pushes items into a list, and a consumer (usually a *worker*) consumes those items and executes actions. Redis has special list commands to make this use case both more reliable and efficient. - -For example both the popular Ruby libraries [resque](https://github.com/resque/resque) and -[sidekiq](https://github.com/mperham/sidekiq) use Redis lists under the hood in order to -implement background jobs. - -The popular Twitter social network [takes the latest tweets](http://www.infoq.com/presentations/Real-Time-Delivery-Twitter) -posted by users into Redis lists. - -To describe a common use case step by step, imagine your home page shows the latest -photos published in a photo sharing social network and you want to speedup access. - -* Every time a user posts a new photo, we add its ID into a list with `LPUSH`. -* When users visit the home page, we use `LRANGE 0 9` in order to get the latest 10 posted items. - -### Capped lists - -In many use cases we just want to use lists to store the *latest items*, -whatever they are: social network updates, logs, or anything else. - -Redis allows us to use lists as a capped collection, only remembering the latest -N items and discarding all the oldest items using the `LTRIM` command. - -The `LTRIM` command is similar to `LRANGE`, but **instead of displaying the -specified range of elements** it sets this range as the new list value. All -the elements outside the given range are removed. - -An example will make it more clear: - - > rpush mylist 1 2 3 4 5 - (integer) 5 - > ltrim mylist 0 2 - OK - > lrange mylist 0 -1 - 1) "1" - 2) "2" - 3) "3" - -The above `LTRIM` command tells Redis to take just list elements from index -0 to 2, everything else will be discarded. This allows for a very simple but -useful pattern: doing a List push operation + a List trim operation together -in order to add a new element and discard elements exceeding a limit: - - LPUSH mylist - LTRIM mylist 0 999 - -The above combination adds a new element and takes only the 1000 -newest elements into the list. With `LRANGE` you can access the top items -without any need to remember very old data. - -Note: while `LRANGE` is technically an O(N) command, accessing small ranges -towards the head or the tail of the list is a constant time operation. - -Blocking operations on lists ---- - -Lists have a special feature that make them suitable to implement queues, -and in general as a building block for inter process communication systems: -blocking operations. - -Imagine you want to push items into a list with one process, and use -a different process in order to actually do some kind of work with those -items. This is the usual producer / consumer setup, and can be implemented -in the following simple way: - -* To push items into the list, producers call `LPUSH`. -* To extract / process items from the list, consumers call `RPOP`. - -However it is possible that sometimes the list is empty and there is nothing -to process, so `RPOP` just returns NULL. In this case a consumer is forced to wait -some time and retry again with `RPOP`. This is called *polling*, and is not -a good idea in this context because it has several drawbacks: - -1. Forces Redis and clients to process useless commands (all the requests when the list is empty will get no actual work done, they'll just return NULL). -2. Adds a delay to the processing of items, since after a worker receives a NULL, it waits some time. To make the delay smaller, we could wait less between calls to `RPOP`, with the effect of amplifying problem number 1, i.e. more useless calls to Redis. - -So Redis implements commands called `BRPOP` and `BLPOP` which are versions -of `RPOP` and `LPOP` able to block if the list is empty: they'll return to -the caller only when a new element is added to the list, or when a user-specified -timeout is reached. - -This is an example of a `BRPOP` call we could use in the worker: - - > brpop tasks 5 - 1) "tasks" - 2) "do_something" - -It means: "wait for elements in the list `tasks`, but return if after 5 seconds -no element is available". - -Note that you can use 0 as timeout to wait for elements forever, and you can -also specify multiple lists and not just one, in order to wait on multiple -lists at the same time, and get notified when the first list receives an -element. - -A few things to note about `BRPOP`: - -1. Clients are served in an ordered way: the first client that blocked waiting for a list, is served first when an element is pushed by some other client, and so forth. -2. The return value is different compared to `RPOP`: it is a two-element array since it also includes the name of the key, because `BRPOP` and `BLPOP` are able to block waiting for elements from multiple lists. -3. If the timeout is reached, NULL is returned. - -There are more things you should know about lists and blocking ops. We -suggest that you read more on the following: - -* It is possible to build safer queues or rotating queues using `LMOVE`. -* There is also a blocking variant of the command, called `BLMOVE`. - -## Automatic creation and removal of keys - -So far in our examples we never had to create empty lists before pushing -elements, or removing empty lists when they no longer have elements inside. -It is Redis' responsibility to delete keys when lists are left empty, or to create -an empty list if the key does not exist and we are trying to add elements -to it, for example, with `LPUSH`. - -This is not specific to lists, it applies to all the Redis data types -composed of multiple elements -- Streams, Sets, Sorted Sets and Hashes. - -Basically we can summarize the behavior with three rules: - -1. When we add an element to an aggregate data type, if the target key does not exist, an empty aggregate data type is created before adding the element. -2. When we remove elements from an aggregate data type, if the value remains empty, the key is automatically destroyed. The Stream data type is the only exception to this rule. -3. Calling a read-only command such as `LLEN` (which returns the length of the list), or a write command removing elements, with an empty key, always produces the same result as if the key is holding an empty aggregate type of the type the command expects to find. - -Examples of rule 1: - - > del mylist - (integer) 1 - > lpush mylist 1 2 3 - (integer) 3 - -However we can't perform operations against the wrong type if the key exists: - - > set foo bar - OK - > lpush foo 1 2 3 - (error) WRONGTYPE Operation against a key holding the wrong kind of value - > type foo - string - -Example of rule 2: - - > lpush mylist 1 2 3 - (integer) 3 - > exists mylist - (integer) 1 - > lpop mylist - "3" - > lpop mylist - "2" - > lpop mylist - "1" - > exists mylist - (integer) 0 - -The key no longer exists after all the elements are popped. - -Example of rule 3: - - > del mylist - (integer) 0 - > llen mylist - (integer) 0 - > lpop mylist - (nil) - - -## Hashes - -Redis hashes look exactly how one might expect a "hash" to look, with field-value pairs: - - > hset user:1000 username antirez birthyear 1977 verified 1 - (integer) 3 - > hget user:1000 username - "antirez" - > hget user:1000 birthyear - "1977" - > hgetall user:1000 - 1) "username" - 2) "antirez" - 3) "birthyear" - 4) "1977" - 5) "verified" - 6) "1" - -While hashes are handy to represent *objects*, actually the number of fields you can -put inside a hash has no practical limits (other than available memory), so you can use -hashes in many different ways inside your application. - -The command `HSET` sets multiple fields of the hash, while `HGET` retrieves -a single field. `HMGET` is similar to `HGET` but returns an array of values: - - > hmget user:1000 username birthyear no-such-field - 1) "antirez" - 2) "1977" - 3) (nil) - -There are commands that are able to perform operations on individual fields -as well, like `HINCRBY`: - - > hincrby user:1000 birthyear 10 - (integer) 1987 - > hincrby user:1000 birthyear 10 - (integer) 1997 - -You can find the [full list of hash commands in the documentation](https://redis.io/commands#hash). - -It is worth noting that small hashes (i.e., a few elements with small values) are -encoded in special way in memory that make them very memory efficient. - - -## Sets - -Redis Sets are unordered collections of strings. The -`SADD` command adds new elements to a set. It's also possible -to do a number of other operations against sets like testing if a given element -already exists, performing the intersection, union or difference between -multiple sets, and so forth. - - > sadd myset 1 2 3 - (integer) 3 - > smembers myset - 1. 3 - 2. 1 - 3. 2 - -Here I've added three elements to my set and told Redis to return all the -elements. As you can see they are not sorted -- Redis is free to return the -elements in any order at every call, since there is no contract with the -user about element ordering. - -Redis has commands to test for membership. For example, checking if an element exists: - - > sismember myset 3 - (integer) 1 - > sismember myset 30 - (integer) 0 - -"3" is a member of the set, while "30" is not. - -Sets are good for expressing relations between objects. -For instance we can easily use sets in order to implement tags. - -A simple way to model this problem is to have a set for every object we -want to tag. The set contains the IDs of the tags associated with the object. - -One illustration is tagging news articles. -If article ID 1000 is tagged with tags 1, 2, 5 and 77, a set -can associate these tag IDs with the news item: - - > sadd news:1000:tags 1 2 5 77 - (integer) 4 - -We may also want to have the inverse relation as well: the list -of all the news tagged with a given tag: - - > sadd tag:1:news 1000 - (integer) 1 - > sadd tag:2:news 1000 - (integer) 1 - > sadd tag:5:news 1000 - (integer) 1 - > sadd tag:77:news 1000 - (integer) 1 - -To get all the tags for a given object is trivial: - - > smembers news:1000:tags - 1. 5 - 2. 1 - 3. 77 - 4. 2 - -Note: in the example we assume you have another data structure, for example -a Redis hash, which maps tag IDs to tag names. - -There are other non trivial operations that are still easy to implement -using the right Redis commands. For instance we may want a list of all the -objects with the tags 1, 2, 10, and 27 together. We can do this using -the `SINTER` command, which performs the intersection between different -sets. We can use: - - > sinter tag:1:news tag:2:news tag:10:news tag:27:news - ... results here ... - -In addition to intersection you can also perform -unions, difference, extract a random element, and so forth. - -The command to extract an element is called `SPOP`, and is handy to model -certain problems. For example in order to implement a web-based poker game, -you may want to represent your deck with a set. Imagine we use a one-char -prefix for (C)lubs, (D)iamonds, (H)earts, (S)pades: - - > sadd deck C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 CJ CQ CK - D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 DJ DQ DK H1 H2 H3 - H4 H5 H6 H7 H8 H9 H10 HJ HQ HK S1 S2 S3 S4 S5 S6 - S7 S8 S9 S10 SJ SQ SK - (integer) 52 - -Now we want to provide each player with 5 cards. The `SPOP` command -removes a random element, returning it to the client, so it is the -perfect operation in this case. - -However if we call it against our deck directly, in the next play of the -game we'll need to populate the deck of cards again, which may not be -ideal. So to start, we can make a copy of the set stored in the `deck` key -into the `game:1:deck` key. - -This is accomplished using `SUNIONSTORE`, which normally performs the -union between multiple sets, and stores the result into another set. -However, since the union of a single set is itself, I can copy my deck -with: - - > sunionstore game:1:deck deck - (integer) 52 - -Now I'm ready to provide the first player with five cards: - - > spop game:1:deck - "C6" - > spop game:1:deck - "CQ" - > spop game:1:deck - "D1" - > spop game:1:deck - "CJ" - > spop game:1:deck - "SJ" - -One pair of jacks, not great... - -This is a good time to introduce the set command that provides the number -of elements inside a set. This is often called the *cardinality of a set* -in the context of set theory, so the Redis command is called `SCARD`. - - > scard game:1:deck - (integer) 47 - -The math works: 52 - 5 = 47. - -When you need to just get random elements without removing them from the -set, there is the `SRANDMEMBER` command suitable for the task. It also features -the ability to return both repeating and non-repeating elements. - - -## Sorted sets - -Sorted sets are a data type which is similar to a mix between a Set and -a Hash. Like sets, sorted sets are composed of unique, non-repeating -string elements, so in some sense a sorted set is a set as well. - -However while elements inside sets are not ordered, every element in -a sorted set is associated with a floating point value, called *the score* -(this is why the type is also similar to a hash, since every element -is mapped to a value). - -Moreover, elements in a sorted set are *taken in order* (so they are not -ordered on request, order is a peculiarity of the data structure used to -represent sorted sets). They are ordered according to the following rule: - -* If B and A are two elements with a different score, then A > B if A.score is > B.score. -* If B and A have exactly the same score, then A > B if the A string is lexicographically greater than the B string. B and A strings can't be equal since sorted sets only have unique elements. - -Let's start with a simple example, adding a few selected hackers names as -sorted set elements, with their year of birth as "score". - - > zadd hackers 1940 "Alan Kay" - (integer) 1 - > zadd hackers 1957 "Sophie Wilson" - (integer) 1 - > zadd hackers 1953 "Richard Stallman" - (integer) 1 - > zadd hackers 1949 "Anita Borg" - (integer) 1 - > zadd hackers 1965 "Yukihiro Matsumoto" - (integer) 1 - > zadd hackers 1914 "Hedy Lamarr" - (integer) 1 - > zadd hackers 1916 "Claude Shannon" - (integer) 1 - > zadd hackers 1969 "Linus Torvalds" - (integer) 1 - > zadd hackers 1912 "Alan Turing" - (integer) 1 - - -As you can see `ZADD` is similar to `SADD`, but takes one additional argument -(placed before the element to be added) which is the score. -`ZADD` is also variadic, so you are free to specify multiple score-value -pairs, even if this is not used in the example above. - -With sorted sets it is trivial to return a list of hackers sorted by their -birth year because actually *they are already sorted*. - -Implementation note: Sorted sets are implemented via a -dual-ported data structure containing both a skip list and a hash table, so -every time we add an element Redis performs an O(log(N)) operation. That's -good, but when we ask for sorted elements Redis does not have to do any work at -all, it's already all sorted: - - > zrange hackers 0 -1 - 1) "Alan Turing" - 2) "Hedy Lamarr" - 3) "Claude Shannon" - 4) "Alan Kay" - 5) "Anita Borg" - 6) "Richard Stallman" - 7) "Sophie Wilson" - 8) "Yukihiro Matsumoto" - 9) "Linus Torvalds" - -Note: 0 and -1 means from element index 0 to the last element (-1 works -here just as it does in the case of the `LRANGE` command). - -What if I want to order them the opposite way, youngest to oldest? -Use [ZREVRANGE](/commands/zrevrange) instead of [ZRANGE](/commands/zrange): - - > zrevrange hackers 0 -1 - 1) "Linus Torvalds" - 2) "Yukihiro Matsumoto" - 3) "Sophie Wilson" - 4) "Richard Stallman" - 5) "Anita Borg" - 6) "Alan Kay" - 7) "Claude Shannon" - 8) "Hedy Lamarr" - 9) "Alan Turing" - -It is possible to return scores as well, using the `WITHSCORES` argument: - - > zrange hackers 0 -1 withscores - 1) "Alan Turing" - 2) "1912" - 3) "Hedy Lamarr" - 4) "1914" - 5) "Claude Shannon" - 6) "1916" - 7) "Alan Kay" - 8) "1940" - 9) "Anita Borg" - 10) "1949" - 11) "Richard Stallman" - 12) "1953" - 13) "Sophie Wilson" - 14) "1957" - 15) "Yukihiro Matsumoto" - 16) "1965" - 17) "Linus Torvalds" - 18) "1969" - -### Operating on ranges - -Sorted sets are more powerful than this. They can operate on ranges. -Let's get all the individuals that were born up to 1950 inclusive. We -use the `ZRANGEBYSCORE` command to do it: - - > zrangebyscore hackers -inf 1950 - 1) "Alan Turing" - 2) "Hedy Lamarr" - 3) "Claude Shannon" - 4) "Alan Kay" - 5) "Anita Borg" - -We asked Redis to return all the elements with a score between negative -infinity and 1950 (both extremes are included). - -It's also possible to remove ranges of elements. Let's remove all -the hackers born between 1940 and 1960 from the sorted set: - - > zremrangebyscore hackers 1940 1960 - (integer) 4 - -`ZREMRANGEBYSCORE` is perhaps not the best command name, -but it can be very useful, and returns the number of removed elements. - -Another extremely useful operation defined for sorted set elements -is the get-rank operation. It is possible to ask what is the -position of an element in the set of the ordered elements. - - > zrank hackers "Anita Borg" - (integer) 4 - -The `ZREVRANK` command is also available in order to get the rank, considering -the elements sorted a descending way. - -### Lexicographical scores - -With recent versions of Redis 2.8, a new feature was introduced that allows -getting ranges lexicographically, assuming elements in a sorted set are all -inserted with the same identical score (elements are compared with the C -`memcmp` function, so it is guaranteed that there is no collation, and every -Redis instance will reply with the same output). - -The main commands to operate with lexicographical ranges are `ZRANGEBYLEX`, -`ZREVRANGEBYLEX`, `ZREMRANGEBYLEX` and `ZLEXCOUNT`. - -For example, let's add again our list of famous hackers, but this time -use a score of zero for all the elements: - - > zadd hackers 0 "Alan Kay" 0 "Sophie Wilson" 0 "Richard Stallman" 0 - "Anita Borg" 0 "Yukihiro Matsumoto" 0 "Hedy Lamarr" 0 "Claude Shannon" - 0 "Linus Torvalds" 0 "Alan Turing" - -Because of the sorted sets ordering rules, they are already sorted -lexicographically: - - > zrange hackers 0 -1 - 1) "Alan Kay" - 2) "Alan Turing" - 3) "Anita Borg" - 4) "Claude Shannon" - 5) "Hedy Lamarr" - 6) "Linus Torvalds" - 7) "Richard Stallman" - 8) "Sophie Wilson" - 9) "Yukihiro Matsumoto" - -Using `ZRANGEBYLEX` we can ask for lexicographical ranges: - - > zrangebylex hackers [B [P - 1) "Claude Shannon" - 2) "Hedy Lamarr" - 3) "Linus Torvalds" - -Ranges can be inclusive or exclusive (depending on the first character), -also string infinite and minus infinite are specified respectively with -the `+` and `-` strings. See the documentation for more information. - -This feature is important because it allows us to use sorted sets as a generic -index. For example, if you want to index elements by a 128-bit unsigned -integer argument, all you need to do is to add elements into a sorted -set with the same score (for example 0) but with a 16 byte prefix -consisting of **the 128 bit number in big endian**. Since numbers in big -endian, when ordered lexicographically (in raw bytes order) are actually -ordered numerically as well, you can ask for ranges in the 128 bit space, -and get the element's value discarding the prefix. - -If you want to see the feature in the context of a more serious demo, -check the [Redis autocomplete demo](http://autocomplete.redis.io). - -Updating the score: leader boards ---- - -Just a final note about sorted sets before switching to the next topic. -Sorted sets' scores can be updated at any time. Just calling `ZADD` against -an element already included in the sorted set will update its score -(and position) with O(log(N)) time complexity. As such, sorted sets are suitable -when there are tons of updates. - -Because of this characteristic a common use case is leader boards. -The typical application is a Facebook game where you combine the ability to -take users sorted by their high score, plus the get-rank operation, in order -to show the top-N users, and the user rank in the leader board (e.g., "you are -the #4932 best score here"). - - -## Bitmaps - -Bitmaps are not an actual data type, but a set of bit-oriented operations -defined on the String type. Since strings are binary safe blobs and their -maximum length is 512 MB, they are suitable to set up to 2^32 different -bits. - -Bit operations are divided into two groups: constant-time single bit -operations, like setting a bit to 1 or 0, or getting its value, and -operations on groups of bits, for example counting the number of set -bits in a given range of bits (e.g., population counting). - -One of the biggest advantages of bitmaps is that they often provide -extreme space savings when storing information. For example in a system -where different users are represented by incremental user IDs, it is possible -to remember a single bit information (for example, knowing whether -a user wants to receive a newsletter) of 4 billion of users using just 512 MB of memory. - -Bits are set and retrieved using the `SETBIT` and `GETBIT` commands: - - > setbit key 10 1 - (integer) 0 - > getbit key 10 - (integer) 1 - > getbit key 11 - (integer) 0 - -The `SETBIT` command takes as its first argument the bit number, and as its second -argument the value to set the bit to, which is 1 or 0. The command -automatically enlarges the string if the addressed bit is outside the -current string length. - -`GETBIT` just returns the value of the bit at the specified index. -Out of range bits (addressing a bit that is outside the length of the string -stored into the target key) are always considered to be zero. - -There are three commands operating on group of bits: - -1. `BITOP` performs bit-wise operations between different strings. The provided operations are AND, OR, XOR and NOT. -2. `BITCOUNT` performs population counting, reporting the number of bits set to 1. -3. `BITPOS` finds the first bit having the specified value of 0 or 1. - -Both `BITPOS` and `BITCOUNT` are able to operate with byte ranges of the -string, instead of running for the whole length of the string. The following -is a trivial example of `BITCOUNT` call: - - > setbit key 0 1 - (integer) 0 - > setbit key 100 1 - (integer) 0 - > bitcount key - (integer) 2 - -Common use cases for bitmaps are: - -* Real time analytics of all kinds. -* Storing space efficient but high performance boolean information associated with object IDs. - -For example imagine you want to know the longest streak of daily visits of -your web site users. You start counting days starting from zero, that is the -day you made your web site public, and set a bit with `SETBIT` every time -the user visits the web site. As a bit index you simply take the current unix -time, subtract the initial offset, and divide by the number of seconds in a day -(normally, 3600\*24). - -This way for each user you have a small string containing the visit -information for each day. With `BITCOUNT` it is possible to easily get -the number of days a given user visited the web site, while with -a few `BITPOS` calls, or simply fetching and analyzing the bitmap client-side, -it is possible to easily compute the longest streak. - -Bitmaps are trivial to split into multiple keys, for example for -the sake of sharding the data set and because in general it is better to -avoid working with huge keys. To split a bitmap across different keys -instead of setting all the bits into a key, a trivial strategy is just -to store M bits per key and obtain the key name with `bit-number/M` and -the Nth bit to address inside the key with `bit-number MOD M`. - - -## HyperLogLogs - -A HyperLogLog is a probabilistic data structure used in order to count -unique things (technically this is referred to estimating the cardinality -of a set). Usually counting unique items requires using an amount of memory -proportional to the number of items you want to count, because you need -to remember the elements you have already seen in the past in order to avoid -counting them multiple times. However there is a set of algorithms that trade -memory for precision: you end with an estimated measure with a standard error, -which in the case of the Redis implementation is less than 1%. The -magic of this algorithm is that you no longer need to use an amount of memory -proportional to the number of items counted, and instead can use a -constant amount of memory! 12k bytes in the worst case, or a lot less if your -HyperLogLog (We'll just call them HLL from now) has seen very few elements. - -HLLs in Redis, while technically a different data structure, are encoded -as a Redis string, so you can call `GET` to serialize a HLL, and `SET` -to deserialize it back to the server. - -Conceptually the HLL API is like using Sets to do the same task. You would -`SADD` every observed element into a set, and would use `SCARD` to check the -number of elements inside the set, which are unique since `SADD` will not -re-add an existing element. - -While you don't really *add items* into an HLL, because the data structure -only contains a state that does not include actual elements, the API is the -same: - -* Every time you see a new element, you add it to the count with `PFADD`. -* Every time you want to retrieve the current approximation of the unique elements *added* with `PFADD` so far, you use the `PFCOUNT`. - - > pfadd hll a b c d - (integer) 1 - > pfcount hll - (integer) 4 - -An example of use case for this data structure is counting unique queries -performed by users in a search form every day. - -Redis is also able to perform the union of HLLs, please check the -[full documentation](/commands#hyperloglog) for more information. - -## Other notable features - -There are other important things in the Redis API that can't be explored -in the context of this document, but are worth your attention: - -* It is possible to [iterate the key space of a large collection incrementally](/commands/scan). -* It is possible to run [Lua scripts server side](/commands/eval) to improve latency and bandwidth. -* Redis is also a [Pub-Sub server](/topics/pubsub). - -## Learn more - -This tutorial is in no way complete and has covered just the basics of the API. -Read the [command reference](/commands) to discover a lot more. - -Thanks for reading, and have fun hacking with Redis! diff --git a/docs/manual/client-side-caching.md b/docs/manual/client-side-caching.md index 61da7dd6c6..5961fe906f 100644 --- a/docs/manual/client-side-caching.md +++ b/docs/manual/client-side-caching.md @@ -1,7 +1,7 @@ --- title: "Client-side caching in Redis" linkTitle: "Client-side caching" -weight: 1 +weight: 2 description: > Server-assisted, client-side caching in Redis aliases: diff --git a/docs/manual/keyspace-notifications.md b/docs/manual/keyspace-notifications.md index 380180a771..d46a750999 100644 --- a/docs/manual/keyspace-notifications.md +++ b/docs/manual/keyspace-notifications.md @@ -1,7 +1,7 @@ --- title: "Redis keyspace notifications" linkTitle: "Keyspace notifications" -weight: 3 +weight: 4 description: > Monitor changes to Redis keys and values in real time aliases: diff --git a/docs/manual/pubsub.md b/docs/manual/pubsub.md index ada1dae322..f3c58299bb 100644 --- a/docs/manual/pubsub.md +++ b/docs/manual/pubsub.md @@ -1,7 +1,7 @@ --- title: Redis Pub/Sub linkTitle: "Pub/sub" -weight: 4 +weight: 5 description: How to use pub/sub channels in Redis aliases: - /topics/pubsub diff --git a/docs/manual/the-redis-keyspace.md b/docs/manual/the-redis-keyspace.md new file mode 100644 index 0000000000..2b80f9c464 --- /dev/null +++ b/docs/manual/the-redis-keyspace.md @@ -0,0 +1,324 @@ +--- +title: "The Redis keyspace" +linkTitle: "The Redis Keyspace" +weight: 1 +description: > + Managing keys in Redis: Key expiration, scanning, altering and querying the key space +--- + +Redis keys are binary safe; this means that you can use any binary sequence as a +key, from a string like "foo" to the content of a JPEG file. +The empty string is also a valid key. + +A few other rules about keys: + +* Very long keys are not a good idea. For instance a key of 1024 bytes is a bad + idea not only memory-wise, but also because the lookup of the key in the + dataset may require several costly key-comparisons. Even when the task at hand + is to match the existence of a large value, hashing it (for example + with SHA1) is a better idea, especially from the perspective of memory + and bandwidth. +* Very short keys are often not a good idea. There is little point in writing + "u1000flw" as a key if you can instead write "user:1000:followers". The latter + is more readable and the added space is minor compared to the space used by + the key object itself and the value object. While short keys will obviously + consume a bit less memory, your job is to find the right balance. +* Try to stick with a schema. For instance "object-type:id" is a good + idea, as in "user:1000". Dots or dashes are often used for multi-word + fields, as in "comment:4321:reply.to" or "comment:4321:reply-to". +* The maximum allowed key size is 512 MB. + +## Altering and querying the key space + +There are commands that are not defined on particular types, but are useful +in order to interact with the space of keys, and thus, can be used with +keys of any type. + +For example the `EXISTS` command returns 1 or 0 to signal if a given key +exists or not in the database, while the `DEL` command deletes a key +and associated value, whatever the value is. + + > set mykey hello + OK + > exists mykey + (integer) 1 + > del mykey + (integer) 1 + > exists mykey + (integer) 0 + +From the examples you can also see how `DEL` itself returns 1 or 0 depending on whether +the key was removed (it existed) or not (there was no such key with that +name). + +There are many key space related commands, but the above two are the +essential ones together with the `TYPE` command, which returns the kind +of value stored at the specified key: + + > set mykey x + OK + > type mykey + string + > del mykey + (integer) 1 + > type mykey + none + +## Key expiration + +Before moving on, we should look at an important Redis feature that works regardless of the type of value you're storing: key expiration. Key expiration lets you set a timeout for a key, also known as a "time to live", or "TTL". When the time to live elapses, the key is automatically destroyed. + +A few important notes about key expiration: + +* They can be set both using seconds or milliseconds precision. +* However the expire time resolution is always 1 millisecond. +* Information about expires are replicated and persisted on disk, the time virtually passes when your Redis server remains stopped (this means that Redis saves the date at which a key will expire). + +Use the `EXPIRE` command to set a key's expiration: + + > set key some-value + OK + > expire key 5 + (integer) 1 + > get key (immediately) + "some-value" + > get key (after some time) + (nil) + +The key vanished between the two `GET` calls, since the second call was +delayed more than 5 seconds. In the example above we used `EXPIRE` in +order to set the expire (it can also be used in order to set a different +expire to a key already having one, like `PERSIST` can be used in order +to remove the expire and make the key persistent forever). However we +can also create keys with expires using other Redis commands. For example +using `SET` options: + + > set key 100 ex 10 + OK + > ttl key + (integer) 9 + +The example above sets a key with the string value `100`, having an expire +of ten seconds. Later the `TTL` command is called in order to check the +remaining time to live for the key. + +In order to set and check expires in milliseconds, check the `PEXPIRE` and +the `PTTL` commands, and the full list of `SET` options. + +## Navigating the keyspace + +### Scan +To incrementally iterate over the keys in a Redis database in an efficient manner, you can use the `SCAN` command. + +Since `SCAN` allows for incremental iteration, returning only a small number of elements per call, it can be used in production without the downside of commands like `KEYS` or `SMEMBERS` that may block the server for a long time (even several seconds) when called against big collections of keys or elements. + +However while blocking commands like `SMEMBERS` are able to provide all the elements that are part of a Set in a given moment, The SCAN family of commands only offer limited guarantees about the returned elements since the collection that we incrementally iterate can change during the iteration process. + +#### SCAN basic usage + +SCAN is a cursor based iterator. This means that at every call of the command, the server returns an updated cursor that the user needs to use as the cursor argument in the next call. + +An iteration starts when the cursor is set to 0, and terminates when the cursor returned by the server is 0. The following is an example of SCAN iteration: + +``` +redis 127.0.0.1:6379> scan 0 +1) "17" +2) 1) "key:12" + 2) "key:8" + 3) "key:4" + 4) "key:14" + 5) "key:16" + 6) "key:17" + 7) "key:15" + 8) "key:10" + 9) "key:3" + 10) "key:7" + 11) "key:1" +redis 127.0.0.1:6379> scan 17 +1) "0" +2) 1) "key:5" + 2) "key:18" + 3) "key:0" + 4) "key:2" + 5) "key:19" + 6) "key:13" + 7) "key:6" + 8) "key:9" + 9) "key:11" +``` + +In the example above, the first call uses zero as a cursor, to start the iteration. The second call uses the cursor returned by the previous call as the first element of the reply, that is, 17. + +As you can see the **SCAN return value** is an array of two values: the first value is the new cursor to use in the next call, the second value is an array of elements. + +Since in the second call the returned cursor is 0, the server signaled to the caller that the iteration finished, and the collection was completely explored. Starting an iteration with a cursor value of 0, and calling `SCAN` until the returned cursor is 0 again is called a **full iteration**. + +#### Scan guarantees + +The `SCAN` command, and the other commands in the `SCAN` family, are able to provide to the user a set of guarantees associated to full iterations. + +* A full iteration always retrieves all the elements that were present in the collection from the start to the end of a full iteration. This means that if a given element is inside the collection when an iteration is started, and is still there when an iteration terminates, then at some point `SCAN` returned it to the user. +* A full iteration never returns any element that was NOT present in the collection from the start to the end of a full iteration. So if an element was removed before the start of an iteration, and is never added back to the collection for all the time an iteration lasts, `SCAN` ensures that this element will never be returned. + +However because `SCAN` has very little state associated (just the cursor) it has the following drawbacks: + +* A given element may be returned multiple times. It is up to the application to handle the case of duplicated elements, for example only using the returned elements in order to perform operations that are safe when re-applied multiple times. +* Elements that were not constantly present in the collection during a full iteration, may be returned or not: it is undefined. + +#### Number of elements returned at every SCAN call + +`SCAN` family functions do not guarantee that the number of elements returned per call are in a given range. The commands are also allowed to return zero elements, and the client should not consider the iteration complete as long as the returned cursor is not zero. + +However the number of returned elements is reasonable, that is, in practical terms SCAN may return a maximum number of elements in the order of a few tens of elements when iterating a large collection, or may return all the elements of the collection in a single call when the iterated collection is small enough to be internally represented as an encoded data structure (this happens for small sets, hashes and sorted sets). + +However there is a way for the user to tune the order of magnitude of the number of returned elements per call using the **COUNT** option. + +#### The COUNT option + +While `SCAN` does not provide guarantees about the number of elements returned at every iteration, it is possible to empirically adjust the behavior of `SCAN` using the **COUNT** option. Basically with COUNT the user specified the *amount of work that should be done at every call in order to retrieve elements from the collection*. This is **just a hint** for the implementation, however generally speaking this is what you could expect most of the times from the implementation. + +* The default COUNT value is 10. +* When iterating the key space, or a Set, Hash or Sorted Set that is big enough to be represented by a hash table, assuming no **MATCH** option is used, the server will usually return *count* or a bit more than *count* elements per call. Please check the *why SCAN may return all the elements at once* section later in this document. +* When iterating Sets encoded as intsets (small sets composed of just integers), or Hashes and Sorted Sets encoded as ziplists (small hashes and sets composed of small individual values), usually all the elements are returned in the first `SCAN` call regardless of the COUNT value. + +Important: **there is no need to use the same COUNT value** for every iteration. The caller is free to change the count from one iteration to the other as required, as long as the cursor passed in the next call is the one obtained in the previous call to the command. + +#### The MATCH option + +It is possible to only iterate elements matching a given glob-style pattern, similarly to the behavior of the `KEYS` command that takes a pattern as its only argument. + +To do so, just append the `MATCH ` arguments at the end of the `SCAN` command (it works with all the SCAN family commands). + +This is an example of iteration using **MATCH**: + +``` +redis 127.0.0.1:6379> sadd myset 1 2 3 foo foobar feelsgood +(integer) 6 +redis 127.0.0.1:6379> sscan myset 0 match f* +1) "0" +2) 1) "foo" + 2) "feelsgood" + 3) "foobar" +redis 127.0.0.1:6379> +``` + +It is important to note that the **MATCH** filter is applied after elements are retrieved from the collection, just before returning data to the client. This means that if the pattern matches very little elements inside the collection, `SCAN` will likely return no elements in most iterations. An example is shown below: + +``` +redis 127.0.0.1:6379> scan 0 MATCH *11* +1) "288" +2) 1) "key:911" +redis 127.0.0.1:6379> scan 288 MATCH *11* +1) "224" +2) (empty list or set) +redis 127.0.0.1:6379> scan 224 MATCH *11* +1) "80" +2) (empty list or set) +redis 127.0.0.1:6379> scan 80 MATCH *11* +1) "176" +2) (empty list or set) +redis 127.0.0.1:6379> scan 176 MATCH *11* COUNT 1000 +1) "0" +2) 1) "key:611" + 2) "key:711" + 3) "key:118" + 4) "key:117" + 5) "key:311" + 6) "key:112" + 7) "key:111" + 8) "key:110" + 9) "key:113" + 10) "key:211" + 11) "key:411" + 12) "key:115" + 13) "key:116" + 14) "key:114" + 15) "key:119" + 16) "key:811" + 17) "key:511" + 18) "key:11" +redis 127.0.0.1:6379> +``` + +As you can see most of the calls returned zero elements, but the last call where a COUNT of 1000 was used in order to force the command to do more scanning for that iteration. + + +#### The TYPE option + +You can use the `!TYPE` option to ask `SCAN` to only return objects that match a given `type`, allowing you to iterate through the database looking for keys of a specific type. The **TYPE** option is only available on the whole-database `SCAN`, not `HSCAN` or `ZSCAN` etc. + +The `type` argument is the same string name that the `TYPE` command returns. Note a quirk where some Redis types, such as GeoHashes, HyperLogLogs, Bitmaps, and Bitfields, may internally be implemented using other Redis types, such as a string or zset, so can't be distinguished from other keys of that same type by `SCAN`. For example, a ZSET and GEOHASH: + +``` +redis 127.0.0.1:6379> GEOADD geokey 0 0 value +(integer) 1 +redis 127.0.0.1:6379> ZADD zkey 1000 value +(integer) 1 +redis 127.0.0.1:6379> TYPE geokey +zset +redis 127.0.0.1:6379> TYPE zkey +zset +redis 127.0.0.1:6379> SCAN 0 TYPE zset +1) "0" +2) 1) "geokey" + 2) "zkey" +``` + +It is important to note that the **TYPE** filter is also applied after elements are retrieved from the database, so the option does not reduce the amount of work the server has to do to complete a full iteration, and for rare types you may receive no elements in many iterations. + +#### Multiple parallel iterations + +It is possible for an infinite number of clients to iterate the same collection at the same time, as the full state of the iterator is in the cursor, that is obtained and returned to the client at every call. No server side state is taken at all. + +#### Terminating iterations in the middle + +Since there is no state server side, but the full state is captured by the cursor, the caller is free to terminate an iteration half-way without signaling this to the server in any way. An infinite number of iterations can be started and never terminated without any issue. + +#### Calling SCAN with a corrupted cursor + +Calling `SCAN` with a broken, negative, out of range, or otherwise invalid cursor, will result in undefined behavior but never in a crash. What will be undefined is that the guarantees about the returned elements can no longer be ensured by the `SCAN` implementation. + +The only valid cursors to use are: + +* The cursor value of 0 when starting an iteration. +* The cursor returned by the previous call to SCAN in order to continue the iteration. + +#### Guarantee of termination + +The `SCAN` algorithm is guaranteed to terminate only if the size of the iterated collection remains bounded to a given maximum size, otherwise iterating a collection that always grows may result into `SCAN` to never terminate a full iteration. + +This is easy to see intuitively: if the collection grows there is more and more work to do in order to visit all the possible elements, and the ability to terminate the iteration depends on the number of calls to `SCAN` and its COUNT option value compared with the rate at which the collection grows. + +#### Why SCAN may return all the items of an aggregate data type in a single call? + +In the `COUNT` option documentation, we state that sometimes this family of commands may return all the elements of a Set, Hash or Sorted Set at once in a single call, regardless of the `COUNT` option value. The reason why this happens is that the cursor-based iterator can be implemented, and is useful, only when the aggregate data type that we are scanning is represented as a hash table. However Redis uses a [memory optimization](/topics/memory-optimization) where small aggregate data types, until they reach a given amount of items or a given max size of single elements, are represented using a compact single-allocation packed encoding. When this is the case, `SCAN` has no meaningful cursor to return, and must iterate the whole data structure at once, so the only sane behavior it has is to return everything in a call. + +However once the data structures are bigger and are promoted to use real hash tables, the `SCAN` family of commands will resort to the normal behavior. Note that since this special behavior of returning all the elements is true only for small aggregates, it has no effects on the command complexity or latency. However the exact limits to get converted into real hash tables are [user configurable](/topics/memory-optimization), so the maximum number of elements you can see returned in a single call depends on how big an aggregate data type could be and still use the packed representation. + +Also note that this behavior is specific of `SSCAN`, `HSCAN` and `ZSCAN`. `SCAN` itself never shows this behavior because the key space is always represented by hash tables. + +### Keys + +Another way to iterate over the keyspace is to use the `KEYS` command, but this approach should be used with care, since `KEYS` will block the Redis server until all keys are returned. + +**Warning**: consider `KEYS` as a command that should only be used in production +environments with extreme care. + +`KEYS` may ruin performance when it is executed against large databases. +This command is intended for debugging and special operations, such as changing +your keyspace layout. +Don't use `KEYS` in your regular application code. +If you're looking for a way to find keys in a subset of your keyspace, consider +using `SCAN` or [sets][tdts]. + +[tdts]: /topics/data-types#sets + +Supported glob-style patterns: + +* `h?llo` matches `hello`, `hallo` and `hxllo` +* `h*llo` matches `hllo` and `heeeello` +* `h[ae]llo` matches `hello` and `hallo,` but not `hillo` +* `h[^e]llo` matches `hallo`, `hbllo`, ... but not `hello` +* `h[a-b]llo` matches `hallo` and `hbllo` + +Use `\` to escape special characters if you want to match them verbatim. diff --git a/docs/manual/transactions.md b/docs/manual/transactions.md index b75a41d838..1c25331630 100644 --- a/docs/manual/transactions.md +++ b/docs/manual/transactions.md @@ -1,7 +1,7 @@ --- title: Transactions linkTitle: Transactions -weight: 5 +weight: 6 description: How transactions work in Redis aliases: - /topics/transactions From 76661e0e1314774b6774ce0c5e2f3bb8b81b3b65 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Fri, 26 May 2023 15:57:32 +0100 Subject: [PATCH 02/23] Adds a page for the JSON data type --- docs/data-types/json.md | 201 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 201 insertions(+) create mode 100644 docs/data-types/json.md diff --git a/docs/data-types/json.md b/docs/data-types/json.md new file mode 100644 index 0000000000..d4d3467f82 --- /dev/null +++ b/docs/data-types/json.md @@ -0,0 +1,201 @@ +--- +title: "JSON" +linkTitle: "JSON" +weight: 30 +description: > + Introduction to RedisJSON +stack: true +--- + +JSON (JavaScript Object Notation) is an open standard file and data interchange format that uses human-readable text to store and transmit data objects. + +```json +{ + "model": "Hillcraft", + "price": 1200, + "type": ["Mountain Bikes", "Kids"], + "specs": { + "material": "carbon", + "weight": "11" + } +} +``` + +In the example JSON above we can see that the data object consists of **attribute–value pairs** (ex. `"model": "Hillcraft"`), arrays (`"type": ["Mountain Bikes", "Kids"]`) or other serialisable values. The attributes (`model`, `price`, `type` and `specs`) are always strings, while the values can be strings, numbers, booleans, null, objects or arrays. String, number, null or boolean values are also called **JSON scalars**. + + +Redis Stack implements JSON as a native data type. It allows storing, updating and fetching JSON values from Redis keys (documents) which makes it a perfect fit for a document store. + +Primary features include: + +- Full support of the JSON standard +- JSONPath syntax for selecting elements inside documents +- Documents are stored as binary data in a tree structure, allowing fast access to sub-elements +- Typed atomic operations for all JSON values types + +The two main benefits of native JSON over using strings or hashes for storing JSON are: +- **Access and retrieval of subvalues**: You can get nested values without having to pull the whole object out of memory, take it to the application layer, deserialize it, and serve the value you need. The overhead of this process is especially prominent for large JSON objects. +- **Atomic partial updates**: JSON allows you to atomically run operations like incrementing a value, adding, or removing elements from an array, append strings and so on. To do the same with a serialised object you would have to pull the value out and then write the new value back, which is not atomic. + + +A JSONPath expression begins with the dollar sign (`$`) character, which refers to the root element of a query. The dollar sign is followed by a sequence of child elements, which are separated via dot (code) notation + + +Some important JSONPath syntax rules are: + +|JSONPath|Description| +|---|---| +|`$` | the root object or element.| +|`@` | current object or element.| +|`.` | child operator, used to denote a child element of the current element.| +|`..` | recursive scan.| +|`*` | wildcard, returning all objects or elements regardless of their names.| +|`[]` | subscript operator / array operator| +|`,` | union operator, returns the union of the children or indexes indicated.| +|`:` | array slice operator; you can slice arrays using the syntax `[start:end:step]`.| +|`()` | lets you pass a script expression in the underlying implementation’s script language. It’s not supported by every implementation of |JSONPath, however. +|`?()` | applies a filter/script expression to query all items that meet certain criteria.| + +## Examples + +In the rest of this tutorial we'll work with the following example JSON document: + +``` +{ + "id": "B085LVV8R7", + "name": "Hillcraft", + "price": 1200, + "type": ["Mountain Bikes", "Kids"], + "specs": { + "material": "carbon", + "weight": "11" + }, + "stock": [ + { + "available_items": 573, + "location_id": "storeYUC89", + "location": "-9.149229, 38.731795", + "name": "Warehouse 1" + }, + { + "available_items": 110, + "location_id": "storeBZP22", + "location": "2.173404, 41.385063", + "name": "Warehouse 2" + }, + { + "available_items": 71, + "location_id": "storePWB554", + "location": "12.496365, 41.902782", + "name": "Warehouse 3" + } + ] +} +``` + + +* Save a JSON document: + +``` +> JSON.SET bike:1 . '{"id":"B085LVV8R7","name":"Hillcraft","price":1200,"type":["Mountain Bikes","Kids"],"specs":{"material":"carbon","weight":"11"},"stock":[{"available_items":573,"location_id":"storeYUC89","location":"-9.149229, 38.731795","name":"Warehouse 1"},{"available_items":110,"location_id":"storeBZP22","location":"2.173404, 41.385063","name":"Warehouse 2"},{"available_items":71,"location_id":"storePWB554","location":"12.496365, 41.902782","name":"Warehouse 3"}]}' + +"OK" +``` + +* Read the whole document: +``` +> JSON.GET bike:1 $ + +"[{\"id\":\"B085LVV8R7\",\"name\":\"Hillcraft\",\"price\":1200,\"type\":[\"Mountain Bikes\",\"Kids\"],\"specs\":{\"material\":\"carbon\",\"weight\":\"11\"},\"stock\":[{\"available_items\":573,\"location_id\":\"storeYUC89\",\"location\":\"-9.149229, 38.731795\",\"name\":\"Warehouse 1\"},{\"available_items\":110,\"location_id\":\"storeBZP22\",\"location\":\"2.173404, 41.385063\",\"name\":\"Warehouse 2\"},{\"available_items\":71,\"location_id\":\"storePWB554\",\"location\":\"12.496365, 41.902782\",\"name\":\"Warehouse 3\"}]}]" +``` + +#### Get first-level elements (`$.attribute`) + +To get a first-level element, you use the `$.` operator: + +For example, to get the model of the bike: +``` +> JSON.GET bike:1 $.id + +"[\"B085LVV8R7\"]" +``` + +#### Get nested properties (tree traversal) (`$.parent.attribute`) + +Get nested properties by following the JSON nested structure: + +``` +> JSON.GET bike:1 $.specs.material + +"[\"carbon\"]" +``` + +#### Get all values for an element (`$..attribute`) + +You can get an array of all values for an element with a certain name with the `$..` notation. In our example JSON object the attribute `name` appears twice, once at top level and once in the `stock` array. With the `$..` operator we can get all of those properties in an array: + +``` +> JSON.GET bike:1 $..name + +"[\"Hillcraft\",\"Warehouse 1\",\"Warehouse 2\",\"Warehouse 3\"]" +``` + +#### Working with arrays + +###### Get the whole array +``` +> JSON.GET bike:1 $.stock + +"[[{\"available_items\":573,\"location_id\":\"storeYUC89\",\"location\":\"-9.149229, 38.731795\",\"name\":\"Warehouse 1\"},{\"available_items\":110,\"location_id\":\"storeBZP22\",\"location\":\"2.173404, 41.385063\",\"name\":\"Warehouse 2\"},{\"available_items\":71,\"location_id\":\"storePWB554\",\"location\":\"12.496365, 41.902782\",\"name\":\"Warehouse 3\"}]]" +``` + +###### Get the first element of an array +``` +> JSON.GET bike:1 $.stock[0] + +"[{\"available_items\":573,\"location_id\":\"storeYUC89\",\"location\":\"-9.149229, 38.731795\",\"name\":\"Warehouse 1\"}]" +``` + +###### Get the last element of an array +``` +> JSON.GET bike:1 $.stock[-1] + +"[{\"available_items\":71,\"location_id\":\"storePWB554\",\"location\":\"12.496365, 41.902782\",\"name\":\"Warehouse 3\"}]" +``` + +###### Get an element at a specific position +``` +> JSON.GET bike:1 $.stock[1] + +"[{\"available_items\":110,\"location_id\":\"storeBZP22\",\"location\":\"2.173404, 41.385063\",\"name\":\"Warehouse 2\"}]" +``` + +###### Get multiple elements at specific positions +``` +> JSON.GET bike:1 $.stock[0,2] + +"[{\"available_items\":573,\"location_id\":\"storeYUC89\",\"location\":\"-9.149229, 38.731795\",\"name\":\"Warehouse 1\"},{\"available_items\":71,\"location_id\":\"storePWB554\",\"location\":\"12.496365, 41.902782\",\"name\":\"Warehouse 3\"}]" +``` + +###### Get elements in a range + +Get elements of the `stock` array, starting at position 1 and ending at position 3(exclusive): + +``` +> JSON.GET bike:1 $.stock[1:3] + +"[{\"available_items\":110,\"location_id\":\"storeBZP22\",\"location\":\"2.173404, 41.385063\",\"name\":\"Warehouse 2\"},{\"available_items\":71,\"location_id\":\"storePWB554\",\"location\":\"12.496365, 41.902782\",\"name\":\"Warehouse 3\"}]" +``` + + +See the [complete list of JSON commands](https://redis.io/commands/?group=json). + + +## Limits + +A JSON value passed to a command can have a depth of up to 128. If you pass to a command a JSON value that contains an object or an array with a nesting level of more than 128, the command returns an error. + + +## Learn more + +TODO \ No newline at end of file From 664863b8534fdb168de9fa28fbb8ccdbcf842ca6 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Mon, 29 May 2023 12:00:54 +0100 Subject: [PATCH 03/23] Apply suggestions from code review Co-authored-by: Anurag Bandyopadhyay --- docs/data-types/lists.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/data-types/lists.md b/docs/data-types/lists.md index 5594209344..43832b96c4 100644 --- a/docs/data-types/lists.md +++ b/docs/data-types/lists.md @@ -156,7 +156,7 @@ An example will make it more clear: 2) "2" 3) "3" -The above [`LTRIM`](/commands/ltrim) command tells Redis to take just list elements from index +The above [`LTRIM`](/commands/ltrim) command tells Redis to keep just list elements from index 0 to 2, everything else will be discarded. This allows for a very simple but useful pattern: doing a List push operation + a List trim operation together in order to add a new element and discard elements exceeding a limit: @@ -164,7 +164,7 @@ in order to add a new element and discard elements exceeding a limit: LPUSH mylist LTRIM mylist 0 999 -The above combination adds a new element and takes only the 1000 +The above combination adds a new element and keeps only the 1000 newest elements into the list. With [`LRANGE`](/commands/lrange) you can access the top items without any need to remember very old data. From bdd76970a01e0eac7ff7354c16002b65df7e5095 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Mon, 29 May 2023 12:43:09 +0100 Subject: [PATCH 04/23] Apply suggestions from code review --- docs/data-types/hyperloglogs.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/data-types/hyperloglogs.md b/docs/data-types/hyperloglogs.md index 5a22dcdc0f..79adc4b830 100644 --- a/docs/data-types/hyperloglogs.md +++ b/docs/data-types/hyperloglogs.md @@ -13,8 +13,8 @@ The Redis HyperLogLog implementation uses up to 12 KB and provides a standard er Counting unique items usually requires an amount of memory proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid -counting them multiple times. However there is a set of algorithms that trade -memory for precision: you end with an estimated measure with a standard error, +counting them multiple times. However, a set of algorithms exist that trade +memory for precision: they return an estimated measure with a standard error, which, in the case of the Redis implementation for HyperLogLog, is less than 1%. which in the case of the Redis implementation is less than 1%. The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a From 4eb3bddc3d86ddde2ae218b52c8f73dc9c526ac4 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Mon, 29 May 2023 12:44:07 +0100 Subject: [PATCH 05/23] Apply suggestions from code review --- docs/data-types/hyperloglogs.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/data-types/hyperloglogs.md b/docs/data-types/hyperloglogs.md index 79adc4b830..85a216c3aa 100644 --- a/docs/data-types/hyperloglogs.md +++ b/docs/data-types/hyperloglogs.md @@ -14,8 +14,8 @@ Counting unique items usually requires an amount of memory proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid counting them multiple times. However, a set of algorithms exist that trade -memory for precision: they return an estimated measure with a standard error, which, in the case of the Redis implementation for HyperLogLog, is less than 1%. -which in the case of the Redis implementation is less than 1%. The +memory for precision: they return an estimated measure with a standard error, +which, in the case of the Redis implementation for HyperLogLog, is less than 1%. The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a constant amount of memory! 12k bytes in the worst case, or a lot less if your From 1ea7535259d034e2f0d503bc78c4315f1bd01ab7 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Tue, 30 May 2023 09:03:11 +0100 Subject: [PATCH 06/23] Remove the single json page --- docs/data-types/json.md | 201 ---------------------------------------- 1 file changed, 201 deletions(-) delete mode 100644 docs/data-types/json.md diff --git a/docs/data-types/json.md b/docs/data-types/json.md deleted file mode 100644 index d4d3467f82..0000000000 --- a/docs/data-types/json.md +++ /dev/null @@ -1,201 +0,0 @@ ---- -title: "JSON" -linkTitle: "JSON" -weight: 30 -description: > - Introduction to RedisJSON -stack: true ---- - -JSON (JavaScript Object Notation) is an open standard file and data interchange format that uses human-readable text to store and transmit data objects. - -```json -{ - "model": "Hillcraft", - "price": 1200, - "type": ["Mountain Bikes", "Kids"], - "specs": { - "material": "carbon", - "weight": "11" - } -} -``` - -In the example JSON above we can see that the data object consists of **attribute–value pairs** (ex. `"model": "Hillcraft"`), arrays (`"type": ["Mountain Bikes", "Kids"]`) or other serialisable values. The attributes (`model`, `price`, `type` and `specs`) are always strings, while the values can be strings, numbers, booleans, null, objects or arrays. String, number, null or boolean values are also called **JSON scalars**. - - -Redis Stack implements JSON as a native data type. It allows storing, updating and fetching JSON values from Redis keys (documents) which makes it a perfect fit for a document store. - -Primary features include: - -- Full support of the JSON standard -- JSONPath syntax for selecting elements inside documents -- Documents are stored as binary data in a tree structure, allowing fast access to sub-elements -- Typed atomic operations for all JSON values types - -The two main benefits of native JSON over using strings or hashes for storing JSON are: -- **Access and retrieval of subvalues**: You can get nested values without having to pull the whole object out of memory, take it to the application layer, deserialize it, and serve the value you need. The overhead of this process is especially prominent for large JSON objects. -- **Atomic partial updates**: JSON allows you to atomically run operations like incrementing a value, adding, or removing elements from an array, append strings and so on. To do the same with a serialised object you would have to pull the value out and then write the new value back, which is not atomic. - - -A JSONPath expression begins with the dollar sign (`$`) character, which refers to the root element of a query. The dollar sign is followed by a sequence of child elements, which are separated via dot (code) notation - - -Some important JSONPath syntax rules are: - -|JSONPath|Description| -|---|---| -|`$` | the root object or element.| -|`@` | current object or element.| -|`.` | child operator, used to denote a child element of the current element.| -|`..` | recursive scan.| -|`*` | wildcard, returning all objects or elements regardless of their names.| -|`[]` | subscript operator / array operator| -|`,` | union operator, returns the union of the children or indexes indicated.| -|`:` | array slice operator; you can slice arrays using the syntax `[start:end:step]`.| -|`()` | lets you pass a script expression in the underlying implementation’s script language. It’s not supported by every implementation of |JSONPath, however. -|`?()` | applies a filter/script expression to query all items that meet certain criteria.| - -## Examples - -In the rest of this tutorial we'll work with the following example JSON document: - -``` -{ - "id": "B085LVV8R7", - "name": "Hillcraft", - "price": 1200, - "type": ["Mountain Bikes", "Kids"], - "specs": { - "material": "carbon", - "weight": "11" - }, - "stock": [ - { - "available_items": 573, - "location_id": "storeYUC89", - "location": "-9.149229, 38.731795", - "name": "Warehouse 1" - }, - { - "available_items": 110, - "location_id": "storeBZP22", - "location": "2.173404, 41.385063", - "name": "Warehouse 2" - }, - { - "available_items": 71, - "location_id": "storePWB554", - "location": "12.496365, 41.902782", - "name": "Warehouse 3" - } - ] -} -``` - - -* Save a JSON document: - -``` -> JSON.SET bike:1 . '{"id":"B085LVV8R7","name":"Hillcraft","price":1200,"type":["Mountain Bikes","Kids"],"specs":{"material":"carbon","weight":"11"},"stock":[{"available_items":573,"location_id":"storeYUC89","location":"-9.149229, 38.731795","name":"Warehouse 1"},{"available_items":110,"location_id":"storeBZP22","location":"2.173404, 41.385063","name":"Warehouse 2"},{"available_items":71,"location_id":"storePWB554","location":"12.496365, 41.902782","name":"Warehouse 3"}]}' - -"OK" -``` - -* Read the whole document: -``` -> JSON.GET bike:1 $ - -"[{\"id\":\"B085LVV8R7\",\"name\":\"Hillcraft\",\"price\":1200,\"type\":[\"Mountain Bikes\",\"Kids\"],\"specs\":{\"material\":\"carbon\",\"weight\":\"11\"},\"stock\":[{\"available_items\":573,\"location_id\":\"storeYUC89\",\"location\":\"-9.149229, 38.731795\",\"name\":\"Warehouse 1\"},{\"available_items\":110,\"location_id\":\"storeBZP22\",\"location\":\"2.173404, 41.385063\",\"name\":\"Warehouse 2\"},{\"available_items\":71,\"location_id\":\"storePWB554\",\"location\":\"12.496365, 41.902782\",\"name\":\"Warehouse 3\"}]}]" -``` - -#### Get first-level elements (`$.attribute`) - -To get a first-level element, you use the `$.` operator: - -For example, to get the model of the bike: -``` -> JSON.GET bike:1 $.id - -"[\"B085LVV8R7\"]" -``` - -#### Get nested properties (tree traversal) (`$.parent.attribute`) - -Get nested properties by following the JSON nested structure: - -``` -> JSON.GET bike:1 $.specs.material - -"[\"carbon\"]" -``` - -#### Get all values for an element (`$..attribute`) - -You can get an array of all values for an element with a certain name with the `$..` notation. In our example JSON object the attribute `name` appears twice, once at top level and once in the `stock` array. With the `$..` operator we can get all of those properties in an array: - -``` -> JSON.GET bike:1 $..name - -"[\"Hillcraft\",\"Warehouse 1\",\"Warehouse 2\",\"Warehouse 3\"]" -``` - -#### Working with arrays - -###### Get the whole array -``` -> JSON.GET bike:1 $.stock - -"[[{\"available_items\":573,\"location_id\":\"storeYUC89\",\"location\":\"-9.149229, 38.731795\",\"name\":\"Warehouse 1\"},{\"available_items\":110,\"location_id\":\"storeBZP22\",\"location\":\"2.173404, 41.385063\",\"name\":\"Warehouse 2\"},{\"available_items\":71,\"location_id\":\"storePWB554\",\"location\":\"12.496365, 41.902782\",\"name\":\"Warehouse 3\"}]]" -``` - -###### Get the first element of an array -``` -> JSON.GET bike:1 $.stock[0] - -"[{\"available_items\":573,\"location_id\":\"storeYUC89\",\"location\":\"-9.149229, 38.731795\",\"name\":\"Warehouse 1\"}]" -``` - -###### Get the last element of an array -``` -> JSON.GET bike:1 $.stock[-1] - -"[{\"available_items\":71,\"location_id\":\"storePWB554\",\"location\":\"12.496365, 41.902782\",\"name\":\"Warehouse 3\"}]" -``` - -###### Get an element at a specific position -``` -> JSON.GET bike:1 $.stock[1] - -"[{\"available_items\":110,\"location_id\":\"storeBZP22\",\"location\":\"2.173404, 41.385063\",\"name\":\"Warehouse 2\"}]" -``` - -###### Get multiple elements at specific positions -``` -> JSON.GET bike:1 $.stock[0,2] - -"[{\"available_items\":573,\"location_id\":\"storeYUC89\",\"location\":\"-9.149229, 38.731795\",\"name\":\"Warehouse 1\"},{\"available_items\":71,\"location_id\":\"storePWB554\",\"location\":\"12.496365, 41.902782\",\"name\":\"Warehouse 3\"}]" -``` - -###### Get elements in a range - -Get elements of the `stock` array, starting at position 1 and ending at position 3(exclusive): - -``` -> JSON.GET bike:1 $.stock[1:3] - -"[{\"available_items\":110,\"location_id\":\"storeBZP22\",\"location\":\"2.173404, 41.385063\",\"name\":\"Warehouse 2\"},{\"available_items\":71,\"location_id\":\"storePWB554\",\"location\":\"12.496365, 41.902782\",\"name\":\"Warehouse 3\"}]" -``` - - -See the [complete list of JSON commands](https://redis.io/commands/?group=json). - - -## Limits - -A JSON value passed to a command can have a depth of up to 128. If you pass to a command a JSON value that contains an object or an array with a nesting level of more than 128, the command returns an error. - - -## Learn more - -TODO \ No newline at end of file From 7a5b958889895364ac455ff101ae0cf1c6e12d51 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Wed, 31 May 2023 16:47:05 +0100 Subject: [PATCH 07/23] =?UTF-8?q?Cleans=20up=20the=20data=20types=20pages.?= =?UTF-8?q?=20Merges=20the=20two=20streams=20pages.=20Moves=20HLL=20under?= =?UTF-8?q?=20=E2=80=9CProbabilistic=E2=80=9D?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/data-types/bitfields.md | 11 +- docs/data-types/bitmaps.md | 66 +- docs/data-types/geospatial.md | 15 +- docs/data-types/hashes.md | 16 +- docs/data-types/lists.md | 155 ++-- .../{ => probabilistic}/hyperloglogs.md | 18 +- docs/data-types/sets.md | 90 +- docs/data-types/streams-tutorial.md | 794 ----------------- docs/data-types/streams.md | 804 +++++++++++++++++- 9 files changed, 987 insertions(+), 982 deletions(-) rename docs/data-types/{ => probabilistic}/hyperloglogs.md (85%) delete mode 100644 docs/data-types/streams-tutorial.md diff --git a/docs/data-types/bitfields.md b/docs/data-types/bitfields.md index 0643aeea71..fb74fc5d1f 100644 --- a/docs/data-types/bitfields.md +++ b/docs/data-types/bitfields.md @@ -12,6 +12,13 @@ For example, you can operate on anything from unsigned 1-bit integers to signed These values are stored using binary-encoded Redis strings. Bitfields support atomic read, write and increment operations, making them a good choice for managing counters and similar numerical values. + +## Basic commands + +* `BITFIELD` atomically sets, increments and reads one or more values. +* `BITFIELD_RO` is a read-only variant of `BITFIELD`. + + ## Examples Suppose you're keeping track of activity in an online game. @@ -46,10 +53,6 @@ You can represent these counters with one bitfield per player. 2) (integer) 1 ``` -## Basic commands - -* `BITFIELD` atomically sets, increments and reads one or more values. -* `BITFIELD_RO` is a read-only variant of `BITFIELD`. ## Performance diff --git a/docs/data-types/bitmaps.md b/docs/data-types/bitmaps.md index b829b93173..1f98bc9063 100644 --- a/docs/data-types/bitmaps.md +++ b/docs/data-types/bitmaps.md @@ -17,6 +17,41 @@ Some examples of bitmap use cases include: * Efficient set representations for cases where the members of a set correspond to the integers 0-N. * Object permissions, where each bit represents a particular permission, similar to the way that file systems store permissions. +## Basic commands + +* `SETBIT` sets a bit at the provided offset to 0 or 1. +* `GETBIT` returns the value of a bit at a given offset. +* `BITOP` lets you perform bitwise operations against one or more strings. + +See the [complete list of bitmap commands](https://redis.io/commands/?group=bitmap). + + +## Examples + +Suppose you have 1000 sensors deployed in the field, labeled 0-999. +You want to quickly determine whether a given sensor has pinged the server within the hour. + +You can represent this scenario using a bitmap whose key references the current hour. + +* Sensor 123 pings the server on January 1, 2024 within the 00:00 hour. +``` +> SETBIT pings:2024-01-01-00:00 123 1 +(integer) 0 +``` + +* Did sensor 123 ping the server on January 1, 2024 within the 00:00 hour? +``` +> GETBIT pings:2024-01-01-00:00 123 +1 +``` + +* What about server 456? +``` +> GETBIT pings:2024-01-01-00:00 456 +0 +``` + + Bit operations are divided into two groups: constant-time single bit operations, like setting a bit to 1 or 0, or getting its value, and @@ -84,38 +119,7 @@ instead of setting all the bits into a key, a trivial strategy is just to store M bits per key and obtain the key name with `bit-number/M` and the Nth bit to address inside the key with `bit-number MOD M`. -## Examples -Suppose you have 1000 sensors deployed in the field, labeled 0-999. -You want to quickly determine whether a given sensor has pinged the server within the hour. - -You can represent this scenario using a bitmap whose key references the current hour. - -* Sensor 123 pings the server on January 1, 2024 within the 00:00 hour. -``` -> SETBIT pings:2024-01-01-00:00 123 1 -(integer) 0 -``` - -* Did sensor 123 ping the server on January 1, 2024 within the 00:00 hour? -``` -> GETBIT pings:2024-01-01-00:00 123 -1 -``` - -* What about server 456? -``` -> GETBIT pings:2024-01-01-00:00 456 -0 -``` - -## Basic commands - -* `SETBIT` sets a bit at the provided offset to 0 or 1. -* `GETBIT` returns the value of a bit at a given offset. -* `BITOP` lets you perform bitwise operations against one or more strings. - -See the [complete list of bitmap commands](https://redis.io/commands/?group=bitmap). ## Performance diff --git a/docs/data-types/geospatial.md b/docs/data-types/geospatial.md index 0f1e7ccdcb..6e8d6cd9dd 100644 --- a/docs/data-types/geospatial.md +++ b/docs/data-types/geospatial.md @@ -9,6 +9,14 @@ description: > Redis geospatial indexes let you store coordinates and search for them. This data structure is useful for finding nearby points within a given radius or bounding box. +## Basic commands + +* `GEOADD` adds a location to a given geospatial index (note that longitude comes before latitude with this command). +* `GEOSEARCH` returns locations with a given radius or a bounding box. + +See the [complete list of geospatial index commands](https://redis.io/commands/?group=geo). + + ## Examples Suppose you're building a mobile app that lets you find all of the electric car charging stations closest to your current location. @@ -34,13 +42,6 @@ Find all locations within a 5 kilometer radius of a given location, and return t 2) "2.2441" ``` -## Basic commands - -* `GEOADD` adds a location to a given geospatial index (note that longitude comes before latitude with this command). -* `GEOSEARCH` returns locations with a given radius or a bounding box. - -See the [complete list of geospatial index commands](https://redis.io/commands/?group=geo). - ## Learn more * [Redis Geospatial Explained](https://www.youtube.com/watch?v=qftiVQraxmI) introduces geospatial indexes by showing you how to build a map of local park attractions. diff --git a/docs/data-types/hashes.md b/docs/data-types/hashes.md index 1ecb940e9b..d6885d6d50 100644 --- a/docs/data-types/hashes.md +++ b/docs/data-types/hashes.md @@ -48,6 +48,14 @@ You can find the [full list of hash commands in the documentation](https://redis It is worth noting that small hashes (i.e., a few elements with small values) are encoded in special way in memory that make them very memory efficient. +## Basic commands + +* `HSET` sets the value of one or more fields on a hash. +* `HGET` returns the value at a given field. +* `HMGET` returns the values at one or more given fields. +* `HINCRBY` increments the value at a given field by the integer provided. + +See the [complete list of hash commands](https://redis.io/commands/?group=hash). ## Examples @@ -88,14 +96,6 @@ encoded in special way in memory that make them very memory efficient. 2) "1" ``` -## Basic commands - -* `HSET` sets the value of one or more fields on a hash. -* `HGET` returns the value at a given field. -* `HMGET` returns the values at one or more given fields. -* `HINCRBY` increments the value at a given field by the integer provided. - -See the [complete list of hash commands](https://redis.io/commands/?group=hash). ## Performance diff --git a/docs/data-types/lists.md b/docs/data-types/lists.md index 43832b96c4..df5822b97d 100644 --- a/docs/data-types/lists.md +++ b/docs/data-types/lists.md @@ -12,6 +12,84 @@ Redis lists are frequently used to: * Implement stacks and queues. * Build queue management for background worker systems. +## Basic commands + +* `LPUSH` adds a new element to the head of a list; `RPUSH` adds to the tail. +* `LPOP` removes and returns an element from the head of a list; `RPOP` does the same but from the tails of a list. +* `LLEN` returns the length of a list. +* `LMOVE` atomically moves elements from one list to another. +* `LTRIM` reduces a list to the specified range of elements. + +### Blocking commands + +Lists support several blocking commands. +For example: + +* `BLPOP` removes and returns an element from the head of a list. + If the list is empty, the command blocks until an element becomes available or until the specified timeout is reached. +* `BLMOVE` atomically moves elements from a source list to a target list. + If the source list is empty, the command will block until a new element becomes available. + +See the [complete series of list commands](https://redis.io/commands/?group=list). + +## Examples + +* Treat a list like a queue (first in, first out): +``` +> LPUSH work:queue:ids 101 +(integer) 1 +> LPUSH work:queue:ids 237 +(integer) 2 +> RPOP work:queue:ids +"101" +> RPOP work:queue:ids +"237" +``` + +* Treat a list like a stack (first in, last out): +``` +> LPUSH work:queue:ids 101 +(integer) 1 +> LPUSH work:queue:ids 237 +(integer) 2 +> LPOP work:queue:ids +"237" +> LPOP work:queue:ids +"101" +``` + +* Check the length of a list: +``` +> LLEN work:queue:ids +(integer) 0 +``` + +* Atomically pop an element from one list and push to another: +``` +> LPUSH board:todo:ids 101 +(integer) 1 +> LPUSH board:todo:ids 273 +(integer) 2 +> LMOVE board:todo:ids board:in-progress:ids LEFT LEFT +"273" +> LRANGE board:todo:ids 0 -1 +1) "101" +> LRANGE board:in-progress:ids 0 -1 +1) "273" +``` + +* To create a capped list that never grows beyond 100 elements, you can call `LTRIM` after each call to `LPUSH`: +``` +> LPUSH notifications:user:1 "You've got mail!" +(integer) 1 +> LTRIM notifications:user:1 0 99 +OK +> LPUSH notifications:user:1 "Your package will be delivered at 12:01 today." +(integer) 2 +> LTRIM notifications:user:1 0 99 +OK +``` + ### What are Lists? To explain the List data type it's better to start with a little bit of theory, as the term *List* is often used in an improper way by information technology @@ -285,87 +363,10 @@ Example of rule 3: (nil) -## Examples - -* Treat a list like a queue (first in, first out): -``` -> LPUSH work:queue:ids 101 -(integer) 1 -> LPUSH work:queue:ids 237 -(integer) 2 -> RPOP work:queue:ids -"101" -> RPOP work:queue:ids -"237" -``` - -* Treat a list like a stack (first in, last out): -``` -> LPUSH work:queue:ids 101 -(integer) 1 -> LPUSH work:queue:ids 237 -(integer) 2 -> LPOP work:queue:ids -"237" -> LPOP work:queue:ids -"101" -``` - -* Check the length of a list: -``` -> LLEN work:queue:ids -(integer) 0 -``` - -* Atomically pop an element from one list and push to another: -``` -> LPUSH board:todo:ids 101 -(integer) 1 -> LPUSH board:todo:ids 273 -(integer) 2 -> LMOVE board:todo:ids board:in-progress:ids LEFT LEFT -"273" -> LRANGE board:todo:ids 0 -1 -1) "101" -> LRANGE board:in-progress:ids 0 -1 -1) "273" -``` - -* To create a capped list that never grows beyond 100 elements, you can call `LTRIM` after each call to `LPUSH`: -``` -> LPUSH notifications:user:1 "You've got mail!" -(integer) 1 -> LTRIM notifications:user:1 0 99 -OK -> LPUSH notifications:user:1 "Your package will be delivered at 12:01 today." -(integer) 2 -> LTRIM notifications:user:1 0 99 -OK -``` - ## Limits The max length of a Redis list is 2^32 - 1 (4,294,967,295) elements. -## Basic commands - -* `LPUSH` adds a new element to the head of a list; `RPUSH` adds to the tail. -* `LPOP` removes and returns an element from the head of a list; `RPOP` does the same but from the tails of a list. -* `LLEN` returns the length of a list. -* `LMOVE` atomically moves elements from one list to another. -* `LTRIM` reduces a list to the specified range of elements. - -### Blocking commands - -Lists support several blocking commands. -For example: - -* `BLPOP` removes and returns an element from the head of a list. - If the list is empty, the command blocks until an element becomes available or until the specified timeout is reached. -* `BLMOVE` atomically moves elements from a source list to a target list. - If the source list is empty, the command will block until a new element becomes available. - -See the [complete series of list commands](https://redis.io/commands/?group=list). ## Performance diff --git a/docs/data-types/hyperloglogs.md b/docs/data-types/probabilistic/hyperloglogs.md similarity index 85% rename from docs/data-types/hyperloglogs.md rename to docs/data-types/probabilistic/hyperloglogs.md index 85a216c3aa..f9e3a1b730 100644 --- a/docs/data-types/hyperloglogs.md +++ b/docs/data-types/probabilistic/hyperloglogs.md @@ -1,9 +1,11 @@ --- -title: "Redis HyperLogLog" +title: "HyperLogLog" linkTitle: "HyperLogLog" -weight: 90 +weight: 1 description: > - Introduction to the Redis HyperLogLog data type + HyperLogLog is a probabilistic data structure that estimates the cardinality of a set. +aliases: + - /docs/data-types/hyperloglogs/ --- HyperLogLog is a probabilistic data structure that estimates the cardinality of a set. As a probabilistic data structure, HyperLogLog trades perfect accuracy for efficient space utilization. @@ -15,10 +17,10 @@ proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid counting them multiple times. However, a set of algorithms exist that trade memory for precision: they return an estimated measure with a standard error, -which, in the case of the Redis implementation for HyperLogLog, is less than 1%. The -magic of this algorithm is that you no longer need to use an amount of memory +which, in the case of the Redis implementation for HyperLogLog, is less than 1%. +The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a -constant amount of memory! 12k bytes in the worst case, or a lot less if your +constant amount of memory; 12k bytes in the worst case, or a lot less if your HyperLogLog (We'll just call them HLL from now) has seen very few elements. HLLs in Redis, while technically a different data structure, are encoded @@ -42,8 +44,8 @@ same: > pfcount hll (integer) 4 -An example of use case for this data structure is counting unique queries -performed by users in a search form every day. +Some examples of use cases for this data structure is counting unique queries +performed by users in a search form every day, number of unique visitors to a web page and other similar cases. Redis is also able to perform the union of HLLs, please check the [full documentation](/commands#hyperloglog) for more information. diff --git a/docs/data-types/sets.md b/docs/data-types/sets.md index 00ee2cf45c..fcc3564209 100644 --- a/docs/data-types/sets.md +++ b/docs/data-types/sets.md @@ -13,6 +13,52 @@ You can use Redis sets to efficiently: * Represent relations (e.g., the set of all users with a given role). * Perform common set operations such as intersection, unions, and differences. +## Basic commands + +* `SADD` adds a new member to a set. +* `SREM` removes the specified member from the set. +* `SISMEMBER` tests a string for set membership. +* `SINTER` returns the set of members that two or more sets have in common (i.e., the intersection). +* `SCARD` returns the size (a.k.a. cardinality) of a set. + +See the [complete list of set commands](https://redis.io/commands/?group=set). + +## Examples + +* Store the set of favorited book IDs for users 123 and 456: +``` +> SADD user:123:favorites 347 +(integer) 1 +> SADD user:123:favorites 561 +(integer) 1 +> SADD user:123:favorites 742 +(integer) 1 +> SADD user:456:favorites 561 +(integer) 1 +``` + +* Check whether user 123 likes books 742 and 299 +``` +> SISMEMBER user:123:favorites 742 +(integer) 1 +> SISMEMBER user:123:favorites 299 +(integer) 0 +``` + +* Do user 123 and 456 have any favorite books in common? +``` +> SINTER user:123:favorites user:456:favorites +1) "561" +``` + +* How many books has user 123 favorited? +``` +> SCARD user:123:favorites +(integer) 3 +``` + +## Tutorial + The [`SADD`](/commands/sadd) command adds new elements to a set. It's also possible to do a number of other operations against sets like testing if a given element already exists, performing the intersection, union or difference between @@ -143,54 +189,10 @@ When you need to just get random elements without removing them from the set, there is the [`SRANDMEMBER`](/commands/srandmember) command suitable for the task. It also features the ability to return both repeating and non-repeating elements. -## Examples - -* Store the set of favorited book IDs for users 123 and 456: -``` -> SADD user:123:favorites 347 -(integer) 1 -> SADD user:123:favorites 561 -(integer) 1 -> SADD user:123:favorites 742 -(integer) 1 -> SADD user:456:favorites 561 -(integer) 1 -``` - -* Check whether user 123 likes books 742 and 299 -``` -> SISMEMBER user:123:favorites 742 -(integer) 1 -> SISMEMBER user:123:favorites 299 -(integer) 0 -``` - -* Do user 123 and 456 have any favorite books in common? -``` -> SINTER user:123:favorites user:456:favorites -1) "561" -``` - -* How many books has user 123 favorited? -``` -> SCARD user:123:favorites -(integer) 3 -``` - ## Limits The max size of a Redis set is 2^32 - 1 (4,294,967,295) members. -## Basic commands - -* `SADD` adds a new member to a set. -* `SREM` removes the specified member from the set. -* `SISMEMBER` tests a string for set membership. -* `SINTER` returns the set of members that two or more sets have in common (i.e., the intersection). -* `SCARD` returns the size (a.k.a. cardinality) of a set. - -See the [complete list of set commands](https://redis.io/commands/?group=set). - ## Performance Most set operations, including adding, removing, and checking whether an item is a set member, are O(1). diff --git a/docs/data-types/streams-tutorial.md b/docs/data-types/streams-tutorial.md deleted file mode 100644 index 01f171dea3..0000000000 --- a/docs/data-types/streams-tutorial.md +++ /dev/null @@ -1,794 +0,0 @@ ---- -title: "Redis Streams tutorial" -linkTitle: "Streams tutorial" -weight: 61 -description: > - A comprehensive tutorial on Redis streams -aliases: - - /topics/streams-intro - - /docs/manual/data-types/streams ---- - -If you're new to streams, see the [Redis Streams introduction](/docs/data-types/streams/). For a more comprehensive tutorial, read on. - -## Introduction - -The Redis stream data type was introduced in Redis 5.0. Streams model a log data structure but also implement several operations to overcome some of the limits of a typical append-only log. These include random access in O(1) time and complex consumption strategies, such as consumer groups. - -## Streams basics - -Streams are an append-only data structure. The fundamental write command, called [XADD](/commands/xadd), appends a new entry to the specified stream. - -Each stream entry consists of one or more field-value pairs, somewhat like a record or a Redis hash: - -``` -> XADD mystream * sensor-id 1234 temperature 19.8 -1518951480106-0 -``` - -The above call to the `XADD` command adds an entry `sensor-id: 1234, temperature: 19.8` to the stream at key `mystream`, using an auto-generated entry ID, which is the one returned by the command, specifically `1518951480106-0`. It gets as its first argument the key name `mystream`, the second argument is the entry ID that identifies every entry inside a stream. However, in this case, we passed `*` because we want the server to generate a new ID for us. Every new ID will be monotonically increasing, so in more simple terms, every new entry added will have a higher ID compared to all the past entries. Auto-generation of IDs by the server is almost always what you want, and the reasons for specifying an ID explicitly are very rare. We'll talk more about this later. The fact that each Stream entry has an ID is another similarity with log files, where line numbers, or the byte offset inside the file, can be used in order to identify a given entry. Returning back at our `XADD` example, after the key name and ID, the next arguments are the field-value pairs composing our stream entry. - -It is possible to get the number of items inside a Stream just using the `XLEN` command: - -``` -> XLEN mystream -(integer) 1 -``` - -### Entry IDs - -The entry ID returned by the `XADD` command, and identifying univocally each entry inside a given stream, is composed of two parts: - -``` -- -``` - -The milliseconds time part is actually the local time in the local Redis node generating the stream ID, however if the current milliseconds time happens to be smaller than the previous entry time, then the previous entry time is used instead, so if a clock jumps backward the monotonically incrementing ID property still holds. The sequence number is used for entries created in the same millisecond. Since the sequence number is 64 bit wide, in practical terms there is no limit to the number of entries that can be generated within the same millisecond. - -The format of such IDs may look strange at first, and the gentle reader may wonder why the time is part of the ID. The reason is that Redis streams support range queries by ID. Because the ID is related to the time the entry is generated, this gives the ability to query for time ranges basically for free. We will see this soon while covering the `XRANGE` command. - -If for some reason the user needs incremental IDs that are not related to time but are actually associated to another external system ID, as previously mentioned, the `XADD` command can take an explicit ID instead of the `*` wildcard ID that triggers auto-generation, like in the following examples: - -``` -> XADD somestream 0-1 field value -0-1 -> XADD somestream 0-2 foo bar -0-2 -``` - -Note that in this case, the minimum ID is 0-1 and that the command will not accept an ID equal or smaller than a previous one: - -``` -> XADD somestream 0-1 foo bar -(error) ERR The ID specified in XADD is equal or smaller than the target stream top item -``` - -If you're running Redis 7 or later, you can also provide an explicit ID consisting of the milliseconds part only. In this case, the sequence portion of the ID will be automatically generated. To do this, use the syntax below: - -``` -> XADD somestream 0-* baz qux -0-3 -``` - -## Getting data from Streams - -Now we are finally able to append entries in our stream via `XADD`. However, while appending data to a stream is quite obvious, the way streams can be queried in order to extract data is not so obvious. If we continue with the analogy of the log file, one obvious way is to mimic what we normally do with the Unix command `tail -f`, that is, we may start to listen in order to get the new messages that are appended to the stream. Note that unlike the blocking list operations of Redis, where a given element will reach a single client which is blocking in a *pop style* operation like `BLPOP`, with streams we want multiple consumers to see the new messages appended to the stream (the same way many `tail -f` processes can see what is added to a log). Using the traditional terminology we want the streams to be able to *fan out* messages to multiple clients. - -However, this is just one potential access mode. We could also see a stream in quite a different way: not as a messaging system, but as a *time series store*. In this case, maybe it's also useful to get the new messages appended, but another natural query mode is to get messages by ranges of time, or alternatively to iterate the messages using a cursor to incrementally check all the history. This is definitely another useful access mode. - -Finally, if we see a stream from the point of view of consumers, we may want to access the stream in yet another way, that is, as a stream of messages that can be partitioned to multiple consumers that are processing such messages, so that groups of consumers can only see a subset of the messages arriving in a single stream. In this way, it is possible to scale the message processing across different consumers, without single consumers having to process all the messages: each consumer will just get different messages to process. This is basically what Kafka (TM) does with consumer groups. Reading messages via consumer groups is yet another interesting mode of reading from a Redis Stream. - -Redis Streams support all three of the query modes described above via different commands. The next sections will show them all, starting from the simplest and most direct to use: range queries. - -### Querying by range: XRANGE and XREVRANGE - -To query the stream by range we are only required to specify two IDs, *start* and *end*. The range returned will include the elements having start or end as ID, so the range is inclusive. The two special IDs `-` and `+` respectively mean the smallest and the greatest ID possible. - -``` -> XRANGE mystream - + -1) 1) 1518951480106-0 - 2) 1) "sensor-id" - 2) "1234" - 3) "temperature" - 4) "19.8" -2) 1) 1518951482479-0 - 2) 1) "sensor-id" - 2) "9999" - 3) "temperature" - 4) "18.2" -``` - -Each entry returned is an array of two items: the ID and the list of field-value pairs. We already said that the entry IDs have a relation with the time, because the part at the left of the `-` character is the Unix time in milliseconds of the local node that created the stream entry, at the moment the entry was created (however note that streams are replicated with fully specified `XADD` commands, so the replicas will have identical IDs to the master). This means that I could query a range of time using `XRANGE`. In order to do so, however, I may want to omit the sequence part of the ID: if omitted, in the start of the range it will be assumed to be 0, while in the end part it will be assumed to be the maximum sequence number available. This way, querying using just two milliseconds Unix times, we get all the entries that were generated in that range of time, in an inclusive way. For instance, if I want to query a two milliseconds period I could use: - -``` -> XRANGE mystream 1518951480106 1518951480107 -1) 1) 1518951480106-0 - 2) 1) "sensor-id" - 2) "1234" - 3) "temperature" - 4) "19.8" -``` - -I have only a single entry in this range, however in real data sets, I could query for ranges of hours, or there could be many items in just two milliseconds, and the result returned could be huge. For this reason, `XRANGE` supports an optional **COUNT** option at the end. By specifying a count, I can just get the first *N* items. If I want more, I can get the last ID returned, increment the sequence part by one, and query again. Let's see this in the following example. We start adding 10 items with `XADD` (I won't show that, lets assume that the stream `mystream` was populated with 10 items). To start my iteration, getting 2 items per command, I start with the full range, but with a count of 2. - -``` -> XRANGE mystream - + COUNT 2 -1) 1) 1519073278252-0 - 2) 1) "foo" - 2) "value_1" -2) 1) 1519073279157-0 - 2) 1) "foo" - 2) "value_2" -``` - -In order to continue the iteration with the next two items, I have to pick the last ID returned, that is `1519073279157-0` and add the prefix `(` to it. The resulting exclusive range interval, that is `(1519073279157-0` in this case, can now be used as the new *start* argument for the next `XRANGE` call: - -``` -> XRANGE mystream (1519073279157-0 + COUNT 2 -1) 1) 1519073280281-0 - 2) 1) "foo" - 2) "value_3" -2) 1) 1519073281432-0 - 2) 1) "foo" - 2) "value_4" -``` - -And so forth. Since `XRANGE` complexity is *O(log(N))* to seek, and then *O(M)* to return M elements, with a small count the command has a logarithmic time complexity, which means that each step of the iteration is fast. So `XRANGE` is also the de facto *streams iterator* and does not require an **XSCAN** command. - -The command `XREVRANGE` is the equivalent of `XRANGE` but returning the elements in inverted order, so a practical use for `XREVRANGE` is to check what is the last item in a Stream: - -``` -> XREVRANGE mystream + - COUNT 1 -1) 1) 1519073287312-0 - 2) 1) "foo" - 2) "value_10" -``` - -Note that the `XREVRANGE` command takes the *start* and *stop* arguments in reverse order. - -## Listening for new items with XREAD - -When we do not want to access items by a range in a stream, usually what we want instead is to *subscribe* to new items arriving to the stream. This concept may appear related to Redis Pub/Sub, where you subscribe to a channel, or to Redis blocking lists, where you wait for a key to get new elements to fetch, but there are fundamental differences in the way you consume a stream: - -1. A stream can have multiple clients (consumers) waiting for data. Every new item, by default, will be delivered to *every consumer* that is waiting for data in a given stream. This behavior is different than blocking lists, where each consumer will get a different element. However, the ability to *fan out* to multiple consumers is similar to Pub/Sub. -2. While in Pub/Sub messages are *fire and forget* and are never stored anyway, and while when using blocking lists, when a message is received by the client it is *popped* (effectively removed) from the list, streams work in a fundamentally different way. All the messages are appended in the stream indefinitely (unless the user explicitly asks to delete entries): different consumers will know what is a new message from its point of view by remembering the ID of the last message received. -3. Streams Consumer Groups provide a level of control that Pub/Sub or blocking lists cannot achieve, with different groups for the same stream, explicit acknowledgment of processed items, ability to inspect the pending items, claiming of unprocessed messages, and coherent history visibility for each single client, that is only able to see its private past history of messages. - -The command that provides the ability to listen for new messages arriving into a stream is called `XREAD`. It's a bit more complex than `XRANGE`, so we'll start showing simple forms, and later the whole command layout will be provided. - -``` -> XREAD COUNT 2 STREAMS mystream 0 -1) 1) "mystream" - 2) 1) 1) 1519073278252-0 - 2) 1) "foo" - 2) "value_1" - 2) 1) 1519073279157-0 - 2) 1) "foo" - 2) "value_2" -``` - -The above is the non-blocking form of `XREAD`. Note that the **COUNT** option is not mandatory, in fact the only mandatory option of the command is the **STREAMS** option, that specifies a list of keys together with the corresponding maximum ID already seen for each stream by the calling consumer, so that the command will provide the client only with messages with an ID greater than the one we specified. - -In the above command we wrote `STREAMS mystream 0` so we want all the messages in the Stream `mystream` having an ID greater than `0-0`. As you can see in the example above, the command returns the key name, because actually it is possible to call this command with more than one key to read from different streams at the same time. I could write, for instance: `STREAMS mystream otherstream 0 0`. Note how after the **STREAMS** option we need to provide the key names, and later the IDs. For this reason, the **STREAMS** option must always be the last option. -Any other options must come before the **STREAMS** option. - -Apart from the fact that `XREAD` can access multiple streams at once, and that we are able to specify the last ID we own to just get newer messages, in this simple form the command is not doing something so different compared to `XRANGE`. However, the interesting part is that we can turn `XREAD` into a *blocking command* easily, by specifying the **BLOCK** argument: - -``` -> XREAD BLOCK 0 STREAMS mystream $ -``` - -Note that in the example above, other than removing **COUNT**, I specified the new **BLOCK** option with a timeout of 0 milliseconds (that means to never timeout). Moreover, instead of passing a normal ID for the stream `mystream` I passed the special ID `$`. This special ID means that `XREAD` should use as last ID the maximum ID already stored in the stream `mystream`, so that we will receive only *new* messages, starting from the time we started listening. This is similar to the `tail -f` Unix command in some way. - -Note that when the **BLOCK** option is used, we do not have to use the special ID `$`. We can use any valid ID. If the command is able to serve our request immediately without blocking, it will do so, otherwise it will block. Normally if we want to consume the stream starting from new entries, we start with the ID `$`, and after that we continue using the ID of the last message received to make the next call, and so forth. - -The blocking form of `XREAD` is also able to listen to multiple Streams, just by specifying multiple key names. If the request can be served synchronously because there is at least one stream with elements greater than the corresponding ID we specified, it returns with the results. Otherwise, the command will block and will return the items of the first stream which gets new data (according to the specified ID). - -Similarly to blocking list operations, blocking stream reads are *fair* from the point of view of clients waiting for data, since the semantics is FIFO style. The first client that blocked for a given stream will be the first to be unblocked when new items are available. - -`XREAD` has no other options than **COUNT** and **BLOCK**, so it's a pretty basic command with a specific purpose to attach consumers to one or multiple streams. More powerful features to consume streams are available using the consumer groups API, however reading via consumer groups is implemented by a different command called `XREADGROUP`, covered in the next section of this guide. - -## Consumer groups - -When the task at hand is to consume the same stream from different clients, then `XREAD` already offers a way to *fan-out* to N clients, potentially also using replicas in order to provide more read scalability. However in certain problems what we want to do is not to provide the same stream of messages to many clients, but to provide a *different subset* of messages from the same stream to many clients. An obvious case where this is useful is that of messages which are slow to process: the ability to have N different workers that will receive different parts of the stream allows us to scale message processing, by routing different messages to different workers that are ready to do more work. - -In practical terms, if we imagine having three consumers C1, C2, C3, and a stream that contains the messages 1, 2, 3, 4, 5, 6, 7 then what we want is to serve the messages according to the following diagram: - -``` -1 -> C1 -2 -> C2 -3 -> C3 -4 -> C1 -5 -> C2 -6 -> C3 -7 -> C1 -``` - -In order to achieve this, Redis uses a concept called *consumer groups*. It is very important to understand that Redis consumer groups have nothing to do, from an implementation standpoint, with Kafka (TM) consumer groups. Yet they are similar in functionality, so I decided to keep Kafka's (TM) terminology, as it originally popularized this idea. - -A consumer group is like a *pseudo consumer* that gets data from a stream, and actually serves multiple consumers, providing certain guarantees: - -1. Each message is served to a different consumer so that it is not possible that the same message will be delivered to multiple consumers. -2. Consumers are identified, within a consumer group, by a name, which is a case-sensitive string that the clients implementing consumers must choose. This means that even after a disconnect, the stream consumer group retains all the state, since the client will claim again to be the same consumer. However, this also means that it is up to the client to provide a unique identifier. -3. Each consumer group has the concept of the *first ID never consumed* so that, when a consumer asks for new messages, it can provide just messages that were not previously delivered. -4. Consuming a message, however, requires an explicit acknowledgment using a specific command. Redis interprets the acknowledgment as: this message was correctly processed so it can be evicted from the consumer group. -5. A consumer group tracks all the messages that are currently pending, that is, messages that were delivered to some consumer of the consumer group, but are yet to be acknowledged as processed. Thanks to this feature, when accessing the message history of a stream, each consumer *will only see messages that were delivered to it*. - -In a way, a consumer group can be imagined as some *amount of state* about a stream: - -``` -+----------------------------------------+ -| consumer_group_name: mygroup | -| consumer_group_stream: somekey | -| last_delivered_id: 1292309234234-92 | -| | -| consumers: | -| "consumer-1" with pending messages | -| 1292309234234-4 | -| 1292309234232-8 | -| "consumer-42" with pending messages | -| ... (and so forth) | -+----------------------------------------+ -``` - -If you see this from this point of view, it is very simple to understand what a consumer group can do, how it is able to just provide consumers with their history of pending messages, and how consumers asking for new messages will just be served with message IDs greater than `last_delivered_id`. At the same time, if you look at the consumer group as an auxiliary data structure for Redis streams, it is obvious that a single stream can have multiple consumer groups, that have a different set of consumers. Actually, it is even possible for the same stream to have clients reading without consumer groups via `XREAD`, and clients reading via `XREADGROUP` in different consumer groups. - -Now it's time to zoom in to see the fundamental consumer group commands. They are the following: - -* `XGROUP` is used in order to create, destroy and manage consumer groups. -* `XREADGROUP` is used to read from a stream via a consumer group. -* `XACK` is the command that allows a consumer to mark a pending message as correctly processed. - -## Creating a consumer group - -Assuming I have a key `mystream` of type stream already existing, in order to create a consumer group I just need to do the following: - -``` -> XGROUP CREATE mystream mygroup $ -OK -``` - -As you can see in the command above when creating the consumer group we have to specify an ID, which in the example is just `$`. This is needed because the consumer group, among the other states, must have an idea about what message to serve next at the first consumer connecting, that is, what was the *last message ID* when the group was just created. If we provide `$` as we did, then only new messages arriving in the stream from now on will be provided to the consumers in the group. If we specify `0` instead the consumer group will consume *all* the messages in the stream history to start with. Of course, you can specify any other valid ID. What you know is that the consumer group will start delivering messages that are greater than the ID you specify. Because `$` means the current greatest ID in the stream, specifying `$` will have the effect of consuming only new messages. - -`XGROUP CREATE` also supports creating the stream automatically, if it doesn't exist, using the optional `MKSTREAM` subcommand as the last argument: - -``` -> XGROUP CREATE newstream mygroup $ MKSTREAM -OK -``` - -Now that the consumer group is created we can immediately try to read messages via the consumer group using the `XREADGROUP` command. We'll read from consumers, that we will call Alice and Bob, to see how the system will return different messages to Alice or Bob. - -`XREADGROUP` is very similar to `XREAD` and provides the same **BLOCK** option, otherwise it is a synchronous command. However there is a *mandatory* option that must be always specified, which is **GROUP** and has two arguments: the name of the consumer group, and the name of the consumer that is attempting to read. The option **COUNT** is also supported and is identical to the one in `XREAD`. - -Before reading from the stream, let's put some messages inside: - -``` -> XADD mystream * message apple -1526569495631-0 -> XADD mystream * message orange -1526569498055-0 -> XADD mystream * message strawberry -1526569506935-0 -> XADD mystream * message apricot -1526569535168-0 -> XADD mystream * message banana -1526569544280-0 -``` - -Note: *here message is the field name, and the fruit is the associated value, remember that stream items are small dictionaries.* - -It is time to try reading something using the consumer group: - -``` -> XREADGROUP GROUP mygroup Alice COUNT 1 STREAMS mystream > -1) 1) "mystream" - 2) 1) 1) 1526569495631-0 - 2) 1) "message" - 2) "apple" -``` - -`XREADGROUP` replies are just like `XREAD` replies. Note however the `GROUP ` provided above. It states that I want to read from the stream using the consumer group `mygroup` and I'm the consumer `Alice`. Every time a consumer performs an operation with a consumer group, it must specify its name, uniquely identifying this consumer inside the group. - -There is another very important detail in the command line above, after the mandatory **STREAMS** option the ID requested for the key `mystream` is the special ID `>`. This special ID is only valid in the context of consumer groups, and it means: **messages never delivered to other consumers so far**. - -This is almost always what you want, however it is also possible to specify a real ID, such as `0` or any other valid ID, in this case, however, what happens is that we request from `XREADGROUP` to just provide us with the **history of pending messages**, and in such case, will never see new messages in the group. So basically `XREADGROUP` has the following behavior based on the ID we specify: - -* If the ID is the special ID `>` then the command will return only new messages never delivered to other consumers so far, and as a side effect, will update the consumer group's *last ID*. -* If the ID is any other valid numerical ID, then the command will let us access our *history of pending messages*. That is, the set of messages that were delivered to this specified consumer (identified by the provided name), and never acknowledged so far with `XACK`. - -We can test this behavior immediately specifying an ID of 0, without any **COUNT** option: we'll just see the only pending message, that is, the one about apples: - -``` -> XREADGROUP GROUP mygroup Alice STREAMS mystream 0 -1) 1) "mystream" - 2) 1) 1) 1526569495631-0 - 2) 1) "message" - 2) "apple" -``` - -However, if we acknowledge the message as processed, it will no longer be part of the pending messages history, so the system will no longer report anything: - -``` -> XACK mystream mygroup 1526569495631-0 -(integer) 1 -> XREADGROUP GROUP mygroup Alice STREAMS mystream 0 -1) 1) "mystream" - 2) (empty list or set) -``` - -Don't worry if you yet don't know how `XACK` works, the idea is just that processed messages are no longer part of the history that we can access. - -Now it's Bob's turn to read something: - -``` -> XREADGROUP GROUP mygroup Bob COUNT 2 STREAMS mystream > -1) 1) "mystream" - 2) 1) 1) 1526569498055-0 - 2) 1) "message" - 2) "orange" - 2) 1) 1526569506935-0 - 2) 1) "message" - 2) "strawberry" -``` - -Bob asked for a maximum of two messages and is reading via the same group `mygroup`. So what happens is that Redis reports just *new* messages. As you can see the "apple" message is not delivered, since it was already delivered to Alice, so Bob gets orange and strawberry, and so forth. - -This way Alice, Bob, and any other consumer in the group, are able to read different messages from the same stream, to read their history of yet to process messages, or to mark messages as processed. This allows creating different topologies and semantics for consuming messages from a stream. - -There are a few things to keep in mind: - -* Consumers are auto-created the first time they are mentioned, no need for explicit creation. -* Even with `XREADGROUP` you can read from multiple keys at the same time, however for this to work, you need to create a consumer group with the same name in every stream. This is not a common need, but it is worth mentioning that the feature is technically available. -* `XREADGROUP` is a *write command* because even if it reads from the stream, the consumer group is modified as a side effect of reading, so it can only be called on master instances. - -An example of a consumer implementation, using consumer groups, written in the Ruby language could be the following. The Ruby code is aimed to be readable by virtually any experienced programmer, even if they do not know Ruby: - -```ruby -require 'redis' - -if ARGV.length == 0 - puts "Please specify a consumer name" - exit 1 -end - -ConsumerName = ARGV[0] -GroupName = "mygroup" -r = Redis.new - -def process_message(id,msg) - puts "[#{ConsumerName}] #{id} = #{msg.inspect}" -end - -$lastid = '0-0' - -puts "Consumer #{ConsumerName} starting..." -check_backlog = true -while true - # Pick the ID based on the iteration: the first time we want to - # read our pending messages, in case we crashed and are recovering. - # Once we consumed our history, we can start getting new messages. - if check_backlog - myid = $lastid - else - myid = '>' - end - - items = r.xreadgroup('GROUP',GroupName,ConsumerName,'BLOCK','2000','COUNT','10','STREAMS',:my_stream_key,myid) - - if items == nil - puts "Timeout!" - next - end - - # If we receive an empty reply, it means we were consuming our history - # and that the history is now empty. Let's start to consume new messages. - check_backlog = false if items[0][1].length == 0 - - items[0][1].each{|i| - id,fields = i - - # Process the message - process_message(id,fields) - - # Acknowledge the message as processed - r.xack(:my_stream_key,GroupName,id) - - $lastid = id - } -end -``` - -As you can see the idea here is to start by consuming the history, that is, our list of pending messages. This is useful because the consumer may have crashed before, so in the event of a restart we want to re-read messages that were delivered to us without getting acknowledged. Note that we might process a message multiple times or one time (at least in the case of consumer failures, but there are also the limits of Redis persistence and replication involved, see the specific section about this topic). - -Once the history was consumed, and we get an empty list of messages, we can switch to using the `>` special ID in order to consume new messages. - -## Recovering from permanent failures - -The example above allows us to write consumers that participate in the same consumer group, each taking a subset of messages to process, and when recovering from failures re-reading the pending messages that were delivered just to them. However in the real world consumers may permanently fail and never recover. What happens to the pending messages of the consumer that never recovers after stopping for any reason? - -Redis consumer groups offer a feature that is used in these situations in order to *claim* the pending messages of a given consumer so that such messages will change ownership and will be re-assigned to a different consumer. The feature is very explicit. A consumer has to inspect the list of pending messages, and will have to claim specific messages using a special command, otherwise the server will leave the messages pending forever and assigned to the old consumer. In this way different applications can choose if to use such a feature or not, and exactly how to use it. - -The first step of this process is just a command that provides observability of pending entries in the consumer group and is called `XPENDING`. -This is a read-only command which is always safe to call and will not change ownership of any message. -In its simplest form, the command is called with two arguments, which are the name of the stream and the name of the consumer group. - -``` -> XPENDING mystream mygroup -1) (integer) 2 -2) 1526569498055-0 -3) 1526569506935-0 -4) 1) 1) "Bob" - 2) "2" -``` - -When called in this way, the command outputs the total number of pending messages in the consumer group (two in this case), the lower and higher message ID among the pending messages, and finally a list of consumers and the number of pending messages they have. -We have only Bob with two pending messages because the single message that Alice requested was acknowledged using `XACK`. - -We can ask for more information by giving more arguments to `XPENDING`, because the full command signature is the following: - -``` -XPENDING [[IDLE ] []] -``` - -By providing a start and end ID (that can be just `-` and `+` as in `XRANGE`) and a count to control the amount of information returned by the command, we are able to know more about the pending messages. The optional final argument, the consumer name, is used if we want to limit the output to just messages pending for a given consumer, but won't use this feature in the following example. - -``` -> XPENDING mystream mygroup - + 10 -1) 1) 1526569498055-0 - 2) "Bob" - 3) (integer) 74170458 - 4) (integer) 1 -2) 1) 1526569506935-0 - 2) "Bob" - 3) (integer) 74170458 - 4) (integer) 1 -``` - -Now we have the details for each message: the ID, the consumer name, the *idle time* in milliseconds, which is how many milliseconds have passed since the last time the message was delivered to some consumer, and finally the number of times that a given message was delivered. -We have two messages from Bob, and they are idle for 74170458 milliseconds, about 20 hours. - -Note that nobody prevents us from checking what the first message content was by just using `XRANGE`. - -``` -> XRANGE mystream 1526569498055-0 1526569498055-0 -1) 1) 1526569498055-0 - 2) 1) "message" - 2) "orange" -``` - -We have just to repeat the same ID twice in the arguments. Now that we have some ideas, Alice may decide that after 20 hours of not processing messages, Bob will probably not recover in time, and it's time to *claim* such messages and resume the processing in place of Bob. To do so, we use the `XCLAIM` command. - -This command is very complex and full of options in its full form, since it is used for replication of consumer groups changes, but we'll use just the arguments that we need normally. In this case it is as simple as: - -``` -XCLAIM ... -``` - -Basically we say, for this specific key and group, I want that the message IDs specified will change ownership, and will be assigned to the specified consumer name ``. However, we also provide a minimum idle time, so that the operation will only work if the idle time of the mentioned messages is greater than the specified idle time. This is useful because maybe two clients are retrying to claim a message at the same time: - -``` -Client 1: XCLAIM mystream mygroup Alice 3600000 1526569498055-0 -Client 2: XCLAIM mystream mygroup Lora 3600000 1526569498055-0 -``` - -However, as a side effect, claiming a message will reset its idle time and will increment its number of deliveries counter, so the second client will fail claiming it. In this way we avoid trivial re-processing of messages (even if in the general case you cannot obtain exactly once processing). - -This is the result of the command execution: - -``` -> XCLAIM mystream mygroup Alice 3600000 1526569498055-0 -1) 1) 1526569498055-0 - 2) 1) "message" - 2) "orange" -``` - -The message was successfully claimed by Alice, who can now process the message and acknowledge it, and move things forward even if the original consumer is not recovering. - -It is clear from the example above that as a side effect of successfully claiming a given message, the `XCLAIM` command also returns it. However this is not mandatory. The **JUSTID** option can be used in order to return just the IDs of the message successfully claimed. This is useful if you want to reduce the bandwidth used between the client and the server (and also the performance of the command) and you are not interested in the message because your consumer is implemented in a way that it will rescan the history of pending messages from time to time. - -Claiming may also be implemented by a separate process: one that just checks the list of pending messages, and assigns idle messages to consumers that appear to be active. Active consumers can be obtained using one of the observability features of Redis streams. This is the topic of the next section. - -## Automatic claiming - -The `XAUTOCLAIM` command, added in Redis 6.2, implements the claiming process that we've described above. -`XPENDING` and `XCLAIM` provide the basic building blocks for different types of recovery mechanisms. -This command optimizes the generic process by having Redis manage it and offers a simple solution for most recovery needs. - -`XAUTOCLAIM` identifies idle pending messages and transfers ownership of them to a consumer. -The command's signature looks like this: - -``` -XAUTOCLAIM [COUNT count] [JUSTID] -``` - -So, in the example above, I could have used automatic claiming to claim a single message like this: - -``` -> XAUTOCLAIM mystream mygroup Alice 3600000 0-0 COUNT 1 -1) 1526569498055-0 -2) 1) 1526569498055-0 - 2) 1) "message" - 2) "orange" -``` - -Like `XCLAIM`, the command replies with an array of the claimed messages, but it also returns a stream ID that allows iterating the pending entries. -The stream ID is a cursor, and I can use it in my next call to continue in claiming idle pending messages: - -``` -> XAUTOCLAIM mystream mygroup Lora 3600000 1526569498055-0 COUNT 1 -1) 0-0 -2) 1) 1526569506935-0 - 2) 1) "message" - 2) "strawberry" -``` -When `XAUTOCLAIM` returns the "0-0" stream ID as a cursor, that means that it reached the end of the consumer group pending entries list. -That doesn't mean that there are no new idle pending messages, so the process continues by calling `XAUTOCLAIM` from the beginning of the stream. - -## Claiming and the delivery counter - -The counter that you observe in the `XPENDING` output is the number of deliveries of each message. The counter is incremented in two ways: when a message is successfully claimed via `XCLAIM` or when an `XREADGROUP` call is used in order to access the history of pending messages. - -When there are failures, it is normal that messages will be delivered multiple times, but eventually they usually get processed and acknowledged. However there might be a problem processing some specific message, because it is corrupted or crafted in a way that triggers a bug in the processing code. In such a case what happens is that consumers will continuously fail to process this particular message. Because we have the counter of the delivery attempts, we can use that counter to detect messages that for some reason are not processable. So once the deliveries counter reaches a given large number that you chose, it is probably wiser to put such messages in another stream and send a notification to the system administrator. This is basically the way that Redis Streams implements the *dead letter* concept. - -## Streams observability - -Messaging systems that lack observability are very hard to work with. Not knowing who is consuming messages, what messages are pending, the set of consumer groups active in a given stream, makes everything opaque. For this reason, Redis Streams and consumer groups have different ways to observe what is happening. We already covered `XPENDING`, which allows us to inspect the list of messages that are under processing at a given moment, together with their idle time and number of deliveries. - -However we may want to do more than that, and the `XINFO` command is an observability interface that can be used with sub-commands in order to get information about streams or consumer groups. - -This command uses subcommands in order to show different information about the status of the stream and its consumer groups. For instance **XINFO STREAM ** reports information about the stream itself. - -``` -> XINFO STREAM mystream - 1) "length" - 2) (integer) 2 - 3) "radix-tree-keys" - 4) (integer) 1 - 5) "radix-tree-nodes" - 6) (integer) 2 - 7) "last-generated-id" - 8) "1638125141232-0" - 9) "max-deleted-entryid" -10) "0-0" -11) "entries-added" -12) (integer) 2 -13) "groups" -14) (integer) 1 -15) "first-entry" -16) 1) "1638125133432-0" - 2) 1) "message" - 2) "apple" -17) "last-entry" -18) 1) "1638125141232-0" - 2) 1) "message" - 2) "banana" -``` - -The output shows information about how the stream is encoded internally, and also shows the first and last message in the stream. Another piece of information available is the number of consumer groups associated with this stream. We can dig further asking for more information about the consumer groups. - -``` -> XINFO GROUPS mystream -1) 1) "name" - 2) "mygroup" - 3) "consumers" - 4) (integer) 2 - 5) "pending" - 6) (integer) 2 - 7) "last-delivered-id" - 8) "1638126030001-0" - 9) "entries-read" - 10) (integer) 2 - 11) "lag" - 12) (integer) 0 -2) 1) "name" - 2) "some-other-group" - 3) "consumers" - 4) (integer) 1 - 5) "pending" - 6) (integer) 0 - 7) "last-delivered-id" - 8) "1638126028070-0" - 9) "entries-read" - 10) (integer) 1 - 11) "lag" - 12) (integer) 1 -``` - -As you can see in this and in the previous output, the `XINFO` command outputs a sequence of field-value items. Because it is an observability command this allows the human user to immediately understand what information is reported, and allows the command to report more information in the future by adding more fields without breaking compatibility with older clients. Other commands that must be more bandwidth efficient, like `XPENDING`, just report the information without the field names. - -The output of the example above, where the **GROUPS** subcommand is used, should be clear observing the field names. We can check in more detail the state of a specific consumer group by checking the consumers that are registered in the group. - -``` -> XINFO CONSUMERS mystream mygroup -1) 1) name - 2) "Alice" - 3) pending - 4) (integer) 1 - 5) idle - 6) (integer) 9104628 -2) 1) name - 2) "Bob" - 3) pending - 4) (integer) 1 - 5) idle - 6) (integer) 83841983 -``` - -In case you do not remember the syntax of the command, just ask the command itself for help: - -``` -> XINFO HELP -1) XINFO [ [value] [opt] ...]. Subcommands are: -2) CONSUMERS -3) Show consumers of . -4) GROUPS -5) Show the stream consumer groups. -6) STREAM [FULL [COUNT ] -7) Show information about the stream. -8) HELP -9) Prints this help. -``` - -## Differences with Kafka (TM) partitions - -Consumer groups in Redis streams may resemble in some way Kafka (TM) partitioning-based consumer groups, however note that Redis streams are, in practical terms, very different. The partitions are only *logical* and the messages are just put into a single Redis key, so the way the different clients are served is based on who is ready to process new messages, and not from which partition clients are reading. For instance, if the consumer C3 at some point fails permanently, Redis will continue to serve C1 and C2 all the new messages arriving, as if now there are only two *logical* partitions. - -Similarly, if a given consumer is much faster at processing messages than the other consumers, this consumer will receive proportionally more messages in the same unit of time. This is possible since Redis tracks all the unacknowledged messages explicitly, and remembers who received which message and the ID of the first message never delivered to any consumer. - -However, this also means that in Redis if you really want to partition messages in the same stream into multiple Redis instances, you have to use multiple keys and some sharding system such as Redis Cluster or some other application-specific sharding system. A single Redis stream is not automatically partitioned to multiple instances. - -We could say that schematically the following is true: - -* If you use 1 stream -> 1 consumer, you are processing messages in order. -* If you use N streams with N consumers, so that only a given consumer hits a subset of the N streams, you can scale the above model of 1 stream -> 1 consumer. -* If you use 1 stream -> N consumers, you are load balancing to N consumers, however in that case, messages about the same logical item may be consumed out of order, because a given consumer may process message 3 faster than another consumer is processing message 4. - -So basically Kafka partitions are more similar to using N different Redis keys, while Redis consumer groups are a server-side load balancing system of messages from a given stream to N different consumers. - -## Capped Streams - -Many applications do not want to collect data into a stream forever. Sometimes it is useful to have at maximum a given number of items inside a stream, other times once a given size is reached, it is useful to move data from Redis to a storage which is not in memory and not as fast but suited to store the history for, potentially, decades to come. Redis streams have some support for this. One is the **MAXLEN** option of the `XADD` command. This option is very simple to use: - -``` -> XADD mystream MAXLEN 2 * value 1 -1526654998691-0 -> XADD mystream MAXLEN 2 * value 2 -1526654999635-0 -> XADD mystream MAXLEN 2 * value 3 -1526655000369-0 -> XLEN mystream -(integer) 2 -> XRANGE mystream - + -1) 1) 1526654999635-0 - 2) 1) "value" - 2) "2" -2) 1) 1526655000369-0 - 2) 1) "value" - 2) "3" -``` - -Using **MAXLEN** the old entries are automatically evicted when the specified length is reached, so that the stream is left at a constant size. There is currently no option to tell the stream to just retain items that are not older than a given period, because such command, in order to run consistently, would potentially block for a long time in order to evict items. Imagine for example what happens if there is an insertion spike, then a long pause, and another insertion, all with the same maximum time. The stream would block to evict the data that became too old during the pause. So it is up to the user to do some planning and understand what is the maximum stream length desired. Moreover, while the length of the stream is proportional to the memory used, trimming by time is less simple to control and anticipate: it depends on the insertion rate which often changes over time (and when it does not change, then to just trim by size is trivial). - -However trimming with **MAXLEN** can be expensive: streams are represented by macro nodes into a radix tree, in order to be very memory efficient. Altering the single macro node, consisting of a few tens of elements, is not optimal. So it's possible to use the command in the following special form: - -``` -XADD mystream MAXLEN ~ 1000 * ... entry fields here ... -``` - -The `~` argument between the **MAXLEN** option and the actual count means, I don't really need this to be exactly 1000 items. It can be 1000 or 1010 or 1030, just make sure to save at least 1000 items. With this argument, the trimming is performed only when we can remove a whole node. This makes it much more efficient, and it is usually what you want. - -There is also the `XTRIM` command, which performs something very similar to what the **MAXLEN** option does above, except that it can be run by itself: - -``` -> XTRIM mystream MAXLEN 10 -``` - -Or, as for the `XADD` option: - -``` -> XTRIM mystream MAXLEN ~ 10 -``` - -However, `XTRIM` is designed to accept different trimming strategies. Another trimming strategy is **MINID**, that evicts entries with IDs lower than the one specified. - -As `XTRIM` is an explicit command, the user is expected to know about the possible shortcomings of different trimming strategies. - -Another useful eviction strategy that may be added to `XTRIM` in the future, is to remove by a range of IDs to ease use of `XRANGE` and `XTRIM` to move data from Redis to other storage systems if needed. - -## Special IDs in the streams API - -You may have noticed that there are several special IDs that can be used in the Redis API. Here is a short recap, so that they can make more sense in the future. - -The first two special IDs are `-` and `+`, and are used in range queries with the `XRANGE` command. Those two IDs respectively mean the smallest ID possible (that is basically `0-1`) and the greatest ID possible (that is `18446744073709551615-18446744073709551615`). As you can see it is a lot cleaner to write `-` and `+` instead of those numbers. - -Then there are APIs where we want to say, the ID of the item with the greatest ID inside the stream. This is what `$` means. So for instance if I want only new entries with `XREADGROUP` I use this ID to signify I already have all the existing entries, but not the new ones that will be inserted in the future. Similarly when I create or set the ID of a consumer group, I can set the last delivered item to `$` in order to just deliver new entries to the consumers in the group. - -As you can see `$` does not mean `+`, they are two different things, as `+` is the greatest ID possible in every possible stream, while `$` is the greatest ID in a given stream containing given entries. Moreover APIs will usually only understand `+` or `$`, yet it was useful to avoid loading a given symbol with multiple meanings. - -Another special ID is `>`, that is a special meaning only related to consumer groups and only when the `XREADGROUP` command is used. This special ID means that we want only entries that were never delivered to other consumers so far. So basically the `>` ID is the *last delivered ID* of a consumer group. - -Finally the special ID `*`, that can be used only with the `XADD` command, means to auto select an ID for us for the new entry. - -So we have `-`, `+`, `$`, `>` and `*`, and all have a different meaning, and most of the time, can be used in different contexts. - -## Persistence, replication and message safety - -A Stream, like any other Redis data structure, is asynchronously replicated to replicas and persisted into AOF and RDB files. However what may not be so obvious is that also the consumer groups full state is propagated to AOF, RDB and replicas, so if a message is pending in the master, also the replica will have the same information. Similarly, after a restart, the AOF will restore the consumer groups' state. - -However note that Redis streams and consumer groups are persisted and replicated using the Redis default replication, so: - -* AOF must be used with a strong fsync policy if persistence of messages is important in your application. -* By default the asynchronous replication will not guarantee that `XADD` commands or consumer groups state changes are replicated: after a failover something can be missing depending on the ability of replicas to receive the data from the master. -* The `WAIT` command may be used in order to force the propagation of the changes to a set of replicas. However note that while this makes it very unlikely that data is lost, the Redis failover process as operated by Sentinel or Redis Cluster performs only a *best effort* check to failover to the replica which is the most updated, and under certain specific failure conditions may promote a replica that lacks some data. - -So when designing an application using Redis streams and consumer groups, make sure to understand the semantical properties your application should have during failures, and configure things accordingly, evaluating whether it is safe enough for your use case. - -## Removing single items from a stream - -Streams also have a special command for removing items from the middle of a stream, just by ID. Normally for an append only data structure this may look like an odd feature, but it is actually useful for applications involving, for instance, privacy regulations. The command is called `XDEL` and receives the name of the stream followed by the IDs to delete: - -``` -> XRANGE mystream - + COUNT 2 -1) 1) 1526654999635-0 - 2) 1) "value" - 2) "2" -2) 1) 1526655000369-0 - 2) 1) "value" - 2) "3" -> XDEL mystream 1526654999635-0 -(integer) 1 -> XRANGE mystream - + COUNT 2 -1) 1) 1526655000369-0 - 2) 1) "value" - 2) "3" -``` - -However in the current implementation, memory is not really reclaimed until a macro node is completely empty, so you should not abuse this feature. - -## Zero length streams - -A difference between streams and other Redis data structures is that when the other data structures no longer have any elements, as a side effect of calling commands that remove elements, the key itself will be removed. So for instance, a sorted set will be completely removed when a call to `ZREM` will remove the last element in the sorted set. Streams, on the other hand, are allowed to stay at zero elements, both as a result of using a **MAXLEN** option with a count of zero (`XADD` and `XTRIM` commands), or because `XDEL` was called. - -The reason why such an asymmetry exists is because Streams may have associated consumer groups, and we do not want to lose the state that the consumer groups defined just because there are no longer any items in the stream. Currently the stream is not deleted even when it has no associated consumer groups. - -## Total latency of consuming a message - -Non blocking stream commands like `XRANGE` and `XREAD` or `XREADGROUP` without the BLOCK option are served synchronously like any other Redis command, so to discuss latency of such commands is meaningless: it is more interesting to check the time complexity of the commands in the Redis documentation. It should be enough to say that stream commands are at least as fast as sorted set commands when extracting ranges, and that `XADD` is very fast and can easily insert from half a million to one million items per second in an average machine if pipelining is used. - -However latency becomes an interesting parameter if we want to understand the delay of processing a message, in the context of blocking consumers in a consumer group, from the moment the message is produced via `XADD`, to the moment the message is obtained by the consumer because `XREADGROUP` returned with the message. - -## How serving blocked consumers works - -Before providing the results of performed tests, it is interesting to understand what model Redis uses in order to route stream messages (and in general actually how any blocking operation waiting for data is managed). - -* The blocked client is referenced in a hash table that maps keys for which there is at least one blocking consumer, to a list of consumers that are waiting for such key. This way, given a key that received data, we can resolve all the clients that are waiting for such data. -* When a write happens, in this case when the `XADD` command is called, it calls the `signalKeyAsReady()` function. This function will put the key into a list of keys that need to be processed, because such keys may have new data for blocked consumers. Note that such *ready keys* will be processed later, so in the course of the same event loop cycle, it is possible that the key will receive other writes. -* Finally, before returning into the event loop, the *ready keys* are finally processed. For each key the list of clients waiting for data is scanned, and if applicable, such clients will receive the new data that arrived. In the case of streams the data is the messages in the applicable range requested by the consumer. - -As you can see, basically, before returning to the event loop both the client calling `XADD` and the clients blocked to consume messages, will have their reply in the output buffers, so the caller of `XADD` should receive the reply from Redis at about the same time the consumers will receive the new messages. - -This model is *push-based*, since adding data to the consumers buffers will be performed directly by the action of calling `XADD`, so the latency tends to be quite predictable. - -## Latency tests results - -In order to check these latency characteristics a test was performed using multiple instances of Ruby programs pushing messages having as an additional field the computer millisecond time, and Ruby programs reading the messages from the consumer group and processing them. The message processing step consisted of comparing the current computer time with the message timestamp, in order to understand the total latency. - -Results obtained: - -``` -Processed between 0 and 1 ms -> 74.11% -Processed between 1 and 2 ms -> 25.80% -Processed between 2 and 3 ms -> 0.06% -Processed between 3 and 4 ms -> 0.01% -Processed between 4 and 5 ms -> 0.02% -``` - -So 99.9% of requests have a latency <= 2 milliseconds, with the outliers that remain still very close to the average. - -Adding a few million unacknowledged messages to the stream does not change the gist of the benchmark, with most queries still processed with very short latency. - -A few remarks: - -* Here we processed up to 10k messages per iteration, this means that the `COUNT` parameter of `XREADGROUP` was set to 10000. This adds a lot of latency but is needed in order to allow the slow consumers to be able to keep with the message flow. So you can expect a real world latency that is a lot smaller. -* The system used for this benchmark is very slow compared to today's standards. diff --git a/docs/data-types/streams.md b/docs/data-types/streams.md index f55469df06..5651ea2456 100644 --- a/docs/data-types/streams.md +++ b/docs/data-types/streams.md @@ -4,9 +4,12 @@ linkTitle: "Streams" weight: 60 description: > Introduction to Redis streams +aliases: + - /topics/streams-intro + - /docs/manual/data-types/streams --- -A Redis stream is a data structure that acts like an append-only log. +A Redis stream is a data structure that acts like an append-only log but also implements several operations to overcome some of the limits of a typical append-only log. These include random access in O(1) time and complex consumption strategies, such as consumer groups. You can use streams to record and simultaneously syndicate events in real time. Examples of Redis stream use cases include: @@ -19,6 +22,15 @@ You can use these IDs to retrieve their associated entries later or to read and Redis streams support several trimming strategies (to prevent streams from growing unbounded) and more than one consumption strategy (see `XREAD`, `XREADGROUP`, and `XRANGE`). +## Basic commands +* `XADD` adds a new entry to a stream. +* `XREAD` reads one or more entries, starting at a given position and moving forward in time. +* `XRANGE` returns a range of entries between two supplied entry IDs. +* `XLEN` returns the length of a stream. + +See the [complete list of stream commands](https://redis.io/commands/?group=stream). + + ## Examples * Add several temperature readings to a stream @@ -56,14 +68,6 @@ Redis streams support several trimming strategies (to prevent streams from growi (nil) ``` -## Basic commands -* `XADD` adds a new entry to a stream. -* `XREAD` reads one or more entries, starting at a given position and moving forward in time. -* `XRANGE` returns a range of entries between two supplied entry IDs. -* `XLEN` returns the length of a stream. - -See the [complete list of stream commands](https://redis.io/commands/?group=stream). - ## Performance Adding an entry to a stream is O(1). @@ -74,6 +78,788 @@ For details on why, note that streams are implemented as [radix trees](https://e Simply put, Redis streams provide highly efficient inserts and reads. See each command's time complexity for the details. + +## Streams basics + +Streams are an append-only data structure. The fundamental write command, called [XADD](/commands/xadd), appends a new entry to the specified stream. + +Each stream entry consists of one or more field-value pairs, somewhat like a record or a Redis hash: + +``` +> XADD mystream * sensor-id 1234 temperature 19.8 +1518951480106-0 +``` + +The above call to the `XADD` command adds an entry `sensor-id: 1234, temperature: 19.8` to the stream at key `mystream`, using an auto-generated entry ID, which is the one returned by the command, specifically `1518951480106-0`. It gets as its first argument the key name `mystream`, the second argument is the entry ID that identifies every entry inside a stream. However, in this case, we passed `*` because we want the server to generate a new ID for us. Every new ID will be monotonically increasing, so in more simple terms, every new entry added will have a higher ID compared to all the past entries. Auto-generation of IDs by the server is almost always what you want, and the reasons for specifying an ID explicitly are very rare. We'll talk more about this later. The fact that each Stream entry has an ID is another similarity with log files, where line numbers, or the byte offset inside the file, can be used in order to identify a given entry. Returning back at our `XADD` example, after the key name and ID, the next arguments are the field-value pairs composing our stream entry. + +It is possible to get the number of items inside a Stream just using the `XLEN` command: + +``` +> XLEN mystream +(integer) 1 +``` + +### Entry IDs + +The entry ID returned by the `XADD` command, and identifying univocally each entry inside a given stream, is composed of two parts: + +``` +- +``` + +The milliseconds time part is actually the local time in the local Redis node generating the stream ID, however if the current milliseconds time happens to be smaller than the previous entry time, then the previous entry time is used instead, so if a clock jumps backward the monotonically incrementing ID property still holds. The sequence number is used for entries created in the same millisecond. Since the sequence number is 64 bit wide, in practical terms there is no limit to the number of entries that can be generated within the same millisecond. + +The format of such IDs may look strange at first, and the gentle reader may wonder why the time is part of the ID. The reason is that Redis streams support range queries by ID. Because the ID is related to the time the entry is generated, this gives the ability to query for time ranges basically for free. We will see this soon while covering the `XRANGE` command. + +If for some reason the user needs incremental IDs that are not related to time but are actually associated to another external system ID, as previously mentioned, the `XADD` command can take an explicit ID instead of the `*` wildcard ID that triggers auto-generation, like in the following examples: + +``` +> XADD somestream 0-1 field value +0-1 +> XADD somestream 0-2 foo bar +0-2 +``` + +Note that in this case, the minimum ID is 0-1 and that the command will not accept an ID equal or smaller than a previous one: + +``` +> XADD somestream 0-1 foo bar +(error) ERR The ID specified in XADD is equal or smaller than the target stream top item +``` + +If you're running Redis 7 or later, you can also provide an explicit ID consisting of the milliseconds part only. In this case, the sequence portion of the ID will be automatically generated. To do this, use the syntax below: + +``` +> XADD somestream 0-* baz qux +0-3 +``` + +## Getting data from Streams + +Now we are finally able to append entries in our stream via `XADD`. However, while appending data to a stream is quite obvious, the way streams can be queried in order to extract data is not so obvious. If we continue with the analogy of the log file, one obvious way is to mimic what we normally do with the Unix command `tail -f`, that is, we may start to listen in order to get the new messages that are appended to the stream. Note that unlike the blocking list operations of Redis, where a given element will reach a single client which is blocking in a *pop style* operation like `BLPOP`, with streams we want multiple consumers to see the new messages appended to the stream (the same way many `tail -f` processes can see what is added to a log). Using the traditional terminology we want the streams to be able to *fan out* messages to multiple clients. + +However, this is just one potential access mode. We could also see a stream in quite a different way: not as a messaging system, but as a *time series store*. In this case, maybe it's also useful to get the new messages appended, but another natural query mode is to get messages by ranges of time, or alternatively to iterate the messages using a cursor to incrementally check all the history. This is definitely another useful access mode. + +Finally, if we see a stream from the point of view of consumers, we may want to access the stream in yet another way, that is, as a stream of messages that can be partitioned to multiple consumers that are processing such messages, so that groups of consumers can only see a subset of the messages arriving in a single stream. In this way, it is possible to scale the message processing across different consumers, without single consumers having to process all the messages: each consumer will just get different messages to process. This is basically what Kafka (TM) does with consumer groups. Reading messages via consumer groups is yet another interesting mode of reading from a Redis Stream. + +Redis Streams support all three of the query modes described above via different commands. The next sections will show them all, starting from the simplest and most direct to use: range queries. + +### Querying by range: XRANGE and XREVRANGE + +To query the stream by range we are only required to specify two IDs, *start* and *end*. The range returned will include the elements having start or end as ID, so the range is inclusive. The two special IDs `-` and `+` respectively mean the smallest and the greatest ID possible. + +``` +> XRANGE mystream - + +1) 1) 1518951480106-0 + 2) 1) "sensor-id" + 2) "1234" + 3) "temperature" + 4) "19.8" +2) 1) 1518951482479-0 + 2) 1) "sensor-id" + 2) "9999" + 3) "temperature" + 4) "18.2" +``` + +Each entry returned is an array of two items: the ID and the list of field-value pairs. We already said that the entry IDs have a relation with the time, because the part at the left of the `-` character is the Unix time in milliseconds of the local node that created the stream entry, at the moment the entry was created (however note that streams are replicated with fully specified `XADD` commands, so the replicas will have identical IDs to the master). This means that I could query a range of time using `XRANGE`. In order to do so, however, I may want to omit the sequence part of the ID: if omitted, in the start of the range it will be assumed to be 0, while in the end part it will be assumed to be the maximum sequence number available. This way, querying using just two milliseconds Unix times, we get all the entries that were generated in that range of time, in an inclusive way. For instance, if I want to query a two milliseconds period I could use: + +``` +> XRANGE mystream 1518951480106 1518951480107 +1) 1) 1518951480106-0 + 2) 1) "sensor-id" + 2) "1234" + 3) "temperature" + 4) "19.8" +``` + +I have only a single entry in this range, however in real data sets, I could query for ranges of hours, or there could be many items in just two milliseconds, and the result returned could be huge. For this reason, `XRANGE` supports an optional **COUNT** option at the end. By specifying a count, I can just get the first *N* items. If I want more, I can get the last ID returned, increment the sequence part by one, and query again. Let's see this in the following example. We start adding 10 items with `XADD` (I won't show that, lets assume that the stream `mystream` was populated with 10 items). To start my iteration, getting 2 items per command, I start with the full range, but with a count of 2. + +``` +> XRANGE mystream - + COUNT 2 +1) 1) 1519073278252-0 + 2) 1) "foo" + 2) "value_1" +2) 1) 1519073279157-0 + 2) 1) "foo" + 2) "value_2" +``` + +In order to continue the iteration with the next two items, I have to pick the last ID returned, that is `1519073279157-0` and add the prefix `(` to it. The resulting exclusive range interval, that is `(1519073279157-0` in this case, can now be used as the new *start* argument for the next `XRANGE` call: + +``` +> XRANGE mystream (1519073279157-0 + COUNT 2 +1) 1) 1519073280281-0 + 2) 1) "foo" + 2) "value_3" +2) 1) 1519073281432-0 + 2) 1) "foo" + 2) "value_4" +``` + +And so forth. Since `XRANGE` complexity is *O(log(N))* to seek, and then *O(M)* to return M elements, with a small count the command has a logarithmic time complexity, which means that each step of the iteration is fast. So `XRANGE` is also the de facto *streams iterator* and does not require an **XSCAN** command. + +The command `XREVRANGE` is the equivalent of `XRANGE` but returning the elements in inverted order, so a practical use for `XREVRANGE` is to check what is the last item in a Stream: + +``` +> XREVRANGE mystream + - COUNT 1 +1) 1) 1519073287312-0 + 2) 1) "foo" + 2) "value_10" +``` + +Note that the `XREVRANGE` command takes the *start* and *stop* arguments in reverse order. + +## Listening for new items with XREAD + +When we do not want to access items by a range in a stream, usually what we want instead is to *subscribe* to new items arriving to the stream. This concept may appear related to Redis Pub/Sub, where you subscribe to a channel, or to Redis blocking lists, where you wait for a key to get new elements to fetch, but there are fundamental differences in the way you consume a stream: + +1. A stream can have multiple clients (consumers) waiting for data. Every new item, by default, will be delivered to *every consumer* that is waiting for data in a given stream. This behavior is different than blocking lists, where each consumer will get a different element. However, the ability to *fan out* to multiple consumers is similar to Pub/Sub. +2. While in Pub/Sub messages are *fire and forget* and are never stored anyway, and while when using blocking lists, when a message is received by the client it is *popped* (effectively removed) from the list, streams work in a fundamentally different way. All the messages are appended in the stream indefinitely (unless the user explicitly asks to delete entries): different consumers will know what is a new message from its point of view by remembering the ID of the last message received. +3. Streams Consumer Groups provide a level of control that Pub/Sub or blocking lists cannot achieve, with different groups for the same stream, explicit acknowledgment of processed items, ability to inspect the pending items, claiming of unprocessed messages, and coherent history visibility for each single client, that is only able to see its private past history of messages. + +The command that provides the ability to listen for new messages arriving into a stream is called `XREAD`. It's a bit more complex than `XRANGE`, so we'll start showing simple forms, and later the whole command layout will be provided. + +``` +> XREAD COUNT 2 STREAMS mystream 0 +1) 1) "mystream" + 2) 1) 1) 1519073278252-0 + 2) 1) "foo" + 2) "value_1" + 2) 1) 1519073279157-0 + 2) 1) "foo" + 2) "value_2" +``` + +The above is the non-blocking form of `XREAD`. Note that the **COUNT** option is not mandatory, in fact the only mandatory option of the command is the **STREAMS** option, that specifies a list of keys together with the corresponding maximum ID already seen for each stream by the calling consumer, so that the command will provide the client only with messages with an ID greater than the one we specified. + +In the above command we wrote `STREAMS mystream 0` so we want all the messages in the Stream `mystream` having an ID greater than `0-0`. As you can see in the example above, the command returns the key name, because actually it is possible to call this command with more than one key to read from different streams at the same time. I could write, for instance: `STREAMS mystream otherstream 0 0`. Note how after the **STREAMS** option we need to provide the key names, and later the IDs. For this reason, the **STREAMS** option must always be the last option. +Any other options must come before the **STREAMS** option. + +Apart from the fact that `XREAD` can access multiple streams at once, and that we are able to specify the last ID we own to just get newer messages, in this simple form the command is not doing something so different compared to `XRANGE`. However, the interesting part is that we can turn `XREAD` into a *blocking command* easily, by specifying the **BLOCK** argument: + +``` +> XREAD BLOCK 0 STREAMS mystream $ +``` + +Note that in the example above, other than removing **COUNT**, I specified the new **BLOCK** option with a timeout of 0 milliseconds (that means to never timeout). Moreover, instead of passing a normal ID for the stream `mystream` I passed the special ID `$`. This special ID means that `XREAD` should use as last ID the maximum ID already stored in the stream `mystream`, so that we will receive only *new* messages, starting from the time we started listening. This is similar to the `tail -f` Unix command in some way. + +Note that when the **BLOCK** option is used, we do not have to use the special ID `$`. We can use any valid ID. If the command is able to serve our request immediately without blocking, it will do so, otherwise it will block. Normally if we want to consume the stream starting from new entries, we start with the ID `$`, and after that we continue using the ID of the last message received to make the next call, and so forth. + +The blocking form of `XREAD` is also able to listen to multiple Streams, just by specifying multiple key names. If the request can be served synchronously because there is at least one stream with elements greater than the corresponding ID we specified, it returns with the results. Otherwise, the command will block and will return the items of the first stream which gets new data (according to the specified ID). + +Similarly to blocking list operations, blocking stream reads are *fair* from the point of view of clients waiting for data, since the semantics is FIFO style. The first client that blocked for a given stream will be the first to be unblocked when new items are available. + +`XREAD` has no other options than **COUNT** and **BLOCK**, so it's a pretty basic command with a specific purpose to attach consumers to one or multiple streams. More powerful features to consume streams are available using the consumer groups API, however reading via consumer groups is implemented by a different command called `XREADGROUP`, covered in the next section of this guide. + +## Consumer groups + +When the task at hand is to consume the same stream from different clients, then `XREAD` already offers a way to *fan-out* to N clients, potentially also using replicas in order to provide more read scalability. However in certain problems what we want to do is not to provide the same stream of messages to many clients, but to provide a *different subset* of messages from the same stream to many clients. An obvious case where this is useful is that of messages which are slow to process: the ability to have N different workers that will receive different parts of the stream allows us to scale message processing, by routing different messages to different workers that are ready to do more work. + +In practical terms, if we imagine having three consumers C1, C2, C3, and a stream that contains the messages 1, 2, 3, 4, 5, 6, 7 then what we want is to serve the messages according to the following diagram: + +``` +1 -> C1 +2 -> C2 +3 -> C3 +4 -> C1 +5 -> C2 +6 -> C3 +7 -> C1 +``` + +In order to achieve this, Redis uses a concept called *consumer groups*. It is very important to understand that Redis consumer groups have nothing to do, from an implementation standpoint, with Kafka (TM) consumer groups. Yet they are similar in functionality, so I decided to keep Kafka's (TM) terminology, as it originally popularized this idea. + +A consumer group is like a *pseudo consumer* that gets data from a stream, and actually serves multiple consumers, providing certain guarantees: + +1. Each message is served to a different consumer so that it is not possible that the same message will be delivered to multiple consumers. +2. Consumers are identified, within a consumer group, by a name, which is a case-sensitive string that the clients implementing consumers must choose. This means that even after a disconnect, the stream consumer group retains all the state, since the client will claim again to be the same consumer. However, this also means that it is up to the client to provide a unique identifier. +3. Each consumer group has the concept of the *first ID never consumed* so that, when a consumer asks for new messages, it can provide just messages that were not previously delivered. +4. Consuming a message, however, requires an explicit acknowledgment using a specific command. Redis interprets the acknowledgment as: this message was correctly processed so it can be evicted from the consumer group. +5. A consumer group tracks all the messages that are currently pending, that is, messages that were delivered to some consumer of the consumer group, but are yet to be acknowledged as processed. Thanks to this feature, when accessing the message history of a stream, each consumer *will only see messages that were delivered to it*. + +In a way, a consumer group can be imagined as some *amount of state* about a stream: + +``` ++----------------------------------------+ +| consumer_group_name: mygroup | +| consumer_group_stream: somekey | +| last_delivered_id: 1292309234234-92 | +| | +| consumers: | +| "consumer-1" with pending messages | +| 1292309234234-4 | +| 1292309234232-8 | +| "consumer-42" with pending messages | +| ... (and so forth) | ++----------------------------------------+ +``` + +If you see this from this point of view, it is very simple to understand what a consumer group can do, how it is able to just provide consumers with their history of pending messages, and how consumers asking for new messages will just be served with message IDs greater than `last_delivered_id`. At the same time, if you look at the consumer group as an auxiliary data structure for Redis streams, it is obvious that a single stream can have multiple consumer groups, that have a different set of consumers. Actually, it is even possible for the same stream to have clients reading without consumer groups via `XREAD`, and clients reading via `XREADGROUP` in different consumer groups. + +Now it's time to zoom in to see the fundamental consumer group commands. They are the following: + +* `XGROUP` is used in order to create, destroy and manage consumer groups. +* `XREADGROUP` is used to read from a stream via a consumer group. +* `XACK` is the command that allows a consumer to mark a pending message as correctly processed. + +## Creating a consumer group + +Assuming I have a key `mystream` of type stream already existing, in order to create a consumer group I just need to do the following: + +``` +> XGROUP CREATE mystream mygroup $ +OK +``` + +As you can see in the command above when creating the consumer group we have to specify an ID, which in the example is just `$`. This is needed because the consumer group, among the other states, must have an idea about what message to serve next at the first consumer connecting, that is, what was the *last message ID* when the group was just created. If we provide `$` as we did, then only new messages arriving in the stream from now on will be provided to the consumers in the group. If we specify `0` instead the consumer group will consume *all* the messages in the stream history to start with. Of course, you can specify any other valid ID. What you know is that the consumer group will start delivering messages that are greater than the ID you specify. Because `$` means the current greatest ID in the stream, specifying `$` will have the effect of consuming only new messages. + +`XGROUP CREATE` also supports creating the stream automatically, if it doesn't exist, using the optional `MKSTREAM` subcommand as the last argument: + +``` +> XGROUP CREATE newstream mygroup $ MKSTREAM +OK +``` + +Now that the consumer group is created we can immediately try to read messages via the consumer group using the `XREADGROUP` command. We'll read from consumers, that we will call Alice and Bob, to see how the system will return different messages to Alice or Bob. + +`XREADGROUP` is very similar to `XREAD` and provides the same **BLOCK** option, otherwise it is a synchronous command. However there is a *mandatory* option that must be always specified, which is **GROUP** and has two arguments: the name of the consumer group, and the name of the consumer that is attempting to read. The option **COUNT** is also supported and is identical to the one in `XREAD`. + +Before reading from the stream, let's put some messages inside: + +``` +> XADD mystream * message apple +1526569495631-0 +> XADD mystream * message orange +1526569498055-0 +> XADD mystream * message strawberry +1526569506935-0 +> XADD mystream * message apricot +1526569535168-0 +> XADD mystream * message banana +1526569544280-0 +``` + +Note: *here message is the field name, and the fruit is the associated value, remember that stream items are small dictionaries.* + +It is time to try reading something using the consumer group: + +``` +> XREADGROUP GROUP mygroup Alice COUNT 1 STREAMS mystream > +1) 1) "mystream" + 2) 1) 1) 1526569495631-0 + 2) 1) "message" + 2) "apple" +``` + +`XREADGROUP` replies are just like `XREAD` replies. Note however the `GROUP ` provided above. It states that I want to read from the stream using the consumer group `mygroup` and I'm the consumer `Alice`. Every time a consumer performs an operation with a consumer group, it must specify its name, uniquely identifying this consumer inside the group. + +There is another very important detail in the command line above, after the mandatory **STREAMS** option the ID requested for the key `mystream` is the special ID `>`. This special ID is only valid in the context of consumer groups, and it means: **messages never delivered to other consumers so far**. + +This is almost always what you want, however it is also possible to specify a real ID, such as `0` or any other valid ID, in this case, however, what happens is that we request from `XREADGROUP` to just provide us with the **history of pending messages**, and in such case, will never see new messages in the group. So basically `XREADGROUP` has the following behavior based on the ID we specify: + +* If the ID is the special ID `>` then the command will return only new messages never delivered to other consumers so far, and as a side effect, will update the consumer group's *last ID*. +* If the ID is any other valid numerical ID, then the command will let us access our *history of pending messages*. That is, the set of messages that were delivered to this specified consumer (identified by the provided name), and never acknowledged so far with `XACK`. + +We can test this behavior immediately specifying an ID of 0, without any **COUNT** option: we'll just see the only pending message, that is, the one about apples: + +``` +> XREADGROUP GROUP mygroup Alice STREAMS mystream 0 +1) 1) "mystream" + 2) 1) 1) 1526569495631-0 + 2) 1) "message" + 2) "apple" +``` + +However, if we acknowledge the message as processed, it will no longer be part of the pending messages history, so the system will no longer report anything: + +``` +> XACK mystream mygroup 1526569495631-0 +(integer) 1 +> XREADGROUP GROUP mygroup Alice STREAMS mystream 0 +1) 1) "mystream" + 2) (empty list or set) +``` + +Don't worry if you yet don't know how `XACK` works, the idea is just that processed messages are no longer part of the history that we can access. + +Now it's Bob's turn to read something: + +``` +> XREADGROUP GROUP mygroup Bob COUNT 2 STREAMS mystream > +1) 1) "mystream" + 2) 1) 1) 1526569498055-0 + 2) 1) "message" + 2) "orange" + 2) 1) 1526569506935-0 + 2) 1) "message" + 2) "strawberry" +``` + +Bob asked for a maximum of two messages and is reading via the same group `mygroup`. So what happens is that Redis reports just *new* messages. As you can see the "apple" message is not delivered, since it was already delivered to Alice, so Bob gets orange and strawberry, and so forth. + +This way Alice, Bob, and any other consumer in the group, are able to read different messages from the same stream, to read their history of yet to process messages, or to mark messages as processed. This allows creating different topologies and semantics for consuming messages from a stream. + +There are a few things to keep in mind: + +* Consumers are auto-created the first time they are mentioned, no need for explicit creation. +* Even with `XREADGROUP` you can read from multiple keys at the same time, however for this to work, you need to create a consumer group with the same name in every stream. This is not a common need, but it is worth mentioning that the feature is technically available. +* `XREADGROUP` is a *write command* because even if it reads from the stream, the consumer group is modified as a side effect of reading, so it can only be called on master instances. + +An example of a consumer implementation, using consumer groups, written in the Ruby language could be the following. The Ruby code is aimed to be readable by virtually any experienced programmer, even if they do not know Ruby: + +```ruby +require 'redis' + +if ARGV.length == 0 + puts "Please specify a consumer name" + exit 1 +end + +ConsumerName = ARGV[0] +GroupName = "mygroup" +r = Redis.new + +def process_message(id,msg) + puts "[#{ConsumerName}] #{id} = #{msg.inspect}" +end + +$lastid = '0-0' + +puts "Consumer #{ConsumerName} starting..." +check_backlog = true +while true + # Pick the ID based on the iteration: the first time we want to + # read our pending messages, in case we crashed and are recovering. + # Once we consumed our history, we can start getting new messages. + if check_backlog + myid = $lastid + else + myid = '>' + end + + items = r.xreadgroup('GROUP',GroupName,ConsumerName,'BLOCK','2000','COUNT','10','STREAMS',:my_stream_key,myid) + + if items == nil + puts "Timeout!" + next + end + + # If we receive an empty reply, it means we were consuming our history + # and that the history is now empty. Let's start to consume new messages. + check_backlog = false if items[0][1].length == 0 + + items[0][1].each{|i| + id,fields = i + + # Process the message + process_message(id,fields) + + # Acknowledge the message as processed + r.xack(:my_stream_key,GroupName,id) + + $lastid = id + } +end +``` + +As you can see the idea here is to start by consuming the history, that is, our list of pending messages. This is useful because the consumer may have crashed before, so in the event of a restart we want to re-read messages that were delivered to us without getting acknowledged. Note that we might process a message multiple times or one time (at least in the case of consumer failures, but there are also the limits of Redis persistence and replication involved, see the specific section about this topic). + +Once the history was consumed, and we get an empty list of messages, we can switch to using the `>` special ID in order to consume new messages. + +## Recovering from permanent failures + +The example above allows us to write consumers that participate in the same consumer group, each taking a subset of messages to process, and when recovering from failures re-reading the pending messages that were delivered just to them. However in the real world consumers may permanently fail and never recover. What happens to the pending messages of the consumer that never recovers after stopping for any reason? + +Redis consumer groups offer a feature that is used in these situations in order to *claim* the pending messages of a given consumer so that such messages will change ownership and will be re-assigned to a different consumer. The feature is very explicit. A consumer has to inspect the list of pending messages, and will have to claim specific messages using a special command, otherwise the server will leave the messages pending forever and assigned to the old consumer. In this way different applications can choose if to use such a feature or not, and exactly how to use it. + +The first step of this process is just a command that provides observability of pending entries in the consumer group and is called `XPENDING`. +This is a read-only command which is always safe to call and will not change ownership of any message. +In its simplest form, the command is called with two arguments, which are the name of the stream and the name of the consumer group. + +``` +> XPENDING mystream mygroup +1) (integer) 2 +2) 1526569498055-0 +3) 1526569506935-0 +4) 1) 1) "Bob" + 2) "2" +``` + +When called in this way, the command outputs the total number of pending messages in the consumer group (two in this case), the lower and higher message ID among the pending messages, and finally a list of consumers and the number of pending messages they have. +We have only Bob with two pending messages because the single message that Alice requested was acknowledged using `XACK`. + +We can ask for more information by giving more arguments to `XPENDING`, because the full command signature is the following: + +``` +XPENDING [[IDLE ] []] +``` + +By providing a start and end ID (that can be just `-` and `+` as in `XRANGE`) and a count to control the amount of information returned by the command, we are able to know more about the pending messages. The optional final argument, the consumer name, is used if we want to limit the output to just messages pending for a given consumer, but won't use this feature in the following example. + +``` +> XPENDING mystream mygroup - + 10 +1) 1) 1526569498055-0 + 2) "Bob" + 3) (integer) 74170458 + 4) (integer) 1 +2) 1) 1526569506935-0 + 2) "Bob" + 3) (integer) 74170458 + 4) (integer) 1 +``` + +Now we have the details for each message: the ID, the consumer name, the *idle time* in milliseconds, which is how many milliseconds have passed since the last time the message was delivered to some consumer, and finally the number of times that a given message was delivered. +We have two messages from Bob, and they are idle for 74170458 milliseconds, about 20 hours. + +Note that nobody prevents us from checking what the first message content was by just using `XRANGE`. + +``` +> XRANGE mystream 1526569498055-0 1526569498055-0 +1) 1) 1526569498055-0 + 2) 1) "message" + 2) "orange" +``` + +We have just to repeat the same ID twice in the arguments. Now that we have some ideas, Alice may decide that after 20 hours of not processing messages, Bob will probably not recover in time, and it's time to *claim* such messages and resume the processing in place of Bob. To do so, we use the `XCLAIM` command. + +This command is very complex and full of options in its full form, since it is used for replication of consumer groups changes, but we'll use just the arguments that we need normally. In this case it is as simple as: + +``` +XCLAIM ... +``` + +Basically we say, for this specific key and group, I want that the message IDs specified will change ownership, and will be assigned to the specified consumer name ``. However, we also provide a minimum idle time, so that the operation will only work if the idle time of the mentioned messages is greater than the specified idle time. This is useful because maybe two clients are retrying to claim a message at the same time: + +``` +Client 1: XCLAIM mystream mygroup Alice 3600000 1526569498055-0 +Client 2: XCLAIM mystream mygroup Lora 3600000 1526569498055-0 +``` + +However, as a side effect, claiming a message will reset its idle time and will increment its number of deliveries counter, so the second client will fail claiming it. In this way we avoid trivial re-processing of messages (even if in the general case you cannot obtain exactly once processing). + +This is the result of the command execution: + +``` +> XCLAIM mystream mygroup Alice 3600000 1526569498055-0 +1) 1) 1526569498055-0 + 2) 1) "message" + 2) "orange" +``` + +The message was successfully claimed by Alice, who can now process the message and acknowledge it, and move things forward even if the original consumer is not recovering. + +It is clear from the example above that as a side effect of successfully claiming a given message, the `XCLAIM` command also returns it. However this is not mandatory. The **JUSTID** option can be used in order to return just the IDs of the message successfully claimed. This is useful if you want to reduce the bandwidth used between the client and the server (and also the performance of the command) and you are not interested in the message because your consumer is implemented in a way that it will rescan the history of pending messages from time to time. + +Claiming may also be implemented by a separate process: one that just checks the list of pending messages, and assigns idle messages to consumers that appear to be active. Active consumers can be obtained using one of the observability features of Redis streams. This is the topic of the next section. + +## Automatic claiming + +The `XAUTOCLAIM` command, added in Redis 6.2, implements the claiming process that we've described above. +`XPENDING` and `XCLAIM` provide the basic building blocks for different types of recovery mechanisms. +This command optimizes the generic process by having Redis manage it and offers a simple solution for most recovery needs. + +`XAUTOCLAIM` identifies idle pending messages and transfers ownership of them to a consumer. +The command's signature looks like this: + +``` +XAUTOCLAIM [COUNT count] [JUSTID] +``` + +So, in the example above, I could have used automatic claiming to claim a single message like this: + +``` +> XAUTOCLAIM mystream mygroup Alice 3600000 0-0 COUNT 1 +1) 1526569498055-0 +2) 1) 1526569498055-0 + 2) 1) "message" + 2) "orange" +``` + +Like `XCLAIM`, the command replies with an array of the claimed messages, but it also returns a stream ID that allows iterating the pending entries. +The stream ID is a cursor, and I can use it in my next call to continue in claiming idle pending messages: + +``` +> XAUTOCLAIM mystream mygroup Lora 3600000 1526569498055-0 COUNT 1 +1) 0-0 +2) 1) 1526569506935-0 + 2) 1) "message" + 2) "strawberry" +``` +When `XAUTOCLAIM` returns the "0-0" stream ID as a cursor, that means that it reached the end of the consumer group pending entries list. +That doesn't mean that there are no new idle pending messages, so the process continues by calling `XAUTOCLAIM` from the beginning of the stream. + +## Claiming and the delivery counter + +The counter that you observe in the `XPENDING` output is the number of deliveries of each message. The counter is incremented in two ways: when a message is successfully claimed via `XCLAIM` or when an `XREADGROUP` call is used in order to access the history of pending messages. + +When there are failures, it is normal that messages will be delivered multiple times, but eventually they usually get processed and acknowledged. However there might be a problem processing some specific message, because it is corrupted or crafted in a way that triggers a bug in the processing code. In such a case what happens is that consumers will continuously fail to process this particular message. Because we have the counter of the delivery attempts, we can use that counter to detect messages that for some reason are not processable. So once the deliveries counter reaches a given large number that you chose, it is probably wiser to put such messages in another stream and send a notification to the system administrator. This is basically the way that Redis Streams implements the *dead letter* concept. + +## Streams observability + +Messaging systems that lack observability are very hard to work with. Not knowing who is consuming messages, what messages are pending, the set of consumer groups active in a given stream, makes everything opaque. For this reason, Redis Streams and consumer groups have different ways to observe what is happening. We already covered `XPENDING`, which allows us to inspect the list of messages that are under processing at a given moment, together with their idle time and number of deliveries. + +However we may want to do more than that, and the `XINFO` command is an observability interface that can be used with sub-commands in order to get information about streams or consumer groups. + +This command uses subcommands in order to show different information about the status of the stream and its consumer groups. For instance **XINFO STREAM ** reports information about the stream itself. + +``` +> XINFO STREAM mystream + 1) "length" + 2) (integer) 2 + 3) "radix-tree-keys" + 4) (integer) 1 + 5) "radix-tree-nodes" + 6) (integer) 2 + 7) "last-generated-id" + 8) "1638125141232-0" + 9) "max-deleted-entryid" +10) "0-0" +11) "entries-added" +12) (integer) 2 +13) "groups" +14) (integer) 1 +15) "first-entry" +16) 1) "1638125133432-0" + 2) 1) "message" + 2) "apple" +17) "last-entry" +18) 1) "1638125141232-0" + 2) 1) "message" + 2) "banana" +``` + +The output shows information about how the stream is encoded internally, and also shows the first and last message in the stream. Another piece of information available is the number of consumer groups associated with this stream. We can dig further asking for more information about the consumer groups. + +``` +> XINFO GROUPS mystream +1) 1) "name" + 2) "mygroup" + 3) "consumers" + 4) (integer) 2 + 5) "pending" + 6) (integer) 2 + 7) "last-delivered-id" + 8) "1638126030001-0" + 9) "entries-read" + 10) (integer) 2 + 11) "lag" + 12) (integer) 0 +2) 1) "name" + 2) "some-other-group" + 3) "consumers" + 4) (integer) 1 + 5) "pending" + 6) (integer) 0 + 7) "last-delivered-id" + 8) "1638126028070-0" + 9) "entries-read" + 10) (integer) 1 + 11) "lag" + 12) (integer) 1 +``` + +As you can see in this and in the previous output, the `XINFO` command outputs a sequence of field-value items. Because it is an observability command this allows the human user to immediately understand what information is reported, and allows the command to report more information in the future by adding more fields without breaking compatibility with older clients. Other commands that must be more bandwidth efficient, like `XPENDING`, just report the information without the field names. + +The output of the example above, where the **GROUPS** subcommand is used, should be clear observing the field names. We can check in more detail the state of a specific consumer group by checking the consumers that are registered in the group. + +``` +> XINFO CONSUMERS mystream mygroup +1) 1) name + 2) "Alice" + 3) pending + 4) (integer) 1 + 5) idle + 6) (integer) 9104628 +2) 1) name + 2) "Bob" + 3) pending + 4) (integer) 1 + 5) idle + 6) (integer) 83841983 +``` + +In case you do not remember the syntax of the command, just ask the command itself for help: + +``` +> XINFO HELP +1) XINFO [ [value] [opt] ...]. Subcommands are: +2) CONSUMERS +3) Show consumers of . +4) GROUPS +5) Show the stream consumer groups. +6) STREAM [FULL [COUNT ] +7) Show information about the stream. +8) HELP +9) Prints this help. +``` + +## Differences with Kafka (TM) partitions + +Consumer groups in Redis streams may resemble in some way Kafka (TM) partitioning-based consumer groups, however note that Redis streams are, in practical terms, very different. The partitions are only *logical* and the messages are just put into a single Redis key, so the way the different clients are served is based on who is ready to process new messages, and not from which partition clients are reading. For instance, if the consumer C3 at some point fails permanently, Redis will continue to serve C1 and C2 all the new messages arriving, as if now there are only two *logical* partitions. + +Similarly, if a given consumer is much faster at processing messages than the other consumers, this consumer will receive proportionally more messages in the same unit of time. This is possible since Redis tracks all the unacknowledged messages explicitly, and remembers who received which message and the ID of the first message never delivered to any consumer. + +However, this also means that in Redis if you really want to partition messages in the same stream into multiple Redis instances, you have to use multiple keys and some sharding system such as Redis Cluster or some other application-specific sharding system. A single Redis stream is not automatically partitioned to multiple instances. + +We could say that schematically the following is true: + +* If you use 1 stream -> 1 consumer, you are processing messages in order. +* If you use N streams with N consumers, so that only a given consumer hits a subset of the N streams, you can scale the above model of 1 stream -> 1 consumer. +* If you use 1 stream -> N consumers, you are load balancing to N consumers, however in that case, messages about the same logical item may be consumed out of order, because a given consumer may process message 3 faster than another consumer is processing message 4. + +So basically Kafka partitions are more similar to using N different Redis keys, while Redis consumer groups are a server-side load balancing system of messages from a given stream to N different consumers. + +## Capped Streams + +Many applications do not want to collect data into a stream forever. Sometimes it is useful to have at maximum a given number of items inside a stream, other times once a given size is reached, it is useful to move data from Redis to a storage which is not in memory and not as fast but suited to store the history for, potentially, decades to come. Redis streams have some support for this. One is the **MAXLEN** option of the `XADD` command. This option is very simple to use: + +``` +> XADD mystream MAXLEN 2 * value 1 +1526654998691-0 +> XADD mystream MAXLEN 2 * value 2 +1526654999635-0 +> XADD mystream MAXLEN 2 * value 3 +1526655000369-0 +> XLEN mystream +(integer) 2 +> XRANGE mystream - + +1) 1) 1526654999635-0 + 2) 1) "value" + 2) "2" +2) 1) 1526655000369-0 + 2) 1) "value" + 2) "3" +``` + +Using **MAXLEN** the old entries are automatically evicted when the specified length is reached, so that the stream is left at a constant size. There is currently no option to tell the stream to just retain items that are not older than a given period, because such command, in order to run consistently, would potentially block for a long time in order to evict items. Imagine for example what happens if there is an insertion spike, then a long pause, and another insertion, all with the same maximum time. The stream would block to evict the data that became too old during the pause. So it is up to the user to do some planning and understand what is the maximum stream length desired. Moreover, while the length of the stream is proportional to the memory used, trimming by time is less simple to control and anticipate: it depends on the insertion rate which often changes over time (and when it does not change, then to just trim by size is trivial). + +However trimming with **MAXLEN** can be expensive: streams are represented by macro nodes into a radix tree, in order to be very memory efficient. Altering the single macro node, consisting of a few tens of elements, is not optimal. So it's possible to use the command in the following special form: + +``` +XADD mystream MAXLEN ~ 1000 * ... entry fields here ... +``` + +The `~` argument between the **MAXLEN** option and the actual count means, I don't really need this to be exactly 1000 items. It can be 1000 or 1010 or 1030, just make sure to save at least 1000 items. With this argument, the trimming is performed only when we can remove a whole node. This makes it much more efficient, and it is usually what you want. + +There is also the `XTRIM` command, which performs something very similar to what the **MAXLEN** option does above, except that it can be run by itself: + +``` +> XTRIM mystream MAXLEN 10 +``` + +Or, as for the `XADD` option: + +``` +> XTRIM mystream MAXLEN ~ 10 +``` + +However, `XTRIM` is designed to accept different trimming strategies. Another trimming strategy is **MINID**, that evicts entries with IDs lower than the one specified. + +As `XTRIM` is an explicit command, the user is expected to know about the possible shortcomings of different trimming strategies. + +Another useful eviction strategy that may be added to `XTRIM` in the future, is to remove by a range of IDs to ease use of `XRANGE` and `XTRIM` to move data from Redis to other storage systems if needed. + +## Special IDs in the streams API + +You may have noticed that there are several special IDs that can be used in the Redis API. Here is a short recap, so that they can make more sense in the future. + +The first two special IDs are `-` and `+`, and are used in range queries with the `XRANGE` command. Those two IDs respectively mean the smallest ID possible (that is basically `0-1`) and the greatest ID possible (that is `18446744073709551615-18446744073709551615`). As you can see it is a lot cleaner to write `-` and `+` instead of those numbers. + +Then there are APIs where we want to say, the ID of the item with the greatest ID inside the stream. This is what `$` means. So for instance if I want only new entries with `XREADGROUP` I use this ID to signify I already have all the existing entries, but not the new ones that will be inserted in the future. Similarly when I create or set the ID of a consumer group, I can set the last delivered item to `$` in order to just deliver new entries to the consumers in the group. + +As you can see `$` does not mean `+`, they are two different things, as `+` is the greatest ID possible in every possible stream, while `$` is the greatest ID in a given stream containing given entries. Moreover APIs will usually only understand `+` or `$`, yet it was useful to avoid loading a given symbol with multiple meanings. + +Another special ID is `>`, that is a special meaning only related to consumer groups and only when the `XREADGROUP` command is used. This special ID means that we want only entries that were never delivered to other consumers so far. So basically the `>` ID is the *last delivered ID* of a consumer group. + +Finally the special ID `*`, that can be used only with the `XADD` command, means to auto select an ID for us for the new entry. + +So we have `-`, `+`, `$`, `>` and `*`, and all have a different meaning, and most of the time, can be used in different contexts. + +## Persistence, replication and message safety + +A Stream, like any other Redis data structure, is asynchronously replicated to replicas and persisted into AOF and RDB files. However what may not be so obvious is that also the consumer groups full state is propagated to AOF, RDB and replicas, so if a message is pending in the master, also the replica will have the same information. Similarly, after a restart, the AOF will restore the consumer groups' state. + +However note that Redis streams and consumer groups are persisted and replicated using the Redis default replication, so: + +* AOF must be used with a strong fsync policy if persistence of messages is important in your application. +* By default the asynchronous replication will not guarantee that `XADD` commands or consumer groups state changes are replicated: after a failover something can be missing depending on the ability of replicas to receive the data from the master. +* The `WAIT` command may be used in order to force the propagation of the changes to a set of replicas. However note that while this makes it very unlikely that data is lost, the Redis failover process as operated by Sentinel or Redis Cluster performs only a *best effort* check to failover to the replica which is the most updated, and under certain specific failure conditions may promote a replica that lacks some data. + +So when designing an application using Redis streams and consumer groups, make sure to understand the semantical properties your application should have during failures, and configure things accordingly, evaluating whether it is safe enough for your use case. + +## Removing single items from a stream + +Streams also have a special command for removing items from the middle of a stream, just by ID. Normally for an append only data structure this may look like an odd feature, but it is actually useful for applications involving, for instance, privacy regulations. The command is called `XDEL` and receives the name of the stream followed by the IDs to delete: + +``` +> XRANGE mystream - + COUNT 2 +1) 1) 1526654999635-0 + 2) 1) "value" + 2) "2" +2) 1) 1526655000369-0 + 2) 1) "value" + 2) "3" +> XDEL mystream 1526654999635-0 +(integer) 1 +> XRANGE mystream - + COUNT 2 +1) 1) 1526655000369-0 + 2) 1) "value" + 2) "3" +``` + +However in the current implementation, memory is not really reclaimed until a macro node is completely empty, so you should not abuse this feature. + +## Zero length streams + +A difference between streams and other Redis data structures is that when the other data structures no longer have any elements, as a side effect of calling commands that remove elements, the key itself will be removed. So for instance, a sorted set will be completely removed when a call to `ZREM` will remove the last element in the sorted set. Streams, on the other hand, are allowed to stay at zero elements, both as a result of using a **MAXLEN** option with a count of zero (`XADD` and `XTRIM` commands), or because `XDEL` was called. + +The reason why such an asymmetry exists is because Streams may have associated consumer groups, and we do not want to lose the state that the consumer groups defined just because there are no longer any items in the stream. Currently the stream is not deleted even when it has no associated consumer groups. + +## Total latency of consuming a message + +Non blocking stream commands like `XRANGE` and `XREAD` or `XREADGROUP` without the BLOCK option are served synchronously like any other Redis command, so to discuss latency of such commands is meaningless: it is more interesting to check the time complexity of the commands in the Redis documentation. It should be enough to say that stream commands are at least as fast as sorted set commands when extracting ranges, and that `XADD` is very fast and can easily insert from half a million to one million items per second in an average machine if pipelining is used. + +However latency becomes an interesting parameter if we want to understand the delay of processing a message, in the context of blocking consumers in a consumer group, from the moment the message is produced via `XADD`, to the moment the message is obtained by the consumer because `XREADGROUP` returned with the message. + +## How serving blocked consumers works + +Before providing the results of performed tests, it is interesting to understand what model Redis uses in order to route stream messages (and in general actually how any blocking operation waiting for data is managed). + +* The blocked client is referenced in a hash table that maps keys for which there is at least one blocking consumer, to a list of consumers that are waiting for such key. This way, given a key that received data, we can resolve all the clients that are waiting for such data. +* When a write happens, in this case when the `XADD` command is called, it calls the `signalKeyAsReady()` function. This function will put the key into a list of keys that need to be processed, because such keys may have new data for blocked consumers. Note that such *ready keys* will be processed later, so in the course of the same event loop cycle, it is possible that the key will receive other writes. +* Finally, before returning into the event loop, the *ready keys* are finally processed. For each key the list of clients waiting for data is scanned, and if applicable, such clients will receive the new data that arrived. In the case of streams the data is the messages in the applicable range requested by the consumer. + +As you can see, basically, before returning to the event loop both the client calling `XADD` and the clients blocked to consume messages, will have their reply in the output buffers, so the caller of `XADD` should receive the reply from Redis at about the same time the consumers will receive the new messages. + +This model is *push-based*, since adding data to the consumers buffers will be performed directly by the action of calling `XADD`, so the latency tends to be quite predictable. + +## Latency tests results + +In order to check these latency characteristics a test was performed using multiple instances of Ruby programs pushing messages having as an additional field the computer millisecond time, and Ruby programs reading the messages from the consumer group and processing them. The message processing step consisted of comparing the current computer time with the message timestamp, in order to understand the total latency. + +Results obtained: + +``` +Processed between 0 and 1 ms -> 74.11% +Processed between 1 and 2 ms -> 25.80% +Processed between 2 and 3 ms -> 0.06% +Processed between 3 and 4 ms -> 0.01% +Processed between 4 and 5 ms -> 0.02% +``` + +So 99.9% of requests have a latency <= 2 milliseconds, with the outliers that remain still very close to the average. + +Adding a few million unacknowledged messages to the stream does not change the gist of the benchmark, with most queries still processed with very short latency. + +A few remarks: + +* Here we processed up to 10k messages per iteration, this means that the `COUNT` parameter of `XREADGROUP` was set to 10000. This adds a lot of latency but is needed in order to allow the slow consumers to be able to keep with the message flow. So you can expect a real world latency that is a lot smaller. +* The system used for this benchmark is very slow compared to today's standards. + + + + ## Learn more * The [Redis Streams Tutorial](/docs/data-types/streams-tutorial) explains Redis streams with many examples. From e500d4a50965ab494ba80e8ea82aec337b32093b Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Fri, 2 Jun 2023 10:54:20 +0100 Subject: [PATCH 08/23] Renames the Keyspace page --- docs/manual/{the-redis-keyspace.md => keyspace.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename docs/manual/{the-redis-keyspace.md => keyspace.md} (99%) diff --git a/docs/manual/the-redis-keyspace.md b/docs/manual/keyspace.md similarity index 99% rename from docs/manual/the-redis-keyspace.md rename to docs/manual/keyspace.md index 2b80f9c464..0bd903e655 100644 --- a/docs/manual/the-redis-keyspace.md +++ b/docs/manual/keyspace.md @@ -1,6 +1,6 @@ --- -title: "The Redis keyspace" -linkTitle: "The Redis Keyspace" +title: "Keyspace" +linkTitle: "Keyspace" weight: 1 description: > Managing keys in Redis: Key expiration, scanning, altering and querying the key space From 7b84d56066e6eda60931a5b5d7f2582b4d077d53 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Thu, 15 Jun 2023 10:53:46 +0100 Subject: [PATCH 09/23] Rename Install to Install Redis --- docs/getting-started/installation/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/getting-started/installation/_index.md b/docs/getting-started/installation/_index.md index aa07f2716a..4a8845d356 100644 --- a/docs/getting-started/installation/_index.md +++ b/docs/getting-started/installation/_index.md @@ -1,6 +1,6 @@ --- title: "Installing Redis" -linkTitle: "Install" +linkTitle: "Install Redis" weight: 1 description: > Install Redis on Linux, macOS, and Windows From 767250ea13f4ce55a7e28c3eede50384f200986a Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Fri, 23 Jun 2023 01:21:03 +0100 Subject: [PATCH 10/23] Adds info on RSAL license, OM clients and HLL use cases --- docs/about/license.md | 79 +++++++++++++++++-- docs/clients/_index.md | 11 +++ docs/data-types/probabilistic/hyperloglogs.md | 19 +++++ 3 files changed, 103 insertions(+), 6 deletions(-) diff --git a/docs/about/license.md b/docs/about/license.md index 5bcada4c4f..48580a3244 100644 --- a/docs/about/license.md +++ b/docs/about/license.md @@ -6,14 +6,23 @@ description: > Redis license and trademark information aliases: - /topics/license + - /docs/stack/license/ --- -Redis is **open source software** released under the terms of the **three clause BSD license**. Most of the Redis source code was written and is copyrighted by Salvatore Sanfilippo and Pieter Noordhuis. A list of other contributors can be found in the git history. -The Redis trademark and logo are owned by Redis Ltd. and can be +* Redis is **open source software** released under the terms of the **three clause BSD license**. Most of the Redis source code was written and is copyrighted by Salvatore Sanfilippo and Pieter Noordhuis. A list of other contributors can be found in the git history. + + The Redis trademark and logo are owned by Redis Ltd. and can be used in accordance with the [Redis Trademark Guidelines](https://redis.com/legal/trademark-guidelines/). -## Three clause BSD license +* RedisInsight is licensed under the Server Side Public License (SSPL). + +* Redis Stack Server, which combines open source Redis with RediSearch, RedisJSON, RedisGraph, RedisTimeSeries, and RedisBloom, is dual-licensed under the Redis Source Available License (RSALv2), as described below, and the [Server Side Public License](https://en.wikipedia.org/wiki/Server_Side_Public_License) (SSPL). For information about licensing per version, see [Versions and licenses](/docs/stack/#versions-and-licenses). + + +## Licences: + +### Three clause BSD license Every file in the Redis distribution, with the exceptions of third party files specified in the list below, contain the following license: @@ -43,7 +52,66 @@ CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -## Third-party files and licenses +### REDIS SOURCE AVAILABLE LICENSE (RSAL) 2.0 + + +_Last updated: November 15, 2022_ + +#### Acceptance + +By using the software, you agree to all of the terms and conditions below. + +#### Copyright License + +The licensor grants you a non-exclusive, royalty-free, worldwide, non-sublicensable, non-transferable license to use, copy, distribute, make available, and prepare derivative works of the software, in each case subject to the limitations and conditions below. + +#### Limitations + +You may not make the functionality of the software or a modified version available to third parties as a service, or distribute the software or a modified version in a manner that makes the functionality of the software available to third parties. +Making the functionality of the software or modified version available to third parties includes, without limitation, enabling third parties to interact with the functionality of the software or modified version in distributed form or remotely through a computer network, offering a product or service the value of which entirely or primarily derives from the value of the software or modified version, or offering a product or service that accomplishes for users the primary purpose of the software or modified version. + +You may not alter, remove, or obscure any licensing, copyright, or other notices of the licensor in the software. Any use of the licensor’s trademarks is subject to applicable law. + +#### Patents + +The licensor grants you a license, under any patent claims the licensor can license, or becomes able to license, to make, have made, use, sell, offer for sale, import and have imported the software, in each case subject to the limitations and conditions in this license. This license does not cover any patent claims that you cause to be infringed by modifications or additions to the software. If you or your company make any written claim that the software infringes or contributes to infringement of any patent, your patent license for the software granted under these terms ends immediately. If your company makes such a claim, your patent license ends immediately for work on behalf of your company. + +#### Notices + +You must ensure that anyone who gets a copy of any part of the software from you also gets a copy of these terms. +If you modify the software, you must include in any modified copies of the software prominent notices stating that you have modified the software. + +#### No Other Rights + +These terms do not imply any licenses other than those expressly granted in these terms. + +#### Termination + +If you use the software in violation of these terms, such use is not licensed, and your licenses will automatically terminate. If the licensor provides you with a notice of your violation, and you cease all violations of this license no later than 30 days after you receive that notice, your licenses will be reinstated retroactively. However, if you violate these terms after such reinstatement, any additional violation of these terms will cause your licenses to terminate automatically and permanently. + +#### No Liability + +_**As far as the law allows, the **software** comes as is, without any warranty or condition, and the licensor will not be liable to you for any damages arising out of these terms or the use or nature of the software, under any kind of legal claim.**_ + +#### Definitions + +The **licensor** is the entity offering these terms, and the software is the software the licensor makes available under these terms, including any portion of it. + +To **modify** a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission other than making an exact copy. The resulting work is called a **modified version** of the earlier work. + +**you** refers to the individual or entity agreeing to these terms. + +**your company** is any legal entity, sole proprietorship, or other kind of organization that you work for, plus all organizations that have control over, are under the control of, or are under common control with that organization. + +**control** means ownership of substantially all the assets of an entity, or the power to direct its management and policies by vote, contract, or otherwise. Control can be direct or indirect. + +**your licenses** are all the licenses granted to you for the software under these terms. +use means anything you do with the software requiring one of your licenses. + +**trademark** means trademarks, service marks, and similar rights. + + +### Third-party files and licenses Redis uses source code from third parties. All this code contains a BSD or BSD-compatible license. The following is a list of third-party files and information about their copyright. @@ -57,5 +125,4 @@ Redis uses source code from third parties. All this code contains a BSD or BSD-c * Inside Jemalloc the files `inttypes.h`, `stdbool.h`, `stdint.h`, `strings.h` under the `msvc_compat` directory are copyright Alexander Chemeris and released under the **three-clause BSD license**. -* The libraries **hiredis** and **linenoise** also included inside the Redis distribution are copyright Salvatore Sanfilippo and Pieter Noordhuis and released under the terms respectively of the **three-clause BSD license** and **two-clause BSD license**. - +* The libraries **hiredis** and **linenoise** also included inside the Redis distribution are copyright Salvatore Sanfilippo and Pieter Noordhuis and released under the terms respectively of the **three-clause BSD license** and **two-clause BSD license**. \ No newline at end of file diff --git a/docs/clients/_index.md b/docs/clients/_index.md index b8fc945a62..b83ac7203a 100644 --- a/docs/clients/_index.md +++ b/docs/clients/_index.md @@ -5,6 +5,7 @@ description: Get started using Redis clients. Select your library and connect yo weight: 45 aliases: - /docs/redis-clients + - /docs/stack/get-started/clients/ --- Here, you will learn how to connect your application to a Redis database. If you're new to Redis, you might first want to [install Redis with Redis Stack and RedisInsight](/docs/stack/get-started/install). @@ -12,3 +13,13 @@ Here, you will learn how to connect your application to a Redis database. If you For more Redis topics, see [Using](/docs/manual/) and [Managing](/docs/management/) Redis. If you're ready to get started, see the following guides for the official client libraries you can use with Redis. For a complete list of community-driven clients, see [Clients](/resources/clients/). + + +## High-level client libraries + +The Redis OM client libraries let you use the document modeling, indexing, and querying capabilities of Redis Stack much like the way you'd use an [ORM](https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping). The following Redis OM libraries support Redis Stack: + +* [Redis OM .NET](/docs/clients/stack-dotnet/) +* [Redis OM Node](/docs/clients/stack-node/) +* [Redis OM Python](/docs/clients/stack-python/) +* [Redis OM Spring](/docs/clients/stack-spring/) diff --git a/docs/data-types/probabilistic/hyperloglogs.md b/docs/data-types/probabilistic/hyperloglogs.md index f9e3a1b730..1f00455e74 100644 --- a/docs/data-types/probabilistic/hyperloglogs.md +++ b/docs/data-types/probabilistic/hyperloglogs.md @@ -50,6 +50,25 @@ performed by users in a search form every day, number of unique visitors to a we Redis is also able to perform the union of HLLs, please check the [full documentation](/commands#hyperloglog) for more information. +## Use cases + +**Anonymous unique visits of a web page (SaaS, analytics tools)** + +This application answers these questions: + +- How many unique visits has this page had on this day? +- How many unique users have played this song? +- How many unique users have viewed this video? + +{{% alert title="Note" color="warning" %}} + +Storing the IP address or any other kind of personal identifier is against the law in some countries, which makes it impossible to get unique visitor statistics on your website. + +{{% /alert %}} + +One HyperLogLog is created per page (video/song) per period, and every IP/identifier is added to it on every visit. + + ## Examples * Add some items to the HyperLogLog: From e885af589fa88b1d79d5723f98cdb7e0accee72f Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Fri, 23 Jun 2023 10:25:09 +0100 Subject: [PATCH 11/23] =?UTF-8?q?Move=20pages=20under=20=E2=80=9CInteract?= =?UTF-8?q?=20with=20data=E2=80=9D?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/interact-with-data/_index.md | 9 +++++++++ .../programmability/_index.md | 3 ++- .../programmability/eval-intro.md | 1 + .../programmability/functions-intro.md | 1 + .../programmability/lua-api.md | 1 + .../programmability/lua-debugging.md | 1 + docs/{manual => interact-with-data}/pubsub.md | 2 +- docs/{manual => interact-with-data}/transactions.md | 3 ++- 8 files changed, 18 insertions(+), 3 deletions(-) create mode 100644 docs/interact-with-data/_index.md rename docs/{manual => interact-with-data}/programmability/_index.md (99%) rename docs/{manual => interact-with-data}/programmability/eval-intro.md (99%) rename docs/{manual => interact-with-data}/programmability/functions-intro.md (99%) rename docs/{manual => interact-with-data}/programmability/lua-api.md (99%) rename docs/{manual => interact-with-data}/programmability/lua-debugging.md (99%) rename docs/{manual => interact-with-data}/pubsub.md (99%) rename docs/{manual => interact-with-data}/transactions.md (99%) diff --git a/docs/interact-with-data/_index.md b/docs/interact-with-data/_index.md new file mode 100644 index 0000000000..97de85b4ae --- /dev/null +++ b/docs/interact-with-data/_index.md @@ -0,0 +1,9 @@ +--- +title: "Interact with data in Redis" +linkTitle: "Interact with data" + +weight: 35 + +description: > + How to interact with data in Redis, including searching, querying, triggered functions, transactions, and pub/sub. +--- \ No newline at end of file diff --git a/docs/manual/programmability/_index.md b/docs/interact-with-data/programmability/_index.md similarity index 99% rename from docs/manual/programmability/_index.md rename to docs/interact-with-data/programmability/_index.md index 9508d17e04..14ac7489c3 100644 --- a/docs/manual/programmability/_index.md +++ b/docs/interact-with-data/programmability/_index.md @@ -1,11 +1,12 @@ --- title: "Redis programmability" linkTitle: "Programmability" -weight: 7 +weight: 20 description: > Extending Redis with Lua and Redis Functions aliases: - /topics/programmability + - /docs/manual/programmability/ --- Redis provides a programming interface that lets you execute custom scripts on the server itself. In Redis 7 and beyond, you can use [Redis Functions](/docs/manual/programmability/functions-intro) to manage and run your scripts. In Redis 6.2 and below, you use [Lua scripting with the EVAL command](/docs/manual/programmability/eval-intro) to program the server. diff --git a/docs/manual/programmability/eval-intro.md b/docs/interact-with-data/programmability/eval-intro.md similarity index 99% rename from docs/manual/programmability/eval-intro.md rename to docs/interact-with-data/programmability/eval-intro.md index f1a4e9b97c..0a1b70982d 100644 --- a/docs/manual/programmability/eval-intro.md +++ b/docs/interact-with-data/programmability/eval-intro.md @@ -6,6 +6,7 @@ description: > Executing Lua in Redis aliases: - /topics/eval-intro + - /docs/manual/programmability/eval-intro/ --- Redis lets users upload and execute Lua scripts on the server. diff --git a/docs/manual/programmability/functions-intro.md b/docs/interact-with-data/programmability/functions-intro.md similarity index 99% rename from docs/manual/programmability/functions-intro.md rename to docs/interact-with-data/programmability/functions-intro.md index 5686d12852..b4dc77033c 100644 --- a/docs/manual/programmability/functions-intro.md +++ b/docs/interact-with-data/programmability/functions-intro.md @@ -6,6 +6,7 @@ description: > Scripting with Redis 7 and beyond aliases: - /topics/functions-intro + - /docs/manual/programmability/functions-intro/ --- Redis Functions is an API for managing code to be executed on the server. This feature, which became available in Redis 7, supersedes the use of [EVAL](/docs/manual/programmability/eval-intro) in prior versions of Redis. diff --git a/docs/manual/programmability/lua-api.md b/docs/interact-with-data/programmability/lua-api.md similarity index 99% rename from docs/manual/programmability/lua-api.md rename to docs/interact-with-data/programmability/lua-api.md index 4ff6006645..f5d6e3e505 100644 --- a/docs/manual/programmability/lua-api.md +++ b/docs/interact-with-data/programmability/lua-api.md @@ -6,6 +6,7 @@ description: > Executing Lua in Redis aliases: - /topics/lua-api + - /docs/manual/programmability/lua-api/ --- Redis includes an embedded [Lua 5.1](https://www.lua.org/) interpreter. diff --git a/docs/manual/programmability/lua-debugging.md b/docs/interact-with-data/programmability/lua-debugging.md similarity index 99% rename from docs/manual/programmability/lua-debugging.md rename to docs/interact-with-data/programmability/lua-debugging.md index 61719370a5..26b4b05e1d 100644 --- a/docs/manual/programmability/lua-debugging.md +++ b/docs/interact-with-data/programmability/lua-debugging.md @@ -5,6 +5,7 @@ description: How to use the built-in Lua debugger weight: 4 aliases: - /topics/ldb + - /docs/manual/programmability/lua-debugging/ --- Starting with version 3.2 Redis includes a complete Lua debugger, that can be diff --git a/docs/manual/pubsub.md b/docs/interact-with-data/pubsub.md similarity index 99% rename from docs/manual/pubsub.md rename to docs/interact-with-data/pubsub.md index f3c58299bb..c6a3bb3ae2 100644 --- a/docs/manual/pubsub.md +++ b/docs/interact-with-data/pubsub.md @@ -1,7 +1,7 @@ --- title: Redis Pub/Sub linkTitle: "Pub/sub" -weight: 5 +weight: 40 description: How to use pub/sub channels in Redis aliases: - /topics/pubsub diff --git a/docs/manual/transactions.md b/docs/interact-with-data/transactions.md similarity index 99% rename from docs/manual/transactions.md rename to docs/interact-with-data/transactions.md index 1c25331630..9641f986bb 100644 --- a/docs/manual/transactions.md +++ b/docs/interact-with-data/transactions.md @@ -1,10 +1,11 @@ --- title: Transactions linkTitle: Transactions -weight: 6 +weight: 30 description: How transactions work in Redis aliases: - /topics/transactions + - /docs/manual/transactions/ --- Redis Transactions allow the execution of a group of commands From 57727725a2e5e074b273eae7d00e2349f10f10c4 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Fri, 23 Jun 2023 12:05:14 +0100 Subject: [PATCH 12/23] Add
to index pages --- docs/about/_index.md | 2 ++ docs/clients/_index.md | 2 ++ docs/data-types/_index.md | 2 ++ docs/getting-started/_index.md | 2 ++ 4 files changed, 8 insertions(+) diff --git a/docs/about/_index.md b/docs/about/_index.md index ce2fb7769a..4b80f25497 100644 --- a/docs/about/_index.md +++ b/docs/about/_index.md @@ -39,3 +39,5 @@ You can use Redis from [most programming languages](/clients). Redis is written in **ANSI C** and works on most POSIX systems like Linux, \*BSD, and Mac OS X, without external dependencies. Linux and OS X are the two operating systems where Redis is developed and tested the most, and we **recommend using Linux for deployment**. Redis may work in Solaris-derived systems like SmartOS, but support is *best effort*. There is no official support for Windows builds. + +
\ No newline at end of file diff --git a/docs/clients/_index.md b/docs/clients/_index.md index b83ac7203a..3b1a4981a7 100644 --- a/docs/clients/_index.md +++ b/docs/clients/_index.md @@ -23,3 +23,5 @@ The Redis OM client libraries let you use the document modeling, indexing, and q * [Redis OM Node](/docs/clients/stack-node/) * [Redis OM Python](/docs/clients/stack-python/) * [Redis OM Spring](/docs/clients/stack-spring/) + +
\ No newline at end of file diff --git a/docs/data-types/_index.md b/docs/data-types/_index.md index 8deccd2271..e4e7dae709 100644 --- a/docs/data-types/_index.md +++ b/docs/data-types/_index.md @@ -107,3 +107,5 @@ To extend the features provided by the included data types, use one of these opt 1. Write your own custom [server-side functions in Lua](/docs/manual/programmability/). 1. Write your own Redis module using the [modules API](/docs/reference/modules/) or check out the [community-supported modules](/docs/modules/). 1. Use [JSON](/docs/stack/json/), [querying](/docs/stack/search/), [time series](/docs/stack/timeseries/), and other capabilities provided by [Redis Stack](/docs/stack/). + +
\ No newline at end of file diff --git a/docs/getting-started/_index.md b/docs/getting-started/_index.md index cc7f0c0ba0..688a129ecc 100644 --- a/docs/getting-started/_index.md +++ b/docs/getting-started/_index.md @@ -158,3 +158,5 @@ Make sure that everything is working as expected: Note: The above instructions don't include all of the Redis configuration parameters that you could change, for instance, to use AOF persistence instead of RDB persistence, or to set up replication, and so forth. Make sure to read the example [`redis.conf`](https://github.com/redis/redis/blob/6.2/redis.conf) file (that is heavily commented). + +
\ No newline at end of file From 97c0bba43d3e9189c0d6c5a1521d312d1c3dba8c Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Thu, 29 Jun 2023 23:47:28 +0100 Subject: [PATCH 13/23] Converts hashes tutorial to bikes story and adds python examples Co-authored by @sav-norem --- docs/data-types/hashes.md | 90 ++++++++++++++++++--------------------- 1 file changed, 41 insertions(+), 49 deletions(-) diff --git a/docs/data-types/hashes.md b/docs/data-types/hashes.md index d6885d6d50..e480d13b26 100644 --- a/docs/data-types/hashes.md +++ b/docs/data-types/hashes.md @@ -9,19 +9,24 @@ description: > Redis hashes are record types structured as collections of field-value pairs. You can use hashes to represent basic objects and to store groupings of counters, among other things. - > hset user:1000 username antirez birthyear 1977 verified 1 - (integer) 3 - > hget user:1000 username - "antirez" - > hget user:1000 birthyear - "1977" - > hgetall user:1000 - 1) "username" - 2) "antirez" - 3) "birthyear" - 4) "1977" - 5) "verified" - 6) "1" +{{< clients-example hash_tutorial set_get_all >}} +> hset bike:1 model Deimos brand Ergonom type 'Enduro bikes' price 4972 +(integer) 4 +> hget bike:1 model +"Deimos" +> hget bike:1 price +"4972" +> hgetall bike:1 +1) "model" +2) "Deimos" +3) "brand" +4) "Ergonom" +5) "type" +6) "Enduro bikes" +7) "price" +8) "4972" + +{{< /clients-example >}} While hashes are handy to represent *objects*, actually the number of fields you can put inside a hash has no practical limits (other than available memory), so you can use @@ -30,18 +35,22 @@ hashes in many different ways inside your application. The command [`HSET`](/commands/hset) sets multiple fields of the hash, while [`HGET`](/commands/hget) retrieves a single field. [`HMGET`](/commands/hmget) is similar to [`HGET`](/commands/hget) but returns an array of values: - > hmget user:1000 username birthyear no-such-field - 1) "antirez" - 2) "1977" - 3) (nil) +{{< clients-example hash_tutorial hmget >}} +> hmget user:1000 username birthyear no-such-field +1) "antirez" +2) "1977" +3) (nil) +{{< /clients-example >}} There are commands that are able to perform operations on individual fields as well, like [`HINCRBY`](/commands/hincrby): - > hincrby user:1000 birthyear 10 - (integer) 1987 - > hincrby user:1000 birthyear 10 - (integer) 1997 +{{< clients-example hash_tutorial hincrby >}} +> hincrby bike:1 price 100 +(integer) 5072 +> hincrby bike:1 price -100 +(integer) 4972 +{{< /clients-example >}} You can find the [full list of hash commands in the documentation](https://redis.io/commands#hash). @@ -60,41 +69,24 @@ See the [complete list of hash commands](https://redis.io/commands/?group=hash). ## Examples -* Represent a basic user profile as a hash: -``` -> HSET user:123 username martina firstName Martina lastName Elisa country GB -(integer) 4 -> HGET user:123 username -"martina" -> HGETALL user:123 -1) "username" -2) "martina" -3) "firstName" -4) "Martina" -5) "lastName" -6) "Elisa" -7) "country" -8) "GB" -``` - -* Store counters for the number of times device 777 had pinged the server, issued a request, or sent an error: -``` -> HINCRBY device:777:stats pings 1 +* Store counters for the number of times bike:1 has been ridden, has crashed, or has changed owners: +{{< clients-example hash_tutorial incrby_get_mget >}} +> HINCRBY bike:1:stats rides 1 (integer) 1 -> HINCRBY device:777:stats pings 1 +> HINCRBY bike:1:stats rides 1 (integer) 2 -> HINCRBY device:777:stats pings 1 +> HINCRBY bike:1:stats rides 1 (integer) 3 -> HINCRBY device:777:stats errors 1 +> HINCRBY bike:1:stats crashes 1 (integer) 1 -> HINCRBY device:777:stats requests 1 +> HINCRBY bike:1:stats owners 1 (integer) 1 -> HGET device:777:stats pings +> HGET bike:1:stats rides "3" -> HMGET device:777:stats requests errors +> HMGET bike:1:stats owners crashes 1) "1" 2) "1" -``` +{{< /clients-example >}} ## Performance @@ -111,4 +103,4 @@ In practice, your hashes are limited only by the overall memory on the VMs hosti ## Learn more * [Redis Hashes Explained](https://www.youtube.com/watch?v=-KdITaRkQ-U) is a short, comprehensive video explainer covering Redis hashes. -* [Redis University's RU101](https://university.redis.com/courses/ru101/) covers Redis hashes in detail. +* [Redis University's RU101](https://university.redis.com/courses/ru101/) covers Redis hashes in detail. \ No newline at end of file From 2f8f174c553b3545177a7d79d3d159953218c723 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Fri, 30 Jun 2023 01:32:20 +0100 Subject: [PATCH 14/23] Debrands modules --- docs/about/license.md | 2 +- docs/data-types/sets.md | 2 +- docs/data-types/sorted-sets.md | 2 +- docs/data-types/strings.md | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/about/license.md b/docs/about/license.md index 48580a3244..39b4c77979 100644 --- a/docs/about/license.md +++ b/docs/about/license.md @@ -17,7 +17,7 @@ used in accordance with the [Redis Trademark Guidelines](https://redis.com/legal * RedisInsight is licensed under the Server Side Public License (SSPL). -* Redis Stack Server, which combines open source Redis with RediSearch, RedisJSON, RedisGraph, RedisTimeSeries, and RedisBloom, is dual-licensed under the Redis Source Available License (RSALv2), as described below, and the [Server Side Public License](https://en.wikipedia.org/wiki/Server_Side_Public_License) (SSPL). For information about licensing per version, see [Versions and licenses](/docs/stack/#versions-and-licenses). +* Redis Stack Server, which combines open source Redis with Search and Query, JSON, Time Series, and Probabilistic, is dual-licensed under the Redis Source Available License (RSALv2), as described below, and the [Server Side Public License](https://en.wikipedia.org/wiki/Server_Side_Public_License) (SSPL). For information about licensing per version, see [Versions and licenses](/docs/stack/#versions-and-licenses). ## Licences: diff --git a/docs/data-types/sets.md b/docs/data-types/sets.md index fcc3564209..7e588c8eb2 100644 --- a/docs/data-types/sets.md +++ b/docs/data-types/sets.md @@ -207,7 +207,7 @@ Sets membership checks on large datasets (or on streaming data) can use a lot of If you're concerned about memory usage and don't need perfect precision, consider a [Bloom filter or Cuckoo filter](/docs/stack/bloom) as an alternative to a set. Redis sets are frequently used as a kind of index. -If you need to index and query your data, consider [RediSearch](/docs/stack/search) and [RedisJSON](/docs/stack/json). +If you need to index and query your data, consider the [JSON](/docs/stack/json) data type and the [Search and query](/docs/stack/search) features. ## Learn more diff --git a/docs/data-types/sorted-sets.md b/docs/data-types/sorted-sets.md index cd33c7af62..190150affb 100644 --- a/docs/data-types/sorted-sets.md +++ b/docs/data-types/sorted-sets.md @@ -275,7 +275,7 @@ This command's time complexity is O(log(n) + m), where _m_ is the number of resu ## Alternatives Redis sorted sets are sometimes used for indexing other Redis data structures. -If you need to index and query your data, consider [RediSearch](/docs/stack/search) and [RedisJSON](/docs/stack/json). +If you need to index and query your data, consider the [JSON](/docs/stack/json) data type and the [Search and query](/docs/stack/search) features. ## Learn more diff --git a/docs/data-types/strings.md b/docs/data-types/strings.md index 00cf39d918..f5047cbfeb 100644 --- a/docs/data-types/strings.md +++ b/docs/data-types/strings.md @@ -122,7 +122,7 @@ These random-access string commands may cause performance issues when dealing wi ## Alternatives -If you're storing structured data as a serialized string, you may also want to consider [Redis hashes](/docs/data-types/hashes) or [RedisJSON](/docs/stack/json). +If you're storing structured data as a serialized string, you may also want to consider Redis [hashes](/docs/data-types/hashes) or [JSON](/docs/stack/json). ## Learn more From 142d84efa880551fe1dc0f721d6553af6bdff4cb Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Fri, 30 Jun 2023 09:20:54 +0100 Subject: [PATCH 15/23] Add words to wordlist --- wordlist | 3 +++ 1 file changed, 3 insertions(+) diff --git a/wordlist b/wordlist index f9869c3ccb..aadc0f5632 100644 --- a/wordlist +++ b/wordlist @@ -124,6 +124,8 @@ EDOM EEXIST EFBIG EINVAL +Enduro +Ergonom ENOENT ENOTSUP EOF @@ -216,6 +218,7 @@ Leaderboards Leau Lehmann Levelgraph +licensor LibLZF Linode Liveness From 908307addf781a4a924167aefbd68975cf994fd0 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Fri, 30 Jun 2023 09:25:41 +0100 Subject: [PATCH 16/23] Adds more words to wordlist --- wordlist | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/wordlist b/wordlist index aadc0f5632..6a74dd2934 100644 --- a/wordlist +++ b/wordlist @@ -61,6 +61,7 @@ B2 B3 BCC's BDFL-style +birthyear BPF BPF's BPF-optimized @@ -171,13 +172,19 @@ HashMaps HashSets Haversine Hexastore +hget +hgetall +hincrby +hmget Hitmeister Homebrew Hotspot +hset HyperLogLog HyperLogLog. HyperLogLogs Hyperloglogs +hyperloglogs IOPs IPC IPs @@ -186,6 +193,7 @@ IPv6 IS-MASTER-DOWN-BY-ADDR Identinal IoT +incrby_get_mget Itamar Jedis JedisCluster @@ -219,11 +227,13 @@ Leau Lehmann Levelgraph licensor +licensor's LibLZF Linode Liveness Lua Lua's +lua-debugging Lua-to-Redis Lucraft M1 @@ -261,6 +271,7 @@ Ok OpenBSD OpenSSL Opteron +ORM PEL PELs PEM @@ -355,6 +366,7 @@ S1 S2 S3 S4 +SaaS SCP SDOWN SHA-256 From cd29a173ebf929b42eabc7feeb6004dd8fc5c533 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Mon, 3 Jul 2023 11:27:10 +0100 Subject: [PATCH 17/23] Small reword --- docs/about/license.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/about/license.md b/docs/about/license.md index 39b4c77979..ba95aad298 100644 --- a/docs/about/license.md +++ b/docs/about/license.md @@ -17,7 +17,7 @@ used in accordance with the [Redis Trademark Guidelines](https://redis.com/legal * RedisInsight is licensed under the Server Side Public License (SSPL). -* Redis Stack Server, which combines open source Redis with Search and Query, JSON, Time Series, and Probabilistic, is dual-licensed under the Redis Source Available License (RSALv2), as described below, and the [Server Side Public License](https://en.wikipedia.org/wiki/Server_Side_Public_License) (SSPL). For information about licensing per version, see [Versions and licenses](/docs/stack/#versions-and-licenses). +* Redis Stack Server, which combines open source Redis with Search and Query features, JSON, Time Series, and Probabilistic data structures is dual-licensed under the Redis Source Available License (RSALv2), as described below, and the [Server Side Public License](https://en.wikipedia.org/wiki/Server_Side_Public_License) (SSPL). For information about licensing per version, see [Versions and licenses](/docs/stack/#versions-and-licenses). ## Licences: From a3a2fdafec15907a82ea404156df48417d5c6e07 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Mon, 3 Jul 2023 12:16:49 +0100 Subject: [PATCH 18/23] =?UTF-8?q?Swaps=20places=20in=20docs=20for=20?= =?UTF-8?q?=E2=80=9CData=20types=E2=80=9D=20and=20=E2=80=9CInteract=20with?= =?UTF-8?q?=20data=E2=80=9D?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/data-types/_index.md | 2 +- docs/interact-with-data/_index.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/data-types/_index.md b/docs/data-types/_index.md index e4e7dae709..0defc77a43 100644 --- a/docs/data-types/_index.md +++ b/docs/data-types/_index.md @@ -2,7 +2,7 @@ title: "Redis data types" linkTitle: "Data types" description: Overview of data types supported by Redis -weight: 40 +weight: 35 aliases: - /docs/manual/data-types - /topics/data-types diff --git a/docs/interact-with-data/_index.md b/docs/interact-with-data/_index.md index 97de85b4ae..a69055546b 100644 --- a/docs/interact-with-data/_index.md +++ b/docs/interact-with-data/_index.md @@ -2,7 +2,7 @@ title: "Interact with data in Redis" linkTitle: "Interact with data" -weight: 35 +weight: 40 description: > How to interact with data in Redis, including searching, querying, triggered functions, transactions, and pub/sub. From d66cf89eb1cac14f9eb42960bbb79bfaf3e1b154 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Tue, 4 Jul 2023 13:21:32 +0100 Subject: [PATCH 19/23] Remove double linking --- docs/data-types/hashes.md | 6 ++-- docs/data-types/lists.md | 54 +++++++++++++++++----------------- docs/data-types/sets.md | 14 ++++----- docs/data-types/sorted-sets.md | 2 +- docs/data-types/streams.md | 2 +- docs/data-types/strings.md | 28 +++++++++--------- docs/management/admin.md | 4 +-- docs/management/replication.md | 12 ++++---- 8 files changed, 61 insertions(+), 61 deletions(-) diff --git a/docs/data-types/hashes.md b/docs/data-types/hashes.md index e480d13b26..e3d60a8ef5 100644 --- a/docs/data-types/hashes.md +++ b/docs/data-types/hashes.md @@ -32,8 +32,8 @@ While hashes are handy to represent *objects*, actually the number of fields you put inside a hash has no practical limits (other than available memory), so you can use hashes in many different ways inside your application. -The command [`HSET`](/commands/hset) sets multiple fields of the hash, while [`HGET`](/commands/hget) retrieves -a single field. [`HMGET`](/commands/hmget) is similar to [`HGET`](/commands/hget) but returns an array of values: +The command `HSET` sets multiple fields of the hash, while `HGET` retrieves +a single field. `HMGET` is similar to `HGET` but returns an array of values: {{< clients-example hash_tutorial hmget >}} > hmget user:1000 username birthyear no-such-field @@ -43,7 +43,7 @@ a single field. [`HMGET`](/commands/hmget) is similar to [`HGET`](/commands/hget {{< /clients-example >}} There are commands that are able to perform operations on individual fields -as well, like [`HINCRBY`](/commands/hincrby): +as well, like `HINCRBY`: {{< clients-example hash_tutorial hincrby >}} > hincrby bike:1 price 100 diff --git a/docs/data-types/lists.md b/docs/data-types/lists.md index df5822b97d..a73f676d7a 100644 --- a/docs/data-types/lists.md +++ b/docs/data-types/lists.md @@ -105,7 +105,7 @@ an Array are very different from the properties of a List implemented using a Redis lists are implemented via Linked Lists. This means that even if you have millions of elements inside a list, the operation of adding a new element in the head or in the tail of the list is performed *in constant time*. The speed of adding a -new element with the [`LPUSH`](/commands/lpush) command to the head of a list with ten +new element with the `LPUSH` command to the head of a list with ten elements is the same as adding an element to the head of list with 10 million elements. @@ -125,10 +125,10 @@ Sorted sets are covered in the [Sorted sets](/docs/data-types/sorted-sets) tutor ### First steps with Redis Lists -The [`LPUSH`](/commands/lpush) command adds a new element into a list, on the -left (at the head), while the [`RPUSH`](/commands/rpush) command adds a new +The `LPUSH` command adds a new element into a list, on the +left (at the head), while the `RPUSH` command adds a new element into a list, on the right (at the tail). Finally the -[`LRANGE`](/commands/lrange) command extracts ranges of elements from lists: +`LRANGE` command extracts ranges of elements from lists: > rpush mylist A (integer) 1 @@ -141,13 +141,13 @@ element into a list, on the right (at the tail). Finally the 2) "A" 3) "B" -Note that [LRANGE](/commands/lrange) takes two indexes, the first and the last +Note that `LRANGE` takes two indexes, the first and the last element of the range to return. Both the indexes can be negative, telling Redis to start counting from the end: so -1 is the last element, -2 is the penultimate element of the list, and so forth. -As you can see [`RPUSH`](/commands/rpush) appended the elements on the right of the list, while -the final [`LPUSH`](/commands/lpush) appended the element on the left. +As you can see `RPUSH` appended the elements on the right of the list, while +the final `LPUSH` appended the element on the left. Both commands are *variadic commands*, meaning that you are free to push multiple elements into a list in a single call: @@ -208,7 +208,7 @@ posted by users into Redis lists. To describe a common use case step by step, imagine your home page shows the latest photos published in a photo sharing social network and you want to speedup access. -* Every time a user posts a new photo, we add its ID into a list with [`LPUSH`](/commands/lpush). +* Every time a user posts a new photo, we add its ID into a list with `LPUSH`. * When users visit the home page, we use `LRANGE 0 9` in order to get the latest 10 posted items. ### Capped lists @@ -217,9 +217,9 @@ In many use cases we just want to use lists to store the *latest items*, whatever they are: social network updates, logs, or anything else. Redis allows us to use lists as a capped collection, only remembering the latest -N items and discarding all the oldest items using the [`LTRIM`](/commands/ltrim) command. +N items and discarding all the oldest items using the `LTRIM` command. -The [`LTRIM`](/commands/ltrim) command is similar to [`LRANGE`](/commands/lrange), but **instead of displaying the +The `LTRIM` command is similar to `LRANGE`, but **instead of displaying the specified range of elements** it sets this range as the new list value. All the elements outside the given range are removed. @@ -234,7 +234,7 @@ An example will make it more clear: 2) "2" 3) "3" -The above [`LTRIM`](/commands/ltrim) command tells Redis to keep just list elements from index +The above `LTRIM` command tells Redis to keep just list elements from index 0 to 2, everything else will be discarded. This allows for a very simple but useful pattern: doing a List push operation + a List trim operation together in order to add a new element and discard elements exceeding a limit: @@ -243,10 +243,10 @@ in order to add a new element and discard elements exceeding a limit: LTRIM mylist 0 999 The above combination adds a new element and keeps only the 1000 -newest elements into the list. With [`LRANGE`](/commands/lrange) you can access the top items +newest elements into the list. With `LRANGE` you can access the top items without any need to remember very old data. -Note: while [`LRANGE`](/commands/lrange) is technically an O(N) command, accessing small ranges +Note: while `LRANGE` is technically an O(N) command, accessing small ranges towards the head or the tail of the list is a constant time operation. Blocking operations on lists @@ -261,23 +261,23 @@ a different process in order to actually do some kind of work with those items. This is the usual producer / consumer setup, and can be implemented in the following simple way: -* To push items into the list, producers call [`LPUSH`](/commands/lpush). -* To extract / process items from the list, consumers call [`RPOP`](/commands/rpop). +* To push items into the list, producers call `LPUSH`. +* To extract / process items from the list, consumers call `RPOP`. However it is possible that sometimes the list is empty and there is nothing -to process, so [`RPOP`](/commands/rpop) just returns NULL. In this case a consumer is forced to wait -some time and retry again with [`RPOP`](/commands/rpop). This is called *polling*, and is not +to process, so `RPOP` just returns NULL. In this case a consumer is forced to wait +some time and retry again with `RPOP`. This is called *polling*, and is not a good idea in this context because it has several drawbacks: 1. Forces Redis and clients to process useless commands (all the requests when the list is empty will get no actual work done, they'll just return NULL). -2. Adds a delay to the processing of items, since after a worker receives a NULL, it waits some time. To make the delay smaller, we could wait less between calls to [`RPOP`](/commands/rpop), with the effect of amplifying problem number 1, i.e. more useless calls to Redis. +2. Adds a delay to the processing of items, since after a worker receives a NULL, it waits some time. To make the delay smaller, we could wait less between calls to `RPOP`, with the effect of amplifying problem number 1, i.e. more useless calls to Redis. -So Redis implements commands called [`BRPOP`](/commands/brpop) and [`BLPOP`](/commands/blpop) which are versions -of [`RPOP`](/commands/rpop) and [`LPOP`](/commands/lpop) able to block if the list is empty: they'll return to +So Redis implements commands called `BRPOP` and `BLPOP` which are versions +of `RPOP` and `LPOP` able to block if the list is empty: they'll return to the caller only when a new element is added to the list, or when a user-specified timeout is reached. -This is an example of a [`BRPOP`](/commands/brpop) call we could use in the worker: +This is an example of a `BRPOP` call we could use in the worker: > brpop tasks 5 1) "tasks" @@ -291,17 +291,17 @@ also specify multiple lists and not just one, in order to wait on multiple lists at the same time, and get notified when the first list receives an element. -A few things to note about [`BRPOP`](/commands/brpop): +A few things to note about `BRPOP`: 1. Clients are served in an ordered way: the first client that blocked waiting for a list, is served first when an element is pushed by some other client, and so forth. -2. The return value is different compared to [`RPOP`](/commands/rpop): it is a two-element array since it also includes the name of the key, because [`BRPOP`](/commands/brpop) and [`BLPOP`](/commands/blpop) are able to block waiting for elements from multiple lists. +2. The return value is different compared to `RPOP`: it is a two-element array since it also includes the name of the key, because `BRPOP` and `BLPOP` are able to block waiting for elements from multiple lists. 3. If the timeout is reached, NULL is returned. There are more things you should know about lists and blocking ops. We suggest that you read more on the following: -* It is possible to build safer queues or rotating queues using [`LMOVE`](/commands/lmove). -* There is also a blocking variant of the command, called [`BLMOVE`](/commands/blmove). +* It is possible to build safer queues or rotating queues using `LMOVE`. +* There is also a blocking variant of the command, called `BLMOVE`. ## Automatic creation and removal of keys @@ -309,7 +309,7 @@ So far in our examples we never had to create empty lists before pushing elements, or removing empty lists when they no longer have elements inside. It is Redis' responsibility to delete keys when lists are left empty, or to create an empty list if the key does not exist and we are trying to add elements -to it, for example, with [`LPUSH`](/commands/lpush). +to it, for example, with `LPUSH`. This is not specific to lists, it applies to all the Redis data types composed of multiple elements -- Streams, Sets, Sorted Sets and Hashes. @@ -318,7 +318,7 @@ Basically we can summarize the behavior with three rules: 1. When we add an element to an aggregate data type, if the target key does not exist, an empty aggregate data type is created before adding the element. 2. When we remove elements from an aggregate data type, if the value remains empty, the key is automatically destroyed. The Stream data type is the only exception to this rule. -3. Calling a read-only command such as [`LLEN`](/commands/llen) (which returns the length of the list), or a write command removing elements, with an empty key, always produces the same result as if the key is holding an empty aggregate type of the type the command expects to find. +3. Calling a read-only command such as `LLEN` (which returns the length of the list), or a write command removing elements, with an empty key, always produces the same result as if the key is holding an empty aggregate type of the type the command expects to find. Examples of rule 1: diff --git a/docs/data-types/sets.md b/docs/data-types/sets.md index 7e588c8eb2..1ad2dcbdfd 100644 --- a/docs/data-types/sets.md +++ b/docs/data-types/sets.md @@ -59,7 +59,7 @@ See the [complete list of set commands](https://redis.io/commands/?group=set). ## Tutorial -The [`SADD`](/commands/sadd) command adds new elements to a set. It's also possible +The `SADD` command adds new elements to a set. It's also possible to do a number of other operations against sets like testing if a given element already exists, performing the intersection, union or difference between multiple sets, and so forth. @@ -124,7 +124,7 @@ a Redis hash, which maps tag IDs to tag names. There are other non trivial operations that are still easy to implement using the right Redis commands. For instance we may want a list of all the objects with the tags 1, 2, 10, and 27 together. We can do this using -the [`SINTER`](/commands/sinter) command, which performs the intersection between different +the `SINTER` command, which performs the intersection between different sets. We can use: > sinter tag:1:news tag:2:news tag:10:news tag:27:news @@ -133,7 +133,7 @@ sets. We can use: In addition to intersection you can also perform unions, difference, extract a random element, and so forth. -The command to extract an element is called [`SPOP`](/commands/spop), and is handy to model +The command to extract an element is called `SPOP`, and is handy to model certain problems. For example in order to implement a web-based poker game, you may want to represent your deck with a set. Imagine we use a one-char prefix for (C)lubs, (D)iamonds, (H)earts, (S)pades: @@ -144,7 +144,7 @@ prefix for (C)lubs, (D)iamonds, (H)earts, (S)pades: S7 S8 S9 S10 SJ SQ SK (integer) 52 -Now we want to provide each player with 5 cards. The [`SPOP`](/commands/spop) command +Now we want to provide each player with 5 cards. The `SPOP` command removes a random element, returning it to the client, so it is the perfect operation in this case. @@ -153,7 +153,7 @@ game we'll need to populate the deck of cards again, which may not be ideal. So to start, we can make a copy of the set stored in the `deck` key into the `game:1:deck` key. -This is accomplished using [`SUNIONSTORE`](/commands/sunionstore), which normally performs the +This is accomplished using `SUNIONSTORE`, which normally performs the union between multiple sets, and stores the result into another set. However, since the union of a single set is itself, I can copy my deck with: @@ -178,7 +178,7 @@ One pair of jacks, not great... This is a good time to introduce the set command that provides the number of elements inside a set. This is often called the *cardinality of a set* -in the context of set theory, so the Redis command is called [`SCARD`](/commands/scard). +in the context of set theory, so the Redis command is called `SCARD`. > scard game:1:deck (integer) 47 @@ -186,7 +186,7 @@ in the context of set theory, so the Redis command is called [`SCARD`](/commands The math works: 52 - 5 = 47. When you need to just get random elements without removing them from the -set, there is the [`SRANDMEMBER`](/commands/srandmember) command suitable for the task. It also features +set, there is the `SRANDMEMBER` command suitable for the task. It also features the ability to return both repeating and non-repeating elements. ## Limits diff --git a/docs/data-types/sorted-sets.md b/docs/data-types/sorted-sets.md index 190150affb..a43fdbb0b8 100644 --- a/docs/data-types/sorted-sets.md +++ b/docs/data-types/sorted-sets.md @@ -81,7 +81,7 @@ Note: 0 and -1 means from element index 0 to the last element (-1 works here just as it does in the case of the `LRANGE` command). What if I want to order them the opposite way, youngest to oldest? -Use [ZREVRANGE](/commands/zrevrange) instead of [ZRANGE](/commands/zrange): +Use `ZREVRANGE` instead of `ZRANGE`: > zrevrange hackers 0 -1 1) "Linus Torvalds" diff --git a/docs/data-types/streams.md b/docs/data-types/streams.md index 5651ea2456..6488c5519a 100644 --- a/docs/data-types/streams.md +++ b/docs/data-types/streams.md @@ -81,7 +81,7 @@ See each command's time complexity for the details. ## Streams basics -Streams are an append-only data structure. The fundamental write command, called [XADD](/commands/xadd), appends a new entry to the specified stream. +Streams are an append-only data structure. The fundamental write command, called `XADD`, appends a new entry to the specified stream. Each stream entry consists of one or more field-value pairs, somewhat like a record or a Redis hash: diff --git a/docs/data-types/strings.md b/docs/data-types/strings.md index f5047cbfeb..7a5f0dfdc2 100644 --- a/docs/data-types/strings.md +++ b/docs/data-types/strings.md @@ -23,16 +23,16 @@ will be performed via `redis-cli` in this tutorial). > get mykey "somevalue" -As you can see using the [`SET`](/commands/set) and the [`GET`](/commands/get) commands are the way we set -and retrieve a string value. Note that [`SET`](/commands/set) will replace any existing value +As you can see using the `SET` and the `GET` commands are the way we set +and retrieve a string value. Note that `SET` will replace any existing value already stored into the key, in the case that the key already exists, even if -the key is associated with a non-string value. So [`SET`](/commands/set) performs an assignment. +the key is associated with a non-string value. So `SET` performs an assignment. Values can be strings (including binary data) of every kind, for instance you can store a jpeg image inside a value. A value can't be bigger than 512 MB. -The [`SET`](/commands/set) command has interesting options, that are provided as additional -arguments. For example, I may ask [`SET`](/commands/set) to fail if the key already exists, +The `SET` command has interesting options, that are provided as additional +arguments. For example, I may ask `SET` to fail if the key already exists, or the opposite, that it only succeed if the key already exists: > set mykey newval nx @@ -41,17 +41,17 @@ or the opposite, that it only succeed if the key already exists: OK There are a number of other commands for operating on strings. For example -the [`GETSET`](/commands/getset) command sets a key to a new value, returning the old value as the +the `GETSET` command sets a key to a new value, returning the old value as the result. You can use this command, for example, if you have a -system that increments a Redis key using [`INCR`](/commands/incr) +system that increments a Redis key using `INCR` every time your web site receives a new visitor. You may want to collect this information once every hour, without losing a single increment. -You can [`GETSET`](/commands/getset) the key, assigning it the new value of "0" and reading the +You can `GETSET` the key, assigning it the new value of "0" and reading the old value back. The ability to set or retrieve the value of multiple keys in a single command is also useful for reduced latency. For this reason there are -the [`MSET`](/commands/mset) and [`MGET`](/commands/mget) commands: +the `MSET` and `MGET` commands: > mset a 10 b 20 c 30 OK @@ -60,7 +60,7 @@ the [`MSET`](/commands/mset) and [`MGET`](/commands/mget) commands: 2) "20" 3) "30" -When [`MGET`](/commands/mget) is used, Redis returns an array of values. +When `MGET` is used, Redis returns an array of values. ### Strings as counters Even if strings are the basic values of Redis, there are interesting operations @@ -75,10 +75,10 @@ you can perform with them. For instance, one is atomic increment: > incrby counter 50 (integer) 152 -The [INCR](/commands/incr) command parses the string value as an integer, +The `INCRBY` command parses the string value as an integer, increments it by one, and finally sets the obtained value as the new value. -There are other similar commands like [INCRBY](/commands/incrby), -[DECR](/commands/decr) and [DECRBY](/commands/decrby). Internally it's +There are other similar commands like `INCRBY`, +`DECR` and `DECRBY`. Internally it's always the same command, acting in a slightly different way. What does it mean that INCR is atomic? @@ -106,7 +106,7 @@ By default, a single Redis string can be a maximum of 512 MB. ### Managing counters * `INCRBY` atomically increments (and decrements when passing a negative number) counters stored at a given key. -* Another command exists for floating point counters: [INCRBYFLOAT](/commands/incrbyfloat). +* Another command exists for floating point counters: `INCRBYFLOAT`. ### Bitwise operations diff --git a/docs/management/admin.md b/docs/management/admin.md index 79bfb70b5f..8d31b9a14d 100644 --- a/docs/management/admin.md +++ b/docs/management/admin.md @@ -56,7 +56,7 @@ aliases: [ ## Upgrading or restarting a Redis instance without downtime -Redis is designed to be a long-running process in your server. You can modify many configuration options without a restart using the [CONFIG SET command](/commands/config-set). You can also switch from AOF to RDB snapshots persistence, or the other way around, without restarting Redis. Check the output of the `CONFIG GET *` command for more information. +Redis is designed to be a long-running process in your server. You can modify many configuration options without a restart using the `CONFIG SET` command. You can also switch from AOF to RDB snapshots persistence, or the other way around, without restarting Redis. Check the output of the `CONFIG GET *` command for more information. From time to time, a restart is required, for example, to upgrade the Redis process to a newer version, or when you need to modify a configuration parameter that is currently not supported by the `CONFIG` command. @@ -74,7 +74,7 @@ Follow these steps to avoid downtime. * Configure all your clients to use the new instance (the replica). Note that you may want to use the `CLIENT PAUSE` command to ensure that no client can write to the old master during the switch. -* Once you confirm that the master is no longer receiving any queries (you can check this using the [MONITOR command](/commands/monitor)), elect the replica to master using the `REPLICAOF NO ONE` command, and then shut down your master. +* Once you confirm that the master is no longer receiving any queries (you can check this using the `MONITOR` command), elect the replica to master using the `REPLICAOF NO ONE` command, and then shut down your master. If you are using [Redis Sentinel](/topics/sentinel) or [Redis Cluster](/topics/cluster-tutorial), the simplest way to upgrade to newer versions is to upgrade one replica after the other. Then you can perform a manual failover to promote one of the upgraded replicas to master, and finally promote the last replica. diff --git a/docs/management/replication.md b/docs/management/replication.md index 801638a563..79e7e341fc 100644 --- a/docs/management/replication.md +++ b/docs/management/replication.md @@ -200,14 +200,14 @@ Historically, there were some use cases that were considered legitimate for writ As of version 7.0, these use cases are now all obsolete and the same can be achieved by other means. For example: -* Computing slow Set or Sorted set operations and storing the result in temporary local keys using commands like [SUNIONSTORE](/commands/sunionstore) and [ZINTERSTORE](/commands/zinterstore). - Instead, use commands that return the result without storing it, such as [SUNION](/commands/sunion) and [ZINTER](/commands/zinter). +* Computing slow Set or Sorted set operations and storing the result in temporary local keys using commands like `SUNIONSTORE` and `ZINTERSTORE`. + Instead, use commands that return the result without storing it, such as `SUNION` and `ZINTER`. -* Using the [SORT](/commands/sort) command (which is not considered a read-only command because of the optional STORE option and therefore cannot be used on a read-only replica). - Instead, use [SORT_RO](/commands/sort_ro), which is a read-only command. +* Using the `SORT` command (which is not considered a read-only command because of the optional STORE option and therefore cannot be used on a read-only replica). + Instead, use `SORT_RO`, which is a read-only command. -* Using [EVAL](/commands/eval) and [EVALSHA](/commands/evalsha) are also not considered read-only commands, because the Lua script may call write commands. - Instead, use [EVAL_RO](/commands/eval_ro) and [EVALSHA_RO](/commands/evalsha_ro) where the Lua script can only call read-only commands. +* Using `EVAL` and `EVALSHA` are also not considered read-only commands, because the Lua script may call write commands. + Instead, use `EVAL_RO` and `EVALSHA_RO` where the Lua script can only call read-only commands. While writes to a replica will be discarded if the replica and the master resync or if the replica is restarted, there is no guarantee that they will sync automatically. From 65bb08a5219f2a0e0aa55952358447fc1e291a59 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Tue, 4 Jul 2023 13:23:29 +0100 Subject: [PATCH 20/23] =?UTF-8?q?Remove=20=E2=80=9C-ing=E2=80=9D=20form=20?= =?UTF-8?q?from=20menus?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/getting-started/_index.md | 4 ++-- docs/management/_index.md | 4 ++-- docs/manual/_index.md | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/getting-started/_index.md b/docs/getting-started/_index.md index 688a129ecc..b07a13648d 100644 --- a/docs/getting-started/_index.md +++ b/docs/getting-started/_index.md @@ -1,6 +1,6 @@ --- -title: "Getting started with Redis" -linkTitle: "Getting started" +title: "Get started with Redis" +linkTitle: "Get started" weight: 20 diff --git a/docs/management/_index.md b/docs/management/_index.md index df79a1b191..bc67c5ceae 100644 --- a/docs/management/_index.md +++ b/docs/management/_index.md @@ -1,6 +1,6 @@ --- -title: "Managing Redis" -linkTitle: "Managing Redis" +title: "Manage Redis" +linkTitle: "Manage Redis" description: An administrator's guide to Redis weight: 60 --- diff --git a/docs/manual/_index.md b/docs/manual/_index.md index 0b0eccb8e7..b7b72b871e 100644 --- a/docs/manual/_index.md +++ b/docs/manual/_index.md @@ -1,6 +1,6 @@ --- -title: "Using Redis" -linkTitle: "Using Redis" +title: "Use Redis" +linkTitle: "Use Redis" description: A developer's guide to Redis weight: 50 --- From 9628635ce5a0cd115e7de0f58b97f9007f084ac4 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Tue, 4 Jul 2023 13:25:45 +0100 Subject: [PATCH 21/23] Fix hash tutorial example --- docs/data-types/hashes.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/data-types/hashes.md b/docs/data-types/hashes.md index e3d60a8ef5..2409955a13 100644 --- a/docs/data-types/hashes.md +++ b/docs/data-types/hashes.md @@ -36,9 +36,9 @@ The command `HSET` sets multiple fields of the hash, while `HGET` retrieves a single field. `HMGET` is similar to `HGET` but returns an array of values: {{< clients-example hash_tutorial hmget >}} -> hmget user:1000 username birthyear no-such-field -1) "antirez" -2) "1977" +> hmget bike:1 model price no-such-field +1) "Deimos" +2) "4972" 3) (nil) {{< /clients-example >}} From fe74fd85071fcf45f699c68d8514b60f7f164455 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Wed, 5 Jul 2023 13:04:02 +0100 Subject: [PATCH 22/23] Remove ruby example --- docs/getting-started/_index.md | 20 -------------------- 1 file changed, 20 deletions(-) diff --git a/docs/getting-started/_index.md b/docs/getting-started/_index.md index b07a13648d..12352e5220 100644 --- a/docs/getting-started/_index.md +++ b/docs/getting-started/_index.md @@ -68,26 +68,6 @@ the goal is to use it from your application. In order to do so you need to download and install a Redis client library for your programming language. You'll find a [full list of clients for different languages in this page](https://redis.io/clients). -For instance if you happen to use the Ruby programming language our best advice -is to use the [Redis-rb](https://github.com/redis/redis-rb) client. -You can install it using the command **gem install redis**. - -These instructions are Ruby specific but actually many library clients for -popular languages look quite similar: you create a Redis object and execute -commands calling methods. A short interactive example using Ruby: - - >> require 'rubygems' - => false - >> require 'redis' - => true - >> r = Redis.new - => # - >> r.ping - => "PONG" - >> r.set('foo','bar') - => "OK" - >> r.get('foo') - => "bar" ## Redis persistence From d93ddc9426c65060fbe671a668ddfb019698a805 Mon Sep 17 00:00:00 2001 From: Elena Kolevska Date: Wed, 5 Jul 2023 13:43:23 +0100 Subject: [PATCH 23/23] =?UTF-8?q?Change=20path=20of=20=E2=80=9Cinteract?= =?UTF-8?q?=E2=80=9D?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/{interact-with-data => interact}/_index.md | 0 docs/{interact-with-data => interact}/programmability/_index.md | 0 .../programmability/eval-intro.md | 0 .../programmability/functions-intro.md | 0 docs/{interact-with-data => interact}/programmability/lua-api.md | 0 .../programmability/lua-debugging.md | 0 docs/{interact-with-data => interact}/pubsub.md | 0 docs/{interact-with-data => interact}/transactions.md | 0 8 files changed, 0 insertions(+), 0 deletions(-) rename docs/{interact-with-data => interact}/_index.md (100%) rename docs/{interact-with-data => interact}/programmability/_index.md (100%) rename docs/{interact-with-data => interact}/programmability/eval-intro.md (100%) rename docs/{interact-with-data => interact}/programmability/functions-intro.md (100%) rename docs/{interact-with-data => interact}/programmability/lua-api.md (100%) rename docs/{interact-with-data => interact}/programmability/lua-debugging.md (100%) rename docs/{interact-with-data => interact}/pubsub.md (100%) rename docs/{interact-with-data => interact}/transactions.md (100%) diff --git a/docs/interact-with-data/_index.md b/docs/interact/_index.md similarity index 100% rename from docs/interact-with-data/_index.md rename to docs/interact/_index.md diff --git a/docs/interact-with-data/programmability/_index.md b/docs/interact/programmability/_index.md similarity index 100% rename from docs/interact-with-data/programmability/_index.md rename to docs/interact/programmability/_index.md diff --git a/docs/interact-with-data/programmability/eval-intro.md b/docs/interact/programmability/eval-intro.md similarity index 100% rename from docs/interact-with-data/programmability/eval-intro.md rename to docs/interact/programmability/eval-intro.md diff --git a/docs/interact-with-data/programmability/functions-intro.md b/docs/interact/programmability/functions-intro.md similarity index 100% rename from docs/interact-with-data/programmability/functions-intro.md rename to docs/interact/programmability/functions-intro.md diff --git a/docs/interact-with-data/programmability/lua-api.md b/docs/interact/programmability/lua-api.md similarity index 100% rename from docs/interact-with-data/programmability/lua-api.md rename to docs/interact/programmability/lua-api.md diff --git a/docs/interact-with-data/programmability/lua-debugging.md b/docs/interact/programmability/lua-debugging.md similarity index 100% rename from docs/interact-with-data/programmability/lua-debugging.md rename to docs/interact/programmability/lua-debugging.md diff --git a/docs/interact-with-data/pubsub.md b/docs/interact/pubsub.md similarity index 100% rename from docs/interact-with-data/pubsub.md rename to docs/interact/pubsub.md diff --git a/docs/interact-with-data/transactions.md b/docs/interact/transactions.md similarity index 100% rename from docs/interact-with-data/transactions.md rename to docs/interact/transactions.md