Redis Overview

Brief Summary#

Understanding Redis
- In-memory database
- Commonly used for caching, message queues, distributed locks
- Atomic operations
Data structures of Redis values (five common types)
- String: caching objects, general counting, distributed locks, shared session information, etc.
- List (linked list): message queues, etc.
- Hash (unordered hash table): caching objects, shopping carts, etc.
- Set (unordered collection of strings): aggregation calculations (union, intersection, difference) scenarios, such as likes, mutual follows, lottery activities, etc.
- Zset (sorted set): sorting scenarios, such as leaderboards, sorting by phone and name, etc.
- … (there are 4 more, not written here)
Redis thread model
- Single-threaded means:
  - [Receive client requests > Parse > Data read/write > Send data to client] is completed by the main thread.
  - The Redis program is not single-threaded.
- Before and after 6.0
  - Before - single-threaded: maintainable, simple, no lock issues.
  - After - multi-threaded: uses multiple I/O threads to handle network requests (due to network I/O performance bottlenecks).
  - Summary: Command execution is still single-threaded; multi-threading is introduced to improve network I/O processing performance.
Redis persistence
- AOF log: After executing commands, write the commands to the AOF file, load this file upon restart.
  - Strategies:
    - Always: synchronously write after executing the command.
    - Everysec: first write the command to the kernel buffer, then write the buffer's content to disk every second.
    - NO: the operating system controls when to write the buffer's content to disk.
  - When too large: triggers AOF rewrite mechanism.
  - Disadvantage: file too large, long loading time.
- RDB snapshot: writes memory data at a certain moment (frequency controllable) to disk in binary format.
  - Methods:
    - save command: generated by the main thread, will block the thread.
    - bgsave command: generated by creating a subprocess, will not block the thread.
  - Disadvantage: high risk of data loss.
- Hybrid: added in 4.0, solves slow AOF recovery and high risk of RDB data loss, combines the advantages of both.
Redis cluster
- Methods:
  - Master-slave replication: master synchronizes to slave (initial full, subsequent incremental), master writes, slave reads.
  - Sentinel mode: based on master-slave replication, adds sentinel monitoring of the health status of master, slave, and sentinel, automatic failover.
  - Sharded cluster: distributes data across different servers, reducing single-point dependency and improving read/write performance.
- Cluster split-brain: network partition (under different networks) leads to dual masters.
  - Solutions:
    - min-slaves-to-write x, limits at least x slaves connected to the master; when below this, the master forbids writes.
    - min-slaves-max-lag x, limits the delay of master-slave data synchronization to less than x seconds; when exceeded, the master forbids writes.
Redis expiration deletion and memory eviction
- Expiration deletion: compares expiration dictionary, returns data for non-expired keys, deletes expired ones.
- Deletion strategies:
  - Lazy deletion: checks if the key is expired when accessed; if so, deletes the key and returns null.
  - Periodic deletion: periodically randomly selects a certain number of keys to check and deletes the expired ones.
    - Process: extract keys → check for expiration and delete → if expired count exceeds 25%, repeat previous steps → if less than 25%, wait for the next check.
- Handling expired keys during persistence:
  - AOF:
    - Adds delete instructions for expired keys to the AOF file.
    - During loading, expired keys will not be written to AOF.
  - RDB:
    - Ignores expired keys when generating RDB.
    - During loading, the master does not load expired keys, while the slave loads all.
- Master-slave handling of expired keys: the slave relies on the master (does not scan for expired keys), the master checks and deletes expired keys before synchronizing to the slave.
- Memory full: triggers memory eviction mechanism upon reaching the threshold, configured with maxmemory.
- Memory eviction mechanism strategies: a total of eight types, categorized into [no eviction] and [eviction].
  - No eviction: exceeds threshold, directly returns an error.
  - Eviction: divided into two categories: [among data with set expiration times] and [all data], with the best being the keys with set expiration times and the lowest usage.
Redis cache design
- Avalanche: to ensure data consistency, set expiration times for cached data; after expiration, request from the database and update the cache again. When a large number of keys expire, requests go directly to the DB, causing crashes, or Redis crashes.
  - Solutions:
    - Expiration:
      - Stagger expiration times.
      - Mutex lock: only one business thread updates the cache at the same time.
      - Double key (main sets expiration, backup is permanent).
    - Crash:
      - Service circuit breaker.
      - Request rate limiting.
      - High availability.
- Breakdown: hot data expiration, requests go directly to the DB (a subset of avalanche) causing crashes.
  - Solutions:
    - Mutex lock: only one business thread updates the cache at the same time.
    - Do not set expiration time.
- Penetration: data is not in the cache and not in the DB, generally due to business misoperation and malicious attacks.
  - Solutions:
    - Limit illegal requests.
    - Cache default values.
    - Use a Bloom filter to check if data exists, avoiding DB queries.
- Dynamically cache hot data: sort by the latest access time, keeping the top data.
- Common cache update strategies:
  - Cache Aside (bypass cache, commonly used in actual development, suitable for read-heavy and write-light scenarios): the application interacts directly with [database, cache], responsible for maintaining the cache.
    - Write strategy: update the DB first, then delete the cache (do not change the order, otherwise it leads to data inconsistency; deletion follows lazy loading).
    - Read strategy:
      - If cached, return data directly.
      - If not cached, read from DB, write to cache, return data.
  - Read/Write Through.
  - Write Back: has data loss risk.
- Ensure Cache and DB data consistency: update database + update cache (using distributed locks, with a shorter expiration time).
Redis practical application
- Delayed execution queue: uses ZSet (sorted set), where the Score attribute stores the delay execution time, then uses zrangebysocre to query and compare time to find pending tasks.
- Redis large key: refers to keys with very large values.
  - Conditions:
    - String type values larger than 10 KB.
    - Hash, List, Set, ZSet type elements exceed 5k.
  - Impact:
    - Client timeout blocking: Redis executes commands in a single thread, operations on large keys take time, blocking the thread.
    - Causes network congestion: data * access volume = generates large traffic.
    - Blocks worker threads: deleting large keys blocks worker threads.
    - Uneven memory distribution: in a cluster model, when slot sharding is even, data and query skew occurs, with some nodes having large keys occupying more memory and higher QPS.
  - Troubleshooting:
    1. redis-cli --bigkeys
      - Note:
        
        Execute on slave nodes, as it will block threads.
        
        If there are no slaves, execute during low peak times; you can use -i to control scan intervals to reduce performance impact.
    2. SCAN: first use SCAN to scan, then use TYPE to get the key type.
      - String type: use STRLEN to get length.
      - Set type:
        
        Average size * quantity.
        
        Use MEMORY USAGE to query key space usage (Redis 4.0+).
    3. https://github.com/sripathikrishnan/redis-rdb-tools: parse RDB files.
  - Deletion: directly deleting will free up a lot of memory but will also increase the time for the free memory block linked list operations, causing the Redis main thread to block, leading to various request timeouts, ultimately exhausting Redis connections and causing various exceptions.
    - Methods:
      - In batches.
        
        Large Hash: use hscan to get 100 fields at a time, then use hdel to delete one by one.
        
        Large Set: use sscan to scan 100 elements at a time, then use srem to delete one by one.
        
        Large ZSet: use zremrangebyrank to delete the top 100 elements at a time.
      - Asynchronously (Redis 4.0+): use unlink instead of del.
        
        Methods:
        
        Actively call the unlink command.
        
        Configure parameters to asynchronously delete when conditions are met.
- Redis Pipeline: a batch processing technique provided by the client rather than the server, processing multiple Redis commands at once, solving the network wait during multiple command executions.
- Redis transactions do not support rollback: do not support transaction rollback during runtime errors; only DISCARD can abandon transaction execution.
  - Reasons for lack of support:
    - The author believes that errors are rare in production environments, so there is no need to develop this feature.
    - Transaction rollback is complex and does not align with Redis's simple and efficient design.
- Distributed lock: a mechanism for concurrent control in a distributed environment, used to ensure that resources can only be used by one application at the same time.
  - Principle: the SET command's NX parameter can achieve [insert only if key does not exist].
    - Key does not exist: shows successful insertion, indicating lock acquisition successful.
    - Key exists: shows insertion failed, indicating lock acquisition failed.
  - Locking conditions:
    - Locking involves [reading, checking, and setting] three operations on the lock variable, requiring atomic operations, so SET + NX is used.
    - The lock variable needs to have an expiration time set, using EX/PX options, with the unit in milliseconds.
    - The value of the lock variable uses the client's unique identifier to distinguish the source.
    # Command that meets the three conditions: SET lock_key unique_value NX PX 10000 - lock_key: key value - unique_value: client unique identifier - NX: only set when lock_key does not exist - PX 10000: sets lock_key expiration time to 10 seconds
  - Unlocking: involves two operations, requiring a Lua script to ensure atomicity, first comparing if the value is equal, then deleting the key.
    // When releasing the lock, first compare if unique_value is equal to avoid incorrect lock release. if redis.call("get",KEYS[1]) == ARGV[1] then return redis.call("del",KEYS[1]) else return 0 end
  - Advantages:
    - High performance.
    - Easy to implement.
    - Avoids single point of failure (Redis cluster deployment).
  - Disadvantages:
    - Timeout duration is difficult to set: too long affects performance, too short loses lock protection.
    - In master-slave replication mode, data is asynchronously replicated, which may lead to unreliability of distributed locks.
  - Reliability under cluster: the official designed a distributed lock algorithm called Redlock. Based on multiple Redis nodes, it is recommended to have at least 5 isolated master nodes.
    - Principle: the client requests to acquire locks from multiple independent Redis nodes in sequence; if more than half of the nodes successfully acquire locks, it is considered that the client has successfully acquired the lock.
    - Process (lock acquisition successful = more than half successfully acquired the lock & total time < lock validity time):
      1. The client gets the current time (t1).
      2. The client sequentially executes lock acquisition on N nodes.
      3. If the client successfully acquires locks from more than half (greater than or equal to N/2+1) of the nodes, it gets the current time again (t2), then calculates the time taken (t2-t1), and if t2-t1 < lock expiration time, it is considered that the lock acquisition is successful.

myEsn2E9

Redis Overview

Brief Summary#

References#