The group communication engine for Group Replication (XCom, a Paxos variant) includes a cache for messages (and their metadata) exchanged between the group members as a part of the consensus protocol. Among other functions, the message cache is used for recovery of missed messages by members that reconnect with the group after a period where they were unable to communicate with the other group members.
A cache size limit can be set for XCom's message cache using the group_replication_message_cache_size
system variable. If the cache size limit is reached, XCom removes the oldest entries that have been decided and delivered. The same cache size limit should be set on all group members, because an unreachable member that is attempting to reconnect selects any other member at random for recovery of missed messages. The same messages should therefore be available in each member's cache.
Ensure that sufficient memory is available on your system for your chosen cache size limit, considering the size of MySQL Server's other caches and object pools. Note that the limit set using group_replication_message_cache_size
applies only to the data stored in the cache, and the cache structures require an additional 50 MB of memory.
When choosing the value for group_replication_message_cache_size
, do so with regard to the expected volume of messages in the period before a member is expelled. The length of this period is controlled by the group_replication_member_expel_timeout
system variable, which determines the waiting period (up to an hour) that is allowed in addition to the initial 5-second detection period for members to return to the group rather than being expelled. The timeout defaults to 5 seconds, so by default a member is not expelled until it has been absent for at least 10 seconds.