The (In)famous MongoDB Message: Cannot Add Session Into the Cache

This week, I had an interesting case whereas one of our customers was facing the issue:

WriteCommandError({
	"ok" : 0,
	"errmsg" : "cannot add session into the cache",
	"code" : 261,
	"codeName" : "TooManyLogicalSessions",
	"operationTime" : Timestamp(1620922589, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1620922589, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
})

WriteCommandError({

"ok" : 0,

"errmsg" : "cannot add session into the cache",

"code" : 261,

"codeName" : "TooManyLogicalSessions",

"operationTime" : Timestamp(1620922589, 1),

"$clusterTime" : {

"clusterTime" : Timestamp(1620922589, 1),

"signature" : {

"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

"keyId" : NumberLong(0)

}

})

Studying this issue and discussing it with my colleagues I had to chance to explore in detail how the logical sessions in MongoDB are handled. First, a brief explanation of how the entire process workers and what is logical sessions:

What are Logical Sessions

Logical sessions allow operations to be tracked as they are consumed throughout the system. This enables simple, precise cancellation of operations and distributed garbage collection. For example, a find() operation will create cursors in all the relevant shards in a cluster. Each cursor will start acquiring results for its first batch to return. Before logical sessions existed, to cancel an operation like this would mean traversing all the shards with administration privileges, figuring out which activity was associated with the operation, and then kill it.

With logical sessions, it is now possible to kill using the killSessions command. For example:

mongos> db.system.sessions.find();
{ "_id" : { "id" : UUID("14c31e1f-c245-46ea-a229-7c31a4b042db"), "uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=") }, "lastUse" : ISODate("2021-05-13T19:17:23.232Z") }
mongos> db.runCommand( { killSessions: [ { id :  UUID("14c31e1f-c245-46ea-a229-7c31a4b042db") }]})
{
	"ok" : 0,
	"errmsg" : "operation was interrupted",
	"code" : 11601,
	"codeName" : "Interrupted",
	"operationTime" : Timestamp(1620933480, 11),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1620933480, 11),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

mongos> db.system.sessions.find();

{ "_id" : { "id" : UUID("14c31e1f-c245-46ea-a229-7c31a4b042db"), "uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=") }, "lastUse" : ISODate("2021-05-13T19:17:23.232Z") }

mongos> db.runCommand( { killSessions: [ { id : UUID("14c31e1f-c245-46ea-a229-7c31a4b042db") }]})

{

"ok" : 0,

"errmsg" : "operation was interrupted",

"code" : 11601,

"codeName" : "Interrupted",

"operationTime" : Timestamp(1620933480, 11),

"$clusterTime" : {

"clusterTime" : Timestamp(1620933480, 11),

"signature" : {

"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

"keyId" : NumberLong(0)

}

How Logical Sessions Works

The Logical Session has an in-memory cache and the physical data stored in system.sessions collection. Each node (router, shard, config server) has its own in-memory cache. A cache entry contains:

_id – The session’s logical session id
user – The session’s logged-in username (if authentication is enabled)
lastUse – The date and time that the session was last used

The in-memory cache periodically persists entries to the config.system.sessions collection, known as the “sessions collection.” The sessions collection has different placement behavior based on whether the user is running a standalone node, a replica set, or a sharded cluster.

Cluster Type	Sessions Collection Durable Storage
Standalone Node	Sessions collection exists on the same node as the in-memory cache.
Replica Set	Sessions collection exists on the primary node and replicates to secondaries.
Sharded Cluster	Sessions collection is a regular sharded collection – can exist on multiple shards and can have multiple chunks.

How does the expiration process work?

When a node receives a request with attached session info, it will place that session into the logical session cache. If a request corresponds to a session that already exists in the cache, the cache will update the cache entry’s lastUse field to the current date and time.

At a regular interval of five (5) minutes (user-configurable), the logical session cache will sync with the sessions collection. Inside the class, this is known as the “refresh” function. There are four steps to this process:

All sessions that have been used on this node since the last refresh will be upserted to the sessions collection. This means that sessions that already exist in the sessions collection will just have their lastUse fields updated.
All sessions that have been ended in the cache on this node (via the endSessions command) will be removed from the sessions collection.
Sessions that have expired from the sessions collection will be removed from the logical session cache on this node.
All cursors registered on this node that match sessions that have been ended (step 2) or were expired (step 3) will be killed.

We have available the following parameters to adjust the Logical Session:

Parameter	Default Value	Description
disableLogicalSessionCacheRefresh	false(boolean)	Disables the logical session cache’s periodic “refresh” and “reap” functions on this node. Recommended for testing only.
logicalSessionRefreshMillis	300000ms (integer)	Changes how often the logical session cache runs its periodic “refresh” and “reap” functions on this node.
localLogicalSessionTimeoutMinutes	30 minutes(integer)	Changes the TTL index timeout for the sessions collection. In sharded clusters, this parameter is supported only on the config server.

This is a brief introduction to logical sessions. Most of the explanation in this post was based on the source code documentation which has great detail:

https://github.com/mongodb/mongo/blob/master/src/mongo/db/s/README.md#logical-sessions

Applying the Theory in Real Workloads

The Logical Session Cache Refresh process syncs the in-memory cache with the system.session based on a frequency defined in the logicalSessionRefreshMillis. However, syncing is not its only function. The other function is to reap unused sessions. Let me show a test:

<strong># Opening 500 sessions</strong>
#!/bin/bash
for (( c=1; c<=500; c++ ))
do  
     /opt/mongodb/4.0.6/bin/mongo localhost:37017/percona --eval "db.data.save({idt: 1})"
done

# Opening 500 sessions

#!/bin/bash

for (( c=1; c<=500; c++ ))

/opt/mongodb/4.0.6/bin/mongo localhost:37017/percona --eval "db.data.save({idt: 1})"

done

And mongoS configured with:

--setParameter logicalSessionRefreshMillis=1000000

1	--setParameter logicalSessionRefreshMillis=1000000

Opening a shell in mongoS to verify the results and we can see 500 sessions created plus the one I’m currently using:

mongos> db.serverStatus().logicalSessionRecordCache;
{
<strong> "activeSessionsCount" : 501,</strong>
  ...
}

mongos> db.serverStatus().logicalSessionRecordCache;

{

"activeSessionsCount" : 501,

...

}

And if we count the number of sessions in the collection, we can see increasing:

mongos> db.system.sessions.find().count()
<strong>241</strong>

1 2	mongos> db.system.sessions.find().count() <strong>241</strong>

Which makes sense, because the cache is not instantly synchronized with the collection (we can have more sessions in-memory than in the collection).

Where is the maxSessions value is based? If I set in mongoS with --setParameter maxSessions=100 and re-run the tests with 500 sessions:

mongos> db.serverStatus().logicalSessionRecordCache;
{
 "activeSessionsCount" : 100,
 ...
}

mongos> db.serverStatus().logicalSessionRecordCache;

{

"activeSessionsCount" : 100,

...

}

I get the following error:

WriteCommandError({
 "ok" : 0,
 "errmsg" : "cannot add session into the cache",
 "code" : 261,

WriteCommandError({

"ok" : 0,

"errmsg" : "cannot add session into the cache",

"code" : 261,

So, maxSessions is based on the cache value, not in the collection system.sessions. We can confirm this in the source code:

Status LogicalSessionCacheImpl::_addToCacheIfNotFull(WithLock, LogicalSessionRecord record) {
    if (_<strong>activeSessions.size() >= size_t(maxSessions)</strong>) {
        Status status = {ErrorCodes::TooManyLogicalSessions,
                         str::stream()
                             << "Unable to add session ID " << record.getId()
                             << " into the cache because the number of active sessions is too "
                                "high"};

Status LogicalSessionCacheImpl::_addToCacheIfNotFull(WithLock, LogicalSessionRecord record) {

if (_activeSessions.size() >= size_t(maxSessions)) {

Status status = {ErrorCodes::TooManyLogicalSessions,

str::stream()

<< "Unable to add session ID " << record.getId()

<< " into the cache because the number of active sessions is too "

"high"};

Fixing the Issue

The sessions are created every time that a connection is open at the database or a startSession is executed. Apart from the Mongo Shell, it is not possible to disable the sessions at the driver level. This is confirmed in this JIRA ticket:

https://jira.mongodb.org/browse/MONGOSH-458

Answer: No, the driver does not support that and is unlikely to do so. Original feature was implemented to debug or to be backwards compatible with earlier server versions, however we only support 4.0+ so we do not need either case.

Let’s say you have a throughput of 10k connections/second. If we have set the logicalSessionRefreshMillis to 10 minutes (600000 milliseconds, 600 seconds) here is the maximum sessions you will be able to open before the Logical Session Refresh starts reaping unused sessions:

MaxSessions = 10000 connections * 600 seconds = <strong>600000</strong>

1	MaxSessions = 10000 connections * 600 seconds = <strong>600000</strong>

To stabilize the use of sessions we have two alternatives. Increase the number of MaxSessions or reduce the logicalSessionRefreshMillis. I made a test with logicalSessionRefreshMillis=10000 (1-second) and re-run the tests and below is the maximum number of activeSessions I got in my tests:

mongos> db.serverStatus().logicalSessionRecordCache;
{
<strong> "activeSessionsCount" : 8,</strong>
 "sessionsCollectionJobCount" : 35,
 "lastSessionsCollectionJobDurationMillis" : 10,
 "lastSessionsCollectionJobTimestamp" : ISODate("2021-05-13T16:27:21.137Z"),
 "lastSessionsCollectionJobEntriesRefreshed" : 1,
 "lastSessionsCollectionJobEntriesEnded" : 6,
 "lastSessionsCollectionJobCursorsClosed" : 0,
 "transactionReaperJobCount" : 0,
 "lastTransactionReaperJobDurationMillis" : 0,
 "lastTransactionReaperJobTimestamp" : ISODate("2021-05-13T16:26:47.128Z"),
 "lastTransactionReaperJobEntriesCleanedUp" : 0
}

mongos> db.serverStatus().logicalSessionRecordCache;

{

"activeSessionsCount" : 8,

"sessionsCollectionJobCount" : 35,

"lastSessionsCollectionJobDurationMillis" : 10,

"lastSessionsCollectionJobTimestamp" : ISODate("2021-05-13T16:27:21.137Z"),

"lastSessionsCollectionJobEntriesRefreshed" : 1,

"lastSessionsCollectionJobEntriesEnded" : 6,

"lastSessionsCollectionJobCursorsClosed" : 0,

"transactionReaperJobCount" : 0,

"lastTransactionReaperJobDurationMillis" : 0,

"lastTransactionReaperJobTimestamp" : ISODate("2021-05-13T16:26:47.128Z"),

"lastTransactionReaperJobEntriesCleanedUp" : 0

}

This is because the reaping process of the Logical Session Refresh is much more aggressive.

And what happens if I increase the maxSessions to a huge number? Let’s check:

<strong># this is my mongoS pid, observe the rss value (memory consumption)</strong>
$ ps --no-headers -p 47274 -o comm,pmem,rss,vsize
mongos           0.0 <strong>26648</strong> 640904

# this is my mongoS pid, observe the rss value (memory consumption)

$ ps --no-headers -p 47274 -o comm,pmem,rss,vsize

mongos 0.0 26648 640904

For 10k activeSessions:

mongos> db.serverStatus().logicalSessionRecordCache;
{
 "activeSessionsCount" : 10032,

mongos> db.serverStatus().logicalSessionRecordCache;

{

"activeSessionsCount" : 10032,

$ ps  --no-headers -p 14650  -o comm,pmem,rss,vsize
mongos           0.0 32308 673788

1 2	$ ps --no-headers -p 14650 -o comm,pmem,rss,vsize mongos 0.0 32308 673788

There was an increase of 5Mb of memory usage. And adding more 10k sessions:

mongos> db.serverStatus().logicalSessionRecordCache;
{
 "activeSessionsCount" : 20237,

mongos> db.serverStatus().logicalSessionRecordCache;

{

"activeSessionsCount" : 20237,

[PROD-SER] [vinicius.grippa@bm-support01 ~]$ ps  --no-headers -p 14650  -o comm,pmem,rss,vsize
mongos           0.0 36228 677884

1 2	[PROD-SER] [vinicius.grippa@bm-support01 ~]$ ps --no-headers -p 14650 -o comm,pmem,rss,vsize mongos 0.0 36228 677884

So we can estimate memory usage of 4MB for each 10K sessions. For 3 million maxSessions, we can expect an increase in the memory usage of 1,17 GB.

Conclusion

To conclude, based on the results above, there are two alternatives to circumvent this issue. I would go first reducing the logicalSessionRefreshMillis which seems to me it creates less impact. Also, it is important to estimate if the current maxSessions supports the given throughput of the application (you can check the formula mentioned previously).

Useful Resources

Finally, you can reach us through the social networks, our forum, or access our material using the links presented below:

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

Download Percona Distribution for MongoDB Today!

4 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Kay Agahd

2 years ago

Thanks Vinicius for these detailed and interesting insights!
I once came across of the same error message but the reason was a misconfiguration of the replica set: the replica set was once part of a cluster (sharded). The cluster has been reduced to only one shard. The remaining shard was finally run as a replica set (no routers and no configservers anymore) BUT the following config entry survived by mistake: “sharding.clusterRole: shardsvr”
The result was that the number of logical sessions constantly increased until it hit the default maximum of 1 million of logical sessions. After removing this parameter from the config and restarting the mongod instance, the number of logical sessions stayed almost constant.
Just in case some of your readers encounter the same problem one day… 🙂

Vinicius M. Grippa

2 years ago

Hi Kay,

Thanks a lot for leaving a comment a here. I’m glad you enjoyed it 🙂

Indeed this issue I forgot to mention. This happens because when you use –shardsrv, mongoD assumes the collection is sharded causing problems since you are using a replica set.

Great findind!

leo

2 years ago

#!/bin/bash
for (( c=1; c<=500; c++ ))
do
/opt/mongodb/4.0.6/bin/mongo localhost:37017/percona –eval "db.data.save({idt: 1})"
done

i test i on my env,it just open one sessions，not 500.

Vinicius Grippa

Author

Reply to leo

2 years ago

Hi Leo,

Try to use the & in the end of the Mongo command. The way you are running seems to be executing serialized.

/opt/mongodb/4.0.6/bin/mongo localhost:37017/percona –eval “db.data.save({idt: 1})” &

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

The (In)famous MongoDB Message: Cannot Add Session Into the Cache – TooManyLogicalSessions

What are Logical Sessions

How Logical Sessions Works

Applying the Theory in Real Workloads

Fixing the Issue

Conclusion

Useful Resources

Related

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

The (In)famous MongoDB Message: Cannot Add Session Into the Cache – TooManyLogicalSessions

What are Logical Sessions

How Logical Sessions Works

Applying the Theory in Real Workloads

Fixing the Issue

Conclusion

Useful Resources

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Valkey/Redis: Not-So-Good Practices

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation