How to Store MySQL Audit Logs in MongoDB in a Maintenance-Free Setup

Store MySQL Audit Logs in MongoDB I was once helping one of our customers on how to load MySQL audit logs into a MySQL database and analyze them. But immediately I thought: “Hey, this is not the most efficient solution! MySQL or typical RDBMS, in general, were not really meant to store logs after all.”

So, I decided to explore an alternative – which seemed more sensible to me – and use MongoDB as the storage for logs, for three main reasons:

schema-less nature fits well to the audit log nature, where different types of events may use different fields
speaks JSON natively and the audit plugin can use JSON format
has capped collections feature, which allows avoiding additional maintenance overhead

Just to mention, audit logging is available in MySQL Enterprise Edition but a similar, yet free, solution, is available in Percona Server for MySQL. In both cases, it works by installing the audit log plugin.

Ad Hoc Import

The simplest scenario is to just set the audit log format to JSON:

audit_log_format = JSON

1	audit_log_format = JSON

And as soon as it collects some data, import the log file into MongoDB collection via the mongoimport command, like this:

# mongoimport --username percona --password P3rc0n4 --host 10.95.83.225 --port 27017 --db auditlogs --collection audit1 --file /var/lib/mysql/audit.log
2020-12-31T16:24:43.782+0000 connected to: 10.95.83.225:27017
2020-12-31T16:24:44.316+0000 imported 25462 documents

mongo > db.audit1.countDocuments({})
25462

# mongoimport --username percona --password P3rc0n4 --host 10.95.83.225 --port 27017 --db auditlogs --collection audit1 --file /var/lib/mysql/audit.log

2020-12-31T16:24:43.782+0000 connected to: 10.95.83.225:27017

2020-12-31T16:24:44.316+0000 imported 25462 documents

mongo > db.audit1.countDocuments({})

25462

Of course, this works, but I prefer an automated solution, so I looked at available options for live-streaming the logs.

Syslog

The first thing that looked useful is the ability to send the audit log directly to syslog instead of a file. Knowing that both rsyslog, as well as syslog-ng, have MongoDB output modules, it felt like a very easy approach. So I installed the rsyslog-mongodb module package on my test Ubuntu VM with running Percona Server for MySQL, configured audit log with:

[mysqld]
audit_log_handler = syslog
audit_log_format = JSON

[mysqld]

audit_log_handler = syslog

audit_log_format = JSON

Rsyslog (version 8.2) example configuration with:

# cat /etc/rsyslog.d/49-ship-syslog.conf
action(type="ommongodb"
uristr="mongodb://percona:P3rc0n4@10.95.83.225:27017/?authSource=auditlogs"
db="auditlogs" collection="mysql_node1_log")

# cat /etc/rsyslog.d/49-ship-syslog.conf

action(type="ommongodb"

uristr="mongodb://percona:[email protected]:27017/?authSource=auditlogs"

db="auditlogs" collection="mysql_node1_log")

This worked, however, inserted documents looked like this:

mongo > db.mysql_node1_log.findOne().pretty()
{
"_id" : ObjectId("5fece941f17f487c7d1d158b"),
"msg" : " {"audit_record":{"name":"Connect","record":"7_1970-01-01T00:00:00","timestamp":"2020-12-30T20:55:29Z","connection_id":"9","status":0,"user":"root","priv_user":"root","os_login":"root","proxy_user":"","host":"localhost","ip":"","db":""}}"
}

mongo > db.mysql_node1_log.findOne().pretty()

{

"_id" : ObjectId("5fece941f17f487c7d1d158b"),

"msg" : " {"audit_record":{"name":"Connect","record":"7_1970-01-01T00:00:00","timestamp":"2020-12-30T20:55:29Z","connection_id":"9","status":0,"user":"root","priv_user":"root","os_login":"root","proxy_user":"","host":"localhost","ip":"","db":""}}"

}

Basically, because of syslog escaping the double quote symbols, the whole audit record appears as a single string inside MongoDB collection, instead of a JSON object. No matter what I tried, like custom templates and property values in rsyslog, I could not disable escaping. Therefore, although feeding MongoDB with audit logs works this way, it becomes pretty useless when it comes to analyzing the logs later. The same issue applies to syslog-ng and the syslog-ng-mod-mongodb module. And since MongoDB does not offer before-insert triggers, I could not easily “fix” the inserted data on the fly.

Fluentd For The Rescue!

This forced me to look for alternative solutions. One of them would be using FIFO file and tail the audit log continuously to feed it, and then read from it to insert logs to mongodb. I wanted a more robust way, though, and decided to try Fluentd instead. It was created as a versatile log collector machine, highly flexible, prepared to work with many different applications out of the box, but most importantly, it is an open source project and speaks JSON natively. Making it to do the job I wanted turned out to be easier than I expected.

Here is what I did:

Installed the Fluentd package (I chose td-agent variant here for an even easier user experience)
Installed MongoDB plugin for Fluentd with (don’t use the usual ‘gem install’ here):

td-agent-gem install fluent-plugin-mongo

1	td-agent-gem install fluent-plugin-mongo

Configured audit log as a source and output directive for MongoDB:

# cat /etc/td-agent/td-agent.conf
####
...
<source>
 @type tail
 path /var/lib/mysql/audit.log
 pos_file /var/log/td-agent/audit.access_log.pos
 <parse>
  @type json
 </parse>
 tag mongo.audit.log
</source>
<match mongo.audit.log>
 @type mongo
 database auditlogs #(required)
 collection audit_log #(optional; default="untagged")
 capped
 capped_size 100m
 host 10.95.83.225 #(optional; default="localhost")
 port 27017 #(optional; default=27017)
 user percona
 password P3rc0n4
 <buffer>
  flush_interval 1s
 </buffer>
</match>

# cat /etc/td-agent/td-agent.conf

####

...

@type tail

path /var/lib/mysql/audit.log

pos_file /var/log/td-agent/audit.access_log.pos

<parse>

@type json

</parse>

tag mongo.audit.log

</source>

@type mongo

database auditlogs #(required)

collection audit_log #(optional; default="untagged")

capped

capped_size 100m

host 10.95.83.225 #(optional; default="localhost")

port 27017 #(optional; default=27017)

user percona

password P3rc0n4

flush_interval 1s

</buffer>

</match>

Added the user used by Fluentd to mysql group to allow it to read from the audit log:

# id td-agent
uid=114(td-agent) gid=121(td-agent) groups=121(td-agent)
# usermod -a -G mysql td-agent
# id td-agent
uid=114(td-agent) gid=121(td-agent) groups=121(td-agent),120(mysql)

# id td-agent

uid=114(td-agent) gid=121(td-agent) groups=121(td-agent)

# usermod -a -G mysql td-agent

# id td-agent

uid=114(td-agent) gid=121(td-agent) groups=121(td-agent),120(mysql)

Configured the Audit log with JSON format and auto-rotation:

[mysqld]
audit_log_handler = file
audit_log_format = JSON
audit_log_file = audit.log
audit_log_rotate_on_size = 10M
audit_log_rotations = 3

[mysqld]

audit_log_handler = file

audit_log_format = JSON

audit_log_file = audit.log

audit_log_rotate_on_size = 10M

audit_log_rotations = 3

Restarted both services to apply changes:

# systemctl restart mysql
# systemctl restart td-agent

1 2	# systemctl restart mysql # systemctl restart td-agent

Checked the Fluentd log to see if it reads the audit log as expected, also for when Percona Server for MySQL rotates it:

# tail -f /var/log/td-agent/td-agent.log
2020-12-31 02:41:39 +0000 [info]: adding match pattern="mongo.audit.log" type="mongo"
...
2020-12-31 02:41:40 +0000 [info]: #0 following tail of /var/lib/mysql/audit.log
...
2020-12-31 02:52:14 +0000 [info]: #0 detected rotation of /var/lib/mysql/audit.log; waiting 5 seconds
2020-12-31 02:52:14 +0000 [info]: #0 following tail of /var/lib/mysql/audit.log

# tail -f /var/log/td-agent/td-agent.log

2020-12-31 02:41:39 +0000 [info]: adding match pattern="mongo.audit.log" type="mongo"

...

2020-12-31 02:41:40 +0000 [info]: #0 following tail of /var/lib/mysql/audit.log

...

2020-12-31 02:52:14 +0000 [info]: #0 detected rotation of /var/lib/mysql/audit.log; waiting 5 seconds

2020-12-31 02:52:14 +0000 [info]: #0 following tail of /var/lib/mysql/audit.log

Ran sysbench against MySQL instance and verified the new collection in MongoDB gets updated:

mongo > db.audit_log.countDocuments({})
281245

mongo > db.audit_log.stats()
{
 "ns" : "auditlogs.audit_log",
 "size" : 104857293,
 "count" : 281245,
 "avgObjSize" : 372,
 "storageSize" : 26357760,
 "capped" : true,
 "max" : -1,
 "maxSize" : 104857600,
(...)

mongo > db.audit_log.countDocuments({})

281245

mongo > db.audit_log.stats()

{

"ns" : "auditlogs.audit_log",

"size" : 104857293,

"count" : 281245,

"avgObjSize" : 372,

"storageSize" : 26357760,

"capped" : true,

"max" : -1,

"maxSize" : 104857600,

(...)

Yay, it works like a charm! Not only are the audit logs rotated automatically on Percona Server for MySQL, but also on MongoDB the destination collection size cap works as well, so I am safe when it comes to disk space on both hosts!

Here, there is a little caveat – if for some reason you drop the destination collection manually on MongoDB, incoming inserts will make it re-created without the capped setting! Therefore, either let the collection be created by Fluentd on its service startup or create it manually with a capped setting, and don’t drop it later.

Now, we can try some example aggregations to get some useful audit stats:

mongo > db.audit_log.aggregate([ { $group: { _id: {name: "$audit_record.name", command: "$audit_record.command_class"}, count: {$sum:1}}}, { $sort: {count:-1}} ])
{ "_id" : { "name" : "Execute", "command" : "error" }, "count" : 267086 }
{ "_id" : { "name" : "Query", "command" : "begin" }, "count" : 14054 }
{ "_id" : { "name" : "Close stmt", "command" : "error" }, "count" : 76 }
{ "_id" : { "name" : "Query", "command" : "show_variables" }, "count" : 7 }
{ "_id" : { "name" : "Query", "command" : "select" }, "count" : 6 }
{ "_id" : { "name" : "Quit" }, "count" : 5 }
{ "_id" : { "name" : "Query", "command" : "show_tables" }, "count" : 4 }
{ "_id" : { "name" : "Init DB", "command" : "error" }, "count" : 2 }
{ "_id" : { "name" : "Field List", "command" : "show_fields" }, "count" : 2 }
{ "_id" : { "name" : "Query", "command" : "show_databases" }, "count" : 2 }
{ "_id" : { "name" : "Connect" }, "count" : 1 }

mongo > db.audit_log.aggregate([ { $group: { _id: {name: "$audit_record.name", command: "$audit_record.command_class"}, count: {$sum:1}}}, { $sort: {count:-1}} ])

{ "_id" : { "name" : "Execute", "command" : "error" }, "count" : 267086 }

{ "_id" : { "name" : "Query", "command" : "begin" }, "count" : 14054 }

{ "_id" : { "name" : "Close stmt", "command" : "error" }, "count" : 76 }

{ "_id" : { "name" : "Query", "command" : "show_variables" }, "count" : 7 }

{ "_id" : { "name" : "Query", "command" : "select" }, "count" : 6 }

{ "_id" : { "name" : "Quit" }, "count" : 5 }

{ "_id" : { "name" : "Query", "command" : "show_tables" }, "count" : 4 }

{ "_id" : { "name" : "Init DB", "command" : "error" }, "count" : 2 }

{ "_id" : { "name" : "Field List", "command" : "show_fields" }, "count" : 2 }

{ "_id" : { "name" : "Query", "command" : "show_databases" }, "count" : 2 }

{ "_id" : { "name" : "Connect" }, "count" : 1 }

mongo > db.audit_log.aggregate([ { $match: { "audit_record.status": {$gt: 0} } }, { $group: { _id: {command_class: "$audit_record.command_class", status: "$audit_record.status"}, count: {$sum:1}}}, { $sort: {count:-1}} ])
{ "_id" : { "command_class" : "error", "status" : 1049 }, "count" : 2 }
{ "_id" : { "command_class" : "show_tables", "status" : 1046 }, "count" : 2 }
{ "_id" : { "command_class" : "create_table", "status" : 1050 }, "count" : 2 }
{ "_id" : { "command_class" : "drop_table", "status" : 1051 }, "count" : 2 }
{ "_id" : { "command_class" : "drop_table", "status" : 1046 }, "count" : 2 }
{ "_id" : { "command_class" : "create_table", "status" : 1046 }, "count" : 1 }
{ "_id" : { "command_class" : "create_table", "status" : 1113 }, "count" : 1 }

mongo > db.audit_log.aggregate([ { $match: { "audit_record.status": {$gt: 0} } }, { $group: { _id: {command_class: "$audit_record.command_class", status: "$audit_record.status"}, count: {$sum:1}}}, { $sort: {count:-1}} ])

{ "_id" : { "command_class" : "error", "status" : 1049 }, "count" : 2 }

{ "_id" : { "command_class" : "show_tables", "status" : 1046 }, "count" : 2 }

{ "_id" : { "command_class" : "create_table", "status" : 1050 }, "count" : 2 }

{ "_id" : { "command_class" : "drop_table", "status" : 1051 }, "count" : 2 }

{ "_id" : { "command_class" : "drop_table", "status" : 1046 }, "count" : 2 }

{ "_id" : { "command_class" : "create_table", "status" : 1046 }, "count" : 1 }

{ "_id" : { "command_class" : "create_table", "status" : 1113 }, "count" : 1 }

References

https://dev.mysql.com/doc/refman/8.0/en/audit-log.html
https://www.rsyslog.com/doc/v8-stable/configuration/modules/ommongodb.html
https://docs.fluentd.org/output/mongo

2 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Kay Agahd

3 years ago

Smart workaround by using FluentD, thanks for the tip! 🙂
However, be aware that the max size of your mongodb capped collection is based on *uncompressed* data. So, if you have calculated to use 80% of your available hard disk, your disk may run out of space in case your log data becomes less compressible. We encountered such an issue in production recently. I wrote a blog post how we could avert the imminent disaster without downtime: https://medium.com/idealo-tech-blog/mission-possible-resize-mongodb-capped-collections-without-downtime-ec8aada2223f

Przemysław Malkowski

Author

Reply to Kay Agahd

3 years ago

A very good point Kay, thanks for the feedback!

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

How to Store MySQL Audit Logs in MongoDB in a Maintenance-Free Setup

Ad Hoc Import

Syslog

Fluentd For The Rescue!

References

Related

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Valkey/Redis: Not-So-Good Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

How to Store MySQL Audit Logs in MongoDB in a Maintenance-Free Setup

Ad Hoc Import

Syslog

Fluentd For The Rescue!

References

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Valkey/Redis: Not-So-Good Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation