05 February 2020

Have you encountered the situation that, when reviewing a mongod log file, you'd like to focus on what operations a particular client (identified by its IP address) has done but the mongod was so busy that the log file was inundated with lots of messages/operations from other clients?

There is a one-liner to filter MongoDB log files to print the log messages related to the connection from the specified IP address(es). For example, the following command lists all operations recorded for connections from localhost (127.0.0.1):

$ cat mongod.log | gawk '/connection accepted from 127.0.0.1/{open[$9] = 1; print $0}$4~/\[conn/{conn=gensub(/\[conn([0-9]*)\]/, "#\\1", "g", $4); if(conn in open){print $0; if($5=="end" && $6=="connection"){delete open[conn]}}}'
2020-02-03T15:00:58.087+1100 I NETWORK  [listener] connection accepted from 127.0.0.1:63120 #1 (1 connection now open)
2020-02-03T15:00:58.087+1100 I NETWORK  [conn1] received client metadata from 127.0.0.1:63120 conn1: { driver: { name: "PyMongo", ...
2020-02-03T15:00:58.090+1100 I NETWORK  [listener] connection accepted from 127.0.0.1:63121 #2 (2 connections now open)
2020-02-03T15:00:58.090+1100 I NETWORK  [conn2] received client metadata from 127.0.0.1:63121 conn2: { driver: { name: "PyMongo", ...
2020-02-03T15:00:58.091+1100 I NETWORK  [conn2] end connection 127.0.0.1:63121 (1 connection now open)
...

If you don't like copying/pasting such a long command line, you can put the awk part into a file (say ops-per-ip.awk) and run cat mongod.log | gawk -f ops-per-ip.awk:

1: /connection accepted from 127.0.0.1/{open[$9] = 1; print $0}
2: $4~/\[conn/{
3:     conn=gensub(/\[conn([0-9]*)\]/, "#\\1", "g", $4)
4:     if(conn in open){
5:         print $0;
6:         if($5=="end" && $6=="connection"){delete open[conn]}
7:     }
8: }
  • Line 1: When a connection from 127.0.0.1 is accepted, insert the connection ID with prefix # (#1234 for example) into the array open. The value is set to 1 but is actually irrelevant.
    • Please note that the source IP does not have to be a single IP address, you can using any regexp gawk recognizes. For example: use 192.168.99.10[56] to track operations launched from 192.168.99.105 and 192.169.99.106. Or, use 192.168 for any IP within the subnet 192.168.0.0/16.
  • Line 2: For other log lines like 2020-02-03T15:00:58.087+1100 I NETWORK [conn1234] ..., do the following:
    • Line 3: Use regexp function gensub to get the connection ID 1234 and construct the array key, i.e. #1234.
    • Line 4: If the key exists in open, then this connection is from the IP address of concern. Therefore,
      • Line 5: print this log line
      • Line 6: remove the key from the array if this line denotes the end of the connection (2020-02-03T15:00:58.091+1100 I NETWORK [conn1234] end connection 127.0.0.1:63121...).

In the following example, with the help of this command I am able to easily confirm that the client on 192.168.99.100 tried to connect many times but it never tried to authenticate itself and did nothing (or, put it more accurately: no operations were recorded).

$ cat mongodb.log | gawk '/connection accepted from 192.168.99.100/{open[$9] = 1; print $0}$4~/\[conn/{conn=gensub(/\[conn([0-9]*)\]/, "#\\1", "g", $4); if(conn in open){print $0; if($5=="end" && $6=="connection"){delete open[conn]}}}'
2020-02-03T02:44:54.343+0000 I NETWORK  [listener] connection accepted from 192.168.99.100:57696 #589430 (108 connections now open)
2020-02-03T02:44:54.383+0000 I NETWORK  [conn589430] end connection 192.168.99.100:57696 (107 connections now open)
2020-02-03T02:45:28.393+0000 I NETWORK  [listener] connection accepted from 192.168.99.100:59738 #592519 (107 connections now open)
2020-02-03T02:45:28.506+0000 I NETWORK  [conn592519] end connection 192.168.99.100:59738 (106 connections now open)
2020-02-03T02:46:23.470+0000 I NETWORK  [listener] connection accepted from 192.168.99.100:34760 #597520 (107 connections now open)
2020-02-03T02:46:23.660+0000 I NETWORK  [conn597520] end connection 192.168.99.100:34760 (106 connections now open)
2020-02-03T02:47:18.420+0000 I NETWORK  [listener] connection accepted from 192.168.99.100:37734 #602492 (107 connections now open)
2020-02-03T02:47:18.524+0000 I NETWORK  [conn602492] end connection 192.168.99.100:37734 (106 connections now open)
...


blog comments powered by Disqus