Central anonymized IP addresses for nginx for all virtual hosts

January 2, 2015

In germany you need to anonymize ip address where ever you work with such data. This also applies for webserver logs, such as nginx.

We just need to rewrite the value of $remote_ip into something like $ip_anonymized.

But this is not as easy as it sounds at first glance (and also already provided solutions on stackoverflow do not properly work).

Let’s have a look, what we have:

nginx has the log_format directive, which has a context of http. This means, the log_format can only be (valid) set within the http {} section of the config file, NOT within the server sections!

On the other hand we have an if directive, which has a context of server and location.

So we can NOT use “if” and “log_format” on a central place, since the if part we would need to put on each virtual server entry.

But I like to have a central managed rewrite of the $remote_ip. So the if is not helpful here, also if is evil! We need something which is working at http context because only there the log_format can be defined in a valid way, and this is the only place outside of the server context, where our virtual hosts are defined…

Luckily there is a map feature within nginx! map is remapping some values into new values (accessible within variables which can be used in a log_format directive). And the good message: This also works with regular expressions.

So let’s map our IPv4 and IPv6 addresses into anonymized addresses. This has to be done in 3 steps, since map can not accumulate returned values, it can only return strings or variables, not a combination of both.

So, at first I grab the part of IP I want to have in my logfiles, the second map returns the part which symbolizes the anonymized part, and the 3rd map rule maps them together again.

 

Here are the rules which go into the http {} context:

map $remote_addr $ip_anonym1 {
 default 0.0.0;

 "~(?P<ip>(\d+)\.(\d+)\.(\d+))\.\d+" $ip;
 "~(?P<ip>[^:]+:[^:]+):" $ip;
}
map $remote_addr $ip_anonym2 {
 default .0;
 "~(?P<ip>(\d+)\.(\d+)\.(\d+))\.\d+" .0;
 "~(?P<ip>[^:]+:[^:]+):" ::;
}
map $ip_anonym1$ip_anonym2 $ip_anonymized {
 default 0.0.0.0;
 "~(?P<ip>.*)" $ip;
}
log_format anonymized '$ip_anonymized - $remote_user [$time_local] ' 
   '"$request" $status $body_bytes_sent ' 
   '"$http_referer" "$http_user_agent"';
 access_log /var/log/nginx/access.log anonymized;

After adding this to your nginx.conf config file, remember to reload your nginx. Your log files should now contain anoymized IP addresses, if you are using the “anonymized” log format (this is the format parameter of access_log directive).

 

Comments are closed.