HAProxy – sysadmin’s swiss army knife

Swiss army knife

HAProxy – sysadmin’s swiss army knife

HAProxy is a free, open-source, high-performance TCP/HTTP load balancer. HAProxy has been around since 2001, it’s written in C programming language, and it uses an insignificant amount of memory and CPU resources, even with very advanced manipulations on HTTP traffic.

It’s also very secure, having only fifteen security issues during the last seven years. Four of these issues were distribution-specific, six required a very high level of access (meaning the sysadmin maintaining the server would have a much bigger problem than HAProxy itself), and the last five were related mainly with denial of service attack vector.

Use cases

We at Sysbee absolutely love HAProxy. Besides the load balancing feature, we use it for mitigation of DOS attacks, traffic filtering, advanced HTTP request routing and throttling – you name it. Heck, we use it on single-server setups as well!

Load balancing

HAProxy supports load balancing of TCP (layer 4) and HTTP (layer 7) traffic with various load balancing algorithms – round-robin, static, by weight, cookie or header to name a few.

TPC mode is faster, and it’s ideal for load balancing various protocols that rely on TCP, e.g. MySQL, SMTP, Redis, and even HTTP if we’re not interested in inspecting HTTP traffic itself.

HTTP mode is a slower method compared to the TCP mode, however, the speed at which HAProxy performs analysis and manipulation of HTTP traffic is measured in single-digit milliseconds, so the term “slow” is fairly relative.

DOS mitigation and traffic filtering

In addition to load balancing, HAProxy has some interesting “party tricks” that can help mitigate some types of HTTP-based denial-of-service attacks and ensure server stability. Here are a few examples.

Slowloris mitigation

Slowloris attacks are a type of a DOS attack which allows a single machine to take down a web server with minimal bandwidth and side effects on unrelated services and ports. Slowloris tries to keep open as many connections to the target web server as possible, for as long as possible. It accomplishes this by opening connections to the target web server and sending a partial request, and then periodically sending subsequent HTTP headers, adding to, but never completing the request. Affected servers will keep these connections open, filling their maximum concurrent connection pool, eventually denying additional connection attempts from clients.

Threaded web servers are the most susceptible to such attacks (e.g. Apache, especially when using prefork multi-processing module). However, web servers with an asynchronous, event-driven architecture (where the request receives it in the background while other requests are processed) are also in peril of such an attack.

Mitigations are possible with the Apache mod_reqtimeout module, with Nginx limit connections per specific IP address and the like.

HAProxy is also event-driven, but with an intelligent protection mechanism.
To mitigate slowloris attacks, HAProxy only needs one directive – timeout http-request timeout – which defines the maximum accepted time to receive a complete HTTP request headers (without data).

We found that 10 seconds is low enough to keep the bad guys at bay and high enough to avoid terminated connections when the client has slow internet access.

# Maximum time to receive complete HTTP request headers
timeout http-request 10s

IP address filtering

It’s safe to say that we’ve all been in a situation where you need to filter a large number of requests based on the client’s IP address (e.g. when mitigating DDOS attack coming from specific subnets or countries).

So your question is probably “Why use HAProxy when you can use .htaccess or Nginx vhost?”.

The answer is pretty straightforward: HAProxy is much lighter in terms of CPU and memory, especially when it comes to filtering a large number of concurrent requests.

In the example below, we have an ACL called ”badguys”. HAProxy will try to match the visitor’s IP address for each HTTP request in the list of IP ranges in the badguys.txt file. To keep the list as small as possible, IP ranges are listed in CIDR notation.

# Returns 403 error response if the request came from blacklisted IP
acl badguys src -f /etc/haproxy/badguys.txt
http-request deny if badguys

User-agent filtering

Similar to IP addresses, list files can be used to store other information. In this example, we want to block all requests with a specific string in the user-agent, except when the request was made to a specific domain.

Make sure to note the hdr_end directive, because it matches the end of the domain in the Host header. Depending on your use-case, you might want to match the value using alternative variables such as hdr_beg, hdr_sub or hdr_reg.

The exclamation point before the ACLs name indicates a negation. Also, bear in mind that HAProxy has a short-circuit evaluation of ACLs, which means the ACL evaluation will stop as soon as one of the conditions is not matched.

# Returns 403 error if the request came with blacklisted user-agent header
acl badbot hdr_sub(User-Agent) -i -f /etc/haproxy/badbots.txt
acl excluded_domain hdr_end(Host) -i -f /etc/haproxy/exclude.txt
http-request deny if badbot !excluded_domain
programmer immersed in code

Login brute-force prevention

HAProxy is very useful when it comes to filtering automated login and contact form attacks. In this example, we will concentrate on login forms.

For automated login attempts, bots/scripts usually attempt to send a single POST request to a specific URL. Smarter bots will try to imitate a legitimate user by sending a GET request beforehand, but to save bandwidth and time, they don’t load the entire login page with all the static resources.

To better understand the configuration example below, we’ll first explain specific configuration directives we mentioned in this example:

  • The cookie insert directive in the backend instructs HAProxy to add an “SB_TRACK” cookie to HTTP response headers.
  • indirect instructs HAProxy to insert the cookie if the client does not already have one
  • nocache means that HAProxy will also add a “Cache-Control: nocache” response header so that the response is not accidentally cached between the client and HAProxy (e.g. if there is a caching server or a CDN node between them)
  • bk_nocookie is a backend which points to the same web server, but HAProxy won’t add tracking cookie. Legitimate users will pick up a cookie by requesting any static resource that’s loaded from default bk_http backend.

The logic behind it is straightforward – the idea is to block bots and malicious users who aim to send as many requests as possible and who will not collect tracking cookies.

Before submitting login credentials, users usually need to access a login page. In our example, legitimate users will pick up SB_TRACK cookie set by HAProxy when they access that login page and the collected cookie will later allow them to submit login credentials using POST HTTP method.

Bots and automated scripts in most cases are not bothered to accept cookies. They are easily blocked from submitting login requests by merely checking if their requests came with previously collected SB_TRACK cookie.

frontend ft_http
...
    acl cms_cookie hdr_sub(cookie) SB_TRACK=c1
    acl cms_admin url_sub /wp-login.php
    acl cms_admin url_beg /admin/

    http-request deny if cms_admin METH_POST !cms_cookie

    use_backend bk_nocookie if cms_admin
    default_backend bk_http


backend bk_http
    cookie SB_TRACK insert indirect nocache
    server web1 192.168.10.10:80 cookie c1

backend bk_nocookie
   server web1 192.168.10.10:80

Basic HTTP request throttling

HAProxy is the ideal tool for getting the most out of a server that will be under increased load for a short period. Excellent examples of such occurrences are Black Friday promotions, holiday sales, and similar. 

In certain situations, the server can be rescued with the HAProxy queueing mechanism.

In this example, the queueing policy is defined in the backend configuration section. Key directives are:

  • minconn – represents a concurrent number of connections to the backend server in calm conditions. All requests above the minconn limit will be queued.
  • fullconn – specifies at what backend load the servers will reach their maxconn.
  • maxconn – defines the concurrent number of connections to the backend server when fullconn limit is reached.
backend bk_http
    fullconn 100
    server web1 192.168.10.10:80 check inter 2s minconn 20 
maxconn 30
    server web2 192.168.10.20:80 check inter 2s minconn 20 maxconn 30

In this example, we’ve set arbitrary “soft” and “hard” limits for a concurrent number of sessions which our backend servers can handle.

Each server will handle 20 concurrent sessions (defined with minconn). In case there’s a surge of requests, HAProxy will automatically queue those requests. If the number of queued requests exceed the fullconn limit (in our example 100 sessions), HAProxy will increase concurrency to 30 sessions per server (defined with maxconn) in effort to lower the number of queued requests.

Requests are queued until the timeout queue limit is reached (e.g., 60 seconds), during which time users will wait for the page to display in their browser.
If the timeout limit is reached, HAProxy will return a 504 gateway timeout error.

Use of the HAProxy queues makes sense only for short-term requests bursts because even if the minconn or maxconn limits are increased, the bottleneck will always be the processing speed of the backend servers.

Advanced HTTP request throttling

Traffic throttling, as described in the previous example is applicable in certain situations (primarily if all requests are sent by legitimate users). But what if one or more malicious users send a large number of requests to the server and slow down the backend server for all other users?

Problematic users can be prevented at an early stage with a few tricks that rely on stick table functionality.

Stick-table is HAProxy’s in-memory key-value store for storing various counters.

Let’s take a closer look at the HAProxy configuration example below.

listen ft_http
    …
    acl static_file path_end .css .js .jpg .jpeg .gif .ico .png .bmp .woff .pdf

    stick-table type ip size 100k expire 10s store http_req_rate(10s),conn_cur
    http-request track-sc0 src if !static_file

    acl fast_client sc0_gpc0_rate gt 10
    acl max_connections sc0_conn_cur gt 20

    use_backend bk_error_429 if max_connections
    use_backend bk_http_slow if fast client

bk_error_429
    timeout tarpit 2s
    errorfile 429 /etc/haproxy/429.html

    http-request tarpit deny_status 429

The stick-table directive declares the stick-table with a couple of arguments:

  • type ip – declares a key whose type is IP (other types are integer, string, binary)
  • size – defines the number of entries (1 entry = ~ 50b… ~ 5 MB for 100,000 entries)
  • expire – defines a TTL for the keys in memory
  • store – defines what values ​​or metrics to store with the key (in this case the number of HTTP requests within the sliding window of 10 seconds and the current number of open connections for a specific IP)

The http-request track-sc0 declares tracking the source (client’s) IP address. The counter is increased for each request that came from a specific IP address, provided the request was not for a static resource, as we excluded those requests by declaring !static_file ACL negation.

The acl fast_client directive checks if the request rate within the sliding window is higher than 10 and acl max_connection checks if the number of currently open connections from an IP address is greater than 20. 

The errorfile in the backend is not necessary, but it is listed as an option if we want to return a custom error page to abusive clients (e.g. error 429 – too many requests).

Conclusion

So there you have it – a small glimpse into some of HAProxy’s functionalities that can help you mitigate basic DOS attacks and put annoying and abusive bots at bay.

If you already use HAProxy and have a favourite feature of your own, let us know about it in the comments below! If you don’t, check out some of our managed services and we can help set it up for you.

Share this post