Hi all, After one year of intense development and almost one month of debugging, polishing, and cross-review work trying to prevent our respective coworkers from winning the first bug award, I'm pleased to announce that haproxy 1.8.0 is now officially released!
Since -rc4, a few last user-visible changes were brought : - by default the master worker exits if any of its processes dies. This is done so that when certain processes are dedicated to certain tasks, we're not left with some features not working anymore. Imagine having 7 SSL offloaders chaining to 1 HTTP frontend, and the last one dying, you don't want to keep the 7 useless frontends. By quitting, we give a chance to a service manager to detect the problem and alert/restart the service. The behaviour is configurable though. - we were not happy with "thread-map" vs "cpu-map", making these difficult to configure. Now "thread-map" was removed and the feature was merged into "cpu-map" which also supports process ranges and cpu ranges for easier configuration. - haproxy can now be built with native systemd support using USE_SYSTEMD=1 and starting it with -Ws (systemd-aware master-worker mode). - HTTP/2 will not schedule a graceful connection shutdown anymore when seeing a "Connection: close" header in a response. Instead a new HTTP action "reject" has been implemented to work like its TCP counter-part. - the HTTP/2 gateway code now properly reassembles split Cookie headers, as mandated by the specification. Not doing it was causing some issues with certain application servers, and absolutely needed to be addressed before claiming that it works. And here is a high level overview of the new features contributed to 1.8 (warning, the list is huge) : - JSON stats (Simon Horman) : the stats socket's "show stat" and "show info" output can now be emitted in a structured JSON format which is more convenient than CSV for some modern data processing frameworks. - server templates (Frédéric Lécaille) : servers can be pre-provisionned in backends using a simple directive ("server-template"). It is then possible to configure them at runtime over the CLI or DNS, making it trivial to add/remove servers at run time without restarting. As a side effect of implementing this, all "server" keywords are now supported on the "default-server" line and it's possible to disable any of them using "no-<keyword>". All settings changed at runtime are present in the state file so that upon reload no information is lost. - dynamic cookies (Olivier Houchard) : a dynamic cookie can be generated on the fly based on the transport address of a newly added server. This is important to be able to use server templates in stateful environments. - per-certificate "bind" configuration (Emmanuel Hocdet) : all the SSL specific settings of the "bind" line may now be set per-certificate in the crtlist file. A common example involves requiring a client cert for certain domains only and not for others, all of them running on the same address:port. - pipelined and asynchronous SPOE (Christopher Faulet) : it's an important improvement to the Stream Processing Offload Engine that allows requests to be streamed over existing connections without having to wait for a previous response. It significantly increases the message rate and reduces the need for parallel connections. Two example WAFs were introduced as contributions to make use of this improvement (mod_security and mod_defender). - seamless reloads (Olivier Houchard) : in order to work around some issues faced on Linux causing a few RST to be emitted for incoming connections during a reload operations despite SO_REUSEPORT being used, it is now possible for the new haproxy process to connect to the previous one and to retrieve existing listening sockets so that they are never closed. Now no connection breakage will be observed during a reload operation anymore. - PCRE2 support (David Carlier) : this new version of PCRE seems to be making its way in some distros, so now we are compatible with it. - hard-stop-after (Cyril Bonté) : this new global setting forces old processes to quit after a delay consecutive to a soft reload operation. This is mostly used to avoid an accumulation of old processes in some environments where idle connections are kept with large timeouts. - support for OpenSSL asynchronous crypto engines (Grant Zhang) : this allows haproxy to defer the expensive crypto operations to external hardware engines. Not only can it significantly improve the performance, but it can also reduce the latency impact of slow crypto operations on all other operations since haproxy switches to other tasks while the engine is busy. This was successfully tested with Intel's QAT and with a home-made software engine. This requires OpenSSL 1.1.x. - replacement of the systemd-wrapper with a new master-worker model (William Lallemand) : this new model allows a master process to stay in the foreground on top of the multiple worker processes. This process knows the list of worker processes, can watch them to detect failures, can broadcast some signals it receives, and has access to the file system to reload if needed (yes, it even supports seamless upgardes to newer versions since it reloads using an execve() call). While initially designed as a replacement for the systemd-wrapper, it also proves useful in other environments and during development. - DNS autonomous resolver (Baptiste Assmann) : the DNS resolution used to be triggered by health checks. While easy and convenient, it was a bit limited and didn't allow to manage servers via the DNS, but only to detect address changes. With this change the DNS resolvers are now totally autonomous and can distribute the addresses they've received to multiple servers at once, and if multiple A records are present in a response, the advertised addresses will be optimally distributed to all the servers relying on the same record. - DNS SRV records (Olivier Houchard) : in order to go a bit further with DNS resolution, SRV records were implemented. The address, port and weight attributes will be applied to servers. New servers are automatically added provided there are enough available templates, and servers which disappear are automatically removed from the farm. By combining server templates and SRV records, it is now trivial to perform service discovery. - configurable severity output on the CLI : external tools connecting to haproxy's CLI had to know a lot of details about the output of certain actions since these messages were initially aimed at humans, and it was not envisionned that the socket would become a runtime API. This change offers an option to emit the severity level on each action's output so that external APIs can classify the output between success, information, warnings, errors etc. - TLS 1.3 with support for Early-Data (AKA 0-RTT) on both sides (Olivier Houchard) : TLS 1.3 introduces the notion of "Early-Data", which are data emitted during the handshake. This feature reduces the TLS handshake time by one round trip. When compiled with a TLS-1.3 compatible TLS library (OpenSSL 1.1.1-dev for now), haproxy can receive such requests, process them safely, and even respond before the handshake completes. Furthermore, when the client opts for this, it is also possible to pass the request to the server following the same principle. This way it is technically possible to fully process a client request in a single round trip. - multi-thread support (Christopher Faulet, Emeric Brun) : no more need to choose between having multiple independant processes performing their own checks or cascading two layers of processes to scale SSL. With multi-threading we get the best of both : a unified process state and multi-core scalability. Eventhough this first implementation focuses on stability over performance, it still scales fairly well, being almost linear on asymmetric crypto, which is where there's the most demand. This feature is enabled by default on platforms where it could be tested, ie Linux >= 2.6.28, Solaris, FreeBSD, OpenBSD >= 5.7. It is considered EXPERIMENTAL, which means that if you face a problem with it, you may be asked to disable it for the time it takes to solve the problem. It is also possible that certain fixes to come will have some side effects. - HTTP/2 (Willy Tarreau) : HTTP/2 is automatically detected and processed in HTTP frontends negociating the "h2" protocol name based on the ALPN or NPN TLS extensions. At the moment the HTTP/2 frames are converted to HTTP/1.1 requests before processing, so they will always appear as 1.1 in the logs (and in server logs). No HTTP/2 is supported for now on the backend, though this is scheduled for the next steps. HTTP/2 support is still considered EXPERIMENTAL, so just like for multi-threading, in case of problem you may end up having to disable it for the time it takes to solve the issue. - small objects cache (William Lallemand) : we've been talking about this so-called "favicon cache" for many years now, so I'm pretty sure it will be welcome. To give a bit of context, we've often been criticized for not caching trivial responses from the servers, especially some slow application servers occasionally returning a small object (favicon.ico, main.css etc). While the obvious response is that installing a cache there is the best idea, it is sometimes perceived as overkill for just a few files. So what we've done here was to fill exactly that hole : have a *safe*, maintenance-free, small objects cache. In practice, if there is any doubt about a response's cachability, it will not cache. Same if the response contains a Vary header or is larger than a buffer. However this can bring huge benefits for situations where there's no argument against trivial caching. The intent is to keep it as simple and fast as possible so that it can always be faster than retrieving the same object from the next layer (possibly a full-featured cache). Note that I purposely asked William *not* to implement the purge on the CLI so that it remains maintenance-free and we don't see it abused where it should not be installed. This version brings a total of 1208 commits authored by 54 persons. That's almost the double of the number of commits of 1.7 (706) for a bit less people (62 by then), though most of them are the same. A few known limitations still apply to this release, but they are minor enough to allow us to release and fix them later : - master-worker + daemon (-W -D) fails strangely on FreeBSD, and the workaround is even stranger. Since the master-worker was meant to replace systemd-wrapper, it's not needed on this platform so we'll take care of analysing the issue in depth. In the mean time, don't use -W on FreeBSD (nor on OpenBSD given that the issue involved the kqueue poller). - the CLI's "show sess" command is known for not being 100% thread-safe, so it's better to avoid using it if more than one thread is enabled. Note that it will not corrupt your system, it will most often work, but may either report occasional garbage or immediately crash. If it completes the dump you're safe. We'll work on it as well. - both the cache and HTTP compression use filters. It is not trivial to safely use them both, we still need to sort this out and either automatically deal with each corner case or document recommendations for safe use. For now, please do not enable compression with the cache (choose only one of them). Note that neither is enabled by default so if you don't know, you're safe. - device detection engines currently don't support multi-threading (but it's safe to build with it, there is a runtime check). The outstanding amount of new features above proves that the new development model we've adopted last year works much better than what we had in the past. However I also noticed that it added a lot more pressure on a few person's shoulders whose help has been invaluable in screening each and every report so that the developers could stay focused on their tasks. And for this reason, among the 466 persons who participated to discussions over the last year and those animating the Discourse forums, I'd like to address special thanks to the following ones who together responded to the vast majority of the threads on the list, saving many of us from having to leave our code : - Aleksandar Lazic (aka Aleks) - Cyril Bonté - Daniel Schneller - Emmanuel Hocdet (aka Manu) - Igor Cicimov - Jarno Huuskonen - Pavlos Parissis - Thierry Fournier - Vincent Bernat and a very special one for Lukas Tribus who in addition to providing a lot of high quality answers on the mailing list has been tirelessly responding to almost every question on Discourse, which is truly amazing (I'm starting to suspect that there are several persons using the same name)! I'm totally aware that saying "thank you" is not enough and that we'll definitely have to see how to make your life easier as well guys, so that we can continue to scale without adding you more burden! I also noticed that the average quality of problem reports has significantly increased over time, in part thanks to some long-time participants well used to the process like Conrad Hoffmann, Pieter Baauw (aka PiBa-NL), Patrick Hemmer, Dmitry Sivachenko or Jarno Huuskonen, and it's really great because there's nothing more annoying than having to respond to a problem by always starting to ask for the same information. So please keep up the good work guys! In my opinion we haven't emitted enough versions to make it easy for more people to test, just like we haven't emitted enough stable releases, due to all the people involved in the process being busy on their development. This is something we'll have to address. I'll send a proposal of release schedule for 1.9 some time later. Now 1.9 opens with 1.9-dev0 so that we can go break things as usual :-) Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse : http://discourse.haproxy.org/ Sources : http://www.haproxy.org/download/1.8/src/ Git repository : http://git.haproxy.org/git/haproxy-1.8.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy-1.8.git Changelog : http://www.haproxy.org/download/1.8/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ By the way, I'm well known for messing up with each and every release, leaving broken links here and there. So if something appears wrong, please report it! Willy --- Complete changelog from -rc5 to -final: Christopher Faulet (15): MINOR: sample: Add "thread" sample fetch BUG/MINOR: Use crt_base instead of ca_base when crt is parsed on a server line BUG/MINOR: listener: Allow multiple "process" options on "bind" lines MINOR: config: Support a range to specify processes in "cpu-map" parameter MINOR: config: Slightly change how parse_process_number works MINOR: config: Export parse_process_number and use it wherever it's applicable MINOR: standard: Add my_ffsl function to get the position of the bit set to one MINOR: config: Add auto-increment feature for cpu-map MINOR: config: Support partial ranges in cpu-map directive MINOR:: config: Remove thread-map directive MINOR: config: Add the threads support in cpu-map directive MINOR: config: Add threads support for "process" option on "bind" lines MEDIUM: listener: Bind listeners on a thread subset if specified CLEANUP: debug: Use DPRINTF instead of fprintf into #ifdef DEBUG_FULL/#endif CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning Emeric Brun (1): DOC: add initial peers protovol v2.0 documentation. Emmanuel Hocdet (1): MINOR: ssl: Handle early data with BoringSSL Eric Salama (2): CONTRIB: spoa_example: allow to compile outside HAProxy. CONTRIB: spoa_example: remove SPOE enums that are useless for clients Lukas Tribus (2): BUG/MINOR: systemd: ignore daemon mode DOC: explain HTTP2 timeout behavior Olivier Houchard (5): BUG/MINOR: ssl: Always start the handshake if we can't send early data. MINOR: ssl: Don't disable early data handling if we could not write. MINOR: ssl: Handle reading early data after writing better. MINOR: mux: Make sure every string is woken up after the handshake. MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list" Tim Duesterhus (1): MEDIUM: mworker: Add systemd `Type=notify` support William Lallemand (17): BUG/MEDIUM: cache: free callback to remove from tree CLEANUP: cache: remove unused struct MEDIUM: cache: enable the HTTP analysers CLEANUP: cache: remove wrong comment CLEANUP: cache: reorder includes MEDIUM: shctx: use unsigned int for len and block_count MEDIUM: cache: "show cache" on the cli BUG/MEDIUM: cache: use key=0 as a condition for freeing BUG/MEDIUM: cache: refcount forbids to free the objects BUG/MEDIUM: cache fix cli_kws structure MEDIUM: cache: store sha1 for hashing the cache key BUG/MEDIUM: cache: free ressources in chn_end_analyze MINOR: cache: move the refcount decrease in the applet release MINOR: cache: replace a fprint() by an abort() MEDIUM: cache: max-age configuration keyword DOC: cache: configuration and management MAJOR: mworker: exits the master on failure Willy Tarreau (49): BUG/MEDIUM: stream: don't automatically forward connect nor close BUG/MAJOR: stream: ensure analysers are always called upon close BUG/MINOR: stream-int: don't try to read again when CF_READ_DONTWAIT is set MINOR: threads/atomic: rename local variables in macros to avoid conflicts MINOR: threads/plock: rename local variables in macros to avoid conflicts MINOR: threads/atomic: implement pl_mb() in asm on x86 MINOR: threads/atomic: implement pl_bts() on non-x86 MINOR: threads/build: atomic: replace the few inlines with macros BUILD: threads/plock: fix a build issue on Clang without optimization BUILD: ebtree: don't redefine types u32/s32 in scope-aware trees BUILD: compiler: add a new type modifier __maybe_unused BUILD: h2: mark some inlined functions "unused" BUILD: server: check->desc always exists BUG/MEDIUM: h2: properly report connection errors in headers and data handlers MEDIUM: h2: add a function to emit an HTTP/1 request from a headers list MEDIUM: h2: change hpack_decode_headers() to only provide a list of headers BUG/MEDIUM: h2: always reassemble the Cookie request header field CONTRIB: spoa_example: remove bref, wordlist, cond_wordlist CONTRIB: spoa_example: remove last dependencies on type "sample" BUG/MEDIUM: deinit: correctly deinitialize the proxy and global listener tasks MINOR: pools: prepare functions to override malloc/free in pools MINOR: pools: implement DEBUG_UAF to detect use after free BUG/MEDIUM: threads/time: fix time drift correction BUG/MEDIUM: threads/time: maintain a common time reference between all threads BUG/MINOR: stream: fix tv_request calculation for applets BUG/MAJOR: h2: always remove a stream from the send list before freeing it BUG/MAJOR: threads/task: dequeue expired tasks under the WQ lock MINOR: http: implement the "http-request reject" rule MINOR: h2: send RST_STREAM before GOAWAY on reject MEDIUM: h2: don't gracefully close the connection anymore on Connection: close MINOR: h2: make use of client-fin timeout after GOAWAY MEDIUM: config: ensure that tune.bufsize is at least 16384 when using HTTP/2 BUG/MEDIUM: stream: always release the stream-interface on abort CLEANUP: pools: rename all pool functions and pointers to remove this "2" DOC: update the roadmap file with the latest changes merged in 1.8 DOC: fix mangled version in peers protocol documentation DOC: mention William as maintainer of the cache and master-worker DOC: add Christopher and Emeric as maintainers of the threads BUG/MINOR: threads: don't drop "extern" on the lock in include files MINOR: task: keep a pointer to the currently running task MINOR: task: align the rq and wq locks MINOR: fd: cache-align fdtab and fdcache locks MINOR: buffers: cache-align buffer_wq_lock CLEANUP: server: reorder some fields in struct server to save 40 bytes CLEANUP: proxy: slightly reorder the struct proxy to reduce holes CLEANUP: checks: remove 16 bytes of holes in struct check CLEANUP: cache: more efficiently pack the struct cache CLEANUP: fd: place the lock at the beginning of struct fdtab CLEANUP: pools: align pools on a cache line BUG/MAJOR: threads/queue: avoid recursive locking in pendconn_get_next_strm() BUILD: Makefile: reorder object files by size [RELEASE] Released version 1.8.0 ---