(thx helmut)
do not read-ahead past '\0' while url-decoding
lighttpd 1.4.60 could previously have read one byte of potentially
uninitialized data. lighttpd detects the '\0' so there is no exposure
of data. This also can not cause a crash in lighttpd 1.4.60 due to how
lighttpd 1.4.60 allocates memory for buffers in sizes (power-2 + 1),
and typical system malloc alignment of 4- or 8- bytes.
define __attribute_nonnull__(params) with params to match
recent changes in glibc development (targetting glibc 2.35 in Feb 2022)
x-ref:
new __attribute_nonnull__(params) conflicts with third-party
https://sourceware.org/bugzilla/show_bug.cgi?id=28425
This commit is a large set of code changes and results in removal of
hundreds, perhaps thousands, of CPU instructions, a portion of which
are on hot code paths.
Most (buffer *) used by lighttpd are not NULL, especially since buffers
were inlined into numerous larger structs such as request_st and chunk.
In the small number of instances where that is not the case, a NULL
check is often performed earlier in a function where that buffer is
later used with a buffer_* func. In the handful of cases that remained,
a NULL check was added, e.g. with r->http_host and r->conf.server_tag.
- check for empty strings at config time and set value to NULL if blank
string will be ignored at runtime; at runtime, simple pointer check
for NULL can be used to check for a value that has been set and is not
blank ("")
- use buffer_is_blank() instead of buffer_string_is_empty(),
and use buffer_is_unset() instead of buffer_is_empty(),
where buffer is known not to be NULL so that NULL check can be skipped
- use buffer_clen() instead of buffer_string_length() when buffer is
known not to be NULL (to avoid NULL check at runtime)
- use buffer_truncate() instead of buffer_string_set_length() to
truncate string, and use buffer_extend() to extend
Examples where buffer known not to be NULL:
- cpv->v.b from config_plugin_values_init is not NULL if T_CONFIG_BOOL
(though we might set it to NULL if buffer_is_blank(cpv->v.b))
- address of buffer is arg (&foo)
(compiler optimizer detects this in most, but not all, cases)
- buffer is checked for NULL earlier in func
- buffer is accessed in same scope without a NULL check (e.g. b->ptr)
internal behavior change:
callers must not pass a NULL buffer to some funcs.
- buffer_init_buffer() requires non-null args
- buffer_copy_buffer() requires non-null args
- buffer_append_string_buffer() requires non-null args
- buffer_string_space() requires non-null arg
buffer_commit() is called by routines which preallocate for operations
like read(). The caller must properly manage the memory. The checks
removed from buffer_commit() are too late.
optimize buffer_* primitives
Other than buffer_string_set_length(), reallocate with one power-2 step
in size (or use the requested size, if larger). This replaces the fixed
BUFFER_PIECE_SIZE round-up of only 64 bytes extension each reallocation,
which could lead to excessive reallocations in some scenarios.
buffer_extend() convenience routine to prep for batch append
(combines buffer_string_prepare_append() and buffer_commit())
mod_fastcgi, mod_scgi, mod_proxy and others now leverage buffer_extend()
mod_scgi directly performs little-endian encoding of short ints
http_response_write_header() optimizes writing response header,
leveraging buffer_extend()
modify mod_proxy to append line ends
similar to how it is done in http_response_write_header()
(removes one call to buffer_append_string_len())
remove buffer_urldecode_query() (unused)
query string generally needs to be split on '&'
before decoding '+' and decoding %-encoding
remove int2hex() (unused, and not well-named for nibble-to-hex)
NB: r->tmp_buf == srv->tmp_buf (pointer is copied for quicker access)
NB: request read and write chunkqueues currently point to connection
chunkqueues; per-request and per-connection chunkqueues are
not distinct from one another
con->read_queue == r->read_queue
con->write_queue == r->write_queue
NB: in the future, a separate connection config may be needed for
connection-level module hooks. Similarly, might need to have
per-request chunkqueues separate from per-connection chunkqueues.
Should probably also have a request_reset() which is distinct from
connection_reset().
use global rather than passing around (server *) just for that
li_itostrn() and li_utostrn() return string length
(rather than requiring subsequent strlen() to find length)
validate UTF-8 in url-decoded paths obtained elsewhere than from request
(burl_normalize(), if enabled with server.http-parseopts, checks url for
overlong encodings of ASCII chars in the HTTP request-line)
buffer_simplify_path() no longer prepends '/' if '/' is missing.
Callers must check for leading '/' depending on use, such as in
concatenation with others paths, or direct use accessing filesystem
Note: lighttpd 1.4.50 provides the server.http-parseopts directive.
Recommended settings unless specific use requires looser settings:
server.http-parseopts = (
"header-strict" => "enable",
"host-strict" => "enable",
"host-normalize" => "enable",
"url-normalize" => "enable",
"url-normalize-unreserved" => "enable",
"url-normalize-required" => "enable",
"url-ctrls-reject" => "enable",
"url-path-2f-decode" => "enable",
"url-path-dotseg-remove" => "enable",
"url-query-20-plus" => "enable"
)
x-ref:
https://digi.ninja/blog/lighttpd_rewrite_bypass.php
As noted in the link above, mod_access should be preferred instead
of mod_rewrite for access controls to URLs.