parse $HTTP["remote-ip"] CIDR mask into structured data at startup
note: adds buffer_move() to configparser.y to reduce memory copying
for all config values, and is required for remote-ip to preserve the
structured data added after the config value string. (Alternatively,
could have normalized the remote-ip value after copying into dc->string)
This commit is a large set of code changes and results in removal of
hundreds, perhaps thousands, of CPU instructions, a portion of which
are on hot code paths.
Most (buffer *) used by lighttpd are not NULL, especially since buffers
were inlined into numerous larger structs such as request_st and chunk.
In the small number of instances where that is not the case, a NULL
check is often performed earlier in a function where that buffer is
later used with a buffer_* func. In the handful of cases that remained,
a NULL check was added, e.g. with r->http_host and r->conf.server_tag.
- check for empty strings at config time and set value to NULL if blank
string will be ignored at runtime; at runtime, simple pointer check
for NULL can be used to check for a value that has been set and is not
blank ("")
- use buffer_is_blank() instead of buffer_string_is_empty(),
and use buffer_is_unset() instead of buffer_is_empty(),
where buffer is known not to be NULL so that NULL check can be skipped
- use buffer_clen() instead of buffer_string_length() when buffer is
known not to be NULL (to avoid NULL check at runtime)
- use buffer_truncate() instead of buffer_string_set_length() to
truncate string, and use buffer_extend() to extend
Examples where buffer known not to be NULL:
- cpv->v.b from config_plugin_values_init is not NULL if T_CONFIG_BOOL
(though we might set it to NULL if buffer_is_blank(cpv->v.b))
- address of buffer is arg (&foo)
(compiler optimizer detects this in most, but not all, cases)
- buffer is checked for NULL earlier in func
- buffer is accessed in same scope without a NULL check (e.g. b->ptr)
internal behavior change:
callers must not pass a NULL buffer to some funcs.
- buffer_init_buffer() requires non-null args
- buffer_copy_buffer() requires non-null args
- buffer_append_string_buffer() requires non-null args
- buffer_string_space() requires non-null arg
move native data_* types into array.c
(the types are already declared in array.h)
The array data structure remains extendable, as is done with data_config
(configfile) and data_auth (mod_auth), though array data structure
primary uses are at startup (config time) and header parsing. The
insertion logic into sorted list can be expensive for large lists,
so header parsing might choose a different data structure in the future.
tighten struct data_config and config_cond_info
create config key at startup and reuse for debug/trace
separate routine for configparser_parse_condition()
separate routine for configparser_parse_else_condition()
(optional addition to (data_string *), used by http_header.[ch])
extend (data_string *) instead of creating another data_* TYPE_*
(new data type would probably have (data_string *) as base class)
(might revisit choice in the future)
HTTP_HEADER_UNSPECIFIED has been removed. It was used in select
locations as an optimization to avoid looking up enum header_header_e
before checking the array, but the ordering in the array now relies
on having the id. Having the id allows for a quick check if a common
header is present or not in the htags bitmask, before checking the
array, and allows for integer comparison in the log(n) search of the
array, instead of strncasecmp().
With HTTP_HEADER_UNSPECIFIED removed, add optimization to set bit
in htags for HTTP_HEADER_OTHER when an "other" header is added,
but do not clear the bit, as there might be addtl "other" headers
NB: r->tmp_buf == srv->tmp_buf (pointer is copied for quicker access)
NB: request read and write chunkqueues currently point to connection
chunkqueues; per-request and per-connection chunkqueues are
not distinct from one another
con->read_queue == r->read_queue
con->write_queue == r->write_queue
NB: in the future, a separate connection config may be needed for
connection-level module hooks. Similarly, might need to have
per-request chunkqueues separate from per-connection chunkqueues.
Should probably also have a request_reset() which is distinct from
connection_reset().
array_get_element_klen() is now intended for read-only access
array_get_data_unset() is used by config processing for r/w access
array_get_buf_ptr() is used for r/w access to ds->value (string buffer)
mark array_get_index() as hot, rewrite to be pure and return sorted pos
mark routines as pure, as appropriate
mark routines as cold if used only at startup for config processing
mark params const, as appropriate
array_get_buf_ptr() for modifiable value buffer after insert into array
uint32_t used and size members instead of size_t
remove a->unique_ndx member; simply add to end of array for value lists
remove du->is_index_key member; simply check buffer_is_empty(du->key)
array_insert_key_value() used to be a hint that lookup could be skipped,
but the state from array_get_index() is now saved and reused internally,
so the distinction is no longer needed. Use array_set_key_value().
quickly clear buffer instead of buffer_string_set_length(b, 0) or
buffer_reset(b). Avoids free() of large buffers about to be reused,
or buffers that are module-scoped, persistent, and reused.
(buffer_reset() should still be used with buffers in connection *con
when the data in the buffers is supplied by external, untrusted source)
save 40 bytes (64-bit), or 16 bytes (32-bit) per data_* element
at the cost of going through indirect function pointer to execute
methods. At runtime, the reset() method is most used among them.
provide standard types in first.h instead of base.h
provide lighttpd types in base_decls.h instead of settings.h
reduce headers exposed by headers for core data structures
do not expose <pcre.h> or <stdlib.h> in headers
move stat_cache_entry to stat_cache.h
reduce use of "server.h" and "base.h" in headers
adjust config parser for valid variable expansion
Return only the value when a variable is expanded so that the
array element keeps its state as value-only or part of key-value
(thx nicorac)
x-ref:
"https://redmine.lighttpd.net/boards/2/topics/7600"
- lemon never calls the destructor for variables on the RHS, make sure
to manually clean up
- outside `if (ctx->ok) { }` always check for NULL pointers, i.e:
- if (x) x->free(x)
- buffer_free and array_free check for NULL on their own
- cleanup RHS variables below `if (ctx->ok) { }` at the bottom
- set variables to NULL before if ownership gets passed on
- move some buffers instead of copying them
x-ref:
"Memory corruption in yy_reduce (configparser.y), SIGSEGV"
https://redmine.lighttpd.net/issues/2809
[core] $HTTP["remoteip"] must handle IPv6 w/o [] (existing behavior)
This was inadvertently broken in lighttpd 1.4.40 when IP address
normalization was added.
In $HTTP["remoteip"], IPv6 is now accepted with or without '[]'.
http_request_host_normalize() expects IPv6 with '[]', and config
processing at runtime expects COMP_HTTP_REMOTE_IP compared without '[]',
so '[]' is stripped (internally) after normalization
Release op1 memory on failure; fixes some theoretical memory leaks (a
failure results in early exit anyway).
From: Stefan Bühler <stbuehler@web.de>
git-svn-id: svn://svn.lighttpd.net/lighttpd/branches/lighttpd-1.4.x@3101 152afb58-edef-0310-8abb-c4023f1b3aa9
- refactor insert into array_find_or_insert; if the element already
exists the caller must resolve the conflict manually:
- array_replace frees the old element
- array_insert_unique calls "insert_dup"
both have no return value anymore
- fix usages of array_replace; they now don't need to delete the old
entry anymore; usage in configparser was probably broken, as it
possibly deleted the old element before calling array_replace
This should fix a lot of the issues reported in "Fortify Open Review
Project - lighttpd 1.4.39" (usually hitting the array limit):
when the array size limit was hit "new" entries leaked instead of
getting added.
On 32-bit INT_MAX entries cannot actually be reached (each entry
requires at least 48 bytes, leading to a total of 96GB memory).
On 64-bit INT_MAX entries would require 224GB memory, so it would be
theoretically possible. But it would need 2^27 reallocations of two
C-arrays of up to 16GB size.
From: Stefan Bühler <stbuehler@web.de>
git-svn-id: svn://svn.lighttpd.net/lighttpd/branches/lighttpd-1.4.x@3098 152afb58-edef-0310-8abb-c4023f1b3aa9
- add new "skip" result to mark conditions that didn't actually get
evaluated to false but just skipped because the preconditions failed.
- add "local_result" for each cache entry to remember whether the
condition itself matched (not including the preconditions).
this can be reused after a cache reset if the condition itself was not
reset, but the preconditions were
- clear result of subtree (children and else-branches) when clearing a
condition cache
From: Stefan Bühler <stbuehler@web.de>
git-svn-id: svn://svn.lighttpd.net/lighttpd/branches/lighttpd-1.4.x@3082 152afb58-edef-0310-8abb-c4023f1b3aa9
only use values in reduce actions when the config is still valid
(ctx->ok).
From: Stefan Bühler <stbuehler@web.de>
git-svn-id: svn://svn.lighttpd.net/lighttpd/branches/lighttpd-1.4.x@3080 152afb58-edef-0310-8abb-c4023f1b3aa9