|
|
|
@ -1540,6 +1540,44 @@ So when you encounter spurious, unexplained daemon exits, make sure you
|
|
|
|
|
ignore SIGPIPE (and maybe make sure you log the exit status of your daemon
|
|
|
|
|
somewhere, as that would have given you a big clue).
|
|
|
|
|
|
|
|
|
|
=head3 The special problem of accept()ing when you can't
|
|
|
|
|
|
|
|
|
|
Many implementations of the POSIX C<accept> function (for example,
|
|
|
|
|
found in port-2004 Linux) have the peculiar behaviour of not removing a
|
|
|
|
|
connection from the pending queue in all error cases.
|
|
|
|
|
|
|
|
|
|
For example, larger servers often run out of file descriptors (because
|
|
|
|
|
of resource limits), causing C<accept> to fail with C<ENFILE> but not
|
|
|
|
|
rejecting the connection, leading to libev signalling readiness on
|
|
|
|
|
the next iteration again (the connection still exists after all), and
|
|
|
|
|
typically causing the program to loop at 100% CPU usage.
|
|
|
|
|
|
|
|
|
|
Unfortunately, the set of errors that cause this issue differs between
|
|
|
|
|
operating systems, there is usually little the app can do to remedy the
|
|
|
|
|
situation, and no known thread-safe method of removing the connection to
|
|
|
|
|
cope with overload is known (to me).
|
|
|
|
|
|
|
|
|
|
One of the easiest ways to handle this situation is to just ignore it
|
|
|
|
|
- when the program encounters an overload, it will just loop until the
|
|
|
|
|
situation is over. While this is a form of busy waiting, no OS offers an
|
|
|
|
|
event-based way to handle this situation, so it's the best one can do.
|
|
|
|
|
|
|
|
|
|
A better way to handle the situation is to log any errors other than
|
|
|
|
|
C<EAGAIN> and C<EWOULDBLOCK>, making sure not to flood the log with such
|
|
|
|
|
messages, and continue as usual, which at least gives the user an idea of
|
|
|
|
|
what could be wrong ("raise the ulimit!"). For extra points one could stop
|
|
|
|
|
the C<ev_io> watcher on the listening fd "for a while", which reduces CPU
|
|
|
|
|
usage.
|
|
|
|
|
|
|
|
|
|
If your program is single-threaded, then you could also keep a dummy file
|
|
|
|
|
descriptor for overload situations (e.g. by opening F</dev/null>), and
|
|
|
|
|
when you run into C<ENFILE> or C<EMFILE>, close it, run C<accept>,
|
|
|
|
|
close that fd, and create a new dummy fd. This will gracefully refuse
|
|
|
|
|
clients under typical overload conditions.
|
|
|
|
|
|
|
|
|
|
The last way to handle it is to simply log the error and C<exit>, as
|
|
|
|
|
is often done with C<malloc> failures, but this results in an easy
|
|
|
|
|
opportunity for a DoS attack.
|
|
|
|
|
|
|
|
|
|
=head3 Watcher-Specific Functions
|
|
|
|
|
|
|
|
|
|