All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Eugene Crosser <crosser@average.org>
Cc: "netfilter-devel@vger.kernel.org" <netfilter-devel@vger.kernel.org>
Subject: Re: Suboptimal error handling in libnftables
Date: Mon, 6 Dec 2021 20:56:18 +0100	[thread overview]
Message-ID: <Ya5q4opTEInI9b28@salvia> (raw)
In-Reply-To: <0b606613-076e-9b05-5283-46ade61292b2@average.org>

[-- Attachment #1: Type: text/plain, Size: 3168 bytes --]

On Mon, Dec 06, 2021 at 05:58:25PM +0100, Eugene Crosser wrote:
> On 02/12/2021 14:54, Pablo Neira Ayuso wrote:
> > On Thu, Dec 02, 2021 at 02:16:12PM +0100, Eugene Crosser wrote:
> >> Hello,
> >>
> >> there is read-from-the-socket loop in src/iface.c line 90 (function
> >> iface_cache_update()), and it (and other places) call macro
> >> netlink_init_error() to report error. The function behind the macro is
> >> in src/netlink.c line 81, and it calls exit(NFT_EXIT_NONL) after writing
> >> a message to stderr.
> >>
> >> I see two problems with this:
> >>
> >> 1. All read-from-the-socket functions should be run in a loop, repeating
> >> if return code is -1 and errno is EINTR. I.e. EINTR should not be
> >> treated as an error, but as a condition that requires retry.
> >>
> >> 2. Library functions are not supposed to call exit() (or abort() for
> >> that matter). They are expected to return an error indication to the
> >> caller, who may have its own strategy for handling error conditions.
> >>
> >> Case in point, we have a daemon (in Python) that uses bindings to
> >> libnftables. It's a service responding to requests coming over a TCP
> >> connection, and it takes care to intercept any error situations and
> >> report them back. We discovered that under some conditions, it just
> >> closes the socket and goes away. This being a daemon, stderr was not
> >> immediately accessible; and even it it were, it is pretty hard to figure
> >> where did the message "iface.c:98: Unable to initialize Netlink socket:
> >> Interrupted system call" come from and why!
> > 
> > This missing EINTR handling for iface_cache_update() is a bug, would
> > you post a patch for this?
> 
> It looks like there is more than just a missing retry when
> socket-receive returns EINTR. In the code in src/iface.c between lines
> 87 and 98, EINTR may come from one of two functions:
> mnl_socket_recvfrom() and mnl_cb_run(). If it is returned by
> mnl_socket_recvfrom(), the correct course of action is to blindly call
> that function again. But when it is returned by mnl_cb_run(), the
> meaning is different. mnl_cb_run() retruns -1 and sets errno=EINTR when
> netlink message contained NLM_F_DUMP_INTR. I assume (though I am not
> sure) that NLM_F_DUMP_INTR means that the data that is being transferred
> has changed while transfer was only partially complete, and the user is
> advised to restart _the whole dump process_ by sending a new NLM_F_DUMP
> request message. (Arguably, libmnl ought to report such situation with
> some other indication, not EINTR.) In any case, I believe that the
> aforementioned code should handle both of these two "need to retry" cases.
> 
> I our tests it looks like we are hitting NLM_F_DUMP_INTR rather then
> interrupted socket recv(). I will report back after this is verified.

NLM_F_DUMP_INTR means that the existing netlink dump is stale (an
update happened while dumping the listing to userspace), so userspace
has to restart to get a consistent listing from the kernel.

Yes, in both cases, either signal interruped syscall or
NLM_F_DUMP_INTR, libnftables should retry.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      reply	other threads:[~2021-12-06 19:56 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-02 13:16 Suboptimal error handling in libnftables Eugene Crosser
2021-12-02 13:54 ` Pablo Neira Ayuso
2021-12-02 14:03   ` Eugene Crosser
2021-12-02 15:50     ` Pablo Neira Ayuso
2021-12-06 16:58   ` Eugene Crosser
2021-12-06 19:56     ` Pablo Neira Ayuso [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ya5q4opTEInI9b28@salvia \
    --to=pablo@netfilter.org \
    --cc=crosser@average.org \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.