Re: nfnetlink: Busy-loop in nfnetlink_rcv_msg()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Phil Sutter <phil@nwl.cc>, Florian Westphal <fw@strlen.de>,
	netfilter-devel@vger.kernel.org
Subject: Re: nfnetlink: Busy-loop in nfnetlink_rcv_msg()
Date: Thu, 27 Aug 2020 19:50:11 +0200	[thread overview]
Message-ID: <20200827175011.GA24542@salvia> (raw)
In-Reply-To: <20200827142319.GL23632@orbyte.nwl.cc>

Hi Phil,

On Thu, Aug 27, 2020 at 04:23:19PM +0200, Phil Sutter wrote:
> Hi Pablo,
> 
> On Wed, Aug 26, 2020 at 05:32:19PM +0200, Pablo Neira Ayuso wrote:
> [...]
> > There several possibilities, just a few that can be explored:
[...]
> >   Note that userspace requires no changes to support batching mode:
> >   libmnl's mnl_cb_run() keeps iterating over the buffer that was
> >   received until all netlink messages are consumed.
> > 
> >   The quick patch is incomplete, I just want to prove the saving in
> >   terms of memory. I'll give it another spin and submit this for
> >   review.
> 
> Highly appreciated. Put me in Cc and I'll stress-test it a bit.

Let's give a try to this patch:

https://patchwork.ozlabs.org/project/netfilter-devel/patch/20200827172842.24478-1-pablo@netfilter.org/

> > * Probably recover the cookie idea: firewalld specifies a cookie
> >   that identifies the rule from userspace, so there is no need for
> >   NLM_F_ECHO. Eric mentioned he would like to delete rules by cookie,
> >   this should be possible by looking up for the handle from the
> >   cookie to delete rules. The cookie is stored in NFTA_RULE_USERDATA
> >   so this is only meaningful to userspace. No kernel changes are
> >   required (this is supported for a bit of time already).
> > 
> >   Note: The cookie could be duplicated. Since the cookie is allocated
> >   by userspace, it's up to userspace to ensure uniqueness. In case
> >   cookies are duplicated, if you delete a rule by cookie then
> > 
> >   Rule deletion by cookie requires dumping the whole ruleset though
> >   (slow). However, I think firewalld keeps a rule cache, so it uses
> >   the rule handle to delete the rule instead (no need to dump the
> >   cache then). Therefore, the cookie is only used to fetch the rule
> >   handle.
> > 
> >   With the rule cookie in place, firewalld can send the batch, then
> >   make a NLM_F_DUMP request after sending the batch to retrieve the
> >   handle from the cookie instead of using NLM_F_ECHO. The batch send +
> >   NLM_F_DUMP barely consumes 16KBytes per recvmsg() call. The send +
> >   dump is not atomic though.
> > 
> >   Because NLM_F_ECHO is only used to get back the rule handle, right?
> 
> The discussion that led to NLM_F_ECHO was to support atomic rule handle
> retrieval. A user-defined "pseudo-handle" obviously can't uphold that
> promise and therefore won't be a full replacement.
>
> Apart from that, JSON echo output in it's current form is useful for
> scripts to retrieve the handle. We've had that discussion already, I
> pointed out they could just do 'input = output' and know
> 'input["nftables"][5]["add"]["rule"]["expr"][0]["match"]["right"]' is
> still present and has the expected value.
>
> I recently found out that firewalld (e.g.) doesn't even do that, but
> instead manually iterates over the list of commands it got back and
> extracts handle values.
>
> Doing that without NLM_F_ECHO but with cookies instead means to add
> some rules, then list ruleset and search for one's cookies.
> 
> That aside, if nftables doesn't support cookies beyond keeping them in
> place, why not just use a custom comment instead? That's even
> backwards-compatible.

Something else to improve netlink socket receiver usage: Userspace
might also set on NLM_F_ECHO for rules only, so the kernel only
consumes the netlink receive socket buffer for these reports. Although
not sure how to expose this yet through the library in a nice way...

Probably it should be possible to extend netlink to filter out
attributes that do not need to be reported back to userspace via
NLM_F_ECHO...

The main gain is to amortize skbuff cost, ie. increasing the batching
factor, but going over NLMSG_GOODSIZE is tricky. We have to update
userspace (provide larger buffer) and revisit if this is possible
without breaking backward compatibility.

BTW, if NLM_F_ECHO results in ENOSPC, the ruleset has been applied,
it's just that the socket buffer got full, a fallback to NLM_F_DUMP is
probably an option in that case? But I understand the goal is to not
hit ENOSPC.

Let me know if my patch helps mitigate the problem, thanks.

next prev parent reply	other threads:[~2020-08-27 17:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-21 23:06 nfnetlink: Busy-loop in nfnetlink_rcv_msg() Phil Sutter
2020-08-22 18:46 ` Florian Westphal
2020-08-24 10:47   ` Pablo Neira Ayuso
2020-08-24 12:39     ` Florian Westphal
2020-08-24 13:11   ` Phil Sutter
2020-08-26 15:32     ` Pablo Neira Ayuso
2020-08-26 18:54       ` Eric Garver
2020-08-27 14:23       ` Phil Sutter
2020-08-27 17:50         ` Pablo Neira Ayuso [this message]
2020-08-23 12:04 ` Pablo Neira Ayuso
  -- strict thread matches above, loose matches on Subject: below --
2020-08-23 11:55 [PATCH nf] netfilter: nfnetlink: nfnetlink_unicast() reports EAGAIN instead of ENOBUFS Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200827175011.GA24542@salvia \
    --to=pablo@netfilter.org \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=phil@nwl.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.