From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Phil Sutter <phil@nwl.cc>, Florian Westphal <fw@strlen.de>,
netfilter-devel@vger.kernel.org
Subject: Re: nfnetlink: Busy-loop in nfnetlink_rcv_msg()
Date: Thu, 27 Aug 2020 19:50:11 +0200 [thread overview]
Message-ID: <20200827175011.GA24542@salvia> (raw)
In-Reply-To: <20200827142319.GL23632@orbyte.nwl.cc>
Hi Phil,
On Thu, Aug 27, 2020 at 04:23:19PM +0200, Phil Sutter wrote:
> Hi Pablo,
>
> On Wed, Aug 26, 2020 at 05:32:19PM +0200, Pablo Neira Ayuso wrote:
> [...]
> > There several possibilities, just a few that can be explored:
[...]
> > Note that userspace requires no changes to support batching mode:
> > libmnl's mnl_cb_run() keeps iterating over the buffer that was
> > received until all netlink messages are consumed.
> >
> > The quick patch is incomplete, I just want to prove the saving in
> > terms of memory. I'll give it another spin and submit this for
> > review.
>
> Highly appreciated. Put me in Cc and I'll stress-test it a bit.
Let's give a try to this patch:
https://patchwork.ozlabs.org/project/netfilter-devel/patch/20200827172842.24478-1-pablo@netfilter.org/
> > * Probably recover the cookie idea: firewalld specifies a cookie
> > that identifies the rule from userspace, so there is no need for
> > NLM_F_ECHO. Eric mentioned he would like to delete rules by cookie,
> > this should be possible by looking up for the handle from the
> > cookie to delete rules. The cookie is stored in NFTA_RULE_USERDATA
> > so this is only meaningful to userspace. No kernel changes are
> > required (this is supported for a bit of time already).
> >
> > Note: The cookie could be duplicated. Since the cookie is allocated
> > by userspace, it's up to userspace to ensure uniqueness. In case
> > cookies are duplicated, if you delete a rule by cookie then
> >
> > Rule deletion by cookie requires dumping the whole ruleset though
> > (slow). However, I think firewalld keeps a rule cache, so it uses
> > the rule handle to delete the rule instead (no need to dump the
> > cache then). Therefore, the cookie is only used to fetch the rule
> > handle.
> >
> > With the rule cookie in place, firewalld can send the batch, then
> > make a NLM_F_DUMP request after sending the batch to retrieve the
> > handle from the cookie instead of using NLM_F_ECHO. The batch send +
> > NLM_F_DUMP barely consumes 16KBytes per recvmsg() call. The send +
> > dump is not atomic though.
> >
> > Because NLM_F_ECHO is only used to get back the rule handle, right?
>
> The discussion that led to NLM_F_ECHO was to support atomic rule handle
> retrieval. A user-defined "pseudo-handle" obviously can't uphold that
> promise and therefore won't be a full replacement.
>
> Apart from that, JSON echo output in it's current form is useful for
> scripts to retrieve the handle. We've had that discussion already, I
> pointed out they could just do 'input = output' and know
> 'input["nftables"][5]["add"]["rule"]["expr"][0]["match"]["right"]' is
> still present and has the expected value.
>
> I recently found out that firewalld (e.g.) doesn't even do that, but
> instead manually iterates over the list of commands it got back and
> extracts handle values.
>
> Doing that without NLM_F_ECHO but with cookies instead means to add
> some rules, then list ruleset and search for one's cookies.
>
> That aside, if nftables doesn't support cookies beyond keeping them in
> place, why not just use a custom comment instead? That's even
> backwards-compatible.
Something else to improve netlink socket receiver usage: Userspace
might also set on NLM_F_ECHO for rules only, so the kernel only
consumes the netlink receive socket buffer for these reports. Although
not sure how to expose this yet through the library in a nice way...
Probably it should be possible to extend netlink to filter out
attributes that do not need to be reported back to userspace via
NLM_F_ECHO...
The main gain is to amortize skbuff cost, ie. increasing the batching
factor, but going over NLMSG_GOODSIZE is tricky. We have to update
userspace (provide larger buffer) and revisit if this is possible
without breaking backward compatibility.
BTW, if NLM_F_ECHO results in ENOSPC, the ruleset has been applied,
it's just that the socket buffer got full, a fallback to NLM_F_DUMP is
probably an option in that case? But I understand the goal is to not
hit ENOSPC.
Let me know if my patch helps mitigate the problem, thanks.
next prev parent reply other threads:[~2020-08-27 17:50 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-21 23:06 nfnetlink: Busy-loop in nfnetlink_rcv_msg() Phil Sutter
2020-08-22 18:46 ` Florian Westphal
2020-08-24 10:47 ` Pablo Neira Ayuso
2020-08-24 12:39 ` Florian Westphal
2020-08-24 13:11 ` Phil Sutter
2020-08-26 15:32 ` Pablo Neira Ayuso
2020-08-26 18:54 ` Eric Garver
2020-08-27 14:23 ` Phil Sutter
2020-08-27 17:50 ` Pablo Neira Ayuso [this message]
2020-08-23 12:04 ` Pablo Neira Ayuso
-- strict thread matches above, loose matches on Subject: below --
2020-08-23 11:55 [PATCH nf] netfilter: nfnetlink: nfnetlink_unicast() reports EAGAIN instead of ENOBUFS Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200827175011.GA24542@salvia \
--to=pablo@netfilter.org \
--cc=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
--cc=phil@nwl.cc \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).