* [iptables PATCH RFC] nft: Set socket receive buffer
@ 2019-07-02 15:12 Phil Sutter
2019-07-02 17:26 ` Pablo Neira Ayuso
0 siblings, 1 reply; 5+ messages in thread
From: Phil Sutter @ 2019-07-02 15:12 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel
When trying to delete user-defined chains in a large ruleset,
iptables-nft aborts with "No buffer space available". This can be
reproduced using the following script:
| #! /bin/bash
| iptables-nft-restore <(
|
| echo "*filter"
| for i in $(seq 0 200000);do
| printf ":chain_%06x - [0:0]\n" $i
| done
| for i in $(seq 0 200000);do
| printf -- "-A INPUT -j chain_%06x\n" $i
| printf -- "-A INPUT -j chain_%06x\n" $i
| done
| echo COMMIT
|
| )
| iptables-nft -X
Note that calling 'iptables-nft -F' before the last call avoids the
issue. Also, correct behaviour is indicated by a different error
message, namely:
| iptables v1.8.3 (nf_tables): CHAIN_USER_DEL failed (Device or resource busy): chain chain_000000
The used multiplier value is a result of trial-and-error, it is the
first one which eliminated the ENOBUFS condition.
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
iptables/nft.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/iptables/nft.c b/iptables/nft.c
index 2c61521455de8..529d5fb1bfac8 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -192,6 +192,7 @@ static void mnl_set_sndbuffer(const struct mnl_socket *nl,
struct nftnl_batch *batch)
{
int newbuffsiz;
+ int mult = 7;
if (nftnl_batch_iovec_len(batch) * BATCH_PAGE_SIZE <= nlbuffsiz)
return;
@@ -203,6 +204,12 @@ static void mnl_set_sndbuffer(const struct mnl_socket *nl,
&newbuffsiz, sizeof(socklen_t)) < 0)
return;
+ newbuffsiz *= mult;
+ if (setsockopt(mnl_socket_get_fd(nl), SOL_SOCKET, SO_RCVBUFFORCE,
+ &newbuffsiz, sizeof(socklen_t)) < 0)
+ return;
+ newbuffsiz /= mult;
+
nlbuffsiz = newbuffsiz;
}
--
2.21.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [iptables PATCH RFC] nft: Set socket receive buffer
2019-07-02 15:12 [iptables PATCH RFC] nft: Set socket receive buffer Phil Sutter
@ 2019-07-02 17:26 ` Pablo Neira Ayuso
2019-07-02 18:03 ` [iptables PATCH] " Phil Sutter
2019-07-02 18:04 ` [iptables PATCH RFC] " Phil Sutter
0 siblings, 2 replies; 5+ messages in thread
From: Pablo Neira Ayuso @ 2019-07-02 17:26 UTC (permalink / raw)
To: Phil Sutter; +Cc: netfilter-devel
On Tue, Jul 02, 2019 at 05:12:01PM +0200, Phil Sutter wrote:
> When trying to delete user-defined chains in a large ruleset,
> iptables-nft aborts with "No buffer space available". This can be
> reproduced using the following script:
>
> | #! /bin/bash
> | iptables-nft-restore <(
> |
> | echo "*filter"
> | for i in $(seq 0 200000);do
> | printf ":chain_%06x - [0:0]\n" $i
> | done
> | for i in $(seq 0 200000);do
> | printf -- "-A INPUT -j chain_%06x\n" $i
> | printf -- "-A INPUT -j chain_%06x\n" $i
> | done
> | echo COMMIT
> |
> | )
> | iptables-nft -X
>
> Note that calling 'iptables-nft -F' before the last call avoids the
> issue. Also, correct behaviour is indicated by a different error
> message, namely:
>
> | iptables v1.8.3 (nf_tables): CHAIN_USER_DEL failed (Device or resource busy): chain chain_000000
>
> The used multiplier value is a result of trial-and-error, it is the
> first one which eliminated the ENOBUFS condition.
This is triggering a lots of errors (ack messages) to userspace.
Could you estimate the buffer size based on the number of commands?
mnl_batch_talk() is called before iterating over the list of commands,
so this number is already in place. Then, pass it to
mnl_nft_socket_sendmsg().
I'd suggest you add a mnl_set_rcvbuffer() too. You could assume that
getpagesize() is the maximum size for an acknoledgment.
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [iptables PATCH] nft: Set socket receive buffer
2019-07-02 17:26 ` Pablo Neira Ayuso
@ 2019-07-02 18:03 ` Phil Sutter
2019-07-02 18:10 ` Pablo Neira Ayuso
2019-07-02 18:04 ` [iptables PATCH RFC] " Phil Sutter
1 sibling, 1 reply; 5+ messages in thread
From: Phil Sutter @ 2019-07-02 18:03 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel
When trying to delete user-defined chains in a large ruleset,
iptables-nft aborts with "No buffer space available". This can be
reproduced using the following script:
| #! /bin/bash
| iptables-nft-restore <(
|
| echo "*filter"
| for i in $(seq 0 200000);do
| printf ":chain_%06x - [0:0]\n" $i
| done
| for i in $(seq 0 200000);do
| printf -- "-A INPUT -j chain_%06x\n" $i
| printf -- "-A INPUT -j chain_%06x\n" $i
| done
| echo COMMIT
|
| )
| iptables-nft -X
The problem seems to be the sheer amount of netlink error messages sent
back to user space (one EBUSY for each chain). To solve this, set
receive buffer size depending on number of commands sent to kernel.
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
iptables/nft.c | 26 ++++++++++++++++++++++----
1 file changed, 22 insertions(+), 4 deletions(-)
diff --git a/iptables/nft.c b/iptables/nft.c
index 2c61521455de8..b5613cd8e26ca 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -206,8 +206,24 @@ static void mnl_set_sndbuffer(const struct mnl_socket *nl,
nlbuffsiz = newbuffsiz;
}
+static int nlrcvbuffsiz;
+
+static void mnl_set_rcvbuffer(const struct mnl_socket *nl, int numcmds)
+{
+ int newbuffsiz = getpagesize() * numcmds;
+
+ if (newbuffsiz <= nlrcvbuffsiz)
+ return;
+
+ if (setsockopt(mnl_socket_get_fd(nl), SOL_SOCKET, SO_RCVBUFFORCE,
+ &newbuffsiz, sizeof(socklen_t)) < 0)
+ return;
+
+ nlrcvbuffsiz = newbuffsiz;
+}
+
static ssize_t mnl_nft_socket_sendmsg(const struct mnl_socket *nf_sock,
- struct nftnl_batch *batch)
+ struct nftnl_batch *batch, int numcmds)
{
static const struct sockaddr_nl snl = {
.nl_family = AF_NETLINK
@@ -222,13 +238,15 @@ static ssize_t mnl_nft_socket_sendmsg(const struct mnl_socket *nf_sock,
};
mnl_set_sndbuffer(nf_sock, batch);
+ mnl_set_rcvbuffer(nf_sock, numcmds);
nftnl_batch_iovec(batch, iov, iov_len);
return sendmsg(mnl_socket_get_fd(nf_sock), &msg, 0);
}
static int mnl_batch_talk(const struct mnl_socket *nf_sock,
- struct nftnl_batch *batch, struct list_head *err_list)
+ struct nftnl_batch *batch, int numcmds,
+ struct list_head *err_list)
{
const struct mnl_socket *nl = nf_sock;
int ret, fd = mnl_socket_get_fd(nl), portid = mnl_socket_get_portid(nl);
@@ -240,7 +258,7 @@ static int mnl_batch_talk(const struct mnl_socket *nf_sock,
};
int err = 0;
- ret = mnl_nft_socket_sendmsg(nf_sock, batch);
+ ret = mnl_nft_socket_sendmsg(nf_sock, batch, numcmds);
if (ret == -1)
return -1;
@@ -2917,7 +2935,7 @@ retry:
}
errno = 0;
- ret = mnl_batch_talk(h->nl, h->batch, &h->err_list);
+ ret = mnl_batch_talk(h->nl, h->batch, seq, &h->err_list);
if (ret && errno == ERESTART) {
nft_rebuild_cache(h);
--
2.21.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [iptables PATCH RFC] nft: Set socket receive buffer
2019-07-02 17:26 ` Pablo Neira Ayuso
2019-07-02 18:03 ` [iptables PATCH] " Phil Sutter
@ 2019-07-02 18:04 ` Phil Sutter
1 sibling, 0 replies; 5+ messages in thread
From: Phil Sutter @ 2019-07-02 18:04 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel
On Tue, Jul 02, 2019 at 07:26:15PM +0200, Pablo Neira Ayuso wrote:
> On Tue, Jul 02, 2019 at 05:12:01PM +0200, Phil Sutter wrote:
> > When trying to delete user-defined chains in a large ruleset,
> > iptables-nft aborts with "No buffer space available". This can be
> > reproduced using the following script:
> >
> > | #! /bin/bash
> > | iptables-nft-restore <(
> > |
> > | echo "*filter"
> > | for i in $(seq 0 200000);do
> > | printf ":chain_%06x - [0:0]\n" $i
> > | done
> > | for i in $(seq 0 200000);do
> > | printf -- "-A INPUT -j chain_%06x\n" $i
> > | printf -- "-A INPUT -j chain_%06x\n" $i
> > | done
> > | echo COMMIT
> > |
> > | )
> > | iptables-nft -X
> >
> > Note that calling 'iptables-nft -F' before the last call avoids the
> > issue. Also, correct behaviour is indicated by a different error
> > message, namely:
> >
> > | iptables v1.8.3 (nf_tables): CHAIN_USER_DEL failed (Device or resource busy): chain chain_000000
> >
> > The used multiplier value is a result of trial-and-error, it is the
> > first one which eliminated the ENOBUFS condition.
>
> This is triggering a lots of errors (ack messages) to userspace.
>
> Could you estimate the buffer size based on the number of commands?
>
> mnl_batch_talk() is called before iterating over the list of commands,
> so this number is already in place. Then, pass it to
> mnl_nft_socket_sendmsg().
>
> I'd suggest you add a mnl_set_rcvbuffer() too. You could assume that
> getpagesize() is the maximum size for an acknoledgment.
Ah, I didn't get that kernel reply depends on number of commands sent,
not batch size. Thanks for your tip, this seems to work fine!
Thanks, Phil
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [iptables PATCH] nft: Set socket receive buffer
2019-07-02 18:03 ` [iptables PATCH] " Phil Sutter
@ 2019-07-02 18:10 ` Pablo Neira Ayuso
0 siblings, 0 replies; 5+ messages in thread
From: Pablo Neira Ayuso @ 2019-07-02 18:10 UTC (permalink / raw)
To: Phil Sutter; +Cc: netfilter-devel
On Tue, Jul 02, 2019 at 08:03:19PM +0200, Phil Sutter wrote:
> When trying to delete user-defined chains in a large ruleset,
> iptables-nft aborts with "No buffer space available". This can be
> reproduced using the following script:
>
> | #! /bin/bash
> | iptables-nft-restore <(
> |
> | echo "*filter"
> | for i in $(seq 0 200000);do
> | printf ":chain_%06x - [0:0]\n" $i
> | done
> | for i in $(seq 0 200000);do
> | printf -- "-A INPUT -j chain_%06x\n" $i
> | printf -- "-A INPUT -j chain_%06x\n" $i
> | done
> | echo COMMIT
> |
> | )
> | iptables-nft -X
>
> The problem seems to be the sheer amount of netlink error messages sent
> back to user space (one EBUSY for each chain). To solve this, set
> receive buffer size depending on number of commands sent to kernel.
LGTM. One more change, make sure you reset:
nlbuffsiz = 0
from nft_restart().
Thanks.
P.S: It would be good a follow up to place this global variables into
the nft_handle object at some point.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-07-02 18:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-02 15:12 [iptables PATCH RFC] nft: Set socket receive buffer Phil Sutter
2019-07-02 17:26 ` Pablo Neira Ayuso
2019-07-02 18:03 ` [iptables PATCH] " Phil Sutter
2019-07-02 18:10 ` Pablo Neira Ayuso
2019-07-02 18:04 ` [iptables PATCH RFC] " Phil Sutter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).