netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH nft] Improve handling of errors from mnl* functions"
@ 2021-12-08 13:49 Eugene Crosser
  2021-12-08 13:49 ` [PATCH nft 1/2] Use abort() in case of netlink_abi_error Eugene Crosser
  2021-12-08 13:49 ` [PATCH nft 2/2] Handle retriable errors from mnl functions Eugene Crosser
  0 siblings, 2 replies; 6+ messages in thread
From: Eugene Crosser @ 2021-12-08 13:49 UTC (permalink / raw)
  To: netfilter-devel

Libnftables does not handle errors that libmnl functions may return
properly:

1. Retriable errors indicated by errno=EINTR are not retried, but
   rather treted as fatal.
2. Instead of reporting the error to the caller, functions call
   exit() on error, which terminates the caller process.

This patch set partly addresses the second point, by calling
abort() instead of exit() on ABI error, that will at least give
more information to the sysadmin than quiet termination of a
process.

It attempts to properly address the first point, by introducing
retry logic when mnl_socket_recvfrom() or mnl_cb_run() return
-1 with errno=EINTR.

It would be desirable to fully address the second point at some
future time, though it requires some redesign of the code structure.

 [PATCH nft 1/2] Use abort() in case of netlink_abi_error
 [PATCH nft 2/2] Handle retriable errors from mnl functions


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH nft 1/2] Use abort() in case of netlink_abi_error
  2021-12-08 13:49 [PATCH nft] Improve handling of errors from mnl* functions" Eugene Crosser
@ 2021-12-08 13:49 ` Eugene Crosser
  2021-12-08 13:49 ` [PATCH nft 2/2] Handle retriable errors from mnl functions Eugene Crosser
  1 sibling, 0 replies; 6+ messages in thread
From: Eugene Crosser @ 2021-12-08 13:49 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Eugene Crosser

Library functions should not use exit(), application that uses the
library may contain error handling path, that cannot be executed if
library functions calls exit(). For truly fatal errors, using abort() is
more acceptable than exit().

Signed-off-by: Eugene Crosser <crosser@average.org>
---
 src/netlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/netlink.c b/src/netlink.c
index 359d801c..c25a7e79 100644
--- a/src/netlink.c
+++ b/src/netlink.c
@@ -59,7 +59,7 @@ void __noreturn __netlink_abi_error(const char *file, int line,
 {
 	fprintf(stderr, "E: Contact urgently your Linux kernel vendor. "
 		"Netlink ABI is broken: %s:%d %s\n", file, line, reason);
-	exit(NFT_EXIT_FAILURE);
+	abort();
 }
 
 int netlink_io_error(struct netlink_ctx *ctx, const struct location *loc,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH nft 2/2] Handle retriable errors from mnl functions
  2021-12-08 13:49 [PATCH nft] Improve handling of errors from mnl* functions" Eugene Crosser
  2021-12-08 13:49 ` [PATCH nft 1/2] Use abort() in case of netlink_abi_error Eugene Crosser
@ 2021-12-08 13:49 ` Eugene Crosser
  2021-12-08 18:06   ` Pablo Neira Ayuso
  2021-12-08 18:11   ` Pablo Neira Ayuso
  1 sibling, 2 replies; 6+ messages in thread
From: Eugene Crosser @ 2021-12-08 13:49 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Eugene Crosser

rc == -1 and errno == EINTR mean:

mnl_socket_recvfrom() - blindly rerun the function
mnl_cb_run()          - restart dump request from scratch

This commit introduces handling of both these conditions

Signed-off-by: Eugene Crosser <crosser@average.org>
---
 src/iface.c | 73 ++++++++++++++++++++++++++++++++---------------------
 1 file changed, 44 insertions(+), 29 deletions(-)

diff --git a/src/iface.c b/src/iface.c
index d0e1834c..029f6476 100644
--- a/src/iface.c
+++ b/src/iface.c
@@ -66,39 +66,54 @@ void iface_cache_update(void)
 	struct nlmsghdr *nlh;
 	struct rtgenmsg *rt;
 	uint32_t seq, portid;
+	bool need_restart;
+	int retry_count = 5;
 	int ret;
 
-	nlh = mnl_nlmsg_put_header(buf);
-	nlh->nlmsg_type	= RTM_GETLINK;
-	nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
-	nlh->nlmsg_seq = seq = time(NULL);
-	rt = mnl_nlmsg_put_extra_header(nlh, sizeof(struct rtgenmsg));
-	rt->rtgen_family = AF_PACKET;
-
-	nl = mnl_socket_open(NETLINK_ROUTE);
-	if (nl == NULL)
-		netlink_init_error();
-
-	if (mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID) < 0)
-		netlink_init_error();
-
-	portid = mnl_socket_get_portid(nl);
-
-	if (mnl_socket_sendto(nl, nlh, nlh->nlmsg_len) < 0)
-		netlink_init_error();
-
-	ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
-	while (ret > 0) {
-		ret = mnl_cb_run(buf, ret, seq, portid, data_cb, NULL);
-		if (ret <= MNL_CB_STOP)
-			break;
-		ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
-	}
-	if (ret == -1)
+	do {
+		nlh = mnl_nlmsg_put_header(buf);
+		nlh->nlmsg_type	= RTM_GETLINK;
+		nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
+		nlh->nlmsg_seq = seq = time(NULL);
+		rt = mnl_nlmsg_put_extra_header(nlh, sizeof(struct rtgenmsg));
+		rt->rtgen_family = AF_PACKET;
+
+		nl = mnl_socket_open(NETLINK_ROUTE);
+		if (nl == NULL)
+			netlink_init_error();
+
+		if (mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID) < 0)
+			netlink_init_error();
+
+		portid = mnl_socket_get_portid(nl);
+
+		if (mnl_socket_sendto(nl, nlh, nlh->nlmsg_len) < 0)
+			netlink_init_error();
+
+		need_restart = false;
+		while (true) {
+			ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
+			if (ret == -1) {
+				if (errno == EINTR)
+					continue;
+				else
+					netlink_init_error();
+			}
+			ret = mnl_cb_run(buf, ret, seq, portid, data_cb, NULL);
+			if (ret == MNL_CB_STOP)
+				break;
+			if (ret == -1) {
+				if (errno == EINTR)
+					need_restart = true;
+				else
+					netlink_init_error();
+			}
+		}
+		mnl_socket_close(nl);
+	} while (need_restart && retry_count--);
+	if (need_restart)
 		netlink_init_error();
 
-	mnl_socket_close(nl);
-
 	iface_cache_init = true;
 }
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH nft 2/2] Handle retriable errors from mnl functions
  2021-12-08 13:49 ` [PATCH nft 2/2] Handle retriable errors from mnl functions Eugene Crosser
@ 2021-12-08 18:06   ` Pablo Neira Ayuso
  2021-12-16 20:33     ` Eugene Crosser
  2021-12-08 18:11   ` Pablo Neira Ayuso
  1 sibling, 1 reply; 6+ messages in thread
From: Pablo Neira Ayuso @ 2021-12-08 18:06 UTC (permalink / raw)
  To: Eugene Crosser; +Cc: netfilter-devel

On Wed, Dec 08, 2021 at 02:49:14PM +0100, Eugene Crosser wrote:
> rc == -1 and errno == EINTR mean:
> 
> mnl_socket_recvfrom() - blindly rerun the function
> mnl_cb_run()          - restart dump request from scratch
> 
> This commit introduces handling of both these conditions
> 
> Signed-off-by: Eugene Crosser <crosser@average.org>
> ---
>  src/iface.c | 73 ++++++++++++++++++++++++++++++++---------------------
>  1 file changed, 44 insertions(+), 29 deletions(-)
> 
> diff --git a/src/iface.c b/src/iface.c
> index d0e1834c..029f6476 100644
> --- a/src/iface.c
> +++ b/src/iface.c
> @@ -66,39 +66,54 @@ void iface_cache_update(void)
>  	struct nlmsghdr *nlh;
>  	struct rtgenmsg *rt;
>  	uint32_t seq, portid;
> +	bool need_restart;
> +	int retry_count = 5;

Did you ever hit this retry count? What is you daemon going to do
after these retries?

Probably this can be made configurable for libraries in case you
prefer your daemon to give up after many retries, but, by default,
I'd prefer to to keep trying until you get a consistent cache from the
kernel via netlink dump.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH nft 2/2] Handle retriable errors from mnl functions
  2021-12-08 13:49 ` [PATCH nft 2/2] Handle retriable errors from mnl functions Eugene Crosser
  2021-12-08 18:06   ` Pablo Neira Ayuso
@ 2021-12-08 18:11   ` Pablo Neira Ayuso
  1 sibling, 0 replies; 6+ messages in thread
From: Pablo Neira Ayuso @ 2021-12-08 18:11 UTC (permalink / raw)
  To: Eugene Crosser; +Cc: netfilter-devel

On Wed, Dec 08, 2021 at 02:49:14PM +0100, Eugene Crosser wrote:
> diff --git a/src/iface.c b/src/iface.c
> index d0e1834c..029f6476 100644
> --- a/src/iface.c
> +++ b/src/iface.c
> @@ -66,39 +66,54 @@ void iface_cache_update(void)
>  	struct nlmsghdr *nlh;
>  	struct rtgenmsg *rt;
>  	uint32_t seq, portid;
> +	bool need_restart;
> +	int retry_count = 5;
>  	int ret;
>  
> -	nlh = mnl_nlmsg_put_header(buf);
> -	nlh->nlmsg_type	= RTM_GETLINK;
> -	nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
> -	nlh->nlmsg_seq = seq = time(NULL);
> -	rt = mnl_nlmsg_put_extra_header(nlh, sizeof(struct rtgenmsg));
> -	rt->rtgen_family = AF_PACKET;
> -
> -	nl = mnl_socket_open(NETLINK_ROUTE);
> -	if (nl == NULL)
> -		netlink_init_error();
> -
> -	if (mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID) < 0)
> -		netlink_init_error();
> -
> -	portid = mnl_socket_get_portid(nl);
> -
> -	if (mnl_socket_sendto(nl, nlh, nlh->nlmsg_len) < 0)
> -		netlink_init_error();
> -
> -	ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
> -	while (ret > 0) {
> -		ret = mnl_cb_run(buf, ret, seq, portid, data_cb, NULL);
> -		if (ret <= MNL_CB_STOP)
> -			break;
> -		ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
> -	}
> -	if (ret == -1)
> +	do {
> +		nlh = mnl_nlmsg_put_header(buf);
> +		nlh->nlmsg_type	= RTM_GETLINK;
> +		nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
> +		nlh->nlmsg_seq = seq = time(NULL);
> +		rt = mnl_nlmsg_put_extra_header(nlh, sizeof(struct rtgenmsg));
> +		rt->rtgen_family = AF_PACKET;
> +
> +		nl = mnl_socket_open(NETLINK_ROUTE);
> +		if (nl == NULL)
> +			netlink_init_error();
> +
> +		if (mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID) < 0)
> +			netlink_init_error();
> +
> +		portid = mnl_socket_get_portid(nl);
> +
> +		if (mnl_socket_sendto(nl, nlh, nlh->nlmsg_len) < 0)
> +			netlink_init_error();
> +
> +		need_restart = false;
> +		while (true) {
> +			ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
> +			if (ret == -1) {
> +				if (errno == EINTR)
> +					continue;
> +				else
> +					netlink_init_error();
> +			}
> +			ret = mnl_cb_run(buf, ret, seq, portid, data_cb, NULL);
> +			if (ret == MNL_CB_STOP)
> +				break;
> +			if (ret == -1) {
> +				if (errno == EINTR)
> +					need_restart = true;
> +				else
> +					netlink_init_error();
> +			}
> +		}
> +		mnl_socket_close(nl);

BTW, could you just rename iface_cache_update() to
__iface_cache_update() then add the loop to retry on EINTR? That would
skip this extra large indent in this patch.

Thanks.

> +	} while (need_restart && retry_count--);
> +	if (need_restart)
>  		netlink_init_error();
>  
> -	mnl_socket_close(nl);
> -
>  	iface_cache_init = true;
>  }
>  
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH nft 2/2] Handle retriable errors from mnl functions
  2021-12-08 18:06   ` Pablo Neira Ayuso
@ 2021-12-16 20:33     ` Eugene Crosser
  0 siblings, 0 replies; 6+ messages in thread
From: Eugene Crosser @ 2021-12-16 20:33 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel


[-- Attachment #1.1: Type: text/plain, Size: 1331 bytes --]

Hello,

On 08/12/2021 19:06, Pablo Neira Ayuso wrote:

>> ---
>>  src/iface.c | 73 ++++++++++++++++++++++++++++++++---------------------
>>  1 file changed, 44 insertions(+), 29 deletions(-)
>>
>> diff --git a/src/iface.c b/src/iface.c
>> index d0e1834c..029f6476 100644
>> --- a/src/iface.c
>> +++ b/src/iface.c
>> @@ -66,39 +66,54 @@ void iface_cache_update(void)
>>  	struct nlmsghdr *nlh;
>>  	struct rtgenmsg *rt;
>>  	uint32_t seq, portid;
>> +	bool need_restart;
>> +	int retry_count = 5;
> 
> Did you ever hit this retry count? What is you daemon going to do
> after these retries?
> 
> Probably this can be made configurable for libraries in case you
> prefer your daemon to give up after many retries, but, by default,
> I'd prefer to to keep trying until you get a consistent cache from the
> kernel via netlink dump.
[...]
> BTW, could you just rename iface_cache_update() to
> __iface_cache_update() then add the loop to retry on EINTR? That would
> skip this extra large indent in this patch.


I have sent the new patches a week ago:

  [PATCH nft v2 0/2] Improve handling of errors from mnl* functions"
  [PATCH nft v2 1/2] Use abort() in case of netlink_abi_error
  [PATCH nft v2 2/2] Handle retriable errors from mnl functions

Do they look better now?

Thanks,

Eugene

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-16 20:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-08 13:49 [PATCH nft] Improve handling of errors from mnl* functions" Eugene Crosser
2021-12-08 13:49 ` [PATCH nft 1/2] Use abort() in case of netlink_abi_error Eugene Crosser
2021-12-08 13:49 ` [PATCH nft 2/2] Handle retriable errors from mnl functions Eugene Crosser
2021-12-08 18:06   ` Pablo Neira Ayuso
2021-12-16 20:33     ` Eugene Crosser
2021-12-08 18:11   ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).