* Suboptimal error handling in libnftables @ 2021-12-02 13:16 Eugene Crosser 2021-12-02 13:54 ` Pablo Neira Ayuso 0 siblings, 1 reply; 6+ messages in thread From: Eugene Crosser @ 2021-12-02 13:16 UTC (permalink / raw) To: netfilter-devel@vger.kernel.org [-- Attachment #1.1: Type: text/plain, Size: 1483 bytes --] Hello, there is read-from-the-socket loop in src/iface.c line 90 (function iface_cache_update()), and it (and other places) call macro netlink_init_error() to report error. The function behind the macro is in src/netlink.c line 81, and it calls exit(NFT_EXIT_NONL) after writing a message to stderr. I see two problems with this: 1. All read-from-the-socket functions should be run in a loop, repeating if return code is -1 and errno is EINTR. I.e. EINTR should not be treated as an error, but as a condition that requires retry. 2. Library functions are not supposed to call exit() (or abort() for that matter). They are expected to return an error indication to the caller, who may have its own strategy for handling error conditions. Case in point, we have a daemon (in Python) that uses bindings to libnftables. It's a service responding to requests coming over a TCP connection, and it takes care to intercept any error situations and report them back. We discovered that under some conditions, it just closes the socket and goes away. This being a daemon, stderr was not immediately accessible; and even it it were, it is pretty hard to figure where did the message "iface.c:98: Unable to initialize Netlink socket: Interrupted system call" come from and why! There is another function that calls exit(), __netlink_abi_error(). I believe that even in such a harsh situation, exit() is not the right way to handle it. Thank you, Eugene [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Suboptimal error handling in libnftables 2021-12-02 13:16 Suboptimal error handling in libnftables Eugene Crosser @ 2021-12-02 13:54 ` Pablo Neira Ayuso 2021-12-02 14:03 ` Eugene Crosser 2021-12-06 16:58 ` Eugene Crosser 0 siblings, 2 replies; 6+ messages in thread From: Pablo Neira Ayuso @ 2021-12-02 13:54 UTC (permalink / raw) To: Eugene Crosser; +Cc: netfilter-devel@vger.kernel.org On Thu, Dec 02, 2021 at 02:16:12PM +0100, Eugene Crosser wrote: > Hello, > > there is read-from-the-socket loop in src/iface.c line 90 (function > iface_cache_update()), and it (and other places) call macro > netlink_init_error() to report error. The function behind the macro is > in src/netlink.c line 81, and it calls exit(NFT_EXIT_NONL) after writing > a message to stderr. > > I see two problems with this: > > 1. All read-from-the-socket functions should be run in a loop, repeating > if return code is -1 and errno is EINTR. I.e. EINTR should not be > treated as an error, but as a condition that requires retry. > > 2. Library functions are not supposed to call exit() (or abort() for > that matter). They are expected to return an error indication to the > caller, who may have its own strategy for handling error conditions. > > Case in point, we have a daemon (in Python) that uses bindings to > libnftables. It's a service responding to requests coming over a TCP > connection, and it takes care to intercept any error situations and > report them back. We discovered that under some conditions, it just > closes the socket and goes away. This being a daemon, stderr was not > immediately accessible; and even it it were, it is pretty hard to figure > where did the message "iface.c:98: Unable to initialize Netlink socket: > Interrupted system call" come from and why! This missing EINTR handling for iface_cache_update() is a bug, would you post a patch for this? > There is another function that calls exit(), __netlink_abi_error(). I > believe that even in such a harsh situation, exit() is not the right way > to handle it. ABI breakage between kernel and userspace should not ever happen. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Suboptimal error handling in libnftables 2021-12-02 13:54 ` Pablo Neira Ayuso @ 2021-12-02 14:03 ` Eugene Crosser 2021-12-02 15:50 ` Pablo Neira Ayuso 2021-12-06 16:58 ` Eugene Crosser 1 sibling, 1 reply; 6+ messages in thread From: Eugene Crosser @ 2021-12-02 14:03 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel@vger.kernel.org [-- Attachment #1.1: Type: text/plain, Size: 1040 bytes --] Hello Pablo, On 02/12/2021 14:54, Pablo Neira Ayuso wrote: >> 1. All read-from-the-socket functions should be run in a loop, repeating >> if return code is -1 and errno is EINTR. I.e. EINTR should not be >> treated as an error, but as a condition that requires retry. [...]> This missing EINTR handling for iface_cache_update() is a bug, would > you post a patch for this? I have a patch that is currently under our internal testing. Will post it here once I get the results of testing. >> There is another function that calls exit(), __netlink_abi_error(). I >> believe that even in such a harsh situation, exit() is not the right way >> to handle it. > > ABI breakage between kernel and userspace should not ever happen. Well, maybe at least use abort() then? It's better to have a dump with a stack trace than have the process silently terminate. Libnftables may be deep down the stack of dependencies, it can be hard to find the source of the problem from just an stderr message. Best regards, Eugene [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Suboptimal error handling in libnftables 2021-12-02 14:03 ` Eugene Crosser @ 2021-12-02 15:50 ` Pablo Neira Ayuso 0 siblings, 0 replies; 6+ messages in thread From: Pablo Neira Ayuso @ 2021-12-02 15:50 UTC (permalink / raw) To: Eugene Crosser; +Cc: netfilter-devel@vger.kernel.org On Thu, Dec 02, 2021 at 03:03:04PM +0100, Eugene Crosser wrote: > Hello Pablo, > > On 02/12/2021 14:54, Pablo Neira Ayuso wrote: > > >> 1. All read-from-the-socket functions should be run in a loop, repeating > >> if return code is -1 and errno is EINTR. I.e. EINTR should not be > >> treated as an error, but as a condition that requires retry. > [...]> This missing EINTR handling for iface_cache_update() is a bug, would > > you post a patch for this? > > I have a patch that is currently under our internal testing. Will post > it here once I get the results of testing. > > >> There is another function that calls exit(), __netlink_abi_error(). I > >> believe that even in such a harsh situation, exit() is not the right way > >> to handle it. > > > > ABI breakage between kernel and userspace should not ever happen. > > Well, maybe at least use abort() then? It's better to have a dump with a > stack trace than have the process silently terminate. Libnftables may be > deep down the stack of dependencies, it can be hard to find the source > of the problem from just an stderr message. Please post a patch to use abort() in this ABI breakage case too. Thanks. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Suboptimal error handling in libnftables 2021-12-02 13:54 ` Pablo Neira Ayuso 2021-12-02 14:03 ` Eugene Crosser @ 2021-12-06 16:58 ` Eugene Crosser 2021-12-06 19:56 ` Pablo Neira Ayuso 1 sibling, 1 reply; 6+ messages in thread From: Eugene Crosser @ 2021-12-06 16:58 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel@vger.kernel.org [-- Attachment #1.1: Type: text/plain, Size: 2728 bytes --] On 02/12/2021 14:54, Pablo Neira Ayuso wrote: > On Thu, Dec 02, 2021 at 02:16:12PM +0100, Eugene Crosser wrote: >> Hello, >> >> there is read-from-the-socket loop in src/iface.c line 90 (function >> iface_cache_update()), and it (and other places) call macro >> netlink_init_error() to report error. The function behind the macro is >> in src/netlink.c line 81, and it calls exit(NFT_EXIT_NONL) after writing >> a message to stderr. >> >> I see two problems with this: >> >> 1. All read-from-the-socket functions should be run in a loop, repeating >> if return code is -1 and errno is EINTR. I.e. EINTR should not be >> treated as an error, but as a condition that requires retry. >> >> 2. Library functions are not supposed to call exit() (or abort() for >> that matter). They are expected to return an error indication to the >> caller, who may have its own strategy for handling error conditions. >> >> Case in point, we have a daemon (in Python) that uses bindings to >> libnftables. It's a service responding to requests coming over a TCP >> connection, and it takes care to intercept any error situations and >> report them back. We discovered that under some conditions, it just >> closes the socket and goes away. This being a daemon, stderr was not >> immediately accessible; and even it it were, it is pretty hard to figure >> where did the message "iface.c:98: Unable to initialize Netlink socket: >> Interrupted system call" come from and why! > > This missing EINTR handling for iface_cache_update() is a bug, would > you post a patch for this? It looks like there is more than just a missing retry when socket-receive returns EINTR. In the code in src/iface.c between lines 87 and 98, EINTR may come from one of two functions: mnl_socket_recvfrom() and mnl_cb_run(). If it is returned by mnl_socket_recvfrom(), the correct course of action is to blindly call that function again. But when it is returned by mnl_cb_run(), the meaning is different. mnl_cb_run() retruns -1 and sets errno=EINTR when netlink message contained NLM_F_DUMP_INTR. I assume (though I am not sure) that NLM_F_DUMP_INTR means that the data that is being transferred has changed while transfer was only partially complete, and the user is advised to restart _the whole dump process_ by sending a new NLM_F_DUMP request message. (Arguably, libmnl ought to report such situation with some other indication, not EINTR.) In any case, I believe that the aforementioned code should handle both of these two "need to retry" cases. I our tests it looks like we are hitting NLM_F_DUMP_INTR rather then interrupted socket recv(). I will report back after this is verified. Best regards, Eugene [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Suboptimal error handling in libnftables 2021-12-06 16:58 ` Eugene Crosser @ 2021-12-06 19:56 ` Pablo Neira Ayuso 0 siblings, 0 replies; 6+ messages in thread From: Pablo Neira Ayuso @ 2021-12-06 19:56 UTC (permalink / raw) To: Eugene Crosser; +Cc: netfilter-devel@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 3168 bytes --] On Mon, Dec 06, 2021 at 05:58:25PM +0100, Eugene Crosser wrote: > On 02/12/2021 14:54, Pablo Neira Ayuso wrote: > > On Thu, Dec 02, 2021 at 02:16:12PM +0100, Eugene Crosser wrote: > >> Hello, > >> > >> there is read-from-the-socket loop in src/iface.c line 90 (function > >> iface_cache_update()), and it (and other places) call macro > >> netlink_init_error() to report error. The function behind the macro is > >> in src/netlink.c line 81, and it calls exit(NFT_EXIT_NONL) after writing > >> a message to stderr. > >> > >> I see two problems with this: > >> > >> 1. All read-from-the-socket functions should be run in a loop, repeating > >> if return code is -1 and errno is EINTR. I.e. EINTR should not be > >> treated as an error, but as a condition that requires retry. > >> > >> 2. Library functions are not supposed to call exit() (or abort() for > >> that matter). They are expected to return an error indication to the > >> caller, who may have its own strategy for handling error conditions. > >> > >> Case in point, we have a daemon (in Python) that uses bindings to > >> libnftables. It's a service responding to requests coming over a TCP > >> connection, and it takes care to intercept any error situations and > >> report them back. We discovered that under some conditions, it just > >> closes the socket and goes away. This being a daemon, stderr was not > >> immediately accessible; and even it it were, it is pretty hard to figure > >> where did the message "iface.c:98: Unable to initialize Netlink socket: > >> Interrupted system call" come from and why! > > > > This missing EINTR handling for iface_cache_update() is a bug, would > > you post a patch for this? > > It looks like there is more than just a missing retry when > socket-receive returns EINTR. In the code in src/iface.c between lines > 87 and 98, EINTR may come from one of two functions: > mnl_socket_recvfrom() and mnl_cb_run(). If it is returned by > mnl_socket_recvfrom(), the correct course of action is to blindly call > that function again. But when it is returned by mnl_cb_run(), the > meaning is different. mnl_cb_run() retruns -1 and sets errno=EINTR when > netlink message contained NLM_F_DUMP_INTR. I assume (though I am not > sure) that NLM_F_DUMP_INTR means that the data that is being transferred > has changed while transfer was only partially complete, and the user is > advised to restart _the whole dump process_ by sending a new NLM_F_DUMP > request message. (Arguably, libmnl ought to report such situation with > some other indication, not EINTR.) In any case, I believe that the > aforementioned code should handle both of these two "need to retry" cases. > > I our tests it looks like we are hitting NLM_F_DUMP_INTR rather then > interrupted socket recv(). I will report back after this is verified. NLM_F_DUMP_INTR means that the existing netlink dump is stale (an update happened while dumping the listing to userspace), so userspace has to restart to get a consistent listing from the kernel. Yes, in both cases, either signal interruped syscall or NLM_F_DUMP_INTR, libnftables should retry. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-12-06 19:56 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-12-02 13:16 Suboptimal error handling in libnftables Eugene Crosser 2021-12-02 13:54 ` Pablo Neira Ayuso 2021-12-02 14:03 ` Eugene Crosser 2021-12-02 15:50 ` Pablo Neira Ayuso 2021-12-06 16:58 ` Eugene Crosser 2021-12-06 19:56 ` Pablo Neira Ayuso
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.