* [PATCH] net: tap: Drop the close of fds for child process
@ 2023-04-06 11:20 Bin Meng
2023-04-06 12:34 ` Philippe Mathieu-Daudé
2023-04-18 8:55 ` Daniel P. Berrangé
0 siblings, 2 replies; 4+ messages in thread
From: Bin Meng @ 2023-04-06 11:20 UTC (permalink / raw)
To: Jason Wang, qemu-devel; +Cc: Zhangjin Wu
Current codes using a brute-force traversal of all file descriptors
do not scale on a system where the maximum number of file descriptors
are set to a very large value (e.g.: in a Docker container of Manjaro
distribution it is set to 1073741816). QEMU just looks freezed during
start-up.
The close-on-exec flag was introduced since a faily old Linux kernel
(2.6.23). With recent newer kernels that QEMU supports, we don't need
to manually close the fds for child process as the proper O_CLOEXEC
flag should have been set properly on files that we don't want child
process to see.
Reported-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Bin Meng <bmeng@tinylab.org>
---
net/tap.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/net/tap.c b/net/tap.c
index 1bf085d422..49e1915484 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -446,13 +446,6 @@ static void launch_script(const char *setup_script, const char *ifname,
return;
}
if (pid == 0) {
- int open_max = sysconf(_SC_OPEN_MAX), i;
-
- for (i = 3; i < open_max; i++) {
- if (i != fd) {
- close(i);
- }
- }
parg = args;
*parg++ = (char *)setup_script;
*parg++ = (char *)ifname;
@@ -536,17 +529,10 @@ static int net_bridge_run_helper(const char *helper, const char *bridge,
return -1;
}
if (pid == 0) {
- int open_max = sysconf(_SC_OPEN_MAX), i;
char *fd_buf = NULL;
char *br_buf = NULL;
char *helper_cmd = NULL;
- for (i = 3; i < open_max; i++) {
- if (i != sv[1]) {
- close(i);
- }
- }
-
fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]);
if (strrchr(helper, ' ') || strrchr(helper, '\t')) {
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] net: tap: Drop the close of fds for child process
2023-04-06 11:20 [PATCH] net: tap: Drop the close of fds for child process Bin Meng
@ 2023-04-06 12:34 ` Philippe Mathieu-Daudé
2023-04-06 12:44 ` Bin Meng
2023-04-18 8:55 ` Daniel P. Berrangé
1 sibling, 1 reply; 4+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-06 12:34 UTC (permalink / raw)
To: Bin Meng, Jason Wang, qemu-devel; +Cc: Zhangjin Wu
On 6/4/23 13:20, Bin Meng wrote:
> Current codes using a brute-force traversal of all file descriptors
> do not scale on a system where the maximum number of file descriptors
> are set to a very large value (e.g.: in a Docker container of Manjaro
> distribution it is set to 1073741816). QEMU just looks freezed during
> start-up.
>
> The close-on-exec flag was introduced since a faily old Linux kernel
> (2.6.23). With recent newer kernels that QEMU supports, we don't need
> to manually close the fds for child process as the proper O_CLOEXEC
> flag should have been set properly on files that we don't want child
> process to see.
But this file is common to all POSIX implementations, not only Linux.
> Reported-by: Zhangjin Wu <falcon@tinylab.org>
> Signed-off-by: Bin Meng <bmeng@tinylab.org>
> ---
>
> net/tap.c | 14 --------------
> 1 file changed, 14 deletions(-)
>
> diff --git a/net/tap.c b/net/tap.c
> index 1bf085d422..49e1915484 100644
> --- a/net/tap.c
> +++ b/net/tap.c
> @@ -446,13 +446,6 @@ static void launch_script(const char *setup_script, const char *ifname,
> return;
> }
> if (pid == 0) {
Maybe guard with #ifndef O_CLOEXEC
> - int open_max = sysconf(_SC_OPEN_MAX), i;
> -
> - for (i = 3; i < open_max; i++) {
> - if (i != fd) {
> - close(i);
> - }
> - }
or add qemu_close_cloexec() in util/osdep.c similar to qemu_open_cloexec()?
> parg = args;
> *parg++ = (char *)setup_script;
> *parg++ = (char *)ifname;
> @@ -536,17 +529,10 @@ static int net_bridge_run_helper(const char *helper, const char *bridge,
> return -1;
> }
> if (pid == 0) {
> - int open_max = sysconf(_SC_OPEN_MAX), i;
> char *fd_buf = NULL;
> char *br_buf = NULL;
> char *helper_cmd = NULL;
>
> - for (i = 3; i < open_max; i++) {
> - if (i != sv[1]) {
> - close(i);
> - }
> - }
> -
> fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]);
>
> if (strrchr(helper, ' ') || strrchr(helper, '\t')) {
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] net: tap: Drop the close of fds for child process
2023-04-06 12:34 ` Philippe Mathieu-Daudé
@ 2023-04-06 12:44 ` Bin Meng
0 siblings, 0 replies; 4+ messages in thread
From: Bin Meng @ 2023-04-06 12:44 UTC (permalink / raw)
To: Philippe Mathieu-Daudé; +Cc: Bin Meng, Jason Wang, qemu-devel, Zhangjin Wu
On Thu, Apr 6, 2023 at 8:34 PM Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>
> On 6/4/23 13:20, Bin Meng wrote:
> > Current codes using a brute-force traversal of all file descriptors
> > do not scale on a system where the maximum number of file descriptors
> > are set to a very large value (e.g.: in a Docker container of Manjaro
> > distribution it is set to 1073741816). QEMU just looks freezed during
> > start-up.
> >
> > The close-on-exec flag was introduced since a faily old Linux kernel
> > (2.6.23). With recent newer kernels that QEMU supports, we don't need
> > to manually close the fds for child process as the proper O_CLOEXEC
> > flag should have been set properly on files that we don't want child
> > process to see.
>
> But this file is common to all POSIX implementations, not only Linux.
Yes, this file is used for Linux, BSD and Solaris.
I checked that O_CLOEXEC is available on Linux (2.6.23), FreeBSD
(8.3), OpenBSD 5.0, Solaris 11. This flag is part of POSIX.1-2008.
Question is do we still need to support OSes that are older and do not
have this support?
>
> > Reported-by: Zhangjin Wu <falcon@tinylab.org>
> > Signed-off-by: Bin Meng <bmeng@tinylab.org>
> > ---
> >
> > net/tap.c | 14 --------------
> > 1 file changed, 14 deletions(-)
> >
> > diff --git a/net/tap.c b/net/tap.c
> > index 1bf085d422..49e1915484 100644
> > --- a/net/tap.c
> > +++ b/net/tap.c
> > @@ -446,13 +446,6 @@ static void launch_script(const char *setup_script, const char *ifname,
> > return;
> > }
> > if (pid == 0) {
>
> Maybe guard with #ifndef O_CLOEXEC
>
> > - int open_max = sysconf(_SC_OPEN_MAX), i;
> > -
> > - for (i = 3; i < open_max; i++) {
> > - if (i != fd) {
> > - close(i);
> > - }
> > - }
>
> or add qemu_close_cloexec() in util/osdep.c similar to qemu_open_cloexec()?
>
> > parg = args;
> > *parg++ = (char *)setup_script;
> > *parg++ = (char *)ifname;
> > @@ -536,17 +529,10 @@ static int net_bridge_run_helper(const char *helper, const char *bridge,
> > return -1;
> > }
> > if (pid == 0) {
> > - int open_max = sysconf(_SC_OPEN_MAX), i;
> > char *fd_buf = NULL;
> > char *br_buf = NULL;
> > char *helper_cmd = NULL;
> >
> > - for (i = 3; i < open_max; i++) {
> > - if (i != sv[1]) {
> > - close(i);
> > - }
> > - }
> > -
> > fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]);
> >
> > if (strrchr(helper, ' ') || strrchr(helper, '\t')) {
Regards,
Bin
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] net: tap: Drop the close of fds for child process
2023-04-06 11:20 [PATCH] net: tap: Drop the close of fds for child process Bin Meng
2023-04-06 12:34 ` Philippe Mathieu-Daudé
@ 2023-04-18 8:55 ` Daniel P. Berrangé
1 sibling, 0 replies; 4+ messages in thread
From: Daniel P. Berrangé @ 2023-04-18 8:55 UTC (permalink / raw)
To: Bin Meng; +Cc: Jason Wang, qemu-devel, Zhangjin Wu
On Thu, Apr 06, 2023 at 07:20:41PM +0800, Bin Meng wrote:
> Current codes using a brute-force traversal of all file descriptors
> do not scale on a system where the maximum number of file descriptors
> are set to a very large value (e.g.: in a Docker container of Manjaro
> distribution it is set to 1073741816). QEMU just looks freezed during
> start-up.
>
> The close-on-exec flag was introduced since a faily old Linux kernel
> (2.6.23). With recent newer kernels that QEMU supports, we don't need
> to manually close the fds for child process as the proper O_CLOEXEC
> flag should have been set properly on files that we don't want child
> process to see.
Even though O_CLOEXEC has existed for a long time, there is plenty
of code that doesn't use it reliably. While QEMU can control its
own code, we use a huge number of 3rd party libraries and we don't
trust them to reliably be using O_CLOEXEC on everything they open.
> Reported-by: Zhangjin Wu <falcon@tinylab.org>
> Signed-off-by: Bin Meng <bmeng@tinylab.org>
> ---
>
> net/tap.c | 14 --------------
> 1 file changed, 14 deletions(-)
>
> diff --git a/net/tap.c b/net/tap.c
> index 1bf085d422..49e1915484 100644
> --- a/net/tap.c
> +++ b/net/tap.c
> @@ -446,13 +446,6 @@ static void launch_script(const char *setup_script, const char *ifname,
> return;
> }
> if (pid == 0) {
> - int open_max = sysconf(_SC_OPEN_MAX), i;
> -
> - for (i = 3; i < open_max; i++) {
> - if (i != fd) {
> - close(i);
> - }
> - }
> parg = args;
> *parg++ = (char *)setup_script;
> *parg++ = (char *)ifname;
> @@ -536,17 +529,10 @@ static int net_bridge_run_helper(const char *helper, const char *bridge,
> return -1;
> }
> if (pid == 0) {
> - int open_max = sysconf(_SC_OPEN_MAX), i;
> char *fd_buf = NULL;
> char *br_buf = NULL;
> char *helper_cmd = NULL;
>
> - for (i = 3; i < open_max; i++) {
> - if (i != sv[1]) {
> - close(i);
> - }
> - }
BSD has closefrom(3) we could use here, while modern Linux has
close_range(3, open_max)
We should probe for those two funtions and use them preferentially,
only falling back to the current manual loop where they don't exist.
> -
> fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]);
>
> if (strrchr(helper, ' ') || strrchr(helper, '\t')) {
> --
> 2.34.1
>
>
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-04-18 8:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-06 11:20 [PATCH] net: tap: Drop the close of fds for child process Bin Meng
2023-04-06 12:34 ` Philippe Mathieu-Daudé
2023-04-06 12:44 ` Bin Meng
2023-04-18 8:55 ` Daniel P. Berrangé
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).