qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Bin Meng <bmeng@tinylab.org>
Cc: qemu-devel@nongnu.org,
	Richard Henderson <richard.henderson@linaro.org>,
	 Zhangjin Wu <falcon@tinylab.org>
Subject: Re: [PATCH v4 6/6] net: tap: Use qemu_close_range() to close fds
Date: Fri, 30 Jun 2023 12:52:14 +0800	[thread overview]
Message-ID: <CACGkMEu_h4-DMMOY+=wmL0LeTWAELeQPiBjucEEG=ud4EtuLSw@mail.gmail.com> (raw)
In-Reply-To: <20230628152726.110295-7-bmeng@tinylab.org>

On Wed, Jun 28, 2023 at 11:29 PM Bin Meng <bmeng@tinylab.org> wrote:
>
> From: Zhangjin Wu <falcon@tinylab.org>
>
> Current codes using a brute-force traversal of all file descriptors
> do not scale on a system where the maximum number of file descriptors
> is set to a very large value (e.g.: in a Docker container of Manjaro
> distribution it is set to 1073741816). QEMU just looks frozen during
> start-up.
>
> The close-on-exec flag (O_CLOEXEC) was introduced since Linux kernel
> 2.6.23, FreeBSD 8.3, OpenBSD 5.0, Solaris 11. While it's true QEMU
> doesn't need to manually close the fds for child process as the proper
> O_CLOEXEC flag should have been set properly on files with its own
> codes, QEMU uses a huge number of 3rd party libraries and we don't
> trust them to reliably be using O_CLOEXEC on everything they open.
>
> Modern Linux and BSDs have the close_range() call we can use to do the
> job, and on Linux we have one more way to walk through /proc/self/fd
> to complete the task efficiently, which is what qemu_close_range() does.
>
> Reported-by: Zhangjin Wu <falcon@tinylab.org>
> Co-developed-by: Bin Meng <bmeng@tinylab.org>
> Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
> Signed-off-by: Bin Meng <bmeng@tinylab.org>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Patch looks good but I'm not sure using helper scripts is good for the
production environment since it increases attack surfaces. Passing TAP
fd should be a better way.

Thanks

>
> ---
>
> Changes in v4:
> - put fd on its own line
>
> Changes in v2:
> - Change to use qemu_close_range() to close fds for child process efficiently
> - v1 link: https://lore.kernel.org/qemu-devel/20230406112041.798585-1-bmeng@tinylab.org/
>
>  net/tap.c | 24 ++++++++++++------------
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/net/tap.c b/net/tap.c
> index 1bf085d422..9f080215f0 100644
> --- a/net/tap.c
> +++ b/net/tap.c
> @@ -446,13 +446,13 @@ static void launch_script(const char *setup_script, const char *ifname,
>          return;
>      }
>      if (pid == 0) {
> -        int open_max = sysconf(_SC_OPEN_MAX), i;
> +        unsigned int last_fd = sysconf(_SC_OPEN_MAX) - 1;
> +
> +        /* skip stdin, stdout and stderr */
> +        qemu_close_range(3, fd - 1);
> +        /* skip the currently used fd */
> +        qemu_close_range(fd + 1, last_fd);
>
> -        for (i = 3; i < open_max; i++) {
> -            if (i != fd) {
> -                close(i);
> -            }
> -        }
>          parg = args;
>          *parg++ = (char *)setup_script;
>          *parg++ = (char *)ifname;
> @@ -536,16 +536,16 @@ static int net_bridge_run_helper(const char *helper, const char *bridge,
>          return -1;
>      }
>      if (pid == 0) {
> -        int open_max = sysconf(_SC_OPEN_MAX), i;
> +        unsigned int last_fd = sysconf(_SC_OPEN_MAX) - 1;
> +        unsigned int fd = sv[1];
>          char *fd_buf = NULL;
>          char *br_buf = NULL;
>          char *helper_cmd = NULL;
>
> -        for (i = 3; i < open_max; i++) {
> -            if (i != sv[1]) {
> -                close(i);
> -            }
> -        }
> +        /* skip stdin, stdout and stderr */
> +        qemu_close_range(3, fd - 1);
> +        /* skip the currently used fd */
> +        qemu_close_range(fd + 1, last_fd);
>
>          fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]);
>
> --
> 2.34.1
>



  reply	other threads:[~2023-06-30  4:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-28 15:27 [PATCH v4 0/6] net/tap: Fix QEMU frozen issue when the maximum number of file descriptors is very large Bin Meng
2023-06-28 15:27 ` [PATCH v4 1/6] tests/tcg/cris: Fix the coding style Bin Meng
2023-06-28 15:27 ` [PATCH v4 2/6] tests/tcg/cris: Correct the off-by-one error Bin Meng
2023-06-28 15:27 ` [PATCH v4 3/6] util/async-teardown: Fall back to close fds one by one Bin Meng
2023-06-28 15:27 ` [PATCH v4 4/6] util/osdep: Introduce qemu_close_range() Bin Meng
2023-07-07 14:40   ` Markus Armbruster
2023-06-28 15:27 ` [PATCH v4 5/6] util/async-teardown: Use qemu_close_range() to close fds Bin Meng
2023-07-07 14:40   ` Markus Armbruster
2023-06-28 15:27 ` [PATCH v4 6/6] net: tap: " Bin Meng
2023-06-30  4:52   ` Jason Wang [this message]
2023-06-29  8:33 ` [PATCH v4 0/6] net/tap: Fix QEMU frozen issue when the maximum number of file descriptors is very large Michael Tokarev
2023-06-29  9:05   ` Daniel P. Berrangé
2023-07-09 15:47 ` Bin Meng
2023-07-10  3:05   ` Jason Wang
2023-07-10  6:07     ` Markus Armbruster
2023-07-11  2:40       ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACGkMEu_h4-DMMOY+=wmL0LeTWAELeQPiBjucEEG=ud4EtuLSw@mail.gmail.com' \
    --to=jasowang@redhat.com \
    --cc=bmeng@tinylab.org \
    --cc=falcon@tinylab.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).