From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Michael Tokarev <mjt@tls.msk.ru>
Cc: Laurent Vivier <laurent@vivier.eu>,
QEMU Trivial <qemu-trivial@nongnu.org>,
QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [PATCH trivial 1/2] close_all_open_fd(): move to oslib-posix.c
Date: Fri, 26 Jan 2024 11:01:36 +0000 [thread overview]
Message-ID: <ZbOREEOcbSCwex15@redhat.com> (raw)
In-Reply-To: <dbb90f22-d17d-4c40-8684-58ec976a014e@tls.msk.ru>
On Fri, Jan 26, 2024 at 01:45:39PM +0300, Michael Tokarev wrote:
> 26.01.2024 12:06, Daniel P. Berrangé wrote:
> > On Fri, Jan 26, 2024 at 08:44:13AM +0100, Laurent Vivier wrote:
> > > Le 25/01/2024 à 23:29, Michael Tokarev a écrit :
>
>
> > > I think the way using sysconf(_SC_OPEN_MAX) is more portable, simpler and
> > > cleaner than the one using /proc/self/fd.
> >
> > A fallback that uses _SC_OPEN_MAX is good for portability, but it is
> > should not be considered a replacement for iterating over /proc/self/fd,
> > rather an additional fallback for non-Linux, or when /proc is not mounted.
> > It is not uncommon for _SC_OPEN_MAX to be *exceedingly* high
> >
> > $ podman run -it quay.io/centos/centos:stream9
> > [root@4a440d62935c /]# ulimit -n
> > 524288
> >
> > Iterating over 1/2 a million FDs is a serious performance penalty that
> > we don't want to have, so _SC_OPEN_MAX should always be the last resort.
>
> From yesterday conversation in IRC which started this:
>
> <mmlb> open files (-n) 1073741816
>
> (it is a docker container)
> They weren't able to start qemu.. :)
>
> Sanity of such setting is questionable, but ok.
>
> Not only linux implement close_range(2) syscall, it is also
> available on some *BSDs.
>
> And the most important point is, - we should aim at using O_CLOEXEC
> everywhere, without this need to close each FD at exec time. I think
> qemu is the only software with such paranoid closing when just running
> an interface setup script..
We should try to use O_CLOEXEC everywhere, but at the same time QEMU
links to a large number of libraries, and we can't assume that they've
reliably used O_CLOEXEC. Non-QEMU owned code that is mapped in process
likely dwarfs QEMU owned code by a factor of x10.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2024-01-26 11:01 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-25 22:29 [PATCH trivial 0/2] split out os_close_all_open_fd and use it in net/tap.c too Michael Tokarev
2024-01-25 22:29 ` [PATCH trivial 1/2] close_all_open_fd(): move to oslib-posix.c Michael Tokarev
2024-01-26 7:44 ` Laurent Vivier
2024-01-26 9:06 ` Daniel P. Berrangé
2024-01-26 10:45 ` Michael Tokarev
2024-01-26 11:01 ` Daniel P. Berrangé [this message]
2024-01-26 12:05 ` Michael Tokarev
2024-01-25 22:29 ` [PATCH trivial 2/2] net/tap: use os_close_all_open_fd() instead of open-coding it Michael Tokarev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZbOREEOcbSCwex15@redhat.com \
--to=berrange@redhat.com \
--cc=laurent@vivier.eu \
--cc=mjt@tls.msk.ru \
--cc=qemu-devel@nongnu.org \
--cc=qemu-trivial@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.