From: Alex Williamson <alex.williamson@redhat.com>
To: Grzegorz Jaszczyk <jaz@semihalf.com>
Cc: Christian Brauner <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-aio@kvack.org,
linux-usb@vger.kernel.org,
Matthew Rosato <mjrosato@linux.ibm.com>,
Paul Durrant <paul@xen.org>, Tom Rix <trix@redhat.com>,
Jason Wang <jasowang@redhat.com>,
dri-devel@lists.freedesktop.org, Michal Hocko <mhocko@kernel.org>,
linux-mm@kvack.org, Kirti Wankhede <kwankhede@nvidia.com>,
Paolo Bonzini <pbonzini@redhat.com>, Jens Axboe <axboe@kernel.dk>,
Vineeth Vijayan <vneethv@linux.ibm.com>,
Diana Craciun <diana.craciun@oss.nxp.com>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
Shakeel Butt <shakeelb@google.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Leon Romanovsky <leon@kernel.org>,
Harald Freudenberger <freude@linux.ibm.com>,
Fei Li <fei1.li@intel.com>,
x86@kernel.org, Roman Gushchin <roman.gushchin@linux.dev>,
Halil Pasic <pasic@linux.ibm.com>, Jason Gunthorpe <jgg@ziepe.ca>,
Ingo Molnar <mingo@redhat.com>,
intel-gfx@lists.freedesktop.org,
Christian Borntraeger <borntraeger@linux.ibm.com>,
linux-fpga@vger.kernel.org, Zhi Wang <zhi.a.wang@intel.com>,
Wu Hao <hao.wu@intel.com>, Jason Herne <jjherne@linux.ibm.com>,
Eric Farman <farman@linux.ibm.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Andrew Donnellan <ajd@linux.ibm.com>,
Arnd Bergmann <arnd@arndb.de>,
linux-s390@vger.kernel.org, Heiko Carstens <hca@linux.ibm.com>,
Johannes Weiner <hannes@cmpxchg.org>,
linuxppc-dev@lists.ozlabs.org, Eric Auger <eric.auger@redhat.com>,
Borislav Petkov <bp@alien8.de>,
kvm@vger.kernel.org, Rodrigo Vivi <rodrigo.vivi@intel.com>,
cgroups@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
virtualization@lists.linux-foundation.org,
intel-gvt-dev@lists.freedesktop.org, io-uring@vger.kernel.org,
netdev@vger.kernel.org, Tony Krowiak <akrowiak@linux.ibm.com>,
Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
Pavel Begunkov <asml.silence@gmail.com>,
Sean Christopherson <seanjc@google.com>,
Oded Gabbay <ogabbay@kernel.org>,
Muchun Song <muchun.song@linux.dev>,
Peter Oberparleiter <oberpar@linux.ibm.com>,
linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
Benjamin LaHaise <bcrl@kvack.org>,
"Michael S. Tsirkin" <mst@redhat.com>,
Sven Schnelle <svens@linux.ibm.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Frederic Barrat <fbarrat@linux.ibm.com>,
Moritz Fischer <mdf@kernel.org>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
David Woodhouse <dwmw2@infradead.org>,
Xu Yilun <yilun.xu@intel.com>, Dominik Behr <dbehr@chromium.org>,
Marcin Wojtas <mw@semihalf.com>
Subject: Re: [PATCH 0/2] eventfd: simplify signal helpers
Date: Mon, 17 Jul 2023 13:08:31 -0600 [thread overview]
Message-ID: <20230717130831.0f18381a.alex.williamson@redhat.com> (raw)
In-Reply-To: <CAH76GKPF4BjJLrzLBW8k12ATaAGADeMYc2NQ9+j0KgRa0pomUw@mail.gmail.com>
On Mon, 17 Jul 2023 10:29:34 +0200
Grzegorz Jaszczyk <jaz@semihalf.com> wrote:
> pt., 14 lip 2023 o 09:05 Christian Brauner <brauner@kernel.org> napisał(a):
> >
> > On Thu, Jul 13, 2023 at 11:10:54AM -0600, Alex Williamson wrote:
> > > On Thu, 13 Jul 2023 12:05:36 +0200
> > > Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > > Hey everyone,
> > > >
> > > > This simplifies the eventfd_signal() and eventfd_signal_mask() helpers
> > > > by removing the count argument which is effectively unused.
> > >
> > > We have a patch under review which does in fact make use of the
> > > signaling value:
> > >
> > > https://lore.kernel.org/all/20230630155936.3015595-1-jaz@semihalf.com/
> >
> > Huh, thanks for the link.
> >
> > Quoting from
> > https://patchwork.kernel.org/project/kvm/patch/20230307220553.631069-1-jaz@semihalf.com/#25266856
> >
> > > Reading an eventfd returns an 8-byte value, we generally only use it
> > > as a counter, but it's been discussed previously and IIRC, it's possible
> > > to use that value as a notification value.
> >
> > So the goal is to pipe a specific value through eventfd? But it is
> > explicitly a counter. The whole thing is written around a counter and
> > each write and signal adds to the counter.
> >
> > The consequences are pretty well described in the cover letter of
> > v6 https://lore.kernel.org/all/20230630155936.3015595-1-jaz@semihalf.com/
> >
> > > Since the eventfd counter is used as ACPI notification value
> > > placeholder, the eventfd signaling needs to be serialized in order to
> > > not end up with notification values being coalesced. Therefore ACPI
> > > notification values are buffered and signalized one by one, when the
> > > previous notification value has been consumed.
> >
> > But isn't this a good indication that you really don't want an eventfd
> > but something that's explicitly designed to associate specific data with
> > a notification? Using eventfd in that manner requires serialization,
> > buffering, and enforces ordering.
What would that mechanism be? We've been iterating on getting the
serialization and buffering correct, but I don't know of another means
that combines the notification with a value, so we'd likely end up with
an eventfd only for notification and a separate ring buffer for
notification values.
As this series demonstrates, the current in-kernel users only increment
the counter and most userspace likely discards the counter value, which
makes the counter largely a waste. While perhaps unconventional,
there's no requirement that the counter may only be incremented by one,
nor any restriction that I see in how userspace must interpret the
counter value.
As I understand the ACPI notification proposal that Grzegorz links
below, a notification with an interpreted value allows for a more
direct userspace implementation when dealing with a series of discrete
notification with value events. Thanks,
Alex
> > I have no skin in the game aside from having to drop this conversion
> > which I'm fine to do if there are actually users for this btu really,
> > that looks a lot like abusing an api that really wasn't designed for
> > this.
>
> https://patchwork.kernel.org/project/kvm/patch/20230307220553.631069-1-jaz@semihalf.com/
> was posted at the beginig of March and one of the main things we've
> discussed was the mechanism for propagating acpi notification value.
> We've endup with eventfd as the best mechanism and have actually been
> using it from v2. I really do not want to waste this effort, I think
> we are quite advanced with v6 now. Additionally we didn't actually
> modify any part of eventfd support that was in place, we only used it
> in a specific (and discussed beforehand) way.
next prev parent reply other threads:[~2023-07-17 19:09 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20230630155936.3015595-1-jaz@semihalf.com>
2023-07-14 7:05 ` [PATCH 0/2] eventfd: simplify signal helpers Christian Brauner
2023-07-14 15:24 ` Jason Gunthorpe
2023-07-17 8:29 ` Grzegorz Jaszczyk
2023-07-17 19:08 ` Alex Williamson [this message]
2023-07-17 22:12 ` Jason Gunthorpe
2023-07-17 22:52 ` Alex Williamson
2023-07-18 15:56 ` Jason Gunthorpe
2023-07-13 10:05 Christian Brauner
2023-07-13 17:10 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230717130831.0f18381a.alex.williamson@redhat.com \
--to=alex.williamson@redhat.com \
--cc=agordeev@linux.ibm.com \
--cc=ajd@linux.ibm.com \
--cc=akrowiak@linux.ibm.com \
--cc=arnd@arndb.de \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=bcrl@kvack.org \
--cc=borntraeger@linux.ibm.com \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=dave.hansen@linux.intel.com \
--cc=dbehr@chromium.org \
--cc=diana.craciun@oss.nxp.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=dwmw2@infradead.org \
--cc=eric.auger@redhat.com \
--cc=farman@linux.ibm.com \
--cc=fbarrat@linux.ibm.com \
--cc=fei1.li@intel.com \
--cc=freude@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=hao.wu@intel.com \
--cc=hca@linux.ibm.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-gvt-dev@lists.freedesktop.org \
--cc=io-uring@vger.kernel.org \
--cc=jasowang@redhat.com \
--cc=jaz@semihalf.com \
--cc=jgg@ziepe.ca \
--cc=jjherne@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=leon@kernel.org \
--cc=linux-aio@kvack.org \
--cc=linux-fpga@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mdf@kernel.org \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=mjrosato@linux.ibm.com \
--cc=mst@redhat.com \
--cc=muchun.song@linux.dev \
--cc=mw@semihalf.com \
--cc=netdev@vger.kernel.org \
--cc=oberpar@linux.ibm.com \
--cc=ogabbay@kernel.org \
--cc=pasic@linux.ibm.com \
--cc=paul@xen.org \
--cc=pbonzini@redhat.com \
--cc=rodrigo.vivi@intel.com \
--cc=roman.gushchin@linux.dev \
--cc=seanjc@google.com \
--cc=shakeelb@google.com \
--cc=svens@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=trix@redhat.com \
--cc=tvrtko.ursulin@linux.intel.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=vkuznets@redhat.com \
--cc=vneethv@linux.ibm.com \
--cc=x86@kernel.org \
--cc=xuanzhuo@linux.alibaba.com \
--cc=yilun.xu@intel.com \
--cc=zhi.a.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).