All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Christian Brauner <brauner-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Vitaly Kuznetsov
	<vkuznets-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Paolo Bonzini <pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Borislav Petkov <bp-Gina5bIWoIWzQB+pC5nmwQ@public.gmane.org>,
	Dave Hansen <dave.hansen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Paul Durrant <paul-LM2mM/qkH7s@public.gmane.org>,
	Oded Gabbay <ogabbay-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Wu Hao <hao.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Tom Rix <trix-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Moritz Fischer <mdf-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Xu Yilun <yilun.xu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Zhenyu Wang <zhenyuw-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	Zhi Wang <zhi.a.wang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Jani Nikula <jani.nikula-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	Joonas Lahtinen
	<joonas.lahtinen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	Rodrigo Vivi
	<rodrigo.vivi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Tvrtko Ursulin
	<tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	David Airlie <airlied@gma>
Subject: Re: [PATCH 2/2] eventfd: simplify eventfd_signal_mask()
Date: Thu, 13 Jul 2023 07:33:05 -0700	[thread overview]
Message-ID: <ZLAK+FA3qgbHW0YK@google.com> (raw)
In-Reply-To: <20230713-vfs-eventfd-signal-v1-2-7fda6c5d212b-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

On Thu, Jul 13, 2023, Christian Brauner wrote:
> diff --git a/fs/eventfd.c b/fs/eventfd.c
> index dc9e01053235..077be5da72bd 100644
> --- a/fs/eventfd.c
> +++ b/fs/eventfd.c
> @@ -43,9 +43,10 @@ struct eventfd_ctx {
>  	int id;
>  };
>  
> -__u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, __poll_t mask)
> +bool eventfd_signal_mask(struct eventfd_ctx *ctx, __poll_t mask)
>  {
>  	unsigned long flags;
> +	__u64 n = 1;
>  
>  	/*
>  	 * Deadlock or stack overflow issues can happen if we recurse here
> @@ -68,7 +69,7 @@ __u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, __poll_t mask)
>  	current->in_eventfd = 0;
>  	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
>  
> -	return n;
> +	return n == 1;
>  }

...

> @@ -58,13 +58,12 @@ static inline struct eventfd_ctx *eventfd_ctx_fdget(int fd)
>  	return ERR_PTR(-ENOSYS);
>  }
>  
> -static inline int eventfd_signal(struct eventfd_ctx *ctx)
> +static inline bool eventfd_signal(struct eventfd_ctx *ctx)
>  {
>  	return -ENOSYS;
>  }
>  
> -static inline int eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n,
> -				      unsigned mask)
> +static inline bool eventfd_signal_mask(struct eventfd_ctx *ctx, unsigned mask)
>  {
>  	return -ENOSYS;

This will morph to "true" for what should be an error case.  One option would be
to have eventfd_signal_mask() return 0/-errno instead of the count, but looking
at all the callers, nothing ever actually consumes the result.

KVMGT morphs failure into -EFAULT

	if (vgpu->msi_trigger && eventfd_signal(vgpu->msi_trigger, 1) != 1)
		return -EFAULT;

but the only caller of that user ignores the return value.

	if (vgpu_vreg(vgpu, i915_mmio_reg_offset(GEN8_MASTER_IRQ))
			& ~GEN8_MASTER_IRQ_CONTROL)
		inject_virtual_interrupt(vgpu);

The sample driver in samples/vfio-mdev/mtty.c uses a similar pattern: prints an
error but otherwise ignores the result.

So why not return nothing?  That will simplify eventfd_signal_mask() a wee bit
more, and eliminate that bizarre return value confusion for the ugly stubs, e.g.

void eventfd_signal_mask(struct eventfd_ctx *ctx, unsigned mask)
{
	unsigned long flags;

	/*
	 * Deadlock or stack overflow issues can happen if we recurse here
	 * through waitqueue wakeup handlers. If the caller users potentially
	 * nested waitqueues with custom wakeup handlers, then it should
	 * check eventfd_signal_allowed() before calling this function. If
	 * it returns false, the eventfd_signal() call should be deferred to a
	 * safe context.
	 */
	if (WARN_ON_ONCE(current->in_eventfd))
		return;

	spin_lock_irqsave(&ctx->wqh.lock, flags);
	current->in_eventfd = 1;
	if (ctx->count < ULLONG_MAX)
		ctx->count++;
	if (waitqueue_active(&ctx->wqh))
		wake_up_locked_poll(&ctx->wqh, EPOLLIN | mask);
	current->in_eventfd = 0;
	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
}

You could even go further and unify the real and stub versions of eventfd_signal().

WARNING: multiple messages have this Message-ID (diff)
From: Sean Christopherson <seanjc@google.com>
To: Christian Brauner <brauner@kernel.org>
Cc: linux-aio@kvack.org, linux-usb@vger.kernel.org,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Paul Durrant <paul@xen.org>, Tom Rix <trix@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	dri-devel@lists.freedesktop.org, Michal Hocko <mhocko@kernel.org>,
	linux-mm@kvack.org, Kirti Wankhede <kwankhede@nvidia.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	Vineeth Vijayan <vneethv@linux.ibm.com>,
	Diana Craciun <diana.craciun@oss.nxp.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	David Airlie <airlied@gmail.com>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Shakeel Butt <shakeelb@google.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Leon Romanovsky <leon@kernel.org>,
	Harald Freudenberger <freude@linux.ibm.com>,
	Fei Li <fei1.li@intel.com>,
	x86@kernel.org, Roman Gushchin <roman.gushchin@linux.dev>,
	Halil Pasic <pasic@linux.ibm.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Ingo Molnar <mingo@redhat.com>,
	intel-gfx@lists.freedesktop.org,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	linux-fpga@vger.kernel.org, Wu Hao <hao.wu@intel.com>,
	Jason Herne <jjherne@linux.ibm.com>,
	Eric Farman <farman@linux.ibm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andrew Donnellan <ajd@linux.ibm.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-s390@vger.kernel.org, Heiko Carstens <hca@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org,
	Frederic Barrat <fbarrat@linux.ibm.com>,
	Moritz Fischer <mdf@kernel.org>,
	kvm@vger.kernel.org, Rodrigo Vivi <rodrigo.vivi@intel.com>,
	cgroups@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
	virtualization@lists.linux-foundation.org,
	intel-gvt-dev@lists.freedesktop.org, io-uring@vger.kernel.org,
	netdev@vger.kernel.org, Tony Krowiak <akrowiak@linux.ibm.com>,
	Pavel Begunkov <asml.silence@gmail.com>,
	Eric Auger <eric.auger@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Oded Gabbay <ogabbay@kernel.org>,
	Muchun Song <muchun.song@linux.dev>,
	Peter Oberparleiter <oberpar@linux.ibm.com>,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	Benjamin LaHaise <bcrl@kvack.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Daniel Vetter <daniel@ffwll.ch>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel@vger.kernel.org, Borislav Petkov <bp@alien8.de>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Xu Yilun <yilun.xu@intel.com>
Subject: Re: [Intel-gfx] [PATCH 2/2] eventfd: simplify eventfd_signal_mask()
Date: Thu, 13 Jul 2023 07:33:05 -0700	[thread overview]
Message-ID: <ZLAK+FA3qgbHW0YK@google.com> (raw)
In-Reply-To: <20230713-vfs-eventfd-signal-v1-2-7fda6c5d212b@kernel.org>

On Thu, Jul 13, 2023, Christian Brauner wrote:
> diff --git a/fs/eventfd.c b/fs/eventfd.c
> index dc9e01053235..077be5da72bd 100644
> --- a/fs/eventfd.c
> +++ b/fs/eventfd.c
> @@ -43,9 +43,10 @@ struct eventfd_ctx {
>  	int id;
>  };
>  
> -__u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, __poll_t mask)
> +bool eventfd_signal_mask(struct eventfd_ctx *ctx, __poll_t mask)
>  {
>  	unsigned long flags;
> +	__u64 n = 1;
>  
>  	/*
>  	 * Deadlock or stack overflow issues can happen if we recurse here
> @@ -68,7 +69,7 @@ __u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, __poll_t mask)
>  	current->in_eventfd = 0;
>  	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
>  
> -	return n;
> +	return n == 1;
>  }

...

> @@ -58,13 +58,12 @@ static inline struct eventfd_ctx *eventfd_ctx_fdget(int fd)
>  	return ERR_PTR(-ENOSYS);
>  }
>  
> -static inline int eventfd_signal(struct eventfd_ctx *ctx)
> +static inline bool eventfd_signal(struct eventfd_ctx *ctx)
>  {
>  	return -ENOSYS;
>  }
>  
> -static inline int eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n,
> -				      unsigned mask)
> +static inline bool eventfd_signal_mask(struct eventfd_ctx *ctx, unsigned mask)
>  {
>  	return -ENOSYS;

This will morph to "true" for what should be an error case.  One option would be
to have eventfd_signal_mask() return 0/-errno instead of the count, but looking
at all the callers, nothing ever actually consumes the result.

KVMGT morphs failure into -EFAULT

	if (vgpu->msi_trigger && eventfd_signal(vgpu->msi_trigger, 1) != 1)
		return -EFAULT;

but the only caller of that user ignores the return value.

	if (vgpu_vreg(vgpu, i915_mmio_reg_offset(GEN8_MASTER_IRQ))
			& ~GEN8_MASTER_IRQ_CONTROL)
		inject_virtual_interrupt(vgpu);

The sample driver in samples/vfio-mdev/mtty.c uses a similar pattern: prints an
error but otherwise ignores the result.

So why not return nothing?  That will simplify eventfd_signal_mask() a wee bit
more, and eliminate that bizarre return value confusion for the ugly stubs, e.g.

void eventfd_signal_mask(struct eventfd_ctx *ctx, unsigned mask)
{
	unsigned long flags;

	/*
	 * Deadlock or stack overflow issues can happen if we recurse here
	 * through waitqueue wakeup handlers. If the caller users potentially
	 * nested waitqueues with custom wakeup handlers, then it should
	 * check eventfd_signal_allowed() before calling this function. If
	 * it returns false, the eventfd_signal() call should be deferred to a
	 * safe context.
	 */
	if (WARN_ON_ONCE(current->in_eventfd))
		return;

	spin_lock_irqsave(&ctx->wqh.lock, flags);
	current->in_eventfd = 1;
	if (ctx->count < ULLONG_MAX)
		ctx->count++;
	if (waitqueue_active(&ctx->wqh))
		wake_up_locked_poll(&ctx->wqh, EPOLLIN | mask);
	current->in_eventfd = 0;
	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
}

You could even go further and unify the real and stub versions of eventfd_signal().

WARNING: multiple messages have this Message-ID (diff)
From: Sean Christopherson <seanjc@google.com>
To: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, David Woodhouse <dwmw2@infradead.org>,
	Paul Durrant <paul@xen.org>, Oded Gabbay <ogabbay@kernel.org>,
	Wu Hao <hao.wu@intel.com>, Tom Rix <trix@redhat.com>,
	Moritz Fischer <mdf@kernel.org>, Xu Yilun <yilun.xu@intel.com>,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhi Wang <zhi.a.wang@intel.com>,
	Jani Nikula <jani.nikula@linux.intel.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	David Airlie <airlied@gmail.com>, Daniel Vetter <daniel@ffwll.ch>,
	Leon Romanovsky <leon@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
	Frederic Barrat <fbarrat@linux.ibm.com>,
	Andrew Donnellan <ajd@linux.ibm.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Eric Farman <farman@linux.ibm.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Halil Pasic <pasic@linux.ibm.com>,
	Vineeth Vijayan <vneethv@linux.ibm.com>,
	Peter Oberparleiter <oberpar@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Tony Krowiak <akrowiak@linux.ibm.com>,
	Jason Herne <jjherne@linux.ibm.com>,
	Harald Freudenberger <freude@linux.ibm.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Diana Craciun <diana.craciun@oss.nxp.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Eric Auger <eric.auger@redhat.com>, Fei Li <fei1.li@intel.com>,
	Benjamin LaHaise <bcrl@kvack.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeelb@google.com>,
	Muchun Song <muchun.song@linux.dev>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linux-fpga@vger.kernel.org,
	intel-gvt-dev@lists.freedesktop.org,
	intel-gfx@lists.freedesktop.org, linux-rdma@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org,
	linux-usb@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, linux-aio@kvack.org,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	Jens Axboe <axboe@kernel.dk>,
	Pavel Begunkov <asml.silence@gmail.com>,
	io-uring@vger.kernel.org
Subject: Re: [PATCH 2/2] eventfd: simplify eventfd_signal_mask()
Date: Thu, 13 Jul 2023 07:33:05 -0700	[thread overview]
Message-ID: <ZLAK+FA3qgbHW0YK@google.com> (raw)
In-Reply-To: <20230713-vfs-eventfd-signal-v1-2-7fda6c5d212b@kernel.org>

On Thu, Jul 13, 2023, Christian Brauner wrote:
> diff --git a/fs/eventfd.c b/fs/eventfd.c
> index dc9e01053235..077be5da72bd 100644
> --- a/fs/eventfd.c
> +++ b/fs/eventfd.c
> @@ -43,9 +43,10 @@ struct eventfd_ctx {
>  	int id;
>  };
>  
> -__u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, __poll_t mask)
> +bool eventfd_signal_mask(struct eventfd_ctx *ctx, __poll_t mask)
>  {
>  	unsigned long flags;
> +	__u64 n = 1;
>  
>  	/*
>  	 * Deadlock or stack overflow issues can happen if we recurse here
> @@ -68,7 +69,7 @@ __u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, __poll_t mask)
>  	current->in_eventfd = 0;
>  	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
>  
> -	return n;
> +	return n == 1;
>  }

...

> @@ -58,13 +58,12 @@ static inline struct eventfd_ctx *eventfd_ctx_fdget(int fd)
>  	return ERR_PTR(-ENOSYS);
>  }
>  
> -static inline int eventfd_signal(struct eventfd_ctx *ctx)
> +static inline bool eventfd_signal(struct eventfd_ctx *ctx)
>  {
>  	return -ENOSYS;
>  }
>  
> -static inline int eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n,
> -				      unsigned mask)
> +static inline bool eventfd_signal_mask(struct eventfd_ctx *ctx, unsigned mask)
>  {
>  	return -ENOSYS;

This will morph to "true" for what should be an error case.  One option would be
to have eventfd_signal_mask() return 0/-errno instead of the count, but looking
at all the callers, nothing ever actually consumes the result.

KVMGT morphs failure into -EFAULT

	if (vgpu->msi_trigger && eventfd_signal(vgpu->msi_trigger, 1) != 1)
		return -EFAULT;

but the only caller of that user ignores the return value.

	if (vgpu_vreg(vgpu, i915_mmio_reg_offset(GEN8_MASTER_IRQ))
			& ~GEN8_MASTER_IRQ_CONTROL)
		inject_virtual_interrupt(vgpu);

The sample driver in samples/vfio-mdev/mtty.c uses a similar pattern: prints an
error but otherwise ignores the result.

So why not return nothing?  That will simplify eventfd_signal_mask() a wee bit
more, and eliminate that bizarre return value confusion for the ugly stubs, e.g.

void eventfd_signal_mask(struct eventfd_ctx *ctx, unsigned mask)
{
	unsigned long flags;

	/*
	 * Deadlock or stack overflow issues can happen if we recurse here
	 * through waitqueue wakeup handlers. If the caller users potentially
	 * nested waitqueues with custom wakeup handlers, then it should
	 * check eventfd_signal_allowed() before calling this function. If
	 * it returns false, the eventfd_signal() call should be deferred to a
	 * safe context.
	 */
	if (WARN_ON_ONCE(current->in_eventfd))
		return;

	spin_lock_irqsave(&ctx->wqh.lock, flags);
	current->in_eventfd = 1;
	if (ctx->count < ULLONG_MAX)
		ctx->count++;
	if (waitqueue_active(&ctx->wqh))
		wake_up_locked_poll(&ctx->wqh, EPOLLIN | mask);
	current->in_eventfd = 0;
	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
}

You could even go further and unify the real and stub versions of eventfd_signal().

WARNING: multiple messages have this Message-ID (diff)
From: Sean Christopherson <seanjc@google.com>
To: Christian Brauner <brauner@kernel.org>
Cc: linux-aio@kvack.org, linux-usb@vger.kernel.org,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Paul Durrant <paul@xen.org>, Tom Rix <trix@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	dri-devel@lists.freedesktop.org, Michal Hocko <mhocko@kernel.org>,
	linux-mm@kvack.org, Kirti Wankhede <kwankhede@nvidia.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	Vineeth Vijayan <vneethv@linux.ibm.com>,
	Diana Craciun <diana.craciun@oss.nxp.com>,
	netdev@vger.kernel.org,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	David Airlie <airlied@gmail.com>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Shakeel Butt <shakeelb@google.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Leon Romanovsky <leon@kernel.org>,
	Harald Freudenberger <freude@linux.ibm.com>,
	Fei Li <fei1.li@intel.com>,
	x86@kernel.org, Roman Gushchin <roman.gushchin@linux.dev>,
	Halil Pasic <pasic@linux.ibm.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Ingo Molnar <mingo@redhat.com>,
	intel-gf x@lists.freedesktop.org,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	linux-fpga@vger.kernel.org, Zhi Wang <zhi.a.wang@intel.com>,
	Wu Hao <hao.wu@intel.com>, Jason Herne <jjherne@linux.ibm.com>,
	Eric Farman <farman@linux.ibm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andrew Donnellan <ajd@linux.ibm.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-s390@vger.kernel.org, Heiko Carstens <hca@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Frederic Barrat <fbarrat@linux.ibm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Moritz Fischer <mdf@kernel.org>,
	Jani Nikula <jani.nikula@linux.intel.com>,
	kvm@vger.kernel.org, Rodrigo Vivi <rodrigo.vivi@intel.com>,
	cgroups@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
	virtualization@lists.linux-foundation.org,
	intel-gvt-dev@lists.freedesktop.org, io-uring@vger.kernel.org,
	Tony Krowiak <akrowiak@linux.ibm.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Pavel Begunkov <asml.silence@gmail .com>,
	Eric Auger <eric.auger@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Oded Gabbay <ogabbay@kernel.org>,
	Muchun Song <muchun.song@linux.dev>,
	Peter Oberparleiter <oberpar@linux.ibm.com>,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	Benjamin LaHaise <bcrl@kvack.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Daniel Vetter <daniel@ffwll.ch>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel@vger.kernel.org, Borislav Petkov <bp@alien8.de>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Xu Yilun <yilun.xu@intel.com>
Subject: Re: [PATCH 2/2] eventfd: simplify eventfd_signal_mask()
Date: Thu, 13 Jul 2023 07:33:05 -0700	[thread overview]
Message-ID: <ZLAK+FA3qgbHW0YK@google.com> (raw)
In-Reply-To: <20230713-vfs-eventfd-signal-v1-2-7fda6c5d212b@kernel.org>

On Thu, Jul 13, 2023, Christian Brauner wrote:
> diff --git a/fs/eventfd.c b/fs/eventfd.c
> index dc9e01053235..077be5da72bd 100644
> --- a/fs/eventfd.c
> +++ b/fs/eventfd.c
> @@ -43,9 +43,10 @@ struct eventfd_ctx {
>  	int id;
>  };
>  
> -__u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, __poll_t mask)
> +bool eventfd_signal_mask(struct eventfd_ctx *ctx, __poll_t mask)
>  {
>  	unsigned long flags;
> +	__u64 n = 1;
>  
>  	/*
>  	 * Deadlock or stack overflow issues can happen if we recurse here
> @@ -68,7 +69,7 @@ __u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, __poll_t mask)
>  	current->in_eventfd = 0;
>  	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
>  
> -	return n;
> +	return n == 1;
>  }

...

> @@ -58,13 +58,12 @@ static inline struct eventfd_ctx *eventfd_ctx_fdget(int fd)
>  	return ERR_PTR(-ENOSYS);
>  }
>  
> -static inline int eventfd_signal(struct eventfd_ctx *ctx)
> +static inline bool eventfd_signal(struct eventfd_ctx *ctx)
>  {
>  	return -ENOSYS;
>  }
>  
> -static inline int eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n,
> -				      unsigned mask)
> +static inline bool eventfd_signal_mask(struct eventfd_ctx *ctx, unsigned mask)
>  {
>  	return -ENOSYS;

This will morph to "true" for what should be an error case.  One option would be
to have eventfd_signal_mask() return 0/-errno instead of the count, but looking
at all the callers, nothing ever actually consumes the result.

KVMGT morphs failure into -EFAULT

	if (vgpu->msi_trigger && eventfd_signal(vgpu->msi_trigger, 1) != 1)
		return -EFAULT;

but the only caller of that user ignores the return value.

	if (vgpu_vreg(vgpu, i915_mmio_reg_offset(GEN8_MASTER_IRQ))
			& ~GEN8_MASTER_IRQ_CONTROL)
		inject_virtual_interrupt(vgpu);

The sample driver in samples/vfio-mdev/mtty.c uses a similar pattern: prints an
error but otherwise ignores the result.

So why not return nothing?  That will simplify eventfd_signal_mask() a wee bit
more, and eliminate that bizarre return value confusion for the ugly stubs, e.g.

void eventfd_signal_mask(struct eventfd_ctx *ctx, unsigned mask)
{
	unsigned long flags;

	/*
	 * Deadlock or stack overflow issues can happen if we recurse here
	 * through waitqueue wakeup handlers. If the caller users potentially
	 * nested waitqueues with custom wakeup handlers, then it should
	 * check eventfd_signal_allowed() before calling this function. If
	 * it returns false, the eventfd_signal() call should be deferred to a
	 * safe context.
	 */
	if (WARN_ON_ONCE(current->in_eventfd))
		return;

	spin_lock_irqsave(&ctx->wqh.lock, flags);
	current->in_eventfd = 1;
	if (ctx->count < ULLONG_MAX)
		ctx->count++;
	if (waitqueue_active(&ctx->wqh))
		wake_up_locked_poll(&ctx->wqh, EPOLLIN | mask);
	current->in_eventfd = 0;
	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
}

You could even go further and unify the real and stub versions of eventfd_signal().

WARNING: multiple messages have this Message-ID (diff)
From: Sean Christopherson <seanjc@google.com>
To: Christian Brauner <brauner@kernel.org>
Cc: linux-aio@kvack.org, linux-usb@vger.kernel.org,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Paul Durrant <paul@xen.org>, Tom Rix <trix@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	dri-devel@lists.freedesktop.org, Michal Hocko <mhocko@kernel.org>,
	linux-mm@kvack.org, Kirti Wankhede <kwankhede@nvidia.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	Vineeth Vijayan <vneethv@linux.ibm.com>,
	Diana Craciun <diana.craciun@oss.nxp.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Shakeel Butt <shakeelb@google.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Leon Romanovsky <leon@kernel.org>,
	Harald Freudenberger <freude@linux.ibm.com>,
	Fei Li <fei1.li@intel.com>,
	x86@kernel.org, Roman Gushchin <roman.gushchin@linux.dev>,
	Halil Pasic <pasic@linux.ibm.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Ingo Molnar <mingo@redhat.com>,
	intel-gfx@lists.freedesktop.org,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	linux-fpga@vger.kernel.org, Zhi Wang <zhi.a.wang@intel.com>,
	Wu Hao <hao.wu@intel.com>, Jason Herne <jjherne@linux.ibm.com>,
	Eric Farman <farman@linux.ibm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andrew Donnellan <ajd@linux.ibm.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-s390@vger.kernel.org, Heiko Carstens <hca@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org,
	Frederic Barrat <fbarrat@linux.ibm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Moritz Fischer <mdf@kernel.org>,
	kvm@vger.kernel.org, Rodrigo Vivi <rodrigo.vivi@intel.com>,
	cgroups@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
	virtualization@lists.linux-foundation.org,
	intel-gvt-dev@lists.freedesktop.org, io-uring@vger.kernel.org,
	netdev@vger.kernel.org, Tony Krowiak <akrowiak@linux.ibm.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Pavel Begunkov <asml.silence@gmail.com>,
	Eric Auger <eric.auger@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Oded Gabbay <ogabbay@kernel.org>,
	Muchun Song <muchun.song@linux.dev>,
	Peter Oberparleiter <oberpar@linux.ibm.com>,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	Benjamin LaHaise <bcrl@kvack.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel@vger.kernel.org, Borislav Petkov <bp@alien8.de>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Xu Yilun <yilun.xu@intel.com>
Subject: Re: [PATCH 2/2] eventfd: simplify eventfd_signal_mask()
Date: Thu, 13 Jul 2023 07:33:05 -0700	[thread overview]
Message-ID: <ZLAK+FA3qgbHW0YK@google.com> (raw)
In-Reply-To: <20230713-vfs-eventfd-signal-v1-2-7fda6c5d212b@kernel.org>

On Thu, Jul 13, 2023, Christian Brauner wrote:
> diff --git a/fs/eventfd.c b/fs/eventfd.c
> index dc9e01053235..077be5da72bd 100644
> --- a/fs/eventfd.c
> +++ b/fs/eventfd.c
> @@ -43,9 +43,10 @@ struct eventfd_ctx {
>  	int id;
>  };
>  
> -__u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, __poll_t mask)
> +bool eventfd_signal_mask(struct eventfd_ctx *ctx, __poll_t mask)
>  {
>  	unsigned long flags;
> +	__u64 n = 1;
>  
>  	/*
>  	 * Deadlock or stack overflow issues can happen if we recurse here
> @@ -68,7 +69,7 @@ __u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, __poll_t mask)
>  	current->in_eventfd = 0;
>  	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
>  
> -	return n;
> +	return n == 1;
>  }

...

> @@ -58,13 +58,12 @@ static inline struct eventfd_ctx *eventfd_ctx_fdget(int fd)
>  	return ERR_PTR(-ENOSYS);
>  }
>  
> -static inline int eventfd_signal(struct eventfd_ctx *ctx)
> +static inline bool eventfd_signal(struct eventfd_ctx *ctx)
>  {
>  	return -ENOSYS;
>  }
>  
> -static inline int eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n,
> -				      unsigned mask)
> +static inline bool eventfd_signal_mask(struct eventfd_ctx *ctx, unsigned mask)
>  {
>  	return -ENOSYS;

This will morph to "true" for what should be an error case.  One option would be
to have eventfd_signal_mask() return 0/-errno instead of the count, but looking
at all the callers, nothing ever actually consumes the result.

KVMGT morphs failure into -EFAULT

	if (vgpu->msi_trigger && eventfd_signal(vgpu->msi_trigger, 1) != 1)
		return -EFAULT;

but the only caller of that user ignores the return value.

	if (vgpu_vreg(vgpu, i915_mmio_reg_offset(GEN8_MASTER_IRQ))
			& ~GEN8_MASTER_IRQ_CONTROL)
		inject_virtual_interrupt(vgpu);

The sample driver in samples/vfio-mdev/mtty.c uses a similar pattern: prints an
error but otherwise ignores the result.

So why not return nothing?  That will simplify eventfd_signal_mask() a wee bit
more, and eliminate that bizarre return value confusion for the ugly stubs, e.g.

void eventfd_signal_mask(struct eventfd_ctx *ctx, unsigned mask)
{
	unsigned long flags;

	/*
	 * Deadlock or stack overflow issues can happen if we recurse here
	 * through waitqueue wakeup handlers. If the caller users potentially
	 * nested waitqueues with custom wakeup handlers, then it should
	 * check eventfd_signal_allowed() before calling this function. If
	 * it returns false, the eventfd_signal() call should be deferred to a
	 * safe context.
	 */
	if (WARN_ON_ONCE(current->in_eventfd))
		return;

	spin_lock_irqsave(&ctx->wqh.lock, flags);
	current->in_eventfd = 1;
	if (ctx->count < ULLONG_MAX)
		ctx->count++;
	if (waitqueue_active(&ctx->wqh))
		wake_up_locked_poll(&ctx->wqh, EPOLLIN | mask);
	current->in_eventfd = 0;
	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
}

You could even go further and unify the real and stub versions of eventfd_signal().

  parent reply	other threads:[~2023-07-13 14:33 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-13 10:05 [PATCH 0/2] eventfd: simplify signal helpers Christian Brauner
2023-07-13 10:05 ` Christian Brauner
2023-07-13 10:05 ` Christian Brauner
2023-07-13 10:05 ` Christian Brauner
2023-07-13 10:05 ` [Intel-gfx] " Christian Brauner
2023-07-13 10:05 ` [PATCH 1/2] eventfd: simplify eventfd_signal() Christian Brauner
2023-07-13 10:05   ` Christian Brauner
2023-07-13 10:05   ` Christian Brauner
2023-07-13 10:05   ` Christian Brauner
2023-07-13 10:05   ` [Intel-gfx] " Christian Brauner
2023-07-13 13:29   ` Anthony Krowiak
2023-07-13 13:29     ` Anthony Krowiak
2023-07-13 13:29     ` Anthony Krowiak
2023-07-13 13:29     ` Anthony Krowiak
2023-07-13 13:29     ` [Intel-gfx] " Anthony Krowiak
2023-07-13 13:34   ` Oded Gabbay
2023-07-13 13:34     ` Oded Gabbay
2023-07-13 13:34     ` Oded Gabbay
2023-07-13 13:34     ` Oded Gabbay
2023-07-13 13:34     ` [Intel-gfx] " Oded Gabbay
2023-07-13 10:05 ` [PATCH 2/2] eventfd: simplify eventfd_signal_mask() Christian Brauner
2023-07-13 10:05   ` Christian Brauner
2023-07-13 10:05   ` Christian Brauner
2023-07-13 10:05   ` Christian Brauner
2023-07-13 10:05   ` [Intel-gfx] " Christian Brauner
2023-07-13 11:59   ` Petr Machata
2023-07-13 14:00     ` Christian Brauner
     [not found]   ` <20230713-vfs-eventfd-signal-v1-2-7fda6c5d212b-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2023-07-13 14:33     ` Sean Christopherson [this message]
2023-07-13 14:33       ` Sean Christopherson
2023-07-13 14:33       ` Sean Christopherson
2023-07-13 14:33       ` Sean Christopherson
2023-07-13 14:33       ` [Intel-gfx] " Sean Christopherson
2023-07-13 14:52       ` Christian Brauner
2023-07-13 14:52         ` Christian Brauner
2023-07-13 14:52         ` Christian Brauner
2023-07-13 14:52         ` Christian Brauner
2023-07-13 14:52         ` [Intel-gfx] " Christian Brauner
2023-07-13 16:43 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for eventfd: simplify signal helpers Patchwork
2023-07-13 16:43 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-07-13 16:57 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-07-13 17:10 ` [PATCH 0/2] " Alex Williamson
2023-07-13 17:10   ` Alex Williamson
2023-07-13 17:10   ` Alex Williamson
2023-07-13 17:10   ` Alex Williamson
2023-07-13 17:10   ` Alex Williamson
2023-07-13 17:10   ` [Intel-gfx] " Alex Williamson
2023-07-13 21:43 ` [Intel-gfx] ✓ Fi.CI.IGT: success for " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZLAK+FA3qgbHW0YK@google.com \
    --to=seanjc-hpiqsd4aklfqt0dzr+alfa@public.gmane.org \
    --cc=airlied@gma \
    --cc=bp-Gina5bIWoIWzQB+pC5nmwQ@public.gmane.org \
    --cc=brauner-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=dave.hansen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=hao.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=jani.nikula-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=joonas.lahtinen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mdf-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=ogabbay-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=paul-LM2mM/qkH7s@public.gmane.org \
    --cc=pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=rodrigo.vivi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=trix-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=tvrtko.ursulin-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=vkuznets-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=yilun.xu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=zhenyuw-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=zhi.a.wang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.