public inbox for linux-kselftest@vger.kernel.org
 help / color / mirror / Atom feed
From: Nirmoy Das <nirmoyd@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>, <iommu@lists.linux.dev>,
	Joerg Roedel <joro@8bytes.org>, Kevin Tian <kevin.tian@intel.com>,
	<linux-kselftest@vger.kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	"Shuah Khan" <shuah@kernel.org>, Will Deacon <will@kernel.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>, <patches@lists.linux.dev>,
	<syzbot+80620e2d0d0a33b09f93@syzkaller.appspotmail.com>
Subject: Re: [PATCH 1/3] iommufd: Fix race during abort for file descriptors
Date: Thu, 18 Sep 2025 14:37:38 +0200	[thread overview]
Message-ID: <12171a1f-d385-4f40-87f8-004f37c50142@nvidia.com> (raw)
In-Reply-To: <1-v1-02cd136829df+31-iommufd_syz_fput_jgg@nvidia.com>


On 17.09.25 22:01, Jason Gunthorpe wrote:
> fput() doesn't actually call file_operations release() synchronously, it
> puts the file on a work queue and it will be released eventually.
>
> This is normally fine, except for iommufd the file and the iommufd_object
> are tied to gether. The file has the object as it's private_data and holds
> a users refcount, while the object is expected to remain alive as long as
> the file is.
>
> When the allocation of a new object aborts before installing the file it
> will fput() the file and then go on to immediately kfree() the obj. This
> causes a UAF once the workqueue completes the fput() and tries to
> decrement the users refcount.
>
> Fix this by putting the core code in charge of the file lifetime, and call
> __fput_sync() during abort to ensure that release() is called before
> kfree.

Thanks for the detailed description. I had looked into this but couldn't 
find a straightforward solution.

Reviewed-by: Nirmoy Das <nirmoyd@nvidia.com>


>   __fput_sync() is a bit too tricky to open code in all the object
> implementations. Instead the objects tell the core code where the file
> pointer is and the core will take care of the life cycle.
>
> If the object is successfully allocated then the file will hold a users
> refcount and the iommufd_object cannot be destroyed.
>
> It is worth noting that close(); ioctl(IOMMU_DESTROY); doesn't have an
> issue because close() is already using a synchronous version of fput().
>
> The UAF looks like this:
>
>      BUG: KASAN: slab-use-after-free in iommufd_eventq_fops_release+0x45/0xc0 drivers/iommu/iommufd/eventq.c:376
>      Write of size 4 at addr ffff888059c97804 by task syz.0.46/6164
>
>      CPU: 0 UID: 0 PID: 6164 Comm: syz.0.46 Not tainted syzkaller #0 PREEMPT(full)
>      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025
>      Call Trace:
>       <TASK>
>       __dump_stack lib/dump_stack.c:94 [inline]
>       dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
>       print_address_description mm/kasan/report.c:378 [inline]
>       print_report+0xcd/0x630 mm/kasan/report.c:482
>       kasan_report+0xe0/0x110 mm/kasan/report.c:595
>       check_region_inline mm/kasan/generic.c:183 [inline]
>       kasan_check_range+0x100/0x1b0 mm/kasan/generic.c:189
>       instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
>       atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:400 [inline]
>       __refcount_dec include/linux/refcount.h:455 [inline]
>       refcount_dec include/linux/refcount.h:476 [inline]
>       iommufd_eventq_fops_release+0x45/0xc0 drivers/iommu/iommufd/eventq.c:376
>       __fput+0x402/0xb70 fs/file_table.c:468
>       task_work_run+0x14d/0x240 kernel/task_work.c:227
>       resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>       exit_to_user_mode_loop+0xeb/0x110 kernel/entry/common.c:43
>       exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline]
>       syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline]
>       syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline]
>       do_syscall_64+0x41c/0x4c0 arch/x86/entry/syscall_64.c:100
>       entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> Cc: stable@vger.kernel.org
> Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object")
> Reported-by: syzbot+80620e2d0d0a33b09f93@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/r/68c8583d.050a0220.2ff435.03a2.GAE@google.com
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.coz>
> ---
>   drivers/iommu/iommufd/eventq.c |  9 ++-------
>   drivers/iommu/iommufd/main.c   | 35 +++++++++++++++++++++++++++++++---
>   2 files changed, 34 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/iommu/iommufd/eventq.c b/drivers/iommu/iommufd/eventq.c
> index fc4de63b0bce64..e23d9ee4fe3806 100644
> --- a/drivers/iommu/iommufd/eventq.c
> +++ b/drivers/iommu/iommufd/eventq.c
> @@ -393,12 +393,12 @@ static int iommufd_eventq_init(struct iommufd_eventq *eventq, char *name,
>   			       const struct file_operations *fops)
>   {
>   	struct file *filep;
> -	int fdno;
>   
>   	spin_lock_init(&eventq->lock);
>   	INIT_LIST_HEAD(&eventq->deliver);
>   	init_waitqueue_head(&eventq->wait_queue);
>   
> +	/* The filep is fput() by the core code during failure */
>   	filep = anon_inode_getfile(name, fops, eventq, O_RDWR);
>   	if (IS_ERR(filep))
>   		return PTR_ERR(filep);
> @@ -408,10 +408,7 @@ static int iommufd_eventq_init(struct iommufd_eventq *eventq, char *name,
>   	eventq->filep = filep;
>   	refcount_inc(&eventq->obj.users);
>   
> -	fdno = get_unused_fd_flags(O_CLOEXEC);
> -	if (fdno < 0)
> -		fput(filep);
> -	return fdno;
> +	return get_unused_fd_flags(O_CLOEXEC);
>   }
>   
>   static const struct file_operations iommufd_fault_fops =
> @@ -452,7 +449,6 @@ int iommufd_fault_alloc(struct iommufd_ucmd *ucmd)
>   	return 0;
>   out_put_fdno:
>   	put_unused_fd(fdno);
> -	fput(fault->common.filep);
>   	return rc;
>   }
>   
> @@ -536,7 +532,6 @@ int iommufd_veventq_alloc(struct iommufd_ucmd *ucmd)
>   
>   out_put_fdno:
>   	put_unused_fd(fdno);
> -	fput(veventq->common.filep);
>   out_abort:
>   	iommufd_object_abort_and_destroy(ucmd->ictx, &veventq->common.obj);
>   out_unlock_veventqs:
> diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
> index 109de747e8b3ed..b8475194279a9a 100644
> --- a/drivers/iommu/iommufd/main.c
> +++ b/drivers/iommu/iommufd/main.c
> @@ -23,6 +23,7 @@
>   #include "iommufd_test.h"
>   
>   struct iommufd_object_ops {
> +	size_t file_offset;
>   	void (*pre_destroy)(struct iommufd_object *obj);
>   	void (*destroy)(struct iommufd_object *obj);
>   	void (*abort)(struct iommufd_object *obj);
> @@ -131,10 +132,30 @@ void iommufd_object_abort(struct iommufd_ctx *ictx, struct iommufd_object *obj)
>   void iommufd_object_abort_and_destroy(struct iommufd_ctx *ictx,
>   				      struct iommufd_object *obj)
>   {
> -	if (iommufd_object_ops[obj->type].abort)
> -		iommufd_object_ops[obj->type].abort(obj);
> +	const struct iommufd_object_ops *ops = &iommufd_object_ops[obj->type];
> +
> +	if (ops->file_offset) {
> +		struct file **filep = ((void *)obj) + ops->file_offset;
> +
> +		/*
> +		 * files should hold a users refcount while the file is open and
> +		 * put it back in their release. They should hold a pointer to
> +		 * obj in their private data. Normal fput() is deferred to a
> +		 * workqueue and can get out of order with the following
> +		 * kfree(obj). Using the sync version ensures the release
> +		 * happens immediately. During abort we require the file
> +		 * refcount is one at this point - meaning the object alloc
> +		 * function cannot do anything to allow another thread to take a
> +		 * refcount prior to a guaranteed success.
> +		 */
> +		if (*filep)
> +			__fput_sync(*filep);
> +	}
> +
> +	if (ops->abort)
> +		ops->abort(obj);
>   	else
> -		iommufd_object_ops[obj->type].destroy(obj);
> +		ops->destroy(obj);
>   	iommufd_object_abort(ictx, obj);
>   }
>   
> @@ -659,6 +680,12 @@ void iommufd_ctx_put(struct iommufd_ctx *ictx)
>   }
>   EXPORT_SYMBOL_NS_GPL(iommufd_ctx_put, "IOMMUFD");
>   
> +#define IOMMUFD_FILE_OFFSET(_struct, _filep, _obj)                           \
> +	.file_offset = (offsetof(_struct, _filep) +                          \
> +			BUILD_BUG_ON_ZERO(!__same_type(                      \
> +				struct file *, ((_struct *)NULL)->_filep)) + \
> +			BUILD_BUG_ON_ZERO(offsetof(_struct, _obj)))
> +
>   static const struct iommufd_object_ops iommufd_object_ops[] = {
>   	[IOMMUFD_OBJ_ACCESS] = {
>   		.destroy = iommufd_access_destroy_object,
> @@ -669,6 +696,7 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
>   	},
>   	[IOMMUFD_OBJ_FAULT] = {
>   		.destroy = iommufd_fault_destroy,
> +		IOMMUFD_FILE_OFFSET(struct iommufd_fault, common.filep, common.obj),
>   	},
>   	[IOMMUFD_OBJ_HW_QUEUE] = {
>   		.destroy = iommufd_hw_queue_destroy,
> @@ -691,6 +719,7 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
>   	[IOMMUFD_OBJ_VEVENTQ] = {
>   		.destroy = iommufd_veventq_destroy,
>   		.abort = iommufd_veventq_abort,
> +		IOMMUFD_FILE_OFFSET(struct iommufd_veventq, common.filep, common.obj),
>   	},
>   	[IOMMUFD_OBJ_VIOMMU] = {
>   		.destroy = iommufd_viommu_destroy,

  parent reply	other threads:[~2025-09-18 12:39 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-17 20:01 [PATCH 0/3] Fix a race with fput during eventq abort Jason Gunthorpe
2025-09-17 20:01 ` [PATCH 1/3] iommufd: Fix race during abort for file descriptors Jason Gunthorpe
2025-09-18  5:07   ` Nicolin Chen
2025-09-18 14:43     ` Jason Gunthorpe
2025-09-18 12:37   ` Nirmoy Das [this message]
2025-09-19  8:16   ` Tian, Kevin
2025-09-17 20:01 ` [PATCH 2/3] iommufd: WARN if an object is aborted with an elevated refcount Jason Gunthorpe
2025-09-18  6:10   ` Nicolin Chen
2025-09-18 14:47     ` Jason Gunthorpe
2025-09-18 20:49       ` Nicolin Chen
2025-09-18 20:50   ` Nicolin Chen
2025-09-19  8:16   ` Tian, Kevin
2025-09-17 20:01 ` [PATCH 3/3] iommufd/selftest: Update the fail_nth limit Jason Gunthorpe
2025-09-18  5:28   ` Nicolin Chen
2025-09-19  8:17   ` Tian, Kevin
2025-09-18 20:52 ` [PATCH 0/3] Fix a race with fput during eventq abort Nicolin Chen
2025-09-19 13:43 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12171a1f-d385-4f40-87f8-004f37c50142@nvidia.com \
    --to=nirmoyd@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=robin.murphy@arm.com \
    --cc=shuah@kernel.org \
    --cc=syzbot+80620e2d0d0a33b09f93@syzkaller.appspotmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox