linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-fsdevel@vger.kernel.org,
	Christian Brauner <brauner@kernel.org>,
	Alexey Kardashevskiy <aik@amd.com>,
	Paul Mackerras <paulus@ozlabs.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [RFC] potential UAF in kvm_spapr_tce_attach_iommu_group() (was Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...))
Date: Mon, 10 Jun 2024 03:44:37 +0100	[thread overview]
Message-ID: <20240610024437.GA1464458@ZenIV> (raw)
In-Reply-To: <20240607210814.GC1629371@ZenIV>

On Fri, Jun 07, 2024 at 10:08:14PM +0100, Al Viro wrote:

> Hell knows - it feels like mixing __cleanup-based stuff with anything
> explicit leads to massive headache.  And I *really* hate to have
> e.g. inode_unlock() hidden in __cleanup in a random subset of places.
> Unlike dropping file references (if we do that a bit later, nothing
> would really care), the loss of explicit control over the places where
> inode lock is dropped is asking for serious trouble.
> 
> Any suggestions?  Linus, what's your opinion on the use of CLASS...
> stuff?

While looking through the converted fdget() users, some interesting
stuff got caught.  Example:

kvm_device_fops.unlocked_ioctl() is equal to kvm_device_ioctl() and it
gets called (by ioctl(2)) without any locks.

kvm_device_ioctl() calls kvm_device_ioctl_attr(), passing it dev->ops->set_attr.

kvm_device_ioctl_attr() calls the callback passed to it, still without any
locks.

->set_attr() can be kvm_vfio_set_attr(), which calls kvm_vfio_set_file(), which
calls kvm_vfio_file_set_spapr_tce(), which takes dev->private.lock and
calls kvm_spapr_tce_attach_iommu_group().  No kvm->lock held.


Now, in kvm_spapr_tce_attach_iommu_group() we have (in mainline)
        f = fdget(tablefd);
        if (!f.file)
                return -EBADF;

        rcu_read_lock();
        list_for_each_entry_rcu(stt, &kvm->arch.spapr_tce_tables, list) {
                if (stt == f.file->private_data) {
                        found = true;
                        break;
                }
        }
        rcu_read_unlock();

        fdput(f);

        if (!found)
                return -EINVAL;

	....
        list_add_rcu(&stit->next, &stt->iommu_tables);

What happens if another thread closes the damn descriptor just as we'd
done fdput()?  This:
static int kvm_spapr_tce_release(struct inode *inode, struct file *filp)
{
        struct kvmppc_spapr_tce_table *stt = filp->private_data;
        struct kvmppc_spapr_tce_iommu_table *stit, *tmp;
        struct kvm *kvm = stt->kvm;

        mutex_lock(&kvm->lock);
        list_del_rcu(&stt->list);
        mutex_unlock(&kvm->lock);

        list_for_each_entry_safe(stit, tmp, &stt->iommu_tables, next) {
                WARN_ON(!kref_read(&stit->kref));
                while (1) {
                        if (kref_put(&stit->kref, kvm_spapr_tce_liobn_put))
                                break;
                }
        }

        account_locked_vm(kvm->mm,
                kvmppc_stt_pages(kvmppc_tce_pages(stt->size)), false);

        kvm_put_kvm(stt->kvm);

        call_rcu(&stt->rcu, release_spapr_tce_table);

        return 0;
}

Leaving aside the question of sanity of that while (!kfref_put()) loop,
that function will *NOT* block on kvm->lock (the only lock being held
by the caller of kvm_spapr_tce_attach_iommu_group() is struct kvm_vfio::lock,
not struct kvm::lock) and it will arrange for RCU-delayed call of
release_spapr_tce_table(), which will kfree stt.

Recall that in kvm_spapr_tce_attach_iommu_group() we are not holding
rcu_read_lock() between fdput() and list_add_rcu() (we couldn't - not
with the blocking allocations we have there), so call_rcu() might as
well have been a direct call.

What's there to protect stt from being freed right after fdput()?

Unless I'm misreading that code (entirely possible), this fdput() shouldn't
be done until we are done with stt.

  reply	other threads:[~2024-06-10  2:44 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-07  1:56 [PATCHES][RFC] rework of struct fd handling Al Viro
2024-06-07  1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
2024-06-07  1:59   ` [PATCH 02/19] lirc: rc_dev_get_from_fd(): fix file leak Al Viro
2024-06-07 15:17     ` Christian Brauner
2024-06-07  1:59   ` [PATCH 03/19] introduce fd_file(), convert all accessors to it Al Viro
2024-06-07  1:59   ` [PATCH 04/19] struct fd: representation change Al Viro
2024-06-07  5:55     ` Amir Goldstein
2024-06-07  1:59   ` [PATCH 05/19] add struct fd constructors, get rid of __to_fd() Al Viro
2024-06-07  1:59   ` [PATCH 06/19] net/socket.c: switch to CLASS(fd) Al Viro
2024-06-07  1:59   ` [PATCH 07/19] introduce struct fderr, convert overlayfs uses to that Al Viro
2024-06-07  1:59   ` [PATCH 08/19] fdget_raw() users: switch to CLASS(fd_raw, ...) Al Viro
2024-06-07 15:20     ` Christian Brauner
2024-06-07  1:59   ` [PATCH 09/19] css_set_fork(): " Al Viro
2024-06-07 15:21     ` Christian Brauner
2024-06-07  1:59   ` [PATCH 10/19] introduce "fd_pos" class Al Viro
2024-06-07 15:21     ` Christian Brauner
2024-06-07  1:59   ` [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...) Al Viro
2024-06-07 15:26     ` Christian Brauner
2024-06-07 16:10       ` Al Viro
2024-06-07 16:11         ` Al Viro
2024-06-07 21:08         ` Al Viro
2024-06-10  2:44           ` Al Viro [this message]
2024-06-12 16:36             ` [RFC] potential UAF in kvm_spapr_tce_attach_iommu_group() (was Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...)) Linus Torvalds
2024-06-13 10:56               ` Michael Ellerman
2024-06-10  5:12           ` [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable() Al Viro
2024-06-10 17:03             ` Al Viro
2024-06-10 20:09             ` Alex Williamson
2024-06-10 20:53               ` Al Viro
2024-06-11 23:04                 ` Alex Williamson
2024-06-12  2:16                   ` Al Viro
2024-06-07  1:59   ` [PATCH 12/19] bpf: switch to CLASS(fd, ...) Al Viro
2024-06-07 15:27     ` Christian Brauner
2024-06-07  1:59   ` [PATCH 13/19] convert vmsplice() " Al Viro
2024-06-07  1:59   ` [PATCH 14/19] finit_module(): convert " Al Viro
2024-06-07  1:59   ` [PATCH 15/19] timerfd: switch " Al Viro
2024-06-07  1:59   ` [PATCH 16/19] do_mq_notify(): " Al Viro
2024-06-07  1:59   ` [PATCH 17/19] simplify xfs_find_handle() a bit Al Viro
2024-06-07  1:59   ` [PATCH 18/19] convert kernel/events/core.c Al Viro
2024-06-07  1:59   ` [PATCH 19/19] deal with the last remaing boolean uses of fd_file() Al Viro
2024-06-07 15:16   ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Christian Brauner
2024-06-07 15:30 ` [PATCHES][RFC] rework of struct fd handling Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240610024437.GA1464458@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=aik@amd.com \
    --cc=brauner@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@ozlabs.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).