From: Jason Gunthorpe <jgg@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: "iommu@lists.linux.dev" <iommu@lists.linux.dev>,
Lu Baolu <baolu.lu@linux.intel.com>,
Eric Auger <eric.auger@redhat.com>,
Lixiao Yang <lixiao.yang@intel.com>,
Matthew Rosato <mjrosato@linux.ibm.com>,
Nicolin Chen <nicolinc@nvidia.com>,
"patches@lists.linux.dev" <patches@lists.linux.dev>,
"syzbot+7574ebfe589049630608@syzkaller.appspotmail.com"
<syzbot+7574ebfe589049630608@syzkaller.appspotmail.com>,
"syzbot+d31adfb277377ef8fcba@syzkaller.appspotmail.com"
<syzbot+d31adfb277377ef8fcba@syzkaller.appspotmail.com>,
"Liu, Yi L" <yi.l.liu@intel.com>
Subject: Re: [PATCH rc v2 2/2] iommufd: Do not UAF during iommufd_put_object()
Date: Fri, 24 Nov 2023 08:50:06 -0400 [thread overview]
Message-ID: <20231124125006.GC436702@nvidia.com> (raw)
In-Reply-To: <BN9PR11MB5276FE18B63A22DB64EBD8098CB8A@BN9PR11MB5276.namprd11.prod.outlook.com>
On Fri, Nov 24, 2023 at 06:48:59AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, November 22, 2023 9:13 PM
> >
> > static inline bool iommufd_lock_obj(struct iommufd_object *obj)
> > {
> > - if (!down_read_trylock(&obj->destroy_rwsem))
> > + if (!refcount_inc_not_zero(&obj->users))
> > return false;
> > - if (!refcount_inc_not_zero(&obj->users)) {
> > - up_read(&obj->destroy_rwsem);
> > + if (!refcount_inc_not_zero(&obj->shortterm_users)) {
> > + /*
> > + * If the caller doesn't already have a ref on obj this must be
> > + * called under the xa_lock. Otherwise the caller is holding a
> > + * ref on users so this cannot make it zero.
> > + */
> > + refcount_dec(&obj->users);
> > return false;
>
> I'm not sure what the comment is explaining. This path can be reached
> only if the earlier refcount_inc_not_zero(&obj->users) succeeds and
> in that case certainly refcount_dec(&obj->users) cannot make it
> zero.
It is a refcount so a parallel thread could have decremented it
leaving us with 1 and then refcount_dec(1) will WARN_ON.
That can't happen because of the reasoning in the comment.
/*
* If the caller doesn't already have a ref on obj this must be
* called under the xa_lock. Otherwise the caller is holding a
* ref on users. Thus it cannot be one before this decrement.
*/
> > +/*
> > + * The caller holds a users refcount and wants to destroy the object. In all
> > + * cases the caller no longer has a users reference on obj. At this point
> > + * the xarray will be holding a shortterm_users reference.
>
> iiuc shortterm_users will be held only if the destroy operation fails
> otherwise it's meaningless given the object is already freed. It's clearer
> to replace "at this point" with "if any error"
"At this point" was not intended to describe the error case but the
entry condition..
/*
* The caller holds a users refcount and wants to destroy the object. At this
* point the caller has no shortterm_users reference and at least the xarray
* will be holding one.
*/
> > + */
> > static inline void iommufd_object_destroy_user(struct iommufd_ctx *ictx,
> > struct iommufd_object *obj)
> > {
> > - __iommufd_object_destroy_user(ictx, obj, false);
> > + int ret;
> > +
> > + ret = iommufd_object_remove(ictx, obj, obj->id,
> > REMOVE_WAIT_SHORTTERM);
> > +
> > + /*
> > + * If there is a bug and we couldn't destroy the object then we did put
> > + * back the caller's refcount and will eventually try to free it again
> > + * during close.
> > + */
>
> this is still inconsistent with last comment which says "the caller no longer
> has a users reference" but here "we did put back the caller's refcount"
"put back" means to have effectively done iommufd_put_object()
> > + WARN_ON(ret);
> > }
> > +
> > +/*
> > + * Used by autodomains to clean up the hwpt object if it was not used
> > manually.
> > + */
> > static inline void iommufd_object_deref_user(struct iommufd_ctx *ictx,
> > struct iommufd_object *obj)
> > {
> > - __iommufd_object_destroy_user(ictx, obj, true);
> > + iommufd_object_remove(ictx, obj, obj->id, 0);
>
> I forgot why autodomains can tolerate failure to not wait here. Can you
> take this chance to add some comment to clarify?
/*
* The HWPT allocated by autodomains is used in possibly many devices and
* is automatically destroyed when its refcount reaches zero.
*
* If userspace uses the HWPT manually, even for a short term, then it will
* disrupt this refcounting and the auto-free in the kernel will not work.
* Userspace that tries to use the automatically allocated HWPT must be careful
* to ensure that it is consistently destroyed, eg by not racing accesses
* and by not attaching an automatic HWPT to a device manually.
*/
static inline void
iommufd_object_put_and_try_destroy(struct iommufd_ctx *ictx,
struct iommufd_object *obj)
> > +int iommufd_object_remove(struct iommufd_ctx *ictx,
> > + struct iommufd_object *to_destroy, u32 id,
> > + unsigned int flags)
> > {
> > struct iommufd_object *obj;
> > XA_STATE(xas, &ictx->objects, id);
> > + bool zerod_shortterm = false;
> > + int ret;
> > +
> > + /*
> > + * The purpose of the shortterm_users is to ensure deterministic
> > + * destruction of objects used by external drivers and destroyed by
> > this
> > + * function. Any temporary increment of the refcount must
> > increment
> > + * shortterm_users, such as during ioctl execution.
> > + */
> > + if (flags & REMOVE_WAIT_SHORTTERM) {
> > + ret = iommufd_object_dec_wait_shortterm(ictx, to_destroy);
>
> missed a check that to_destroy must be valid when this flag bit is set
It is an internal function I don't think we need to check the
parameters for consistency..
> > + if (ret) {
> > + /*
> > + * We have a bug. Put back the callers reference and
> > + * defer cleaning this object until close.
> > + */
> > + refcount_dec(&to_destroy->users);
>
> explain why refcount_dec means 'put back'? also 'put back' is
> inconsistent with the earlier comment "In all cases the caller no
> longer has a users reference "
As above, "put back" means to have effectively done iommufd_put_object()
Jason
next prev parent reply other threads:[~2023-11-24 12:50 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-22 13:13 [PATCH rc v2 0/2] Do not UAF during iommufd_put_object() Jason Gunthorpe
2023-11-22 13:13 ` [PATCH rc v2 1/2] iommufd: Add iommufd_ctx to iommufd_put_object() Jason Gunthorpe
2023-11-22 13:13 ` [PATCH rc v2 2/2] iommufd: Do not UAF during iommufd_put_object() Jason Gunthorpe
2023-11-24 6:48 ` Tian, Kevin
2023-11-24 12:50 ` Jason Gunthorpe [this message]
2023-11-28 7:59 ` Tian, Kevin
2023-11-29 1:03 ` Jason Gunthorpe
2023-11-30 0:37 ` [PATCH rc v2 0/2] " Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231124125006.GC436702@nvidia.com \
--to=jgg@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=eric.auger@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=kevin.tian@intel.com \
--cc=lixiao.yang@intel.com \
--cc=mjrosato@linux.ibm.com \
--cc=nicolinc@nvidia.com \
--cc=patches@lists.linux.dev \
--cc=syzbot+7574ebfe589049630608@syzkaller.appspotmail.com \
--cc=syzbot+d31adfb277377ef8fcba@syzkaller.appspotmail.com \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox