All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Luís Henriques" <lhenriques@suse.de>
To: Xiubo Li <xiubli@redhat.com>
Cc: idryomov@gmail.com, ceph-devel@vger.kernel.org,
	jlayton@kernel.org, vshankar@redhat.com, mchangir@redhat.com,
	stable@vger.kernel.org
Subject: Re: [PATCH v3] ceph: fix potential use-after-free bug when trimming caps
Date: Wed, 19 Apr 2023 14:22:50 +0100	[thread overview]
Message-ID: <87r0sgt39x.fsf@brahms.olymp> (raw)
In-Reply-To: <008bc17d-d5fb-107e-4405-ebacc1568890@redhat.com> (Xiubo Li's message of "Wed, 19 Apr 2023 10:28:11 +0800")

Xiubo Li <xiubli@redhat.com> writes:

> On 4/18/23 22:20, Luís Henriques wrote:
>> xiubli@redhat.com writes:
>>
>>> From: Xiubo Li <xiubli@redhat.com>
>>>
>>> When trimming the caps and just after the 'session->s_cap_lock' is
>>> released in ceph_iterate_session_caps() the cap maybe removed by
>>> another thread, and when using the stale cap memory in the callbacks
>>> it will trigger use-after-free crash.
>>>
>>> We need to check the existence of the cap just after the 'ci->i_ceph_lock'
>>> being acquired. And do nothing if it's already removed.
>> Your patch seems to be OK, but I'll be honest: the locking is *so* complex
>> that I can say for sure it really solves any problem :-(
>>
>> ceph_put_cap() uses mdsc->caps_list_lock to protect the list, but I can't
>> be sure that holding ci->i_ceph_lock will protect a race in the case
>> you're trying to solve.
>
> The 'mdsc->caps_list_lock' will protect the members in mdsc:
>
>         /*
>          * Cap reservations
>          *
>          * Maintain a global pool of preallocated struct ceph_caps, referenced
>          * by struct ceph_caps_reservations.  This ensures that we preallocate
>          * memory needed to successfully process an MDS response. (If an MDS
>          * sends us cap information and we fail to process it, we will have
>          * problems due to the client and MDS being out of sync.)
>          *
>          * Reservations are 'owned' by a ceph_cap_reservation context.
>          */
>         spinlock_t      caps_list_lock;
>         struct          list_head caps_list; /* unused (reserved or
>                                                 unreserved) */
>         struct          list_head cap_wait_list;
>         int             caps_total_count;    /* total caps allocated */
>         int             caps_use_count;      /* in use */
>         int             caps_use_max;        /* max used caps */
>         int             caps_reserve_count;  /* unused, reserved */
>         int             caps_avail_count;    /* unused, unreserved */
>         int             caps_min_count;      /* keep at least this many
>
> Not protecting the cap list in session or inode.
>
>
> And the racy is between the session's cap list and inode's cap rbtree and both
> are holding the same 'cap' reference.
>
> So in 'ceph_iterate_session_caps()' after getting the 'cap' and releasing the
> 'session->s_cap_lock', just before passing the 'cap' to _cb() another thread
> could continue and release the 'cap'. Then the 'cap' should be stale now and
> after being passed to _cb() the 'cap' when dereferencing it will crash the
> kernel.
>
> And if the 'cap' is stale, it shouldn't exist in the inode's cap rbtree. Please
> note the lock order will be:
>
> 1, spin_lock(&ci->i_ceph_lock)
>
> 2, spin_lock(&session->s_cap_lock)
>
>
> Before:
>
> ThreadA: ThreadB:
>
> __ceph_remove_caps() -->
>
>     spin_lock(&ci->i_ceph_lock)
>
>     ceph_remove_cap() --> ceph_iterate_session_caps() -->
>
>         __ceph_remove_cap() --> spin_lock(&session->s_cap_lock);
>
> cap = list_entry(p, struct ceph_cap, session_caps);
>
> spin_unlock(&session->s_cap_lock);
>
>             spin_lock(&session->s_cap_lock);
>
>             // remove it from the session's cap list
>
>             list_del_init(&cap->session_caps);
>
>             spin_unlock(&session->s_cap_lock);
>
>             ceph_put_cap()
>
> trim_caps_cb('cap') -->   // the _cb() could be deferred after ThreadA finished
> 'ceph_put_cap()'
>
> spin_unlock(&ci->i_ceph_lock) dreference cap->xxx will trigger crash
>
>
>
> With this patch:
>
> ThreadA: ThreadB:
>
> __ceph_remove_caps() -->
>
>     spin_lock(&ci->i_ceph_lock)
>
>     ceph_remove_cap() --> ceph_iterate_session_caps() -->
>
>         __ceph_remove_cap() --> spin_lock(&session->s_cap_lock);
>
> cap = list_entry(p, struct ceph_cap, session_caps);
>
> ci_node = &cap->ci_node;
>
> spin_unlock(&session->s_cap_lock);
>
>             spin_lock(&session->s_cap_lock);
>
>             // remove it from the session's cap list
>
>             list_del_init(&cap->session_caps);
>
>             spin_unlock(&session->s_cap_lock);
>
>             ceph_put_cap()
>
> trim_caps_cb('ci_node') -->
>
> spin_unlock(&ci->i_ceph_lock)
>
> spin_lock(&ci->i_ceph_lock)
>
> cap = rb_entry(ci_node, struct ceph_cap, ci_node);    // This is buggy in this
> version, we should use the 'mds' instead and I will fix it.
>
> if (!cap)  { release the spin lock and return directly }
>
> spin_unlock(&ci->i_ceph_lock)

Thanks a lot for taking the time to explain all of this.  Much
appreciated.  It all seems to make sense, and, again, I don't have any
real objection to your patch.  It's just that I still find the whole
locking to be too complex, and every change that is made to it looks like
walking on a mine field :-)

> While we should switch to use the 'mds' of the cap instead of the 'ci_node',
> which is buggy. I will fix it in next version.

Yeah, I've took a quick look at v4 and it looks like it fixes this.

>> Is the issue in that bugzilla reproducible, or was that a one-time thing?
>
> No, I don't think so. Locally I have tried by turning the mds options to trigger
> the cap reclaiming more frequently, but still couldn't reproduce it. It should
> be very corner case.

Yeah, too bad.  It would help to gain some extra confidence on the patch.

Cheers,
-- 
Luís

  reply	other threads:[~2023-04-19 13:23 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-18  1:45 [PATCH v3] ceph: fix potential use-after-free bug when trimming caps xiubli
2023-04-18 14:20 ` Luís Henriques
2023-04-19  2:28   ` Xiubo Li
2023-04-19 13:22     ` Luís Henriques [this message]
2023-04-20  1:14       ` Xiubo Li
2023-04-30  8:39       ` Ilya Dryomov
2023-05-01 10:37         ` Luís Henriques

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r0sgt39x.fsf@brahms.olymp \
    --to=lhenriques@suse.de \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=mchangir@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=vshankar@redhat.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.