From: Joel Fernandes <joel@joelfernandes.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Hou Tao <houtao1@huawei.com>, Yufen Yu <yuyufen@huawei.com>,
axboe@kernel.dk, linux-block@vger.kernel.org,
ming.lei@redhat.com, hch@lst.de, zhengchuan@huawei.com,
yi.zhang@huawei.com, rcu@vger.kernel.org
Subject: Re: [PATCH] block: make sure last_lookup set as NULL after part deleted
Date: Fri, 3 Jan 2020 18:45:21 -0500 [thread overview]
Message-ID: <20200103234521.GG189259@google.com> (raw)
In-Reply-To: <20191231231158.GW13449@paulmck-ThinkPad-P72>
On Tue, Dec 31, 2019 at 03:11:58PM -0800, Paul E. McKenney wrote:
> On Tue, Dec 31, 2019 at 10:55:47PM +0800, Hou Tao wrote:
> > Hi,
> >
> > On 2019/12/31 19:09, Yufen Yu wrote:
> > > When delete partition executes concurrently with IOs issue,
> > > it may cause use-after-free on part in disk_map_sector_rcu()
> > > as following:
> > snip
> >
> > >
> > > diff --git a/block/genhd.c b/block/genhd.c
> > > index ff6268970ddc..39fa8999905f 100644
> > > --- a/block/genhd.c
> > > +++ b/block/genhd.c
> > > @@ -293,7 +293,23 @@ struct hd_struct *disk_map_sector_rcu(struct gendisk *disk, sector_t sector)
> > > part = rcu_dereference(ptbl->part[i]);
> > >
> > > if (part && sector_in_part(part, sector)) {
> > snip
> >
> > > rcu_assign_pointer(ptbl->last_lookup, part);
> > > + part = rcu_dereference(ptbl->part[i]);
> > > + if (part == NULL) {
> > > + rcu_assign_pointer(ptbl->last_lookup, NULL);
> > > + break;
> > > + }
> > > return part;
> > > }
> > > }
> >
> > Not ensure whether the re-read can handle the following case or not:
> >
> > process A process B process C
> >
> > disk_map_sector_rcu(): delete_partition(): disk_map_sector_rcu():
> >
> > rcu_read_lock
> >
> > // need to iterate partition table
> > part[i] != NULL (1) part[i] = NULL (2)
> > smp_mb()
> > last_lookup = NULL (3)
> > call_rcu() (4)
> > last_lookup = part[i] (5)
> >
> >
> > rcu_read_lock()
> > read last_lookup return part[i] (6)
> > sector_in_part() is OK (7)
> > return part[i] (8)
>
> Just for the record...
>
> Use of RCU needs to ensure that readers cannot access the to-be-freed
> structure -before- invoking call_rcu(). Which does look to happen here
> with the "last_lookup = NULL". But in addition, the callback needs to
> get access to the to-be-freed structure via some sideband (usually the
> structure passed to call_rcu()), not from the reader-accessible structure.
>
> Or am I misinterpreting this sequence of events?
If I understand correctly, the issue described above is there are 2 threads
setting last_lookup pointer simultaneously, one of them is NULLing it and
waiting for a GP before freeing it (process B above), while the other is
assigning to it concurrently after it was just NULLed (process A). Meanwhile
process C starts a reader section *after* the GP by process B already started
and accesses the reassigned pointer causing use-after-free.
Did I miss something?
I believe the fix is what Tao already posted which is to use refcounts so
that the destructor does not free it while references are already held. Is
that what the final fix is going to be? That other thread is pretty long so I
lost track a bit..
thanks,
- Joel
> Thanx, Paul
>
> > part[i] == NULL (9)
> > last_lookup = NULL (10)
> > rcu_read_unlock() (11)
> > one RCU grace period completes
> > __delete_partition() (12)
> > free hd_partition (13)
> > // use-after-free
> > hd_struct_try_get(part[i]) (14)
> >
> > * the number in the parenthesis is the sequence of events.
> >
> > Maybe RCU experts can shed some light on this problem, so cc +paulmck@kernel.org, +joel@joelfernandes.org and +RCU maillist.
> >
> > If the above case is possible, maybe we can fix the problem by pinning last_lookup through increasing its ref-count
> > (the following patch is only compile tested):
> >
> > diff --git a/block/genhd.c b/block/genhd.c
> > index 6e8543ca6912..179e0056fae1 100644
> > --- a/block/genhd.c
> > +++ b/block/genhd.c
> > @@ -279,7 +279,14 @@ struct hd_struct *disk_map_sector_rcu(struct gendisk *disk, sector_t sector)
> > part = rcu_dereference(ptbl->part[i]);
> >
> > if (part && sector_in_part(part, sector)) {
> > - rcu_assign_pointer(ptbl->last_lookup, part);
> > + struct hd_struct *old;
> > +
> > + if (!hd_struct_try_get(part))
> > + break;
> > +
> > + old = xchg(&ptbl->last_lookup, part);
> > + if (old)
> > + hd_struct_put(old);
> > return part;
> > }
> > }
> > @@ -1231,7 +1238,11 @@ static void disk_replace_part_tbl(struct gendisk *disk,
> > rcu_assign_pointer(disk->part_tbl, new_ptbl);
> >
> > if (old_ptbl) {
> > - rcu_assign_pointer(old_ptbl->last_lookup, NULL);
> > + struct hd_struct *part;
> > +
> > + part = xchg(&old_ptbl->last_lookup, NULL);
> > + if (part)
> > + hd_struct_put(part);
> > kfree_rcu(old_ptbl, rcu_head);
> > }
> > }
> > diff --git a/block/partition-generic.c b/block/partition-generic.c
> > index 98d60a59b843..441c1c591c04 100644
> > --- a/block/partition-generic.c
> > +++ b/block/partition-generic.c
> > @@ -285,7 +285,8 @@ void delete_partition(struct gendisk *disk, int partno)
> > return;
> >
> > rcu_assign_pointer(ptbl->part[partno], NULL);
> > - rcu_assign_pointer(ptbl->last_lookup, NULL);
> > + if (cmpxchg(&ptbl->last_lookup, part, NULL) == part)
> > + hd_struct_put(part);
> > kobject_put(part->holder_dir);
> > device_del(part_to_dev(part));
> >
> > --
> > 2.22.0
> >
> > Regards,
> > Tao
> >
> >
> > > diff --git a/block/partition-generic.c b/block/partition-generic.c
> > > index 1d20c9cf213f..1e0065ed6f02 100644
> > > --- a/block/partition-generic.c
> > > +++ b/block/partition-generic.c
> > > @@ -284,6 +284,13 @@ void delete_partition(struct gendisk *disk, int partno)
> > > return;
> > >
> > > rcu_assign_pointer(ptbl->part[partno], NULL);
> > > + /*
> > > + * Without the memory barrier, disk_map_sector_rcu()
> > > + * may read the old value after overwriting the
> > > + * last_lookup. Then it can not clear last_lookup,
> > > + * which may cause use-after-free.
> > > + */
> > > + smp_mb();
> > > rcu_assign_pointer(ptbl->last_lookup, NULL);
> > > kobject_put(part->holder_dir);
> > > device_del(part_to_dev(part));
> > >
> >
next prev parent reply other threads:[~2020-01-03 23:45 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-31 11:09 [PATCH] block: make sure last_lookup set as NULL after part deleted Yufen Yu
2019-12-31 14:55 ` Hou Tao
2019-12-31 23:11 ` Paul E. McKenney
2020-01-01 2:33 ` htbegin
2020-01-01 3:39 ` htbegin
2020-01-03 23:45 ` Joel Fernandes [this message]
2020-01-04 9:16 ` Hou Tao
2020-01-02 1:23 ` Ming Lei
2020-01-03 3:06 ` Hou Tao
2020-01-03 4:18 ` Ming Lei
2020-01-03 7:35 ` Hou Tao
2020-01-03 8:17 ` Ming Lei
2020-01-03 12:03 ` Yufen Yu
2020-01-03 15:16 ` Ming Lei
2020-01-06 7:39 ` Yufen Yu
2020-01-06 8:11 ` Ming Lei
2020-01-06 9:41 ` Hou Tao
2020-01-06 10:05 ` Ming Lei
2020-01-07 11:40 ` Hou Tao
2020-01-08 3:19 ` Ming Lei
2020-01-03 12:43 ` Yufen Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200103234521.GG189259@google.com \
--to=joel@joelfernandes.org \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=houtao1@huawei.com \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=yi.zhang@huawei.com \
--cc=yuyufen@huawei.com \
--cc=zhengchuan@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).