From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.kernel.org ([198.145.29.99]:54332 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750911AbdE3R1i (ORCPT ); Tue, 30 May 2017 13:27:38 -0400 Date: Tue, 30 May 2017 10:27:35 -0700 From: Shaohua Li To: Ben Hutchings Cc: Dennis Yang , NeilBrown , Shaohua Li , linux-kernel@vger.kernel.org, stable@vger.kernel.org, Greg Kroah-Hartman Subject: Re: [PATCH 4.4 018/103] md: update slab_cache before releasing new stripes when stripes resizing Message-ID: <20170530172735.phicnjvr6ruo7grr@kernel.org> References: <20170523200856.903752266@linuxfoundation.org> <20170523200858.992214045@linuxfoundation.org> <1496150213.2083.55.camel@codethink.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1496150213.2083.55.camel@codethink.co.uk> Sender: stable-owner@vger.kernel.org List-ID: On Tue, May 30, 2017 at 02:16:53PM +0100, Ben Hutchings wrote: > On Tue, 2017-05-23 at 22:08 +0200, Greg Kroah-Hartman wrote: > > 4.4-stable review patch. If anyone has any objections, please let me know. > > > > ------------------ > > > > From: Dennis Yang > > > > commit 583da48e388f472e8818d9bb60ef6a1d40ee9f9d upstream. > > > > When growing raid5 device on machine with small memory, there is chance that > > mdadm will be killed and the following bug report can be observed. The same > > bug could also be reproduced in linux-4.10.6. > [...] > > The problem is that resize_stripes() releases new stripe_heads before assigning new > > slab cache to conf->slab_cache. If the shrinker function raid5_cache_scan() gets called > > after resize_stripes() starting releasing new stripes but right before new slab cache > > being assigned, it is possible that these new stripe_heads will be freed with the old > > slab_cache which was already been destoryed and that triggers this bug. > [...] > > --- a/drivers/md/raid5.c > > +++ b/drivers/md/raid5.c > > @@ -2232,6 +2232,10 @@ static int resize_stripes(struct r5conf > > err = -ENOMEM; > > > > mutex_unlock(&conf->cache_size_mutex); > > + > > + conf->slab_cache = sc; > > + conf->active_name = 1-conf->active_name; > > + > > /* Step 4, return new stripes to service */ > > while(!list_empty(&newstripes)) { > > nsh = list_entry(newstripes.next, struct stripe_head, lru); > [...] > > The assignments are still being done after conf->cache_size_mutex is > unlocked, so there still seems to be a race with raid5_cache_scan(). > Shouldn't they be moved above the mutex_unlock()? Unnecessary. The raid5_cache_scan can't free any stripe to slab_cache before the stripe is called with raid5_release_stripe. Thanks, SHaohua