[PATCH] md linear: fix a race between linear_add() and linear

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] md linear: fix a race between linear_add() and linear_congested()
@ 2017-01-25 11:15 colyli
  2017-01-25 18:02 ` Shaohua Li
  0 siblings, 1 reply; 5+ messages in thread
From: colyli @ 2017-01-25 11:15 UTC (permalink / raw)
  To: linux-raid; +Cc: Coly Li, Shaohua Li, Neil Brown, stable

Recently I receie a report that on Linux v3.0 based kerenl, hot add disk
to a md linear device causes kernel crash at linear_congested(). From the
crash image analysis, I find in linear_congested(), mddev->raid_disks
contains value N, but conf->disks[] only has N-1 pointers available. Then
a pointer deference to a NULL pointer crashes the kernel.

There is a race between linear_add() and linear_congested(), RCU stuffs
used in these two functions cannot avoid the race. Since Linuv v4.0
RCU code is replaced by introducing mddev_suspend().  After checking the
upstream code, it seems linear_congested() is not called in
generic_make_request() code patch, so mddev_suspend() cannot provent it
from being called. The possible race still exists.

Here I explain how the race still exists in current code.  For a machine
has many CPUs, on one CPU, linear_add() is called to add a hard disk to a
md linear device; at the same time on other CPU, linear_congested() is
called to detect whether this md linear device is congested before issuing
an I/O request onto it.

Now I use a possible code execution time sequence to demo how the possible
race happens, 

seq    linear_add()                linear_congested()
 0                                 conf=mddev->private
 1   oldconf=mddev->private
 2   mddev->raid_disks++
 3                              for (i=0; i<mddev->raid_disks;i++)
 4                                bdev_get_queue(conf->disks[i].rdev->bdev)
 5   mddev->private=newconf

In linear_add() mddev->raid_disks is increased in time seq 2, and on
another CPU in linear_congested() the for-loop iterates conf->disks[i] by
the increased mddev->raid_disks in time seq 3,4. But conf with one more
element (which is a pointer to struct dev_info type) to conf->disks[] is
not updated yet, accessing its structure member in time seq 4 will cause a
NULL pointer deference fault.

The fix is to update mddev->private with new value before increasing
mddev->raid_disks, and to make sure on other CPUs their are seen to be
updated in same order as linear_add() does (otherwise the race may still
happen), a smp_mb() is necessary.

A question is, by this fix, if mddev->private is update to new value in
linear_add(), but in linear_congested() the for-loop still tests old value
of mddev->raid_disks, then the iteration will miss the last element of
conf->disks[]. My answer is don't worry it, it's OK. the reasons are,
 - When updating mddev->private, the md linear device is suspend, no I/O
   may happen, it is safe to missing congestion status of the last
   new-added hard disk. 
 - In the worst case linear_congested() returns 0 and I/O sent to this md
   linear device, but the new added hard disk is congested, then the I/O
   request will be blocked for a while if it just happenly hits the new
   added hard disk. linear_congested() is in code path of wb_congested(),
   which is quite hot in write back code path. Comparing to add locking
   code in linear_congested(), the cost of the worst case is acceptable.

The bug is reported on Linux v3.0 based kernel, it can and should be
applied to all kernels since Linux v3.0. I see linear_add() is merged into
mainline since Linux v2.6.18, maybe stable kernel maintainers after this
version may consider to pick this fix as well.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Shaohua Li <shli@fb.com>
Cc: Neil Brown <neilb@suse.com>
Cc: stable@vger.kernel.org
---
 drivers/md/linear.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/md/linear.c b/drivers/md/linear.c
index 5975c99..48ccfad 100644
--- a/drivers/md/linear.c
+++ b/drivers/md/linear.c
@@ -196,10 +196,22 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
 	if (!newconf)
 		return -ENOMEM;

+	/* In linear_congested(), mddev->raid_disks and mddev->private
+	 * are accessed without protection by mddev_suspend(). If on
+	 * another CPU,  in linear_congested() mddev->private is still seen
+	 * to contains old value but mddev->raid_disks is seen to have the
+	 * increased value, the last iteration to conf->disks[i].rdev will
+	 * trigger a NULL pointer deference. To avoid this race, here
+	 * mddev->private must be updated before increasing
+	 * mddev->raid_disks, and a smp_mb() is required between them. Then
+	 * in linear_congested(), we are sure the updated mddev->private is
+	 * seen when iterating conf->disks[i].
+	 */
 	mddev_suspend(mddev);
 	oldconf = mddev->private;
-	mddev->raid_disks++;
 	mddev->private = newconf;
+	smp_mb();
+	mddev->raid_disks++;
 	md_set_array_sectors(mddev, linear_size(mddev, 0, 0));
 	set_capacity(mddev->gendisk, mddev->array_sectors);
 	mddev_resume(mddev);
-- 
2.6.6

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] md linear: fix a race between linear_add() and linear_congested()
  2017-01-25 11:15 [PATCH] md linear: fix a race between linear_add() and linear_congested() colyli
@ 2017-01-25 18:02 ` Shaohua Li
  2017-01-26  0:04   ` NeilBrown
  2017-01-27 17:32   ` Coly Li
  0 siblings, 2 replies; 5+ messages in thread
From: Shaohua Li @ 2017-01-25 18:02 UTC (permalink / raw)
  To: colyli; +Cc: linux-raid, Shaohua Li, Neil Brown, stable

On Wed, Jan 25, 2017 at 07:15:43PM +0800, colyli@suse.de wrote:
> Recently I receie a report that on Linux v3.0 based kerenl, hot add disk
> to a md linear device causes kernel crash at linear_congested(). From the
> crash image analysis, I find in linear_congested(), mddev->raid_disks
> contains value N, but conf->disks[] only has N-1 pointers available. Then
> a pointer deference to a NULL pointer crashes the kernel.
> 
> There is a race between linear_add() and linear_congested(), RCU stuffs
> used in these two functions cannot avoid the race. Since Linuv v4.0
> RCU code is replaced by introducing mddev_suspend().  After checking the
> upstream code, it seems linear_congested() is not called in
> generic_make_request() code patch, so mddev_suspend() cannot provent it
> from being called. The possible race still exists.
> 
> Here I explain how the race still exists in current code.  For a machine
> has many CPUs, on one CPU, linear_add() is called to add a hard disk to a
> md linear device; at the same time on other CPU, linear_congested() is
> called to detect whether this md linear device is congested before issuing
> an I/O request onto it.
> 
> Now I use a possible code execution time sequence to demo how the possible
> race happens, 
> 
> seq    linear_add()                linear_congested()
>  0                                 conf=mddev->private
>  1   oldconf=mddev->private
>  2   mddev->raid_disks++
>  3                              for (i=0; i<mddev->raid_disks;i++)
>  4                                bdev_get_queue(conf->disks[i].rdev->bdev)
>  5   mddev->private=newconf

Good catch, this makes a lot of sense. However, this looks like an incomplete
fix. step 0 will get the old conf, after step 5, linear_add will free the old
conf. So it's possible linear_congested() will use the freed old conf. I think
this is more likely to happen. The easist fix maybe put rcu_lock in
linear_congested and free the old conf in a rcu callback.

Thanks,
Shaohua
 
> In linear_add() mddev->raid_disks is increased in time seq 2, and on
> another CPU in linear_congested() the for-loop iterates conf->disks[i] by
> the increased mddev->raid_disks in time seq 3,4. But conf with one more
> element (which is a pointer to struct dev_info type) to conf->disks[] is
> not updated yet, accessing its structure member in time seq 4 will cause a
> NULL pointer deference fault.
> 
> The fix is to update mddev->private with new value before increasing
> mddev->raid_disks, and to make sure on other CPUs their are seen to be
> updated in same order as linear_add() does (otherwise the race may still
> happen), a smp_mb() is necessary.
> 
> A question is, by this fix, if mddev->private is update to new value in
> linear_add(), but in linear_congested() the for-loop still tests old value
> of mddev->raid_disks, then the iteration will miss the last element of
> conf->disks[]. My answer is don't worry it, it's OK. the reasons are,
>  - When updating mddev->private, the md linear device is suspend, no I/O
>    may happen, it is safe to missing congestion status of the last
>    new-added hard disk. 
>  - In the worst case linear_congested() returns 0 and I/O sent to this md
>    linear device, but the new added hard disk is congested, then the I/O
>    request will be blocked for a while if it just happenly hits the new
>    added hard disk. linear_congested() is in code path of wb_congested(),
>    which is quite hot in write back code path. Comparing to add locking
>    code in linear_congested(), the cost of the worst case is acceptable.
> 
> The bug is reported on Linux v3.0 based kernel, it can and should be
> applied to all kernels since Linux v3.0. I see linear_add() is merged into
> mainline since Linux v2.6.18, maybe stable kernel maintainers after this
> version may consider to pick this fix as well.
> 
> Signed-off-by: Coly Li <colyli@suse.de>
> Cc: Shaohua Li <shli@fb.com>
> Cc: Neil Brown <neilb@suse.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/md/linear.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/linear.c b/drivers/md/linear.c
> index 5975c99..48ccfad 100644
> --- a/drivers/md/linear.c
> +++ b/drivers/md/linear.c
> @@ -196,10 +196,22 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
>  	if (!newconf)
>  		return -ENOMEM;
>  
> +	/* In linear_congested(), mddev->raid_disks and mddev->private
> +	 * are accessed without protection by mddev_suspend(). If on
> +	 * another CPU,  in linear_congested() mddev->private is still seen
> +	 * to contains old value but mddev->raid_disks is seen to have the
> +	 * increased value, the last iteration to conf->disks[i].rdev will
> +	 * trigger a NULL pointer deference. To avoid this race, here
> +	 * mddev->private must be updated before increasing
> +	 * mddev->raid_disks, and a smp_mb() is required between them. Then
> +	 * in linear_congested(), we are sure the updated mddev->private is
> +	 * seen when iterating conf->disks[i].
> +	 */
>  	mddev_suspend(mddev);
>  	oldconf = mddev->private;
> -	mddev->raid_disks++;
>  	mddev->private = newconf;
> +	smp_mb();
> +	mddev->raid_disks++;
>  	md_set_array_sectors(mddev, linear_size(mddev, 0, 0));
>  	set_capacity(mddev->gendisk, mddev->array_sectors);
>  	mddev_resume(mddev);
> -- 
> 2.6.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] md linear: fix a race between linear_add() and linear_congested()
  2017-01-25 18:02 ` Shaohua Li
@ 2017-01-26  0:04   ` NeilBrown
  2017-01-27 17:45     ` Coly Li
  2017-01-27 17:32   ` Coly Li
  1 sibling, 1 reply; 5+ messages in thread
From: NeilBrown @ 2017-01-26  0:04 UTC (permalink / raw)
  To: Shaohua Li, colyli; +Cc: linux-raid, Shaohua Li, stable

[-- Attachment #1: Type: text/plain, Size: 6318 bytes --]

On Wed, Jan 25 2017, Shaohua Li wrote:

> On Wed, Jan 25, 2017 at 07:15:43PM +0800, colyli@suse.de wrote:
>> Recently I receie a report that on Linux v3.0 based kerenl, hot add disk
>> to a md linear device causes kernel crash at linear_congested(). From the
>> crash image analysis, I find in linear_congested(), mddev->raid_disks
>> contains value N, but conf->disks[] only has N-1 pointers available. Then
>> a pointer deference to a NULL pointer crashes the kernel.
>> 
>> There is a race between linear_add() and linear_congested(), RCU stuffs
>> used in these two functions cannot avoid the race. Since Linuv v4.0
>> RCU code is replaced by introducing mddev_suspend().  After checking the
>> upstream code, it seems linear_congested() is not called in
>> generic_make_request() code patch, so mddev_suspend() cannot provent it
>> from being called. The possible race still exists.
>> 
>> Here I explain how the race still exists in current code.  For a machine
>> has many CPUs, on one CPU, linear_add() is called to add a hard disk to a
>> md linear device; at the same time on other CPU, linear_congested() is
>> called to detect whether this md linear device is congested before issuing
>> an I/O request onto it.
>> 
>> Now I use a possible code execution time sequence to demo how the possible
>> race happens, 
>> 
>> seq    linear_add()                linear_congested()
>>  0                                 conf=mddev->private
>>  1   oldconf=mddev->private
>>  2   mddev->raid_disks++
>>  3                              for (i=0; i<mddev->raid_disks;i++)
>>  4                                bdev_get_queue(conf->disks[i].rdev->bdev)
>>  5   mddev->private=newconf
>
> Good catch, this makes a lot of sense. However, this looks like an incomplete
> fix. step 0 will get the old conf, after step 5, linear_add will free the old
> conf. So it's possible linear_congested() will use the freed old conf. I think
> this is more likely to happen. The easist fix maybe put rcu_lock in
> linear_congested and free the old conf in a rcu callback.

We used to use kfree_rcu() but removed it in

Commit: 3be260cc18f8 ("md/linear: remove rcu protections in favour of suspend/resume")

when we changed to suspend/resume the device.  That stops all IO, but
doesn't stop the ->congested call.

So we probably should re-introduce kfree_rcu() to free oldconf.
It might also be good to store a copy of raid_disks in
linear_conf, like we do with r5conf, the ensure we never us inconsistent
->raid_disks and ->disks[]

Thanks,
NeilBrown


>
> Thanks,
> Shaohua
>  
>> In linear_add() mddev->raid_disks is increased in time seq 2, and on
>> another CPU in linear_congested() the for-loop iterates conf->disks[i] by
>> the increased mddev->raid_disks in time seq 3,4. But conf with one more
>> element (which is a pointer to struct dev_info type) to conf->disks[] is
>> not updated yet, accessing its structure member in time seq 4 will cause a
>> NULL pointer deference fault.
>> 
>> The fix is to update mddev->private with new value before increasing
>> mddev->raid_disks, and to make sure on other CPUs their are seen to be
>> updated in same order as linear_add() does (otherwise the race may still
>> happen), a smp_mb() is necessary.
>> 
>> A question is, by this fix, if mddev->private is update to new value in
>> linear_add(), but in linear_congested() the for-loop still tests old value
>> of mddev->raid_disks, then the iteration will miss the last element of
>> conf->disks[]. My answer is don't worry it, it's OK. the reasons are,
>>  - When updating mddev->private, the md linear device is suspend, no I/O
>>    may happen, it is safe to missing congestion status of the last
>>    new-added hard disk. 
>>  - In the worst case linear_congested() returns 0 and I/O sent to this md
>>    linear device, but the new added hard disk is congested, then the I/O
>>    request will be blocked for a while if it just happenly hits the new
>>    added hard disk. linear_congested() is in code path of wb_congested(),
>>    which is quite hot in write back code path. Comparing to add locking
>>    code in linear_congested(), the cost of the worst case is acceptable.
>> 
>> The bug is reported on Linux v3.0 based kernel, it can and should be
>> applied to all kernels since Linux v3.0. I see linear_add() is merged into
>> mainline since Linux v2.6.18, maybe stable kernel maintainers after this
>> version may consider to pick this fix as well.
>> 
>> Signed-off-by: Coly Li <colyli@suse.de>
>> Cc: Shaohua Li <shli@fb.com>
>> Cc: Neil Brown <neilb@suse.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  drivers/md/linear.c | 14 +++++++++++++-
>>  1 file changed, 13 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/md/linear.c b/drivers/md/linear.c
>> index 5975c99..48ccfad 100644
>> --- a/drivers/md/linear.c
>> +++ b/drivers/md/linear.c
>> @@ -196,10 +196,22 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
>>  	if (!newconf)
>>  		return -ENOMEM;
>>  
>> +	/* In linear_congested(), mddev->raid_disks and mddev->private
>> +	 * are accessed without protection by mddev_suspend(). If on
>> +	 * another CPU,  in linear_congested() mddev->private is still seen
>> +	 * to contains old value but mddev->raid_disks is seen to have the
>> +	 * increased value, the last iteration to conf->disks[i].rdev will
>> +	 * trigger a NULL pointer deference. To avoid this race, here
>> +	 * mddev->private must be updated before increasing
>> +	 * mddev->raid_disks, and a smp_mb() is required between them. Then
>> +	 * in linear_congested(), we are sure the updated mddev->private is
>> +	 * seen when iterating conf->disks[i].
>> +	 */
>>  	mddev_suspend(mddev);
>>  	oldconf = mddev->private;
>> -	mddev->raid_disks++;
>>  	mddev->private = newconf;
>> +	smp_mb();
>> +	mddev->raid_disks++;
>>  	md_set_array_sectors(mddev, linear_size(mddev, 0, 0));
>>  	set_capacity(mddev->gendisk, mddev->array_sectors);
>>  	mddev_resume(mddev);
>> -- 
>> 2.6.6
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] md linear: fix a race between linear_add() and linear_congested()
  2017-01-26  0:04   ` NeilBrown
@ 2017-01-27 17:45     ` Coly Li
  0 siblings, 0 replies; 5+ messages in thread
From: Coly Li @ 2017-01-27 17:45 UTC (permalink / raw)
  To: NeilBrown; +Cc: Shaohua Li, linux-raid, Shaohua Li, stable

On 2017/1/26 上午8:04, NeilBrown wrote:
> On Wed, Jan 25 2017, Shaohua Li wrote:
> 
>> On Wed, Jan 25, 2017 at 07:15:43PM +0800, colyli@suse.de wrote:
>>> Recently I receie a report that on Linux v3.0 based kerenl, hot
>>> add disk to a md linear device causes kernel crash at
>>> linear_congested(). From the crash image analysis, I find in
>>> linear_congested(), mddev->raid_disks contains value N, but
>>> conf->disks[] only has N-1 pointers available. Then a pointer
>>> deference to a NULL pointer crashes the kernel.
>>> 
>>> There is a race between linear_add() and linear_congested(),
>>> RCU stuffs used in these two functions cannot avoid the race.
>>> Since Linuv v4.0 RCU code is replaced by introducing
>>> mddev_suspend().  After checking the upstream code, it seems
>>> linear_congested() is not called in generic_make_request() code
>>> patch, so mddev_suspend() cannot provent it from being called.
>>> The possible race still exists.
>>> 
>>> Here I explain how the race still exists in current code.  For
>>> a machine has many CPUs, on one CPU, linear_add() is called to
>>> add a hard disk to a md linear device; at the same time on
>>> other CPU, linear_congested() is called to detect whether this
>>> md linear device is congested before issuing an I/O request
>>> onto it.
>>> 
>>> Now I use a possible code execution time sequence to demo how
>>> the possible race happens,
>>> 
>>> seq    linear_add()                linear_congested() 0
>>> conf=mddev->private 1   oldconf=mddev->private 2
>>> mddev->raid_disks++ 3                              for (i=0;
>>> i<mddev->raid_disks;i++) 4
>>> bdev_get_queue(conf->disks[i].rdev->bdev) 5
>>> mddev->private=newconf
>> 
>> Good catch, this makes a lot of sense. However, this looks like
>> an incomplete fix. step 0 will get the old conf, after step 5,
>> linear_add will free the old conf. So it's possible
>> linear_congested() will use the freed old conf. I think this is
>> more likely to happen. The easist fix maybe put rcu_lock in 
>> linear_congested and free the old conf in a rcu callback.
> 
> We used to use kfree_rcu() but removed it in
> 
> Commit: 3be260cc18f8 ("md/linear: remove rcu protections in favour
> of suspend/resume")
> 
> when we changed to suspend/resume the device.  That stops all IO,
> but doesn't stop the ->congested call.
> 
> So we probably should re-introduce kfree_rcu() to free oldconf. It
> might also be good to store a copy of raid_disks in linear_conf,
> like we do with r5conf, the ensure we never us inconsistent 
> ->raid_disks and ->disks[]

Hi Neil,

I just send out v2 patch which adds RCU stuffs back. I test it on my
small server, it survives.

Once thing I want to confirm here is the memory barrier in linear_add().

219         mddev_suspend(mddev);
220         oldconf = rcu_dereference(mddev->private);
221         rcu_assign_pointer(mddev->private, newconf);
222         smp_mb();
223         mddev->raid_disks++;
224         md_set_array_sectors(mddev, linear_size(mddev, 0, 0));
225         set_capacity(mddev->gendisk, mddev->array_sectors);
226         mddev_resume(mddev);
227         revalidate_disk(mddev->gendisk);
228         call_rcu(&oldconf->rcu, free_conf);

At LINE 222, I add a smp_mb(), from Documentations/memory-barrier.txt,
my understand is here I need a smp_wmb() or smp_mb(). I see other
places all use smp_mb() so I choose the stronger one -- smp_mb().

But from Documentation/whatisRCU.txt, it says about
rcu_assign_pointer(): "This function returns he new value, and also
executes any memory-barrier instructions required for a given CPU
architecture." So it seems smp_mb() at LINE 222 is unnecessary.

In v2 patch, I keep smp_mb() although I think it is unnecessary. I
will remove it if you or Shaohua may confirm it is unncessary as I think.


Another question is, I try to look at the code about r5conf, but I
still have no idea how to store a copy of r5conf. Could you please to
give me more hint ?

Thanks.

Coly

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] md linear: fix a race between linear_add() and linear_congested()
  2017-01-25 18:02 ` Shaohua Li
  2017-01-26  0:04   ` NeilBrown
@ 2017-01-27 17:32   ` Coly Li
  1 sibling, 0 replies; 5+ messages in thread
From: Coly Li @ 2017-01-27 17:32 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-raid, Neil Brown, stable

On 2017/1/26 上午2:02, Shaohua Li wrote:
> On Wed, Jan 25, 2017 at 07:15:43PM +0800, colyli@suse.de wrote:
>> Recently I receie a report that on Linux v3.0 based kerenl, hot add disk
>> to a md linear device causes kernel crash at linear_congested(). From the
>> crash image analysis, I find in linear_congested(), mddev->raid_disks
>> contains value N, but conf->disks[] only has N-1 pointers available. Then
>> a pointer deference to a NULL pointer crashes the kernel.
>>
>> There is a race between linear_add() and linear_congested(), RCU stuffs
>> used in these two functions cannot avoid the race. Since Linuv v4.0
>> RCU code is replaced by introducing mddev_suspend().  After checking the
>> upstream code, it seems linear_congested() is not called in
>> generic_make_request() code patch, so mddev_suspend() cannot provent it
>> from being called. The possible race still exists.
>>
>> Here I explain how the race still exists in current code.  For a machine
>> has many CPUs, on one CPU, linear_add() is called to add a hard disk to a
>> md linear device; at the same time on other CPU, linear_congested() is
>> called to detect whether this md linear device is congested before issuing
>> an I/O request onto it.
>>
>> Now I use a possible code execution time sequence to demo how the possible
>> race happens, 
>>
>> seq    linear_add()                linear_congested()
>>  0                                 conf=mddev->private
>>  1   oldconf=mddev->private
>>  2   mddev->raid_disks++
>>  3                              for (i=0; i<mddev->raid_disks;i++)
>>  4                                bdev_get_queue(conf->disks[i].rdev->bdev)
>>  5   mddev->private=newconf
> 
> Good catch, this makes a lot of sense. However, this looks like an incomplete
> fix. step 0 will get the old conf, after step 5, linear_add will free the old
> conf. So it's possible linear_congested() will use the freed old conf. I think
> this is more likely to happen. The easist fix maybe put rcu_lock in
> linear_congested and free the old conf in a rcu callback.

Yes, RCU is still necessary here, I just compose and send out the second
version.

Thanks for pointing out this :-)

Coly

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-01-27 17:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-25 11:15 [PATCH] md linear: fix a race between linear_add() and linear_congested() colyli
2017-01-25 18:02 ` Shaohua Li
2017-01-26  0:04   ` NeilBrown
2017-01-27 17:45     ` Coly Li
2017-01-27 17:32   ` Coly Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).