linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* linear : divide error: 0000
@ 2008-11-02 16:03 Wei Yongquan
  2008-11-03 10:32 ` Andre Noll
  0 siblings, 1 reply; 8+ messages in thread
From: Wei Yongquan @ 2008-11-02 16:03 UTC (permalink / raw)
  To: linux-raid

Hi
    When I create a linear device with two small disk(less then 576K),
a segmentation fault occurs. follow message:
md: bind<hda>
md: bind<hdb>
divide error: 0000 [#1]
Modules linked in:

Pid: 251, comm: mdadm.static Not tainted (2.6.27.4 #2)
EIP: 0060:[<c02ee9d0>] EFLAGS: 00000202 CPU: 0
EIP is at linear_conf+0x1e0/0x2f0
EAX: 00000381 EBX: 00000000 ECX: 00000000 EDX: 00000000
ESI: 00000001 EDI: c6d571e0 EBP: 00000001 ESP: c6d83c64
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process mdadm.static (pid: 251, ti=c6d82000 task=c781ca30 task.ti=c6d82000)
Stack: c02f1273 00000380 00000002 c6d5a400 c6d5a410 c032c7b0 c032c7b0 00000002
       00d5bc6c c6d5a400 ffffffff c6d5a484 c6d5a400 c02eebc2 c035d0c0 c02f670d
       c0336ed0 ffffffff c6d81cf0 c7482ec8 00000000 00000001 c6d5a410 c6d5a200
Call Trace:
 [<c02f1273>] mddev_put+0x13/0x70
 [<c02eebc2>] linear_run+0x22/0x80
 [<c02f670d>] do_md_run+0x3bd/0x960
 [<c0276ed9>] bd_claim_by_disk+0x1a9/0x280
 [<c02f31f9>] bind_rdev_to_array+0x1d9/0x270
 [<c023497c>] handle_level_irq+0x6c/0xa0
 [<c0204f79>] do_IRQ+0x39/0x70
 [<c02a3680>] cfq_merged_request+0x0/0x60
 [<c02f8f00>] md_ioctl+0x1530/0x1a10
 [<c0203193>] common_interrupt+0x23/0x28
 [<c02e2910>] task_no_data_intr+0x0/0x80
 [<c02de2aa>] ide_execute_command+0x4a/0x60
 [<c02e2485>] do_rw_taskfile+0x1a5/0x250
 [<c02deb72>] ide_wait_stat+0x42/0x90
 [<c02991f2>] elv_queue_empty+0x22/0x30
 [<c02dc53a>] ide_do_request+0x9a/0xa00
 [<c029ae58>] submit_bio+0x48/0xd0
 [<c0299447>] elv_drain_elevator+0x17/0x60
 [<c029ba51>] __generic_unplug_device+0x11/0x40
 [<c02f79d0>] md_ioctl+0x0/0x1a10
 [<c029edad>] blkdev_driver_ioctl+0x5d/0x60
 [<c029f051>] blkdev_ioctl+0x2a1/0x820
 [<c029af7b>] __freed_request+0x9b/0xa0
 [<c029afa4>] freed_request+0x24/0x50
 [<c02e1c58>] ide_raw_taskfile+0x78/0x90
 [<c02e1c88>] ide_no_data_taskfile+0x18/0x20
 [<c02e505f>] ide_cacheflush_p+0x5f/0x90
 [<c02e55de>] ide_disk_put+0x2e/0x50
 [<c02e5693>] idedisk_release+0x43/0xb0
 [<c0276ab5>] block_ioctl+0x15/0x20
 [<c0276aa0>] block_ioctl

I found the code in linear.c: +200
            round = sector_div(sz, base);
the base can be ZERO when the device is too small.  So I think this is
a bug or I did something wrong, please tell me.  Thanks a lot .
 I did the test with qemu-0.90, kernel 2.6.27.4 and mdadm 2.6.7 static linked.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linear : divide error: 0000
  2008-11-02 16:03 linear : divide error: 0000 Wei Yongquan
@ 2008-11-03 10:32 ` Andre Noll
  2008-11-03 12:37   ` Wei Yongquan
  2008-11-03 18:28   ` Andre Noll
  0 siblings, 2 replies; 8+ messages in thread
From: Andre Noll @ 2008-11-03 10:32 UTC (permalink / raw)
  To: Wei Yongquan; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2581 bytes --]

On 00:03, Wei Yongquan wrote:
> Hi
>     When I create a linear device with two small disk(less then 576K),
> a segmentation fault occurs. follow message:
> md: bind<hda>
> md: bind<hdb>
> divide error: 0000 [#1]
> Modules linked in:
> 
> Pid: 251, comm: mdadm.static Not tainted (2.6.27.4 #2)
> EIP: 0060:[<c02ee9d0>] EFLAGS: 00000202 CPU: 0
> EIP is at linear_conf+0x1e0/0x2f0
> EAX: 00000381 EBX: 00000000 ECX: 00000000 EDX: 00000000
> ESI: 00000001 EDI: c6d571e0 EBP: 00000001 ESP: c6d83c64
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process mdadm.static (pid: 251, ti=c6d82000 task=c781ca30 task.ti=c6d82000)
> Stack: c02f1273 00000380 00000002 c6d5a400 c6d5a410 c032c7b0 c032c7b0 00000002
>        00d5bc6c c6d5a400 ffffffff c6d5a484 c6d5a400 c02eebc2 c035d0c0 c02f670d
>        c0336ed0 ffffffff c6d81cf0 c7482ec8 00000000 00000001 c6d5a410 c6d5a200
> Call Trace:
>  [<c02f1273>] mddev_put+0x13/0x70
>  [<c02eebc2>] linear_run+0x22/0x80
>  [<c02f670d>] do_md_run+0x3bd/0x960
>  [<c0276ed9>] bd_claim_by_disk+0x1a9/0x280
>  [<c02f31f9>] bind_rdev_to_array+0x1d9/0x270
>  [<c023497c>] handle_level_irq+0x6c/0xa0
>  [<c0204f79>] do_IRQ+0x39/0x70
>  [<c02a3680>] cfq_merged_request+0x0/0x60
>  [<c02f8f00>] md_ioctl+0x1530/0x1a10
>  [<c0203193>] common_interrupt+0x23/0x28
>  [<c02e2910>] task_no_data_intr+0x0/0x80
>  [<c02de2aa>] ide_execute_command+0x4a/0x60
>  [<c02e2485>] do_rw_taskfile+0x1a5/0x250
>  [<c02deb72>] ide_wait_stat+0x42/0x90
>  [<c02991f2>] elv_queue_empty+0x22/0x30
>  [<c02dc53a>] ide_do_request+0x9a/0xa00
>  [<c029ae58>] submit_bio+0x48/0xd0
>  [<c0299447>] elv_drain_elevator+0x17/0x60
>  [<c029ba51>] __generic_unplug_device+0x11/0x40
>  [<c02f79d0>] md_ioctl+0x0/0x1a10
>  [<c029edad>] blkdev_driver_ioctl+0x5d/0x60
>  [<c029f051>] blkdev_ioctl+0x2a1/0x820
>  [<c029af7b>] __freed_request+0x9b/0xa0
>  [<c029afa4>] freed_request+0x24/0x50
>  [<c02e1c58>] ide_raw_taskfile+0x78/0x90
>  [<c02e1c88>] ide_no_data_taskfile+0x18/0x20
>  [<c02e505f>] ide_cacheflush_p+0x5f/0x90
>  [<c02e55de>] ide_disk_put+0x2e/0x50
>  [<c02e5693>] idedisk_release+0x43/0xb0
>  [<c0276ab5>] block_ioctl+0x15/0x20
>  [<c0276aa0>] block_ioctl
> 
> I found the code in linear.c: +200
>             round = sector_div(sz, base);

This sector_div() is in linear_conf() while the call trace indicates
the bug happens in mddev_put(). I'll have a deeper look later today.

Is this a recent regression?

Thanks for the report
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linear : divide error: 0000
  2008-11-03 10:32 ` Andre Noll
@ 2008-11-03 12:37   ` Wei Yongquan
  2008-11-03 14:15     ` Andre Noll
  2008-11-03 18:28   ` Andre Noll
  1 sibling, 1 reply; 8+ messages in thread
From: Wei Yongquan @ 2008-11-03 12:37 UTC (permalink / raw)
  To: Andre Noll; +Cc: linux-raid

> This sector_div() is in linear_conf() while the call trace indicates
> the bug happens in mddev_put(). I'll have a deeper look later today.
>
> Is this a recent regression?

Sorry, I don't know.  2.6.24.4 has the some problem.

the code linear.c :

158              min_spacing = conf->array_sectors / 2;
159              sector_div(min_spacing, PAGE_SIZE/sizeof(struct dev_info *));

if the array size is less then 1024K and the system is 32 based,
min_spacing will be ZERO.

174               if (sz >= min_spacing && sz < conf->hash_spacing)
175                              conf->hash_spacing = sz;

when min_spacing is ZERO then the hash_spacing is ZERO, too.

199               base = conf->hash_spacing >> conf->preshift;
200               round = sector_div(sz, base);

here is base is ZERO, and case didive error.

raid of less then 1M size is really unusual, just refusing the request is OK.

Correct me if I have something wrong.  Thanks a lot.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linear : divide error: 0000
  2008-11-03 12:37   ` Wei Yongquan
@ 2008-11-03 14:15     ` Andre Noll
  2008-11-03 14:53       ` Wei Yongquan
  0 siblings, 1 reply; 8+ messages in thread
From: Andre Noll @ 2008-11-03 14:15 UTC (permalink / raw)
  To: Wei Yongquan; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 948 bytes --]

On 20:37, Wei Yongquan wrote:
> > This sector_div() is in linear_conf() while the call trace indicates
> > the bug happens in mddev_put(). I'll have a deeper look later today.
> >
> > Is this a recent regression?
> 
> Sorry, I don't know.  2.6.24.4 has the some problem.

That's good to know because there has been a bit of churn in the
linear code recently. But this happened way after 2.6.24.4 and is
therefore not the source of the problem.

> the code linear.c :
> 
> 158              min_spacing = conf->array_sectors / 2;
> 159              sector_div(min_spacing, PAGE_SIZE/sizeof(struct dev_info *));
> 
> if the array size is less then 1024K and the system is 32 based,
> min_spacing will be ZERO.

nope, array_sectors in in 512 byte units. So the array size has to
be smaller than 1024 _bytes_ for min_spacing to become zero.

Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linear : divide error: 0000
  2008-11-03 14:15     ` Andre Noll
@ 2008-11-03 14:53       ` Wei Yongquan
  2008-11-03 16:43         ` Andre Noll
  0 siblings, 1 reply; 8+ messages in thread
From: Wei Yongquan @ 2008-11-03 14:53 UTC (permalink / raw)
  To: Andre Noll; +Cc: linux-raid

>> the code linear.c :
>>
>> 158              min_spacing = conf->array_sectors / 2;
>> 159              sector_div(min_spacing, PAGE_SIZE/sizeof(struct dev_info *));
>>
>> if the array size is less then 1024K and the system is 32 based,
>> min_spacing will be ZERO.
>
> nope, array_sectors in in 512 byte units. So the array size has to
> be smaller than 1024 _bytes_ for min_spacing to become zero.
Thanks.
er,  "array_sectors in in 512 byte units" means  when array_sectors =
2048, the array size is actually 2048 * 512 = 1024K. Is that right?
when min_spacing < 1024 at line 158, it becomes zero at line 159. is that right?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linear : divide error: 0000
  2008-11-03 14:53       ` Wei Yongquan
@ 2008-11-03 16:43         ` Andre Noll
  0 siblings, 0 replies; 8+ messages in thread
From: Andre Noll @ 2008-11-03 16:43 UTC (permalink / raw)
  To: Wei Yongquan; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 915 bytes --]

On 22:53, Wei Yongquan wrote:
> >> the code linear.c :
> >>
> >> 158              min_spacing = conf->array_sectors / 2;
> >> 159              sector_div(min_spacing, PAGE_SIZE/sizeof(struct dev_info *));
> >>
> >> if the array size is less then 1024K and the system is 32 based,
> >> min_spacing will be ZERO.
> >
> > nope, array_sectors in in 512 byte units. So the array size has to

s/in in/ is in

> > be smaller than 1024 _bytes_ for min_spacing to become zero.
> Thanks.
> er,  "array_sectors in in 512 byte units" means  when array_sectors =
> 2048, the array size is actually 2048 * 512 = 1024K. Is that right?

Exactly.

> when min_spacing < 1024 at line 158, it becomes zero at line 159. is that right?

Yes, that's true, provided that PAGE_SIZE is 4096 and sizeof(struct
dev_info *) is 4.

Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linear : divide error: 0000
  2008-11-03 10:32 ` Andre Noll
  2008-11-03 12:37   ` Wei Yongquan
@ 2008-11-03 18:28   ` Andre Noll
  2008-11-06  6:44     ` Neil Brown
  1 sibling, 1 reply; 8+ messages in thread
From: Andre Noll @ 2008-11-03 18:28 UTC (permalink / raw)
  To: Wei Yongquan; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 4032 bytes --]

On 11:32, Andre Noll wrote:
> >     When I create a linear device with two small disk(less then 576K),
> > a segmentation fault occurs. follow message:

> > md: bind<hda>
> > md: bind<hdb>
> > divide error: 0000 [#1]
> > Modules linked in:
> > 
> > Pid: 251, comm: mdadm.static Not tainted (2.6.27.4 #2)
> > EIP: 0060:[<c02ee9d0>] EFLAGS: 00000202 CPU: 0
> > EIP is at linear_conf+0x1e0/0x2f0
> > EAX: 00000381 EBX: 00000000 ECX: 00000000 EDX: 00000000
> > ESI: 00000001 EDI: c6d571e0 EBP: 00000001 ESP: c6d83c64
> >  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> > Process mdadm.static (pid: 251, ti=c6d82000 task=c781ca30 task.ti=c6d82000)
> > Stack: c02f1273 00000380 00000002 c6d5a400 c6d5a410 c032c7b0 c032c7b0 00000002
> >        00d5bc6c c6d5a400 ffffffff c6d5a484 c6d5a400 c02eebc2 c035d0c0 c02f670d
> >        c0336ed0 ffffffff c6d81cf0 c7482ec8 00000000 00000001 c6d5a410 c6d5a200
> > Call Trace:
> >  [<c02f1273>] mddev_put+0x13/0x70
> >  [<c02eebc2>] linear_run+0x22/0x80
> >  [<c02f670d>] do_md_run+0x3bd/0x960
> >  [<c0276ed9>] bd_claim_by_disk+0x1a9/0x280
> >  [<c02f31f9>] bind_rdev_to_array+0x1d9/0x270
> >  [<c023497c>] handle_level_irq+0x6c/0xa0
> >  [<c0204f79>] do_IRQ+0x39/0x70
> >  [<c02a3680>] cfq_merged_request+0x0/0x60
> >  [<c02f8f00>] md_ioctl+0x1530/0x1a10
> >  [<c0203193>] common_interrupt+0x23/0x28
> >  [<c02e2910>] task_no_data_intr+0x0/0x80
> >  [<c02de2aa>] ide_execute_command+0x4a/0x60
> >  [<c02e2485>] do_rw_taskfile+0x1a5/0x250
> >  [<c02deb72>] ide_wait_stat+0x42/0x90
> >  [<c02991f2>] elv_queue_empty+0x22/0x30
> >  [<c02dc53a>] ide_do_request+0x9a/0xa00
> >  [<c029ae58>] submit_bio+0x48/0xd0
> >  [<c0299447>] elv_drain_elevator+0x17/0x60
> >  [<c029ba51>] __generic_unplug_device+0x11/0x40
> >  [<c02f79d0>] md_ioctl+0x0/0x1a10
> >  [<c029edad>] blkdev_driver_ioctl+0x5d/0x60
> >  [<c029f051>] blkdev_ioctl+0x2a1/0x820
> >  [<c029af7b>] __freed_request+0x9b/0xa0
> >  [<c029afa4>] freed_request+0x24/0x50
> >  [<c02e1c58>] ide_raw_taskfile+0x78/0x90
> >  [<c02e1c88>] ide_no_data_taskfile+0x18/0x20
> >  [<c02e505f>] ide_cacheflush_p+0x5f/0x90
> >  [<c02e55de>] ide_disk_put+0x2e/0x50
> >  [<c02e5693>] idedisk_release+0x43/0xb0
> >  [<c0276ab5>] block_ioctl+0x15/0x20
> >  [<c0276aa0>] block_ioctl
> > 
> > I found the code in linear.c: +200
> >             round = sector_div(sz, base);
> 
> This sector_div() is in linear_conf() while the call trace indicates
> the bug happens in mddev_put(). I'll have a deeper look later today.

Indeed, this sector_div() is responsible for the bug. The following
patch (against Neil's for-next tree) should fix this. Please give it a
try.

Andre

commit 284b3e2a37c8790d0e9315aed03d594165a85455
Author: Andre Noll <maan@systemlinux.org>
Date:   Mon Nov 3 19:39:22 2008 +0100

    md: linear: Fix a division by zero bug for very small arrays.
    
    We currently oops with a divide error on starting a linear software
    raid array consisting of at least two very small (< 500K) devices.
    
    The bug is caused by the calculation of the hash table size which
    tries to compute sector_div(sz, base) with "base" being zero due to
    the small size of the component devices of the array.
    
    Fix this by requiring the hash spacing to be at least one which
    implies that also "base" is non-zero.
    
    Signed-off-by: Andre Noll <maan@systemlinux.org>

diff --git a/drivers/md/linear.c b/drivers/md/linear.c
index 190147c..3b90c5c 100644
--- a/drivers/md/linear.c
+++ b/drivers/md/linear.c
@@ -148,6 +148,8 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks)
 
 	min_sectors = conf->array_sectors;
 	sector_div(min_sectors, PAGE_SIZE/sizeof(struct dev_info *));
+	if (min_sectors == 0)
+		min_sectors = 1;
 
 	/* min_sectors is the minimum spacing that will fit the hash
 	 * table in one PAGE.  This may be much smaller than needed.
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: linear : divide error: 0000
  2008-11-03 18:28   ` Andre Noll
@ 2008-11-06  6:44     ` Neil Brown
  0 siblings, 0 replies; 8+ messages in thread
From: Neil Brown @ 2008-11-06  6:44 UTC (permalink / raw)
  To: Andre Noll; +Cc: Wei Yongquan, linux-raid

On Monday November 3, maan@systemlinux.org wrote:
> > This sector_div() is in linear_conf() while the call trace indicates
> > the bug happens in mddev_put(). I'll have a deeper look later today.
> 
> Indeed, this sector_div() is responsible for the bug. The following
> patch (against Neil's for-next tree) should fix this. Please give it a
> try.

Thanks Andre.  This is definitely a bug, since 2.6.14!!

I'll queue that patch for mainline and -stable.

Thanks,
NeilBrown

> 
> Andre
> 
> commit 284b3e2a37c8790d0e9315aed03d594165a85455
> Author: Andre Noll <maan@systemlinux.org>
> Date:   Mon Nov 3 19:39:22 2008 +0100
> 
>     md: linear: Fix a division by zero bug for very small arrays.
>     
>     We currently oops with a divide error on starting a linear software
>     raid array consisting of at least two very small (< 500K) devices.
>     
>     The bug is caused by the calculation of the hash table size which
>     tries to compute sector_div(sz, base) with "base" being zero due to
>     the small size of the component devices of the array.
>     
>     Fix this by requiring the hash spacing to be at least one which
>     implies that also "base" is non-zero.
>     
>     Signed-off-by: Andre Noll <maan@systemlinux.org>
> 
> diff --git a/drivers/md/linear.c b/drivers/md/linear.c
> index 190147c..3b90c5c 100644
> --- a/drivers/md/linear.c
> +++ b/drivers/md/linear.c
> @@ -148,6 +148,8 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks)
>  
>  	min_sectors = conf->array_sectors;
>  	sector_div(min_sectors, PAGE_SIZE/sizeof(struct dev_info *));
> +	if (min_sectors == 0)
> +		min_sectors = 1;
>  
>  	/* min_sectors is the minimum spacing that will fit the hash
>  	 * table in one PAGE.  This may be much smaller than needed.
> -- 
> The only person who always got his work done by Friday was Robinson Crusoe

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-11-06  6:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-02 16:03 linear : divide error: 0000 Wei Yongquan
2008-11-03 10:32 ` Andre Noll
2008-11-03 12:37   ` Wei Yongquan
2008-11-03 14:15     ` Andre Noll
2008-11-03 14:53       ` Wei Yongquan
2008-11-03 16:43         ` Andre Noll
2008-11-03 18:28   ` Andre Noll
2008-11-06  6:44     ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).