Flexible I/O Tester development
 help / color / mirror / Atom feed
* Bug in fio: infinite loop when using two volumes crafted from one  MD?
@ 2009-09-11 15:59 Alan D. Brunelle
  2009-09-11 16:35 ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Alan D. Brunelle @ 2009-09-11 15:59 UTC (permalink / raw)
  To: fio; +Cc: Jens Axboe

I have a somewhat complex (but practical) situation I'm trying to
measure (looking at Goyal's io-controller patches).

o  24-disk MD RAID10 set (/dev/md0)

o  12 linear LV volumes crafted from /dev/md0

o  Ext3 FS created on each LV volume

o  16GiB test file created on each FS

# du -s -h /mnt/lv[01]/data.bin
33G     /mnt/lv0/data.bin
33G     /mnt/lv1/data.bin

When I execute the following job file (only using 2 of the 12 files/FS):

[global]
rw=rw
rwmixread=80
randrepeat=1
size=32g
direct=0
ioengine=libaio
iodepth=32
iodepth_low=32
iodepth_batch=32
iodepth_batch_complete=6
overwrite=0
bs=4k
runtime=30

[lv0]
filename=/mnt/lv0/data.bin

[test]
filename=/mnt/lv1/data.bin

the I/O portion of the run completes, but whilst attempting to display
the Disk stats it hangs whilst outputting:

...
Run status group 0 (all jobs):
   READ: io=3,602MB, aggrb=120MB/s, minb=61,754KB/s, maxb=64,135KB/s,
mint=30002msec, maxt=30005msec
  WRITE: io=903MB, aggrb=30,813KB/s, minb=15,537KB/s, maxb=16,016KB/s,
mint=30002msec, maxt=30005msec

Disk stats (read/write):

<<<hangs...>>>

Breaking in (via gdb) yields:
(gdb) where
#0  0x0000000000428bb4 in aggregate_slaves_stats
(masterdu=0x7f21bbf511e8)
    at diskutil.c:458
#1  0x000000000042905c in show_disk_util () at diskutil.c:528
#2  0x00000000004121bb in show_run_stats () at stat.c:663
#3  0x000000000040acad in main (argc=2, argv=0x7fff6b83bba8) at
fio.c:1654

setting a break at:

454                     ios[0] += dus->ios[0];

and using 'cont' & "print *slavedu" yields:

(gdb) print *slavedu
$3 = {list = {next = 0x28000156a7, prev = 0x1571500006e19}, slavelist =
{
    next = 0x7f21bbf513f8, prev = 0x7f21bbf513f8},
  name = 0x74b4 <Address 0x74b4 out of bounds>,
  sysfs_root = 0x4aaa6cce <Address 0x4aaa6cce out of bounds>,
  path = "\030�\r\000\000\000\000\000\000\020i�!\177\000\000\000\000\000
\000\000\000\000\000�Z�Z\000\000\000\000\024\000\000\000ᆳ�dm-1\000\000
\000\000�Z�Z", '\0' <repeats 12 times>, "�\001\000\000ᆳ��G�!\177\000
\000�G�!\177\000\000�G�!\177\000\000�G�!\177\000\000�I�!\177\000\000�W
\203k�\177\000\000/sys/block/dm-1/slaves/../../md0/stat", '\0' <repeats
98 times>, major = 0, minor = 0,
  dus = {ios = {0, 0}, merges = {0, 0}, sectors = {0, 0}, ticks = {0,
0},
    io_ticks = 0, time_in_queue = 0}, last_dus = {ios = {0, 0}, merges =
{0,
      0}, sectors = {0, 0}, ticks = {0, 0}, io_ticks = 0, time_in_queue
= 0},
  slaves = {next = 0x0, prev = 0x0}, msec = 9, time = {tv_sec = 0,
    tv_usec = 0}, lock = 0x0, users = 0}
(gdb) cont
Continuing.

Breakpoint 1, aggregate_slaves_stats (masterdu=0x7f21bbf511e8)
    at diskutil.c:454
454                     ios[0] += dus->ios[0];
(gdb) print *slavedu
$4 = {list = {next = 0x7f21bbf515e8, prev = 0x7f21bbf511e8}, slavelist =
{
    next = 0x7f21bbf54780, prev = 0x7f21bbf54780},
  name = 0x7f21bbf515c8 "md0",
  sysfs_root = 0x7fff6b8357f0 "/sys/block/dm-1/slaves/../../md0",
  path = "/sys/block/dm-0/slaves/../../md0/stat", '\0' <repeats 218
times>,
  major = 9, minor = 0, dus = {ios = {0, 0}, merges = {0, 0}, sectors =
{0,
      0}, ticks = {0, 0}, io_ticks = 0, time_in_queue = 0}, last_dus =
{ios = {
      567924, 118793}, merges = {0, 0}, sectors = {71929094, 950344},
ticks = {
      0, 0}, io_ticks = 0, time_in_queue = 0}, slaves = {
    next = 0x7f21bbf515f8, prev = 0x7f21bbf543f8}, msec = 0, time = {
    tv_sec = 1252682928, tv_usec = 935566}, lock = 0x7f21bc973000, users
= 0}
(gdb) cont
Continuing.

Breakpoint 1, aggregate_slaves_stats (masterdu=0x7f21bbf511e8)
    at diskutil.c:454
454                     ios[0] += dus->ios[0];
(gdb) print *slavedu
$5 = {list = {next = 0x28000156a7, prev = 0x1571500006e19}, slavelist =
{
    next = 0x7f21bbf513f8, prev = 0x7f21bbf513f8},
  name = 0x74b4 <Address 0x74b4 out of bounds>,
  sysfs_root = 0x4aaa6cce <Address 0x4aaa6cce out of bounds>,
  path = "\030�\r\000\000\000\000\000\000\020i�!\177\000\000\000\000\000
\000\000\000\000\000�Z�Z\000\000\000\000\024\000\000\000ᆳ�dm-1\000\000
\000\000�Z�Z", '\0' <repeats 12 times>, "�\001\000\000ᆳ��G�!\177\000
\000�G�!\177\000\000�G�!\177\000\000�G�!\177\000\000�I�!\177\000\000�W
\203k�\177\000\000/sys/block/dm-1/slaves/../../md0/stat", '\0' <repeats
98 times>, major = 0, minor = 0,
  dus = {ios = {0, 0}, merges = {0, 0}, sectors = {0, 0}, ticks = {0,
0},
    io_ticks = 0, time_in_queue = 0}, last_dus = {ios = {0, 0}, merges =
{0,
      0}, sectors = {0, 0}, ticks = {0, 0}, io_ticks = 0, time_in_queue
= 0},
  slaves = {next = 0x0, prev = 0x0}, msec = 9, time = {tv_sec = 0,
    tv_usec = 0}, lock = 0x0, users = 0}

and then it seems to be bouncing between these two "things".

Now using a totally separate disk & FS & data file:

# du -s -h /mnt/lv0/data.bin /mnt/test/data.bin
33G     /mnt/lv0/data.bin
33G     /mnt/test/data.bin

(/mnt/test is *not* constructed from the MD device)

and changing the job file to look like:

[test]
filename=/mnt/test/data.bin

(Removing the /dev/vg/lv1 file)

It runs to completion correctly.

It seems to me that there may be some error in the logic dealing with
finding the underlying devices for different mount points/files the come
to the same underlying device (/dev/md0, in this case?)?

Alan






^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in fio: infinite loop when using two volumes crafted from  one  MD?
  2009-09-11 15:59 Bug in fio: infinite loop when using two volumes crafted from one MD? Alan D. Brunelle
@ 2009-09-11 16:35 ` Jens Axboe
  2009-09-11 19:03   ` Alan D. Brunelle
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2009-09-11 16:35 UTC (permalink / raw)
  To: Alan D. Brunelle; +Cc: fio

On Fri, Sep 11 2009, Alan D. Brunelle wrote:
> I have a somewhat complex (but practical) situation I'm trying to
> measure (looking at Goyal's io-controller patches). 
> 
> o  24-disk MD RAID10 set (/dev/md0)
> 
> o  12 linear LV volumes crafted from /dev/md0
> 
> o  Ext3 FS created on each LV volume
> 
> o  16GiB test file created on each FS
> 
> # du -s -h /mnt/lv[01]/data.bin
> 33G     /mnt/lv0/data.bin
> 33G     /mnt/lv1/data.bin
> 
> When I execute the following job file (only using 2 of the 12 files/FS):
> 
> [global]
> rw=rw
> rwmixread=80
> randrepeat=1
> size=32g
> direct=0
> ioengine=libaio
> iodepth=32
> iodepth_low=32
> iodepth_batch=32
> iodepth_batch_complete=6
> overwrite=0
> bs=4k
> runtime=30
> 
> [lv0]
> filename=/mnt/lv0/data.bin
> 
> [test]
> filename=/mnt/lv1/data.bin
> 
> the I/O portion of the run completes, but whilst attempting to display
> the Disk stats it hangs whilst outputting:
> 
> ...
> Run status group 0 (all jobs):
>    READ: io=3,602MB, aggrb=120MB/s, minb=61,754KB/s, maxb=64,135KB/s,
> mint=30002msec, maxt=30005msec
>   WRITE: io=903MB, aggrb=30,813KB/s, minb=15,537KB/s, maxb=16,016KB/s,
> mint=30002msec, maxt=30005msec
> 
> Disk stats (read/write):
> 
> <<<hangs...>>>
> 
> Breaking in (via gdb) yields:
> (gdb) where
> #0  0x0000000000428bb4 in aggregate_slaves_stats
> (masterdu=0x7f21bbf511e8)
>     at diskutil.c:458
> #1  0x000000000042905c in show_disk_util () at diskutil.c:528
> #2  0x00000000004121bb in show_run_stats () at stat.c:663
> #3  0x000000000040acad in main (argc=2, argv=0x7fff6b83bba8) at
> fio.c:1654
> 
> setting a break at:
> 
> 454                     ios[0] += dus->ios[0];
> 
> and using 'cont' & "print *slavedu" yields:
> 
> (gdb) print *slavedu
> $3 = {list = {next = 0x28000156a7, prev = 0x1571500006e19}, slavelist =
> {
>     next = 0x7f21bbf513f8, prev = 0x7f21bbf513f8}, 
>   name = 0x74b4 <Address 0x74b4 out of bounds>, 
>   sysfs_root = 0x4aaa6cce <Address 0x4aaa6cce out of bounds>, 
>   path = "\030???\r\000\000\000\000\000\000\020i???!\177\000\000\000\000\000
> \000\000\000\000\000???Z???Z\000\000\000\000\024\000\000\000??????dm-1\000\000
> \000\000???Z???Z", '\0' <repeats 12 times>, "???\001\000\000?????????G???!\177\000
> \000???G???!\177\000\000???G???!\177\000\000???G???!\177\000\000???I???!\177\000\000???W
> \203k???\177\000\000/sys/block/dm-1/slaves/../../md0/stat", '\0' <repeats
> 98 times>, major = 0, minor = 0, 
>   dus = {ios = {0, 0}, merges = {0, 0}, sectors = {0, 0}, ticks = {0,
> 0}, 
>     io_ticks = 0, time_in_queue = 0}, last_dus = {ios = {0, 0}, merges =
> {0, 
>       0}, sectors = {0, 0}, ticks = {0, 0}, io_ticks = 0, time_in_queue
> = 0}, 
>   slaves = {next = 0x0, prev = 0x0}, msec = 9, time = {tv_sec = 0, 
>     tv_usec = 0}, lock = 0x0, users = 0}
> (gdb) cont
> Continuing.
> 
> Breakpoint 1, aggregate_slaves_stats (masterdu=0x7f21bbf511e8)
>     at diskutil.c:454
> 454                     ios[0] += dus->ios[0];
> (gdb) print *slavedu
> $4 = {list = {next = 0x7f21bbf515e8, prev = 0x7f21bbf511e8}, slavelist =
> {
>     next = 0x7f21bbf54780, prev = 0x7f21bbf54780}, 
>   name = 0x7f21bbf515c8 "md0", 
>   sysfs_root = 0x7fff6b8357f0 "/sys/block/dm-1/slaves/../../md0", 
>   path = "/sys/block/dm-0/slaves/../../md0/stat", '\0' <repeats 218
> times>, 
>   major = 9, minor = 0, dus = {ios = {0, 0}, merges = {0, 0}, sectors =
> {0, 
>       0}, ticks = {0, 0}, io_ticks = 0, time_in_queue = 0}, last_dus =
> {ios = {
>       567924, 118793}, merges = {0, 0}, sectors = {71929094, 950344},
> ticks = {
>       0, 0}, io_ticks = 0, time_in_queue = 0}, slaves = {
>     next = 0x7f21bbf515f8, prev = 0x7f21bbf543f8}, msec = 0, time = {
>     tv_sec = 1252682928, tv_usec = 935566}, lock = 0x7f21bc973000, users
> = 0}
> (gdb) cont
> Continuing.
> 
> Breakpoint 1, aggregate_slaves_stats (masterdu=0x7f21bbf511e8)
>     at diskutil.c:454
> 454                     ios[0] += dus->ios[0];
> (gdb) print *slavedu
> $5 = {list = {next = 0x28000156a7, prev = 0x1571500006e19}, slavelist =
> {
>     next = 0x7f21bbf513f8, prev = 0x7f21bbf513f8}, 
>   name = 0x74b4 <Address 0x74b4 out of bounds>, 
>   sysfs_root = 0x4aaa6cce <Address 0x4aaa6cce out of bounds>, 
>   path = "\030???\r\000\000\000\000\000\000\020i???!\177\000\000\000\000\000
> \000\000\000\000\000???Z???Z\000\000\000\000\024\000\000\000??????dm-1\000\000
> \000\000???Z???Z", '\0' <repeats 12 times>, "???\001\000\000?????????G???!\177\000
> \000???G???!\177\000\000???G???!\177\000\000???G???!\177\000\000???I???!\177\000\000???W
> \203k???\177\000\000/sys/block/dm-1/slaves/../../md0/stat", '\0' <repeats
> 98 times>, major = 0, minor = 0, 
>   dus = {ios = {0, 0}, merges = {0, 0}, sectors = {0, 0}, ticks = {0,
> 0}, 
>     io_ticks = 0, time_in_queue = 0}, last_dus = {ios = {0, 0}, merges =
> {0, 
>       0}, sectors = {0, 0}, ticks = {0, 0}, io_ticks = 0, time_in_queue
> = 0}, 
>   slaves = {next = 0x0, prev = 0x0}, msec = 9, time = {tv_sec = 0, 
>     tv_usec = 0}, lock = 0x0, users = 0}
> 
> and then it seems to be bouncing between these two "things". 
> 
> Now using a totally separate disk & FS & data file:
> 
> # du -s -h /mnt/lv0/data.bin /mnt/test/data.bin 
> 33G     /mnt/lv0/data.bin
> 33G     /mnt/test/data.bin
> 
> (/mnt/test is *not* constructed from the MD device) 
> 
> and changing the job file to look like:
> 
> [test]
> filename=/mnt/test/data.bin
> 
> (Removing the /dev/vg/lv1 file)
> 
> It runs to completion correctly.
> 
> It seems to me that there may be some error in the logic dealing with
> finding the underlying devices for different mount points/files the come
> to the same underlying device (/dev/md0, in this case?)? 

Fio does indeed have code to find the below devices for stat purposes,
sure does sound like there's a bug in there. If you have time to poke at
it and find out why, that would be great :-)

If not, I'll try and take a look.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in fio: infinite loop when using two volumes crafted from  one  MD?
  2009-09-11 16:35 ` Jens Axboe
@ 2009-09-11 19:03   ` Alan D. Brunelle
  2009-09-11 19:34     ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Alan D. Brunelle @ 2009-09-11 19:03 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

[-- Attachment #1: Type: text/plain, Size: 323 bytes --]

Well, the attached patch appears to fix the problem: When looking to add
an underlying device we first check to see if its in the list of current
devices. If so, we skip this add. 

As I know next to nothing about the innards of fio, I'd be a little
leery of taking this in, but at least it allows me to proceed! :-)

Alan

[-- Attachment #2: 0001-Bug-fix-handles-disk-device-used-multiple-times.patch --]
[-- Type: text/x-patch, Size: 1169 bytes --]

From 199a61fe53340c3ebd93567d22e16a95cca3aaeb Mon Sep 17 00:00:00 2001
From: Alan D. Brunelle <alan.brunelle@hp.com>
Date: Fri, 11 Sep 2009 14:57:10 -0400
Subject: [PATCH] Bug fix: handles disk device used multiple times

There were issues in having the same underlying device being referenced
multiple times (via different paths) when reporting storage I/O
statistics. As an example: having two (or more) LVM2/DM volumes crafted
out of the same MD array.

This patch simply skips over any devices previously seen.

Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
---
 diskutil.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/diskutil.c b/diskutil.c
index cb15882..62149d9 100644
--- a/diskutil.c
+++ b/diskutil.c
@@ -221,6 +221,13 @@ static void find_add_disk_slaves(struct thread_data *td, char *path,
 			return;
 		}
 
+		/*
+		 * See if this maj,min already exists
+		 */
+		slavedu = disk_util_exists(majdev, mindev);
+		if (slavedu)
+			continue;
+
 		sprintf(temppath, "%s/%s", slavesdir, slavepath);
 		__init_per_file_disk_util(td, majdev, mindev, temppath);
 		slavedu = disk_util_exists(majdev, mindev);
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Bug in fio: infinite loop when using two volumes crafted from  one  MD?
  2009-09-11 19:03   ` Alan D. Brunelle
@ 2009-09-11 19:34     ` Jens Axboe
  2009-09-11 20:16       ` Alan D. Brunelle
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2009-09-11 19:34 UTC (permalink / raw)
  To: Alan D. Brunelle; +Cc: fio

On Fri, Sep 11 2009, Alan D. Brunelle wrote:
> Well, the attached patch appears to fix the problem: When looking to add
> an underlying device we first check to see if its in the list of current
> devices. If so, we skip this add. 
> 
> As I know next to nothing about the innards of fio, I'd be a little
> leery of taking this in, but at least it allows me to proceed! :-)

OK, so it's probably corrupting the list on a double insert. Your check
is likely fine, even if there are alternative ways we could fix this.
I'll add it for now, thanks for debugging this Alan!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in fio: infinite loop when using two volumes crafted from  one  MD?
  2009-09-11 19:34     ` Jens Axboe
@ 2009-09-11 20:16       ` Alan D. Brunelle
  2009-09-11 20:23         ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Alan D. Brunelle @ 2009-09-11 20:16 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

On Fri, 2009-09-11 at 21:34 +0200, Jens Axboe wrote:
> On Fri, Sep 11 2009, Alan D. Brunelle wrote:
> > Well, the attached patch appears to fix the problem: When looking to add
> > an underlying device we first check to see if its in the list of current
> > devices. If so, we skip this add. 
> > 
> > As I know next to nothing about the innards of fio, I'd be a little
> > leery of taking this in, but at least it allows me to proceed! :-)
> 
> OK, so it's probably corrupting the list on a double insert. Your check
> is likely fine, even if there are alternative ways we could fix this.
> I'll add it for now, thanks for debugging this Alan!
> 

It worked fine w/ my full test (all 12 LVM2/DM volumes being accessed on
the same underlying MD RAID10 array). So that's goodness...

Alan



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in fio: infinite loop when using two volumes crafted from  one  MD?
  2009-09-11 20:16       ` Alan D. Brunelle
@ 2009-09-11 20:23         ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2009-09-11 20:23 UTC (permalink / raw)
  To: Alan D. Brunelle; +Cc: fio

On Fri, Sep 11 2009, Alan D. Brunelle wrote:
> On Fri, 2009-09-11 at 21:34 +0200, Jens Axboe wrote:
> > On Fri, Sep 11 2009, Alan D. Brunelle wrote:
> > > Well, the attached patch appears to fix the problem: When looking to add
> > > an underlying device we first check to see if its in the list of current
> > > devices. If so, we skip this add. 
> > > 
> > > As I know next to nothing about the innards of fio, I'd be a little
> > > leery of taking this in, but at least it allows me to proceed! :-)
> > 
> > OK, so it's probably corrupting the list on a double insert. Your check
> > is likely fine, even if there are alternative ways we could fix this.
> > I'll add it for now, thanks for debugging this Alan!
> > 
> 
> It worked fine w/ my full test (all 12 LVM2/DM volumes being accessed on
> the same underlying MD RAID10 array). So that's goodness...

Great, thanks for re-testing!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in fio: infinite loop when using two volumes crafted from  one MD?
@ 2009-10-23  3:12 Glen Ogilvie
  2009-10-23  4:23 ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Glen Ogilvie @ 2009-10-23  3:12 UTC (permalink / raw)
  To: fio

Hi,

Which release is this bug fixed in?  We experienced this bug as well.

Also, We have a problem where running fio on a DRBD device seems to cause
a complete system lockup sometimes.   Any ideas?

Regards
-- 
Glen Ogilvie
Open Systems Specialists
Level 1, 162 Grafton Road
http://www.oss.co.nz/

Ph: +64 9 984 3000
Mobile: +64 21 684 146
GPG Key: ACED9C17

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in fio: infinite loop when using two volumes crafted from  one MD?
  2009-10-23  3:12 Glen Ogilvie
@ 2009-10-23  4:23 ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2009-10-23  4:23 UTC (permalink / raw)
  To: Glen Ogilvie; +Cc: fio

On Fri, Oct 23 2009, Glen Ogilvie wrote:
> Hi,
> 
> Which release is this bug fixed in?  We experienced this bug as well.

Should be fixed in 1.34 release and later.

> Also, We have a problem where running fio on a DRBD device seems to cause
> a complete system lockup sometimes.   Any ideas?

That would be a kernel bug, it should not be able to lock up the system
in any way. So you'd probably want to look into that separately.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-10-23  4:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-11 15:59 Bug in fio: infinite loop when using two volumes crafted from one MD? Alan D. Brunelle
2009-09-11 16:35 ` Jens Axboe
2009-09-11 19:03   ` Alan D. Brunelle
2009-09-11 19:34     ` Jens Axboe
2009-09-11 20:16       ` Alan D. Brunelle
2009-09-11 20:23         ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2009-10-23  3:12 Glen Ogilvie
2009-10-23  4:23 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox