linux-btrace.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* lots of unplugged by timer normal or abnormal?
@ 2009-10-06 18:53 Simon Kirby
  2009-10-06 19:50 ` Alan D. Brunelle
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Simon Kirby @ 2009-10-06 18:53 UTC (permalink / raw)
  To: linux-btrace

Hello,

I was just testing out blktrace/blkparse/btrace on a box that is hitting
some performance walls, and noticed that there are a fair amonut of "UT"
events (unplug by timer).  Is this normal/expected?  For example:

[sroot@nas04:/root]# btrace /dev/etherd/e7.0 | grep T
152,32   4     1379     3.582513494     0 UT   N [swapper] 8
152,32   1     1289     4.329760067     0 UT   N [swapper] 1
152,32   3     1132     4.362257911     0 UT   N [swapper] 3
152,32   7     1070     4.375258374     0 UT   N [swapper] 1
152,32   3     1146     4.382260172     0 UT   N [swapper] 5
152,32   1     1318     4.417759727  2830 UT   N [nfsd] 3
152,32   5     1003     4.498757105     0 UT   N [swapper] 1
152,32   1     1368     4.517761127     0 UT   N [swapper] 1
152,32   0     1133     4.585522116     0 UT   N [swapper] 5

This is on 2.6.30.5 with the 2.6.30.6 XFS patches.

Cheers,

Simon-

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lots of unplugged by timer normal or abnormal?
  2009-10-06 18:53 lots of unplugged by timer normal or abnormal? Simon Kirby
@ 2009-10-06 19:50 ` Alan D. Brunelle
  2009-10-06 20:08 ` Simon Kirby
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Alan D. Brunelle @ 2009-10-06 19:50 UTC (permalink / raw)
  To: linux-btrace

On Tue, 2009-10-06 at 11:53 -0700, Simon Kirby wrote:
> Hello,
> 
> I was just testing out blktrace/blkparse/btrace on a box that is hitting
> some performance walls, and noticed that there are a fair amonut of "UT"
> events (unplug by timer).  Is this normal/expected?  For example:
> 
> [sroot@nas04:/root]# btrace /dev/etherd/e7.0 | grep T
> 152,32   4     1379     3.582513494     0 UT   N [swapper] 8
> 152,32   1     1289     4.329760067     0 UT   N [swapper] 1
> 152,32   3     1132     4.362257911     0 UT   N [swapper] 3
> 152,32   7     1070     4.375258374     0 UT   N [swapper] 1
> 152,32   3     1146     4.382260172     0 UT   N [swapper] 5
> 152,32   1     1318     4.417759727  2830 UT   N [nfsd] 3
> 152,32   5     1003     4.498757105     0 UT   N [swapper] 1
> 152,32   1     1368     4.517761127     0 UT   N [swapper] 1
> 152,32   0     1133     4.585522116     0 UT   N [swapper] 5
> 
> This is on 2.6.30.5 with the 2.6.30.6 XFS patches.
> 
> Cheers,
> 
> Simon-

Hello Simon- 

This looks pretty bad - could you tell me what distro you are running on
as a base? And did this happen before 2.6.30.5 (the poor performance)?
And could you provide a more complete snippet of blktrace output
(showing a handful of complete ops)?

Thanks,
Alan D. Brunelle
Hewlett-Packard


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lots of unplugged by timer normal or abnormal?
  2009-10-06 18:53 lots of unplugged by timer normal or abnormal? Simon Kirby
  2009-10-06 19:50 ` Alan D. Brunelle
@ 2009-10-06 20:08 ` Simon Kirby
  2009-10-06 20:20 ` Alan D. Brunelle
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Simon Kirby @ 2009-10-06 20:08 UTC (permalink / raw)
  To: linux-btrace

On Tue, Oct 06, 2009 at 03:50:45PM -0400, Alan D. Brunelle wrote:

> This looks pretty bad - could you tell me what distro you are running on
> as a base? And did this happen before 2.6.30.5 (the poor performance)?
> And could you provide a more complete snippet of blktrace output
> (showing a handful of complete ops)?

Hi, Alan!

The poor performance has been ongoing for some time (these boxes were
built circa 2.6.26, and we hit a number of issues along the way -- 2.6.30
was the first stable kernel for serving files via nfsd, even with EXT3).

I'm figuring the majority of the performance issue are actually with the
AOE driver or the Coraid queue scheduling or RAID implementation, but I
figured that unplugging by timer maybe shouldn't happen even regardless
of how slow the underlying "device" is, hence my email.

It's amd64 Debian lenny (not sure why that matters), 16 GB of RAM, about
and a whole bunch of AOE storage, chopped up via DM, mostly using XFS as
a file system, served via knfsd.  "iostat -x -k 1" shows 100% utilization
fairly often for this particular device.

It's possible that this is just aggravated by memory allocator latency or
something, causing the unplug timer to trigger.  I was just wondering if
this is normal behaviour.

blktrace snippets seem pretty massive:

[sroot@nas04:/root/trace]# blktrace -w 10 /dev/etherd/e7.0
Device: /dev/etherd/e7.0
  CPU  0:                    0 events,      843 KiB data
  CPU  1:                    0 events,      709 KiB data
  CPU  2:                    0 events,      712 KiB data
  CPU  3:                    0 events,      752 KiB data
  CPU  4:                    0 events,      714 KiB data
  CPU  5:                    0 events,      713 KiB data
  CPU  6:                    0 events,      619 KiB data
  CPU  7:                    0 events,      732 KiB data
  Total:                     0 events (dropped 0),     5791 KiB data

I've saved that here:

http://0x.ca/sim/ref/2.6.30.5-hw-fixedxfs/nas04_blktrace_snippet.tgz

...which can be extracted and read with "blkparse etherd_e7.0".

Cheers,

Simon-

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lots of unplugged by timer normal or abnormal?
  2009-10-06 18:53 lots of unplugged by timer normal or abnormal? Simon Kirby
  2009-10-06 19:50 ` Alan D. Brunelle
  2009-10-06 20:08 ` Simon Kirby
@ 2009-10-06 20:20 ` Alan D. Brunelle
  2009-10-11 16:44 ` Simon Kirby
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Alan D. Brunelle @ 2009-10-06 20:20 UTC (permalink / raw)
  To: linux-btrace

[-- Attachment #1: Type: text/plain, Size: 1545 bytes --]

On Tue, 2009-10-06 at 13:08 -0700, Simon Kirby wrote:
> On Tue, Oct 06, 2009 at 03:50:45PM -0400, Alan D. Brunelle wrote:
> 
> > This looks pretty bad - could you tell me what distro you are running on
> > as a base? And did this happen before 2.6.30.5 (the poor performance)?
> > And could you provide a more complete snippet of blktrace output
> > (showing a handful of complete ops)?
> 
> Hi, Alan!
> 
> The poor performance has been ongoing for some time (these boxes were
> built circa 2.6.26, and we hit a number of issues along the way -- 2.6.30
> was the first stable kernel for serving files via nfsd, even with EXT3).
> 
> I'm figuring the majority of the performance issue are actually with the
> AOE driver or the Coraid queue scheduling or RAID implementation, but I
> figured that unplugging by timer maybe shouldn't happen even regardless
> of how slow the underlying "device" is, hence my email.
> 
> It's amd64 Debian lenny (not sure why that matters), 16 GB of RAM, about
> and a whole bunch of AOE storage, chopped up via DM, mostly using XFS as
> a file system, served via knfsd.  "iostat -x -k 1" shows 100% utilization
> fairly often for this particular device.


Coincidentally I was investigating something similar w/ RHEL 5.4 at this
very moment - just wanted to know if that was a common denominator.

Could you run the attached SystemTap script and send out the output?
[You may have to get `systemtap' though...]

I'll try to get some time to look at your traces - might not be until
tomorrow though...

Thanks,
Alan

[-- Attachment #2: bpd.stp --]
[-- Type: text/plain, Size: 420 bytes --]

#! /usr/bin/env stap

global n, plug_stacks

probe kernel.function("blk_plug_device") {
        if ((n ++ % 0xff) == 0) {
                plug_stacks[backtrace()] ++
        }
}

probe begin {
        printf("Collecting traces\n")
}

probe timer.sec(20) {
        foreach (stack in plug_stacks- limit 5) {
                printf("%d:\n", plug_stacks[stack])
                print_stack(stack)
        }
        exit()
}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lots of unplugged by timer normal or abnormal?
  2009-10-06 18:53 lots of unplugged by timer normal or abnormal? Simon Kirby
                   ` (2 preceding siblings ...)
  2009-10-06 20:20 ` Alan D. Brunelle
@ 2009-10-11 16:44 ` Simon Kirby
  2009-10-11 18:35 ` Jens Axboe
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Simon Kirby @ 2009-10-11 16:44 UTC (permalink / raw)
  To: linux-btrace

On Tue, Oct 06, 2009 at 04:20:32PM -0400, Alan D. Brunelle wrote:

> Coincidentally I was investigating something similar w/ RHEL 5.4 at this
> very moment - just wanted to know if that was a common denominator.
> 
> Could you run the attached SystemTap script and send out the output?
> [You may have to get `systemtap' though...]

Thanks for the excuse to try SystemTap. :)  I have some traces, but I
don't think the UT case is abnormal.  On my desktop at home, I can see
btraces such as this, which seem to show it happening often:

# btrace /dev/md0
  9,0    2        1     0.000000000 11679  P   N [flush-9:0]
  9,0    2        2     0.000006301 11679  Q   W 103873856 + 8 [flush-9:0]
  9,0    2        3     0.001946762     0 UT   N [swapper] 0
  9,0    2        4     0.001964494   290  U   N [kblockd/2] 0
  9,0    2        5     4.625939573 13087  Q   W 103899840 + 8 [firefox-bin]
  9,0    2        6     4.635987726 13087  P   N [firefox-bin]
  9,0    2        7     4.637612451 13087 UT   N [firefox-bin] 0
  9,0    2        8     4.637670985   290  U   N [kblockd/2] 0
  9,0    0        1     4.996817219 11679  Q   W 85739464 + 8 [flush-9:0]
  9,0    0        2     5.005772973 11679  P   N [flush-9:0]
  9,0    0        3     5.005789934 11679  Q   W 92127240 + 8 [flush-9:0]
  9,0    0        4     5.012468560     4 UT   N [ksoftirqd/0] 0
  9,0    0        5     5.012482690   288  U   N [kblockd/0] 0
  9,0    2        9     6.829559401 13087  Q   W 103907504 + 8 [firefox-bin]
  9,0    2       10     6.838558733 13087  P   N [firefox-bin]
  9,0    2       11     6.838560239 13087  Q   W 103907512 + 8 [firefox-bin]
  9,0    2       12     6.838561798 13087  Q   W 103907528 + 8 [firefox-bin]
  9,0    2       13     6.838563499 13087  Q   W 103907624 + 8 [firefox-bin]
  9,0    2       14     6.838564916 13087  Q   W 103907664 + 8 [firefox-bin]
  9,0    2       15     6.838566357 13087  Q   W 103907672 + 8 [firefox-bin]
  9,0    2       16     6.841928170     0 UT   N [swapper] 0
  9,0    2       17     6.841934150   290  U   N [kblockd/2] 0
^CCPU0 (9,0):
 Reads Queued:           0,        0KiB  Writes Queued:           2,        8KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:            0,        0KiB
 Read depth:             0               Write depth:             0
 IO unplugs:             1               Timer unplugs:           1
CPU2 (9,0):
 Reads Queued:           0,        0KiB  Writes Queued:           8,       32KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:            0,        0KiB
 Read depth:             0               Write depth:             0
 IO unplugs:             3               Timer unplugs:           3

Total (9,0):
 Reads Queued:           0,        0KiB  Writes Queued:          10,       40KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:        0,        0KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB  Write Merges:            0,        0KiB
 IO unplugs:             4               Timer unplugs:           4

Throughput (R/W): 0KiB/s / 0KiB/s
Events (9,0): 22 entries
Skips: 0 forward (0 -   0.0%)

This is on ext3 via MD raid 1, 0eca52a92735f43462165efe00a7e394345fb38e
kernel version aka "2.6.32rc3".

Simon-

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lots of unplugged by timer normal or abnormal?
  2009-10-06 18:53 lots of unplugged by timer normal or abnormal? Simon Kirby
                   ` (3 preceding siblings ...)
  2009-10-11 16:44 ` Simon Kirby
@ 2009-10-11 18:35 ` Jens Axboe
  2009-10-11 20:58 ` Simon Kirby
  2009-10-12  7:18 ` Jens Axboe
  6 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2009-10-11 18:35 UTC (permalink / raw)
  To: linux-btrace

On Sun, Oct 11 2009, Simon Kirby wrote:
> On Tue, Oct 06, 2009 at 04:20:32PM -0400, Alan D. Brunelle wrote:
> 
> > Coincidentally I was investigating something similar w/ RHEL 5.4 at this
> > very moment - just wanted to know if that was a common denominator.
> > 
> > Could you run the attached SystemTap script and send out the output?
> > [You may have to get `systemtap' though...]
> 
> Thanks for the excuse to try SystemTap. :)  I have some traces, but I
> don't think the UT case is abnormal.  On my desktop at home, I can see
> btraces such as this, which seem to show it happening often:

It depends a lot on the workload. The UT will trigger if the amount of
writes isn't very high - if you go above 4 queued writes, a regular
unplug will trigger. So if you see if for write intensive workloads, it
is almost surely a bug. For background firefox activity on your desktop,
it's not unusual.

If you see if for reads, it's almost always a bug.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lots of unplugged by timer normal or abnormal?
  2009-10-06 18:53 lots of unplugged by timer normal or abnormal? Simon Kirby
                   ` (4 preceding siblings ...)
  2009-10-11 18:35 ` Jens Axboe
@ 2009-10-11 20:58 ` Simon Kirby
  2009-10-12  7:18 ` Jens Axboe
  6 siblings, 0 replies; 8+ messages in thread
From: Simon Kirby @ 2009-10-11 20:58 UTC (permalink / raw)
  To: linux-btrace

On Sun, Oct 11, 2009 at 08:35:36PM +0200, Jens Axboe wrote:

> It depends a lot on the workload. The UT will trigger if the amount of
> writes isn't very high - if you go above 4 queued writes, a regular
> unplug will trigger. So if you see if for write intensive workloads, it
> is almost surely a bug. For background firefox activity on your desktop,
> it's not unusual.
> 
> If you see if for reads, it's almost always a bug.

Ok, so this came up because we're having performance issues on a NFS
head which serves files on XFS from a bunch of AOE devices.  iostat is
reporting near-100% utilization for the device I've traced here,
/dev/etherd/e7.0.

I noticed blkparse doesn't seem to run anywhere (it takes the local
number of CPUs even though the trace may have involved more?), so I've
uploaded blkparse output from the trace machine:

	http://0x.ca/sim/ref/2.6.30.5-hw-fixedxfs/blkparse_out.txt.gz
	http://0x.ca/sim/ref/2.6.30.5-hw-fixedxfs/blkparse-t_out.txt.gz

See near the beginning of that, CPU #7:

152,32   7        5     0.006609360  3228  C   R 7739498616 + 248 ( 4103734) [0]
152,32   7        6     0.007504725  2787  A   R 4393342696 + 248 <- (252,43) 98375016
152,32   7        7     0.007505633  2787  Q   R 4393342696 + 248 [nfsd]
152,32   7        8     0.007507379  2787  G   R 4393342696 + 248 [nfsd]
152,32   7        9     0.007509055  2787  P   N [nfsd]
152,32   7       10     0.007514922  2787  I   R 4393342696 + 248 (    7543) [nfsd]
152,32   7       11     0.007520998  2787  A   R 4393342944 + 8 <- (252,43) 98375264
152,32   7       12     0.007521696  2787  Q   R 4393342944 + 8 [nfsd]
152,32   7       13     0.007528750  2787  G   R 4393342944 + 8 [nfsd]
152,32   7       14     0.007529519  2787  I   R 4393342944 + 8 (     769) [nfsd]
152,32   7       15     0.011107417     0 UT   N [swapper] 3
152,32   7       16     0.011121386   207  U   N [kblockd/7] 3
152,32   7       17     0.011129347   207  D   R 4393342696 + 248 ( 3614425) [kblockd/7]

Is that a bug?

Simon-

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: lots of unplugged by timer normal or abnormal?
  2009-10-06 18:53 lots of unplugged by timer normal or abnormal? Simon Kirby
                   ` (5 preceding siblings ...)
  2009-10-11 20:58 ` Simon Kirby
@ 2009-10-12  7:18 ` Jens Axboe
  6 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2009-10-12  7:18 UTC (permalink / raw)
  To: linux-btrace

On Sun, Oct 11 2009, Simon Kirby wrote:
> On Sun, Oct 11, 2009 at 08:35:36PM +0200, Jens Axboe wrote:
> 
> > It depends a lot on the workload. The UT will trigger if the amount of
> > writes isn't very high - if you go above 4 queued writes, a regular
> > unplug will trigger. So if you see if for write intensive workloads, it
> > is almost surely a bug. For background firefox activity on your desktop,
> > it's not unusual.
> > 
> > If you see if for reads, it's almost always a bug.
> 
> Ok, so this came up because we're having performance issues on a NFS
> head which serves files on XFS from a bunch of AOE devices.  iostat is
> reporting near-100% utilization for the device I've traced here,
> /dev/etherd/e7.0.
> 
> I noticed blkparse doesn't seem to run anywhere (it takes the local
> number of CPUs even though the trace may have involved more?), so I've
> uploaded blkparse output from the trace machine:

Huh really, if so that's surely a bug. You should be able to parse the
traces anywhere, even on different architectures. The format is designed
to be machine independent. But if blkparse has such a dumb bug, that is
troublesome :-)

> 	http://0x.ca/sim/ref/2.6.30.5-hw-fixedxfs/blkparse_out.txt.gz
> 	http://0x.ca/sim/ref/2.6.30.5-hw-fixedxfs/blkparse-t_out.txt.gz
> 
> See near the beginning of that, CPU #7:
> 
> 152,32   7        5     0.006609360  3228  C   R 7739498616 + 248 ( 4103734) [0]
> 152,32   7        6     0.007504725  2787  A   R 4393342696 + 248 <- (252,43) 98375016
> 152,32   7        7     0.007505633  2787  Q   R 4393342696 + 248 [nfsd]
> 152,32   7        8     0.007507379  2787  G   R 4393342696 + 248 [nfsd]
> 152,32   7        9     0.007509055  2787  P   N [nfsd]
> 152,32   7       10     0.007514922  2787  I   R 4393342696 + 248 (    7543) [nfsd]
> 152,32   7       11     0.007520998  2787  A   R 4393342944 + 8 <- (252,43) 98375264
> 152,32   7       12     0.007521696  2787  Q   R 4393342944 + 8 [nfsd]
> 152,32   7       13     0.007528750  2787  G   R 4393342944 + 8 [nfsd]
> 152,32   7       14     0.007529519  2787  I   R 4393342944 + 8 (     769) [nfsd]
> 152,32   7       15     0.011107417     0 UT   N [swapper] 3
> 152,32   7       16     0.011121386   207  U   N [kblockd/7] 3
> 152,32   7       17     0.011129347   207  D   R 4393342696 + 248 ( 3614425) [kblockd/7]
> 
> Is that a bug?

It very much looks like a bug. After that last reas is inserted (seq
14), it's idling for 3.6ms before the unplug timer kicks things into
gear. I wonder if nfsd is missing a backing device or mapping run.

What kernel is this?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-10-12  7:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-06 18:53 lots of unplugged by timer normal or abnormal? Simon Kirby
2009-10-06 19:50 ` Alan D. Brunelle
2009-10-06 20:08 ` Simon Kirby
2009-10-06 20:20 ` Alan D. Brunelle
2009-10-11 16:44 ` Simon Kirby
2009-10-11 18:35 ` Jens Axboe
2009-10-11 20:58 ` Simon Kirby
2009-10-12  7:18 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).