public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* should ia64_spinlock_contention do backoff?
@ 2004-03-25 19:41 David Mosberger
  2004-03-25 20:06 ` John Hawkes
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: David Mosberger @ 2004-03-25 19:41 UTC (permalink / raw)
  To: linux-ia64

Has anyone studied the impact of doing exponential backoff in
ia64_spinlock_contention.  My theory is that it wouldn't buy much _if_
spinlocks always were in their own cachelines, but since they're not,
not using backoff could cause extra cache-line bouncing.  To be
honest, I'd rather not spend time on this myself, since I don't have
convenient access to large machines, but me thinks this is a question
that's long overdue to have a proper answer.

	--david

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: should ia64_spinlock_contention do backoff?
  2004-03-25 19:41 should ia64_spinlock_contention do backoff? David Mosberger
@ 2004-03-25 20:06 ` John Hawkes
  2004-03-25 22:13 ` Keith Owens
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: John Hawkes @ 2004-03-25 20:06 UTC (permalink / raw)
  To: linux-ia64

From: "David Mosberger" <davidm@napali.hpl.hp.com>
> Has anyone studied the impact of doing exponential backoff in
> ia64_spinlock_contention.  My theory is that it wouldn't buy much _if_
> spinlocks always were in their own cachelines, but since they're not,
> not using backoff could cause extra cache-line bouncing.  To be
> honest, I'd rather not spend time on this myself, since I don't have
> convenient access to large machines, but me thinks this is a question
> that's long overdue to have a proper answer.

We (at SGI) don't have definitive measurements on this, but it's On The List
of things to do.
I believe that for a NUMA system, one big problem with highly-contended
spinlocks is the relative unfairness of the more remote nodes as the waiters
race to see the spinlock freed and to be the first to reacquire it.  In
extreme cases this unfairness becomes a near-starvation.  A backoff algorithm
would likely produce more fairness in these high-contention cases, although at
the expense of potentially lengthening the wait-times for the shortest
waiters.

John Hawkes



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: should ia64_spinlock_contention do backoff?
  2004-03-25 19:41 should ia64_spinlock_contention do backoff? David Mosberger
  2004-03-25 20:06 ` John Hawkes
@ 2004-03-25 22:13 ` Keith Owens
  2004-03-25 22:28 ` David Mosberger
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Keith Owens @ 2004-03-25 22:13 UTC (permalink / raw)
  To: linux-ia64

On Thu, 25 Mar 2004 11:41:53 -0800, 
David Mosberger <davidm@napali.hpl.hp.com> wrote:
>Has anyone studied the impact of doing exponential backoff in
>ia64_spinlock_contention.  My theory is that it wouldn't buy much _if_
>spinlocks always were in their own cachelines, but since they're not,
>not using backoff could cause extra cache-line bouncing.  To be
>honest, I'd rather not spend time on this myself, since I don't have
>convenient access to large machines, but me thinks this is a question
>that's long overdue to have a proper answer.

Short answer, on one workload it gave ~2.5% improvement on a highly
contended lock, but we got much better results by changing the lock
type.

Somebody had a workload that would result in most cpus spinning on the
same inode lock.  From our bug tracking system :- "The app is threaded.
The threads are synchronized using barriers or something equivalent.
After each barrier, each thread opens & reads the next file in a series
of files. At any point in time, all threads are reading the same file".

8:  3.99 wall,    28.76 sys,   6.97 user
16:  8.82 wall,    124.99 sys,   24.64 user
32:  23.26 wall,    655.31 sys,   109.93 user
64:  71.78 wall,    3681.90 sys,   858.37 user

This was on a 2.4 based kernel with my patch for out of line spinlock
contention, using brl, not call.  Changing the contention path to do
backoff[*] gave approximately a 2.5% improvement in wall time.

8:  3.74 wall,    27.04 sys,   6.51 user
16:  8.41 wall,    118.92 sys,   23.50 user
32:  22.50 wall,    628.80 sys,   111.40 user
64:  69.84 wall,    3491.79 sys,   967.31 user

Jack Steiner added this comment to the bug :- "I have seen a number of
attempts to use backoff algorithms in kernel code.  Most have failed
because locks are usually lightly contended and backoff is not needed.
It unnecessarily delays the next process that tries to get the lock.

For example, if one process holds a lock & ONLY 1 other process is
trying to acquire it, backoff is not needed as long as the lock is not
shared with other locks or data. There are many cases to consider & in
general, code should be optimized for the lightly contended case".

Our kernel does not use backoff for spinlocks, the above was just a
test.  We decided that it was better to redesign the lock.  All places
where the inode lock was being taken could sleep, so we changed from a
spin lock to rwsem.  That resulted in the contending threads being
rescheduled instead of spinning on the same cache line which gave much
better cache traffic and resulted in an overall reduction in wall and
system time for this highly contended lock.


[*] Extract of patched 2.4 out of line contention code.  The
exponential backoff code came from early 2.4 kernels.  It starts at
now&0xf, doubles each time, clips to the low 14 bits then or with 0xf
to avoid going to 0.

        mov now=ar.itc
	...
	and delay=0x3f,now
        ...
.retry:
        // exponential backoff, kdb, lockmeter etc. go in here
        //

        add timeout=now,delay
        shl delayfilay,1
        ;;
        // FIXME: limit should depend on number of cpus
        dep delayfilay,r0,0,13                 // limit delay to 8192 cycles
        ;;
        // delay a little...
.wait:  sub now=now,timeout
        or delay=0xf,delay                      // make sure delay is non-zero (otherwise we get stuck with 0)
        ;;
        cmp.lt p15,p0=now,r0
        mov now=ar.itc
(p15)   br.cond.sptk .wait

        ;;
        ld4 r21=[r31]
        ;;
        cmp4.ne p15,p0=r21,r0
(p15)   br.cond.sptk.few .retry


-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: should ia64_spinlock_contention do backoff?
  2004-03-25 19:41 should ia64_spinlock_contention do backoff? David Mosberger
  2004-03-25 20:06 ` John Hawkes
  2004-03-25 22:13 ` Keith Owens
@ 2004-03-25 22:28 ` David Mosberger
  2004-03-26 17:16 ` markw
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: David Mosberger @ 2004-03-25 22:28 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 26 Mar 2004 09:13:14 +1100, Keith Owens <kaos@sgi.com> said:

  Keith> [*] Extract of patched 2.4 out of line contention code.  The
  Keith> exponential backoff code came from early 2.4 kernels.

Yeah, I thought it looked similar to something I wrote a long time
ago! ;-)

Thanks for sharing the info.  I agree that it's difficult to find the
ideal trade-off between low-latency in the lightly contended case and
low bus utilization in the highly contended case.

Until someone finds data to the contrary, I'll assume that exponential
backoff isn't worth the trouble for the kernel (user-level is a very
different story, of course).

Thanks,

	--david

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: should ia64_spinlock_contention do backoff?
  2004-03-25 19:41 should ia64_spinlock_contention do backoff? David Mosberger
                   ` (2 preceding siblings ...)
  2004-03-25 22:28 ` David Mosberger
@ 2004-03-26 17:16 ` markw
  2004-03-26 23:41 ` Chris Wedgwood
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: markw @ 2004-03-26 17:16 UTC (permalink / raw)
  To: linux-ia64

On 25 Mar, David Mosberger wrote:
> Has anyone studied the impact of doing exponential backoff in
> ia64_spinlock_contention.  My theory is that it wouldn't buy much _if_
> spinlocks always were in their own cachelines, but since they're not,
> not using backoff could cause extra cache-line bouncing.  To be
> honest, I'd rather not spend time on this myself, since I don't have
> convenient access to large machines, but me thinks this is a question
> that's long overdue to have a proper answer.

We have 4-way Itanium2 systems here.  Is that large enough for testing?

Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: should ia64_spinlock_contention do backoff?
  2004-03-25 19:41 should ia64_spinlock_contention do backoff? David Mosberger
                   ` (3 preceding siblings ...)
  2004-03-26 17:16 ` markw
@ 2004-03-26 23:41 ` Chris Wedgwood
  2004-03-26 23:44 ` David Mosberger
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Chris Wedgwood @ 2004-03-26 23:41 UTC (permalink / raw)
  To: linux-ia64

On Fri, Mar 26, 2004 at 09:16:52AM -0800, markw@osdl.org wrote:

> We have 4-way Itanium2 systems here.  Is that large enough for
> testing?

It depends on the level of lock contention and what the remote memory
latency is (well, essentially how quickly you can bounce cachelines
about).  As other people have mentioned SGI has tested this on much
larger configurations without seeing much incentive to use it.

As a general principal I think getting lock contention down is more
worthwhile (but very much harder usually).

Does anyone have any idea what locks are most heavily contented for
ia64 right now?  John?


  --cw

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: should ia64_spinlock_contention do backoff?
  2004-03-25 19:41 should ia64_spinlock_contention do backoff? David Mosberger
                   ` (4 preceding siblings ...)
  2004-03-26 23:41 ` Chris Wedgwood
@ 2004-03-26 23:44 ` David Mosberger
  2004-03-27  0:14 ` Chen, Kenneth W
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: David Mosberger @ 2004-03-26 23:44 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 26 Mar 2004 09:16:52 -0800 (PST), markw@osdl.org said:

  Mark> We have 4-way Itanium2 systems here.  Is that large enough for testing?

Bigger is better and of course NUMA is harder than UMA, but certainly
4-way can provide useful data-points.

	--david

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: should ia64_spinlock_contention do backoff?
  2004-03-25 19:41 should ia64_spinlock_contention do backoff? David Mosberger
                   ` (5 preceding siblings ...)
  2004-03-26 23:44 ` David Mosberger
@ 2004-03-27  0:14 ` Chen, Kenneth W
  2004-03-28  2:37 ` John Hawkes
  2004-03-28 19:23 ` Matthew Wilcox
  8 siblings, 0 replies; 10+ messages in thread
From: Chen, Kenneth W @ 2004-03-27  0:14 UTC (permalink / raw)
  To: linux-ia64

>>>>> Chris Wedgwood wrote on Fri, March 26, 2004 3:41 PM
> Does anyone have any idea what locks are most heavily contented for
> ia64 right now?

This question is a bit too broad, lock contention highly depends on workload.

For example, I'm doing direct I/O on bunch of block devices, and the dev nodes
sit on reiserfs, this contention shows up:

 79.7% 91.2%   18us(1415us)  449us( 760ms)(57.9%)   2970234  8.8% 91.2%    0%  kernel_flag
 0.00% 92.3%  5.8us(  35us)  689us(  58ms)(0.01%)       298  7.7% 92.3%    0%    __break_lease+0x80
 0.01% 94.3%   23us( 120us)  555us(9106us)(0.01%)       298  5.7% 94.3%    0%    chrdev_open+0x1a0
 0.00% 83.3%  7.6us(  14us)   73us( 344us)(0.00%)         6 16.7% 83.3%    0%    de_put+0x60
 0.00% 92.7%  5.4us(  18us)  633us(  14ms)(0.00%)        41  7.3% 92.7%    0%    default_llseek+0x60
 0.00%  100%  2.9us( 4.2us)   44us(  86us)(0.00%)         2    0%  100%    0%    proc_file_lseek+0x60
 0.00%  100%   32us(  34us) 2523us(7279us)(0.00%)         3    0%  100%    0%    proc_lookup+0x60
 0.00%  100%   14us(  17us)   18us(  24us)(0.00%)         2    0%  100%    0%    reiserfs_commit_write+0xb0
 74.2% 91.2%   16us(1247us)  394us( 292ms)(50.6%)   2959013  8.8% 91.2%    0%    reiserfs_dirty_inode+0xa0
 0.00%  100%  214us( 214us)  106us( 106us)(0.00%)         1    0%  100%    0%    reiserfs_file_release+0x110
 0.00% 50.0%   21us(  34us)   29us(  29us)(0.00%)         2 50.0% 50.0%    0%    reiserfs_readdir+0xb0
 0.01% 92.3%  294us( 369us)  223us(1791us)(0.00%)        13  7.7% 92.3%    0%    reiserfs_write_super+0x60
  5.5% 92.9%  342us(1415us)   16ms( 760ms)( 7.3%)     10516  7.1% 92.9%    0%    schedule+0xe30
 0.00%  100%  4.1us( 7.3us)   27us(  45us)(0.00%)         2    0%  100%    0%    sys_fcntl+0x430
 0.00% 90.9%   14us(  48us)   80us( 254us)(0.00%)        11  9.1% 90.9%    0%    sys_ioctl+0xb0
 0.00% 93.8%   11us(  21us)  191us(2263us)(0.00%)        16  6.2% 93.8%    0%    tty_read+0x170
 0.00% 80.0%   17us(  28us)   21us(  38us)(0.00%)        10 20.0% 80.0%    0%    tty_write+0x3c0


And if my workload does lots of context switch, load balance also shows up
some hot spots.

  1.2%  2.1%  1.9us(  22us)    0us                   426680 97.9%    0%  2.1%  load_balance+0x290
 0.02% 89.3%  2.5us(  44us)  3.3us( 635us)(0.00%)      4582 10.7% 89.3%    0%  load_balance+0x2d0
 0.02%  1.3%  2.9us(  15us)  2.9us( 8.3us)(0.00%)      4582 98.7%  1.3%    0%  load_balance+0x2e0
 0.01% 84.6%  2.1us(  17us)  2.9us(  21us)(0.00%)      4585 15.4% 84.6%    0%  load_balance+0xb60

- Ken



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: should ia64_spinlock_contention do backoff?
  2004-03-25 19:41 should ia64_spinlock_contention do backoff? David Mosberger
                   ` (6 preceding siblings ...)
  2004-03-27  0:14 ` Chen, Kenneth W
@ 2004-03-28  2:37 ` John Hawkes
  2004-03-28 19:23 ` Matthew Wilcox
  8 siblings, 0 replies; 10+ messages in thread
From: John Hawkes @ 2004-03-28  2:37 UTC (permalink / raw)
  To: linux-ia64



> >>>>> Chris Wedgwood wrote on Fri, March 26, 2004 3:41 PM
> > Does anyone have any idea what locks are most heavily contented for
> > ia64 right now?
>
> This question is a bit too broad, lock contention highly depends on
workload.

I totally agree.

On one AIM7-like workload that I'm using, at 64p the dominant contention is
doing pte_chain_lock() in page_remove_rmap() (11% of the cycles) and
page_add_rmap() (5%).  That's not spinlock contention -- it's cacheline
contention.  That same workload at 96p shows about 47% of the kernel cycles in
find_get_page() doing the spin_lock(mapping->page_lock), and the
pte_chain_lock() bottleneck has faded into insignificance.

John Hawkes


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: should ia64_spinlock_contention do backoff?
  2004-03-25 19:41 should ia64_spinlock_contention do backoff? David Mosberger
                   ` (7 preceding siblings ...)
  2004-03-28  2:37 ` John Hawkes
@ 2004-03-28 19:23 ` Matthew Wilcox
  8 siblings, 0 replies; 10+ messages in thread
From: Matthew Wilcox @ 2004-03-28 19:23 UTC (permalink / raw)
  To: linux-ia64

On Fri, Mar 26, 2004 at 04:14:13PM -0800, Chen, Kenneth W wrote:
> This question is a bit too broad, lock contention highly depends on workload.

Surely.

> For example, I'm doing direct I/O on bunch of block devices, and the dev nodes
> sit on reiserfs, this contention shows up:
> 
>  79.7% 91.2%   18us(1415us)  449us( 760ms)(57.9%)   2970234  8.8% 91.2%    0%  kernel_flag
>  0.00% 92.3%  5.8us(  35us)  689us(  58ms)(0.01%)       298  7.7% 92.3%    0%    __break_lease+0x80

I'm not familiar with this output; I'm assuming this means __break_lease()
is a major contributor to the amount that kernel_flag (ie the BKL)
is locked?  That's interesting; I assume you're using Samba as part of
your workload since it's the only major user of leases that I'm aware of.
Hmm.. seems to me we should move time_out_leases() down 5 lines to minimise
the amount of time we spend with the BKL held if there's a non-lease lock
held on the file.

I wish we'd managed to get the file locking code BKL-free during 2.5 but
it just didn't happen ;-(

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-03-28 19:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-25 19:41 should ia64_spinlock_contention do backoff? David Mosberger
2004-03-25 20:06 ` John Hawkes
2004-03-25 22:13 ` Keith Owens
2004-03-25 22:28 ` David Mosberger
2004-03-26 17:16 ` markw
2004-03-26 23:41 ` Chris Wedgwood
2004-03-26 23:44 ` David Mosberger
2004-03-27  0:14 ` Chen, Kenneth W
2004-03-28  2:37 ` John Hawkes
2004-03-28 19:23 ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox