netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Miller <davem@davemloft.net>
To: herbert@gondor.apana.org.au
Cc: djohnson+linux-kernel@sw.starentnetworks.com,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] improved locking performance in rt_run_flush()
Date: Thu, 31 May 2007 16:26:26 -0700 (PDT)	[thread overview]
Message-ID: <20070531.162626.91312503.davem@davemloft.net> (raw)
In-Reply-To: <E1Hpdhs-00069a-00@gondolin.me.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sun, 20 May 2007 15:11:48 +1000

> David Miller <davem@davemloft.net> wrote:
> > From: Dave Johnson <djohnson+linux-kernel@sw.starentnetworks.com>
> >> 
> >> The below patch changes rt_run_flush() to only take each spinlock
> >> protecting the rt_hash_table once instead of taking a spinlock for
> >> every hash table bucket (and ending up taking the same small set 
> >> of locks over and over).
> 
> ...
> 
> > I'm not ignoring it I'm just trying to brainstorm whether there
> > is a better way to resolve this inefficiency. :-)
> 
> The main problem I see with this is having to walk and free each
> chain with the lock held.  We could avoid this if we had a pointer
> in struct rtable to chain them up for freeing later.
> 
> I just checked and struct rtable is 236 bytes long on 32-bit but
> the slab cache pads it to 256 bytes so we've got some free space.
> I suspect 64-bit should be similar.

SLUB I believe packs more aggressively and won't pad things out like
that.  Therefore adding a member to rtable is much less attractive.

I've been considering various alternative ways to deal with this.

For 2.6.22 and -stable's sake we could allocate an array of pointers
of size N where N is the number of rtable hash slots per spinlock.
A big lock wraps around rt_run_flush() to protect these slots, and
then the loop is:

	grap_lock();
	for_each_hash_chain_for_lock(i) {
		rth = rt_hash_table[i].chain;
		if (rth) {
			rt_hash_table[i].chain = NULL;
			flush_chain[i % N] = rt;
		}
	}
	drop_lock();

	for (i = 0; i < N; i++) {
		struct rtable *rth = flush_chain[i];
		flush_chain[i] = NULL;
		while (rth) {
			struct rtable *next = rth->u.dst.rt_next;
			rt_free(rth);
			rth = next;
		}
	}

Holding a lock across the entire hash plucking has it's not nice
properties, but it's better than taking the same lock N times in
a row.

In the longer term, if I resurrect my dynamically sized rtable hash
patches (which I do intend to do), that code protected a lot of this
stuff with a seqlock and it might be possible to use that seqlock
solely to flush the lists in rt_run_flush().

Any better ideas?

      reply	other threads:[~2007-05-31 23:26 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-12 16:36 [PATCH] improved locking performance in rt_run_flush() Dave Johnson
2007-05-14 10:04 ` David Miller
2007-05-20  5:11   ` Herbert Xu
2007-05-31 23:26     ` David Miller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070531.162626.91312503.davem@davemloft.net \
    --to=davem@davemloft.net \
    --cc=djohnson+linux-kernel@sw.starentnetworks.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).