From: Eric Dumazet <dada1@cosmosbay.com>
To: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>,
Herbert Xu <herbert@gondor.apana.org.au>,
berrange@redhat.com, et-mgmt-tools@redhat.com,
davem@davemloft.net, netdev@vger.kernel.org
Subject: Re: virt-manager broken by bind(0) in net-next.
Date: Sun, 01 Feb 2009 17:12:41 +0100 [thread overview]
Message-ID: <4985C9F9.1020103@cosmosbay.com> (raw)
In-Reply-To: <20090201124220.GA2319@ioremap.net>
Evgeniy Polyakov a écrit :
> Hi Eric.
>
> On Sat, Jan 31, 2009 at 11:17:15AM +0100, Eric Dumazet (dada1@cosmosbay.com) wrote:
>> We only need to know if the *fix* is solving Stephen problem
>>
>> About performance effects of careful variable placement and percpu counter
>> strategy you might consult as an example :
>>
>> http://lkml.indiana.edu/hypermail/linux/kernel/0812.1/01624.html
>
> Impressive, but to be 100% fair it is not only because of the cache line
> issues :)
>
>> Now, with these patches applied, try to see effect of your new bsockets field
>> on a network workload doing lot of socket bind()/unbind() calls...
>>
>> With current kernels, you probably wont notice because of inode/dcache hot
>> cache lines, but it might change eventually...
>
> David applied the patch which fixed the problem, so we can return to the
> cache line issues. What do you think about the last version where
> bsockets field was placed at the very end of the structure and with
> cacheline_aligned_on_smp attribute?
>
Yes, at a minimum, move it away from first cache line.
And using atomic_t so that we dont have to discuss about accumulated
errors on SMP on this variable. We will see later if percpu counter
is wanted or not.
Thank you
[PATCH] net: move bsockets outside of read only beginning of struct inet_hashinfo
And switch bsockets to atomic_t since it might be changed in parallel.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
include/net/inet_hashtables.h | 3 ++-
net/ipv4/inet_connection_sock.c | 2 +-
net/ipv4/inet_hashtables.c | 5 +++--
3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 8d98dc7..a44e224 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -134,7 +134,7 @@ struct inet_hashinfo {
struct inet_bind_hashbucket *bhash;
unsigned int bhash_size;
- int bsockets;
+ /* 4 bytes hole on 64 bit */
struct kmem_cache *bind_bucket_cachep;
@@ -151,6 +151,7 @@ struct inet_hashinfo {
struct inet_listen_hashbucket listening_hash[INET_LHTABLE_SIZE]
____cacheline_aligned_in_smp;
+ atomic_t bsockets;
};
static inline struct inet_ehash_bucket *inet_ehash_bucket(
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 9bc6a18..22cd19e 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -119,7 +119,7 @@ again:
(tb->num_owners < smallest_size || smallest_size == -1)) {
smallest_size = tb->num_owners;
smallest_rover = rover;
- if (hashinfo->bsockets > (high - low) + 1) {
+ if (atomic_read(&hashinfo->bsockets) > (high - low) + 1) {
spin_unlock(&head->lock);
snum = smallest_rover;
goto have_snum;
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index d7b6178..625cc5f 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -62,7 +62,7 @@ void inet_bind_hash(struct sock *sk, struct inet_bind_bucket *tb,
{
struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo;
- hashinfo->bsockets++;
+ atomic_inc(&hashinfo->bsockets);
inet_sk(sk)->num = snum;
sk_add_bind_node(sk, &tb->owners);
@@ -81,7 +81,7 @@ static void __inet_put_port(struct sock *sk)
struct inet_bind_hashbucket *head = &hashinfo->bhash[bhash];
struct inet_bind_bucket *tb;
- hashinfo->bsockets--;
+ atomic_dec(&hashinfo->bsockets);
spin_lock(&head->lock);
tb = inet_csk(sk)->icsk_bind_hash;
@@ -532,6 +532,7 @@ void inet_hashinfo_init(struct inet_hashinfo *h)
{
int i;
+ atomic_set(&h->bsockets, 0);
for (i = 0; i < INET_LHTABLE_SIZE; i++) {
spin_lock_init(&h->listening_hash[i].lock);
INIT_HLIST_NULLS_HEAD(&h->listening_hash[i].head,
next prev parent reply other threads:[~2009-02-01 16:13 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-29 5:21 virt-manager broken on 2.6.29-rc2 Stephen Hemminger
[not found] ` <20090129103544.GC22110@redhat.com>
2009-01-30 5:35 ` virt-manager broken by bind(0) in net-next Stephen Hemminger
2009-01-30 8:16 ` Evgeniy Polyakov
[not found] ` <20090130081600.GA2717-i6C2adt8DTjR7s880joybQ@public.gmane.org>
2009-01-30 10:27 ` Daniel P. Berrange
2009-01-30 11:21 ` Evgeniy Polyakov
2009-01-30 12:53 ` Herbert Xu
[not found] ` <20090130125337.GA7155-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2009-01-30 17:57 ` Stephen Hemminger
2009-01-30 18:41 ` Eric Dumazet
2009-01-30 21:50 ` Evgeniy Polyakov
2009-01-30 22:30 ` Eric Dumazet
2009-01-30 22:51 ` Evgeniy Polyakov
[not found] ` <20090130225113.GA13977-i6C2adt8DTjR7s880joybQ@public.gmane.org>
2009-01-31 0:36 ` Stephen Hemminger
2009-01-31 8:35 ` Evgeniy Polyakov
2009-01-31 2:52 ` Stephen Hemminger
2009-01-31 8:37 ` Evgeniy Polyakov
2009-01-31 9:17 ` Eric Dumazet
2009-01-31 9:31 ` Evgeniy Polyakov
2009-01-31 9:49 ` Eric Dumazet
2009-01-31 9:56 ` Evgeniy Polyakov
2009-01-31 10:17 ` Eric Dumazet
2009-02-01 12:42 ` Evgeniy Polyakov
2009-02-01 16:12 ` Eric Dumazet [this message]
2009-02-01 17:40 ` Evgeniy Polyakov
2009-02-01 20:31 ` David Miller
[not found] ` <20090130215008.GB12210-i6C2adt8DTjR7s880joybQ@public.gmane.org>
2009-02-01 5:58 ` Stephen Hemminger
2009-02-01 9:07 ` David Miller
2009-02-01 12:44 ` Evgeniy Polyakov
[not found] ` <498349F7.4050300-fPLkHRcR87vqlBn2x/YWAg@public.gmane.org>
2009-02-01 5:29 ` Stephen Hemminger
2009-01-30 6:50 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4985C9F9.1020103@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=berrange@redhat.com \
--cc=davem@davemloft.net \
--cc=et-mgmt-tools@redhat.com \
--cc=herbert@gondor.apana.org.au \
--cc=netdev@vger.kernel.org \
--cc=shemminger@vyatta.com \
--cc=zbr@ioremap.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.