All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	berrange@redhat.com, et-mgmt-tools@redhat.com,
	davem@davemloft.net, netdev@vger.kernel.org
Subject: Re: virt-manager broken by bind(0) in net-next.
Date: Fri, 30 Jan 2009 23:30:22 +0100	[thread overview]
Message-ID: <49837F7E.90306@cosmosbay.com> (raw)
In-Reply-To: <20090130215008.GB12210@ioremap.net>

Evgeniy Polyakov a écrit :
> On Fri, Jan 30, 2009 at 07:41:59PM +0100, Eric Dumazet (dada1@cosmosbay.com) wrote:
>> Reviewing commit a9d8f9110d7e953c2f2b521087a4179677843c2a
>>
>> I see use of a hashinfo->bsockets field that :
>>
>> - lacks proper lock/synchronization
> 
> It should contain rough number of sockets, there is no need to be very
> precise because of this hueristic.

Denying there is a bug is... well... I dont know what to say.

I wonder why we still use atomic_t all over the kernel.

> 
>> - suffers from cache line ping pongs on SMP
> 
> I used free alignment slot so that socket structure would not be
> icreased.

Are you kidding ?

bsockets is not part of socket structure, but part of "struct inet_hashinfo",
shared by all cpus and accessed several thousand times per second on many
machines.

Please read the comment three lines after 'the free alignemnt slot'
you chose.... You just introduced one write on a cache line
that is supposed to *not* be written.

        unsigned int                    bhash_size;
        int                             bsockets;

        struct kmem_cache               *bind_bucket_cachep;

        /* All the above members are written once at bootup and
         * never written again _or_ are predominantly read-access.
         *
         * Now align to a new cache line as all the following members
         * might be often dirty.
         */



> 
>> Also there might be a problem at line 175
>>
>> if (sk->sk_reuse && sk->sk_state != TCP_LISTEN && --attempts >= 0) { 
>> 	spin_unlock(&head->lock);
>> 	goto again;
>>
>> If we entered inet_csk_get_port() with a non null snum, we can "goto again"
>> while it was not expected.
>>
>> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
>> index df8e72f..752c6b2 100644
>> --- a/net/ipv4/inet_connection_sock.c
>> +++ b/net/ipv4/inet_connection_sock.c
>> @@ -172,7 +172,8 @@ tb_found:
>>  		} else {
>>  			ret = 1;
>>  			if (inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb)) {
>> -				if (sk->sk_reuse && sk->sk_state != TCP_LISTEN && --attempts >= 0) {
>> +				if (sk->sk_reuse && sk->sk_state != TCP_LISTEN &&
>> +					smallest_size == -1 &&  --attempts >= 0) {
> 
> I think it should be smallest_size != -1, since we really want to goto
> to the again label when hueristic is used, which in turn changes
> smallest_size.
> 

Yep



  reply	other threads:[~2009-01-30 22:30 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-29  5:21 virt-manager broken on 2.6.29-rc2 Stephen Hemminger
     [not found] ` <20090129103544.GC22110@redhat.com>
2009-01-30  5:35   ` virt-manager broken by bind(0) in net-next Stephen Hemminger
2009-01-30  8:16     ` Evgeniy Polyakov
     [not found]       ` <20090130081600.GA2717-i6C2adt8DTjR7s880joybQ@public.gmane.org>
2009-01-30 10:27         ` Daniel P. Berrange
2009-01-30 11:21           ` Evgeniy Polyakov
2009-01-30 12:53             ` Herbert Xu
     [not found]               ` <20090130125337.GA7155-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
2009-01-30 17:57                 ` Stephen Hemminger
2009-01-30 18:41                   ` Eric Dumazet
2009-01-30 21:50                     ` Evgeniy Polyakov
2009-01-30 22:30                       ` Eric Dumazet [this message]
2009-01-30 22:51                         ` Evgeniy Polyakov
     [not found]                           ` <20090130225113.GA13977-i6C2adt8DTjR7s880joybQ@public.gmane.org>
2009-01-31  0:36                             ` Stephen Hemminger
2009-01-31  8:35                               ` Evgeniy Polyakov
2009-01-31  2:52                             ` Stephen Hemminger
2009-01-31  8:37                               ` Evgeniy Polyakov
2009-01-31  9:17                                 ` Eric Dumazet
2009-01-31  9:31                                   ` Evgeniy Polyakov
2009-01-31  9:49                                     ` Eric Dumazet
2009-01-31  9:56                                       ` Evgeniy Polyakov
2009-01-31 10:17                                         ` Eric Dumazet
2009-02-01 12:42                                           ` Evgeniy Polyakov
2009-02-01 16:12                                             ` Eric Dumazet
2009-02-01 17:40                                               ` Evgeniy Polyakov
2009-02-01 20:31                                                 ` David Miller
     [not found]                       ` <20090130215008.GB12210-i6C2adt8DTjR7s880joybQ@public.gmane.org>
2009-02-01  5:58                         ` Stephen Hemminger
2009-02-01  9:07                           ` David Miller
2009-02-01 12:44                           ` Evgeniy Polyakov
     [not found]                     ` <498349F7.4050300-fPLkHRcR87vqlBn2x/YWAg@public.gmane.org>
2009-02-01  5:29                       ` Stephen Hemminger
2009-01-30  6:50   ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49837F7E.90306@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=berrange@redhat.com \
    --cc=davem@davemloft.net \
    --cc=et-mgmt-tools@redhat.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@vyatta.com \
    --cc=zbr@ioremap.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.