Re: [net-next-2.6 PATCH RFC] TCPCT part 1d: generate Responder Cookie

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: William Allen Simpson <william.allen.simpson@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Linux Kernel Developers <linux-kernel@vger.kernel.org>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: [net-next-2.6 PATCH RFC] TCPCT part 1d: generate Responder Cookie
Date: Thu, 5 Nov 2009 06:59:33 -0800	[thread overview]
Message-ID: <20091105145933.GA6770@linux.vnet.ibm.com> (raw)
In-Reply-To: <4AF2C266.1010603@gmail.com>

On Thu, Nov 05, 2009 at 07:17:42AM -0500, William Allen Simpson wrote:
> Paul E. McKenney wrote:
>> On Tue, Nov 03, 2009 at 05:38:10PM -0500, William Allen Simpson wrote:
>>> Documentation/RCU/checklist.txt #7 says:
>>>
>>>   One exception to this rule: rcu_read_lock() and rcu_read_unlock()
>>>   may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
>>>   in cases where local bottom halves are already known to be
>>>   disabled, for example, in irq or softirq context.  Commenting
>>>   such cases is a must, of course!  And the jury is still out on
>>>   whether the increased speed is worth it.
>> I strongly suggest using the matching primitives unless you have a
>> really strong reason not to.
> Eric gave contrary advice.  But he also suggested (in an earlier message)
> clearing the secrets with a timer, which could be a separate context --
> although much later in time.
>
> As you suggest, I'll use the _bh suffix everywhere until every i is dotted
> and t is crossed.  Then, check for efficiency later after thorough
> analysis by experts such as yourself.
>
> This code will be hit on every SYN and SYNACK that has a cookie option.
> But it's just prior to a CPU intensive sha_transform -- in comparison,
> it's trivial.

Had Eric said that this code were performance-critical, where every
nanosecond mattered, that would certainly be good enough for me.
Eric has excellent knowledge of the networking code, certainly much
better than mine.  And 10Gb Ethernet is certainly a performance
challenge, and I don't expect 40Gb Ethernet to be any easier.

Of course, I would still argue that the use of rcu_read_lock() rather
than rcu_read_unlock() needs to be commented.  And if this sort of
substitution happens a lot, maybe we need a way for it to happen
automatically.

							Thanx, Paul

>>> +			rcu_assign_pointer(tcp_secret_generating,
>>> +					   tcp_secret_secondary);
>>> +			rcu_assign_pointer(tcp_secret_retiring,
>>> +					   tcp_secret_primary);
>>> +			spin_unlock_bh(&tcp_secret_locker);
>>> +			/* call_rcu() or synchronize_rcu() not needed. */
>> Would you be willing to say why?  Are you relying on a time delay for a
>> given item to pass through tcp_secret_secondary and tcp_secret_retiring
>> or some such?  If so, how do you know that this time delay will always
>> be long enough?
>> Or are you just shuffling the data structures around, without ever
>> freeing them?  If so, is it really OK for a given reader to keep a
>> reference to a given item through the full range of shuffling, especially
>> given that it might be accesssing this concurrently with the ->expires
>> assignments above?
>> Either way, could you please expand the comment to give at least some
>> hint to the poor guy reading your code?  ;-)
> Yes.  Just shuffling the pointers without ever freeing anything.  So,
> there's nothing for call_rcu() to do, and nothing else to synchronize
> (only the pointers).  This assumes that after _unlock_ any CPU cache
> with an old pointer->expires will hit the _lock_ code, and that will
> update *both* ->expires and the other array elements concurrently?
>
> One of the advantages of this scheme is the new secret is initialized
> while the old secret is still used, and the old secret can continue to
> be verified as old packets arrive.  (I originally designed this for
> Photuris [RFC-2522] circa 1995.)
>
> As described in the long header given, each array element goes through
> four (4) states.  This is handling the first state transition.  It will
> hit at least 2 more locks, pointer updates, and unlocks before reuse.
>
> Also, a great deal of time passes.  After being retired (and expired), it
> will be unused for approximately 5 minutes.
>
> All that's a bit long for a comment.
>
> +			/*
> +			 * The retiring data is never freed.  Instead, it is
> +			 * replaced after later pointer updates and a quiet
> +			 * time of approximately 5 minutes.  There is nothing
> +			 * for call_rcu() or synchronize_rcu() to handle.
> +			 */
>
> Clear enough?

     prev parent reply	other threads:[~2009-11-05 14:59 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-30 11:00 [net-next-2.6 PATCH RFC] TCPCT part 1d: generate Responder Cookie William Allen Simpson
2009-10-30 18:11 ` William Allen Simpson
2009-11-01 13:01 ` William Allen Simpson
2009-11-01 18:03   ` Eric Dumazet
2009-11-02 10:39     ` William Allen Simpson
2009-11-02 10:50       ` David Miller
2009-11-02 10:56       ` Eric Dumazet
2009-11-02 12:36         ` William Allen Simpson
2009-11-02 13:16           ` Eric Dumazet
2009-11-02 17:21             ` William Allen Simpson
2009-11-02 17:42               ` Eric Dumazet
2009-11-03 22:38     ` William Allen Simpson
2009-11-03 23:03       ` Eric Dumazet
2009-11-04 21:48       ` Paul E. McKenney
2009-11-05 12:17         ` William Allen Simpson
2009-11-05 12:45           ` William Allen Simpson
2009-11-05 13:34             ` Eric Dumazet
2009-11-05 13:19           ` Eric Dumazet
2009-11-05 19:44             ` William Allen Simpson
2009-11-05 14:59           ` Paul E. McKenney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091105145933.GA6770@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=william.allen.simpson@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).