Re: [net-next-2.6 PATCH RFC] TCPCT part 1d: generate Responder Cookie

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: William Allen Simpson <william.allen.simpson@gmail.com>
To: paulmck@linux.vnet.ibm.com
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Linux Kernel Developers <linux-kernel@vger.kernel.org>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: [net-next-2.6 PATCH RFC] TCPCT part 1d: generate Responder	Cookie
Date: Thu, 05 Nov 2009 07:17:42 -0500	[thread overview]
Message-ID: <4AF2C266.1010603@gmail.com> (raw)
In-Reply-To: <20091104214844.GA6714@linux.vnet.ibm.com>

Paul E. McKenney wrote:
> On Tue, Nov 03, 2009 at 05:38:10PM -0500, William Allen Simpson wrote:
>> Documentation/RCU/checklist.txt #7 says:
>>
>>   One exception to this rule: rcu_read_lock() and rcu_read_unlock()
>>   may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
>>   in cases where local bottom halves are already known to be
>>   disabled, for example, in irq or softirq context.  Commenting
>>   such cases is a must, of course!  And the jury is still out on
>>   whether the increased speed is worth it.
> 
> I strongly suggest using the matching primitives unless you have a
> really strong reason not to.
> 
Eric gave contrary advice.  But he also suggested (in an earlier message)
clearing the secrets with a timer, which could be a separate context --
although much later in time.

As you suggest, I'll use the _bh suffix everywhere until every i is dotted
and t is crossed.  Then, check for efficiency later after thorough
analysis by experts such as yourself.

This code will be hit on every SYN and SYNACK that has a cookie option.
But it's just prior to a CPU intensive sha_transform -- in comparison,
it's trivial.


>> +			rcu_assign_pointer(tcp_secret_generating,
>> +					   tcp_secret_secondary);
>> +			rcu_assign_pointer(tcp_secret_retiring,
>> +					   tcp_secret_primary);
>> +			spin_unlock_bh(&tcp_secret_locker);
>> +			/* call_rcu() or synchronize_rcu() not needed. */
> 
> Would you be willing to say why?  Are you relying on a time delay for a
> given item to pass through tcp_secret_secondary and tcp_secret_retiring
> or some such?  If so, how do you know that this time delay will always
> be long enough?
> 
> Or are you just shuffling the data structures around, without ever
> freeing them?  If so, is it really OK for a given reader to keep a
> reference to a given item through the full range of shuffling, especially
> given that it might be accesssing this concurrently with the ->expires
> assignments above?
> 
> Either way, could you please expand the comment to give at least some
> hint to the poor guy reading your code?  ;-)
> 
Yes.  Just shuffling the pointers without ever freeing anything.  So,
there's nothing for call_rcu() to do, and nothing else to synchronize
(only the pointers).  This assumes that after _unlock_ any CPU cache
with an old pointer->expires will hit the _lock_ code, and that will
update *both* ->expires and the other array elements concurrently?

One of the advantages of this scheme is the new secret is initialized
while the old secret is still used, and the old secret can continue to
be verified as old packets arrive.  (I originally designed this for
Photuris [RFC-2522] circa 1995.)

As described in the long header given, each array element goes through
four (4) states.  This is handling the first state transition.  It will
hit at least 2 more locks, pointer updates, and unlocks before reuse.

Also, a great deal of time passes.  After being retired (and expired), it
will be unused for approximately 5 minutes.

All that's a bit long for a comment.

+			/*
+			 * The retiring data is never freed.  Instead, it is
+			 * replaced after later pointer updates and a quiet
+			 * time of approximately 5 minutes.  There is nothing
+			 * for call_rcu() or synchronize_rcu() to handle.
+			 */

Clear enough?

next prev parent reply	other threads:[~2009-11-05 12:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-30 11:00 [net-next-2.6 PATCH RFC] TCPCT part 1d: generate Responder Cookie William Allen Simpson
2009-10-30 18:11 ` William Allen Simpson
2009-11-01 13:01 ` William Allen Simpson
2009-11-01 18:03   ` Eric Dumazet
2009-11-02 10:39     ` William Allen Simpson
2009-11-02 10:50       ` David Miller
2009-11-02 10:56       ` Eric Dumazet
2009-11-02 12:36         ` William Allen Simpson
2009-11-02 13:16           ` Eric Dumazet
2009-11-02 17:21             ` William Allen Simpson
2009-11-02 17:42               ` Eric Dumazet
2009-11-03 22:38     ` William Allen Simpson
2009-11-03 23:03       ` Eric Dumazet
2009-11-04 21:48       ` Paul E. McKenney
2009-11-05 12:17         ` William Allen Simpson [this message]
2009-11-05 12:45           ` William Allen Simpson
2009-11-05 13:34             ` Eric Dumazet
2009-11-05 13:19           ` Eric Dumazet
2009-11-05 19:44             ` William Allen Simpson
2009-11-05 14:59           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AF2C266.1010603@gmail.com \
    --to=william.allen.simpson@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).