From: Guillaume Nault <gnault@redhat.com>
To: Eric Dumazet <edumazet@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
David Miller <davem@davemloft.net>,
Jakub Kicinski <jakub.kicinski@netronome.com>,
netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH net v2 2/2] tcp: tighten acceptance of ACKs not matching a child socket
Date: Thu, 5 Dec 2019 20:22:52 +0100 [thread overview]
Message-ID: <20191205192252.GA18203@linux.home> (raw)
In-Reply-To: <CANn89i+RHVmA2Mc8x0NdHZjWsw4UtgZ5ymbWBBxLgv_YczUjvg@mail.gmail.com>
On Thu, Dec 05, 2019 at 10:14:15AM -0800, Eric Dumazet wrote:
> On Thu, Dec 5, 2019 at 10:00 AM Guillaume Nault <gnault@redhat.com> wrote:
> >
> > On Wed, Dec 04, 2019 at 07:08:49PM -0800, Eric Dumazet wrote:
> > >
> > >
> > > On 12/4/19 4:59 PM, Guillaume Nault wrote:
> > > > When no synflood occurs, the synflood timestamp isn't updated.
> > > > Therefore it can be so old that time_after32() can consider it to be
> > > > in the future.
> > > >
> > > > That's a problem for tcp_synq_no_recent_overflow() as it may report
> > > > that a recent overflow occurred while, in fact, it's just that jiffies
> > > > has grown past 'last_overflow' + TCP_SYNCOOKIE_VALID + 2^31.
> > > >
> > > > Spurious detection of recent overflows lead to extra syncookie
> > > > verification in cookie_v[46]_check(). At that point, the verification
> > > > should fail and the packet dropped. But we should have dropped the
> > > > packet earlier as we didn't even send a syncookie.
> > > >
> > > > Let's refine tcp_synq_no_recent_overflow() to report a recent overflow
> > > > only if jiffies is within the
> > > > [last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval. This
> > > > way, no spurious recent overflow is reported when jiffies wraps and
> > > > 'last_overflow' becomes in the future from the point of view of
> > > > time_after32().
> > > >
> > > > However, if jiffies wraps and enters the
> > > > [last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval (with
> > > > 'last_overflow' being a stale synflood timestamp), then
> > > > tcp_synq_no_recent_overflow() still erroneously reports an
> > > > overflow. In such cases, we have to rely on syncookie verification
> > > > to drop the packet. We unfortunately have no way to differentiate
> > > > between a fresh and a stale syncookie timestamp.
> > > >
> > > > Signed-off-by: Guillaume Nault <gnault@redhat.com>
> > > > ---
> > > > include/net/tcp.h | 6 ++++--
> > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/include/net/tcp.h b/include/net/tcp.h
> > > > index f0eae83ee555..005d4c691543 100644
> > > > --- a/include/net/tcp.h
> > > > +++ b/include/net/tcp.h
> > > > @@ -520,12 +520,14 @@ static inline bool tcp_synq_no_recent_overflow(const struct sock *sk)
> > > > if (likely(reuse)) {
> > > > last_overflow = READ_ONCE(reuse->synq_overflow_ts);
> > > > return time_after32(now, last_overflow +
> > > > - TCP_SYNCOOKIE_VALID);
> > > > + TCP_SYNCOOKIE_VALID) ||
> > > > + time_before32(now, last_overflow);
> > > > }
> > > > }
> > > >
> > > > last_overflow = tcp_sk(sk)->rx_opt.ts_recent_stamp;
> > > > - return time_after32(now, last_overflow + TCP_SYNCOOKIE_VALID);
> > > > + return time_after32(now, last_overflow + TCP_SYNCOOKIE_VALID) ||
> > > > + time_before32(now, last_overflow);
> > > > }
> > >
> > >
> > > There is a race I believe here.
> > >
> > > CPU1 CPU2
> > >
> > > now = jiffies.
> > > ...
> > > jiffies++
> > > ...
> > > SYN received, last_overflow is updated to the new jiffies.
> > >
> > >
> > > CPU1
> > > timer_before32(now, last_overflow) is true, because last_overflow was set to now+1
> > >
> > >
> > > I suggest some cushion here.
> > >
> > Yes, we should wrap access to ->rx_opt.ts_recent_stamp into READ_ONCE(),
> > to ensure that last_overflow won't be reloaded between the
> > time_after32() and the time_before32() calls. Is that what you had in
> > mind?
> >
> > - last_overflow = tcp_sk(sk)->rx_opt.ts_recent_stamp;
> > + last_overflow = READ_ONCE(tcp_sk(sk)->rx_opt.ts_recent_stamp);
> >
> > Patch 1 would need the same fix BTW.
> >
> > > Also we TCP uses between() macro, we might add a time_between32(a, b, c) macro
> > > to ease code review.
> > >
> > I didn't realise that. I'll define it in v3.
> >
> > > ->
> > > return !time_between32(last_overflow - HZ, now, last_overflow + TCP_SYNCOOKIE_VALID);
> > >
> > 'last_overflow - HZ'? I don't get why we'd take HZ into account here.
> >
>
> Please read carefuly my prior feedback.
>
> Even with READ_ONCE(), you still have a race.
>
>
> CPU1 CPU2
>
> now = jiffies.
> <some long interrupt>
> jiffies++ (or jiffies += 3 or 4
> if CPU1 has been interrupted by a long interrupt)
> ...
> SYN received, last_overflow is
> updated to the new jiffies.
>
>
> CPU1
>
> @now still has a stale values (an old jiffies value)
> timer_before32(now, last_overflow) is true, because last_overflow was
> set to now+1 (or now + 2 or now + 3)
>
Ok, I get it now. Thanks!
Will send v3 using 'last_overflow - HZ' as lower bound.
I think READ_ONCE()/WRITE_ONCE() are still necessary to prevent reloading
and imaginary write of last_overflow. At least that's my understanding
after reading memory-barriers.txt again.
next prev parent reply other threads:[~2019-12-05 19:23 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-05 0:58 [PATCH net v2 0/2] tcp: fix handling of stale syncookies timestamps Guillaume Nault
2019-12-05 0:59 ` [PATCH net v2 1/2] tcp: fix rejected syncookies due to stale timestamps Guillaume Nault
2019-12-05 0:59 ` [PATCH net v2 2/2] tcp: tighten acceptance of ACKs not matching a child socket Guillaume Nault
2019-12-05 3:08 ` Eric Dumazet
2019-12-05 18:00 ` Guillaume Nault
2019-12-05 18:14 ` Eric Dumazet
2019-12-05 19:22 ` Guillaume Nault [this message]
2019-12-05 19:30 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191205192252.GA18203@linux.home \
--to=gnault@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=jakub.kicinski@netronome.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.