netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Treason uncloaked / Broken peer again
@ 2009-02-21 19:58 Greg Lindahl
  2009-02-21 21:34 ` Ilpo Järvinen
  0 siblings, 1 reply; 10+ messages in thread
From: Greg Lindahl @ 2009-02-21 19:58 UTC (permalink / raw)
  To: netdev

A recent set of fiddling to our web crawler has resulted in crawled
Linux hosts frequently getting "Treason uncloaked" messages in their
dmesg. This has resulted in a modest amount of hate mail, surprisingly
little given that we crawl millions of hosts per day.

We're running the RHEL 5.2's kernel, and I have a remote webserver of
my own running RHEL 5.2 that's getting the messages. If you look in
your own webserver dmesg and see treason emanating from
38.108.180.XXX/24, that's me.

Unfortunately, I haven't made the thing deterministic. My suspicion is
that there's another bug similar to:

http://github.com/github/linux-2.6/commit/2ad41065d9fe518759b695fc2640cf9c07261dd2

Any advice on how to narrow down the bug? I was hoping I could get a
tcpdump of a treasonous conversation and hand it over to you guys.

-- greg



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Treason uncloaked / Broken peer again
  2009-02-21 19:58 Treason uncloaked / Broken peer again Greg Lindahl
@ 2009-02-21 21:34 ` Ilpo Järvinen
  2009-02-21 21:50   ` Greg Lindahl
  2009-02-26  6:30   ` Greg Lindahl
  0 siblings, 2 replies; 10+ messages in thread
From: Ilpo Järvinen @ 2009-02-21 21:34 UTC (permalink / raw)
  To: Greg Lindahl; +Cc: Netdev

On Sat, 21 Feb 2009, Greg Lindahl wrote:

> A recent set of fiddling to our web crawler has resulted in crawled
> Linux hosts frequently getting "Treason uncloaked" messages in their
> dmesg. This has resulted in a modest amount of hate mail, surprisingly
> little given that we crawl millions of hosts per day.
> 
> We're running the RHEL 5.2's kernel, and I have a remote webserver of
> my own running RHEL 5.2 that's getting the messages. If you look in
> your own webserver dmesg and see treason emanating from
> 38.108.180.XXX/24, that's me.
>
> Unfortunately, I haven't made the thing deterministic. My suspicion is
> that there's another bug similar to:
> 
> http://github.com/github/linux-2.6/commit/2ad41065d9fe518759b695fc2640cf9c07261dd2
> 
> Any advice on how to narrow down the bug? I was hoping I could get a
> tcpdump of a treasonous conversation and hand it over to you guys.

A case I remember top of the hat is related to shrinking of 
advertized window due to granularity steps because of window
scaling. Fixed 2.6.22-26 timeframe iirc.

Besides actually shrinking the window, it has is often proved to be so 
that TCP sent past the advertized window (which is a bug in itself) and 
then if the peer (in here, us) shrinks its window to zero at that point, 
the message gets triggered at the remote end (and will blame the wrong end 
:-)). So to avoid this your application must avoid causing zero window to 
prevent a fulfilment of the second requirement since you don't know if the 
remote end has that bug or not.


-- 
 i.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Treason uncloaked / Broken peer again
  2009-02-21 21:34 ` Ilpo Järvinen
@ 2009-02-21 21:50   ` Greg Lindahl
  2009-02-26  6:30   ` Greg Lindahl
  1 sibling, 0 replies; 10+ messages in thread
From: Greg Lindahl @ 2009-02-21 21:50 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Netdev

On Sat, Feb 21, 2009 at 11:34:54PM +0200, Ilpo Järvinen wrote:

> A case I remember top of the hat is related to shrinking of 
> advertized window due to granularity steps because of window
> scaling. Fixed 2.6.22-26 timeframe iirc.

I have a dozen hosts running 2.6.24.7, and 1 has some treason
complaining about a RHEL 5.2 host (2.6.18+redhat). Of course, it could
be a different bug, or caused by the other side. I'm a bit weak
at understanding which side is guilty in your explanation.

-- greg



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Treason uncloaked / Broken peer again
  2009-02-21 21:34 ` Ilpo Järvinen
  2009-02-21 21:50   ` Greg Lindahl
@ 2009-02-26  6:30   ` Greg Lindahl
  2009-02-26  8:58     ` Herbert Xu
  1 sibling, 1 reply; 10+ messages in thread
From: Greg Lindahl @ 2009-02-26  6:30 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Netdev

> > A recent set of fiddling to our web crawler has resulted in crawled
> > Linux hosts frequently getting "Treason uncloaked" messages in their
> > dmesg. This has resulted in a modest amount of hate mail, surprisingly
> > little given that we crawl millions of hosts per day.

I'm continuing to get hate mail from all over the planet. Can anyone
recommend a webpage which I could point to that explains how harmless
this message can be? Google returns lots of scary warnings. I would
write one myself but the complainers are already dubious of me.

It seems that most complainers are running < 2.6.14, which had a
header prediction bug.

-- greg


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Treason uncloaked / Broken peer again
  2009-02-26  6:30   ` Greg Lindahl
@ 2009-02-26  8:58     ` Herbert Xu
  2009-02-26  9:30       ` Ilpo Järvinen
  0 siblings, 1 reply; 10+ messages in thread
From: Herbert Xu @ 2009-02-26  8:58 UTC (permalink / raw)
  To: Greg Lindahl; +Cc: ilpo.jarvinen, netdev

Greg Lindahl <greg@blekko.com> wrote:
> 
> I'm continuing to get hate mail from all over the planet. Can anyone
> recommend a webpage which I could point to that explains how harmless
> this message can be? Google returns lots of scary warnings. I would
> write one myself but the complainers are already dubious of me.
> 
> It seems that most complainers are running < 2.6.14, which had a
> header prediction bug.

Right, most of these instances are due to buggy receivers.  I
suppose you can just point them to this or one of the previous
threads and tell them to upgrade :)

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Treason uncloaked / Broken peer again
  2009-02-26  8:58     ` Herbert Xu
@ 2009-02-26  9:30       ` Ilpo Järvinen
  2009-02-27 21:22         ` Greg Lindahl
  0 siblings, 1 reply; 10+ messages in thread
From: Ilpo Järvinen @ 2009-02-26  9:30 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Greg Lindahl, Netdev

On Thu, 26 Feb 2009, Herbert Xu wrote:

> Greg Lindahl <greg@blekko.com> wrote:
> > 
> > I'm continuing to get hate mail from all over the planet. Can anyone
> > recommend a webpage which I could point to that explains how harmless
> > this message can be? Google returns lots of scary warnings. I would
> > write one myself but the complainers are already dubious of me.
> > 
> > It seems that most complainers are running < 2.6.14, which had a
> > header prediction bug.

Like I said, one possible way for you to try to avoid the situation (when 
the buggy receiver is not in your control) is to prevent window getting 
zero (ever). Either make sure your application is fast enough and/or 
increase tcp_rmem enough. Alternatively you could add some OUTPUT firewall 
to drop zero window advertizing ACKs altogether (not that I recommend 
such a solution :-)).

> Right, most of these instances are due to buggy receivers.  I
> suppose you can just point them to this or one of the previous
> threads and tell them to upgrade :)

Right, it's rather crude to have buggy kernel which sends past the 
receiver's advertized window, and then when it cannot cope the results of 
its own bug (and prints that message), put a blame on others who behave in 
a compliant way. Sadly, I'd think that such people might also refuse 
upgrade which is beyond ridiculous if they still keep complaining about 
that message. It is well known that such bugs exist in the old kernels but 
I guess nobody can convince all. This is btw why I recently suggested 
(when the Treason message was revised) that the notion about peer 
shrunking its window should be removed since it's not always the case.

Perhaps one should start sending blames to all who send past the 
receiver's advertized window... ;-) It's certainly very questionable 
behavior (In a quick browsing through RFCs I didn't find anything that
clearly forbids it, but it for sure at least SHOULD NOT, and RFC793
also has says what the send window is, however, it's just positive
wording, no opposite case spelled out).


-- 
 i.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Treason uncloaked / Broken peer again
  2009-02-26  9:30       ` Ilpo Järvinen
@ 2009-02-27 21:22         ` Greg Lindahl
  2009-02-27 21:27           ` Greg Lindahl
  2009-02-28  0:31           ` David Miller
  0 siblings, 2 replies; 10+ messages in thread
From: Greg Lindahl @ 2009-02-27 21:22 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Herbert Xu, Netdev

On Thu, Feb 26, 2009 at 11:30:58AM +0200, Ilpo Järvinen wrote:

> Right, it's rather crude to have buggy kernel which sends past the 
> receiver's advertized window, and then when it cannot cope the results of 
> its own bug (and prints that message), put a blame on others who behave in 
> a compliant way.

Well, perhaps we shouldn't have the message be "Broken peer" when the
problem is often on the node printing the message? Maybe "Something's
broken, might be me"?

One of the people who complained to me is running 2.4.26, so the bug
in TSO that you fixed in 2.6.25 is not the last bug at issue. I was
also able to turn off TSO on all my 2.6.18+redhat systems and quickly
got Treason, so this bug is not the only one. But still, I can't cause
the bug often enough to get a tcpdump of it in action.

-- greg

Fixed in 2.6.25:

commit 5ea3a7480606cef06321cd85bc5113c72d2c7c68
Author: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Date:   Tue Mar 11 17:55:27 2008 -0700

    [TCP]: Prevent sending past receiver window with TSO (at last skb)

    With TSO it was possible to send past the receiver window when the skb
    to be sent was the last in the write queue while the receiver window
    is the limiting factor. One can notice that there's a loophole in the
    tcp_mss_split_point that lacked a receiver window check for the
    tcp_write_queue_tail() if also cwnd was smaller than the full skb.

    Noticed by Thomas Gleixner <tglx@linutronix.de> in form of "Treason
    uncloaked! Peer ... shrinks window .... Repaired."  messages (the peer
    didn't actually shrink its window as the message suggests, we had just
    sent something past it without a permission to do so).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Treason uncloaked / Broken peer again
  2009-02-27 21:22         ` Greg Lindahl
@ 2009-02-27 21:27           ` Greg Lindahl
  2009-02-28  0:31           ` David Miller
  1 sibling, 0 replies; 10+ messages in thread
From: Greg Lindahl @ 2009-02-27 21:27 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: Herbert Xu, Netdev

On Fri, Feb 27, 2009 at 01:22:34PM -0800, Greg Lindahl wrote:

> One of the people who complained to me is running 2.4.26, so the bug
> in TSO that you fixed in 2.6.25 is not the last bug at issue.

Sorry, I was confused, he really did say 2.4.26, so it's really old.

But, my experiment turning off TSO on 2.6.18+redhat shows that the
TSO bug fixed in 2.6.25 isn't the only bug.

> I was
> also able to turn off TSO on all my 2.6.18+redhat systems and quickly
> got Treason, so this bug is not the only one. But still, I can't cause
> the bug often enough to get a tcpdump of it in action.

-- greg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Treason uncloaked / Broken peer again
  2009-02-27 21:22         ` Greg Lindahl
  2009-02-27 21:27           ` Greg Lindahl
@ 2009-02-28  0:31           ` David Miller
  2009-03-10 19:37             ` Greg Lindahl
  1 sibling, 1 reply; 10+ messages in thread
From: David Miller @ 2009-02-28  0:31 UTC (permalink / raw)
  To: greg; +Cc: ilpo.jarvinen, herbert, netdev

From: Greg Lindahl <greg@blekko.com>
Date: Fri, 27 Feb 2009 13:22:34 -0800

> On Thu, Feb 26, 2009 at 11:30:58AM +0200, Ilpo Järvinen wrote:
> 
> > Right, it's rather crude to have buggy kernel which sends past the 
> > receiver's advertized window, and then when it cannot cope the results of 
> > its own bug (and prints that message), put a blame on others who behave in 
> > a compliant way.
> 
> Well, perhaps we shouldn't have the message be "Broken peer" when the
> problem is often on the node printing the message? Maybe "Something's
> broken, might be me"?

As has been mentioned several times in this thread and others, we have
tweaked the message to be more reasonable and in particular it doesn't
say "broken peer" any more.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Treason uncloaked / Broken peer again
  2009-02-28  0:31           ` David Miller
@ 2009-03-10 19:37             ` Greg Lindahl
  0 siblings, 0 replies; 10+ messages in thread
From: Greg Lindahl @ 2009-03-10 19:37 UTC (permalink / raw)
  To: netdev

By experimentation we determined that turning off window scaling meant
we didn't trigger "Treason uncloaked" on remote systems.

The recent posting about passive OS fingerprinting led me to

http://www.openbsd.org/cgi-bin/cvsweb/src/etc/pf.os?rev=1.21;content-type=text%2Fplain

which says that Googlebot has window scaling turned off, too.

Things that make me go, "Hmmm."

-- greg



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-03-10 19:37 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-21 19:58 Treason uncloaked / Broken peer again Greg Lindahl
2009-02-21 21:34 ` Ilpo Järvinen
2009-02-21 21:50   ` Greg Lindahl
2009-02-26  6:30   ` Greg Lindahl
2009-02-26  8:58     ` Herbert Xu
2009-02-26  9:30       ` Ilpo Järvinen
2009-02-27 21:22         ` Greg Lindahl
2009-02-27 21:27           ` Greg Lindahl
2009-02-28  0:31           ` David Miller
2009-03-10 19:37             ` Greg Lindahl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).