* Treason uncloaked / Broken peer again
@ 2009-02-21 19:58 Greg Lindahl
2009-02-21 21:34 ` Ilpo Järvinen
0 siblings, 1 reply; 10+ messages in thread
From: Greg Lindahl @ 2009-02-21 19:58 UTC (permalink / raw)
To: netdev
A recent set of fiddling to our web crawler has resulted in crawled
Linux hosts frequently getting "Treason uncloaked" messages in their
dmesg. This has resulted in a modest amount of hate mail, surprisingly
little given that we crawl millions of hosts per day.
We're running the RHEL 5.2's kernel, and I have a remote webserver of
my own running RHEL 5.2 that's getting the messages. If you look in
your own webserver dmesg and see treason emanating from
38.108.180.XXX/24, that's me.
Unfortunately, I haven't made the thing deterministic. My suspicion is
that there's another bug similar to:
http://github.com/github/linux-2.6/commit/2ad41065d9fe518759b695fc2640cf9c07261dd2
Any advice on how to narrow down the bug? I was hoping I could get a
tcpdump of a treasonous conversation and hand it over to you guys.
-- greg
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Treason uncloaked / Broken peer again
2009-02-21 19:58 Treason uncloaked / Broken peer again Greg Lindahl
@ 2009-02-21 21:34 ` Ilpo Järvinen
2009-02-21 21:50 ` Greg Lindahl
2009-02-26 6:30 ` Greg Lindahl
0 siblings, 2 replies; 10+ messages in thread
From: Ilpo Järvinen @ 2009-02-21 21:34 UTC (permalink / raw)
To: Greg Lindahl; +Cc: Netdev
On Sat, 21 Feb 2009, Greg Lindahl wrote:
> A recent set of fiddling to our web crawler has resulted in crawled
> Linux hosts frequently getting "Treason uncloaked" messages in their
> dmesg. This has resulted in a modest amount of hate mail, surprisingly
> little given that we crawl millions of hosts per day.
>
> We're running the RHEL 5.2's kernel, and I have a remote webserver of
> my own running RHEL 5.2 that's getting the messages. If you look in
> your own webserver dmesg and see treason emanating from
> 38.108.180.XXX/24, that's me.
>
> Unfortunately, I haven't made the thing deterministic. My suspicion is
> that there's another bug similar to:
>
> http://github.com/github/linux-2.6/commit/2ad41065d9fe518759b695fc2640cf9c07261dd2
>
> Any advice on how to narrow down the bug? I was hoping I could get a
> tcpdump of a treasonous conversation and hand it over to you guys.
A case I remember top of the hat is related to shrinking of
advertized window due to granularity steps because of window
scaling. Fixed 2.6.22-26 timeframe iirc.
Besides actually shrinking the window, it has is often proved to be so
that TCP sent past the advertized window (which is a bug in itself) and
then if the peer (in here, us) shrinks its window to zero at that point,
the message gets triggered at the remote end (and will blame the wrong end
:-)). So to avoid this your application must avoid causing zero window to
prevent a fulfilment of the second requirement since you don't know if the
remote end has that bug or not.
--
i.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Treason uncloaked / Broken peer again
2009-02-21 21:34 ` Ilpo Järvinen
@ 2009-02-21 21:50 ` Greg Lindahl
2009-02-26 6:30 ` Greg Lindahl
1 sibling, 0 replies; 10+ messages in thread
From: Greg Lindahl @ 2009-02-21 21:50 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: Netdev
On Sat, Feb 21, 2009 at 11:34:54PM +0200, Ilpo Järvinen wrote:
> A case I remember top of the hat is related to shrinking of
> advertized window due to granularity steps because of window
> scaling. Fixed 2.6.22-26 timeframe iirc.
I have a dozen hosts running 2.6.24.7, and 1 has some treason
complaining about a RHEL 5.2 host (2.6.18+redhat). Of course, it could
be a different bug, or caused by the other side. I'm a bit weak
at understanding which side is guilty in your explanation.
-- greg
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Treason uncloaked / Broken peer again
2009-02-21 21:34 ` Ilpo Järvinen
2009-02-21 21:50 ` Greg Lindahl
@ 2009-02-26 6:30 ` Greg Lindahl
2009-02-26 8:58 ` Herbert Xu
1 sibling, 1 reply; 10+ messages in thread
From: Greg Lindahl @ 2009-02-26 6:30 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: Netdev
> > A recent set of fiddling to our web crawler has resulted in crawled
> > Linux hosts frequently getting "Treason uncloaked" messages in their
> > dmesg. This has resulted in a modest amount of hate mail, surprisingly
> > little given that we crawl millions of hosts per day.
I'm continuing to get hate mail from all over the planet. Can anyone
recommend a webpage which I could point to that explains how harmless
this message can be? Google returns lots of scary warnings. I would
write one myself but the complainers are already dubious of me.
It seems that most complainers are running < 2.6.14, which had a
header prediction bug.
-- greg
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Treason uncloaked / Broken peer again
2009-02-26 6:30 ` Greg Lindahl
@ 2009-02-26 8:58 ` Herbert Xu
2009-02-26 9:30 ` Ilpo Järvinen
0 siblings, 1 reply; 10+ messages in thread
From: Herbert Xu @ 2009-02-26 8:58 UTC (permalink / raw)
To: Greg Lindahl; +Cc: ilpo.jarvinen, netdev
Greg Lindahl <greg@blekko.com> wrote:
>
> I'm continuing to get hate mail from all over the planet. Can anyone
> recommend a webpage which I could point to that explains how harmless
> this message can be? Google returns lots of scary warnings. I would
> write one myself but the complainers are already dubious of me.
>
> It seems that most complainers are running < 2.6.14, which had a
> header prediction bug.
Right, most of these instances are due to buggy receivers. I
suppose you can just point them to this or one of the previous
threads and tell them to upgrade :)
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: Treason uncloaked / Broken peer again
2009-02-26 8:58 ` Herbert Xu
@ 2009-02-26 9:30 ` Ilpo Järvinen
2009-02-27 21:22 ` Greg Lindahl
0 siblings, 1 reply; 10+ messages in thread
From: Ilpo Järvinen @ 2009-02-26 9:30 UTC (permalink / raw)
To: Herbert Xu; +Cc: Greg Lindahl, Netdev
On Thu, 26 Feb 2009, Herbert Xu wrote:
> Greg Lindahl <greg@blekko.com> wrote:
> >
> > I'm continuing to get hate mail from all over the planet. Can anyone
> > recommend a webpage which I could point to that explains how harmless
> > this message can be? Google returns lots of scary warnings. I would
> > write one myself but the complainers are already dubious of me.
> >
> > It seems that most complainers are running < 2.6.14, which had a
> > header prediction bug.
Like I said, one possible way for you to try to avoid the situation (when
the buggy receiver is not in your control) is to prevent window getting
zero (ever). Either make sure your application is fast enough and/or
increase tcp_rmem enough. Alternatively you could add some OUTPUT firewall
to drop zero window advertizing ACKs altogether (not that I recommend
such a solution :-)).
> Right, most of these instances are due to buggy receivers. I
> suppose you can just point them to this or one of the previous
> threads and tell them to upgrade :)
Right, it's rather crude to have buggy kernel which sends past the
receiver's advertized window, and then when it cannot cope the results of
its own bug (and prints that message), put a blame on others who behave in
a compliant way. Sadly, I'd think that such people might also refuse
upgrade which is beyond ridiculous if they still keep complaining about
that message. It is well known that such bugs exist in the old kernels but
I guess nobody can convince all. This is btw why I recently suggested
(when the Treason message was revised) that the notion about peer
shrunking its window should be removed since it's not always the case.
Perhaps one should start sending blames to all who send past the
receiver's advertized window... ;-) It's certainly very questionable
behavior (In a quick browsing through RFCs I didn't find anything that
clearly forbids it, but it for sure at least SHOULD NOT, and RFC793
also has says what the send window is, however, it's just positive
wording, no opposite case spelled out).
--
i.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Treason uncloaked / Broken peer again
2009-02-26 9:30 ` Ilpo Järvinen
@ 2009-02-27 21:22 ` Greg Lindahl
2009-02-27 21:27 ` Greg Lindahl
2009-02-28 0:31 ` David Miller
0 siblings, 2 replies; 10+ messages in thread
From: Greg Lindahl @ 2009-02-27 21:22 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: Herbert Xu, Netdev
On Thu, Feb 26, 2009 at 11:30:58AM +0200, Ilpo Järvinen wrote:
> Right, it's rather crude to have buggy kernel which sends past the
> receiver's advertized window, and then when it cannot cope the results of
> its own bug (and prints that message), put a blame on others who behave in
> a compliant way.
Well, perhaps we shouldn't have the message be "Broken peer" when the
problem is often on the node printing the message? Maybe "Something's
broken, might be me"?
One of the people who complained to me is running 2.4.26, so the bug
in TSO that you fixed in 2.6.25 is not the last bug at issue. I was
also able to turn off TSO on all my 2.6.18+redhat systems and quickly
got Treason, so this bug is not the only one. But still, I can't cause
the bug often enough to get a tcpdump of it in action.
-- greg
Fixed in 2.6.25:
commit 5ea3a7480606cef06321cd85bc5113c72d2c7c68
Author: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Date: Tue Mar 11 17:55:27 2008 -0700
[TCP]: Prevent sending past receiver window with TSO (at last skb)
With TSO it was possible to send past the receiver window when the skb
to be sent was the last in the write queue while the receiver window
is the limiting factor. One can notice that there's a loophole in the
tcp_mss_split_point that lacked a receiver window check for the
tcp_write_queue_tail() if also cwnd was smaller than the full skb.
Noticed by Thomas Gleixner <tglx@linutronix.de> in form of "Treason
uncloaked! Peer ... shrinks window .... Repaired." messages (the peer
didn't actually shrink its window as the message suggests, we had just
sent something past it without a permission to do so).
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: Treason uncloaked / Broken peer again
2009-02-27 21:22 ` Greg Lindahl
@ 2009-02-27 21:27 ` Greg Lindahl
2009-02-28 0:31 ` David Miller
1 sibling, 0 replies; 10+ messages in thread
From: Greg Lindahl @ 2009-02-27 21:27 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: Herbert Xu, Netdev
On Fri, Feb 27, 2009 at 01:22:34PM -0800, Greg Lindahl wrote:
> One of the people who complained to me is running 2.4.26, so the bug
> in TSO that you fixed in 2.6.25 is not the last bug at issue.
Sorry, I was confused, he really did say 2.4.26, so it's really old.
But, my experiment turning off TSO on 2.6.18+redhat shows that the
TSO bug fixed in 2.6.25 isn't the only bug.
> I was
> also able to turn off TSO on all my 2.6.18+redhat systems and quickly
> got Treason, so this bug is not the only one. But still, I can't cause
> the bug often enough to get a tcpdump of it in action.
-- greg
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Treason uncloaked / Broken peer again
2009-02-27 21:22 ` Greg Lindahl
2009-02-27 21:27 ` Greg Lindahl
@ 2009-02-28 0:31 ` David Miller
2009-03-10 19:37 ` Greg Lindahl
1 sibling, 1 reply; 10+ messages in thread
From: David Miller @ 2009-02-28 0:31 UTC (permalink / raw)
To: greg; +Cc: ilpo.jarvinen, herbert, netdev
From: Greg Lindahl <greg@blekko.com>
Date: Fri, 27 Feb 2009 13:22:34 -0800
> On Thu, Feb 26, 2009 at 11:30:58AM +0200, Ilpo Järvinen wrote:
>
> > Right, it's rather crude to have buggy kernel which sends past the
> > receiver's advertized window, and then when it cannot cope the results of
> > its own bug (and prints that message), put a blame on others who behave in
> > a compliant way.
>
> Well, perhaps we shouldn't have the message be "Broken peer" when the
> problem is often on the node printing the message? Maybe "Something's
> broken, might be me"?
As has been mentioned several times in this thread and others, we have
tweaked the message to be more reasonable and in particular it doesn't
say "broken peer" any more.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-03-10 19:37 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-21 19:58 Treason uncloaked / Broken peer again Greg Lindahl
2009-02-21 21:34 ` Ilpo Järvinen
2009-02-21 21:50 ` Greg Lindahl
2009-02-26 6:30 ` Greg Lindahl
2009-02-26 8:58 ` Herbert Xu
2009-02-26 9:30 ` Ilpo Järvinen
2009-02-27 21:22 ` Greg Lindahl
2009-02-27 21:27 ` Greg Lindahl
2009-02-28 0:31 ` David Miller
2009-03-10 19:37 ` Greg Lindahl
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).