* Checksumming problem in pv_ops dom0 kernel / netback
@ 2010-03-17 12:34 S.H. Verbrugge
2010-03-17 12:48 ` Stefan Kuhne
2010-03-30 22:58 ` Scott Garron
0 siblings, 2 replies; 12+ messages in thread
From: S.H. Verbrugge @ 2010-03-17 12:34 UTC (permalink / raw)
To: xen-devel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=unknown-8bit, Size: 1822 bytes --]
Hello,
I seem to be having some troubles regarding the latest 2.6.31.6 and 2.6.32.9 Xen dom0 pv_ops trees.
Our platform:
-Xen 3.4.3-rc3 (also tried 3.4.2 on 2.6.31.6 pv_ops dom0)
-2.6.32.9 pv_ops dom0 kernel, perhaps a week old checkout from xen/stable git (can provide changeset if requested).
-100+ domU's, all PV.
Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26 xenkernel from Debian repo before, with Xen 3.2),
we started to have some problems when attempting to route packets on a domU.
The following message appears in dmesg on the dom0:
"Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet"
We can actually sniff both the virtual interface on the dom0 (nothing ever leaves the domU) and
we do see the ICMP echo requests inside the domU. The reply, however, never gets to the destination or outside the domU for that matter.
It seems that for some reason, as soon as the packet leaves the domU, the dom0 kernel drops the packet, as shown in the dmesg log.
Some background info:
We're using normal virtual interfaces, on a named bridge, br-internet. This bridge contains the hardware interface 'eth0'.
This is a tg3 interface, we already tried turning off both RX and TX checksumming for this interface.
Setting Ã'ethtool tx off' in the domU itself, doesn't help, either.
The different interfaces are vlan'ed through a Cisco 2927 switch.
Since this problem did not occur in xenkernels before, it is most likely related to a netback patch in pv_ops dom0.
Perhaps somebody could provide me with some more info, or insights.
--
/\/\ Hostingvereniging Soleus | Community-driven
< ** > http://soleus.nu | Virtual Private Servers
\/\/ Sen (IEF) Verbrugge (CT ProLead) | & more ...
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Checksumming problem in pv_ops dom0 kernel / netback
2010-03-17 12:34 Checksumming problem in pv_ops dom0 kernel / netback S.H. Verbrugge
@ 2010-03-17 12:48 ` Stefan Kuhne
2010-03-17 15:48 ` S.H. Verbrugge
2010-03-30 22:58 ` Scott Garron
1 sibling, 1 reply; 12+ messages in thread
From: Stefan Kuhne @ 2010-03-17 12:48 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 651 bytes --]
Am 17.03.2010 13:34, schrieb S.H. Verbrugge:
Hello,
> Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26 xenkernel from Debian repo before, with Xen 3.2),
> we started to have some problems when attempting to route packets on a domU.
>
> The following message appears in dmesg on the dom0:
>
> "Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet"
>
I know one how has same problem.
I've a similar Problem.
I can ping from routing DomU and Dom0 to Internet, but no one else can.
See also:
Post "2.6.31.6 pv_ops and routing DomU" from 10/03/09 on Xen-Devel
Regards,
Stefan Kuhne
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 552 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Checksumming problem in pv_ops dom0 kernel / netback
2010-03-17 12:48 ` Stefan Kuhne
@ 2010-03-17 15:48 ` S.H. Verbrugge
2010-03-17 23:25 ` James Harper
0 siblings, 1 reply; 12+ messages in thread
From: S.H. Verbrugge @ 2010-03-17 15:48 UTC (permalink / raw)
To: xen-devel
On Wed, Mar 17, 2010 at 01:48:33PM +0100, Stefan Kuhne wrote:
> Am 17.03.2010 13:34, schrieb S.H. Verbrugge:
>
> Hello,
>
> > Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26 xenkernel from Debian repo before, with Xen 3.2),
> > we started to have some problems when attempting to route packets on a domU.
> >
> > The following message appears in dmesg on the dom0:
> >
> > "Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet"
> >
> I know one how has same problem.
> I've a similar Problem.
>
> I can ping from routing DomU and Dom0 to Internet, but no one else can.
>
> See also:
> Post "2.6.31.6 pv_ops and routing DomU" from 10/03/09 on Xen-Devel
>
> Regards,
> Stefan Kuhne
>
>
Yeah, I've seen that. It does not give that much information however.
More specifically, I've tested with two dom0 pv_ops kernels now, and since it's still
reproducable on the latest xen/stable platform, I'm guessing it either has to do with vlan'ing or
the tg3 driver , in combination with netback.
I was hoping some Xen developers could shed some light on this.
I already conversed with Jeremy about this, and he pointed me to this mailing list.
--
/\/\ Hostingvereniging Soleus | Community-driven
< ** > http://soleus.nu | Virtual Private Servers
\/\/ Sen (IEF) Verbrugge (CT ProLead) | & more ...
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: Checksumming problem in pv_ops dom0 kernel / netback
2010-03-17 15:48 ` S.H. Verbrugge
@ 2010-03-17 23:25 ` James Harper
2010-03-17 23:37 ` Stefan Kuhne
2010-03-17 23:48 ` S.H. Verbrugge
0 siblings, 2 replies; 12+ messages in thread
From: James Harper @ 2010-03-17 23:25 UTC (permalink / raw)
To: S.H. Verbrugge, xen-devel
>
> Yeah, I've seen that. It does not give that much information however.
>
> More specifically, I've tested with two dom0 pv_ops kernels now, and
since
> it's still
> reproducable on the latest xen/stable platform, I'm guessing it either
has to
> do with vlan'ing or
> the tg3 driver , in combination with netback.
>
> I was hoping some Xen developers could shed some light on this.
> I already conversed with Jeremy about this, and he pointed me to this
mailing
> list.
>
This may not be relavant, but I have seen problems with the following
combination:
br0:
eth0
<netback devices>
br1:
eth0.2
<netback devices>
Some (most?) network hardware cannot provide checksum/large send offload
functions for packets that use vlan tagging, but Linux doesn't quite
understand that and gets confused, so when such a packet comes off of
netback and is sent to eth0.2, the LSO/checksum function should be
performed in software but isn't.
I haven't yet figured out of the problem is that the driver is
incorrectly reporting that offload is supported on the vlan device or if
the rest of Linux isn't taking the appropriate action...
James
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Checksumming problem in pv_ops dom0 kernel / netback
2010-03-17 23:25 ` James Harper
@ 2010-03-17 23:37 ` Stefan Kuhne
2010-03-17 23:48 ` S.H. Verbrugge
1 sibling, 0 replies; 12+ messages in thread
From: Stefan Kuhne @ 2010-03-17 23:37 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 653 bytes --]
Am 18.03.2010 00:25, schrieb James Harper:
Hello,
> This may not be relavant, but I have seen problems with the following
> combination:
>
> br0:
> eth0
> <netback devices>
>
> br1:
> eth0.2
> <netback devices>
>
> Some (most?) network hardware cannot provide checksum/large send offload
> functions for packets that use vlan tagging, but Linux doesn't quite
> understand that and gets confused, so when such a packet comes off of
> netback and is sent to eth0.2, the LSO/checksum function should be
> performed in software but isn't.
>
I've no VLAN running.
But i've only a similar problem.
Regards,
Stefan Kuhne
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 552 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Checksumming problem in pv_ops dom0 kernel / netback
2010-03-17 23:25 ` James Harper
2010-03-17 23:37 ` Stefan Kuhne
@ 2010-03-17 23:48 ` S.H. Verbrugge
2010-03-18 0:04 ` James Harper
1 sibling, 1 reply; 12+ messages in thread
From: S.H. Verbrugge @ 2010-03-17 23:48 UTC (permalink / raw)
To: James Harper; +Cc: xen-devel
On Thu, Mar 18, 2010 at 10:25:34AM +1100, James Harper wrote:
> >
> > Yeah, I've seen that. It does not give that much information however.
> >
> > More specifically, I've tested with two dom0 pv_ops kernels now, and
> since
> > it's still
> > reproducable on the latest xen/stable platform, I'm guessing it either
> has to
> > do with vlan'ing or
> > the tg3 driver , in combination with netback.
> >
> > I was hoping some Xen developers could shed some light on this.
> > I already conversed with Jeremy about this, and he pointed me to this
> mailing
> > list.
> >
>
> This may not be relavant, but I have seen problems with the following
> combination:
>
> br0:
> eth0
> <netback devices>
>
> br1:
> eth0.2
> <netback devices>
>
> Some (most?) network hardware cannot provide checksum/large send offload
> functions for packets that use vlan tagging, but Linux doesn't quite
> understand that and gets confused, so when such a packet comes off of
> netback and is sent to eth0.2, the LSO/checksum function should be
> performed in software but isn't.
>
> I haven't yet figured out of the problem is that the driver is
> incorrectly reporting that offload is supported on the vlan device or if
> the rest of Linux isn't taking the appropriate action...
Hmm, just to set the record straight, there's vlan'ing on the switch (containing
several ports in a single physical network), but no tagging as far as I know.
However, I've read some previous problems using checksum offloading in the TG3 driver.
Perhaps the two are related (netback / tg3 checksums), but I have no way of determining that.
Isn't there some way of patching the netback driver, so it does not support checksumming?
Maybe I'm way off base here..
--
/\/\ Hostingvereniging Soleus | Community-driven
< ** > http://soleus.nu | Virtual Private Servers
\/\/ Sen (IEF) Verbrugge (CT ProLead) | & more ...
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: Checksumming problem in pv_ops dom0 kernel / netback
2010-03-17 23:48 ` S.H. Verbrugge
@ 2010-03-18 0:04 ` James Harper
2010-03-18 0:17 ` S.H. Verbrugge
0 siblings, 1 reply; 12+ messages in thread
From: James Harper @ 2010-03-18 0:04 UTC (permalink / raw)
To: S.H. Verbrugge; +Cc: xen-devel
>
> Isn't there some way of patching the netback driver, so it does not
support
> checksumming?
> Maybe I'm way off base here..
>
Yes, use ethtool on the DomU interface, the bridge, and the Dom0
physical interface.
James
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Checksumming problem in pv_ops dom0 kernel / netback
2010-03-18 0:04 ` James Harper
@ 2010-03-18 0:17 ` S.H. Verbrugge
0 siblings, 0 replies; 12+ messages in thread
From: S.H. Verbrugge @ 2010-03-18 0:17 UTC (permalink / raw)
To: James Harper; +Cc: xen-devel
On Thu, Mar 18, 2010 at 11:04:41AM +1100, James Harper wrote:
>
> >
> > Isn't there some way of patching the netback driver, so it does not
> support
> > checksumming?
> > Maybe I'm way off base here..
> >
>
> Yes, use ethtool on the DomU interface, the bridge, and the Dom0
> physical interface.
>
> James
Yeah, I already tried that.
I did that in the domU on eth0 and any other interfaces, as well as
the dom0 phys. intf, the bridge and the virtual vif for the domU.
No luck, unfortunately.
--
/\/\ Hostingvereniging Soleus | Community-driven
< ** > http://soleus.nu | Virtual Private Servers
\/\/ Sen (IEF) Verbrugge (CT ProLead) | & more ...
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Checksumming problem in pv_ops dom0 kernel / netback
2010-03-17 12:34 Checksumming problem in pv_ops dom0 kernel / netback S.H. Verbrugge
2010-03-17 12:48 ` Stefan Kuhne
@ 2010-03-30 22:58 ` Scott Garron
2010-04-14 17:27 ` Pasi Kärkkäinen
2010-04-15 8:20 ` Ian Campbell
1 sibling, 2 replies; 12+ messages in thread
From: Scott Garron @ 2010-03-30 22:58 UTC (permalink / raw)
To: xen-devel
S.H. Verbrugge wrote:
> Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26
> xenkernel from Debian repo before, with Xen 3.2), we started to have
> some problems when attempting to route packets on a domU.
I'm having the same problem. All TCP packets that are forwarded
through a domU are somehow getting a static checksum (0x9e85) just as
they're being put out on the wire. ICMP is dropped by the dom0, as you
describe, with the "Attempting to checksum a non UDP/TCP packet" message
in dmesg. More detail about my situation is in my post to xen-users, here:
http://lists.xensource.com/archives/html/xen-users/2010-03/msg00846.html
It doesn't include a solution, though.
> This is a tg3 interface
I'm also running the tg3 ethernet driver, which may be of
significance.
--
Scott Garron
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: Checksumming problem in pv_ops dom0 kernel / netback
2010-03-30 22:58 ` Scott Garron
@ 2010-04-14 17:27 ` Pasi Kärkkäinen
2010-04-15 8:20 ` Ian Campbell
1 sibling, 0 replies; 12+ messages in thread
From: Pasi Kärkkäinen @ 2010-04-14 17:27 UTC (permalink / raw)
To: Scott Garron; +Cc: swente, xen-devel
On Tue, Mar 30, 2010 at 06:58:02PM -0400, Scott Garron wrote:
> S.H. Verbrugge wrote:
>> Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26
>> xenkernel from Debian repo before, with Xen 3.2), we started to have
>> some problems when attempting to route packets on a domU.
>
> I'm having the same problem. All TCP packets that are forwarded
> through a domU are somehow getting a static checksum (0x9e85) just as
> they're being put out on the wire. ICMP is dropped by the dom0, as you
> describe, with the "Attempting to checksum a non UDP/TCP packet" message
> in dmesg. More detail about my situation is in my post to xen-users, here:
>
> http://lists.xensource.com/archives/html/xen-users/2010-03/msg00846.html
>
> It doesn't include a solution, though.
>
>> This is a tg3 interface
>
> I'm also running the tg3 ethernet driver, which may be of
> significance.
>
I CCd some else having the same problem.. did you guys ever resolve this problem?
For him the problem got solved when he replaced the pvops _domU_ kernel with 2.6.18.8..
(Still running pvops in dom0).
-- Pasi
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Checksumming problem in pv_ops dom0 kernel / netback
2010-03-30 22:58 ` Scott Garron
2010-04-14 17:27 ` Pasi Kärkkäinen
@ 2010-04-15 8:20 ` Ian Campbell
2010-05-22 11:51 ` Pasi Kärkkäinen
1 sibling, 1 reply; 12+ messages in thread
From: Ian Campbell @ 2010-04-15 8:20 UTC (permalink / raw)
To: Scott Garron; +Cc: xen-devel@lists.xensource.com
On Tue, 2010-03-30 at 23:58 +0100, Scott Garron wrote:
> S.H. Verbrugge wrote:
> > Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26
> > xenkernel from Debian repo before, with Xen 3.2), we started to have
> > some problems when attempting to route packets on a domU.
>
> I'm having the same problem. All TCP packets that are forwarded
> through a domU are somehow getting a static checksum (0x9e85) just as
> they're being put out on the wire. ICMP is dropped by the dom0, as you
> describe, with the "Attempting to checksum a non UDP/TCP packet" message
> in dmesg. More detail about my situation is in my post to xen-users, here:
>
> http://lists.xensource.com/archives/html/xen-users/2010-03/msg00846.html
>
> It doesn't include a solution, though.
>
> > This is a tg3 interface
>
> I'm also running the tg3 ethernet driver, which may be of
> significance.
According to the driver source some tg3 chipsets are known to have
broken checksumming hardware, in particular 5700 B0 silicon. The
workaround seems to have been present in the driver forever though so
that may be a red-herring.
Ian.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Checksumming problem in pv_ops dom0 kernel / netback
2010-04-15 8:20 ` Ian Campbell
@ 2010-05-22 11:51 ` Pasi Kärkkäinen
0 siblings, 0 replies; 12+ messages in thread
From: Pasi Kärkkäinen @ 2010-05-22 11:51 UTC (permalink / raw)
To: Ian Campbell; +Cc: Scott Garron, xen-devel@lists.xensource.com
On Thu, Apr 15, 2010 at 09:20:39AM +0100, Ian Campbell wrote:
> On Tue, 2010-03-30 at 23:58 +0100, Scott Garron wrote:
> > S.H. Verbrugge wrote:
> > > Ever since we switched to a pv_ops dom0 kernel (we were using 2.6.26
> > > xenkernel from Debian repo before, with Xen 3.2), we started to have
> > > some problems when attempting to route packets on a domU.
> >
> > I'm having the same problem. All TCP packets that are forwarded
> > through a domU are somehow getting a static checksum (0x9e85) just as
> > they're being put out on the wire. ICMP is dropped by the dom0, as you
> > describe, with the "Attempting to checksum a non UDP/TCP packet" message
> > in dmesg. More detail about my situation is in my post to xen-users, here:
> >
> > http://lists.xensource.com/archives/html/xen-users/2010-03/msg00846.html
> >
> > It doesn't include a solution, though.
> >
> > > This is a tg3 interface
> >
> > I'm also running the tg3 ethernet driver, which may be of
> > significance.
>
> According to the driver source some tg3 chipsets are known to have
> broken checksumming hardware, in particular 5700 B0 silicon. The
> workaround seems to have been present in the driver forever though so
> that may be a red-herring.
>
Recently there was a fix for a bug in netback.. so you might want to update
to latest pvops dom0 kernel from xen/stable-2.6.32.x branch and see if that
fixes the problem.
-- Pasi
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2010-05-22 11:51 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-17 12:34 Checksumming problem in pv_ops dom0 kernel / netback S.H. Verbrugge
2010-03-17 12:48 ` Stefan Kuhne
2010-03-17 15:48 ` S.H. Verbrugge
2010-03-17 23:25 ` James Harper
2010-03-17 23:37 ` Stefan Kuhne
2010-03-17 23:48 ` S.H. Verbrugge
2010-03-18 0:04 ` James Harper
2010-03-18 0:17 ` S.H. Verbrugge
2010-03-30 22:58 ` Scott Garron
2010-04-14 17:27 ` Pasi Kärkkäinen
2010-04-15 8:20 ` Ian Campbell
2010-05-22 11:51 ` Pasi Kärkkäinen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).