* Error with NFS and XEN (High network load)
@ 2006-08-15 22:58 Roberto Gonzalez Azevedo
2006-08-16 14:44 ` Emmanuel Ackaouy
2006-08-16 16:04 ` Herbert Xu
0 siblings, 2 replies; 6+ messages in thread
From: Roberto Gonzalez Azevedo @ 2006-08-15 22:58 UTC (permalink / raw)
To: xen-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Error with NFS and XEN (High network load):
I'm using UDP and 1500 MTU.
On server-side:
(LVM2)
/dev/mapper/vg-web on /web type xfs (rw,nosuid,noatime,quota,usrquota)
On client-side (dom0 or domU):
mount -t nfs -o rsize=8192,wsize=8192,intr,noexec,retry=10 \
server:/web /data/web
On domU machine:
root@domU:# mount -t nfs real-machine-not-xen-or-any-virtual:/mnt /data
[ok]
Crash:
root@domU:# time dd if=/dev/zero of=/data/test.txt bs=16k count=16384
[NOT OK]
My hardware is Dell PowerEdge 1850, network card module e1000.
Here is the crash:
root@dom0:# xm console machine
"
Unable to handle kernel NULL pointer dereference at virtual address 00000115
printing eip:
*pde = ma 00000000 pa fffff000
Oops: 0002 [#1]
SMP
Modules linked in:
CPU: 0
EIP: 0061:[<c02f9f99>] Not tainted VLI
EFLAGS: 00010086 (2.6.16.27-xenU_09-08-2006 #3)
EIP is at network_tx_buf_gc+0xdb/0x277
eax: 00000095 ebx: 00000087 ecx: c0508c54 edx: 00000000
esi: c0508380 edi: 00000085 ebp: c0499e98 esp: c0499e70
ds: 007b es: 007b ss: 0069
Process swapper (pid: 0, threadinfo=c0498000 task=c0434500)
Stack: <0>c0508c54 00000000 c0498000 c0508000 00026c3f 00026c8b 00026c6e
00000000
c0508408 c0508380 c0499eac c02fa230 c7d98a40 00000000 00000000
c0499ed4
c013c911 00000107 c0508000 c0499f3c c0499f3c 00000107 00008380
c048d780
Call Trace:
[<c0105580>] show_stack_log_lvl+0xb9/0x103
[<c0105766>] show_registers+0x19c/0x232
[<c0105a64>] die+0x116/0x233
[<c01112ca>] do_page_fault+0x4e8/0x8cc
[<c0104ff7>] error_code+0x2b/0x30
[<c02fa230>] netif_int+0x29/0x121
[<c013c911>] handle_IRQ_event+0x3c/0xac
[<c013ca0d>] __do_IRQ+0x8c/0xef
[<c0106a62>] do_IRQ+0x1d/0x29
[<c02ee27b>] evtchn_do_upcall+0x94/0xc8
[<c0105039>] hypervisor_callback+0x3d/0x48
[<c010382f>] xen_idle+0x2d/0x53
[<c01038c0>] cpu_idle+0x6b/0xb6
[<c0102035>] rest_init+0x35/0x37
[<c049a4ed>] start_kernel+0x2e7/0x392
[<c010006f>] 0xc010006f
Code: 89 44 24 04 89 14 24 e8 cf 54 ff ff c7 84 9e d8 08 00 00 00 00 00
00 8b 86 d0 00 00 00 89 84 9e d0 00 00 00 89 9e d0 00 00 00 90 <ff> 8f
90 00 00 00 0f 94 c0 84 c0 75 4e 8b 4e 74 83 45 f0 01 8b
<0>Kernel panic - not syncing: Fatal exception in interrupt
"
And the domU dies ...
Is it a bug ?
I have reported it in bugzilla as Bug id # 735.
Dom0 crashes using Xen Stable (3.0.2.2).
- --
- ----------------------------
Roberto Gonzalez Azevedo
Tested with:
Xen Stable 3.0.2.2 or Xen Unstable (both crash)
(dom0 = Ubuntu 6.0.6.1 Dapper AND
dom0 = CentOS 4.3 with official Xen RPMs
domU = Slackware 10.2)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
iD8DBQFE4lGHF+EMwkXLsEwRAuaAAJwKzO900w1LhFayq4hd6bNeqBWiPACeNlGK
dMrMcjZiLyiZ5ThcDC6711Q=
=rbnf
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Error with NFS and XEN (High network load)
2006-08-15 22:58 Error with NFS and XEN (High network load) Roberto Gonzalez Azevedo
@ 2006-08-16 14:44 ` Emmanuel Ackaouy
2006-08-16 16:18 ` Roberto Gonzalez Azevedo
2006-08-16 16:04 ` Herbert Xu
1 sibling, 1 reply; 6+ messages in thread
From: Emmanuel Ackaouy @ 2006-08-16 14:44 UTC (permalink / raw)
To: Roberto Gonzalez Azevedo; +Cc: xen-devel
Roberto,
I can't reproduce this on tip unstable. Tried both fragmented
UDP tx with ttcp and same NFS dd recipe with NFS over UDP and
8k r/w size. I even tried NFSoverTCP for good measure (since
your recipe didn't explicitly mount -o udp).
I notice you are running 2.6.16.27. Which xen linux patches
did you apply? Which changeset of xen-unstable?
I wonder what versions of netfront and netback you're using.
Have you tried tip of unstable with the latest gso patches?
Emmanuel.
On Tue, Aug 15, 2006 at 07:58:15PM -0300, Roberto Gonzalez Azevedo wrote:
> Here is the crash:
> root@dom0:# xm console machine
> "
> Unable to handle kernel NULL pointer dereference at virtual address 00000115
> printing eip:
> *pde = ma 00000000 pa fffff000
> Oops: 0002 [#1]
> SMP
> Modules linked in:
> CPU: 0
> EIP: 0061:[<c02f9f99>] Not tainted VLI
> EFLAGS: 00010086 (2.6.16.27-xenU_09-08-2006 #3)
> EIP is at network_tx_buf_gc+0xdb/0x277
> eax: 00000095 ebx: 00000087 ecx: c0508c54 edx: 00000000
> esi: c0508380 edi: 00000085 ebp: c0499e98 esp: c0499e70
> ds: 007b es: 007b ss: 0069
> Process swapper (pid: 0, threadinfo=c0498000 task=c0434500)
> Stack: <0>c0508c54 00000000 c0498000 c0508000 00026c3f 00026c8b 00026c6e
> 00000000
> c0508408 c0508380 c0499eac c02fa230 c7d98a40 00000000 00000000
> c0499ed4
> c013c911 00000107 c0508000 c0499f3c c0499f3c 00000107 00008380
> c048d780
> Call Trace:
> [<c0105580>] show_stack_log_lvl+0xb9/0x103
> [<c0105766>] show_registers+0x19c/0x232
> [<c0105a64>] die+0x116/0x233
> [<c01112ca>] do_page_fault+0x4e8/0x8cc
> [<c0104ff7>] error_code+0x2b/0x30
> [<c02fa230>] netif_int+0x29/0x121
> [<c013c911>] handle_IRQ_event+0x3c/0xac
> [<c013ca0d>] __do_IRQ+0x8c/0xef
> [<c0106a62>] do_IRQ+0x1d/0x29
> [<c02ee27b>] evtchn_do_upcall+0x94/0xc8
> [<c0105039>] hypervisor_callback+0x3d/0x48
> [<c010382f>] xen_idle+0x2d/0x53
> [<c01038c0>] cpu_idle+0x6b/0xb6
> [<c0102035>] rest_init+0x35/0x37
> [<c049a4ed>] start_kernel+0x2e7/0x392
> [<c010006f>] 0xc010006f
> Code: 89 44 24 04 89 14 24 e8 cf 54 ff ff c7 84 9e d8 08 00 00 00 00 00
> 00 8b 86 d0 00 00 00 89 84 9e d0 00 00 00 89 9e d0 00 00 00 90 <ff> 8f
> 90 00 00 00 0f 94 c0 84 c0 75 4e 8b 4e 74 83 45 f0 01 8b
> <0>Kernel panic - not syncing: Fatal exception in interrupt
> "
>
> And the domU dies ...
>
> Is it a bug ?
> I have reported it in bugzilla as Bug id # 735.
> Dom0 crashes using Xen Stable (3.0.2.2).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Error with NFS and XEN (High network load)
2006-08-15 22:58 Error with NFS and XEN (High network load) Roberto Gonzalez Azevedo
2006-08-16 14:44 ` Emmanuel Ackaouy
@ 2006-08-16 16:04 ` Herbert Xu
2006-08-17 3:41 ` Roberto Gonzalez Azevedo
2006-08-17 14:09 ` Roberto Gonzalez Azevedo
1 sibling, 2 replies; 6+ messages in thread
From: Herbert Xu @ 2006-08-16 16:04 UTC (permalink / raw)
To: Keir Fraser, Roberto Gonzalez Azevedo; +Cc: Xen Development Mailing List
On Tue, Aug 15, 2006 at 10:58:15PM +0000, Roberto Gonzalez Azevedo wrote:
>
> On domU machine:
> root@domU:# mount -t nfs real-machine-not-xen-or-any-virtual:/mnt /data
> [ok]
>
> Crash:
> root@domU:# time dd if=/dev/zero of=/data/test.txt bs=16k count=16384
> [NOT OK]
Thanks for the report. This is what I needed to reproduce the problem.
I forgot to initialise the first fragment to a proper value in the first
SG patch.
[NET] back: Initialise first fragment properly
The first fragment is used to store the pending_idx of the leading
txreq if it doesn't fit in the head area. When it does fit into
the head we need to ensure that the first fragment contains a value
that is not equal to pending_idx as that's what we use to distinguish
between the two cases in a a number of places.
This patch sets the first fragment to ~0 which is not equal to any
valid pending_idx. Without this initialisation, we may double-free
a pending_idx if the first fragment happened to contain a value
equal to it (this usually happened with pending_idx 0).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff -r ec03b24a2d83 linux-2.6-xen-sparse/drivers/xen/netback/netback.c
--- a/linux-2.6-xen-sparse/drivers/xen/netback/netback.c Tue Aug 15 19:53:55 2006 +0100
+++ b/linux-2.6-xen-sparse/drivers/xen/netback/netback.c Thu Aug 17 01:57:54 2006 +1000
@@ -1147,6 +1147,8 @@ static void net_tx_action(unsigned long
__skb_put(skb, data_len);
skb_shinfo(skb)->nr_frags = ret;
+ skb_shinfo(skb)->frags[0].page = (void *)~0UL;
+
if (data_len < txreq.size) {
skb_shinfo(skb)->nr_frags++;
skb_shinfo(skb)->frags[0].page =
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Error with NFS and XEN (High network load)
2006-08-16 14:44 ` Emmanuel Ackaouy
@ 2006-08-16 16:18 ` Roberto Gonzalez Azevedo
0 siblings, 0 replies; 6+ messages in thread
From: Roberto Gonzalez Azevedo @ 2006-08-16 16:18 UTC (permalink / raw)
To: Roberto Gonzalez Azevedo, xen-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Emmanuel Ackaouy wrote:
> Roberto,
>
> I can't reproduce this on tip unstable. Tried both fragmented
> UDP tx with ttcp and same NFS dd recipe with NFS over UDP and
> 8k r/w size. I even tried NFSoverTCP for good measure (since
> your recipe didn't explicitly mount -o udp).
>
> I notice you are running 2.6.16.27. Which xen linux patches
> did you apply? Which changeset of xen-unstable?
>
I'm using 2.6.16.13 and the latest Xen Unstable.
Typing error. Sorry, :)
Using the unstable, only crashes in domU, but xend dies and dom0 not crash.
Using the stable version, crashes in all domains (dom0 e domU).
How to apply the latest gso patches ?
Thanks.
- ----------------------------
Roberto Gonzalez Azevedo
> I wonder what versions of netfront and netback you're using.
> Have you tried tip of unstable with the latest gso patches?
>
> Emmanuel.
>
> On Tue, Aug 15, 2006 at 07:58:15PM -0300, Roberto Gonzalez Azevedo wrote:
>> Here is the crash:
>> root@dom0:# xm console machine
>> "
>> Unable to handle kernel NULL pointer dereference at virtual address 00000115
>> printing eip:
>> *pde = ma 00000000 pa fffff000
>> Oops: 0002 [#1]
>> SMP
>> Modules linked in:
>> CPU: 0
>> EIP: 0061:[<c02f9f99>] Not tainted VLI
>> EFLAGS: 00010086 (2.6.16.27-xenU_09-08-2006 #3)
>> EIP is at network_tx_buf_gc+0xdb/0x277
>> eax: 00000095 ebx: 00000087 ecx: c0508c54 edx: 00000000
>> esi: c0508380 edi: 00000085 ebp: c0499e98 esp: c0499e70
>> ds: 007b es: 007b ss: 0069
>> Process swapper (pid: 0, threadinfo=c0498000 task=c0434500)
>> Stack: <0>c0508c54 00000000 c0498000 c0508000 00026c3f 00026c8b 00026c6e
>> 00000000
>> c0508408 c0508380 c0499eac c02fa230 c7d98a40 00000000 00000000
>> c0499ed4
>> c013c911 00000107 c0508000 c0499f3c c0499f3c 00000107 00008380
>> c048d780
>> Call Trace:
>> [<c0105580>] show_stack_log_lvl+0xb9/0x103
>> [<c0105766>] show_registers+0x19c/0x232
>> [<c0105a64>] die+0x116/0x233
>> [<c01112ca>] do_page_fault+0x4e8/0x8cc
>> [<c0104ff7>] error_code+0x2b/0x30
>> [<c02fa230>] netif_int+0x29/0x121
>> [<c013c911>] handle_IRQ_event+0x3c/0xac
>> [<c013ca0d>] __do_IRQ+0x8c/0xef
>> [<c0106a62>] do_IRQ+0x1d/0x29
>> [<c02ee27b>] evtchn_do_upcall+0x94/0xc8
>> [<c0105039>] hypervisor_callback+0x3d/0x48
>> [<c010382f>] xen_idle+0x2d/0x53
>> [<c01038c0>] cpu_idle+0x6b/0xb6
>> [<c0102035>] rest_init+0x35/0x37
>> [<c049a4ed>] start_kernel+0x2e7/0x392
>> [<c010006f>] 0xc010006f
>> Code: 89 44 24 04 89 14 24 e8 cf 54 ff ff c7 84 9e d8 08 00 00 00 00 00
>> 00 8b 86 d0 00 00 00 89 84 9e d0 00 00 00 89 9e d0 00 00 00 90 <ff> 8f
>> 90 00 00 00 0f 94 c0 84 c0 75 4e 8b 4e 74 83 45 f0 01 8b
>> <0>Kernel panic - not syncing: Fatal exception in interrupt
>> "
>>
>> And the domU dies ...
>>
>> Is it a bug ?
>> I have reported it in bugzilla as Bug id # 735.
>> Dom0 crashes using Xen Stable (3.0.2.2).
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
iD8DBQFE40ViF+EMwkXLsEwRAg+HAJ4pqBYC1PDIJf+xzAPMFTzbN2d5lQCcCcID
kaKigAVdThbevwc3+KiQ9So=
=DsGl
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Error with NFS and XEN (High network load)
2006-08-16 16:04 ` Herbert Xu
@ 2006-08-17 3:41 ` Roberto Gonzalez Azevedo
2006-08-17 14:09 ` Roberto Gonzalez Azevedo
1 sibling, 0 replies; 6+ messages in thread
From: Roberto Gonzalez Azevedo @ 2006-08-17 3:41 UTC (permalink / raw)
To: Herbert Xu; +Cc: Xen Development Mailing List
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Thanks !!! Now it's working !!! Dom0 and DomU don't crash anymore !!!
Please, report it on Xen Bugzilla id # 735.
Thanks !
- ----------------------------
Roberto Gonzalez Azevedo
Herbert Xu wrote:
> On Tue, Aug 15, 2006 at 10:58:15PM +0000, Roberto Gonzalez Azevedo wrote:
>> On domU machine:
>> root@domU:# mount -t nfs real-machine-not-xen-or-any-virtual:/mnt /data
>> [ok]
>>
>> Crash:
>> root@domU:# time dd if=/dev/zero of=/data/test.txt bs=16k count=16384
>> [NOT OK]
>
> Thanks for the report. This is what I needed to reproduce the problem.
> I forgot to initialise the first fragment to a proper value in the first
> SG patch.
>
> [NET] back: Initialise first fragment properly
>
> The first fragment is used to store the pending_idx of the leading
> txreq if it doesn't fit in the head area. When it does fit into
> the head we need to ensure that the first fragment contains a value
> that is not equal to pending_idx as that's what we use to distinguish
> between the two cases in a a number of places.
>
> This patch sets the first fragment to ~0 which is not equal to any
> valid pending_idx. Without this initialisation, we may double-free
> a pending_idx if the first fragment happened to contain a value
> equal to it (this usually happened with pending_idx 0).
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> Cheers,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
iD8DBQFE4+V4F+EMwkXLsEwRAkrbAKCdhygwo/Hh+oyDgaS3I0z0B6IzlwCggmok
vcxKwacM/yKnhdpChmlIQPI=
=SrQT
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Error with NFS and XEN (High network load)
2006-08-16 16:04 ` Herbert Xu
2006-08-17 3:41 ` Roberto Gonzalez Azevedo
@ 2006-08-17 14:09 ` Roberto Gonzalez Azevedo
1 sibling, 0 replies; 6+ messages in thread
From: Roberto Gonzalez Azevedo @ 2006-08-17 14:09 UTC (permalink / raw)
To: Herbert Xu, xen-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Thanks !!! Now it's working !!! Dom0 and DomU don't crash anymore !!!
Please, report it on Xen Bugzilla id # 735.
Thanks !
- --
- ----------------------------
Roberto Gonzalez Azevedo
Herbert Xu wrote:
> On Tue, Aug 15, 2006 at 10:58:15PM +0000, Roberto Gonzalez Azevedo wrote:
>> On domU machine:
>> root@domU:# mount -t nfs real-machine-not-xen-or-any-virtual:/mnt /data
>> [ok]
>>
>> Crash:
>> root@domU:# time dd if=/dev/zero of=/data/test.txt bs=16k count=16384
>> [NOT OK]
>
> Thanks for the report. This is what I needed to reproduce the problem.
> I forgot to initialise the first fragment to a proper value in the first
> SG patch.
>
> [NET] back: Initialise first fragment properly
>
> The first fragment is used to store the pending_idx of the leading
> txreq if it doesn't fit in the head area. When it does fit into
> the head we need to ensure that the first fragment contains a value
> that is not equal to pending_idx as that's what we use to distinguish
> between the two cases in a a number of places.
>
> This patch sets the first fragment to ~0 which is not equal to any
> valid pending_idx. Without this initialisation, we may double-free
> a pending_idx if the first fragment happened to contain a value
> equal to it (this usually happened with pending_idx 0).
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> Cheers,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
iD8DBQFE5HiiF+EMwkXLsEwRAkGVAKCh4+qOF5ONalGWTSaKYFAQG0/MggCfcLIk
yqtPsCTMxJP7ueouUOCFwkM=
=fuLb
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-08-17 14:09 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-15 22:58 Error with NFS and XEN (High network load) Roberto Gonzalez Azevedo
2006-08-16 14:44 ` Emmanuel Ackaouy
2006-08-16 16:18 ` Roberto Gonzalez Azevedo
2006-08-16 16:04 ` Herbert Xu
2006-08-17 3:41 ` Roberto Gonzalez Azevedo
2006-08-17 14:09 ` Roberto Gonzalez Azevedo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.