Error with NFS and XEN (High network load)

All of lore.kernel.org
 help / color / mirror / Atom feed

* Error with NFS and XEN (High network load)
@ 2006-08-15 22:58 Roberto Gonzalez Azevedo
  2006-08-16 14:44 ` Emmanuel Ackaouy
  2006-08-16 16:04 ` Herbert Xu
  0 siblings, 2 replies; 6+ messages in thread
From: Roberto Gonzalez Azevedo @ 2006-08-15 22:58 UTC (permalink / raw)
  To: xen-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Error with NFS and XEN (High network load):

I'm using UDP and 1500 MTU.

On server-side:
(LVM2)
/dev/mapper/vg-web on /web type xfs (rw,nosuid,noatime,quota,usrquota)

On client-side (dom0 or domU):
mount -t nfs -o rsize=8192,wsize=8192,intr,noexec,retry=10 \
server:/web /data/web


On domU machine:
root@domU:# mount -t nfs real-machine-not-xen-or-any-virtual:/mnt /data
[ok]

Crash:
root@domU:# time dd if=/dev/zero of=/data/test.txt bs=16k count=16384
[NOT OK]

My hardware is Dell PowerEdge 1850, network card module e1000.

Here is the crash:
root@dom0:# xm console machine
"
Unable to handle kernel NULL pointer dereference at virtual address 00000115
 printing eip:
*pde = ma 00000000 pa fffff000
Oops: 0002 [#1]
SMP
Modules linked in:
CPU:    0
EIP:    0061:[<c02f9f99>]    Not tainted VLI
EFLAGS: 00010086   (2.6.16.27-xenU_09-08-2006 #3)
EIP is at network_tx_buf_gc+0xdb/0x277
eax: 00000095   ebx: 00000087   ecx: c0508c54   edx: 00000000
esi: c0508380   edi: 00000085   ebp: c0499e98   esp: c0499e70
ds: 007b   es: 007b   ss: 0069
Process swapper (pid: 0, threadinfo=c0498000 task=c0434500)
Stack: <0>c0508c54 00000000 c0498000 c0508000 00026c3f 00026c8b 00026c6e
00000000
       c0508408 c0508380 c0499eac c02fa230 c7d98a40 00000000 00000000
c0499ed4
       c013c911 00000107 c0508000 c0499f3c c0499f3c 00000107 00008380
c048d780
Call Trace:
 [<c0105580>] show_stack_log_lvl+0xb9/0x103
 [<c0105766>] show_registers+0x19c/0x232
 [<c0105a64>] die+0x116/0x233
 [<c01112ca>] do_page_fault+0x4e8/0x8cc
 [<c0104ff7>] error_code+0x2b/0x30
 [<c02fa230>] netif_int+0x29/0x121
 [<c013c911>] handle_IRQ_event+0x3c/0xac
 [<c013ca0d>] __do_IRQ+0x8c/0xef
 [<c0106a62>] do_IRQ+0x1d/0x29
 [<c02ee27b>] evtchn_do_upcall+0x94/0xc8
 [<c0105039>] hypervisor_callback+0x3d/0x48
 [<c010382f>] xen_idle+0x2d/0x53
 [<c01038c0>] cpu_idle+0x6b/0xb6
 [<c0102035>] rest_init+0x35/0x37
 [<c049a4ed>] start_kernel+0x2e7/0x392
 [<c010006f>] 0xc010006f
Code: 89 44 24 04 89 14 24 e8 cf 54 ff ff c7 84 9e d8 08 00 00 00 00 00
00 8b 86 d0 00 00 00 89 84 9e d0 00 00 00 89 9e d0 00 00 00 90 <ff> 8f
90 00 00 00 0f 94 c0 84 c0 75 4e 8b 4e 74 83 45 f0 01 8b
 <0>Kernel panic - not syncing: Fatal exception in interrupt
"

And the domU dies ...

Is it a bug ?

I have reported it in bugzilla as Bug id # 735.

Dom0 crashes using Xen Stable (3.0.2.2).

- --
- ----------------------------
Roberto Gonzalez Azevedo
Tested with:
Xen Stable 3.0.2.2 or Xen Unstable (both crash)
(dom0 = Ubuntu 6.0.6.1 Dapper AND
 dom0 = CentOS 4.3 with official Xen RPMs
 domU = Slackware 10.2)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFE4lGHF+EMwkXLsEwRAuaAAJwKzO900w1LhFayq4hd6bNeqBWiPACeNlGK
dMrMcjZiLyiZ5ThcDC6711Q=
=rbnf
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Error with NFS and XEN (High network load)
  2006-08-15 22:58 Error with NFS and XEN (High network load) Roberto Gonzalez Azevedo
@ 2006-08-16 14:44 ` Emmanuel Ackaouy
  2006-08-16 16:18   ` Roberto Gonzalez Azevedo
  2006-08-16 16:04 ` Herbert Xu
  1 sibling, 1 reply; 6+ messages in thread
From: Emmanuel Ackaouy @ 2006-08-16 14:44 UTC (permalink / raw)
  To: Roberto Gonzalez Azevedo; +Cc: xen-devel

Roberto,

I can't reproduce this on tip unstable. Tried both fragmented
UDP tx with ttcp and same NFS dd recipe with NFS over UDP and
8k r/w size. I even tried NFSoverTCP for good measure (since
your recipe didn't explicitly mount -o udp).

I notice you are running 2.6.16.27. Which xen linux patches
did you apply? Which changeset of xen-unstable?

I wonder what versions of netfront and netback you're using.
Have you tried tip of unstable with the latest gso patches?

Emmanuel.

On Tue, Aug 15, 2006 at 07:58:15PM -0300, Roberto Gonzalez Azevedo wrote:
> Here is the crash:
> root@dom0:# xm console machine
> "
> Unable to handle kernel NULL pointer dereference at virtual address 00000115
>  printing eip:
> *pde = ma 00000000 pa fffff000
> Oops: 0002 [#1]
> SMP
> Modules linked in:
> CPU:    0
> EIP:    0061:[<c02f9f99>]    Not tainted VLI
> EFLAGS: 00010086   (2.6.16.27-xenU_09-08-2006 #3)
> EIP is at network_tx_buf_gc+0xdb/0x277
> eax: 00000095   ebx: 00000087   ecx: c0508c54   edx: 00000000
> esi: c0508380   edi: 00000085   ebp: c0499e98   esp: c0499e70
> ds: 007b   es: 007b   ss: 0069
> Process swapper (pid: 0, threadinfo=c0498000 task=c0434500)
> Stack: <0>c0508c54 00000000 c0498000 c0508000 00026c3f 00026c8b 00026c6e
> 00000000
>        c0508408 c0508380 c0499eac c02fa230 c7d98a40 00000000 00000000
> c0499ed4
>        c013c911 00000107 c0508000 c0499f3c c0499f3c 00000107 00008380
> c048d780
> Call Trace:
>  [<c0105580>] show_stack_log_lvl+0xb9/0x103
>  [<c0105766>] show_registers+0x19c/0x232
>  [<c0105a64>] die+0x116/0x233
>  [<c01112ca>] do_page_fault+0x4e8/0x8cc
>  [<c0104ff7>] error_code+0x2b/0x30
>  [<c02fa230>] netif_int+0x29/0x121
>  [<c013c911>] handle_IRQ_event+0x3c/0xac
>  [<c013ca0d>] __do_IRQ+0x8c/0xef
>  [<c0106a62>] do_IRQ+0x1d/0x29
>  [<c02ee27b>] evtchn_do_upcall+0x94/0xc8
>  [<c0105039>] hypervisor_callback+0x3d/0x48
>  [<c010382f>] xen_idle+0x2d/0x53
>  [<c01038c0>] cpu_idle+0x6b/0xb6
>  [<c0102035>] rest_init+0x35/0x37
>  [<c049a4ed>] start_kernel+0x2e7/0x392
>  [<c010006f>] 0xc010006f
> Code: 89 44 24 04 89 14 24 e8 cf 54 ff ff c7 84 9e d8 08 00 00 00 00 00
> 00 8b 86 d0 00 00 00 89 84 9e d0 00 00 00 89 9e d0 00 00 00 90 <ff> 8f
> 90 00 00 00 0f 94 c0 84 c0 75 4e 8b 4e 74 83 45 f0 01 8b
>  <0>Kernel panic - not syncing: Fatal exception in interrupt
> "
> 
> And the domU dies ...
> 
> Is it a bug ?
> I have reported it in bugzilla as Bug id # 735.
> Dom0 crashes using Xen Stable (3.0.2.2).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Error with NFS and XEN (High network load)
  2006-08-15 22:58 Error with NFS and XEN (High network load) Roberto Gonzalez Azevedo
  2006-08-16 14:44 ` Emmanuel Ackaouy
@ 2006-08-16 16:04 ` Herbert Xu
  2006-08-17  3:41   ` Roberto Gonzalez Azevedo
  2006-08-17 14:09   ` Roberto Gonzalez Azevedo
  1 sibling, 2 replies; 6+ messages in thread
From: Herbert Xu @ 2006-08-16 16:04 UTC (permalink / raw)
  To: Keir Fraser, Roberto Gonzalez Azevedo; +Cc: Xen Development Mailing List

On Tue, Aug 15, 2006 at 10:58:15PM +0000, Roberto Gonzalez Azevedo wrote:
> 
> On domU machine:
> root@domU:# mount -t nfs real-machine-not-xen-or-any-virtual:/mnt /data
> [ok]
> 
> Crash:
> root@domU:# time dd if=/dev/zero of=/data/test.txt bs=16k count=16384
> [NOT OK]

Thanks for the report.  This is what I needed to reproduce the problem.
I forgot to initialise the first fragment to a proper value in the first
SG patch.

[NET] back: Initialise first fragment properly

The first fragment is used to store the pending_idx of the leading
txreq if it doesn't fit in the head area.  When it does fit into
the head we need to ensure that the first fragment contains a value
that is not equal to pending_idx as that's what we use to distinguish
between the two cases in a a number of places.

This patch sets the first fragment to ~0 which is not equal to any
valid pending_idx.  Without this initialisation, we may double-free
a pending_idx if the first fragment happened to contain a value
equal to it (this usually happened with pending_idx 0).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff -r ec03b24a2d83 linux-2.6-xen-sparse/drivers/xen/netback/netback.c
--- a/linux-2.6-xen-sparse/drivers/xen/netback/netback.c	Tue Aug 15 19:53:55 2006 +0100
+++ b/linux-2.6-xen-sparse/drivers/xen/netback/netback.c	Thu Aug 17 01:57:54 2006 +1000
@@ -1147,6 +1147,8 @@ static void net_tx_action(unsigned long 
 		__skb_put(skb, data_len);

 		skb_shinfo(skb)->nr_frags = ret;
+		skb_shinfo(skb)->frags[0].page = (void *)~0UL;
+
 		if (data_len < txreq.size) {
 			skb_shinfo(skb)->nr_frags++;
 			skb_shinfo(skb)->frags[0].page =

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Error with NFS and XEN (High network load)
  2006-08-16 14:44 ` Emmanuel Ackaouy
@ 2006-08-16 16:18   ` Roberto Gonzalez Azevedo
  0 siblings, 0 replies; 6+ messages in thread
From: Roberto Gonzalez Azevedo @ 2006-08-16 16:18 UTC (permalink / raw)
  To: Roberto Gonzalez Azevedo, xen-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Emmanuel Ackaouy wrote:
> Roberto,
> 
> I can't reproduce this on tip unstable. Tried both fragmented
> UDP tx with ttcp and same NFS dd recipe with NFS over UDP and
> 8k r/w size. I even tried NFSoverTCP for good measure (since
> your recipe didn't explicitly mount -o udp).
> 
> I notice you are running 2.6.16.27. Which xen linux patches
> did you apply? Which changeset of xen-unstable?
>

I'm using 2.6.16.13 and the latest Xen Unstable.
Typing error. Sorry, :)

Using the unstable, only crashes in domU, but xend dies and dom0 not crash.
Using the stable version, crashes in all domains (dom0 e domU).

How to apply the latest gso patches ?

Thanks.

- ----------------------------
Roberto Gonzalez Azevedo


> I wonder what versions of netfront and netback you're using.
> Have you tried tip of unstable with the latest gso patches?
> 
> Emmanuel.
> 
> On Tue, Aug 15, 2006 at 07:58:15PM -0300, Roberto Gonzalez Azevedo wrote:
>> Here is the crash:
>> root@dom0:# xm console machine
>> "
>> Unable to handle kernel NULL pointer dereference at virtual address 00000115
>>  printing eip:
>> *pde = ma 00000000 pa fffff000
>> Oops: 0002 [#1]
>> SMP
>> Modules linked in:
>> CPU:    0
>> EIP:    0061:[<c02f9f99>]    Not tainted VLI
>> EFLAGS: 00010086   (2.6.16.27-xenU_09-08-2006 #3)
>> EIP is at network_tx_buf_gc+0xdb/0x277
>> eax: 00000095   ebx: 00000087   ecx: c0508c54   edx: 00000000
>> esi: c0508380   edi: 00000085   ebp: c0499e98   esp: c0499e70
>> ds: 007b   es: 007b   ss: 0069
>> Process swapper (pid: 0, threadinfo=c0498000 task=c0434500)
>> Stack: <0>c0508c54 00000000 c0498000 c0508000 00026c3f 00026c8b 00026c6e
>> 00000000
>>        c0508408 c0508380 c0499eac c02fa230 c7d98a40 00000000 00000000
>> c0499ed4
>>        c013c911 00000107 c0508000 c0499f3c c0499f3c 00000107 00008380
>> c048d780
>> Call Trace:
>>  [<c0105580>] show_stack_log_lvl+0xb9/0x103
>>  [<c0105766>] show_registers+0x19c/0x232
>>  [<c0105a64>] die+0x116/0x233
>>  [<c01112ca>] do_page_fault+0x4e8/0x8cc
>>  [<c0104ff7>] error_code+0x2b/0x30
>>  [<c02fa230>] netif_int+0x29/0x121
>>  [<c013c911>] handle_IRQ_event+0x3c/0xac
>>  [<c013ca0d>] __do_IRQ+0x8c/0xef
>>  [<c0106a62>] do_IRQ+0x1d/0x29
>>  [<c02ee27b>] evtchn_do_upcall+0x94/0xc8
>>  [<c0105039>] hypervisor_callback+0x3d/0x48
>>  [<c010382f>] xen_idle+0x2d/0x53
>>  [<c01038c0>] cpu_idle+0x6b/0xb6
>>  [<c0102035>] rest_init+0x35/0x37
>>  [<c049a4ed>] start_kernel+0x2e7/0x392
>>  [<c010006f>] 0xc010006f
>> Code: 89 44 24 04 89 14 24 e8 cf 54 ff ff c7 84 9e d8 08 00 00 00 00 00
>> 00 8b 86 d0 00 00 00 89 84 9e d0 00 00 00 89 9e d0 00 00 00 90 <ff> 8f
>> 90 00 00 00 0f 94 c0 84 c0 75 4e 8b 4e 74 83 45 f0 01 8b
>>  <0>Kernel panic - not syncing: Fatal exception in interrupt
>> "
>>
>> And the domU dies ...
>>
>> Is it a bug ?
>> I have reported it in bugzilla as Bug id # 735.
>> Dom0 crashes using Xen Stable (3.0.2.2).
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFE40ViF+EMwkXLsEwRAg+HAJ4pqBYC1PDIJf+xzAPMFTzbN2d5lQCcCcID
kaKigAVdThbevwc3+KiQ9So=
=DsGl
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Error with NFS and XEN (High network load)
  2006-08-16 16:04 ` Herbert Xu
@ 2006-08-17  3:41   ` Roberto Gonzalez Azevedo
  2006-08-17 14:09   ` Roberto Gonzalez Azevedo
  1 sibling, 0 replies; 6+ messages in thread
From: Roberto Gonzalez Azevedo @ 2006-08-17  3:41 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Xen Development Mailing List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thanks !!! Now it's working !!! Dom0 and DomU don't crash anymore !!!
Please, report it on Xen Bugzilla id # 735.


Thanks !
- ----------------------------
Roberto Gonzalez Azevedo


Herbert Xu wrote:
> On Tue, Aug 15, 2006 at 10:58:15PM +0000, Roberto Gonzalez Azevedo wrote:
>> On domU machine:
>> root@domU:# mount -t nfs real-machine-not-xen-or-any-virtual:/mnt /data
>> [ok]
>>
>> Crash:
>> root@domU:# time dd if=/dev/zero of=/data/test.txt bs=16k count=16384
>> [NOT OK]
> 
> Thanks for the report.  This is what I needed to reproduce the problem.
> I forgot to initialise the first fragment to a proper value in the first
> SG patch.
> 
> [NET] back: Initialise first fragment properly
> 
> The first fragment is used to store the pending_idx of the leading
> txreq if it doesn't fit in the head area.  When it does fit into
> the head we need to ensure that the first fragment contains a value
> that is not equal to pending_idx as that's what we use to distinguish
> between the two cases in a a number of places.
> 
> This patch sets the first fragment to ~0 which is not equal to any
> valid pending_idx.  Without this initialisation, we may double-free
> a pending_idx if the first fragment happened to contain a value
> equal to it (this usually happened with pending_idx 0).
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> Cheers,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFE4+V4F+EMwkXLsEwRAkrbAKCdhygwo/Hh+oyDgaS3I0z0B6IzlwCggmok
vcxKwacM/yKnhdpChmlIQPI=
=SrQT
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Error with NFS and XEN (High network load)
  2006-08-16 16:04 ` Herbert Xu
  2006-08-17  3:41   ` Roberto Gonzalez Azevedo
@ 2006-08-17 14:09   ` Roberto Gonzalez Azevedo
  1 sibling, 0 replies; 6+ messages in thread
From: Roberto Gonzalez Azevedo @ 2006-08-17 14:09 UTC (permalink / raw)
  To: Herbert Xu, xen-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thanks !!! Now it's working !!! Dom0 and DomU don't crash anymore !!!
Please, report it on Xen Bugzilla id # 735.


Thanks !

- --
- ----------------------------
Roberto Gonzalez Azevedo


Herbert Xu wrote:
> On Tue, Aug 15, 2006 at 10:58:15PM +0000, Roberto Gonzalez Azevedo wrote:
>> On domU machine:
>> root@domU:# mount -t nfs real-machine-not-xen-or-any-virtual:/mnt /data
>> [ok]
>>
>> Crash:
>> root@domU:# time dd if=/dev/zero of=/data/test.txt bs=16k count=16384
>> [NOT OK]
> 
> Thanks for the report.  This is what I needed to reproduce the problem.
> I forgot to initialise the first fragment to a proper value in the first
> SG patch.
> 
> [NET] back: Initialise first fragment properly
> 
> The first fragment is used to store the pending_idx of the leading
> txreq if it doesn't fit in the head area.  When it does fit into
> the head we need to ensure that the first fragment contains a value
> that is not equal to pending_idx as that's what we use to distinguish
> between the two cases in a a number of places.
> 
> This patch sets the first fragment to ~0 which is not equal to any
> valid pending_idx.  Without this initialisation, we may double-free
> a pending_idx if the first fragment happened to contain a value
> equal to it (this usually happened with pending_idx 0).
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> Cheers,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFE5HiiF+EMwkXLsEwRAkGVAKCh4+qOF5ONalGWTSaKYFAQG0/MggCfcLIk
yqtPsCTMxJP7ueouUOCFwkM=
=fuLb
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-08-17 14:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-15 22:58 Error with NFS and XEN (High network load) Roberto Gonzalez Azevedo
2006-08-16 14:44 ` Emmanuel Ackaouy
2006-08-16 16:18   ` Roberto Gonzalez Azevedo
2006-08-16 16:04 ` Herbert Xu
2006-08-17  3:41   ` Roberto Gonzalez Azevedo
2006-08-17 14:09   ` Roberto Gonzalez Azevedo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.