All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad@darnok.org>
To: Anton Samsonov <avscomputing@gmail.com>
Cc: xen-devel@lists.xensource.com
Subject: Re: General protection fault in netback
Date: Thu, 9 Feb 2012 17:13:16 -0400	[thread overview]
Message-ID: <20120209211316.GD14007@andromeda.dapyr.net> (raw)
In-Reply-To: <CALtE5pkBhJ_GzVBbuMuuH0vvaHSSCnHzR-vvazhygQAsWb=2aA@mail.gmail.com>

On Fri, Feb 03, 2012 at 07:32:40PM +0300, Anton Samsonov wrote:
> I was experimenting with DomU redundancy and load balancing,
> and I think this GPF started to show up after a couple of DomUs
> with CARP and HAProxy were added that constantly generate
> a strong flow of network traffic by pinging target machines
> and each other as well. Or may be it is not related to CARP
> and pinging, but just depends on traffic volume: the more VMs
> added and running, the more chances that Dom0-DomU networking
> will collapse, the critical point being 8 guest domains, while I need 10.
> 
> I can't give exact steps to reproduce, as it happens randomly,
> usually without any correlated user activity, after several hours
> (or several minutes) of normal performance. But sometimes
> it happens not so long after a balancer's DomU startup or shutdown.
> After GPF happens, all VMs loose their networking connectivity.
> 
> Dom0 is openSUSE 12.1 on AMD64 (Linux 3.1.0-1.2-xen)

Do you get the same issue with a pv-ops dom0? So also 3.1, but from
kernel.org?

> with Xen version 4.1.2_05-1.9, which is patched as described
> in openSUSE bug 727081 (bugzilla.novell.com/show_bug.cgi?id=727081).
> Supposedly "offending" DomU is paravirtualized NetBSD 5.1.1
> for AMD64 with recompiled kernel (CARP enabled, no more changes).

What is CARP?
> Other VMs are openSUSE 11.4 and 12.1 for AMD64.
> 
> 
> Trace log in /var/log/messages always looks similar (varying digits
> replaced with asterisks ***):
> 
> 
> general protection fault: 0000 [#1] SMP
> CPU {core-number}
> Modules linked in: 8250 8250_pnp af_packet asus_wmi ata_generic
> blkback_pagemap blkbk blktap bridge btrfs button cdrom dm_mod
> domctl drm drm_kms_helper edd eeepc_wmi ehci_hcd evtchn fuse
> gntdev hid hwmon i2c_algo_bit i2c_core i2c_i801 i915
> iTCO_vendor_support iTCO_wdt linear llc lzo_compress mei(C)
> microcode netbk parport parport_pc pata_via pci_hotplug pcspkr
> ppdev processor r8169 rfkill serial_core [serio_raw] sg
> snd snd_hda_codec snd_hda_codec_hdmi snd_hda_codec_realtek
> snd_hda_intel snd_hwdep snd_mixer_oss snd_page_alloc snd_pcm
> snd_pcm_oss snd_seq snd_seq_device snd_timer soundcore
> sparse_keymap sr_mod stp thermal_sys uas usbbk usbcore
> usbhid usb_storage video wmi xenblk xenbus_be xennet zlib_deflate
> 
> Pid: {process-id}, comm: netback/{0/1} Tainted: G
>          C  3.1.0-1.2-xen #1 System manufacturer System Product Name/P8H67-M
> RIP: e030:[<ffffffff803e7451>]  [<ffffffff803e7451>]
> skb_release_data.part.47+0x61/0xc0
> RSP: e02b:ffff880******d40  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff880********0 RCX: ffff880******000
> RDX: {..RCX.+.0e80..} RSI: 00000000000000** RDI: 00***c**00000000
> RBP: {.....RBX......} R08: {..RCX.-.cff0..} R09: 0000000*********
> R10: 000000000000000* R11: {.task.+.0470..} R12: ffff880026a51000
> R13: ffff880********0 R14: ffffc900048****0 R15: 0000000000000001
> FS:  00007f*******7*0(0000) GS:ffff880******000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000***********0 CR3: 0000000******000 CR4: 0000000000042660
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process netback/{0/1} (pid: {process-id}, threadinfo ffff880******000,
> task ffff880********0)
> Stack:
>  0000000000000000 {.....RBX......} 0000000000000000 ffffffff803e7511
>  {.....RBX......} ffffffffa0***d2c {.....task.....} {thread.+.1e00.}
>  {thread.+.1db0.} {.R14.-.22a40..} ffffc9000000000* 0000000000000000

Hm, that is a pretty neat stack output. Wonder which patch of theirs
does that.

> Call Trace:
>  [<ffffffff803e7511>] __kfree_skb+0x11/0x20
>  [<ffffffffa0***d2c>] net_rx_action+0x66c/0x9c0 [netbk]
>  [<ffffffffa0***72a>] netbk_action_thread+0x5a/0x270 [netbk]
>  [<ffffffff8006438e>] kthread+0x7e/0x90
>  [<ffffffff8050f814>] kernel_thread_helper+0x4/0x10
> Code: 48 8b 7c 02 08 e8 90 69 cf ff 8b 95 d0 00 00 00
>   48 8b 8d d8 00 00 00 48 01 ca 0f b7 02 39 c3 7c
>   d1 f6 42 0c 10 74 1e 48 8b 7a 30
> RIP  [<ffffffff803e7451>] skb_release_data.part.47+0x61/0xc0
>  RSP <ffff880******d40>
> ---[ end trace **************** ]---
> 
> 
> Preceeding and subsequent messages don't seem to be related with GPF,
> time gap is from minutes to half an hour or even more. But if this could give
> some insight, I will post them, too.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

  reply	other threads:[~2012-02-09 21:13 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-03 16:32 General protection fault in netback Anton Samsonov
2012-02-09 21:13 ` Konrad Rzeszutek Wilk [this message]
2012-02-15 16:29   ` Anton Samsonov
2012-02-15 18:31     ` Pasi Kärkkäinen
2012-02-21 15:06       ` Anton Samsonov
2012-02-21 16:58         ` Konrad Rzeszutek Wilk
2012-02-22 12:17           ` Anton Samsonov
2012-02-22 18:11             ` Konrad Rzeszutek Wilk
2012-02-23  9:08             ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120209211316.GD14007@andromeda.dapyr.net \
    --to=konrad@darnok.org \
    --cc=avscomputing@gmail.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.