All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Vrable <mvrable@cs.ucsd.edu>
To: xen-devel@lists.xensource.com
Subject: Grant Table Network Issues
Date: Sat, 13 Aug 2005 11:59:45 -0700	[thread overview]
Message-ID: <20050813185945.GA23341@vrable.net> (raw)

I've been working on getting networking functioning in translated shadow
mode, and have it to the point where it's almost working--some packets
get through some of the time before the machine crashes.

In an effort to narrow down the problem, I've found that the grant table
network interface code in xen-unstable.hg seems to have some stability
problems as well.  Here's one of them: this is with a mostly unmodified
checkout of xen-unstable.hg (from yesterday evening), patched to produce
more debugging output and also with IP checksumming optimizations
disabled (since I was seeing some trouble with those).  After a very
short while, I get a Dom-0 crash.  This transcript is taken from
Domain-0, pinging an unprivileged domain (no shadow modes enabled):

    potemkin58:~# ping 192.168.1.2
    PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
    (XEN) gnttab_donate: i=0 mfn=000112aa domid=1 gref=000003c1
    (XEN) (file=grant_table.c, line=1146) gnttab_prepare_for_transfer rd(1) ld(0) ref(961).
    (XEN) (file=grant_table.c, line=1225) gnttab_notify_transfer rd(1) ld(0) ref(961).
    (XEN) (file=grant_table.c, line=423) Mapping grant ref (706) for domain (1) with flags (6)
    (XEN) (file=grant_table.c, line=119) activate_grant_ref: mapping=0 granting=1
    (XEN) (file=grant_table.c, line=315) activate_grant_ref: frame=78807
    (XEN) (file=grant_table.c, line=455) map_grant_ref: frame=78807 vaddr=df807000 handle=2
    (XEN) gnttab_donate: i=0 mfn=0007689f domid=1 gref=000003c2
    (XEN) (file=grant_table.c, line=1146) gnttab_prepare_for_transfer rd(1) ld(0) ref(962).
    (XEN) (file=grant_table.c, line=1225) gnttab_notify_transfer rd(1) ld(0) ref(962).
    (XEN) (file=grant_table.c, line=535) Unmapping grant ref (706) for domain (1) with handle (2)
    (XEN) (file=grant_table.c, line=652) unmap_grant_ref: frame=78807
    (XEN) (file=grant_table.c, line=423) Mapping grant ref (706) for domain (1) with flags (6)
    (XEN) (file=grant_table.c, line=119) activate_grant_ref: mapping=0 granting=1
    (XEN) (file=grant_table.c, line=315) activate_grant_ref: frame=7b6b0
    (XEN) (file=grant_table.c, line=455) map_grant_ref: frame=7b6b0 vaddr=df808000 handle=2
    kernel BUG at include/linux/skbuff.h:1148 (kmap_skb_frag)!
     [<c03c2a0b>] skb_checksum+0x27b/0x310
     [<c040228b>] icmp_rcv+0x16b/0x1a0
     [<c03dbebb>] ip_local_deliver+0xdb/0x220
     [<c03dc33e>] ip_rcv+0x33e/0x4b0
     [<c03dc630>] ip_rcv_finish+0x0/0x250
     [<c03c7d64>] netif_receive_skb+0x204/0x270
     [<c03c7e89>] process_backlog+0xb9/0x190
     [<c03c800d>] net_rx_action+0xad/0x1a0
     [<c0123f35>] __do_softirq+0xc5/0xf0
     [<c0123feb>] do_softirq+0x8b/0x90
     [<c01240b5>] irq_exit+0x35/0x40
     [<c010f082>] do_IRQ+0x22/0x30
     [<c0106530>] evtchn_do_upcall+0x70/0xa0
     [<c010a758>] hypervisor_callback+0x2c/0x34
     [<c01064ba>] force_evtchn_callback+0xa/0x10
     [<c014b72f>] __pagevec_lru_add+0x15f/0x1c0
     [<c013fb46>] add_to_page_cache+0x76/0xf0
     [<c018957d>] mpage_readpages+0x18d/0x190
     [<c01cd850>] ext3_get_block+0x0/0xc0
     [<c0147e74>] read_pages+0x124/0x170
     [<c01cd850>] ext3_get_block+0x0/0xc0
     [<c0145573>] __alloc_pages+0x2e3/0x430
     [<c0147fe0>] __do_page_cache_readahead+0x120/0x230
     [<c01415cf>] filemap_nopage+0x2ef/0x410
     [<c01528f8>] do_no_page+0xb8/0x3b0
     [<c01502a3>] pte_alloc_map+0x93/0x210
     [<c0152e36>] handle_mm_fault+0xf6/0x240
     [<c01178ec>] do_page_fault+0x19c/0x5f2
     [<c0114246>] old_mmap+0xd6/0x110
     [<c010a93a>] page_fault+0x2e/0x34
    Kernel panic - not syncing: BUG!
     (XEN) Domain 0 shutdown: rebooting machine.

The line causing trouble is "BUG_ON(in_irq())".  In this example, I had
tcpdump running in both domains; this seems to trigger the problem more
reliably.  I've also seen a similar crash with a TCP connection, but it
takes a few packets before this shows up (the handshake completes, and
the crash happens about the time data packets come back from domain-0;
if checksumming optimizations are enabled, it seems the packets are
dropped so I don't see a crash but I don't get any data either).

I've been having a difficult time tracking this down, so any help is
appreciated.

--Michael Vrable

             reply	other threads:[~2005-08-13 18:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-13 18:59 Michael Vrable [this message]
2005-08-14  8:29 ` Grant Table Network Issues Keir Fraser
2005-08-14 15:39   ` Michael Vrable
2005-08-14 15:46     ` Michael Vrable
2005-08-14 15:59       ` Ian Pratt
2005-08-14 16:15         ` Steven Hand
2005-08-14 16:27         ` Michael Vrable
2005-08-14 16:43           ` Michael Vrable
2005-08-14 16:53     ` David Hopwood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050813185945.GA23341@vrable.net \
    --to=mvrable@cs.ucsd.edu \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.