All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: DMA trouble with current xen-sparse
@ 2005-11-07 13:51 Ian Pratt
  2005-11-07 14:15 ` Daniel Veillard
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Pratt @ 2005-11-07 13:51 UTC (permalink / raw)
  To: veillard, Vincent Hanquez; +Cc: xen-devel


> Sure, took a bit of time to recompile the kernel (I didn't do 
> this for years) and it crashed as expected, here are the info:
> 
>   ptr: f160ed8e 1514
> 
> the size looks a full ethernet frame, i.e. 1500 of payload, 2 
> ethernet addresses and the 2bytes for the ethernet type, that 
> looks kosher to me but clearly it is not aligned.

This allocation isn't aligned to the next power of 2 boundary ---
usually 1514 byte allocations are 2KB aligned. 

You're not enabling some experimental option in your config that changes
the alignment of slab allocations are you?

Ian

^ permalink raw reply	[flat|nested] 18+ messages in thread
* RE: DMA trouble with current xen-sparse
@ 2005-11-07 23:03 Ian Pratt
  0 siblings, 0 replies; 18+ messages in thread
From: Ian Pratt @ 2005-11-07 23:03 UTC (permalink / raw)
  To: Stephen C. Tweedie; +Cc: xen-devel, Vincent Hanquez, veillard

> from 1 to 0 in mm/slab.c.  But that's just going to waste more slab
> cache space for many caches.   Without that change, the fact 
> is that an
> important debugging option is creating cross-page objects 
> routinely, and that the slab allocator can create such 
> objects quite normally even without that option; so it may 
> end up being something that Xen just has to deal with.

The best xen fix for this is for us to hook alloc_skb (rather than just
dev_alloc_skb). This will enable us to solve the jumbo frames issue too.

Ian 

^ permalink raw reply	[flat|nested] 18+ messages in thread
* RE: DMA trouble with current xen-sparse
@ 2005-11-04 14:50 Ian Pratt
  2005-11-07 21:28 ` Stephen C. Tweedie
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Pratt @ 2005-11-04 14:50 UTC (permalink / raw)
  To: veillard, Vincent Hanquez; +Cc: xen-devel

 > Sure, took a bit of time to recompile the kernel (I didn't do 
> this for years) and it crashed as expected, here are the info:
> 
>   ptr: f160ed8e 1514
> 
> the size looks a full ethernet frame, i.e. 1500 of payload, 2 
> ethernet addresses and the 2bytes for the ethernet type, that 
> looks kosher to me but clearly it is not aligned.

Please can you try using either our -xen or -xen0 kernel config. I
strongly suspect there's something in your config that is breaking this
for you, just not sure what.

(NB: make sure you 'rm dist/install/boot/config*' to avoid make woprld
from grabbing your old config)

Best,
Ian

^ permalink raw reply	[flat|nested] 18+ messages in thread
* RE: DMA trouble with current xen-sparse
@ 2005-11-02 15:32 Ian Pratt
  2005-11-02 15:36 ` Stephen Tweedie
  0 siblings, 1 reply; 18+ messages in thread
From: Ian Pratt @ 2005-11-02 15:32 UTC (permalink / raw)
  To: Keir Fraser, Stephen C. Tweedie; +Cc: xen-devel


> On 28 Oct 2005, at 20:21, Stephen C. Tweedie wrote:
> 
> > The trouble is that this is a 1G box, so its memory is not large 
> > enough to automatically enable the swiotlb.  
> > (arch/xen/i386/kernel/swiotlb.c enables swiotlb 
> automatically for dom0 
> > only if there's at least 2G of
> > memory.)  And the first time we get a pci_dma_single() 
> request for a 
> > dom0-contiguous region which crosses a page boundary, we hit the 
> > BUG_ON at arch/xen/i386/kernel/pci_dma.c:270 due to 
> dma_map_single() checking:

Does your card support TSO? What revision e1000 is it?

Please can you try turning it off with: 
  ethtool -K eth0 tso off

If TSO is the problem we'll come up with a better fix than using
swiotlb.


Thanks,
Ian

^ permalink raw reply	[flat|nested] 18+ messages in thread
* DMA trouble with current xen-sparse
@ 2005-10-28 19:21 Stephen C. Tweedie
  2005-10-29  8:55 ` Keir Fraser
  2005-10-30  9:52 ` Muli Ben-Yehuda
  0 siblings, 2 replies; 18+ messages in thread
From: Stephen C. Tweedie @ 2005-10-28 19:21 UTC (permalink / raw)
  To: xen-devel

Hi,

I've been trying to get current xen-sparse up and running on a 2-cpu box
and have had a number of problems.  One has been that networking is
completely unstable: I get kernel panics under the slightest network
load.

The trouble is that this is a 1G box, so its memory is not large enough
to automatically enable the swiotlb.  (arch/xen/i386/kernel/swiotlb.c
enables swiotlb automatically for dom0 only if there's at least 2G of
memory.)  And the first time we get a pci_dma_single() request for a
dom0-contiguous region which crosses a page boundary, we hit the BUG_ON
at arch/xen/i386/kernel/pci_dma.c:270 due to dma_map_single() checking:

		IOMMU_BUG_ON(range_straddles_page_boundary(ptr, size));

And this happens *instantly* on any loaded tcp connection on my e1000
NIC.  All I need to do to kill the box is to ssh in and type "find\n".
Instant dom0 death after the ssh client receives about a dozen lines of
output.  The stack trace is appended below.

The PCI mapping documentation certainly says that pci_map_single() needs
to be able to map a single region, not just a single page.  If it can't,
then I suspect we really need to enable swiotlb by default, because
we'll just be unstable without it.

The kernel panics after this with "Fatal DMA error! Please use
'swiotlb=force'".  But of course the default for Xen is to instantly
reboot at this point before the error is visible.  And even after
catching the message with serial console, I found that "swiotlb=force"
*also* dies on this box, with

(XEN) (file=memory.c, line=57) Could not allocate order=14 extent: id=0 flags=0
(0 of 1)
kernel BUG at arch/xen/i386/mm/hypervisor.c:354
(xen_create_contiguous_region)!
 [<c011a77d>] xen_create_contiguous_region+0x26d/0x2b0
 [<c0112596>] swiotlb_init_with_default_size+0x86/0x1c0
 [<c0112735>] swiotlb_init+0x65/0xa0

because we don't have a large enough zone at boot time to create the
64MB swiotlb.  

Booting with "swiotlb=force swiotlb=8m" works around both of these bugs
and allows me to boot; fortunately things are much more stable after I
get this far.

Cheers,
 Stephen

---

kernel BUG at arch/xen/i386/kernel/pci-dma.c:270 (dma_map_single)!
 [<c010ecd6>] dma_map_single+0xf6/0x160
 [<f49cd40b>] e1000_xmit_frame+0x40b/0xd30 [e1000]
 [<c0313510>] qdisc_restart+0x100/0x2f0
 [<c03241d0>] ip_finish_output2+0x0/0x250
 [<c030d594>] nf_hook_slow+0x64/0x110
 [<c03010ff>] dev_queue_xmit+0x9f/0x340
 [<c032404c>] ip_finish_output+0x15c/0x2e0
 [<c03241d0>] ip_finish_output2+0x0/0x250
 [<c0324947>] ip_queue_xmit+0x2b7/0x560
 [<c0323ec0>] dst_output+0x0/0x30
 [<c0155bf2>] poison_obj+0x32/0x60
 [<c0155408>] dbg_redzone1+0x18/0x60
 [<c0155e06>] check_poison_obj+0x26/0x1c0
 [<c0155bf2>] poison_obj+0x32/0x60
 [<c0155408>] dbg_redzone1+0x18/0x60
 [<c0157dbc>] cache_alloc_debugcheck_after+0x4c/0x1b0
 [<c0336e24>] tcp_transmit_skb+0x3d4/0x810
 [<c02fab10>] skb_clone+0x20/0x1d0
 [<c0337efd>] tcp_write_xmit+0x10d/0x330
 [<c0334943>] __tcp_data_snd_check+0xa3/0xe0
 [<c02fa961>] kfree_skbmem+0x21/0x30
 [<c0335069>] tcp_rcv_established+0x2a9/0x910
 [<f4b3f036>] ipt_hook+0x36/0x40 [iptable_filter]
 [<c033ef5a>] tcp_v4_do_rcv+0xfa/0x150
 [<c033f8d5>] tcp_v4_rcv+0x925/0x980
 [<c030d594>] nf_hook_slow+0x64/0x110
 [<c03208d0>] ip_local_deliver_finish+0x0/0x270
 [<c03206bc>] ip_local_deliver+0xdc/0x2f0
 [<c03208d0>] ip_local_deliver_finish+0x0/0x270
 [<c0320f0e>] ip_rcv+0x3ce/0x5b0
 [<c03210f0>] ip_rcv_finish+0x0/0x320
 [<c0301be0>] netif_receive_skb+0x250/0x310
 [<f49cf3ae>] e1000_clean_rx_irq+0x13e/0x5d0 [e1000]
 [<f49ce8a2>] e1000_clean+0x52/0x1c0 [e1000]
 [<c0301f2c>] net_rx_action+0xdc/0x220
 [<c0128f4a>] __do_softirq+0x8a/0x120
 [<c012905d>] do_softirq+0x7d/0x80
 [<c010ee22>] do_IRQ+0x22/0x30
 [<c01049be>] evtchn_do_upcall+0x9e/0xe0
 [<c010a2f0>] hypervisor_callback+0x2c/0x34
 [<c0107b30>] xen_idle+0x40/0x80
 [<c0107bd4>] cpu_idle+0x64/0xb0
 [<c0436a4f>] start_kernel+0x1af/0x210
 [<c0436380>] unknown_bootoption+0x0/0x220

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2005-11-08 15:55 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-07 13:51 DMA trouble with current xen-sparse Ian Pratt
2005-11-07 14:15 ` Daniel Veillard
  -- strict thread matches above, loose matches on Subject: below --
2005-11-07 23:03 Ian Pratt
2005-11-04 14:50 Ian Pratt
2005-11-07 21:28 ` Stephen C. Tweedie
2005-11-08  6:41   ` Daniel Veillard
2005-11-08 15:55     ` Keir Fraser
2005-11-08 15:25       ` Stephen C. Tweedie
2005-11-02 15:32 Ian Pratt
2005-11-02 15:36 ` Stephen Tweedie
2005-11-02 15:59   ` Daniel Veillard
2005-11-02 17:12     ` Keir Fraser
2005-11-02 23:04       ` Daniel Veillard
2005-11-03  2:45         ` Vincent Hanquez
2005-11-03 14:51           ` Daniel Veillard
2005-10-28 19:21 Stephen C. Tweedie
2005-10-29  8:55 ` Keir Fraser
2005-10-30  9:52 ` Muli Ben-Yehuda

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.