public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* virtio net regression
@ 2009-04-15 20:04 Antoine Martin
  2009-04-15 23:38 ` Antoine Martin
  0 siblings, 1 reply; 14+ messages in thread
From: Antoine Martin @ 2009-04-15 20:04 UTC (permalink / raw)
  To: kvm@vger.kernel.org

Hi,

I've got some hosts that were happily running the 2.6.25.x host kernel,
kvm-84, kernel.org kvm modules.
The guests were running 2.6.25 to 2.6.29.x quite happily.
Network was using virtio.
Since I upgraded one of the hosts (Intel dual core) to 2.6.29.x
yesterday, the virtio network performance of the guests on it dropped
dramatically. (for some reason another AMD host did not seem to be
affected...)
Here are the tests I performed using wget and scp:
* guest to guest: fast
* guest to host: fast
* host to internet: fast
* guest to internet: slow!!!
I was normally getting ~5MB/s to the host (speed to the internet was
limited by the capacity of the DSL line), but since the upgrade the
performance had dropped to around 20KB/s!
Strangely enough, I could open many new connections to the guest and get
more chunks all at 20KB/s!
I switched the guests to using ne2k_pci and the performance has been
restored...

And this is where it gets even weirder...
UDP packets get corrupted using ne2k_pci and rtl8139cp but not with
virtio...
So I can get performance or UDP, but not both...

Let me know if there is anything more I can provide to help fix this
regression.
I can reproduce the problem quite easily without causing problems on the
host.

Cheers
Antoine

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-04-15 20:04 virtio net regression Antoine Martin
@ 2009-04-15 23:38 ` Antoine Martin
  2009-04-19 11:48   ` Avi Kivity
  0 siblings, 1 reply; 14+ messages in thread
From: Antoine Martin @ 2009-04-15 23:38 UTC (permalink / raw)
  To: Antoine Martin; +Cc: kvm@vger.kernel.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Wireshark was showing a huge amount of invalid packets (wrong checksum)
- - that was the cause of the slowdown.
Simply rebooting the host into 2.6.28.9 fixed *everything*, regardless
of whether the guests use virtio or ne2k_pci/etc.
The guests are still running 2.6.29.1, but I am not likely to try that
release again on the host anytime soon! Ouch!

Antoine

Antoine Martin wrote:
> Hi,
> 
> I've got some hosts that were happily running the 2.6.25.x host kernel,
> kvm-84, kernel.org kvm modules.
> The guests were running 2.6.25 to 2.6.29.x quite happily.
> Network was using virtio.
> Since I upgraded one of the hosts (Intel dual core) to 2.6.29.x
> yesterday, the virtio network performance of the guests on it dropped
> dramatically. (for some reason another AMD host did not seem to be
> affected...)
> Here are the tests I performed using wget and scp:
> * guest to guest: fast
> * guest to host: fast
> * host to internet: fast
> * guest to internet: slow!!!
> I was normally getting ~5MB/s to the host (speed to the internet was
> limited by the capacity of the DSL line), but since the upgrade the
> performance had dropped to around 20KB/s!
> Strangely enough, I could open many new connections to the guest and get
> more chunks all at 20KB/s!
> I switched the guests to using ne2k_pci and the performance has been
> restored...
> 
> And this is where it gets even weirder...
> UDP packets get corrupted using ne2k_pci and rtl8139cp but not with
> virtio...
> So I can get performance or UDP, but not both...
> 
> Let me know if there is anything more I can provide to help fix this
> regression.
> I can reproduce the problem quite easily without causing problems on the
> host.
> 
> Cheers
> Antoine
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEUEAREKAAYFAknmb/sACgkQGK2zHPGK1ruzgwCWPMvAJzToIMbrE7k2K2FHBQlk
dQCcCpDrTufqIN4ZSQs/dMLTQMYtTAU=
=lDW9
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-04-15 23:38 ` Antoine Martin
@ 2009-04-19 11:48   ` Avi Kivity
  2009-04-20 11:12     ` Mark McLoughlin
  0 siblings, 1 reply; 14+ messages in thread
From: Avi Kivity @ 2009-04-19 11:48 UTC (permalink / raw)
  To: Antoine Martin; +Cc: Antoine Martin, kvm@vger.kernel.org, Rusty Russell

Antoine Martin wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> Wireshark was showing a huge amount of invalid packets (wrong checksum)
> - - that was the cause of the slowdown.
> Simply rebooting the host into 2.6.28.9 fixed *everything*, regardless
> of whether the guests use virtio or ne2k_pci/etc.
> The guests are still running 2.6.29.1, but I am not likely to try that
> release again on the host anytime soon! Ouch!
>   


Strange, no significant tun changes between .28 and .29.

Rusty, any idea?

> Antoine
>
> Antoine Martin wrote:
>   
>> Hi,
>>
>> I've got some hosts that were happily running the 2.6.25.x host kernel,
>> kvm-84, kernel.org kvm modules.
>> The guests were running 2.6.25 to 2.6.29.x quite happily.
>> Network was using virtio.
>> Since I upgraded one of the hosts (Intel dual core) to 2.6.29.x
>> yesterday, the virtio network performance of the guests on it dropped
>> dramatically. (for some reason another AMD host did not seem to be
>> affected...)
>> Here are the tests I performed using wget and scp:
>> * guest to guest: fast
>> * guest to host: fast
>> * host to internet: fast
>> * guest to internet: slow!!!
>> I was normally getting ~5MB/s to the host (speed to the internet was
>> limited by the capacity of the DSL line), but since the upgrade the
>> performance had dropped to around 20KB/s!
>> Strangely enough, I could open many new connections to the guest and get
>> more chunks all at 20KB/s!
>> I switched the guests to using ne2k_pci and the performance has been
>> restored...
>>
>> And this is where it gets even weirder...
>> UDP packets get corrupted using ne2k_pci and rtl8139cp but not with
>> virtio...
>> So I can get performance or UDP, but not both...
>>
>> Let me know if there is anything more I can provide to help fix this
>> regression.
>> I can reproduce the problem quite easily without causing problems on the
>> host.
>>
>> Cheers
>> Antoine
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>     
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEUEAREKAAYFAknmb/sACgkQGK2zHPGK1ruzgwCWPMvAJzToIMbrE7k2K2FHBQlk
> dQCcCpDrTufqIN4ZSQs/dMLTQMYtTAU=
> =lDW9
> -----END PGP SIGNATURE-----
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-04-19 11:48   ` Avi Kivity
@ 2009-04-20 11:12     ` Mark McLoughlin
  2009-04-20 15:09       ` Antoine Martin
       [not found]       ` <49EC8F1D.7000109@nagafix.co.uk>
  0 siblings, 2 replies; 14+ messages in thread
From: Mark McLoughlin @ 2009-04-20 11:12 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Antoine Martin, Antoine Martin, kvm@vger.kernel.org,
	Rusty Russell

On Sun, 2009-04-19 at 14:48 +0300, Avi Kivity wrote:
> Antoine Martin wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA512
> >
> > Wireshark was showing a huge amount of invalid packets (wrong checksum)
> > - - that was the cause of the slowdown.
> > Simply rebooting the host into 2.6.28.9 fixed *everything*, regardless
> > of whether the guests use virtio or ne2k_pci/etc.
> > The guests are still running 2.6.29.1, but I am not likely to try that
> > release again on the host anytime soon! Ouch!
> >   
> 
> 
> Strange, no significant tun changes between .28 and .29.

Sounds to me like it's this:

  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0

davem said he was queueing up for stable, but it's not in yet:

  http://kerneltrap.org/mailarchive/linux-netdev/2009/3/30/5337934

I'll check that it's in the queue.

Cheers,
Mark.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-04-20 11:12     ` Mark McLoughlin
@ 2009-04-20 15:09       ` Antoine Martin
       [not found]       ` <49EC8F1D.7000109@nagafix.co.uk>
  1 sibling, 0 replies; 14+ messages in thread
From: Antoine Martin @ 2009-04-20 15:09 UTC (permalink / raw)
  To: Mark McLoughlin; +Cc: Avi Kivity, kvm@vger.kernel.org, Rusty Russell, davem

Hi,

The bug report below does indeed match everything I have experienced.
Upon further inspection, 2.6.28.9 is also affected, just less so.

Unfortunately I have applied this patch to 2.6.29.1:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0
And if anything, it made things worse... (speed was down to just 6KB/s
because of the number of broken packets)
Any ideas?

Cheers
Antoine



Mark McLoughlin wrote:
> On Sun, 2009-04-19 at 14:48 +0300, Avi Kivity wrote:
>> Antoine Martin wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA512
>>>
>>> Wireshark was showing a huge amount of invalid packets (wrong checksum)
>>> - - that was the cause of the slowdown.
>>> Simply rebooting the host into 2.6.28.9 fixed *everything*, regardless
>>> of whether the guests use virtio or ne2k_pci/etc.
>>> The guests are still running 2.6.29.1, but I am not likely to try that
>>> release again on the host anytime soon! Ouch!
>>>   
>>
>> Strange, no significant tun changes between .28 and .29.
> 
> Sounds to me like it's this:
> 
>   http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0
> 
> davem said he was queueing up for stable, but it's not in yet:
> 
>   http://kerneltrap.org/mailarchive/linux-netdev/2009/3/30/5337934
> 
> I'll check that it's in the queue.
> 
> Cheers,
> Mark.
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
       [not found]       ` <49EC8F1D.7000109@nagafix.co.uk>
@ 2009-04-28 18:57         ` Antoine Martin
  2009-05-09 13:19           ` Antoine Martin
  0 siblings, 1 reply; 14+ messages in thread
From: Antoine Martin @ 2009-04-28 18:57 UTC (permalink / raw)
  To: Mark McLoughlin
  Cc: Avi Kivity, Antoine Martin, kvm@vger.kernel.org, Rusty Russell,
	davem

Hi

Still getting (some but less) network issues with a 2.6.28.9 host.

Found quite a few of these call traces in the 2.6.29.1 guests:
Guest has 512MB of memory and was not all that busy (just network
traffic), so I don't understand why it would fail to allocate a page...


[701453.834571] kjournald: page allocation failure. order:0, mode:0x4020
[701453.834574] Pid: 4806, comm: kjournald Not tainted 2.6.29.1 #4
[701453.834576] Call Trace:
[701453.834578]  <IRQ>  [<ffffffff8027fa48>]
__alloc_pages_internal+0x3e1/0x401
[701453.834586]  [<ffffffff802a1ad4>] __slab_alloc+0x17f/0x4ca
[701453.834590]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
[701453.834592]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
[701453.834595]  [<ffffffff802a2e66>] __kmalloc_track_caller+0xac/0xe1
[701453.834598]  [<ffffffff8062f97e>] __alloc_skb+0x61/0x11e
[701453.834600]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
[701453.834603]  [<ffffffff8067c374>] tcp_rcv_established+0x6c7/0x9e6
[701453.834605]  [<ffffffff80683515>] tcp_v4_do_rcv+0x19e/0x324
[701453.834608]  [<ffffffff80683b23>] tcp_v4_rcv+0x488/0x73b
[701453.834611]  [<ffffffff806499c4>] nf_hook_slow+0x62/0xc3
[701453.834615]  [<ffffffff8066925c>] ip_local_deliver_finish+0x0/0x1ee
[701453.834617]  [<ffffffff80669378>] ip_local_deliver_finish+0x11c/0x1ee
[701453.834620]  [<ffffffff80668fcb>] ip_rcv_finish+0x2cf/0x2e9
[701453.834622]  [<ffffffff80669218>] ip_rcv+0x233/0x277
[701453.834626]  [<ffffffff8055d1e7>] virtnet_poll+0x4ca/0x5ab
[701453.834628]  [<ffffffff80633952>] net_rx_action+0x70/0x143
[701453.834631]  [<ffffffff8024030a>] __do_softirq+0x83/0x145
[701453.834634]  [<ffffffff8020eb7a>] timer_interrupt+0x1a/0x21
[701453.834637]  [<ffffffff8020d35c>] call_softirq+0x1c/0x28
[701453.834639]  [<ffffffff8020e2c0>] do_softirq+0x3c/0x85
[701453.834641]  [<ffffffff80240021>] irq_exit+0x3f/0x7a
[701453.834643]  [<ffffffff8020e59c>] do_IRQ+0x12b/0x14f
[701453.834646]  [<ffffffff8020cad3>] ret_from_intr+0x0/0x29
[701453.834647]  <EOI>  [<ffffffff80621b29>] vp_notify+0x0/0x1c
[701453.834653]  [<ffffffff804b099e>] __make_request+0x3e2/0x425
[701453.834656]  [<ffffffff804af1ff>] generic_make_request+0x338/0x389
[701453.834660]  [<ffffffff802986ce>] end_swap_bio_write+0x0/0x66
[701453.834664]  [<ffffffff802c6643>] bio_alloc_bioset+0x73/0xff
[701453.834666]  [<ffffffff804af30d>] submit_bio+0xbd/0xc4
[701453.834669]  [<ffffffff8072a52a>] _spin_lock+0x5/0x7
[701453.834672]  [<ffffffff802986c4>] swap_writepage+0x9b/0xa5
[701453.834675]  [<ffffffff80283dc1>] shrink_page_list+0x358/0x5ff
[701453.834677]  [<ffffffff80284319>] shrink_list+0x2b1/0x5d8
[701453.834680]  [<ffffffff802806e8>] determine_dirtyable_memory+0xd/0x1d
[701453.834682]  [<ffffffff8028075e>] get_dirty_limits+0x1d/0x24f
[701453.834685]  [<ffffffff802237c4>] pvclock_clocksource_read+0x3a/0x70
[701453.834688]  [<ffffffff802848bd>] shrink_zone+0x27d/0x325
[701453.834692]  [<ffffffff80231733>] resched_task+0x2a/0x75
[701453.834694]  [<ffffffff80280990>] background_writeout+0x0/0xce
[701453.834696]  [<ffffffff802855c7>] try_to_free_pages+0x1fa/0x32d
[701453.834699]  [<ffffffff80282bea>] isolate_pages_global+0x0/0x231
[701453.834701]  [<ffffffff8027f8c0>] __alloc_pages_internal+0x259/0x401
[701453.834705]  [<ffffffff8027aefb>] find_or_create_page+0x48/0x88
[701453.834707]  [<ffffffff802c2c31>] __getblk+0x117/0x29d
[701453.834711]  [<ffffffff80357f00>]
journal_get_descriptor_buffer+0x30/0x76
[701453.834713]  [<ffffffff8035478e>] journal_commit_transaction+0x6da/0xdf0
[701453.834716]  [<ffffffff80244218>] lock_timer_base+0x26/0x4b
[701453.834719]  [<ffffffff8024428f>] try_to_del_timer_sync+0x52/0x5b
[701453.834721]  [<ffffffff8072a484>] _spin_lock_irqsave+0x24/0x2c
[701453.834723]  [<ffffffff803578a0>] kjournald+0xe5/0x214
[701453.834726]  [<ffffffff8024d628>] autoremove_wake_function+0x0/0x2e
[701453.834729]  [<ffffffff803577bb>] kjournald+0x0/0x214
[701453.834731]  [<ffffffff8024d2c7>] kthread+0x47/0x73
[701453.834748]  [<ffffffff8020d25a>] child_rip+0xa/0x20
[701453.834751]  [<ffffffff8024d280>] kthread+0x0/0x73
[701453.834753]  [<ffffffff8020d250>] child_rip+0x0/0x20
[701453.834754] Mem-Info:
[701453.834755] DMA per-cpu:
[701453.834757] CPU    0: hi:    0, btch:   1 usd:   0
[701453.834758] DMA32 per-cpu:
[701453.834760] CPU    0: hi:  186, btch:  31 usd: 165
[701453.834763] Active_anon:674 active_file:43401 inactive_anon:11269
[701453.834764]  inactive_file:53885 unevictable:0 dirty:5182
writeback:70 unstable:0
[701453.834765]  free:749 slab:12132 mapped:9094 pagetables:840 bounce:0
[701453.834768] DMA free:1968kB min:28kB low:32kB high:40kB
active_anon:0kB inactive_anon:0kB active_file:1952kB
inactive_file:2380kB unevictable:0kB present:5440kB pages_scanned:0
all_unreclaimable? no
[701453.834770] lowmem_reserve[]: 0 489 489 489
[701453.834774] DMA32 free:1028kB min:2812kB low:3512kB high:4216kB
active_anon:2696kB inactive_anon:45076kB active_file:171652kB
inactive_file:213160kB unevictable:0kB present:500896kB pages_scanned:0
all_unreclaimable? no
[701453.834777] lowmem_reserve[]: 0 0 0 0
[701453.834779] DMA: 78*4kB 7*8kB 12*16kB 8*32kB 10*64kB 4*128kB 0*256kB
0*512kB 0*1024kB 0*2048kB 0*4096kB = 1968kB
[701453.834785] DMA32: 1*4kB 0*8kB 0*16kB 0*32kB 10*64kB 3*128kB 0*256kB
0*512kB 0*1024kB 0*2048kB 0*4096kB = 1028kB
[701453.834791] 99417 total pagecache pages
[701453.834793] 2101 pages in swap cache
[701453.834794] Swap cache stats: add 8718, delete 6617, find 110037/110217
[701453.834796] Free swap  = 1020652kB
[701453.834797] Total swap = 1048568kB
[701453.836985] 131056 pages RAM
[701453.836987] 4801 pages reserved
[701453.836988] 98664 pages shared
[701453.836990] 34608 pages non-shared




Antoine Martin wrote:
> Hi,
> 
> The bug report below does indeed match everything I have experienced.
> Upon further inspection, 2.6.28.9 is also affected, just less so.
> 
> Unfortunately I have applied the patch:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0
> And if anything, it made things worse.
> 
> Cheers
> Antoine
> 
> 
> 
> Mark McLoughlin wrote:
>> On Sun, 2009-04-19 at 14:48 +0300, Avi Kivity wrote:
>>> Antoine Martin wrote:
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA512
>>>>
>>>> Wireshark was showing a huge amount of invalid packets (wrong checksum)
>>>> - - that was the cause of the slowdown.
>>>> Simply rebooting the host into 2.6.28.9 fixed *everything*, regardless
>>>> of whether the guests use virtio or ne2k_pci/etc.
>>>> The guests are still running 2.6.29.1, but I am not likely to try that
>>>> release again on the host anytime soon! Ouch!
>>>>   
>>> Strange, no significant tun changes between .28 and .29.
>> Sounds to me like it's this:
>>
>>   http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0
>>
>> davem said he was queueing up for stable, but it's not in yet:
>>
>>   http://kerneltrap.org/mailarchive/linux-netdev/2009/3/30/5337934
>>
>> I'll check that it's in the queue.
>>
>> Cheers,
>> Mark.
>>
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-04-28 18:57         ` Antoine Martin
@ 2009-05-09 13:19           ` Antoine Martin
  2009-05-13 12:58             ` Antoine Martin
  2009-05-18  9:39             ` Avi Kivity
  0 siblings, 2 replies; 14+ messages in thread
From: Antoine Martin @ 2009-05-09 13:19 UTC (permalink / raw)
  To: Mark McLoughlin
  Cc: Avi Kivity, Antoine Martin, kvm@vger.kernel.org, Rusty Russell,
	davem

Hi,

Here is another one, any ideas?
These oopses do look quite deep. Is it normal to end up in tcp_send_ack
from pdflush??

Cheers
Antoine

[929492.154634] pdflush: page allocation failure. order:0, mode:0x20
[929492.154637] Pid: 291, comm: pdflush Not tainted 2.6.29.2 #5
[929492.154639] Call Trace:
[929492.154641]  <IRQ>  [<ffffffff8027e8bc>]
__alloc_pages_internal+0x3e1/0x401
[929492.154649]  [<ffffffff8055b5ea>] try_fill_recv+0xa1/0x182
[929492.154652]  [<ffffffff8055c1fc>] virtnet_poll+0x533/0x5ab
[929492.154655]  [<ffffffff80632bba>] net_rx_action+0x70/0x143
[929492.154658]  [<ffffffff8023f18c>] __do_softirq+0x83/0x123
[929492.154661]  [<ffffffff8020d35c>] call_softirq+0x1c/0x28
[929492.154664]  [<ffffffff8020e2c0>] do_softirq+0x3c/0x85
[929492.154666]  [<ffffffff8023eea3>] irq_exit+0x3f/0x7a
[929492.154668]  [<ffffffff8020e59c>] do_IRQ+0x12b/0x14f
[929492.154670]  [<ffffffff8020cad3>] ret_from_intr+0x0/0x29
[929492.154672]  <EOI>  [<ffffffff802c22b1>]
__set_page_dirty_buffers+0x0/0x8f
[929492.154677]  [<ffffffff8031702b>] bget_one+0x0/0xb
[929492.154680]  [<ffffffff80316fa2>] walk_page_buffers+0x2/0x8b
[929492.154682]  [<ffffffff803185bc>] ext3_ordered_writepage+0xae/0x134
[929492.154685]  [<ffffffff8027ea46>] __writepage+0xa/0x25
[929492.154687]  [<ffffffff8027f19f>] write_cache_pages+0x206/0x322
[929492.154689]  [<ffffffff8027ea3c>] __writepage+0x0/0x25
[929492.154691]  [<ffffffff8027f2fe>] do_writepages+0x27/0x2d
[929492.154694]  [<ffffffff802bd3f6>] __writeback_single_inode+0x1a7/0x3b5
[929492.154696]  [<ffffffff8020a68c>] __switch_to+0xb4/0x38c
[929492.154698]  [<ffffffff802bda76>] generic_sync_sb_inodes+0x2a7/0x458
[929492.154701]  [<ffffffff802bde00>] writeback_inodes+0x8d/0xe6
[929492.154704]  [<ffffffff807296e2>] _spin_lock+0x5/0x7
[929492.155056]  [<ffffffff8027f432>] wb_kupdate+0x9f/0x116
[929492.155058]  [<ffffffff80280095>] pdflush+0x14b/0x202
[929492.155061]  [<ffffffff8027f393>] wb_kupdate+0x0/0x116
[929492.155063]  [<ffffffff8027ff4a>] pdflush+0x0/0x202
[929492.155065]  [<ffffffff8027ff4a>] pdflush+0x0/0x202
[929492.155068]  [<ffffffff8024c127>] kthread+0x47/0x73
[929492.155070]  [<ffffffff8020d25a>] child_rip+0xa/0x20
[929492.155072]  [<ffffffff8024c0e0>] kthread+0x0/0x73
[929492.183142]  [<ffffffff8020d250>] child_rip+0x0/0x20
[929492.183145] Mem-Info:
[929492.183147] DMA per-cpu:
[929492.183149] CPU    0: hi:    0, btch:   1 usd:   0
[929492.183151] DMA32 per-cpu:
[929492.183154] CPU    0: hi:  186, btch:  31 usd: 184
[929492.183158] Active_anon:2755 active_file:39849 inactive_anon:2972
[929492.183159]  inactive_file:70353 unevictable:0 dirty:4172
writeback:1580 unstable:0
[929492.183161]  free:734 slab:5619 mapped:15047 pagetables:927 bounce:0
[929492.183166] DMA free:1968kB min:28kB low:32kB high:40kB
active_anon:0kB inactive_anon:40kB active_file:2116kB
inactive_file:1880kB unevictable:0kB present:5448kB pages_scanned:0
all_unreclaimable? no
[929492.183169] lowmem_reserve[]: 0 489 489 489
[929492.183176] DMA32 free:968kB min:2812kB low:3512kB high:4216kB
active_anon:11020kB inactive_anon:11848kB active_file:157280kB
inactive_file:279532kB unevictable:0kB present:500896kB pages_scanned:0
all_unreclaimable? no
[929492.183180] lowmem_reserve[]: 0 0 0 0
[929492.183183] DMA: 6*4kB 2*8kB 3*16kB 1*32kB 1*64kB 2*128kB 0*256kB
1*512kB 1*1024kB 0*2048kB 0*4096kB = 1976kB
[929492.183235] DMA32: 0*4kB 1*8kB 0*16kB 0*32kB 1*64kB 3*128kB 2*256kB
0*512kB 0*1024kB 0*2048kB 0*4096kB = 968kB
[929492.183244] 110992 total pagecache pages
[929492.183246] 739 pages in swap cache
[929492.183248] Swap cache stats: add 8996, delete 8257, find 92604/93191
[929492.183250] Free swap  = 1040016kB
[929492.183252] Total swap = 1048568kB
[929492.186003] 131056 pages RAM
[929492.186006] 4799 pages reserved
[929492.186007] 44697 pages shared
[929492.186008] 90516 pages non-shared
[930274.380075] eth0: no IPv6 routers present







Antoine Martin wrote:
> Hi
> 
> Still getting (some but less) network issues with a 2.6.28.9 host.
> 
> Found quite a few of these call traces in the 2.6.29.1 guests:
> Guest has 512MB of memory and was not all that busy (just network
> traffic), so I don't understand why it would fail to allocate a page...
> 
> 
> [701453.834571] kjournald: page allocation failure. order:0, mode:0x4020
> [701453.834574] Pid: 4806, comm: kjournald Not tainted 2.6.29.1 #4
> [701453.834576] Call Trace:
> [701453.834578]  <IRQ>  [<ffffffff8027fa48>]
> __alloc_pages_internal+0x3e1/0x401
> [701453.834586]  [<ffffffff802a1ad4>] __slab_alloc+0x17f/0x4ca
> [701453.834590]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
> [701453.834592]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
> [701453.834595]  [<ffffffff802a2e66>] __kmalloc_track_caller+0xac/0xe1
> [701453.834598]  [<ffffffff8062f97e>] __alloc_skb+0x61/0x11e
> [701453.834600]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
> [701453.834603]  [<ffffffff8067c374>] tcp_rcv_established+0x6c7/0x9e6
> [701453.834605]  [<ffffffff80683515>] tcp_v4_do_rcv+0x19e/0x324
> [701453.834608]  [<ffffffff80683b23>] tcp_v4_rcv+0x488/0x73b
> [701453.834611]  [<ffffffff806499c4>] nf_hook_slow+0x62/0xc3
> [701453.834615]  [<ffffffff8066925c>] ip_local_deliver_finish+0x0/0x1ee
> [701453.834617]  [<ffffffff80669378>] ip_local_deliver_finish+0x11c/0x1ee
> [701453.834620]  [<ffffffff80668fcb>] ip_rcv_finish+0x2cf/0x2e9
> [701453.834622]  [<ffffffff80669218>] ip_rcv+0x233/0x277
> [701453.834626]  [<ffffffff8055d1e7>] virtnet_poll+0x4ca/0x5ab
> [701453.834628]  [<ffffffff80633952>] net_rx_action+0x70/0x143
> [701453.834631]  [<ffffffff8024030a>] __do_softirq+0x83/0x145
> [701453.834634]  [<ffffffff8020eb7a>] timer_interrupt+0x1a/0x21
> [701453.834637]  [<ffffffff8020d35c>] call_softirq+0x1c/0x28
> [701453.834639]  [<ffffffff8020e2c0>] do_softirq+0x3c/0x85
> [701453.834641]  [<ffffffff80240021>] irq_exit+0x3f/0x7a
> [701453.834643]  [<ffffffff8020e59c>] do_IRQ+0x12b/0x14f
> [701453.834646]  [<ffffffff8020cad3>] ret_from_intr+0x0/0x29
> [701453.834647]  <EOI>  [<ffffffff80621b29>] vp_notify+0x0/0x1c
> [701453.834653]  [<ffffffff804b099e>] __make_request+0x3e2/0x425
> [701453.834656]  [<ffffffff804af1ff>] generic_make_request+0x338/0x389
> [701453.834660]  [<ffffffff802986ce>] end_swap_bio_write+0x0/0x66
> [701453.834664]  [<ffffffff802c6643>] bio_alloc_bioset+0x73/0xff
> [701453.834666]  [<ffffffff804af30d>] submit_bio+0xbd/0xc4
> [701453.834669]  [<ffffffff8072a52a>] _spin_lock+0x5/0x7
> [701453.834672]  [<ffffffff802986c4>] swap_writepage+0x9b/0xa5
> [701453.834675]  [<ffffffff80283dc1>] shrink_page_list+0x358/0x5ff
> [701453.834677]  [<ffffffff80284319>] shrink_list+0x2b1/0x5d8
> [701453.834680]  [<ffffffff802806e8>] determine_dirtyable_memory+0xd/0x1d
> [701453.834682]  [<ffffffff8028075e>] get_dirty_limits+0x1d/0x24f
> [701453.834685]  [<ffffffff802237c4>] pvclock_clocksource_read+0x3a/0x70
> [701453.834688]  [<ffffffff802848bd>] shrink_zone+0x27d/0x325
> [701453.834692]  [<ffffffff80231733>] resched_task+0x2a/0x75
> [701453.834694]  [<ffffffff80280990>] background_writeout+0x0/0xce
> [701453.834696]  [<ffffffff802855c7>] try_to_free_pages+0x1fa/0x32d
> [701453.834699]  [<ffffffff80282bea>] isolate_pages_global+0x0/0x231
> [701453.834701]  [<ffffffff8027f8c0>] __alloc_pages_internal+0x259/0x401
> [701453.834705]  [<ffffffff8027aefb>] find_or_create_page+0x48/0x88
> [701453.834707]  [<ffffffff802c2c31>] __getblk+0x117/0x29d
> [701453.834711]  [<ffffffff80357f00>]
> journal_get_descriptor_buffer+0x30/0x76
> [701453.834713]  [<ffffffff8035478e>] journal_commit_transaction+0x6da/0xdf0
> [701453.834716]  [<ffffffff80244218>] lock_timer_base+0x26/0x4b
> [701453.834719]  [<ffffffff8024428f>] try_to_del_timer_sync+0x52/0x5b
> [701453.834721]  [<ffffffff8072a484>] _spin_lock_irqsave+0x24/0x2c
> [701453.834723]  [<ffffffff803578a0>] kjournald+0xe5/0x214
> [701453.834726]  [<ffffffff8024d628>] autoremove_wake_function+0x0/0x2e
> [701453.834729]  [<ffffffff803577bb>] kjournald+0x0/0x214
> [701453.834731]  [<ffffffff8024d2c7>] kthread+0x47/0x73
> [701453.834748]  [<ffffffff8020d25a>] child_rip+0xa/0x20
> [701453.834751]  [<ffffffff8024d280>] kthread+0x0/0x73
> [701453.834753]  [<ffffffff8020d250>] child_rip+0x0/0x20
> [701453.834754] Mem-Info:
> [701453.834755] DMA per-cpu:
> [701453.834757] CPU    0: hi:    0, btch:   1 usd:   0
> [701453.834758] DMA32 per-cpu:
> [701453.834760] CPU    0: hi:  186, btch:  31 usd: 165
> [701453.834763] Active_anon:674 active_file:43401 inactive_anon:11269
> [701453.834764]  inactive_file:53885 unevictable:0 dirty:5182
> writeback:70 unstable:0
> [701453.834765]  free:749 slab:12132 mapped:9094 pagetables:840 bounce:0
> [701453.834768] DMA free:1968kB min:28kB low:32kB high:40kB
> active_anon:0kB inactive_anon:0kB active_file:1952kB
> inactive_file:2380kB unevictable:0kB present:5440kB pages_scanned:0
> all_unreclaimable? no
> [701453.834770] lowmem_reserve[]: 0 489 489 489
> [701453.834774] DMA32 free:1028kB min:2812kB low:3512kB high:4216kB
> active_anon:2696kB inactive_anon:45076kB active_file:171652kB
> inactive_file:213160kB unevictable:0kB present:500896kB pages_scanned:0
> all_unreclaimable? no
> [701453.834777] lowmem_reserve[]: 0 0 0 0
> [701453.834779] DMA: 78*4kB 7*8kB 12*16kB 8*32kB 10*64kB 4*128kB 0*256kB
> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1968kB
> [701453.834785] DMA32: 1*4kB 0*8kB 0*16kB 0*32kB 10*64kB 3*128kB 0*256kB
> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1028kB
> [701453.834791] 99417 total pagecache pages
> [701453.834793] 2101 pages in swap cache
> [701453.834794] Swap cache stats: add 8718, delete 6617, find 110037/110217
> [701453.834796] Free swap  = 1020652kB
> [701453.834797] Total swap = 1048568kB
> [701453.836985] 131056 pages RAM
> [701453.836987] 4801 pages reserved
> [701453.836988] 98664 pages shared
> [701453.836990] 34608 pages non-shared
> 
> 
> 
> 
> Antoine Martin wrote:
>> Hi,
>>
>> The bug report below does indeed match everything I have experienced.
>> Upon further inspection, 2.6.28.9 is also affected, just less so.
>>
>> Unfortunately I have applied the patch:
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0
>> And if anything, it made things worse.
>>
>> Cheers
>> Antoine
>>
>>
>>
>> Mark McLoughlin wrote:
>>> On Sun, 2009-04-19 at 14:48 +0300, Avi Kivity wrote:
>>>> Antoine Martin wrote:
>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>> Hash: SHA512
>>>>>
>>>>> Wireshark was showing a huge amount of invalid packets (wrong checksum)
>>>>> - - that was the cause of the slowdown.
>>>>> Simply rebooting the host into 2.6.28.9 fixed *everything*, regardless
>>>>> of whether the guests use virtio or ne2k_pci/etc.
>>>>> The guests are still running 2.6.29.1, but I am not likely to try that
>>>>> release again on the host anytime soon! Ouch!
>>>>>   
>>>> Strange, no significant tun changes between .28 and .29.
>>> Sounds to me like it's this:
>>>
>>>   http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0
>>>
>>> davem said he was queueing up for stable, but it's not in yet:
>>>
>>>   http://kerneltrap.org/mailarchive/linux-netdev/2009/3/30/5337934
>>>
>>> I'll check that it's in the queue.
>>>
>>> Cheers,
>>> Mark.
>>>
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-05-09 13:19           ` Antoine Martin
@ 2009-05-13 12:58             ` Antoine Martin
  2009-05-14  3:52               ` David Miller
  2009-05-18  9:39             ` Avi Kivity
  1 sibling, 1 reply; 14+ messages in thread
From: Antoine Martin @ 2009-05-13 12:58 UTC (permalink / raw)
  To: Mark McLoughlin; +Cc: Avi Kivity, kvm@vger.kernel.org, Rusty Russell, davem

Re-sending as this does not seem to have made it to the list.

Antoine Martin wrote:
> Hi,
> 
> Here is another one, any ideas?
> These oopses do look quite deep. Is it normal to end up in tcp_send_ack
> from pdflush??
> 
> Cheers
> Antoine
> 
> [929492.154634] pdflush: page allocation failure. order:0, mode:0x20
> [929492.154637] Pid: 291, comm: pdflush Not tainted 2.6.29.2 #5
> [929492.154639] Call Trace:
> [929492.154641]  <IRQ>  [<ffffffff8027e8bc>]
> __alloc_pages_internal+0x3e1/0x401
> [929492.154649]  [<ffffffff8055b5ea>] try_fill_recv+0xa1/0x182
> [929492.154652]  [<ffffffff8055c1fc>] virtnet_poll+0x533/0x5ab
> [929492.154655]  [<ffffffff80632bba>] net_rx_action+0x70/0x143
> [929492.154658]  [<ffffffff8023f18c>] __do_softirq+0x83/0x123
> [929492.154661]  [<ffffffff8020d35c>] call_softirq+0x1c/0x28
> [929492.154664]  [<ffffffff8020e2c0>] do_softirq+0x3c/0x85
> [929492.154666]  [<ffffffff8023eea3>] irq_exit+0x3f/0x7a
> [929492.154668]  [<ffffffff8020e59c>] do_IRQ+0x12b/0x14f
> [929492.154670]  [<ffffffff8020cad3>] ret_from_intr+0x0/0x29
> [929492.154672]  <EOI>  [<ffffffff802c22b1>]
> __set_page_dirty_buffers+0x0/0x8f
> [929492.154677]  [<ffffffff8031702b>] bget_one+0x0/0xb
> [929492.154680]  [<ffffffff80316fa2>] walk_page_buffers+0x2/0x8b
> [929492.154682]  [<ffffffff803185bc>] ext3_ordered_writepage+0xae/0x134
> [929492.154685]  [<ffffffff8027ea46>] __writepage+0xa/0x25
> [929492.154687]  [<ffffffff8027f19f>] write_cache_pages+0x206/0x322
> [929492.154689]  [<ffffffff8027ea3c>] __writepage+0x0/0x25
> [929492.154691]  [<ffffffff8027f2fe>] do_writepages+0x27/0x2d
> [929492.154694]  [<ffffffff802bd3f6>] __writeback_single_inode+0x1a7/0x3b5
> [929492.154696]  [<ffffffff8020a68c>] __switch_to+0xb4/0x38c
> [929492.154698]  [<ffffffff802bda76>] generic_sync_sb_inodes+0x2a7/0x458
> [929492.154701]  [<ffffffff802bde00>] writeback_inodes+0x8d/0xe6
> [929492.154704]  [<ffffffff807296e2>] _spin_lock+0x5/0x7
> [929492.155056]  [<ffffffff8027f432>] wb_kupdate+0x9f/0x116
> [929492.155058]  [<ffffffff80280095>] pdflush+0x14b/0x202
> [929492.155061]  [<ffffffff8027f393>] wb_kupdate+0x0/0x116
> [929492.155063]  [<ffffffff8027ff4a>] pdflush+0x0/0x202
> [929492.155065]  [<ffffffff8027ff4a>] pdflush+0x0/0x202
> [929492.155068]  [<ffffffff8024c127>] kthread+0x47/0x73
> [929492.155070]  [<ffffffff8020d25a>] child_rip+0xa/0x20
> [929492.155072]  [<ffffffff8024c0e0>] kthread+0x0/0x73
> [929492.183142]  [<ffffffff8020d250>] child_rip+0x0/0x20
> [929492.183145] Mem-Info:
> [929492.183147] DMA per-cpu:
> [929492.183149] CPU    0: hi:    0, btch:   1 usd:   0
> [929492.183151] DMA32 per-cpu:
> [929492.183154] CPU    0: hi:  186, btch:  31 usd: 184
> [929492.183158] Active_anon:2755 active_file:39849 inactive_anon:2972
> [929492.183159]  inactive_file:70353 unevictable:0 dirty:4172
> writeback:1580 unstable:0
> [929492.183161]  free:734 slab:5619 mapped:15047 pagetables:927 bounce:0
> [929492.183166] DMA free:1968kB min:28kB low:32kB high:40kB
> active_anon:0kB inactive_anon:40kB active_file:2116kB
> inactive_file:1880kB unevictable:0kB present:5448kB pages_scanned:0
> all_unreclaimable? no
> [929492.183169] lowmem_reserve[]: 0 489 489 489
> [929492.183176] DMA32 free:968kB min:2812kB low:3512kB high:4216kB
> active_anon:11020kB inactive_anon:11848kB active_file:157280kB
> inactive_file:279532kB unevictable:0kB present:500896kB pages_scanned:0
> all_unreclaimable? no
> [929492.183180] lowmem_reserve[]: 0 0 0 0
> [929492.183183] DMA: 6*4kB 2*8kB 3*16kB 1*32kB 1*64kB 2*128kB 0*256kB
> 1*512kB 1*1024kB 0*2048kB 0*4096kB = 1976kB
> [929492.183235] DMA32: 0*4kB 1*8kB 0*16kB 0*32kB 1*64kB 3*128kB 2*256kB
> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 968kB
> [929492.183244] 110992 total pagecache pages
> [929492.183246] 739 pages in swap cache
> [929492.183248] Swap cache stats: add 8996, delete 8257, find 92604/93191
> [929492.183250] Free swap  = 1040016kB
> [929492.183252] Total swap = 1048568kB
> [929492.186003] 131056 pages RAM
> [929492.186006] 4799 pages reserved
> [929492.186007] 44697 pages shared
> [929492.186008] 90516 pages non-shared
> [930274.380075] eth0: no IPv6 routers present
> 
> 
> 
> 
> 
> 
> 
> Antoine Martin wrote:
>> Hi
>>
>> Still getting (some but less) network issues with a 2.6.28.9 host.
>>
>> Found quite a few of these call traces in the 2.6.29.1 guests:
>> Guest has 512MB of memory and was not all that busy (just network
>> traffic), so I don't understand why it would fail to allocate a page...
>>
>>
>> [701453.834571] kjournald: page allocation failure. order:0, mode:0x4020
>> [701453.834574] Pid: 4806, comm: kjournald Not tainted 2.6.29.1 #4
>> [701453.834576] Call Trace:
>> [701453.834578]  <IRQ>  [<ffffffff8027fa48>]
>> __alloc_pages_internal+0x3e1/0x401
>> [701453.834586]  [<ffffffff802a1ad4>] __slab_alloc+0x17f/0x4ca
>> [701453.834590]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
>> [701453.834592]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
>> [701453.834595]  [<ffffffff802a2e66>] __kmalloc_track_caller+0xac/0xe1
>> [701453.834598]  [<ffffffff8062f97e>] __alloc_skb+0x61/0x11e
>> [701453.834600]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
>> [701453.834603]  [<ffffffff8067c374>] tcp_rcv_established+0x6c7/0x9e6
>> [701453.834605]  [<ffffffff80683515>] tcp_v4_do_rcv+0x19e/0x324
>> [701453.834608]  [<ffffffff80683b23>] tcp_v4_rcv+0x488/0x73b
>> [701453.834611]  [<ffffffff806499c4>] nf_hook_slow+0x62/0xc3
>> [701453.834615]  [<ffffffff8066925c>] ip_local_deliver_finish+0x0/0x1ee
>> [701453.834617]  [<ffffffff80669378>] ip_local_deliver_finish+0x11c/0x1ee
>> [701453.834620]  [<ffffffff80668fcb>] ip_rcv_finish+0x2cf/0x2e9
>> [701453.834622]  [<ffffffff80669218>] ip_rcv+0x233/0x277
>> [701453.834626]  [<ffffffff8055d1e7>] virtnet_poll+0x4ca/0x5ab
>> [701453.834628]  [<ffffffff80633952>] net_rx_action+0x70/0x143
>> [701453.834631]  [<ffffffff8024030a>] __do_softirq+0x83/0x145
>> [701453.834634]  [<ffffffff8020eb7a>] timer_interrupt+0x1a/0x21
>> [701453.834637]  [<ffffffff8020d35c>] call_softirq+0x1c/0x28
>> [701453.834639]  [<ffffffff8020e2c0>] do_softirq+0x3c/0x85
>> [701453.834641]  [<ffffffff80240021>] irq_exit+0x3f/0x7a
>> [701453.834643]  [<ffffffff8020e59c>] do_IRQ+0x12b/0x14f
>> [701453.834646]  [<ffffffff8020cad3>] ret_from_intr+0x0/0x29
>> [701453.834647]  <EOI>  [<ffffffff80621b29>] vp_notify+0x0/0x1c
>> [701453.834653]  [<ffffffff804b099e>] __make_request+0x3e2/0x425
>> [701453.834656]  [<ffffffff804af1ff>] generic_make_request+0x338/0x389
>> [701453.834660]  [<ffffffff802986ce>] end_swap_bio_write+0x0/0x66
>> [701453.834664]  [<ffffffff802c6643>] bio_alloc_bioset+0x73/0xff
>> [701453.834666]  [<ffffffff804af30d>] submit_bio+0xbd/0xc4
>> [701453.834669]  [<ffffffff8072a52a>] _spin_lock+0x5/0x7
>> [701453.834672]  [<ffffffff802986c4>] swap_writepage+0x9b/0xa5
>> [701453.834675]  [<ffffffff80283dc1>] shrink_page_list+0x358/0x5ff
>> [701453.834677]  [<ffffffff80284319>] shrink_list+0x2b1/0x5d8
>> [701453.834680]  [<ffffffff802806e8>] determine_dirtyable_memory+0xd/0x1d
>> [701453.834682]  [<ffffffff8028075e>] get_dirty_limits+0x1d/0x24f
>> [701453.834685]  [<ffffffff802237c4>] pvclock_clocksource_read+0x3a/0x70
>> [701453.834688]  [<ffffffff802848bd>] shrink_zone+0x27d/0x325
>> [701453.834692]  [<ffffffff80231733>] resched_task+0x2a/0x75
>> [701453.834694]  [<ffffffff80280990>] background_writeout+0x0/0xce
>> [701453.834696]  [<ffffffff802855c7>] try_to_free_pages+0x1fa/0x32d
>> [701453.834699]  [<ffffffff80282bea>] isolate_pages_global+0x0/0x231
>> [701453.834701]  [<ffffffff8027f8c0>] __alloc_pages_internal+0x259/0x401
>> [701453.834705]  [<ffffffff8027aefb>] find_or_create_page+0x48/0x88
>> [701453.834707]  [<ffffffff802c2c31>] __getblk+0x117/0x29d
>> [701453.834711]  [<ffffffff80357f00>]
>> journal_get_descriptor_buffer+0x30/0x76
>> [701453.834713]  [<ffffffff8035478e>] journal_commit_transaction+0x6da/0xdf0
>> [701453.834716]  [<ffffffff80244218>] lock_timer_base+0x26/0x4b
>> [701453.834719]  [<ffffffff8024428f>] try_to_del_timer_sync+0x52/0x5b
>> [701453.834721]  [<ffffffff8072a484>] _spin_lock_irqsave+0x24/0x2c
>> [701453.834723]  [<ffffffff803578a0>] kjournald+0xe5/0x214
>> [701453.834726]  [<ffffffff8024d628>] autoremove_wake_function+0x0/0x2e
>> [701453.834729]  [<ffffffff803577bb>] kjournald+0x0/0x214
>> [701453.834731]  [<ffffffff8024d2c7>] kthread+0x47/0x73
>> [701453.834748]  [<ffffffff8020d25a>] child_rip+0xa/0x20
>> [701453.834751]  [<ffffffff8024d280>] kthread+0x0/0x73
>> [701453.834753]  [<ffffffff8020d250>] child_rip+0x0/0x20
>> [701453.834754] Mem-Info:
>> [701453.834755] DMA per-cpu:
>> [701453.834757] CPU    0: hi:    0, btch:   1 usd:   0
>> [701453.834758] DMA32 per-cpu:
>> [701453.834760] CPU    0: hi:  186, btch:  31 usd: 165
>> [701453.834763] Active_anon:674 active_file:43401 inactive_anon:11269
>> [701453.834764]  inactive_file:53885 unevictable:0 dirty:5182
>> writeback:70 unstable:0
>> [701453.834765]  free:749 slab:12132 mapped:9094 pagetables:840 bounce:0
>> [701453.834768] DMA free:1968kB min:28kB low:32kB high:40kB
>> active_anon:0kB inactive_anon:0kB active_file:1952kB
>> inactive_file:2380kB unevictable:0kB present:5440kB pages_scanned:0
>> all_unreclaimable? no
>> [701453.834770] lowmem_reserve[]: 0 489 489 489
>> [701453.834774] DMA32 free:1028kB min:2812kB low:3512kB high:4216kB
>> active_anon:2696kB inactive_anon:45076kB active_file:171652kB
>> inactive_file:213160kB unevictable:0kB present:500896kB pages_scanned:0
>> all_unreclaimable? no
>> [701453.834777] lowmem_reserve[]: 0 0 0 0
>> [701453.834779] DMA: 78*4kB 7*8kB 12*16kB 8*32kB 10*64kB 4*128kB 0*256kB
>> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1968kB
>> [701453.834785] DMA32: 1*4kB 0*8kB 0*16kB 0*32kB 10*64kB 3*128kB 0*256kB
>> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1028kB
>> [701453.834791] 99417 total pagecache pages
>> [701453.834793] 2101 pages in swap cache
>> [701453.834794] Swap cache stats: add 8718, delete 6617, find 110037/110217
>> [701453.834796] Free swap  = 1020652kB
>> [701453.834797] Total swap = 1048568kB
>> [701453.836985] 131056 pages RAM
>> [701453.836987] 4801 pages reserved
>> [701453.836988] 98664 pages shared
>> [701453.836990] 34608 pages non-shared
>>
>>
>>
>>
>> Antoine Martin wrote:
>>> Hi,
>>>
>>> The bug report below does indeed match everything I have experienced.
>>> Upon further inspection, 2.6.28.9 is also affected, just less so.
>>>
>>> Unfortunately I have applied the patch:
>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0
>>> And if anything, it made things worse.
>>>
>>> Cheers
>>> Antoine
>>>
>>>
>>>
>>> Mark McLoughlin wrote:
>>>> On Sun, 2009-04-19 at 14:48 +0300, Avi Kivity wrote:
>>>>> Antoine Martin wrote:
>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>> Hash: SHA512
>>>>>>
>>>>>> Wireshark was showing a huge amount of invalid packets (wrong checksum)
>>>>>> - - that was the cause of the slowdown.
>>>>>> Simply rebooting the host into 2.6.28.9 fixed *everything*, regardless
>>>>>> of whether the guests use virtio or ne2k_pci/etc.
>>>>>> The guests are still running 2.6.29.1, but I am not likely to try that
>>>>>> release again on the host anytime soon! Ouch!
>>>>>>   
>>>>> Strange, no significant tun changes between .28 and .29.
>>>> Sounds to me like it's this:
>>>>
>>>>   http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0
>>>>
>>>> davem said he was queueing up for stable, but it's not in yet:
>>>>
>>>>   http://kerneltrap.org/mailarchive/linux-netdev/2009/3/30/5337934
>>>>
>>>> I'll check that it's in the queue.
>>>>
>>>> Cheers,
>>>> Mark.
>>>>
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-05-13 12:58             ` Antoine Martin
@ 2009-05-14  3:52               ` David Miller
  0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2009-05-14  3:52 UTC (permalink / raw)
  To: antoine; +Cc: markmc, avi, kvm, rusty

From: Antoine Martin <antoine@devloop.org.uk>
Date: Wed, 13 May 2009 19:58:45 +0700

> Re-sending as this does not seem to have made it to the list.

It made it, it's just that nobody has had a chance to look into
this.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-05-09 13:19           ` Antoine Martin
  2009-05-13 12:58             ` Antoine Martin
@ 2009-05-18  9:39             ` Avi Kivity
  2009-05-19 10:16               ` Antoine Martin
  1 sibling, 1 reply; 14+ messages in thread
From: Avi Kivity @ 2009-05-18  9:39 UTC (permalink / raw)
  To: Antoine Martin
  Cc: Mark McLoughlin, Antoine Martin, kvm@vger.kernel.org,
	Rusty Russell, davem

Antoine Martin wrote:
> Hi,
>
> Here is another one, any ideas?
> These oopses do look quite deep. Is it normal to end up in tcp_send_ack
> from pdflush??
>
>   

I think it can happen anywhere, part of the net softirq.

> Cheers
> Antoine
>
> [929492.154634] pdflush: page allocation failure. order:0, mode:0x20
>   

You're out of memory.  How much memory did you allocate to the guest?  
did you balloon it?

> [929492.154637] Pid: 291, comm: pdflush Not tainted 2.6.29.2 #5
> [929492.154639] Call Trace:
> [929492.154641]  <IRQ>  [<ffffffff8027e8bc>]
> __alloc_pages_internal+0x3e1/0x401
> [929492.154649]  [<ffffffff8055b5ea>] try_fill_recv+0xa1/0x182
> [929492.154652]  [<ffffffff8055c1fc>] virtnet_poll+0x533/0x5ab
> [929492.154655]  [<ffffffff80632bba>] net_rx_action+0x70/0x143
> [929492.154658]  [<ffffffff8023f18c>] __do_softirq+0x83/0x123
> [929492.154661]  [<ffffffff8020d35c>] call_softirq+0x1c/0x28
> [929492.154664]  [<ffffffff8020e2c0>] do_softirq+0x3c/0x85
> [929492.154666]  [<ffffffff8023eea3>] irq_exit+0x3f/0x7a
> [929492.154668]  [<ffffffff8020e59c>] do_IRQ+0x12b/0x14f
> [929492.154670]  [<ffffffff8020cad3>] ret_from_intr+0x0/0x29
> [929492.154672]  <EOI>  [<ffffffff802c22b1>]
> __set_page_dirty_buffers+0x0/0x8f
> [929492.154677]  [<ffffffff8031702b>] bget_one+0x0/0xb
> [929492.154680]  [<ffffffff80316fa2>] walk_page_buffers+0x2/0x8b
> [929492.154682]  [<ffffffff803185bc>] ext3_ordered_writepage+0xae/0x134
> [929492.154685]  [<ffffffff8027ea46>] __writepage+0xa/0x25
> [929492.154687]  [<ffffffff8027f19f>] write_cache_pages+0x206/0x322
> [929492.154689]  [<ffffffff8027ea3c>] __writepage+0x0/0x25
> [929492.154691]  [<ffffffff8027f2fe>] do_writepages+0x27/0x2d
> [929492.154694]  [<ffffffff802bd3f6>] __writeback_single_inode+0x1a7/0x3b5
> [929492.154696]  [<ffffffff8020a68c>] __switch_to+0xb4/0x38c
> [929492.154698]  [<ffffffff802bda76>] generic_sync_sb_inodes+0x2a7/0x458
> [929492.154701]  [<ffffffff802bde00>] writeback_inodes+0x8d/0xe6
> [929492.154704]  [<ffffffff807296e2>] _spin_lock+0x5/0x7
> [929492.155056]  [<ffffffff8027f432>] wb_kupdate+0x9f/0x116
> [929492.155058]  [<ffffffff80280095>] pdflush+0x14b/0x202
> [929492.155061]  [<ffffffff8027f393>] wb_kupdate+0x0/0x116
> [929492.155063]  [<ffffffff8027ff4a>] pdflush+0x0/0x202
> [929492.155065]  [<ffffffff8027ff4a>] pdflush+0x0/0x202
> [929492.155068]  [<ffffffff8024c127>] kthread+0x47/0x73
> [929492.155070]  [<ffffffff8020d25a>] child_rip+0xa/0x20
> [929492.155072]  [<ffffffff8024c0e0>] kthread+0x0/0x73
> [929492.183142]  [<ffffffff8020d250>] child_rip+0x0/0x20
> [929492.183145] Mem-Info:
> [929492.183147] DMA per-cpu:
> [929492.183149] CPU    0: hi:    0, btch:   1 usd:   0
> [929492.183151] DMA32 per-cpu:
> [929492.183154] CPU    0: hi:  186, btch:  31 usd: 184
> [929492.183158] Active_anon:2755 active_file:39849 inactive_anon:2972
> [929492.183159]  inactive_file:70353 unevictable:0 dirty:4172
> writeback:1580 unstable:0
> [929492.183161]  free:734 slab:5619 mapped:15047 pagetables:927 bounce:0
> [929492.183166] DMA free:1968kB min:28kB low:32kB high:40kB
> active_anon:0kB inactive_anon:40kB active_file:2116kB
> inactive_file:1880kB unevictable:0kB present:5448kB pages_scanned:0
> all_unreclaimable? no
> [929492.183169] lowmem_reserve[]: 0 489 489 489
> [929492.183176] DMA32 free:968kB min:2812kB low:3512kB high:4216kB
> active_anon:11020kB inactive_anon:11848kB active_file:157280kB
> inactive_file:279532kB unevictable:0kB present:500896kB pages_scanned:0
> all_unreclaimable? no
> [929492.183180] lowmem_reserve[]: 0 0 0 0
> [929492.183183] DMA: 6*4kB 2*8kB 3*16kB 1*32kB 1*64kB 2*128kB 0*256kB
> 1*512kB 1*1024kB 0*2048kB 0*4096kB = 1976kB
> [929492.183235] DMA32: 0*4kB 1*8kB 0*16kB 0*32kB 1*64kB 3*128kB 2*256kB
> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 968kB
> [929492.183244] 110992 total pagecache pages
> [929492.183246] 739 pages in swap cache
> [929492.183248] Swap cache stats: add 8996, delete 8257, find 92604/93191
> [929492.183250] Free swap  = 1040016kB
> [929492.183252] Total swap = 1048568kB
> [929492.186003] 131056 pages RAM
> [929492.186006] 4799 pages reserved
> [929492.186007] 44697 pages shared
> [929492.186008] 90516 pages non-shared
> [930274.380075] eth0: no IPv6 routers present
>
>   
Strange, seems to be a bit of free memory here.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-05-18  9:39             ` Avi Kivity
@ 2009-05-19 10:16               ` Antoine Martin
  2009-05-19 10:21                 ` Avi Kivity
  0 siblings, 1 reply; 14+ messages in thread
From: Antoine Martin @ 2009-05-19 10:16 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Antoine Martin, Mark McLoughlin, kvm@vger.kernel.org,
	Rusty Russell, davem

Avi Kivity wrote:
> Antoine Martin wrote:
>> Hi,
>>
>> Here is another one, any ideas?
>> These oopses do look quite deep. Is it normal to end up in tcp_send_ack
>> from pdflush??
>>
>>   
>
> I think it can happen anywhere, part of the net softirq.
Hah, gotcha.
>
>> Cheers
>> Antoine
>>
>> [929492.154634] pdflush: page allocation failure. order:0, mode:0x20
>>   
>
> You're out of memory.
That's quite odd, the guest wasn't even hitting the swap at the tine.
>   How much memory did you allocate to the guest?  did you balloon it?
512MB, no ballooning.

>
>> [929492.154637] Pid: 291, comm: pdflush Not tainted 2.6.29.2 #5
>> [929492.154639] Call Trace:
>> [929492.154641]  <IRQ>  [<ffffffff8027e8bc>]
>> __alloc_pages_internal+0x3e1/0x401
>> [929492.154649]  [<ffffffff8055b5ea>] try_fill_recv+0xa1/0x182
>> [929492.154652]  [<ffffffff8055c1fc>] virtnet_poll+0x533/0x5ab
>> [929492.154655]  [<ffffffff80632bba>] net_rx_action+0x70/0x143
>> [929492.154658]  [<ffffffff8023f18c>] __do_softirq+0x83/0x123
>> [929492.154661]  [<ffffffff8020d35c>] call_softirq+0x1c/0x28
>> [929492.154664]  [<ffffffff8020e2c0>] do_softirq+0x3c/0x85
>> [929492.154666]  [<ffffffff8023eea3>] irq_exit+0x3f/0x7a
>> [929492.154668]  [<ffffffff8020e59c>] do_IRQ+0x12b/0x14f
>> [929492.154670]  [<ffffffff8020cad3>] ret_from_intr+0x0/0x29
>> [929492.154672]  <EOI>  [<ffffffff802c22b1>]
>> __set_page_dirty_buffers+0x0/0x8f
>> [929492.154677]  [<ffffffff8031702b>] bget_one+0x0/0xb
>> [929492.154680]  [<ffffffff80316fa2>] walk_page_buffers+0x2/0x8b
>> [929492.154682]  [<ffffffff803185bc>] ext3_ordered_writepage+0xae/0x134
>> [929492.154685]  [<ffffffff8027ea46>] __writepage+0xa/0x25
>> [929492.154687]  [<ffffffff8027f19f>] write_cache_pages+0x206/0x322
>> [929492.154689]  [<ffffffff8027ea3c>] __writepage+0x0/0x25
>> [929492.154691]  [<ffffffff8027f2fe>] do_writepages+0x27/0x2d
>> [929492.154694]  [<ffffffff802bd3f6>]
>> __writeback_single_inode+0x1a7/0x3b5
>> [929492.154696]  [<ffffffff8020a68c>] __switch_to+0xb4/0x38c
>> [929492.154698]  [<ffffffff802bda76>] generic_sync_sb_inodes+0x2a7/0x458
>> [929492.154701]  [<ffffffff802bde00>] writeback_inodes+0x8d/0xe6
>> [929492.154704]  [<ffffffff807296e2>] _spin_lock+0x5/0x7
>> [929492.155056]  [<ffffffff8027f432>] wb_kupdate+0x9f/0x116
>> [929492.155058]  [<ffffffff80280095>] pdflush+0x14b/0x202
>> [929492.155061]  [<ffffffff8027f393>] wb_kupdate+0x0/0x116
>> [929492.155063]  [<ffffffff8027ff4a>] pdflush+0x0/0x202
>> [929492.155065]  [<ffffffff8027ff4a>] pdflush+0x0/0x202
>> [929492.155068]  [<ffffffff8024c127>] kthread+0x47/0x73
>> [929492.155070]  [<ffffffff8020d25a>] child_rip+0xa/0x20
>> [929492.155072]  [<ffffffff8024c0e0>] kthread+0x0/0x73
>> [929492.183142]  [<ffffffff8020d250>] child_rip+0x0/0x20
>> [929492.183145] Mem-Info:
>> [929492.183147] DMA per-cpu:
>> [929492.183149] CPU    0: hi:    0, btch:   1 usd:   0
>> [929492.183151] DMA32 per-cpu:
>> [929492.183154] CPU    0: hi:  186, btch:  31 usd: 184
>> [929492.183158] Active_anon:2755 active_file:39849 inactive_anon:2972
>> [929492.183159]  inactive_file:70353 unevictable:0 dirty:4172
>> writeback:1580 unstable:0
>> [929492.183161]  free:734 slab:5619 mapped:15047 pagetables:927 bounce:0
>> [929492.183166] DMA free:1968kB min:28kB low:32kB high:40kB
>> active_anon:0kB inactive_anon:40kB active_file:2116kB
>> inactive_file:1880kB unevictable:0kB present:5448kB pages_scanned:0
>> all_unreclaimable? no
>> [929492.183169] lowmem_reserve[]: 0 489 489 489
>> [929492.183176] DMA32 free:968kB min:2812kB low:3512kB high:4216kB
>> active_anon:11020kB inactive_anon:11848kB active_file:157280kB
>> inactive_file:279532kB unevictable:0kB present:500896kB pages_scanned:0
>> all_unreclaimable? no
>> [929492.183180] lowmem_reserve[]: 0 0 0 0
>> [929492.183183] DMA: 6*4kB 2*8kB 3*16kB 1*32kB 1*64kB 2*128kB 0*256kB
>> 1*512kB 1*1024kB 0*2048kB 0*4096kB = 1976kB
>> [929492.183235] DMA32: 0*4kB 1*8kB 0*16kB 0*32kB 1*64kB 3*128kB 2*256kB
>> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 968kB
>> [929492.183244] 110992 total pagecache pages
>> [929492.183246] 739 pages in swap cache
>> [929492.183248] Swap cache stats: add 8996, delete 8257, find
>> 92604/93191
>> [929492.183250] Free swap  = 1040016kB
>> [929492.183252] Total swap = 1048568kB
>> [929492.186003] 131056 pages RAM
>> [929492.186006] 4799 pages reserved
>> [929492.186007] 44697 pages shared
>> [929492.186008] 90516 pages non-shared
>> [930274.380075] eth0: no IPv6 routers present
>>
>>   
> Strange, seems to be a bit of free memory here.
There should be lots, all this host is doing is apache+sftp...

Assuming I can make it re-occur (stress testing it?), how would I dig
further to find the cause of this memory exhaustion? /proc/meminfo and
friends?

Cheers
Antoine


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-05-19 10:16               ` Antoine Martin
@ 2009-05-19 10:21                 ` Avi Kivity
  2009-05-19 10:47                   ` Antoine Martin
  0 siblings, 1 reply; 14+ messages in thread
From: Avi Kivity @ 2009-05-19 10:21 UTC (permalink / raw)
  To: Antoine Martin
  Cc: Antoine Martin, Mark McLoughlin, kvm@vger.kernel.org,
	Rusty Russell, davem

Antoine Martin wrote:
>> You're out of memory.
>>     
> That's quite odd, the guest wasn't even hitting the swap at the tine.
>   

But you do have swap enabled?

  

>> Strange, seems to be a bit of free memory here.
>>     
> There should be lots, all this host is doing is apache+sftp...
>
> Assuming I can make it re-occur (stress testing it?), how would I dig
> further to find the cause of this memory exhaustion? /proc/meminfo and
> friends?
>   

Yes please.  Maybe virtio is leaking memory.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-05-19 10:21                 ` Avi Kivity
@ 2009-05-19 10:47                   ` Antoine Martin
  2009-05-19 13:03                     ` Avi Kivity
  0 siblings, 1 reply; 14+ messages in thread
From: Antoine Martin @ 2009-05-19 10:47 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Antoine Martin, Mark McLoughlin, kvm@vger.kernel.org,
	Rusty Russell, davem

Avi Kivity wrote:
> Antoine Martin wrote:
>>> You're out of memory.
>>>     
>> That's quite odd, the guest wasn't even hitting the swap at the tine.  
>
> But you do have swap enabled?
Yes.

I always do this on the guests as it seems fairer to let the guests use
swap when they need the extra memory rather than over-committing too
much memory on the host. Although it would probably be more efficient
overall to let the host manage all swapping.
It consumes more I/O bandwidth, but most guest's memory stay "warm" no
matter what other guests are doing.
Does that sound reasonable?
>>> Strange, seems to be a bit of free memory here.
>>>     
>> There should be lots, all this host is doing is apache+sftp...
>>
>> Assuming I can make it re-occur (stress testing it?), how would I dig
>> further to find the cause of this memory exhaustion? /proc/meminfo and
>> friends?
>>   
>
> Yes please.  Maybe virtio is leaking memory.
Will report if I find anything.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: virtio net regression
  2009-05-19 10:47                   ` Antoine Martin
@ 2009-05-19 13:03                     ` Avi Kivity
  0 siblings, 0 replies; 14+ messages in thread
From: Avi Kivity @ 2009-05-19 13:03 UTC (permalink / raw)
  To: Antoine Martin
  Cc: Antoine Martin, Mark McLoughlin, kvm@vger.kernel.org,
	Rusty Russell, davem

Antoine Martin wrote:
>>
>> But you do have swap enabled?
>>     
> Yes.
>
> I always do this on the guests as it seems fairer to let the guests use
> swap when they need the extra memory rather than over-committing too
> much memory on the host. Although it would probably be more efficient
> overall to let the host manage all swapping.
> It consumes more I/O bandwidth, but most guest's memory stay "warm" no
> matter what other guests are doing.
> Does that sound reasonable?
>   

Yes, it also provides better isolation.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-05-19 13:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-15 20:04 virtio net regression Antoine Martin
2009-04-15 23:38 ` Antoine Martin
2009-04-19 11:48   ` Avi Kivity
2009-04-20 11:12     ` Mark McLoughlin
2009-04-20 15:09       ` Antoine Martin
     [not found]       ` <49EC8F1D.7000109@nagafix.co.uk>
2009-04-28 18:57         ` Antoine Martin
2009-05-09 13:19           ` Antoine Martin
2009-05-13 12:58             ` Antoine Martin
2009-05-14  3:52               ` David Miller
2009-05-18  9:39             ` Avi Kivity
2009-05-19 10:16               ` Antoine Martin
2009-05-19 10:21                 ` Avi Kivity
2009-05-19 10:47                   ` Antoine Martin
2009-05-19 13:03                     ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox