From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eddie Chapman <eddie@ehuk.net>
Cc: kvm@vger.kernel.org, Romain Francoise <romain@orebokech.com>,
Michael Mueller <mimu@linux.vnet.ibm.com>,
mityapetuhov@gmail.com
Subject: Re: Possible to backport this vhost-net fix to 3.10?
Date: Sun, 5 Oct 2014 18:44:45 +0300 [thread overview]
Message-ID: <20141005154445.GA14840@redhat.com> (raw)
In-Reply-To: <543164EF.4070007@ehuk.net>
On Sun, Oct 05, 2014 at 04:34:07PM +0100, Eddie Chapman wrote:
>
> On 04/10/14 19:35, Michael S. Tsirkin wrote:
> >On Sat, Oct 04, 2014 at 12:38:24AM +0100, Eddie Chapman wrote:
> >>Hi,
> >>
> >>I've been regularly seeing on the 3.10 stable kernels the same problem as
> >>reported by Romain Francoise here:
> >>https://lkml.org/lkml/2013/1/23/492
> >>
> >>An example from my setup is at the bottom of this mail. It's a problem as
> >>qemu fails to run when it hits this, only solution is to do all qemu
> >>launches with vhost=off after it happens. It starts happening after the
> >>machine has been running for a while and after a few VMs have been started.
> >>I guess that is the fragmentation issue as the machine is never under any
> >>serious memory pressure when it happens.
> >>
> >>I see this set of changes for 3.16 has a couple of fixes which appear to
> >>address the problem:
> >>https://lkml.org/lkml/2014/6/11/302
> >>
> >>I was just wondering if there are any plans to backport these to 3.10, or
> >>even if it is actually possible (I'm not a kernel dev so wouldn't know)?
> >>
> >>If not, are there any other workarounds other than vhost=off?
> >>
> >>thanks,
> >>Eddie
> >
> >Yes, these patches aren't hard to backport.
> >Go ahead and post the backport, I'll review and ack.
>
> Thanks Michael,
>
> Actually I just discovered that Dmitry Petuhov backported
> 23cc5a991c7a9fb7e6d6550e65cee4f4173111c5 ("vhost-net: extend device
> allocation to vmalloc") last month to the Proxmox 3.10 kernel
> https://www.mail-archive.com/pve-devel@pve.proxmox.com/msg08873.html
>
> He appears to have tested it quite thoroughly himself with a heavy workload,
> with no problems, though it hasn't gone into a Proxmox release yet.
>
> His patch applies to vanilla kernel.org 3.10.55 with only slight fuzzines,
> so I've done some slight white space cleanup so it applies cleanly. vanilla
> 3.10.55 compiles fine on my machine without any errors or warnings with it.
> Is it OK (below)? Not sure it will meet stable submission rules?
OK but pls cleanup indentation, it's all scrambled. You'll also need to
add proper attribution (using >From: header), your signature etc.
>
> Dmitry also says that d04257b07f2362d4eb550952d5bf5f4241a8046d ("vhost-net:
> don't open-code kvfree") is not applicable in 3.10 because there's no
> open-coded kvfree() function (this appears in v3.15-rc5).
Yes that's just a cleanup, we don't do these in stable.
> Have added Dmitry to CC.
>
> thanks,
> Eddie
> --- a/drivers/vhost/net.c 2014-10-05 15:34:12.282126999 +0100
> +++ b/drivers/vhost/net.c 2014-10-05 15:34:15.862140883 +0100
> @@ -18,6 +18,7 @@
> #include <linux/rcupdate.h>
> #include <linux/file.h>
> #include <linux/slab.h>
> +#include <linux/vmalloc.h>
>
> #include <linux/net.h>
> #include <linux/if_packet.h>
> @@ -707,18 +708,30 @@
> handle_rx(net);
> }
>
> +static void vhost_net_free(void *addr)
> +{
> + if (is_vmalloc_addr(addr))
> + vfree(addr);
> + else
> + kfree(addr);
> +}
> +
> static int vhost_net_open(struct inode *inode, struct file *f)
> {
> - struct vhost_net *n = kmalloc(sizeof *n, GFP_KERNEL);
> + struct vhost_net *n;
> struct vhost_dev *dev;
> struct vhost_virtqueue **vqs;
> int r, i;
>
> - if (!n)
> - return -ENOMEM;
> + n = kmalloc(sizeof *n, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT);
> + if (!n) {
> + n = vmalloc(sizeof *n);
> + if (!n)
> + return -ENOMEM;
> + }
> vqs = kmalloc(VHOST_NET_VQ_MAX * sizeof(*vqs), GFP_KERNEL);
> if (!vqs) {
> - kfree(n);
> + vhost_net_free(n);
> return -ENOMEM;
> }
>
> @@ -737,7 +750,7 @@
> }
> r = vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
> if (r < 0) {
> - kfree(n);
> + vhost_net_free(n);
> kfree(vqs);
> return r;
> }
> @@ -840,7 +853,7 @@
> * since jobs can re-queue themselves. */
> vhost_net_flush(n);
> kfree(n->dev.vqs);
> - kfree(n);
> + vhost_net_free(n);
> return 0;
> }
>
>
> >
> >
> >>[1948751.794040] qemu-system-x86: page allocation failure: order:4,
> >>mode:0x1040d0
> >>[1948751.810341] CPU: 4 PID: 41198 Comm: qemu-system-x86 Not tainted
> >>3.10.53-rc1 #3
> >>[1948751.826846] Hardware name: Intel Corporation S1200BTL/S1200BTL, BIOS
> >>S1200BT.86B.02.00.0041.120520121743 12/05/2012
> >>[1948751.847285] 0000000000000004 ffff8802eaf3b9d8 ffffffff8162ff4d
> >>ffff8802eaf3ba68
> >>[1948751.864257] ffffffff810ab771 0000000000000001 ffff8802eaf3bb48
> >>ffff8802eaf3ba68
> >>[1948751.881209] ffffffff810abe68 ffffffff81ca2f40 ffffffff00000000
> >>0000000200000040
> >>[1948751.898276] Call Trace:
> >>[1948751.909628] [<ffffffff8162ff4d>] dump_stack+0x19/0x1c
> >>[1948751.924284] [<ffffffff810ab771>] warn_alloc_failed+0x111/0x126
> >>[1948751.939774] [<ffffffff810abe68>] ?
> >>__alloc_pages_direct_compact+0x181/0x198
> >>[1948751.956650] [<ffffffff810ac5ae>] __alloc_pages_nodemask+0x72f/0x77c
> >>[1948751.972853] [<ffffffff810ac676>] __get_free_pages+0x12/0x41
> >>[1948751.988297] [<ffffffffa04ac71b>] vhost_net_open+0x23/0x171 [vhost_net]
> >>[1948752.004938] [<ffffffff8130d6c3>] misc_open+0x119/0x17d
> >>[1948752.020111] [<ffffffff810e99b4>] chrdev_open+0x134/0x155
> >>[1948752.035604] [<ffffffff81053193>] ? lg_local_unlock+0x1e/0x31
> >>[1948752.051436] [<ffffffff810e9880>] ? cdev_put+0x24/0x24
> >>[1948752.066540] [<ffffffff810e46b8>] do_dentry_open+0x15c/0x20f
> >>[1948752.082214] [<ffffffff810e484b>] finish_open+0x34/0x3f
> >>[1948752.097234] [<ffffffff810f2737>] do_last+0x996/0xbcb
> >>[1948752.111983] [<ffffffff810ef98e>] ? link_path_walk+0x5e/0x791
> >>[1948752.127447] [<ffffffff810f0296>] ? path_init+0x11d/0x403
> >>[1948752.142517] [<ffffffff810f2a32>] path_openat+0xc6/0x43b
> >>[1948752.157207] [<ffffffff81070f08>] ? __lock_acquire+0x9ae/0xa4a
> >>[1948752.172369] [<ffffffff815ac2ef>] ? rtnl_unlock+0x9/0xb
> >>[1948752.186893] [<ffffffff810f2eac>] do_filp_open+0x38/0x84
> >>[1948752.201503] [<ffffffff81633673>] ? _raw_spin_unlock+0x26/0x2a
> >>[1948752.216719] [<ffffffff810fdfef>] ? __alloc_fd+0xf6/0x10a
> >>[1948752.231521] [<ffffffff810e437c>] do_sys_open+0x114/0x1a6
> >>[1948752.246396] [<ffffffff810e4438>] SyS_open+0x19/0x1b
> >>[1948752.260709] [<ffffffff816341d2>] system_call_fastpath+0x16/0x1b
> >
next prev parent reply other threads:[~2014-10-05 15:41 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-03 23:38 Possible to backport this vhost-net fix to 3.10? Eddie Chapman
2014-10-04 18:35 ` Michael S. Tsirkin
2014-10-05 15:34 ` Eddie Chapman
2014-10-05 15:44 ` Michael S. Tsirkin [this message]
2014-10-07 3:42 ` Dmitry Petuhov
2014-10-07 3:47 ` Dmitry Petuhov
2014-10-07 12:34 ` Eddie Chapman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141005154445.GA14840@redhat.com \
--to=mst@redhat.com \
--cc=eddie@ehuk.net \
--cc=kvm@vger.kernel.org \
--cc=mimu@linux.vnet.ibm.com \
--cc=mityapetuhov@gmail.com \
--cc=romain@orebokech.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox