From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yuanhan Liu Subject: Re: Memory leak when adding/removing vhost_user ports Date: Tue, 19 Apr 2016 22:04:45 -0700 Message-ID: <20160420050445.GB5872@yliu-dev.sh.intel.com> References: <20160418174650.GD2576@yliu-dev.sh.intel.com> <20160418181422.GE2576@yliu-dev.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: dev , Daniele Di Proietto To: Christian Ehrhardt Return-path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id BBB05298F for ; Wed, 20 Apr 2016 07:03:34 +0200 (CEST) Content-Disposition: inline In-Reply-To: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, Apr 19, 2016 at 06:33:50PM +0200, Christian Ehrhardt wrote: > Hi, > thanks for the patch. > I backported it this way to my DPDK 2.2 based environment for now (see = below): >=20 > With that applied one (and only one) of my two guests looses connectivi= ty after > removing the ports the first time. Yeah, that's should be because I invoked the "->destroy_device()" callback. BTW, I'm curious how do you do the test? I saw you added 256 ports, but with 2 guests only? So, 254 of them are idle, just for testing the memory leak bug? And then you remove all of them, without stopping the guest. How that's gonna work? I mean, the vhost-user connection would be broken, and data flow would not work. --yliu > No traffic seems to pass, setting the device in the guest down/up doesn= 't get > anything. > But It isn't totally broken - stopping/starting the guest gets it worki= ng > again. > So openvswitch/dpdk is still somewhat working - it just seems the guest= lost > something, after tapping on that vhost_user interface again it works. >=20 > I will check tomorrow and let you know: > - if I'm more lucky with that patch on top of 16.04 > - if it looses connectivity after the first or a certain amount of port= removes >=20 > If you find issues with my backport adaption let me know. >=20 >=20 > --- >=20 > Backport and reasoning: >=20 > new fix relies on a lot of new code, vhost_destroy_device looks totally > different from the former destroy_device. > History on todays function content: > =A0 4796ad63 - original code moved from examples to lib > =A0 a90ca1a1 - this replaces ops->destroy_device with vhost_destroy_dev= ice > =A0 71dc571e - simple check against null pointers > =A0 45ca9c6f - this changed the code from linked list to arrays >=20 > =A0 New code cleans with: > =A0 =A0 =A0 notify_ops->destroy_device (callback into the parent) > =A0 =A0 =A0 cleanup_device was existing before even in 2.2 code > =A0 =A0 =A0 free_device as well existing before even in 2.2 code > =A0 Old code cleans with: > =A0 =A0 =A0 notify_ops->destroy_device - still there > =A0 =A0 =A0 rm_config_ll_entry -> eventually calls cleanup_device and f= ree_device > =A0 =A0 =A0 =A0 (just in the more complex linked list way) >=20 > So the "only" adaption for backporting needed is to replace > vhost_destroy_device > with ops->destroy_device(ctx) >=20 > Index: dpdk/lib/librte_vhost/vhost_user/vhost-net-user.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- dpdk.orig/lib/librte_vhost/vhost_user/vhost-net-user.c > +++ dpdk/lib/librte_vhost/vhost_user/vhost-net-user.c > @@ -310,6 +310,7 @@ vserver_new_vq_conn(int fd, void *dat, _ > =A0 } > =A0 > =A0 vdev_ctx.fh =3D fh; > + vserver->fh =3D fh; > =A0 size =3D strnlen(vserver->path, PATH_MAX); > =A0 ops->set_ifname(vdev_ctx, vserver->path, > =A0 size); > @@ -516,6 +517,11 @@ rte_vhost_driver_unregister(const char * > =A0 > =A0 for (i =3D 0; i < g_vhost_server.vserver_cnt; i++) { > =A0 if (!strcmp(g_vhost_server.server[i]->path, path)) { > + struct vhost_device_ctx ctx; > + > + ctx.fh =3D g_vhost_server.server[i]->fh; > + ops->destroy_device(ctx); > + > =A0 fdset_del(&g_vhost_server.fdset, > =A0 g_vhost_server.server[i]->listenfd); > =A0 > Index: dpdk/lib/librte_vhost/vhost_user/vhost-net-user.h > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- dpdk.orig/lib/librte_vhost/vhost_user/vhost-net-user.h > +++ dpdk/lib/librte_vhost/vhost_user/vhost-net-user.h > @@ -43,6 +43,7 @@ > =A0struct vhost_server { > =A0 char *path; /**< The path the uds is bind to. */ > =A0 int listenfd; =A0 =A0 /**< The listener sockfd. */ > + uint32_t fh; > =A0}; > =A0 > =A0/* refer to hw/virtio/vhost-user.c */ >=20 >=20 >=20 >=20 > Christian Ehrhardt > Software Engineer, Ubuntu Server > Canonical Ltd >=20 > On Mon, Apr 18, 2016 at 8:14 PM, Yuanhan Liu > wrote: >=20 > On Mon, Apr 18, 2016 at 10:46:50AM -0700, Yuanhan Liu wrote: > > On Mon, Apr 18, 2016 at 07:18:05PM +0200, Christian Ehrhardt wrot= e: > > > I assume there is a leak somewhere on adding/removing vhost_use= r ports. > > > Although it could also be "only" a fragmentation issue. > > > > > > Reproduction is easy: > > > I set up a pair of nicely working OVS-DPDK connected KVM Guests= . > > > Then in a loop I > > >=A0 =A0 - add up to more 512 ports > > >=A0 =A0 - test connectivity between the two guests > > >=A0 =A0 - remove up to 512 ports > > > > > > Depending on memory and the amount of multiqueue/rxq I use it s= eems to > > > slightly change when exactly it breaks. But for my default setu= p of 4 > > > queues and 5G Hugepages initialized by DPDK it always breaks at= the > sixth > > > iteration. > > > Here a link to the stack trace indicating a memory shortage (TB= C): > > > https://launchpadlibrarian.net/253916410/apport-retrace.log > > > > > > Known Todos: > > > - I want to track it down more, and will try to come up with a = non > > > openvswitch based looping testcase that might show it as well t= o > simplify > > > debugging. > > > - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DP= DK 16.04 > and > > > Openvswitch master is planned. > > > > > > I will go on debugging this and let you know, but I wanted to g= ive a > heads > > > up to everyone. > > > > Thanks for the report. > > > > > In case this is a known issue for some of you please let me kno= w. > > > > Yeah, it might be. I'm wondering that virtio_net struct is not fr= eed. > > It will be freed only (if I'm not mistaken) when guest quits, by = far. >=20 > Would you try following diff and to see if it fix your issue? >=20 > =A0 =A0 =A0 =A0 --yliu >=20 > --- > =A0lib/librte_vhost/vhost_user/vhost-net-user.c | 6 ++++++ > =A0lib/librte_vhost/vhost_user/vhost-net-user.h | 1 + > =A02 files changed, 7 insertions(+) >=20 > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/ > librte_vhost/vhost_user/vhost-net-user.c > index df2bd64..8f7ebd7 100644 > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c > @@ -309,6 +309,7 @@ vserver_new_vq_conn(int fd, void *dat, __rte_un= used int > *remove) > =A0 =A0 =A0 =A0 } >=20 > =A0 =A0 =A0 =A0 vdev_ctx.fh =3D fh; > +=A0 =A0 =A0 =A0vserver->fh =3D fh; > =A0 =A0 =A0 =A0 size =3D strnlen(vserver->path, PATH_MAX); > =A0 =A0 =A0 =A0 vhost_set_ifname(vdev_ctx, vserver->path, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 size); > @@ -501,6 +502,11 @@ rte_vhost_driver_unregister(const char *path) >=20 > =A0 =A0 =A0 =A0 for (i =3D 0; i < g_vhost_server.vserver_cnt; i++) = { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!strcmp(g_vhost_server.server[i= ]->path, path)) { > +=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct vhost_device= _ctx ctx; > + > +=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ctx.fh =3D g_vhost_= server.server[i]->fh; > +=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vhost_destroy_devic= e(ctx); > + > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 fdset_del(&g_vhost_= server.fdset, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 g_v= host_server.server[i]->listenfd); >=20 > diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.h b/lib/ > librte_vhost/vhost_user/vhost-net-user.h > index e3bb413..7cf21db 100644 > --- a/lib/librte_vhost/vhost_user/vhost-net-user.h > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.h > @@ -43,6 +43,7 @@ > =A0struct vhost_server { > =A0 =A0 =A0 =A0 char *path; /**< The path the uds is bind to. */ > =A0 =A0 =A0 =A0 int listenfd;=A0 =A0 =A0/**< The listener sockfd. *= / > +=A0 =A0 =A0 =A0uint32_t fh; > =A0}; >=20 > =A0/* refer to hw/virtio/vhost-user.c */ > -- > 1.9.3 >=20 >=20 >=20 >=20