From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: Re: xc_gntshr_unmap problems (BUG(s) in xen-gntalloc?) Date: Tue, 2 Sep 2014 14:55:55 +0100 Message-ID: <5405CC6B.4070400@citrix.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XOoiv-0006d7-HH for xen-devel@lists.xenproject.org; Tue, 02 Sep 2014 14:06:17 +0000 In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dave Scott , "xen-devel@lists.xenproject.org" Cc: John Else , Anil Madhavapeddy List-Id: xen-devel@lists.xenproject.org On 27/08/14 22:33, Dave Scott wrote: > Hi, > > I've been playing with gntshr (as used by libvchan) and have noticed a > few problems. Firstly if I use xc_gntshr_share_pages to share > 1 page > then it seems to leak after xc_gntshr_munmap: [...] > ... subsequent runs fail earlier and earlier. I added some printf debugging > and noticed that the address returned by xc_gntshr_share_pages was decreasing > by 0x1000 per iteration, suggesting that the xc_gntshr_munmap was unmapping > the first page but missing the second. FYI, there's a kernel fix 243082e0d59f (xen/gntalloc: fix reference counts on multi-page mappings) for multi-page mappings which is included in 3.3 and later. > [ 148.564340] Pid: 897, comm: test-gnt Not tainted 3.2.0-67-generic #101-Ubuntu > [ 148.564348] RIP: e030:[] [] gnttab_query_foreign_access+0x13/0x20 [...] > [ 148.564452] Call Trace: > [ 148.564459] [] ? __del_gref+0x105/0x150 [xen_gntalloc] > [ 148.564465] [] ? gnttab_grant_foreign_access+0x2b/0x80 > [ 148.564471] [] add_grefs+0x1c8/0x2b0 [xen_gntalloc] > [ 148.564478] [] gntalloc_ioctl_alloc+0xf8/0x160 [xen_gntalloc] > [ 148.564485] [] gntalloc_ioctl+0x50/0x64 [xen_gntalloc] > [ 148.564492] [] do_vfs_ioctl+0x8a/0x340 > [ 148.564498] [] ? do_munmap+0x1f3/0x2f0 > [ 148.564504] [] sys_ioctl+0x91/0xa0 > [ 148.564510] [] system_call_fastpath+0x16/0x1b This happens if the allocate ioctl fails because there are no more grant references. This happened in your case because of the bug in 3.2 (see above). But it is triggerable in 3.17-rc3 if you increase the gntalloc limit (via the module parameter) or if you plug in a lots of VIFs/VBDs etc. I'll send patches to the list shortly. David