From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752295AbcCARUa (ORCPT ); Tue, 1 Mar 2016 12:20:30 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58791 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750862AbcCARU3 (ORCPT ); Tue, 1 Mar 2016 12:20:29 -0500 Date: Tue, 1 Mar 2016 19:20:24 +0200 From: "Michael S. Tsirkin" To: Michal Hocko Cc: Vladimir Davydov , Andrew Morton , Tetsuo Handa , David Rientjes , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] exit: clear TIF_MEMDIE after exit_task_work Message-ID: <20160301191906-mutt-send-email-mst@redhat.com> References: <1456765329-14890-1-git-send-email-vdavydov@virtuozzo.com> <20160301155212.GJ9461@dhcp22.suse.cz> <20160301175431-mutt-send-email-mst@redhat.com> <20160301160813.GM9461@dhcp22.suse.cz> <20160301182027-mutt-send-email-mst@redhat.com> <20160301163537.GO9461@dhcp22.suse.cz> <20160301184046-mutt-send-email-mst@redhat.com> <20160301171758.GP9461@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160301171758.GP9461@dhcp22.suse.cz> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 01, 2016 at 06:17:58PM +0100, Michal Hocko wrote: > On Tue 01-03-16 18:46:38, Michael S. Tsirkin wrote: > > On Tue, Mar 01, 2016 at 05:35:37PM +0100, Michal Hocko wrote: > > > On Tue 01-03-16 18:22:32, Michael S. Tsirkin wrote: > > > > On Tue, Mar 01, 2016 at 05:08:13PM +0100, Michal Hocko wrote: > > > > > On Tue 01-03-16 17:57:04, Michael S. Tsirkin wrote: > > > > > > On Tue, Mar 01, 2016 at 04:52:12PM +0100, Michal Hocko wrote: > > > > > > > [CCing vhost-net maintainer] > > > > > > > > > > > > > > On Mon 29-02-16 20:02:09, Vladimir Davydov wrote: > > > > > > > > An mm_struct may be pinned by a file. An example is vhost-net device > > > > > > > > created by a qemu/kvm (see vhost_net_ioctl -> vhost_net_set_owner -> > > > > > > > > vhost_dev_set_owner). > > > > > > > > > > > > > > The more I think about that the more I am wondering whether this is > > > > > > > actually OK and correct. Why does the driver have to pin the address > > > > > > > space? Nothing really prevents from parallel tearing down of the address > > > > > > > space anyway so the code cannot expect all the vmas to stay. Would it be > > > > > > > enough to pin the mm_struct only? > > > > > > > > > > > > I'll need to research this. It's a fact that as long as the > > > > > > device is not stopped, vhost can attempt to access > > > > > > the address space. > > > > > > > > > > But does it expect any specific parts of the address space to be mapped? > > > > > E.g. proc needs to keep the mm allocated as well for some files but it > > > > > doesn't pin the address space (mm_users) but rather mm_count (see > > > > > proc_mem_open). > > > > > > > > At a quick glance, it seems that it's needed: it calls > > > > get_user_pages(mm) and that looks like it will not DTRT (or even fail > > > > gracefully) if mm->mm_users == 0 and exit_mmap/etc was already called > > > > (or is in progress). > > > > > > yes it will fail gracefully > > > > > > What makes get_user_pages fail gracefully in this case, > > if it races with task exiting? > > Sorry, I could have been more verbose... The code would have to make sure > that the mm is still alive before calling g-u-p by > atomic_inc_not_zero(&mm->mm_users) and fail if the user count dropped to > 0 in the mean time. See how fs/proc/task_mmu.c does that (proc_mem_open > + m_start + m_stop. > > The biggest advanatage would be that the mm address space pin would be > only for the particular operation. Not sure whether that is possible in > the driver though. Anyway pinning the mm for a potentially unbounded > amount of time doesn't sound too nice. > -- > Michal Hocko > SUSE Labs Hmm that would be another atomic on data path ... I'd have to explore that. -- MST