From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F05E2CA9EB5 for ; Mon, 4 Nov 2019 20:32:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C82C7214D8 for ; Mon, 4 Nov 2019 20:32:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FVZW2lpb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729521AbfKDUcM (ORCPT ); Mon, 4 Nov 2019 15:32:12 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:54658 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728741AbfKDUcL (ORCPT ); Mon, 4 Nov 2019 15:32:11 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1572899529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0t18qmWA+5595eIOsjZ92rtvApF9YJpqwJyzz983FX4=; b=FVZW2lpbqvJiFchSK8aK7TQ9NwyyxRFD1Ign9yZL1JLBHWgbSbWYQ4imG4817IKAuO9etR /nl6nKfrBIiWIQbR79GpgQAvTmulbRXrRjjGz9NCRiFgJFzMCFLzoJXRzJ14AfAU1gsA7m Y73CzaOudHWDz0takN06dgWR9UrEn80= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-167-Z6X9SXkoPpydpMlPq3lbmg-1; Mon, 04 Nov 2019 15:32:05 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 39197477; Mon, 4 Nov 2019 20:32:01 +0000 (UTC) Received: from redhat.com (unknown [10.20.6.178]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BE494600C4; Mon, 4 Nov 2019 20:31:54 +0000 (UTC) Date: Mon, 4 Nov 2019 15:31:53 -0500 From: Jerome Glisse To: John Hubbard Cc: Andrew Morton , Al Viro , Alex Williamson , Benjamin Herrenschmidt , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , Christoph Hellwig , Dan Williams , Daniel Vetter , Dave Chinner , David Airlie , "David S . Miller" , Ira Weiny , Jan Kara , Jason Gunthorpe , Jens Axboe , Jonathan Corbet , Magnus Karlsson , Mauro Carvalho Chehab , Michael Ellerman , Michal Hocko , Mike Kravetz , Paul Mackerras , Shuah Khan , Vlastimil Babka , bpf@vger.kernel.org, dri-devel@lists.freedesktop.org, kvm@vger.kernel.org, linux-block@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org, linux-rdma@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, netdev@vger.kernel.org, linux-mm@kvack.org, LKML Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN Message-ID: <20191104203153.GB7731@redhat.com> References: <20191103211813.213227-1-jhubbard@nvidia.com> <20191103211813.213227-6-jhubbard@nvidia.com> <20191104173325.GD5134@redhat.com> <20191104191811.GI5134@redhat.com> <20191104195248.GA7731@redhat.com> <25ec4bc0-caaa-2a01-2ae7-2d79663a40e1@nvidia.com> MIME-Version: 1.0 In-Reply-To: <25ec4bc0-caaa-2a01-2ae7-2d79663a40e1@nvidia.com> User-Agent: Mutt/1.12.1 (2019-06-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-MC-Unique: Z6X9SXkoPpydpMlPq3lbmg-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Mon, Nov 04, 2019 at 12:09:05PM -0800, John Hubbard wrote: > Jason, a question for you at the bottom. >=20 > On 11/4/19 11:52 AM, Jerome Glisse wrote: > ... > >> CASE 3: ODP > >> ----------- > >> RDMA hardware with page faulting support. Here, a well-written driver = doesn't > >=20 > > CASE3: Hardware with page fault support > > --------------------------------------- > >=20 > > Here, a well-written .... > >=20 >=20 > Ah, OK. So just drop the first sentence, yes. >=20 > ... > >>>>>> +=09 */ > >>>>>> +=09gup_flags |=3D FOLL_REMOTE | FOLL_PIN; > >>>>> > >>>>> Wouldn't it be better to not add pin_longterm_pages_remote() until > >>>>> it can be properly implemented ? > >>>>> > >>>> > >>>> Well, the problem is that I need each call site that requires FOLL_P= IN > >>>> to use a proper wrapper. It's the FOLL_PIN that is the focus here, b= ecause > >>>> there is a hard, bright rule, which is: if and only if a caller sets > >>>> FOLL_PIN, then the dma-page tracking happens, and put_user_page() mu= st > >>>> be called. > >>>> > >>>> So this leaves me with only two reasonable choices: > >>>> > >>>> a) Convert the call site as above: pin_longterm_pages_remote(), whic= h sets > >>>> FOLL_PIN (the key point!), and leaves the FOLL_LONGTERM situation ex= actly > >>>> as it has been so far. When the FOLL_LONGTERM situation is fixed, th= e call > >>>> site *might* not need any changes to adopt the working gup.c code. > >>>> > >>>> b) Convert the call site to pin_user_pages_remote(), which also sets > >>>> FOLL_PIN, and also leaves the FOLL_LONGTERM situation exactly as bef= ore. > >>>> There would also be a comment at the call site, to the effect of, "t= his > >>>> is the wrong call to make: it really requires FOLL_LONGTERM behavior= ". > >>>> > >>>> When the FOLL_LONGTERM situation is fixed, the call site will need t= o be > >>>> changed to pin_longterm_pages_remote(). > >>>> > >>>> So you can probably see why I picked (a). > >>> > >>> But right now nobody has FOLL_LONGTERM and FOLL_REMOTE. So you should > >>> never have the need for pin_longterm_pages_remote(). My fear is that > >>> longterm has implication and it would be better to not drop this impl= ication > >>> by adding a wrapper that does not do what the name says. > >>> > >>> So do not introduce pin_longterm_pages_remote() until its first user > >>> happens. This is option c) > >>> > >> > >> Almost forgot, though: there is already another user: Infiniband: > >> > >> drivers/infiniband/core/umem_odp.c:646: npages =3D pin_longter= m_pages_remote(owning_process, owning_mm, > >=20 > > odp do not need that, i thought the HMM convertion was already upstream > > but seems not, in any case odp do not need the longterm case it only > > so best is to revert that user to gup_fast or something until it get > > converted to HMM. > >=20 >=20 > Note for Jason: the (a) or (b) items are talking about the vfio case, whi= ch is > one of the two call sites that now use pin_longterm_pages_remote(), and t= he > other one is infiniband: >=20 > drivers/infiniband/core/umem_odp.c:646: npages =3D pin_longterm_p= ages_remote(owning_process, owning_mm, > drivers/vfio/vfio_iommu_type1.c:353: ret =3D pin_longterm_page= s_remote(NULL, mm, vaddr, 1, vfio should be reverted until it can be properly implemented. The issue is that when you fix the implementation you might break vfio existing user and thus regress the kernel from user point of view. So i rather have the change to vfio reverted, i believe it was not well understood when it got upstream, between in my 5.4 tree it is still gup_remote not longterm. > Jerome, Jason: I really don't want to revert the put_page() to put_user_p= age()=20 > conversions that are already throughout the IB driver--pointless churn, r= ight? > I'd rather either delete them in Jason's tree, or go with what I have her= e > while waiting for the deletion. >=20 > Maybe we should just settle on (a) or (b), so that the IB driver ends up = with > the wrapper functions? In fact, if it's getting deleted, then I'd prefer = leaving > it at (a), since that's simple... >=20 > Jason should weigh in on how he wants this to go, with respect to branchi= ng > and merging, since it sounds like that will conflict with the hmm branch= =20 > (ha, I'm overdue in reviewing his mmu notifier series, that's what I get = for > being late). >=20 > thanks, >=20 > John Hubbard > NVIDIA From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 054C7CA9EB5 for ; Mon, 4 Nov 2019 20:35:25 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A6890206BA for ; Mon, 4 Nov 2019 20:35:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FVZW2lpb" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A6890206BA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 476PgZ4BBMzF46R for ; Tue, 5 Nov 2019 07:35:22 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=redhat.com (client-ip=207.211.31.81; helo=us-smtp-delivery-1.mimecast.com; envelope-from=jglisse@redhat.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.b="FVZW2lpb"; dkim-atps=neutral Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 476Pbx1KjNzF3Kn for ; Tue, 5 Nov 2019 07:32:12 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1572899529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0t18qmWA+5595eIOsjZ92rtvApF9YJpqwJyzz983FX4=; b=FVZW2lpbqvJiFchSK8aK7TQ9NwyyxRFD1Ign9yZL1JLBHWgbSbWYQ4imG4817IKAuO9etR /nl6nKfrBIiWIQbR79GpgQAvTmulbRXrRjjGz9NCRiFgJFzMCFLzoJXRzJ14AfAU1gsA7m Y73CzaOudHWDz0takN06dgWR9UrEn80= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-167-Z6X9SXkoPpydpMlPq3lbmg-1; Mon, 04 Nov 2019 15:32:05 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 39197477; Mon, 4 Nov 2019 20:32:01 +0000 (UTC) Received: from redhat.com (unknown [10.20.6.178]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BE494600C4; Mon, 4 Nov 2019 20:31:54 +0000 (UTC) Date: Mon, 4 Nov 2019 15:31:53 -0500 From: Jerome Glisse To: John Hubbard Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN Message-ID: <20191104203153.GB7731@redhat.com> References: <20191103211813.213227-1-jhubbard@nvidia.com> <20191103211813.213227-6-jhubbard@nvidia.com> <20191104173325.GD5134@redhat.com> <20191104191811.GI5134@redhat.com> <20191104195248.GA7731@redhat.com> <25ec4bc0-caaa-2a01-2ae7-2d79663a40e1@nvidia.com> MIME-Version: 1.0 In-Reply-To: <25ec4bc0-caaa-2a01-2ae7-2d79663a40e1@nvidia.com> User-Agent: Mutt/1.12.1 (2019-06-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-MC-Unique: Z6X9SXkoPpydpMlPq3lbmg-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Hocko , Jan Kara , kvm@vger.kernel.org, linux-doc@vger.kernel.org, David Airlie , Dave Chinner , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Paul Mackerras , linux-kselftest@vger.kernel.org, Ira Weiny , Jonathan Corbet , linux-rdma@vger.kernel.org, Christoph Hellwig , Jason Gunthorpe , Vlastimil Babka , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , linux-media@vger.kernel.org, Shuah Khan , linux-block@vger.kernel.org, Alex Williamson , Al Viro , Dan Williams , Mauro Carvalho Chehab , bpf@vger.kernel.org, Magnus Karlsson , Jens Axboe , netdev@vger.kernel.org, LKML , Daniel Vetter , linux-fsdevel@vger.kernel.org, Andrew Morton , linuxppc-dev@lists.ozlabs.org, "David S . Miller" , Mike Kravetz Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Mon, Nov 04, 2019 at 12:09:05PM -0800, John Hubbard wrote: > Jason, a question for you at the bottom. >=20 > On 11/4/19 11:52 AM, Jerome Glisse wrote: > ... > >> CASE 3: ODP > >> ----------- > >> RDMA hardware with page faulting support. Here, a well-written driver = doesn't > >=20 > > CASE3: Hardware with page fault support > > --------------------------------------- > >=20 > > Here, a well-written .... > >=20 >=20 > Ah, OK. So just drop the first sentence, yes. >=20 > ... > >>>>>> +=09 */ > >>>>>> +=09gup_flags |=3D FOLL_REMOTE | FOLL_PIN; > >>>>> > >>>>> Wouldn't it be better to not add pin_longterm_pages_remote() until > >>>>> it can be properly implemented ? > >>>>> > >>>> > >>>> Well, the problem is that I need each call site that requires FOLL_P= IN > >>>> to use a proper wrapper. It's the FOLL_PIN that is the focus here, b= ecause > >>>> there is a hard, bright rule, which is: if and only if a caller sets > >>>> FOLL_PIN, then the dma-page tracking happens, and put_user_page() mu= st > >>>> be called. > >>>> > >>>> So this leaves me with only two reasonable choices: > >>>> > >>>> a) Convert the call site as above: pin_longterm_pages_remote(), whic= h sets > >>>> FOLL_PIN (the key point!), and leaves the FOLL_LONGTERM situation ex= actly > >>>> as it has been so far. When the FOLL_LONGTERM situation is fixed, th= e call > >>>> site *might* not need any changes to adopt the working gup.c code. > >>>> > >>>> b) Convert the call site to pin_user_pages_remote(), which also sets > >>>> FOLL_PIN, and also leaves the FOLL_LONGTERM situation exactly as bef= ore. > >>>> There would also be a comment at the call site, to the effect of, "t= his > >>>> is the wrong call to make: it really requires FOLL_LONGTERM behavior= ". > >>>> > >>>> When the FOLL_LONGTERM situation is fixed, the call site will need t= o be > >>>> changed to pin_longterm_pages_remote(). > >>>> > >>>> So you can probably see why I picked (a). > >>> > >>> But right now nobody has FOLL_LONGTERM and FOLL_REMOTE. So you should > >>> never have the need for pin_longterm_pages_remote(). My fear is that > >>> longterm has implication and it would be better to not drop this impl= ication > >>> by adding a wrapper that does not do what the name says. > >>> > >>> So do not introduce pin_longterm_pages_remote() until its first user > >>> happens. This is option c) > >>> > >> > >> Almost forgot, though: there is already another user: Infiniband: > >> > >> drivers/infiniband/core/umem_odp.c:646: npages =3D pin_longter= m_pages_remote(owning_process, owning_mm, > >=20 > > odp do not need that, i thought the HMM convertion was already upstream > > but seems not, in any case odp do not need the longterm case it only > > so best is to revert that user to gup_fast or something until it get > > converted to HMM. > >=20 >=20 > Note for Jason: the (a) or (b) items are talking about the vfio case, whi= ch is > one of the two call sites that now use pin_longterm_pages_remote(), and t= he > other one is infiniband: >=20 > drivers/infiniband/core/umem_odp.c:646: npages =3D pin_longterm_p= ages_remote(owning_process, owning_mm, > drivers/vfio/vfio_iommu_type1.c:353: ret =3D pin_longterm_page= s_remote(NULL, mm, vaddr, 1, vfio should be reverted until it can be properly implemented. The issue is that when you fix the implementation you might break vfio existing user and thus regress the kernel from user point of view. So i rather have the change to vfio reverted, i believe it was not well understood when it got upstream, between in my 5.4 tree it is still gup_remote not longterm. > Jerome, Jason: I really don't want to revert the put_page() to put_user_p= age()=20 > conversions that are already throughout the IB driver--pointless churn, r= ight? > I'd rather either delete them in Jason's tree, or go with what I have her= e > while waiting for the deletion. >=20 > Maybe we should just settle on (a) or (b), so that the IB driver ends up = with > the wrapper functions? In fact, if it's getting deleted, then I'd prefer = leaving > it at (a), since that's simple... >=20 > Jason should weigh in on how he wants this to go, with respect to branchi= ng > and merging, since it sounds like that will conflict with the hmm branch= =20 > (ha, I'm overdue in reviewing his mmu notifier series, that's what I get = for > being late). >=20 > thanks, >=20 > John Hubbard > NVIDIA From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerome Glisse Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN Date: Mon, 4 Nov 2019 15:31:53 -0500 Message-ID: <20191104203153.GB7731@redhat.com> References: <20191103211813.213227-1-jhubbard@nvidia.com> <20191103211813.213227-6-jhubbard@nvidia.com> <20191104173325.GD5134@redhat.com> <20191104191811.GI5134@redhat.com> <20191104195248.GA7731@redhat.com> <25ec4bc0-caaa-2a01-2ae7-2d79663a40e1@nvidia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <25ec4bc0-caaa-2a01-2ae7-2d79663a40e1@nvidia.com> Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org To: John Hubbard Cc: Andrew Morton , Al Viro , Alex Williamson , Benjamin Herrenschmidt , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , Christoph Hellwig , Dan Williams , Daniel Vetter , Dave Chinner , David Airlie , "David S . Miller" , Ira Weiny , Jan Kara , Jason Gunthorpe , Jens Axboe , Jonathan Corbet , Magnus Karlsson , Mauro Carvalho Chehab , Michael Ellerman , Michal Hocko List-Id: dri-devel@lists.freedesktop.org On Mon, Nov 04, 2019 at 12:09:05PM -0800, John Hubbard wrote: > Jason, a question for you at the bottom. >=20 > On 11/4/19 11:52 AM, Jerome Glisse wrote: > ... > >> CASE 3: ODP > >> ----------- > >> RDMA hardware with page faulting support. Here, a well-written driver = doesn't > >=20 > > CASE3: Hardware with page fault support > > --------------------------------------- > >=20 > > Here, a well-written .... > >=20 >=20 > Ah, OK. So just drop the first sentence, yes. >=20 > ... > >>>>>> +=09 */ > >>>>>> +=09gup_flags |=3D FOLL_REMOTE | FOLL_PIN; > >>>>> > >>>>> Wouldn't it be better to not add pin_longterm_pages_remote() until > >>>>> it can be properly implemented ? > >>>>> > >>>> > >>>> Well, the problem is that I need each call site that requires FOLL_P= IN > >>>> to use a proper wrapper. It's the FOLL_PIN that is the focus here, b= ecause > >>>> there is a hard, bright rule, which is: if and only if a caller sets > >>>> FOLL_PIN, then the dma-page tracking happens, and put_user_page() mu= st > >>>> be called. > >>>> > >>>> So this leaves me with only two reasonable choices: > >>>> > >>>> a) Convert the call site as above: pin_longterm_pages_remote(), whic= h sets > >>>> FOLL_PIN (the key point!), and leaves the FOLL_LONGTERM situation ex= actly > >>>> as it has been so far. When the FOLL_LONGTERM situation is fixed, th= e call > >>>> site *might* not need any changes to adopt the working gup.c code. > >>>> > >>>> b) Convert the call site to pin_user_pages_remote(), which also sets > >>>> FOLL_PIN, and also leaves the FOLL_LONGTERM situation exactly as bef= ore. > >>>> There would also be a comment at the call site, to the effect of, "t= his > >>>> is the wrong call to make: it really requires FOLL_LONGTERM behavior= ". > >>>> > >>>> When the FOLL_LONGTERM situation is fixed, the call site will need t= o be > >>>> changed to pin_longterm_pages_remote(). > >>>> > >>>> So you can probably see why I picked (a). > >>> > >>> But right now nobody has FOLL_LONGTERM and FOLL_REMOTE. So you should > >>> never have the need for pin_longterm_pages_remote(). My fear is that > >>> longterm has implication and it would be better to not drop this impl= ication > >>> by adding a wrapper that does not do what the name says. > >>> > >>> So do not introduce pin_longterm_pages_remote() until its first user > >>> happens. This is option c) > >>> > >> > >> Almost forgot, though: there is already another user: Infiniband: > >> > >> drivers/infiniband/core/umem_odp.c:646: npages =3D pin_longter= m_pages_remote(owning_process, owning_mm, > >=20 > > odp do not need that, i thought the HMM convertion was already upstream > > but seems not, in any case odp do not need the longterm case it only > > so best is to revert that user to gup_fast or something until it get > > converted to HMM. > >=20 >=20 > Note for Jason: the (a) or (b) items are talking about the vfio case, whi= ch is > one of the two call sites that now use pin_longterm_pages_remote(), and t= he > other one is infiniband: >=20 > drivers/infiniband/core/umem_odp.c:646: npages =3D pin_longterm_p= ages_remote(owning_process, owning_mm, > drivers/vfio/vfio_iommu_type1.c:353: ret =3D pin_longterm_page= s_remote(NULL, mm, vaddr, 1, vfio should be reverted until it can be properly implemented. The issue is that when you fix the implementation you might break vfio existing user and thus regress the kernel from user point of view. So i rather have the change to vfio reverted, i believe it was not well understood when it got upstream, between in my 5.4 tree it is still gup_remote not longterm. > Jerome, Jason: I really don't want to revert the put_page() to put_user_p= age()=20 > conversions that are already throughout the IB driver--pointless churn, r= ight? > I'd rather either delete them in Jason's tree, or go with what I have her= e > while waiting for the deletion. >=20 > Maybe we should just settle on (a) or (b), so that the IB driver ends up = with > the wrapper functions? In fact, if it's getting deleted, then I'd prefer = leaving > it at (a), since that's simple... >=20 > Jason should weigh in on how he wants this to go, with respect to branchi= ng > and merging, since it sounds like that will conflict with the hmm branch= =20 > (ha, I'm overdue in reviewing his mmu notifier series, that's what I get = for > being late). >=20 > thanks, >=20 > John Hubbard > NVIDIA From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 849E9C47E49 for ; Mon, 4 Nov 2019 20:32:13 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5E4EC21655 for ; Mon, 4 Nov 2019 20:32:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5E4EC21655 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D292A6E85D; Mon, 4 Nov 2019 20:32:11 +0000 (UTC) Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8190A6E85D for ; Mon, 4 Nov 2019 20:32:10 +0000 (UTC) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-167-Z6X9SXkoPpydpMlPq3lbmg-1; Mon, 04 Nov 2019 15:32:05 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 39197477; Mon, 4 Nov 2019 20:32:01 +0000 (UTC) Received: from redhat.com (unknown [10.20.6.178]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BE494600C4; Mon, 4 Nov 2019 20:31:54 +0000 (UTC) Date: Mon, 4 Nov 2019 15:31:53 -0500 From: Jerome Glisse To: John Hubbard Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN Message-ID: <20191104203153.GB7731@redhat.com> References: <20191103211813.213227-1-jhubbard@nvidia.com> <20191103211813.213227-6-jhubbard@nvidia.com> <20191104173325.GD5134@redhat.com> <20191104191811.GI5134@redhat.com> <20191104195248.GA7731@redhat.com> <25ec4bc0-caaa-2a01-2ae7-2d79663a40e1@nvidia.com> MIME-Version: 1.0 In-Reply-To: <25ec4bc0-caaa-2a01-2ae7-2d79663a40e1@nvidia.com> User-Agent: Mutt/1.12.1 (2019-06-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-MC-Unique: Z6X9SXkoPpydpMlPq3lbmg-1 X-Mimecast-Spam-Score: 0 Content-Disposition: inline X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1572899529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0t18qmWA+5595eIOsjZ92rtvApF9YJpqwJyzz983FX4=; b=FVZW2lpbqvJiFchSK8aK7TQ9NwyyxRFD1Ign9yZL1JLBHWgbSbWYQ4imG4817IKAuO9etR /nl6nKfrBIiWIQbR79GpgQAvTmulbRXrRjjGz9NCRiFgJFzMCFLzoJXRzJ14AfAU1gsA7m Y73CzaOudHWDz0takN06dgWR9UrEn80= X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Hocko , Jan Kara , kvm@vger.kernel.org, linux-doc@vger.kernel.org, David Airlie , Dave Chinner , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Paul Mackerras , linux-kselftest@vger.kernel.org, Ira Weiny , Jonathan Corbet , linux-rdma@vger.kernel.org, Michael Ellerman , Christoph Hellwig , Jason Gunthorpe , Vlastimil Babka , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , linux-media@vger.kernel.org, Shuah Khan , linux-block@vger.kernel.org, Alex Williamson , Al Viro , Dan Williams , Mauro Carvalho Chehab , bpf@vger.kernel.org, Magnus Karlsson , Jens Axboe , netdev@vger.kernel.org, LKML , linux-fsdevel@vger.kernel.org, Andrew Morton , linuxppc-dev@lists.ozlabs.org, "David S . Miller" , Mike Kravetz Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Message-ID: <20191104203153.HUt-Q_f5WkSZ11e0TYaLlNWyZV1bODY_kEHOMIYNwYM@z> T24gTW9uLCBOb3YgMDQsIDIwMTkgYXQgMTI6MDk6MDVQTSAtMDgwMCwgSm9obiBIdWJiYXJkIHdy b3RlOgo+IEphc29uLCBhIHF1ZXN0aW9uIGZvciB5b3UgYXQgdGhlIGJvdHRvbS4KPiAKPiBPbiAx MS80LzE5IDExOjUyIEFNLCBKZXJvbWUgR2xpc3NlIHdyb3RlOgo+IC4uLgo+ID4+IENBU0UgMzog T0RQCj4gPj4gLS0tLS0tLS0tLS0KPiA+PiBSRE1BIGhhcmR3YXJlIHdpdGggcGFnZSBmYXVsdGlu ZyBzdXBwb3J0LiBIZXJlLCBhIHdlbGwtd3JpdHRlbiBkcml2ZXIgZG9lc24ndAo+ID4gCj4gPiBD QVNFMzogSGFyZHdhcmUgd2l0aCBwYWdlIGZhdWx0IHN1cHBvcnQKPiA+IC0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQo+ID4gCj4gPiBIZXJlLCBhIHdlbGwtd3JpdHRlbiAu Li4uCj4gPiAKPiAKPiBBaCwgT0suIFNvIGp1c3QgZHJvcCB0aGUgZmlyc3Qgc2VudGVuY2UsIHll cy4KPiAKPiAuLi4KPiA+Pj4+Pj4gKwkgKi8KPiA+Pj4+Pj4gKwlndXBfZmxhZ3MgfD0gRk9MTF9S RU1PVEUgfCBGT0xMX1BJTjsKPiA+Pj4+Pgo+ID4+Pj4+IFdvdWxkbid0IGl0IGJlIGJldHRlciB0 byBub3QgYWRkIHBpbl9sb25ndGVybV9wYWdlc19yZW1vdGUoKSB1bnRpbAo+ID4+Pj4+IGl0IGNh biBiZSBwcm9wZXJseSBpbXBsZW1lbnRlZCA/Cj4gPj4+Pj4KPiA+Pj4+Cj4gPj4+PiBXZWxsLCB0 aGUgcHJvYmxlbSBpcyB0aGF0IEkgbmVlZCBlYWNoIGNhbGwgc2l0ZSB0aGF0IHJlcXVpcmVzIEZP TExfUElOCj4gPj4+PiB0byB1c2UgYSBwcm9wZXIgd3JhcHBlci4gSXQncyB0aGUgRk9MTF9QSU4g dGhhdCBpcyB0aGUgZm9jdXMgaGVyZSwgYmVjYXVzZQo+ID4+Pj4gdGhlcmUgaXMgYSBoYXJkLCBi cmlnaHQgcnVsZSwgd2hpY2ggaXM6IGlmIGFuZCBvbmx5IGlmIGEgY2FsbGVyIHNldHMKPiA+Pj4+ IEZPTExfUElOLCB0aGVuIHRoZSBkbWEtcGFnZSB0cmFja2luZyBoYXBwZW5zLCBhbmQgcHV0X3Vz ZXJfcGFnZSgpIG11c3QKPiA+Pj4+IGJlIGNhbGxlZC4KPiA+Pj4+Cj4gPj4+PiBTbyB0aGlzIGxl YXZlcyBtZSB3aXRoIG9ubHkgdHdvIHJlYXNvbmFibGUgY2hvaWNlczoKPiA+Pj4+Cj4gPj4+PiBh KSBDb252ZXJ0IHRoZSBjYWxsIHNpdGUgYXMgYWJvdmU6IHBpbl9sb25ndGVybV9wYWdlc19yZW1v dGUoKSwgd2hpY2ggc2V0cwo+ID4+Pj4gRk9MTF9QSU4gKHRoZSBrZXkgcG9pbnQhKSwgYW5kIGxl YXZlcyB0aGUgRk9MTF9MT05HVEVSTSBzaXR1YXRpb24gZXhhY3RseQo+ID4+Pj4gYXMgaXQgaGFz IGJlZW4gc28gZmFyLiBXaGVuIHRoZSBGT0xMX0xPTkdURVJNIHNpdHVhdGlvbiBpcyBmaXhlZCwg dGhlIGNhbGwKPiA+Pj4+IHNpdGUgKm1pZ2h0KiBub3QgbmVlZCBhbnkgY2hhbmdlcyB0byBhZG9w dCB0aGUgd29ya2luZyBndXAuYyBjb2RlLgo+ID4+Pj4KPiA+Pj4+IGIpIENvbnZlcnQgdGhlIGNh bGwgc2l0ZSB0byBwaW5fdXNlcl9wYWdlc19yZW1vdGUoKSwgd2hpY2ggYWxzbyBzZXRzCj4gPj4+ PiBGT0xMX1BJTiwgYW5kIGFsc28gbGVhdmVzIHRoZSBGT0xMX0xPTkdURVJNIHNpdHVhdGlvbiBl eGFjdGx5IGFzIGJlZm9yZS4KPiA+Pj4+IFRoZXJlIHdvdWxkIGFsc28gYmUgYSBjb21tZW50IGF0 IHRoZSBjYWxsIHNpdGUsIHRvIHRoZSBlZmZlY3Qgb2YsICJ0aGlzCj4gPj4+PiBpcyB0aGUgd3Jv bmcgY2FsbCB0byBtYWtlOiBpdCByZWFsbHkgcmVxdWlyZXMgRk9MTF9MT05HVEVSTSBiZWhhdmlv ciIuCj4gPj4+Pgo+ID4+Pj4gV2hlbiB0aGUgRk9MTF9MT05HVEVSTSBzaXR1YXRpb24gaXMgZml4 ZWQsIHRoZSBjYWxsIHNpdGUgd2lsbCBuZWVkIHRvIGJlCj4gPj4+PiBjaGFuZ2VkIHRvIHBpbl9s b25ndGVybV9wYWdlc19yZW1vdGUoKS4KPiA+Pj4+Cj4gPj4+PiBTbyB5b3UgY2FuIHByb2JhYmx5 IHNlZSB3aHkgSSBwaWNrZWQgKGEpLgo+ID4+Pgo+ID4+PiBCdXQgcmlnaHQgbm93IG5vYm9keSBo YXMgRk9MTF9MT05HVEVSTSBhbmQgRk9MTF9SRU1PVEUuIFNvIHlvdSBzaG91bGQKPiA+Pj4gbmV2 ZXIgaGF2ZSB0aGUgbmVlZCBmb3IgcGluX2xvbmd0ZXJtX3BhZ2VzX3JlbW90ZSgpLiBNeSBmZWFy IGlzIHRoYXQKPiA+Pj4gbG9uZ3Rlcm0gaGFzIGltcGxpY2F0aW9uIGFuZCBpdCB3b3VsZCBiZSBi ZXR0ZXIgdG8gbm90IGRyb3AgdGhpcyBpbXBsaWNhdGlvbgo+ID4+PiBieSBhZGRpbmcgYSB3cmFw cGVyIHRoYXQgZG9lcyBub3QgZG8gd2hhdCB0aGUgbmFtZSBzYXlzLgo+ID4+Pgo+ID4+PiBTbyBk byBub3QgaW50cm9kdWNlIHBpbl9sb25ndGVybV9wYWdlc19yZW1vdGUoKSB1bnRpbCBpdHMgZmly c3QgdXNlcgo+ID4+PiBoYXBwZW5zLiBUaGlzIGlzIG9wdGlvbiBjKQo+ID4+Pgo+ID4+Cj4gPj4g QWxtb3N0IGZvcmdvdCwgdGhvdWdoOiB0aGVyZSBpcyBhbHJlYWR5IGFub3RoZXIgdXNlcjogSW5m aW5pYmFuZDoKPiA+Pgo+ID4+IGRyaXZlcnMvaW5maW5pYmFuZC9jb3JlL3VtZW1fb2RwLmM6NjQ2 OiAgICAgICAgIG5wYWdlcyA9IHBpbl9sb25ndGVybV9wYWdlc19yZW1vdGUob3duaW5nX3Byb2Nl c3MsIG93bmluZ19tbSwKPiA+IAo+ID4gb2RwIGRvIG5vdCBuZWVkIHRoYXQsIGkgdGhvdWdodCB0 aGUgSE1NIGNvbnZlcnRpb24gd2FzIGFscmVhZHkgdXBzdHJlYW0KPiA+IGJ1dCBzZWVtcyBub3Qs IGluIGFueSBjYXNlIG9kcCBkbyBub3QgbmVlZCB0aGUgbG9uZ3Rlcm0gY2FzZSBpdCBvbmx5Cj4g PiBzbyBiZXN0IGlzIHRvIHJldmVydCB0aGF0IHVzZXIgdG8gZ3VwX2Zhc3Qgb3Igc29tZXRoaW5n IHVudGlsIGl0IGdldAo+ID4gY29udmVydGVkIHRvIEhNTS4KPiA+IAo+IAo+IE5vdGUgZm9yIEph c29uOiB0aGUgKGEpIG9yIChiKSBpdGVtcyBhcmUgdGFsa2luZyBhYm91dCB0aGUgdmZpbyBjYXNl LCB3aGljaCBpcwo+IG9uZSBvZiB0aGUgdHdvIGNhbGwgc2l0ZXMgdGhhdCBub3cgdXNlIHBpbl9s b25ndGVybV9wYWdlc19yZW1vdGUoKSwgYW5kIHRoZQo+IG90aGVyIG9uZSBpcyBpbmZpbmliYW5k Ogo+IAo+IGRyaXZlcnMvaW5maW5pYmFuZC9jb3JlL3VtZW1fb2RwLmM6NjQ2OiAgICAgICAgIG5w YWdlcyA9IHBpbl9sb25ndGVybV9wYWdlc19yZW1vdGUob3duaW5nX3Byb2Nlc3MsIG93bmluZ19t bSwKPiBkcml2ZXJzL3ZmaW8vdmZpb19pb21tdV90eXBlMS5jOjM1MzogICAgICAgICAgICByZXQg PSBwaW5fbG9uZ3Rlcm1fcGFnZXNfcmVtb3RlKE5VTEwsIG1tLCB2YWRkciwgMSwKCnZmaW8gc2hv dWxkIGJlIHJldmVydGVkIHVudGlsIGl0IGNhbiBiZSBwcm9wZXJseSBpbXBsZW1lbnRlZC4KVGhl IGlzc3VlIGlzIHRoYXQgd2hlbiB5b3UgZml4IHRoZSBpbXBsZW1lbnRhdGlvbiB5b3UgbWlnaHQK YnJlYWsgdmZpbyBleGlzdGluZyB1c2VyIGFuZCB0aHVzIHJlZ3Jlc3MgdGhlIGtlcm5lbCBmcm9t IHVzZXIKcG9pbnQgb2Ygdmlldy4gU28gaSByYXRoZXIgaGF2ZSB0aGUgY2hhbmdlIHRvIHZmaW8g cmV2ZXJ0ZWQsCmkgYmVsaWV2ZSBpdCB3YXMgbm90IHdlbGwgdW5kZXJzdG9vZCB3aGVuIGl0IGdv dCB1cHN0cmVhbSwKYmV0d2VlbiBpbiBteSA1LjQgdHJlZSBpdCBpcyBzdGlsbCBndXBfcmVtb3Rl IG5vdCBsb25ndGVybS4KCgo+IEplcm9tZSwgSmFzb246IEkgcmVhbGx5IGRvbid0IHdhbnQgdG8g cmV2ZXJ0IHRoZSBwdXRfcGFnZSgpIHRvIHB1dF91c2VyX3BhZ2UoKSAKPiBjb252ZXJzaW9ucyB0 aGF0IGFyZSBhbHJlYWR5IHRocm91Z2hvdXQgdGhlIElCIGRyaXZlci0tcG9pbnRsZXNzIGNodXJu LCByaWdodD8KPiBJJ2QgcmF0aGVyIGVpdGhlciBkZWxldGUgdGhlbSBpbiBKYXNvbidzIHRyZWUs IG9yIGdvIHdpdGggd2hhdCBJIGhhdmUgaGVyZQo+IHdoaWxlIHdhaXRpbmcgZm9yIHRoZSBkZWxl dGlvbi4KPiAKPiBNYXliZSB3ZSBzaG91bGQganVzdCBzZXR0bGUgb24gKGEpIG9yIChiKSwgc28g dGhhdCB0aGUgSUIgZHJpdmVyIGVuZHMgdXAgd2l0aAo+IHRoZSB3cmFwcGVyIGZ1bmN0aW9ucz8g SW4gZmFjdCwgaWYgaXQncyBnZXR0aW5nIGRlbGV0ZWQsIHRoZW4gSSdkIHByZWZlciBsZWF2aW5n Cj4gaXQgYXQgKGEpLCBzaW5jZSB0aGF0J3Mgc2ltcGxlLi4uCj4gCj4gSmFzb24gc2hvdWxkIHdl aWdoIGluIG9uIGhvdyBoZSB3YW50cyB0aGlzIHRvIGdvLCB3aXRoIHJlc3BlY3QgdG8gYnJhbmNo aW5nCj4gYW5kIG1lcmdpbmcsIHNpbmNlIGl0IHNvdW5kcyBsaWtlIHRoYXQgd2lsbCBjb25mbGlj dCB3aXRoIHRoZSBobW0gYnJhbmNoIAo+IChoYSwgSSdtIG92ZXJkdWUgaW4gcmV2aWV3aW5nIGhp cyBtbXUgbm90aWZpZXIgc2VyaWVzLCB0aGF0J3Mgd2hhdCBJIGdldCBmb3IKPiBiZWluZyBsYXRl KS4KPiAKPiB0aGFua3MsCj4gCj4gSm9obiBIdWJiYXJkCj4gTlZJRElBCgpfX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpkcmktZGV2ZWwgbWFpbGluZyBsaXN0 CmRyaS1kZXZlbEBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5mcmVlZGVza3Rv cC5vcmcvbWFpbG1hbi9saXN0aW5mby9kcmktZGV2ZWw=