From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C839CA9EB5 for ; Mon, 4 Nov 2019 19:30:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DC645214D8 for ; Mon, 4 Nov 2019 19:30:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="N7o5kNr/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728321AbfKDTaf (ORCPT ); Mon, 4 Nov 2019 14:30:35 -0500 Received: from hqemgate14.nvidia.com ([216.228.121.143]:2050 "EHLO hqemgate14.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728174AbfKDTaf (ORCPT ); Mon, 4 Nov 2019 14:30:35 -0500 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 04 Nov 2019 11:30:39 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Mon, 04 Nov 2019 11:30:33 -0800 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Mon, 04 Nov 2019 11:30:33 -0800 Received: from [10.110.48.28] (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 4 Nov 2019 19:30:32 +0000 Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN To: Jerome Glisse CC: Andrew Morton , Al Viro , Alex Williamson , Benjamin Herrenschmidt , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Christoph Hellwig , Dan Williams , Daniel Vetter , Dave Chinner , David Airlie , "David S . Miller" , Ira Weiny , Jan Kara , Jason Gunthorpe , Jens Axboe , Jonathan Corbet , Magnus Karlsson , Mauro Carvalho Chehab , Michael Ellerman , Michal Hocko , Mike Kravetz , Paul Mackerras , Shuah Khan , Vlastimil Babka , , , , , , , , , , , , , LKML References: <20191103211813.213227-1-jhubbard@nvidia.com> <20191103211813.213227-6-jhubbard@nvidia.com> <20191104173325.GD5134@redhat.com> <20191104191811.GI5134@redhat.com> From: John Hubbard X-Nvconfidentiality: public Message-ID: Date: Mon, 4 Nov 2019 11:30:32 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20191104191811.GI5134@redhat.com> X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To HQMAIL107.nvidia.com (172.20.187.13) Content-Type: text/plain; charset="windows-1252" Content-Language: en-US Content-Transfer-Encoding: quoted-printable DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1572895839; bh=4bBXDhcnAvf/UNaIaERnnHztx1MZo0i9KcDepWFrcDI=; h=X-PGP-Universal:Subject:To:CC:References:From:X-Nvconfidentiality: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=N7o5kNr/onfpbpTkDkcET17D1txOG2LPXemOd+E+RUCurbTzfnK4HybyAK4eO9VMh /WA6Gjpr3d+HnPyWwcFQjIWqh6PFu25AQdh6E4cPKroXl6Z6PiXQQNZd9O8NbKe2VJ EALC1QWpHVawx3KHhJ4QlFBaBQaAtVQmfTfjtc1ixIWfqESqAK5r4Iwv5NkSGDw7Ru RX+Qmp3UxOLe+pMgAfJDtJmiX1zSTnb30+KbUkFzsr+NuVeSs4gFqmy/tdw2feg3JN c47Fjq2XnouO8SaER8nRQLLk9EaLzIhVe73FJUSIkcSD+fYv0Gi7WxfIP0kuBSYref 5Xxmg51W0i3PQ== Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On 11/4/19 11:18 AM, Jerome Glisse wrote: > On Mon, Nov 04, 2019 at 11:04:38AM -0800, John Hubbard wrote: >> On 11/4/19 9:33 AM, Jerome Glisse wrote: >> ... >>> >>> Few nitpick belows, nonetheless: >>> >>> Reviewed-by: J=E9r=F4me Glisse >>> [...] >>>> + >>>> +CASE 3: ODP >>>> +----------- >>>> +(Mellanox/Infiniband On Demand Paging: the hardware supports >>>> +replayable page faulting). There are GUP references to pages serving = as DMA >>>> +buffers. For ODP, MMU notifiers are used to synchronize with page_mkc= lean() >>>> +and munmap(). Therefore, normal GUP calls are sufficient, so neither = flag >>>> +needs to be set. >>> >>> I would not include ODP or anything like it here, they do not use >>> GUP anymore and i believe it is more confusing here. I would how- >>> ever include some text in this documentation explaining that hard- >>> ware that support page fault is superior as it does not incur any >>> of the issues described here. >> >> OK, agreed, here's a new write up that I'll put in v3: >> >> >> CASE 3: ODP >> ----------- >=20 > ODP is RDMA, maybe Hardware with page fault support instead >=20 >> Advanced, but non-CPU (DMA) hardware that supports replayable page fault= s. OK, so: "RDMA hardware with page faulting support." for the first sentence. >> Here, a well-written driver doesn't normally need to pin pages at all. H= owever, >> if the driver does choose to do so, it can register MMU notifiers for th= e range, >> and will be called back upon invalidation. Either way (avoiding page pin= ning, or >> using MMU notifiers to unpin upon request), there is proper synchronizat= ion with=20 >> both filesystem and mm (page_mkclean(), munmap(), etc). >> >> Therefore, neither flag needs to be set. >=20 > In fact GUP should never be use with those. Yes. The next paragraph says that, but maybe not strong enough. >> >> It's worth mentioning here that pinning pages should not be the first de= sign >> choice. If page fault capable hardware is available, then the software s= hould >> be written so that it does not pin pages. This allows mm and filesystems= to >> operate more efficiently and reliably. Here's what we have after the above changes: CASE 3: ODP ----------- RDMA hardware with page faulting support. Here, a well-written driver doesn= 't normally need to pin pages at all. However, if the driver does choose to do= so, it can register MMU notifiers for the range, and will be called back upon invalidation. Either way (avoiding page pinning, or using MMU notifiers to = unpin upon request), there is proper synchronization with both filesystem and mm (page_mkclean(), munmap(), etc). Therefore, neither flag needs to be set. In this case, ideally, neither get_user_pages() nor pin_user_pages() should= be=20 called. Instead, the software should be written so that it does not pin pag= es.=20 This allows mm and filesystems to operate more efficiently and reliably. >>> [...] >>> >>>> @@ -1014,7 +1018,16 @@ static __always_inline long __get_user_pages_lo= cked(struct task_struct *tsk, >>>> BUG_ON(*locked !=3D 1); >>>> } >>>> =20 >>>> - if (pages) >>>> + /* >>>> + * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavio= r >>>> + * is to set FOLL_GET if the caller wants pages[] filled in (but has >>>> + * carelessly failed to specify FOLL_GET), so keep doing that, but o= nly >>>> + * for FOLL_GET, not for the newer FOLL_PIN. >>>> + * >>>> + * FOLL_PIN always expects pages to be non-null, but no need to asse= rt >>>> + * that here, as any failures will be obvious enough. >>>> + */ >>>> + if (pages && !(flags & FOLL_PIN)) >>>> flags |=3D FOLL_GET; >>> >>> Did you look at user that have pages and not FOLL_GET set ? >>> I believe it would be better to first fix them to end up >>> with FOLL_GET set and then error out if pages is !=3D NULL but >>> nor FOLL_GET or FOLL_PIN is set. >>> >> >> I was perhaps overly cautious, and didn't go there. However, it's probab= ly >> doable, given that there was already the following in __get_user_pages()= : >> >> VM_BUG_ON(!!pages !=3D !!(gup_flags & FOLL_GET)); >> >> ...which will have conditioned people and code to set FOLL_GET together = with >> pages. So I agree that the time is right. >> >> In order to make bisecting future failures simpler, I can insert a patch= right=20 >> before this one, that changes the FOLL_GET setting into an assert, like = this: >> >> diff --git a/mm/gup.c b/mm/gup.c >> index 8f236a335ae9..be338961e80d 100644 >> --- a/mm/gup.c >> +++ b/mm/gup.c >> @@ -1014,8 +1014,8 @@ static __always_inline long __get_user_pages_locke= d(struct task_struct *tsk, >> BUG_ON(*locked !=3D 1); >> } >> =20 >> - if (pages) >> - flags |=3D FOLL_GET; >> + if (pages && WARN_ON_ONCE(!(gup_flags & FOLL_GET))) >> + return -EINVAL; >> =20 >> pages_done =3D 0; >> lock_dropped =3D false; >> >> >> ...and then add in FOLL_PIN, with this patch. >=20 > looks good but double check that it should not happens, i will try > to check on my side too. Yes, I'll look. ... >>>> + */ >>>> + gup_flags |=3D FOLL_REMOTE | FOLL_PIN; >>> >>> Wouldn't it be better to not add pin_longterm_pages_remote() until >>> it can be properly implemented ? >>> >> >> Well, the problem is that I need each call site that requires FOLL_PIN >> to use a proper wrapper. It's the FOLL_PIN that is the focus here, becau= se >> there is a hard, bright rule, which is: if and only if a caller sets >> FOLL_PIN, then the dma-page tracking happens, and put_user_page() must >> be called. >> >> So this leaves me with only two reasonable choices: >> >> a) Convert the call site as above: pin_longterm_pages_remote(), which se= ts >> FOLL_PIN (the key point!), and leaves the FOLL_LONGTERM situation exactl= y >> as it has been so far. When the FOLL_LONGTERM situation is fixed, the ca= ll >> site *might* not need any changes to adopt the working gup.c code. >> >> b) Convert the call site to pin_user_pages_remote(), which also sets >> FOLL_PIN, and also leaves the FOLL_LONGTERM situation exactly as before. >> There would also be a comment at the call site, to the effect of, "this >> is the wrong call to make: it really requires FOLL_LONGTERM behavior". >> >> When the FOLL_LONGTERM situation is fixed, the call site will need to be >> changed to pin_longterm_pages_remote(). >> >> So you can probably see why I picked (a). >=20 > But right now nobody has FOLL_LONGTERM and FOLL_REMOTE. So you should > never have the need for pin_longterm_pages_remote(). My fear is that > longterm has implication and it would be better to not drop this implicat= ion > by adding a wrapper that does not do what the name says. >=20 > So do not introduce pin_longterm_pages_remote() until its first user > happens. This is option c) >=20 Almost forgot, though: there is already another user: Infiniband: drivers/infiniband/core/umem_odp.c:646: npages =3D pin_longterm_pag= es_remote(owning_process, owning_mm, thanks, John Hubbard NVIDIA From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A952CA9ECF for ; Mon, 4 Nov 2019 19:34:22 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0122C2080F for ; Mon, 4 Nov 2019 19:34:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="N7o5kNr/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0122C2080F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 476NK74LZYzF3wL for ; Tue, 5 Nov 2019 06:34:19 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nvidia.com (client-ip=216.228.121.143; helo=hqemgate14.nvidia.com; envelope-from=jhubbard@nvidia.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=nvidia.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.b="N7o5kNr/"; dkim-atps=neutral Received: from hqemgate14.nvidia.com (hqemgate14.nvidia.com [216.228.121.143]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 476NDv586CzF3rt for ; Tue, 5 Nov 2019 06:30:39 +1100 (AEDT) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 04 Nov 2019 11:30:39 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Mon, 04 Nov 2019 11:30:33 -0800 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Mon, 04 Nov 2019 11:30:33 -0800 Received: from [10.110.48.28] (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 4 Nov 2019 19:30:32 +0000 Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN To: Jerome Glisse References: <20191103211813.213227-1-jhubbard@nvidia.com> <20191103211813.213227-6-jhubbard@nvidia.com> <20191104173325.GD5134@redhat.com> <20191104191811.GI5134@redhat.com> From: John Hubbard X-Nvconfidentiality: public Message-ID: Date: Mon, 4 Nov 2019 11:30:32 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20191104191811.GI5134@redhat.com> X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To HQMAIL107.nvidia.com (172.20.187.13) Content-Type: text/plain; charset="windows-1252" Content-Language: en-US Content-Transfer-Encoding: quoted-printable DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1572895839; bh=4bBXDhcnAvf/UNaIaERnnHztx1MZo0i9KcDepWFrcDI=; h=X-PGP-Universal:Subject:To:CC:References:From:X-Nvconfidentiality: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=N7o5kNr/onfpbpTkDkcET17D1txOG2LPXemOd+E+RUCurbTzfnK4HybyAK4eO9VMh /WA6Gjpr3d+HnPyWwcFQjIWqh6PFu25AQdh6E4cPKroXl6Z6PiXQQNZd9O8NbKe2VJ EALC1QWpHVawx3KHhJ4QlFBaBQaAtVQmfTfjtc1ixIWfqESqAK5r4Iwv5NkSGDw7Ru RX+Qmp3UxOLe+pMgAfJDtJmiX1zSTnb30+KbUkFzsr+NuVeSs4gFqmy/tdw2feg3JN c47Fjq2XnouO8SaER8nRQLLk9EaLzIhVe73FJUSIkcSD+fYv0Gi7WxfIP0kuBSYref 5Xxmg51W0i3PQ== X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Hocko , Jan Kara , kvm@vger.kernel.org, linux-doc@vger.kernel.org, David Airlie , Dave Chinner , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Paul Mackerras , linux-kselftest@vger.kernel.org, Ira Weiny , Jonathan Corbet , linux-rdma@vger.kernel.org, Christoph Hellwig , Jason Gunthorpe , Vlastimil Babka , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , linux-media@vger.kernel.org, Shuah Khan , linux-block@vger.kernel.org, Alex Williamson , Al Viro , Dan Williams , Mauro Carvalho Chehab , bpf@vger.kernel.org, Magnus Karlsson , Jens Axboe , netdev@vger.kernel.org, LKML , Daniel Vetter , linux-fsdevel@vger.kernel.org, Andrew Morton , linuxppc-dev@lists.ozlabs.org, "David S . Miller" , Mike Kravetz Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On 11/4/19 11:18 AM, Jerome Glisse wrote: > On Mon, Nov 04, 2019 at 11:04:38AM -0800, John Hubbard wrote: >> On 11/4/19 9:33 AM, Jerome Glisse wrote: >> ... >>> >>> Few nitpick belows, nonetheless: >>> >>> Reviewed-by: J=E9r=F4me Glisse >>> [...] >>>> + >>>> +CASE 3: ODP >>>> +----------- >>>> +(Mellanox/Infiniband On Demand Paging: the hardware supports >>>> +replayable page faulting). There are GUP references to pages serving = as DMA >>>> +buffers. For ODP, MMU notifiers are used to synchronize with page_mkc= lean() >>>> +and munmap(). Therefore, normal GUP calls are sufficient, so neither = flag >>>> +needs to be set. >>> >>> I would not include ODP or anything like it here, they do not use >>> GUP anymore and i believe it is more confusing here. I would how- >>> ever include some text in this documentation explaining that hard- >>> ware that support page fault is superior as it does not incur any >>> of the issues described here. >> >> OK, agreed, here's a new write up that I'll put in v3: >> >> >> CASE 3: ODP >> ----------- >=20 > ODP is RDMA, maybe Hardware with page fault support instead >=20 >> Advanced, but non-CPU (DMA) hardware that supports replayable page fault= s. OK, so: "RDMA hardware with page faulting support." for the first sentence. >> Here, a well-written driver doesn't normally need to pin pages at all. H= owever, >> if the driver does choose to do so, it can register MMU notifiers for th= e range, >> and will be called back upon invalidation. Either way (avoiding page pin= ning, or >> using MMU notifiers to unpin upon request), there is proper synchronizat= ion with=20 >> both filesystem and mm (page_mkclean(), munmap(), etc). >> >> Therefore, neither flag needs to be set. >=20 > In fact GUP should never be use with those. Yes. The next paragraph says that, but maybe not strong enough. >> >> It's worth mentioning here that pinning pages should not be the first de= sign >> choice. If page fault capable hardware is available, then the software s= hould >> be written so that it does not pin pages. This allows mm and filesystems= to >> operate more efficiently and reliably. Here's what we have after the above changes: CASE 3: ODP ----------- RDMA hardware with page faulting support. Here, a well-written driver doesn= 't normally need to pin pages at all. However, if the driver does choose to do= so, it can register MMU notifiers for the range, and will be called back upon invalidation. Either way (avoiding page pinning, or using MMU notifiers to = unpin upon request), there is proper synchronization with both filesystem and mm (page_mkclean(), munmap(), etc). Therefore, neither flag needs to be set. In this case, ideally, neither get_user_pages() nor pin_user_pages() should= be=20 called. Instead, the software should be written so that it does not pin pag= es.=20 This allows mm and filesystems to operate more efficiently and reliably. >>> [...] >>> >>>> @@ -1014,7 +1018,16 @@ static __always_inline long __get_user_pages_lo= cked(struct task_struct *tsk, >>>> BUG_ON(*locked !=3D 1); >>>> } >>>> =20 >>>> - if (pages) >>>> + /* >>>> + * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavio= r >>>> + * is to set FOLL_GET if the caller wants pages[] filled in (but has >>>> + * carelessly failed to specify FOLL_GET), so keep doing that, but o= nly >>>> + * for FOLL_GET, not for the newer FOLL_PIN. >>>> + * >>>> + * FOLL_PIN always expects pages to be non-null, but no need to asse= rt >>>> + * that here, as any failures will be obvious enough. >>>> + */ >>>> + if (pages && !(flags & FOLL_PIN)) >>>> flags |=3D FOLL_GET; >>> >>> Did you look at user that have pages and not FOLL_GET set ? >>> I believe it would be better to first fix them to end up >>> with FOLL_GET set and then error out if pages is !=3D NULL but >>> nor FOLL_GET or FOLL_PIN is set. >>> >> >> I was perhaps overly cautious, and didn't go there. However, it's probab= ly >> doable, given that there was already the following in __get_user_pages()= : >> >> VM_BUG_ON(!!pages !=3D !!(gup_flags & FOLL_GET)); >> >> ...which will have conditioned people and code to set FOLL_GET together = with >> pages. So I agree that the time is right. >> >> In order to make bisecting future failures simpler, I can insert a patch= right=20 >> before this one, that changes the FOLL_GET setting into an assert, like = this: >> >> diff --git a/mm/gup.c b/mm/gup.c >> index 8f236a335ae9..be338961e80d 100644 >> --- a/mm/gup.c >> +++ b/mm/gup.c >> @@ -1014,8 +1014,8 @@ static __always_inline long __get_user_pages_locke= d(struct task_struct *tsk, >> BUG_ON(*locked !=3D 1); >> } >> =20 >> - if (pages) >> - flags |=3D FOLL_GET; >> + if (pages && WARN_ON_ONCE(!(gup_flags & FOLL_GET))) >> + return -EINVAL; >> =20 >> pages_done =3D 0; >> lock_dropped =3D false; >> >> >> ...and then add in FOLL_PIN, with this patch. >=20 > looks good but double check that it should not happens, i will try > to check on my side too. Yes, I'll look. ... >>>> + */ >>>> + gup_flags |=3D FOLL_REMOTE | FOLL_PIN; >>> >>> Wouldn't it be better to not add pin_longterm_pages_remote() until >>> it can be properly implemented ? >>> >> >> Well, the problem is that I need each call site that requires FOLL_PIN >> to use a proper wrapper. It's the FOLL_PIN that is the focus here, becau= se >> there is a hard, bright rule, which is: if and only if a caller sets >> FOLL_PIN, then the dma-page tracking happens, and put_user_page() must >> be called. >> >> So this leaves me with only two reasonable choices: >> >> a) Convert the call site as above: pin_longterm_pages_remote(), which se= ts >> FOLL_PIN (the key point!), and leaves the FOLL_LONGTERM situation exactl= y >> as it has been so far. When the FOLL_LONGTERM situation is fixed, the ca= ll >> site *might* not need any changes to adopt the working gup.c code. >> >> b) Convert the call site to pin_user_pages_remote(), which also sets >> FOLL_PIN, and also leaves the FOLL_LONGTERM situation exactly as before. >> There would also be a comment at the call site, to the effect of, "this >> is the wrong call to make: it really requires FOLL_LONGTERM behavior". >> >> When the FOLL_LONGTERM situation is fixed, the call site will need to be >> changed to pin_longterm_pages_remote(). >> >> So you can probably see why I picked (a). >=20 > But right now nobody has FOLL_LONGTERM and FOLL_REMOTE. So you should > never have the need for pin_longterm_pages_remote(). My fear is that > longterm has implication and it would be better to not drop this implicat= ion > by adding a wrapper that does not do what the name says. >=20 > So do not introduce pin_longterm_pages_remote() until its first user > happens. This is option c) >=20 Almost forgot, though: there is already another user: Infiniband: drivers/infiniband/core/umem_odp.c:646: npages =3D pin_longterm_pag= es_remote(owning_process, owning_mm, thanks, John Hubbard NVIDIA From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Hubbard Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN Date: Mon, 4 Nov 2019 11:30:32 -0800 Message-ID: References: <20191103211813.213227-1-jhubbard@nvidia.com> <20191103211813.213227-6-jhubbard@nvidia.com> <20191104173325.GD5134@redhat.com> <20191104191811.GI5134@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <20191104191811.GI5134@redhat.com> Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org To: Jerome Glisse Cc: Andrew Morton , Al Viro , Alex Williamson , Benjamin Herrenschmidt , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Christoph Hellwig , Dan Williams , Daniel Vetter , Dave Chinner , David Airlie , "David S . Miller" , Ira Weiny , Jan Kara , Jason Gunthorpe , Jens Axboe , Jonathan Corbet , Magnus Karlsson , Mauro Carvalho Chehab , Michael Ellerman , Michal Hocko List-Id: dri-devel@lists.freedesktop.org On 11/4/19 11:18 AM, Jerome Glisse wrote: > On Mon, Nov 04, 2019 at 11:04:38AM -0800, John Hubbard wrote: >> On 11/4/19 9:33 AM, Jerome Glisse wrote: >> ... >>> >>> Few nitpick belows, nonetheless: >>> >>> Reviewed-by: J=E9r=F4me Glisse >>> [...] >>>> + >>>> +CASE 3: ODP >>>> +----------- >>>> +(Mellanox/Infiniband On Demand Paging: the hardware supports >>>> +replayable page faulting). There are GUP references to pages serving = as DMA >>>> +buffers. For ODP, MMU notifiers are used to synchronize with page_mkc= lean() >>>> +and munmap(). Therefore, normal GUP calls are sufficient, so neither = flag >>>> +needs to be set. >>> >>> I would not include ODP or anything like it here, they do not use >>> GUP anymore and i believe it is more confusing here. I would how- >>> ever include some text in this documentation explaining that hard- >>> ware that support page fault is superior as it does not incur any >>> of the issues described here. >> >> OK, agreed, here's a new write up that I'll put in v3: >> >> >> CASE 3: ODP >> ----------- >=20 > ODP is RDMA, maybe Hardware with page fault support instead >=20 >> Advanced, but non-CPU (DMA) hardware that supports replayable page fault= s. OK, so: "RDMA hardware with page faulting support." for the first sentence. >> Here, a well-written driver doesn't normally need to pin pages at all. H= owever, >> if the driver does choose to do so, it can register MMU notifiers for th= e range, >> and will be called back upon invalidation. Either way (avoiding page pin= ning, or >> using MMU notifiers to unpin upon request), there is proper synchronizat= ion with=20 >> both filesystem and mm (page_mkclean(), munmap(), etc). >> >> Therefore, neither flag needs to be set. >=20 > In fact GUP should never be use with those. Yes. The next paragraph says that, but maybe not strong enough. >> >> It's worth mentioning here that pinning pages should not be the first de= sign >> choice. If page fault capable hardware is available, then the software s= hould >> be written so that it does not pin pages. This allows mm and filesystems= to >> operate more efficiently and reliably. Here's what we have after the above changes: CASE 3: ODP ----------- RDMA hardware with page faulting support. Here, a well-written driver doesn= 't normally need to pin pages at all. However, if the driver does choose to do= so, it can register MMU notifiers for the range, and will be called back upon invalidation. Either way (avoiding page pinning, or using MMU notifiers to = unpin upon request), there is proper synchronization with both filesystem and mm (page_mkclean(), munmap(), etc). Therefore, neither flag needs to be set. In this case, ideally, neither get_user_pages() nor pin_user_pages() should= be=20 called. Instead, the software should be written so that it does not pin pag= es.=20 This allows mm and filesystems to operate more efficiently and reliably. >>> [...] >>> >>>> @@ -1014,7 +1018,16 @@ static __always_inline long __get_user_pages_lo= cked(struct task_struct *tsk, >>>> BUG_ON(*locked !=3D 1); >>>> } >>>> =20 >>>> - if (pages) >>>> + /* >>>> + * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavio= r >>>> + * is to set FOLL_GET if the caller wants pages[] filled in (but has >>>> + * carelessly failed to specify FOLL_GET), so keep doing that, but o= nly >>>> + * for FOLL_GET, not for the newer FOLL_PIN. >>>> + * >>>> + * FOLL_PIN always expects pages to be non-null, but no need to asse= rt >>>> + * that here, as any failures will be obvious enough. >>>> + */ >>>> + if (pages && !(flags & FOLL_PIN)) >>>> flags |=3D FOLL_GET; >>> >>> Did you look at user that have pages and not FOLL_GET set ? >>> I believe it would be better to first fix them to end up >>> with FOLL_GET set and then error out if pages is !=3D NULL but >>> nor FOLL_GET or FOLL_PIN is set. >>> >> >> I was perhaps overly cautious, and didn't go there. However, it's probab= ly >> doable, given that there was already the following in __get_user_pages()= : >> >> VM_BUG_ON(!!pages !=3D !!(gup_flags & FOLL_GET)); >> >> ...which will have conditioned people and code to set FOLL_GET together = with >> pages. So I agree that the time is right. >> >> In order to make bisecting future failures simpler, I can insert a patch= right=20 >> before this one, that changes the FOLL_GET setting into an assert, like = this: >> >> diff --git a/mm/gup.c b/mm/gup.c >> index 8f236a335ae9..be338961e80d 100644 >> --- a/mm/gup.c >> +++ b/mm/gup.c >> @@ -1014,8 +1014,8 @@ static __always_inline long __get_user_pages_locke= d(struct task_struct *tsk, >> BUG_ON(*locked !=3D 1); >> } >> =20 >> - if (pages) >> - flags |=3D FOLL_GET; >> + if (pages && WARN_ON_ONCE(!(gup_flags & FOLL_GET))) >> + return -EINVAL; >> =20 >> pages_done =3D 0; >> lock_dropped =3D false; >> >> >> ...and then add in FOLL_PIN, with this patch. >=20 > looks good but double check that it should not happens, i will try > to check on my side too. Yes, I'll look. ... >>>> + */ >>>> + gup_flags |=3D FOLL_REMOTE | FOLL_PIN; >>> >>> Wouldn't it be better to not add pin_longterm_pages_remote() until >>> it can be properly implemented ? >>> >> >> Well, the problem is that I need each call site that requires FOLL_PIN >> to use a proper wrapper. It's the FOLL_PIN that is the focus here, becau= se >> there is a hard, bright rule, which is: if and only if a caller sets >> FOLL_PIN, then the dma-page tracking happens, and put_user_page() must >> be called. >> >> So this leaves me with only two reasonable choices: >> >> a) Convert the call site as above: pin_longterm_pages_remote(), which se= ts >> FOLL_PIN (the key point!), and leaves the FOLL_LONGTERM situation exactl= y >> as it has been so far. When the FOLL_LONGTERM situation is fixed, the ca= ll >> site *might* not need any changes to adopt the working gup.c code. >> >> b) Convert the call site to pin_user_pages_remote(), which also sets >> FOLL_PIN, and also leaves the FOLL_LONGTERM situation exactly as before. >> There would also be a comment at the call site, to the effect of, "this >> is the wrong call to make: it really requires FOLL_LONGTERM behavior". >> >> When the FOLL_LONGTERM situation is fixed, the call site will need to be >> changed to pin_longterm_pages_remote(). >> >> So you can probably see why I picked (a). >=20 > But right now nobody has FOLL_LONGTERM and FOLL_REMOTE. So you should > never have the need for pin_longterm_pages_remote(). My fear is that > longterm has implication and it would be better to not drop this implicat= ion > by adding a wrapper that does not do what the name says. >=20 > So do not introduce pin_longterm_pages_remote() until its first user > happens. This is option c) >=20 Almost forgot, though: there is already another user: Infiniband: drivers/infiniband/core/umem_odp.c:646: npages =3D pin_longterm_pag= es_remote(owning_process, owning_mm, thanks, John Hubbard NVIDIA From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 922D1CA9EB5 for ; Mon, 4 Nov 2019 19:35:45 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 74BDD2089C for ; Mon, 4 Nov 2019 19:35:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 74BDD2089C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 547996E847; Mon, 4 Nov 2019 19:35:21 +0000 (UTC) Received: from hqemgate14.nvidia.com (hqemgate14.nvidia.com [216.228.121.143]) by gabe.freedesktop.org (Postfix) with ESMTPS id B83AE6E46D for ; Mon, 4 Nov 2019 19:30:33 +0000 (UTC) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 04 Nov 2019 11:30:39 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Mon, 04 Nov 2019 11:30:33 -0800 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Mon, 04 Nov 2019 11:30:33 -0800 Received: from [10.110.48.28] (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 4 Nov 2019 19:30:32 +0000 Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN To: Jerome Glisse References: <20191103211813.213227-1-jhubbard@nvidia.com> <20191103211813.213227-6-jhubbard@nvidia.com> <20191104173325.GD5134@redhat.com> <20191104191811.GI5134@redhat.com> From: John Hubbard X-Nvconfidentiality: public Message-ID: Date: Mon, 4 Nov 2019 11:30:32 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20191104191811.GI5134@redhat.com> X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To HQMAIL107.nvidia.com (172.20.187.13) Content-Language: en-US X-Mailman-Approved-At: Mon, 04 Nov 2019 19:35:10 +0000 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1572895840; bh=4bBXDhcnAvf/UNaIaERnnHztx1MZo0i9KcDepWFrcDI=; h=X-PGP-Universal:Subject:To:CC:References:From:X-Nvconfidentiality: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=d3nSLZi4dQ/XlnxXKjqALz7gSqo2gYAS/Q/UpUaE6y9CQHCjj0xNC3/wRZJ2nkIZk idyri1DTOY6VDAwkDWaYWvKk8FwuJprMNH546Sbk6JxeKJiohwns3CbZRTwRKNXzJa Gjz6pjzfU8WYmMYKYhaNRJekIdhXid+C4mEHDc9ioAcxX+GX1/fc9MPns1r3jOt118 ZpVuwwsb4Y0BuXB6XkccIUQ7CU6wtyxusONmDCfJh7hk1XcHiVH4o9VI3WDbXTPd04 2GkQFCtjpUZ++ThDLVf1BKQhWxG5D+3PzM56mODBrAWU309gVnY583In6MwrbaqLL6 OZPVV/04gPiDQ== X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Hocko , Jan Kara , kvm@vger.kernel.org, linux-doc@vger.kernel.org, David Airlie , Dave Chinner , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Paul Mackerras , linux-kselftest@vger.kernel.org, Ira Weiny , Jonathan Corbet , linux-rdma@vger.kernel.org, Michael Ellerman , Christoph Hellwig , Jason Gunthorpe , Vlastimil Babka , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , linux-media@vger.kernel.org, Shuah Khan , linux-block@vger.kernel.org, Alex Williamson , Al Viro , Dan Williams , Mauro Carvalho Chehab , bpf@vger.kernel.org, Magnus Karlsson , Jens Axboe , netdev@vger.kernel.org, LKML , linux-fsdevel@vger.kernel.org, Andrew Morton , linuxppc-dev@lists.ozlabs.org, "David S . Miller" , Mike Kravetz Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Message-ID: <20191104193032.wvdrfqV5LnpsU3soucmB5vqrU_KpIKyry9okvBqLPD8@z> T24gMTEvNC8xOSAxMToxOCBBTSwgSmVyb21lIEdsaXNzZSB3cm90ZToKPiBPbiBNb24sIE5vdiAw NCwgMjAxOSBhdCAxMTowNDozOEFNIC0wODAwLCBKb2huIEh1YmJhcmQgd3JvdGU6Cj4+IE9uIDEx LzQvMTkgOTozMyBBTSwgSmVyb21lIEdsaXNzZSB3cm90ZToKPj4gLi4uCj4+Pgo+Pj4gRmV3IG5p dHBpY2sgYmVsb3dzLCBub25ldGhlbGVzczoKPj4+Cj4+PiBSZXZpZXdlZC1ieTogSsOpcsO0bWUg R2xpc3NlIDxqZ2xpc3NlQHJlZGhhdC5jb20+Cj4+PiBbLi4uXQo+Pj4+ICsKPj4+PiArQ0FTRSAz OiBPRFAKPj4+PiArLS0tLS0tLS0tLS0KPj4+PiArKE1lbGxhbm94L0luZmluaWJhbmQgT24gRGVt YW5kIFBhZ2luZzogdGhlIGhhcmR3YXJlIHN1cHBvcnRzCj4+Pj4gK3JlcGxheWFibGUgcGFnZSBm YXVsdGluZykuIFRoZXJlIGFyZSBHVVAgcmVmZXJlbmNlcyB0byBwYWdlcyBzZXJ2aW5nIGFzIERN QQo+Pj4+ICtidWZmZXJzLiBGb3IgT0RQLCBNTVUgbm90aWZpZXJzIGFyZSB1c2VkIHRvIHN5bmNo cm9uaXplIHdpdGggcGFnZV9ta2NsZWFuKCkKPj4+PiArYW5kIG11bm1hcCgpLiBUaGVyZWZvcmUs IG5vcm1hbCBHVVAgY2FsbHMgYXJlIHN1ZmZpY2llbnQsIHNvIG5laXRoZXIgZmxhZwo+Pj4+ICtu ZWVkcyB0byBiZSBzZXQuCj4+Pgo+Pj4gSSB3b3VsZCBub3QgaW5jbHVkZSBPRFAgb3IgYW55dGhp bmcgbGlrZSBpdCBoZXJlLCB0aGV5IGRvIG5vdCB1c2UKPj4+IEdVUCBhbnltb3JlIGFuZCBpIGJl bGlldmUgaXQgaXMgbW9yZSBjb25mdXNpbmcgaGVyZS4gSSB3b3VsZCBob3ctCj4+PiBldmVyIGlu Y2x1ZGUgc29tZSB0ZXh0IGluIHRoaXMgZG9jdW1lbnRhdGlvbiBleHBsYWluaW5nIHRoYXQgaGFy ZC0KPj4+IHdhcmUgdGhhdCBzdXBwb3J0IHBhZ2UgZmF1bHQgaXMgc3VwZXJpb3IgYXMgaXQgZG9l cyBub3QgaW5jdXIgYW55Cj4+PiBvZiB0aGUgaXNzdWVzIGRlc2NyaWJlZCBoZXJlLgo+Pgo+PiBP SywgYWdyZWVkLCBoZXJlJ3MgYSBuZXcgd3JpdGUgdXAgdGhhdCBJJ2xsIHB1dCBpbiB2MzoKPj4K Pj4KPj4gQ0FTRSAzOiBPRFAKPj4gLS0tLS0tLS0tLS0KPiAKPiBPRFAgaXMgUkRNQSwgbWF5YmUg SGFyZHdhcmUgd2l0aCBwYWdlIGZhdWx0IHN1cHBvcnQgaW5zdGVhZAo+IAo+PiBBZHZhbmNlZCwg YnV0IG5vbi1DUFUgKERNQSkgaGFyZHdhcmUgdGhhdCBzdXBwb3J0cyByZXBsYXlhYmxlIHBhZ2Ug ZmF1bHRzLgoKT0ssIHNvOgoKICAgICJSRE1BIGhhcmR3YXJlIHdpdGggcGFnZSBmYXVsdGluZyBz dXBwb3J0LiIKCmZvciB0aGUgZmlyc3Qgc2VudGVuY2UuCgoKPj4gSGVyZSwgYSB3ZWxsLXdyaXR0 ZW4gZHJpdmVyIGRvZXNuJ3Qgbm9ybWFsbHkgbmVlZCB0byBwaW4gcGFnZXMgYXQgYWxsLiBIb3dl dmVyLAo+PiBpZiB0aGUgZHJpdmVyIGRvZXMgY2hvb3NlIHRvIGRvIHNvLCBpdCBjYW4gcmVnaXN0 ZXIgTU1VIG5vdGlmaWVycyBmb3IgdGhlIHJhbmdlLAo+PiBhbmQgd2lsbCBiZSBjYWxsZWQgYmFj ayB1cG9uIGludmFsaWRhdGlvbi4gRWl0aGVyIHdheSAoYXZvaWRpbmcgcGFnZSBwaW5uaW5nLCBv cgo+PiB1c2luZyBNTVUgbm90aWZpZXJzIHRvIHVucGluIHVwb24gcmVxdWVzdCksIHRoZXJlIGlz IHByb3BlciBzeW5jaHJvbml6YXRpb24gd2l0aCAKPj4gYm90aCBmaWxlc3lzdGVtIGFuZCBtbSAo cGFnZV9ta2NsZWFuKCksIG11bm1hcCgpLCBldGMpLgo+Pgo+PiBUaGVyZWZvcmUsIG5laXRoZXIg ZmxhZyBuZWVkcyB0byBiZSBzZXQuCj4gCj4gSW4gZmFjdCBHVVAgc2hvdWxkIG5ldmVyIGJlIHVz ZSB3aXRoIHRob3NlLgoKClllcy4gVGhlIG5leHQgcGFyYWdyYXBoIHNheXMgdGhhdCwgYnV0IG1h eWJlIG5vdCBzdHJvbmcgZW5vdWdoLgoKCj4+Cj4+IEl0J3Mgd29ydGggbWVudGlvbmluZyBoZXJl IHRoYXQgcGlubmluZyBwYWdlcyBzaG91bGQgbm90IGJlIHRoZSBmaXJzdCBkZXNpZ24KPj4gY2hv aWNlLiBJZiBwYWdlIGZhdWx0IGNhcGFibGUgaGFyZHdhcmUgaXMgYXZhaWxhYmxlLCB0aGVuIHRo ZSBzb2Z0d2FyZSBzaG91bGQKPj4gYmUgd3JpdHRlbiBzbyB0aGF0IGl0IGRvZXMgbm90IHBpbiBw YWdlcy4gVGhpcyBhbGxvd3MgbW0gYW5kIGZpbGVzeXN0ZW1zIHRvCj4+IG9wZXJhdGUgbW9yZSBl ZmZpY2llbnRseSBhbmQgcmVsaWFibHkuCgpIZXJlJ3Mgd2hhdCB3ZSBoYXZlIGFmdGVyIHRoZSBh Ym92ZSBjaGFuZ2VzOgoKQ0FTRSAzOiBPRFAKLS0tLS0tLS0tLS0KUkRNQSBoYXJkd2FyZSB3aXRo IHBhZ2UgZmF1bHRpbmcgc3VwcG9ydC4gSGVyZSwgYSB3ZWxsLXdyaXR0ZW4gZHJpdmVyIGRvZXNu J3QKbm9ybWFsbHkgbmVlZCB0byBwaW4gcGFnZXMgYXQgYWxsLiBIb3dldmVyLCBpZiB0aGUgZHJp dmVyIGRvZXMgY2hvb3NlIHRvIGRvIHNvLAppdCBjYW4gcmVnaXN0ZXIgTU1VIG5vdGlmaWVycyBm b3IgdGhlIHJhbmdlLCBhbmQgd2lsbCBiZSBjYWxsZWQgYmFjayB1cG9uCmludmFsaWRhdGlvbi4g RWl0aGVyIHdheSAoYXZvaWRpbmcgcGFnZSBwaW5uaW5nLCBvciB1c2luZyBNTVUgbm90aWZpZXJz IHRvIHVucGluCnVwb24gcmVxdWVzdCksIHRoZXJlIGlzIHByb3BlciBzeW5jaHJvbml6YXRpb24g d2l0aCBib3RoIGZpbGVzeXN0ZW0gYW5kIG1tCihwYWdlX21rY2xlYW4oKSwgbXVubWFwKCksIGV0 YykuCgpUaGVyZWZvcmUsIG5laXRoZXIgZmxhZyBuZWVkcyB0byBiZSBzZXQuCgpJbiB0aGlzIGNh c2UsIGlkZWFsbHksIG5laXRoZXIgZ2V0X3VzZXJfcGFnZXMoKSBub3IgcGluX3VzZXJfcGFnZXMo KSBzaG91bGQgYmUgCmNhbGxlZC4gSW5zdGVhZCwgdGhlIHNvZnR3YXJlIHNob3VsZCBiZSB3cml0 dGVuIHNvIHRoYXQgaXQgZG9lcyBub3QgcGluIHBhZ2VzLiAKVGhpcyBhbGxvd3MgbW0gYW5kIGZp bGVzeXN0ZW1zIHRvIG9wZXJhdGUgbW9yZSBlZmZpY2llbnRseSBhbmQgcmVsaWFibHkuCgo+Pj4g Wy4uLl0KPj4+Cj4+Pj4gQEAgLTEwMTQsNyArMTAxOCwxNiBAQCBzdGF0aWMgX19hbHdheXNfaW5s aW5lIGxvbmcgX19nZXRfdXNlcl9wYWdlc19sb2NrZWQoc3RydWN0IHRhc2tfc3RydWN0ICp0c2ss Cj4+Pj4gIAkJQlVHX09OKCpsb2NrZWQgIT0gMSk7Cj4+Pj4gIAl9Cj4+Pj4gIAo+Pj4+IC0JaWYg KHBhZ2VzKQo+Pj4+ICsJLyoKPj4+PiArCSAqIEZPTExfUElOIGFuZCBGT0xMX0dFVCBhcmUgbXV0 dWFsbHkgZXhjbHVzaXZlLiBUcmFkaXRpb25hbCBiZWhhdmlvcgo+Pj4+ICsJICogaXMgdG8gc2V0 IEZPTExfR0VUIGlmIHRoZSBjYWxsZXIgd2FudHMgcGFnZXNbXSBmaWxsZWQgaW4gKGJ1dCBoYXMK Pj4+PiArCSAqIGNhcmVsZXNzbHkgZmFpbGVkIHRvIHNwZWNpZnkgRk9MTF9HRVQpLCBzbyBrZWVw IGRvaW5nIHRoYXQsIGJ1dCBvbmx5Cj4+Pj4gKwkgKiBmb3IgRk9MTF9HRVQsIG5vdCBmb3IgdGhl IG5ld2VyIEZPTExfUElOLgo+Pj4+ICsJICoKPj4+PiArCSAqIEZPTExfUElOIGFsd2F5cyBleHBl Y3RzIHBhZ2VzIHRvIGJlIG5vbi1udWxsLCBidXQgbm8gbmVlZCB0byBhc3NlcnQKPj4+PiArCSAq IHRoYXQgaGVyZSwgYXMgYW55IGZhaWx1cmVzIHdpbGwgYmUgb2J2aW91cyBlbm91Z2guCj4+Pj4g KwkgKi8KPj4+PiArCWlmIChwYWdlcyAmJiAhKGZsYWdzICYgRk9MTF9QSU4pKQo+Pj4+ICAJCWZs YWdzIHw9IEZPTExfR0VUOwo+Pj4KPj4+IERpZCB5b3UgbG9vayBhdCB1c2VyIHRoYXQgaGF2ZSBw YWdlcyBhbmQgbm90IEZPTExfR0VUIHNldCA/Cj4+PiBJIGJlbGlldmUgaXQgd291bGQgYmUgYmV0 dGVyIHRvIGZpcnN0IGZpeCB0aGVtIHRvIGVuZCB1cAo+Pj4gd2l0aCBGT0xMX0dFVCBzZXQgYW5k IHRoZW4gZXJyb3Igb3V0IGlmIHBhZ2VzIGlzICE9IE5VTEwgYnV0Cj4+PiBub3IgRk9MTF9HRVQg b3IgRk9MTF9QSU4gaXMgc2V0Lgo+Pj4KPj4KPj4gSSB3YXMgcGVyaGFwcyBvdmVybHkgY2F1dGlv dXMsIGFuZCBkaWRuJ3QgZ28gdGhlcmUuIEhvd2V2ZXIsIGl0J3MgcHJvYmFibHkKPj4gZG9hYmxl LCBnaXZlbiB0aGF0IHRoZXJlIHdhcyBhbHJlYWR5IHRoZSBmb2xsb3dpbmcgaW4gX19nZXRfdXNl cl9wYWdlcygpOgo+Pgo+PiAgICAgVk1fQlVHX09OKCEhcGFnZXMgIT0gISEoZ3VwX2ZsYWdzICYg Rk9MTF9HRVQpKTsKPj4KPj4gLi4ud2hpY2ggd2lsbCBoYXZlIGNvbmRpdGlvbmVkIHBlb3BsZSBh bmQgY29kZSB0byBzZXQgRk9MTF9HRVQgdG9nZXRoZXIgd2l0aAo+PiBwYWdlcy4gU28gSSBhZ3Jl ZSB0aGF0IHRoZSB0aW1lIGlzIHJpZ2h0Lgo+Pgo+PiBJbiBvcmRlciB0byBtYWtlIGJpc2VjdGlu ZyBmdXR1cmUgZmFpbHVyZXMgc2ltcGxlciwgSSBjYW4gaW5zZXJ0IGEgcGF0Y2ggcmlnaHQgCj4+ IGJlZm9yZSB0aGlzIG9uZSwgdGhhdCBjaGFuZ2VzIHRoZSBGT0xMX0dFVCBzZXR0aW5nIGludG8g YW4gYXNzZXJ0LCBsaWtlIHRoaXM6Cj4+Cj4+IGRpZmYgLS1naXQgYS9tbS9ndXAuYyBiL21tL2d1 cC5jCj4+IGluZGV4IDhmMjM2YTMzNWFlOS4uYmUzMzg5NjFlODBkIDEwMDY0NAo+PiAtLS0gYS9t bS9ndXAuYwo+PiArKysgYi9tbS9ndXAuYwo+PiBAQCAtMTAxNCw4ICsxMDE0LDggQEAgc3RhdGlj IF9fYWx3YXlzX2lubGluZSBsb25nIF9fZ2V0X3VzZXJfcGFnZXNfbG9ja2VkKHN0cnVjdCB0YXNr X3N0cnVjdCAqdHNrLAo+PiAgICAgICAgICAgICAgICAgQlVHX09OKCpsb2NrZWQgIT0gMSk7Cj4+ ICAgICAgICAgfQo+PiAgCj4+IC0gICAgICAgaWYgKHBhZ2VzKQo+PiAtICAgICAgICAgICAgICAg ZmxhZ3MgfD0gRk9MTF9HRVQ7Cj4+ICsgICAgICAgaWYgKHBhZ2VzICYmIFdBUk5fT05fT05DRSgh KGd1cF9mbGFncyAmIEZPTExfR0VUKSkpCj4+ICsgICAgICAgICAgICAgICByZXR1cm4gLUVJTlZB TDsKPj4gIAo+PiAgICAgICAgIHBhZ2VzX2RvbmUgPSAwOwo+PiAgICAgICAgIGxvY2tfZHJvcHBl ZCA9IGZhbHNlOwo+Pgo+Pgo+PiAuLi5hbmQgdGhlbiBhZGQgaW4gRk9MTF9QSU4sIHdpdGggdGhp cyBwYXRjaC4KPiAKPiBsb29rcyBnb29kIGJ1dCBkb3VibGUgY2hlY2sgdGhhdCBpdCBzaG91bGQg bm90IGhhcHBlbnMsIGkgd2lsbCB0cnkKPiB0byBjaGVjayBvbiBteSBzaWRlIHRvby4KClllcywg SSdsbCBsb29rLgoKLi4uCj4+Pj4gKwkgKi8KPj4+PiArCWd1cF9mbGFncyB8PSBGT0xMX1JFTU9U RSB8IEZPTExfUElOOwo+Pj4KPj4+IFdvdWxkbid0IGl0IGJlIGJldHRlciB0byBub3QgYWRkIHBp bl9sb25ndGVybV9wYWdlc19yZW1vdGUoKSB1bnRpbAo+Pj4gaXQgY2FuIGJlIHByb3Blcmx5IGlt cGxlbWVudGVkID8KPj4+Cj4+Cj4+IFdlbGwsIHRoZSBwcm9ibGVtIGlzIHRoYXQgSSBuZWVkIGVh Y2ggY2FsbCBzaXRlIHRoYXQgcmVxdWlyZXMgRk9MTF9QSU4KPj4gdG8gdXNlIGEgcHJvcGVyIHdy YXBwZXIuIEl0J3MgdGhlIEZPTExfUElOIHRoYXQgaXMgdGhlIGZvY3VzIGhlcmUsIGJlY2F1c2UK Pj4gdGhlcmUgaXMgYSBoYXJkLCBicmlnaHQgcnVsZSwgd2hpY2ggaXM6IGlmIGFuZCBvbmx5IGlm IGEgY2FsbGVyIHNldHMKPj4gRk9MTF9QSU4sIHRoZW4gdGhlIGRtYS1wYWdlIHRyYWNraW5nIGhh cHBlbnMsIGFuZCBwdXRfdXNlcl9wYWdlKCkgbXVzdAo+PiBiZSBjYWxsZWQuCj4+Cj4+IFNvIHRo aXMgbGVhdmVzIG1lIHdpdGggb25seSB0d28gcmVhc29uYWJsZSBjaG9pY2VzOgo+Pgo+PiBhKSBD b252ZXJ0IHRoZSBjYWxsIHNpdGUgYXMgYWJvdmU6IHBpbl9sb25ndGVybV9wYWdlc19yZW1vdGUo KSwgd2hpY2ggc2V0cwo+PiBGT0xMX1BJTiAodGhlIGtleSBwb2ludCEpLCBhbmQgbGVhdmVzIHRo ZSBGT0xMX0xPTkdURVJNIHNpdHVhdGlvbiBleGFjdGx5Cj4+IGFzIGl0IGhhcyBiZWVuIHNvIGZh ci4gV2hlbiB0aGUgRk9MTF9MT05HVEVSTSBzaXR1YXRpb24gaXMgZml4ZWQsIHRoZSBjYWxsCj4+ IHNpdGUgKm1pZ2h0KiBub3QgbmVlZCBhbnkgY2hhbmdlcyB0byBhZG9wdCB0aGUgd29ya2luZyBn dXAuYyBjb2RlLgo+Pgo+PiBiKSBDb252ZXJ0IHRoZSBjYWxsIHNpdGUgdG8gcGluX3VzZXJfcGFn ZXNfcmVtb3RlKCksIHdoaWNoIGFsc28gc2V0cwo+PiBGT0xMX1BJTiwgYW5kIGFsc28gbGVhdmVz IHRoZSBGT0xMX0xPTkdURVJNIHNpdHVhdGlvbiBleGFjdGx5IGFzIGJlZm9yZS4KPj4gVGhlcmUg d291bGQgYWxzbyBiZSBhIGNvbW1lbnQgYXQgdGhlIGNhbGwgc2l0ZSwgdG8gdGhlIGVmZmVjdCBv ZiwgInRoaXMKPj4gaXMgdGhlIHdyb25nIGNhbGwgdG8gbWFrZTogaXQgcmVhbGx5IHJlcXVpcmVz IEZPTExfTE9OR1RFUk0gYmVoYXZpb3IiLgo+Pgo+PiBXaGVuIHRoZSBGT0xMX0xPTkdURVJNIHNp dHVhdGlvbiBpcyBmaXhlZCwgdGhlIGNhbGwgc2l0ZSB3aWxsIG5lZWQgdG8gYmUKPj4gY2hhbmdl ZCB0byBwaW5fbG9uZ3Rlcm1fcGFnZXNfcmVtb3RlKCkuCj4+Cj4+IFNvIHlvdSBjYW4gcHJvYmFi bHkgc2VlIHdoeSBJIHBpY2tlZCAoYSkuCj4gCj4gQnV0IHJpZ2h0IG5vdyBub2JvZHkgaGFzIEZP TExfTE9OR1RFUk0gYW5kIEZPTExfUkVNT1RFLiBTbyB5b3Ugc2hvdWxkCj4gbmV2ZXIgaGF2ZSB0 aGUgbmVlZCBmb3IgcGluX2xvbmd0ZXJtX3BhZ2VzX3JlbW90ZSgpLiBNeSBmZWFyIGlzIHRoYXQK PiBsb25ndGVybSBoYXMgaW1wbGljYXRpb24gYW5kIGl0IHdvdWxkIGJlIGJldHRlciB0byBub3Qg ZHJvcCB0aGlzIGltcGxpY2F0aW9uCj4gYnkgYWRkaW5nIGEgd3JhcHBlciB0aGF0IGRvZXMgbm90 IGRvIHdoYXQgdGhlIG5hbWUgc2F5cy4KPiAKPiBTbyBkbyBub3QgaW50cm9kdWNlIHBpbl9sb25n dGVybV9wYWdlc19yZW1vdGUoKSB1bnRpbCBpdHMgZmlyc3QgdXNlcgo+IGhhcHBlbnMuIFRoaXMg aXMgb3B0aW9uIGMpCj4gCgpBbG1vc3QgZm9yZ290LCB0aG91Z2g6IHRoZXJlIGlzIGFscmVhZHkg YW5vdGhlciB1c2VyOiBJbmZpbmliYW5kOgoKZHJpdmVycy9pbmZpbmliYW5kL2NvcmUvdW1lbV9v ZHAuYzo2NDY6ICAgICAgICAgbnBhZ2VzID0gcGluX2xvbmd0ZXJtX3BhZ2VzX3JlbW90ZShvd25p bmdfcHJvY2Vzcywgb3duaW5nX21tLAoKCgp0aGFua3MsCgpKb2huIEh1YmJhcmQKTlZJRElBCl9f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCmRyaS1kZXZlbCBt YWlsaW5nIGxpc3QKZHJpLWRldmVsQGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3Rz LmZyZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3RpbmZvL2RyaS1kZXZlbA==