From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98F68C1744B for ; Tue, 12 Nov 2019 06:51:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6937221D7F for ; Tue, 12 Nov 2019 06:51:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573541488; bh=Xrd0YnKaYbVVFQLPrYOT47qBaf64xGATzmWkzzpLmiA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=XypMttTs9JukeoxxI06ODxBGJAX34FEPrF9KKkrhb6pEVfqMUPNQn2UykhoKXeTJr 5DsYu/c5KiRmC+369lPs5RWLNy4YY+yLC5VJ+rnxwhmNa+T5JghbclCK8CDD1ny2Ye CCR7Jwm0dawIkFwWEDCi073sKkJ4RP48+M+xV9Fo= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725781AbfKLGvU (ORCPT ); Tue, 12 Nov 2019 01:51:20 -0500 Received: from mail.kernel.org ([198.145.29.99]:50172 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725775AbfKLGvT (ORCPT ); Tue, 12 Nov 2019 01:51:19 -0500 Received: from rapoport-lnx (unknown [195.57.117.247]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 123CD2084F; Tue, 12 Nov 2019 06:51:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573541477; bh=Xrd0YnKaYbVVFQLPrYOT47qBaf64xGATzmWkzzpLmiA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=B8jDkQZ6Yaop7bTCojs7RpRcs7qzAYkt7UQtAShffH1VroOthzb0lxXZ+Uqi9E2C9 L+fU7mWBQog4EI8R3urhYTrKWVhcrv1YybqFVDmiAVLKq35FK2BQV3cS+FpmDp9EJt Prf9mTDXl8/QHBhyteLCU6UJyRMnvTcnUn7JMsuk= Date: Tue, 12 Nov 2019 07:51:05 +0100 From: Mike Rapoport To: John Hubbard Cc: Andrew Morton , Al Viro , Alex Williamson , Benjamin Herrenschmidt , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , Christoph Hellwig , Dan Williams , Daniel Vetter , Dave Chinner , David Airlie , "David S . Miller" , Ira Weiny , Jan Kara , Jason Gunthorpe , Jens Axboe , Jonathan Corbet , =?iso-8859-1?B?Suly9G1l?= Glisse , Magnus Karlsson , Mauro Carvalho Chehab , Michael Ellerman , Michal Hocko , Mike Kravetz , Paul Mackerras , Shuah Khan , Vlastimil Babka , bpf@vger.kernel.org, dri-devel@lists.freedesktop.org, kvm@vger.kernel.org, linux-block@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org, linux-rdma@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, netdev@vger.kernel.org, linux-mm@kvack.org, LKML Subject: Re: [PATCH v3 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN Message-ID: <20191112065103.GA1209@rapoport-lnx> References: <20191112000700.3455038-1-jhubbard@nvidia.com> <20191112000700.3455038-10-jhubbard@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20191112000700.3455038-10-jhubbard@nvidia.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Mon, Nov 11, 2019 at 04:06:46PM -0800, John Hubbard wrote: > Introduce pin_user_pages*() variations of get_user_pages*() calls, > and also pin_longterm_pages*() variations. > > These variants all set FOLL_PIN, which is also introduced, and > thoroughly documented. > > The pin_longterm*() variants also set FOLL_LONGTERM, in addition > to FOLL_PIN: > > pin_user_pages() > pin_user_pages_remote() > pin_user_pages_fast() > > pin_longterm_pages() > pin_longterm_pages_remote() > pin_longterm_pages_fast() > > All pages that are pinned via the above calls, must be unpinned via > put_user_page(). > > The underlying rules are: > > * These are gup-internal flags, so the call sites should not directly > set FOLL_PIN nor FOLL_LONGTERM. That behavior is enforced with > assertions, for the new FOLL_PIN flag. However, for the pre-existing > FOLL_LONGTERM flag, which has some call sites that still directly > set FOLL_LONGTERM, there is no assertion yet. > > * Call sites that want to indicate that they are going to do DirectIO > ("DIO") or something with similar characteristics, should call a > get_user_pages()-like wrapper call that sets FOLL_PIN. These wrappers > will: > * Start with "pin_user_pages" instead of "get_user_pages". That > makes it easy to find and audit the call sites. > * Set FOLL_PIN > > * For pages that are received via FOLL_PIN, those pages must be returned > via put_user_page(). > > Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases > in this documentation. (I've reworded it and expanded upon it.) > > Reviewed-by: Jérôme Glisse > Cc: Mike Rapoport > Cc: Jonathan Corbet > Cc: Ira Weiny > Signed-off-by: John Hubbard > --- Reviewed-by: Mike Rapoport # Documentation > Documentation/core-api/index.rst | 1 + > Documentation/core-api/pin_user_pages.rst | 218 ++++++++++++++++++ > include/linux/mm.h | 62 +++++- > mm/gup.c | 260 ++++++++++++++++++++-- > 4 files changed, 514 insertions(+), 27 deletions(-) > create mode 100644 Documentation/core-api/pin_user_pages.rst > > diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst > index ab0eae1c153a..413f7d7c8642 100644 > --- a/Documentation/core-api/index.rst > +++ b/Documentation/core-api/index.rst > @@ -31,6 +31,7 @@ Core utilities > generic-radix-tree > memory-allocation > mm-api > + pin_user_pages > gfp_mask-from-fs-io > timekeeping > boot-time-mm > diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst > new file mode 100644 > index 000000000000..ce819e709435 > --- /dev/null > +++ b/Documentation/core-api/pin_user_pages.rst > @@ -0,0 +1,218 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +==================================================== > +pin_user_pages() and related calls > +==================================================== > + > +.. contents:: :local: > + > +Overview > +======== > + > +This document describes the following functions: :: > + > + pin_user_pages > + pin_user_pages_fast > + pin_user_pages_remote > + > + pin_longterm_pages > + pin_longterm_pages_fast > + pin_longterm_pages_remote > + > +Basic description of FOLL_PIN > +============================= > + > +FOLL_PIN and FOLL_LONGTERM are flags that can be passed to the get_user_pages*() > +("gup") family of functions. FOLL_PIN has significant interactions and > +interdependencies with FOLL_LONGTERM, so both are covered here. > + > +Both FOLL_PIN and FOLL_LONGTERM are internal to gup, meaning that neither > +FOLL_PIN nor FOLL_LONGTERM should not appear at the gup call sites. This allows > +the associated wrapper functions (pin_user_pages() and others) to set the > +correct combination of these flags, and to check for problems as well. > + > +FOLL_PIN and FOLL_GET are mutually exclusive for a given gup call. However, > +multiple threads and call sites are free to pin the same struct pages, via both > +FOLL_PIN and FOLL_GET. It's just the call site that needs to choose one or the > +other, not the struct page(s). > + > +The FOLL_PIN implementation is nearly the same as FOLL_GET, except that FOLL_PIN > +uses a different reference counting technique. > + > +FOLL_PIN is a prerequisite to FOLL_LONGTGERM. Another way of saying that is, > +FOLL_LONGTERM is a specific case, more restrictive case of FOLL_PIN. > + > +Which flags are set by each wrapper > +=================================== > + > +Only FOLL_PIN and FOLL_LONGTERM are covered here. These flags are added to > +whatever flags the caller provides:: > + > + Function gup flags (FOLL_PIN or FOLL_LONGTERM only) > + -------- ------------------------------------------ > + pin_user_pages FOLL_PIN > + pin_user_pages_fast FOLL_PIN > + pin_user_pages_remote FOLL_PIN > + > + pin_longterm_pages FOLL_PIN | FOLL_LONGTERM > + pin_longterm_pages_fast FOLL_PIN | FOLL_LONGTERM > + pin_longterm_pages_remote FOLL_PIN | FOLL_LONGTERM > + > +Tracking dma-pinned pages > +========================= > + > +Some of the key design constraints, and solutions, for tracking dma-pinned > +pages: > + > +* An actual reference count, per struct page, is required. This is because > + multiple processes may pin and unpin a page. > + > +* False positives (reporting that a page is dma-pinned, when in fact it is not) > + are acceptable, but false negatives are not. > + > +* struct page may not be increased in size for this, and all fields are already > + used. > + > +* Given the above, we can overload the page->_refcount field by using, sort of, > + the upper bits in that field for a dma-pinned count. "Sort of", means that, > + rather than dividing page->_refcount into bit fields, we simple add a medium- > + large value (GUP_PIN_COUNTING_BIAS, initially chosen to be 1024: 10 bits) to > + page->_refcount. This provides fuzzy behavior: if a page has get_page() called > + on it 1024 times, then it will appear to have a single dma-pinned count. > + And again, that's acceptable. > + > +This also leads to limitations: there are only 31-10==21 bits available for a > +counter that increments 10 bits at a time. > + > +TODO: for 1GB and larger huge pages, this is cutting it close. That's because > +when pin_user_pages() follows such pages, it increments the head page by "1" > +(where "1" used to mean "+1" for get_user_pages(), but now means "+1024" for > +pin_user_pages()) for each tail page. So if you have a 1GB huge page: > + > +* There are 256K (18 bits) worth of 4 KB tail pages. > +* There are 21 bits available to count up via GUP_PIN_COUNTING_BIAS (that is, > + 10 bits at a time) > +* There are 21 - 18 == 3 bits available to count. Except that there aren't, > + because you need to allow for a few normal get_page() calls on the head page, > + as well. Fortunately, the approach of using addition, rather than "hard" > + bitfields, within page->_refcount, allows for sharing these bits gracefully. > + But we're still looking at about 8 references. > + > +This, however, is a missing feature more than anything else, because it's easily > +solved by addressing an obvious inefficiency in the original get_user_pages() > +approach of retrieving pages: stop treating all the pages as if they were > +PAGE_SIZE. Retrieve huge pages as huge pages. The callers need to be aware of > +this, so some work is required. Once that's in place, this limitation mostly > +disappears from view, because there will be ample refcounting range available. > + > +* Callers must specifically request "dma-pinned tracking of pages". In other > + words, just calling get_user_pages() will not suffice; a new set of functions, > + pin_user_page() and related, must be used. > + > +FOLL_PIN, FOLL_GET, FOLL_LONGTERM: when to use which flags > +========================================================== > + > +Thanks to Jan Kara, Vlastimil Babka and several other -mm people, for describing > +these categories: > + > +CASE 1: Direct IO (DIO) > +----------------------- > +There are GUP references to pages that are serving > +as DIO buffers. These buffers are needed for a relatively short time (so they > +are not "long term"). No special synchronization with page_mkclean() or > +munmap() is provided. Therefore, flags to set at the call site are: :: > + > + FOLL_PIN > + > +...but rather than setting FOLL_PIN directly, call sites should use one of > +the pin_user_pages*() routines that set FOLL_PIN. > + > +CASE 2: RDMA > +------------ > +There are GUP references to pages that are serving as DMA > +buffers. These buffers are needed for a long time ("long term"). No special > +synchronization with page_mkclean() or munmap() is provided. Therefore, flags > +to set at the call site are: :: > + > + FOLL_PIN | FOLL_LONGTERM > + > +NOTE: Some pages, such as DAX pages, cannot be pinned with longterm pins. That's > +because DAX pages do not have a separate page cache, and so "pinning" implies > +locking down file system blocks, which is not (yet) supported in that way. > + > +CASE 3: Hardware with page faulting support > +------------------------------------------- > +Here, a well-written driver doesn't normally need to pin pages at all. However, > +if the driver does choose to do so, it can register MMU notifiers for the range, > +and will be called back upon invalidation. Either way (avoiding page pinning, or > +using MMU notifiers to unpin upon request), there is proper synchronization with > +both filesystem and mm (page_mkclean(), munmap(), etc). > + > +Therefore, neither flag needs to be set. > + > +In this case, ideally, neither get_user_pages() nor pin_user_pages() should be > +called. Instead, the software should be written so that it does not pin pages. > +This allows mm and filesystems to operate more efficiently and reliably. > + > +CASE 4: Pinning for struct page manipulation only > +------------------------------------------------- > +Here, normal GUP calls are sufficient, so neither flag needs to be set. > + > +page_dma_pinned(): the whole point of pinning > +============================================= > + > +The whole point of marking pages as "DMA-pinned" or "gup-pinned" is to be able > +to query, "is this page DMA-pinned?" That allows code such as page_mkclean() > +(and file system writeback code in general) to make informed decisions about > +what to do when a page cannot be unmapped due to such pins. > + > +What to do in those cases is the subject of a years-long series of discussions > +and debates (see the References at the end of this document). It's a TODO item > +here: fill in the details once that's worked out. Meanwhile, it's safe to say > +that having this available: :: > + > + static inline bool page_dma_pinned(struct page *page) > + > +...is a prerequisite to solving the long-running gup+DMA problem. > + > +Another way of thinking about FOLL_GET, FOLL_PIN, and FOLL_LONGTERM > +=================================================================== > + > +Another way of thinking about these flags is as a progression of restrictions: > +FOLL_GET is for struct page manipulation, without affecting the data that the > +struct page refers to. FOLL_PIN is a *replacement* for FOLL_GET, and is for > +short term pins on pages whose data *will* get accessed. As such, FOLL_PIN is > +a "more severe" form of pinning. And finally, FOLL_LONGTERM is an even more > +restrictive case that has FOLL_PIN as a prerequisite: this is for pages that > +will be pinned longterm, and whose data will be accessed. > + > +Unit testing > +============ > +This file:: > + > + tools/testing/selftests/vm/gup_benchmark.c > + > +has the following new calls to exercise the new pin*() wrapper functions: > + > +* PIN_FAST_BENCHMARK (./gup_benchmark -a) > +* PIN_LONGTERM_BENCHMARK (./gup_benchmark -a) > +* PIN_BENCHMARK (./gup_benchmark -a) > + > +You can monitor how many total dma-pinned pages have been acquired and released > +since the system was booted, via two new /proc/vmstat entries: :: > + > + /proc/vmstat/nr_foll_pin_requested > + /proc/vmstat/nr_foll_pin_requested > + > +Those are both going to show zero, unless CONFIG_DEBUG_VM is set. This is > +because there is a noticeable performance drop in put_user_page(), when they > +are activated. > + > +References > +========== > + > +* `Some slow progress on get_user_pages() (Apr 2, 2019) `_ > +* `DMA and get_user_pages() (LPC: Dec 12, 2018) `_ > +* `The trouble with get_user_pages() (Apr 30, 2018) `_ > + > +John Hubbard, October, 2019 > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 96228376139c..11e0086d64a4 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1542,9 +1542,23 @@ long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas, int *locked); > +long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked); > +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked); > long get_user_pages(unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas); > +long pin_user_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas); > +long pin_longterm_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas); > long get_user_pages_locked(unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, int *locked); > long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, > @@ -1552,6 +1566,10 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, > > int get_user_pages_fast(unsigned long start, int nr_pages, > unsigned int gup_flags, struct page **pages); > +int pin_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages); > +int pin_longterm_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages); > > int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc); > int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc, > @@ -2610,13 +2628,15 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, > #define FOLL_ANON 0x8000 /* don't do file mappings */ > #define FOLL_LONGTERM 0x10000 /* mapping lifetime is indefinite: see below */ > #define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */ > +#define FOLL_PIN 0x40000 /* pages must be released via put_user_page() */ > > /* > - * NOTE on FOLL_LONGTERM: > + * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each > + * other. Here is what they mean, and how to use them: > * > * FOLL_LONGTERM indicates that the page will be held for an indefinite time > - * period _often_ under userspace control. This is contrasted with > - * iov_iter_get_pages() where usages which are transient. > + * period _often_ under userspace control. This is in contrast to > + * iov_iter_get_pages(), where usages which are transient. > * > * FIXME: For pages which are part of a filesystem, mappings are subject to the > * lifetime enforced by the filesystem and we need guarantees that longterm > @@ -2631,11 +2651,41 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, > * Currently only get_user_pages() and get_user_pages_fast() support this flag > * and calls to get_user_pages_[un]locked are specifically not allowed. This > * is due to an incompatibility with the FS DAX check and > - * FAULT_FLAG_ALLOW_RETRY > + * FAULT_FLAG_ALLOW_RETRY. > * > - * In the CMA case: longterm pins in a CMA region would unnecessarily fragment > - * that region. And so CMA attempts to migrate the page before pinning when > + * In the CMA case: long term pins in a CMA region would unnecessarily fragment > + * that region. And so, CMA attempts to migrate the page before pinning, when > * FOLL_LONGTERM is specified. > + * > + * FOLL_PIN indicates that a special kind of tracking (not just page->_refcount, > + * but an additional pin counting system) will be invoked. This is intended for > + * anything that gets a page reference and then touches page data (for example, > + * Direct IO). This lets the filesystem know that some non-file-system entity is > + * potentially changing the pages' data. In contrast to FOLL_GET (whose pages > + * are released via put_page()), FOLL_PIN pages must be released, ultimately, by > + * a call to put_user_page(). > + * > + * FOLL_PIN is similar to FOLL_GET: both of these pin pages. They use different > + * and separate refcounting mechanisms, however, and that means that each has > + * its own acquire and release mechanisms: > + * > + * FOLL_GET: get_user_pages*() to acquire, and put_page() to release. > + * > + * FOLL_PIN: pin_user_pages*() or pin_longterm_pages*() to acquire, and > + * put_user_pages to release. > + * > + * FOLL_PIN and FOLL_GET are mutually exclusive for a given function call. > + * (The underlying pages may experience both FOLL_GET-based and FOLL_PIN-based > + * calls applied to them, and that's perfectly OK. This is a constraint on the > + * callers, not on the pages.) > + * > + * FOLL_PIN and FOLL_LONGTERM should be set internally by the pin_user_page*() > + * and pin_longterm_*() APIs, never directly by the caller. That's in order to > + * help avoid mismatches when releasing pages: get_user_pages*() pages must be > + * released via put_page(), while pin_user_pages*() pages must be released via > + * put_user_page(). > + * > + * Please see Documentation/vm/pin_user_pages.rst for more information. > */ > > static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags) > diff --git a/mm/gup.c b/mm/gup.c > index cfe6dc5fc343..ea31810da828 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -194,6 +194,10 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, > spinlock_t *ptl; > pte_t *ptep, pte; > > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) == > + (FOLL_PIN | FOLL_GET))) > + return ERR_PTR(-EINVAL); > retry: > if (unlikely(pmd_bad(*pmd))) > return no_page_table(vma, flags); > @@ -805,7 +809,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, > > start = untagged_addr(start); > > - VM_BUG_ON(!!pages != !!(gup_flags & FOLL_GET)); > + VM_BUG_ON(!!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN))); > > /* > * If FOLL_FORCE is set then do not force a full fault as the hinting > @@ -1029,7 +1033,16 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, > BUG_ON(*locked != 1); > } > > - if (pages) > + /* > + * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavior > + * is to set FOLL_GET if the caller wants pages[] filled in (but has > + * carelessly failed to specify FOLL_GET), so keep doing that, but only > + * for FOLL_GET, not for the newer FOLL_PIN. > + * > + * FOLL_PIN always expects pages to be non-null, but no need to assert > + * that here, as any failures will be obvious enough. > + */ > + if (pages && !(flags & FOLL_PIN)) > flags |= FOLL_GET; > > pages_done = 0; > @@ -1166,6 +1179,14 @@ long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas, int *locked) > { > + /* > + * FOLL_PIN must only be set internally by the pin_user_page*() and > + * pin_longterm_*() APIs, never directly by the caller, so enforce that > + * with an assertion: > + */ > + if (WARN_ON_ONCE(gup_flags & FOLL_PIN)) > + return -EINVAL; > + > /* > * Current FOLL_LONGTERM behavior is incompatible with > * FAULT_FLAG_ALLOW_RETRY because of the FS DAX check requirement on > @@ -1626,6 +1647,14 @@ long get_user_pages(unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas) > { > + /* > + * FOLL_PIN must only be set internally by the pin_user_page*() and > + * pin_longterm_*() APIs, never directly by the caller, so enforce that > + * with an assertion: > + */ > + if (WARN_ON_ONCE(gup_flags & FOLL_PIN)) > + return -EINVAL; > + > return __gup_longterm_locked(current, current->mm, start, nr_pages, > pages, vmas, gup_flags | FOLL_TOUCH); > } > @@ -2377,29 +2406,14 @@ static int __gup_longterm_unlocked(unsigned long start, int nr_pages, > return ret; > } > > -/** > - * get_user_pages_fast() - pin user pages in memory > - * @start: starting user address > - * @nr_pages: number of pages from start to pin > - * @gup_flags: flags modifying pin behaviour > - * @pages: array that receives pointers to the pages pinned. > - * Should be at least nr_pages long. > - * > - * Attempt to pin user pages in memory without taking mm->mmap_sem. > - * If not successful, it will fall back to taking the lock and > - * calling get_user_pages(). > - * > - * Returns number of pages pinned. This may be fewer than the number > - * requested. If nr_pages is 0 or negative, returns 0. If no pages > - * were pinned, returns -errno. > - */ > -int get_user_pages_fast(unsigned long start, int nr_pages, > - unsigned int gup_flags, struct page **pages) > +static int internal_get_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, > + struct page **pages) > { > unsigned long addr, len, end; > int nr = 0, ret = 0; > > - if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM))) > + if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM | FOLL_PIN))) > return -EINVAL; > > start = untagged_addr(start) & PAGE_MASK; > @@ -2439,4 +2453,208 @@ int get_user_pages_fast(unsigned long start, int nr_pages, > > return ret; > } > + > +/** > + * get_user_pages_fast() - pin user pages in memory > + * @start: starting user address > + * @nr_pages: number of pages from start to pin > + * @gup_flags: flags modifying pin behaviour > + * @pages: array that receives pointers to the pages pinned. > + * Should be at least nr_pages long. > + * > + * Attempt to pin user pages in memory without taking mm->mmap_sem. > + * If not successful, it will fall back to taking the lock and > + * calling get_user_pages(). > + * > + * Returns number of pages pinned. This may be fewer than the number requested. > + * If nr_pages is 0 or negative, returns 0. If no pages were pinned, returns > + * -errno. > + */ > +int get_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages) > +{ > + /* > + * FOLL_PIN must only be set internally by the pin_user_page*() and > + * pin_longterm_*() APIs, never directly by the caller, so enforce that: > + */ > + if (WARN_ON_ONCE(gup_flags & FOLL_PIN)) > + return -EINVAL; > + > + return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages); > +} > EXPORT_SYMBOL_GPL(get_user_pages_fast); > + > +/** > + * pin_user_pages_fast() - pin user pages in memory without taking locks > + * > + * Nearly the same as get_user_pages_fast(), except that FOLL_PIN is set. See > + * get_user_pages_fast() for documentation on the function arguments, because > + * the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It > + * is NOT intended for Case 2 (RDMA: long-term pins). > + */ > +int pin_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_PIN; > + return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages); > +} > +EXPORT_SYMBOL_GPL(pin_user_pages_fast); > + > +/** > + * pin_longterm_pages_fast() - pin user pages in memory without taking locks > + * > + * Nearly the same as get_user_pages_fast(), except that FOLL_PIN and > + * FOLL_LONGTERM are set. See get_user_pages_fast() for documentation on the > + * function arguments, because the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * FOLL_LONGTERM means that the pages are being pinned for "long term" use, > + * typically by a non-CPU device, and we cannot be sure that waiting for a > + * pinned page to become unpin will be effective. > + * > + * This is intended for Case 2 (RDMA: long-term pins) of the FOLL_PIN > + * documentation. > + */ > +int pin_longterm_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= (FOLL_PIN | FOLL_LONGTERM); > + return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages); > +} > +EXPORT_SYMBOL_GPL(pin_longterm_pages_fast); > + > +/** > + * pin_user_pages_remote() - pin pages of a remote process (task != current) > + * > + * Nearly the same as get_user_pages_remote(), except that FOLL_PIN is set. See > + * get_user_pages_remote() for documentation on the function arguments, because > + * the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for details. > + * > + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It > + * is NOT intended for Case 2 (RDMA: long-term pins). > + */ > +long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_TOUCH | FOLL_REMOTE | FOLL_PIN; > + > + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, > + locked, gup_flags); > +} > +EXPORT_SYMBOL(pin_user_pages_remote); > + > +/** > + * pin_longterm_pages_remote() - pin pages of a remote process (task != current) > + * > + * Nearly the same as get_user_pages_remote(), but note that FOLL_TOUCH is not > + * set, and FOLL_PIN and FOLL_LONGTERM are set. See get_user_pages_remote() for > + * documentation on the function arguments, because the arguments here are > + * identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * FOLL_LONGTERM means that the pages are being pinned for "long term" use, > + * typically by a non-CPU device, and we cannot be sure that waiting for a > + * pinned page to become unpin will be effective. > + * > + * This is intended for Case 2 (RDMA: long-term pins) in > + * Documentation/vm/pin_user_pages.rst. > + */ > +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_LONGTERM | FOLL_REMOTE | FOLL_PIN; > + > + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, > + locked, gup_flags); > +} > +EXPORT_SYMBOL(pin_longterm_pages_remote); > + > +/** > + * pin_user_pages() - pin user pages in memory for use by other devices > + * > + * Nearly the same as get_user_pages(), except that FOLL_TOUCH is not set, and > + * FOLL_PIN is set. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for details. > + * > + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It > + * is NOT intended for Case 2 (RDMA: long-term pins). > + */ > +long pin_user_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_PIN; > + return __gup_longterm_locked(current, current->mm, start, nr_pages, > + pages, vmas, gup_flags); > +} > +EXPORT_SYMBOL(pin_user_pages); > + > +/** > + * pin_longterm_pages() - pin user pages in memory for long-term use (RDMA, > + * typically) > + * > + * Nearly the same as get_user_pages(), except that FOLL_PIN and FOLL_LONGTERM > + * are set. See get_user_pages_fast() for documentation on the function > + * arguments, because the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * FOLL_LONGTERM means that the pages are being pinned for "long term" use, > + * typically by a non-CPU device, and we cannot be sure that waiting for a > + * pinned page to become unpin will be effective. > + * > + * This is intended for Case 2 (RDMA: long-term pins) in > + * Documentation/vm/pin_user_pages.rst. > + */ > +long pin_longterm_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_PIN | FOLL_LONGTERM; > + return __gup_longterm_locked(current, current->mm, start, nr_pages, > + pages, vmas, gup_flags); > +} > +EXPORT_SYMBOL(pin_longterm_pages); > -- > 2.24.0 > -- Sincerely yours, Mike. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40487C43331 for ; Tue, 12 Nov 2019 06:55:02 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5E279206A3 for ; Tue, 12 Nov 2019 06:55:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="B8jDkQZ6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5E279206A3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 47Bz5H09GXzF3qq for ; Tue, 12 Nov 2019 17:54:59 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=kernel.org (client-ip=198.145.29.99; helo=mail.kernel.org; envelope-from=rppt@kernel.org; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.b="B8jDkQZ6"; dkim-atps=neutral Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 47Bz1454KfzF5CT for ; Tue, 12 Nov 2019 17:51:20 +1100 (AEDT) Received: from rapoport-lnx (unknown [195.57.117.247]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 123CD2084F; Tue, 12 Nov 2019 06:51:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573541477; bh=Xrd0YnKaYbVVFQLPrYOT47qBaf64xGATzmWkzzpLmiA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=B8jDkQZ6Yaop7bTCojs7RpRcs7qzAYkt7UQtAShffH1VroOthzb0lxXZ+Uqi9E2C9 L+fU7mWBQog4EI8R3urhYTrKWVhcrv1YybqFVDmiAVLKq35FK2BQV3cS+FpmDp9EJt Prf9mTDXl8/QHBhyteLCU6UJyRMnvTcnUn7JMsuk= Date: Tue, 12 Nov 2019 07:51:05 +0100 From: Mike Rapoport To: John Hubbard Subject: Re: [PATCH v3 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN Message-ID: <20191112065103.GA1209@rapoport-lnx> References: <20191112000700.3455038-1-jhubbard@nvidia.com> <20191112000700.3455038-10-jhubbard@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20191112000700.3455038-10-jhubbard@nvidia.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Hocko , Jan Kara , kvm@vger.kernel.org, linux-doc@vger.kernel.org, David Airlie , Dave Chinner , dri-devel@lists.freedesktop.org, LKML , linux-mm@kvack.org, Paul Mackerras , linux-kselftest@vger.kernel.org, Ira Weiny , Jonathan Corbet , linux-rdma@vger.kernel.org, Christoph Hellwig , Jason Gunthorpe , Vlastimil Babka , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , linux-media@vger.kernel.org, Shuah Khan , linux-block@vger.kernel.org, =?iso-8859-1?B?Suly9G1l?= Glisse , Al Viro , Dan Williams , Mauro Carvalho Chehab , bpf@vger.kernel.org, Magnus Karlsson , Jens Axboe , netdev@vger.kernel.org, Alex Williamson , Daniel Vetter , linux-fsdevel@vger.kernel.org, Andrew Morton , linuxppc-dev@lists.ozlabs.org, "David S . Miller" , Mike Kravetz Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Mon, Nov 11, 2019 at 04:06:46PM -0800, John Hubbard wrote: > Introduce pin_user_pages*() variations of get_user_pages*() calls, > and also pin_longterm_pages*() variations. > > These variants all set FOLL_PIN, which is also introduced, and > thoroughly documented. > > The pin_longterm*() variants also set FOLL_LONGTERM, in addition > to FOLL_PIN: > > pin_user_pages() > pin_user_pages_remote() > pin_user_pages_fast() > > pin_longterm_pages() > pin_longterm_pages_remote() > pin_longterm_pages_fast() > > All pages that are pinned via the above calls, must be unpinned via > put_user_page(). > > The underlying rules are: > > * These are gup-internal flags, so the call sites should not directly > set FOLL_PIN nor FOLL_LONGTERM. That behavior is enforced with > assertions, for the new FOLL_PIN flag. However, for the pre-existing > FOLL_LONGTERM flag, which has some call sites that still directly > set FOLL_LONGTERM, there is no assertion yet. > > * Call sites that want to indicate that they are going to do DirectIO > ("DIO") or something with similar characteristics, should call a > get_user_pages()-like wrapper call that sets FOLL_PIN. These wrappers > will: > * Start with "pin_user_pages" instead of "get_user_pages". That > makes it easy to find and audit the call sites. > * Set FOLL_PIN > > * For pages that are received via FOLL_PIN, those pages must be returned > via put_user_page(). > > Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases > in this documentation. (I've reworded it and expanded upon it.) > > Reviewed-by: Jérôme Glisse > Cc: Mike Rapoport > Cc: Jonathan Corbet > Cc: Ira Weiny > Signed-off-by: John Hubbard > --- Reviewed-by: Mike Rapoport # Documentation > Documentation/core-api/index.rst | 1 + > Documentation/core-api/pin_user_pages.rst | 218 ++++++++++++++++++ > include/linux/mm.h | 62 +++++- > mm/gup.c | 260 ++++++++++++++++++++-- > 4 files changed, 514 insertions(+), 27 deletions(-) > create mode 100644 Documentation/core-api/pin_user_pages.rst > > diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst > index ab0eae1c153a..413f7d7c8642 100644 > --- a/Documentation/core-api/index.rst > +++ b/Documentation/core-api/index.rst > @@ -31,6 +31,7 @@ Core utilities > generic-radix-tree > memory-allocation > mm-api > + pin_user_pages > gfp_mask-from-fs-io > timekeeping > boot-time-mm > diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst > new file mode 100644 > index 000000000000..ce819e709435 > --- /dev/null > +++ b/Documentation/core-api/pin_user_pages.rst > @@ -0,0 +1,218 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +==================================================== > +pin_user_pages() and related calls > +==================================================== > + > +.. contents:: :local: > + > +Overview > +======== > + > +This document describes the following functions: :: > + > + pin_user_pages > + pin_user_pages_fast > + pin_user_pages_remote > + > + pin_longterm_pages > + pin_longterm_pages_fast > + pin_longterm_pages_remote > + > +Basic description of FOLL_PIN > +============================= > + > +FOLL_PIN and FOLL_LONGTERM are flags that can be passed to the get_user_pages*() > +("gup") family of functions. FOLL_PIN has significant interactions and > +interdependencies with FOLL_LONGTERM, so both are covered here. > + > +Both FOLL_PIN and FOLL_LONGTERM are internal to gup, meaning that neither > +FOLL_PIN nor FOLL_LONGTERM should not appear at the gup call sites. This allows > +the associated wrapper functions (pin_user_pages() and others) to set the > +correct combination of these flags, and to check for problems as well. > + > +FOLL_PIN and FOLL_GET are mutually exclusive for a given gup call. However, > +multiple threads and call sites are free to pin the same struct pages, via both > +FOLL_PIN and FOLL_GET. It's just the call site that needs to choose one or the > +other, not the struct page(s). > + > +The FOLL_PIN implementation is nearly the same as FOLL_GET, except that FOLL_PIN > +uses a different reference counting technique. > + > +FOLL_PIN is a prerequisite to FOLL_LONGTGERM. Another way of saying that is, > +FOLL_LONGTERM is a specific case, more restrictive case of FOLL_PIN. > + > +Which flags are set by each wrapper > +=================================== > + > +Only FOLL_PIN and FOLL_LONGTERM are covered here. These flags are added to > +whatever flags the caller provides:: > + > + Function gup flags (FOLL_PIN or FOLL_LONGTERM only) > + -------- ------------------------------------------ > + pin_user_pages FOLL_PIN > + pin_user_pages_fast FOLL_PIN > + pin_user_pages_remote FOLL_PIN > + > + pin_longterm_pages FOLL_PIN | FOLL_LONGTERM > + pin_longterm_pages_fast FOLL_PIN | FOLL_LONGTERM > + pin_longterm_pages_remote FOLL_PIN | FOLL_LONGTERM > + > +Tracking dma-pinned pages > +========================= > + > +Some of the key design constraints, and solutions, for tracking dma-pinned > +pages: > + > +* An actual reference count, per struct page, is required. This is because > + multiple processes may pin and unpin a page. > + > +* False positives (reporting that a page is dma-pinned, when in fact it is not) > + are acceptable, but false negatives are not. > + > +* struct page may not be increased in size for this, and all fields are already > + used. > + > +* Given the above, we can overload the page->_refcount field by using, sort of, > + the upper bits in that field for a dma-pinned count. "Sort of", means that, > + rather than dividing page->_refcount into bit fields, we simple add a medium- > + large value (GUP_PIN_COUNTING_BIAS, initially chosen to be 1024: 10 bits) to > + page->_refcount. This provides fuzzy behavior: if a page has get_page() called > + on it 1024 times, then it will appear to have a single dma-pinned count. > + And again, that's acceptable. > + > +This also leads to limitations: there are only 31-10==21 bits available for a > +counter that increments 10 bits at a time. > + > +TODO: for 1GB and larger huge pages, this is cutting it close. That's because > +when pin_user_pages() follows such pages, it increments the head page by "1" > +(where "1" used to mean "+1" for get_user_pages(), but now means "+1024" for > +pin_user_pages()) for each tail page. So if you have a 1GB huge page: > + > +* There are 256K (18 bits) worth of 4 KB tail pages. > +* There are 21 bits available to count up via GUP_PIN_COUNTING_BIAS (that is, > + 10 bits at a time) > +* There are 21 - 18 == 3 bits available to count. Except that there aren't, > + because you need to allow for a few normal get_page() calls on the head page, > + as well. Fortunately, the approach of using addition, rather than "hard" > + bitfields, within page->_refcount, allows for sharing these bits gracefully. > + But we're still looking at about 8 references. > + > +This, however, is a missing feature more than anything else, because it's easily > +solved by addressing an obvious inefficiency in the original get_user_pages() > +approach of retrieving pages: stop treating all the pages as if they were > +PAGE_SIZE. Retrieve huge pages as huge pages. The callers need to be aware of > +this, so some work is required. Once that's in place, this limitation mostly > +disappears from view, because there will be ample refcounting range available. > + > +* Callers must specifically request "dma-pinned tracking of pages". In other > + words, just calling get_user_pages() will not suffice; a new set of functions, > + pin_user_page() and related, must be used. > + > +FOLL_PIN, FOLL_GET, FOLL_LONGTERM: when to use which flags > +========================================================== > + > +Thanks to Jan Kara, Vlastimil Babka and several other -mm people, for describing > +these categories: > + > +CASE 1: Direct IO (DIO) > +----------------------- > +There are GUP references to pages that are serving > +as DIO buffers. These buffers are needed for a relatively short time (so they > +are not "long term"). No special synchronization with page_mkclean() or > +munmap() is provided. Therefore, flags to set at the call site are: :: > + > + FOLL_PIN > + > +...but rather than setting FOLL_PIN directly, call sites should use one of > +the pin_user_pages*() routines that set FOLL_PIN. > + > +CASE 2: RDMA > +------------ > +There are GUP references to pages that are serving as DMA > +buffers. These buffers are needed for a long time ("long term"). No special > +synchronization with page_mkclean() or munmap() is provided. Therefore, flags > +to set at the call site are: :: > + > + FOLL_PIN | FOLL_LONGTERM > + > +NOTE: Some pages, such as DAX pages, cannot be pinned with longterm pins. That's > +because DAX pages do not have a separate page cache, and so "pinning" implies > +locking down file system blocks, which is not (yet) supported in that way. > + > +CASE 3: Hardware with page faulting support > +------------------------------------------- > +Here, a well-written driver doesn't normally need to pin pages at all. However, > +if the driver does choose to do so, it can register MMU notifiers for the range, > +and will be called back upon invalidation. Either way (avoiding page pinning, or > +using MMU notifiers to unpin upon request), there is proper synchronization with > +both filesystem and mm (page_mkclean(), munmap(), etc). > + > +Therefore, neither flag needs to be set. > + > +In this case, ideally, neither get_user_pages() nor pin_user_pages() should be > +called. Instead, the software should be written so that it does not pin pages. > +This allows mm and filesystems to operate more efficiently and reliably. > + > +CASE 4: Pinning for struct page manipulation only > +------------------------------------------------- > +Here, normal GUP calls are sufficient, so neither flag needs to be set. > + > +page_dma_pinned(): the whole point of pinning > +============================================= > + > +The whole point of marking pages as "DMA-pinned" or "gup-pinned" is to be able > +to query, "is this page DMA-pinned?" That allows code such as page_mkclean() > +(and file system writeback code in general) to make informed decisions about > +what to do when a page cannot be unmapped due to such pins. > + > +What to do in those cases is the subject of a years-long series of discussions > +and debates (see the References at the end of this document). It's a TODO item > +here: fill in the details once that's worked out. Meanwhile, it's safe to say > +that having this available: :: > + > + static inline bool page_dma_pinned(struct page *page) > + > +...is a prerequisite to solving the long-running gup+DMA problem. > + > +Another way of thinking about FOLL_GET, FOLL_PIN, and FOLL_LONGTERM > +=================================================================== > + > +Another way of thinking about these flags is as a progression of restrictions: > +FOLL_GET is for struct page manipulation, without affecting the data that the > +struct page refers to. FOLL_PIN is a *replacement* for FOLL_GET, and is for > +short term pins on pages whose data *will* get accessed. As such, FOLL_PIN is > +a "more severe" form of pinning. And finally, FOLL_LONGTERM is an even more > +restrictive case that has FOLL_PIN as a prerequisite: this is for pages that > +will be pinned longterm, and whose data will be accessed. > + > +Unit testing > +============ > +This file:: > + > + tools/testing/selftests/vm/gup_benchmark.c > + > +has the following new calls to exercise the new pin*() wrapper functions: > + > +* PIN_FAST_BENCHMARK (./gup_benchmark -a) > +* PIN_LONGTERM_BENCHMARK (./gup_benchmark -a) > +* PIN_BENCHMARK (./gup_benchmark -a) > + > +You can monitor how many total dma-pinned pages have been acquired and released > +since the system was booted, via two new /proc/vmstat entries: :: > + > + /proc/vmstat/nr_foll_pin_requested > + /proc/vmstat/nr_foll_pin_requested > + > +Those are both going to show zero, unless CONFIG_DEBUG_VM is set. This is > +because there is a noticeable performance drop in put_user_page(), when they > +are activated. > + > +References > +========== > + > +* `Some slow progress on get_user_pages() (Apr 2, 2019) `_ > +* `DMA and get_user_pages() (LPC: Dec 12, 2018) `_ > +* `The trouble with get_user_pages() (Apr 30, 2018) `_ > + > +John Hubbard, October, 2019 > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 96228376139c..11e0086d64a4 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1542,9 +1542,23 @@ long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas, int *locked); > +long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked); > +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked); > long get_user_pages(unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas); > +long pin_user_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas); > +long pin_longterm_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas); > long get_user_pages_locked(unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, int *locked); > long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, > @@ -1552,6 +1566,10 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, > > int get_user_pages_fast(unsigned long start, int nr_pages, > unsigned int gup_flags, struct page **pages); > +int pin_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages); > +int pin_longterm_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages); > > int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc); > int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc, > @@ -2610,13 +2628,15 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, > #define FOLL_ANON 0x8000 /* don't do file mappings */ > #define FOLL_LONGTERM 0x10000 /* mapping lifetime is indefinite: see below */ > #define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */ > +#define FOLL_PIN 0x40000 /* pages must be released via put_user_page() */ > > /* > - * NOTE on FOLL_LONGTERM: > + * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each > + * other. Here is what they mean, and how to use them: > * > * FOLL_LONGTERM indicates that the page will be held for an indefinite time > - * period _often_ under userspace control. This is contrasted with > - * iov_iter_get_pages() where usages which are transient. > + * period _often_ under userspace control. This is in contrast to > + * iov_iter_get_pages(), where usages which are transient. > * > * FIXME: For pages which are part of a filesystem, mappings are subject to the > * lifetime enforced by the filesystem and we need guarantees that longterm > @@ -2631,11 +2651,41 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, > * Currently only get_user_pages() and get_user_pages_fast() support this flag > * and calls to get_user_pages_[un]locked are specifically not allowed. This > * is due to an incompatibility with the FS DAX check and > - * FAULT_FLAG_ALLOW_RETRY > + * FAULT_FLAG_ALLOW_RETRY. > * > - * In the CMA case: longterm pins in a CMA region would unnecessarily fragment > - * that region. And so CMA attempts to migrate the page before pinning when > + * In the CMA case: long term pins in a CMA region would unnecessarily fragment > + * that region. And so, CMA attempts to migrate the page before pinning, when > * FOLL_LONGTERM is specified. > + * > + * FOLL_PIN indicates that a special kind of tracking (not just page->_refcount, > + * but an additional pin counting system) will be invoked. This is intended for > + * anything that gets a page reference and then touches page data (for example, > + * Direct IO). This lets the filesystem know that some non-file-system entity is > + * potentially changing the pages' data. In contrast to FOLL_GET (whose pages > + * are released via put_page()), FOLL_PIN pages must be released, ultimately, by > + * a call to put_user_page(). > + * > + * FOLL_PIN is similar to FOLL_GET: both of these pin pages. They use different > + * and separate refcounting mechanisms, however, and that means that each has > + * its own acquire and release mechanisms: > + * > + * FOLL_GET: get_user_pages*() to acquire, and put_page() to release. > + * > + * FOLL_PIN: pin_user_pages*() or pin_longterm_pages*() to acquire, and > + * put_user_pages to release. > + * > + * FOLL_PIN and FOLL_GET are mutually exclusive for a given function call. > + * (The underlying pages may experience both FOLL_GET-based and FOLL_PIN-based > + * calls applied to them, and that's perfectly OK. This is a constraint on the > + * callers, not on the pages.) > + * > + * FOLL_PIN and FOLL_LONGTERM should be set internally by the pin_user_page*() > + * and pin_longterm_*() APIs, never directly by the caller. That's in order to > + * help avoid mismatches when releasing pages: get_user_pages*() pages must be > + * released via put_page(), while pin_user_pages*() pages must be released via > + * put_user_page(). > + * > + * Please see Documentation/vm/pin_user_pages.rst for more information. > */ > > static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags) > diff --git a/mm/gup.c b/mm/gup.c > index cfe6dc5fc343..ea31810da828 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -194,6 +194,10 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, > spinlock_t *ptl; > pte_t *ptep, pte; > > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) == > + (FOLL_PIN | FOLL_GET))) > + return ERR_PTR(-EINVAL); > retry: > if (unlikely(pmd_bad(*pmd))) > return no_page_table(vma, flags); > @@ -805,7 +809,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, > > start = untagged_addr(start); > > - VM_BUG_ON(!!pages != !!(gup_flags & FOLL_GET)); > + VM_BUG_ON(!!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN))); > > /* > * If FOLL_FORCE is set then do not force a full fault as the hinting > @@ -1029,7 +1033,16 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, > BUG_ON(*locked != 1); > } > > - if (pages) > + /* > + * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavior > + * is to set FOLL_GET if the caller wants pages[] filled in (but has > + * carelessly failed to specify FOLL_GET), so keep doing that, but only > + * for FOLL_GET, not for the newer FOLL_PIN. > + * > + * FOLL_PIN always expects pages to be non-null, but no need to assert > + * that here, as any failures will be obvious enough. > + */ > + if (pages && !(flags & FOLL_PIN)) > flags |= FOLL_GET; > > pages_done = 0; > @@ -1166,6 +1179,14 @@ long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas, int *locked) > { > + /* > + * FOLL_PIN must only be set internally by the pin_user_page*() and > + * pin_longterm_*() APIs, never directly by the caller, so enforce that > + * with an assertion: > + */ > + if (WARN_ON_ONCE(gup_flags & FOLL_PIN)) > + return -EINVAL; > + > /* > * Current FOLL_LONGTERM behavior is incompatible with > * FAULT_FLAG_ALLOW_RETRY because of the FS DAX check requirement on > @@ -1626,6 +1647,14 @@ long get_user_pages(unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas) > { > + /* > + * FOLL_PIN must only be set internally by the pin_user_page*() and > + * pin_longterm_*() APIs, never directly by the caller, so enforce that > + * with an assertion: > + */ > + if (WARN_ON_ONCE(gup_flags & FOLL_PIN)) > + return -EINVAL; > + > return __gup_longterm_locked(current, current->mm, start, nr_pages, > pages, vmas, gup_flags | FOLL_TOUCH); > } > @@ -2377,29 +2406,14 @@ static int __gup_longterm_unlocked(unsigned long start, int nr_pages, > return ret; > } > > -/** > - * get_user_pages_fast() - pin user pages in memory > - * @start: starting user address > - * @nr_pages: number of pages from start to pin > - * @gup_flags: flags modifying pin behaviour > - * @pages: array that receives pointers to the pages pinned. > - * Should be at least nr_pages long. > - * > - * Attempt to pin user pages in memory without taking mm->mmap_sem. > - * If not successful, it will fall back to taking the lock and > - * calling get_user_pages(). > - * > - * Returns number of pages pinned. This may be fewer than the number > - * requested. If nr_pages is 0 or negative, returns 0. If no pages > - * were pinned, returns -errno. > - */ > -int get_user_pages_fast(unsigned long start, int nr_pages, > - unsigned int gup_flags, struct page **pages) > +static int internal_get_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, > + struct page **pages) > { > unsigned long addr, len, end; > int nr = 0, ret = 0; > > - if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM))) > + if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM | FOLL_PIN))) > return -EINVAL; > > start = untagged_addr(start) & PAGE_MASK; > @@ -2439,4 +2453,208 @@ int get_user_pages_fast(unsigned long start, int nr_pages, > > return ret; > } > + > +/** > + * get_user_pages_fast() - pin user pages in memory > + * @start: starting user address > + * @nr_pages: number of pages from start to pin > + * @gup_flags: flags modifying pin behaviour > + * @pages: array that receives pointers to the pages pinned. > + * Should be at least nr_pages long. > + * > + * Attempt to pin user pages in memory without taking mm->mmap_sem. > + * If not successful, it will fall back to taking the lock and > + * calling get_user_pages(). > + * > + * Returns number of pages pinned. This may be fewer than the number requested. > + * If nr_pages is 0 or negative, returns 0. If no pages were pinned, returns > + * -errno. > + */ > +int get_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages) > +{ > + /* > + * FOLL_PIN must only be set internally by the pin_user_page*() and > + * pin_longterm_*() APIs, never directly by the caller, so enforce that: > + */ > + if (WARN_ON_ONCE(gup_flags & FOLL_PIN)) > + return -EINVAL; > + > + return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages); > +} > EXPORT_SYMBOL_GPL(get_user_pages_fast); > + > +/** > + * pin_user_pages_fast() - pin user pages in memory without taking locks > + * > + * Nearly the same as get_user_pages_fast(), except that FOLL_PIN is set. See > + * get_user_pages_fast() for documentation on the function arguments, because > + * the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It > + * is NOT intended for Case 2 (RDMA: long-term pins). > + */ > +int pin_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_PIN; > + return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages); > +} > +EXPORT_SYMBOL_GPL(pin_user_pages_fast); > + > +/** > + * pin_longterm_pages_fast() - pin user pages in memory without taking locks > + * > + * Nearly the same as get_user_pages_fast(), except that FOLL_PIN and > + * FOLL_LONGTERM are set. See get_user_pages_fast() for documentation on the > + * function arguments, because the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * FOLL_LONGTERM means that the pages are being pinned for "long term" use, > + * typically by a non-CPU device, and we cannot be sure that waiting for a > + * pinned page to become unpin will be effective. > + * > + * This is intended for Case 2 (RDMA: long-term pins) of the FOLL_PIN > + * documentation. > + */ > +int pin_longterm_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= (FOLL_PIN | FOLL_LONGTERM); > + return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages); > +} > +EXPORT_SYMBOL_GPL(pin_longterm_pages_fast); > + > +/** > + * pin_user_pages_remote() - pin pages of a remote process (task != current) > + * > + * Nearly the same as get_user_pages_remote(), except that FOLL_PIN is set. See > + * get_user_pages_remote() for documentation on the function arguments, because > + * the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for details. > + * > + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It > + * is NOT intended for Case 2 (RDMA: long-term pins). > + */ > +long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_TOUCH | FOLL_REMOTE | FOLL_PIN; > + > + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, > + locked, gup_flags); > +} > +EXPORT_SYMBOL(pin_user_pages_remote); > + > +/** > + * pin_longterm_pages_remote() - pin pages of a remote process (task != current) > + * > + * Nearly the same as get_user_pages_remote(), but note that FOLL_TOUCH is not > + * set, and FOLL_PIN and FOLL_LONGTERM are set. See get_user_pages_remote() for > + * documentation on the function arguments, because the arguments here are > + * identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * FOLL_LONGTERM means that the pages are being pinned for "long term" use, > + * typically by a non-CPU device, and we cannot be sure that waiting for a > + * pinned page to become unpin will be effective. > + * > + * This is intended for Case 2 (RDMA: long-term pins) in > + * Documentation/vm/pin_user_pages.rst. > + */ > +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_LONGTERM | FOLL_REMOTE | FOLL_PIN; > + > + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, > + locked, gup_flags); > +} > +EXPORT_SYMBOL(pin_longterm_pages_remote); > + > +/** > + * pin_user_pages() - pin user pages in memory for use by other devices > + * > + * Nearly the same as get_user_pages(), except that FOLL_TOUCH is not set, and > + * FOLL_PIN is set. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for details. > + * > + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It > + * is NOT intended for Case 2 (RDMA: long-term pins). > + */ > +long pin_user_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_PIN; > + return __gup_longterm_locked(current, current->mm, start, nr_pages, > + pages, vmas, gup_flags); > +} > +EXPORT_SYMBOL(pin_user_pages); > + > +/** > + * pin_longterm_pages() - pin user pages in memory for long-term use (RDMA, > + * typically) > + * > + * Nearly the same as get_user_pages(), except that FOLL_PIN and FOLL_LONGTERM > + * are set. See get_user_pages_fast() for documentation on the function > + * arguments, because the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * FOLL_LONGTERM means that the pages are being pinned for "long term" use, > + * typically by a non-CPU device, and we cannot be sure that waiting for a > + * pinned page to become unpin will be effective. > + * > + * This is intended for Case 2 (RDMA: long-term pins) in > + * Documentation/vm/pin_user_pages.rst. > + */ > +long pin_longterm_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_PIN | FOLL_LONGTERM; > + return __gup_longterm_locked(current, current->mm, start, nr_pages, > + pages, vmas, gup_flags); > +} > +EXPORT_SYMBOL(pin_longterm_pages); > -- > 2.24.0 > -- Sincerely yours, Mike. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Rapoport Subject: Re: [PATCH v3 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN Date: Tue, 12 Nov 2019 07:51:05 +0100 Message-ID: <20191112065103.GA1209@rapoport-lnx> References: <20191112000700.3455038-1-jhubbard@nvidia.com> <20191112000700.3455038-10-jhubbard@nvidia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Return-path: Content-Disposition: inline In-Reply-To: <20191112000700.3455038-10-jhubbard@nvidia.com> Sender: netdev-owner@vger.kernel.org To: John Hubbard Cc: Andrew Morton , Al Viro , Alex Williamson , Benjamin Herrenschmidt , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , Christoph Hellwig , Dan Williams , Daniel Vetter , Dave Chinner , David Airlie , "David S . Miller" , Ira Weiny , Jan Kara , Jason Gunthorpe , Jens Axboe , Jonathan Corbet , =?iso-8859-1?B?Suly9G1l?= Glisse , Magnus Karlsson , Mauro Carvalho Chehab , Micha List-Id: dri-devel@lists.freedesktop.org On Mon, Nov 11, 2019 at 04:06:46PM -0800, John Hubbard wrote: > Introduce pin_user_pages*() variations of get_user_pages*() calls, > and also pin_longterm_pages*() variations. > > These variants all set FOLL_PIN, which is also introduced, and > thoroughly documented. > > The pin_longterm*() variants also set FOLL_LONGTERM, in addition > to FOLL_PIN: > > pin_user_pages() > pin_user_pages_remote() > pin_user_pages_fast() > > pin_longterm_pages() > pin_longterm_pages_remote() > pin_longterm_pages_fast() > > All pages that are pinned via the above calls, must be unpinned via > put_user_page(). > > The underlying rules are: > > * These are gup-internal flags, so the call sites should not directly > set FOLL_PIN nor FOLL_LONGTERM. That behavior is enforced with > assertions, for the new FOLL_PIN flag. However, for the pre-existing > FOLL_LONGTERM flag, which has some call sites that still directly > set FOLL_LONGTERM, there is no assertion yet. > > * Call sites that want to indicate that they are going to do DirectIO > ("DIO") or something with similar characteristics, should call a > get_user_pages()-like wrapper call that sets FOLL_PIN. These wrappers > will: > * Start with "pin_user_pages" instead of "get_user_pages". That > makes it easy to find and audit the call sites. > * Set FOLL_PIN > > * For pages that are received via FOLL_PIN, those pages must be returned > via put_user_page(). > > Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases > in this documentation. (I've reworded it and expanded upon it.) > > Reviewed-by: Jérôme Glisse > Cc: Mike Rapoport > Cc: Jonathan Corbet > Cc: Ira Weiny > Signed-off-by: John Hubbard > --- Reviewed-by: Mike Rapoport # Documentation > Documentation/core-api/index.rst | 1 + > Documentation/core-api/pin_user_pages.rst | 218 ++++++++++++++++++ > include/linux/mm.h | 62 +++++- > mm/gup.c | 260 ++++++++++++++++++++-- > 4 files changed, 514 insertions(+), 27 deletions(-) > create mode 100644 Documentation/core-api/pin_user_pages.rst > > diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst > index ab0eae1c153a..413f7d7c8642 100644 > --- a/Documentation/core-api/index.rst > +++ b/Documentation/core-api/index.rst > @@ -31,6 +31,7 @@ Core utilities > generic-radix-tree > memory-allocation > mm-api > + pin_user_pages > gfp_mask-from-fs-io > timekeeping > boot-time-mm > diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst > new file mode 100644 > index 000000000000..ce819e709435 > --- /dev/null > +++ b/Documentation/core-api/pin_user_pages.rst > @@ -0,0 +1,218 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +==================================================== > +pin_user_pages() and related calls > +==================================================== > + > +.. contents:: :local: > + > +Overview > +======== > + > +This document describes the following functions: :: > + > + pin_user_pages > + pin_user_pages_fast > + pin_user_pages_remote > + > + pin_longterm_pages > + pin_longterm_pages_fast > + pin_longterm_pages_remote > + > +Basic description of FOLL_PIN > +============================= > + > +FOLL_PIN and FOLL_LONGTERM are flags that can be passed to the get_user_pages*() > +("gup") family of functions. FOLL_PIN has significant interactions and > +interdependencies with FOLL_LONGTERM, so both are covered here. > + > +Both FOLL_PIN and FOLL_LONGTERM are internal to gup, meaning that neither > +FOLL_PIN nor FOLL_LONGTERM should not appear at the gup call sites. This allows > +the associated wrapper functions (pin_user_pages() and others) to set the > +correct combination of these flags, and to check for problems as well. > + > +FOLL_PIN and FOLL_GET are mutually exclusive for a given gup call. However, > +multiple threads and call sites are free to pin the same struct pages, via both > +FOLL_PIN and FOLL_GET. It's just the call site that needs to choose one or the > +other, not the struct page(s). > + > +The FOLL_PIN implementation is nearly the same as FOLL_GET, except that FOLL_PIN > +uses a different reference counting technique. > + > +FOLL_PIN is a prerequisite to FOLL_LONGTGERM. Another way of saying that is, > +FOLL_LONGTERM is a specific case, more restrictive case of FOLL_PIN. > + > +Which flags are set by each wrapper > +=================================== > + > +Only FOLL_PIN and FOLL_LONGTERM are covered here. These flags are added to > +whatever flags the caller provides:: > + > + Function gup flags (FOLL_PIN or FOLL_LONGTERM only) > + -------- ------------------------------------------ > + pin_user_pages FOLL_PIN > + pin_user_pages_fast FOLL_PIN > + pin_user_pages_remote FOLL_PIN > + > + pin_longterm_pages FOLL_PIN | FOLL_LONGTERM > + pin_longterm_pages_fast FOLL_PIN | FOLL_LONGTERM > + pin_longterm_pages_remote FOLL_PIN | FOLL_LONGTERM > + > +Tracking dma-pinned pages > +========================= > + > +Some of the key design constraints, and solutions, for tracking dma-pinned > +pages: > + > +* An actual reference count, per struct page, is required. This is because > + multiple processes may pin and unpin a page. > + > +* False positives (reporting that a page is dma-pinned, when in fact it is not) > + are acceptable, but false negatives are not. > + > +* struct page may not be increased in size for this, and all fields are already > + used. > + > +* Given the above, we can overload the page->_refcount field by using, sort of, > + the upper bits in that field for a dma-pinned count. "Sort of", means that, > + rather than dividing page->_refcount into bit fields, we simple add a medium- > + large value (GUP_PIN_COUNTING_BIAS, initially chosen to be 1024: 10 bits) to > + page->_refcount. This provides fuzzy behavior: if a page has get_page() called > + on it 1024 times, then it will appear to have a single dma-pinned count. > + And again, that's acceptable. > + > +This also leads to limitations: there are only 31-10==21 bits available for a > +counter that increments 10 bits at a time. > + > +TODO: for 1GB and larger huge pages, this is cutting it close. That's because > +when pin_user_pages() follows such pages, it increments the head page by "1" > +(where "1" used to mean "+1" for get_user_pages(), but now means "+1024" for > +pin_user_pages()) for each tail page. So if you have a 1GB huge page: > + > +* There are 256K (18 bits) worth of 4 KB tail pages. > +* There are 21 bits available to count up via GUP_PIN_COUNTING_BIAS (that is, > + 10 bits at a time) > +* There are 21 - 18 == 3 bits available to count. Except that there aren't, > + because you need to allow for a few normal get_page() calls on the head page, > + as well. Fortunately, the approach of using addition, rather than "hard" > + bitfields, within page->_refcount, allows for sharing these bits gracefully. > + But we're still looking at about 8 references. > + > +This, however, is a missing feature more than anything else, because it's easily > +solved by addressing an obvious inefficiency in the original get_user_pages() > +approach of retrieving pages: stop treating all the pages as if they were > +PAGE_SIZE. Retrieve huge pages as huge pages. The callers need to be aware of > +this, so some work is required. Once that's in place, this limitation mostly > +disappears from view, because there will be ample refcounting range available. > + > +* Callers must specifically request "dma-pinned tracking of pages". In other > + words, just calling get_user_pages() will not suffice; a new set of functions, > + pin_user_page() and related, must be used. > + > +FOLL_PIN, FOLL_GET, FOLL_LONGTERM: when to use which flags > +========================================================== > + > +Thanks to Jan Kara, Vlastimil Babka and several other -mm people, for describing > +these categories: > + > +CASE 1: Direct IO (DIO) > +----------------------- > +There are GUP references to pages that are serving > +as DIO buffers. These buffers are needed for a relatively short time (so they > +are not "long term"). No special synchronization with page_mkclean() or > +munmap() is provided. Therefore, flags to set at the call site are: :: > + > + FOLL_PIN > + > +...but rather than setting FOLL_PIN directly, call sites should use one of > +the pin_user_pages*() routines that set FOLL_PIN. > + > +CASE 2: RDMA > +------------ > +There are GUP references to pages that are serving as DMA > +buffers. These buffers are needed for a long time ("long term"). No special > +synchronization with page_mkclean() or munmap() is provided. Therefore, flags > +to set at the call site are: :: > + > + FOLL_PIN | FOLL_LONGTERM > + > +NOTE: Some pages, such as DAX pages, cannot be pinned with longterm pins. That's > +because DAX pages do not have a separate page cache, and so "pinning" implies > +locking down file system blocks, which is not (yet) supported in that way. > + > +CASE 3: Hardware with page faulting support > +------------------------------------------- > +Here, a well-written driver doesn't normally need to pin pages at all. However, > +if the driver does choose to do so, it can register MMU notifiers for the range, > +and will be called back upon invalidation. Either way (avoiding page pinning, or > +using MMU notifiers to unpin upon request), there is proper synchronization with > +both filesystem and mm (page_mkclean(), munmap(), etc). > + > +Therefore, neither flag needs to be set. > + > +In this case, ideally, neither get_user_pages() nor pin_user_pages() should be > +called. Instead, the software should be written so that it does not pin pages. > +This allows mm and filesystems to operate more efficiently and reliably. > + > +CASE 4: Pinning for struct page manipulation only > +------------------------------------------------- > +Here, normal GUP calls are sufficient, so neither flag needs to be set. > + > +page_dma_pinned(): the whole point of pinning > +============================================= > + > +The whole point of marking pages as "DMA-pinned" or "gup-pinned" is to be able > +to query, "is this page DMA-pinned?" That allows code such as page_mkclean() > +(and file system writeback code in general) to make informed decisions about > +what to do when a page cannot be unmapped due to such pins. > + > +What to do in those cases is the subject of a years-long series of discussions > +and debates (see the References at the end of this document). It's a TODO item > +here: fill in the details once that's worked out. Meanwhile, it's safe to say > +that having this available: :: > + > + static inline bool page_dma_pinned(struct page *page) > + > +...is a prerequisite to solving the long-running gup+DMA problem. > + > +Another way of thinking about FOLL_GET, FOLL_PIN, and FOLL_LONGTERM > +=================================================================== > + > +Another way of thinking about these flags is as a progression of restrictions: > +FOLL_GET is for struct page manipulation, without affecting the data that the > +struct page refers to. FOLL_PIN is a *replacement* for FOLL_GET, and is for > +short term pins on pages whose data *will* get accessed. As such, FOLL_PIN is > +a "more severe" form of pinning. And finally, FOLL_LONGTERM is an even more > +restrictive case that has FOLL_PIN as a prerequisite: this is for pages that > +will be pinned longterm, and whose data will be accessed. > + > +Unit testing > +============ > +This file:: > + > + tools/testing/selftests/vm/gup_benchmark.c > + > +has the following new calls to exercise the new pin*() wrapper functions: > + > +* PIN_FAST_BENCHMARK (./gup_benchmark -a) > +* PIN_LONGTERM_BENCHMARK (./gup_benchmark -a) > +* PIN_BENCHMARK (./gup_benchmark -a) > + > +You can monitor how many total dma-pinned pages have been acquired and released > +since the system was booted, via two new /proc/vmstat entries: :: > + > + /proc/vmstat/nr_foll_pin_requested > + /proc/vmstat/nr_foll_pin_requested > + > +Those are both going to show zero, unless CONFIG_DEBUG_VM is set. This is > +because there is a noticeable performance drop in put_user_page(), when they > +are activated. > + > +References > +========== > + > +* `Some slow progress on get_user_pages() (Apr 2, 2019) `_ > +* `DMA and get_user_pages() (LPC: Dec 12, 2018) `_ > +* `The trouble with get_user_pages() (Apr 30, 2018) `_ > + > +John Hubbard, October, 2019 > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 96228376139c..11e0086d64a4 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1542,9 +1542,23 @@ long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas, int *locked); > +long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked); > +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked); > long get_user_pages(unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas); > +long pin_user_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas); > +long pin_longterm_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas); > long get_user_pages_locked(unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, int *locked); > long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, > @@ -1552,6 +1566,10 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, > > int get_user_pages_fast(unsigned long start, int nr_pages, > unsigned int gup_flags, struct page **pages); > +int pin_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages); > +int pin_longterm_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages); > > int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc); > int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc, > @@ -2610,13 +2628,15 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, > #define FOLL_ANON 0x8000 /* don't do file mappings */ > #define FOLL_LONGTERM 0x10000 /* mapping lifetime is indefinite: see below */ > #define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */ > +#define FOLL_PIN 0x40000 /* pages must be released via put_user_page() */ > > /* > - * NOTE on FOLL_LONGTERM: > + * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each > + * other. Here is what they mean, and how to use them: > * > * FOLL_LONGTERM indicates that the page will be held for an indefinite time > - * period _often_ under userspace control. This is contrasted with > - * iov_iter_get_pages() where usages which are transient. > + * period _often_ under userspace control. This is in contrast to > + * iov_iter_get_pages(), where usages which are transient. > * > * FIXME: For pages which are part of a filesystem, mappings are subject to the > * lifetime enforced by the filesystem and we need guarantees that longterm > @@ -2631,11 +2651,41 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, > * Currently only get_user_pages() and get_user_pages_fast() support this flag > * and calls to get_user_pages_[un]locked are specifically not allowed. This > * is due to an incompatibility with the FS DAX check and > - * FAULT_FLAG_ALLOW_RETRY > + * FAULT_FLAG_ALLOW_RETRY. > * > - * In the CMA case: longterm pins in a CMA region would unnecessarily fragment > - * that region. And so CMA attempts to migrate the page before pinning when > + * In the CMA case: long term pins in a CMA region would unnecessarily fragment > + * that region. And so, CMA attempts to migrate the page before pinning, when > * FOLL_LONGTERM is specified. > + * > + * FOLL_PIN indicates that a special kind of tracking (not just page->_refcount, > + * but an additional pin counting system) will be invoked. This is intended for > + * anything that gets a page reference and then touches page data (for example, > + * Direct IO). This lets the filesystem know that some non-file-system entity is > + * potentially changing the pages' data. In contrast to FOLL_GET (whose pages > + * are released via put_page()), FOLL_PIN pages must be released, ultimately, by > + * a call to put_user_page(). > + * > + * FOLL_PIN is similar to FOLL_GET: both of these pin pages. They use different > + * and separate refcounting mechanisms, however, and that means that each has > + * its own acquire and release mechanisms: > + * > + * FOLL_GET: get_user_pages*() to acquire, and put_page() to release. > + * > + * FOLL_PIN: pin_user_pages*() or pin_longterm_pages*() to acquire, and > + * put_user_pages to release. > + * > + * FOLL_PIN and FOLL_GET are mutually exclusive for a given function call. > + * (The underlying pages may experience both FOLL_GET-based and FOLL_PIN-based > + * calls applied to them, and that's perfectly OK. This is a constraint on the > + * callers, not on the pages.) > + * > + * FOLL_PIN and FOLL_LONGTERM should be set internally by the pin_user_page*() > + * and pin_longterm_*() APIs, never directly by the caller. That's in order to > + * help avoid mismatches when releasing pages: get_user_pages*() pages must be > + * released via put_page(), while pin_user_pages*() pages must be released via > + * put_user_page(). > + * > + * Please see Documentation/vm/pin_user_pages.rst for more information. > */ > > static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags) > diff --git a/mm/gup.c b/mm/gup.c > index cfe6dc5fc343..ea31810da828 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -194,6 +194,10 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, > spinlock_t *ptl; > pte_t *ptep, pte; > > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) == > + (FOLL_PIN | FOLL_GET))) > + return ERR_PTR(-EINVAL); > retry: > if (unlikely(pmd_bad(*pmd))) > return no_page_table(vma, flags); > @@ -805,7 +809,7 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, > > start = untagged_addr(start); > > - VM_BUG_ON(!!pages != !!(gup_flags & FOLL_GET)); > + VM_BUG_ON(!!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN))); > > /* > * If FOLL_FORCE is set then do not force a full fault as the hinting > @@ -1029,7 +1033,16 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, > BUG_ON(*locked != 1); > } > > - if (pages) > + /* > + * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavior > + * is to set FOLL_GET if the caller wants pages[] filled in (but has > + * carelessly failed to specify FOLL_GET), so keep doing that, but only > + * for FOLL_GET, not for the newer FOLL_PIN. > + * > + * FOLL_PIN always expects pages to be non-null, but no need to assert > + * that here, as any failures will be obvious enough. > + */ > + if (pages && !(flags & FOLL_PIN)) > flags |= FOLL_GET; > > pages_done = 0; > @@ -1166,6 +1179,14 @@ long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas, int *locked) > { > + /* > + * FOLL_PIN must only be set internally by the pin_user_page*() and > + * pin_longterm_*() APIs, never directly by the caller, so enforce that > + * with an assertion: > + */ > + if (WARN_ON_ONCE(gup_flags & FOLL_PIN)) > + return -EINVAL; > + > /* > * Current FOLL_LONGTERM behavior is incompatible with > * FAULT_FLAG_ALLOW_RETRY because of the FS DAX check requirement on > @@ -1626,6 +1647,14 @@ long get_user_pages(unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas) > { > + /* > + * FOLL_PIN must only be set internally by the pin_user_page*() and > + * pin_longterm_*() APIs, never directly by the caller, so enforce that > + * with an assertion: > + */ > + if (WARN_ON_ONCE(gup_flags & FOLL_PIN)) > + return -EINVAL; > + > return __gup_longterm_locked(current, current->mm, start, nr_pages, > pages, vmas, gup_flags | FOLL_TOUCH); > } > @@ -2377,29 +2406,14 @@ static int __gup_longterm_unlocked(unsigned long start, int nr_pages, > return ret; > } > > -/** > - * get_user_pages_fast() - pin user pages in memory > - * @start: starting user address > - * @nr_pages: number of pages from start to pin > - * @gup_flags: flags modifying pin behaviour > - * @pages: array that receives pointers to the pages pinned. > - * Should be at least nr_pages long. > - * > - * Attempt to pin user pages in memory without taking mm->mmap_sem. > - * If not successful, it will fall back to taking the lock and > - * calling get_user_pages(). > - * > - * Returns number of pages pinned. This may be fewer than the number > - * requested. If nr_pages is 0 or negative, returns 0. If no pages > - * were pinned, returns -errno. > - */ > -int get_user_pages_fast(unsigned long start, int nr_pages, > - unsigned int gup_flags, struct page **pages) > +static int internal_get_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, > + struct page **pages) > { > unsigned long addr, len, end; > int nr = 0, ret = 0; > > - if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM))) > + if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM | FOLL_PIN))) > return -EINVAL; > > start = untagged_addr(start) & PAGE_MASK; > @@ -2439,4 +2453,208 @@ int get_user_pages_fast(unsigned long start, int nr_pages, > > return ret; > } > + > +/** > + * get_user_pages_fast() - pin user pages in memory > + * @start: starting user address > + * @nr_pages: number of pages from start to pin > + * @gup_flags: flags modifying pin behaviour > + * @pages: array that receives pointers to the pages pinned. > + * Should be at least nr_pages long. > + * > + * Attempt to pin user pages in memory without taking mm->mmap_sem. > + * If not successful, it will fall back to taking the lock and > + * calling get_user_pages(). > + * > + * Returns number of pages pinned. This may be fewer than the number requested. > + * If nr_pages is 0 or negative, returns 0. If no pages were pinned, returns > + * -errno. > + */ > +int get_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages) > +{ > + /* > + * FOLL_PIN must only be set internally by the pin_user_page*() and > + * pin_longterm_*() APIs, never directly by the caller, so enforce that: > + */ > + if (WARN_ON_ONCE(gup_flags & FOLL_PIN)) > + return -EINVAL; > + > + return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages); > +} > EXPORT_SYMBOL_GPL(get_user_pages_fast); > + > +/** > + * pin_user_pages_fast() - pin user pages in memory without taking locks > + * > + * Nearly the same as get_user_pages_fast(), except that FOLL_PIN is set. See > + * get_user_pages_fast() for documentation on the function arguments, because > + * the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It > + * is NOT intended for Case 2 (RDMA: long-term pins). > + */ > +int pin_user_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_PIN; > + return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages); > +} > +EXPORT_SYMBOL_GPL(pin_user_pages_fast); > + > +/** > + * pin_longterm_pages_fast() - pin user pages in memory without taking locks > + * > + * Nearly the same as get_user_pages_fast(), except that FOLL_PIN and > + * FOLL_LONGTERM are set. See get_user_pages_fast() for documentation on the > + * function arguments, because the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * FOLL_LONGTERM means that the pages are being pinned for "long term" use, > + * typically by a non-CPU device, and we cannot be sure that waiting for a > + * pinned page to become unpin will be effective. > + * > + * This is intended for Case 2 (RDMA: long-term pins) of the FOLL_PIN > + * documentation. > + */ > +int pin_longterm_pages_fast(unsigned long start, int nr_pages, > + unsigned int gup_flags, struct page **pages) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= (FOLL_PIN | FOLL_LONGTERM); > + return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages); > +} > +EXPORT_SYMBOL_GPL(pin_longterm_pages_fast); > + > +/** > + * pin_user_pages_remote() - pin pages of a remote process (task != current) > + * > + * Nearly the same as get_user_pages_remote(), except that FOLL_PIN is set. See > + * get_user_pages_remote() for documentation on the function arguments, because > + * the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for details. > + * > + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It > + * is NOT intended for Case 2 (RDMA: long-term pins). > + */ > +long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_TOUCH | FOLL_REMOTE | FOLL_PIN; > + > + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, > + locked, gup_flags); > +} > +EXPORT_SYMBOL(pin_user_pages_remote); > + > +/** > + * pin_longterm_pages_remote() - pin pages of a remote process (task != current) > + * > + * Nearly the same as get_user_pages_remote(), but note that FOLL_TOUCH is not > + * set, and FOLL_PIN and FOLL_LONGTERM are set. See get_user_pages_remote() for > + * documentation on the function arguments, because the arguments here are > + * identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * FOLL_LONGTERM means that the pages are being pinned for "long term" use, > + * typically by a non-CPU device, and we cannot be sure that waiting for a > + * pinned page to become unpin will be effective. > + * > + * This is intended for Case 2 (RDMA: long-term pins) in > + * Documentation/vm/pin_user_pages.rst. > + */ > +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm, > + unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas, int *locked) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_LONGTERM | FOLL_REMOTE | FOLL_PIN; > + > + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, > + locked, gup_flags); > +} > +EXPORT_SYMBOL(pin_longterm_pages_remote); > + > +/** > + * pin_user_pages() - pin user pages in memory for use by other devices > + * > + * Nearly the same as get_user_pages(), except that FOLL_TOUCH is not set, and > + * FOLL_PIN is set. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for details. > + * > + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It > + * is NOT intended for Case 2 (RDMA: long-term pins). > + */ > +long pin_user_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_PIN; > + return __gup_longterm_locked(current, current->mm, start, nr_pages, > + pages, vmas, gup_flags); > +} > +EXPORT_SYMBOL(pin_user_pages); > + > +/** > + * pin_longterm_pages() - pin user pages in memory for long-term use (RDMA, > + * typically) > + * > + * Nearly the same as get_user_pages(), except that FOLL_PIN and FOLL_LONGTERM > + * are set. See get_user_pages_fast() for documentation on the function > + * arguments, because the arguments here are identical. > + * > + * FOLL_PIN means that the pages must be released via put_user_page(). Please > + * see Documentation/vm/pin_user_pages.rst for further details. > + * > + * FOLL_LONGTERM means that the pages are being pinned for "long term" use, > + * typically by a non-CPU device, and we cannot be sure that waiting for a > + * pinned page to become unpin will be effective. > + * > + * This is intended for Case 2 (RDMA: long-term pins) in > + * Documentation/vm/pin_user_pages.rst. > + */ > +long pin_longterm_pages(unsigned long start, unsigned long nr_pages, > + unsigned int gup_flags, struct page **pages, > + struct vm_area_struct **vmas) > +{ > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ > + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) > + return -EINVAL; > + > + gup_flags |= FOLL_PIN | FOLL_LONGTERM; > + return __gup_longterm_locked(current, current->mm, start, nr_pages, > + pages, vmas, gup_flags); > +} > +EXPORT_SYMBOL(pin_longterm_pages); > -- > 2.24.0 > -- Sincerely yours, Mike. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 620E3C17448 for ; Tue, 12 Nov 2019 06:51:21 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 41C8521925 for ; Tue, 12 Nov 2019 06:51:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 41C8521925 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C32376E79A; Tue, 12 Nov 2019 06:51:19 +0000 (UTC) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by gabe.freedesktop.org (Postfix) with ESMTPS id C9A956E79A for ; Tue, 12 Nov 2019 06:51:17 +0000 (UTC) Received: from rapoport-lnx (unknown [195.57.117.247]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 123CD2084F; Tue, 12 Nov 2019 06:51:08 +0000 (UTC) Date: Tue, 12 Nov 2019 07:51:05 +0100 From: Mike Rapoport To: John Hubbard Subject: Re: [PATCH v3 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN Message-ID: <20191112065103.GA1209@rapoport-lnx> References: <20191112000700.3455038-1-jhubbard@nvidia.com> <20191112000700.3455038-10-jhubbard@nvidia.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191112000700.3455038-10-jhubbard@nvidia.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1573541477; bh=Xrd0YnKaYbVVFQLPrYOT47qBaf64xGATzmWkzzpLmiA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=B8jDkQZ6Yaop7bTCojs7RpRcs7qzAYkt7UQtAShffH1VroOthzb0lxXZ+Uqi9E2C9 L+fU7mWBQog4EI8R3urhYTrKWVhcrv1YybqFVDmiAVLKq35FK2BQV3cS+FpmDp9EJt Prf9mTDXl8/QHBhyteLCU6UJyRMnvTcnUn7JMsuk= X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Hocko , Jan Kara , kvm@vger.kernel.org, linux-doc@vger.kernel.org, David Airlie , Dave Chinner , dri-devel@lists.freedesktop.org, LKML , linux-mm@kvack.org, Paul Mackerras , linux-kselftest@vger.kernel.org, Ira Weiny , Jonathan Corbet , linux-rdma@vger.kernel.org, Michael Ellerman , Christoph Hellwig , Jason Gunthorpe , Vlastimil Babka , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , linux-media@vger.kernel.org, Shuah Khan , linux-block@vger.kernel.org, =?iso-8859-1?B?Suly9G1l?= Glisse , Al Viro , Dan Williams , Mauro Carvalho Chehab , bpf@vger.kernel.org, Magnus Karlsson , Jens Axboe , netdev@vger.kernel.org, Alex Williamson , linux-fsdevel@vger.kernel.org, Andrew Morton , linuxppc-dev@lists.ozlabs.org, "David S . Miller" , Mike Kravetz Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Message-ID: <20191112065105.2K0LhnM9juP0RR63dGGWrRRMFFGw8AM05fMjscXocMs@z> T24gTW9uLCBOb3YgMTEsIDIwMTkgYXQgMDQ6MDY6NDZQTSAtMDgwMCwgSm9obiBIdWJiYXJkIHdy b3RlOgo+IEludHJvZHVjZSBwaW5fdXNlcl9wYWdlcyooKSB2YXJpYXRpb25zIG9mIGdldF91c2Vy X3BhZ2VzKigpIGNhbGxzLAo+IGFuZCBhbHNvIHBpbl9sb25ndGVybV9wYWdlcyooKSB2YXJpYXRp b25zLgo+IAo+IFRoZXNlIHZhcmlhbnRzIGFsbCBzZXQgRk9MTF9QSU4sIHdoaWNoIGlzIGFsc28g aW50cm9kdWNlZCwgYW5kCj4gdGhvcm91Z2hseSBkb2N1bWVudGVkLgo+IAo+IFRoZSBwaW5fbG9u Z3Rlcm0qKCkgdmFyaWFudHMgYWxzbyBzZXQgRk9MTF9MT05HVEVSTSwgaW4gYWRkaXRpb24KPiB0 byBGT0xMX1BJTjoKPiAKPiAgICAgcGluX3VzZXJfcGFnZXMoKQo+ICAgICBwaW5fdXNlcl9wYWdl c19yZW1vdGUoKQo+ICAgICBwaW5fdXNlcl9wYWdlc19mYXN0KCkKPiAKPiAgICAgcGluX2xvbmd0 ZXJtX3BhZ2VzKCkKPiAgICAgcGluX2xvbmd0ZXJtX3BhZ2VzX3JlbW90ZSgpCj4gICAgIHBpbl9s b25ndGVybV9wYWdlc19mYXN0KCkKPiAKPiBBbGwgcGFnZXMgdGhhdCBhcmUgcGlubmVkIHZpYSB0 aGUgYWJvdmUgY2FsbHMsIG11c3QgYmUgdW5waW5uZWQgdmlhCj4gcHV0X3VzZXJfcGFnZSgpLgo+ IAo+IFRoZSB1bmRlcmx5aW5nIHJ1bGVzIGFyZToKPiAKPiAqIFRoZXNlIGFyZSBndXAtaW50ZXJu YWwgZmxhZ3MsIHNvIHRoZSBjYWxsIHNpdGVzIHNob3VsZCBub3QgZGlyZWN0bHkKPiBzZXQgRk9M TF9QSU4gbm9yIEZPTExfTE9OR1RFUk0uIFRoYXQgYmVoYXZpb3IgaXMgZW5mb3JjZWQgd2l0aAo+ IGFzc2VydGlvbnMsIGZvciB0aGUgbmV3IEZPTExfUElOIGZsYWcuIEhvd2V2ZXIsIGZvciB0aGUg cHJlLWV4aXN0aW5nCj4gRk9MTF9MT05HVEVSTSBmbGFnLCB3aGljaCBoYXMgc29tZSBjYWxsIHNp dGVzIHRoYXQgc3RpbGwgZGlyZWN0bHkKPiBzZXQgRk9MTF9MT05HVEVSTSwgdGhlcmUgaXMgbm8g YXNzZXJ0aW9uIHlldC4KPiAKPiAqIENhbGwgc2l0ZXMgdGhhdCB3YW50IHRvIGluZGljYXRlIHRo YXQgdGhleSBhcmUgZ29pbmcgdG8gZG8gRGlyZWN0SU8KPiAgICgiRElPIikgb3Igc29tZXRoaW5n IHdpdGggc2ltaWxhciBjaGFyYWN0ZXJpc3RpY3MsIHNob3VsZCBjYWxsIGEKPiAgIGdldF91c2Vy X3BhZ2VzKCktbGlrZSB3cmFwcGVyIGNhbGwgdGhhdCBzZXRzIEZPTExfUElOLiBUaGVzZSB3cmFw cGVycwo+ICAgd2lsbDoKPiAgICAgICAgICogU3RhcnQgd2l0aCAicGluX3VzZXJfcGFnZXMiIGlu c3RlYWQgb2YgImdldF91c2VyX3BhZ2VzIi4gVGhhdAo+ICAgICAgICAgICBtYWtlcyBpdCBlYXN5 IHRvIGZpbmQgYW5kIGF1ZGl0IHRoZSBjYWxsIHNpdGVzLgo+ICAgICAgICAgKiBTZXQgRk9MTF9Q SU4KPiAKPiAqIEZvciBwYWdlcyB0aGF0IGFyZSByZWNlaXZlZCB2aWEgRk9MTF9QSU4sIHRob3Nl IHBhZ2VzIG11c3QgYmUgcmV0dXJuZWQKPiAgIHZpYSBwdXRfdXNlcl9wYWdlKCkuCj4gCj4gVGhh bmtzIHRvIEphbiBLYXJhIGFuZCBWbGFzdGltaWwgQmFia2EgZm9yIGV4cGxhaW5pbmcgdGhlIDQg Y2FzZXMKPiBpbiB0aGlzIGRvY3VtZW50YXRpb24uIChJJ3ZlIHJld29yZGVkIGl0IGFuZCBleHBh bmRlZCB1cG9uIGl0LikKPiAKPiBSZXZpZXdlZC1ieTogSsOpcsO0bWUgR2xpc3NlIDxqZ2xpc3Nl QHJlZGhhdC5jb20+Cj4gQ2M6IE1pa2UgUmFwb3BvcnQgPHJwcHRAa2VybmVsLm9yZz4KPiBDYzog Sm9uYXRoYW4gQ29yYmV0IDxjb3JiZXRAbHduLm5ldD4KPiBDYzogSXJhIFdlaW55IDxpcmEud2Vp bnlAaW50ZWwuY29tPgo+IFNpZ25lZC1vZmYtYnk6IEpvaG4gSHViYmFyZCA8amh1YmJhcmRAbnZp ZGlhLmNvbT4KPiAtLS0KClJldmlld2VkLWJ5OiBNaWtlIFJhcG9wb3J0IDxycHB0QGxpbnV4Lmli bS5jb20+ICAjIERvY3VtZW50YXRpb24KCj4gIERvY3VtZW50YXRpb24vY29yZS1hcGkvaW5kZXgu cnN0ICAgICAgICAgIHwgICAxICsKPiAgRG9jdW1lbnRhdGlvbi9jb3JlLWFwaS9waW5fdXNlcl9w YWdlcy5yc3QgfCAyMTggKysrKysrKysrKysrKysrKysrCj4gIGluY2x1ZGUvbGludXgvbW0uaCAg ICAgICAgICAgICAgICAgICAgICAgIHwgIDYyICsrKysrLQo+ICBtbS9ndXAuYyAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICB8IDI2MCArKysrKysrKysrKysrKysrKysrKy0tCj4gIDQg ZmlsZXMgY2hhbmdlZCwgNTE0IGluc2VydGlvbnMoKyksIDI3IGRlbGV0aW9ucygtKQo+ICBjcmVh dGUgbW9kZSAxMDA2NDQgRG9jdW1lbnRhdGlvbi9jb3JlLWFwaS9waW5fdXNlcl9wYWdlcy5yc3QK PiAKPiBkaWZmIC0tZ2l0IGEvRG9jdW1lbnRhdGlvbi9jb3JlLWFwaS9pbmRleC5yc3QgYi9Eb2N1 bWVudGF0aW9uL2NvcmUtYXBpL2luZGV4LnJzdAo+IGluZGV4IGFiMGVhZTFjMTUzYS4uNDEzZjdk N2M4NjQyIDEwMDY0NAo+IC0tLSBhL0RvY3VtZW50YXRpb24vY29yZS1hcGkvaW5kZXgucnN0Cj4g KysrIGIvRG9jdW1lbnRhdGlvbi9jb3JlLWFwaS9pbmRleC5yc3QKPiBAQCAtMzEsNiArMzEsNyBA QCBDb3JlIHV0aWxpdGllcwo+ICAgICBnZW5lcmljLXJhZGl4LXRyZWUKPiAgICAgbWVtb3J5LWFs bG9jYXRpb24KPiAgICAgbW0tYXBpCj4gKyAgIHBpbl91c2VyX3BhZ2VzCj4gICAgIGdmcF9tYXNr LWZyb20tZnMtaW8KPiAgICAgdGltZWtlZXBpbmcKPiAgICAgYm9vdC10aW1lLW1tCj4gZGlmZiAt LWdpdCBhL0RvY3VtZW50YXRpb24vY29yZS1hcGkvcGluX3VzZXJfcGFnZXMucnN0IGIvRG9jdW1l bnRhdGlvbi9jb3JlLWFwaS9waW5fdXNlcl9wYWdlcy5yc3QKPiBuZXcgZmlsZSBtb2RlIDEwMDY0 NAo+IGluZGV4IDAwMDAwMDAwMDAwMC4uY2U4MTllNzA5NDM1Cj4gLS0tIC9kZXYvbnVsbAo+ICsr KyBiL0RvY3VtZW50YXRpb24vY29yZS1hcGkvcGluX3VzZXJfcGFnZXMucnN0Cj4gQEAgLTAsMCAr MSwyMTggQEAKPiArLi4gU1BEWC1MaWNlbnNlLUlkZW50aWZpZXI6IEdQTC0yLjAKPiArCj4gKz09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KPiArcGlu X3VzZXJfcGFnZXMoKSBhbmQgcmVsYXRlZCBjYWxscwo+ICs9PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Cj4gKwo+ICsuLiBjb250ZW50czo6IDpsb2Nh bDoKPiArCj4gK092ZXJ2aWV3Cj4gKz09PT09PT09Cj4gKwo+ICtUaGlzIGRvY3VtZW50IGRlc2Ny aWJlcyB0aGUgZm9sbG93aW5nIGZ1bmN0aW9uczogOjoKPiArCj4gKyBwaW5fdXNlcl9wYWdlcwo+ ICsgcGluX3VzZXJfcGFnZXNfZmFzdAo+ICsgcGluX3VzZXJfcGFnZXNfcmVtb3RlCj4gKwo+ICsg cGluX2xvbmd0ZXJtX3BhZ2VzCj4gKyBwaW5fbG9uZ3Rlcm1fcGFnZXNfZmFzdAo+ICsgcGluX2xv bmd0ZXJtX3BhZ2VzX3JlbW90ZQo+ICsKPiArQmFzaWMgZGVzY3JpcHRpb24gb2YgRk9MTF9QSU4K PiArPT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KPiArCj4gK0ZPTExfUElOIGFuZCBGT0xM X0xPTkdURVJNIGFyZSBmbGFncyB0aGF0IGNhbiBiZSBwYXNzZWQgdG8gdGhlIGdldF91c2VyX3Bh Z2VzKigpCj4gKygiZ3VwIikgZmFtaWx5IG9mIGZ1bmN0aW9ucy4gRk9MTF9QSU4gaGFzIHNpZ25p ZmljYW50IGludGVyYWN0aW9ucyBhbmQKPiAraW50ZXJkZXBlbmRlbmNpZXMgd2l0aCBGT0xMX0xP TkdURVJNLCBzbyBib3RoIGFyZSBjb3ZlcmVkIGhlcmUuCj4gKwo+ICtCb3RoIEZPTExfUElOIGFu ZCBGT0xMX0xPTkdURVJNIGFyZSBpbnRlcm5hbCB0byBndXAsIG1lYW5pbmcgdGhhdCBuZWl0aGVy Cj4gK0ZPTExfUElOIG5vciBGT0xMX0xPTkdURVJNIHNob3VsZCBub3QgYXBwZWFyIGF0IHRoZSBn dXAgY2FsbCBzaXRlcy4gVGhpcyBhbGxvd3MKPiArdGhlIGFzc29jaWF0ZWQgd3JhcHBlciBmdW5j dGlvbnMgIChwaW5fdXNlcl9wYWdlcygpIGFuZCBvdGhlcnMpIHRvIHNldCB0aGUKPiArY29ycmVj dCBjb21iaW5hdGlvbiBvZiB0aGVzZSBmbGFncywgYW5kIHRvIGNoZWNrIGZvciBwcm9ibGVtcyBh cyB3ZWxsLgo+ICsKPiArRk9MTF9QSU4gYW5kIEZPTExfR0VUIGFyZSBtdXR1YWxseSBleGNsdXNp dmUgZm9yIGEgZ2l2ZW4gZ3VwIGNhbGwuIEhvd2V2ZXIsCj4gK211bHRpcGxlIHRocmVhZHMgYW5k IGNhbGwgc2l0ZXMgYXJlIGZyZWUgdG8gcGluIHRoZSBzYW1lIHN0cnVjdCBwYWdlcywgdmlhIGJv dGgKPiArRk9MTF9QSU4gYW5kIEZPTExfR0VULiBJdCdzIGp1c3QgdGhlIGNhbGwgc2l0ZSB0aGF0 IG5lZWRzIHRvIGNob29zZSBvbmUgb3IgdGhlCj4gK290aGVyLCBub3QgdGhlIHN0cnVjdCBwYWdl KHMpLgo+ICsKPiArVGhlIEZPTExfUElOIGltcGxlbWVudGF0aW9uIGlzIG5lYXJseSB0aGUgc2Ft ZSBhcyBGT0xMX0dFVCwgZXhjZXB0IHRoYXQgRk9MTF9QSU4KPiArdXNlcyBhIGRpZmZlcmVudCBy ZWZlcmVuY2UgY291bnRpbmcgdGVjaG5pcXVlLgo+ICsKPiArRk9MTF9QSU4gaXMgYSBwcmVyZXF1 aXNpdGUgdG8gRk9MTF9MT05HVEdFUk0uIEFub3RoZXIgd2F5IG9mIHNheWluZyB0aGF0IGlzLAo+ ICtGT0xMX0xPTkdURVJNIGlzIGEgc3BlY2lmaWMgY2FzZSwgbW9yZSByZXN0cmljdGl2ZSBjYXNl IG9mIEZPTExfUElOLgo+ICsKPiArV2hpY2ggZmxhZ3MgYXJlIHNldCBieSBlYWNoIHdyYXBwZXIK PiArPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KPiArCj4gK09ubHkgRk9MTF9Q SU4gYW5kIEZPTExfTE9OR1RFUk0gYXJlIGNvdmVyZWQgaGVyZS4gVGhlc2UgZmxhZ3MgYXJlIGFk ZGVkIHRvCj4gK3doYXRldmVyIGZsYWdzIHRoZSBjYWxsZXIgcHJvdmlkZXM6Ogo+ICsKPiArIEZ1 bmN0aW9uICAgICAgICAgICAgICAgICAgICBndXAgZmxhZ3MgKEZPTExfUElOIG9yIEZPTExfTE9O R1RFUk0gb25seSkKPiArIC0tLS0tLS0tICAgICAgICAgICAgICAgICAgICAtLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KPiArIHBpbl91c2VyX3BhZ2VzICAgICAgICAg ICAgICBGT0xMX1BJTgo+ICsgcGluX3VzZXJfcGFnZXNfZmFzdCAgICAgICAgIEZPTExfUElOCj4g KyBwaW5fdXNlcl9wYWdlc19yZW1vdGUgICAgICAgRk9MTF9QSU4KPiArCj4gKyBwaW5fbG9uZ3Rl cm1fcGFnZXMgICAgICAgICAgRk9MTF9QSU4gfCBGT0xMX0xPTkdURVJNCj4gKyBwaW5fbG9uZ3Rl cm1fcGFnZXNfZmFzdCAgICAgRk9MTF9QSU4gfCBGT0xMX0xPTkdURVJNCj4gKyBwaW5fbG9uZ3Rl cm1fcGFnZXNfcmVtb3RlICAgRk9MTF9QSU4gfCBGT0xMX0xPTkdURVJNCj4gKwo+ICtUcmFja2lu ZyBkbWEtcGlubmVkIHBhZ2VzCj4gKz09PT09PT09PT09PT09PT09PT09PT09PT0KPiArCj4gK1Nv bWUgb2YgdGhlIGtleSBkZXNpZ24gY29uc3RyYWludHMsIGFuZCBzb2x1dGlvbnMsIGZvciB0cmFj a2luZyBkbWEtcGlubmVkCj4gK3BhZ2VzOgo+ICsKPiArKiBBbiBhY3R1YWwgcmVmZXJlbmNlIGNv dW50LCBwZXIgc3RydWN0IHBhZ2UsIGlzIHJlcXVpcmVkLiBUaGlzIGlzIGJlY2F1c2UKPiArICBt dWx0aXBsZSBwcm9jZXNzZXMgbWF5IHBpbiBhbmQgdW5waW4gYSBwYWdlLgo+ICsKPiArKiBGYWxz ZSBwb3NpdGl2ZXMgKHJlcG9ydGluZyB0aGF0IGEgcGFnZSBpcyBkbWEtcGlubmVkLCB3aGVuIGlu IGZhY3QgaXQgaXMgbm90KQo+ICsgIGFyZSBhY2NlcHRhYmxlLCBidXQgZmFsc2UgbmVnYXRpdmVz IGFyZSBub3QuCj4gKwo+ICsqIHN0cnVjdCBwYWdlIG1heSBub3QgYmUgaW5jcmVhc2VkIGluIHNp emUgZm9yIHRoaXMsIGFuZCBhbGwgZmllbGRzIGFyZSBhbHJlYWR5Cj4gKyAgdXNlZC4KPiArCj4g KyogR2l2ZW4gdGhlIGFib3ZlLCB3ZSBjYW4gb3ZlcmxvYWQgdGhlIHBhZ2UtPl9yZWZjb3VudCBm aWVsZCBieSB1c2luZywgc29ydCBvZiwKPiArICB0aGUgdXBwZXIgYml0cyBpbiB0aGF0IGZpZWxk IGZvciBhIGRtYS1waW5uZWQgY291bnQuICJTb3J0IG9mIiwgbWVhbnMgdGhhdCwKPiArICByYXRo ZXIgdGhhbiBkaXZpZGluZyBwYWdlLT5fcmVmY291bnQgaW50byBiaXQgZmllbGRzLCB3ZSBzaW1w bGUgYWRkIGEgbWVkaXVtLQo+ICsgIGxhcmdlIHZhbHVlIChHVVBfUElOX0NPVU5USU5HX0JJQVMs IGluaXRpYWxseSBjaG9zZW4gdG8gYmUgMTAyNDogMTAgYml0cykgdG8KPiArICBwYWdlLT5fcmVm Y291bnQuIFRoaXMgcHJvdmlkZXMgZnV6enkgYmVoYXZpb3I6IGlmIGEgcGFnZSBoYXMgZ2V0X3Bh Z2UoKSBjYWxsZWQKPiArICBvbiBpdCAxMDI0IHRpbWVzLCB0aGVuIGl0IHdpbGwgYXBwZWFyIHRv IGhhdmUgYSBzaW5nbGUgZG1hLXBpbm5lZCBjb3VudC4KPiArICBBbmQgYWdhaW4sIHRoYXQncyBh Y2NlcHRhYmxlLgo+ICsKPiArVGhpcyBhbHNvIGxlYWRzIHRvIGxpbWl0YXRpb25zOiB0aGVyZSBh cmUgb25seSAzMS0xMD09MjEgYml0cyBhdmFpbGFibGUgZm9yIGEKPiArY291bnRlciB0aGF0IGlu Y3JlbWVudHMgMTAgYml0cyBhdCBhIHRpbWUuCj4gKwo+ICtUT0RPOiBmb3IgMUdCIGFuZCBsYXJn ZXIgaHVnZSBwYWdlcywgdGhpcyBpcyBjdXR0aW5nIGl0IGNsb3NlLiBUaGF0J3MgYmVjYXVzZQo+ ICt3aGVuIHBpbl91c2VyX3BhZ2VzKCkgZm9sbG93cyBzdWNoIHBhZ2VzLCBpdCBpbmNyZW1lbnRz IHRoZSBoZWFkIHBhZ2UgYnkgIjEiCj4gKyh3aGVyZSAiMSIgdXNlZCB0byBtZWFuICIrMSIgZm9y IGdldF91c2VyX3BhZ2VzKCksIGJ1dCBub3cgbWVhbnMgIisxMDI0IiBmb3IKPiArcGluX3VzZXJf cGFnZXMoKSkgZm9yIGVhY2ggdGFpbCBwYWdlLiBTbyBpZiB5b3UgaGF2ZSBhIDFHQiBodWdlIHBh Z2U6Cj4gKwo+ICsqIFRoZXJlIGFyZSAyNTZLICgxOCBiaXRzKSB3b3J0aCBvZiA0IEtCIHRhaWwg cGFnZXMuCj4gKyogVGhlcmUgYXJlIDIxIGJpdHMgYXZhaWxhYmxlIHRvIGNvdW50IHVwIHZpYSBH VVBfUElOX0NPVU5USU5HX0JJQVMgKHRoYXQgaXMsCj4gKyAgMTAgYml0cyBhdCBhIHRpbWUpCj4g KyogVGhlcmUgYXJlIDIxIC0gMTggPT0gMyBiaXRzIGF2YWlsYWJsZSB0byBjb3VudC4gRXhjZXB0 IHRoYXQgdGhlcmUgYXJlbid0LAo+ICsgIGJlY2F1c2UgeW91IG5lZWQgdG8gYWxsb3cgZm9yIGEg ZmV3IG5vcm1hbCBnZXRfcGFnZSgpIGNhbGxzIG9uIHRoZSBoZWFkIHBhZ2UsCj4gKyAgYXMgd2Vs bC4gRm9ydHVuYXRlbHksIHRoZSBhcHByb2FjaCBvZiB1c2luZyBhZGRpdGlvbiwgcmF0aGVyIHRo YW4gImhhcmQiCj4gKyAgYml0ZmllbGRzLCB3aXRoaW4gcGFnZS0+X3JlZmNvdW50LCBhbGxvd3Mg Zm9yIHNoYXJpbmcgdGhlc2UgYml0cyBncmFjZWZ1bGx5Lgo+ICsgIEJ1dCB3ZSdyZSBzdGlsbCBs b29raW5nIGF0IGFib3V0IDggcmVmZXJlbmNlcy4KPiArCj4gK1RoaXMsIGhvd2V2ZXIsIGlzIGEg bWlzc2luZyBmZWF0dXJlIG1vcmUgdGhhbiBhbnl0aGluZyBlbHNlLCBiZWNhdXNlIGl0J3MgZWFz aWx5Cj4gK3NvbHZlZCBieSBhZGRyZXNzaW5nIGFuIG9idmlvdXMgaW5lZmZpY2llbmN5IGluIHRo ZSBvcmlnaW5hbCBnZXRfdXNlcl9wYWdlcygpCj4gK2FwcHJvYWNoIG9mIHJldHJpZXZpbmcgcGFn ZXM6IHN0b3AgdHJlYXRpbmcgYWxsIHRoZSBwYWdlcyBhcyBpZiB0aGV5IHdlcmUKPiArUEFHRV9T SVpFLiBSZXRyaWV2ZSBodWdlIHBhZ2VzIGFzIGh1Z2UgcGFnZXMuIFRoZSBjYWxsZXJzIG5lZWQg dG8gYmUgYXdhcmUgb2YKPiArdGhpcywgc28gc29tZSB3b3JrIGlzIHJlcXVpcmVkLiBPbmNlIHRo YXQncyBpbiBwbGFjZSwgdGhpcyBsaW1pdGF0aW9uIG1vc3RseQo+ICtkaXNhcHBlYXJzIGZyb20g dmlldywgYmVjYXVzZSB0aGVyZSB3aWxsIGJlIGFtcGxlIHJlZmNvdW50aW5nIHJhbmdlIGF2YWls YWJsZS4KPiArCj4gKyogQ2FsbGVycyBtdXN0IHNwZWNpZmljYWxseSByZXF1ZXN0ICJkbWEtcGlu bmVkIHRyYWNraW5nIG9mIHBhZ2VzIi4gSW4gb3RoZXIKPiArICB3b3JkcywganVzdCBjYWxsaW5n IGdldF91c2VyX3BhZ2VzKCkgd2lsbCBub3Qgc3VmZmljZTsgYSBuZXcgc2V0IG9mIGZ1bmN0aW9u cywKPiArICBwaW5fdXNlcl9wYWdlKCkgYW5kIHJlbGF0ZWQsIG11c3QgYmUgdXNlZC4KPiArCj4g K0ZPTExfUElOLCBGT0xMX0dFVCwgRk9MTF9MT05HVEVSTTogd2hlbiB0byB1c2Ugd2hpY2ggZmxh Z3MKPiArPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PQo+ICsKPiArVGhhbmtzIHRvIEphbiBLYXJhLCBWbGFzdGltaWwgQmFia2EgYW5kIHNl dmVyYWwgb3RoZXIgLW1tIHBlb3BsZSwgZm9yIGRlc2NyaWJpbmcKPiArdGhlc2UgY2F0ZWdvcmll czoKPiArCj4gK0NBU0UgMTogRGlyZWN0IElPIChESU8pCj4gKy0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tCj4gK1RoZXJlIGFyZSBHVVAgcmVmZXJlbmNlcyB0byBwYWdlcyB0aGF0IGFyZSBzZXJ2aW5n Cj4gK2FzIERJTyBidWZmZXJzLiBUaGVzZSBidWZmZXJzIGFyZSBuZWVkZWQgZm9yIGEgcmVsYXRp dmVseSBzaG9ydCB0aW1lIChzbyB0aGV5Cj4gK2FyZSBub3QgImxvbmcgdGVybSIpLiBObyBzcGVj aWFsIHN5bmNocm9uaXphdGlvbiB3aXRoIHBhZ2VfbWtjbGVhbigpIG9yCj4gK211bm1hcCgpIGlz IHByb3ZpZGVkLiBUaGVyZWZvcmUsIGZsYWdzIHRvIHNldCBhdCB0aGUgY2FsbCBzaXRlIGFyZTog OjoKPiArCj4gKyAgICBGT0xMX1BJTgo+ICsKPiArLi4uYnV0IHJhdGhlciB0aGFuIHNldHRpbmcg Rk9MTF9QSU4gZGlyZWN0bHksIGNhbGwgc2l0ZXMgc2hvdWxkIHVzZSBvbmUgb2YKPiArdGhlIHBp bl91c2VyX3BhZ2VzKigpIHJvdXRpbmVzIHRoYXQgc2V0IEZPTExfUElOLgo+ICsKPiArQ0FTRSAy OiBSRE1BCj4gKy0tLS0tLS0tLS0tLQo+ICtUaGVyZSBhcmUgR1VQIHJlZmVyZW5jZXMgdG8gcGFn ZXMgdGhhdCBhcmUgc2VydmluZyBhcyBETUEKPiArYnVmZmVycy4gVGhlc2UgYnVmZmVycyBhcmUg bmVlZGVkIGZvciBhIGxvbmcgdGltZSAoImxvbmcgdGVybSIpLiBObyBzcGVjaWFsCj4gK3N5bmNo cm9uaXphdGlvbiB3aXRoIHBhZ2VfbWtjbGVhbigpIG9yIG11bm1hcCgpIGlzIHByb3ZpZGVkLiBU aGVyZWZvcmUsIGZsYWdzCj4gK3RvIHNldCBhdCB0aGUgY2FsbCBzaXRlIGFyZTogOjoKPiArCj4g KyAgICBGT0xMX1BJTiB8IEZPTExfTE9OR1RFUk0KPiArCj4gK05PVEU6IFNvbWUgcGFnZXMsIHN1 Y2ggYXMgREFYIHBhZ2VzLCBjYW5ub3QgYmUgcGlubmVkIHdpdGggbG9uZ3Rlcm0gcGlucy4gVGhh dCdzCj4gK2JlY2F1c2UgREFYIHBhZ2VzIGRvIG5vdCBoYXZlIGEgc2VwYXJhdGUgcGFnZSBjYWNo ZSwgYW5kIHNvICJwaW5uaW5nIiBpbXBsaWVzCj4gK2xvY2tpbmcgZG93biBmaWxlIHN5c3RlbSBi bG9ja3MsIHdoaWNoIGlzIG5vdCAoeWV0KSBzdXBwb3J0ZWQgaW4gdGhhdCB3YXkuCj4gKwo+ICtD QVNFIDM6IEhhcmR3YXJlIHdpdGggcGFnZSBmYXVsdGluZyBzdXBwb3J0Cj4gKy0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KPiArSGVyZSwgYSB3ZWxsLXdyaXR0ZW4g ZHJpdmVyIGRvZXNuJ3Qgbm9ybWFsbHkgbmVlZCB0byBwaW4gcGFnZXMgYXQgYWxsLiBIb3dldmVy LAo+ICtpZiB0aGUgZHJpdmVyIGRvZXMgY2hvb3NlIHRvIGRvIHNvLCBpdCBjYW4gcmVnaXN0ZXIg TU1VIG5vdGlmaWVycyBmb3IgdGhlIHJhbmdlLAo+ICthbmQgd2lsbCBiZSBjYWxsZWQgYmFjayB1 cG9uIGludmFsaWRhdGlvbi4gRWl0aGVyIHdheSAoYXZvaWRpbmcgcGFnZSBwaW5uaW5nLCBvcgo+ ICt1c2luZyBNTVUgbm90aWZpZXJzIHRvIHVucGluIHVwb24gcmVxdWVzdCksIHRoZXJlIGlzIHBy b3BlciBzeW5jaHJvbml6YXRpb24gd2l0aAo+ICtib3RoIGZpbGVzeXN0ZW0gYW5kIG1tIChwYWdl X21rY2xlYW4oKSwgbXVubWFwKCksIGV0YykuCj4gKwo+ICtUaGVyZWZvcmUsIG5laXRoZXIgZmxh ZyBuZWVkcyB0byBiZSBzZXQuCj4gKwo+ICtJbiB0aGlzIGNhc2UsIGlkZWFsbHksIG5laXRoZXIg Z2V0X3VzZXJfcGFnZXMoKSBub3IgcGluX3VzZXJfcGFnZXMoKSBzaG91bGQgYmUKPiArY2FsbGVk LiBJbnN0ZWFkLCB0aGUgc29mdHdhcmUgc2hvdWxkIGJlIHdyaXR0ZW4gc28gdGhhdCBpdCBkb2Vz IG5vdCBwaW4gcGFnZXMuCj4gK1RoaXMgYWxsb3dzIG1tIGFuZCBmaWxlc3lzdGVtcyB0byBvcGVy YXRlIG1vcmUgZWZmaWNpZW50bHkgYW5kIHJlbGlhYmx5Lgo+ICsKPiArQ0FTRSA0OiBQaW5uaW5n IGZvciBzdHJ1Y3QgcGFnZSBtYW5pcHVsYXRpb24gb25seQo+ICstLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tCj4gK0hlcmUsIG5vcm1hbCBHVVAgY2FsbHMg YXJlIHN1ZmZpY2llbnQsIHNvIG5laXRoZXIgZmxhZyBuZWVkcyB0byBiZSBzZXQuCj4gKwo+ICtw YWdlX2RtYV9waW5uZWQoKTogdGhlIHdob2xlIHBvaW50IG9mIHBpbm5pbmcKPiArPT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Cj4gKwo+ICtUaGUgd2hvbGUgcG9p bnQgb2YgbWFya2luZyBwYWdlcyBhcyAiRE1BLXBpbm5lZCIgb3IgImd1cC1waW5uZWQiIGlzIHRv IGJlIGFibGUKPiArdG8gcXVlcnksICJpcyB0aGlzIHBhZ2UgRE1BLXBpbm5lZD8iIFRoYXQgYWxs b3dzIGNvZGUgc3VjaCBhcyBwYWdlX21rY2xlYW4oKQo+ICsoYW5kIGZpbGUgc3lzdGVtIHdyaXRl YmFjayBjb2RlIGluIGdlbmVyYWwpIHRvIG1ha2UgaW5mb3JtZWQgZGVjaXNpb25zIGFib3V0Cj4g K3doYXQgdG8gZG8gd2hlbiBhIHBhZ2UgY2Fubm90IGJlIHVubWFwcGVkIGR1ZSB0byBzdWNoIHBp bnMuCj4gKwo+ICtXaGF0IHRvIGRvIGluIHRob3NlIGNhc2VzIGlzIHRoZSBzdWJqZWN0IG9mIGEg eWVhcnMtbG9uZyBzZXJpZXMgb2YgZGlzY3Vzc2lvbnMKPiArYW5kIGRlYmF0ZXMgKHNlZSB0aGUg UmVmZXJlbmNlcyBhdCB0aGUgZW5kIG9mIHRoaXMgZG9jdW1lbnQpLiBJdCdzIGEgVE9ETyBpdGVt Cj4gK2hlcmU6IGZpbGwgaW4gdGhlIGRldGFpbHMgb25jZSB0aGF0J3Mgd29ya2VkIG91dC4gTWVh bndoaWxlLCBpdCdzIHNhZmUgdG8gc2F5Cj4gK3RoYXQgaGF2aW5nIHRoaXMgYXZhaWxhYmxlOiA6 Ogo+ICsKPiArICAgICAgICBzdGF0aWMgaW5saW5lIGJvb2wgcGFnZV9kbWFfcGlubmVkKHN0cnVj dCBwYWdlICpwYWdlKQo+ICsKPiArLi4uaXMgYSBwcmVyZXF1aXNpdGUgdG8gc29sdmluZyB0aGUg bG9uZy1ydW5uaW5nIGd1cCtETUEgcHJvYmxlbS4KPiArCj4gK0Fub3RoZXIgd2F5IG9mIHRoaW5r aW5nIGFib3V0IEZPTExfR0VULCBGT0xMX1BJTiwgYW5kIEZPTExfTE9OR1RFUk0KPiArPT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PQo+ICsKPiArQW5vdGhlciB3YXkgb2YgdGhpbmtpbmcgYWJvdXQgdGhlc2UgZmxhZ3MgaXMg YXMgYSBwcm9ncmVzc2lvbiBvZiByZXN0cmljdGlvbnM6Cj4gK0ZPTExfR0VUIGlzIGZvciBzdHJ1 Y3QgcGFnZSBtYW5pcHVsYXRpb24sIHdpdGhvdXQgYWZmZWN0aW5nIHRoZSBkYXRhIHRoYXQgdGhl Cj4gK3N0cnVjdCBwYWdlIHJlZmVycyB0by4gRk9MTF9QSU4gaXMgYSAqcmVwbGFjZW1lbnQqIGZv ciBGT0xMX0dFVCwgYW5kIGlzIGZvcgo+ICtzaG9ydCB0ZXJtIHBpbnMgb24gcGFnZXMgd2hvc2Ug ZGF0YSAqd2lsbCogZ2V0IGFjY2Vzc2VkLiBBcyBzdWNoLCBGT0xMX1BJTiBpcwo+ICthICJtb3Jl IHNldmVyZSIgZm9ybSBvZiBwaW5uaW5nLiBBbmQgZmluYWxseSwgRk9MTF9MT05HVEVSTSBpcyBh biBldmVuIG1vcmUKPiArcmVzdHJpY3RpdmUgY2FzZSB0aGF0IGhhcyBGT0xMX1BJTiBhcyBhIHBy ZXJlcXVpc2l0ZTogdGhpcyBpcyBmb3IgcGFnZXMgdGhhdAo+ICt3aWxsIGJlIHBpbm5lZCBsb25n dGVybSwgYW5kIHdob3NlIGRhdGEgd2lsbCBiZSBhY2Nlc3NlZC4KPiArCj4gK1VuaXQgdGVzdGlu Zwo+ICs9PT09PT09PT09PT0KPiArVGhpcyBmaWxlOjoKPiArCj4gKyB0b29scy90ZXN0aW5nL3Nl bGZ0ZXN0cy92bS9ndXBfYmVuY2htYXJrLmMKPiArCj4gK2hhcyB0aGUgZm9sbG93aW5nIG5ldyBj YWxscyB0byBleGVyY2lzZSB0aGUgbmV3IHBpbiooKSB3cmFwcGVyIGZ1bmN0aW9uczoKPiArCj4g KyogUElOX0ZBU1RfQkVOQ0hNQVJLICguL2d1cF9iZW5jaG1hcmsgLWEpCj4gKyogUElOX0xPTkdU RVJNX0JFTkNITUFSSyAoLi9ndXBfYmVuY2htYXJrIC1hKQo+ICsqIFBJTl9CRU5DSE1BUksgKC4v Z3VwX2JlbmNobWFyayAtYSkKPiArCj4gK1lvdSBjYW4gbW9uaXRvciBob3cgbWFueSB0b3RhbCBk bWEtcGlubmVkIHBhZ2VzIGhhdmUgYmVlbiBhY3F1aXJlZCBhbmQgcmVsZWFzZWQKPiArc2luY2Ug dGhlIHN5c3RlbSB3YXMgYm9vdGVkLCB2aWEgdHdvIG5ldyAvcHJvYy92bXN0YXQgZW50cmllczog OjoKPiArCj4gKyAgICAvcHJvYy92bXN0YXQvbnJfZm9sbF9waW5fcmVxdWVzdGVkCj4gKyAgICAv cHJvYy92bXN0YXQvbnJfZm9sbF9waW5fcmVxdWVzdGVkCj4gKwo+ICtUaG9zZSBhcmUgYm90aCBn b2luZyB0byBzaG93IHplcm8sIHVubGVzcyBDT05GSUdfREVCVUdfVk0gaXMgc2V0LiBUaGlzIGlz Cj4gK2JlY2F1c2UgdGhlcmUgaXMgYSBub3RpY2VhYmxlIHBlcmZvcm1hbmNlIGRyb3AgaW4gcHV0 X3VzZXJfcGFnZSgpLCB3aGVuIHRoZXkKPiArYXJlIGFjdGl2YXRlZC4KPiArCj4gK1JlZmVyZW5j ZXMKPiArPT09PT09PT09PQo+ICsKPiArKiBgU29tZSBzbG93IHByb2dyZXNzIG9uIGdldF91c2Vy X3BhZ2VzKCkgKEFwciAyLCAyMDE5KSA8aHR0cHM6Ly9sd24ubmV0L0FydGljbGVzLzc4NDU3NC8+ YF8KPiArKiBgRE1BIGFuZCBnZXRfdXNlcl9wYWdlcygpIChMUEM6IERlYyAxMiwgMjAxOCkgPGh0 dHBzOi8vbHduLm5ldC9BcnRpY2xlcy83NzQ0MTEvPmBfCj4gKyogYFRoZSB0cm91YmxlIHdpdGgg Z2V0X3VzZXJfcGFnZXMoKSAoQXByIDMwLCAyMDE4KSA8aHR0cHM6Ly9sd24ubmV0L0FydGljbGVz Lzc1MzAyNy8+YF8KPiArCj4gK0pvaG4gSHViYmFyZCwgT2N0b2JlciwgMjAxOQo+IGRpZmYgLS1n aXQgYS9pbmNsdWRlL2xpbnV4L21tLmggYi9pbmNsdWRlL2xpbnV4L21tLmgKPiBpbmRleCA5NjIy ODM3NjEzOWMuLjExZTAwODZkNjRhNCAxMDA2NDQKPiAtLS0gYS9pbmNsdWRlL2xpbnV4L21tLmgK PiArKysgYi9pbmNsdWRlL2xpbnV4L21tLmgKPiBAQCAtMTU0Miw5ICsxNTQyLDIzIEBAIGxvbmcg Z2V0X3VzZXJfcGFnZXNfcmVtb3RlKHN0cnVjdCB0YXNrX3N0cnVjdCAqdHNrLCBzdHJ1Y3QgbW1f c3RydWN0ICptbSwKPiAgCQkJICAgIHVuc2lnbmVkIGxvbmcgc3RhcnQsIHVuc2lnbmVkIGxvbmcg bnJfcGFnZXMsCj4gIAkJCSAgICB1bnNpZ25lZCBpbnQgZ3VwX2ZsYWdzLCBzdHJ1Y3QgcGFnZSAq KnBhZ2VzLAo+ICAJCQkgICAgc3RydWN0IHZtX2FyZWFfc3RydWN0ICoqdm1hcywgaW50ICpsb2Nr ZWQpOwo+ICtsb25nIHBpbl91c2VyX3BhZ2VzX3JlbW90ZShzdHJ1Y3QgdGFza19zdHJ1Y3QgKnRz aywgc3RydWN0IG1tX3N0cnVjdCAqbW0sCj4gKwkJCSAgIHVuc2lnbmVkIGxvbmcgc3RhcnQsIHVu c2lnbmVkIGxvbmcgbnJfcGFnZXMsCj4gKwkJCSAgIHVuc2lnbmVkIGludCBndXBfZmxhZ3MsIHN0 cnVjdCBwYWdlICoqcGFnZXMsCj4gKwkJCSAgIHN0cnVjdCB2bV9hcmVhX3N0cnVjdCAqKnZtYXMs IGludCAqbG9ja2VkKTsKPiArbG9uZyBwaW5fbG9uZ3Rlcm1fcGFnZXNfcmVtb3RlKHN0cnVjdCB0 YXNrX3N0cnVjdCAqdHNrLCBzdHJ1Y3QgbW1fc3RydWN0ICptbSwKPiArCQkJICAgICAgIHVuc2ln bmVkIGxvbmcgc3RhcnQsIHVuc2lnbmVkIGxvbmcgbnJfcGFnZXMsCj4gKwkJCSAgICAgICB1bnNp Z25lZCBpbnQgZ3VwX2ZsYWdzLCBzdHJ1Y3QgcGFnZSAqKnBhZ2VzLAo+ICsJCQkgICAgICAgc3Ry dWN0IHZtX2FyZWFfc3RydWN0ICoqdm1hcywgaW50ICpsb2NrZWQpOwo+ICBsb25nIGdldF91c2Vy X3BhZ2VzKHVuc2lnbmVkIGxvbmcgc3RhcnQsIHVuc2lnbmVkIGxvbmcgbnJfcGFnZXMsCj4gIAkJ CSAgICB1bnNpZ25lZCBpbnQgZ3VwX2ZsYWdzLCBzdHJ1Y3QgcGFnZSAqKnBhZ2VzLAo+ICAJCQkg ICAgc3RydWN0IHZtX2FyZWFfc3RydWN0ICoqdm1hcyk7Cj4gK2xvbmcgcGluX3VzZXJfcGFnZXMo dW5zaWduZWQgbG9uZyBzdGFydCwgdW5zaWduZWQgbG9uZyBucl9wYWdlcywKPiArCQkgICAgdW5z aWduZWQgaW50IGd1cF9mbGFncywgc3RydWN0IHBhZ2UgKipwYWdlcywKPiArCQkgICAgc3RydWN0 IHZtX2FyZWFfc3RydWN0ICoqdm1hcyk7Cj4gK2xvbmcgcGluX2xvbmd0ZXJtX3BhZ2VzKHVuc2ln bmVkIGxvbmcgc3RhcnQsIHVuc2lnbmVkIGxvbmcgbnJfcGFnZXMsCj4gKwkJCXVuc2lnbmVkIGlu dCBndXBfZmxhZ3MsIHN0cnVjdCBwYWdlICoqcGFnZXMsCj4gKwkJCXN0cnVjdCB2bV9hcmVhX3N0 cnVjdCAqKnZtYXMpOwo+ICBsb25nIGdldF91c2VyX3BhZ2VzX2xvY2tlZCh1bnNpZ25lZCBsb25n IHN0YXJ0LCB1bnNpZ25lZCBsb25nIG5yX3BhZ2VzLAo+ICAJCSAgICB1bnNpZ25lZCBpbnQgZ3Vw X2ZsYWdzLCBzdHJ1Y3QgcGFnZSAqKnBhZ2VzLCBpbnQgKmxvY2tlZCk7Cj4gIGxvbmcgZ2V0X3Vz ZXJfcGFnZXNfdW5sb2NrZWQodW5zaWduZWQgbG9uZyBzdGFydCwgdW5zaWduZWQgbG9uZyBucl9w YWdlcywKPiBAQCAtMTU1Miw2ICsxNTY2LDEwIEBAIGxvbmcgZ2V0X3VzZXJfcGFnZXNfdW5sb2Nr ZWQodW5zaWduZWQgbG9uZyBzdGFydCwgdW5zaWduZWQgbG9uZyBucl9wYWdlcywKPiAgCj4gIGlu dCBnZXRfdXNlcl9wYWdlc19mYXN0KHVuc2lnbmVkIGxvbmcgc3RhcnQsIGludCBucl9wYWdlcywK PiAgCQkJdW5zaWduZWQgaW50IGd1cF9mbGFncywgc3RydWN0IHBhZ2UgKipwYWdlcyk7Cj4gK2lu dCBwaW5fdXNlcl9wYWdlc19mYXN0KHVuc2lnbmVkIGxvbmcgc3RhcnQsIGludCBucl9wYWdlcywK PiArCQkJdW5zaWduZWQgaW50IGd1cF9mbGFncywgc3RydWN0IHBhZ2UgKipwYWdlcyk7Cj4gK2lu dCBwaW5fbG9uZ3Rlcm1fcGFnZXNfZmFzdCh1bnNpZ25lZCBsb25nIHN0YXJ0LCBpbnQgbnJfcGFn ZXMsCj4gKwkJCSAgICB1bnNpZ25lZCBpbnQgZ3VwX2ZsYWdzLCBzdHJ1Y3QgcGFnZSAqKnBhZ2Vz KTsKPiAgCj4gIGludCBhY2NvdW50X2xvY2tlZF92bShzdHJ1Y3QgbW1fc3RydWN0ICptbSwgdW5z aWduZWQgbG9uZyBwYWdlcywgYm9vbCBpbmMpOwo+ICBpbnQgX19hY2NvdW50X2xvY2tlZF92bShz dHJ1Y3QgbW1fc3RydWN0ICptbSwgdW5zaWduZWQgbG9uZyBwYWdlcywgYm9vbCBpbmMsCj4gQEAg LTI2MTAsMTMgKzI2MjgsMTUgQEAgc3RydWN0IHBhZ2UgKmZvbGxvd19wYWdlKHN0cnVjdCB2bV9h cmVhX3N0cnVjdCAqdm1hLCB1bnNpZ25lZCBsb25nIGFkZHJlc3MsCj4gICNkZWZpbmUgRk9MTF9B Tk9OCTB4ODAwMAkvKiBkb24ndCBkbyBmaWxlIG1hcHBpbmdzICovCj4gICNkZWZpbmUgRk9MTF9M T05HVEVSTQkweDEwMDAwCS8qIG1hcHBpbmcgbGlmZXRpbWUgaXMgaW5kZWZpbml0ZTogc2VlIGJl bG93ICovCj4gICNkZWZpbmUgRk9MTF9TUExJVF9QTUQJMHgyMDAwMAkvKiBzcGxpdCBodWdlIHBt ZCBiZWZvcmUgcmV0dXJuaW5nICovCj4gKyNkZWZpbmUgRk9MTF9QSU4JMHg0MDAwMAkvKiBwYWdl cyBtdXN0IGJlIHJlbGVhc2VkIHZpYSBwdXRfdXNlcl9wYWdlKCkgKi8KPiAgCj4gIC8qCj4gLSAq IE5PVEUgb24gRk9MTF9MT05HVEVSTToKPiArICogRk9MTF9QSU4gYW5kIEZPTExfTE9OR1RFUk0g bWF5IGJlIHVzZWQgaW4gdmFyaW91cyBjb21iaW5hdGlvbnMgd2l0aCBlYWNoCj4gKyAqIG90aGVy LiBIZXJlIGlzIHdoYXQgdGhleSBtZWFuLCBhbmQgaG93IHRvIHVzZSB0aGVtOgo+ICAgKgo+ICAg KiBGT0xMX0xPTkdURVJNIGluZGljYXRlcyB0aGF0IHRoZSBwYWdlIHdpbGwgYmUgaGVsZCBmb3Ig YW4gaW5kZWZpbml0ZSB0aW1lCj4gLSAqIHBlcmlvZCBfb2Z0ZW5fIHVuZGVyIHVzZXJzcGFjZSBj b250cm9sLiAgVGhpcyBpcyBjb250cmFzdGVkIHdpdGgKPiAtICogaW92X2l0ZXJfZ2V0X3BhZ2Vz KCkgd2hlcmUgdXNhZ2VzIHdoaWNoIGFyZSB0cmFuc2llbnQuCj4gKyAqIHBlcmlvZCBfb2Z0ZW5f IHVuZGVyIHVzZXJzcGFjZSBjb250cm9sLiAgVGhpcyBpcyBpbiBjb250cmFzdCB0bwo+ICsgKiBp b3ZfaXRlcl9nZXRfcGFnZXMoKSwgd2hlcmUgdXNhZ2VzIHdoaWNoIGFyZSB0cmFuc2llbnQuCj4g ICAqCj4gICAqIEZJWE1FOiBGb3IgcGFnZXMgd2hpY2ggYXJlIHBhcnQgb2YgYSBmaWxlc3lzdGVt LCBtYXBwaW5ncyBhcmUgc3ViamVjdCB0byB0aGUKPiAgICogbGlmZXRpbWUgZW5mb3JjZWQgYnkg dGhlIGZpbGVzeXN0ZW0gYW5kIHdlIG5lZWQgZ3VhcmFudGVlcyB0aGF0IGxvbmd0ZXJtCj4gQEAg LTI2MzEsMTEgKzI2NTEsNDEgQEAgc3RydWN0IHBhZ2UgKmZvbGxvd19wYWdlKHN0cnVjdCB2bV9h cmVhX3N0cnVjdCAqdm1hLCB1bnNpZ25lZCBsb25nIGFkZHJlc3MsCj4gICAqIEN1cnJlbnRseSBv bmx5IGdldF91c2VyX3BhZ2VzKCkgYW5kIGdldF91c2VyX3BhZ2VzX2Zhc3QoKSBzdXBwb3J0IHRo aXMgZmxhZwo+ICAgKiBhbmQgY2FsbHMgdG8gZ2V0X3VzZXJfcGFnZXNfW3VuXWxvY2tlZCBhcmUg c3BlY2lmaWNhbGx5IG5vdCBhbGxvd2VkLiAgVGhpcwo+ICAgKiBpcyBkdWUgdG8gYW4gaW5jb21w YXRpYmlsaXR5IHdpdGggdGhlIEZTIERBWCBjaGVjayBhbmQKPiAtICogRkFVTFRfRkxBR19BTExP V19SRVRSWQo+ICsgKiBGQVVMVF9GTEFHX0FMTE9XX1JFVFJZLgo+ICAgKgo+IC0gKiBJbiB0aGUg Q01BIGNhc2U6IGxvbmd0ZXJtIHBpbnMgaW4gYSBDTUEgcmVnaW9uIHdvdWxkIHVubmVjZXNzYXJp bHkgZnJhZ21lbnQKPiAtICogdGhhdCByZWdpb24uICBBbmQgc28gQ01BIGF0dGVtcHRzIHRvIG1p Z3JhdGUgdGhlIHBhZ2UgYmVmb3JlIHBpbm5pbmcgd2hlbgo+ICsgKiBJbiB0aGUgQ01BIGNhc2U6 IGxvbmcgdGVybSBwaW5zIGluIGEgQ01BIHJlZ2lvbiB3b3VsZCB1bm5lY2Vzc2FyaWx5IGZyYWdt ZW50Cj4gKyAqIHRoYXQgcmVnaW9uLiAgQW5kIHNvLCBDTUEgYXR0ZW1wdHMgdG8gbWlncmF0ZSB0 aGUgcGFnZSBiZWZvcmUgcGlubmluZywgd2hlbgo+ICAgKiBGT0xMX0xPTkdURVJNIGlzIHNwZWNp ZmllZC4KPiArICoKPiArICogRk9MTF9QSU4gaW5kaWNhdGVzIHRoYXQgYSBzcGVjaWFsIGtpbmQg b2YgdHJhY2tpbmcgKG5vdCBqdXN0IHBhZ2UtPl9yZWZjb3VudCwKPiArICogYnV0IGFuIGFkZGl0 aW9uYWwgcGluIGNvdW50aW5nIHN5c3RlbSkgd2lsbCBiZSBpbnZva2VkLiBUaGlzIGlzIGludGVu ZGVkIGZvcgo+ICsgKiBhbnl0aGluZyB0aGF0IGdldHMgYSBwYWdlIHJlZmVyZW5jZSBhbmQgdGhl biB0b3VjaGVzIHBhZ2UgZGF0YSAoZm9yIGV4YW1wbGUsCj4gKyAqIERpcmVjdCBJTykuIFRoaXMg bGV0cyB0aGUgZmlsZXN5c3RlbSBrbm93IHRoYXQgc29tZSBub24tZmlsZS1zeXN0ZW0gZW50aXR5 IGlzCj4gKyAqIHBvdGVudGlhbGx5IGNoYW5naW5nIHRoZSBwYWdlcycgZGF0YS4gSW4gY29udHJh c3QgdG8gRk9MTF9HRVQgKHdob3NlIHBhZ2VzCj4gKyAqIGFyZSByZWxlYXNlZCB2aWEgcHV0X3Bh Z2UoKSksIEZPTExfUElOIHBhZ2VzIG11c3QgYmUgcmVsZWFzZWQsIHVsdGltYXRlbHksIGJ5Cj4g KyAqIGEgY2FsbCB0byBwdXRfdXNlcl9wYWdlKCkuCj4gKyAqCj4gKyAqIEZPTExfUElOIGlzIHNp bWlsYXIgdG8gRk9MTF9HRVQ6IGJvdGggb2YgdGhlc2UgcGluIHBhZ2VzLiBUaGV5IHVzZSBkaWZm ZXJlbnQKPiArICogYW5kIHNlcGFyYXRlIHJlZmNvdW50aW5nIG1lY2hhbmlzbXMsIGhvd2V2ZXIs IGFuZCB0aGF0IG1lYW5zIHRoYXQgZWFjaCBoYXMKPiArICogaXRzIG93biBhY3F1aXJlIGFuZCBy ZWxlYXNlIG1lY2hhbmlzbXM6Cj4gKyAqCj4gKyAqICAgICBGT0xMX0dFVDogZ2V0X3VzZXJfcGFn ZXMqKCkgdG8gYWNxdWlyZSwgYW5kIHB1dF9wYWdlKCkgdG8gcmVsZWFzZS4KPiArICoKPiArICog ICAgIEZPTExfUElOOiBwaW5fdXNlcl9wYWdlcyooKSBvciBwaW5fbG9uZ3Rlcm1fcGFnZXMqKCkg dG8gYWNxdWlyZSwgYW5kCj4gKyAqICAgICAgICAgICAgICAgcHV0X3VzZXJfcGFnZXMgdG8gcmVs ZWFzZS4KPiArICoKPiArICogRk9MTF9QSU4gYW5kIEZPTExfR0VUIGFyZSBtdXR1YWxseSBleGNs dXNpdmUgZm9yIGEgZ2l2ZW4gZnVuY3Rpb24gY2FsbC4KPiArICogKFRoZSB1bmRlcmx5aW5nIHBh Z2VzIG1heSBleHBlcmllbmNlIGJvdGggRk9MTF9HRVQtYmFzZWQgYW5kIEZPTExfUElOLWJhc2Vk Cj4gKyAqIGNhbGxzIGFwcGxpZWQgdG8gdGhlbSwgYW5kIHRoYXQncyBwZXJmZWN0bHkgT0suIFRo aXMgaXMgYSBjb25zdHJhaW50IG9uIHRoZQo+ICsgKiBjYWxsZXJzLCBub3Qgb24gdGhlIHBhZ2Vz LikKPiArICoKPiArICogRk9MTF9QSU4gYW5kIEZPTExfTE9OR1RFUk0gc2hvdWxkIGJlIHNldCBp bnRlcm5hbGx5IGJ5IHRoZSBwaW5fdXNlcl9wYWdlKigpCj4gKyAqIGFuZCBwaW5fbG9uZ3Rlcm1f KigpIEFQSXMsIG5ldmVyIGRpcmVjdGx5IGJ5IHRoZSBjYWxsZXIuIFRoYXQncyBpbiBvcmRlciB0 bwo+ICsgKiBoZWxwIGF2b2lkIG1pc21hdGNoZXMgd2hlbiByZWxlYXNpbmcgcGFnZXM6IGdldF91 c2VyX3BhZ2VzKigpIHBhZ2VzIG11c3QgYmUKPiArICogcmVsZWFzZWQgdmlhIHB1dF9wYWdlKCks IHdoaWxlIHBpbl91c2VyX3BhZ2VzKigpIHBhZ2VzIG11c3QgYmUgcmVsZWFzZWQgdmlhCj4gKyAq IHB1dF91c2VyX3BhZ2UoKS4KPiArICoKPiArICogUGxlYXNlIHNlZSBEb2N1bWVudGF0aW9uL3Zt L3Bpbl91c2VyX3BhZ2VzLnJzdCBmb3IgbW9yZSBpbmZvcm1hdGlvbi4KPiAgICovCj4gIAo+ICBz dGF0aWMgaW5saW5lIGludCB2bV9mYXVsdF90b19lcnJubyh2bV9mYXVsdF90IHZtX2ZhdWx0LCBp bnQgZm9sbF9mbGFncykKPiBkaWZmIC0tZ2l0IGEvbW0vZ3VwLmMgYi9tbS9ndXAuYwo+IGluZGV4 IGNmZTZkYzVmYzM0My4uZWEzMTgxMGRhODI4IDEwMDY0NAo+IC0tLSBhL21tL2d1cC5jCj4gKysr IGIvbW0vZ3VwLmMKPiBAQCAtMTk0LDYgKzE5NCwxMCBAQCBzdGF0aWMgc3RydWN0IHBhZ2UgKmZv bGxvd19wYWdlX3B0ZShzdHJ1Y3Qgdm1fYXJlYV9zdHJ1Y3QgKnZtYSwKPiAgCXNwaW5sb2NrX3Qg KnB0bDsKPiAgCXB0ZV90ICpwdGVwLCBwdGU7Cj4gIAo+ICsJLyogRk9MTF9HRVQgYW5kIEZPTExf UElOIGFyZSBtdXR1YWxseSBleGNsdXNpdmUuICovCj4gKwlpZiAoV0FSTl9PTl9PTkNFKChmbGFn cyAmIChGT0xMX1BJTiB8IEZPTExfR0VUKSkgPT0KPiArCQkJIChGT0xMX1BJTiB8IEZPTExfR0VU KSkpCj4gKwkJcmV0dXJuIEVSUl9QVFIoLUVJTlZBTCk7Cj4gIHJldHJ5Ogo+ICAJaWYgKHVubGlr ZWx5KHBtZF9iYWQoKnBtZCkpKQo+ICAJCXJldHVybiBub19wYWdlX3RhYmxlKHZtYSwgZmxhZ3Mp Owo+IEBAIC04MDUsNyArODA5LDcgQEAgc3RhdGljIGxvbmcgX19nZXRfdXNlcl9wYWdlcyhzdHJ1 Y3QgdGFza19zdHJ1Y3QgKnRzaywgc3RydWN0IG1tX3N0cnVjdCAqbW0sCj4gIAo+ICAJc3RhcnQg PSB1bnRhZ2dlZF9hZGRyKHN0YXJ0KTsKPiAgCj4gLQlWTV9CVUdfT04oISFwYWdlcyAhPSAhIShn dXBfZmxhZ3MgJiBGT0xMX0dFVCkpOwo+ICsJVk1fQlVHX09OKCEhcGFnZXMgIT0gISEoZ3VwX2Zs YWdzICYgKEZPTExfR0VUIHwgRk9MTF9QSU4pKSk7Cj4gIAo+ICAJLyoKPiAgCSAqIElmIEZPTExf Rk9SQ0UgaXMgc2V0IHRoZW4gZG8gbm90IGZvcmNlIGEgZnVsbCBmYXVsdCBhcyB0aGUgaGludGlu Zwo+IEBAIC0xMDI5LDcgKzEwMzMsMTYgQEAgc3RhdGljIF9fYWx3YXlzX2lubGluZSBsb25nIF9f Z2V0X3VzZXJfcGFnZXNfbG9ja2VkKHN0cnVjdCB0YXNrX3N0cnVjdCAqdHNrLAo+ICAJCUJVR19P TigqbG9ja2VkICE9IDEpOwo+ICAJfQo+ICAKPiAtCWlmIChwYWdlcykKPiArCS8qCj4gKwkgKiBG T0xMX1BJTiBhbmQgRk9MTF9HRVQgYXJlIG11dHVhbGx5IGV4Y2x1c2l2ZS4gVHJhZGl0aW9uYWwg YmVoYXZpb3IKPiArCSAqIGlzIHRvIHNldCBGT0xMX0dFVCBpZiB0aGUgY2FsbGVyIHdhbnRzIHBh Z2VzW10gZmlsbGVkIGluIChidXQgaGFzCj4gKwkgKiBjYXJlbGVzc2x5IGZhaWxlZCB0byBzcGVj aWZ5IEZPTExfR0VUKSwgc28ga2VlcCBkb2luZyB0aGF0LCBidXQgb25seQo+ICsJICogZm9yIEZP TExfR0VULCBub3QgZm9yIHRoZSBuZXdlciBGT0xMX1BJTi4KPiArCSAqCj4gKwkgKiBGT0xMX1BJ TiBhbHdheXMgZXhwZWN0cyBwYWdlcyB0byBiZSBub24tbnVsbCwgYnV0IG5vIG5lZWQgdG8gYXNz ZXJ0Cj4gKwkgKiB0aGF0IGhlcmUsIGFzIGFueSBmYWlsdXJlcyB3aWxsIGJlIG9idmlvdXMgZW5v dWdoLgo+ICsJICovCj4gKwlpZiAocGFnZXMgJiYgIShmbGFncyAmIEZPTExfUElOKSkKPiAgCQlm bGFncyB8PSBGT0xMX0dFVDsKPiAgCj4gIAlwYWdlc19kb25lID0gMDsKPiBAQCAtMTE2Niw2ICsx MTc5LDE0IEBAIGxvbmcgZ2V0X3VzZXJfcGFnZXNfcmVtb3RlKHN0cnVjdCB0YXNrX3N0cnVjdCAq dHNrLCBzdHJ1Y3QgbW1fc3RydWN0ICptbSwKPiAgCQl1bnNpZ25lZCBpbnQgZ3VwX2ZsYWdzLCBz dHJ1Y3QgcGFnZSAqKnBhZ2VzLAo+ICAJCXN0cnVjdCB2bV9hcmVhX3N0cnVjdCAqKnZtYXMsIGlu dCAqbG9ja2VkKQo+ICB7Cj4gKwkvKgo+ICsJICogRk9MTF9QSU4gbXVzdCBvbmx5IGJlIHNldCBp bnRlcm5hbGx5IGJ5IHRoZSBwaW5fdXNlcl9wYWdlKigpIGFuZAo+ICsJICogcGluX2xvbmd0ZXJt XyooKSBBUElzLCBuZXZlciBkaXJlY3RseSBieSB0aGUgY2FsbGVyLCBzbyBlbmZvcmNlIHRoYXQK PiArCSAqIHdpdGggYW4gYXNzZXJ0aW9uOgo+ICsJICovCj4gKwlpZiAoV0FSTl9PTl9PTkNFKGd1 cF9mbGFncyAmIEZPTExfUElOKSkKPiArCQlyZXR1cm4gLUVJTlZBTDsKPiArCj4gIAkvKgo+ICAJ ICogQ3VycmVudCBGT0xMX0xPTkdURVJNIGJlaGF2aW9yIGlzIGluY29tcGF0aWJsZSB3aXRoCj4g IAkgKiBGQVVMVF9GTEFHX0FMTE9XX1JFVFJZIGJlY2F1c2Ugb2YgdGhlIEZTIERBWCBjaGVjayBy ZXF1aXJlbWVudCBvbgo+IEBAIC0xNjI2LDYgKzE2NDcsMTQgQEAgbG9uZyBnZXRfdXNlcl9wYWdl cyh1bnNpZ25lZCBsb25nIHN0YXJ0LCB1bnNpZ25lZCBsb25nIG5yX3BhZ2VzLAo+ICAJCXVuc2ln bmVkIGludCBndXBfZmxhZ3MsIHN0cnVjdCBwYWdlICoqcGFnZXMsCj4gIAkJc3RydWN0IHZtX2Fy ZWFfc3RydWN0ICoqdm1hcykKPiAgewo+ICsJLyoKPiArCSAqIEZPTExfUElOIG11c3Qgb25seSBi ZSBzZXQgaW50ZXJuYWxseSBieSB0aGUgcGluX3VzZXJfcGFnZSooKSBhbmQKPiArCSAqIHBpbl9s b25ndGVybV8qKCkgQVBJcywgbmV2ZXIgZGlyZWN0bHkgYnkgdGhlIGNhbGxlciwgc28gZW5mb3Jj ZSB0aGF0Cj4gKwkgKiB3aXRoIGFuIGFzc2VydGlvbjoKPiArCSAqLwo+ICsJaWYgKFdBUk5fT05f T05DRShndXBfZmxhZ3MgJiBGT0xMX1BJTikpCj4gKwkJcmV0dXJuIC1FSU5WQUw7Cj4gKwo+ICAJ cmV0dXJuIF9fZ3VwX2xvbmd0ZXJtX2xvY2tlZChjdXJyZW50LCBjdXJyZW50LT5tbSwgc3RhcnQs IG5yX3BhZ2VzLAo+ICAJCQkJICAgICBwYWdlcywgdm1hcywgZ3VwX2ZsYWdzIHwgRk9MTF9UT1VD SCk7Cj4gIH0KPiBAQCAtMjM3NywyOSArMjQwNiwxNCBAQCBzdGF0aWMgaW50IF9fZ3VwX2xvbmd0 ZXJtX3VubG9ja2VkKHVuc2lnbmVkIGxvbmcgc3RhcnQsIGludCBucl9wYWdlcywKPiAgCXJldHVy biByZXQ7Cj4gIH0KPiAgCj4gLS8qKgo+IC0gKiBnZXRfdXNlcl9wYWdlc19mYXN0KCkgLSBwaW4g dXNlciBwYWdlcyBpbiBtZW1vcnkKPiAtICogQHN0YXJ0OglzdGFydGluZyB1c2VyIGFkZHJlc3MK PiAtICogQG5yX3BhZ2VzOgludW1iZXIgb2YgcGFnZXMgZnJvbSBzdGFydCB0byBwaW4KPiAtICog QGd1cF9mbGFnczoJZmxhZ3MgbW9kaWZ5aW5nIHBpbiBiZWhhdmlvdXIKPiAtICogQHBhZ2VzOglh cnJheSB0aGF0IHJlY2VpdmVzIHBvaW50ZXJzIHRvIHRoZSBwYWdlcyBwaW5uZWQuCj4gLSAqCQlT aG91bGQgYmUgYXQgbGVhc3QgbnJfcGFnZXMgbG9uZy4KPiAtICoKPiAtICogQXR0ZW1wdCB0byBw aW4gdXNlciBwYWdlcyBpbiBtZW1vcnkgd2l0aG91dCB0YWtpbmcgbW0tPm1tYXBfc2VtLgo+IC0g KiBJZiBub3Qgc3VjY2Vzc2Z1bCwgaXQgd2lsbCBmYWxsIGJhY2sgdG8gdGFraW5nIHRoZSBsb2Nr IGFuZAo+IC0gKiBjYWxsaW5nIGdldF91c2VyX3BhZ2VzKCkuCj4gLSAqCj4gLSAqIFJldHVybnMg bnVtYmVyIG9mIHBhZ2VzIHBpbm5lZC4gVGhpcyBtYXkgYmUgZmV3ZXIgdGhhbiB0aGUgbnVtYmVy Cj4gLSAqIHJlcXVlc3RlZC4gSWYgbnJfcGFnZXMgaXMgMCBvciBuZWdhdGl2ZSwgcmV0dXJucyAw LiBJZiBubyBwYWdlcwo+IC0gKiB3ZXJlIHBpbm5lZCwgcmV0dXJucyAtZXJybm8uCj4gLSAqLwo+ IC1pbnQgZ2V0X3VzZXJfcGFnZXNfZmFzdCh1bnNpZ25lZCBsb25nIHN0YXJ0LCBpbnQgbnJfcGFn ZXMsCj4gLQkJCXVuc2lnbmVkIGludCBndXBfZmxhZ3MsIHN0cnVjdCBwYWdlICoqcGFnZXMpCj4g K3N0YXRpYyBpbnQgaW50ZXJuYWxfZ2V0X3VzZXJfcGFnZXNfZmFzdCh1bnNpZ25lZCBsb25nIHN0 YXJ0LCBpbnQgbnJfcGFnZXMsCj4gKwkJCQkJdW5zaWduZWQgaW50IGd1cF9mbGFncywKPiArCQkJ CQlzdHJ1Y3QgcGFnZSAqKnBhZ2VzKQo+ICB7Cj4gIAl1bnNpZ25lZCBsb25nIGFkZHIsIGxlbiwg ZW5kOwo+ICAJaW50IG5yID0gMCwgcmV0ID0gMDsKPiAgCj4gLQlpZiAoV0FSTl9PTl9PTkNFKGd1 cF9mbGFncyAmIH4oRk9MTF9XUklURSB8IEZPTExfTE9OR1RFUk0pKSkKPiArCWlmIChXQVJOX09O X09OQ0UoZ3VwX2ZsYWdzICYgfihGT0xMX1dSSVRFIHwgRk9MTF9MT05HVEVSTSB8IEZPTExfUElO KSkpCj4gIAkJcmV0dXJuIC1FSU5WQUw7Cj4gIAo+ICAJc3RhcnQgPSB1bnRhZ2dlZF9hZGRyKHN0 YXJ0KSAmIFBBR0VfTUFTSzsKPiBAQCAtMjQzOSw0ICsyNDUzLDIwOCBAQCBpbnQgZ2V0X3VzZXJf cGFnZXNfZmFzdCh1bnNpZ25lZCBsb25nIHN0YXJ0LCBpbnQgbnJfcGFnZXMsCj4gIAo+ICAJcmV0 dXJuIHJldDsKPiAgfQo+ICsKPiArLyoqCj4gKyAqIGdldF91c2VyX3BhZ2VzX2Zhc3QoKSAtIHBp biB1c2VyIHBhZ2VzIGluIG1lbW9yeQo+ICsgKiBAc3RhcnQ6CXN0YXJ0aW5nIHVzZXIgYWRkcmVz cwo+ICsgKiBAbnJfcGFnZXM6CW51bWJlciBvZiBwYWdlcyBmcm9tIHN0YXJ0IHRvIHBpbgo+ICsg KiBAZ3VwX2ZsYWdzOglmbGFncyBtb2RpZnlpbmcgcGluIGJlaGF2aW91cgo+ICsgKiBAcGFnZXM6 CWFycmF5IHRoYXQgcmVjZWl2ZXMgcG9pbnRlcnMgdG8gdGhlIHBhZ2VzIHBpbm5lZC4KPiArICoJ CVNob3VsZCBiZSBhdCBsZWFzdCBucl9wYWdlcyBsb25nLgo+ICsgKgo+ICsgKiBBdHRlbXB0IHRv IHBpbiB1c2VyIHBhZ2VzIGluIG1lbW9yeSB3aXRob3V0IHRha2luZyBtbS0+bW1hcF9zZW0uCj4g KyAqIElmIG5vdCBzdWNjZXNzZnVsLCBpdCB3aWxsIGZhbGwgYmFjayB0byB0YWtpbmcgdGhlIGxv Y2sgYW5kCj4gKyAqIGNhbGxpbmcgZ2V0X3VzZXJfcGFnZXMoKS4KPiArICoKPiArICogUmV0dXJu cyBudW1iZXIgb2YgcGFnZXMgcGlubmVkLiBUaGlzIG1heSBiZSBmZXdlciB0aGFuIHRoZSBudW1i ZXIgcmVxdWVzdGVkLgo+ICsgKiBJZiBucl9wYWdlcyBpcyAwIG9yIG5lZ2F0aXZlLCByZXR1cm5z IDAuIElmIG5vIHBhZ2VzIHdlcmUgcGlubmVkLCByZXR1cm5zCj4gKyAqIC1lcnJuby4KPiArICov Cj4gK2ludCBnZXRfdXNlcl9wYWdlc19mYXN0KHVuc2lnbmVkIGxvbmcgc3RhcnQsIGludCBucl9w YWdlcywKPiArCQkJdW5zaWduZWQgaW50IGd1cF9mbGFncywgc3RydWN0IHBhZ2UgKipwYWdlcykK PiArewo+ICsJLyoKPiArCSAqIEZPTExfUElOIG11c3Qgb25seSBiZSBzZXQgaW50ZXJuYWxseSBi eSB0aGUgcGluX3VzZXJfcGFnZSooKSBhbmQKPiArCSAqIHBpbl9sb25ndGVybV8qKCkgQVBJcywg bmV2ZXIgZGlyZWN0bHkgYnkgdGhlIGNhbGxlciwgc28gZW5mb3JjZSB0aGF0Ogo+ICsJICovCj4g KwlpZiAoV0FSTl9PTl9PTkNFKGd1cF9mbGFncyAmIEZPTExfUElOKSkKPiArCQlyZXR1cm4gLUVJ TlZBTDsKPiArCj4gKwlyZXR1cm4gaW50ZXJuYWxfZ2V0X3VzZXJfcGFnZXNfZmFzdChzdGFydCwg bnJfcGFnZXMsIGd1cF9mbGFncywgcGFnZXMpOwo+ICt9Cj4gIEVYUE9SVF9TWU1CT0xfR1BMKGdl dF91c2VyX3BhZ2VzX2Zhc3QpOwo+ICsKPiArLyoqCj4gKyAqIHBpbl91c2VyX3BhZ2VzX2Zhc3Qo KSAtIHBpbiB1c2VyIHBhZ2VzIGluIG1lbW9yeSB3aXRob3V0IHRha2luZyBsb2Nrcwo+ICsgKgo+ ICsgKiBOZWFybHkgdGhlIHNhbWUgYXMgZ2V0X3VzZXJfcGFnZXNfZmFzdCgpLCBleGNlcHQgdGhh dCBGT0xMX1BJTiBpcyBzZXQuIFNlZQo+ICsgKiBnZXRfdXNlcl9wYWdlc19mYXN0KCkgZm9yIGRv Y3VtZW50YXRpb24gb24gdGhlIGZ1bmN0aW9uIGFyZ3VtZW50cywgYmVjYXVzZQo+ICsgKiB0aGUg YXJndW1lbnRzIGhlcmUgYXJlIGlkZW50aWNhbC4KPiArICoKPiArICogRk9MTF9QSU4gbWVhbnMg dGhhdCB0aGUgcGFnZXMgbXVzdCBiZSByZWxlYXNlZCB2aWEgcHV0X3VzZXJfcGFnZSgpLiBQbGVh c2UKPiArICogc2VlIERvY3VtZW50YXRpb24vdm0vcGluX3VzZXJfcGFnZXMucnN0IGZvciBmdXJ0 aGVyIGRldGFpbHMuCj4gKyAqCj4gKyAqIFRoaXMgaXMgaW50ZW5kZWQgZm9yIENhc2UgMSAoRElP KSBpbiBEb2N1bWVudGF0aW9uL3ZtL3Bpbl91c2VyX3BhZ2VzLnJzdC4gSXQKPiArICogaXMgTk9U IGludGVuZGVkIGZvciBDYXNlIDIgKFJETUE6IGxvbmctdGVybSBwaW5zKS4KPiArICovCj4gK2lu dCBwaW5fdXNlcl9wYWdlc19mYXN0KHVuc2lnbmVkIGxvbmcgc3RhcnQsIGludCBucl9wYWdlcywK PiArCQkJdW5zaWduZWQgaW50IGd1cF9mbGFncywgc3RydWN0IHBhZ2UgKipwYWdlcykKPiArewo+ ICsJLyogRk9MTF9HRVQgYW5kIEZPTExfUElOIGFyZSBtdXR1YWxseSBleGNsdXNpdmUuICovCj4g KwlpZiAoV0FSTl9PTl9PTkNFKGd1cF9mbGFncyAmIEZPTExfR0VUKSkKPiArCQlyZXR1cm4gLUVJ TlZBTDsKPiArCj4gKwlndXBfZmxhZ3MgfD0gRk9MTF9QSU47Cj4gKwlyZXR1cm4gaW50ZXJuYWxf Z2V0X3VzZXJfcGFnZXNfZmFzdChzdGFydCwgbnJfcGFnZXMsIGd1cF9mbGFncywgcGFnZXMpOwo+ ICt9Cj4gK0VYUE9SVF9TWU1CT0xfR1BMKHBpbl91c2VyX3BhZ2VzX2Zhc3QpOwo+ICsKPiArLyoq Cj4gKyAqIHBpbl9sb25ndGVybV9wYWdlc19mYXN0KCkgLSBwaW4gdXNlciBwYWdlcyBpbiBtZW1v cnkgd2l0aG91dCB0YWtpbmcgbG9ja3MKPiArICoKPiArICogTmVhcmx5IHRoZSBzYW1lIGFzIGdl dF91c2VyX3BhZ2VzX2Zhc3QoKSwgZXhjZXB0IHRoYXQgRk9MTF9QSU4gYW5kCj4gKyAqIEZPTExf TE9OR1RFUk0gYXJlIHNldC4gU2VlIGdldF91c2VyX3BhZ2VzX2Zhc3QoKSBmb3IgZG9jdW1lbnRh dGlvbiBvbiB0aGUKPiArICogZnVuY3Rpb24gYXJndW1lbnRzLCBiZWNhdXNlIHRoZSBhcmd1bWVu dHMgaGVyZSBhcmUgaWRlbnRpY2FsLgo+ICsgKgo+ICsgKiBGT0xMX1BJTiBtZWFucyB0aGF0IHRo ZSBwYWdlcyBtdXN0IGJlIHJlbGVhc2VkIHZpYSBwdXRfdXNlcl9wYWdlKCkuIFBsZWFzZQo+ICsg KiBzZWUgRG9jdW1lbnRhdGlvbi92bS9waW5fdXNlcl9wYWdlcy5yc3QgZm9yIGZ1cnRoZXIgZGV0 YWlscy4KPiArICoKPiArICogRk9MTF9MT05HVEVSTSBtZWFucyB0aGF0IHRoZSBwYWdlcyBhcmUg YmVpbmcgcGlubmVkIGZvciAibG9uZyB0ZXJtIiB1c2UsCj4gKyAqIHR5cGljYWxseSBieSBhIG5v bi1DUFUgZGV2aWNlLCBhbmQgd2UgY2Fubm90IGJlIHN1cmUgdGhhdCB3YWl0aW5nIGZvciBhCj4g KyAqIHBpbm5lZCBwYWdlIHRvIGJlY29tZSB1bnBpbiB3aWxsIGJlIGVmZmVjdGl2ZS4KPiArICoK PiArICogVGhpcyBpcyBpbnRlbmRlZCBmb3IgQ2FzZSAyIChSRE1BOiBsb25nLXRlcm0gcGlucykg b2YgdGhlIEZPTExfUElOCj4gKyAqIGRvY3VtZW50YXRpb24uCj4gKyAqLwo+ICtpbnQgcGluX2xv bmd0ZXJtX3BhZ2VzX2Zhc3QodW5zaWduZWQgbG9uZyBzdGFydCwgaW50IG5yX3BhZ2VzLAo+ICsJ CQkgICAgdW5zaWduZWQgaW50IGd1cF9mbGFncywgc3RydWN0IHBhZ2UgKipwYWdlcykKPiArewo+ ICsJLyogRk9MTF9HRVQgYW5kIEZPTExfUElOIGFyZSBtdXR1YWxseSBleGNsdXNpdmUuICovCj4g KwlpZiAoV0FSTl9PTl9PTkNFKGd1cF9mbGFncyAmIEZPTExfR0VUKSkKPiArCQlyZXR1cm4gLUVJ TlZBTDsKPiArCj4gKwlndXBfZmxhZ3MgfD0gKEZPTExfUElOIHwgRk9MTF9MT05HVEVSTSk7Cj4g KwlyZXR1cm4gaW50ZXJuYWxfZ2V0X3VzZXJfcGFnZXNfZmFzdChzdGFydCwgbnJfcGFnZXMsIGd1 cF9mbGFncywgcGFnZXMpOwo+ICt9Cj4gK0VYUE9SVF9TWU1CT0xfR1BMKHBpbl9sb25ndGVybV9w YWdlc19mYXN0KTsKPiArCj4gKy8qKgo+ICsgKiBwaW5fdXNlcl9wYWdlc19yZW1vdGUoKSAtIHBp biBwYWdlcyBvZiBhIHJlbW90ZSBwcm9jZXNzICh0YXNrICE9IGN1cnJlbnQpCj4gKyAqCj4gKyAq IE5lYXJseSB0aGUgc2FtZSBhcyBnZXRfdXNlcl9wYWdlc19yZW1vdGUoKSwgZXhjZXB0IHRoYXQg Rk9MTF9QSU4gaXMgc2V0LiBTZWUKPiArICogZ2V0X3VzZXJfcGFnZXNfcmVtb3RlKCkgZm9yIGRv Y3VtZW50YXRpb24gb24gdGhlIGZ1bmN0aW9uIGFyZ3VtZW50cywgYmVjYXVzZQo+ICsgKiB0aGUg YXJndW1lbnRzIGhlcmUgYXJlIGlkZW50aWNhbC4KPiArICoKPiArICogRk9MTF9QSU4gbWVhbnMg dGhhdCB0aGUgcGFnZXMgbXVzdCBiZSByZWxlYXNlZCB2aWEgcHV0X3VzZXJfcGFnZSgpLiBQbGVh c2UKPiArICogc2VlIERvY3VtZW50YXRpb24vdm0vcGluX3VzZXJfcGFnZXMucnN0IGZvciBkZXRh aWxzLgo+ICsgKgo+ICsgKiBUaGlzIGlzIGludGVuZGVkIGZvciBDYXNlIDEgKERJTykgaW4gRG9j dW1lbnRhdGlvbi92bS9waW5fdXNlcl9wYWdlcy5yc3QuIEl0Cj4gKyAqIGlzIE5PVCBpbnRlbmRl ZCBmb3IgQ2FzZSAyIChSRE1BOiBsb25nLXRlcm0gcGlucykuCj4gKyAqLwo+ICtsb25nIHBpbl91 c2VyX3BhZ2VzX3JlbW90ZShzdHJ1Y3QgdGFza19zdHJ1Y3QgKnRzaywgc3RydWN0IG1tX3N0cnVj dCAqbW0sCj4gKwkJCSAgIHVuc2lnbmVkIGxvbmcgc3RhcnQsIHVuc2lnbmVkIGxvbmcgbnJfcGFn ZXMsCj4gKwkJCSAgIHVuc2lnbmVkIGludCBndXBfZmxhZ3MsIHN0cnVjdCBwYWdlICoqcGFnZXMs Cj4gKwkJCSAgIHN0cnVjdCB2bV9hcmVhX3N0cnVjdCAqKnZtYXMsIGludCAqbG9ja2VkKQo+ICt7 Cj4gKwkvKiBGT0xMX0dFVCBhbmQgRk9MTF9QSU4gYXJlIG11dHVhbGx5IGV4Y2x1c2l2ZS4gKi8K PiArCWlmIChXQVJOX09OX09OQ0UoZ3VwX2ZsYWdzICYgRk9MTF9HRVQpKQo+ICsJCXJldHVybiAt RUlOVkFMOwo+ICsKPiArCWd1cF9mbGFncyB8PSBGT0xMX1RPVUNIIHwgRk9MTF9SRU1PVEUgfCBG T0xMX1BJTjsKPiArCj4gKwlyZXR1cm4gX19nZXRfdXNlcl9wYWdlc19sb2NrZWQodHNrLCBtbSwg c3RhcnQsIG5yX3BhZ2VzLCBwYWdlcywgdm1hcywKPiArCQkJCSAgICAgICBsb2NrZWQsIGd1cF9m bGFncyk7Cj4gK30KPiArRVhQT1JUX1NZTUJPTChwaW5fdXNlcl9wYWdlc19yZW1vdGUpOwo+ICsK PiArLyoqCj4gKyAqIHBpbl9sb25ndGVybV9wYWdlc19yZW1vdGUoKSAtIHBpbiBwYWdlcyBvZiBh IHJlbW90ZSBwcm9jZXNzICh0YXNrICE9IGN1cnJlbnQpCj4gKyAqCj4gKyAqIE5lYXJseSB0aGUg c2FtZSBhcyBnZXRfdXNlcl9wYWdlc19yZW1vdGUoKSwgYnV0IG5vdGUgdGhhdCBGT0xMX1RPVUNI IGlzIG5vdAo+ICsgKiBzZXQsIGFuZCBGT0xMX1BJTiBhbmQgRk9MTF9MT05HVEVSTSBhcmUgc2V0 LiBTZWUgZ2V0X3VzZXJfcGFnZXNfcmVtb3RlKCkgZm9yCj4gKyAqIGRvY3VtZW50YXRpb24gb24g dGhlIGZ1bmN0aW9uIGFyZ3VtZW50cywgYmVjYXVzZSB0aGUgYXJndW1lbnRzIGhlcmUgYXJlCj4g KyAqIGlkZW50aWNhbC4KPiArICoKPiArICogRk9MTF9QSU4gbWVhbnMgdGhhdCB0aGUgcGFnZXMg bXVzdCBiZSByZWxlYXNlZCB2aWEgcHV0X3VzZXJfcGFnZSgpLiBQbGVhc2UKPiArICogc2VlIERv Y3VtZW50YXRpb24vdm0vcGluX3VzZXJfcGFnZXMucnN0IGZvciBmdXJ0aGVyIGRldGFpbHMuCj4g KyAqCj4gKyAqIEZPTExfTE9OR1RFUk0gbWVhbnMgdGhhdCB0aGUgcGFnZXMgYXJlIGJlaW5nIHBp bm5lZCBmb3IgImxvbmcgdGVybSIgdXNlLAo+ICsgKiB0eXBpY2FsbHkgYnkgYSBub24tQ1BVIGRl dmljZSwgYW5kIHdlIGNhbm5vdCBiZSBzdXJlIHRoYXQgd2FpdGluZyBmb3IgYQo+ICsgKiBwaW5u ZWQgcGFnZSB0byBiZWNvbWUgdW5waW4gd2lsbCBiZSBlZmZlY3RpdmUuCj4gKyAqCj4gKyAqIFRo aXMgaXMgaW50ZW5kZWQgZm9yIENhc2UgMiAoUkRNQTogbG9uZy10ZXJtIHBpbnMpIGluCj4gKyAq IERvY3VtZW50YXRpb24vdm0vcGluX3VzZXJfcGFnZXMucnN0Lgo+ICsgKi8KPiArbG9uZyBwaW5f bG9uZ3Rlcm1fcGFnZXNfcmVtb3RlKHN0cnVjdCB0YXNrX3N0cnVjdCAqdHNrLCBzdHJ1Y3QgbW1f c3RydWN0ICptbSwKPiArCQkJICAgICAgIHVuc2lnbmVkIGxvbmcgc3RhcnQsIHVuc2lnbmVkIGxv bmcgbnJfcGFnZXMsCj4gKwkJCSAgICAgICB1bnNpZ25lZCBpbnQgZ3VwX2ZsYWdzLCBzdHJ1Y3Qg cGFnZSAqKnBhZ2VzLAo+ICsJCQkgICAgICAgc3RydWN0IHZtX2FyZWFfc3RydWN0ICoqdm1hcywg aW50ICpsb2NrZWQpCj4gK3sKPiArCS8qIEZPTExfR0VUIGFuZCBGT0xMX1BJTiBhcmUgbXV0dWFs bHkgZXhjbHVzaXZlLiAqLwo+ICsJaWYgKFdBUk5fT05fT05DRShndXBfZmxhZ3MgJiBGT0xMX0dF VCkpCj4gKwkJcmV0dXJuIC1FSU5WQUw7Cj4gKwo+ICsJZ3VwX2ZsYWdzIHw9IEZPTExfTE9OR1RF Uk0gfCBGT0xMX1JFTU9URSB8IEZPTExfUElOOwo+ICsKPiArCXJldHVybiBfX2dldF91c2VyX3Bh Z2VzX2xvY2tlZCh0c2ssIG1tLCBzdGFydCwgbnJfcGFnZXMsIHBhZ2VzLCB2bWFzLAo+ICsJCQkJ ICAgICAgIGxvY2tlZCwgZ3VwX2ZsYWdzKTsKPiArfQo+ICtFWFBPUlRfU1lNQk9MKHBpbl9sb25n dGVybV9wYWdlc19yZW1vdGUpOwo+ICsKPiArLyoqCj4gKyAqIHBpbl91c2VyX3BhZ2VzKCkgLSBw aW4gdXNlciBwYWdlcyBpbiBtZW1vcnkgZm9yIHVzZSBieSBvdGhlciBkZXZpY2VzCj4gKyAqCj4g KyAqIE5lYXJseSB0aGUgc2FtZSBhcyBnZXRfdXNlcl9wYWdlcygpLCBleGNlcHQgdGhhdCBGT0xM X1RPVUNIIGlzIG5vdCBzZXQsIGFuZAo+ICsgKiBGT0xMX1BJTiBpcyBzZXQuCj4gKyAqCj4gKyAq IEZPTExfUElOIG1lYW5zIHRoYXQgdGhlIHBhZ2VzIG11c3QgYmUgcmVsZWFzZWQgdmlhIHB1dF91 c2VyX3BhZ2UoKS4gUGxlYXNlCj4gKyAqIHNlZSBEb2N1bWVudGF0aW9uL3ZtL3Bpbl91c2VyX3Bh Z2VzLnJzdCBmb3IgZGV0YWlscy4KPiArICoKPiArICogVGhpcyBpcyBpbnRlbmRlZCBmb3IgQ2Fz ZSAxIChESU8pIGluIERvY3VtZW50YXRpb24vdm0vcGluX3VzZXJfcGFnZXMucnN0LiBJdAo+ICsg KiBpcyBOT1QgaW50ZW5kZWQgZm9yIENhc2UgMiAoUkRNQTogbG9uZy10ZXJtIHBpbnMpLgo+ICsg Ki8KPiArbG9uZyBwaW5fdXNlcl9wYWdlcyh1bnNpZ25lZCBsb25nIHN0YXJ0LCB1bnNpZ25lZCBs b25nIG5yX3BhZ2VzLAo+ICsJCSAgICB1bnNpZ25lZCBpbnQgZ3VwX2ZsYWdzLCBzdHJ1Y3QgcGFn ZSAqKnBhZ2VzLAo+ICsJCSAgICBzdHJ1Y3Qgdm1fYXJlYV9zdHJ1Y3QgKip2bWFzKQo+ICt7Cj4g KwkvKiBGT0xMX0dFVCBhbmQgRk9MTF9QSU4gYXJlIG11dHVhbGx5IGV4Y2x1c2l2ZS4gKi8KPiAr CWlmIChXQVJOX09OX09OQ0UoZ3VwX2ZsYWdzICYgRk9MTF9HRVQpKQo+ICsJCXJldHVybiAtRUlO VkFMOwo+ICsKPiArCWd1cF9mbGFncyB8PSBGT0xMX1BJTjsKPiArCXJldHVybiBfX2d1cF9sb25n dGVybV9sb2NrZWQoY3VycmVudCwgY3VycmVudC0+bW0sIHN0YXJ0LCBucl9wYWdlcywKPiArCQkJ CSAgICAgcGFnZXMsIHZtYXMsIGd1cF9mbGFncyk7Cj4gK30KPiArRVhQT1JUX1NZTUJPTChwaW5f dXNlcl9wYWdlcyk7Cj4gKwo+ICsvKioKPiArICogcGluX2xvbmd0ZXJtX3BhZ2VzKCkgLSBwaW4g dXNlciBwYWdlcyBpbiBtZW1vcnkgZm9yIGxvbmctdGVybSB1c2UgKFJETUEsCj4gKyAqIHR5cGlj YWxseSkKPiArICoKPiArICogTmVhcmx5IHRoZSBzYW1lIGFzIGdldF91c2VyX3BhZ2VzKCksIGV4 Y2VwdCB0aGF0IEZPTExfUElOIGFuZCBGT0xMX0xPTkdURVJNCj4gKyAqIGFyZSBzZXQuIFNlZSBn ZXRfdXNlcl9wYWdlc19mYXN0KCkgZm9yIGRvY3VtZW50YXRpb24gb24gdGhlIGZ1bmN0aW9uCj4g KyAqIGFyZ3VtZW50cywgYmVjYXVzZSB0aGUgYXJndW1lbnRzIGhlcmUgYXJlIGlkZW50aWNhbC4K PiArICoKPiArICogRk9MTF9QSU4gbWVhbnMgdGhhdCB0aGUgcGFnZXMgbXVzdCBiZSByZWxlYXNl ZCB2aWEgcHV0X3VzZXJfcGFnZSgpLiBQbGVhc2UKPiArICogc2VlIERvY3VtZW50YXRpb24vdm0v cGluX3VzZXJfcGFnZXMucnN0IGZvciBmdXJ0aGVyIGRldGFpbHMuCj4gKyAqCj4gKyAqIEZPTExf TE9OR1RFUk0gbWVhbnMgdGhhdCB0aGUgcGFnZXMgYXJlIGJlaW5nIHBpbm5lZCBmb3IgImxvbmcg dGVybSIgdXNlLAo+ICsgKiB0eXBpY2FsbHkgYnkgYSBub24tQ1BVIGRldmljZSwgYW5kIHdlIGNh bm5vdCBiZSBzdXJlIHRoYXQgd2FpdGluZyBmb3IgYQo+ICsgKiBwaW5uZWQgcGFnZSB0byBiZWNv bWUgdW5waW4gd2lsbCBiZSBlZmZlY3RpdmUuCj4gKyAqCj4gKyAqIFRoaXMgaXMgaW50ZW5kZWQg Zm9yIENhc2UgMiAoUkRNQTogbG9uZy10ZXJtIHBpbnMpIGluCj4gKyAqIERvY3VtZW50YXRpb24v dm0vcGluX3VzZXJfcGFnZXMucnN0Lgo+ICsgKi8KPiArbG9uZyBwaW5fbG9uZ3Rlcm1fcGFnZXMo dW5zaWduZWQgbG9uZyBzdGFydCwgdW5zaWduZWQgbG9uZyBucl9wYWdlcywKPiArCQkJdW5zaWdu ZWQgaW50IGd1cF9mbGFncywgc3RydWN0IHBhZ2UgKipwYWdlcywKPiArCQkJc3RydWN0IHZtX2Fy ZWFfc3RydWN0ICoqdm1hcykKPiArewo+ICsJLyogRk9MTF9HRVQgYW5kIEZPTExfUElOIGFyZSBt dXR1YWxseSBleGNsdXNpdmUuICovCj4gKwlpZiAoV0FSTl9PTl9PTkNFKGd1cF9mbGFncyAmIEZP TExfR0VUKSkKPiArCQlyZXR1cm4gLUVJTlZBTDsKPiArCj4gKwlndXBfZmxhZ3MgfD0gRk9MTF9Q SU4gfCBGT0xMX0xPTkdURVJNOwo+ICsJcmV0dXJuIF9fZ3VwX2xvbmd0ZXJtX2xvY2tlZChjdXJy ZW50LCBjdXJyZW50LT5tbSwgc3RhcnQsIG5yX3BhZ2VzLAo+ICsJCQkJICAgICBwYWdlcywgdm1h cywgZ3VwX2ZsYWdzKTsKPiArfQo+ICtFWFBPUlRfU1lNQk9MKHBpbl9sb25ndGVybV9wYWdlcyk7 Cj4gLS0gCj4gMi4yNC4wCj4gCgotLSAKU2luY2VyZWx5IHlvdXJzLApNaWtlLgpfX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpkcmktZGV2ZWwgbWFpbGluZyBs aXN0CmRyaS1kZXZlbEBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5mcmVlZGVz a3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9kcmktZGV2ZWw=