From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17E283290C4 for ; Tue, 16 Jun 2026 13:40:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781617208; cv=none; b=Bx5+qD1K7I6pEs/GFoMRmuONRn4/YvhKdvtp8jgQAKY8aO3+ArTHTGYPYUD7pGWun7i1soR6Lzw24vOSzYw591ZbbcwqhClPvjtqf05m1Fm1ogvK2I22Jb8QVjD6coEvhXzkN47C2koVOGG0KI5LV/XPxsN4FOJZo5nzFisL7Tc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781617208; c=relaxed/simple; bh=rIGLshW+X166l81xCwoWf0q5ZuoxJonDlbcxM3hXdzg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=pSDAphYn6+4SNlcTNeArVsKowMADLq+28+TJoZqJtrDIxc8MfPlaUsTRXBJulLFrGWCChs8ldZTqKzHOJ2W5QS8AYDeZexCc98J11NT6GQYv9rpfV2I6NYWZIKMewOoTtmK9aYOH1Gu3nFk2+KxrN3PU+gvcwCpU5gmKAY5deHE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--praan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=unlbVm1N; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--praan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="unlbVm1N" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-37c64f7ff48so404582a91.2 for ; Tue, 16 Jun 2026 06:40:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1781617205; x=1782222005; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=mweXAOR38Xgx261y16hhaEqqdbuo1oE3edAA0tndrpY=; b=unlbVm1NznNPnP8Ce9qCItFGGqnm1o0NGoUwQswD0STCfOeFbYJR1alnhU8OSslCZN /SmXIZvRPeaZaGoh3qJZBKDIXzfoXvqAso2NLKDlNg9CuE1SLiPzgfeHfcteMCS8U4fU MKFFtubjkjSffDWHriuF7+0/QXprRunMCZKbQmZ2REEDvHS/t8SHzgg+SvwsTotiGMw8 YbsTFHVkRYvbqKM+XpfuQCcwR24bi12lSZluL/rs1tHh2LekEL85yC2kb1/QJGdFmZEc hJfqDsOo/arPw0o/wzpMP81B1rH4dMdYoPIVl6rEmcmdHPOqIxH3odKNfD4rG9lnGt3N 89rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781617205; x=1782222005; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mweXAOR38Xgx261y16hhaEqqdbuo1oE3edAA0tndrpY=; b=E1JSzltWfzc0hB4gj4Gproqv1u+5boyucL1f2hQqv7fbH+870Fq66U3YXlJxPCYTri YVBYVjHrKayZ3np1sUiewXBjHeN+Pq2ceXb0IdfIkHp67b7kY/VzVo3pjbA5hG4aEaf7 +L3Bj2KohnvIpdpFqxmpk3W1HOKKNwjmCW7qzKFYZFk5OKUQSIuKM9bbRqspZemgJtKA oX1vpuiq0k6txbb49KSE1GBdQx0mDi9p5dH9xo9Qx5PkwJYZx8ekI6jJ8sediNl6HZ05 ms9mIKhKzIFtGdcP8VuN0ykvROSRU4UYPj4IKQRSW+ARG+aQqML4dRu6DnFLMBTIURz/ 5nhg== X-Forwarded-Encrypted: i=1; AFNElJ8+jVa2lyzcizNmCrnLh9Ghg7aRspqOUc17ZZU+tYANhymCRBzuPN0NjjWuK5jN/p/uzuZ3VopDlZIv0OQ=@vger.kernel.org X-Gm-Message-State: AOJu0YwUFCKlehNdhHyRhYns/mA08eeFFOW+ZMeB+hdDspU7DH9vcxx/ FY9HUC7z1txZKuEp/fy1QnVdYUWws+V6HsB42R4cJkbqhXdV/wLhK3OKLSAhwAQY/yCww+WE0IQ s5A== X-Received: from pjbfa6.prod.google.com ([2002:a17:90a:f0c6:b0:365:ca4c:7afb]) (user=praan job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:7344:b0:37c:6975:2e3d with SMTP id 98e67ed59e1d1-37c697534e4mr558882a91.8.1781617205226; Tue, 16 Jun 2026 06:40:05 -0700 (PDT) Date: Tue, 16 Jun 2026 13:39:54 +0000 In-Reply-To: <20260616134000.2733403-1-praan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260616134000.2733403-1-praan@google.com> X-Mailer: git-send-email 2.54.0.1136.gdb2ca164c4-goog Message-ID: <20260616134000.2733403-2-praan@google.com> Subject: [PATCH v2 1/7] nfs: make nfs_page pin-aware From: Pranjal Shrivastava To: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Trond Myklebust , Anna Schumaker , Christoph Hellwig , Christoph Hellwig , Shivaji Kant , Pranjal Shrivastava Content-Type: text/plain; charset="UTF-8" Modernizing the NFS Direct I/O path to use iov_iter_extract_pages() introduces page pinning (GUP) instead of standard page referencing. To handle this correctly, nfs_page must track whether it holds a pin or a standard reference. Introduce a new flag, PG_PINNED, to struct nfs_page. Update the creation path (nfs_page_create_from_page and nfs_page_create_from_folio) to accept a pinned bool and set the flag accordingly. If the page is pinned, we skip the existing reference increment (get_page/folio_get) as the pin itself acts as a reference. Update nfs_clear_request() & nfs_direct_release_pages() to use unpin_user_page() or unpin_user_folio() instead of only refcount decrement (put_page) when PG_PINNED flag is set. Finally, ensure subrequests inherit the pinning status from their parent request. Signed-off-by: Pranjal Shrivastava --- fs/nfs/direct.c | 22 +++++++++++++++------- fs/nfs/pagelist.c | 38 ++++++++++++++++++++++++++++---------- fs/nfs/read.c | 2 +- fs/nfs/write.c | 2 +- include/linux/nfs_page.h | 3 +++ 5 files changed, 48 insertions(+), 19 deletions(-) diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index e626c72495e6..19792a38c924 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -165,11 +165,17 @@ int nfs_swap_rw(struct kiocb *iocb, struct iov_iter *iter) return 0; } -static void nfs_direct_release_pages(struct page **pages, unsigned int npages) +static void nfs_direct_release_pages(struct page **pages, unsigned int npages, + bool pinned) { unsigned int i; - for (i = 0; i < npages; i++) - put_page(pages[i]); + + if (pinned) { + unpin_user_pages(pages, npages); + } else { + for (i = 0; i < npages; i++) + put_page(pages[i]); + } } void nfs_init_cinfo_from_dreq(struct nfs_commit_info *cinfo, @@ -371,7 +377,8 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq, unsigned int req_len = min_t(size_t, bytes, PAGE_SIZE - pgbase); /* XXX do we need to do the eof zeroing found in async_filler? */ req = nfs_page_create_from_page(dreq->ctx, pagevec[i], - pgbase, pos, req_len); + false, pgbase, pos, + req_len); if (IS_ERR(req)) { result = PTR_ERR(req); break; @@ -386,7 +393,7 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq, requested_bytes += req_len; pos += req_len; } - nfs_direct_release_pages(pagevec, npages); + nfs_direct_release_pages(pagevec, npages, false); kvfree(pagevec); if (result < 0) break; @@ -907,7 +914,8 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq, unsigned int req_len = min_t(size_t, bytes, PAGE_SIZE - pgbase); req = nfs_page_create_from_page(dreq->ctx, pagevec[i], - pgbase, pos, req_len); + false, pgbase, pos, + req_len); if (IS_ERR(req)) { result = PTR_ERR(req); break; @@ -950,7 +958,7 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq, desc.pg_error = 0; defer = true; } - nfs_direct_release_pages(pagevec, npages); + nfs_direct_release_pages(pagevec, npages, false); kvfree(pagevec); if (result < 0) break; diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index 7dd478ffc2fa..faa8bc1c6526 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -404,20 +404,26 @@ static struct nfs_page *nfs_page_create(struct nfs_lock_context *l_ctx, return req; } -static void nfs_page_assign_folio(struct nfs_page *req, struct folio *folio) +static void nfs_page_assign_folio(struct nfs_page *req, struct folio *folio, bool pinned) { if (folio != NULL) { req->wb_folio = folio; - folio_get(folio); + if (pinned) + set_bit(PG_PINNED, &req->wb_flags); + else + folio_get(folio); set_bit(PG_FOLIO, &req->wb_flags); } } -static void nfs_page_assign_page(struct nfs_page *req, struct page *page) +static void nfs_page_assign_page(struct nfs_page *req, struct page *page, bool pinned) { if (page != NULL) { req->wb_page = page; - get_page(page); + if (pinned) + set_bit(PG_PINNED, &req->wb_flags); + else + get_page(page); } } @@ -425,6 +431,7 @@ static void nfs_page_assign_page(struct nfs_page *req, struct page *page) * nfs_page_create_from_page - Create an NFS read/write request. * @ctx: open context to use * @page: page to write + * @pinned: true if page is pinned * @pgbase: starting offset within the page for the write * @offset: file offset for the write * @count: number of bytes to read/write @@ -435,6 +442,7 @@ static void nfs_page_assign_page(struct nfs_page *req, struct page *page) */ struct nfs_page *nfs_page_create_from_page(struct nfs_open_context *ctx, struct page *page, + bool pinned, unsigned int pgbase, loff_t offset, unsigned int count) { @@ -446,7 +454,7 @@ struct nfs_page *nfs_page_create_from_page(struct nfs_open_context *ctx, ret = nfs_page_create(l_ctx, pgbase, offset >> PAGE_SHIFT, offset_in_page(offset), count); if (!IS_ERR(ret)) { - nfs_page_assign_page(ret, page); + nfs_page_assign_page(ret, page, pinned); nfs_page_group_init(ret, NULL); } nfs_put_lock_context(l_ctx); @@ -457,6 +465,7 @@ struct nfs_page *nfs_page_create_from_page(struct nfs_open_context *ctx, * nfs_page_create_from_folio - Create an NFS read/write request. * @ctx: open context to use * @folio: folio to write + * @pinned: true if folio is pinned * @offset: starting offset within the folio for the write * @count: number of bytes to read/write * @@ -466,6 +475,7 @@ struct nfs_page *nfs_page_create_from_page(struct nfs_open_context *ctx, */ struct nfs_page *nfs_page_create_from_folio(struct nfs_open_context *ctx, struct folio *folio, + bool pinned, unsigned int offset, unsigned int count) { @@ -476,7 +486,7 @@ struct nfs_page *nfs_page_create_from_folio(struct nfs_open_context *ctx, return ERR_CAST(l_ctx); ret = nfs_page_create(l_ctx, offset, folio->index, offset, count); if (!IS_ERR(ret)) { - nfs_page_assign_folio(ret, folio); + nfs_page_assign_folio(ret, folio, pinned); nfs_page_group_init(ret, NULL); } nfs_put_lock_context(l_ctx); @@ -498,9 +508,11 @@ nfs_create_subreq(struct nfs_page *req, offset, count); if (!IS_ERR(ret)) { if (folio) - nfs_page_assign_folio(ret, folio); + nfs_page_assign_folio(ret, folio, + test_bit(PG_PINNED, &req->wb_flags)); else - nfs_page_assign_page(ret, page); + nfs_page_assign_page(ret, page, + test_bit(PG_PINNED, &req->wb_flags)); /* find the last request */ for (last = req->wb_head; last->wb_this_page != req->wb_head; @@ -552,11 +564,17 @@ static void nfs_clear_request(struct nfs_page *req) struct nfs_open_context *ctx; if (folio != NULL) { - folio_put(folio); + if (test_and_clear_bit(PG_PINNED, &req->wb_flags)) + unpin_user_folio(folio, 1); + else + folio_put(folio); req->wb_folio = NULL; clear_bit(PG_FOLIO, &req->wb_flags); } else if (page != NULL) { - put_page(page); + if (test_and_clear_bit(PG_PINNED, &req->wb_flags)) + unpin_user_page(page); + else + put_page(page); req->wb_page = NULL; } if (l_ctx != NULL) { diff --git a/fs/nfs/read.c b/fs/nfs/read.c index 2b70bd2b934b..e7497b029d6c 100644 --- a/fs/nfs/read.c +++ b/fs/nfs/read.c @@ -324,7 +324,7 @@ int nfs_read_add_folio(struct nfs_pageio_descriptor *pgio, aligned_len = min_t(unsigned int, ALIGN(len, rsize), fsize); - new = nfs_page_create_from_folio(ctx, folio, 0, aligned_len); + new = nfs_page_create_from_folio(ctx, folio, false, 0, aligned_len); if (IS_ERR(new)) { error = PTR_ERR(new); if (nfs_netfs_folio_unlock(folio)) diff --git a/fs/nfs/write.c b/fs/nfs/write.c index fcffb8c9e9df..e39e62b65ce2 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1086,7 +1086,7 @@ static struct nfs_page *nfs_setup_write_request(struct nfs_open_context *ctx, req = nfs_try_to_update_request(folio, offset, bytes); if (req != NULL) goto out; - req = nfs_page_create_from_folio(ctx, folio, offset, bytes); + req = nfs_page_create_from_folio(ctx, folio, false, offset, bytes); if (IS_ERR(req)) goto out; nfs_inode_add_request(req); diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h index 4b9a35dbc062..fd7aafe7cb54 100644 --- a/include/linux/nfs_page.h +++ b/include/linux/nfs_page.h @@ -38,6 +38,7 @@ enum { PG_REMOVE, /* page group sync bit in write path */ PG_CONTENDED1, /* Is someone waiting for a lock? */ PG_CONTENDED2, /* Is someone waiting for a lock? */ + PG_PINNED, /* page is pinned by GUP */ }; struct nfs_inode; @@ -125,11 +126,13 @@ struct nfs_pageio_descriptor { extern struct nfs_page *nfs_page_create_from_page(struct nfs_open_context *ctx, struct page *page, + bool pinned, unsigned int pgbase, loff_t offset, unsigned int count); extern struct nfs_page *nfs_page_create_from_folio(struct nfs_open_context *ctx, struct folio *folio, + bool pinned, unsigned int offset, unsigned int count); extern void nfs_release_request(struct nfs_page *); -- 2.54.0.1136.gdb2ca164c4-goog