All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] NFSD: Rate-limiting unstable WRITEs
@ 2025-12-19 14:11 Chuck Lever
  2025-12-19 14:11 ` [RFC PATCH 1/2] NFSD: Add aggressive write throttling control Chuck Lever
  2025-12-19 14:11 ` [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support Chuck Lever
  0 siblings, 2 replies; 15+ messages in thread
From: Chuck Lever @ 2025-12-19 14:11 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Snitzer; +Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Following up on

  https://lore.kernel.org/linux-nfs/99dd427d-a16e-4494-a4b1-ff65488181ee@oracle.com/

Client workloads that are write-intensive can sometimes trigger an
NFSD meltdown (thrashing, livelocking, or becoming unresponsive).
This can happen when clients present NFSD with more UNSTABLE WRITEs
than can fit in the server's physical memory, and the system simply
can't get those dirty pages onto persistent storage fast enough.

In those cases, it makes sense to slow those clients down until the
backlog can be cleared out. NFSD might do this by delaying the
responses to UNSTABLE WRITEs, which in turn leaves unprocessed
ingress WRITEs on the transport queue longer, and thus closes down
the ingress congestion window on the network connection. This
applies direct backpressure on the noisy clients.

NFSD might already be doing this to some extent, but it can be
argued that it is not going far enough.

These two patches fall squarely in the "crazy ideas" category, but
I hope they serve as conversation starters.

Chuck Lever (2):
  NFSD: Add aggressive write throttling control
  NFSD: Add asynchronous write throttling support

 fs/nfsd/debugfs.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++
 fs/nfsd/nfsd.h    | 10 +++++++
 fs/nfsd/vfs.c     | 34 ++++++++++++++++++++++++
 3 files changed, 111 insertions(+)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 1/2] NFSD: Add aggressive write throttling control
  2025-12-19 14:11 [RFC PATCH 0/2] NFSD: Rate-limiting unstable WRITEs Chuck Lever
@ 2025-12-19 14:11 ` Chuck Lever
  2026-01-07  7:55   ` Christoph Hellwig
  2025-12-19 14:11 ` [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support Chuck Lever
  1 sibling, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2025-12-19 14:11 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Snitzer; +Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

On NFS servers with fast network links but slow storage, clients can
generate WRITE requests faster than the server can flush payloads to
durable storage. This can push the server into memory exhaustion as
dirty pages accumulate across hundreds of concurrent NFSD threads.

The existing dirty page throttling (balance_dirty_pages()) uses
per-task accounting with default ratelimits that allow each thread
to dirty ~32 pages before throttling occurs. With many NFSD threads,
this allows significant dirty page accumulation before any
throttling kicks in.

Add a debugfs control to enable aggressive write throttling for
NFSD:

  /sys/kernel/debug/nfsd/write_throttle

When set to 1, NFSD write operations reduce nr_dirtied_pause to
force balance_dirty_pages() to be called more frequently. This uses
the same page-size-adjusted limit that
balance_dirty_pages_ratelimited_flags() applies when
wb->dirty_exceeded is true, providing 4x more frequent throttling on
systems with 4KB pages.

The setting defaults to 0 (normal throttling) and can be changed at
runtime without restarting NFSD.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/debugfs.c | 33 +++++++++++++++++++++++++++++++++
 fs/nfsd/nfsd.h    |  9 +++++++++
 fs/nfsd/vfs.c     | 17 +++++++++++++++++
 3 files changed, 59 insertions(+)

diff --git a/fs/nfsd/debugfs.c b/fs/nfsd/debugfs.c
index 7f44689e0a53..f3d9e957cc5c 100644
--- a/fs/nfsd/debugfs.c
+++ b/fs/nfsd/debugfs.c
@@ -122,6 +122,36 @@ static int nfsd_io_cache_write_set(void *data, u64 val)
 DEFINE_DEBUGFS_ATTRIBUTE(nfsd_io_cache_write_fops, nfsd_io_cache_write_get,
 			 nfsd_io_cache_write_set, "%llu\n");
 
+/*
+ * /sys/kernel/debug/nfsd/write_throttle
+ *
+ * Contents:
+ *   %0: Normal throttling (default)
+ *   %1: Aggressive throttling for NFSD writes
+ *
+ * When set to 1, NFSD write operations are throttled more aggressively
+ * to prevent memory exhaustion when fast network clients overwhelm slow
+ * storage. This is useful when the server has limited memory or slow disks.
+ *
+ * This setting takes immediate effect for all NFS versions, all exports,
+ * and in all NFSD net namespaces.
+ */
+
+static int nfsd_write_throttle_get(void *data, u64 *val)
+{
+	*val = nfsd_aggressive_write_throttle ? 1 : 0;
+	return 0;
+}
+
+static int nfsd_write_throttle_set(void *data, u64 val)
+{
+	nfsd_aggressive_write_throttle = (val > 0);
+	return 0;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(nfsd_write_throttle_fops, nfsd_write_throttle_get,
+			 nfsd_write_throttle_set, "%llu\n");
+
 void nfsd_debugfs_exit(void)
 {
 	debugfs_remove_recursive(nfsd_top_dir);
@@ -140,4 +170,7 @@ void nfsd_debugfs_init(void)
 
 	debugfs_create_file("io_cache_write", 0644, nfsd_top_dir, NULL,
 			    &nfsd_io_cache_write_fops);
+
+	debugfs_create_file("write_throttle", 0644, nfsd_top_dir, NULL,
+			    &nfsd_write_throttle_fops);
 }
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index b0283213a8f5..16a259839768 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -165,6 +165,15 @@ enum {
 
 extern u64 nfsd_io_cache_read __read_mostly;
 extern u64 nfsd_io_cache_write __read_mostly;
+extern bool nfsd_aggressive_write_throttle __read_mostly;
+
+/*
+ * Aggressive write throttling reduces nr_dirtied_pause to force more
+ * frequent calls to balance_dirty_pages(). This uses the same page-size
+ * adjusted formula as balance_dirty_pages_ratelimited_flags() when
+ * wb->dirty_exceeded is true (see mm/page-writeback.c:2066).
+ */
+#define NFSD_AGGRESSIVE_DIRTY_LIMIT	(32 >> (PAGE_SHIFT - 10))
 
 extern int nfsd_max_blksize;
 
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 168d3ccc8155..33805b9ac7e4 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -51,6 +51,7 @@
 bool nfsd_disable_splice_read __read_mostly;
 u64 nfsd_io_cache_read __read_mostly = NFSD_IO_BUFFERED;
 u64 nfsd_io_cache_write __read_mostly = NFSD_IO_BUFFERED;
+bool nfsd_aggressive_write_throttle __read_mostly;
 
 /**
  * nfserrno - Map Linux errnos to NFS errnos
@@ -1420,6 +1421,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	unsigned int		pflags = current->flags;
 	bool			restore_flags = false;
 	unsigned int		nvecs;
+	int			saved_nr_dirtied_pause = 0;
+	bool			throttle_adjusted = false;
 
 	trace_nfsd_write_opened(rqstp, fhp, offset, *cnt);
 
@@ -1441,6 +1444,18 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 
 	exp = fhp->fh_export;
 
+	/*
+	 * If aggressive write throttling is enabled, reduce the per-task
+	 * dirty page limit to throttle NFSD writes more aggressively.
+	 * This helps prevent memory exhaustion when fast network clients
+	 * overwhelm slow storage.
+	 */
+	if (nfsd_aggressive_write_throttle) {
+		saved_nr_dirtied_pause = current->nr_dirtied_pause;
+		current->nr_dirtied_pause = NFSD_AGGRESSIVE_DIRTY_LIMIT;
+		throttle_adjusted = true;
+	}
+
 	if (!EX_ISSYNC(exp))
 		stable = NFS_UNSTABLE;
 	init_sync_kiocb(&kiocb, file);
@@ -1505,6 +1520,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		trace_nfsd_write_err(rqstp, fhp, offset, host_err);
 		nfserr = nfserrno(host_err);
 	}
+	if (throttle_adjusted)
+		current->nr_dirtied_pause = saved_nr_dirtied_pause;
 	if (restore_flags)
 		current_restore_flags(pflags, PF_LOCAL_THROTTLE);
 	return nfserr;
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
  2025-12-19 14:11 [RFC PATCH 0/2] NFSD: Rate-limiting unstable WRITEs Chuck Lever
  2025-12-19 14:11 ` [RFC PATCH 1/2] NFSD: Add aggressive write throttling control Chuck Lever
@ 2025-12-19 14:11 ` Chuck Lever
  2025-12-20 15:34   ` kernel test robot
                     ` (4 more replies)
  1 sibling, 5 replies; 15+ messages in thread
From: Chuck Lever @ 2025-12-19 14:11 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Snitzer; +Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

When memory pressure occurs during buffered writes, the traditional
approach is for balance_dirty_pages() to put the writing thread to
sleep until dirty pages are flushed. For NFSD, this means server
threads block waiting for I/O, reducing overall server throughput.

Add support for asynchronous write throttling using the BDP_ASYNC
flag to balance_dirty_pages_ratelimited_flags(). When enabled via:

  /sys/kernel/debug/nfsd/write_async_throttle

NFSD checks memory pressure before attempting buffered writes. If
balance_dirty_pages_ratelimited_flags() returns -EAGAIN (indicating
memory exhaustion), NFSD returns NFS4ERR_DELAY (or NFSERR_JUKEBOX for
NFSv3) to the client instead of blocking.

This allows clients to back off and retry rather than having server
threads tied up waiting for writeback. The setting defaults to 0
(synchronous throttling) and can be combined with write_throttle for
layered throttling strategies.

Note: NFSv2 does not support NFSERR_JUKEBOX, so async throttling is
automatically disabled for NFSv2 requests regardless of the setting.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/debugfs.c | 34 ++++++++++++++++++++++++++++++++++
 fs/nfsd/nfsd.h    |  1 +
 fs/nfsd/vfs.c     | 17 +++++++++++++++++
 3 files changed, 52 insertions(+)

diff --git a/fs/nfsd/debugfs.c b/fs/nfsd/debugfs.c
index f3d9e957cc5c..f2cce37589ce 100644
--- a/fs/nfsd/debugfs.c
+++ b/fs/nfsd/debugfs.c
@@ -152,6 +152,37 @@ static int nfsd_write_throttle_set(void *data, u64 val)
 DEFINE_DEBUGFS_ATTRIBUTE(nfsd_write_throttle_fops, nfsd_write_throttle_get,
 			 nfsd_write_throttle_set, "%llu\n");
 
+/*
+ * /sys/kernel/debug/nfsd/write_async_throttle
+ *
+ * Contents:
+ *   %0: Synchronous throttling (default) - writes sleep in balance_dirty_pages()
+ *   %1: Asynchronous throttling - return NFS4ERR_DELAY when memory is tight
+ *
+ * When set to 1, NFSD uses BDP_ASYNC mode which returns -EAGAIN from
+ * balance_dirty_pages_ratelimited_flags() instead of sleeping. This allows
+ * NFSD to return NFS4ERR_DELAY (or NFSERR_JUKEBOX for NFSv3), letting
+ * clients back off and retry rather than having NFSD threads blocked.
+ *
+ * This setting takes immediate effect for all NFS versions, all exports,
+ * and in all NFSD net namespaces.
+ */
+
+static int nfsd_async_throttle_get(void *data, u64 *val)
+{
+	*val = nfsd_async_write_throttle ? 1 : 0;
+	return 0;
+}
+
+static int nfsd_async_throttle_set(void *data, u64 val)
+{
+	nfsd_async_write_throttle = (val > 0);
+	return 0;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(nfsd_async_throttle_fops, nfsd_async_throttle_get,
+			 nfsd_async_throttle_set, "%llu\n");
+
 void nfsd_debugfs_exit(void)
 {
 	debugfs_remove_recursive(nfsd_top_dir);
@@ -173,4 +204,7 @@ void nfsd_debugfs_init(void)
 
 	debugfs_create_file("write_throttle", 0644, nfsd_top_dir, NULL,
 			    &nfsd_write_throttle_fops);
+
+	debugfs_create_file("write_async_throttle", 0644, nfsd_top_dir, NULL,
+			    &nfsd_async_throttle_fops);
 }
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 16a259839768..ea61db58ef95 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -166,6 +166,7 @@ enum {
 extern u64 nfsd_io_cache_read __read_mostly;
 extern u64 nfsd_io_cache_write __read_mostly;
 extern bool nfsd_aggressive_write_throttle __read_mostly;
+extern bool nfsd_async_write_throttle __read_mostly;
 
 /*
  * Aggressive write throttling reduces nr_dirtied_pause to force more
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 33805b9ac7e4..0fcfd29e843d 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -52,6 +52,7 @@ bool nfsd_disable_splice_read __read_mostly;
 u64 nfsd_io_cache_read __read_mostly = NFSD_IO_BUFFERED;
 u64 nfsd_io_cache_write __read_mostly = NFSD_IO_BUFFERED;
 bool nfsd_aggressive_write_throttle __read_mostly;
+bool nfsd_async_write_throttle __read_mostly;
 
 /**
  * nfserrno - Map Linux errnos to NFS errnos
@@ -1473,6 +1474,22 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		}
 	}
 
+	/*
+	 * If async throttling is enabled, check memory pressure
+	 * before attempting buffered writes. Return -EAGAIN if
+	 * the system is low on memory, allowing NFSD to return
+	 * an NFS error code asking the client to retry later.
+	 *
+	 * Skip this for NFSv2 since it lacks NFSERR_JUKEBOX.
+	 */
+	if (nfsd_async_write_throttle && rqstp->rq_vers >= 3) {
+		host_err =
+			balance_dirty_pages_ratelimited_flags(file->f_mapping,
+							      BDP_ASYNC);
+		if (host_err == -EAGAIN)
+			break;
+	}
+
 	nvecs = xdr_buf_to_bvec(rqstp->rq_bvec, rqstp->rq_maxpages, payload);
 
 	since = READ_ONCE(file->f_wb_err);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
  2025-12-19 14:11 ` [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support Chuck Lever
@ 2025-12-20 15:34   ` kernel test robot
  2025-12-21  5:41   ` kernel test robot
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2025-12-20 15:34 UTC (permalink / raw)
  To: Chuck Lever; +Cc: llvm, oe-kbuild-all

Hi Chuck,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on brauner-vfs/vfs.all]
[also build test ERROR on next-20251219]
[cannot apply to linus/master v6.16-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/NFSD-Add-aggressive-write-throttling-control/20251219-221859
base:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
patch link:    https://lore.kernel.org/r/20251219141105.1247093-3-cel%40kernel.org
patch subject: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20251220/202512201657.c3KKm6Kh-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251220/202512201657.c3KKm6Kh-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512201657.c3KKm6Kh-lkp@intel.com/

All errors (new ones prefixed by >>):

>> fs/nfsd/vfs.c:1490:4: error: 'break' statement not in loop or switch statement
    1490 |                         break;
         |                         ^
   1 error generated.


vim +/break +1490 fs/nfsd/vfs.c

  1389	
  1390	/**
  1391	 * nfsd_vfs_write - write data to an already-open file
  1392	 * @rqstp: RPC execution context
  1393	 * @fhp: File handle of file to write into
  1394	 * @nf: An open file matching @fhp
  1395	 * @offset: Byte offset of start
  1396	 * @payload: xdr_buf containing the write payload
  1397	 * @cnt: IN: number of bytes to write, OUT: number of bytes actually written
  1398	 * @stable: An NFS stable_how value
  1399	 * @verf: NFS WRITE verifier
  1400	 *
  1401	 * Upon return, caller must invoke fh_put on @fhp.
  1402	 *
  1403	 * Return values:
  1404	 *   An nfsstat value in network byte order.
  1405	 */
  1406	__be32
  1407	nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
  1408		       struct nfsd_file *nf, loff_t offset,
  1409		       const struct xdr_buf *payload, unsigned long *cnt,
  1410		       int stable, __be32 *verf)
  1411	{
  1412		struct nfsd_net		*nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
  1413		struct file		*file = nf->nf_file;
  1414		struct super_block	*sb = file_inode(file)->i_sb;
  1415		struct kiocb		kiocb;
  1416		struct svc_export	*exp;
  1417		struct iov_iter		iter;
  1418		errseq_t		since;
  1419		__be32			nfserr;
  1420		int			host_err;
  1421		unsigned long		exp_op_flags = 0;
  1422		unsigned int		pflags = current->flags;
  1423		bool			restore_flags = false;
  1424		unsigned int		nvecs;
  1425		int			saved_nr_dirtied_pause = 0;
  1426		bool			throttle_adjusted = false;
  1427	
  1428		trace_nfsd_write_opened(rqstp, fhp, offset, *cnt);
  1429	
  1430		if (sb->s_export_op)
  1431			exp_op_flags = sb->s_export_op->flags;
  1432	
  1433		if (test_bit(RQ_LOCAL, &rqstp->rq_flags) &&
  1434		    !(exp_op_flags & EXPORT_OP_REMOTE_FS)) {
  1435			/*
  1436			 * We want throttling in balance_dirty_pages()
  1437			 * and shrink_inactive_list() to only consider
  1438			 * the backingdev we are writing to, so that nfs to
  1439			 * localhost doesn't cause nfsd to lock up due to all
  1440			 * the client's dirty pages or its congested queue.
  1441			 */
  1442			current->flags |= PF_LOCAL_THROTTLE;
  1443			restore_flags = true;
  1444		}
  1445	
  1446		exp = fhp->fh_export;
  1447	
  1448		/*
  1449		 * If aggressive write throttling is enabled, reduce the per-task
  1450		 * dirty page limit to throttle NFSD writes more aggressively.
  1451		 * This helps prevent memory exhaustion when fast network clients
  1452		 * overwhelm slow storage.
  1453		 */
  1454		if (nfsd_aggressive_write_throttle) {
  1455			saved_nr_dirtied_pause = current->nr_dirtied_pause;
  1456			current->nr_dirtied_pause = NFSD_AGGRESSIVE_DIRTY_LIMIT;
  1457			throttle_adjusted = true;
  1458		}
  1459	
  1460		if (!EX_ISSYNC(exp))
  1461			stable = NFS_UNSTABLE;
  1462		init_sync_kiocb(&kiocb, file);
  1463		kiocb.ki_pos = offset;
  1464		if (likely(!fhp->fh_use_wgather)) {
  1465			switch (stable) {
  1466			case NFS_FILE_SYNC:
  1467				/* persist data and timestamps */
  1468				kiocb.ki_flags |= IOCB_DSYNC | IOCB_SYNC;
  1469				break;
  1470			case NFS_DATA_SYNC:
  1471				/* persist data only */
  1472				kiocb.ki_flags |= IOCB_DSYNC;
  1473				break;
  1474			}
  1475		}
  1476	
  1477		/*
  1478		 * If async throttling is enabled, check memory pressure
  1479		 * before attempting buffered writes. Return -EAGAIN if
  1480		 * the system is low on memory, allowing NFSD to return
  1481		 * an NFS error code asking the client to retry later.
  1482		 *
  1483		 * Skip this for NFSv2 since it lacks NFSERR_JUKEBOX.
  1484		 */
  1485		if (nfsd_async_write_throttle && rqstp->rq_vers >= 3) {
  1486			host_err =
  1487				balance_dirty_pages_ratelimited_flags(file->f_mapping,
  1488								      BDP_ASYNC);
  1489			if (host_err == -EAGAIN)
> 1490				break;
  1491		}
  1492	
  1493		nvecs = xdr_buf_to_bvec(rqstp->rq_bvec, rqstp->rq_maxpages, payload);
  1494	
  1495		since = READ_ONCE(file->f_wb_err);
  1496		if (verf)
  1497			nfsd_copy_write_verifier(verf, nn);
  1498	
  1499		switch (nfsd_io_cache_write) {
  1500		case NFSD_IO_DIRECT:
  1501			host_err = nfsd_direct_write(rqstp, fhp, nf, nvecs,
  1502						     cnt, &kiocb);
  1503			break;
  1504		case NFSD_IO_DONTCACHE:
  1505			if (file->f_op->fop_flags & FOP_DONTCACHE)
  1506				kiocb.ki_flags |= IOCB_DONTCACHE;
  1507			fallthrough;
  1508		case NFSD_IO_BUFFERED:
  1509			iov_iter_bvec(&iter, ITER_SOURCE, rqstp->rq_bvec, nvecs, *cnt);
  1510			host_err = vfs_iocb_iter_write(file, &kiocb, &iter);
  1511			if (host_err < 0)
  1512				break;
  1513			*cnt = host_err;
  1514			break;
  1515		}
  1516		if (host_err < 0) {
  1517			commit_reset_write_verifier(nn, rqstp, host_err);
  1518			goto out_nfserr;
  1519		}
  1520		nfsd_stats_io_write_add(nn, exp, *cnt);
  1521		fsnotify_modify(file);
  1522		host_err = filemap_check_wb_err(file->f_mapping, since);
  1523		if (host_err < 0)
  1524			goto out_nfserr;
  1525	
  1526		if (stable && fhp->fh_use_wgather) {
  1527			host_err = wait_for_concurrent_writes(file);
  1528			if (host_err < 0)
  1529				commit_reset_write_verifier(nn, rqstp, host_err);
  1530		}
  1531	
  1532	out_nfserr:
  1533		if (host_err >= 0) {
  1534			trace_nfsd_write_io_done(rqstp, fhp, offset, *cnt);
  1535			nfserr = nfs_ok;
  1536		} else {
  1537			trace_nfsd_write_err(rqstp, fhp, offset, host_err);
  1538			nfserr = nfserrno(host_err);
  1539		}
  1540		if (throttle_adjusted)
  1541			current->nr_dirtied_pause = saved_nr_dirtied_pause;
  1542		if (restore_flags)
  1543			current_restore_flags(pflags, PF_LOCAL_THROTTLE);
  1544		return nfserr;
  1545	}
  1546	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
  2025-12-19 14:11 ` [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support Chuck Lever
  2025-12-20 15:34   ` kernel test robot
@ 2025-12-21  5:41   ` kernel test robot
  2025-12-22 18:06   ` kernel test robot
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2025-12-21  5:41 UTC (permalink / raw)
  To: Chuck Lever; +Cc: oe-kbuild-all

Hi Chuck,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on brauner-vfs/vfs.all]
[also build test ERROR on linus/master v6.19-rc1 next-20251219]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/NFSD-Add-aggressive-write-throttling-control/20251219-221859
base:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
patch link:    https://lore.kernel.org/r/20251219141105.1247093-3-cel%40kernel.org
patch subject: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20251221/202512210637.Fz6bpRxI-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251221/202512210637.Fz6bpRxI-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512210637.Fz6bpRxI-lkp@intel.com/

All errors (new ones prefixed by >>):

   fs/nfsd/vfs.c: In function 'nfsd_vfs_write':
>> fs/nfsd/vfs.c:1490:25: error: break statement not within loop or switch
    1490 |                         break;
         |                         ^~~~~


vim +1490 fs/nfsd/vfs.c

  1389	
  1390	/**
  1391	 * nfsd_vfs_write - write data to an already-open file
  1392	 * @rqstp: RPC execution context
  1393	 * @fhp: File handle of file to write into
  1394	 * @nf: An open file matching @fhp
  1395	 * @offset: Byte offset of start
  1396	 * @payload: xdr_buf containing the write payload
  1397	 * @cnt: IN: number of bytes to write, OUT: number of bytes actually written
  1398	 * @stable: An NFS stable_how value
  1399	 * @verf: NFS WRITE verifier
  1400	 *
  1401	 * Upon return, caller must invoke fh_put on @fhp.
  1402	 *
  1403	 * Return values:
  1404	 *   An nfsstat value in network byte order.
  1405	 */
  1406	__be32
  1407	nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
  1408		       struct nfsd_file *nf, loff_t offset,
  1409		       const struct xdr_buf *payload, unsigned long *cnt,
  1410		       int stable, __be32 *verf)
  1411	{
  1412		struct nfsd_net		*nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
  1413		struct file		*file = nf->nf_file;
  1414		struct super_block	*sb = file_inode(file)->i_sb;
  1415		struct kiocb		kiocb;
  1416		struct svc_export	*exp;
  1417		struct iov_iter		iter;
  1418		errseq_t		since;
  1419		__be32			nfserr;
  1420		int			host_err;
  1421		unsigned long		exp_op_flags = 0;
  1422		unsigned int		pflags = current->flags;
  1423		bool			restore_flags = false;
  1424		unsigned int		nvecs;
  1425		int			saved_nr_dirtied_pause = 0;
  1426		bool			throttle_adjusted = false;
  1427	
  1428		trace_nfsd_write_opened(rqstp, fhp, offset, *cnt);
  1429	
  1430		if (sb->s_export_op)
  1431			exp_op_flags = sb->s_export_op->flags;
  1432	
  1433		if (test_bit(RQ_LOCAL, &rqstp->rq_flags) &&
  1434		    !(exp_op_flags & EXPORT_OP_REMOTE_FS)) {
  1435			/*
  1436			 * We want throttling in balance_dirty_pages()
  1437			 * and shrink_inactive_list() to only consider
  1438			 * the backingdev we are writing to, so that nfs to
  1439			 * localhost doesn't cause nfsd to lock up due to all
  1440			 * the client's dirty pages or its congested queue.
  1441			 */
  1442			current->flags |= PF_LOCAL_THROTTLE;
  1443			restore_flags = true;
  1444		}
  1445	
  1446		exp = fhp->fh_export;
  1447	
  1448		/*
  1449		 * If aggressive write throttling is enabled, reduce the per-task
  1450		 * dirty page limit to throttle NFSD writes more aggressively.
  1451		 * This helps prevent memory exhaustion when fast network clients
  1452		 * overwhelm slow storage.
  1453		 */
  1454		if (nfsd_aggressive_write_throttle) {
  1455			saved_nr_dirtied_pause = current->nr_dirtied_pause;
  1456			current->nr_dirtied_pause = NFSD_AGGRESSIVE_DIRTY_LIMIT;
  1457			throttle_adjusted = true;
  1458		}
  1459	
  1460		if (!EX_ISSYNC(exp))
  1461			stable = NFS_UNSTABLE;
  1462		init_sync_kiocb(&kiocb, file);
  1463		kiocb.ki_pos = offset;
  1464		if (likely(!fhp->fh_use_wgather)) {
  1465			switch (stable) {
  1466			case NFS_FILE_SYNC:
  1467				/* persist data and timestamps */
  1468				kiocb.ki_flags |= IOCB_DSYNC | IOCB_SYNC;
  1469				break;
  1470			case NFS_DATA_SYNC:
  1471				/* persist data only */
  1472				kiocb.ki_flags |= IOCB_DSYNC;
  1473				break;
  1474			}
  1475		}
  1476	
  1477		/*
  1478		 * If async throttling is enabled, check memory pressure
  1479		 * before attempting buffered writes. Return -EAGAIN if
  1480		 * the system is low on memory, allowing NFSD to return
  1481		 * an NFS error code asking the client to retry later.
  1482		 *
  1483		 * Skip this for NFSv2 since it lacks NFSERR_JUKEBOX.
  1484		 */
  1485		if (nfsd_async_write_throttle && rqstp->rq_vers >= 3) {
  1486			host_err =
  1487				balance_dirty_pages_ratelimited_flags(file->f_mapping,
  1488								      BDP_ASYNC);
  1489			if (host_err == -EAGAIN)
> 1490				break;
  1491		}
  1492	
  1493		nvecs = xdr_buf_to_bvec(rqstp->rq_bvec, rqstp->rq_maxpages, payload);
  1494	
  1495		since = READ_ONCE(file->f_wb_err);
  1496		if (verf)
  1497			nfsd_copy_write_verifier(verf, nn);
  1498	
  1499		switch (nfsd_io_cache_write) {
  1500		case NFSD_IO_DIRECT:
  1501			host_err = nfsd_direct_write(rqstp, fhp, nf, nvecs,
  1502						     cnt, &kiocb);
  1503			break;
  1504		case NFSD_IO_DONTCACHE:
  1505			if (file->f_op->fop_flags & FOP_DONTCACHE)
  1506				kiocb.ki_flags |= IOCB_DONTCACHE;
  1507			fallthrough;
  1508		case NFSD_IO_BUFFERED:
  1509			iov_iter_bvec(&iter, ITER_SOURCE, rqstp->rq_bvec, nvecs, *cnt);
  1510			host_err = vfs_iocb_iter_write(file, &kiocb, &iter);
  1511			if (host_err < 0)
  1512				break;
  1513			*cnt = host_err;
  1514			break;
  1515		}
  1516		if (host_err < 0) {
  1517			commit_reset_write_verifier(nn, rqstp, host_err);
  1518			goto out_nfserr;
  1519		}
  1520		nfsd_stats_io_write_add(nn, exp, *cnt);
  1521		fsnotify_modify(file);
  1522		host_err = filemap_check_wb_err(file->f_mapping, since);
  1523		if (host_err < 0)
  1524			goto out_nfserr;
  1525	
  1526		if (stable && fhp->fh_use_wgather) {
  1527			host_err = wait_for_concurrent_writes(file);
  1528			if (host_err < 0)
  1529				commit_reset_write_verifier(nn, rqstp, host_err);
  1530		}
  1531	
  1532	out_nfserr:
  1533		if (host_err >= 0) {
  1534			trace_nfsd_write_io_done(rqstp, fhp, offset, *cnt);
  1535			nfserr = nfs_ok;
  1536		} else {
  1537			trace_nfsd_write_err(rqstp, fhp, offset, host_err);
  1538			nfserr = nfserrno(host_err);
  1539		}
  1540		if (throttle_adjusted)
  1541			current->nr_dirtied_pause = saved_nr_dirtied_pause;
  1542		if (restore_flags)
  1543			current_restore_flags(pflags, PF_LOCAL_THROTTLE);
  1544		return nfserr;
  1545	}
  1546	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
  2025-12-19 14:11 ` [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support Chuck Lever
  2025-12-20 15:34   ` kernel test robot
  2025-12-21  5:41   ` kernel test robot
@ 2025-12-22 18:06   ` kernel test robot
  2025-12-22 23:47   ` kernel test robot
  2026-01-07  8:00   ` Christoph Hellwig
  4 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2025-12-22 18:06 UTC (permalink / raw)
  To: Chuck Lever; +Cc: oe-kbuild-all

Hi Chuck,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on brauner-vfs/vfs.all]
[also build test ERROR on linus/master v6.19-rc2 next-20251219]
[cannot apply to hch-configfs/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/NFSD-Add-aggressive-write-throttling-control/20251219-221859
base:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
patch link:    https://lore.kernel.org/r/20251219141105.1247093-3-cel%40kernel.org
patch subject: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
config: parisc-randconfig-001-20251223 (https://download.01.org/0day-ci/archive/20251223/202512230126.gowu7NIP-lkp@intel.com/config)
compiler: hppa-linux-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251223/202512230126.gowu7NIP-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512230126.gowu7NIP-lkp@intel.com/

All errors (new ones prefixed by >>):

   fs/nfsd/vfs.c: In function 'nfsd_vfs_write':
>> fs/nfsd/vfs.c:1490:4: error: break statement not within loop or switch
       break;
       ^~~~~


vim +1490 fs/nfsd/vfs.c

  1389	
  1390	/**
  1391	 * nfsd_vfs_write - write data to an already-open file
  1392	 * @rqstp: RPC execution context
  1393	 * @fhp: File handle of file to write into
  1394	 * @nf: An open file matching @fhp
  1395	 * @offset: Byte offset of start
  1396	 * @payload: xdr_buf containing the write payload
  1397	 * @cnt: IN: number of bytes to write, OUT: number of bytes actually written
  1398	 * @stable: An NFS stable_how value
  1399	 * @verf: NFS WRITE verifier
  1400	 *
  1401	 * Upon return, caller must invoke fh_put on @fhp.
  1402	 *
  1403	 * Return values:
  1404	 *   An nfsstat value in network byte order.
  1405	 */
  1406	__be32
  1407	nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
  1408		       struct nfsd_file *nf, loff_t offset,
  1409		       const struct xdr_buf *payload, unsigned long *cnt,
  1410		       int stable, __be32 *verf)
  1411	{
  1412		struct nfsd_net		*nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
  1413		struct file		*file = nf->nf_file;
  1414		struct super_block	*sb = file_inode(file)->i_sb;
  1415		struct kiocb		kiocb;
  1416		struct svc_export	*exp;
  1417		struct iov_iter		iter;
  1418		errseq_t		since;
  1419		__be32			nfserr;
  1420		int			host_err;
  1421		unsigned long		exp_op_flags = 0;
  1422		unsigned int		pflags = current->flags;
  1423		bool			restore_flags = false;
  1424		unsigned int		nvecs;
  1425		int			saved_nr_dirtied_pause = 0;
  1426		bool			throttle_adjusted = false;
  1427	
  1428		trace_nfsd_write_opened(rqstp, fhp, offset, *cnt);
  1429	
  1430		if (sb->s_export_op)
  1431			exp_op_flags = sb->s_export_op->flags;
  1432	
  1433		if (test_bit(RQ_LOCAL, &rqstp->rq_flags) &&
  1434		    !(exp_op_flags & EXPORT_OP_REMOTE_FS)) {
  1435			/*
  1436			 * We want throttling in balance_dirty_pages()
  1437			 * and shrink_inactive_list() to only consider
  1438			 * the backingdev we are writing to, so that nfs to
  1439			 * localhost doesn't cause nfsd to lock up due to all
  1440			 * the client's dirty pages or its congested queue.
  1441			 */
  1442			current->flags |= PF_LOCAL_THROTTLE;
  1443			restore_flags = true;
  1444		}
  1445	
  1446		exp = fhp->fh_export;
  1447	
  1448		/*
  1449		 * If aggressive write throttling is enabled, reduce the per-task
  1450		 * dirty page limit to throttle NFSD writes more aggressively.
  1451		 * This helps prevent memory exhaustion when fast network clients
  1452		 * overwhelm slow storage.
  1453		 */
  1454		if (nfsd_aggressive_write_throttle) {
  1455			saved_nr_dirtied_pause = current->nr_dirtied_pause;
  1456			current->nr_dirtied_pause = NFSD_AGGRESSIVE_DIRTY_LIMIT;
  1457			throttle_adjusted = true;
  1458		}
  1459	
  1460		if (!EX_ISSYNC(exp))
  1461			stable = NFS_UNSTABLE;
  1462		init_sync_kiocb(&kiocb, file);
  1463		kiocb.ki_pos = offset;
  1464		if (likely(!fhp->fh_use_wgather)) {
  1465			switch (stable) {
  1466			case NFS_FILE_SYNC:
  1467				/* persist data and timestamps */
  1468				kiocb.ki_flags |= IOCB_DSYNC | IOCB_SYNC;
  1469				break;
  1470			case NFS_DATA_SYNC:
  1471				/* persist data only */
  1472				kiocb.ki_flags |= IOCB_DSYNC;
  1473				break;
  1474			}
  1475		}
  1476	
  1477		/*
  1478		 * If async throttling is enabled, check memory pressure
  1479		 * before attempting buffered writes. Return -EAGAIN if
  1480		 * the system is low on memory, allowing NFSD to return
  1481		 * an NFS error code asking the client to retry later.
  1482		 *
  1483		 * Skip this for NFSv2 since it lacks NFSERR_JUKEBOX.
  1484		 */
  1485		if (nfsd_async_write_throttle && rqstp->rq_vers >= 3) {
  1486			host_err =
  1487				balance_dirty_pages_ratelimited_flags(file->f_mapping,
  1488								      BDP_ASYNC);
  1489			if (host_err == -EAGAIN)
> 1490				break;
  1491		}
  1492	
  1493		nvecs = xdr_buf_to_bvec(rqstp->rq_bvec, rqstp->rq_maxpages, payload);
  1494	
  1495		since = READ_ONCE(file->f_wb_err);
  1496		if (verf)
  1497			nfsd_copy_write_verifier(verf, nn);
  1498	
  1499		switch (nfsd_io_cache_write) {
  1500		case NFSD_IO_DIRECT:
  1501			host_err = nfsd_direct_write(rqstp, fhp, nf, nvecs,
  1502						     cnt, &kiocb);
  1503			break;
  1504		case NFSD_IO_DONTCACHE:
  1505			if (file->f_op->fop_flags & FOP_DONTCACHE)
  1506				kiocb.ki_flags |= IOCB_DONTCACHE;
  1507			fallthrough;
  1508		case NFSD_IO_BUFFERED:
  1509			iov_iter_bvec(&iter, ITER_SOURCE, rqstp->rq_bvec, nvecs, *cnt);
  1510			host_err = vfs_iocb_iter_write(file, &kiocb, &iter);
  1511			if (host_err < 0)
  1512				break;
  1513			*cnt = host_err;
  1514			break;
  1515		}
  1516		if (host_err < 0) {
  1517			commit_reset_write_verifier(nn, rqstp, host_err);
  1518			goto out_nfserr;
  1519		}
  1520		nfsd_stats_io_write_add(nn, exp, *cnt);
  1521		fsnotify_modify(file);
  1522		host_err = filemap_check_wb_err(file->f_mapping, since);
  1523		if (host_err < 0)
  1524			goto out_nfserr;
  1525	
  1526		if (stable && fhp->fh_use_wgather) {
  1527			host_err = wait_for_concurrent_writes(file);
  1528			if (host_err < 0)
  1529				commit_reset_write_verifier(nn, rqstp, host_err);
  1530		}
  1531	
  1532	out_nfserr:
  1533		if (host_err >= 0) {
  1534			trace_nfsd_write_io_done(rqstp, fhp, offset, *cnt);
  1535			nfserr = nfs_ok;
  1536		} else {
  1537			trace_nfsd_write_err(rqstp, fhp, offset, host_err);
  1538			nfserr = nfserrno(host_err);
  1539		}
  1540		if (throttle_adjusted)
  1541			current->nr_dirtied_pause = saved_nr_dirtied_pause;
  1542		if (restore_flags)
  1543			current_restore_flags(pflags, PF_LOCAL_THROTTLE);
  1544		return nfserr;
  1545	}
  1546	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
  2025-12-19 14:11 ` [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support Chuck Lever
                     ` (2 preceding siblings ...)
  2025-12-22 18:06   ` kernel test robot
@ 2025-12-22 23:47   ` kernel test robot
  2026-01-07  8:00   ` Christoph Hellwig
  4 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2025-12-22 23:47 UTC (permalink / raw)
  To: Chuck Lever; +Cc: llvm, oe-kbuild-all

Hi Chuck,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on brauner-vfs/vfs.all]
[also build test ERROR on linus/master v6.19-rc2 next-20251219]
[cannot apply to hch-configfs/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/NFSD-Add-aggressive-write-throttling-control/20251219-221859
base:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
patch link:    https://lore.kernel.org/r/20251219141105.1247093-3-cel%40kernel.org
patch subject: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
config: loongarch-defconfig (https://download.01.org/0day-ci/archive/20251223/202512230750.htK4fXlz-lkp@intel.com/config)
compiler: clang version 19.1.7 (https://github.com/llvm/llvm-project cd708029e0b2869e80abe31ddb175f7c35361f90)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251223/202512230750.htK4fXlz-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512230750.htK4fXlz-lkp@intel.com/

All errors (new ones prefixed by >>):

>> fs/nfsd/vfs.c:1490:4: error: 'break' statement not in loop or switch statement
    1490 |                         break;
         |                         ^
   1 error generated.


vim +/break +1490 fs/nfsd/vfs.c

  1389	
  1390	/**
  1391	 * nfsd_vfs_write - write data to an already-open file
  1392	 * @rqstp: RPC execution context
  1393	 * @fhp: File handle of file to write into
  1394	 * @nf: An open file matching @fhp
  1395	 * @offset: Byte offset of start
  1396	 * @payload: xdr_buf containing the write payload
  1397	 * @cnt: IN: number of bytes to write, OUT: number of bytes actually written
  1398	 * @stable: An NFS stable_how value
  1399	 * @verf: NFS WRITE verifier
  1400	 *
  1401	 * Upon return, caller must invoke fh_put on @fhp.
  1402	 *
  1403	 * Return values:
  1404	 *   An nfsstat value in network byte order.
  1405	 */
  1406	__be32
  1407	nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
  1408		       struct nfsd_file *nf, loff_t offset,
  1409		       const struct xdr_buf *payload, unsigned long *cnt,
  1410		       int stable, __be32 *verf)
  1411	{
  1412		struct nfsd_net		*nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
  1413		struct file		*file = nf->nf_file;
  1414		struct super_block	*sb = file_inode(file)->i_sb;
  1415		struct kiocb		kiocb;
  1416		struct svc_export	*exp;
  1417		struct iov_iter		iter;
  1418		errseq_t		since;
  1419		__be32			nfserr;
  1420		int			host_err;
  1421		unsigned long		exp_op_flags = 0;
  1422		unsigned int		pflags = current->flags;
  1423		bool			restore_flags = false;
  1424		unsigned int		nvecs;
  1425		int			saved_nr_dirtied_pause = 0;
  1426		bool			throttle_adjusted = false;
  1427	
  1428		trace_nfsd_write_opened(rqstp, fhp, offset, *cnt);
  1429	
  1430		if (sb->s_export_op)
  1431			exp_op_flags = sb->s_export_op->flags;
  1432	
  1433		if (test_bit(RQ_LOCAL, &rqstp->rq_flags) &&
  1434		    !(exp_op_flags & EXPORT_OP_REMOTE_FS)) {
  1435			/*
  1436			 * We want throttling in balance_dirty_pages()
  1437			 * and shrink_inactive_list() to only consider
  1438			 * the backingdev we are writing to, so that nfs to
  1439			 * localhost doesn't cause nfsd to lock up due to all
  1440			 * the client's dirty pages or its congested queue.
  1441			 */
  1442			current->flags |= PF_LOCAL_THROTTLE;
  1443			restore_flags = true;
  1444		}
  1445	
  1446		exp = fhp->fh_export;
  1447	
  1448		/*
  1449		 * If aggressive write throttling is enabled, reduce the per-task
  1450		 * dirty page limit to throttle NFSD writes more aggressively.
  1451		 * This helps prevent memory exhaustion when fast network clients
  1452		 * overwhelm slow storage.
  1453		 */
  1454		if (nfsd_aggressive_write_throttle) {
  1455			saved_nr_dirtied_pause = current->nr_dirtied_pause;
  1456			current->nr_dirtied_pause = NFSD_AGGRESSIVE_DIRTY_LIMIT;
  1457			throttle_adjusted = true;
  1458		}
  1459	
  1460		if (!EX_ISSYNC(exp))
  1461			stable = NFS_UNSTABLE;
  1462		init_sync_kiocb(&kiocb, file);
  1463		kiocb.ki_pos = offset;
  1464		if (likely(!fhp->fh_use_wgather)) {
  1465			switch (stable) {
  1466			case NFS_FILE_SYNC:
  1467				/* persist data and timestamps */
  1468				kiocb.ki_flags |= IOCB_DSYNC | IOCB_SYNC;
  1469				break;
  1470			case NFS_DATA_SYNC:
  1471				/* persist data only */
  1472				kiocb.ki_flags |= IOCB_DSYNC;
  1473				break;
  1474			}
  1475		}
  1476	
  1477		/*
  1478		 * If async throttling is enabled, check memory pressure
  1479		 * before attempting buffered writes. Return -EAGAIN if
  1480		 * the system is low on memory, allowing NFSD to return
  1481		 * an NFS error code asking the client to retry later.
  1482		 *
  1483		 * Skip this for NFSv2 since it lacks NFSERR_JUKEBOX.
  1484		 */
  1485		if (nfsd_async_write_throttle && rqstp->rq_vers >= 3) {
  1486			host_err =
  1487				balance_dirty_pages_ratelimited_flags(file->f_mapping,
  1488								      BDP_ASYNC);
  1489			if (host_err == -EAGAIN)
> 1490				break;
  1491		}
  1492	
  1493		nvecs = xdr_buf_to_bvec(rqstp->rq_bvec, rqstp->rq_maxpages, payload);
  1494	
  1495		since = READ_ONCE(file->f_wb_err);
  1496		if (verf)
  1497			nfsd_copy_write_verifier(verf, nn);
  1498	
  1499		switch (nfsd_io_cache_write) {
  1500		case NFSD_IO_DIRECT:
  1501			host_err = nfsd_direct_write(rqstp, fhp, nf, nvecs,
  1502						     cnt, &kiocb);
  1503			break;
  1504		case NFSD_IO_DONTCACHE:
  1505			if (file->f_op->fop_flags & FOP_DONTCACHE)
  1506				kiocb.ki_flags |= IOCB_DONTCACHE;
  1507			fallthrough;
  1508		case NFSD_IO_BUFFERED:
  1509			iov_iter_bvec(&iter, ITER_SOURCE, rqstp->rq_bvec, nvecs, *cnt);
  1510			host_err = vfs_iocb_iter_write(file, &kiocb, &iter);
  1511			if (host_err < 0)
  1512				break;
  1513			*cnt = host_err;
  1514			break;
  1515		}
  1516		if (host_err < 0) {
  1517			commit_reset_write_verifier(nn, rqstp, host_err);
  1518			goto out_nfserr;
  1519		}
  1520		nfsd_stats_io_write_add(nn, exp, *cnt);
  1521		fsnotify_modify(file);
  1522		host_err = filemap_check_wb_err(file->f_mapping, since);
  1523		if (host_err < 0)
  1524			goto out_nfserr;
  1525	
  1526		if (stable && fhp->fh_use_wgather) {
  1527			host_err = wait_for_concurrent_writes(file);
  1528			if (host_err < 0)
  1529				commit_reset_write_verifier(nn, rqstp, host_err);
  1530		}
  1531	
  1532	out_nfserr:
  1533		if (host_err >= 0) {
  1534			trace_nfsd_write_io_done(rqstp, fhp, offset, *cnt);
  1535			nfserr = nfs_ok;
  1536		} else {
  1537			trace_nfsd_write_err(rqstp, fhp, offset, host_err);
  1538			nfserr = nfserrno(host_err);
  1539		}
  1540		if (throttle_adjusted)
  1541			current->nr_dirtied_pause = saved_nr_dirtied_pause;
  1542		if (restore_flags)
  1543			current_restore_flags(pflags, PF_LOCAL_THROTTLE);
  1544		return nfserr;
  1545	}
  1546	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/2] NFSD: Add aggressive write throttling control
  2025-12-19 14:11 ` [RFC PATCH 1/2] NFSD: Add aggressive write throttling control Chuck Lever
@ 2026-01-07  7:55   ` Christoph Hellwig
  2026-01-07 14:36     ` Chuck Lever
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2026-01-07  7:55 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Christoph Hellwig, Mike Snitzer, linux-nfs, Chuck Lever, linux-mm,
	linux-fsdevel

On Fri, Dec 19, 2025 at 09:11:04AM -0500, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> On NFS servers with fast network links but slow storage, clients can
> generate WRITE requests faster than the server can flush payloads to
> durable storage. This can push the server into memory exhaustion as
> dirty pages accumulate across hundreds of concurrent NFSD threads.
> 
> The existing dirty page throttling (balance_dirty_pages()) uses
> per-task accounting with default ratelimits that allow each thread
> to dirty ~32 pages before throttling occurs. With many NFSD threads,
> this allows significant dirty page accumulation before any
> throttling kicks in.

What makes NFSD so special here vs say a userspace process with a bunch
of threads?  Also what is the actual problem we're trying to solve?

I kinda hate having this stuff in NFSD when there's nothing specific
about nfs serving here.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
  2025-12-19 14:11 ` [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support Chuck Lever
                     ` (3 preceding siblings ...)
  2025-12-22 23:47   ` kernel test robot
@ 2026-01-07  8:00   ` Christoph Hellwig
  2026-01-07 14:42     ` Chuck Lever
  4 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2026-01-07  8:00 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Christoph Hellwig, Mike Snitzer, linux-nfs, Chuck Lever

On Fri, Dec 19, 2025 at 09:11:05AM -0500, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> When memory pressure occurs during buffered writes, the traditional
> approach is for balance_dirty_pages() to put the writing thread to
> sleep until dirty pages are flushed. For NFSD, this means server
> threads block waiting for I/O, reducing overall server throughput.
> 
> Add support for asynchronous write throttling using the BDP_ASYNC
> flag to balance_dirty_pages_ratelimited_flags(). When enabled via:
> 
>   /sys/kernel/debug/nfsd/write_async_throttle

Let me reiterate that I really, really hate all this magic debugs-fs
enabled features.  Either they are gnuinely useful (think this would
be such a thing) and they should be enabled unconditionally, or they
are tradeoffs and should have a proper tunable not hidden in debugfs.

> NFSD checks memory pressure before attempting buffered writes. If
> balance_dirty_pages_ratelimited_flags() returns -EAGAIN (indicating
> memory exhaustion), NFSD returns NFS4ERR_DELAY (or NFSERR_JUKEBOX for
> NFSv3) to the client instead of blocking.
> 
> This allows clients to back off and retry rather than having server
> threads tied up waiting for writeback. The setting defaults to 0
> (synchronous throttling) and can be combined with write_throttle for
> layered throttling strategies.
> 
> Note: NFSv2 does not support NFSERR_JUKEBOX, so async throttling is
> automatically disabled for NFSv2 requests regardless of the setting.

This all seems very useful to me.  But it really needs to show numbers
on how it helps.

> + * Contents:
> + *   %0: Synchronous throttling (default) - writes sleep in balance_dirty_pages()

Overly lone line.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/2] NFSD: Add aggressive write throttling control
  2026-01-07  7:55   ` Christoph Hellwig
@ 2026-01-07 14:36     ` Chuck Lever
  2026-01-07 14:42       ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2026-01-07 14:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Mike Snitzer, linux-nfs, Chuck Lever, linux-mm, linux-fsdevel

On 1/7/26 2:55 AM, Christoph Hellwig wrote:
> On Fri, Dec 19, 2025 at 09:11:04AM -0500, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> On NFS servers with fast network links but slow storage, clients can
>> generate WRITE requests faster than the server can flush payloads to
>> durable storage. This can push the server into memory exhaustion as
>> dirty pages accumulate across hundreds of concurrent NFSD threads.
>>
>> The existing dirty page throttling (balance_dirty_pages()) uses
>> per-task accounting with default ratelimits that allow each thread
>> to dirty ~32 pages before throttling occurs. With many NFSD threads,
>> this allows significant dirty page accumulation before any
>> throttling kicks in.
> 
> What makes NFSD so special here vs say a userspace process with a bunch
> of threads?  Also what is the actual problem we're trying to solve?

The problem, as I see it, is that the system is not providing enough
backpressure to slow down noisy clients, allowing them to overwhelm
the server's memory with UNSTABLE WRITE traffic.

This is the same issue, IMO, that Mike's direct I/O is attempting to
address. Our implementation of UNSTABLE WRITE is a denial-of-service
vector.


> I kinda hate having this stuff in NFSD when there's nothing specific
> about nfs serving here.
Don't worry too much about that, these patches are obviously not in any
kind of merge-able shape yet. We do need to understand the metabolism of
UNSTABLE WRITEs, in particular, to get a clear picture of what needs to
be controlled to make the server autonomously stable.


-- 
Chuck Lever


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/2] NFSD: Add aggressive write throttling control
  2026-01-07 14:36     ` Chuck Lever
@ 2026-01-07 14:42       ` Christoph Hellwig
  2026-01-07 14:49         ` Chuck Lever
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2026-01-07 14:42 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Christoph Hellwig, Mike Snitzer, linux-nfs, Chuck Lever, linux-mm,
	linux-fsdevel

On Wed, Jan 07, 2026 at 09:36:39AM -0500, Chuck Lever wrote:
> > What makes NFSD so special here vs say a userspace process with a bunch
> > of threads?  Also what is the actual problem we're trying to solve?
> 
> The problem, as I see it, is that the system is not providing enough
> backpressure to slow down noisy clients, allowing them to overwhelm
> the server's memory with UNSTABLE WRITE traffic.
> 
> This is the same issue, IMO, that Mike's direct I/O is attempting to
> address. Our implementation of UNSTABLE WRITE is a denial-of-service
> vector.

But how is this different from Samba or a userspace NFS server?



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
  2026-01-07  8:00   ` Christoph Hellwig
@ 2026-01-07 14:42     ` Chuck Lever
  2026-01-07 16:25       ` Christoph Hellwig
  2026-01-07 19:40       ` Mike Snitzer
  0 siblings, 2 replies; 15+ messages in thread
From: Chuck Lever @ 2026-01-07 14:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Mike Snitzer, linux-nfs, Chuck Lever

On 1/7/26 3:00 AM, Christoph Hellwig wrote:
> On Fri, Dec 19, 2025 at 09:11:05AM -0500, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> When memory pressure occurs during buffered writes, the traditional
>> approach is for balance_dirty_pages() to put the writing thread to
>> sleep until dirty pages are flushed. For NFSD, this means server
>> threads block waiting for I/O, reducing overall server throughput.
>>
>> Add support for asynchronous write throttling using the BDP_ASYNC
>> flag to balance_dirty_pages_ratelimited_flags(). When enabled via:
>>
>>   /sys/kernel/debug/nfsd/write_async_throttle
> 
> Let me reiterate that I really, really hate all this magic debugs-fs
> enabled features.  Either they are gnuinely useful (think this would
> be such a thing) and they should be enabled unconditionally, or they
> are tradeoffs and should have a proper tunable not hidden in debugfs.

The use of debugfs here is because we don't yet have a coherent design
in mind -- this new facility is entirely experimental, and we need a
way to enable and disable it to make good comparisons, without making
immutable changes to the actual NFSD administrative interface.

"The RFC sign out front should have told ya."

But I agree, in the long term I most prefer no new administrative
controls -- it should just work if at all possible.


>> NFSD checks memory pressure before attempting buffered writes. If
>> balance_dirty_pages_ratelimited_flags() returns -EAGAIN (indicating
>> memory exhaustion), NFSD returns NFS4ERR_DELAY (or NFSERR_JUKEBOX for
>> NFSv3) to the client instead of blocking.
>>
>> This allows clients to back off and retry rather than having server
>> threads tied up waiting for writeback. The setting defaults to 0
>> (synchronous throttling) and can be combined with write_throttle for
>> layered throttling strategies.
>>
>> Note: NFSv2 does not support NFSERR_JUKEBOX, so async throttling is
>> automatically disabled for NFSv2 requests regardless of the setting.
> 
> This all seems very useful to me.  But it really needs to show numbers
> on how it helps.

Well if I can get this into operational shape, perhaps J. Flynn would
be interested in trying it out for us.

I'm happy to run with this one and drop (or postpone) 1/2, if that is
your assessment.


>> + * Contents:
>> + *   %0: Synchronous throttling (default) - writes sleep in balance_dirty_pages()
> 
> Overly lone line.
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/2] NFSD: Add aggressive write throttling control
  2026-01-07 14:42       ` Christoph Hellwig
@ 2026-01-07 14:49         ` Chuck Lever
  0 siblings, 0 replies; 15+ messages in thread
From: Chuck Lever @ 2026-01-07 14:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Mike Snitzer, linux-nfs, Chuck Lever, linux-mm, linux-fsdevel

On 1/7/26 9:42 AM, Christoph Hellwig wrote:
> On Wed, Jan 07, 2026 at 09:36:39AM -0500, Chuck Lever wrote:
>>> What makes NFSD so special here vs say a userspace process with a bunch
>>> of threads?  Also what is the actual problem we're trying to solve?
>>
>> The problem, as I see it, is that the system is not providing enough
>> backpressure to slow down noisy clients, allowing them to overwhelm
>> the server's memory with UNSTABLE WRITE traffic.
>>
>> This is the same issue, IMO, that Mike's direct I/O is attempting to
>> address. Our implementation of UNSTABLE WRITE is a denial-of-service
>> vector.
> 
> But how is this different from Samba or a userspace NFS server?
Well it might not be different. But at this point I don't think we know
enough about the problem to say one way or another. I'm just trying to
gather more experimental evidence about what is happening.


-- 
Chuck Lever


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
  2026-01-07 14:42     ` Chuck Lever
@ 2026-01-07 16:25       ` Christoph Hellwig
  2026-01-07 19:40       ` Mike Snitzer
  1 sibling, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2026-01-07 16:25 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Christoph Hellwig, Mike Snitzer, linux-nfs, Chuck Lever

On Wed, Jan 07, 2026 at 09:42:58AM -0500, Chuck Lever wrote:
> I'm happy to run with this one and drop (or postpone) 1/2, if that is
> your assessment.

I don't really understand what exactly patch 1 is aiming for.  Not
stalling nfsd threads when congested makes total sense on the other
hand.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support
  2026-01-07 14:42     ` Chuck Lever
  2026-01-07 16:25       ` Christoph Hellwig
@ 2026-01-07 19:40       ` Mike Snitzer
  1 sibling, 0 replies; 15+ messages in thread
From: Mike Snitzer @ 2026-01-07 19:40 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Christoph Hellwig, linux-nfs, Chuck Lever, jonathan.flynn

On Wed, Jan 07, 2026 at 09:42:58AM -0500, Chuck Lever wrote:
> On 1/7/26 3:00 AM, Christoph Hellwig wrote:
> > On Fri, Dec 19, 2025 at 09:11:05AM -0500, Chuck Lever wrote:
> >> From: Chuck Lever <chuck.lever@oracle.com>
> >>
> >> When memory pressure occurs during buffered writes, the traditional
> >> approach is for balance_dirty_pages() to put the writing thread to
> >> sleep until dirty pages are flushed. For NFSD, this means server
> >> threads block waiting for I/O, reducing overall server throughput.
> >>
> >> Add support for asynchronous write throttling using the BDP_ASYNC
> >> flag to balance_dirty_pages_ratelimited_flags(). When enabled via:
> >>
> >>   /sys/kernel/debug/nfsd/write_async_throttle
> > 
> > Let me reiterate that I really, really hate all this magic debugs-fs
> > enabled features.  Either they are gnuinely useful (think this would
> > be such a thing) and they should be enabled unconditionally, or they
> > are tradeoffs and should have a proper tunable not hidden in debugfs.
> 
> The use of debugfs here is because we don't yet have a coherent design
> in mind -- this new facility is entirely experimental, and we need a
> way to enable and disable it to make good comparisons, without making
> immutable changes to the actual NFSD administrative interface.
> 
> "The RFC sign out front should have told ya."
> 
> But I agree, in the long term I most prefer no new administrative
> controls -- it should just work if at all possible.
> 
> 
> >> NFSD checks memory pressure before attempting buffered writes. If
> >> balance_dirty_pages_ratelimited_flags() returns -EAGAIN (indicating
> >> memory exhaustion), NFSD returns NFS4ERR_DELAY (or NFSERR_JUKEBOX for
> >> NFSv3) to the client instead of blocking.
> >>
> >> This allows clients to back off and retry rather than having server
> >> threads tied up waiting for writeback. The setting defaults to 0
> >> (synchronous throttling) and can be combined with write_throttle for
> >> layered throttling strategies.
> >>
> >> Note: NFSv2 does not support NFSERR_JUKEBOX, so async throttling is
> >> automatically disabled for NFSv2 requests regardless of the setting.
> > 
> > This all seems very useful to me.  But it really needs to show numbers
> > on how it helps.
> 
> Well if I can get this into operational shape, perhaps J. Flynn would
> be interested in trying it out for us.
> 
> I'm happy to run with this one and drop (or postpone) 1/2, if that is
> your assessment.

Probably a good start.  Definitely looks useful and worth measuring to
see if buffered IO improves.

I can include it in a test kernel for Jon Flynn once you're happy with
the patch and would like further testing (fyi I've rebased to latest
6.18-stable but Jon hasn't done baseline testing of it yet, so we
could kill 2 birds once ready).

Thanks,
Mike

ps. Jon, for further context see Chuck's original 2/2 patch:
https://lore.kernel.org/linux-nfs/20251219141105.1247093-3-cel@kernel.org/

And his cover letter:
https://lore.kernel.org/linux-nfs/20251219141105.1247093-1-cel@kernel.org/
Also patch 1/2, but consensus seems to be "focus on 2/2 first":
https://lore.kernel.org/linux-nfs/20251219141105.1247093-2-cel@kernel.org/

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-01-07 19:40 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-19 14:11 [RFC PATCH 0/2] NFSD: Rate-limiting unstable WRITEs Chuck Lever
2025-12-19 14:11 ` [RFC PATCH 1/2] NFSD: Add aggressive write throttling control Chuck Lever
2026-01-07  7:55   ` Christoph Hellwig
2026-01-07 14:36     ` Chuck Lever
2026-01-07 14:42       ` Christoph Hellwig
2026-01-07 14:49         ` Chuck Lever
2025-12-19 14:11 ` [RFC PATCH 2/2] NFSD: Add asynchronous write throttling support Chuck Lever
2025-12-20 15:34   ` kernel test robot
2025-12-21  5:41   ` kernel test robot
2025-12-22 18:06   ` kernel test robot
2025-12-22 23:47   ` kernel test robot
2026-01-07  8:00   ` Christoph Hellwig
2026-01-07 14:42     ` Chuck Lever
2026-01-07 16:25       ` Christoph Hellwig
2026-01-07 19:40       ` Mike Snitzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.