All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] fuse: {io-uring} Allow to reduce the number of queues and request distribution
@ 2025-10-13 17:09 Bernd Schubert
  2025-10-13 17:09 ` [PATCH v3 1/6] fuse: {io-uring} Add queue length counters Bernd Schubert
                   ` (6 more replies)
  0 siblings, 7 replies; 27+ messages in thread
From: Bernd Schubert @ 2025-10-13 17:09 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Joanne Koong, linux-fsdevel, Luis Henriques, Gang He,
	Bernd Schubert

This adds bitmaps that track which queues are registered and which queues
do not have queued requests.
These bitmaps are then used to map from request core to queue
and also allow load distribution. NUMA affinity is handled and
fuse client/server protocol does not need changes, all is handled
in fuse client internally.

Signed-off-by: Bernd Schubert <bschubert@ddn.com>
---
Changes in v3:
- removed FUSE_URING_QUEUE_THRESHOLD (Luis)
- Fixed accidentaly early return of queue1 in fuse_uring_best_queue()
- Fixed similar early return 'best_global'
- Added sanity checks for cpu_to_node()
- Removed retry loops in fuse_uring_best_queue() for code simplicity
- Reduced local numa retries in fuse_uring_get_queue
- Added 'FUSE_URING_REDUCED_Q' FUSE_INIT flag to inform userspace
  about the possibility to reduced queues
- Link to v2: https://lore.kernel.org/r/20251003-reduced-nr-ring-queues_3-v2-0-742ff1a8fc58@ddn.com
- Removed wake-on-same cpu patch from this series, 
  it will be send out independently
- Used READ_ONCE(queue->nr_reqs) as the value is updated (with a lock being
  hold) by other threads and possibly cpus.

---
Bernd Schubert (6):
      fuse: {io-uring} Add queue length counters
      fuse: {io-uring} Rename ring->nr_queues to max_nr_queues
      fuse: {io-uring} Use bitmaps to track registered queues
      fuse: {io-uring} Distribute load among queues
      fuse: {io-uring} Allow reduced number of ring queues
      fuse: {io-uring} Queue background requests on a different core

 fs/fuse/dev_uring.c       | 260 ++++++++++++++++++++++++++++++++++++----------
 fs/fuse/dev_uring_i.h     |  14 ++-
 fs/fuse/inode.c           |   2 +-
 include/uapi/linux/fuse.h |   3 +
 4 files changed, 224 insertions(+), 55 deletions(-)
---
base-commit: ec714e371f22f716a04e6ecb2a24988c92b26911
change-id: 20250722-reduced-nr-ring-queues_3-6acb79dad978

Best regards,
-- 
Bernd Schubert <bschubert@ddn.com>


^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: [PATCH v3 6/6] fuse: {io-uring} Queue background requests on a different core
@ 2025-10-20  7:15 ` Dan Carpenter
  0 siblings, 0 replies; 27+ messages in thread
From: kernel test robot @ 2025-10-20  7:09 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp, Dan Carpenter

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20251013-reduced-nr-ring-queues_3-v3-6-6d87c8aa31ae@ddn.com>
References: <20251013-reduced-nr-ring-queues_3-v3-6-6d87c8aa31ae@ddn.com>
TO: Bernd Schubert <bschubert@ddn.com>
TO: Miklos Szeredi <miklos@szeredi.hu>
CC: Joanne Koong <joannelkoong@gmail.com>
CC: linux-fsdevel@vger.kernel.org
CC: Luis Henriques <luis@igalia.com>
CC: Gang He <dchg2000@gmail.com>
CC: Bernd Schubert <bschubert@ddn.com>

Hi Bernd,

kernel test robot noticed the following build warnings:

[auto build test WARNING on ec714e371f22f716a04e6ecb2a24988c92b26911]

url:    https://github.com/intel-lab-lkp/linux/commits/Bernd-Schubert/fuse-io-uring-Add-queue-length-counters/20251014-024703
base:   ec714e371f22f716a04e6ecb2a24988c92b26911
patch link:    https://lore.kernel.org/r/20251013-reduced-nr-ring-queues_3-v3-6-6d87c8aa31ae%40ddn.com
patch subject: [PATCH v3 6/6] fuse: {io-uring} Queue background requests on a different core
:::::: branch date: 7 days ago
:::::: commit date: 7 days ago
config: loongarch-randconfig-r072-20251019 (https://download.01.org/0day-ci/archive/20251020/202510201259.MevZAfl5-lkp@intel.com/config)
compiler: loongarch64-linux-gcc (GCC) 15.1.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <error27@gmail.com>
| Closes: https://lore.kernel.org/r/202510201259.MevZAfl5-lkp@intel.com/

smatch warnings:
fs/fuse/dev_uring.c:1389 fuse_uring_get_queue() error: uninitialized symbol 'best_numa'.

vim +/best_numa +1389 fs/fuse/dev_uring.c

aca09b212467554 Bernd Schubert 2025-10-13  1302  
aca09b212467554 Bernd Schubert 2025-10-13  1303  /*
aca09b212467554 Bernd Schubert 2025-10-13  1304   * Get the best queue for the current CPU
aca09b212467554 Bernd Schubert 2025-10-13  1305   */
2482ae85881b957 Bernd Schubert 2025-10-13  1306  static struct fuse_ring_queue *fuse_uring_get_queue(struct fuse_ring *ring,
2482ae85881b957 Bernd Schubert 2025-10-13  1307  						    bool background)
c2c9af9a0b13261 Bernd Schubert 2025-01-20  1308  {
c2c9af9a0b13261 Bernd Schubert 2025-01-20  1309  	unsigned int qid;
aca09b212467554 Bernd Schubert 2025-10-13  1310  	struct fuse_ring_queue *local_queue, *best_numa, *best_global;
aca09b212467554 Bernd Schubert 2025-10-13  1311  	int local_node;
aca09b212467554 Bernd Schubert 2025-10-13  1312  	const struct cpumask *numa_mask, *global_mask;
2482ae85881b957 Bernd Schubert 2025-10-13  1313  	int retries = 0;
2482ae85881b957 Bernd Schubert 2025-10-13  1314  	int weight = -1;
c2c9af9a0b13261 Bernd Schubert 2025-01-20  1315  
c2c9af9a0b13261 Bernd Schubert 2025-01-20  1316  	qid = task_cpu(current);
868e7728394dbc8 Bernd Schubert 2025-10-13  1317  	if (WARN_ONCE(qid >= ring->max_nr_queues,
c2c9af9a0b13261 Bernd Schubert 2025-01-20  1318  		      "Core number (%u) exceeds nr queues (%zu)\n", qid,
868e7728394dbc8 Bernd Schubert 2025-10-13  1319  		      ring->max_nr_queues))
c2c9af9a0b13261 Bernd Schubert 2025-01-20  1320  		qid = 0;
c2c9af9a0b13261 Bernd Schubert 2025-01-20  1321  
aca09b212467554 Bernd Schubert 2025-10-13  1322  	local_node = cpu_to_node(qid);
aca09b212467554 Bernd Schubert 2025-10-13  1323  	if (WARN_ON_ONCE(local_node > ring->nr_numa_nodes))
aca09b212467554 Bernd Schubert 2025-10-13  1324  		local_node = 0;
c2c9af9a0b13261 Bernd Schubert 2025-01-20  1325  
2482ae85881b957 Bernd Schubert 2025-10-13  1326  	local_queue = READ_ONCE(ring->queues[qid]);
2482ae85881b957 Bernd Schubert 2025-10-13  1327  
2482ae85881b957 Bernd Schubert 2025-10-13  1328  retry:
2482ae85881b957 Bernd Schubert 2025-10-13  1329  	/*
2482ae85881b957 Bernd Schubert 2025-10-13  1330  	 * For background requests, try next CPU in same NUMA domain.
2482ae85881b957 Bernd Schubert 2025-10-13  1331  	 * I.e. cpu-0 creates async requests, cpu-1 io processes.
2482ae85881b957 Bernd Schubert 2025-10-13  1332  	 * Similar for foreground requests, when the local queue does not
2482ae85881b957 Bernd Schubert 2025-10-13  1333  	 * exist - still better to always wake the same cpu id.
2482ae85881b957 Bernd Schubert 2025-10-13  1334  	 */
2482ae85881b957 Bernd Schubert 2025-10-13  1335  	if (background || !local_queue) {
2482ae85881b957 Bernd Schubert 2025-10-13  1336  		numa_mask = ring->numa_registered_q_mask[local_node];
2482ae85881b957 Bernd Schubert 2025-10-13  1337  
2482ae85881b957 Bernd Schubert 2025-10-13  1338  		if (weight == -1)
2482ae85881b957 Bernd Schubert 2025-10-13  1339  			weight = cpumask_weight(numa_mask);
2482ae85881b957 Bernd Schubert 2025-10-13  1340  
2482ae85881b957 Bernd Schubert 2025-10-13  1341  		if (weight == 0)
2482ae85881b957 Bernd Schubert 2025-10-13  1342  			goto global;
2482ae85881b957 Bernd Schubert 2025-10-13  1343  
2482ae85881b957 Bernd Schubert 2025-10-13  1344  		if (weight > 1) {
2482ae85881b957 Bernd Schubert 2025-10-13  1345  			int idx = (qid + 1) % weight;
2482ae85881b957 Bernd Schubert 2025-10-13  1346  
2482ae85881b957 Bernd Schubert 2025-10-13  1347  			qid = cpumask_nth(idx, numa_mask);
2482ae85881b957 Bernd Schubert 2025-10-13  1348  		} else {
2482ae85881b957 Bernd Schubert 2025-10-13  1349  			qid = cpumask_first(numa_mask);
2482ae85881b957 Bernd Schubert 2025-10-13  1350  		}
2482ae85881b957 Bernd Schubert 2025-10-13  1351  
2482ae85881b957 Bernd Schubert 2025-10-13  1352  		local_queue = READ_ONCE(ring->queues[qid]);
2482ae85881b957 Bernd Schubert 2025-10-13  1353  		if (WARN_ON_ONCE(!local_queue))
2482ae85881b957 Bernd Schubert 2025-10-13  1354  			return NULL;
2482ae85881b957 Bernd Schubert 2025-10-13  1355  	}
2482ae85881b957 Bernd Schubert 2025-10-13  1356  
2482ae85881b957 Bernd Schubert 2025-10-13  1357  	if (READ_ONCE(local_queue->nr_reqs) <= FURING_Q_NUMA_THRESHOLD)
aca09b212467554 Bernd Schubert 2025-10-13  1358  		return local_queue;
aca09b212467554 Bernd Schubert 2025-10-13  1359  
2482ae85881b957 Bernd Schubert 2025-10-13  1360  	if (retries < FURING_NEXT_QUEUE_RETRIES && weight > retries + 1) {
2482ae85881b957 Bernd Schubert 2025-10-13  1361  		retries++;
2482ae85881b957 Bernd Schubert 2025-10-13  1362  		local_queue = NULL;
2482ae85881b957 Bernd Schubert 2025-10-13  1363  		goto retry;
2482ae85881b957 Bernd Schubert 2025-10-13  1364  	}
2482ae85881b957 Bernd Schubert 2025-10-13  1365  
aca09b212467554 Bernd Schubert 2025-10-13  1366  	/* Find best NUMA-local queue */
aca09b212467554 Bernd Schubert 2025-10-13  1367  	numa_mask = ring->numa_registered_q_mask[local_node];
aca09b212467554 Bernd Schubert 2025-10-13  1368  	best_numa = fuse_uring_best_queue(numa_mask, ring);
aca09b212467554 Bernd Schubert 2025-10-13  1369  
aca09b212467554 Bernd Schubert 2025-10-13  1370  	/* If NUMA queue is under threshold, use it */
aca09b212467554 Bernd Schubert 2025-10-13  1371  	if (best_numa &&
aca09b212467554 Bernd Schubert 2025-10-13  1372  	    READ_ONCE(best_numa->nr_reqs) <= FURING_Q_NUMA_THRESHOLD)
aca09b212467554 Bernd Schubert 2025-10-13  1373  		return best_numa;
aca09b212467554 Bernd Schubert 2025-10-13  1374  
2482ae85881b957 Bernd Schubert 2025-10-13  1375  global:
aca09b212467554 Bernd Schubert 2025-10-13  1376  	/* NUMA queues above threshold, try global queues */
aca09b212467554 Bernd Schubert 2025-10-13  1377  	global_mask = ring->registered_q_mask;
aca09b212467554 Bernd Schubert 2025-10-13  1378  	best_global = fuse_uring_best_queue(global_mask, ring);
aca09b212467554 Bernd Schubert 2025-10-13  1379  
aca09b212467554 Bernd Schubert 2025-10-13  1380  	/* Might happen during tear down */
aca09b212467554 Bernd Schubert 2025-10-13  1381  	if (!best_global)
aca09b212467554 Bernd Schubert 2025-10-13  1382  		return NULL;
aca09b212467554 Bernd Schubert 2025-10-13  1383  
aca09b212467554 Bernd Schubert 2025-10-13  1384  	/* If global queue is under double threshold, use it */
aca09b212467554 Bernd Schubert 2025-10-13  1385  	if (READ_ONCE(best_global->nr_reqs) <= FURING_Q_GLOBAL_THRESHOLD)
aca09b212467554 Bernd Schubert 2025-10-13  1386  		return best_global;
aca09b212467554 Bernd Schubert 2025-10-13  1387  
aca09b212467554 Bernd Schubert 2025-10-13  1388  	/* There is no ideal queue, stay numa_local if possible */
aca09b212467554 Bernd Schubert 2025-10-13 @1389  	return best_numa ? best_numa : best_global;
c2c9af9a0b13261 Bernd Schubert 2025-01-20  1390  }
c2c9af9a0b13261 Bernd Schubert 2025-01-20  1391  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2025-10-24 17:58 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-13 17:09 [PATCH v3 0/6] fuse: {io-uring} Allow to reduce the number of queues and request distribution Bernd Schubert
2025-10-13 17:09 ` [PATCH v3 1/6] fuse: {io-uring} Add queue length counters Bernd Schubert
2025-10-15  9:19   ` Luis Henriques
2025-10-13 17:09 ` [PATCH v3 2/6] fuse: {io-uring} Rename ring->nr_queues to max_nr_queues Bernd Schubert
2025-10-13 17:09 ` [PATCH v3 3/6] fuse: {io-uring} Use bitmaps to track registered queues Bernd Schubert
2025-10-15 23:49   ` Joanne Koong
2025-10-16 11:33     ` Bernd Schubert
2025-10-13 17:10 ` [PATCH v3 4/6] fuse: {io-uring} Distribute load among queues Bernd Schubert
2025-10-18  0:12   ` Joanne Koong
2025-10-20 19:00     ` Bernd Schubert
2025-10-20 22:59       ` Joanne Koong
2025-10-20 23:28         ` Bernd Schubert
2025-10-24 17:05         ` Joanne Koong
2025-10-24 17:52           ` Bernd Schubert
2025-10-24 17:58             ` Bernd Schubert
2025-10-13 17:10 ` [PATCH v3 5/6] fuse: {io-uring} Allow reduced number of ring queues Bernd Schubert
2025-10-15  9:25   ` Luis Henriques
2025-10-15  9:31     ` Bernd Schubert
2025-10-13 17:10 ` [PATCH v3 6/6] fuse: {io-uring} Queue background requests on a different core Bernd Schubert
2025-10-15  9:50   ` Luis Henriques
2025-10-15 10:27     ` Bernd Schubert
2025-10-15 11:05       ` Luis Henriques
2025-10-14  8:43 ` [PATCH v3 0/6] fuse: {io-uring} Allow to reduce the number of queues and request distribution Gang He
2025-10-14  9:14   ` Bernd Schubert
2025-10-16  6:15     ` Gang He
  -- strict thread matches above, loose matches on Subject: below --
2025-10-20  7:09 [PATCH v3 6/6] fuse: {io-uring} Queue background requests on a different core kernel test robot
2025-10-20  7:15 ` Dan Carpenter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.