From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5ECD223A9BD for ; Tue, 17 Mar 2026 14:15:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.171 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773756936; cv=none; b=jICm7l9r6uKA2yLe68kg6hFY85QgNGQCbpy4Ca/ayDC7X/ejMReTKLsUcNn5ZRDGacb6kmzQxlDQv7QahUcg5rzbHk+7zd3RGHoYtIZWj4qOp1RCuo/E92/ELVbcn35pEUxkRD+qzrPZ9g722nYMrWZ57vwMOpUT+Xa+ySjHeic= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773756936; c=relaxed/simple; bh=NUiB1P4QA0pA05q1c8ogSM2kUQDvbhebJRIMpdZBVeA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=gpWPCeIXNLeMWuqbFlBnysiM29UeLfKAfUX7WGQXrwiuDjrHCP5VjBHIXMrhlceSDvxqtQXTOm4dXV/9Xt4+VlacMc70Dh5qTdBwLCH5vn/LzYLf0blHt+lRYZReMhvgxV6bBtQN5ESFMvd7DG33ZNdFrqDtYBqXZEHBx/+3IQc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=IIBXAbp3; arc=none smtp.client-ip=209.85.210.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IIBXAbp3" Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-82735a41920so2194877b3a.2 for ; Tue, 17 Mar 2026 07:15:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773756935; x=1774361735; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=Abhg6EeNYHh32SUyiyk7mRWAwc5LjDCHaYztqqH82Ek=; b=IIBXAbp3gKxkrisSRfKGYIjAK5eyG2eEcdDZQ0JbId1s1cweJPBLIkP1GEPQ318s1i GsgsQsSEHrv4j3GAR1MJhoz6e7yY+EJ45XEASUFMnNiA7rdGHFIYjj4nAQQMobP/3VZN 5Q8sF4Ji7EDZcc2veTSoC515gjU/SR0Ux5tRlX2yRQuQpuCyVFEON5wEGXZFM1XW5Ogp /51r6wq2h2Q9bm6vac3L0pqn9Whc7Anj2oB6xNxatv1dk8f/syaU/PEbr25aJZjqD54f wiunCdLAnTjy/kcPlOtIfY/lhA9UmCU7n3y77oq366ia6mgg2aS9A+fW8mr80Zxexiaz c+Ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773756935; x=1774361735; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Abhg6EeNYHh32SUyiyk7mRWAwc5LjDCHaYztqqH82Ek=; b=UFUcU2+otaL0yuK84akK6t7zRfYAmfjo8wpDAAYQuS0Dm2fMT4NBtiu1Xf6nk2iV5H bxhrlgH+9/UcG2jpXUl7YpxTFi1kgE9YaZ807QZv8zWUKKhbAm68lgkz3CCFXUieLec5 zrbXwJRJK+w829B0PzYFkwQtwL+2UnQ5bdi1PMx3bQSow0pyMDS+sysYfythqEyFTPop 87FG8RkFU4LxmUP/psDTj120ES1udPy2Zf33hyVG5kVpoGK+3FGn2rxbbay0oB7gFqMA x5wEL6Uex217FLnHPXaac+w3o9zdGsq363q1IOa8ufx92oYBOp45QG9wwuEqCTMhwkCz SxSw== X-Forwarded-Encrypted: i=1; AJvYcCXK7JQOgBs2b4Im/jxVSeTwN8cAwhBgyVRzeYYbfVWcAoQjsL/BM0atZR3+1qled7OsfD94+QNXTlHl0Rg=@vger.kernel.org X-Gm-Message-State: AOJu0Yy2qD3Dc9LaJrk97Y3p76L/gdq16IBF5JYve4S/+N10IDDw/9eE BWYjC5rgkmphrAy6aO1nwZWCh38yEr4+8SWt+39txyTItPjj4pPab2n9 X-Gm-Gg: ATEYQzytVSkubBCKF6l0Ttp5pwL4hF5ELF1vp0Bb9aoyR+TUWifqbr61L+9wvt9ElQC R51WE90p9f+ByNwhm9I0wKGxFWww0i5YjCfgtsh/YM2M+RTeuMtbaKRV0g3ZicNzDTfrnOxh8Sp 84tZG4pINC9Xm2LdcYOT34ewOhTqnh4IFEV/rvJPhM7By/fUfh60YSNM8NnMsl+EdOEwFW4dF3Z 2PV0ZhdxC7Y6vWjivhJDKwdbGq/uwkZfP07nOGD0mbTYXyN1U4BDll5ID21KHm4+SsWgm7BBIiq kqUWlI60XtN9zW8B8sxhOaLHLEn36LEG3GAlsXVUpHXY0YWVB3yd3CwF/rQlip5QCBvm8P1tpNh L6GLNRiviKs/l/TVVcmH3p47IehLLfK7FKTrXn7T63uu4/WxGIpiKjC5b4mP47KY3CYrC/6W0yf 9fW74PLw0qgg809aV4u6x4f/Tl/nb7TiQ/NEA= X-Received: by 2002:a05:6a21:50a:b0:398:72b7:ec82 with SMTP id adf61e73a8af0-398eca6d3e9mr15417657637.20.1773756934562; Tue, 17 Mar 2026 07:15:34 -0700 (PDT) Received: from google.com ([2402:7500:a44:85b:1dcb:ad2e:a580:8965]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-c73ebb6800fsm10598503a12.16.2026.03.17.07.15.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Mar 2026 07:15:33 -0700 (PDT) Date: Tue, 17 Mar 2026 22:15:29 +0800 From: Kuan-Wei Chiu To: Zhihao Cheng Cc: richard@nod.at, akpm@linux-foundation.org, jserv@ccns.ncku.edu.tw, eleanor15x@gmail.com, marscheng@google.com, linux-mtd@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] lib/list_sort: introduce list_sort_nonatomic() and remove dummy cmp() calls Message-ID: References: <20260315193900.218737-1-visitorckw@gmail.com> <7bb23ce1-a0f3-b576-ad79-c9fa746f11ed@huawei.com> <3fec3dbc-2835-e056-4394-d2dcaae3b80a@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3fec3dbc-2835-e056-4394-d2dcaae3b80a@huawei.com> Hi Zhihao, On Tue, Mar 17, 2026 at 09:22:26PM +0800, Zhihao Cheng wrote: > 在 2026/3/17 20:32, Kuan-Wei Chiu 写道: > > Hi Zhihao, > > > > On Tue, Mar 17, 2026 at 12:05:54PM +0800, Zhihao Cheng wrote: > > > 在 2026/3/16 3:39, Kuan-Wei Chiu 写道: > > > > Historically, list_sort() implemented a hack in merge_final(): > > > > if (unlikely(!++count)) > > > > cmp(priv, b, b); > > > > > > > > This was designed specifically so that callers could periodically > > > > invoke cond_resched() within their comparison functions when merging > > > > highly unbalanced lists. > > > > > > > > However, an audit of the kernel tree reveals that only fs/ubifs/ relies > > > > on this mechanism. For the vast majority of list_sort() users (such as > > > > block layer IO schedulers and file systems), this results in completely > > > > wasted function calls. In the worst-case scenario (merging an already > > > > sorted list where 'a' is exhausted quickly), this results in > > > > approximately (N/2)/256 unnecessary cmp() calls. > > > > > > > > To clean up this API while ensuring behavior compatibility: > > > > 1. Introduce list_sort_nonatomic(), which explicitly calls > > > > cond_resched() internally when count overflows. > > > > 2. Remove the dummy cmp(priv, b, b) fallback for standard list_sort(), > > > > saving unnecessary function calls and improving determinism. > > > > 3. Convert the sole user (fs/ubifs/) to the new API. > > > > > > > > Note that ubifs still maintains cond_resched() inside its own > > > > comparison functions. This patch does not alter the frequency or timing > > > > of those scheduling points, guaranteeing no regressions for ubifs, > > > > while benefiting all other kernel users. > > > > > > > > Signed-off-by: Kuan-Wei Chiu > > > > --- > > > > fs/ubifs/gc.c | 4 +- > > > > fs/ubifs/replay.c | 2 +- > > > > include/linux/list_sort.h | 3 + > > > > lib/list_sort.c | 166 +++++++++++++++++++++----------------- > > > > 4 files changed, 100 insertions(+), 75 deletions(-) > > > > > > lgtm for UBIFS. > > > > > > Reviewed-by: Zhihao Cheng > > > > Thanks for your review! > > > > > > > > one small nit below. > > > > > > > > > > [...] > > > > diff --git a/include/linux/list_sort.h b/include/linux/list_sort.h > > > > index 453105f74e05..f7af29073d48 100644 > > > > --- a/include/linux/list_sort.h > > > > +++ b/include/linux/list_sort.h > > > > @@ -11,4 +11,7 @@ typedef int __attribute__((nonnull(2,3))) (*list_cmp_func_t)(void *, > > > > __attribute__((nonnull(2,3))) > > > > void list_sort(void *priv, struct list_head *head, list_cmp_func_t cmp); > > > > + > > > > +__attribute__((nonnull(2, 3))) > > > > +void list_sort_nonatomic(void *priv, struct list_head *head, list_cmp_func_t cmp); > > > > #endif > > > > diff --git a/lib/list_sort.c b/lib/list_sort.c > > > > index a310ecb7ccc0..788bfc26cf7b 100644 > > > > --- a/lib/list_sort.c > > > > +++ b/lib/list_sort.c > > > > @@ -3,6 +3,7 @@ > > > > #include > > > > #include > > > > #include > > > > +#include > > > > /* > > > > * Returns a list organized in an intermediate format suited > > > > @@ -47,7 +48,7 @@ static struct list_head *merge(void *priv, list_cmp_func_t cmp, > > > > */ > > > > __attribute__((nonnull(2,3,4,5))) > > > > static void merge_final(void *priv, list_cmp_func_t cmp, struct list_head *head, > > > > - struct list_head *a, struct list_head *b) > > > > + struct list_head *a, struct list_head *b, bool may_schedule) > > > > { > > > > struct list_head *tail = head; > > > > u8 count = 0; > > > > @@ -79,12 +80,11 @@ static void merge_final(void *priv, list_cmp_func_t cmp, struct list_head *head, > > > > /* > > > > * If the merge is highly unbalanced (e.g. the input is > > > > * already sorted), this loop may run many iterations. > > > > - * Continue callbacks to the client even though no > > > > - * element comparison is needed, so the client's cmp() > > > > - * routine can invoke cond_resched() periodically. > > > > + * If may_schedule is true, periodically invoke cond_resched() > > > > + * to avoid soft lockups. > > > > */ > > > > - if (unlikely(!++count)) > > > > - cmp(priv, b, b); > > > > + if (may_schedule && unlikely(!++count)) > > > > + cond_resched(); > > > The cond_resched() already has a judgment on whether to schedule out, so the > > > 'count' could be removed? > > > > However, I think keeping the u8 count rate-limiter makes more sense > > here due to the overhead difference. > > > > Evaluating unlikely(!++count) is essentially a single ALU instruction > > (register increment) and a zero-flag check, which has virtually zero > > cost. On the other hand, cond_resched() is a macro that does much more > > than a simple flag check. Depending on the kernel config, it often > > invokes __might_resched() (which reads current to check task_struct > > states, irq flags, etc.) and makes a call to __cond_resched(). > > Evaluating all of this heavy machinery on every single iteration of > > such a tight loop would probably introduce noticeable overhead. > > > > Actually, your comment brings up another thought I wanted to discuss. > > > > Since we are introducing list_sort_nonatomic(), I wonder if we should > > eventually move the cond_resched() out of UBIFS's cmp() functions > > entirely and handle it inside list_sort_nonatomic(). > > > > Right now, because the cmp() callback is inherently invoked at every > > step of the merge process, UBIFS ends up evaluating the cond_resched() > > macro every 3 or 4 pointer assignments during the main sort. While > > UBIFS needs to prevent soft lockups on huge lists, checking for resched > > at such a micro-granularity still feels excessive and likely leaves > > performance on the table. > > In my humble opinion, I don't think frequent 'cond_resched' calling will > bring observable performance impact, and there are many examples in kernel > hotspot paths(eg. > blk_mq_prealloc_tag_set_tags/blk_rq_poll_completion/__blk_mq_alloc_rq_maps > ...). For list_sort(), I prefer the aim of code cleanup is to make the code > more readable. I am neutral on code cleanup for the current implementation > of list_sort. OK. Since this only affects UBIFS, if performance isn't a concern for UBIFS, I'll drop the !++count check in v2. Additionally, I plan to remove cond_resched() from UBIFS's cmp() functions in v2 and move it directly into merge() and merge_final() inside list_sort. I think this will make the code much more readable compared to hiding the scheduling points inside the comparison callbacks. Regards, Kuan-Wei