From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4E030C282DC for ; Tue, 4 Mar 2025 19:56:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=YM8Ez/q06miNqJtCBk3k8n3HOLzBJgbDbaMVZIGyvCs=; b=E+cnAPGILlEKLXb7emd8E+kWf8 JagwU/jEEkwLuRRDTZvA5SORMADBW7HAUBsh+Hv9RvfYdh688MN1HQ4lw7t5+CAY6TZMal8VOjRs+ nOY3wi6iBtf6zBoKkr2HeNbuuS61J6nMiTec9IhU3lKxaOfn0F5oipiXNLzw4ysdZs2lT013pcCq9 hZaD9iHH9Rv/TmLU28KLVfjNKvrbjUW7QMZZua+PK+/KOzpt1TmLU6M0sU45O9oHNS72MtVXGec+5 2S/eilu5Gw5lt+XCKwlIniuGTPYjiKEJ5TH11Y/7YGmnW95p02gp8uluq5LqNWmJC07xfmvtqWp54 9Gg/Vj3w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tpYNB-000000067ct-3yXe; Tue, 04 Mar 2025 19:56:13 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tpXkH-00000005z6U-0R9X for linux-arm-kernel@lists.infradead.org; Tue, 04 Mar 2025 19:16:02 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 9C86D5C6368; Tue, 4 Mar 2025 19:13:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2C4CBC4CEE5; Tue, 4 Mar 2025 19:15:57 +0000 (UTC) Date: Tue, 4 Mar 2025 19:15:54 +0000 From: Catalin Marinas To: Ankur Arora Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, arnd@arndb.de, will@kernel.org, peterz@infradead.org, mark.rutland@arm.com, harisokn@amazon.com, cl@gentwo.org, memxor@gmail.com, zhenglifeng1@huawei.com, joao.m.martins@oracle.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com Subject: Re: [PATCH 1/4] asm-generic: barrier: Add smp_cond_load_relaxed_timewait() Message-ID: References: <20250203214911.898276-1-ankur.a.arora@oracle.com> <20250203214911.898276-2-ankur.a.arora@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250203214911.898276-2-ankur.a.arora@oracle.com> X-TUID: lLrdoPZx9Q90 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250304_111601_245455_E8039CA4 X-CRM114-Status: GOOD ( 29.37 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Feb 03, 2025 at 01:49:08PM -0800, Ankur Arora wrote: > Add smp_cond_load_relaxed_timewait(), a timed variant of > smp_cond_load_relaxed(). This is useful for cases where we want to > wait on a conditional variable but don't want to wait indefinitely. Bikeshedding: why not "timeout" rather than "timewait"? > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h > index d4f581c1e21d..31de8ed2a05e 100644 > --- a/include/asm-generic/barrier.h > +++ b/include/asm-generic/barrier.h > @@ -273,6 +273,54 @@ do { \ > }) > #endif > > +#ifndef smp_cond_time_check_count > +/* > + * Limit how often smp_cond_load_relaxed_timewait() evaluates time_expr_ns. > + * This helps reduce the number of instructions executed while spin-waiting. > + */ > +#define smp_cond_time_check_count 200 > +#endif While this was indeed added to the poll_idle() loop, it feels completely random in a generic implementation. It's highly dependent on the time_expr_ns passed. Can the caller not move the loop in time_expr_ns before invoking this macro? > + > +/** > + * smp_cond_load_relaxed_timewait() - (Spin) wait for cond with no ordering > + * guarantees until a timeout expires. > + * @ptr: pointer to the variable to wait on > + * @cond: boolean expression to wait for > + * @time_expr_ns: evaluates to the current time > + * @time_limit_ns: compared against time_expr_ns > + * > + * Equivalent to using READ_ONCE() on the condition variable. > + * > + * Note that the time check in time_expr_ns can be synchronous or > + * asynchronous. > + * In the generic version the check is synchronous but kept coarse > + * to minimize instructions executed while spin-waiting. > + */ Not sure exactly what synchronous vs asynchronous here mean. I see the latter more like an interrupt. I guess what you have in mind is the WFE wakeup events on arm64, though they don't interrupt the instruction flow. I'd not bother specifying this at all. > +#ifndef __smp_cond_load_relaxed_spinwait > +#define __smp_cond_load_relaxed_spinwait(ptr, cond_expr, time_expr_ns, \ > + time_limit_ns) ({ \ > + typeof(ptr) __PTR = (ptr); \ > + __unqual_scalar_typeof(*ptr) VAL; \ > + unsigned int __count = 0; \ > + for (;;) { \ > + VAL = READ_ONCE(*__PTR); \ > + if (cond_expr) \ > + break; \ > + cpu_relax(); \ > + if (__count++ < smp_cond_time_check_count) \ > + continue; \ > + if ((time_expr_ns) >= (time_limit_ns)) \ > + break; \ > + __count = 0; \ > + } \ > + (typeof(*ptr))VAL; \ > +}) > +#endif > + > +#ifndef smp_cond_load_relaxed_timewait > +#define smp_cond_load_relaxed_timewait __smp_cond_load_relaxed_spinwait > +#endif What I don't particularly like about this interface is (1) no idea of what time granularity it offers, how much it can slip past the deadline, even though there's some nanoseconds implied and (2) time_expr_ns leaves the caller to figure out why time function to use for tracking the time. Well, I can be ok with (2) if we make it a bit more generic. The way it is written, I guess the type of the time expression and limit no longer matters as long as you can compare them. The naming implies nanoseconds but we don't have such precision, especially with the WFE implementation for arm64. We could add a slack range argument like the delta_ns for some of the hrtimer API and let the arch code decide whether to honour it. What about we drop the time_limit_ns and build it into the time_expr_ns as a 'time_cond' argument? The latter would return the result of some comparison and the loop bails out if true. An additional argument would be the minimum granularity for checking the time_cond and the arch code may decide to fall back to busy loops if the granularity is larger than what the caller required. -- Catalin