From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D9731E25EB for ; Tue, 7 Jan 2025 19:53:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736279637; cv=none; b=S2cVStj4Ibh2W3+71wbGvZk8A5FkvqVqyjWx8BkDqYle42lz+7x0V7ESbenTftXPwZT5RkL3dAWNfNwHpOdE9mMBwoXGv6+LaO7kYizstuOwK5RHNN7aJAEJY7V9dzOBFuwBf8AP9CAnsAHBvcl94mqsoWRPFiWW30OBZrGE6eI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736279637; c=relaxed/simple; bh=jjaFdKjOx1/NadzdQdgdlWHqSz747qMYO39ozfYmQng=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=qScvDiddjSj8rulWVaaVyKax307KLopdR2Awf2EziJKKzr/mVt6kIobuDYywySKHIl18cc8TvEDSra4k83mXe0BDddoo9uZIeAy4CkSAOrcOaXvBhGgpvSWNqIFbhqYhzXbL4m0ao4a5Qaw8PfvFnthhpJmvIxZAiggC+dfMUGI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=irH4hGlJ; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="irH4hGlJ" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=EdprZ/M/ny6d97iN9obStiFiwSOcoL60/SaMSf3fsoU=; b=irH4hGlJPjkxJWXfcfnph66oD8 YQLeSyYb1ovlIXkFaoBSC07yBut1IhDhXsBIbL9r1dvspCnQ+0X6/6sQydvPHYcXYYl3K77uZw+Fi +B04MUl5QnfEftDpX5Njf9LS0FX3e9Re3p6VP/cgynqJ2fNe5TrvdJXFB6iZuHol1LrIv3SLmrITZ cuLHbeRuLIQeb1H1yiPosJO6w3WzAKyIsrN8Hq3TaXzBi0hXo6JmbFv0RyeLV9JuDwGzGwNsOderu oDgNOqpIBpgNnZAN4KaKXG1nMcJU/4ysS1lNCRVRP4LJfTQgXNE/UsRLIgqLidUDya8IecNHaQcVb 4jILSjrA==; Received: from 77-249-17-89.cable.dynamic.v4.ziggo.nl ([77.249.17.89] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1tVFe8-00000009C9f-1p9V; Tue, 07 Jan 2025 19:53:50 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id E702C30057A; Tue, 7 Jan 2025 20:53:47 +0100 (CET) Date: Tue, 7 Jan 2025 20:53:47 +0100 From: Peter Zijlstra To: Tejun Heo Cc: Changwoo Min , void@manifault.com, arighi@nvidia.com, mingo@redhat.com, changwoo@igalia.com, kernel-dev@igalia.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v7 0/6] sched_ext: Support high-performance monotonically non-decreasing clock Message-ID: <20250107195347.GD28303@noisy.programming.kicks-ass.net> References: <20241230095625.114363-1-changwoo@igalia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Tue, Jan 07, 2025 at 09:46:48AM -1000, Tejun Heo wrote: > Hello, > > On Fri, Jan 03, 2025 at 12:16:41PM -1000, Tejun Heo wrote: > > On Mon, Dec 30, 2024 at 06:56:19PM +0900, Changwoo Min wrote: > > > Many BPF schedulers (such as scx_central, scx_lavd, scx_rusty, scx_bpfland, > > > and scx_flash) frequently call bpf_ktime_get_ns() for tracking tasks' runtime > > > properties. If supported, bpf_ktime_get_ns() eventually reads a hardware > > > timestamp counter (TSC). However, reading a hardware TSC is not > > > performant in some hardware platforms, degrading IPC. > > > > > > This patchset addresses the performance problem of reading hardware TSC > > > by leveraging the rq clock in the scheduler core, introducing a > > > scx_bpf_now() function for BPF schedulers. Whenever the rq clock > > > is fresh and valid, scx_bpf_now() provides the rq clock, which is > > > already updated by the scheduler core (update_rq_clock), so it can reduce > > > reading the hardware TSC. > > > > > > When the rq lock is released (rq_unpin_lock), the rq clock is invalidated, > > > so a subsequent scx_bpf_now() call gets the fresh sched_clock for the caller. > > > > > > In addition, scx_bpf_now() guarantees the clock is monotonically > > > non-decreasing for the same CPU, so the clock cannot go backward > > > in the same CPU. > > > > > > Using scx_bpf_now() reduces the number of reading hardware TSC > > > by 50-80% (76% for scx_lavd, 82% for scx_bpfland, and 51% for scx_rusty) > > > for the following benchmark: > > > > The patch series generally look good to me. Peter, if things look okay to > > you, I'll apply the series to sched_ext/for-6.14. > > Applying to sched_ext/for-6.14. Please holler if there are concerns. Urgh, I'll try and have a look, but have a hate for everybody who's been working through x-mas :/