From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp-relay-internal-1.canonical.com (smtp-relay-internal-1.canonical.com [185.125.188.123]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B0D27657BC for ; Tue, 30 Jan 2024 11:27:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.125.188.123 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706614072; cv=none; b=QkYTtlv7HzNtT77jB4aJZeDfhMh8XEGqEz5hFTeMgqc7+31eAsyUsaIyG8uOmaITx7RsQupQqDFuovky7PkBbcG1sbzrNfSXyijbc5b1bsLYhBUZPqGkL9MDioNvbCpxOeOz/tH58D11UsITr82gdjrR3I+DwIbhIaOv6E3fsvo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706614072; c=relaxed/simple; bh=ZGZZspdnE8EUgLWPFHd+6BwVKIys4nDLal4mfwz/sNc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=lE84CHczvR/fOAxyH7KQPLGPE5ACzVYjg2fU8jAo1Gn7D8BpQQcot+QxnhwLC0YEvfiYDJmtP499J1vIB8HVd214+qtWY6GpeBg9etx38STvVjMpzkGcfYAVbBFHgckcLzjM0+I3GqMgzlRnJyJr+n0RwTO4hmENCDWNCTuzxw4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=canonical.com; spf=pass smtp.mailfrom=canonical.com; dkim=pass (2048-bit key) header.d=canonical.com header.i=@canonical.com header.b=qdsXOOuF; arc=none smtp.client-ip=185.125.188.123 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=canonical.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=canonical.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=canonical.com header.i=@canonical.com header.b="qdsXOOuF" Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com [209.85.218.71]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 7BB18406C8 for ; Tue, 30 Jan 2024 11:27:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1706614065; bh=6tdV/4l0zEWyPEi+CYks1udwaNXyLFwEEK3upPagEgQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:In-Reply-To; b=qdsXOOuFjIRJJkjes1QRKaxD2CvcSGpUSuKxiHPyIWwe4n38K1LH/VvxolZ1nhqQ4 9f8ZP3hNxuURMpk7+k2iYZlAwqQG+P3MvGbckmgulLRC9sP/r1Lt/qqM6Bqe8LQjpT y1pvqZ8HwHhVpzIOf7MCBgqDltgocHxxdkXi2rxQMjfQTUAZfWjzYuN43MtrZoGoIz uTVhTRiP29dzn5y3Py1pr0yAMNQ+24B8Z/XykL1/rJWjtTYLUFN8L+CB3v8bqistGo TbkTnLY6XKIfZ88VsEGBxbsBL3e5h2/o549MWDNTV+mUV1872Rflpjf9N38NWCJV1T 1a0HL6p23LE8w== Received: by mail-ej1-f71.google.com with SMTP id a640c23a62f3a-a3120029877so537816866b.1 for ; Tue, 30 Jan 2024 03:27:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706614065; x=1707218865; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=6tdV/4l0zEWyPEi+CYks1udwaNXyLFwEEK3upPagEgQ=; b=HZH7i93nQCnKHr9m1Izf77KJjs0E8VizeeRzMcSOFOXpUDUkEgmredlmz7PO4xAF1+ APvYnDVLlRD4rLIfDzryAW17HQtjjFkYafVCCETsP0Kjtj1aN44r6yPIF7CtmEP6KyKl Lv9EYVFIPn8porRvct3YmNaO6TWSvgAy39cRamAsvFMbu8qJexf4Be5Cgu1MNZ0tyhHR y0FHahlht9l8vl0VAYxfmt1SCMHMkVbIRCjxBpS59fHbjJojM10ICrepu0ew5LeUC9tT PSsvkvD6rimn7MREJb+c1t20ZbzVm/VPQYK6y53E4xmgYTPmrZIhPgitDr5DaoOdTpFE xnjA== X-Gm-Message-State: AOJu0YyhibL5QMy6PTo2qjlhxP/XkQ2b6MVNRhLPMZHdyVjPTueCp0ni KWJMVEdyhdSvW5FMcMIYLl4uSq/U/LFIxZZ6dZNj3nMoxzZDBvOiOj4dZVILalJqHTOZDc8Dd+x JV7/ZfIxVfzsb7ZLAvhg5G0ZtO4yGZnk3bRoUyV0y3zWOUrAeeXovHSVoiDDhjIA29w== X-Received: by 2002:a17:906:3752:b0:a28:c04e:315b with SMTP id e18-20020a170906375200b00a28c04e315bmr1284343ejc.13.1706614064833; Tue, 30 Jan 2024 03:27:44 -0800 (PST) X-Google-Smtp-Source: AGHT+IHf/eCjmaXbXDJndk+g6+zWKeBxat36mREoWkzf+ohZ6i294C6rwgN1PRoMHw0fWKrnZZvRYw== X-Received: by 2002:a17:906:3752:b0:a28:c04e:315b with SMTP id e18-20020a170906375200b00a28c04e315bmr1284323ejc.13.1706614064449; Tue, 30 Jan 2024 03:27:44 -0800 (PST) Received: from localhost (host-79-33-203-37.retail.telecomitalia.it. [79.33.203.37]) by smtp.gmail.com with ESMTPSA id ty8-20020a170907c70800b00a353d1a19a9sm3915183ejc.191.2024.01.30.03.27.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jan 2024 03:27:44 -0800 (PST) Date: Tue, 30 Jan 2024 12:27:43 +0100 From: Andrea Righi To: Uladzislau Rezki Cc: "Paul E. McKenney" , Joel Fernandes , Joel Fernandes , rcu@vger.kernel.org, "Cc: Frederic Weisbecker" Subject: Re: Observation on NOHZ_FULL Message-ID: References: <6a2b6857-57e7-4321-adae-132ba69a4fff@paulmck-laptop> <0e15e91e-da47-45dd-b7de-7f89b7b6002b@joelfernandes.org> Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Tue, Jan 30, 2024 at 12:06:49PM +0100, Uladzislau Rezki wrote: > On Tue, Jan 30, 2024 at 02:17:22AM -0800, Paul E. McKenney wrote: > > On Tue, Jan 30, 2024 at 07:58:18AM +0100, Andrea Righi wrote: > > > Hi Joel and Paul, > > > > > > comments below. > > > > > > On Mon, Jan 29, 2024 at 05:16:38PM -0500, Joel Fernandes wrote: > > > > Hi Paul, > > > > > > > > On 1/29/2024 3:41 PM, Paul E. McKenney wrote: > > > > > On Mon, Jan 29, 2024 at 05:47:39PM +0000, Joel Fernandes wrote: > > > > >> Hi Guys, > > > > >> Something caught my eye in [1] which a colleague pointed me to > > > > >> - CONFIG_HZ=1000 : 14866.05 bogo ops/s > > > > >> - CONFIG_HZ=1000+nohz_full : 18505.52 bogo ops/s > > > > >> > > > > >> The test in concern is: > > > > >> stress-ng --matrix $(getconf _NPROCESSORS_ONLN) --timeout 5m --metrics-brief > > > > >> > > > > >> which is a CPU intensive test. > > > > >> > > > > >> Any thoughts on what else can attribute a 30% performance increase > > > > >> versus non-nohz_full ? (Confession: No idea if the baseline is > > > > >> nohz_idle or no nohz at all). If it is 30%, I may want to evaluate > > > > >> nohz_full on some of our limited-CPU devices :) > > > > > > > > > > The usual questions. ;-) > > > > > > > > > > Is this repeatable? Is it under the same conditions of temperature, > > > > > load, and so on? Was it running on bare metal or on a guest OS? If on a > > > > > guest OS, what was the load from other guest OSes on the same hypervisor > > > > > or on the hypervisor itself? > > > > > > That was the result of a quick test, so I expect it has some fuzzyness > > > in there. > > > > > > It's an average of 10 runs, it was bare metal (my laptop, 8 cores 11th > > > Gen Intel(R) Core(TM) i7-1195G7 @ 2.90GHz), *but* I wanted to run the > > > test with the default Ubuntu settings, that means having "power mode: > > > balanced" enabled. I don't know exactly what it's doing (I'll check how > > > it works in details), I think it's using intel p-states IIRC. > > > > > > Also, the system was not completely isolated (my email client was > > > running) but the system was mostly idle in general. > > > > > > I was already planning to repeat the tests in a more "isolated" > > > environment and add details to the bug tracker. > > > > > > > > > > > > > The bug report ad "CONFIG_HZ=250 : 17415.60 bogo ops/s", which makes > > > > > me wonder if someone enabled some heavy debug that is greatly > > > > > increasing the overhead of the scheduling-clock interrupt. > > > > > > > > > > Now, if that was the case, I would expect the 250HZ number to have > > > > > three-quarters of the improvement of the nohz_full number compared > > > > > to the 1000HZ number: > > > > >> 17415.60-14866.05=2549.55 > > > > > 18505.52-14866.05=3639.47 > > > > > > > > > > 2549.55/3639.47=0.70 > > > > > > > > I wonder if the difference here could possibly also be because of CPU idle > > > > governor. It may behave differently at differently clock rates so perhaps has > > > > different overhead. > > > > > > Could be, but, again, the balanced power mode could play a major role > > > here. > > > > > > > > > > > I have added trying nohz full to my list as well to evaluate. FWIW, when we > > > > moved from 250HZ to 1000HZ, it actually improved power because the CPUidle > > > > governor could put the CPUs in deeper idle states more quickly! > > > > > > Interesting, another benefit to add to my proposal. :) > > > > > > > > > > > > OK, 0.70 is not *that* far off of 0.75. So what debugging does that > > > > > test have enabled? Also, if you use tracing (or whatever) to measure > > > > > the typical duration of the scheduling-clock interrupt and related things > > > > > like softirq handlers, does it fit with these numbers? Such a measurment > > > > > would look at how long it took to get back into userspace. > > > > Just to emphasize... > > > > The above calculations show that your measurements are close to what you > > would expect if scheduling-clock interrupts took longer than one would > > expect. Here "scheduling-clock interrupts" includes softirq processing > > (timers, networking, RCU, ...) that piggybacks on each such interrupt. > > > > Although softirq makes the most sense given the amount of time that must > > be consumed, for the most part softirq work is conserved. which suggests > > that you should also at the rest of the system to check whether the > > reported speedup is instead due to this work simply being moved to some > > other CPU. > > > > But maybe the fat softirqs are due to some debugging option that Ubuntu > > enabled. In which case checking up on the actual duration (perhaps > > using some form of tracing) would provide useful information. ;-) > > > As a first step i would have a look at perf figures what is going on > during a test run. For such purpose the "perf" tool can be used. As a > basic step it can be run in a "top" mode: > > perf top -a -g -e cycles:k > > Sorry for the noise :) Yep, I'm planning to do better tests and collect more info (perf, bpftrace). Also making sure that we don't have some crazy debugging config enabled in the Ubuntu kernel, as correctly pointed by Paul. But first of all I need to repeat the tests in a more isolated environment, just to make sure we're looking at reasonable numbers here. Thanks, -Andrea