From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3B381A6806 for ; Thu, 21 May 2026 07:53:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779349985; cv=none; b=YZgAa98vrsaij84tgx4WrosNXthvmL39V+y/w9W1UwA0YzgwPStKj8+BOKVMW4/TD3yYHwA/bT7quIVQJO7vt+35+evMv/dHPlmXnWqVKz5sYy7A+494s1E7YO7DdT3aDSRF6FqkYS97s+RlhPLJ7Hd/PvSCCQKXVAppXFyR6As= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779349985; c=relaxed/simple; bh=PfJEs2u64VsGXvQUThZcpQCrOfbKP50a2FfwEEhtnbo=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=kjlqi4ei6lJQXPA6cuDPujn27oGe5ug6WfJrz8E97hNgCstxwHRimuHo3e2LTuZabS1PKqH61E1a3vwf9r6DJQGKgdVz89HxguHTyIpncpKrJzWEv4aKfHAxw7O6lkwKZ6AXBm2i+5EY4Rt/2qw/pREpKmydqvdbUENnCWAlieA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gCcE37Mt; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gCcE37Mt" Received: by smtp.kernel.org (Postfix) with UTF8SMTPSA id E5D5B1F00A3B; Thu, 21 May 2026 07:53:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1779349983; bh=ZGvLYvQVB01z4RFeMTSVwN3GNzSz5Z4ZvpgBlnf7LWo=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=gCcE37MtaIzx0suIZeilWy6ZHJaJNepb6hBlIv7nFFdligri1nxE8XwfMmuT25g4u hrW5lRlgCjq+Gm9GeHVIFO5/+rRQWxJoewgmVeCmiZKmaf4tPD6m9UBXfR4KeYqMoc zg7uRBkv1Bg3fJlh+RCeNB8eYfSDB1WfmE9nGs5UOK2ThOitZTQfMzPVHksl00WjP3 OU6yaBmPgTMtklmMVnqrk1gFIXLUub5MkxLg7gxyYy/1dmeSvBTbfcg7gCd9ckB+VP fmrLhECgAImBFgwtXj3Rs+VDhY+88pfEOH6CD1Cw+n4JTjD2MMpNsycKf5/lx9F0pX NeZzKvUdEoO8A== From: Thomas Gleixner To: Shrikanth Hegde , LKML Cc: x86@kernel.org, Michael Kelley , Dmitry Ilvokhin , Radu Rendec , Jan Kiszka , Kieran Bingham , Florian Fainelli , Marc Zyngier Subject: Re: [patch V6 00/16] Improve /proc/interrupts further In-Reply-To: <0ef61565-dc6c-4281-ad85-ddfff87078a7@linux.ibm.com> References: <20260517194421.705253664@kernel.org> <87wlwyw188.ffs@tglx> <0ef61565-dc6c-4281-ad85-ddfff87078a7@linux.ibm.com> Date: Thu, 21 May 2026 09:53:00 +0200 Message-ID: <87jysxw65f.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain >> Shrikanth! On Thu, May 21 2026 at 10:04, Shrikanth Hegde wrote: > On 5/20/26 8:57 PM, Thomas Gleixner wrote: >> Can you redirect it to /dev/null instead to take the file operations out >> of the picture? > > Yes. Did "perf stat -r 1000 cat /proc/interrupts > /dev/null". > It shows better improvement with the series compared to file write. Unsurprisingly :) >>> 0.000490211 +- 0.000000992 seconds time elapsed ( +- 0.20% ) <<< 3-4% improvements. >> >> Again IPC drops .... > > Yes. IPC dropping is consistent. I see the same trend in (PATCH 1/16) in the series. > Copying that snippet below. > > Before: > 8,932,242 instructions # 1.66 insn per cycle ( +- 0.34% ) > After: > 7,020,982 instructions # 1.30 insn per cycle ( +- 0.52% ) > > So it might be common pattern across archs. Maybe perf stat subsystem is slow > enough it doesn't shows the aboslute benefit. The problem is that the overhead of starting and tearing down 'cat' is accounted as well. That's constant, obviously. But for the use cases like irqbalanced or similar things, there is no startup/teardown cost involved. The process is up and running and they care about the actual read performance. It's clearly to observe by comparing the perf data with the read loop timing data: Base line v6 Perf 3072.21 us 1564.40 us Loop 1310.36 us 209.90 us It doesn't add up completely, but the trend is there. And you can trick perf to reveal the startup/teardown overhead it by comparing: perf stat -r 1000 head -q -c -0 /proc/interrupts >/dev/null perf stat -r 1000 head -q -c 0 /proc/interrupts >/dev/null > In addition, I ran "perf stat -a -r 1000 cat /proc/interrupts > /dev/null" > It is now 10x slower. IPC is same with series And improvement vanishes. > So heavier the infra testing it, gains are getting minimal i guess. As often :) > But i don't see any regression. > > As you said in the cover-letter, the micro loops you ran maybe the best way to evaluate it. > If you have the code in shareable form, I can give it a try. See below. I thought I would come around some day to actually use perf directly in the test program, but that never happened due to -ENOTIME. Thanks, tglx --- #include #include #include #include #include static char buf[1024*1024]; #define NSECS_PER_SEC (1000L * 1000L * 1000L) #define LOOPS 1000 static float td[LOOPS]; int main(int argc, char *argv[]) { int fd = open("/proc/interrupts", O_RDONLY); long tsum = 0, rs = 0; for (int i = 0; i < LOOPS; i++) { long r; do { r = read(fd, buf, sizeof(buf)); } while (r); lseek(fd, 0, 0); } for (int i = 0; i < LOOPS; i++) { struct timespec t0, t1; unsigned long delta; long r; clock_gettime(CLOCK_MONOTONIC, &t0); do { r = read(fd, buf, sizeof(buf)); rs += r; } while (r); clock_gettime(CLOCK_MONOTONIC, &t1); delta = t1.tv_nsec + t1.tv_sec * NSECS_PER_SEC; delta -= t0.tv_nsec + t0.tv_sec * NSECS_PER_SEC; tsum += delta; td[i] = delta * 1.0; lseek(fd, 0, 0); } float mean = tsum / LOOPS; float calc = 0; for (int i = 0; i < LOOPS; i++) { float tmp = td[i] - mean; calc += tmp * tmp; } calc /= LOOPS; float std = sqrt(calc * 1.0); printf("%lu %lu %5.3f\n", tsum / LOOPS, rs / LOOPS, (std / mean) * 100.0); return 0; }