From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3B381A6806
	for <linux-kernel@vger.kernel.org>; Thu, 21 May 2026 07:53:03 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1779349985; cv=none; b=YZgAa98vrsaij84tgx4WrosNXthvmL39V+y/w9W1UwA0YzgwPStKj8+BOKVMW4/TD3yYHwA/bT7quIVQJO7vt+35+evMv/dHPlmXnWqVKz5sYy7A+494s1E7YO7DdT3aDSRF6FqkYS97s+RlhPLJ7Hd/PvSCCQKXVAppXFyR6As=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1779349985; c=relaxed/simple;
	bh=PfJEs2u64VsGXvQUThZcpQCrOfbKP50a2FfwEEhtnbo=;
	h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID:
	 MIME-Version:Content-Type; b=kjlqi4ei6lJQXPA6cuDPujn27oGe5ug6WfJrz8E97hNgCstxwHRimuHo3e2LTuZabS1PKqH61E1a3vwf9r6DJQGKgdVz89HxguHTyIpncpKrJzWEv4aKfHAxw7O6lkwKZ6AXBm2i+5EY4Rt/2qw/pREpKmydqvdbUENnCWAlieA=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gCcE37Mt; arc=none smtp.client-ip=100.103.45.18
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gCcE37Mt"
Received: by smtp.kernel.org (Postfix) with UTF8SMTPSA id E5D5B1F00A3B;
	Thu, 21 May 2026 07:53:02 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org;
	s=k20260515; t=1779349983;
	bh=ZGvLYvQVB01z4RFeMTSVwN3GNzSz5Z4ZvpgBlnf7LWo=;
	h=From:To:Cc:Subject:In-Reply-To:References:Date;
	b=gCcE37MtaIzx0suIZeilWy6ZHJaJNepb6hBlIv7nFFdligri1nxE8XwfMmuT25g4u
	 hrW5lRlgCjq+Gm9GeHVIFO5/+rRQWxJoewgmVeCmiZKmaf4tPD6m9UBXfR4KeYqMoc
	 zg7uRBkv1Bg3fJlh+RCeNB8eYfSDB1WfmE9nGs5UOK2ThOitZTQfMzPVHksl00WjP3
	 OU6yaBmPgTMtklmMVnqrk1gFIXLUub5MkxLg7gxyYy/1dmeSvBTbfcg7gCd9ckB+VP
	 fmrLhECgAImBFgwtXj3Rs+VDhY+88pfEOH6CD1Cw+n4JTjD2MMpNsycKf5/lx9F0pX
	 NeZzKvUdEoO8A==
From: Thomas Gleixner <tglx@kernel.org>
To: Shrikanth Hegde <sshegde@linux.ibm.com>, LKML
 <linux-kernel@vger.kernel.org>
Cc: x86@kernel.org, Michael Kelley <mhklinux@outlook.com>, Dmitry Ilvokhin
 <d@ilvokhin.com>, Radu Rendec <radu@rendec.net>, Jan Kiszka
 <jan.kiszka@siemens.com>, Kieran Bingham <kbingham@kernel.org>, Florian
 Fainelli <florian.fainelli@broadcom.com>, Marc Zyngier <maz@kernel.org>
Subject: Re: [patch V6 00/16] Improve /proc/interrupts further
In-Reply-To: <0ef61565-dc6c-4281-ad85-ddfff87078a7@linux.ibm.com>
References: <20260517194421.705253664@kernel.org>
 <e123b8b9-a179-48fa-9e48-2bfcbcc24b86@linux.ibm.com> <87wlwyw188.ffs@tglx>
 <0ef61565-dc6c-4281-ad85-ddfff87078a7@linux.ibm.com>
Date: Thu, 21 May 2026 09:53:00 +0200
Message-ID: <87jysxw65f.ffs@tglx>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain

>> Shrikanth!

On Thu, May 21 2026 at 10:04, Shrikanth Hegde wrote:
> On 5/20/26 8:57 PM, Thomas Gleixner wrote:
>> Can you redirect it to /dev/null instead to take the file operations out
>> of the picture?
>
> Yes. Did "perf stat -r 1000 cat /proc/interrupts > /dev/null".
> It shows better improvement with the series compared to file write.

Unsurprisingly :)

>>>          0.000490211 +- 0.000000992 seconds time elapsed  ( +-  0.20% )   <<< 3-4% improvements.
>> 
>> Again IPC drops ....
>
> Yes. IPC dropping is consistent. I see the same trend in (PATCH 1/16) in the series.
> Copying that snippet below.
>
> Before:
>   8,932,242      instructions      #    1.66  insn per cycle  ( +-  0.34% )
> After:
>   7,020,982      instructions      #    1.30  insn per cycle  ( +-  0.52% )
>
> So it might be common pattern across archs. Maybe perf stat subsystem is slow
> enough it doesn't shows the aboslute benefit.

The problem is that the overhead of starting and tearing down 'cat' is
accounted as well. That's constant, obviously.

But for the use cases like irqbalanced or similar things, there is no
startup/teardown cost involved. The process is up and running and they
care about the actual read performance.

It's clearly to observe by comparing the perf data with the read loop
timing data:

          Base line		v6
Perf	  3072.21 us		1564.40 us
Loop	  1310.36 us		 209.90 us

It doesn't add up completely, but the trend is there. And you can trick
perf to reveal the startup/teardown overhead it by comparing:

  perf stat -r 1000 head -q -c -0 /proc/interrupts >/dev/null
  perf stat -r 1000 head -q -c 0 /proc/interrupts >/dev/null

> In addition, I ran "perf stat -a -r 1000 cat /proc/interrupts > /dev/null"
> It is now 10x slower. IPC is same with series And improvement vanishes.
> So heavier the infra testing it, gains are getting minimal i guess.

As often :)

> But i don't see any regression.
>
> As you said in the cover-letter, the micro loops you ran maybe the best way to evaluate it.
> If you have the code in shareable form, I can give it a try.

See below. I thought I would come around some day to actually use perf
directly in the test program, but that never happened due to
-ENOTIME.

Thanks,

        tglx
---
#include <fcntl.h>
#include <math.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>

static char buf[1024*1024];

#define NSECS_PER_SEC	(1000L * 1000L * 1000L)

#define LOOPS	1000

static float td[LOOPS];

int main(int argc, char *argv[])
{
	int fd = open("/proc/interrupts", O_RDONLY);
	long tsum = 0, rs = 0;

	for (int i = 0; i < LOOPS; i++) {
		long r;

		do {
			r = read(fd, buf, sizeof(buf));
		} while (r);
		lseek(fd, 0, 0);
	}

	for (int i = 0; i < LOOPS; i++) {
		struct timespec t0, t1;
		unsigned long delta;
		long r;

		clock_gettime(CLOCK_MONOTONIC, &t0);
		do {
			r = read(fd, buf, sizeof(buf));
			rs += r;
		} while (r);
		clock_gettime(CLOCK_MONOTONIC, &t1);

		delta = t1.tv_nsec + t1.tv_sec * NSECS_PER_SEC;
		delta -= t0.tv_nsec + t0.tv_sec * NSECS_PER_SEC;
		tsum += delta;
		td[i] = delta * 1.0;

		lseek(fd, 0, 0);
	}

	float mean = tsum / LOOPS;
	float calc = 0;

	for (int i = 0; i < LOOPS; i++) {
		float tmp = td[i] - mean;

		calc += tmp * tmp;
	}

	calc /= LOOPS;

	float std = sqrt(calc * 1.0);

	printf("%lu %lu %5.3f\n", tsum / LOOPS, rs / LOOPS, (std / mean) * 100.0);
	return 0;
}