public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Ilvokhin <d@ilvokhin.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH RESEND] sched/stats: Optimize /proc/schedstat printing
Date: Wed, 29 Oct 2025 14:46:33 +0000	[thread overview]
Message-ID: <aQIoySXrIVcKXXGS@shell.ilvokhin.com> (raw)
In-Reply-To: <20251029140755.GF4067720@noisy.programming.kicks-ass.net>

On Wed, Oct 29, 2025 at 03:07:55PM +0100, Peter Zijlstra wrote:
> On Wed, Oct 29, 2025 at 01:07:15PM +0000, Dmitry Ilvokhin wrote:
> > Function seq_printf supports rich format string for decimals printing,
> > but there is no need for it in /proc/schedstat, since majority of the
> > data is space separared decimals. Use seq_put_decimal_ull instead as
> > faster alternative.
> > 
> > Performance counter stats (truncated) for sh -c 'cat /proc/schedstat >
> > /dev/null' before and after applying the patch from machine with 72 CPUs
> > are below.
> > 
> > Before:
> > 
> >       2.94 msec task-clock               #    0.820 CPUs utilized
> >          1      context-switches         #  340.551 /sec
> >          0      cpu-migrations           #    0.000 /sec
> >        340      page-faults              #  115.787 K/sec
> > 10,327,200      instructions             #    1.89  insn per cycle
> >                                          #    0.10  stalled cycles per insn
> >  5,458,307      cycles                   #    1.859 GHz
> >  1,052,733      stalled-cycles-frontend  #   19.29% frontend cycles idle
> >  2,066,321      branches                 #  703.687 M/sec
> >     25,621      branch-misses            #    1.24% of all branches
> > 
> > 0.00357974 +- 0.00000209 seconds time elapsed  ( +-  0.06% )
> > 
> > After:
> > 
> >       2.50 msec task-clock              #    0.785 CPUs utilized
> >          1      context-switches        #  399.780 /sec
> >          0      cpu-migrations          #    0.000 /sec
> >        340      page-faults             #  135.925 K/sec
> >  7,371,867      instructions            #    1.59  insn per cycle
> >                                         #    0.13  stalled cycles per insn
> >  4,647,053      cycles                  #    1.858 GHz
> >    986,487      stalled-cycles-frontend #   21.23% frontend cycles idle
> >  1,591,374      branches                #  636.199 M/sec
> >     28,973      branch-misses           #    1.82% of all branches
> > 
> > 0.00318461 +- 0.00000295 seconds time elapsed  ( +-  0.09% )
> > 
> > This is ~11% (relative) improvement in time elapsed.
> 
> Yeah, but who cares? Why do we want less obvious code for a silly stats
> file?

Thanks for the feedback, Peter.

Fair point that /proc/schedstat isn’t a hot path in the kernel itself,
but it is a hot path for monitoring software (Prometheus for example).
In large fleets, these files are polled periodically (often every few
seconds) on every machine. The cumulative overhead adds up quickly
across thousands of nodes, so reducing the cost of generating these
stats does have a measurable operational impact. With the ongoing trend
toward higher core counts per machine, this cost becomes even more
noticeable over time.

I've tried to keep the code as readable as possible, but I understand if
you think an ~11% improvement isn't worth the added complexity. If you
have suggestions for making the code cleaner or the intent clearer, I’d
be happy to rework it.

  reply	other threads:[~2025-10-29 14:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-29 13:07 [PATCH RESEND] sched/stats: Optimize /proc/schedstat printing Dmitry Ilvokhin
2025-10-29 14:07 ` Peter Zijlstra
2025-10-29 14:46   ` Dmitry Ilvokhin [this message]
2025-10-29 14:55     ` Peter Zijlstra
2025-10-29 15:49       ` Dmitry Ilvokhin
2025-11-05 15:04         ` Dmitry Ilvokhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQIoySXrIVcKXXGS@shell.ilvokhin.com \
    --to=d@ilvokhin.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox