netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
To: Jason Xing <kerneljasonxing@gmail.com>, Jakub Kicinski <kuba@kernel.org>
Cc: brouer@redhat.com, jbrouer@redhat.com, davem@davemloft.net,
	edumazet@google.com, pabeni@redhat.com, ast@kernel.org,
	daniel@iogearbox.net, hawk@kernel.org, john.fastabend@gmail.com,
	stephen@networkplumber.org, simon.horman@corigine.com,
	sinquersw@gmail.com, bpf@vger.kernel.org, netdev@vger.kernel.org,
	Jason Xing <kernelxing@tencent.com>
Subject: Re: [PATCH v4 net-next 2/2] net: introduce budget_squeeze to help us tune rx behavior
Date: Mon, 20 Mar 2023 14:30:27 +0100	[thread overview]
Message-ID: <870ba7b7-c38b-f4af-2087-688e9ae5a15d@redhat.com> (raw)
In-Reply-To: <CAL+tcoD+BoXsEBS5T_kvuUzDTuF3N7kO1eLqwNP3Wy6hps+BBA@mail.gmail.com>


On 17/03/2023 05.11, Jason Xing wrote:
> On Fri, Mar 17, 2023 at 11:26 AM Jakub Kicinski <kuba@kernel.org> wrote:
>>
>> On Fri, 17 Mar 2023 10:27:11 +0800 Jason Xing wrote:
>>>> That is the common case, and can be understood from the napi trace
>>>
>>> Thanks for your reply. It is commonly happening every day on many servers.
>>
>> Right but the common issue is the time squeeze, not budget squeeze,
> 
> Most of them are about time, so yes.
> 
>> and either way the budget squeeze doesn't really matter because
>> the softirq loop will call us again soon, if softirq itself is
>> not scheduled out.
>>

I agree, the budget squeeze count doesn't provide much value as it
doesn't indicate something critical (softirq loop will call us again
soon).  The time squeeze event is more critical and something that is
worth monitoring.

I see value in this patch, because it makes it possible monitor the time
squeeze events.  Currently the counter is "polluted" by the budget
squeeze, making it impossible to get a proper time squeeze signal.
Thus, I see this patch as a fix to a old problem.

Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>

That said (see below), besides monitoring time squeeze counter, I
recommend adding some BPF monitoring to capture latency issues...

>> So if you want to monitor a meaningful event in your fleet, I think
>> a better event to monitor is the number of times ksoftirqd was woken
>> up and latency of it getting onto the CPU.
> 
> It's a good point. Thanks for your advice.

I'm willing to help you out writing a BPF-based tool that can help you
identify the issue Jakub describe above. Of high latency from when
softIRQ is raised until softIRQ processing runs on the CPU.

I have this bpftrace script[1] available that does just that:

  [1] 
https://github.com/xdp-project/xdp-project/blob/master/areas/latency/softirq_net_latency.bt

Perhaps you can take the latency historgrams and then plot a heatmap[2]
in your monitoring platform.

  [2] https://www.brendangregg.com/heatmaps.html

--Jesper


  parent reply	other threads:[~2023-03-20 13:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-15  9:20 [PATCH v4 net-next 0/2] add some detailed data when reading softnet_stat Jason Xing
2023-03-15  9:20 ` [PATCH v4 net-next 1/2] net-sysfs: display two backlog queue len separately Jason Xing
2023-03-19  3:05   ` Jason Xing
2023-03-20 18:40     ` Jakub Kicinski
2023-03-21  1:49       ` Jason Xing
2023-03-15  9:20 ` [PATCH v4 net-next 2/2] net: introduce budget_squeeze to help us tune rx behavior Jason Xing
2023-03-17  0:20   ` Jakub Kicinski
2023-03-17  2:27     ` Jason Xing
2023-03-17  3:26       ` Jakub Kicinski
2023-03-17  4:11         ` Jason Xing
2023-03-17  4:30           ` Jakub Kicinski
2023-03-18  4:00             ` Jason Xing
2023-03-20 13:30           ` Jesper Dangaard Brouer [this message]
2023-03-20 18:46             ` Jakub Kicinski
2023-03-21  2:08             ` Jason Xing
2023-03-30  9:59         ` Jason Xing
2023-03-30 16:23           ` Jakub Kicinski
2023-03-31  0:48             ` Jason Xing
2023-03-31  2:20               ` Jakub Kicinski
2023-03-31  2:33                 ` Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=870ba7b7-c38b-f4af-2087-688e9ae5a15d@redhat.com \
    --to=jbrouer@redhat.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kerneljasonxing@gmail.com \
    --cc=kernelxing@tencent.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=simon.horman@corigine.com \
    --cc=sinquersw@gmail.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).