linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Menzel <pmenzel@molgen.mpg.de>
To: Anna-Maria Behnsen <anna-maria@linutronix.de>
Cc: Frederic Weisbecker <frederic@kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-usb@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Subject: Re: NOHZ tick-stop error: local softirq work is pending, handler #08!!! on Dell XPS 13 9360
Date: Thu, 24 Oct 2024 14:48:09 +0200	[thread overview]
Message-ID: <598d4118-3f63-44dd-b0da-e19efbb73933@molgen.mpg.de> (raw)
In-Reply-To: <9de8ebd1-53fc-48d7-af11-b5c1ed828b3d@molgen.mpg.de>

Dear Anna-Maria,


Am 21.08.24 um 23:01 schrieb Paul Menzel:
> [Added URLs for files.]
> 
> Am 21.08.24 um 10:20 schrieb Paul Menzel:
>> Dear Anna-Maria,
>>
>>
>> Thank you very much for the support. I was finally able to collect the 
>> data you asked for.
>>
>> Am 09.04.24 um 09:57 schrieb Anna-Maria Behnsen:
>>> Paul Menzel writes:
>>
>> […]
>>
>>>> Am 08.04.24 um 12:10 schrieb Anna-Maria Behnsen:
>>>>
>>>>> Paul Menzel writes:
>>>>
>>>>>> On Dell XPS 13 9360/0596KF, BIOS 2.21.0 06/02/2022, with Linux 
>>>>>> 6.9- rc2+
>>>>>> built from commit b1e6ec0a0fd0 (Merge tag 'docs-6.9-fixes' of
>>>>>> git://git.lwn.net/linux) the external USB-C adapter Dell DA300 
>>>>>> stopped
>>>>>> working (only the Ethernet port was used). Linux logged:
>>>>>
>>>>> thanks for the report. Can you please provide a trace beside the dmesg
>>>>> output? The following trace events should be enabled (via kernel 
>>>>> command
>>>>> line):
>>>>>
>>>>> trace_event=timer:*,timer_migration:*,sched:sched_switch,sched:sched_wakeup,sched:sched_process_hang,irq:softirq_entry,irq:softirq_raise,irq:softirq_exit
>>>> Unfortunately I haven’t been able to reproduce it until now. Should it
>>>> happen again, I am going to try your suggestion.
>>>
>>> Thanks for letting me know.
>>
>> I wanted to configure that in the running system, but wasn’t able to 
>> set all of these at once with `set_event`:
>>
>>      echo 'timer:*,timer_migration:*,sched:sched_switch,sched:sched_wakeup,sched:sched_process_hang,irq:softirq_entry,irq:softirq_raise,irq:softirq_exit' | sudo tee /sys/kernel/tracing/set_event
>>
>> For some reason setting them individually also did *not* work:
>>
>>      for e in timer:* timer_migration:* sched:sched_switch sched:sched_wakeup sched:sched_process_hang irq:softirq_entry irq:softirq_raise irq:softirq_exit'; do echo "$e" | sudo tee -a /sys/ kernel/tracing/set_event; done
>>
>> I then used
>>
>>      echo 1 | sudo tee /sys/kernel/tracing/events/timer/enable
>>      echo 1 | sudo tee /sys/kernel/tracing/events/timer_migration/enable
>>      echo 1 | sudo tee /sys/kernel/tracing/events/sched/sched_switch/enable
>>      echo 1 | sudo tee /sys/kernel/tracing/events/sched/sched_wakeup/enable
>>      echo 1 | sudo tee /sys/kernel/tracing/events/sched/sched_process_hang/enable
>>      echo 1 | sudo tee /sys/kernel/tracing/events/irq/softirq_entry/enable
>>      echo 1 | sudo tee /sys/kernel/tracing/events/irq/softirq_raise/enable
>>      echo 1 | sudo tee /sys/kernel/tracing/events/irq/softirq_exit/enable
>>
>> and also had to increase the buffer to bridge the gap between the 
>> event and me noticing it:
>>
>>      echo 96000 | sudo tee /sys/kernel/tracing/buffer_size_kb
>>
>> Then, with Linux v6.11-rc4-11-g521b1e7f4cf0b, I was able to get the 
>> trace for the event below:
>>
>>      [ 7542.706299] NOHZ tick-stop error: local softirq work is pending, handler #08!!!
>>
>> $ sudo cat /sys/kernel/tracing/trace
>> […]
>>   MediaPD~der #28-14000   [000] d..1.  7542.703768: hrtimer_cancel: hrtimer=000000008d2c9f3f
>>   MediaPD~der #28-14000   [000] .....  7542.703810: hrtimer_init: hrtimer=00000000c6f259e7 clockid=CLOCK_MONOTONIC mode=ABS
>>   MediaPD~der #28-14000   [000] d..1.  7542.703812: hrtimer_start:hrtimer=00000000c6f259e7 function=hrtimer_wakeup expires=7602581538204 softexpires=7602581488204 mode=ABS
>>   MediaPD~der #28-14000   [000] d..2.  7542.703821: sched_switch: prev_comm=MediaPD~der #28 prev_pid=14000 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>>            <idle>-0       [000] dN.2.  7542.703931: sched_wakeup: comm=ImageBridgeChld pid=6041 prio=120 target_cpu=000
>>            <idle>-0       [000] d..2.  7542.703937: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=ImageBridgeChld next_pid=6041 next_prio=120
>>   ImageBridgeChld-6041    [000] d..2.  7542.704041: sched_switch: prev_comm=ImageBridgeChld prev_pid=6041 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>>            <idle>-0       [000] dN.2.  7542.704174: sched_wakeup: comm=Renderer pid=4113 prio=120 target_cpu=000
>>            <idle>-0       [000] d..2.  7542.704179: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=Renderer next_pid=4113 next_prio=120
>>          Renderer-4113    [000] d..2.  7542.704245: sched_switch: prev_comm=Renderer prev_pid=4113 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>>            <idle>-0       [000] dNh2.  7542.704260: sched_wakeup: comm=IPC I/O Child pid=6029 prio=120 target_cpu=000
>>            <idle>-0       [000] d..2.  7542.704267: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==>  next_comm=IPC I/O Child next_pid=6029 next_prio=120
>>     IPC I/O Child-6029    [000] d..2.  7542.704340: sched_switch: prev_comm=IPC I/O Child prev_pid=6029 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>>            <idle>-0       [000] dN.2.  7542.704786: sched_wakeup: comm=Compositor pid=4123 prio=120 target_cpu=000
>>            <idle>-0       [000] d..2.  7542.704791: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=Compositor next_pid=4123 next_prio=120
>>        Compositor-4123    [000] d..2.  7542.704944: sched_switch: prev_comm=Compositor prev_pid=4123 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>>            <idle>-0       [000] dN.2.  7542.705943: sched_wakeup: comm=Compositor pid=4123 prio=120 target_cpu=000
>>            <idle>-0       [000] d..2.  7542.705950: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=Compositor next_pid=4123 next_prio=120
>>        Compositor-4123    [000] d..2.  7542.706105: sched_switch: prev_comm=Compositor prev_pid=4123 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>>            <idle>-0       [000] d.h2.  7542.706328: hrtimer_cancel: hrtimer=000000009bbda66a
>>            <idle>-0       [000] d.h1.  7542.706329: hrtimer_expire_entry: hrtimer=000000009bbda66a function=tick_nohz_handler now=7542584007490
>>            <idle>-0       [000] d.h1.  7542.706333: softirq_raise: vec=9 [action=RCU]
>>            <idle>-0       [000] d.h1.  7542.706338: softirq_raise: vec=7 [action=SCHED]
>>            <idle>-0       [000] d.h1.  7542.706339: hrtimer_expire_exit: hrtimer=000000009bbda66a
>>            <idle>-0       [000] d.h2.  7542.706340: hrtimer_start: hrtimer=000000009bbda66a function=tick_nohz_handler expires=7542588000000 softexpires=7542588000000 mode=ABS
>>            <idle>-0       [000] ..s1.  7542.706345: softirq_entry: vec=7 [action=SCHED]
>>            <idle>-0       [000] ..s1.  7542.706359: softirq_exit: vec=7 [action=SCHED]
>>            <idle>-0       [000] ..s1.  7542.706360: softirq_entry: vec=9 [action=RCU]
>>            <idle>-0       [000] ..s1.  7542.706362: softirq_exit: vec=9 [action=RCU]
>>            <idle>-0       [000] dNh4.  7542.707672: sched_wakeup:  comm=irq/51-DLL075B: pid=194 prio=49 target_cpu=000
>>            <idle>-0       [000] d..2.  7542.707685: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=irq/51-DLL075B: next_pid=194 next_prio=49
>>   irq/51-DLL075B:-194     [000] .....  7542.707708: timer_init: timer=00000000630ae178
>>   irq/51-DLL075B:-194     [000] d..1.  7542.707710: timer_start: timer=00000000630ae178 function=process_timeout expires=4296778179 [timeout=250] bucket_expiry=4296778184 cpu=0 idx=121 flags=
>>   irq/51-DLL075B:-194     [000] d..2.  7542.707718: sched_switch: prev_comm=irq/51-DLL075B: prev_pid=194 prev_prio=49 prev_state=D ==> next_comm=swapper/0 next_pid=0 next_prio=120
>>            <idle>-0       [000] dN.2.  7542.709072: sched_wakeup: comm=AudioIP~ent RPC pid=6671 prio=120 target_cpu=000
>> […]
>>
>> The trace file is 320 MB big. If you need the full trace and log, 
>> please tell me, and I’ll upload it somewhere.
> 
> https://owww.molgen.mpg.de/~pmenzel/20240821--linux-6.10-rc4+.txt
> https://owww.molgen.mpg.de/~pmenzel/20240821--soft-irq--trace.7z

Just for the record, I am still seeing this with 
6.12.0-rc4-00047-gc2ee9f594da8 (KVM: selftests: Fix build on on non-x86 
architectures).


Kind regards,

Paul

  reply	other threads:[~2024-10-24 12:48 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <354a2690-9bbf-4ccb-8769-fa94707a9340@molgen.mpg.de>
     [not found] ` <87o7ak411y.fsf@somnus>
     [not found]   ` <256fdb2e-9b83-4837-bd31-0c34e4267c31@molgen.mpg.de>
     [not found]     ` <87sezv7ytw.fsf@somnus>
2024-08-21  8:20       ` NOHZ tick-stop error: local softirq work is pending, handler #08!!! on Dell XPS 13 9360 Paul Menzel
2024-08-21 21:01         ` Paul Menzel
2024-10-24 12:48           ` Paul Menzel [this message]
2025-02-10  7:55             ` Paul Menzel
2025-02-10 11:45               ` Michał Pecio
2025-02-10 11:59                 ` Paul Menzel
2025-02-10 13:26                   ` Frederic Weisbecker
2025-02-11 11:57                     ` Paul Menzel
2025-02-11 15:02                       ` Frederic Weisbecker
2025-02-13 17:07                         ` Paul Menzel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=598d4118-3f63-44dd-b0da-e19efbb73933@molgen.mpg.de \
    --to=pmenzel@molgen.mpg.de \
    --cc=anna-maria@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).