public inbox for linux-next@vger.kernel.org
 help / color / mirror / Atom feed
From: Johannes Berg <johannes@sipsolutions.net>
To: Bert Karwatzki <spasswolf@web.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "linux-next@vger.kernel.org" <linux-next@vger.kernel.org>,
	 "llvm@lists.linux.dev" <llvm@lists.linux.dev>,
	Thomas Gleixner <tglx@linutronix.de>,
	 linux-wireless@vger.kernel.org
Subject: Re: lockup and kernel panic in linux-next-202505{09,12} when compiled with clang
Date: Thu, 15 May 2025 08:30:50 +0200	[thread overview]
Message-ID: <8684a2b4bf367e2e2a97e2b52356ffe5436a8270.camel@sipsolutions.net> (raw)
In-Reply-To: <f8552d41fb7eae286803b78302390614179b33b0.camel@web.de>

On Thu, 2025-05-15 at 00:27 +0200, Bert Karwatzki wrote:
> Am Mittwoch, dem 14.05.2025 um 20:56 +0200 schrieb Johannes Berg:
> > > 
> > > I've split off the problematic piece of code into an noinline function to simplify the disassembly:
> > > 
> > 
> > Oh and also, does it even still crash with that? :)
> 
> Yes, it still crashes when compiled with clang.

OK, just checking. :)

FWIW, I'm not convinced at all that the code you were looking at is
really the problem. The crash (see below) is happening on the status
side. Of course it cannot crash on the status side if on the TX side we
never enter anything into the IDR data structure, and never tag the SKB
to look up in the IDR and therefore never try to create the status
report on the status side.

Basically what happens is this:

- on TX, if we have a socket requesting status, create a copy of the
  SKB, put it into the IDR, and put the IDR index into the original
  skb->cb
- then transmit the original skb, of course
- on TX status report from the driver, see if the skb->cb is tagged with
  the IDR value, if so, report the copy of the SKB back to the socket
  with the status information

(The reason we need to make a copy is that the SKB could be encrypted or
otherwise modified in flight, and we don't want to undo that, rather
keeping a copy for the report.)

>  [  267.339591][  T575] BUG: unable to handle page fault for address: ffffffff51e080b0
>  [  267.339598][  T575] #PF: supervisor write access in kernel mode
>  [  267.339602][  T575] #PF: error_code(0x0002) - not-present page
>  [  267.339606][  T575] PGD f1cc3c067 P4D f1cc3c067 PUD 0 
>  [  267.339613][  T575] Oops: Oops: 0002 [#1] SMP NOPTI
>  [  267.339622][  T575] CPU: 0 UID: 0 PID: 575 Comm: napi/phy0-0 Not tainted
> 6.15.0-rc6-next-20250513-llvm-00009-gec34cd07a425 #968 PREEMPT_{RT,(full)} 
>  [  267.339629][  T575] Hardware name: Micro-Star International Co., Ltd. Alpha
> 15 B5EEK/MS-158L, BIOS E158LAMS.10F 11/11/2024
>  [  267.339632][  T575] RIP: 0010:queued_spin_lock_slowpath+0x120/0x1c0
...
> [  267.339692][  T575] Call Trace:
>  [  267.339701][  T575]  <TASK>
>  [  267.339705][  T575]  _raw_spin_lock_irqsave+0x57/0x60
>  [  267.339714][  T575]  rt_spin_lock+0x73/0xa0
>  [  267.339720][  T575]  sock_queue_err_skb+0xdc/0x140
>  [  267.339727][  T575]  skb_complete_wifi_ack+0xa9/0x120
>  [  267.339737][  T575]  ieee80211_report_used_skb+0x541/0x6e0 [mac80211]
>  [  267.339799][  T575]  ? srso_alias_return_thunk+0x5/0xfbef5
>  [  267.339804][  T575]  ? start_dl_timer+0xcf/0x110
>  [  267.339814][  T575]  ieee80211_tx_status_ext+0x3b3/0x870 [mac80211]
>  [  267.339851][  T575]  ? raw_spin_rq_lock_nested+0x15/0x80
>  [  267.339862][  T575]  ? srso_alias_return_thunk+0x5/0xfbef5
>  [  267.339866][  T575]  ? rt_spin_lock+0x3d/0xa0
>  [  267.339873][  T575]  ? mt76_tx_status_unlock+0x38/0x230 [mt76]
>  [  267.339886][  T575]  mt76_tx_status_unlock+0x1e0/0x230 [mt76]

Yeah so that's the crash on the status report as explained above, it
kind of looks almost like the skb->sk was freed and somehow invalid now?
But I don't see a general issue here (will keep digging), and how come
it only shows up with clang?

Since it reproduces pretty reliably, maybe you could do with KASAN?

Also could be interesting - what userspace are you running with wifi?
What tool is even setting up the wifi status? If you don't really know
maybe just put WARN_ON(1) into net/core/sock.s where SO_WIFI_STATUS is
written (sk_setsockopt).

johannes

  reply	other threads:[~2025-05-15  6:30 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-13 16:48 lockup and kernel panic in linux-next-202505{09,12} when compiled with clang Bert Karwatzki
2025-05-13 22:33 ` Thomas Gleixner
2025-05-14  0:11   ` Bert Karwatzki
2025-05-14  9:32     ` Bert Karwatzki
2025-05-14 10:23       ` Johannes Berg
2025-05-14 13:46         ` Bert Karwatzki
2025-05-14 17:49           ` Johannes Berg
2025-05-14 18:56           ` Johannes Berg
2025-05-14 22:27             ` Bert Karwatzki
2025-05-15  6:30               ` Johannes Berg [this message]
2025-05-15  9:10                 ` Bert Karwatzki
2025-05-16 18:19                   ` Bert Karwatzki
2025-05-17 11:34                     ` Bert Karwatzki
2025-05-17 19:49                       ` Bert Karwatzki
2025-05-18  1:30                         ` Jason Xing
2025-05-18 12:12                           ` Bert Karwatzki
2025-05-18 12:43                             ` Bert Karwatzki
2025-05-18 14:15                               ` Bert Karwatzki
2025-05-18 14:41                                 ` Bert Karwatzki
  -- strict thread matches above, loose matches on Subject: below --
2025-05-13 22:15 Bert Karwatzki
2025-05-13 10:19 Bert Karwatzki
2025-05-13  8:00 Bert Karwatzki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8684a2b4bf367e2e2a97e2b52356ffe5436a8270.camel@sipsolutions.net \
    --to=johannes@sipsolutions.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=spasswolf@web.de \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox