From: Takashi Sakamoto <o-takashi@sakamocchi.jp>
To: Edmund Raile <edmund.raile@proton.me>
Cc: linux1394-devel@lists.sourceforge.net,
linux-kernel@vger.kernel.org, regressions@lists.linux.dev,
stable@vger.kernel.org
Subject: Re: firewire-ohci: device rediscovery hardlock regression
Date: Fri, 25 Oct 2024 12:49:34 +0900 [thread overview]
Message-ID: <20241025034934.GA95457@workstation.local> (raw)
In-Reply-To: <8a9902a4ece9329af1e1e42f5fea76861f0bf0e8.camel@proton.me>
Hi,
Thanks for the bug report.
Coincidentally, I face the same problem with my TC Electronic Desktop
Konnekt 6, which reports one available port and two invalidated ports, and
investigate its cause. I think the problem occurs just for the devices
which have three or more ports.
I sent a fix[1] just now by referring to your suggestions. Would you please
evaluate the fix with your device?
I'm sorry for your inconvenience.
[1] [PATCH] firewire: core: fix invalid port index for parent device
https://lore.kernel.org/lkml/20241025034137.99317-1-o-takashi@sakamocchi.jp/
Thanks
Takashi Sakamoto
On Thu, Oct 24, 2024 at 01:56:31PM +0000, Edmund Raile wrote:
> Hello,
>
> I'd like to report a regression in firewire-ohci that results
> in the kernel hardlocking when re-discovering a FireWire device.
>
> TI XIO2213B
> RME FireFace 800
>
> It will occur under three conditions:
> * power-cycling the FireWire device
> * un- and re-plugging the FireWire device
> * suspending and then waking the PC
>
> Often it would also occur directly on boot in QEMU but I have not
> yet observed this specific behavior on bare metal.
>
> Here is an excerpt from the stack trace (don't know whether it is
> acceptable to send in full):
>
> kernel: ------------[ cut here ]------------
> kernel: refcount_t: addition on 0; use-after-free.
> kernel: WARNING: CPU: 3 PID: 116 at lib/refcount.c:25
> refcount_warn_saturate (/build/linux/lib/refcount.c:25 (discriminator
> 1))
> kernel: Workqueue: firewire_ohci bus_reset_work
> kernel: RIP: 0010:refcount_warn_saturate
> (/build/linux/lib/refcount.c:25 (discriminator 1))
> kernel: Call Trace:
> kernel: <TASK>
> kernel: ? refcount_warn_saturate (/build/linux/lib/refcount.c:25
> (discriminator 1))
> kernel: ? __warn.cold (/build/linux/kernel/panic.c:693)
> kernel: ? refcount_warn_saturate (/build/linux/lib/refcount.c:25
> (discriminator 1))
> kernel: ? report_bug (/build/linux/lib/bug.c:180
> /build/linux/lib/bug.c:219)
> kernel: ? handle_bug (/build/linux/arch/x86/kernel/traps.c:218)
> kernel: ? exc_invalid_op (/build/linux/arch/x86/kernel/traps.c:260
> (discriminator 1))
> kernel: ? asm_exc_invalid_op
> (/build/linux/./arch/x86/include/asm/idtentry.h:621)
> kernel: ? refcount_warn_saturate (/build/linux/lib/refcount.c:25
> (discriminator 1))
> kernel: for_each_fw_node (/build/linux/./include/linux/refcount.h:190
> /build/linux/./include/linux/refcount.h:241
> /build/linux/./include/linux/refcount.h:258
> /build/linux/drivers/firewire/core.h:199
> /build/linux/drivers/firewire/core-topology.c:275)
> kernel: ? __pfx_report_found_node (/build/linux/drivers/firewire/core-
> topology.c:312)
> kernel: fw_core_handle_bus_reset (/build/linux/drivers/firewire/core-
> topology.c:399 (discriminator 1) /build/linux/drivers/firewire/core-
> topology.c:504 (discriminator 1))
> kernel: bus_reset_work (/build/linux/drivers/firewire/ohci.c:2121)
> kernel: process_one_work
> (/build/linux/./arch/x86/include/asm/jump_label.h:27
> /build/linux/./include/linux/jump_label.h:207
> /build/linux/./include/trace/events/workqueue.h:110
> /build/linux/kernel/workqueue.c:3236)
> kernel: worker_thread (/build/linux/kernel/workqueue.c:3306
> (discriminator 2) /build/linux/kernel/workqueue.c:3393 (discriminator
> 2))
> kernel: ? __pfx_worker_thread (/build/linux/kernel/workqueue.c:3339)
> kernel: kthread (/build/linux/kernel/kthread.c:389)
> kernel: ? __pfx_kthread (/build/linux/kernel/kthread.c:342)
> kernel: ret_from_fork (/build/linux/arch/x86/kernel/process.c:153)
> kernel: ? __pfx_kthread (/build/linux/kernel/kthread.c:342)
> kernel: ret_from_fork_asm (/build/linux/arch/x86/entry/entry_64.S:254)
> kernel: </TASK>
>
> I have identified the commit via bisection:
> 24b7f8e5cd656196a13077e160aec45ad89b58d9
> firewire: core: use helper functions for self ID sequence
>
> It was part of the following patch series:
> firewire: add tracepoints events for self ID sequence
> https://lore.kernel.org/all/20240605235155.116468-6-o-takashi@sakamocchi.jp/
>
> #regzbot introduced: 24b7f8e5cd65
>
> Since this was before v6.10-rc5 and stable 6.10.14 is EOL,
> stable v6.11.5 and mainline are affected.
>
> Reversion appears to be non-trivial as it is part of a patch
> series, other files have been altered as well and other commits
> build on top of it.
>
> Call chain:
> core-topology.c fw_core_handle_bus_reset()
> -> core-topology.c for_each_fw_node(card, local_node,
> report_found_node)
> -> core.h fw_node_get(root)
> -> refcount.h __refcount_inc(&node)
> -> refcount.h __refcount_add(1, r, oldp);
> -> refcount.h refcount_warn_saturate(r, REFCOUNT_ADD_UAF);
> -> refcount.h REFCOUNT_WARN("addition on 0; use-after-free")
>
> Since local_node of fw_core_handle_bus_reset() is retrieved by
> local_node = build_tree(card, self_ids, self_id_count);
> build_tree() needs to be looked at, it was indeed altered by
> 24b7f8e5cd65.
>
> After a hard 3 hour look traversing all used functions and comparing
> against the original function (as of e404cacfc5ed), this caught my eye:
> for (port_index = 0; port_index < total_port_count;
> ++port_index) {
> switch (port_status) {
> case PHY_PACKET_SELF_ID_PORT_STATUS_PARENT:
> node->color = i;
>
> In both for loops, "port_index" was replaced by "i"
> "i" remains in use above:
> for (i = 0, h = &stack; i < child_port_count; i++)
> h = h->prev;
>
> While the original also used the less descriptive i in the loop
> for (i = 0; i < port_count; i++) {
> switch (get_port_type(sid, i)) {
> case SELFID_PORT_PARENT:
> node->color = i;
> but reset it to 0 at the beginning of the loop.
>
> So the stray "i" in the for loop should be replaced with the loop
> iterator "port_index" as it is meant to be synchronous with the
> loop iterator (i.e. the port_index), no?
>
> diff --git a/drivers/firewire/core-topology.c b/drivers/firewire/core-
> topology.c
> index 8c10f47cc8fc..7fd91ba9c9c4 100644
> --- a/drivers/firewire/core-topology.c
> +++ b/drivers/firewire/core-topology.c
> @@ -207,7 +207,7 @@ static struct fw_node *build_tree(struct fw_card
> *card, const u32 *sid, int self
> // the node->ports array where the
> parent node should be. Later,
> // when we handle the parent node, we
> fix up the reference.
> ++parent_count;
> - node->color = i;
> + node->color = port_index;
> break;
>
> What threw me off was discaridng node->color as it would be replaced
> later anyways (can't be important!), or so I thought.
>
> Please tell me, is this line of reasoning correct or am I missing
> something?
>
> Compiling 24b7f8e5cd65 and later mainline with the patch above
> resulted in a kernel that didn't crash!
>
> In case my solution should turn out to be correct, I will gladly
> submit the patch.
>
> Kind regards,
> Edmund Raile.
prev parent reply other threads:[~2024-10-25 3:49 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-24 13:56 firewire-ohci: device rediscovery hardlock regression Edmund Raile
2024-10-25 3:49 ` Takashi Sakamoto [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241025034934.GA95457@workstation.local \
--to=o-takashi@sakamocchi.jp \
--cc=edmund.raile@proton.me \
--cc=linux-kernel@vger.kernel.org \
--cc=linux1394-devel@lists.sourceforge.net \
--cc=regressions@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox