From: Takashi Sakamoto <o-takashi@sakamocchi.jp>
To: Edmund Raile <edmund.raile@proton.me>
Cc: linux1394-devel@lists.sourceforge.net,
linux-kernel@vger.kernel.org, regressions@lists.linux.dev,
stable@vger.kernel.org
Subject: Re: firewire-ohci: device rediscovery hardlock regression
Date: Fri, 25 Oct 2024 12:49:34 +0900 [thread overview]
Message-ID: <20241025034934.GA95457@workstation.local> (raw)
In-Reply-To: <8a9902a4ece9329af1e1e42f5fea76861f0bf0e8.camel@proton.me>
Hi,
Thanks for the bug report.
Coincidentally, I face the same problem with my TC Electronic Desktop
Konnekt 6, which reports one available port and two invalidated ports, and
investigate its cause. I think the problem occurs just for the devices
which have three or more ports.
I sent a fix[1] just now by referring to your suggestions. Would you please
evaluate the fix with your device?
I'm sorry for your inconvenience.
[1] [PATCH] firewire: core: fix invalid port index for parent device
https://lore.kernel.org/lkml/20241025034137.99317-1-o-takashi@sakamocchi.jp/
Thanks
Takashi Sakamoto
On Thu, Oct 24, 2024 at 01:56:31PM +0000, Edmund Raile wrote:
> Hello,
>
> I'd like to report a regression in firewire-ohci that results
> in the kernel hardlocking when re-discovering a FireWire device.
>
> TI XIO2213B
> RME FireFace 800
>
> It will occur under three conditions:
> * power-cycling the FireWire device
> * un- and re-plugging the FireWire device
> * suspending and then waking the PC
>
> Often it would also occur directly on boot in QEMU but I have not
> yet observed this specific behavior on bare metal.
>
> Here is an excerpt from the stack trace (don't know whether it is
> acceptable to send in full):
>
> kernel: ------------[ cut here ]------------
> kernel: refcount_t: addition on 0; use-after-free.
> kernel: WARNING: CPU: 3 PID: 116 at lib/refcount.c:25
> refcount_warn_saturate (/build/linux/lib/refcount.c:25 (discriminator
> 1))
> kernel: Workqueue: firewire_ohci bus_reset_work
> kernel: RIP: 0010:refcount_warn_saturate
> (/build/linux/lib/refcount.c:25 (discriminator 1))
> kernel: Call Trace:
> kernel: <TASK>
> kernel: ? refcount_warn_saturate (/build/linux/lib/refcount.c:25
> (discriminator 1))
> kernel: ? __warn.cold (/build/linux/kernel/panic.c:693)
> kernel: ? refcount_warn_saturate (/build/linux/lib/refcount.c:25
> (discriminator 1))
> kernel: ? report_bug (/build/linux/lib/bug.c:180
> /build/linux/lib/bug.c:219)
> kernel: ? handle_bug (/build/linux/arch/x86/kernel/traps.c:218)
> kernel: ? exc_invalid_op (/build/linux/arch/x86/kernel/traps.c:260
> (discriminator 1))
> kernel: ? asm_exc_invalid_op
> (/build/linux/./arch/x86/include/asm/idtentry.h:621)
> kernel: ? refcount_warn_saturate (/build/linux/lib/refcount.c:25
> (discriminator 1))
> kernel: for_each_fw_node (/build/linux/./include/linux/refcount.h:190
> /build/linux/./include/linux/refcount.h:241
> /build/linux/./include/linux/refcount.h:258
> /build/linux/drivers/firewire/core.h:199
> /build/linux/drivers/firewire/core-topology.c:275)
> kernel: ? __pfx_report_found_node (/build/linux/drivers/firewire/core-
> topology.c:312)
> kernel: fw_core_handle_bus_reset (/build/linux/drivers/firewire/core-
> topology.c:399 (discriminator 1) /build/linux/drivers/firewire/core-
> topology.c:504 (discriminator 1))
> kernel: bus_reset_work (/build/linux/drivers/firewire/ohci.c:2121)
> kernel: process_one_work
> (/build/linux/./arch/x86/include/asm/jump_label.h:27
> /build/linux/./include/linux/jump_label.h:207
> /build/linux/./include/trace/events/workqueue.h:110
> /build/linux/kernel/workqueue.c:3236)
> kernel: worker_thread (/build/linux/kernel/workqueue.c:3306
> (discriminator 2) /build/linux/kernel/workqueue.c:3393 (discriminator
> 2))
> kernel: ? __pfx_worker_thread (/build/linux/kernel/workqueue.c:3339)
> kernel: kthread (/build/linux/kernel/kthread.c:389)
> kernel: ? __pfx_kthread (/build/linux/kernel/kthread.c:342)
> kernel: ret_from_fork (/build/linux/arch/x86/kernel/process.c:153)
> kernel: ? __pfx_kthread (/build/linux/kernel/kthread.c:342)
> kernel: ret_from_fork_asm (/build/linux/arch/x86/entry/entry_64.S:254)
> kernel: </TASK>
>
> I have identified the commit via bisection:
> 24b7f8e5cd656196a13077e160aec45ad89b58d9
> firewire: core: use helper functions for self ID sequence
>
> It was part of the following patch series:
> firewire: add tracepoints events for self ID sequence
> https://lore.kernel.org/all/20240605235155.116468-6-o-takashi@sakamocchi.jp/
>
> #regzbot introduced: 24b7f8e5cd65
>
> Since this was before v6.10-rc5 and stable 6.10.14 is EOL,
> stable v6.11.5 and mainline are affected.
>
> Reversion appears to be non-trivial as it is part of a patch
> series, other files have been altered as well and other commits
> build on top of it.
>
> Call chain:
> core-topology.c fw_core_handle_bus_reset()
> -> core-topology.c for_each_fw_node(card, local_node,
> report_found_node)
> -> core.h fw_node_get(root)
> -> refcount.h __refcount_inc(&node)
> -> refcount.h __refcount_add(1, r, oldp);
> -> refcount.h refcount_warn_saturate(r, REFCOUNT_ADD_UAF);
> -> refcount.h REFCOUNT_WARN("addition on 0; use-after-free")
>
> Since local_node of fw_core_handle_bus_reset() is retrieved by
> local_node = build_tree(card, self_ids, self_id_count);
> build_tree() needs to be looked at, it was indeed altered by
> 24b7f8e5cd65.
>
> After a hard 3 hour look traversing all used functions and comparing
> against the original function (as of e404cacfc5ed), this caught my eye:
> for (port_index = 0; port_index < total_port_count;
> ++port_index) {
> switch (port_status) {
> case PHY_PACKET_SELF_ID_PORT_STATUS_PARENT:
> node->color = i;
>
> In both for loops, "port_index" was replaced by "i"
> "i" remains in use above:
> for (i = 0, h = &stack; i < child_port_count; i++)
> h = h->prev;
>
> While the original also used the less descriptive i in the loop
> for (i = 0; i < port_count; i++) {
> switch (get_port_type(sid, i)) {
> case SELFID_PORT_PARENT:
> node->color = i;
> but reset it to 0 at the beginning of the loop.
>
> So the stray "i" in the for loop should be replaced with the loop
> iterator "port_index" as it is meant to be synchronous with the
> loop iterator (i.e. the port_index), no?
>
> diff --git a/drivers/firewire/core-topology.c b/drivers/firewire/core-
> topology.c
> index 8c10f47cc8fc..7fd91ba9c9c4 100644
> --- a/drivers/firewire/core-topology.c
> +++ b/drivers/firewire/core-topology.c
> @@ -207,7 +207,7 @@ static struct fw_node *build_tree(struct fw_card
> *card, const u32 *sid, int self
> // the node->ports array where the
> parent node should be. Later,
> // when we handle the parent node, we
> fix up the reference.
> ++parent_count;
> - node->color = i;
> + node->color = port_index;
> break;
>
> What threw me off was discaridng node->color as it would be replaced
> later anyways (can't be important!), or so I thought.
>
> Please tell me, is this line of reasoning correct or am I missing
> something?
>
> Compiling 24b7f8e5cd65 and later mainline with the patch above
> resulted in a kernel that didn't crash!
>
> In case my solution should turn out to be correct, I will gladly
> submit the patch.
>
> Kind regards,
> Edmund Raile.
prev parent reply other threads:[~2024-10-25 3:49 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-24 13:56 firewire-ohci: device rediscovery hardlock regression Edmund Raile
2024-10-25 3:49 ` Takashi Sakamoto [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241025034934.GA95457@workstation.local \
--to=o-takashi@sakamocchi.jp \
--cc=edmund.raile@proton.me \
--cc=linux-kernel@vger.kernel.org \
--cc=linux1394-devel@lists.sourceforge.net \
--cc=regressions@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.