* [PATCH net-next] FDDI: defza: Sanitise the reset safety timer
@ 2026-05-09 21:04 Maciej W. Rozycki
2026-05-13 1:06 ` Jakub Kicinski
0 siblings, 1 reply; 2+ messages in thread
From: Maciej W. Rozycki @ 2026-05-09 21:04 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, linux-kernel
The reset actions of the DEFZA adapters are exceedingly slow, taking up
to 30 seconds to complete by the device spec and typically in the range
of 10 seconds in reality, as required for the device RTOS to boot, still
quite a lot. Therefore a state machine is used that's interrupt driven,
however a safety mechanism is required in case of adapter malfunction,
so that if no state change interrupt has arrived in time, then the
situation is taken care of.
The safety mechanism depends on the origin of the reset. For regular
adapter initialisation at the device probe time a sleep is requested.
However a reset is also required by the device spec when the adapter has
transitioned into the halted state, such as in response to a PC Trace
event in the course of ring fault recovery, possibly a common network
event. In that case no sleep is possible as a device halt is reported
at the hardirq level.
A timer is therefore set up to ensure progress in case no adapter state
change interrupt has arrived in time, but as from commit 168f6b6ffbee
("timers: Use del_timer_sync() even on UP") a warning is issued as the
timer is deleted in the hardirq handler upon an expected state change:
defza: v.1.1.4 Oct 6 2018 Maciej W. Rozycki
tc2: DEC FDDIcontroller 700 or 700-C at 0x18000000, irq 4
tc2: resetting the board...
------------[ cut here ]------------
WARNING: kernel/time/timer.c:1611 at __timer_delete_sync+0x104/0x120, CPU#0: swapper/0/0
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 7.0.0-dirty #2 VOLUNTARY
Stack : 9800000002027d08 00000000140120e0 0000000000000000 ffffffff8089d468
0000000000000000 0000000000000000 ffffffff807ed6b8 ffffffff80897458
ffffffff80897400 9800000002027b88 0000000000000000 7070617773203a6d
0000000000000000 9800000002027ba4 0000000000001000 6465746e69617420
0000000000000000 ffffffff807ed6b8 00000000140120e0 0000000000000009
000000000000064b ffffffff800dd14c 0000000000000036 9800000002184000
0000000000000000 0000000000000020 0000000000000000 ffffffff80910000
ffffffff8085c000 9800000002027c70 0000000000000001 ffffffff80045fa0
0000000000000000 0000000000000000 0000000000000000 0000000000000009
000000000000064b ffffffff800502b8 ffffffff807ed6b8 ffffffff80045fa0
...
Call Trace:
[<ffffffff800502b8>] show_stack+0x28/0xf0
[<ffffffff80045fa0>] dump_stack_lvl+0x48/0x7c
[<ffffffff80068c98>] __warn+0xa0/0x128
[<ffffffff8004120c>] warn_slowpath_fmt+0x64/0xa4
[<ffffffff800dd14c>] __timer_delete_sync+0x104/0x120
[<ffffffff804934ac>] fza_interrupt+0xc74/0xeb8
[<ffffffff800c6390>] __handle_irq_event_percpu+0x70/0x228
[<ffffffff800c6560>] handle_irq_event_percpu+0x18/0x78
[<ffffffff800cc320>] handle_percpu_irq+0x50/0x80
[<ffffffff800c5970>] generic_handle_irq+0x90/0xd0
[<ffffffff806e956c>] do_IRQ+0x1c/0x30
[<ffffffff8004ad4c>] handle_int+0x148/0x154
[<ffffffff800ab7c0>] do_idle+0x40/0x108
[<ffffffff800abb0c>] cpu_startup_entry+0x2c/0x38
[<ffffffff806dfec8>] kernel_init+0x0/0x108
---[ end trace 0000000000000000 ]---
tc2: OK
tc2: model 700 (DEFZA-AA), MMF PMD, address 08-00-2b-xx-xx-xx
tc2: ROM rev. 1.0, firmware rev. 1.2, RMC rev. A, SMT ver. 1
tc2: link unavailable
------------[ cut here ]------------
WARNING: kernel/time/timer.c:1611 at __timer_delete_sync+0x104/0x120, CPU#0: swapper/0/0
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G W 7.0.0-dirty #2 VOLUNTARY
Tainted: [W]=WARN
Stack : 9800000002027d08 00000000140120e0 0000000000000000 ffffffff8089d468
0000000000000000 0000000000000000 ffffffff807ed6b8 ffffffff80897458
ffffffff80897400 9800000002027b88 0000000000000000 0000000000000000
0000000000000000 9800000002027ba4 0000000000001000 0000000000000000
0000000000000000 ffffffff807ed6b8 00000000140120e0 0000000000000009
000000000000064b ffffffff800dd14c 0000000000000036 9800000002184000
0000000000000000 0000000000000020 0000000000000000 ffffffff80910000
ffffffff8085c000 9800000002027c70 0000000000000001 ffffffff80045fa0
0000000000000000 0000000000000000 0000000000000000 0000000000000009
000000000000064b ffffffff800502b8 ffffffff807ed6b8 ffffffff80045fa0
...
Call Trace:
[<ffffffff800502b8>] show_stack+0x28/0xf0
[<ffffffff80045fa0>] dump_stack_lvl+0x48/0x7c
[<ffffffff80068c98>] __warn+0xa0/0x128
[<ffffffff8004120c>] warn_slowpath_fmt+0x64/0xa4
[<ffffffff800dd14c>] __timer_delete_sync+0x104/0x120
[<ffffffff804934ac>] fza_interrupt+0xc74/0xeb8
[<ffffffff800c6390>] __handle_irq_event_percpu+0x70/0x228
[<ffffffff800c6560>] handle_irq_event_percpu+0x18/0x78
[<ffffffff800cc320>] handle_percpu_irq+0x50/0x80
[<ffffffff800c5970>] generic_handle_irq+0x90/0xd0
[<ffffffff806e956c>] do_IRQ+0x1c/0x30
[<ffffffff8004ad4c>] handle_int+0x148/0x154
[<ffffffff806de8a4>] arch_local_irq_disable+0x4/0x28
[<ffffffff800ab7d0>] do_idle+0x50/0x108
[<ffffffff800abb0c>] cpu_startup_entry+0x2c/0x38
[<ffffffff806dfec8>] kernel_init+0x0/0x108
---[ end trace 0000000000000000 ]---
tc2: registered as fddi0
The immediate origin of the new warning is the switch away from aliasing
del_timer_sync() to del_timer() (timer_delete_sync() to timer_delete()
in terms of current function names) for UP configurations, which however
is the only choice for this driver anyway as no SMP hardware supports
the TURBOchannel bus this device interfaces to. Therefore there is a
very remote issue only this is a sign of.
Specifically if an adapter reset issued upon a transition to the halted
state times out and first triggers fza_reset_timer() for another reset
assertion, which then schedules fza_reset_timer() for reset deassertion
and then that second call is pre-empted after poking at the hardware,
but before the timer has been rearmed and owing to high system load
causing exceedingly high scheduling latency control is not handed back
before a transition to the uninitialised state has caused the timer to
be deleted even before it has been started, then fza_reset_timer() will
be called yet again and issue another reset even though by then the
adapter has already recovered.
Prevent this situation from happening by switching to timer_delete() for
the transition to the halted state and protect the code region affected
with a spinlock, also to make sure add_timer() has not been called twice
in a row due to an execution race between the interrupt handler and the
timer handler (though it could only happen on SMP, but let's keep the
driver clean). It's a very unlikely sequence of events to happen and
therefore there's no point in trying to be overly clever about it, such
as by placing printk() calls outside the protection. For the transition
to the uninitialised state switch to timer_delete_sync_try() instead, so
that a timer isn't deleted that's just been rearmed by the timer handler
and needs to watch for the device to come out of reset again (again, an
SMP scenario only).
Retain timer_delete_sync() invocations outside the hardirq context for a
stray timer not to fire once device structures have been released.
Fixes: 61414f5ec9834 ("FDDI: defza: Add support for DEC FDDIcontroller 700 TURBOchannel adapter")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
---
drivers/net/fddi/defza.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
linux-defza-timer-lock.diff
Index: linux-macro/drivers/net/fddi/defza.c
===================================================================
--- linux-macro.orig/drivers/net/fddi/defza.c
+++ linux-macro/drivers/net/fddi/defza.c
@@ -984,7 +984,7 @@ static irqreturn_t fza_interrupt(int irq
case FZA_STATE_UNINITIALIZED:
netif_carrier_off(dev);
- timer_delete_sync(&fp->reset_timer);
+ timer_delete_sync_try(&fp->reset_timer);
fp->ring_cmd_index = 0;
fp->ring_uns_index = 0;
fp->ring_rmc_tx_index = 0;
@@ -1018,7 +1018,9 @@ static irqreturn_t fza_interrupt(int irq
fp->queue_active = 0;
netif_stop_queue(dev);
pr_debug("%s: queue stopped\n", fp->name);
- timer_delete_sync(&fp->reset_timer);
+
+ spin_lock(&fp->lock);
+ timer_delete(&fp->reset_timer);
pr_warn("%s: halted, reason: %x\n", fp->name,
FZA_STATUS_GET_HALT(status));
fza_regs_dump(fp);
@@ -1027,6 +1029,8 @@ static irqreturn_t fza_interrupt(int irq
fp->timer_state = 0;
fp->reset_timer.expires = jiffies + 45 * HZ;
add_timer(&fp->reset_timer);
+ spin_unlock(&fp->lock);
+
break;
default:
@@ -1046,7 +1050,9 @@ static irqreturn_t fza_interrupt(int irq
static void fza_reset_timer(struct timer_list *t)
{
struct fza_private *fp = timer_container_of(fp, t, reset_timer);
+ unsigned long flags;
+ spin_lock_irqsave(&fp->lock, flags);
if (!fp->timer_state) {
pr_err("%s: RESET timed out!\n", fp->name);
pr_info("%s: trying harder...\n", fp->name);
@@ -1069,6 +1075,7 @@ static void fza_reset_timer(struct timer
fp->reset_timer.expires = jiffies + 45 * HZ;
}
add_timer(&fp->reset_timer);
+ spin_unlock_irqrestore(&fp->lock, flags);
}
static int fza_set_mac_address(struct net_device *dev, void *addr)
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH net-next] FDDI: defza: Sanitise the reset safety timer
2026-05-09 21:04 [PATCH net-next] FDDI: defza: Sanitise the reset safety timer Maciej W. Rozycki
@ 2026-05-13 1:06 ` Jakub Kicinski
0 siblings, 0 replies; 2+ messages in thread
From: Jakub Kicinski @ 2026-05-13 1:06 UTC (permalink / raw)
To: Maciej W. Rozycki
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni, netdev,
linux-kernel
On Sat, 9 May 2026 22:04:50 +0100 (BST) Maciej W. Rozycki wrote:
> case FZA_STATE_UNINITIALIZED:
> netif_carrier_off(dev);
> - timer_delete_sync(&fp->reset_timer);
> + timer_delete_sync_try(&fp->reset_timer);
> fp->ring_cmd_index = 0;
> fp->ring_uns_index = 0;
> fp->ring_rmc_tx_index = 0;
> @@ -1018,7 +1018,9 @@ static irqreturn_t fza_interrupt(int irq
> fp->queue_active = 0;
> netif_stop_queue(dev);
> pr_debug("%s: queue stopped\n", fp->name);
> - timer_delete_sync(&fp->reset_timer);
> +
> + spin_lock(&fp->lock);
> + timer_delete(&fp->reset_timer);
Noob q, why use timer_delete_sync_try() above ?
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-13 1:06 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-09 21:04 [PATCH net-next] FDDI: defza: Sanitise the reset safety timer Maciej W. Rozycki
2026-05-13 1:06 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox