From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Adam Goldman <adamg@pobox.com>,
Takashi Sakamoto <o-takashi@sakamocchi.jp>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 23/52] firewire: core: send bus reset promptly on gap count error
Date: Tue, 27 Feb 2024 14:26:10 +0100 [thread overview]
Message-ID: <20240227131549.293475652@linuxfoundation.org> (raw)
In-Reply-To: <20240227131548.514622258@linuxfoundation.org>
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Sakamoto <o-takashi@sakamocchi.jp>
[ Upstream commit 7ed4380009e96d9e9c605e12822e987b35b05648 ]
If we are bus manager and the bus has inconsistent gap counts, send a
bus reset immediately instead of trying to read the root node's config
ROM first. Otherwise, we could spend a lot of time trying to read the
config ROM but never succeeding.
This eliminates a 50+ second delay before the FireWire bus is usable after
a newly connected device is powered on in certain circumstances.
The delay occurs if a gap count inconsistency occurs, we are not the root
node, and we become bus manager. One scenario that causes this is with a TI
XIO2213B OHCI, the first time a Sony DSR-25 is powered on after being
connected to the FireWire cable. In this configuration, the Linux box will
not receive the initial PHY configuration packet sent by the DSR-25 as IRM,
resulting in the DSR-25 having a gap count of 44 while the Linux box has a
gap count of 63.
FireWire devices have a gap count parameter, which is set to 63 on power-up
and can be changed with a PHY configuration packet. This determines the
duration of the subaction and arbitration gaps. For reliable communication,
all nodes on a FireWire bus must have the same gap count.
A node may have zero or more of the following roles: root node, bus manager
(BM), isochronous resource manager (IRM), and cycle master. Unless a root
node was forced with a PHY configuration packet, any node might become root
node after a bus reset. Only the root node can become cycle master. If the
root node is not cycle master capable, the BM or IRM should force a change
of root node.
After a bus reset, each node sends a self-ID packet, which contains its
current gap count. A single bus reset does not change the gap count, but
two bus resets in a row will set the gap count to 63. Because a consistent
gap count is required for reliable communication, IEEE 1394a-2000 requires
that the bus manager generate a bus reset if it detects that the gap count
is inconsistent.
When the gap count is inconsistent, build_tree() will notice this after the
self identification process. It will set card->gap_count to the invalid
value 0. If we become bus master, this will force bm_work() to send a bus
reset when it performs gap count optimization.
After a bus reset, there is no bus manager. We will almost always try to
become bus manager. Once we become bus manager, we will first determine
whether the root node is cycle master capable. Then, we will determine if
the gap count should be changed. If either the root node or the gap count
should be changed, we will generate a bus reset.
To determine if the root node is cycle master capable, we read its
configuration ROM. bm_work() will wait until we have finished trying to
read the configuration ROM.
However, an inconsistent gap count can make this take a long time.
read_config_rom() will read the first few quadlets from the config ROM. Due
to the gap count inconsistency, eventually one of the reads will time out.
When read_config_rom() fails, fw_device_init() calls it again until
MAX_RETRIES is reached. This takes 50+ seconds.
Once we give up trying to read the configuration ROM, bm_work() will wake
up, assume that the root node is not cycle master capable, and do a bus
reset. Hopefully, this will resolve the gap count inconsistency.
This change makes bm_work() check for an inconsistent gap count before
waiting for the root node's configuration ROM. If the gap count is
inconsistent, bm_work() will immediately do a bus reset. This eliminates
the 50+ second delay and rapidly brings the bus to a working state.
I considered that if the gap count is inconsistent, a PHY configuration
packet might not be successful, so it could be desirable to skip the PHY
configuration packet before the bus reset in this case. However, IEEE
1394a-2000 and IEEE 1394-2008 say that the bus manager may transmit a PHY
configuration packet before a bus reset when correcting a gap count error.
Since the standard endorses this, I decided it's safe to retain the PHY
configuration packet transmission.
Normally, after a topology change, we will reset the bus a maximum of 5
times to change the root node and perform gap count optimization. However,
if there is a gap count inconsistency, we must always generate a bus reset.
Otherwise the gap count inconsistency will persist and communication will
be unreliable. For that reason, if there is a gap count inconstency, we
generate a bus reset even if we already reached the 5 reset limit.
Signed-off-by: Adam Goldman <adamg@pobox.com>
Reference: https://sourceforge.net/p/linux1394/mailman/message/58727806/
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/firewire/core-card.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/firewire/core-card.c b/drivers/firewire/core-card.c
index 11c634125c7df..0e6f96c0e3957 100644
--- a/drivers/firewire/core-card.c
+++ b/drivers/firewire/core-card.c
@@ -442,7 +442,23 @@ static void bm_work(struct work_struct *work)
*/
card->bm_generation = generation;
- if (root_device == NULL) {
+ if (card->gap_count == 0) {
+ /*
+ * If self IDs have inconsistent gap counts, do a
+ * bus reset ASAP. The config rom read might never
+ * complete, so don't wait for it. However, still
+ * send a PHY configuration packet prior to the
+ * bus reset. The PHY configuration packet might
+ * fail, but 1394-2008 8.4.5.2 explicitly permits
+ * it in this case, so it should be safe to try.
+ */
+ new_root_id = local_id;
+ /*
+ * We must always send a bus reset if the gap count
+ * is inconsistent, so bypass the 5-reset limit.
+ */
+ card->bm_retries = 0;
+ } else if (root_device == NULL) {
/*
* Either link_on is false, or we failed to read the
* config rom. In either case, pick another root.
--
2.43.0
next prev parent reply other threads:[~2024-02-27 13:44 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-27 13:25 [PATCH 4.19 00/52] 4.19.308-rc1 review Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 01/52] net/sched: Retire CBQ qdisc Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 02/52] net/sched: Retire ATM qdisc Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 03/52] net/sched: Retire dsmark qdisc Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 04/52] stmmac: no need to check return value of debugfs_create functions Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 05/52] net: stmmac: fix notifier registration Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 06/52] memcg: add refcnt for pcpu stock to avoid UAF problem in drain_all_stock() Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 07/52] nilfs2: replace WARN_ONs for invalid DAT metadata block requests Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 08/52] userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 09/52] sched/rt: Fix sysctl_sched_rr_timeslice intial value Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 10/52] sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 11/52] sched/rt: Disallow writing invalid values to sched_rt_period_us Greg Kroah-Hartman
2024-02-27 13:25 ` [PATCH 4.19 12/52] scsi: target: core: Add TMF to tmr_list handling Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 13/52] dmaengine: shdma: increase size of dev_id Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 14/52] wifi: cfg80211: fix missing interfaces when dumping Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 15/52] wifi: mac80211: fix race condition on enabling fast-xmit Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 16/52] fbdev: savage: Error out if pixclock equals zero Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 17/52] fbdev: sis: " Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 18/52] ahci: asm1166: correct count of reported ports Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 19/52] ext4: avoid allocating blocks from corrupted group in ext4_mb_try_best_found() Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 20/52] ext4: avoid allocating blocks from corrupted group in ext4_mb_find_by_goal() Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 21/52] regulator: pwm-regulator: Add validity checks in continuous .get_voltage Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 22/52] hwmon: (coretemp) Enlarge per package core count limit Greg Kroah-Hartman
2024-02-27 13:26 ` Greg Kroah-Hartman [this message]
2024-02-27 13:26 ` [PATCH 4.19 24/52] virtio-blk: Ensure no requests in virtqueues before deleting vqs Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 25/52] s390/qeth: Fix potential loss of L3-IP@ in case of network issues Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 26/52] pmdomain: renesas: r8a77980-sysc: CR7 must be always on Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 27/52] IB/hfi1: Fix sdma.h tx->num_descs off-by-one error Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 28/52] mm: memcontrol: switch to rcu protection in drain_all_stock() Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 29/52] dm-crypt: dont modify the data when using authenticated encryption Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 30/52] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp() Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 31/52] l2tp: pass correct message length to ip6_append_data Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 32/52] ARM: ep93xx: Add terminator to gpiod_lookup_table Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 33/52] usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 34/52] usb: roles: dont get/set_role() when usb_role_switch is unregistered Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 35/52] IB/hfi1: Fix a memleak in init_credit_return Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 36/52] RDMA/bnxt_re: Return error for SRQ resize Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 37/52] RDMA/srpt: Support specifying the srpt_service_guid parameter Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 38/52] RDMA/ulp: Use dev_name instead of ibdev->name Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 39/52] RDMA/srpt: Make debug output more detailed Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 40/52] RDMA/srpt: fix function pointer cast warnings Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 41/52] scripts/bpf: teach bpf_helpers_doc.py to dump BPF helper definitions Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 42/52] bpf, scripts: Correct GPL license name Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 43/52] scsi: jazz_esp: Only build if SCSI core is builtin Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 44/52] nouveau: fix function cast warnings Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 45/52] ipv6: sr: fix possible use-after-free and null-ptr-deref Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 46/52] packet: move from strlcpy with unused retval to strscpy Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 47/52] s390: use the correct count for __iowrite64_copy() Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 48/52] PCI/MSI: Prevent MSI hardware interrupt number truncation Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 49/52] KVM: arm64: vgic-its: Test for valid IRQ in its_sync_lpi_pending_table() Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 50/52] KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 51/52] fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 4.19 52/52] scripts/bpf: Fix xdp_md forward declaration typo Greg Kroah-Hartman
2024-02-27 18:31 ` [PATCH 4.19 00/52] 4.19.308-rc1 review Pavel Machek
2024-02-28 8:49 ` Naresh Kamboju
2024-02-28 13:39 ` Jon Hunter
2024-02-28 16:58 ` Shuah Khan
2024-02-28 18:16 ` Harshit Mogalapalli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240227131549.293475652@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=adamg@pobox.com \
--cc=o-takashi@sakamocchi.jp \
--cc=patches@lists.linux.dev \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox