* [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation
@ 2007-11-01 1:49 Stefan Richter
2007-11-01 1:50 ` [PATCH] firewire: fw-core: react on bus resets while the config ROM is being fetched Stefan Richter
2007-11-01 3:53 ` [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation Nick Piggin
0 siblings, 2 replies; 23+ messages in thread
From: Stefan Richter @ 2007-11-01 1:49 UTC (permalink / raw)
To: linux1394-devel; +Cc: linux-kernel, Kristian Høgsberg
fw_device.node_id and fw_device.generation are accessed without mutexes.
We have to ensure that all readers will get to see node_id updates
before generation updates.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
drivers/firewire/fw-device.c | 6 ++++++
drivers/firewire/fw-topology.c | 1 +
2 files changed, 7 insertions(+)
Index: linux/drivers/firewire/fw-device.c
===================================================================
--- linux.orig/drivers/firewire/fw-device.c
+++ linux/drivers/firewire/fw-device.c
@@ -808,6 +813,7 @@ void fw_node_event(struct fw_card *card,
device = node->data;
device->node_id = node->node_id;
+ wmb();
device->generation = card->generation;
if (atomic_read(&device->state) == FW_DEVICE_RUNNING) {
PREPARE_DELAYED_WORK(&device->work, fw_device_update);
Index: linux/drivers/firewire/fw-topology.c
===================================================================
--- linux.orig/drivers/firewire/fw-topology.c
+++ linux/drivers/firewire/fw-topology.c
@@ -518,6 +518,7 @@ fw_core_handle_bus_reset(struct fw_card
card->bm_retries = 0;
card->node_id = node_id;
+ wmb();
card->generation = generation;
card->reset_jiffies = jiffies;
schedule_delayed_work(&card->work, 0);
--
Stefan Richter
-=====-=-=== =-== ----=
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] firewire: fw-core: react on bus resets while the config ROM is being fetched
2007-11-01 1:49 [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation Stefan Richter
@ 2007-11-01 1:50 ` Stefan Richter
2007-11-01 1:51 ` [PATCH] firewire: fw-sbp2: enforce read order of device generation and node ID Stefan Richter
2007-11-01 3:53 ` [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation Nick Piggin
1 sibling, 1 reply; 23+ messages in thread
From: Stefan Richter @ 2007-11-01 1:50 UTC (permalink / raw)
To: linux1394-devel; +Cc: linux-kernel, Kristian Høgsberg
read_rom() obtained a fresh new fw_device.generation for each read
transaction (unless the compiler performed very aggressive inlining in
read_bus_info_block). It's unlikely but not impossible that we could
end up with a corrupt fetched ROM image if there was a generation change
during the ROM reading.
We now restart reading the ROM if the bus generation changed.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
drivers/firewire/fw-device.c | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
Index: linux/drivers/firewire/fw-device.c
===================================================================
--- linux.orig/drivers/firewire/fw-device.c
+++ linux/drivers/firewire/fw-device.c
@@ -391,7 +391,8 @@ complete_transaction(struct fw_card *car
complete(&callback_data->done);
}
-static int read_rom(struct fw_device *device, int index, u32 * data)
+static int
+read_rom(struct fw_device *device, int generation, int index, u32 *data)
{
struct read_quadlet_callback_data callback_data;
struct fw_transaction t;
@@ -401,7 +402,7 @@ static int read_rom(struct fw_device *de
offset = 0xfffff0000400ULL + index * 4;
fw_send_request(device->card, &t, TCODE_READ_QUADLET_REQUEST,
- device->node_id, device->generation, device->max_speed,
+ device->node_id, generation, device->max_speed,
offset, NULL, 4, complete_transaction, &callback_data);
wait_for_completion(&callback_data.done);
@@ -411,7 +412,14 @@ static int read_rom(struct fw_device *de
return callback_data.rcode;
}
-static int read_bus_info_block(struct fw_device *device)
+/*
+ * Read the bus info block, perform a speed probe, and read all of the rest of
+ * the config ROM. We do all this with a cached bus generation. If the bus
+ * generation changes under us, read_bus_info_block will fail and get retried.
+ * It's better to start all over in this case because the node from which we
+ * are reading the ROM may have changed the ROM during the reset.
+ */
+static int read_bus_info_block(struct fw_device *device, int generation)
{
static u32 rom[256];
u32 stack[16], sp, key;
@@ -421,7 +429,7 @@ static int read_bus_info_block(struct fw
/* First read the bus info block. */
for (i = 0; i < 5; i++) {
- if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE)
+ if (read_rom(device, generation, i, &rom[i]) != RCODE_COMPLETE)
return -1;
/*
* As per IEEE1212 7.2, during power-up, devices can
@@ -456,7 +464,8 @@ static int read_bus_info_block(struct fw
device->max_speed = device->card->link_speed;
while (device->max_speed > SCODE_100) {
- if (read_rom(device, 0, &dummy) == RCODE_COMPLETE)
+ if (read_rom(device, generation, 0, &dummy) ==
+ RCODE_COMPLETE)
break;
device->max_speed--;
}
@@ -489,7 +498,7 @@ static int read_bus_info_block(struct fw
return -1;
/* Read header quadlet for the block to get the length. */
- if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE)
+ if (read_rom(device, generation, i, &rom[i]) != RCODE_COMPLETE)
return -1;
end = i + (rom[i] >> 16) + 1;
i++;
@@ -508,7 +517,8 @@ static int read_bus_info_block(struct fw
* it references another block, and push it in that case.
*/
while (i < end) {
- if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE)
+ if (read_rom(device, generation, i, &rom[i]) !=
+ RCODE_COMPLETE)
return -1;
if ((key >> 30) == 3 && (rom[i] >> 30) > 1 &&
sp < ARRAY_SIZE(stack))
@@ -655,7 +665,7 @@ static void fw_device_init(struct work_s
* device.
*/
- if (read_bus_info_block(device) < 0) {
+ if (read_bus_info_block(device, device->generation) < 0) {
if (device->config_rom_retries < MAX_RETRIES) {
device->config_rom_retries++;
schedule_delayed_work(&device->work, RETRY_DELAY);
--
Stefan Richter
-=====-=-=== =-== ----=
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] firewire: fw-sbp2: enforce read order of device generation and node ID
2007-11-01 1:50 ` [PATCH] firewire: fw-core: react on bus resets while the config ROM is being fetched Stefan Richter
@ 2007-11-01 1:51 ` Stefan Richter
0 siblings, 0 replies; 23+ messages in thread
From: Stefan Richter @ 2007-11-01 1:51 UTC (permalink / raw)
To: linux1394-devel; +Cc: linux-kernel, Kristian Høgsberg
Two fixes:
- There was a small window where a login or reconnect job could use an
already updated card generation with an outdated node ID. We better
use the fw_device.generation here, not the fw_card.generation.
- Insert a memory barrier to ensure that the device generation is read
before the node ID. This is to guarantee that the generation is not
newer than the node ID.
A small optimization:
- The target's and initiator's node IDs can be obtained from fw_device
and fw_card. Dereferencing their underlying topology objects is not
necessary.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
drivers/firewire/fw-sbp2.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
Index: linux/drivers/firewire/fw-sbp2.c
===================================================================
--- linux.orig/drivers/firewire/fw-sbp2.c
+++ linux/drivers/firewire/fw-sbp2.c
@@ -643,9 +643,10 @@ static void sbp2_login(struct work_struc
struct sbp2_login_response response;
int generation, node_id, local_node_id;
- generation = device->card->generation;
- node_id = device->node->node_id;
- local_node_id = device->card->local_node->node_id;
+ generation = device->generation;
+ rmb();
+ node_id = device->node_id;
+ local_node_id = device->card->node_id;
if (sbp2_send_management_orb(lu, node_id, generation,
SBP2_LOGIN_REQUEST, lu->lun, &response) < 0) {
@@ -900,9 +901,10 @@ static void sbp2_reconnect(struct work_s
struct fw_device *device = fw_device(unit->device.parent);
int generation, node_id, local_node_id;
- generation = device->card->generation;
- node_id = device->node->node_id;
- local_node_id = device->card->local_node->node_id;
+ generation = device->generation;
+ rmb();
+ node_id = device->node_id;
+ local_node_id = device->card->node_id;
if (sbp2_send_management_orb(lu, node_id, generation,
SBP2_RECONNECT_REQUEST,
--
Stefan Richter
-=====-=-=== =-== ----=
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation
2007-11-01 1:49 [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation Stefan Richter
2007-11-01 1:50 ` [PATCH] firewire: fw-core: react on bus resets while the config ROM is being fetched Stefan Richter
@ 2007-11-01 3:53 ` Nick Piggin
2007-11-01 9:51 ` dealing with barriers (was Re: [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation) Stefan Richter
1 sibling, 1 reply; 23+ messages in thread
From: Nick Piggin @ 2007-11-01 3:53 UTC (permalink / raw)
To: Stefan Richter; +Cc: linux1394-devel, linux-kernel, Kristian Høgsberg
On Thursday 01 November 2007 12:49, Stefan Richter wrote:
> fw_device.node_id and fw_device.generation are accessed without mutexes.
> We have to ensure that all readers will get to see node_id updates
> before generation updates.
>
Hi, a few points:
- can change it to use spinlocks instead? This would be most
preferable.
- if not, you need comments.
- you also seem to be missing rmb()s now. I see a couple in the
firewire directory, but nothing that seems to be ordering loads
of these particular fields.
- use smp_*mb() if you are just ordering regular cacheable RAM
accesses.
Also, diffstat is a bit wrong... maybe you posted the wrong version?
> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
> ---
> drivers/firewire/fw-device.c | 6 ++++++
> drivers/firewire/fw-topology.c | 1 +
> 2 files changed, 7 insertions(+)
>
> Index: linux/drivers/firewire/fw-device.c
> ===================================================================
> --- linux.orig/drivers/firewire/fw-device.c
> +++ linux/drivers/firewire/fw-device.c
> @@ -808,6 +813,7 @@ void fw_node_event(struct fw_card *card,
>
> device = node->data;
> device->node_id = node->node_id;
> + wmb();
> device->generation = card->generation;
> if (atomic_read(&device->state) == FW_DEVICE_RUNNING) {
> PREPARE_DELAYED_WORK(&device->work, fw_device_update);
> Index: linux/drivers/firewire/fw-topology.c
> ===================================================================
> --- linux.orig/drivers/firewire/fw-topology.c
> +++ linux/drivers/firewire/fw-topology.c
> @@ -518,6 +518,7 @@ fw_core_handle_bus_reset(struct fw_card
> card->bm_retries = 0;
>
> card->node_id = node_id;
> + wmb();
> card->generation = generation;
> card->reset_jiffies = jiffies;
> schedule_delayed_work(&card->work, 0);
^ permalink raw reply [flat|nested] 23+ messages in thread
* dealing with barriers (was Re: [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation)
2007-11-01 3:53 ` [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation Nick Piggin
@ 2007-11-01 9:51 ` Stefan Richter
2007-11-01 11:32 ` Nick Piggin
2008-01-24 0:52 ` [PATCH 0/4] firewire: order of memory accesses (bus generation vs. node ID) Stefan Richter
0 siblings, 2 replies; 23+ messages in thread
From: Stefan Richter @ 2007-11-01 9:51 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux1394-devel, linux-kernel, Kristian Høgsberg
Nick Piggin wrote:
> On Thursday 01 November 2007 12:49, Stefan Richter wrote:
>> fw_device.node_id and fw_device.generation are accessed without mutexes.
>> We have to ensure that all readers will get to see node_id updates
>> before generation updates.
>>
>
> Hi, a few points:
Thanks, this is appreciated.
> - can change it to use spinlocks instead? This would be most
> preferable.
It could be done, but I don't find it preferable for this purpose.
Although the struct members in question are basically part of fw-core's
API (kernel-internal API), there won't be a lot of places which access
these members.
(Besides, lockless algorithms are essential to FireWire driver code
everywhere where CPU and controller interact. So IMO these lockless
accesses don't constitute a whole new level of complexity in these drivers.)
> - if not, you need comments.
So far I tended to disagree every time when checkpatch.pl bugged me
about barriers without comments. But maybe that was because I had a
wrong idea of what kind of comment should go there. There would
certainly be no value in a comment which effectively says "trust me, I
know we need a barrier here". However, thinking about it again, it
occurs to me that I should add the following comments:
1.) At respective struct type definitions: A comment on how struct
members are to be accessed and why.
2.) At the occurrences of rmb() and wmb(): A comment on which
accesses the particular barrier divides. This is in order to get
the barriers properly updated (i.e. moved or removed) when
surrounding code is reordered, APIs reworked, or whatever.
Of course this division of the Why and the What only applies to accesses
like those in the patch --- i.e. API-like data types which aren't
abstracted by accessor functions --- while there may be other
requirements on comments for other usage types of barriers.
Or am I still wrong in my judgment of how barriers should be documented?
> - you also seem to be missing rmb()s now. I see a couple in the
> firewire directory, but nothing that seems to be ordering loads
> of these particular fields.
That's right. Of course the wmb()s don't make sense if the reader side
isn't fixed up accordingly. I did this for all places which I spotted
yesterday in the patches
"firewire: fw-core: react on bus resets while the config ROM is being
fetched", http://lkml.org/lkml/2007/10/31/464 (Hmm, this has an unclear
changelog.)
"firewire: fw-sbp2: enforce read order of device generation and node
ID", http://lkml.org/lkml/2007/10/31/465
I didn't fold all these patches into one because these two other patches
include also other changes to the respective read sides. I should have
pointed this out in this first patch.
Also, it could be that I have overlooked one more reader last night; I
will reexamine it.
> - use smp_*mb() if you are just ordering regular cacheable RAM
> accesses.
Hmm, right. Looks like the smp_ variants are indeed the right thing to
do here.
> Also, diffstat is a bit wrong... maybe you posted the wrong version?
Oops, this happened when I worked on what became the "fw-core: react on
bus resets..." patch. So the diffstat is wrong but...
>> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
>> ---
>> drivers/firewire/fw-device.c | 6 ++++++
>> drivers/firewire/fw-topology.c | 1 +
>> 2 files changed, 7 insertions(+)
>>
>> Index: linux/drivers/firewire/fw-device.c
>> ===================================================================
>> --- linux.orig/drivers/firewire/fw-device.c
>> +++ linux/drivers/firewire/fw-device.c
>> @@ -808,6 +813,7 @@ void fw_node_event(struct fw_card *card,
>>
>> device = node->data;
>> device->node_id = node->node_id;
>> + wmb();
>> device->generation = card->generation;
>> if (atomic_read(&device->state) == FW_DEVICE_RUNNING) {
>> PREPARE_DELAYED_WORK(&device->work, fw_device_update);
>> Index: linux/drivers/firewire/fw-topology.c
>> ===================================================================
>> --- linux.orig/drivers/firewire/fw-topology.c
>> +++ linux/drivers/firewire/fw-topology.c
>> @@ -518,6 +518,7 @@ fw_core_handle_bus_reset(struct fw_card
>> card->bm_retries = 0;
>>
>> card->node_id = node_id;
>> + wmb();
>> card->generation = generation;
>> card->reset_jiffies = jiffies;
>> schedule_delayed_work(&card->work, 0);
...the diff is what I intended. Anyway, maybe I should make it a habit
to send patches only between 10 AM and 10 PM local time. :-)
--
Stefan Richter
-=====-=-=== =-== ----=
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: dealing with barriers (was Re: [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation)
2007-11-01 9:51 ` dealing with barriers (was Re: [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation) Stefan Richter
@ 2007-11-01 11:32 ` Nick Piggin
2008-01-24 0:52 ` [PATCH 0/4] firewire: order of memory accesses (bus generation vs. node ID) Stefan Richter
1 sibling, 0 replies; 23+ messages in thread
From: Nick Piggin @ 2007-11-01 11:32 UTC (permalink / raw)
To: Stefan Richter; +Cc: linux1394-devel, linux-kernel, Kristian Høgsberg
On Thursday 01 November 2007 20:51, Stefan Richter wrote:
> Nick Piggin wrote:
> > On Thursday 01 November 2007 12:49, Stefan Richter wrote:
> >> fw_device.node_id and fw_device.generation are accessed without mutexes.
> >> We have to ensure that all readers will get to see node_id updates
> >> before generation updates.
> >
> > Hi, a few points:
>
> Thanks, this is appreciated.
>
> > - can change it to use spinlocks instead? This would be most
> > preferable.
>
> It could be done, but I don't find it preferable for this purpose.
> Although the struct members in question are basically part of fw-core's
> API (kernel-internal API), there won't be a lot of places which access
> these members.
>
> (Besides, lockless algorithms are essential to FireWire driver code
> everywhere where CPU and controller interact. So IMO these lockless
> accesses don't constitute a whole new level of complexity in these
> drivers.)
Fair enough.
> > - if not, you need comments.
>
> So far I tended to disagree every time when checkpatch.pl bugged me
> about barriers without comments. But maybe that was because I had a
> wrong idea of what kind of comment should go there. There would
> certainly be no value in a comment which effectively says "trust me, I
> know we need a barrier here". However, thinking about it again, it
> occurs to me that I should add the following comments:
>
> 1.) At respective struct type definitions: A comment on how struct
> members are to be accessed and why.
That's something that can be really helpful I think, yes.
> 2.) At the occurrences of rmb() and wmb(): A comment on which
> accesses the particular barrier divides. This is in order to get
> the barriers properly updated (i.e. moved or removed) when
> surrounding code is reordered, APIs reworked, or whatever.
Yes, this is the main thing to comment at the actual site of the barirer.
Although it isn't always 100% clear with locks either (because you might
do some non-critical operatoins inside a lock section), it just tends to
be clearer what it is being locked, and what other paths are being locked
against.
So describing what accesses are being ordered is good. Also a brief
description of how the lockless read-side works, if it's not obvious
(or you've already described that in your #1).
> Of course this division of the Why and the What only applies to accesses
> like those in the patch --- i.e. API-like data types which aren't
> abstracted by accessor functions --- while there may be other
> requirements on comments for other usage types of barriers.
>
> Or am I still wrong in my judgment of how barriers should be documented?
I think what you've said sounds good. I think that even memory barriers
within some subsystem can use commenting, even if only a line or two.
> > - you also seem to be missing rmb()s now. I see a couple in the
> > firewire directory, but nothing that seems to be ordering loads
> > of these particular fields.
>
> That's right. Of course the wmb()s don't make sense if the reader side
> isn't fixed up accordingly. I did this for all places which I spotted
> yesterday in the patches
> "firewire: fw-core: react on bus resets while the config ROM is being
> fetched", http://lkml.org/lkml/2007/10/31/464 (Hmm, this has an unclear
> changelog.)
> "firewire: fw-sbp2: enforce read order of device generation and node
> ID", http://lkml.org/lkml/2007/10/31/465
Ah, right, didn't see them. Thanks, that looks more useful now ;)
> I didn't fold all these patches into one because these two other patches
> include also other changes to the respective read sides. I should have
> pointed this out in this first patch.
Sure. I might personally still separate out the memory ordering fix
and put all the rmb and wmb in a single patch, but that's no big
deal.
> Also, it could be that I have overlooked one more reader last night; I
> will reexamine it.
>
> > - use smp_*mb() if you are just ordering regular cacheable RAM
> > accesses.
>
> Hmm, right. Looks like the smp_ variants are indeed the right thing to
> do here.
OK.
> > Also, diffstat is a bit wrong... maybe you posted the wrong version?
>
> Oops, this happened when I worked on what became the "fw-core: react on
> bus resets..." patch. So the diffstat is wrong but...
Ah, that's no problem. I manage to screw up patches in much worse
ways than this, don't worry ;)
Thanks,
Nick
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 0/4] firewire: order of memory accesses (bus generation vs. node ID)
2007-11-01 9:51 ` dealing with barriers (was Re: [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation) Stefan Richter
2007-11-01 11:32 ` Nick Piggin
@ 2008-01-24 0:52 ` Stefan Richter
2008-01-24 0:53 ` [PATCH 1/4] firewire: fw-sbp2: use device generation, not card generation Stefan Richter
` (3 more replies)
1 sibling, 4 replies; 23+ messages in thread
From: Stefan Richter @ 2008-01-24 0:52 UTC (permalink / raw)
To: linux1394-devel
Cc: linux-kernel, Kristian Høgsberg, Jarod Wilson, Nick Piggin
I now updated some patches from November according to comments I
received from Nick back then. As Jarod found out now while testing the
November versions, parts of these patches actually fix multiply reported
bugs.
While updating this stuff, I also found two more read sites where the
read order was not properly enforced.
Incoming:
1/4 firewire: fw-sbp2: use device generation, not card generation
2/4 firewire: fw-cdev: use device generation, not card generation
3/4 firewire: enforce access order between generation and node ID
4/4 firewire: fw-core: react on bus resets while the config ROM is being fetched
drivers/firewire/fw-cdev.c | 3 ++-
drivers/firewire/fw-device.c | 33 ++++++++++++++++++++++++---------
drivers/firewire/fw-device.h | 12 ++++++++++++
drivers/firewire/fw-sbp2.c | 14 ++++++++------
drivers/firewire/fw-topology.c | 5 +++++
5 files changed, 51 insertions(+), 16 deletions(-)
--
Stefan Richter
-=====-==--- ---= ==---
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 1/4] firewire: fw-sbp2: use device generation, not card generation
2008-01-24 0:52 ` [PATCH 0/4] firewire: order of memory accesses (bus generation vs. node ID) Stefan Richter
@ 2008-01-24 0:53 ` Stefan Richter
2008-01-24 16:04 ` Jarod Wilson
2008-01-24 0:53 ` [PATCH 2/4] firewire: fw-cdev: " Stefan Richter
` (2 subsequent siblings)
3 siblings, 1 reply; 23+ messages in thread
From: Stefan Richter @ 2008-01-24 0:53 UTC (permalink / raw)
To: linux1394-devel
Cc: linux-kernel, Kristian Høgsberg, Jarod Wilson, Nick Piggin
There was a small window where a login or reconnect job could use an
already updated card generation with an outdated node ID. We have to
use the fw_device.generation here, not the fw_card.generation, because
the generation must never be newer than the node ID when we emit a
transaction. This cannot be guaranteed with fw_card.generation.
Furthermore, the target's and initiator's node IDs can be obtained from
fw_device and fw_card. Dereferencing their underlying topology objects
is not necessary.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
Rework of patch "firewire: fw-sbp2: enforce read order of device
generation and node ID" from November 1 2007.
This code also needs barriers to work precisely as intended. They will
be added by a subsequent patch which consistently updates readers and
writers of .generation and .node_id.
drivers/firewire/fw-sbp2.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
Index: linux/drivers/firewire/fw-sbp2.c
===================================================================
--- linux.orig/drivers/firewire/fw-sbp2.c
+++ linux/drivers/firewire/fw-sbp2.c
@@ -671,9 +671,9 @@ static void sbp2_login(struct work_struc
struct sbp2_login_response response;
int generation, node_id, local_node_id;
- generation = device->card->generation;
- node_id = device->node->node_id;
- local_node_id = device->card->local_node->node_id;
+ generation = device->generation;
+ node_id = device->node_id;
+ local_node_id = device->card->node_id;
if (sbp2_send_management_orb(lu, node_id, generation,
SBP2_LOGIN_REQUEST, lu->lun, &response) < 0) {
@@ -921,9 +921,9 @@ static void sbp2_reconnect(struct work_s
struct fw_device *device = fw_device(unit->device.parent);
int generation, node_id, local_node_id;
- generation = device->card->generation;
- node_id = device->node->node_id;
- local_node_id = device->card->local_node->node_id;
+ generation = device->generation;
+ node_id = device->node_id;
+ local_node_id = device->card->node_id;
if (sbp2_send_management_orb(lu, node_id, generation,
SBP2_RECONNECT_REQUEST,
--
Stefan Richter
-=====-==--- ---= ==---
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 2/4] firewire: fw-cdev: use device generation, not card generation
2008-01-24 0:52 ` [PATCH 0/4] firewire: order of memory accesses (bus generation vs. node ID) Stefan Richter
2008-01-24 0:53 ` [PATCH 1/4] firewire: fw-sbp2: use device generation, not card generation Stefan Richter
@ 2008-01-24 0:53 ` Stefan Richter
2008-01-24 16:05 ` Jarod Wilson
2008-01-24 0:54 ` [PATCH 3/4] firewire: enforce access order between generation and node ID Stefan Richter
2008-01-24 0:55 ` [PATCH 4/4] firewire: fw-core: react on bus resets while the config ROM is being fetched Stefan Richter
3 siblings, 1 reply; 23+ messages in thread
From: Stefan Richter @ 2008-01-24 0:53 UTC (permalink / raw)
To: linux1394-devel
Cc: linux-kernel, Kristian Høgsberg, Jarod Wilson, Nick Piggin
We have to use the fw_device.generation here, not the fw_card.generation,
because the generation must never be newer than the node ID when we emit
a transaction. This cannot be guaranteed with fw_card.generation.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
This code also needs barriers to work precisely as intended. They will
be added by a subsequent patch which consistently updates readers and
writers of .generation and .node_id.
drivers/firewire/fw-cdev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/drivers/firewire/fw-cdev.c
===================================================================
--- linux.orig/drivers/firewire/fw-cdev.c
+++ linux/drivers/firewire/fw-cdev.c
@@ -206,12 +206,12 @@ fill_bus_reset_event(struct fw_cdev_even
event->closure = client->bus_reset_closure;
event->type = FW_CDEV_EVENT_BUS_RESET;
+ event->generation = client->device->generation;
event->node_id = client->device->node_id;
event->local_node_id = card->local_node->node_id;
event->bm_node_id = 0; /* FIXME: We don't track the BM. */
event->irm_node_id = card->irm_node->node_id;
event->root_node_id = card->root_node->node_id;
- event->generation = card->generation;
}
static void
--
Stefan Richter
-=====-==--- ---= ==---
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 3/4] firewire: enforce access order between generation and node ID
2008-01-24 0:52 ` [PATCH 0/4] firewire: order of memory accesses (bus generation vs. node ID) Stefan Richter
2008-01-24 0:53 ` [PATCH 1/4] firewire: fw-sbp2: use device generation, not card generation Stefan Richter
2008-01-24 0:53 ` [PATCH 2/4] firewire: fw-cdev: " Stefan Richter
@ 2008-01-24 0:54 ` Stefan Richter
2008-01-24 4:55 ` Nick Piggin
2008-01-24 16:06 ` Jarod Wilson
2008-01-24 0:55 ` [PATCH 4/4] firewire: fw-core: react on bus resets while the config ROM is being fetched Stefan Richter
3 siblings, 2 replies; 23+ messages in thread
From: Stefan Richter @ 2008-01-24 0:54 UTC (permalink / raw)
To: linux1394-devel
Cc: linux-kernel, Kristian Høgsberg, Jarod Wilson, Nick Piggin
fw_device.node_id and fw_device.generation are accessed without mutexes.
We have to ensure that all readers will get to see node_id updates
before generation updates.
An earlier incarnation of this patch fixes an inability to recognize
devices after "giving up on config rom",
https://bugzilla.redhat.com/show_bug.cgi?id=429950
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
Rework of patches
firewire: fw-core: enforce write order when updating fw_device.generation
and parts of
firewire: fw-core: react on bus resets while the config ROM is being fetched
firewire: fw-sbp2: enforce read order of device generation and node ID
from November 1 2007.
Update:
- write site and read sites folded into one patch
- added fix to fw_device_enable_phys_dma() and fill_bus_reset_event()
- smp_ barriers are sufficient
- comments, changelog
drivers/firewire/fw-cdev.c | 1 +
drivers/firewire/fw-device.c | 13 +++++++++++--
drivers/firewire/fw-device.h | 12 ++++++++++++
drivers/firewire/fw-sbp2.c | 2 ++
drivers/firewire/fw-topology.c | 5 +++++
5 files changed, 31 insertions(+), 2 deletions(-)
Index: linux/drivers/firewire/fw-device.c
===================================================================
--- linux.orig/drivers/firewire/fw-device.c
+++ linux/drivers/firewire/fw-device.c
@@ -182,9 +182,13 @@ static void fw_device_release(struct dev
int fw_device_enable_phys_dma(struct fw_device *device)
{
+ int generation = device->generation;
+
+ /* device->node_id, accessed below, must not be older than generation */
+ smp_rmb();
return device->card->driver->enable_phys_dma(device->card,
device->node_id,
- device->generation);
+ generation);
}
EXPORT_SYMBOL(fw_device_enable_phys_dma);
@@ -389,12 +393,16 @@ static int read_rom(struct fw_device *de
struct read_quadlet_callback_data callback_data;
struct fw_transaction t;
u64 offset;
+ int generation = device->generation;
+
+ /* device->node_id, accessed below, must not be older than generation */
+ smp_rmb();
init_completion(&callback_data.done);
offset = 0xfffff0000400ULL + index * 4;
fw_send_request(device->card, &t, TCODE_READ_QUADLET_REQUEST,
- device->node_id, device->generation, device->max_speed,
+ device->node_id, generation, device->max_speed,
offset, NULL, 4, complete_transaction, &callback_data);
wait_for_completion(&callback_data.done);
@@ -801,6 +809,7 @@ void fw_node_event(struct fw_card *card,
device = node->data;
device->node_id = node->node_id;
+ smp_wmb(); /* update node_id before generation */
device->generation = card->generation;
if (atomic_read(&device->state) == FW_DEVICE_RUNNING) {
PREPARE_DELAYED_WORK(&device->work, fw_device_update);
Index: linux/drivers/firewire/fw-topology.c
===================================================================
--- linux.orig/drivers/firewire/fw-topology.c
+++ linux/drivers/firewire/fw-topology.c
@@ -518,6 +518,11 @@ fw_core_handle_bus_reset(struct fw_card
card->bm_retries = 0;
card->node_id = node_id;
+ /*
+ * Update node_id before generation to prevent anybody from using
+ * a stale node_id togeher with a current generation.
+ */
+ smp_wmb();
card->generation = generation;
card->reset_jiffies = jiffies;
schedule_delayed_work(&card->work, 0);
Index: linux/drivers/firewire/fw-sbp2.c
===================================================================
--- linux.orig/drivers/firewire/fw-sbp2.c
+++ linux/drivers/firewire/fw-sbp2.c
@@ -672,6 +672,7 @@ static void sbp2_login(struct work_struc
int generation, node_id, local_node_id;
generation = device->generation;
+ smp_rmb(); /* node_id must not be older than generation */
node_id = device->node_id;
local_node_id = device->card->node_id;
@@ -922,6 +923,7 @@ static void sbp2_reconnect(struct work_s
int generation, node_id, local_node_id;
generation = device->generation;
+ smp_rmb(); /* node_id must not be older than generation */
node_id = device->node_id;
local_node_id = device->card->node_id;
Index: linux/drivers/firewire/fw-device.h
===================================================================
--- linux.orig/drivers/firewire/fw-device.h
+++ linux/drivers/firewire/fw-device.h
@@ -35,6 +35,18 @@ struct fw_attribute_group {
struct attribute *attrs[11];
};
+/*
+ * Note, fw_device.generation always has to be read before fw_device.node_id.
+ * Use SMP memory barriers to ensure this. Otherwise requests will be sent
+ * to an outdated node_id if the generation was updated in the meantime due
+ * to a bus reset.
+ *
+ * Likewise, fw-core will take care to update .node_id before .generation so
+ * that whenever fw_device.generation is current WRT the actual bus generation,
+ * fw_device.node_id is guaranteed to be current too.
+ *
+ * The same applies to fw_device.card->node_id vs. fw_device.generation.
+ */
struct fw_device {
atomic_t state;
struct fw_node *node;
Index: linux/drivers/firewire/fw-cdev.c
===================================================================
--- linux.orig/drivers/firewire/fw-cdev.c
+++ linux/drivers/firewire/fw-cdev.c
@@ -207,6 +207,7 @@ fill_bus_reset_event(struct fw_cdev_even
event->closure = client->bus_reset_closure;
event->type = FW_CDEV_EVENT_BUS_RESET;
event->generation = client->device->generation;
+ smp_rmb(); /* node_id must not be older than generation */
event->node_id = client->device->node_id;
event->local_node_id = card->local_node->node_id;
event->bm_node_id = 0; /* FIXME: We don't track the BM. */
--
Stefan Richter
-=====-==--- ---= ==---
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 4/4] firewire: fw-core: react on bus resets while the config ROM is being fetched
2008-01-24 0:52 ` [PATCH 0/4] firewire: order of memory accesses (bus generation vs. node ID) Stefan Richter
` (2 preceding siblings ...)
2008-01-24 0:54 ` [PATCH 3/4] firewire: enforce access order between generation and node ID Stefan Richter
@ 2008-01-24 0:55 ` Stefan Richter
2008-01-24 16:11 ` Jarod Wilson
3 siblings, 1 reply; 23+ messages in thread
From: Stefan Richter @ 2008-01-24 0:55 UTC (permalink / raw)
To: linux1394-devel
Cc: linux-kernel, Kristian Høgsberg, Jarod Wilson, Nick Piggin
read_rom() obtained a fresh new fw_device.generation for each read
transaction. Hence it was able to continue reading in the middle of the
ROM even if a bus reset happened. However the device may have modified
the ROM during the reset. We would end up with a corrupt fetched ROM
image then.
Although all of this is quite unlikely, it is not impossible.
Therefore we now restart reading the ROM if the bus generation changed.
Side note: The barrier in read_rom(), inserted by patch "firewire:
enforce access order between generation and node ID" is not necessary
anymore because the sequence of calls
fw_device_init() ->
read_bus_info_block() ->
read_rom()
read_rom()
read_rom()
...
will take care that generation is read before node_id, won't it?
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
Refreshed version of the patch from November 1 2007.
drivers/firewire/fw-device.c | 28 +++++++++++++++++-----------
1 file changed, 17 insertions(+), 11 deletions(-)
Index: linux/drivers/firewire/fw-device.c
===================================================================
--- linux.orig/drivers/firewire/fw-device.c
+++ linux/drivers/firewire/fw-device.c
@@ -388,15 +388,12 @@ complete_transaction(struct fw_card *car
complete(&callback_data->done);
}
-static int read_rom(struct fw_device *device, int index, u32 * data)
+static int
+read_rom(struct fw_device *device, int generation, int index, u32 *data)
{
struct read_quadlet_callback_data callback_data;
struct fw_transaction t;
u64 offset;
- int generation = device->generation;
-
- /* device->node_id, accessed below, must not be older than generation */
- smp_rmb();
init_completion(&callback_data.done);
@@ -412,7 +409,14 @@ static int read_rom(struct fw_device *de
return callback_data.rcode;
}
-static int read_bus_info_block(struct fw_device *device)
+/*
+ * Read the bus info block, perform a speed probe, and read all of the rest of
+ * the config ROM. We do all this with a cached bus generation. If the bus
+ * generation changes under us, read_bus_info_block will fail and get retried.
+ * It's better to start all over in this case because the node from which we
+ * are reading the ROM may have changed the ROM during the reset.
+ */
+static int read_bus_info_block(struct fw_device *device, int generation)
{
static u32 rom[256];
u32 stack[16], sp, key;
@@ -422,7 +426,7 @@ static int read_bus_info_block(struct fw
/* First read the bus info block. */
for (i = 0; i < 5; i++) {
- if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE)
+ if (read_rom(device, generation, i, &rom[i]) != RCODE_COMPLETE)
return -1;
/*
* As per IEEE1212 7.2, during power-up, devices can
@@ -457,7 +461,8 @@ static int read_bus_info_block(struct fw
device->max_speed = device->card->link_speed;
while (device->max_speed > SCODE_100) {
- if (read_rom(device, 0, &dummy) == RCODE_COMPLETE)
+ if (read_rom(device, generation, 0, &dummy) ==
+ RCODE_COMPLETE)
break;
device->max_speed--;
}
@@ -490,7 +495,7 @@ static int read_bus_info_block(struct fw
return -1;
/* Read header quadlet for the block to get the length. */
- if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE)
+ if (read_rom(device, generation, i, &rom[i]) != RCODE_COMPLETE)
return -1;
end = i + (rom[i] >> 16) + 1;
i++;
@@ -509,7 +514,8 @@ static int read_bus_info_block(struct fw
* it references another block, and push it in that case.
*/
while (i < end) {
- if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE)
+ if (read_rom(device, generation, i, &rom[i]) !=
+ RCODE_COMPLETE)
return -1;
if ((key >> 30) == 3 && (rom[i] >> 30) > 1 &&
sp < ARRAY_SIZE(stack))
@@ -656,7 +662,7 @@ static void fw_device_init(struct work_s
* device.
*/
- if (read_bus_info_block(device) < 0) {
+ if (read_bus_info_block(device, device->generation) < 0) {
if (device->config_rom_retries < MAX_RETRIES) {
device->config_rom_retries++;
schedule_delayed_work(&device->work, RETRY_DELAY);
--
Stefan Richter
-=====-==--- ---= ==---
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 3/4] firewire: enforce access order between generation and node ID
2008-01-24 0:54 ` [PATCH 3/4] firewire: enforce access order between generation and node ID Stefan Richter
@ 2008-01-24 4:55 ` Nick Piggin
2008-01-24 16:06 ` Jarod Wilson
1 sibling, 0 replies; 23+ messages in thread
From: Nick Piggin @ 2008-01-24 4:55 UTC (permalink / raw)
To: Stefan Richter
Cc: linux1394-devel, linux-kernel, Kristian Høgsberg,
Jarod Wilson
On Thursday 24 January 2008 11:54, Stefan Richter wrote:
> fw_device.node_id and fw_device.generation are accessed without mutexes.
> We have to ensure that all readers will get to see node_id updates
> before generation updates.
>
> An earlier incarnation of this patch fixes an inability to recognize
> devices after "giving up on config rom",
> https://bugzilla.redhat.com/show_bug.cgi?id=429950
>
> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
> ---
>
> Rework of patches
> firewire: fw-core: enforce write order when updating
> fw_device.generation and parts of
> firewire: fw-core: react on bus resets while the config ROM is being
> fetched firewire: fw-sbp2: enforce read order of device generation and node
> ID from November 1 2007.
>
> Update:
> - write site and read sites folded into one patch
> - added fix to fw_device_enable_phys_dma() and fill_bus_reset_event()
> - smp_ barriers are sufficient
> - comments, changelog
I don't know the firewire subsystem at all, but the barriers seem
right (in that they match your description of the problem), and
comments for them are really good.
Thanks,
Nick
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/4] firewire: fw-sbp2: use device generation, not card generation
2008-01-24 0:53 ` [PATCH 1/4] firewire: fw-sbp2: use device generation, not card generation Stefan Richter
@ 2008-01-24 16:04 ` Jarod Wilson
0 siblings, 0 replies; 23+ messages in thread
From: Jarod Wilson @ 2008-01-24 16:04 UTC (permalink / raw)
To: Stefan Richter
Cc: linux1394-devel, linux-kernel, Kristian Høgsberg,
Nick Piggin
Stefan Richter wrote:
> There was a small window where a login or reconnect job could use an
> already updated card generation with an outdated node ID. We have to
> use the fw_device.generation here, not the fw_card.generation, because
> the generation must never be newer than the node ID when we emit a
> transaction. This cannot be guaranteed with fw_card.generation.
>
> Furthermore, the target's and initiator's node IDs can be obtained from
> fw_device and fw_card. Dereferencing their underlying topology objects
> is not necessary.
>
> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Verified in concert with parts 2 and 3 as well as with 2, 3 and 4, to fix
'giving up on config rom' issues on multiple system and drive combinations
that were previously affected.
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
--
Jarod Wilson
jwilson@redhat.com
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/4] firewire: fw-cdev: use device generation, not card generation
2008-01-24 0:53 ` [PATCH 2/4] firewire: fw-cdev: " Stefan Richter
@ 2008-01-24 16:05 ` Jarod Wilson
0 siblings, 0 replies; 23+ messages in thread
From: Jarod Wilson @ 2008-01-24 16:05 UTC (permalink / raw)
To: Stefan Richter
Cc: linux1394-devel, linux-kernel, Kristian Høgsberg,
Nick Piggin
Stefan Richter wrote:
> We have to use the fw_device.generation here, not the fw_card.generation,
> because the generation must never be newer than the node ID when we emit
> a transaction. This cannot be guaranteed with fw_card.generation.
>
> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Verified in concert with parts 1 and 3 as well as with 1, 3 and 4, to fix
'giving up on config rom' issues on multiple system and drive combinations
that were previously affected.
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
--
Jarod Wilson
jwilson@redhat.com
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 3/4] firewire: enforce access order between generation and node ID
2008-01-24 0:54 ` [PATCH 3/4] firewire: enforce access order between generation and node ID Stefan Richter
2008-01-24 4:55 ` Nick Piggin
@ 2008-01-24 16:06 ` Jarod Wilson
2008-01-25 16:35 ` Stefan Richter
1 sibling, 1 reply; 23+ messages in thread
From: Jarod Wilson @ 2008-01-24 16:06 UTC (permalink / raw)
To: Stefan Richter
Cc: linux1394-devel, linux-kernel, Kristian Høgsberg,
Nick Piggin
Stefan Richter wrote:
> fw_device.node_id and fw_device.generation are accessed without mutexes.
> We have to ensure that all readers will get to see node_id updates
> before generation updates.
>
> An earlier incarnation of this patch fixes an inability to recognize
> devices after "giving up on config rom",
> https://bugzilla.redhat.com/show_bug.cgi?id=429950
>
> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Verified in concert with parts 1 and 2 as well as with 1, 2 and 4, to fix
'giving up on config rom' issues on multiple system and drive combinations
that were previously affected.
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
--
Jarod Wilson
jwilson@redhat.com
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 4/4] firewire: fw-core: react on bus resets while the config ROM is being fetched
2008-01-24 0:55 ` [PATCH 4/4] firewire: fw-core: react on bus resets while the config ROM is being fetched Stefan Richter
@ 2008-01-24 16:11 ` Jarod Wilson
2008-01-24 19:26 ` Jarod Wilson
0 siblings, 1 reply; 23+ messages in thread
From: Jarod Wilson @ 2008-01-24 16:11 UTC (permalink / raw)
To: Stefan Richter
Cc: linux1394-devel, linux-kernel, Kristian Høgsberg,
Nick Piggin
Stefan Richter wrote:
> read_rom() obtained a fresh new fw_device.generation for each read
> transaction. Hence it was able to continue reading in the middle of the
> ROM even if a bus reset happened. However the device may have modified
> the ROM during the reset. We would end up with a corrupt fetched ROM
> image then.
>
> Although all of this is quite unlikely, it is not impossible.
> Therefore we now restart reading the ROM if the bus generation changed.
>
> Side note: The barrier in read_rom(), inserted by patch "firewire:
> enforce access order between generation and node ID" is not necessary
> anymore because the sequence of calls
> fw_device_init() ->
> read_bus_info_block() ->
> read_rom()
> read_rom()
> read_rom()
> ...
> will take care that generation is read before node_id, won't it?
>
> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Based on a quick read through the code path, coupled with empirical evidence,
yes, it appears safe to remove the barrier in read_rom().
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
--
Jarod Wilson
jwilson@redhat.com
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 4/4] firewire: fw-core: react on bus resets while the config ROM is being fetched
2008-01-24 16:11 ` Jarod Wilson
@ 2008-01-24 19:26 ` Jarod Wilson
2008-01-25 16:53 ` [PATCH 4/4 update] " Stefan Richter
0 siblings, 1 reply; 23+ messages in thread
From: Jarod Wilson @ 2008-01-24 19:26 UTC (permalink / raw)
To: Stefan Richter
Cc: Kristian Høgsberg, Nick Piggin, linux1394-devel,
linux-kernel
Jarod Wilson wrote:
> Stefan Richter wrote:
>> read_rom() obtained a fresh new fw_device.generation for each read
>> transaction. Hence it was able to continue reading in the middle of the
>> ROM even if a bus reset happened. However the device may have modified
>> the ROM during the reset. We would end up with a corrupt fetched ROM
>> image then.
>>
>> Although all of this is quite unlikely, it is not impossible.
>> Therefore we now restart reading the ROM if the bus generation changed.
>>
>> Side note: The barrier in read_rom(), inserted by patch "firewire:
>> enforce access order between generation and node ID" is not necessary
>> anymore because the sequence of calls
>> fw_device_init() ->
>> read_bus_info_block() ->
>> read_rom()
>> read_rom()
>> read_rom()
>> ...
>> will take care that generation is read before node_id, won't it?
>>
>> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
>
> Based on a quick read through the code path, coupled with empirical evidence,
> yes, it appears safe to remove the barrier in read_rom().
Crap. My earlier 'empirical evidence' seems to have been happy coincidence.
After a fresh boot, I'm consistently hitting 'failed to read config rom'
failures with this patch applied. Simply putting the barrier back in gets
things working again though. Interestingly, subsequent reloading of firewire-*
modules less the barrier also tend to work until the system is rebooted.
--
Jarod Wilson
jwilson@redhat.com
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 3/4] firewire: enforce access order between generation and node ID
2008-01-24 16:06 ` Jarod Wilson
@ 2008-01-25 16:35 ` Stefan Richter
2008-01-25 17:57 ` Stefan Richter
0 siblings, 1 reply; 23+ messages in thread
From: Stefan Richter @ 2008-01-25 16:35 UTC (permalink / raw)
To: Jarod Wilson
Cc: Kristian Høgsberg, Nick Piggin, linux1394-devel,
linux-kernel
I'm going to commit it with a whitespace adjustment and a comment typo
fixed. Jarod, thanks for the rigorous testing; Nick, thanks for
helping in making this a solid improvement over the initial submission.
--
Stefan Richter
-=====-==--- ---= ==--=
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 4/4 update] firewire: fw-core: react on bus resets while the config ROM is being fetched
2008-01-24 19:26 ` Jarod Wilson
@ 2008-01-25 16:53 ` Stefan Richter
2008-01-25 17:16 ` Jarod Wilson
0 siblings, 1 reply; 23+ messages in thread
From: Stefan Richter @ 2008-01-25 16:53 UTC (permalink / raw)
To: Jarod Wilson
Cc: Kristian Høgsberg, Nick Piggin, linux1394-devel,
linux-kernel
read_rom() obtained a fresh new fw_device.generation for each read
transaction. Hence it was able to continue reading in the middle of the
ROM even if a bus reset happened. However the device may have modified
the ROM during the reset. We would end up with a corrupt fetched ROM
image then.
Although all of this is quite unlikely, it is not impossible.
Therefore we now restart reading the ROM if the bus generation changed.
Note, the memory barrier in read_rom() is still necessary according to
tests by Jarod Wilson, despite of the ->generation access being moved up
in the call chain.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
First posted on 2007-11-01. Update: Barrier stays.
drivers/firewire/fw-device.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
Index: linux-2.6.24/drivers/firewire/fw-device.c
===================================================================
--- linux-2.6.24.orig/drivers/firewire/fw-device.c
+++ linux-2.6.24/drivers/firewire/fw-device.c
@@ -389,12 +389,12 @@ complete_transaction(struct fw_card *car
complete(&callback_data->done);
}
-static int read_rom(struct fw_device *device, int index, u32 * data)
+static int
+read_rom(struct fw_device *device, int generation, int index, u32 *data)
{
struct read_quadlet_callback_data callback_data;
struct fw_transaction t;
u64 offset;
- int generation = device->generation;
/* device->node_id, accessed below, must not be older than generation */
smp_rmb();
@@ -413,7 +413,14 @@ static int read_rom(struct fw_device *de
return callback_data.rcode;
}
-static int read_bus_info_block(struct fw_device *device)
+/*
+ * Read the bus info block, perform a speed probe, and read all of the rest of
+ * the config ROM. We do all this with a cached bus generation. If the bus
+ * generation changes under us, read_bus_info_block will fail and get retried.
+ * It's better to start all over in this case because the node from which we
+ * are reading the ROM may have changed the ROM during the reset.
+ */
+static int read_bus_info_block(struct fw_device *device, int generation)
{
static u32 rom[256];
u32 stack[16], sp, key;
@@ -423,7 +430,7 @@ static int read_bus_info_block(struct fw
/* First read the bus info block. */
for (i = 0; i < 5; i++) {
- if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE)
+ if (read_rom(device, generation, i, &rom[i]) != RCODE_COMPLETE)
return -1;
/*
* As per IEEE1212 7.2, during power-up, devices can
@@ -458,7 +465,8 @@ static int read_bus_info_block(struct fw
device->max_speed = device->card->link_speed;
while (device->max_speed > SCODE_100) {
- if (read_rom(device, 0, &dummy) == RCODE_COMPLETE)
+ if (read_rom(device, generation, 0, &dummy) ==
+ RCODE_COMPLETE)
break;
device->max_speed--;
}
@@ -491,7 +499,7 @@ static int read_bus_info_block(struct fw
return -1;
/* Read header quadlet for the block to get the length. */
- if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE)
+ if (read_rom(device, generation, i, &rom[i]) != RCODE_COMPLETE)
return -1;
end = i + (rom[i] >> 16) + 1;
i++;
@@ -510,7 +518,8 @@ static int read_bus_info_block(struct fw
* it references another block, and push it in that case.
*/
while (i < end) {
- if (read_rom(device, i, &rom[i]) != RCODE_COMPLETE)
+ if (read_rom(device, generation, i, &rom[i]) !=
+ RCODE_COMPLETE)
return -1;
if ((key >> 30) == 3 && (rom[i] >> 30) > 1 &&
sp < ARRAY_SIZE(stack))
@@ -657,7 +666,7 @@ static void fw_device_init(struct work_s
* device.
*/
- if (read_bus_info_block(device) < 0) {
+ if (read_bus_info_block(device, device->generation) < 0) {
if (device->config_rom_retries < MAX_RETRIES) {
device->config_rom_retries++;
schedule_delayed_work(&device->work, RETRY_DELAY);
--
Stefan Richter
-=====-==--- ---= ==--=
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 4/4 update] firewire: fw-core: react on bus resets while the config ROM is being fetched
2008-01-25 16:53 ` [PATCH 4/4 update] " Stefan Richter
@ 2008-01-25 17:16 ` Jarod Wilson
2008-01-25 17:21 ` Stefan Richter
0 siblings, 1 reply; 23+ messages in thread
From: Jarod Wilson @ 2008-01-25 17:16 UTC (permalink / raw)
To: Stefan Richter
Cc: Kristian Høgsberg, Nick Piggin, linux1394-devel,
linux-kernel
On Friday 25 January 2008 11:53:49 am Stefan Richter wrote:
> read_rom() obtained a fresh new fw_device.generation for each read
> transaction. Hence it was able to continue reading in the middle of the
> ROM even if a bus reset happened. However the device may have modified
> the ROM during the reset. We would end up with a corrupt fetched ROM
> image then.
>
> Although all of this is quite unlikely, it is not impossible.
> Therefore we now restart reading the ROM if the bus generation changed.
>
> Note, the memory barrier in read_rom() is still necessary according to
> tests by Jarod Wilson, despite of the ->generation access being moved up
> in the call chain.
>
> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
> ---
>
> First posted on 2007-11-01. Update: Barrier stays.
This is essentially what I've been beating on locally, and I've yet to hit
another config rom read failure with it.
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
--
Jarod Wilson
jwilson@redhat.com
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 4/4 update] firewire: fw-core: react on bus resets while the config ROM is being fetched
2008-01-25 17:16 ` Jarod Wilson
@ 2008-01-25 17:21 ` Stefan Richter
0 siblings, 0 replies; 23+ messages in thread
From: Stefan Richter @ 2008-01-25 17:21 UTC (permalink / raw)
To: Jarod Wilson
Cc: Kristian Høgsberg, Nick Piggin, linux1394-devel,
linux-kernel
Jarod Wilson wrote:
> On Friday 25 January 2008 11:53:49 am Stefan Richter wrote:
>> Update: Barrier stays.
>
> This is essentially what I've been beating on locally, and I've yet to hit
> another config rom read failure with it.
>
> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Thanks. I already pushed to linux1394-2.6.git but I will add your
sign-off before I ask Linus to pull.
--
Stefan Richter
-=====-==--- ---= ==--=
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 3/4] firewire: enforce access order between generation and node ID
2008-01-25 16:35 ` Stefan Richter
@ 2008-01-25 17:57 ` Stefan Richter
[not found] ` <59ad55d30801251024j6ff43953tb86aaa52fdea5ec9@mail.gmail.com>
0 siblings, 1 reply; 23+ messages in thread
From: Stefan Richter @ 2008-01-25 17:57 UTC (permalink / raw)
To: Jarod Wilson
Cc: Kristian Høgsberg, Nick Piggin, linux1394-devel,
linux-kernel
Stefan Richter wrote:
> I'm going to commit it with a whitespace adjustment and a comment typo
> fixed.
Also forgot to #include <asm/system.h> in some of the files. Here is
the final product, for the record.
From: Stefan Richter <stefanr@s5r6.in-berlin.de>
Subject: firewire: enforce access order between generation and node ID, fix "giving up on config rom"
fw_device.node_id and fw_device.generation are accessed without mutexes.
We have to ensure that all readers will get to see node_id updates
before generation updates.
Fixes an inability to recognize devices after "giving up on config rom",
https://bugzilla.redhat.com/show_bug.cgi?id=429950
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Reviewed by Nick Piggin <nickpiggin@yahoo.com.au>.
Verified to fix 'giving up on config rom' issues on multiple system and
drive combinations that were previously affected.
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
---
drivers/firewire/fw-cdev.c | 1 +
drivers/firewire/fw-device.c | 15 +++++++++++++--
drivers/firewire/fw-device.h | 12 ++++++++++++
drivers/firewire/fw-sbp2.c | 3 +++
drivers/firewire/fw-topology.c | 6 ++++++
5 files changed, 35 insertions(+), 2 deletions(-)
Index: linux/drivers/firewire/fw-device.c
===================================================================
--- linux.orig/drivers/firewire/fw-device.c
+++ linux/drivers/firewire/fw-device.c
@@ -27,6 +27,7 @@
#include <linux/idr.h>
#include <linux/rwsem.h>
#include <asm/semaphore.h>
+#include <asm/system.h>
#include <linux/ctype.h>
#include "fw-transaction.h"
#include "fw-topology.h"
@@ -182,9 +183,14 @@ static void fw_device_release(struct dev
int fw_device_enable_phys_dma(struct fw_device *device)
{
+ int generation = device->generation;
+
+ /* device->node_id, accessed below, must not be older than generation */
+ smp_rmb();
+
return device->card->driver->enable_phys_dma(device->card,
device->node_id,
- device->generation);
+ generation);
}
EXPORT_SYMBOL(fw_device_enable_phys_dma);
@@ -389,12 +395,16 @@ static int read_rom(struct fw_device *de
struct read_quadlet_callback_data callback_data;
struct fw_transaction t;
u64 offset;
+ int generation = device->generation;
+
+ /* device->node_id, accessed below, must not be older than generation */
+ smp_rmb();
init_completion(&callback_data.done);
offset = 0xfffff0000400ULL + index * 4;
fw_send_request(device->card, &t, TCODE_READ_QUADLET_REQUEST,
- device->node_id, device->generation, device->max_speed,
+ device->node_id, generation, device->max_speed,
offset, NULL, 4, complete_transaction, &callback_data);
wait_for_completion(&callback_data.done);
@@ -801,6 +811,7 @@ void fw_node_event(struct fw_card *card,
device = node->data;
device->node_id = node->node_id;
+ smp_wmb(); /* update node_id before generation */
device->generation = card->generation;
if (atomic_read(&device->state) == FW_DEVICE_RUNNING) {
PREPARE_DELAYED_WORK(&device->work, fw_device_update);
Index: linux/drivers/firewire/fw-topology.c
===================================================================
--- linux.orig/drivers/firewire/fw-topology.c
+++ linux/drivers/firewire/fw-topology.c
@@ -21,6 +21,7 @@
#include <linux/module.h>
#include <linux/wait.h>
#include <linux/errno.h>
+#include <asm/system.h>
#include "fw-transaction.h"
#include "fw-topology.h"
@@ -518,6 +519,11 @@ fw_core_handle_bus_reset(struct fw_card
card->bm_retries = 0;
card->node_id = node_id;
+ /*
+ * Update node_id before generation to prevent anybody from using
+ * a stale node_id together with a current generation.
+ */
+ smp_wmb();
card->generation = generation;
card->reset_jiffies = jiffies;
schedule_delayed_work(&card->work, 0);
Index: linux/drivers/firewire/fw-sbp2.c
===================================================================
--- linux.orig/drivers/firewire/fw-sbp2.c
+++ linux/drivers/firewire/fw-sbp2.c
@@ -40,6 +40,7 @@
#include <linux/stringify.h>
#include <linux/timer.h>
#include <linux/workqueue.h>
+#include <asm/system.h>
#include <scsi/scsi.h>
#include <scsi/scsi_cmnd.h>
@@ -662,6 +663,7 @@ static void sbp2_login(struct work_struc
int generation, node_id, local_node_id;
generation = device->generation;
+ smp_rmb(); /* node_id must not be older than generation */
node_id = device->node_id;
local_node_id = device->card->node_id;
@@ -912,6 +914,7 @@ static void sbp2_reconnect(struct work_s
int generation, node_id, local_node_id;
generation = device->generation;
+ smp_rmb(); /* node_id must not be older than generation */
node_id = device->node_id;
local_node_id = device->card->node_id;
Index: linux/drivers/firewire/fw-device.h
===================================================================
--- linux.orig/drivers/firewire/fw-device.h
+++ linux/drivers/firewire/fw-device.h
@@ -35,6 +35,18 @@ struct fw_attribute_group {
struct attribute *attrs[11];
};
+/*
+ * Note, fw_device.generation always has to be read before fw_device.node_id.
+ * Use SMP memory barriers to ensure this. Otherwise requests will be sent
+ * to an outdated node_id if the generation was updated in the meantime due
+ * to a bus reset.
+ *
+ * Likewise, fw-core will take care to update .node_id before .generation so
+ * that whenever fw_device.generation is current WRT the actual bus generation,
+ * fw_device.node_id is guaranteed to be current too.
+ *
+ * The same applies to fw_device.card->node_id vs. fw_device.generation.
+ */
struct fw_device {
atomic_t state;
struct fw_node *node;
Index: linux/drivers/firewire/fw-cdev.c
===================================================================
--- linux.orig/drivers/firewire/fw-cdev.c
+++ linux/drivers/firewire/fw-cdev.c
@@ -207,6 +207,7 @@ fill_bus_reset_event(struct fw_cdev_even
event->closure = client->bus_reset_closure;
event->type = FW_CDEV_EVENT_BUS_RESET;
event->generation = client->device->generation;
+ smp_rmb(); /* node_id must not be older than generation */
event->node_id = client->device->node_id;
event->local_node_id = card->local_node->node_id;
event->bm_node_id = 0; /* FIXME: We don't track the BM. */
--
Stefan Richter
-=====-==--- ---= ==--=
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 3/4] firewire: enforce access order between generation and node ID
[not found] ` <59ad55d30801251024j6ff43953tb86aaa52fdea5ec9@mail.gmail.com>
@ 2008-01-25 22:25 ` Stefan Richter
0 siblings, 0 replies; 23+ messages in thread
From: Stefan Richter @ 2008-01-25 22:25 UTC (permalink / raw)
To: Kristian Høgsberg
Cc: Jarod Wilson, Kristian Høgsberg, Nick Piggin,
linux1394-devel, linux-kernel
Kristian Høgsberg wrote:
> I'm down with the flu here, but I read through the threads and the
> patches and this work definitely looks good.
Thanks for keeping your eyes on us. Your sign-off will go in when I
build the for-linus branch. Get well,
--
Stefan Richter
-=====-==--- ---= ==--=
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2008-01-25 22:26 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-01 1:49 [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation Stefan Richter
2007-11-01 1:50 ` [PATCH] firewire: fw-core: react on bus resets while the config ROM is being fetched Stefan Richter
2007-11-01 1:51 ` [PATCH] firewire: fw-sbp2: enforce read order of device generation and node ID Stefan Richter
2007-11-01 3:53 ` [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation Nick Piggin
2007-11-01 9:51 ` dealing with barriers (was Re: [PATCH] firewire: fw-core: enforce write order when updating fw_device.generation) Stefan Richter
2007-11-01 11:32 ` Nick Piggin
2008-01-24 0:52 ` [PATCH 0/4] firewire: order of memory accesses (bus generation vs. node ID) Stefan Richter
2008-01-24 0:53 ` [PATCH 1/4] firewire: fw-sbp2: use device generation, not card generation Stefan Richter
2008-01-24 16:04 ` Jarod Wilson
2008-01-24 0:53 ` [PATCH 2/4] firewire: fw-cdev: " Stefan Richter
2008-01-24 16:05 ` Jarod Wilson
2008-01-24 0:54 ` [PATCH 3/4] firewire: enforce access order between generation and node ID Stefan Richter
2008-01-24 4:55 ` Nick Piggin
2008-01-24 16:06 ` Jarod Wilson
2008-01-25 16:35 ` Stefan Richter
2008-01-25 17:57 ` Stefan Richter
[not found] ` <59ad55d30801251024j6ff43953tb86aaa52fdea5ec9@mail.gmail.com>
2008-01-25 22:25 ` Stefan Richter
2008-01-24 0:55 ` [PATCH 4/4] firewire: fw-core: react on bus resets while the config ROM is being fetched Stefan Richter
2008-01-24 16:11 ` Jarod Wilson
2008-01-24 19:26 ` Jarod Wilson
2008-01-25 16:53 ` [PATCH 4/4 update] " Stefan Richter
2008-01-25 17:16 ` Jarod Wilson
2008-01-25 17:21 ` Stefan Richter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).