* [PATCH] 2.4.21 fix race condition in sg.c
@ 2003-06-26 15:18 Tony Battersby
2003-06-26 22:15 ` Douglas Gilbert
0 siblings, 1 reply; 10+ messages in thread
From: Tony Battersby @ 2003-06-26 15:18 UTC (permalink / raw)
To: dgilbert, linux-scsi
The function sg_cmd_done_bh() sets srp->done = 1 before setting other fields
to be returned to userspace (such as srp->header.resid). This is a race
condition since other code assumes that done == 1 means that all fields to
be returned to userspace (via read()) have already been set. I am seeing
this race condition manifest in a program that uses poll() to wait for any
one of several outstanding commands to complete. The symptom is that every
once in a while the resid value returned for the command is 0 rather than
the correct value, since poll() showed that the command was complete and
read() returned the completion status for it before sg_cmd_done_bh() had
gotten to the line "srp->header.resid = SCpnt->resid".
This patch against 2.4.21 fixes the problem.
--- drivers/scsi/sg.c.orig Fri Jun 13 10:51:36 2003
+++ drivers/scsi/sg.c Thu Jun 26 11:07:28 2003
@@ -1241,7 +1241,6 @@
SRpnt->sr_request.rq_dev = MKDEV(0, 0); /* "sg" _disowns_ request blk
*/
srp->my_cmdp = NULL;
- srp->done = 1;
read_unlock(&sg_dev_arr_lock);
SCSI_LOG_TIMEOUT(4, printk("sg...bh: dev=%d, pack_id=%d, res=0x%x\n",
@@ -1274,6 +1273,8 @@
}
/* Rely on write phase to clean out srp status values, so no "else" */
+ srp->done = 1;
+
scsi_release_request(SRpnt);
SRpnt = NULL;
if (sfp->closed) { /* whoops this fd already released, cleanup */
---------------
Anthony J. Battersby
Cybernetics
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] 2.4.21 fix race condition in sg.c
2003-06-26 15:18 [PATCH] 2.4.21 fix race condition in sg.c Tony Battersby
@ 2003-06-26 22:15 ` Douglas Gilbert
0 siblings, 0 replies; 10+ messages in thread
From: Douglas Gilbert @ 2003-06-26 22:15 UTC (permalink / raw)
To: tonyb; +Cc: linux-scsi
Tony Battersby wrote:
> The function sg_cmd_done_bh() sets srp->done = 1 before setting other fields
> to be returned to userspace (such as srp->header.resid). This is a race
> condition since other code assumes that done == 1 means that all fields to
> be returned to userspace (via read()) have already been set. I am seeing
> this race condition manifest in a program that uses poll() to wait for any
> one of several outstanding commands to complete. The symptom is that every
> once in a while the resid value returned for the command is 0 rather than
> the correct value, since poll() showed that the command was complete and
> read() returned the completion status for it before sg_cmd_done_bh() had
> gotten to the line "srp->header.resid = SCpnt->resid".
>
> This patch against 2.4.21 fixes the problem.
>
> --- drivers/scsi/sg.c.orig Fri Jun 13 10:51:36 2003
> +++ drivers/scsi/sg.c Thu Jun 26 11:07:28 2003
> @@ -1241,7 +1241,6 @@
> SRpnt->sr_request.rq_dev = MKDEV(0, 0); /* "sg" _disowns_ request blk
> */
>
> srp->my_cmdp = NULL;
> - srp->done = 1;
> read_unlock(&sg_dev_arr_lock);
>
> SCSI_LOG_TIMEOUT(4, printk("sg...bh: dev=%d, pack_id=%d, res=0x%x\n",
> @@ -1274,6 +1273,8 @@
> }
> /* Rely on write phase to clean out srp status values, so no "else" */
>
> + srp->done = 1;
> +
> scsi_release_request(SRpnt);
> SRpnt = NULL;
> if (sfp->closed) { /* whoops this fd already released, cleanup */
>
> ---------------
Tony,
Yes, that is a problem and the same fix is needed in
the lk 2.5 series. Could you forward this match onto
Marcelo and Alan for inclusion.
Doug Gilbert
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] 2.4.21 fix race condition in sg.c
@ 2003-06-27 13:52 Tony Battersby
2003-06-27 14:10 ` James Bottomley
0 siblings, 1 reply; 10+ messages in thread
From: Tony Battersby @ 2003-06-27 13:52 UTC (permalink / raw)
To: marcelo, alan; +Cc: linux-scsi, dougg
Marcelo & Alan:
This patch against 2.4.21 fixes a race condition in the SCSI generic driver.
It has been approved by Doug Gilbert for inclusion. A similar fix is also
needed for 2.5, although I don't have the 2.5 source tree available to make
a patch.
Note: I have added smp_wmb() since the last time I sent this patch. I think
it is necessary for 100% correctness.
Anthony J. Battersby
Cybernetics
--- drivers/scsi/sg.c.orig Fri Jun 13 10:51:36 2003
+++ drivers/scsi/sg.c Fri Jun 27 09:27:40 2003
@@ -1241,7 +1241,6 @@
SRpnt->sr_request.rq_dev = MKDEV(0, 0); /* "sg" _disowns_ request blk */
srp->my_cmdp = NULL;
- srp->done = 1;
read_unlock(&sg_dev_arr_lock);
SCSI_LOG_TIMEOUT(4, printk("sg...bh: dev=%d, pack_id=%d, res=0x%x\n",
@@ -1274,6 +1273,9 @@
}
/* Rely on write phase to clean out srp status values, so no "else" */
+ smp_wmb();
+ srp->done = 1;
+
scsi_release_request(SRpnt);
SRpnt = NULL;
if (sfp->closed) { /* whoops this fd already released, cleanup */
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] 2.4.21 fix race condition in sg.c
2003-06-27 13:52 Tony Battersby
@ 2003-06-27 14:10 ` James Bottomley
2003-06-27 14:31 ` Alan Cox
2003-06-27 14:57 ` Tony Battersby
0 siblings, 2 replies; 10+ messages in thread
From: James Bottomley @ 2003-06-27 14:10 UTC (permalink / raw)
To: tonyb; +Cc: marcelo, Alan Cox, SCSI Mailing List, dougg
On Fri, 2003-06-27 at 08:52, Tony Battersby wrote:
> Note: I have added smp_wmb() since the last time I sent this patch. I think
> it is necessary for 100% correctness.
Actually, I don't believe it is. memory barriers are only needed if the
code cares about the ordering of the write. Either for externally
visible ordering (like MMIO to a device, or other CPU visibility in SMP)
or for internal consistency to prevent gcc reordering the writes to give
states that could be temporarily bogus.
I don't believe any of the above applies to the sg request pointer done
flag.
Memory barriers can be expensive on some architectures, so it's best to
use them only where necessary.
James
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] 2.4.21 fix race condition in sg.c
2003-06-27 14:10 ` James Bottomley
@ 2003-06-27 14:31 ` Alan Cox
2003-06-27 14:47 ` James Bottomley
2003-06-27 14:57 ` Tony Battersby
1 sibling, 1 reply; 10+ messages in thread
From: Alan Cox @ 2003-06-27 14:31 UTC (permalink / raw)
To: James Bottomley; +Cc: tonyb, Marcelo Tosatti, SCSI Mailing List, dougg
On Gwe, 2003-06-27 at 15:10, James Bottomley wrote:
> Memory barriers can be expensive on some architectures, so it's best to
> use them only where necessary.
For cases you just have to force compiler ordering you can use barrier, which
resolves to
__asm__ __volatile__("": : :"memory")
ie an empty space that affects memory so causes write backs
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] 2.4.21 fix race condition in sg.c
2003-06-27 14:31 ` Alan Cox
@ 2003-06-27 14:47 ` James Bottomley
0 siblings, 0 replies; 10+ messages in thread
From: James Bottomley @ 2003-06-27 14:47 UTC (permalink / raw)
To: Alan Cox; +Cc: tonyb, Marcelo Tosatti, SCSI Mailing List, dougg
On Fri, 2003-06-27 at 09:31, Alan Cox wrote:
> For cases you just have to force compiler ordering you can use barrier, which
> resolves to
>
> __asm__ __volatile__("": : :"memory")
>
> ie an empty space that affects memory so causes write backs
Right. The code in question is the only statement between an if block
and a function call, thus I believe it's safe from gcc reordering, so I
was assuming the intent was externally visible ordering (which can be
the expensive one).
James
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH] 2.4.21 fix race condition in sg.c
2003-06-27 14:10 ` James Bottomley
2003-06-27 14:31 ` Alan Cox
@ 2003-06-27 14:57 ` Tony Battersby
2003-06-27 15:00 ` Jeff Garzik
2003-06-27 15:11 ` James Bottomley
1 sibling, 2 replies; 10+ messages in thread
From: Tony Battersby @ 2003-06-27 14:57 UTC (permalink / raw)
To: 'James Bottomley'
Cc: 'Alan Cox', 'SCSI Mailing List', marcelo, dougg
> I don't believe any of the above applies to the sg request
> pointer done flag.
I see a situation analogous to this:
CPU #0:
foo = result;
smp_wmb();
done = 1;
CPU #1:
if (done)
return foo;
Isn't smp_wmb() necessary to prevent the CPU from re-ordering the stores?
In the actual code, "foo" corresponds to fields like the SCSI return status,
sense data, and the data transfer residual.
Sorry if I am wrong about this. I couldn't find much documentation on the
proper use of barriers.
Anthony J. Battersby
Cybernetics
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] 2.4.21 fix race condition in sg.c
2003-06-27 14:57 ` Tony Battersby
@ 2003-06-27 15:00 ` Jeff Garzik
2003-06-27 15:11 ` James Bottomley
1 sibling, 0 replies; 10+ messages in thread
From: Jeff Garzik @ 2003-06-27 15:00 UTC (permalink / raw)
To: Tony Battersby
Cc: 'James Bottomley', 'Alan Cox',
'SCSI Mailing List', marcelo, dougg
Personally, I advocate the KISS approach:
if it's an SMP race where you find yourself thinking about memory
barriers, you should probably just go ahead and use a spinlock :)
Jeff
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH] 2.4.21 fix race condition in sg.c
2003-06-27 14:57 ` Tony Battersby
2003-06-27 15:00 ` Jeff Garzik
@ 2003-06-27 15:11 ` James Bottomley
2003-06-27 15:38 ` Tony Battersby
1 sibling, 1 reply; 10+ messages in thread
From: James Bottomley @ 2003-06-27 15:11 UTC (permalink / raw)
To: tonyb; +Cc: 'Alan Cox', 'SCSI Mailing List', marcelo, dougg
On Fri, 2003-06-27 at 09:57, Tony Battersby wrote:
> > I don't believe any of the above applies to the sg request
> > pointer done flag.
>
> I see a situation analogous to this:
>
> CPU #0:
> foo = result;
> smp_wmb();
> done = 1;
>
> CPU #1:
> if (done)
> return foo;
> Isn't smp_wmb() necessary to prevent the CPU from re-ordering the stores?
Ah, OK. The answer's yes and no. Your example is true for lockless
access, but horribly racy (which is why, in practice, it's almost never
done). The original code had locked access to done which prevents the
race. If we've lost the locked access, I don't think the fix can be
correct.
James
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH] 2.4.21 fix race condition in sg.c
2003-06-27 15:11 ` James Bottomley
@ 2003-06-27 15:38 ` Tony Battersby
0 siblings, 0 replies; 10+ messages in thread
From: Tony Battersby @ 2003-06-27 15:38 UTC (permalink / raw)
To: 'James Bottomley'
Cc: 'Alan Cox', 'SCSI Mailing List', marcelo, dougg
> > Isn't smp_wmb() necessary to prevent the CPU from
> > re-ordering the stores?
>
> Ah, OK. The answer's yes and no. Your example is true for lockless
> access, but horribly racy (which is why, in practice, it's
> almost never
> done). The original code had locked access to done which prevents the
> race. If we've lost the locked access, I don't think the fix can be
> correct.
I looked at the locking before attempting the patch, but I didn't see
anything protecting the done flag or status return fields. It is true
that the done flag was set (without the patch) holding sg_dev_arr_lock,
but the code that looks at the done flag doesn't touch that lock.
If you think it needs a spinlock to be correct, then I will wait for
Doug's input, since he knows how it is supposed to work.
Anthony J. Battersby
Cybernetics
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2003-06-27 15:24 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-06-26 15:18 [PATCH] 2.4.21 fix race condition in sg.c Tony Battersby
2003-06-26 22:15 ` Douglas Gilbert
-- strict thread matches above, loose matches on Subject: below --
2003-06-27 13:52 Tony Battersby
2003-06-27 14:10 ` James Bottomley
2003-06-27 14:31 ` Alan Cox
2003-06-27 14:47 ` James Bottomley
2003-06-27 14:57 ` Tony Battersby
2003-06-27 15:00 ` Jeff Garzik
2003-06-27 15:11 ` James Bottomley
2003-06-27 15:38 ` Tony Battersby
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox