From: vaughan <vaughan.cao@oracle.com>
To: dgilbert@interlog.com
Cc: "Jörn Engel" <joern@logfs.org>,
JBottomley@parallels.com, linux-scsi@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open
Date: Wed, 28 Aug 2013 09:50:43 +0800 [thread overview]
Message-ID: <521D5773.8040006@oracle.com> (raw)
In-Reply-To: <521CA60E.2040802@interlog.com>
On 08/27/2013 09:13 PM, Douglas Gilbert wrote:
> On 13-08-27 10:16 AM, vaughan wrote:
>> On 08/13/2013 11:16 AM, Douglas Gilbert wrote:
>>> On 13-08-12 10:46 PM, vaughan wrote:
>>>> On 08/06/2013 04:52 AM, Douglas Gilbert wrote:
>>>>> On 13-08-04 10:19 PM, vaughan wrote:
>>>>>> On 08/03/2013 01:25 PM, Douglas Gilbert wrote:
>>>>>>> On 13-08-01 01:01 AM, Douglas Gilbert wrote:
>>>>>>>> On 13-07-22 01:03 PM, Jörn Engel wrote:
>>>>>>>>> On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:
>>>>>>>>>>
>>>>>>>>>> There is a race when open sg with O_EXCL flag. Also a race may
>>>>>>>>>> happen between
>>>>>>>>>> sg_open and sg_remove.
>>>>>>>>>>
>>>>>>>>>> Changes from v4:
>>>>>>>>>> * [3/4] use ERR_PTR series instead of adding another
>>>>>>>>>> parameter in
>>>>>>>>>> sg_add_sfp
>>>>>>>>>> * [4/4] fix conflict for cherry-pick from v3.
>>>>>>>>>>
>>>>>>>>>> Changes from v3:
>>>>>>>>>> * release o_sem in sg_release(), not in sg_remove_sfp().
>>>>>>>>>> * not set exclude with sfd_lock held.
>>>>>>>>>>
>>>>>>>>>> Vaughan Cao (4):
>>>>>>>>>> [SCSI] sg: use rwsem to solve race during exclusive open
>>>>>>>>>> [SCSI] sg: no need sg_open_exclusive_lock
>>>>>>>>>> [SCSI] sg: checking sdp->detached isn't protected when
>>>>>>>>>> open
>>>>>>>>>> [SCSI] sg: push file descriptor list locking down to
>>>>>>>>>> per-device
>>>>>>>>>> locking
>>>>>>>>>>
>>>>>>>>>> drivers/scsi/sg.c | 178
>>>>>>>>>> +++++++++++++++++++++++++-----------------------------
>>>>>>>>>> 1 file changed, 83 insertions(+), 95 deletions(-)
>>>>>>>>>
>>>>>>>>> Patchset looks good to me, although I didn't test it on hardware
>>>>>>>>> yet.
>>>>>>>>> Signed-off-by: Joern Engel <joern@logfs.org>
>>>>>>>>>
>>>>>>>>> James, care to pick this up?
>>>>>>>>
>>>>>>>> Acked-by: Douglas Gilbert <dgilbert@interlog.com>
>>>>>>>>
>>>>>>>> Tested O_EXCL with multiple processes and threads; passed.
>>>>>>>> sg driver prior to this patch had "leaky" O_EXCL logic
>>>>>>>> according to the same test. Block device passed.
>>>>>>>>
>>>>>>>> James, could you clean this up:
>>>>>>>> drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
>>>>>>>> [-Wunused-variable]
>>>>>>>
>>>>>>> Further testing suggests this patch on the sg driver is
>>>>>>> broken, so I'll rescind my ack.
>>>>>>>
>>>>>>> The case it is broken for is when a device is opened
>>>>>>> without O_EXCL. Now if, while it is open, a second
>>>>>>> thread/process tries to open the same device O_EXCL
>>>>>>> then IMO the second open should fail with EBUSY.
>>>>>>>
>>>>>>> My testing shows that O_EXCL opens properly deflect
>>>>>>> other O_EXCL opens.
>>>>>> Hi Doug,
>>>>>>
>>>>>> My test don't have this issue. The routine is something as below:
>>>>>>
>>>>>> I start three opens without O_EXCL, wait 30s each, and open with
>>>>>> O_EXCL|O_NONBLOCK, it failed with EBUSY.
>>>>>> And I also call myopen with/without O_EXCL many times in
>>>>>> background at
>>>>>> the same time, and the test is passed. I don't know why it failed in
>>>>>> your test.
>>>>>>
>>>>>> Usage: myopen [-e][-n][-d delay] -f file
>>>>>> -e: exclude
>>>>>> -n: nonblock
>>>>>> -d: delay N seconds and then close.
>>>>>>
>>>>>> [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 &
>>>>>> [1] 3417
>>>>>> [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 &
>>>>>> [2] 3418
>>>>>> [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 &
>>>>>> [3] 3419
>>>>>> [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
>>>>>> max_active_device=6(origin 1)
>>>>>> def_reserved_size=32768
>>>>>> >>> device=sg5 scsi5 chan=0 id=1 lun=0 em=0 sg_tablesize=55
>>>>>> excl=0
>>>>>> FD(1): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>>>>> cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>>>>> No requests active
>>>>>> FD(2): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>>>>> cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>>>>> No requests active
>>>>>> FD(3): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>>>>> cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>>>>> No requests active
>>>>>>
>>>>>> [root@vacaowol5 16835013]# ./myopen -e -n -f /dev/sg5 -d 30 &
>>>>>> [4] 3422
>>>>>> [3422:3351] /dev/sg5:exclude: Device or resource busy
>>>>>>
>>>>>> [4]+ Exit 1 ./myopen -e -n -f /dev/sg5 -d 30
>>>>>>
>>>>>> [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
>>>>>> max_active_device=6(origin 1)
>>>>>> def_reserved_size=32768
>>>>>> >>> device=sg5 scsi5 chan=0 id=1 lun=0 em=0 sg_tablesize=55
>>>>>> excl=0
>>>>>> FD(1): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>>>>> cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>>>>> No requests active
>>>>>> FD(2): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>>>>> cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>>>>> No requests active
>>>>>> FD(3): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>>>>> cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>>>>> No requests active
>>>>>> [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
>>>>>> [1] Done ./myopen -f /dev/sg5 -d 30
>>>>>> [2]- Done ./myopen -f /dev/sg5 -d 30
>>>>>> [3]+ Done ./myopen -f /dev/sg5 -d 30
>>>>>>
>>>>>
>>>>> Hi,
>>>>> After the initial failures about 36 hours ago, retesting
>>>>> yesterday and today has not produced any unexpected
>>>>> failures. And I have been trying hard on lk 3.10.4 and
>>>>> lk 3.10.5 .
>>>>>
>>>>> My test program is a bit more intense than yours and can
>>>>> be found in the sg3_utils beta in the News section of this
>>>>> page:
>>>>> http://sg.danny.cz/sg/
>>>>>
>>>>> It is in the examples directory, two variants called
>>>>> sg_tst_excl and sg_tst_excl2 . You will need a recent gcc
>>>>> compiler, IOW something that can compile c++11 . gcc 4.7.3
>>>>> in Ubuntu 13.04 only just manages, fedora 19 should do
>>>>> better with gcc 4.8.1 . The threading is implemented using
>>>>> pthreads so it should be reliable.
>>>>>
>>>>> Typically I run multiple instances (processes) and each has
>>>>> multiple threads. One instance can run '-x' which will cause
>>>>> its first thread not to use O_EXCL **. All my tests currently
>>>>> use O_NONBLOCK and that leads to lots of EBUSYs (sometimes
>>>>> in the billions).
>>>>>
>>>>> Doug Gilbert
>>>>>
>>>>>
>>>>> ** Using '-x' on two instances will cause an expected failure
>>>>> so can be used as a control.
>>>>>
>>>> Hi Doug,
>>>>
>>>> Can I regard this as you ACK it again?
>>>
>>> Hi,
>>> I'd like you to test your setup with sg_tst_excl or sg_tst_excl2 .
>>> Since my last email, I have not seen any more failures with those
>>> tests on the patched sg driver but I did see a couple on
>>> /dev/sd* . With sg_tst_excl2, bsg devices can be used and since bsg
>>> accepts and ignores O_EXCL, it fails reliably.
>>>
>>> BTW I use scsi_debug with 'delay=0' for a pseudo device.
>>>
>>> Doug Gilbert
>> Hi Doug,
>>
>> I run test for sg and sd drivers with both sg_tst_excl and sg_tst_excl2
>> on kernel vanilla 3.10.9 with mypatches included on Fedora19 x86_64
>> baremachine.
>> * I've tried several times with different -w and -n setting, no failure
>> for sg driver found.
>> * It's easy to find failure on both patched and non-patched kernel for
>> sd driver with the following test command:
>> ./sg_tst_excl -w 3 -n 2000 -t 16 -x $1 &
>> ./sg_tst_excl -w 3 -n 2000 -t 16 $1 &
>> ./sg_tst_excl -w 0 -n 2000 -t 16 $1 &
>> ./sg_tst_excl -w -1 -n 2000 -t 16 $1 &
>> ./sg_tst_excl -w -2 -n 2000 -t 16 $1 &
>> I think option '-w 0/-1/-2' is significant to trigger the failure, since
>> when I only use '-w >0', test usually passed.
>
> Hi,
> Thanks for testing that. For others trying to follow this,
> the usage message for that test utility is:
>
> Usage: sg_tst_excl [-b] [-l <lba>] [-n <n_per_thr>] [-t <num_thrs>]
> [-V] [-w <wait_ms>] [-x] <disk_device>
> where
> -b block on open (def: O_NONBLOCK)
> -l <lba> logical block to increment (def: 1000)
> -n <n_per_thr> number of loops per thread (def: 200)
> -t <num_thrs> number of threads (def: 4)
> -V print version number then exit
> -w <wait_ms> >0: sleep_for(<wait_ms>); =0: yield(); -1: no
> wait; -2: sleep(0) (def: 0)
> -x don't use O_EXCL on first thread (def: use
> O_EXCL on all threads)
>
> Test O_EXCL open flag with sg driver. Each open/close cycle with the
> O_EXCL flag does a double increment on lba (using its first 4 bytes).
>
> ----------------------
>
> The test is using O_EXCL as a lock so that if a 4 byte integer in
> a block starts out as even (and for a scsi_debug pseudo, blocks are
> zero filled) then a open,double_increment,close sequence should
> always see an even number after the open. Notice that it should be
> safe for one process to run with '-x' so that its first thread
> opens the device without the O_EXCL flag.
>
>
> Acked-by: Douglas Gilbert <dgilbert@interlog.com>
>
> And I let you run with the sd driver quirk :-)
Thanks, I appreciate testing the quirk for sd.
Vaughan
next prev parent reply other threads:[~2013-08-28 1:50 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-05 9:18 [PATCH] sg: atomize check and set sdp->exclude in sg_open vaughan
2013-06-05 13:27 ` Jörn Engel
2013-06-05 16:16 ` vaughan
2013-06-05 15:41 ` Jörn Engel
2013-06-06 7:19 ` vaughan
2013-06-06 7:29 ` vaughan
2013-06-17 13:10 ` [PATCH v2 1/1] [SCSI] sg: fix race condition when do exclusive open vaughan
2013-06-26 1:37 ` vaughan
2013-07-05 1:59 ` vaughan
2013-07-05 17:39 ` Jörn Engel
2013-07-06 17:24 ` vaughan
2013-07-07 19:53 ` [PATCH v3 " vaughan
2013-07-15 20:37 ` Jörn Engel
2013-07-17 15:34 ` [PATCH v4 0/4] [SCSI] sg: fix race condition in sg_open Vaughan Cao
2013-07-17 15:34 ` [PATCH v4 1/4] [SCSI] sg: use rwsem to solve race during exclusive open Vaughan Cao
2013-07-19 21:19 ` Jörn Engel
2013-07-17 15:34 ` [PATCH v4 2/4] [SCSI] sg: no need sg_open_exclusive_lock Vaughan Cao
2013-07-19 21:19 ` Jörn Engel
2013-07-17 15:34 ` [PATCH v4 3/4] [SCSI] sg: checking sdp->detached isn't protected when open Vaughan Cao
2013-07-19 21:24 ` Jörn Engel
2013-07-22 3:39 ` [PATCH v5 " Vaughan Cao
[not found] ` <CAMvaAQnFy0WiXHaNtAB1KPLK-7yj1AHh=_Pw4MBm0=_ecpoAoQ@mail.gmail.com>
2013-07-22 16:52 ` [PATCH v4 " Jörn Engel
2013-07-17 15:34 ` [PATCH v4 4/4] [SCSI] sg: push file descriptor list locking down to per-device locking Vaughan Cao
2013-07-19 21:26 ` Jörn Engel
2013-07-22 3:41 ` [PATCH v5 " Vaughan Cao
2013-07-22 4:40 ` [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open Vaughan Cao
2013-07-22 4:40 ` [PATCH v5 1/4] [SCSI] sg: use rwsem to solve race during exclusive open Vaughan Cao
2013-08-28 4:00 ` James Bottomley
2013-08-28 10:07 ` [PATCH v6 0/4][SCSI] sg: fix race condition in sg_open Vaughan Cao
2013-08-28 10:07 ` [PATCH v6 1/4] sg: use rwsem to solve race during exclusive open Vaughan Cao
2013-08-28 10:26 ` James Bottomley
2013-08-29 2:00 ` [PATCH v7 0/4][SCSI] sg: fix race condition in sg_open Vaughan Cao
2013-08-29 2:00 ` [PATCH v7 1/4] sg: use rwsem to solve race during exclusive open Vaughan Cao
2013-08-29 2:00 ` [PATCH v7 2/4] sg: no need sg_open_exclusive_lock Vaughan Cao
2013-08-29 2:00 ` [PATCH v7 3/4] sg: checking sdp->detached isn't protected when open Vaughan Cao
2013-08-29 2:00 ` [PATCH v7 4/4] sg: push file descriptor list locking down to per-device locking Vaughan Cao
2013-08-28 10:07 ` [PATCH v6 2/4] sg: no need sg_open_exclusive_lock Vaughan Cao
2013-08-28 10:07 ` [PATCH v6 3/4] sg: checking sdp->detached isn't protected when open Vaughan Cao
2013-08-28 10:07 ` [PATCH v6 4/4] sg: push file descriptor list locking down to per-device locking Vaughan Cao
2013-07-22 4:40 ` [PATCH v5 2/4] [SCSI] sg: no need sg_open_exclusive_lock Vaughan Cao
2013-07-22 4:40 ` [PATCH v5 3/4] [SCSI] sg: checking sdp->detached isn't protected when open Vaughan Cao
2013-07-22 4:40 ` [PATCH v5 4/4] [SCSI] sg: push file descriptor list locking down to per-device locking Vaughan Cao
2013-07-22 17:03 ` [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open Jörn Engel
2013-07-25 15:32 ` vaughan
2013-07-25 20:33 ` Douglas Gilbert
2013-07-31 4:40 ` vaughan
2013-08-01 5:01 ` Douglas Gilbert
2013-08-03 5:25 ` Douglas Gilbert
2013-08-05 2:19 ` vaughan
2013-08-05 20:52 ` Douglas Gilbert
2013-08-13 2:46 ` vaughan
2013-08-13 3:16 ` Douglas Gilbert
2013-08-27 8:16 ` vaughan
2013-08-27 13:13 ` Douglas Gilbert
2013-08-28 1:50 ` vaughan [this message]
2013-07-15 17:25 ` [PATCH v2 1/1] [SCSI] sg: fix race condition when do exclusive open Jörn Engel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=521D5773.8040006@oracle.com \
--to=vaughan.cao@oracle.com \
--cc=JBottomley@parallels.com \
--cc=dgilbert@interlog.com \
--cc=joern@logfs.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).