From: Tony Battersby <tonyb@cybernetics.com>
To: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: dgilbert@interlog.com, James.Bottomley@HansenPartnership.com,
hch@infradead.org, linux-scsi@vger.kernel.org
Subject: Re: [PATCH 0/2] sg: fix races during device removal (v2)
Date: Mon, 12 Jan 2009 16:09:25 -0500 [thread overview]
Message-ID: <496BB185.7080008@cybernetics.com> (raw)
In-Reply-To: <20090111022525I.fujita.tomonori@lab.ntt.co.jp>
FUJITA Tomonori wrote:
>
>> test #1
>>
>> open /dev/sgX
>> send a command that takes a long time (e.g. any tape drive seek
>> command)
>> before command completes, echo 1 >
>> /sys/class/scsi_generic/sgX/device/delete
>>
>> without patch:
>> oops
>>
>> with patch:
>> test program gets ENODEV immediately
>> keventd sleeps until the cmd is complete
>> this is suboptimal since it starves other users of keventd while
>> waiting for the command to complete, but it is better than an oops
>>
>
> As you said, we can do better. It's not correct to make
> class_interface's remove_dev hook, sg_remove wait for the completion
> of all the commands.
>
> The main problem is about the lifetime of sg_dev and sg_fd. I think
> that we can fix it in a simpler and better way.
>
> You use kref for sg_dev. We can use kref for sg_fd too. I think that
> that's all we need to do.
>
>
Thanks for the review. Your patch behaves very similarly to my patch
with the wait_event() removed from sg_remove(). With my patch, it was
not strictly necessary for sg_remove() to wait until all commands had
gone through sg_rq_end_io() - the refcounting, locking, and other checks
in my patch would have taken care of freeing sdp, sfp, and related
resources at the right time, even if sg_remove() didn't wait. The
reason I added the wait is because without it keventd would still sleep
(outside of sg_remove()) waiting for the command to complete anyway,
except that when the command did complete, the kernel would spew some
scary-looking error messages:
mptscsih: ioc1: ERROR - Received a mf that was already freed
mptscsih: ioc1: ERROR - req_idx=beaf req_idx_MR=beaf mf=ce504b00
mr=00000000 sc=00000000
So I figured it would be better to wait in sg_remove() and avoid the
error above, instead of waiting somewhere else for the same amount of
time and getting an error. With your patch, I get the above error
again, and keventd blocks for the same amount of time, so initially it
did not look like an improvement. However, since this issue has
surfaced again, I went ahead and analyzed why keventd is waiting outside
of sg_remove(), and discovered that it is a bug in mptscsih (*). When I
retested with sym53c8xx or open-iscsi, keventd did not block waiting for
the command to complete. I will send a patch for mptscsih later.
So we can accomplish the same effect with my patch just by removing the
wait_event() in sg_remove() and the other code associated with it. Now
the question is - do we still want to use krefs for the sg_fd? I had
considered it when coming up with my patch, but I decided not to add the
overhead of two extra atomic ops per command. Atomic ops for opening
and closing fds are OK, but I didn't want to slow down the command
processing critical paths. I haven't measured the overhead, so maybe it
isn't too bad, but it is something to consider.
Also, I spent a lot of time analyzing locks and potential races with my
patch, and your patch reverts some of the fixes that I incorporated in
my patch. Just to point out one example that I spotted quickly:
sg_rq_end_io() needs to check srp->orphan and set srp->done = 1 while
holding sfp->rq_list_lock to avoid races with sg_ioctl(SG_IO) (see my
[PATCH 2/2] sg: fix races with ioctl(SG_IO)). But if you really want to
go with a kref on sg_fd and get/put on every command, let me know and I
will check and test everything again in detail. If not, I will resubmit
my patch with the wait_event and associated code removed.
As a side note, scsi_remove_device() is called from keventd when using
the /sys interface to delete a device, but may be called from a
different process when using a different interface to delete a device.
So under different circumstances the sleep may block rmmod, iscsi_wq,
etc., instead of keventd.
(*) Here is why mptscsih causes keventd to sleep:
scsi_remove_device() -> __scsi_remove_device() -> mptscsih_slave_destroy()
First, mptscsih_search_running_cmds() calls sc->scsi_done()
(sg_rq_end_io()) on all outstanding commands.
Then, mptscsih_synchronize_cache() is called. It has the following check:
if (vdevice->vtarget->type != TYPE_DISK || vdevice->vtarget->deleted ||
!vdevice->configured_lun)
return;
However, this check is buggy. My test is using a tape drive, so the
type != TYPE_DISK check should make the function return without doing
anything. However, vtarget->type appears not to be set. Also, you
might hope that the vtarget->deleted check might make the function
return, but it does not. So it ends up sending a synchronize cache disk
command to a tape drive being deleted, and waits for it to complete.
Since the tape drive does not do command queueing, this forces it to
wait for the tape drive seek command previously sent by my test program
to finish. But when the seek command does finish, the message frame was
already freed by mptscsih_search_running_cmds(), so it gives the error
message above.
Tony
next prev parent reply other threads:[~2009-01-12 21:09 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-05 19:07 [PATCH 0/2] sg: fix races during device removal (v2) Tony Battersby
2009-01-08 23:21 ` Douglas Gilbert
2009-01-10 17:26 ` FUJITA Tomonori
2009-01-12 21:09 ` Tony Battersby [this message]
2009-01-13 16:24 ` FUJITA Tomonori
2009-01-14 20:31 ` Tony Battersby
2009-01-14 21:39 ` Greg KH
2009-01-14 21:59 ` Tony Battersby
2009-01-14 22:33 ` Stefan Richter
2009-01-14 22:53 ` Tony Battersby
2009-01-14 23:47 ` Stefan Richter
2009-01-15 14:47 ` Tony Battersby
2009-01-15 16:22 ` Stefan Richter
2009-01-15 16:44 ` Stefan Richter
2009-01-15 18:17 ` Tony Battersby
2009-01-15 18:47 ` Stefan Richter
2009-01-15 19:14 ` Stefan Richter
2009-01-15 19:20 ` Tony Battersby
2009-01-15 20:43 ` Stefan Richter
2009-01-15 21:43 ` Tony Battersby
2009-01-15 21:58 ` Stefan Richter
2009-01-15 22:23 ` Tony Battersby
2009-01-15 23:24 ` Stefan Richter
2009-01-16 14:16 ` Tony Battersby
2009-01-16 0:53 ` Stefan Richter
2009-01-16 8:09 ` Stefan Richter
2009-01-19 6:57 ` FUJITA Tomonori
2009-01-19 15:02 ` Tony Battersby
2009-01-19 23:03 ` [PATCH 1/2] sg: fix races during device removal (v4) Tony Battersby
2009-01-20 1:06 ` FUJITA Tomonori
2009-01-20 21:58 ` [PATCH 1/2] sg: fix races during device removal (v5) Tony Battersby
2009-01-21 18:25 ` Stefan Richter
2009-01-21 19:23 ` Tony Battersby
2009-01-21 19:45 ` [PATCH 1/2] sg: fix races during device removal (v6) Tony Battersby
2009-01-25 12:46 ` FUJITA Tomonori
2009-01-26 13:57 ` Douglas Gilbert
2009-01-28 1:51 ` FUJITA Tomonori
2009-01-28 15:06 ` James Bottomley
2009-01-20 22:00 ` [PATCH 2/2] sg: fix races with ioctl(SG_IO) (v2) Tony Battersby
2009-01-25 12:46 ` FUJITA Tomonori
2009-01-19 23:06 ` [PATCH 2/2] sg: fix races with ioctl(SG_IO) Tony Battersby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=496BB185.7080008@cybernetics.com \
--to=tonyb@cybernetics.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=dgilbert@interlog.com \
--cc=fujita.tomonori@lab.ntt.co.jp \
--cc=hch@infradead.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox