* [PATCH v4] sg: O_EXCL and other lock handling
@ 2014-06-13 0:26 Douglas Gilbert
2014-06-13 9:03 ` Christoph Hellwig
2014-06-21 11:01 ` Hannes Reinecke
0 siblings, 2 replies; 6+ messages in thread
From: Douglas Gilbert @ 2014-06-13 0:26 UTC (permalink / raw)
To: SCSI development list, Christoph Hellwig
Cc: James Bottomley, Hannes Reinecke, vaughan
[-- Attachment #1: Type: text/plain, Size: 1740 bytes --]
This is a re-presentation of a patch to the sg driver
whose v3 was sent in November 2013:
http://www.spinics.net/lists/linux-scsi/msg69957.html
It addresses a problem reported by Vaughan Cao concerning
the correctness of the O_EXCL logic in the sg driver. POSIX
doesn't defined O_EXCL semantics on devices but "allow only
one open file descriptor at a time per sg device" is a rough
definition. The sg driver's semantics have been to wait
on an open() when O_NONBLOCK is not given and there are
O_EXCL headwinds. Nasty things can happen during that wait
such as the device being detached (removed). So multiple
locks are reworked in this patch making it large and hard
to break down into digestible bits.
This patch is against Linus's current git repository which
doesn't include any sg patches sent in the last few weeks.
Hence this patch touches as little as possible that it
doesn't need to and strips out most SCSI_LOG_TIMEOUT()
changes in v3 because Hannes said he was going to rework all
that stuff.
The sg3_utils package has several test programs written to
test this patch. See examples/sg_tst_excl*.cpp .
Not all the locks and flags in sg have been re-worked in
this patch, notably sg_request::done . That can wait for
a follow-up patch if this one meets with approval.
ChangeLog v4:
- based on the current kernel tree: pre 3.16-rc1
- strip out clean-ups in v3 that others are better
placed to do (e.g. debug/logging)
- simplify open_wait_event() logic and add comment
ChangeLog v3 and earlier: see link in the first paragraph.
Could anyone confirm whether v3 of this patch has found its
way into any distro and/or been tested or used more widely?
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
[-- Attachment #2: sg.c_race_excl_dg4.patch --]
[-- Type: text/x-patch, Size: 28774 bytes --]
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 53268aa..fc10ed5 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -51,6 +51,7 @@ static int sg_version_num = 30534; /* 2 digits for each component */
#include <linux/delay.h>
#include <linux/blktrace_api.h>
#include <linux/mutex.h>
+#include <linux/atomic.h>
#include <linux/ratelimit.h>
#include "scsi.h"
@@ -102,18 +103,16 @@ static int scatter_elem_sz_prev = SG_SCATTER_SZ;
#define SG_SECTOR_SZ 512
-static int sg_add(struct device *, struct class_interface *);
-static void sg_remove(struct device *, struct class_interface *);
-
-static DEFINE_SPINLOCK(sg_open_exclusive_lock);
+static int sg_add_device(struct device *, struct class_interface *);
+static void sg_remove_device(struct device *, struct class_interface *);
static DEFINE_IDR(sg_index_idr);
static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock
file descriptor list for device */
static struct class_interface sg_interface = {
- .add_dev = sg_add,
- .remove_dev = sg_remove,
+ .add_dev = sg_add_device,
+ .remove_dev = sg_remove_device,
};
typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */
@@ -146,8 +145,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests outstanding per file */
} Sg_request;
typedef struct sg_fd { /* holds the state of a file descriptor */
- /* sfd_siblings is protected by sg_index_lock */
- struct list_head sfd_siblings;
+ struct list_head sfd_siblings; /* protected by device's sfd_lock */
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait; /* queue read until command done */
rwlock_t rq_list_lock; /* protect access to list in req_arr */
@@ -170,14 +168,15 @@ typedef struct sg_fd { /* holds the state of a file descriptor */
typedef struct sg_device { /* holds the state of each scsi generic device */
struct scsi_device *device;
- wait_queue_head_t o_excl_wait; /* queue open() when O_EXCL in use */
+ wait_queue_head_t open_wait; /* queue open() when O_EXCL present */
+ struct mutex open_rel_lock; /* held when in open() or release() */
int sg_tablesize; /* adapter's max scatter-gather table size */
u32 index; /* device index number */
- /* sfds is protected by sg_index_lock */
struct list_head sfds;
- volatile char detached; /* 0->attached, 1->detached pending removal */
- /* exclude protected by sg_open_exclusive_lock */
- char exclude; /* opened for exclusive access */
+ rwlock_t sfd_lock; /* protect access to sfd list */
+ atomic_t detaching; /* 0->device usable, 1->device detaching */
+ atomic_t exclude; /* 1->open(O_EXCL) succeeded and is active */
+ atomic_t open_cnt; /* count of opens (perhaps < num(sfds) ) */
char sgdebug; /* 0->off, 1->sense, 9->dump dev, 10-> all devs */
struct gendisk *disk;
struct cdev * cdev; /* char_dev [sysfs: /sys/cdev/major/sg<n>] */
@@ -208,7 +207,7 @@ static Sg_request *sg_add_request(Sg_fd * sfp);
static int sg_remove_request(Sg_fd * sfp, Sg_request * srp);
static int sg_res_in_use(Sg_fd * sfp);
static Sg_device *sg_get_dev(int dev);
-static void sg_put_dev(Sg_device *sdp);
+static void sg_device_destroy(struct kref *kref);
#define SZ_SG_HEADER sizeof(struct sg_header)
#define SZ_SG_IO_HDR sizeof(sg_io_hdr_t)
@@ -225,38 +224,23 @@ static int sg_allow_access(struct file *filp, unsigned char *cmd)
return blk_verify_command(cmd, filp->f_mode & FMODE_WRITE);
}
-static int get_exclude(Sg_device *sdp)
-{
- unsigned long flags;
- int ret;
-
- spin_lock_irqsave(&sg_open_exclusive_lock, flags);
- ret = sdp->exclude;
- spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
- return ret;
-}
-
-static int set_exclude(Sg_device *sdp, char val)
+static int
+open_wait_event(Sg_device *sdp, int other_has_exclude)
{
- unsigned long flags;
+ int retval;
- spin_lock_irqsave(&sg_open_exclusive_lock, flags);
- sdp->exclude = val;
- spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
- return val;
-}
-
-static int sfds_list_empty(Sg_device *sdp)
-{
- unsigned long flags;
- int ret;
-
- read_lock_irqsave(&sg_index_lock, flags);
- ret = list_empty(&sdp->sfds);
- read_unlock_irqrestore(&sg_index_lock, flags);
- return ret;
+ if (other_has_exclude)
+ retval = wait_event_interruptible(sdp->open_wait,
+ (atomic_read(&sdp->detaching)) ||
+ !atomic_read(&sdp->exclude));
+ else /* my caller wants O_EXCL */
+ retval = wait_event_interruptible(sdp->open_wait,
+ (atomic_read(&sdp->detaching)) ||
+ (atomic_read(&sdp->open_cnt) < 1));
+ return atomic_read(&sdp->detaching) ? -ENODEV : retval;
}
+/* Returns 0 on success, else a negated errno value */
static int
sg_open(struct inode *inode, struct file *filp)
{
@@ -265,11 +249,12 @@ sg_open(struct inode *inode, struct file *filp)
struct request_queue *q;
Sg_device *sdp;
Sg_fd *sfp;
- int res;
int retval;
nonseekable_open(inode, filp);
SCSI_LOG_TIMEOUT(3, printk("sg_open: dev=%d, flags=0x%x\n", dev, flags));
+ if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE)))
+ return -EPERM; /* Can't lock it with read only access */
sdp = sg_get_dev(dev);
if (IS_ERR(sdp)) {
retval = PTR_ERR(sdp);
@@ -287,6 +272,9 @@ sg_open(struct inode *inode, struct file *filp)
if (retval)
goto sdp_put;
+ /* scsi_block_when_processing_errors() may block so bypass
+ * check if O_NONBLOCK. Permits SCSI commands to be issued
+ * during error recovery. Tread carefully. */
if (!((flags & O_NONBLOCK) ||
scsi_block_when_processing_errors(sdp->device))) {
retval = -ENXIO;
@@ -294,68 +282,78 @@ sg_open(struct inode *inode, struct file *filp)
goto error_out;
}
- if (flags & O_EXCL) {
- if (O_RDONLY == (flags & O_ACCMODE)) {
- retval = -EPERM; /* Can't lock it with read only access */
- goto error_out;
- }
- if (!sfds_list_empty(sdp) && (flags & O_NONBLOCK)) {
- retval = -EBUSY;
- goto error_out;
- }
- res = wait_event_interruptible(sdp->o_excl_wait,
- ((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 : set_exclude(sdp, 1)));
- if (res) {
- retval = res; /* -ERESTARTSYS because signal hit process */
- goto error_out;
- }
- } else if (get_exclude(sdp)) { /* some other fd has an exclusive lock on dev */
- if (flags & O_NONBLOCK) {
- retval = -EBUSY;
- goto error_out;
+ mutex_lock(&sdp->open_rel_lock);
+ if (flags & O_NONBLOCK) {
+ if (flags & O_EXCL) {
+ if (atomic_read(&sdp->open_cnt) > 0) {
+ retval = -EBUSY;
+ goto error_mutex_locked;
+ }
+ } else {
+ if (atomic_read(&sdp->exclude)) {
+ retval = -EBUSY;
+ goto error_mutex_locked;
+ }
}
- res = wait_event_interruptible(sdp->o_excl_wait, !get_exclude(sdp));
- if (res) {
- retval = res; /* -ERESTARTSYS because signal hit process */
- goto error_out;
+ } else {
+ if (flags & O_EXCL) {
+ while (atomic_read(&sdp->open_cnt) > 0) {
+ mutex_unlock(&sdp->open_rel_lock);
+ retval = open_wait_event(sdp, 0);
+ if (retval) /* -ERESTARTSYS or -ENODEV */
+ goto error_out;
+ mutex_lock(&sdp->open_rel_lock);
+ }
+ } else {
+ while (atomic_read(&sdp->exclude)) {
+ mutex_unlock(&sdp->open_rel_lock);
+ retval = open_wait_event(sdp, 1);
+ if (retval) /* -ERESTARTSYS or -ENODEV */
+ goto error_out;
+ mutex_lock(&sdp->open_rel_lock);
+ }
}
}
- if (sdp->detached) {
- retval = -ENODEV;
- goto error_out;
- }
- if (sfds_list_empty(sdp)) { /* no existing opens on this device */
+ /* N.B. at this point we are holding the open_rel_lock */
+ if (flags & O_EXCL)
+ atomic_set(&sdp->exclude, 1);
+
+ if (atomic_read(&sdp->open_cnt) < 1) { /* no existing opens */
sdp->sgdebug = 0;
q = sdp->device->request_queue;
sdp->sg_tablesize = queue_max_segments(q);
}
- if ((sfp = sg_add_sfp(sdp, dev)))
+ sfp = sg_add_sfp(sdp, dev);
+ if (!IS_ERR(sfp)) {
filp->private_data = sfp;
- else {
+ retval = 0;
+ atomic_inc(&sdp->open_cnt);
+ mutex_unlock(&sdp->open_rel_lock);
+ } else {
+ retval = PTR_ERR(sfp);
if (flags & O_EXCL) {
- set_exclude(sdp, 0); /* undo if error */
- wake_up_interruptible(&sdp->o_excl_wait);
+ atomic_set(&sdp->exclude, 0); /* undo if error */
+ wake_up_interruptible(&sdp->open_wait);
}
- retval = -ENOMEM;
- goto error_out;
- }
- retval = 0;
+error_mutex_locked:
+ mutex_unlock(&sdp->open_rel_lock);
error_out:
- if (retval) {
scsi_autopm_put_device(sdp->device);
sdp_put:
scsi_device_put(sdp->device);
}
sg_put:
if (sdp)
- sg_put_dev(sdp);
+ kref_put(&sdp->d_ref, sg_device_destroy);
return retval;
}
-/* Following function was formerly called 'sg_close' */
+/* Release resources associated with a successful sg_open()
+ * Returns 0 on success, else a negated errno value */
static int
sg_release(struct inode *inode, struct file *filp)
{
+ int rem_open_cnt;
Sg_device *sdp;
Sg_fd *sfp;
@@ -363,11 +361,22 @@ sg_release(struct inode *inode, struct file *filp)
return -ENXIO;
SCSI_LOG_TIMEOUT(3, printk("sg_release: %s\n", sdp->disk->disk_name));
- set_exclude(sdp, 0);
- wake_up_interruptible(&sdp->o_excl_wait);
-
+ mutex_lock(&sdp->open_rel_lock);
scsi_autopm_put_device(sdp->device);
kref_put(&sfp->f_ref, sg_remove_sfp);
+ rem_open_cnt = atomic_dec_return(&sdp->open_cnt);
+ if (unlikely(rem_open_cnt < 0)) {
+ atomic_set(&sdp->open_cnt, 0);
+ rem_open_cnt = 0;
+ pr_err("%s: open_cnt went negative, fix\n", __func__);
+ }
+ /* possibly many open()s waiting on exlude clearing, start many;
+ * only open(O_EXCL)s wait on 0==rem_open_cnt so only start one */
+ if (atomic_add_unless(&sdp->exclude, -1, 0))
+ wake_up_interruptible_all(&sdp->open_wait);
+ else if (0 == rem_open_cnt)
+ wake_up_interruptible(&sdp->open_wait);
+ mutex_unlock(&sdp->open_rel_lock);
return 0;
}
@@ -419,7 +428,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
}
srp = sg_get_rq_mark(sfp, req_pack_id);
if (!srp) { /* now wait on packet to arrive */
- if (sdp->detached) {
+ if (atomic_read(&sdp->detaching)) {
retval = -ENODEV;
goto free_old_hdr;
}
@@ -428,9 +437,9 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
goto free_old_hdr;
}
retval = wait_event_interruptible(sfp->read_wait,
- (sdp->detached ||
+ (atomic_read(&sdp->detaching) ||
(srp = sg_get_rq_mark(sfp, req_pack_id))));
- if (sdp->detached) {
+ if (atomic_read(&sdp->detaching)) {
retval = -ENODEV;
goto free_old_hdr;
}
@@ -572,7 +581,7 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos)
return -ENXIO;
SCSI_LOG_TIMEOUT(3, printk("sg_write: %s, count=%d\n",
sdp->disk->disk_name, (int) count));
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
if (!((filp->f_flags & O_NONBLOCK) ||
scsi_block_when_processing_errors(sdp->device)))
@@ -764,7 +773,7 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp,
sg_finish_rem_req(srp);
return k; /* probably out of space --> ENOMEM */
}
- if (sdp->detached) {
+ if (atomic_read(&sdp->detaching)) {
if (srp->bio)
blk_end_request_all(srp->rq, -EIO);
sg_finish_rem_req(srp);
@@ -826,7 +835,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
switch (cmd_in) {
case SG_IO:
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
if (!scsi_block_when_processing_errors(sdp->device))
return -ENXIO;
@@ -837,8 +846,8 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
if (result < 0)
return result;
result = wait_event_interruptible(sfp->read_wait,
- (srp_done(sfp, srp) || sdp->detached));
- if (sdp->detached)
+ (srp_done(sfp, srp) || atomic_read(&sdp->detaching)));
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
write_lock_irq(&sfp->rq_list_lock);
if (srp->done) {
@@ -877,7 +886,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
sg_build_reserve(sfp, val);
}
} else {
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
sfp->low_dma = sdp->device->host->unchecked_isa_dma;
}
@@ -890,7 +899,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
else {
sg_scsi_id_t __user *sg_idp = p;
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
__put_user((int) sdp->device->host->host_no,
&sg_idp->host_no);
@@ -1032,11 +1041,11 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
return result;
}
case SG_EMULATED_HOST:
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
return put_user(sdp->device->host->hostt->emulated, ip);
case SG_SCSI_RESET:
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
if (filp->f_flags & O_NONBLOCK) {
if (scsi_host_in_recovery(sdp->device->host))
@@ -1069,7 +1078,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
return (scsi_reset_provider(sdp->device, val) ==
SUCCESS) ? 0 : -EIO;
case SCSI_IOCTL_SEND_COMMAND:
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
if (read_only) {
unsigned char opcode = WRITE_6;
@@ -1091,7 +1100,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
case SCSI_IOCTL_GET_BUS_NUMBER:
case SCSI_IOCTL_PROBE_HOST:
case SG_GET_TRANSFORM:
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
return scsi_ioctl(sdp->device, cmd_in, p);
case BLKSECTGET:
@@ -1165,7 +1174,7 @@ sg_poll(struct file *filp, poll_table * wait)
}
read_unlock_irqrestore(&sfp->rq_list_lock, iflags);
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
res |= POLLHUP;
else if (!sfp->cmd_q) {
if (0 == count)
@@ -1264,7 +1273,8 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma)
return 0;
}
-static void sg_rq_end_io_usercontext(struct work_struct *work)
+static void
+sg_rq_end_io_usercontext(struct work_struct *work)
{
struct sg_request *srp = container_of(work, struct sg_request, ew.work);
struct sg_fd *sfp = srp->parentfp;
@@ -1277,7 +1287,8 @@ static void sg_rq_end_io_usercontext(struct work_struct *work)
* This function is a "bottom half" handler that is called by the mid
* level when a command is completed (or has failed).
*/
-static void sg_rq_end_io(struct request *rq, int uptodate)
+static void
+sg_rq_end_io(struct request *rq, int uptodate)
{
struct sg_request *srp = rq->end_io_data;
Sg_device *sdp;
@@ -1295,8 +1306,8 @@ static void sg_rq_end_io(struct request *rq, int uptodate)
return;
sdp = sfp->parentdp;
- if (unlikely(sdp->detached))
- printk(KERN_INFO "sg_rq_end_io: device detached\n");
+ if (unlikely(atomic_read(&sdp->detaching)))
+ pr_info("%s: device detaching\n", __func__);
sense = rq->sense;
result = rq->errors;
@@ -1319,7 +1330,7 @@ static void sg_rq_end_io(struct request *rq, int uptodate)
if ((sdp->sgdebug > 0) &&
((CHECK_CONDITION == srp->header.masked_status) ||
(COMMAND_TERMINATED == srp->header.masked_status)))
- __scsi_print_sense("sg_cmd_done", sense,
+ __scsi_print_sense(__func__, sense,
SCSI_SENSE_BUFFERSIZE);
/* Following if statement is a patch supplied by Eric Youngdale */
@@ -1378,7 +1389,8 @@ static struct class *sg_sysfs_class;
static int sg_sysfs_valid = 0;
-static Sg_device *sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
+static Sg_device *
+sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
{
struct request_queue *q = scsidp->request_queue;
Sg_device *sdp;
@@ -1388,7 +1400,8 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
sdp = kzalloc(sizeof(Sg_device), GFP_KERNEL);
if (!sdp) {
- printk(KERN_WARNING "kmalloc Sg_device failure\n");
+ sdev_printk(KERN_WARNING, scsidp, "%s: kmalloc Sg_device "
+ "failure\n", __func__);
return ERR_PTR(-ENOMEM);
}
@@ -1403,8 +1416,9 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
scsidp->type, SG_MAX_DEVS - 1);
error = -ENODEV;
} else {
- printk(KERN_WARNING
- "idr allocation Sg_device failure: %d\n", error);
+ sdev_printk(KERN_WARNING, scsidp, "%s: idr "
+ "allocation Sg_device failure: %d\n",
+ __func__, error);
}
goto out_unlock;
}
@@ -1415,8 +1429,13 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
disk->first_minor = k;
sdp->disk = disk;
sdp->device = scsidp;
+ mutex_init(&sdp->open_rel_lock);
INIT_LIST_HEAD(&sdp->sfds);
- init_waitqueue_head(&sdp->o_excl_wait);
+ init_waitqueue_head(&sdp->open_wait);
+ atomic_set(&sdp->detaching, 0);
+ atomic_set(&sdp->exclude, 0);
+ atomic_set(&sdp->open_cnt, 0);
+ rwlock_init(&sdp->sfd_lock);
sdp->sg_tablesize = queue_max_segments(q);
sdp->index = k;
kref_init(&sdp->d_ref);
@@ -1434,7 +1453,7 @@ out_unlock:
}
static int
-sg_add(struct device *cl_dev, struct class_interface *cl_intf)
+sg_add_device(struct device *cl_dev, struct class_interface *cl_intf)
{
struct scsi_device *scsidp = to_scsi_device(cl_dev->parent);
struct gendisk *disk;
@@ -1445,7 +1464,7 @@ sg_add(struct device *cl_dev, struct class_interface *cl_intf)
disk = alloc_disk(1);
if (!disk) {
- printk(KERN_WARNING "alloc_disk failed\n");
+ pr_warn("%s: alloc_disk failed\n", __func__);
return -ENOMEM;
}
disk->major = SCSI_GENERIC_MAJOR;
@@ -1453,7 +1472,7 @@ sg_add(struct device *cl_dev, struct class_interface *cl_intf)
error = -ENOMEM;
cdev = cdev_alloc();
if (!cdev) {
- printk(KERN_WARNING "cdev_alloc failed\n");
+ pr_warn("%s: cdev_alloc failed\n", __func__);
goto out;
}
cdev->owner = THIS_MODULE;
@@ -1461,7 +1480,7 @@ sg_add(struct device *cl_dev, struct class_interface *cl_intf)
sdp = sg_alloc(disk, scsidp);
if (IS_ERR(sdp)) {
- printk(KERN_WARNING "sg_alloc failed\n");
+ pr_warn("%s: sg_alloc failed\n", __func__);
error = PTR_ERR(sdp);
goto out;
}
@@ -1479,22 +1498,20 @@ sg_add(struct device *cl_dev, struct class_interface *cl_intf)
sdp->index),
sdp, "%s", disk->disk_name);
if (IS_ERR(sg_class_member)) {
- printk(KERN_ERR "sg_add: "
- "device_create failed\n");
+ pr_err("%s: device_create failed\n", __func__);
error = PTR_ERR(sg_class_member);
goto cdev_add_err;
}
error = sysfs_create_link(&scsidp->sdev_gendev.kobj,
&sg_class_member->kobj, "generic");
if (error)
- printk(KERN_ERR "sg_add: unable to make symlink "
- "'generic' back to sg%d\n", sdp->index);
+ pr_err("%s: unable to make symlink 'generic' back "
+ "to sg%d\n", __func__, sdp->index);
} else
- printk(KERN_WARNING "sg_add: sg_sys Invalid\n");
+ pr_warn("%s: sg_sys Invalid\n", __func__);
- sdev_printk(KERN_NOTICE, scsidp,
- "Attached scsi generic sg%d type %d\n", sdp->index,
- scsidp->type);
+ sdev_printk(KERN_NOTICE, scsidp, "Attached scsi generic sg%d "
+ "type %d\n", sdp->index, scsidp->type);
dev_set_drvdata(cl_dev, sdp);
@@ -1513,7 +1530,8 @@ out:
return error;
}
-static void sg_device_destroy(struct kref *kref)
+static void
+sg_device_destroy(struct kref *kref)
{
struct sg_device *sdp = container_of(kref, struct sg_device, d_ref);
unsigned long flags;
@@ -1535,33 +1553,39 @@ static void sg_device_destroy(struct kref *kref)
kfree(sdp);
}
-static void sg_remove(struct device *cl_dev, struct class_interface *cl_intf)
+static void
+sg_remove_device(struct device *cl_dev, struct class_interface *cl_intf)
{
struct scsi_device *scsidp = to_scsi_device(cl_dev->parent);
Sg_device *sdp = dev_get_drvdata(cl_dev);
unsigned long iflags;
Sg_fd *sfp;
+ int val;
- if (!sdp || sdp->detached)
+ if (!sdp)
return;
+ /* want sdp->detaching non-zero as soon as possible */
+ val = atomic_inc_return(&sdp->detaching);
+ if (val > 1)
+ return; /* only want to do following once per device */
- SCSI_LOG_TIMEOUT(3, printk("sg_remove: %s\n", sdp->disk->disk_name));
+ SCSI_LOG_TIMEOUT(3, printk("%s: %s\n", __func__,
+ sdp->disk->disk_name));
- /* Need a write lock to set sdp->detached. */
- write_lock_irqsave(&sg_index_lock, iflags);
- sdp->detached = 1;
+ read_lock_irqsave(&sdp->sfd_lock, iflags);
list_for_each_entry(sfp, &sdp->sfds, sfd_siblings) {
- wake_up_interruptible(&sfp->read_wait);
+ wake_up_interruptible_all(&sfp->read_wait);
kill_fasync(&sfp->async_qp, SIGPOLL, POLL_HUP);
}
- write_unlock_irqrestore(&sg_index_lock, iflags);
+ wake_up_interruptible_all(&sdp->open_wait);
+ read_unlock_irqrestore(&sdp->sfd_lock, iflags);
sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
device_destroy(sg_sysfs_class, MKDEV(SCSI_GENERIC_MAJOR, sdp->index));
cdev_del(sdp->cdev);
sdp->cdev = NULL;
- sg_put_dev(sdp);
+ kref_put(&sdp->d_ref, sg_device_destroy);
}
module_param_named(scatter_elem_sz, scatter_elem_sz, int, S_IRUGO | S_IWUSR);
@@ -1631,7 +1655,8 @@ exit_sg(void)
idr_destroy(&sg_index_idr);
}
-static int sg_start_req(Sg_request *srp, unsigned char *cmd)
+static int
+sg_start_req(Sg_request *srp, unsigned char *cmd)
{
int res;
struct request *rq;
@@ -1726,7 +1751,8 @@ static int sg_start_req(Sg_request *srp, unsigned char *cmd)
return res;
}
-static int sg_finish_rem_req(Sg_request * srp)
+static int
+sg_finish_rem_req(Sg_request *srp)
{
int ret = 0;
@@ -2063,7 +2089,7 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp = kzalloc(sizeof(*sfp), GFP_ATOMIC | __GFP_NOWARN);
if (!sfp)
- return NULL;
+ return ERR_PTR(-ENOMEM);
init_waitqueue_head(&sfp->read_wait);
rwlock_init(&sfp->rq_list_lock);
@@ -2077,9 +2103,13 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->cmd_q = SG_DEF_COMMAND_Q;
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
- write_lock_irqsave(&sg_index_lock, iflags);
+ write_lock_irqsave(&sdp->sfd_lock, iflags);
+ if (atomic_read(&sdp->detaching)) {
+ write_unlock_irqrestore(&sdp->sfd_lock, iflags);
+ return ERR_PTR(-ENODEV);
+ }
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
- write_unlock_irqrestore(&sg_index_lock, iflags);
+ write_unlock_irqrestore(&sdp->sfd_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
if (unlikely(sg_big_buff != def_reserved_size))
sg_big_buff = def_reserved_size;
@@ -2095,7 +2125,8 @@ sg_add_sfp(Sg_device * sdp, int dev)
return sfp;
}
-static void sg_remove_sfp_usercontext(struct work_struct *work)
+static void
+sg_remove_sfp_usercontext(struct work_struct *work)
{
struct sg_fd *sfp = container_of(work, struct sg_fd, ew.work);
struct sg_device *sdp = sfp->parentdp;
@@ -2119,20 +2150,20 @@ static void sg_remove_sfp_usercontext(struct work_struct *work)
kfree(sfp);
scsi_device_put(sdp->device);
- sg_put_dev(sdp);
+ kref_put(&sdp->d_ref, sg_device_destroy);
module_put(THIS_MODULE);
}
-static void sg_remove_sfp(struct kref *kref)
+static void
+sg_remove_sfp(struct kref *kref)
{
struct sg_fd *sfp = container_of(kref, struct sg_fd, f_ref);
struct sg_device *sdp = sfp->parentdp;
unsigned long iflags;
- write_lock_irqsave(&sg_index_lock, iflags);
+ write_lock_irqsave(&sdp->sfd_lock, iflags);
list_del(&sfp->sfd_siblings);
- write_unlock_irqrestore(&sg_index_lock, iflags);
- wake_up_interruptible(&sdp->o_excl_wait);
+ write_unlock_irqrestore(&sdp->sfd_lock, iflags);
INIT_WORK(&sfp->ew.work, sg_remove_sfp_usercontext);
schedule_work(&sfp->ew.work);
@@ -2183,7 +2214,8 @@ static Sg_device *sg_lookup_dev(int dev)
return idr_find(&sg_index_idr, dev);
}
-static Sg_device *sg_get_dev(int dev)
+static Sg_device *
+sg_get_dev(int dev)
{
struct sg_device *sdp;
unsigned long flags;
@@ -2192,8 +2224,8 @@ static Sg_device *sg_get_dev(int dev)
sdp = sg_lookup_dev(dev);
if (!sdp)
sdp = ERR_PTR(-ENXIO);
- else if (sdp->detached) {
- /* If sdp->detached, then the refcount may already be 0, in
+ else if (atomic_read(&sdp->detaching)) {
+ /* If sdp->detaching, then the refcount may already be 0, in
* which case it would be a bug to do kref_get().
*/
sdp = ERR_PTR(-ENODEV);
@@ -2204,11 +2236,6 @@ static Sg_device *sg_get_dev(int dev)
return sdp;
}
-static void sg_put_dev(struct sg_device *sdp)
-{
- kref_put(&sdp->d_ref, sg_device_destroy);
-}
-
#ifdef CONFIG_SCSI_PROC_FS
static struct proc_dir_entry *sg_proc_sgp = NULL;
@@ -2425,8 +2452,7 @@ static int sg_proc_single_open_version(struct inode *inode, struct file *file)
static int sg_proc_seq_show_devhdr(struct seq_file *s, void *v)
{
- seq_printf(s, "host\tchan\tid\tlun\ttype\topens\tqdepth\tbusy\t"
- "online\n");
+ seq_puts(s, "host\tchan\tid\tlun\ttype\topens\tqdepth\tbusy\tonline\n");
return 0;
}
@@ -2482,7 +2508,11 @@ static int sg_proc_seq_show_dev(struct seq_file *s, void *v)
read_lock_irqsave(&sg_index_lock, iflags);
sdp = it ? sg_lookup_dev(it->index) : NULL;
- if (sdp && (scsidp = sdp->device) && (!sdp->detached))
+ if ((NULL == sdp) || (NULL == sdp->device) ||
+ (atomic_read(&sdp->detaching)))
+ seq_puts(s, "-1\t-1\t-1\t-1\t-1\t-1\t-1\t-1\t-1\n");
+ else {
+ scsidp = sdp->device;
seq_printf(s, "%d\t%d\t%d\t%d\t%d\t%d\t%d\t%d\t%d\n",
scsidp->host->host_no, scsidp->channel,
scsidp->id, scsidp->lun, (int) scsidp->type,
@@ -2490,8 +2520,7 @@ static int sg_proc_seq_show_dev(struct seq_file *s, void *v)
(int) scsidp->queue_depth,
(int) scsidp->device_busy,
(int) scsi_device_online(scsidp));
- else
- seq_printf(s, "-1\t-1\t-1\t-1\t-1\t-1\t-1\t-1\t-1\n");
+ }
read_unlock_irqrestore(&sg_index_lock, iflags);
return 0;
}
@@ -2510,11 +2539,12 @@ static int sg_proc_seq_show_devstrs(struct seq_file *s, void *v)
read_lock_irqsave(&sg_index_lock, iflags);
sdp = it ? sg_lookup_dev(it->index) : NULL;
- if (sdp && (scsidp = sdp->device) && (!sdp->detached))
+ scsidp = sdp ? sdp->device : NULL;
+ if (sdp && scsidp && (!atomic_read(&sdp->detaching)))
seq_printf(s, "%8.8s\t%16.16s\t%4.4s\n",
scsidp->vendor, scsidp->model, scsidp->rev);
else
- seq_printf(s, "<no active device>\n");
+ seq_puts(s, "<no active device>\n");
read_unlock_irqrestore(&sg_index_lock, iflags);
return 0;
}
@@ -2559,12 +2589,12 @@ static void sg_proc_debug_helper(struct seq_file *s, Sg_device * sdp)
else
cp = " ";
}
- seq_printf(s, cp);
+ seq_puts(s, cp);
blen = srp->data.bufflen;
usg = srp->data.k_use_sg;
- seq_printf(s, srp->done ?
- ((1 == srp->done) ? "rcv:" : "fin:")
- : "act:");
+ seq_puts(s, srp->done ?
+ ((1 == srp->done) ? "rcv:" : "fin:")
+ : "act:");
seq_printf(s, " id=%d blen=%d",
srp->header.pack_id, blen);
if (srp->done)
@@ -2580,7 +2610,7 @@ static void sg_proc_debug_helper(struct seq_file *s, Sg_device * sdp)
(int) srp->data.cmd_opcode);
}
if (0 == m)
- seq_printf(s, " No requests active\n");
+ seq_puts(s, " No requests active\n");
read_unlock(&fp->rq_list_lock);
}
}
@@ -2596,31 +2626,35 @@ static int sg_proc_seq_show_debug(struct seq_file *s, void *v)
Sg_device *sdp;
unsigned long iflags;
- if (it && (0 == it->index)) {
- seq_printf(s, "max_active_device=%d(origin 1)\n",
- (int)it->max);
- seq_printf(s, " def_reserved_size=%d\n", sg_big_buff);
- }
+ if (it && (0 == it->index))
+ seq_printf(s, "max_active_device=%d def_reserved_size=%d\n",
+ (int)it->max, sg_big_buff);
read_lock_irqsave(&sg_index_lock, iflags);
sdp = it ? sg_lookup_dev(it->index) : NULL;
- if (sdp && !list_empty(&sdp->sfds)) {
- struct scsi_device *scsidp = sdp->device;
-
+ if (NULL == sdp)
+ goto skip;
+ read_lock(&sdp->sfd_lock);
+ if (!list_empty(&sdp->sfds)) {
seq_printf(s, " >>> device=%s ", sdp->disk->disk_name);
- if (sdp->detached)
- seq_printf(s, "detached pending close ");
- else
- seq_printf
- (s, "scsi%d chan=%d id=%d lun=%d em=%d",
- scsidp->host->host_no,
- scsidp->channel, scsidp->id,
- scsidp->lun,
- scsidp->host->hostt->emulated);
- seq_printf(s, " sg_tablesize=%d excl=%d\n",
- sdp->sg_tablesize, get_exclude(sdp));
+ if (atomic_read(&sdp->detaching))
+ seq_puts(s, "detaching pending close ");
+ else if (sdp->device) {
+ struct scsi_device *scsidp = sdp->device;
+
+ seq_printf(s, "%d:%d:%d:%d em=%d",
+ scsidp->host->host_no,
+ scsidp->channel, scsidp->id,
+ scsidp->lun,
+ scsidp->host->hostt->emulated);
+ }
+ seq_printf(s, " sg_tablesize=%d excl=%d open_cnt=%d\n",
+ sdp->sg_tablesize, atomic_read(&sdp->exclude),
+ atomic_read(&sdp->open_cnt));
sg_proc_debug_helper(s, sdp);
}
+ read_unlock(&sdp->sfd_lock);
+skip:
read_unlock_irqrestore(&sg_index_lock, iflags);
return 0;
}
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v4] sg: O_EXCL and other lock handling
2014-06-13 0:26 [PATCH v4] sg: O_EXCL and other lock handling Douglas Gilbert
@ 2014-06-13 9:03 ` Christoph Hellwig
2014-06-13 14:33 ` Douglas Gilbert
2014-06-13 20:18 ` Douglas Gilbert
2014-06-21 11:01 ` Hannes Reinecke
1 sibling, 2 replies; 6+ messages in thread
From: Christoph Hellwig @ 2014-06-13 9:03 UTC (permalink / raw)
To: Douglas Gilbert
Cc: SCSI development list, Christoph Hellwig, James Bottomley,
Hannes Reinecke, vaughan
Hi Doug,
this looks generally good to me, but I don't think open_cnt and exclude
have to use atomic_t, as they are only ever modified under open_rel_lock.
Can you take a look at the version below? This changes open_cnt to an
int, exclude to a bool, removes the open_cnt underflow check that
the VFS takes care for, and streamlines the open path a little bit:
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index e75f71e..0cd77a3 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -51,6 +51,7 @@ static int sg_version_num = 30534; /* 2 digits for each component */
#include <linux/delay.h>
#include <linux/blktrace_api.h>
#include <linux/mutex.h>
+#include <linux/atomic.h>
#include <linux/ratelimit.h>
#include "scsi.h"
@@ -102,18 +103,16 @@ static int scatter_elem_sz_prev = SG_SCATTER_SZ;
#define SG_SECTOR_SZ 512
-static int sg_add(struct device *, struct class_interface *);
-static void sg_remove(struct device *, struct class_interface *);
-
-static DEFINE_SPINLOCK(sg_open_exclusive_lock);
+static int sg_add_device(struct device *, struct class_interface *);
+static void sg_remove_device(struct device *, struct class_interface *);
static DEFINE_IDR(sg_index_idr);
static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock
file descriptor list for device */
static struct class_interface sg_interface = {
- .add_dev = sg_add,
- .remove_dev = sg_remove,
+ .add_dev = sg_add_device,
+ .remove_dev = sg_remove_device,
};
typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */
@@ -146,8 +145,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests outstanding per file */
} Sg_request;
typedef struct sg_fd { /* holds the state of a file descriptor */
- /* sfd_siblings is protected by sg_index_lock */
- struct list_head sfd_siblings;
+ struct list_head sfd_siblings; /* protected by device's sfd_lock */
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait; /* queue read until command done */
rwlock_t rq_list_lock; /* protect access to list in req_arr */
@@ -170,14 +168,15 @@ typedef struct sg_fd { /* holds the state of a file descriptor */
typedef struct sg_device { /* holds the state of each scsi generic device */
struct scsi_device *device;
- wait_queue_head_t o_excl_wait; /* queue open() when O_EXCL in use */
+ wait_queue_head_t open_wait; /* queue open() when O_EXCL present */
+ struct mutex open_rel_lock; /* held when in open() or release() */
int sg_tablesize; /* adapter's max scatter-gather table size */
u32 index; /* device index number */
- /* sfds is protected by sg_index_lock */
struct list_head sfds;
- volatile char detached; /* 0->attached, 1->detached pending removal */
- /* exclude protected by sg_open_exclusive_lock */
- char exclude; /* opened for exclusive access */
+ rwlock_t sfd_lock; /* protect access to sfd list */
+ atomic_t detaching; /* 0->device usable, 1->device detaching */
+ bool exclude; /* 1->open(O_EXCL) succeeded and is active */
+ int open_cnt; /* count of opens (perhaps < num(sfds) ) */
char sgdebug; /* 0->off, 1->sense, 9->dump dev, 10-> all devs */
struct gendisk *disk;
struct cdev * cdev; /* char_dev [sysfs: /sys/cdev/major/sg<n>] */
@@ -208,7 +207,7 @@ static Sg_request *sg_add_request(Sg_fd * sfp);
static int sg_remove_request(Sg_fd * sfp, Sg_request * srp);
static int sg_res_in_use(Sg_fd * sfp);
static Sg_device *sg_get_dev(int dev);
-static void sg_put_dev(Sg_device *sdp);
+static void sg_device_destroy(struct kref *kref);
#define SZ_SG_HEADER sizeof(struct sg_header)
#define SZ_SG_IO_HDR sizeof(sg_io_hdr_t)
@@ -225,38 +224,43 @@ static int sg_allow_access(struct file *filp, unsigned char *cmd)
return blk_verify_command(cmd, filp->f_mode & FMODE_WRITE);
}
-static int get_exclude(Sg_device *sdp)
-{
- unsigned long flags;
- int ret;
-
- spin_lock_irqsave(&sg_open_exclusive_lock, flags);
- ret = sdp->exclude;
- spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
- return ret;
-}
-
-static int set_exclude(Sg_device *sdp, char val)
+static int
+open_wait(Sg_device *sdp, int flags)
{
- unsigned long flags;
-
- spin_lock_irqsave(&sg_open_exclusive_lock, flags);
- sdp->exclude = val;
- spin_unlock_irqrestore(&sg_open_exclusive_lock, flags);
- return val;
-}
+ int retval = 0;
-static int sfds_list_empty(Sg_device *sdp)
-{
- unsigned long flags;
- int ret;
+ if (flags & O_EXCL) {
+ while (sdp->open_cnt > 0) {
+ mutex_unlock(&sdp->open_rel_lock);
+ retval = wait_event_interruptible(sdp->open_wait,
+ (atomic_read(&sdp->detaching) ||
+ !sdp->open_cnt));
+ mutex_lock(&sdp->open_rel_lock);
+
+ if (retval) /* -ERESTARTSYS */
+ return retval;
+ if (atomic_read(&sdp->detaching))
+ return -ENODEV;
+ }
+ } else {
+ while (sdp->exclude) {
+ mutex_unlock(&sdp->open_rel_lock);
+ retval = wait_event_interruptible(sdp->open_wait,
+ (atomic_read(&sdp->detaching) ||
+ !sdp->exclude));
+ mutex_lock(&sdp->open_rel_lock);
+
+ if (retval) /* -ERESTARTSYS */
+ return retval;
+ if (atomic_read(&sdp->detaching))
+ return -ENODEV;
+ }
+ }
- read_lock_irqsave(&sg_index_lock, flags);
- ret = list_empty(&sdp->sfds);
- read_unlock_irqrestore(&sg_index_lock, flags);
- return ret;
+ return retval;
}
+/* Returns 0 on success, else a negated errno value */
static int
sg_open(struct inode *inode, struct file *filp)
{
@@ -265,17 +269,15 @@ sg_open(struct inode *inode, struct file *filp)
struct request_queue *q;
Sg_device *sdp;
Sg_fd *sfp;
- int res;
int retval;
nonseekable_open(inode, filp);
SCSI_LOG_TIMEOUT(3, printk("sg_open: dev=%d, flags=0x%x\n", dev, flags));
+ if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE)))
+ return -EPERM; /* Can't lock it with read only access */
sdp = sg_get_dev(dev);
- if (IS_ERR(sdp)) {
- retval = PTR_ERR(sdp);
- sdp = NULL;
- goto sg_put;
- }
+ if (IS_ERR(sdp))
+ return PTR_ERR(sdp);
/* This driver's module count bumped by fops_get in <linux/fs.h> */
/* Prevent the device driver from vanishing while we sleep */
@@ -287,6 +289,9 @@ sg_open(struct inode *inode, struct file *filp)
if (retval)
goto sdp_put;
+ /* scsi_block_when_processing_errors() may block so bypass
+ * check if O_NONBLOCK. Permits SCSI commands to be issued
+ * during error recovery. Tread carefully. */
if (!((flags & O_NONBLOCK) ||
scsi_block_when_processing_errors(sdp->device))) {
retval = -ENXIO;
@@ -294,65 +299,65 @@ sg_open(struct inode *inode, struct file *filp)
goto error_out;
}
- if (flags & O_EXCL) {
- if (O_RDONLY == (flags & O_ACCMODE)) {
- retval = -EPERM; /* Can't lock it with read only access */
- goto error_out;
- }
- if (!sfds_list_empty(sdp) && (flags & O_NONBLOCK)) {
- retval = -EBUSY;
- goto error_out;
- }
- res = wait_event_interruptible(sdp->o_excl_wait,
- ((!sfds_list_empty(sdp) || get_exclude(sdp)) ? 0 : set_exclude(sdp, 1)));
- if (res) {
- retval = res; /* -ERESTARTSYS because signal hit process */
- goto error_out;
- }
- } else if (get_exclude(sdp)) { /* some other fd has an exclusive lock on dev */
- if (flags & O_NONBLOCK) {
- retval = -EBUSY;
- goto error_out;
- }
- res = wait_event_interruptible(sdp->o_excl_wait, !get_exclude(sdp));
- if (res) {
- retval = res; /* -ERESTARTSYS because signal hit process */
- goto error_out;
+ mutex_lock(&sdp->open_rel_lock);
+ if (flags & O_NONBLOCK) {
+ if (flags & O_EXCL) {
+ if (sdp->open_cnt > 0) {
+ retval = -EBUSY;
+ goto error_mutex_locked;
+ }
+ } else {
+ if (sdp->exclude) {
+ retval = -EBUSY;
+ goto error_mutex_locked;
+ }
}
+ } else {
+ retval = open_wait(sdp, flags);
+ if (retval) /* -ERESTARTSYS or -ENODEV */
+ goto error_mutex_locked;
}
- if (sdp->detached) {
- retval = -ENODEV;
- goto error_out;
- }
- if (sfds_list_empty(sdp)) { /* no existing opens on this device */
+
+ /* N.B. at this point we are holding the open_rel_lock */
+ if (flags & O_EXCL)
+ sdp->exclude = true;
+
+ if (sdp->open_cnt < 1) { /* no existing opens */
sdp->sgdebug = 0;
q = sdp->device->request_queue;
sdp->sg_tablesize = queue_max_segments(q);
}
- if ((sfp = sg_add_sfp(sdp, dev)))
- filp->private_data = sfp;
- else {
- if (flags & O_EXCL) {
- set_exclude(sdp, 0); /* undo if error */
- wake_up_interruptible(&sdp->o_excl_wait);
- }
- retval = -ENOMEM;
- goto error_out;
+ sfp = sg_add_sfp(sdp, dev);
+ if (IS_ERR(sfp)) {
+ retval = PTR_ERR(sfp);
+ goto out_undo;
}
+
+ filp->private_data = sfp;
+ sdp->open_cnt++;
+ mutex_unlock(&sdp->open_rel_lock);
+
retval = 0;
-error_out:
- if (retval) {
- scsi_autopm_put_device(sdp->device);
-sdp_put:
- scsi_device_put(sdp->device);
- }
sg_put:
- if (sdp)
- sg_put_dev(sdp);
+ kref_put(&sdp->d_ref, sg_device_destroy);
return retval;
+
+out_undo:
+ if (flags & O_EXCL) {
+ sdp->exclude = false; /* undo if error */
+ wake_up_interruptible(&sdp->open_wait);
+ }
+error_mutex_locked:
+ mutex_unlock(&sdp->open_rel_lock);
+error_out:
+ scsi_autopm_put_device(sdp->device);
+sdp_put:
+ scsi_device_put(sdp->device);
+ goto sg_put;
}
-/* Following function was formerly called 'sg_close' */
+/* Release resources associated with a successful sg_open()
+ * Returns 0 on success, else a negated errno value */
static int
sg_release(struct inode *inode, struct file *filp)
{
@@ -363,11 +368,20 @@ sg_release(struct inode *inode, struct file *filp)
return -ENXIO;
SCSI_LOG_TIMEOUT(3, printk("sg_release: %s\n", sdp->disk->disk_name));
- set_exclude(sdp, 0);
- wake_up_interruptible(&sdp->o_excl_wait);
-
+ mutex_lock(&sdp->open_rel_lock);
scsi_autopm_put_device(sdp->device);
kref_put(&sfp->f_ref, sg_remove_sfp);
+ sdp->open_cnt--;
+
+ /* possibly many open()s waiting on exlude clearing, start many;
+ * only open(O_EXCL)s wait on 0==open_cnt so only start one */
+ if (sdp->exclude) {
+ sdp->exclude = false;
+ wake_up_interruptible_all(&sdp->open_wait);
+ } else if (0 == sdp->open_cnt) {
+ wake_up_interruptible(&sdp->open_wait);
+ }
+ mutex_unlock(&sdp->open_rel_lock);
return 0;
}
@@ -419,7 +433,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
}
srp = sg_get_rq_mark(sfp, req_pack_id);
if (!srp) { /* now wait on packet to arrive */
- if (sdp->detached) {
+ if (atomic_read(&sdp->detaching)) {
retval = -ENODEV;
goto free_old_hdr;
}
@@ -428,9 +442,9 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
goto free_old_hdr;
}
retval = wait_event_interruptible(sfp->read_wait,
- (sdp->detached ||
+ (atomic_read(&sdp->detaching) ||
(srp = sg_get_rq_mark(sfp, req_pack_id))));
- if (sdp->detached) {
+ if (atomic_read(&sdp->detaching)) {
retval = -ENODEV;
goto free_old_hdr;
}
@@ -572,7 +586,7 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos)
return -ENXIO;
SCSI_LOG_TIMEOUT(3, printk("sg_write: %s, count=%d\n",
sdp->disk->disk_name, (int) count));
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
if (!((filp->f_flags & O_NONBLOCK) ||
scsi_block_when_processing_errors(sdp->device)))
@@ -764,7 +778,7 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp,
sg_finish_rem_req(srp);
return k; /* probably out of space --> ENOMEM */
}
- if (sdp->detached) {
+ if (atomic_read(&sdp->detaching)) {
if (srp->bio)
blk_end_request_all(srp->rq, -EIO);
sg_finish_rem_req(srp);
@@ -826,7 +840,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
switch (cmd_in) {
case SG_IO:
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
if (!scsi_block_when_processing_errors(sdp->device))
return -ENXIO;
@@ -837,8 +851,8 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
if (result < 0)
return result;
result = wait_event_interruptible(sfp->read_wait,
- (srp_done(sfp, srp) || sdp->detached));
- if (sdp->detached)
+ (srp_done(sfp, srp) || atomic_read(&sdp->detaching)));
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
write_lock_irq(&sfp->rq_list_lock);
if (srp->done) {
@@ -877,7 +891,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
sg_build_reserve(sfp, val);
}
} else {
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
sfp->low_dma = sdp->device->host->unchecked_isa_dma;
}
@@ -890,7 +904,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
else {
sg_scsi_id_t __user *sg_idp = p;
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
__put_user((int) sdp->device->host->host_no,
&sg_idp->host_no);
@@ -1032,11 +1046,11 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
return result;
}
case SG_EMULATED_HOST:
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
return put_user(sdp->device->host->hostt->emulated, ip);
case SG_SCSI_RESET:
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
if (filp->f_flags & O_NONBLOCK) {
if (scsi_host_in_recovery(sdp->device->host))
@@ -1069,7 +1083,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
return (scsi_reset_provider(sdp->device, val) ==
SUCCESS) ? 0 : -EIO;
case SCSI_IOCTL_SEND_COMMAND:
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
if (read_only) {
unsigned char opcode = WRITE_6;
@@ -1091,7 +1105,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
case SCSI_IOCTL_GET_BUS_NUMBER:
case SCSI_IOCTL_PROBE_HOST:
case SG_GET_TRANSFORM:
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
return -ENODEV;
return scsi_ioctl(sdp->device, cmd_in, p);
case BLKSECTGET:
@@ -1165,7 +1179,7 @@ sg_poll(struct file *filp, poll_table * wait)
}
read_unlock_irqrestore(&sfp->rq_list_lock, iflags);
- if (sdp->detached)
+ if (atomic_read(&sdp->detaching))
res |= POLLHUP;
else if (!sfp->cmd_q) {
if (0 == count)
@@ -1264,7 +1278,8 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma)
return 0;
}
-static void sg_rq_end_io_usercontext(struct work_struct *work)
+static void
+sg_rq_end_io_usercontext(struct work_struct *work)
{
struct sg_request *srp = container_of(work, struct sg_request, ew.work);
struct sg_fd *sfp = srp->parentfp;
@@ -1277,7 +1292,8 @@ static void sg_rq_end_io_usercontext(struct work_struct *work)
* This function is a "bottom half" handler that is called by the mid
* level when a command is completed (or has failed).
*/
-static void sg_rq_end_io(struct request *rq, int uptodate)
+static void
+sg_rq_end_io(struct request *rq, int uptodate)
{
struct sg_request *srp = rq->end_io_data;
Sg_device *sdp;
@@ -1295,8 +1311,8 @@ static void sg_rq_end_io(struct request *rq, int uptodate)
return;
sdp = sfp->parentdp;
- if (unlikely(sdp->detached))
- printk(KERN_INFO "sg_rq_end_io: device detached\n");
+ if (unlikely(atomic_read(&sdp->detaching)))
+ pr_info("%s: device detaching\n", __func__);
sense = rq->sense;
result = rq->errors;
@@ -1319,7 +1335,7 @@ static void sg_rq_end_io(struct request *rq, int uptodate)
if ((sdp->sgdebug > 0) &&
((CHECK_CONDITION == srp->header.masked_status) ||
(COMMAND_TERMINATED == srp->header.masked_status)))
- __scsi_print_sense("sg_cmd_done", sense,
+ __scsi_print_sense(__func__, sense,
SCSI_SENSE_BUFFERSIZE);
/* Following if statement is a patch supplied by Eric Youngdale */
@@ -1378,7 +1394,8 @@ static struct class *sg_sysfs_class;
static int sg_sysfs_valid = 0;
-static Sg_device *sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
+static Sg_device *
+sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
{
struct request_queue *q = scsidp->request_queue;
Sg_device *sdp;
@@ -1388,7 +1405,8 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
sdp = kzalloc(sizeof(Sg_device), GFP_KERNEL);
if (!sdp) {
- printk(KERN_WARNING "kmalloc Sg_device failure\n");
+ sdev_printk(KERN_WARNING, scsidp, "%s: kmalloc Sg_device "
+ "failure\n", __func__);
return ERR_PTR(-ENOMEM);
}
@@ -1403,8 +1421,9 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
scsidp->type, SG_MAX_DEVS - 1);
error = -ENODEV;
} else {
- printk(KERN_WARNING
- "idr allocation Sg_device failure: %d\n", error);
+ sdev_printk(KERN_WARNING, scsidp, "%s: idr "
+ "allocation Sg_device failure: %d\n",
+ __func__, error);
}
goto out_unlock;
}
@@ -1415,8 +1434,11 @@ static Sg_device *sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
disk->first_minor = k;
sdp->disk = disk;
sdp->device = scsidp;
+ mutex_init(&sdp->open_rel_lock);
INIT_LIST_HEAD(&sdp->sfds);
- init_waitqueue_head(&sdp->o_excl_wait);
+ init_waitqueue_head(&sdp->open_wait);
+ atomic_set(&sdp->detaching, 0);
+ rwlock_init(&sdp->sfd_lock);
sdp->sg_tablesize = queue_max_segments(q);
sdp->index = k;
kref_init(&sdp->d_ref);
@@ -1434,7 +1456,7 @@ out_unlock:
}
static int
-sg_add(struct device *cl_dev, struct class_interface *cl_intf)
+sg_add_device(struct device *cl_dev, struct class_interface *cl_intf)
{
struct scsi_device *scsidp = to_scsi_device(cl_dev->parent);
struct gendisk *disk;
@@ -1445,7 +1467,7 @@ sg_add(struct device *cl_dev, struct class_interface *cl_intf)
disk = alloc_disk(1);
if (!disk) {
- printk(KERN_WARNING "alloc_disk failed\n");
+ pr_warn("%s: alloc_disk failed\n", __func__);
return -ENOMEM;
}
disk->major = SCSI_GENERIC_MAJOR;
@@ -1453,7 +1475,7 @@ sg_add(struct device *cl_dev, struct class_interface *cl_intf)
error = -ENOMEM;
cdev = cdev_alloc();
if (!cdev) {
- printk(KERN_WARNING "cdev_alloc failed\n");
+ pr_warn("%s: cdev_alloc failed\n", __func__);
goto out;
}
cdev->owner = THIS_MODULE;
@@ -1461,7 +1483,7 @@ sg_add(struct device *cl_dev, struct class_interface *cl_intf)
sdp = sg_alloc(disk, scsidp);
if (IS_ERR(sdp)) {
- printk(KERN_WARNING "sg_alloc failed\n");
+ pr_warn("%s: sg_alloc failed\n", __func__);
error = PTR_ERR(sdp);
goto out;
}
@@ -1479,22 +1501,20 @@ sg_add(struct device *cl_dev, struct class_interface *cl_intf)
sdp->index),
sdp, "%s", disk->disk_name);
if (IS_ERR(sg_class_member)) {
- printk(KERN_ERR "sg_add: "
- "device_create failed\n");
+ pr_err("%s: device_create failed\n", __func__);
error = PTR_ERR(sg_class_member);
goto cdev_add_err;
}
error = sysfs_create_link(&scsidp->sdev_gendev.kobj,
&sg_class_member->kobj, "generic");
if (error)
- printk(KERN_ERR "sg_add: unable to make symlink "
- "'generic' back to sg%d\n", sdp->index);
+ pr_err("%s: unable to make symlink 'generic' back "
+ "to sg%d\n", __func__, sdp->index);
} else
- printk(KERN_WARNING "sg_add: sg_sys Invalid\n");
+ pr_warn("%s: sg_sys Invalid\n", __func__);
- sdev_printk(KERN_NOTICE, scsidp,
- "Attached scsi generic sg%d type %d\n", sdp->index,
- scsidp->type);
+ sdev_printk(KERN_NOTICE, scsidp, "Attached scsi generic sg%d "
+ "type %d\n", sdp->index, scsidp->type);
dev_set_drvdata(cl_dev, sdp);
@@ -1513,7 +1533,8 @@ out:
return error;
}
-static void sg_device_destroy(struct kref *kref)
+static void
+sg_device_destroy(struct kref *kref)
{
struct sg_device *sdp = container_of(kref, struct sg_device, d_ref);
unsigned long flags;
@@ -1535,33 +1556,39 @@ static void sg_device_destroy(struct kref *kref)
kfree(sdp);
}
-static void sg_remove(struct device *cl_dev, struct class_interface *cl_intf)
+static void
+sg_remove_device(struct device *cl_dev, struct class_interface *cl_intf)
{
struct scsi_device *scsidp = to_scsi_device(cl_dev->parent);
Sg_device *sdp = dev_get_drvdata(cl_dev);
unsigned long iflags;
Sg_fd *sfp;
+ int val;
- if (!sdp || sdp->detached)
+ if (!sdp)
return;
+ /* want sdp->detaching non-zero as soon as possible */
+ val = atomic_inc_return(&sdp->detaching);
+ if (val > 1)
+ return; /* only want to do following once per device */
- SCSI_LOG_TIMEOUT(3, printk("sg_remove: %s\n", sdp->disk->disk_name));
+ SCSI_LOG_TIMEOUT(3, printk("%s: %s\n", __func__,
+ sdp->disk->disk_name));
- /* Need a write lock to set sdp->detached. */
- write_lock_irqsave(&sg_index_lock, iflags);
- sdp->detached = 1;
+ read_lock_irqsave(&sdp->sfd_lock, iflags);
list_for_each_entry(sfp, &sdp->sfds, sfd_siblings) {
- wake_up_interruptible(&sfp->read_wait);
+ wake_up_interruptible_all(&sfp->read_wait);
kill_fasync(&sfp->async_qp, SIGPOLL, POLL_HUP);
}
- write_unlock_irqrestore(&sg_index_lock, iflags);
+ wake_up_interruptible_all(&sdp->open_wait);
+ read_unlock_irqrestore(&sdp->sfd_lock, iflags);
sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
device_destroy(sg_sysfs_class, MKDEV(SCSI_GENERIC_MAJOR, sdp->index));
cdev_del(sdp->cdev);
sdp->cdev = NULL;
- sg_put_dev(sdp);
+ kref_put(&sdp->d_ref, sg_device_destroy);
}
module_param_named(scatter_elem_sz, scatter_elem_sz, int, S_IRUGO | S_IWUSR);
@@ -1631,7 +1658,8 @@ exit_sg(void)
idr_destroy(&sg_index_idr);
}
-static int sg_start_req(Sg_request *srp, unsigned char *cmd)
+static int
+sg_start_req(Sg_request *srp, unsigned char *cmd)
{
int res;
struct request *rq;
@@ -1726,7 +1754,8 @@ static int sg_start_req(Sg_request *srp, unsigned char *cmd)
return res;
}
-static int sg_finish_rem_req(Sg_request * srp)
+static int
+sg_finish_rem_req(Sg_request *srp)
{
int ret = 0;
@@ -2063,7 +2092,7 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp = kzalloc(sizeof(*sfp), GFP_ATOMIC | __GFP_NOWARN);
if (!sfp)
- return NULL;
+ return ERR_PTR(-ENOMEM);
init_waitqueue_head(&sfp->read_wait);
rwlock_init(&sfp->rq_list_lock);
@@ -2077,9 +2106,13 @@ sg_add_sfp(Sg_device * sdp, int dev)
sfp->cmd_q = SG_DEF_COMMAND_Q;
sfp->keep_orphan = SG_DEF_KEEP_ORPHAN;
sfp->parentdp = sdp;
- write_lock_irqsave(&sg_index_lock, iflags);
+ write_lock_irqsave(&sdp->sfd_lock, iflags);
+ if (atomic_read(&sdp->detaching)) {
+ write_unlock_irqrestore(&sdp->sfd_lock, iflags);
+ return ERR_PTR(-ENODEV);
+ }
list_add_tail(&sfp->sfd_siblings, &sdp->sfds);
- write_unlock_irqrestore(&sg_index_lock, iflags);
+ write_unlock_irqrestore(&sdp->sfd_lock, iflags);
SCSI_LOG_TIMEOUT(3, printk("sg_add_sfp: sfp=0x%p\n", sfp));
if (unlikely(sg_big_buff != def_reserved_size))
sg_big_buff = def_reserved_size;
@@ -2095,7 +2128,8 @@ sg_add_sfp(Sg_device * sdp, int dev)
return sfp;
}
-static void sg_remove_sfp_usercontext(struct work_struct *work)
+static void
+sg_remove_sfp_usercontext(struct work_struct *work)
{
struct sg_fd *sfp = container_of(work, struct sg_fd, ew.work);
struct sg_device *sdp = sfp->parentdp;
@@ -2119,20 +2153,20 @@ static void sg_remove_sfp_usercontext(struct work_struct *work)
kfree(sfp);
scsi_device_put(sdp->device);
- sg_put_dev(sdp);
+ kref_put(&sdp->d_ref, sg_device_destroy);
module_put(THIS_MODULE);
}
-static void sg_remove_sfp(struct kref *kref)
+static void
+sg_remove_sfp(struct kref *kref)
{
struct sg_fd *sfp = container_of(kref, struct sg_fd, f_ref);
struct sg_device *sdp = sfp->parentdp;
unsigned long iflags;
- write_lock_irqsave(&sg_index_lock, iflags);
+ write_lock_irqsave(&sdp->sfd_lock, iflags);
list_del(&sfp->sfd_siblings);
- write_unlock_irqrestore(&sg_index_lock, iflags);
- wake_up_interruptible(&sdp->o_excl_wait);
+ write_unlock_irqrestore(&sdp->sfd_lock, iflags);
INIT_WORK(&sfp->ew.work, sg_remove_sfp_usercontext);
schedule_work(&sfp->ew.work);
@@ -2183,7 +2217,8 @@ static Sg_device *sg_lookup_dev(int dev)
return idr_find(&sg_index_idr, dev);
}
-static Sg_device *sg_get_dev(int dev)
+static Sg_device *
+sg_get_dev(int dev)
{
struct sg_device *sdp;
unsigned long flags;
@@ -2192,8 +2227,8 @@ static Sg_device *sg_get_dev(int dev)
sdp = sg_lookup_dev(dev);
if (!sdp)
sdp = ERR_PTR(-ENXIO);
- else if (sdp->detached) {
- /* If sdp->detached, then the refcount may already be 0, in
+ else if (atomic_read(&sdp->detaching)) {
+ /* If sdp->detaching, then the refcount may already be 0, in
* which case it would be a bug to do kref_get().
*/
sdp = ERR_PTR(-ENODEV);
@@ -2204,11 +2239,6 @@ static Sg_device *sg_get_dev(int dev)
return sdp;
}
-static void sg_put_dev(struct sg_device *sdp)
-{
- kref_put(&sdp->d_ref, sg_device_destroy);
-}
-
#ifdef CONFIG_SCSI_PROC_FS
static struct proc_dir_entry *sg_proc_sgp = NULL;
@@ -2425,8 +2455,7 @@ static int sg_proc_single_open_version(struct inode *inode, struct file *file)
static int sg_proc_seq_show_devhdr(struct seq_file *s, void *v)
{
- seq_printf(s, "host\tchan\tid\tlun\ttype\topens\tqdepth\tbusy\t"
- "online\n");
+ seq_puts(s, "host\tchan\tid\tlun\ttype\topens\tqdepth\tbusy\tonline\n");
return 0;
}
@@ -2482,7 +2511,11 @@ static int sg_proc_seq_show_dev(struct seq_file *s, void *v)
read_lock_irqsave(&sg_index_lock, iflags);
sdp = it ? sg_lookup_dev(it->index) : NULL;
- if (sdp && (scsidp = sdp->device) && (!sdp->detached))
+ if ((NULL == sdp) || (NULL == sdp->device) ||
+ (atomic_read(&sdp->detaching)))
+ seq_puts(s, "-1\t-1\t-1\t-1\t-1\t-1\t-1\t-1\t-1\n");
+ else {
+ scsidp = sdp->device;
seq_printf(s, "%d\t%d\t%d\t%d\t%d\t%d\t%d\t%d\t%d\n",
scsidp->host->host_no, scsidp->channel,
scsidp->id, scsidp->lun, (int) scsidp->type,
@@ -2490,8 +2523,7 @@ static int sg_proc_seq_show_dev(struct seq_file *s, void *v)
(int) scsidp->queue_depth,
(int) atomic_read(&scsidp->device_busy),
(int) scsi_device_online(scsidp));
- else
- seq_printf(s, "-1\t-1\t-1\t-1\t-1\t-1\t-1\t-1\t-1\n");
+ }
read_unlock_irqrestore(&sg_index_lock, iflags);
return 0;
}
@@ -2510,11 +2542,12 @@ static int sg_proc_seq_show_devstrs(struct seq_file *s, void *v)
read_lock_irqsave(&sg_index_lock, iflags);
sdp = it ? sg_lookup_dev(it->index) : NULL;
- if (sdp && (scsidp = sdp->device) && (!sdp->detached))
+ scsidp = sdp ? sdp->device : NULL;
+ if (sdp && scsidp && (!atomic_read(&sdp->detaching)))
seq_printf(s, "%8.8s\t%16.16s\t%4.4s\n",
scsidp->vendor, scsidp->model, scsidp->rev);
else
- seq_printf(s, "<no active device>\n");
+ seq_puts(s, "<no active device>\n");
read_unlock_irqrestore(&sg_index_lock, iflags);
return 0;
}
@@ -2559,12 +2592,12 @@ static void sg_proc_debug_helper(struct seq_file *s, Sg_device * sdp)
else
cp = " ";
}
- seq_printf(s, cp);
+ seq_puts(s, cp);
blen = srp->data.bufflen;
usg = srp->data.k_use_sg;
- seq_printf(s, srp->done ?
- ((1 == srp->done) ? "rcv:" : "fin:")
- : "act:");
+ seq_puts(s, srp->done ?
+ ((1 == srp->done) ? "rcv:" : "fin:")
+ : "act:");
seq_printf(s, " id=%d blen=%d",
srp->header.pack_id, blen);
if (srp->done)
@@ -2580,7 +2613,7 @@ static void sg_proc_debug_helper(struct seq_file *s, Sg_device * sdp)
(int) srp->data.cmd_opcode);
}
if (0 == m)
- seq_printf(s, " No requests active\n");
+ seq_puts(s, " No requests active\n");
read_unlock(&fp->rq_list_lock);
}
}
@@ -2596,31 +2629,34 @@ static int sg_proc_seq_show_debug(struct seq_file *s, void *v)
Sg_device *sdp;
unsigned long iflags;
- if (it && (0 == it->index)) {
- seq_printf(s, "max_active_device=%d(origin 1)\n",
- (int)it->max);
- seq_printf(s, " def_reserved_size=%d\n", sg_big_buff);
- }
+ if (it && (0 == it->index))
+ seq_printf(s, "max_active_device=%d def_reserved_size=%d\n",
+ (int)it->max, sg_big_buff);
read_lock_irqsave(&sg_index_lock, iflags);
sdp = it ? sg_lookup_dev(it->index) : NULL;
- if (sdp && !list_empty(&sdp->sfds)) {
- struct scsi_device *scsidp = sdp->device;
-
+ if (NULL == sdp)
+ goto skip;
+ read_lock(&sdp->sfd_lock);
+ if (!list_empty(&sdp->sfds)) {
seq_printf(s, " >>> device=%s ", sdp->disk->disk_name);
- if (sdp->detached)
- seq_printf(s, "detached pending close ");
- else
- seq_printf
- (s, "scsi%d chan=%d id=%d lun=%d em=%d",
- scsidp->host->host_no,
- scsidp->channel, scsidp->id,
- scsidp->lun,
- scsidp->host->hostt->emulated);
- seq_printf(s, " sg_tablesize=%d excl=%d\n",
- sdp->sg_tablesize, get_exclude(sdp));
+ if (atomic_read(&sdp->detaching))
+ seq_puts(s, "detaching pending close ");
+ else if (sdp->device) {
+ struct scsi_device *scsidp = sdp->device;
+
+ seq_printf(s, "%d:%d:%d:%d em=%d",
+ scsidp->host->host_no,
+ scsidp->channel, scsidp->id,
+ scsidp->lun,
+ scsidp->host->hostt->emulated);
+ }
+ seq_printf(s, " sg_tablesize=%d excl=%d open_cnt=%d\n",
+ sdp->sg_tablesize, sdp->exclude, sdp->open_cnt);
sg_proc_debug_helper(s, sdp);
}
+ read_unlock(&sdp->sfd_lock);
+skip:
read_unlock_irqrestore(&sg_index_lock, iflags);
return 0;
}
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v4] sg: O_EXCL and other lock handling
2014-06-13 9:03 ` Christoph Hellwig
@ 2014-06-13 14:33 ` Douglas Gilbert
2014-06-16 13:45 ` Christoph Hellwig
2014-06-13 20:18 ` Douglas Gilbert
1 sibling, 1 reply; 6+ messages in thread
From: Douglas Gilbert @ 2014-06-13 14:33 UTC (permalink / raw)
To: Christoph Hellwig
Cc: SCSI development list, James Bottomley, Hannes Reinecke, vaughan
On 14-06-13 05:03 AM, Christoph Hellwig wrote:
> Hi Doug,
>
> this looks generally good to me, but I don't think open_cnt and exclude
> have to use atomic_t, as they are only ever modified under open_rel_lock.
They are read by 'cat /proc/scsi/sg/debug' [in
sg_proc_seq_show_debug()] but transient off-by-1 reports
are not so important. [Prior to locking that block with
read_lock(sfd_lock) that routine sometimes printed wild
results, so some care is required.]
> Can you take a look at the version below? This changes open_cnt to an
> int, exclude to a bool, removes the open_cnt underflow check that
> the VFS takes care for, and streamlines the open path a little bit:
sg_open() and sg_release() clean-up looks good. The
back-up goto at the end of sg_open() reminds me of
Fortran style :-) More importantly my sg_tst_excl*
tests give this version the thumbs up.
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] sg: O_EXCL and other lock handling
2014-06-13 14:33 ` Douglas Gilbert
@ 2014-06-16 13:45 ` Christoph Hellwig
0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2014-06-16 13:45 UTC (permalink / raw)
To: Douglas Gilbert
Cc: SCSI development list, James Bottomley, Hannes Reinecke, vaughan
On Fri, Jun 13, 2014 at 10:33:37AM -0400, Douglas Gilbert wrote:
>> this looks generally good to me, but I don't think open_cnt and exclude
>> have to use atomic_t, as they are only ever modified under open_rel_lock.
>
> They are read by 'cat /proc/scsi/sg/debug' [in
> sg_proc_seq_show_debug()] but transient off-by-1 reports
> are not so important. [Prior to locking that block with
> read_lock(sfd_lock) that routine sometimes printed wild
> results, so some care is required.]
In general 32-bit variables that can be read independly only need the lock
for updates. A lock on the read side is important if variables are of
a type that can't be atomically read (e.g. 64-bit on 32-bit architectures,
or small integers on some old alpha CPUs), or if the values of multiple
variables need to be consistent for the reader.
>> Can you take a look at the version below? This changes open_cnt to an
>> int, exclude to a bool, removes the open_cnt underflow check that
>> the VFS takes care for, and streamlines the open path a little bit:
>
> sg_open() and sg_release() clean-up looks good. The
> back-up goto at the end of sg_open() reminds me of
> Fortran style :-) More importantly my sg_tst_excl*
> tests give this version the thumbs up.
>
> Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Given that this was your patch with minor changes I was planning to put
this in under your name if that's fine with you.
I'm also looking for another review.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] sg: O_EXCL and other lock handling
2014-06-13 9:03 ` Christoph Hellwig
2014-06-13 14:33 ` Douglas Gilbert
@ 2014-06-13 20:18 ` Douglas Gilbert
1 sibling, 0 replies; 6+ messages in thread
From: Douglas Gilbert @ 2014-06-13 20:18 UTC (permalink / raw)
To: Christoph Hellwig
Cc: SCSI development list, James Bottomley, Hannes Reinecke, vaughan
[-- Attachment #1: Type: text/plain, Size: 716 bytes --]
On 14-06-13 05:03 AM, Christoph Hellwig wrote:
> Hi Doug,
>
> this looks generally good to me, but I don't think open_cnt and exclude
> have to use atomic_t, as they are only ever modified under open_rel_lock.
>
> Can you take a look at the version below? This changes open_cnt to an
> int, exclude to a bool, removes the open_cnt underflow check that
> the VFS takes care for, and streamlines the open path a little bit:
As a follow-up attached is a first cut of replacing the locks
around sg_request::done with an atomic. It assumes the patch
from Christoph earlier in this thread. The interesting part is
the removal of the write lock in the completion part of the
SG_IO ioctl.
Testing ongoing.
Doug Gilbert
[-- Attachment #2: sg_done_atomic1.patch --]
[-- Type: text/x-patch, Size: 6249 bytes --]
--- a/drivers/scsi/sg.c 2014-06-13 09:39:54.962003585 -0400
+++ b/drivers/scsi/sg.c 2014-06-13 13:11:49.796632281 -0400
@@ -138,7 +138,7 @@ typedef struct sg_request { /* SG_MAX_QU
char orphan; /* 1 -> drop on sight, 0 -> normal */
char sg_io_owned; /* 1 -> packet belongs to SG_IO */
/* done protected by rq_list_lock */
- char done; /* 0->before bh, 1->before read, 2->read */
+ atomic_t done; /* 0->before bh, 1->before read, 2->read */
struct request *rq;
struct bio *bio;
struct execute_work ew;
@@ -809,17 +809,6 @@ sg_common_write(Sg_fd * sfp, Sg_request
return 0;
}
-static int srp_done(Sg_fd *sfp, Sg_request *srp)
-{
- unsigned long flags;
- int ret;
-
- read_lock_irqsave(&sfp->rq_list_lock, flags);
- ret = srp->done;
- read_unlock_irqrestore(&sfp->rq_list_lock, flags);
- return ret;
-}
-
static long
sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
{
@@ -851,16 +840,17 @@ sg_ioctl(struct file *filp, unsigned int
if (result < 0)
return result;
result = wait_event_interruptible(sfp->read_wait,
- (srp_done(sfp, srp) || atomic_read(&sdp->detaching)));
+ (atomic_read(&srp->done) ||
+ atomic_read(&sdp->detaching)));
if (atomic_read(&sdp->detaching))
return -ENODEV;
- write_lock_irq(&sfp->rq_list_lock);
- if (srp->done) {
- srp->done = 2;
- write_unlock_irq(&sfp->rq_list_lock);
+ if (likely(atomic_add_unless(&srp->done, 1, 0))) {
+ /* srp->done bumped from 1 to 2 which is expected */
result = sg_new_read(sfp, p, SZ_SG_IO_HDR, srp);
return (result < 0) ? result : 0;
}
+ /* here if srp->done was 0, abnormal */
+ write_lock_irq(&sfp->rq_list_lock);
srp->orphan = 1;
write_unlock_irq(&sfp->rq_list_lock);
return result; /* -ERESTARTSYS because signal hit process */
@@ -932,7 +922,8 @@ sg_ioctl(struct file *filp, unsigned int
return -EFAULT;
read_lock_irqsave(&sfp->rq_list_lock, iflags);
for (srp = sfp->headrp; srp; srp = srp->nextrp) {
- if ((1 == srp->done) && (!srp->sg_io_owned)) {
+ if ((1 == atomic_read(&srp->done)) &&
+ (!srp->sg_io_owned)) {
read_unlock_irqrestore(&sfp->rq_list_lock,
iflags);
__put_user(srp->header.pack_id, ip);
@@ -945,7 +936,8 @@ sg_ioctl(struct file *filp, unsigned int
case SG_GET_NUM_WAITING:
read_lock_irqsave(&sfp->rq_list_lock, iflags);
for (val = 0, srp = sfp->headrp; srp; srp = srp->nextrp) {
- if ((1 == srp->done) && (!srp->sg_io_owned))
+ if ((1 == atomic_read(&srp->done)) &&
+ (!srp->sg_io_owned))
++val;
}
read_unlock_irqrestore(&sfp->rq_list_lock, iflags);
@@ -1015,12 +1007,13 @@ sg_ioctl(struct file *filp, unsigned int
++val, srp = srp ? srp->nextrp : srp) {
memset(&rinfo[val], 0, SZ_SG_REQ_INFO);
if (srp) {
- rinfo[val].req_state = srp->done + 1;
+ rinfo[val].req_state =
+ atomic_read(&srp->done) + 1;
rinfo[val].problem =
- srp->header.masked_status &
- srp->header.host_status &
- srp->header.driver_status;
- if (srp->done)
+ srp->header.masked_status &
+ srp->header.host_status &
+ srp->header.driver_status;
+ if (atomic_read(&srp->done))
rinfo[val].duration =
srp->header.duration;
else {
@@ -1173,7 +1166,8 @@ sg_poll(struct file *filp, poll_table *
read_lock_irqsave(&sfp->rq_list_lock, iflags);
for (srp = sfp->headrp; srp; srp = srp->nextrp) {
/* if any read waiting, flag it */
- if ((0 == res) && (1 == srp->done) && (!srp->sg_io_owned))
+ if ((0 == res) && (1 == atomic_read(&srp->done)) &&
+ (!srp->sg_io_owned))
res = POLLIN | POLLRDNORM;
++count;
}
@@ -1303,7 +1297,7 @@ sg_rq_end_io(struct request *rq, int upt
char *sense;
int result, resid, done = 1;
- if (WARN_ON(srp->done != 0))
+ if (WARN_ON(atomic_read(&srp->done) != 0))
return;
sfp = srp->parentfp;
@@ -1318,8 +1312,8 @@ sg_rq_end_io(struct request *rq, int upt
result = rq->errors;
resid = rq->resid_len;
- SCSI_LOG_TIMEOUT(4, printk("sg_cmd_done: %s, pack_id=%d, res=0x%x\n",
- sdp->disk->disk_name, srp->header.pack_id, result));
+ SCSI_LOG_TIMEOUT(4, printk("%s: %s, pack_id=%d, res=0x%x\n",
+ __func__, sdp->disk->disk_name, srp->header.pack_id, result));
srp->header.resid = resid;
ms = jiffies_to_msecs(jiffies);
srp->header.duration = (ms > srp->header.duration) ?
@@ -1358,7 +1352,7 @@ sg_rq_end_io(struct request *rq, int upt
else
done = 0;
}
- srp->done = done;
+ atomic_set(&srp->done, done);
write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
if (likely(done)) {
@@ -1999,9 +1993,10 @@ sg_get_rq_mark(Sg_fd * sfp, int pack_id)
write_lock_irqsave(&sfp->rq_list_lock, iflags);
for (resp = sfp->headrp; resp; resp = resp->nextrp) {
/* look for requests that are ready + not SG_IO owned */
- if ((1 == resp->done) && (!resp->sg_io_owned) &&
+ if ((1 == atomic_read(&resp->done)) && (!resp->sg_io_owned) &&
((-1 == pack_id) || (resp->header.pack_id == pack_id))) {
- resp->done = 2; /* guard against other readers */
+ /* guard against other readers */
+ atomic_set(&resp->done, 2);
break;
}
}
@@ -2047,6 +2042,7 @@ sg_add_request(Sg_fd * sfp)
if (resp) {
resp->nextrp = NULL;
resp->header.duration = jiffies_to_msecs(jiffies);
+ atomic_set(&resp->done, 0);
}
write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
return resp;
@@ -2556,7 +2552,7 @@ static int sg_proc_seq_show_devstrs(stru
/* must be called while holding sg_index_lock */
static void sg_proc_debug_helper(struct seq_file *s, Sg_device * sdp)
{
- int k, m, new_interface, blen, usg;
+ int k, m, new_interface, blen, usg, done;
Sg_request *srp;
Sg_fd *fp;
const sg_io_hdr_t *hp;
@@ -2596,12 +2592,12 @@ static void sg_proc_debug_helper(struct
seq_puts(s, cp);
blen = srp->data.bufflen;
usg = srp->data.k_use_sg;
- seq_puts(s, srp->done ?
- ((1 == srp->done) ? "rcv:" : "fin:")
- : "act:");
+ done = atomic_read(&srp->done);
+ seq_puts(s, done ? ((1 == done) ? "rcv:" : "fin:")
+ : "act:");
seq_printf(s, " id=%d blen=%d",
srp->header.pack_id, blen);
- if (srp->done)
+ if (done)
seq_printf(s, " dur=%d", hp->duration);
else {
ms = jiffies_to_msecs(jiffies);
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] sg: O_EXCL and other lock handling
2014-06-13 0:26 [PATCH v4] sg: O_EXCL and other lock handling Douglas Gilbert
2014-06-13 9:03 ` Christoph Hellwig
@ 2014-06-21 11:01 ` Hannes Reinecke
1 sibling, 0 replies; 6+ messages in thread
From: Hannes Reinecke @ 2014-06-21 11:01 UTC (permalink / raw)
To: dgilbert, SCSI development list, Christoph Hellwig
Cc: James Bottomley, vaughan
On 06/13/2014 02:26 AM, Douglas Gilbert wrote:
> This is a re-presentation of a patch to the sg driver
> whose v3 was sent in November 2013:
> http://www.spinics.net/lists/linux-scsi/msg69957.html
>
> It addresses a problem reported by Vaughan Cao concerning
> the correctness of the O_EXCL logic in the sg driver. POSIX
> doesn't defined O_EXCL semantics on devices but "allow only
> one open file descriptor at a time per sg device" is a rough
> definition. The sg driver's semantics have been to wait
> on an open() when O_NONBLOCK is not given and there are
> O_EXCL headwinds. Nasty things can happen during that wait
> such as the device being detached (removed). So multiple
> locks are reworked in this patch making it large and hard
> to break down into digestible bits.
>
> This patch is against Linus's current git repository which
> doesn't include any sg patches sent in the last few weeks.
> Hence this patch touches as little as possible that it
> doesn't need to and strips out most SCSI_LOG_TIMEOUT()
> changes in v3 because Hannes said he was going to rework all
> that stuff.
>
> The sg3_utils package has several test programs written to
> test this patch. See examples/sg_tst_excl*.cpp .
>
> Not all the locks and flags in sg have been re-worked in
> this patch, notably sg_request::done . That can wait for
> a follow-up patch if this one meets with approval.
>
> ChangeLog v4:
> - based on the current kernel tree: pre 3.16-rc1
> - strip out clean-ups in v3 that others are better
> placed to do (e.g. debug/logging)
> - simplify open_wait_event() logic and add comment
>
> ChangeLog v3 and earlier: see link in the first paragraph.
>
>
> Could anyone confirm whether v3 of this patch has found its
> way into any distro and/or been tested or used more widely?
>
> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
>
>
We've tested a similar version here.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-06-21 11:01 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-13 0:26 [PATCH v4] sg: O_EXCL and other lock handling Douglas Gilbert
2014-06-13 9:03 ` Christoph Hellwig
2014-06-13 14:33 ` Douglas Gilbert
2014-06-16 13:45 ` Christoph Hellwig
2014-06-13 20:18 ` Douglas Gilbert
2014-06-21 11:01 ` Hannes Reinecke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox