[PATCH 00/19] target/target_core_user: changes for 4.16

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 00/19] target/target_core_user: changes for 4.16
@ 2017-10-30  3:44 Mike Christie
  2017-10-30  3:44 ` [PATCH 01/19] tcmu: fix crash when removing the tcmu device v4 Mike Christie
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab

The following patches made over linus's tree, and also apply over
Nicks target-pending master branch, and Martin and James's for-next branches,
fix several bugs and add features to the target_core_user module for
LIO.

- Patches 1 -3 are target_core_user fixes/changes that have been sitting
on the list for a while and have been reviewed.

- Patches 4 - 18 are changes to target_core_user that fixes bugs in
the ring buffer (buffer used to pass commands and responses between
user/kernel space) and allow the user to config the ring size and how
long we wait for memory to free up.

- Patch 19 is a change to the core target code that allows backend modules
like target_core_user to return SAM STAT TASK SET FULL when a condition
like the ring buffer is full temporarily.

I know it is already rc7 and I am not pushing these for 4.15, so
that is why the subject mentions the next next kernel.

Note:

These patches have conflicts with Nicks's for next branch. Nick's
for next has 3 patches which are not needed with my patchset.

This patch:

commit b8501553476c2eabbd15da79e152a84a9a67dc5a
Author: Xiubo Li <lixiubo@cmss.chinamobile.com>
Date:   Wed Jul 12 15:16:07 2017 +0800

    tcmu: Add fifo type waiter list support to avoid starvation

is bad and can just be dropped/reverted. My patch fixes the same
issue and fixes regressions/bugs that that patch has.

This patch

commit 04229774f69250e0f7e63b4d1b50ba8371c9d165
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Tue Aug 1 23:09:17 2017 +0300

    tcmu: Oops in unmap_thread_fn()

fixes a bug with the above patch and can just be dropped if the above
patch is also dropped.

And I included:

commit 1b0a5fda1cd7e9001d4c16cbb6dfc5e963f08bd7
Author: Xiubo Li <lixiubo@cmss.chinamobile.com>
Date:   Thu Jul 13 14:33:50 2017 +0800

    tcmu: clean up the scatter helper

this patch in my set.

Nick, I am assuming, if you come back soon'ish you can just drop the
those patches or revert them like you have done in the past. Or,
if someone else picks up these patches then they do not need to worry
about what was sitting in your branches.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 01/19] tcmu: fix crash when removing the tcmu device v4
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 02/19] tcmu: Add netlink command reply supported option for each device Mike Christie
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab
  Cc: Xiubo Li, Zhang Zhuoyu, Mike Christie

From: Xiubo Li <lixiubo@cmss.chinamobile.com>

Before the nl REMOVE msg has been sent to the userspace, the ring's
and other resources have been released, but the userspace maybe still
using them. And then we can see the crash messages like:

ring broken, not handling completions
BUG: unable to handle kernel paging request at ffffffffffffffd0
IP: tcmu_handle_completions+0x134/0x2f0 [target_core_user]
PGD 11bdc0c067
P4D 11bdc0c067
PUD 11bdc0e067
PMD 0

Oops: 0000 [#1] SMP
cmd_id not found, ring is broken
RIP: 0010:tcmu_handle_completions+0x134/0x2f0 [target_core_user]
RSP: 0018:ffffb8a2d8983d88 EFLAGS: 00010296
RAX: 0000000000000000 RBX: ffffb8a2aaa4e000 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000220
R10: 0000000076c71401 R11: ffff8d2e76c713f0 R12: ffffb8a2aad56bc0
R13: 000000000000001c R14: ffff8d2e32c90000 R15: ffff8d2e76c713f0
FS:  00007f411ffff700(0000) GS:ffff8d1e7fdc0000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd0 CR3: 0000001027070000 CR4:
00000000001406e0
Call Trace:
? tcmu_irqcontrol+0x2a/0x40 [target_core_user]
? uio_write+0x7b/0xc0 [uio]
? __vfs_write+0x37/0x150
? __getnstimeofday64+0x3b/0xd0
? vfs_write+0xb2/0x1b0
? syscall_trace_enter+0x1d0/0x2b0
? SyS_write+0x55/0xc0
? do_syscall_64+0x67/0x150
? entry_SYSCALL64_slow_path+0x25/0x25
Code: 41 5d 41 5e 41 5f 5d c3 83 f8 01 0f 85 cf 01 00
00 48 8b 7d d0 e8 dd 5c 1d f3 41 0f b7 74 24 04 48 8b
7d c8 31 d2 e8 5c c7 1b f3 <48> 8b 7d d0 49 89 c7 c6 07
00 0f 1f 40 00 4d 85 ff 0f 84 82 01  RIP:
tcmu_handle_completions+0x134/0x2f0 [target_core_user]
RSP: ffffb8a2d8983d88
CR2: ffffffffffffffd0

And the crash also could happen in tcmu_page_fault and other places.

Signed-off-by: Zhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
[ported to scsi branch]
Signed-off-by: Mike Christie <mchristi@redhat.com>

---
 drivers/target/target_core_user.c | 92 ++++++++++++++++++++-------------------
 1 file changed, 47 insertions(+), 45 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 942d0942..8b9ec0f 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -1112,6 +1112,8 @@ static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name)
 	init_waitqueue_head(&udev->nl_cmd_wq);
 	spin_lock_init(&udev->nl_cmd_lock);
 
+	INIT_RADIX_TREE(&udev->data_blocks, GFP_KERNEL);
+
 	return &udev->se_dev;
 }
 
@@ -1280,10 +1282,54 @@ static void tcmu_dev_call_rcu(struct rcu_head *p)
 	kfree(udev);
 }
 
+static int tcmu_check_and_free_pending_cmd(struct tcmu_cmd *cmd)
+{
+	if (test_bit(TCMU_CMD_BIT_EXPIRED, &cmd->flags)) {
+		kmem_cache_free(tcmu_cmd_cache, cmd);
+		return 0;
+	}
+	return -EINVAL;
+}
+
+static void tcmu_blocks_release(struct tcmu_dev *udev)
+{
+	int i;
+	struct page *page;
+
+	/* Try to release all block pages */
+	mutex_lock(&udev->cmdr_lock);
+	for (i = 0; i <= udev->dbi_max; i++) {
+		page = radix_tree_delete(&udev->data_blocks, i);
+		if (page) {
+			__free_page(page);
+			atomic_dec(&global_db_count);
+		}
+	}
+	mutex_unlock(&udev->cmdr_lock);
+}
+
 static void tcmu_dev_kref_release(struct kref *kref)
 {
 	struct tcmu_dev *udev = container_of(kref, struct tcmu_dev, kref);
 	struct se_device *dev = &udev->se_dev;
+	struct tcmu_cmd *cmd;
+	bool all_expired = true;
+	int i;
+
+	vfree(udev->mb_addr);
+	udev->mb_addr = NULL;
+
+	/* Upper layer should drain all requests before calling this */
+	spin_lock_irq(&udev->commands_lock);
+	idr_for_each_entry(&udev->commands, cmd, i) {
+		if (tcmu_check_and_free_pending_cmd(cmd) != 0)
+			all_expired = false;
+	}
+	idr_destroy(&udev->commands);
+	spin_unlock_irq(&udev->commands_lock);
+	WARN_ON(!all_expired);
+
+	tcmu_blocks_release(udev);
 
 	call_rcu(&dev->rcu_head, tcmu_dev_call_rcu);
 }
@@ -1476,8 +1522,6 @@ static int tcmu_configure_device(struct se_device *dev)
 	WARN_ON(udev->data_size % PAGE_SIZE);
 	WARN_ON(udev->data_size % DATA_BLOCK_SIZE);
 
-	INIT_RADIX_TREE(&udev->data_blocks, GFP_KERNEL);
-
 	info->version = __stringify(TCMU_MAILBOX_VERSION);
 
 	info->mem[0].name = "tcm-user command & data buffer";
@@ -1527,6 +1571,7 @@ static int tcmu_configure_device(struct se_device *dev)
 	uio_unregister_device(&udev->uio_info);
 err_register:
 	vfree(udev->mb_addr);
+	udev->mb_addr = NULL;
 err_vzalloc:
 	kfree(info->name);
 	info->name = NULL;
@@ -1534,37 +1579,11 @@ static int tcmu_configure_device(struct se_device *dev)
 	return ret;
 }
 
-static int tcmu_check_and_free_pending_cmd(struct tcmu_cmd *cmd)
-{
-	if (test_bit(TCMU_CMD_BIT_EXPIRED, &cmd->flags)) {
-		kmem_cache_free(tcmu_cmd_cache, cmd);
-		return 0;
-	}
-	return -EINVAL;
-}
-
 static bool tcmu_dev_configured(struct tcmu_dev *udev)
 {
 	return udev->uio_info.uio_dev ? true : false;
 }
 
-static void tcmu_blocks_release(struct tcmu_dev *udev)
-{
-	int i;
-	struct page *page;
-
-	/* Try to release all block pages */
-	mutex_lock(&udev->cmdr_lock);
-	for (i = 0; i <= udev->dbi_max; i++) {
-		page = radix_tree_delete(&udev->data_blocks, i);
-		if (page) {
-			__free_page(page);
-			atomic_dec(&global_db_count);
-		}
-	}
-	mutex_unlock(&udev->cmdr_lock);
-}
-
 static void tcmu_free_device(struct se_device *dev)
 {
 	struct tcmu_dev *udev = TCMU_DEV(dev);
@@ -1576,9 +1595,6 @@ static void tcmu_free_device(struct se_device *dev)
 static void tcmu_destroy_device(struct se_device *dev)
 {
 	struct tcmu_dev *udev = TCMU_DEV(dev);
-	struct tcmu_cmd *cmd;
-	bool all_expired = true;
-	int i;
 
 	del_timer_sync(&udev->timeout);
 
@@ -1586,20 +1602,6 @@ static void tcmu_destroy_device(struct se_device *dev)
 	list_del(&udev->node);
 	mutex_unlock(&root_udev_mutex);
 
-	vfree(udev->mb_addr);
-
-	/* Upper layer should drain all requests before calling this */
-	spin_lock_irq(&udev->commands_lock);
-	idr_for_each_entry(&udev->commands, cmd, i) {
-		if (tcmu_check_and_free_pending_cmd(cmd) != 0)
-			all_expired = false;
-	}
-	idr_destroy(&udev->commands);
-	spin_unlock_irq(&udev->commands_lock);
-	WARN_ON(!all_expired);
-
-	tcmu_blocks_release(udev);
-
 	tcmu_netlink_event(udev, TCMU_CMD_REMOVED_DEVICE, 0, NULL);
 
 	uio_unregister_device(&udev->uio_info);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 02/19] tcmu: Add netlink command reply supported option for each device
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
  2017-10-30  3:44 ` [PATCH 01/19] tcmu: fix crash when removing the tcmu device v4 Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 03/19] tcmu: Use macro to call container_of in tcmu_cmd_time_out_show Mike Christie
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab
  Cc: Kenjiro Nakayama, Mike Christie

From: Kenjiro Nakayama <nakayamakenjiro@gmail.com>

Currently netlink command reply support option
(TCMU_ATTR_SUPP_KERN_CMD_REPLY) can be enabled only on module
scope. Because of that, once an application enables the netlink
command reply support, all applications using target_core_user.ko
would be expected to support the netlink reply. To make matters worse,
users will not be able to add a device via configfs manually.

To fix these issues, this patch adds an option to make netlink command
reply disabled on each device through configfs. Original
TCMU_ATTR_SUPP_KERN_CMD_REPLY is still enabled on module scope to keep
backward-compatibility and used by default, however once users set
nl_reply_supported=<NAGATIVE_VALUE> via configfs for a particular
device, the device disables the netlink command reply support.

Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com>
[use kstrtoint]
Signed-off-by: Mike Christie <mchristi@redhat.com>

---
 drivers/target/target_core_user.c | 58 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 8b9ec0f..fa251d3 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -150,6 +150,8 @@ struct tcmu_dev {
 	wait_queue_head_t nl_cmd_wq;
 
 	char dev_config[TCMU_CONFIG_LEN];
+
+	int nl_reply_supported;
 };
 
 #define TCMU_DEV(_se_dev) container_of(_se_dev, struct tcmu_dev, se_dev)
@@ -1352,6 +1354,10 @@ static void tcmu_init_genl_cmd_reply(struct tcmu_dev *udev, int cmd)
 
 	if (!tcmu_kern_cmd_reply_supported)
 		return;
+
+	if (udev->nl_reply_supported <= 0)
+		return;
+
 relock:
 	spin_lock(&udev->nl_cmd_lock);
 
@@ -1378,6 +1384,9 @@ static int tcmu_wait_genl_cmd_reply(struct tcmu_dev *udev)
 	if (!tcmu_kern_cmd_reply_supported)
 		return 0;
 
+	if (udev->nl_reply_supported <= 0)
+		return 0;
+
 	pr_debug("sleeping for nl reply\n");
 	wait_for_completion(&nl_cmd->complete);
 
@@ -1550,6 +1559,12 @@ static int tcmu_configure_device(struct se_device *dev)
 		dev->dev_attrib.emulate_write_cache = 0;
 	dev->dev_attrib.hw_queue_depth = 128;
 
+	/* If user didn't explicitly disable netlink reply support, use
+	 * module scope setting.
+	 */
+	if (udev->nl_reply_supported >= 0)
+		udev->nl_reply_supported = tcmu_kern_cmd_reply_supported;
+
 	/*
 	 * Get a ref incase userspace does a close on the uio device before
 	 * LIO has initiated tcmu_free_device.
@@ -1612,7 +1627,7 @@ static void tcmu_destroy_device(struct se_device *dev)
 
 enum {
 	Opt_dev_config, Opt_dev_size, Opt_hw_block_size, Opt_hw_max_sectors,
-	Opt_err,
+	Opt_nl_reply_supported, Opt_err,
 };
 
 static match_table_t tokens = {
@@ -1620,6 +1635,7 @@ enum {
 	{Opt_dev_size, "dev_size=%u"},
 	{Opt_hw_block_size, "hw_block_size=%u"},
 	{Opt_hw_max_sectors, "hw_max_sectors=%u"},
+	{Opt_nl_reply_supported, "nl_reply_supported=%d"},
 	{Opt_err, NULL}
 };
 
@@ -1694,6 +1710,17 @@ static ssize_t tcmu_set_configfs_dev_params(struct se_device *dev,
 			ret = tcmu_set_dev_attrib(&args[0],
 					&(dev->dev_attrib.hw_max_sectors));
 			break;
+		case Opt_nl_reply_supported:
+			arg_p = match_strdup(&args[0]);
+			if (!arg_p) {
+				ret = -ENOMEM;
+				break;
+			}
+			ret = kstrtoint(arg_p, 0, &udev->nl_reply_supported);
+			kfree(arg_p);
+			if (ret < 0)
+				pr_err("kstrtoint() failed for nl_reply_supported=\n");
+			break;
 		default:
 			break;
 		}
@@ -1844,6 +1871,34 @@ static ssize_t tcmu_dev_size_store(struct config_item *item, const char *page,
 }
 CONFIGFS_ATTR(tcmu_, dev_size);
 
+static ssize_t tcmu_nl_reply_supported_show(struct config_item *item,
+		char *page)
+{
+	struct se_dev_attrib *da = container_of(to_config_group(item),
+						struct se_dev_attrib, da_group);
+	struct tcmu_dev *udev = TCMU_DEV(da->da_dev);
+
+	return snprintf(page, PAGE_SIZE, "%d\n", udev->nl_reply_supported);
+}
+
+static ssize_t tcmu_nl_reply_supported_store(struct config_item *item,
+		const char *page, size_t count)
+{
+	struct se_dev_attrib *da = container_of(to_config_group(item),
+						struct se_dev_attrib, da_group);
+	struct tcmu_dev *udev = TCMU_DEV(da->da_dev);
+	s8 val;
+	int ret;
+
+	ret = kstrtos8(page, 0, &val);
+	if (ret < 0)
+		return ret;
+
+	udev->nl_reply_supported = val;
+	return count;
+}
+CONFIGFS_ATTR(tcmu_, nl_reply_supported);
+
 static ssize_t tcmu_emulate_write_cache_show(struct config_item *item,
 					     char *page)
 {
@@ -1886,6 +1941,7 @@ static ssize_t tcmu_emulate_write_cache_store(struct config_item *item,
 	&tcmu_attr_dev_config,
 	&tcmu_attr_dev_size,
 	&tcmu_attr_emulate_write_cache,
+	&tcmu_attr_nl_reply_supported,
 	NULL,
 };
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 03/19] tcmu: Use macro to call container_of in tcmu_cmd_time_out_show
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
  2017-10-30  3:44 ` [PATCH 01/19] tcmu: fix crash when removing the tcmu device v4 Mike Christie
  2017-10-30  3:44 ` [PATCH 02/19] tcmu: Add netlink command reply supported option for each device Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 04/19] tcmu: fix double se_cmd completion Mike Christie
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab
  Cc: Kenjiro Nakayama, Mike Christie

From: Kenjiro Nakayama <nakayamakenjiro@gmail.com>

This patch makes a tiny change that using TCMU_DEV in
tcmu_cmd_time_out_show so it is consistent with other functions.

Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com>
[added missing tcmu_cmd_time_out_store conversion]
Signed-off-by: Mike Christie <mchristi@redhat.com>

---
 drivers/target/target_core_user.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index fa251d3..1207b5b 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -1763,8 +1763,7 @@ static ssize_t tcmu_cmd_time_out_show(struct config_item *item, char *page)
 {
 	struct se_dev_attrib *da = container_of(to_config_group(item),
 					struct se_dev_attrib, da_group);
-	struct tcmu_dev *udev = container_of(da->da_dev,
-					struct tcmu_dev, se_dev);
+	struct tcmu_dev *udev = TCMU_DEV(da->da_dev);
 
 	return snprintf(page, PAGE_SIZE, "%lu\n", udev->cmd_time_out / MSEC_PER_SEC);
 }
@@ -1774,8 +1773,7 @@ static ssize_t tcmu_cmd_time_out_store(struct config_item *item, const char *pag
 {
 	struct se_dev_attrib *da = container_of(to_config_group(item),
 					struct se_dev_attrib, da_group);
-	struct tcmu_dev *udev = container_of(da->da_dev,
-					struct tcmu_dev, se_dev);
+	struct tcmu_dev *udev = TCMU_DEV(da->da_dev);
 	u32 val;
 	int ret;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 04/19] tcmu: fix double se_cmd completion
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (2 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 03/19] tcmu: Use macro to call container_of in tcmu_cmd_time_out_show Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 05/19] tcmu: merge common block release code Mike Christie
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

If cmd_time_out != 0, then tcmu_queue_cmd_ring could end up
sleeping waiting for ring space, timing out and then returning
failure to lio, and tcmu_check_expired_cmd could also detect
the timeout and call target_complete_cmd on the cmd.

This patch just delays setting up the deadline value and adding
the cmd to the udev->commands idr until we have allocated ring
space and are about to send the cmd to userspace.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 54 ++++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 23 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 1207b5b..4f666a5 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -432,7 +432,6 @@ static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
 	struct se_device *se_dev = se_cmd->se_dev;
 	struct tcmu_dev *udev = TCMU_DEV(se_dev);
 	struct tcmu_cmd *tcmu_cmd;
-	int cmd_id;
 
 	tcmu_cmd = kmem_cache_zalloc(tcmu_cmd_cache, GFP_KERNEL);
 	if (!tcmu_cmd)
@@ -440,9 +439,6 @@ static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
 
 	tcmu_cmd->se_cmd = se_cmd;
 	tcmu_cmd->tcmu_dev = udev;
-	if (udev->cmd_time_out)
-		tcmu_cmd->deadline = jiffies +
-					msecs_to_jiffies(udev->cmd_time_out);
 
 	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
 	tcmu_cmd->dbi_cnt = tcmu_cmd_get_block_cnt(tcmu_cmd);
@@ -453,19 +449,6 @@ static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
 		return NULL;
 	}
 
-	idr_preload(GFP_KERNEL);
-	spin_lock_irq(&udev->commands_lock);
-	cmd_id = idr_alloc(&udev->commands, tcmu_cmd, 0,
-		USHRT_MAX, GFP_NOWAIT);
-	spin_unlock_irq(&udev->commands_lock);
-	idr_preload_end();
-
-	if (cmd_id < 0) {
-		tcmu_free_cmd(tcmu_cmd);
-		return NULL;
-	}
-	tcmu_cmd->cmd_id = cmd_id;
-
 	return tcmu_cmd;
 }
 
@@ -748,6 +731,30 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd *tcmu_cmd,
 	return command_size;
 }
 
+static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
+{
+	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
+	unsigned long tmo = udev->cmd_time_out;
+	int cmd_id;
+
+	if (tcmu_cmd->cmd_id)
+		return 0;
+
+	cmd_id = idr_alloc(&udev->commands, tcmu_cmd, 1, USHRT_MAX, GFP_NOWAIT);
+	if (cmd_id < 0) {
+		pr_err("tcmu: Could not allocate cmd id.\n");
+		return cmd_id;
+	}
+	tcmu_cmd->cmd_id = cmd_id;
+
+	if (!tmo)
+		return 0;
+
+	tcmu_cmd->deadline = round_jiffies_up(jiffies + msecs_to_jiffies(tmo));
+	mod_timer(&udev->timeout, tcmu_cmd->deadline);
+	return 0;
+}
+
 static sense_reason_t
 tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
 {
@@ -841,7 +848,6 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd *tcmu_cmd,
 	entry = (void *) mb + CMDR_OFF + cmd_head;
 	memset(entry, 0, command_size);
 	tcmu_hdr_set_op(&entry->hdr.len_op, TCMU_OP_CMD);
-	entry->hdr.cmd_id = tcmu_cmd->cmd_id;
 
 	/* Handle allocating space from the data area */
 	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
@@ -879,6 +885,13 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd *tcmu_cmd,
 	}
 	entry->req.iov_bidi_cnt = iov_cnt;
 
+	ret = tcmu_setup_cmd_timer(tcmu_cmd);
+	if (ret) {
+		tcmu_cmd_free_data(tcmu_cmd, tcmu_cmd->dbi_cnt);
+		return TCM_OUT_OF_RESOURCES;
+	}
+	entry->hdr.cmd_id = tcmu_cmd->cmd_id;
+
 	/*
 	 * Recalaulate the command's base size and size according
 	 * to the actual needs
@@ -912,8 +925,6 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd *tcmu_cmd,
 static sense_reason_t
 tcmu_queue_cmd(struct se_cmd *se_cmd)
 {
-	struct se_device *se_dev = se_cmd->se_dev;
-	struct tcmu_dev *udev = TCMU_DEV(se_dev);
 	struct tcmu_cmd *tcmu_cmd;
 	sense_reason_t ret;
 
@@ -924,9 +935,6 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd *tcmu_cmd,
 	ret = tcmu_queue_cmd_ring(tcmu_cmd);
 	if (ret != TCM_NO_SENSE) {
 		pr_err("TCMU: Could not queue command\n");
-		spin_lock_irq(&udev->commands_lock);
-		idr_remove(&udev->commands, tcmu_cmd->cmd_id);
-		spin_unlock_irq(&udev->commands_lock);
 
 		tcmu_free_cmd(tcmu_cmd);
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 05/19] tcmu: merge common block release code
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (3 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 04/19] tcmu: fix double se_cmd completion Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 06/19] tcmu: split unmap_thread_fn Mike Christie
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

Have unmap_thread_fn use tcmu_blocks_release.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 24 ++++++++----------------
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 4f666a5..a6205e4 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -1301,21 +1301,19 @@ static int tcmu_check_and_free_pending_cmd(struct tcmu_cmd *cmd)
 	return -EINVAL;
 }
 
-static void tcmu_blocks_release(struct tcmu_dev *udev)
+static void tcmu_blocks_release(struct radix_tree_root *blocks,
+				int start, int end)
 {
 	int i;
 	struct page *page;
 
-	/* Try to release all block pages */
-	mutex_lock(&udev->cmdr_lock);
-	for (i = 0; i <= udev->dbi_max; i++) {
-		page = radix_tree_delete(&udev->data_blocks, i);
+	for (i = start; i < end; i++) {
+		page = radix_tree_delete(blocks, i);
 		if (page) {
 			__free_page(page);
 			atomic_dec(&global_db_count);
 		}
 	}
-	mutex_unlock(&udev->cmdr_lock);
 }
 
 static void tcmu_dev_kref_release(struct kref *kref)
@@ -1339,7 +1337,9 @@ static void tcmu_dev_kref_release(struct kref *kref)
 	spin_unlock_irq(&udev->commands_lock);
 	WARN_ON(!all_expired);
 
-	tcmu_blocks_release(udev);
+	mutex_lock(&udev->cmdr_lock);
+	tcmu_blocks_release(&udev->data_blocks, 0, udev->dbi_max + 1);
+	mutex_unlock(&udev->cmdr_lock);
 
 	call_rcu(&dev->rcu_head, tcmu_dev_call_rcu);
 }
@@ -1976,8 +1976,6 @@ static int unmap_thread_fn(void *data)
 	struct tcmu_dev *udev;
 	loff_t off;
 	uint32_t start, end, block;
-	struct page *page;
-	int i;
 
 	while (!kthread_should_stop()) {
 		DEFINE_WAIT(__wait);
@@ -2025,13 +2023,7 @@ static int unmap_thread_fn(void *data)
 			unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
 
 			/* Release the block pages */
-			for (i = start; i < end; i++) {
-				page = radix_tree_delete(&udev->data_blocks, i);
-				if (page) {
-					__free_page(page);
-					atomic_dec(&global_db_count);
-				}
-			}
+			tcmu_blocks_release(&udev->data_blocks, start, end);
 			mutex_unlock(&udev->cmdr_lock);
 		}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 06/19] tcmu: split unmap_thread_fn
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (4 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 05/19] tcmu: merge common block release code Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 07/19] tcmu: fix unmap thread race Mike Christie
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

Separate unmap_thread_fn to make it easier to read.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 120 ++++++++++++++++++++++----------------
 1 file changed, 70 insertions(+), 50 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index a6205e4..15b54fd 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -1971,71 +1971,91 @@ static ssize_t tcmu_emulate_write_cache_store(struct config_item *item,
 	.tb_dev_attrib_attrs	= NULL,
 };
 
-static int unmap_thread_fn(void *data)
+
+static void find_free_blocks(void)
 {
 	struct tcmu_dev *udev;
 	loff_t off;
 	uint32_t start, end, block;
 
-	while (!kthread_should_stop()) {
-		DEFINE_WAIT(__wait);
-
-		prepare_to_wait(&unmap_wait, &__wait, TASK_INTERRUPTIBLE);
-		schedule();
-		finish_wait(&unmap_wait, &__wait);
+	mutex_lock(&root_udev_mutex);
+	list_for_each_entry(udev, &root_udev, node) {
+		mutex_lock(&udev->cmdr_lock);
 
-		if (kthread_should_stop())
-			break;
+		/* Try to complete the finished commands first */
+		tcmu_handle_completions(udev);
 
-		mutex_lock(&root_udev_mutex);
-		list_for_each_entry(udev, &root_udev, node) {
-			mutex_lock(&udev->cmdr_lock);
+		/* Skip the udevs waiting the global pool or in idle */
+		if (udev->waiting_global || !udev->dbi_thresh) {
+			mutex_unlock(&udev->cmdr_lock);
+			continue;
+		}
 
-			/* Try to complete the finished commands first */
-			tcmu_handle_completions(udev);
+		end = udev->dbi_max + 1;
+		block = find_last_bit(udev->data_bitmap, end);
+		if (block == udev->dbi_max) {
+			/*
+			 * The last bit is dbi_max, so there is
+			 * no need to shrink any blocks.
+			 */
+			mutex_unlock(&udev->cmdr_lock);
+			continue;
+		} else if (block == end) {
+			/* The current udev will goto idle state */
+			udev->dbi_thresh = start = 0;
+			udev->dbi_max = 0;
+		} else {
+			udev->dbi_thresh = start = block + 1;
+			udev->dbi_max = block;
+		}
 
-			/* Skip the udevs waiting the global pool or in idle */
-			if (udev->waiting_global || !udev->dbi_thresh) {
-				mutex_unlock(&udev->cmdr_lock);
-				continue;
-			}
+		/* Here will truncate the data area from off */
+		off = udev->data_off + start * DATA_BLOCK_SIZE;
+		unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
 
-			end = udev->dbi_max + 1;
-			block = find_last_bit(udev->data_bitmap, end);
-			if (block == udev->dbi_max) {
-				/*
-				 * The last bit is dbi_max, so there is
-				 * no need to shrink any blocks.
-				 */
-				mutex_unlock(&udev->cmdr_lock);
-				continue;
-			} else if (block == end) {
-				/* The current udev will goto idle state */
-				udev->dbi_thresh = start = 0;
-				udev->dbi_max = 0;
-			} else {
-				udev->dbi_thresh = start = block + 1;
-				udev->dbi_max = block;
-			}
+		/* Release the block pages */
+		tcmu_blocks_release(&udev->data_blocks, start, end);
+		mutex_unlock(&udev->cmdr_lock);
+	}
+	mutex_unlock(&root_udev_mutex);
+}
 
-			/* Here will truncate the data area from off */
-			off = udev->data_off + start * DATA_BLOCK_SIZE;
-			unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
+static void run_cmdr_queues(void)
+{
+	struct tcmu_dev *udev;
 
-			/* Release the block pages */
-			tcmu_blocks_release(&udev->data_blocks, start, end);
+	/*
+	 * Try to wake up the udevs who are waiting
+	 * for the global data block pool.
+	 */
+	mutex_lock(&root_udev_mutex);
+	list_for_each_entry(udev, &root_udev, node) {
+		mutex_lock(&udev->cmdr_lock);
+		if (!udev->waiting_global) {
 			mutex_unlock(&udev->cmdr_lock);
+			break;
 		}
+		mutex_unlock(&udev->cmdr_lock);
 
-		/*
-		 * Try to wake up the udevs who are waiting
-		 * for the global data pool.
-		 */
-		list_for_each_entry(udev, &root_udev, node) {
-			if (udev->waiting_global)
-				wake_up(&udev->wait_cmdr);
-		}
-		mutex_unlock(&root_udev_mutex);
+		wake_up(&udev->wait_cmdr);
+	}
+	mutex_unlock(&root_udev_mutex);
+}
+
+static int unmap_thread_fn(void *data)
+{
+	while (!kthread_should_stop()) {
+		DEFINE_WAIT(__wait);
+
+		prepare_to_wait(&unmap_wait, &__wait, TASK_INTERRUPTIBLE);
+		schedule();
+		finish_wait(&unmap_wait, &__wait);
+
+		if (kthread_should_stop())
+			break;
+
+		find_free_blocks();
+		run_cmdr_queues();
 	}
 
 	return 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 07/19] tcmu: fix unmap thread race
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (5 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 06/19] tcmu: split unmap_thread_fn Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 08/19] tcmu: move expired command completion to unmap thread Mike Christie
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

If the unmap thread has already run find_free_blocks
but not yet run prepare_to_wait when a wake_up(&unmap_wait)
call is done, the unmap thread is going to miss the wake
call. Instead of adding checks for if new waiters were added
this just has use a work queue which will run us again
in this type of case.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 43 +++++++++------------------------------
 1 file changed, 10 insertions(+), 33 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 15b54fd..14d9b79 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -32,7 +32,7 @@
 #include <linux/highmem.h>
 #include <linux/configfs.h>
 #include <linux/mutex.h>
-#include <linux/kthread.h>
+#include <linux/workqueue.h>
 #include <net/genetlink.h>
 #include <scsi/scsi_common.h>
 #include <scsi/scsi_proto.h>
@@ -176,12 +176,11 @@ struct tcmu_cmd {
 	unsigned long flags;
 };
 
-static struct task_struct *unmap_thread;
-static wait_queue_head_t unmap_wait;
 static DEFINE_MUTEX(root_udev_mutex);
 static LIST_HEAD(root_udev);
 
 static atomic_t global_db_count = ATOMIC_INIT(0);
+static struct work_struct tcmu_unmap_work;
 
 static struct kmem_cache *tcmu_cmd_cache;
 
@@ -389,8 +388,7 @@ static bool tcmu_get_empty_blocks(struct tcmu_dev *udev,
 
 err:
 	udev->waiting_global = true;
-	/* Try to wake up the unmap thread */
-	wake_up(&unmap_wait);
+	schedule_work(&tcmu_unmap_work);
 	return false;
 }
 
@@ -1063,8 +1061,7 @@ static void tcmu_device_timedout(unsigned long data)
 	idr_for_each(&udev->commands, tcmu_check_expired_cmd, NULL);
 	spin_unlock_irqrestore(&udev->commands_lock, flags);
 
-	/* Try to wake up the ummap thread */
-	wake_up(&unmap_wait);
+	schedule_work(&tcmu_unmap_work);
 
 	/*
 	 * We don't need to wakeup threads on wait_cmdr since they have their
@@ -2042,23 +2039,10 @@ static void run_cmdr_queues(void)
 	mutex_unlock(&root_udev_mutex);
 }
 
-static int unmap_thread_fn(void *data)
+static void tcmu_unmap_work_fn(struct work_struct *work)
 {
-	while (!kthread_should_stop()) {
-		DEFINE_WAIT(__wait);
-
-		prepare_to_wait(&unmap_wait, &__wait, TASK_INTERRUPTIBLE);
-		schedule();
-		finish_wait(&unmap_wait, &__wait);
-
-		if (kthread_should_stop())
-			break;
-
-		find_free_blocks();
-		run_cmdr_queues();
-	}
-
-	return 0;
+	find_free_blocks();
+	run_cmdr_queues();
 }
 
 static int __init tcmu_module_init(void)
@@ -2067,6 +2051,8 @@ static int __init tcmu_module_init(void)
 
 	BUILD_BUG_ON((sizeof(struct tcmu_cmd_entry) % TCMU_OP_ALIGN_SIZE) != 0);
 
+	INIT_WORK(&tcmu_unmap_work, tcmu_unmap_work_fn);
+
 	tcmu_cmd_cache = kmem_cache_create("tcmu_cmd_cache",
 				sizeof(struct tcmu_cmd),
 				__alignof__(struct tcmu_cmd),
@@ -2112,17 +2098,8 @@ static int __init tcmu_module_init(void)
 	if (ret)
 		goto out_attrs;
 
-	init_waitqueue_head(&unmap_wait);
-	unmap_thread = kthread_run(unmap_thread_fn, NULL, "tcmu_unmap");
-	if (IS_ERR(unmap_thread)) {
-		ret = PTR_ERR(unmap_thread);
-		goto out_unreg_transport;
-	}
-
 	return 0;
 
-out_unreg_transport:
-	target_backend_unregister(&tcmu_ops);
 out_attrs:
 	kfree(tcmu_attrs);
 out_unreg_genl:
@@ -2137,7 +2114,7 @@ static int __init tcmu_module_init(void)
 
 static void __exit tcmu_module_exit(void)
 {
-	kthread_stop(unmap_thread);
+	cancel_work_sync(&tcmu_unmap_work);
 	target_backend_unregister(&tcmu_ops);
 	kfree(tcmu_attrs);
 	genl_unregister_family(&tcmu_genl_family);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 08/19] tcmu: move expired command completion to unmap thread
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (6 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 07/19] tcmu: fix unmap thread race Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 09/19] tcmu: remove commands_lock Mike Christie
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

This moves the expired command completion handling to
the unmap wq, so the next patch can use a mutex
in tcmu_check_expired_cmd.

Notes:
tcmu_device_timedout's use of spin_lock_irq was not needed.
The commands_lock is used between thread context (tcmu_queue_cmd_ring
and tcmu_irqcontrol (even though this is named irqcontrol it is not
run in irq context)) and timer/bh context. In the timer/bh context
bhs are disabled, so you need to use the _bh lock calls from the
thread context callers.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 48 +++++++++++++++++++++++++++++++--------
 1 file changed, 39 insertions(+), 9 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 14d9b79..7271ec2 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -143,6 +143,7 @@ struct tcmu_dev {
 
 	struct timer_list timeout;
 	unsigned int cmd_time_out;
+	struct list_head timedout_entry;
 
 	spinlock_t nl_cmd_lock;
 	struct tcmu_nl_cmd curr_nl_cmd;
@@ -179,6 +180,9 @@ struct tcmu_cmd {
 static DEFINE_MUTEX(root_udev_mutex);
 static LIST_HEAD(root_udev);
 
+static DEFINE_SPINLOCK(timed_out_udevs_lock);
+static LIST_HEAD(timed_out_udevs);
+
 static atomic_t global_db_count = ATOMIC_INIT(0);
 static struct work_struct tcmu_unmap_work;
 
@@ -1055,18 +1059,15 @@ static int tcmu_check_expired_cmd(int id, void *p, void *data)
 static void tcmu_device_timedout(unsigned long data)
 {
 	struct tcmu_dev *udev = (struct tcmu_dev *)data;
-	unsigned long flags;
 
-	spin_lock_irqsave(&udev->commands_lock, flags);
-	idr_for_each(&udev->commands, tcmu_check_expired_cmd, NULL);
-	spin_unlock_irqrestore(&udev->commands_lock, flags);
+	pr_debug("%s cmd timeout has expired\n", udev->name);
 
-	schedule_work(&tcmu_unmap_work);
+	spin_lock(&timed_out_udevs_lock);
+	if (list_empty(&udev->timedout_entry))
+		list_add_tail(&udev->timedout_entry, &timed_out_udevs);
+	spin_unlock(&timed_out_udevs_lock);
 
-	/*
-	 * We don't need to wakeup threads on wait_cmdr since they have their
-	 * own timeout.
-	 */
+	schedule_work(&tcmu_unmap_work);
 }
 
 static int tcmu_attach_hba(struct se_hba *hba, u32 host_id)
@@ -1110,6 +1111,7 @@ static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name)
 	init_waitqueue_head(&udev->wait_cmdr);
 	mutex_init(&udev->cmdr_lock);
 
+	INIT_LIST_HEAD(&udev->timedout_entry);
 	idr_init(&udev->commands);
 	spin_lock_init(&udev->commands_lock);
 
@@ -1324,6 +1326,11 @@ static void tcmu_dev_kref_release(struct kref *kref)
 	vfree(udev->mb_addr);
 	udev->mb_addr = NULL;
 
+	spin_lock_bh(&timed_out_udevs_lock);
+	if (!list_empty(&udev->timedout_entry))
+		list_del(&udev->timedout_entry);
+	spin_unlock_bh(&timed_out_udevs_lock);
+
 	/* Upper layer should drain all requests before calling this */
 	spin_lock_irq(&udev->commands_lock);
 	idr_for_each_entry(&udev->commands, cmd, i) {
@@ -2039,8 +2046,31 @@ static void run_cmdr_queues(void)
 	mutex_unlock(&root_udev_mutex);
 }
 
+static void check_timedout_devices(void)
+{
+	struct tcmu_dev *udev, *tmp_dev;
+	LIST_HEAD(devs);
+
+	spin_lock_bh(&timed_out_udevs_lock);
+	list_splice_init(&timed_out_udevs, &devs);
+
+	list_for_each_entry_safe(udev, tmp_dev, &devs, timedout_entry) {
+		list_del_init(&udev->timedout_entry);
+		spin_unlock_bh(&timed_out_udevs_lock);
+
+		spin_lock(&udev->commands_lock);
+		idr_for_each(&udev->commands, tcmu_check_expired_cmd, NULL);
+		spin_unlock(&udev->commands_lock);
+
+		spin_lock_bh(&timed_out_udevs_lock);
+	}
+
+	spin_unlock_bh(&timed_out_udevs_lock);
+}
+
 static void tcmu_unmap_work_fn(struct work_struct *work)
 {
+	check_timedout_devices();
 	find_free_blocks();
 	run_cmdr_queues();
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 09/19] tcmu: remove commands_lock
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (7 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 08/19] tcmu: move expired command completion to unmap thread Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 10/19] tcmu: release blocks for partially setup cmds Mike Christie
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

No need for the commands_lock. The cmdr_lock is already held during
idr addition and deletion, so just grab it during traversal.

Note: This also fixes a issue where we should have been using at
least _bh locking in tcmu_handle_completions when taking the commands
lock to prevent the case where tcmu_handle_completions could be
interrupted by a timer softirq while the commands_lock is held.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 7271ec2..5226b82 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -139,7 +139,6 @@ struct tcmu_dev {
 	struct radix_tree_root data_blocks;
 
 	struct idr commands;
-	spinlock_t commands_lock;
 
 	struct timer_list timeout;
 	unsigned int cmd_time_out;
@@ -1012,10 +1011,7 @@ static unsigned int tcmu_handle_completions(struct tcmu_dev *udev)
 		}
 		WARN_ON(tcmu_hdr_get_op(entry->hdr.len_op) != TCMU_OP_CMD);
 
-		spin_lock(&udev->commands_lock);
 		cmd = idr_remove(&udev->commands, entry->hdr.cmd_id);
-		spin_unlock(&udev->commands_lock);
-
 		if (!cmd) {
 			pr_err("cmd_id not found, ring is broken\n");
 			set_bit(TCMU_DEV_BIT_BROKEN, &udev->flags);
@@ -1113,7 +1109,6 @@ static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name)
 
 	INIT_LIST_HEAD(&udev->timedout_entry);
 	idr_init(&udev->commands);
-	spin_lock_init(&udev->commands_lock);
 
 	setup_timer(&udev->timeout, tcmu_device_timedout,
 		(unsigned long)udev);
@@ -1332,16 +1327,14 @@ static void tcmu_dev_kref_release(struct kref *kref)
 	spin_unlock_bh(&timed_out_udevs_lock);
 
 	/* Upper layer should drain all requests before calling this */
-	spin_lock_irq(&udev->commands_lock);
+	mutex_lock(&udev->cmdr_lock);
 	idr_for_each_entry(&udev->commands, cmd, i) {
 		if (tcmu_check_and_free_pending_cmd(cmd) != 0)
 			all_expired = false;
 	}
 	idr_destroy(&udev->commands);
-	spin_unlock_irq(&udev->commands_lock);
 	WARN_ON(!all_expired);
 
-	mutex_lock(&udev->cmdr_lock);
 	tcmu_blocks_release(&udev->data_blocks, 0, udev->dbi_max + 1);
 	mutex_unlock(&udev->cmdr_lock);
 
@@ -2058,9 +2051,9 @@ static void check_timedout_devices(void)
 		list_del_init(&udev->timedout_entry);
 		spin_unlock_bh(&timed_out_udevs_lock);
 
-		spin_lock(&udev->commands_lock);
+		mutex_lock(&udev->cmdr_lock);
 		idr_for_each(&udev->commands, tcmu_check_expired_cmd, NULL);
-		spin_unlock(&udev->commands_lock);
+		mutex_unlock(&udev->cmdr_lock);
 
 		spin_lock_bh(&timed_out_udevs_lock);
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 10/19] tcmu: release blocks for partially setup cmds
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (8 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 09/19] tcmu: remove commands_lock Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 11/19] tcmu: simplify scatter_data_area error handling Mike Christie
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

If we cannot setup a cmd because we run out of ring space
or global pages release the blocks before sleeping. This
prevents a deadlock where dev0 has waiting_blocks set and
needs N blocks, but dev1 to devX have each allocated N / X blocks
and also hit the global block limit so they went to sleep.

find_free_blocks is not able to take the sleeping dev's
blocks becaause their waiting_blocks is set and even
if it was not the block returned by find_last_bit could equal
dbi_max. The latter will probably never happen because
DATA_BLOCK_BITS is so high but in the next patches
DATA_BLOCK_BITS and TCMU_GLOBAL_MAX_BLOCKS will be settable so
it might be lower and could happen.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 5226b82..612ac65 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -806,6 +806,13 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 		int ret;
 		DEFINE_WAIT(__wait);

+		/*
+		 * Don't leave commands partially setup because the unmap
+		 * thread might need the blocks to make forward progress.
+		 */
+		tcmu_cmd_free_data(tcmu_cmd, tcmu_cmd->dbi_cur);
+		tcmu_cmd_reset_dbi_cur(tcmu_cmd);
+
 		prepare_to_wait(&udev->wait_cmdr, &__wait, TASK_INTERRUPTIBLE);

 		pr_debug("sleeping for ring space\n");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 11/19] tcmu: simplify scatter_data_area error handling
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (9 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 10/19] tcmu: release blocks for partially setup cmds Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 12/19] tcmu: fix free block calculation Mike Christie
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

scatter_data_area always returns 0, so stop checking
for errors.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 31 +++++++------------------------
 1 file changed, 7 insertions(+), 24 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 612ac65..76f0e5a 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -519,7 +519,7 @@ static inline size_t iov_tail(struct iovec *iov)
 	return (size_t)iov->iov_base + iov->iov_len;
 }
 
-static int scatter_data_area(struct tcmu_dev *udev,
+static void scatter_data_area(struct tcmu_dev *udev,
 	struct tcmu_cmd *tcmu_cmd, struct scatterlist *data_sg,
 	unsigned int data_nents, struct iovec **iov,
 	int *iov_cnt, bool copy_data)
@@ -572,8 +572,6 @@ static int scatter_data_area(struct tcmu_dev *udev,
 	}
 	if (to)
 		kunmap_atomic(to);
-
-	return 0;
 }
 
 static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
@@ -863,33 +861,18 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 	iov_cnt = 0;
 	copy_to_data_area = (se_cmd->data_direction == DMA_TO_DEVICE
 		|| se_cmd->se_cmd_flags & SCF_BIDI);
-	ret = scatter_data_area(udev, tcmu_cmd, se_cmd->t_data_sg,
-				se_cmd->t_data_nents, &iov, &iov_cnt,
-				copy_to_data_area);
-	if (ret) {
-		tcmu_cmd_free_data(tcmu_cmd, tcmu_cmd->dbi_cnt);
-		mutex_unlock(&udev->cmdr_lock);
-
-		pr_err("tcmu: alloc and scatter data failed\n");
-		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
-	}
+	scatter_data_area(udev, tcmu_cmd, se_cmd->t_data_sg,
+			  se_cmd->t_data_nents, &iov, &iov_cnt,
+			  copy_to_data_area);
 	entry->req.iov_cnt = iov_cnt;
 
 	/* Handle BIDI commands */
 	iov_cnt = 0;
 	if (se_cmd->se_cmd_flags & SCF_BIDI) {
 		iov++;
-		ret = scatter_data_area(udev, tcmu_cmd,
-					se_cmd->t_bidi_data_sg,
-					se_cmd->t_bidi_data_nents,
-					&iov, &iov_cnt, false);
-		if (ret) {
-			tcmu_cmd_free_data(tcmu_cmd, tcmu_cmd->dbi_cnt);
-			mutex_unlock(&udev->cmdr_lock);
-
-			pr_err("tcmu: alloc and scatter bidi data failed\n");
-			return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
-		}
+		scatter_data_area(udev, tcmu_cmd, se_cmd->t_bidi_data_sg,
+				  se_cmd->t_bidi_data_nents, &iov, &iov_cnt,
+				  false);
 	}
 	entry->req.iov_bidi_cnt = iov_cnt;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 12/19] tcmu: fix free block calculation
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (10 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 11/19] tcmu: simplify scatter_data_area error handling Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 13/19] tcmu: clean up the scatter helper Mike Christie
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

The blocks_left calculation does not account for free blocks
between 0 and thresh, so we could be queueing/waiting when
there are enough blocks free.

This has us add in the blocks between 0 and thresh as well as
at the end from thresh to DATA_BLOCK_BITS.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 76f0e5a..b7eeb25 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -636,7 +636,7 @@ static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 
 static inline size_t spc_bitmap_free(unsigned long *bitmap, uint32_t thresh)
 {
-	return DATA_BLOCK_SIZE * (thresh - bitmap_weight(bitmap, thresh));
+	return thresh - bitmap_weight(bitmap, thresh);
 }
 
 /*
@@ -676,8 +676,9 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 
 	/* try to check and get the data blocks as needed */
 	space = spc_bitmap_free(udev->data_bitmap, udev->dbi_thresh);
-	if (space < data_needed) {
-		unsigned long blocks_left = DATA_BLOCK_BITS - udev->dbi_thresh;
+	if ((space * DATA_BLOCK_SIZE) < data_needed) {
+		unsigned long blocks_left = DATA_BLOCK_BITS - udev->dbi_thresh +
+						space;
 		unsigned long grow;
 
 		if (blocks_left < blocks_needed) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 13/19] tcmu: clean up the scatter helper
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (11 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 12/19] tcmu: fix free block calculation Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 14/19] tcmu: prep queue_cmd_ring to be used by unmap wq Mike Christie
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab
  Cc: Xiubo Li, Mike Christie

From: Xiubo Li <lixiubo@cmss.chinamobile.com>

Add some comments to make the scatter code to be more readable,
and drop unused arg to new_iov.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index b7eeb25..91e0601 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -491,8 +491,7 @@ static inline size_t head_to_end(size_t head, size_t size)
 	return size - head;
 }
 
-static inline void new_iov(struct iovec **iov, int *iov_cnt,
-			   struct tcmu_dev *udev)
+static inline void new_iov(struct iovec **iov, int *iov_cnt)
 {
 	struct iovec *iovec;
 
@@ -545,19 +544,38 @@ static void scatter_data_area(struct tcmu_dev *udev,
 				to = kmap_atomic(page);
 			}
 
-			copy_bytes = min_t(size_t, sg_remaining,
-					block_remaining);
+			/*
+			 * Covert to virtual offset of the ring data area.
+			 */
 			to_offset = get_block_offset_user(udev, dbi,
 					block_remaining);
 
+			/*
+			 * The following code will gather and map the blocks
+			 * to the same iovec when the blocks are all next to
+			 * each other.
+			 */
+			copy_bytes = min_t(size_t, sg_remaining,
+					block_remaining);
 			if (*iov_cnt != 0 &&
 			    to_offset == iov_tail(*iov)) {
+				/*
+				 * Will append to the current iovec, because
+				 * the current block page is next to the
+				 * previous one.
+				 */
 				(*iov)->iov_len += copy_bytes;
 			} else {
-				new_iov(iov, iov_cnt, udev);
+				/*
+				 * Will allocate a new iovec because we are
+				 * first time here or the current block page
+				 * is not next to the previous one.
+				 */
+				new_iov(iov, iov_cnt);
 				(*iov)->iov_base = (void __user *)to_offset;
 				(*iov)->iov_len = copy_bytes;
 			}
+
 			if (copy_data) {
 				offset = DATA_BLOCK_SIZE - block_remaining;
 				memcpy(to + offset,
@@ -565,11 +583,13 @@ static void scatter_data_area(struct tcmu_dev *udev,
 				       copy_bytes);
 				tcmu_flush_dcache_range(to, copy_bytes);
 			}
+
 			sg_remaining -= copy_bytes;
 			block_remaining -= copy_bytes;
 		}
 		kunmap_atomic(from - sg->offset);
 	}
+
 	if (to)
 		kunmap_atomic(to);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 14/19] tcmu: prep queue_cmd_ring to be used by unmap wq
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (12 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 13/19] tcmu: clean up the scatter helper Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 15/19] tcmu: simplify dbi thresh handling Mike Christie
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

In the next patch we will call queue_cmd_ring from the submitting
context and also the unmap wq when blocks free. This changes
the queue_cmd_ring return code so in the next patch we can
return a sense_reason_t and also tell the caller if the cmd
was internally queued.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 41 +++++++++++++++++++++++++--------------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 91e0601..68668a8 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -775,8 +775,16 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 	return 0;
 }
 
-static sense_reason_t
-tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
+/**
+ * queue_cmd_ring - queue cmd to ring or internally
+ * @tcmu_cmd: cmd to queue
+ * @scsi_err: TCM error code if failure (-1) returned.
+ *
+ * Returns:
+ * -1 we cannot queue internally or to the ring.
+ *  0 success
+ */
+static sense_reason_t queue_cmd_ring(struct tcmu_cmd *tcmu_cmd, int *scsi_err)
 {
 	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
 	struct se_cmd *se_cmd = tcmu_cmd->se_cmd;
@@ -790,8 +798,12 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 	bool copy_to_data_area;
 	size_t data_length = tcmu_cmd_get_data_length(tcmu_cmd);
 
-	if (test_bit(TCMU_DEV_BIT_BROKEN, &udev->flags))
-		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+	*scsi_err = TCM_NO_SENSE;
+
+	if (test_bit(TCMU_DEV_BIT_BROKEN, &udev->flags)) {
+		*scsi_err = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+		return -1;
+	}
 
 	/*
 	 * Must be a certain minimum size for response sense info, but
@@ -818,7 +830,8 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 			"cmd ring/data area\n", command_size, data_length,
 			udev->cmdr_size, udev->data_size);
 		mutex_unlock(&udev->cmdr_lock);
-		return TCM_INVALID_CDB_FIELD;
+		*scsi_err = TCM_INVALID_CDB_FIELD;
+		return -1;
 	}
 
 	while (!is_ring_space_avail(udev, tcmu_cmd, command_size, data_length)) {
@@ -844,7 +857,8 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 		finish_wait(&udev->wait_cmdr, &__wait);
 		if (!ret) {
 			pr_warn("tcmu: command timed out\n");
-			return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+			*scsi_err = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+			return -1;
 		}
 
 		mutex_lock(&udev->cmdr_lock);
@@ -900,7 +914,8 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 	ret = tcmu_setup_cmd_timer(tcmu_cmd);
 	if (ret) {
 		tcmu_cmd_free_data(tcmu_cmd, tcmu_cmd->dbi_cnt);
-		return TCM_OUT_OF_RESOURCES;
+		*scsi_err = TCM_OUT_OF_RESOURCES;
+		return -1;
 	}
 	entry->hdr.cmd_id = tcmu_cmd->cmd_id;
 
@@ -931,27 +946,23 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 		mod_timer(&udev->timeout, round_jiffies_up(jiffies +
 			  msecs_to_jiffies(udev->cmd_time_out)));
 
-	return TCM_NO_SENSE;
+	return 0;
 }
 
 static sense_reason_t
 tcmu_queue_cmd(struct se_cmd *se_cmd)
 {
 	struct tcmu_cmd *tcmu_cmd;
-	sense_reason_t ret;
+	sense_reason_t scsi_ret;
 
 	tcmu_cmd = tcmu_alloc_cmd(se_cmd);
 	if (!tcmu_cmd)
 		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
 
-	ret = tcmu_queue_cmd_ring(tcmu_cmd);
-	if (ret != TCM_NO_SENSE) {
-		pr_err("TCMU: Could not queue command\n");
-
+	if (queue_cmd_ring(tcmu_cmd, &scsi_ret) < 0)
 		tcmu_free_cmd(tcmu_cmd);
-	}
 
-	return ret;
+	return scsi_ret;
 }
 
 static void tcmu_handle_completion(struct tcmu_cmd *cmd, struct tcmu_cmd_entry *entry)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 15/19] tcmu: simplify dbi thresh handling
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (13 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 14/19] tcmu: prep queue_cmd_ring to be used by unmap wq Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 16/19] tcmu: don't block submitting context for block waits Mike Christie
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

We do not really save a lot by trying to increase thresh
a multiple of the existing value. This just simplifies the
code by increasing it to whatever is needed for the command
being executed.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 22 +++-------------------
 1 file changed, 3 insertions(+), 19 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 68668a8..0630657 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -79,7 +79,6 @@
 #define DATA_BLOCK_SIZE PAGE_SIZE
 #define DATA_BLOCK_BITS (256 * 1024)
 #define DATA_SIZE (DATA_BLOCK_BITS * DATA_BLOCK_SIZE)
-#define DATA_BLOCK_INIT_BITS 128
 
 /* The total size of the ring is 8M + 256K * PAGE_SIZE */
 #define TCMU_RING_SIZE (CMDR_SIZE + DATA_SIZE)
@@ -699,7 +698,6 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 	if ((space * DATA_BLOCK_SIZE) < data_needed) {
 		unsigned long blocks_left = DATA_BLOCK_BITS - udev->dbi_thresh +
 						space;
-		unsigned long grow;
 
 		if (blocks_left < blocks_needed) {
 			pr_debug("no data space: only %lu available, but ask for %zu\n",
@@ -708,23 +706,9 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 			return false;
 		}
 
-		/* Try to expand the thresh */
-		if (!udev->dbi_thresh) {
-			/* From idle state */
-			uint32_t init_thresh = DATA_BLOCK_INIT_BITS;
-
-			udev->dbi_thresh = max(blocks_needed, init_thresh);
-		} else {
-			/*
-			 * Grow the data area by max(blocks needed,
-			 * dbi_thresh / 2), but limited to the max
-			 * DATA_BLOCK_BITS size.
-			 */
-			grow = max(blocks_needed, udev->dbi_thresh / 2);
-			udev->dbi_thresh += grow;
-			if (udev->dbi_thresh > DATA_BLOCK_BITS)
-				udev->dbi_thresh = DATA_BLOCK_BITS;
-		}
+		udev->dbi_thresh += blocks_needed;
+		if (udev->dbi_thresh > DATA_BLOCK_BITS)
+			udev->dbi_thresh = DATA_BLOCK_BITS;
 	}
 
 	return tcmu_get_empty_blocks(udev, cmd);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 16/19] tcmu: don't block submitting context for block waits
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (14 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 15/19] tcmu: simplify dbi thresh handling Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 17/19] tcmu: make ring buffer timer configurable Mike Christie
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

This patch has tcmu internally queue cmds if its ring buffer
is full. It also makes the TCMU_GLOBAL_MAX_BLOCKS limit a
hint instead of a hard limit, so we do not have to add any
new locks/atomics in the main IO path except when IO is not
running.

This fixes the following bugs:

1. We cannot sleep from the submitting context because it might be
called from a target recv context. This results in transport level
commands timing out. For example if the ring is full, we would
sleep, and a iscsi initiator would send a iscsi ping/nop which
times out because the target's recv thread is sleeping here.

2. Devices were not fairly scheduled to run when they hit the global
limit so they could time out waiting for ring space while others
got run.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 264 ++++++++++++++++++++++++--------------
 1 file changed, 169 insertions(+), 95 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 0630657..80dd17a 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -83,7 +83,10 @@
 /* The total size of the ring is 8M + 256K * PAGE_SIZE */
 #define TCMU_RING_SIZE (CMDR_SIZE + DATA_SIZE)
 
-/* Default maximum of the global data blocks(512K * PAGE_SIZE) */
+/*
+ * Default number of global data blocks(512K * PAGE_SIZE)
+ * when the unmap thread will be started.
+ */
 #define TCMU_GLOBAL_MAX_BLOCKS (512 * 1024)
 
 static u8 tcmu_kern_cmd_reply_supported;
@@ -106,6 +109,7 @@ struct tcmu_nl_cmd {
 struct tcmu_dev {
 	struct list_head node;
 	struct kref kref;
+
 	struct se_device se_dev;
 
 	char *name;
@@ -128,10 +132,9 @@ struct tcmu_dev {
 	size_t data_off;
 	size_t data_size;
 
-	wait_queue_head_t wait_cmdr;
 	struct mutex cmdr_lock;
+	struct list_head cmdr_queue;
 
-	bool waiting_global;
 	uint32_t dbi_max;
 	uint32_t dbi_thresh;
 	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
@@ -160,6 +163,7 @@ struct tcmu_dev {
 struct tcmu_cmd {
 	struct se_cmd *se_cmd;
 	struct tcmu_dev *tcmu_dev;
+	struct list_head cmdr_queue_entry;
 
 	uint16_t cmd_id;
 
@@ -174,7 +178,16 @@ struct tcmu_cmd {
 #define TCMU_CMD_BIT_EXPIRED 0
 	unsigned long flags;
 };
-
+/*
+ * To avoid dead lock the mutex lock order should always be:
+ *
+ * mutex_lock(&root_udev_mutex);
+ * ...
+ * mutex_lock(&tcmu_dev->cmdr_lock);
+ * mutex_unlock(&tcmu_dev->cmdr_lock);
+ * ...
+ * mutex_unlock(&root_udev_mutex);
+ */
 static DEFINE_MUTEX(root_udev_mutex);
 static LIST_HEAD(root_udev);
 
@@ -182,7 +195,7 @@ struct tcmu_cmd {
 static LIST_HEAD(timed_out_udevs);
 
 static atomic_t global_db_count = ATOMIC_INIT(0);
-static struct work_struct tcmu_unmap_work;
+static struct delayed_work tcmu_unmap_work;
 
 static struct kmem_cache *tcmu_cmd_cache;
 
@@ -346,10 +359,8 @@ static inline bool tcmu_get_empty_block(struct tcmu_dev *udev,
 	page = radix_tree_lookup(&udev->data_blocks, dbi);
 	if (!page) {
 		if (atomic_add_return(1, &global_db_count) >
-					TCMU_GLOBAL_MAX_BLOCKS) {
-			atomic_dec(&global_db_count);
-			return false;
-		}
+					TCMU_GLOBAL_MAX_BLOCKS)
+			schedule_delayed_work(&tcmu_unmap_work, 0);
 
 		/* try to get new page from the mm */
 		page = alloc_page(GFP_KERNEL);
@@ -380,18 +391,11 @@ static bool tcmu_get_empty_blocks(struct tcmu_dev *udev,
 {
 	int i;
 
-	udev->waiting_global = false;
-
 	for (i = tcmu_cmd->dbi_cur; i < tcmu_cmd->dbi_cnt; i++) {
 		if (!tcmu_get_empty_block(udev, tcmu_cmd))
-			goto err;
+			return false;
 	}
 	return true;
-
-err:
-	udev->waiting_global = true;
-	schedule_work(&tcmu_unmap_work);
-	return false;
 }
 
 static inline struct page *
@@ -437,6 +441,7 @@ static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
 	if (!tcmu_cmd)
 		return NULL;
 
+	INIT_LIST_HEAD(&tcmu_cmd->cmdr_queue_entry);
 	tcmu_cmd->se_cmd = se_cmd;
 	tcmu_cmd->tcmu_dev = udev;
 
@@ -741,6 +746,10 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 	unsigned long tmo = udev->cmd_time_out;
 	int cmd_id;
 
+	/*
+	 * If it was on the cmdr queue waiting we do not reset the timer
+	 * for requeues and when it is finally sent to userspace.
+	 */
 	if (tcmu_cmd->cmd_id)
 		return 0;
 
@@ -752,13 +761,31 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 	tcmu_cmd->cmd_id = cmd_id;
 
 	if (!tmo)
-		return 0;
+		tmo = TCMU_TIME_OUT;
+
+	pr_debug("allocated cmd %u for dev %s tmo %lu\n", tcmu_cmd->cmd_id,
+		 udev->name, tmo / MSEC_PER_SEC);
 
 	tcmu_cmd->deadline = round_jiffies_up(jiffies + msecs_to_jiffies(tmo));
 	mod_timer(&udev->timeout, tcmu_cmd->deadline);
 	return 0;
 }
 
+static int add_to_cmdr_queue(struct tcmu_cmd *tcmu_cmd)
+{
+	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
+	int ret;
+
+	ret = tcmu_setup_cmd_timer(tcmu_cmd);
+	if (ret)
+		return ret;
+
+	list_add_tail(&tcmu_cmd->cmdr_queue_entry, &udev->cmdr_queue);
+	pr_debug("adding cmd %u on dev %s to ring space wait queue\n",
+		 tcmu_cmd->cmd_id, udev->name);
+	return 0;
+}
+
 /**
  * queue_cmd_ring - queue cmd to ring or internally
  * @tcmu_cmd: cmd to queue
@@ -767,6 +794,7 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
  * Returns:
  * -1 we cannot queue internally or to the ring.
  *  0 success
+ *  1 internally queued to wait for ring memory to free.
  */
 static sense_reason_t queue_cmd_ring(struct tcmu_cmd *tcmu_cmd, int *scsi_err)
 {
@@ -804,7 +832,8 @@ static sense_reason_t queue_cmd_ring(struct tcmu_cmd *tcmu_cmd, int *scsi_err)
 	base_command_size = tcmu_cmd_get_base_cmd_size(tcmu_cmd->dbi_cnt);
 	command_size = tcmu_cmd_get_cmd_size(tcmu_cmd, base_command_size);
 
-	mutex_lock(&udev->cmdr_lock);
+	if (!list_empty(&udev->cmdr_queue))
+		goto queue;
 
 	mb = udev->mb_addr;
 	cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
@@ -813,42 +842,18 @@ static sense_reason_t queue_cmd_ring(struct tcmu_cmd *tcmu_cmd, int *scsi_err)
 		pr_warn("TCMU: Request of size %zu/%zu is too big for %u/%zu "
 			"cmd ring/data area\n", command_size, data_length,
 			udev->cmdr_size, udev->data_size);
-		mutex_unlock(&udev->cmdr_lock);
 		*scsi_err = TCM_INVALID_CDB_FIELD;
 		return -1;
 	}
 
-	while (!is_ring_space_avail(udev, tcmu_cmd, command_size, data_length)) {
-		int ret;
-		DEFINE_WAIT(__wait);
-
+	if (!is_ring_space_avail(udev, tcmu_cmd, command_size, data_length)) {
 		/*
 		 * Don't leave commands partially setup because the unmap
 		 * thread might need the blocks to make forward progress.
 		 */
 		tcmu_cmd_free_data(tcmu_cmd, tcmu_cmd->dbi_cur);
 		tcmu_cmd_reset_dbi_cur(tcmu_cmd);
-
-		prepare_to_wait(&udev->wait_cmdr, &__wait, TASK_INTERRUPTIBLE);
-
-		pr_debug("sleeping for ring space\n");
-		mutex_unlock(&udev->cmdr_lock);
-		if (udev->cmd_time_out)
-			ret = schedule_timeout(
-					msecs_to_jiffies(udev->cmd_time_out));
-		else
-			ret = schedule_timeout(msecs_to_jiffies(TCMU_TIME_OUT));
-		finish_wait(&udev->wait_cmdr, &__wait);
-		if (!ret) {
-			pr_warn("tcmu: command timed out\n");
-			*scsi_err = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
-			return -1;
-		}
-
-		mutex_lock(&udev->cmdr_lock);
-
-		/* We dropped cmdr_lock, cmd_head is stale */
-		cmd_head = mb->cmd_head % udev->cmdr_size; /* UAM */
+		goto queue;
 	}
 
 	/* Insert a PAD if end-of-ring space is too small */
@@ -921,31 +926,39 @@ static sense_reason_t queue_cmd_ring(struct tcmu_cmd *tcmu_cmd, int *scsi_err)
 
 	UPDATE_HEAD(mb->cmd_head, command_size, udev->cmdr_size);
 	tcmu_flush_dcache_range(mb, sizeof(*mb));
-	mutex_unlock(&udev->cmdr_lock);
 
 	/* TODO: only if FLUSH and FUA? */
 	uio_event_notify(&udev->uio_info);
 
-	if (udev->cmd_time_out)
-		mod_timer(&udev->timeout, round_jiffies_up(jiffies +
-			  msecs_to_jiffies(udev->cmd_time_out)));
-
 	return 0;
+
+queue:
+	if (add_to_cmdr_queue(tcmu_cmd)) {
+		*scsi_err = TCM_OUT_OF_RESOURCES;
+		return -1;
+	}
+
+	return 1;
 }
 
 static sense_reason_t
 tcmu_queue_cmd(struct se_cmd *se_cmd)
 {
+	struct se_device *se_dev = se_cmd->se_dev;
+	struct tcmu_dev *udev = TCMU_DEV(se_dev);
 	struct tcmu_cmd *tcmu_cmd;
 	sense_reason_t scsi_ret;
+	int ret;
 
 	tcmu_cmd = tcmu_alloc_cmd(se_cmd);
 	if (!tcmu_cmd)
 		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
 
-	if (queue_cmd_ring(tcmu_cmd, &scsi_ret) < 0)
+	mutex_lock(&udev->cmdr_lock);
+	ret = queue_cmd_ring(tcmu_cmd, &scsi_ret);
+	mutex_unlock(&udev->cmdr_lock);
+	if (ret < 0)
 		tcmu_free_cmd(tcmu_cmd);
-
 	return scsi_ret;
 }
 
@@ -1033,10 +1046,15 @@ static unsigned int tcmu_handle_completions(struct tcmu_dev *udev)
 		handled++;
 	}
 
-	if (mb->cmd_tail == mb->cmd_head)
-		del_timer(&udev->timeout); /* no more pending cmds */
-
-	wake_up(&udev->wait_cmdr);
+	if (mb->cmd_tail == mb->cmd_head && list_empty(&udev->cmdr_queue)) {
+		del_timer(&udev->timeout);
+		/*
+		 * not more pending or waiting commands so try to reclaim
+		 * blocks if needed.
+		 */
+		if (atomic_read(&global_db_count) > TCMU_GLOBAL_MAX_BLOCKS)
+			schedule_delayed_work(&tcmu_unmap_work, 0);
+	}
 
 	return handled;
 }
@@ -1044,6 +1062,10 @@ static unsigned int tcmu_handle_completions(struct tcmu_dev *udev)
 static int tcmu_check_expired_cmd(int id, void *p, void *data)
 {
 	struct tcmu_cmd *cmd = p;
+	struct tcmu_dev *udev = cmd->tcmu_dev;
+	u8 scsi_status;
+	struct se_cmd *se_cmd;
+	bool is_running;
 
 	if (test_bit(TCMU_CMD_BIT_EXPIRED, &cmd->flags))
 		return 0;
@@ -1051,10 +1073,27 @@ static int tcmu_check_expired_cmd(int id, void *p, void *data)
 	if (!time_after(jiffies, cmd->deadline))
 		return 0;
 
-	set_bit(TCMU_CMD_BIT_EXPIRED, &cmd->flags);
-	target_complete_cmd(cmd->se_cmd, SAM_STAT_CHECK_CONDITION);
+	is_running = list_empty(&cmd->cmdr_queue_entry);
+	pr_debug("Timing out cmd %u on dev %s that is %s.\n",
+		 id, udev->name, is_running ? "inflight" : "queued");
+
+	se_cmd = cmd->se_cmd;
 	cmd->se_cmd = NULL;
 
+	if (is_running) {
+		set_bit(TCMU_CMD_BIT_EXPIRED, &cmd->flags);
+		/*
+		 * target_complete_cmd will translate this to LUN COMM FAILURE
+		 */
+		scsi_status = SAM_STAT_CHECK_CONDITION;
+	} else {
+		list_del_init(&cmd->cmdr_queue_entry);
+
+		idr_remove(&udev->commands, id);
+		tcmu_free_cmd(cmd);
+		scsi_status = SAM_STAT_TASK_SET_FULL;
+	}
+	target_complete_cmd(se_cmd, scsi_status);
 	return 0;
 }
 
@@ -1069,7 +1108,7 @@ static void tcmu_device_timedout(unsigned long data)
 		list_add_tail(&udev->timedout_entry, &timed_out_udevs);
 	spin_unlock(&timed_out_udevs_lock);
 
-	schedule_work(&tcmu_unmap_work);
+	schedule_delayed_work(&tcmu_unmap_work, 0);
 }
 
 static int tcmu_attach_hba(struct se_hba *hba, u32 host_id)
@@ -1110,10 +1149,10 @@ static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name)
 	udev->hba = hba;
 	udev->cmd_time_out = TCMU_TIME_OUT;
 
-	init_waitqueue_head(&udev->wait_cmdr);
 	mutex_init(&udev->cmdr_lock);
 
 	INIT_LIST_HEAD(&udev->timedout_entry);
+	INIT_LIST_HEAD(&udev->cmdr_queue);
 	idr_init(&udev->commands);
 
 	setup_timer(&udev->timeout, tcmu_device_timedout,
@@ -1127,13 +1166,63 @@ static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name)
 	return &udev->se_dev;
 }
 
+static bool run_cmdr_queue(struct tcmu_dev *udev)
+{
+	struct tcmu_cmd *tcmu_cmd, *tmp_cmd;
+	LIST_HEAD(cmds);
+	bool drained = true;
+	sense_reason_t scsi_ret;
+	int ret;
+
+	if (list_empty(&udev->cmdr_queue))
+		return true;
+
+	pr_debug("running %s's cmdr queue\n", udev->name);
+
+	list_splice_init(&udev->cmdr_queue, &cmds);
+
+	list_for_each_entry_safe(tcmu_cmd, tmp_cmd, &cmds, cmdr_queue_entry) {
+		list_del_init(&tcmu_cmd->cmdr_queue_entry);
+
+	        pr_debug("removing cmd %u on dev %s from queue\n",
+		         tcmu_cmd->cmd_id, udev->name);
+
+		ret = queue_cmd_ring(tcmu_cmd, &scsi_ret);
+		if (ret < 0) {
+		        pr_debug("cmd %u on dev %s failed with %u\n",
+			         tcmu_cmd->cmd_id, udev->name, scsi_ret);
+
+			idr_remove(&udev->commands, tcmu_cmd->cmd_id);
+			/*
+			 * Ignore scsi_ret for now. target_complete_cmd
+			 * drops it.
+			 */
+			target_complete_cmd(tcmu_cmd->se_cmd,
+					    SAM_STAT_CHECK_CONDITION);
+			tcmu_free_cmd(tcmu_cmd);
+		} else if (ret > 0) {
+			pr_debug("ran out of space during cmdr queue run\n");
+			/*
+			 * cmd was requeued, so just put all cmds back in
+			 * the queue
+			 */
+			list_splice_tail(&cmds, &udev->cmdr_queue);
+			drained = false;
+			goto done;
+		}
+	}
+done:
+	return drained;
+}
+
 static int tcmu_irqcontrol(struct uio_info *info, s32 irq_on)
 {
-	struct tcmu_dev *tcmu_dev = container_of(info, struct tcmu_dev, uio_info);
+	struct tcmu_dev *udev = container_of(info, struct tcmu_dev, uio_info);
 
-	mutex_lock(&tcmu_dev->cmdr_lock);
-	tcmu_handle_completions(tcmu_dev);
-	mutex_unlock(&tcmu_dev->cmdr_lock);
+	mutex_lock(&udev->cmdr_lock);
+	tcmu_handle_completions(udev);
+	run_cmdr_queue(udev);
+	mutex_unlock(&udev->cmdr_lock);
 
 	return 0;
 }
@@ -1529,7 +1618,6 @@ static int tcmu_configure_device(struct se_device *dev)
 	udev->data_off = CMDR_SIZE;
 	udev->data_size = DATA_SIZE;
 	udev->dbi_thresh = 0; /* Default in Idle state */
-	udev->waiting_global = false;
 
 	/* Initialise the mailbox of the ring buffer */
 	mb = udev->mb_addr;
@@ -1974,12 +2062,14 @@ static ssize_t tcmu_emulate_write_cache_store(struct config_item *item,
 	.tb_dev_attrib_attrs	= NULL,
 };
 
-
 static void find_free_blocks(void)
 {
 	struct tcmu_dev *udev;
 	loff_t off;
-	uint32_t start, end, block;
+	u32 start, end, block, total_freed = 0;
+
+	if (atomic_read(&global_db_count) <= TCMU_GLOBAL_MAX_BLOCKS)
+		return;
 
 	mutex_lock(&root_udev_mutex);
 	list_for_each_entry(udev, &root_udev, node) {
@@ -1988,8 +2078,8 @@ static void find_free_blocks(void)
 		/* Try to complete the finished commands first */
 		tcmu_handle_completions(udev);
 
-		/* Skip the udevs waiting the global pool or in idle */
-		if (udev->waiting_global || !udev->dbi_thresh) {
+		/* Skip the udevs in idle */
+		if (!udev->dbi_thresh) {
 			mutex_unlock(&udev->cmdr_lock);
 			continue;
 		}
@@ -1998,8 +2088,8 @@ static void find_free_blocks(void)
 		block = find_last_bit(udev->data_bitmap, end);
 		if (block == udev->dbi_max) {
 			/*
-			 * The last bit is dbi_max, so there is
-			 * no need to shrink any blocks.
+			 * The last bit is dbi_max, so it is not possible
+			 * reclaim any blocks.
 			 */
 			mutex_unlock(&udev->cmdr_lock);
 			continue;
@@ -2019,30 +2109,15 @@ static void find_free_blocks(void)
 		/* Release the block pages */
 		tcmu_blocks_release(&udev->data_blocks, start, end);
 		mutex_unlock(&udev->cmdr_lock);
-	}
-	mutex_unlock(&root_udev_mutex);
-}
 
-static void run_cmdr_queues(void)
-{
-	struct tcmu_dev *udev;
-
-	/*
-	 * Try to wake up the udevs who are waiting
-	 * for the global data block pool.
-	 */
-	mutex_lock(&root_udev_mutex);
-	list_for_each_entry(udev, &root_udev, node) {
-		mutex_lock(&udev->cmdr_lock);
-		if (!udev->waiting_global) {
-			mutex_unlock(&udev->cmdr_lock);
-			break;
-		}
-		mutex_unlock(&udev->cmdr_lock);
-
-		wake_up(&udev->wait_cmdr);
+		total_freed += end - start;
+		pr_debug("Freed %u blocks (total %u) from %s.\n", end - start,
+			 total_freed, udev->name);
 	}
 	mutex_unlock(&root_udev_mutex);
+
+	if (atomic_read(&global_db_count) > TCMU_GLOBAL_MAX_BLOCKS)
+		schedule_delayed_work(&tcmu_unmap_work, msecs_to_jiffies(5000));
 }
 
 static void check_timedout_devices(void)
@@ -2071,7 +2146,6 @@ static void tcmu_unmap_work_fn(struct work_struct *work)
 {
 	check_timedout_devices();
 	find_free_blocks();
-	run_cmdr_queues();
 }
 
 static int __init tcmu_module_init(void)
@@ -2080,7 +2154,7 @@ static int __init tcmu_module_init(void)
 
 	BUILD_BUG_ON((sizeof(struct tcmu_cmd_entry) % TCMU_OP_ALIGN_SIZE) != 0);
 
-	INIT_WORK(&tcmu_unmap_work, tcmu_unmap_work_fn);
+	INIT_DELAYED_WORK(&tcmu_unmap_work, tcmu_unmap_work_fn);
 
 	tcmu_cmd_cache = kmem_cache_create("tcmu_cmd_cache",
 				sizeof(struct tcmu_cmd),
@@ -2143,7 +2217,7 @@ static int __init tcmu_module_init(void)
 
 static void __exit tcmu_module_exit(void)
 {
-	cancel_work_sync(&tcmu_unmap_work);
+	cancel_delayed_work_sync(&tcmu_unmap_work);
 	target_backend_unregister(&tcmu_ops);
 	kfree(tcmu_attrs);
 	genl_unregister_family(&tcmu_genl_family);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 17/19] tcmu: make ring buffer timer configurable
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (15 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 16/19] tcmu: don't block submitting context for block waits Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 18/19] tcmu: allow max block and global max blocks to be settable Mike Christie
  2017-10-30  3:44 ` [PATCH 19/19] target: return SAM_STAT_TASK_SET_FULL for TCM_OUT_OF_RESOURCES Mike Christie
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

This adds a timer, qfull_time_out, that controls how long a
device will wait for ring buffer space to open before
failing the commands in the queue. It is useful to separate
this timer from the cmd_time_out and default 30 sec one,
because for HA setups cmd_time_out may be disbled and 30
seconds is too long to wait when some OSs like ESX will
timeout commands after as little as 8 - 15 seconds.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 131 +++++++++++++++++++++++++++++---------
 1 file changed, 101 insertions(+), 30 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 80dd17a..92eb918 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -142,8 +142,12 @@ struct tcmu_dev {
 
 	struct idr commands;
 
-	struct timer_list timeout;
+	struct timer_list cmd_timer;
 	unsigned int cmd_time_out;
+
+	struct timer_list qfull_timer;
+	int qfull_time_out;
+
 	struct list_head timedout_entry;
 
 	spinlock_t nl_cmd_lock;
@@ -740,18 +744,14 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd *tcmu_cmd,
 	return command_size;
 }
 
-static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
+static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd, unsigned int tmo,
+				struct timer_list *timer)
 {
 	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
-	unsigned long tmo = udev->cmd_time_out;
 	int cmd_id;
 
-	/*
-	 * If it was on the cmdr queue waiting we do not reset the timer
-	 * for requeues and when it is finally sent to userspace.
-	 */
 	if (tcmu_cmd->cmd_id)
-		return 0;
+		goto setup_timer;
 
 	cmd_id = idr_alloc(&udev->commands, tcmu_cmd, 1, USHRT_MAX, GFP_NOWAIT);
 	if (cmd_id < 0) {
@@ -760,23 +760,38 @@ static int tcmu_setup_cmd_timer(struct tcmu_cmd *tcmu_cmd)
 	}
 	tcmu_cmd->cmd_id = cmd_id;
 
-	if (!tmo)
-		tmo = TCMU_TIME_OUT;
-
 	pr_debug("allocated cmd %u for dev %s tmo %lu\n", tcmu_cmd->cmd_id,
 		 udev->name, tmo / MSEC_PER_SEC);
 
+setup_timer:
+	if (!tmo)
+		return 0;
+
 	tcmu_cmd->deadline = round_jiffies_up(jiffies + msecs_to_jiffies(tmo));
-	mod_timer(&udev->timeout, tcmu_cmd->deadline);
+	mod_timer(timer, tcmu_cmd->deadline);
 	return 0;
 }
 
 static int add_to_cmdr_queue(struct tcmu_cmd *tcmu_cmd)
 {
 	struct tcmu_dev *udev = tcmu_cmd->tcmu_dev;
+	unsigned int tmo;
 	int ret;
 
-	ret = tcmu_setup_cmd_timer(tcmu_cmd);
+	/*
+	 * For backwards compat if qfull_time_out is not set use
+	 * cmd_time_out and if that's not set use the default time out.
+	 */
+	if (!udev->qfull_time_out)
+		return -ETIMEDOUT;
+	else if (udev->qfull_time_out > 0)
+		tmo = udev->qfull_time_out;
+	else if (udev->cmd_time_out)
+		tmo = udev->cmd_time_out;
+	else
+		tmo = TCMU_TIME_OUT;
+
+	ret = tcmu_setup_cmd_timer(tcmu_cmd, tmo, &udev->qfull_timer);
 	if (ret)
 		return ret;
 
@@ -900,7 +915,8 @@ static sense_reason_t queue_cmd_ring(struct tcmu_cmd *tcmu_cmd, int *scsi_err)
 	}
 	entry->req.iov_bidi_cnt = iov_cnt;
 
-	ret = tcmu_setup_cmd_timer(tcmu_cmd);
+	ret = tcmu_setup_cmd_timer(tcmu_cmd, udev->cmd_time_out,
+				   &udev->cmd_timer);
 	if (ret) {
 		tcmu_cmd_free_data(tcmu_cmd, tcmu_cmd->dbi_cnt);
 		*scsi_err = TCM_OUT_OF_RESOURCES;
@@ -1046,14 +1062,19 @@ static unsigned int tcmu_handle_completions(struct tcmu_dev *udev)
 		handled++;
 	}
 
-	if (mb->cmd_tail == mb->cmd_head && list_empty(&udev->cmdr_queue)) {
-		del_timer(&udev->timeout);
-		/*
-		 * not more pending or waiting commands so try to reclaim
-		 * blocks if needed.
-		 */
-		if (atomic_read(&global_db_count) > TCMU_GLOBAL_MAX_BLOCKS)
-			schedule_delayed_work(&tcmu_unmap_work, 0);
+	if (mb->cmd_tail == mb->cmd_head) {
+		/* no more pending commands */
+		del_timer(&udev->cmd_timer);
+
+		if (list_empty(&udev->cmdr_queue)) {
+			/*
+			 * no more pending or waiting commands so try to
+			 * reclaim blocks if needed.
+			 */
+			if (atomic_read(&global_db_count) >
+			    TCMU_GLOBAL_MAX_BLOCKS)
+				schedule_delayed_work(&tcmu_unmap_work, 0);
+		}
 	}
 
 	return handled;
@@ -1074,13 +1095,15 @@ static int tcmu_check_expired_cmd(int id, void *p, void *data)
 		return 0;
 
 	is_running = list_empty(&cmd->cmdr_queue_entry);
-	pr_debug("Timing out cmd %u on dev %s that is %s.\n",
-		 id, udev->name, is_running ? "inflight" : "queued");
-
-	se_cmd = cmd->se_cmd;
-	cmd->se_cmd = NULL;
 
 	if (is_running) {
+		/*
+		 * If cmd_time_out is disabled but qfull is set deadline
+		 * will only reflect the qfull timeout. Ignore it.
+		 */
+		if (!udev->cmd_time_out)
+			return 0;
+
 		set_bit(TCMU_CMD_BIT_EXPIRED, &cmd->flags);
 		/*
 		 * target_complete_cmd will translate this to LUN COMM FAILURE
@@ -1093,6 +1116,12 @@ static int tcmu_check_expired_cmd(int id, void *p, void *data)
 		tcmu_free_cmd(cmd);
 		scsi_status = SAM_STAT_TASK_SET_FULL;
 	}
+
+	pr_debug("Timing out cmd %u on dev %s that is %s.\n",
+		 id, udev->name, is_running ? "inflight" : "queued");
+
+	se_cmd = cmd->se_cmd;
+	cmd->se_cmd = NULL;
 	target_complete_cmd(se_cmd, scsi_status);
 	return 0;
 }
@@ -1101,7 +1130,7 @@ static void tcmu_device_timedout(unsigned long data)
 {
 	struct tcmu_dev *udev = (struct tcmu_dev *)data;
 
-	pr_debug("%s cmd timeout has expired\n", udev->name);
+	pr_debug("%s cmd/qfull timeout has expired\n", udev->name);
 
 	spin_lock(&timed_out_udevs_lock);
 	if (list_empty(&udev->timedout_entry))
@@ -1148,6 +1177,7 @@ static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name)
 
 	udev->hba = hba;
 	udev->cmd_time_out = TCMU_TIME_OUT;
+	udev->qfull_time_out = -1;
 
 	mutex_init(&udev->cmdr_lock);
 
@@ -1155,7 +1185,10 @@ static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name)
 	INIT_LIST_HEAD(&udev->cmdr_queue);
 	idr_init(&udev->commands);
 
-	setup_timer(&udev->timeout, tcmu_device_timedout,
+	setup_timer(&udev->qfull_timer, tcmu_device_timedout,
+		(unsigned long)udev);
+
+	setup_timer(&udev->cmd_timer, tcmu_device_timedout,
 		(unsigned long)udev);
 
 	init_waitqueue_head(&udev->nl_cmd_wq);
@@ -1211,6 +1244,8 @@ static bool run_cmdr_queue(struct tcmu_dev *udev)
 			goto done;
 		}
 	}
+	if (list_empty(&udev->cmdr_queue))
+		del_timer(&udev->qfull_timer);
 done:
 	return drained;
 }
@@ -1710,7 +1745,8 @@ static void tcmu_destroy_device(struct se_device *dev)
 {
 	struct tcmu_dev *udev = TCMU_DEV(dev);
 
-	del_timer_sync(&udev->timeout);
+	del_timer_sync(&udev->cmd_timer);
+	del_timer_sync(&udev->qfull_timer);
 
 	mutex_lock(&root_udev_mutex);
 	list_del(&udev->node);
@@ -1890,6 +1926,40 @@ static ssize_t tcmu_cmd_time_out_store(struct config_item *item, const char *pag
 }
 CONFIGFS_ATTR(tcmu_, cmd_time_out);
 
+static ssize_t tcmu_qfull_time_out_show(struct config_item *item, char *page)
+{
+	struct se_dev_attrib *da = container_of(to_config_group(item),
+						struct se_dev_attrib, da_group);
+	struct tcmu_dev *udev = TCMU_DEV(da->da_dev);
+
+	return snprintf(page, PAGE_SIZE, "%ld\n", udev->qfull_time_out <= 0 ?
+			udev->qfull_time_out :
+			udev->qfull_time_out / MSEC_PER_SEC);
+}
+
+static ssize_t tcmu_qfull_time_out_store(struct config_item *item,
+					 const char *page, size_t count)
+{
+	struct se_dev_attrib *da = container_of(to_config_group(item),
+					struct se_dev_attrib, da_group);
+	struct tcmu_dev *udev = TCMU_DEV(da->da_dev);
+	s32 val;
+	int ret;
+
+	ret = kstrtos32(page, 0, &val);
+	if (ret < 0)
+		return ret;
+
+	if (val >= 0) {
+		udev->qfull_time_out = val * MSEC_PER_SEC;
+	} else {
+		printk(KERN_ERR "Invalid qfull timeout value %d\n", val);
+		return -EINVAL;
+	}
+	return count;
+}
+CONFIGFS_ATTR(tcmu_, qfull_time_out);
+
 static ssize_t tcmu_dev_config_show(struct config_item *item, char *page)
 {
 	struct se_dev_attrib *da = container_of(to_config_group(item),
@@ -2035,6 +2105,7 @@ static ssize_t tcmu_emulate_write_cache_store(struct config_item *item,
 
 static struct configfs_attribute *tcmu_attrib_attrs[] = {
 	&tcmu_attr_cmd_time_out,
+	&tcmu_attr_qfull_time_out,
 	&tcmu_attr_dev_config,
 	&tcmu_attr_dev_size,
 	&tcmu_attr_emulate_write_cache,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 18/19] tcmu: allow max block and global max blocks to be settable
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (16 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 17/19] tcmu: make ring buffer timer configurable Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  2017-10-30  3:44 ` [PATCH 19/19] target: return SAM_STAT_TASK_SET_FULL for TCM_OUT_OF_RESOURCES Mike Christie
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

Users might have a physical system to a target so they could
have a lot more than 2 gigs of memory they want to devote to
tcmu. OTOH, we could be running in a vm and so a 2 gig
global and 1 gig per dev limit might be too high. This patch
allows the user to specify the limits.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_user.c | 143 +++++++++++++++++++++++++++++++++-----
 1 file changed, 124 insertions(+), 19 deletions(-)

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 92eb918..f49f765 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -77,9 +77,13 @@
  * the total size is 256K * PAGE_SIZE.
  */
 #define DATA_BLOCK_SIZE PAGE_SIZE
-#define DATA_BLOCK_BITS (256 * 1024)
+#define DATA_BLOCK_SHIFT PAGE_SHIFT
+#define DATA_BLOCK_BITS_DEF (256 * 1024)
 #define DATA_SIZE (DATA_BLOCK_BITS * DATA_BLOCK_SIZE)
 
+#define TCMU_MBS_TO_BLOCKS(_mbs) (_mbs << (20 - DATA_BLOCK_SHIFT))
+#define TCMU_BLOCKS_TO_MBS(_blocks) (_blocks >> (20 - DATA_BLOCK_SHIFT))
+
 /* The total size of the ring is 8M + 256K * PAGE_SIZE */
 #define TCMU_RING_SIZE (CMDR_SIZE + DATA_SIZE)
 
@@ -87,7 +91,7 @@
  * Default number of global data blocks(512K * PAGE_SIZE)
  * when the unmap thread will be started.
  */
-#define TCMU_GLOBAL_MAX_BLOCKS (512 * 1024)
+#define TCMU_GLOBAL_MAX_BLOCKS_DEF (512 * 1024)
 
 static u8 tcmu_kern_cmd_reply_supported;
 
@@ -131,13 +135,15 @@ struct tcmu_dev {
 	/* Must add data_off and mb_addr to get the address */
 	size_t data_off;
 	size_t data_size;
+	uint32_t max_blocks;
+	size_t ring_size;
 
 	struct mutex cmdr_lock;
 	struct list_head cmdr_queue;
 
 	uint32_t dbi_max;
 	uint32_t dbi_thresh;
-	DECLARE_BITMAP(data_bitmap, DATA_BLOCK_BITS);
+	unsigned long *data_bitmap;
 	struct radix_tree_root data_blocks;
 
 	struct idr commands;
@@ -198,10 +204,51 @@ struct tcmu_cmd {
 static DEFINE_SPINLOCK(timed_out_udevs_lock);
 static LIST_HEAD(timed_out_udevs);
 
+static struct kmem_cache *tcmu_cmd_cache;
+
 static atomic_t global_db_count = ATOMIC_INIT(0);
 static struct delayed_work tcmu_unmap_work;
+static int tcmu_global_max_blocks = TCMU_GLOBAL_MAX_BLOCKS_DEF;
 
-static struct kmem_cache *tcmu_cmd_cache;
+static int tcmu_set_global_max_data_area(const char *str,
+					 const struct kernel_param *kp)
+{
+	int ret, max_area_mb;
+
+	ret = kstrtoint(str, 10, &max_area_mb);
+	if (ret)
+		return -EINVAL;
+
+	if (max_area_mb <= 0) {
+		pr_err("global_max_data_area must be larger than 0.\n");
+		return -EINVAL;
+	}
+
+	tcmu_global_max_blocks = TCMU_MBS_TO_BLOCKS(max_area_mb);
+	if (atomic_read(&global_db_count) > tcmu_global_max_blocks)
+		schedule_delayed_work(&tcmu_unmap_work, 0);
+	else
+		cancel_delayed_work_sync(&tcmu_unmap_work);
+
+	return 0;
+}
+
+static int tcmu_get_global_max_data_area(char *buffer,
+					 const struct kernel_param *kp)
+{
+	return sprintf(buffer, "%d", TCMU_BLOCKS_TO_MBS(tcmu_global_max_blocks));
+}
+
+static const struct kernel_param_ops tcmu_global_max_data_area_op = {
+	.set = tcmu_set_global_max_data_area,
+	.get = tcmu_get_global_max_data_area,
+};
+
+module_param_cb(global_max_data_area_mb, &tcmu_global_max_data_area_op, NULL,
+		S_IWUSR | S_IRUGO);
+MODULE_PARM_DESC(global_max_data_area_mb,
+		 "Max MBs allowed to be allocated to all the tcmu device's "
+		 "data areas.");
 
 /* multicast group */
 enum tcmu_multicast_groups {
@@ -363,7 +410,7 @@ static inline bool tcmu_get_empty_block(struct tcmu_dev *udev,
 	page = radix_tree_lookup(&udev->data_blocks, dbi);
 	if (!page) {
 		if (atomic_add_return(1, &global_db_count) >
-					TCMU_GLOBAL_MAX_BLOCKS)
+				      tcmu_global_max_blocks)
 			schedule_delayed_work(&tcmu_unmap_work, 0);
 
 		/* try to get new page from the mm */
@@ -705,8 +752,8 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 	/* try to check and get the data blocks as needed */
 	space = spc_bitmap_free(udev->data_bitmap, udev->dbi_thresh);
 	if ((space * DATA_BLOCK_SIZE) < data_needed) {
-		unsigned long blocks_left = DATA_BLOCK_BITS - udev->dbi_thresh +
-						space;
+		unsigned long blocks_left =
+				(udev->max_blocks - udev->dbi_thresh) + space;
 
 		if (blocks_left < blocks_needed) {
 			pr_debug("no data space: only %lu available, but ask for %zu\n",
@@ -716,8 +763,8 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
 		}
 
 		udev->dbi_thresh += blocks_needed;
-		if (udev->dbi_thresh > DATA_BLOCK_BITS)
-			udev->dbi_thresh = DATA_BLOCK_BITS;
+		if (udev->dbi_thresh > udev->max_blocks)
+			udev->dbi_thresh = udev->max_blocks;
 	}
 
 	return tcmu_get_empty_blocks(udev, cmd);
@@ -1072,7 +1119,7 @@ static unsigned int tcmu_handle_completions(struct tcmu_dev *udev)
 			 * reclaim blocks if needed.
 			 */
 			if (atomic_read(&global_db_count) >
-			    TCMU_GLOBAL_MAX_BLOCKS)
+			    tcmu_global_max_blocks)
 				schedule_delayed_work(&tcmu_unmap_work, 0);
 		}
 	}
@@ -1179,6 +1226,7 @@ static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name)
 	udev->cmd_time_out = TCMU_TIME_OUT;
 	udev->qfull_time_out = -1;
 
+	udev->max_blocks = DATA_BLOCK_BITS_DEF;
 	mutex_init(&udev->cmdr_lock);
 
 	INIT_LIST_HEAD(&udev->timedout_entry);
@@ -1323,7 +1371,7 @@ static struct page *tcmu_try_get_block_page(struct tcmu_dev *udev, uint32_t dbi)
 
 		/*
 		 * Since this case is rare in page fault routine, here we
-		 * will allow the global_db_count >= TCMU_GLOBAL_MAX_BLOCKS
+		 * will allow the global_db_count >= tcmu_global_max_blocks
 		 * to reduce possible page fault call trace.
 		 */
 		atomic_inc(&global_db_count);
@@ -1384,7 +1432,7 @@ static int tcmu_mmap(struct uio_info *info, struct vm_area_struct *vma)
 	vma->vm_private_data = udev;
 
 	/* Ensure the mmap is exactly the right size */
-	if (vma_pages(vma) != (TCMU_RING_SIZE >> PAGE_SHIFT))
+	if (vma_pages(vma) != (udev->ring_size >> PAGE_SHIFT))
 		return -EINVAL;
 
 	return 0;
@@ -1466,6 +1514,7 @@ static void tcmu_dev_kref_release(struct kref *kref)
 	WARN_ON(!all_expired);
 
 	tcmu_blocks_release(&udev->data_blocks, 0, udev->dbi_max + 1);
+	kfree(udev->data_bitmap);
 	mutex_unlock(&udev->cmdr_lock);
 
 	call_rcu(&dev->rcu_head, tcmu_dev_call_rcu);
@@ -1642,6 +1691,11 @@ static int tcmu_configure_device(struct se_device *dev)
 
 	info = &udev->uio_info;
 
+	udev->data_bitmap = kzalloc(BITS_TO_LONGS(udev->max_blocks) *
+				    sizeof(unsigned long), GFP_KERNEL);
+	if (!udev->data_bitmap)
+		goto err_bitmap_alloc;
+
 	udev->mb_addr = vzalloc(CMDR_SIZE);
 	if (!udev->mb_addr) {
 		ret = -ENOMEM;
@@ -1651,7 +1705,7 @@ static int tcmu_configure_device(struct se_device *dev)
 	/* mailbox fits in first part of CMDR space */
 	udev->cmdr_size = CMDR_SIZE - CMDR_OFF;
 	udev->data_off = CMDR_SIZE;
-	udev->data_size = DATA_SIZE;
+	udev->data_size = udev->max_blocks * DATA_BLOCK_SIZE;
 	udev->dbi_thresh = 0; /* Default in Idle state */
 
 	/* Initialise the mailbox of the ring buffer */
@@ -1669,7 +1723,7 @@ static int tcmu_configure_device(struct se_device *dev)
 
 	info->mem[0].name = "tcm-user command & data buffer";
 	info->mem[0].addr = (phys_addr_t)(uintptr_t)udev->mb_addr;
-	info->mem[0].size = TCMU_RING_SIZE;
+	info->mem[0].size = udev->ring_size = udev->data_size + CMDR_SIZE;
 	info->mem[0].memtype = UIO_MEM_NONE;
 
 	info->irqcontrol = tcmu_irqcontrol;
@@ -1722,6 +1776,9 @@ static int tcmu_configure_device(struct se_device *dev)
 	vfree(udev->mb_addr);
 	udev->mb_addr = NULL;
 err_vzalloc:
+	kfree(udev->data_bitmap);
+	udev->data_bitmap = NULL;
+err_bitmap_alloc:
 	kfree(info->name);
 	info->name = NULL;
 
@@ -1762,7 +1819,7 @@ static void tcmu_destroy_device(struct se_device *dev)
 
 enum {
 	Opt_dev_config, Opt_dev_size, Opt_hw_block_size, Opt_hw_max_sectors,
-	Opt_nl_reply_supported, Opt_err,
+	Opt_nl_reply_supported, Opt_max_data_area_mb, Opt_err,
 };
 
 static match_table_t tokens = {
@@ -1771,6 +1828,7 @@ enum {
 	{Opt_hw_block_size, "hw_block_size=%u"},
 	{Opt_hw_max_sectors, "hw_max_sectors=%u"},
 	{Opt_nl_reply_supported, "nl_reply_supported=%d"},
+	{Opt_max_data_area_mb, "max_data_area_mb=%u"},
 	{Opt_err, NULL}
 };
 
@@ -1804,7 +1862,7 @@ static ssize_t tcmu_set_configfs_dev_params(struct se_device *dev,
 	struct tcmu_dev *udev = TCMU_DEV(dev);
 	char *orig, *ptr, *opts, *arg_p;
 	substring_t args[MAX_OPT_ARGS];
-	int ret = 0, token;
+	int ret = 0, token, tmpval;
 
 	opts = kstrdup(page, GFP_KERNEL);
 	if (!opts)
@@ -1856,6 +1914,39 @@ static ssize_t tcmu_set_configfs_dev_params(struct se_device *dev,
 			if (ret < 0)
 				pr_err("kstrtoint() failed for nl_reply_supported=\n");
 			break;
+		case Opt_max_data_area_mb:
+			if (dev->export_count) {
+				pr_err("Unable to set max_data_area_mb while exports exist\n");
+				ret = -EINVAL;
+				break;
+			}
+
+			arg_p = match_strdup(&args[0]);
+			if (!arg_p) {
+				ret = -ENOMEM;
+				break;
+			}
+			ret = kstrtoint(arg_p, 0, &tmpval);
+			kfree(arg_p);
+			if (ret < 0) {
+				pr_err("kstrtoint() failed for max_data_area_mb=\n");
+				break;
+			}
+
+			if (tmpval <= 0) {
+				pr_err("Invalid max_data_area %d\n", tmpval);
+				ret = -EINVAL;
+				break;
+			}
+
+			udev->max_blocks = TCMU_MBS_TO_BLOCKS(tmpval);
+			if (udev->max_blocks > tcmu_global_max_blocks) {
+				pr_err("%d is too large. Adjusting max_data_area_mb to global limit of %u\n",
+				       tmpval,
+				       TCMU_BLOCKS_TO_MBS(tcmu_global_max_blocks));
+				udev->max_blocks = tcmu_global_max_blocks;
+			}
+			break;
 		default:
 			break;
 		}
@@ -1875,7 +1966,9 @@ static ssize_t tcmu_show_configfs_dev_params(struct se_device *dev, char *b)
 
 	bl = sprintf(b + bl, "Config: %s ",
 		     udev->dev_config[0] ? udev->dev_config : "NULL");
-	bl += sprintf(b + bl, "Size: %zu\n", udev->dev_size);
+	bl += sprintf(b + bl, "Size: %zu ", udev->dev_size);
+	bl += sprintf(b + bl, "MaxDataAreaMB: %u\n",
+		      TCMU_BLOCKS_TO_MBS(udev->max_blocks));
 
 	return bl;
 }
@@ -1960,6 +2053,17 @@ static ssize_t tcmu_qfull_time_out_store(struct config_item *item,
 }
 CONFIGFS_ATTR(tcmu_, qfull_time_out);
 
+static ssize_t tcmu_max_data_area_mb_show(struct config_item *item, char *page)
+{
+	struct se_dev_attrib *da = container_of(to_config_group(item),
+						struct se_dev_attrib, da_group);
+	struct tcmu_dev *udev = TCMU_DEV(da->da_dev);
+
+	return snprintf(page, PAGE_SIZE, "%u\n",
+			TCMU_BLOCKS_TO_MBS(udev->max_blocks));
+}
+CONFIGFS_ATTR_RO(tcmu_, max_data_area_mb);
+
 static ssize_t tcmu_dev_config_show(struct config_item *item, char *page)
 {
 	struct se_dev_attrib *da = container_of(to_config_group(item),
@@ -2106,6 +2210,7 @@ static ssize_t tcmu_emulate_write_cache_store(struct config_item *item,
 static struct configfs_attribute *tcmu_attrib_attrs[] = {
 	&tcmu_attr_cmd_time_out,
 	&tcmu_attr_qfull_time_out,
+	&tcmu_attr_max_data_area_mb,
 	&tcmu_attr_dev_config,
 	&tcmu_attr_dev_size,
 	&tcmu_attr_emulate_write_cache,
@@ -2139,7 +2244,7 @@ static void find_free_blocks(void)
 	loff_t off;
 	u32 start, end, block, total_freed = 0;
 
-	if (atomic_read(&global_db_count) <= TCMU_GLOBAL_MAX_BLOCKS)
+	if (atomic_read(&global_db_count) <= tcmu_global_max_blocks)
 		return;
 
 	mutex_lock(&root_udev_mutex);
@@ -2187,7 +2292,7 @@ static void find_free_blocks(void)
 	}
 	mutex_unlock(&root_udev_mutex);
 
-	if (atomic_read(&global_db_count) > TCMU_GLOBAL_MAX_BLOCKS)
+	if (atomic_read(&global_db_count) > tcmu_global_max_blocks)
 		schedule_delayed_work(&tcmu_unmap_work, msecs_to_jiffies(5000));
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 19/19] target: return SAM_STAT_TASK_SET_FULL for TCM_OUT_OF_RESOURCES
  2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
                   ` (17 preceding siblings ...)
  2017-10-30  3:44 ` [PATCH 18/19] tcmu: allow max block and global max blocks to be settable Mike Christie
@ 2017-10-30  3:44 ` Mike Christie
  18 siblings, 0 replies; 20+ messages in thread
From: Mike Christie @ 2017-10-30  3:44 UTC (permalink / raw)
  To: martin.petersen, jejb, linux-scsi, target-devel, nab; +Cc: Mike Christie

TCM_OUT_OF_RESOURCES is getting translated to
TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE which seems like a heavy
error when we just cannot allocate a resource that may be
allocatable later. This has us translate TCM_OUT_OF_RESOURCES
to SAM_STAT_TASK_SET_FULL instead.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/target/target_core_transport.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 836d552..a3b264b 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -1772,8 +1772,8 @@ void transport_generic_request_failure(struct se_cmd *cmd,
 	case TCM_UNSUPPORTED_SEGMENT_DESC_TYPE_CODE:
 		break;
 	case TCM_OUT_OF_RESOURCES:
-		sense_reason = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
-		break;
+		cmd->scsi_status = SAM_STAT_TASK_SET_FULL;
+		goto queue_status;
 	case TCM_RESERVATION_CONFLICT:
 		/*
 		 * No SENSE Data payload for this case, set SCSI Status
@@ -1795,11 +1795,8 @@ void transport_generic_request_failure(struct se_cmd *cmd,
 					       cmd->orig_fe_lun, 0x2C,
 					ASCQ_2CH_PREVIOUS_RESERVATION_CONFLICT_STATUS);
 		}
-		trace_target_cmd_complete(cmd);
-		ret = cmd->se_tfo->queue_status(cmd);
-		if (ret)
-			goto queue_full;
-		goto check_stop;
+
+		goto queue_status;
 	default:
 		pr_err("Unknown transport error for CDB 0x%02x: %d\n",
 			cmd->t_task_cdb[0], sense_reason);
@@ -1816,6 +1813,11 @@ void transport_generic_request_failure(struct se_cmd *cmd,
 	transport_cmd_check_stop_to_fabric(cmd);
 	return;
 
+queue_status:
+	trace_target_cmd_complete(cmd);
+	ret = cmd->se_tfo->queue_status(cmd);
+	if (!ret)
+		goto check_stop;
 queue_full:
 	transport_handle_queue_full(cmd, cmd->se_dev, ret, false);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-10-30  3:45 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-30  3:44 [PATCH 00/19] target/target_core_user: changes for 4.16 Mike Christie
2017-10-30  3:44 ` [PATCH 01/19] tcmu: fix crash when removing the tcmu device v4 Mike Christie
2017-10-30  3:44 ` [PATCH 02/19] tcmu: Add netlink command reply supported option for each device Mike Christie
2017-10-30  3:44 ` [PATCH 03/19] tcmu: Use macro to call container_of in tcmu_cmd_time_out_show Mike Christie
2017-10-30  3:44 ` [PATCH 04/19] tcmu: fix double se_cmd completion Mike Christie
2017-10-30  3:44 ` [PATCH 05/19] tcmu: merge common block release code Mike Christie
2017-10-30  3:44 ` [PATCH 06/19] tcmu: split unmap_thread_fn Mike Christie
2017-10-30  3:44 ` [PATCH 07/19] tcmu: fix unmap thread race Mike Christie
2017-10-30  3:44 ` [PATCH 08/19] tcmu: move expired command completion to unmap thread Mike Christie
2017-10-30  3:44 ` [PATCH 09/19] tcmu: remove commands_lock Mike Christie
2017-10-30  3:44 ` [PATCH 10/19] tcmu: release blocks for partially setup cmds Mike Christie
2017-10-30  3:44 ` [PATCH 11/19] tcmu: simplify scatter_data_area error handling Mike Christie
2017-10-30  3:44 ` [PATCH 12/19] tcmu: fix free block calculation Mike Christie
2017-10-30  3:44 ` [PATCH 13/19] tcmu: clean up the scatter helper Mike Christie
2017-10-30  3:44 ` [PATCH 14/19] tcmu: prep queue_cmd_ring to be used by unmap wq Mike Christie
2017-10-30  3:44 ` [PATCH 15/19] tcmu: simplify dbi thresh handling Mike Christie
2017-10-30  3:44 ` [PATCH 16/19] tcmu: don't block submitting context for block waits Mike Christie
2017-10-30  3:44 ` [PATCH 17/19] tcmu: make ring buffer timer configurable Mike Christie
2017-10-30  3:44 ` [PATCH 18/19] tcmu: allow max block and global max blocks to be settable Mike Christie
2017-10-30  3:44 ` [PATCH 19/19] target: return SAM_STAT_TASK_SET_FULL for TCM_OUT_OF_RESOURCES Mike Christie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).