public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 0/42 RESEND+NEW] Target updates for May 27
       [not found] <1306523240-15543-1-git-send-email-agrover@redhat.com>
@ 2011-05-27 22:15 ` Nicholas A. Bellinger
  2011-05-27 22:35   ` Christoph Hellwig
       [not found] ` <1306523240-15543-11-git-send-email-agrover@redhat.com>
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-27 22:15 UTC (permalink / raw)
  To: linux-iscsi-target-dev; +Cc: linux-scsi, Christoph Hellwig, James Bottomley

On Fri, 2011-05-27 at 12:06 -0700, Andy Grover wrote:
> Hi Nicholas,
> 
> The first 24 are a resend (rebased to current lio-41 git) of patches
> from a couple weeks ago.
> 
> The last 18 are new. They include some misc. cleanups, but the heart
> of what they do is to allocate the memory for data buffers in the
> fabric (and from fabric cxt), instead of the core. This saves two context
> switches per command. It also reduces code size, since we no longer need
> to synchronize between the two threads, and we have more direct knowledge
> about the layout of the data buffer.
> 
> iscsi fabric does not rely on se_cmd->t_mem_list at all anymore. After some
> more cleanups of other fabrics, t_mem_list and struct se_mem will be an
> implementation detail of tcm core (as opposed to part of its API), which
> will allow further progress.
> 
> The code has been tested and appears to work fine, and although I haven't
> done any performance measurements, I'd expect to see command servicing
> latencies improve a bit.
> 

Hi Andy,

My apologies for the delayed on the last series..  I will have a look at
these updated bits ahead of next week and start merging these into
lio-core-2.6.git and will follow up with specific comments.

Also, I CC'ed the linux-scsi list here just to keep folks in the loop
for the future.

Btw, I am quiet excited about the iscsi-target conversion to drop direct
se_cmd->t_mem_list usage and some of the other improvements.  Great
stuff.  :)

Thank you!

--nab

> Thanks -- Regards -- Andy
> 
> 
> The following changes since commit 41cb785aaaeeaa2ee3532afe36b9c3ff3e3d94b5:
> 
>   Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 into lio-4.1 (2011-05-27 07:57:12 +0000)
> 
> are available in the git repository at:
> 
>   git://fedorapeople.org/home/fedora/grover/public_git/linux-2.6.git for-nab
> 
> Andy Grover (42):
>       target: Replace custom sg<->buf functions with lib funcs
>       target: Simplify sector limiting code
>       target: get_cdb should never return NULL
>       target: Simplify transport_memcpy_se_mem_read_contig
>       target: Use assignment rather than increment for t_task_cdbs
>       target: Don't pass dma_size to generic_get_mem. Don't memset.
>       target: Pass sg with type scatterlist in transport_map_sg_to_mem
>       target: Move task_sg_num next to task_sg in struct se_task
>       target: inline struct se_transport_task into struct se_task
>       target: Rewrite transport_init_task_sg()
>       target: Simplify transport_do_task_sg_chain()
>       target: Change name & semantics of transport_get_sectors()
>       target: Remove unused members of se_cmd
>       target: Rename se_cmd.t_task_cdbs to t_task_list_num
>       target: Fix some spelling
>       target/iscsi: rearrange iscsi_session members to show lock coverage
>       target/iscsi: Fix spelling of enum immediate_data_ret_table members
>       target/iscsi: Remove iscsi_ooo_cmdsn.batch_count
>       target: Remove unused members of iscsi_cmd and iscsi_queue_req
>       target: Remove unused var from transport_generic_do_tmr
>       target: Remove iscsi_session.cmdsn_outoforder flag
>       target: Remove iscsi_session.ooo_cmdsn_count
>       target: Simplify CmdSN sequencing
>       target: Silence a warning in a printk in target_emulate_write_same
>       target: Rename transport_generic_allocate_tasks to _process_cdb
>       target: Rename transport_generic_handle_cdb to _new_cmd
>       target: map_sg_to_mem: return sg_count in return value
>       target/pscsi: Use min_t for sector limits
>       target/pscsi: Unused param for pscsi_get_bio()
>       target: Rename get_cdb_count to allocate_tasks
>       target: Have iscsi fabric allocate its own buffers
>       target/iscsi: Inline params from map_sg into set_iovec_ptrs
>       target/iscsi: pass data_length by value to iscsit_calculate_map_segment
>       target/iscsi: Don't pass lmap to iscsit_get_offset()
>       target/iscsi: Rewrite iscsit_get_offset()
>       target/iscsi: Remove unused function iscsit_send_async_msg
>       target/iscsi: Do not use t_mem_list anymore
>       target/iscsi: use pad_bytes in cmd over local vars
>       target: Call transport_new_cmd instead of adding to cmd queue
>       target: Remove fabric callback to allocate iovecs
>       target/iscsi: remove unsolicited_data_comp completion
>       target/file: Alloc iov[] off the stack
> 
>  drivers/infiniband/ulp/srpt/ib_srpt.c           |   22 +-
>  drivers/target/iscsi/iscsi_target.c             |  894 +++++------------
>  drivers/target/iscsi/iscsi_target_configfs.c    |   35 +-
>  drivers/target/iscsi/iscsi_target_core.h        |   39 +-
>  drivers/target/iscsi/iscsi_target_erl1.c        |   24 +-
>  drivers/target/iscsi/iscsi_target_erl2.c        |    1 -
>  drivers/target/iscsi/iscsi_target_tmr.c         |    4 +-
>  drivers/target/iscsi/iscsi_target_util.c        |  257 ++---
>  drivers/target/iscsi/iscsi_target_util.h        |   42 +-
>  drivers/target/loopback/tcm_loop.c              |   17 +-
>  drivers/target/target_core_alua.c               |    4 +-
>  drivers/target/target_core_cdb.c                |   41 +-
>  drivers/target/target_core_device.c             |    6 +-
>  drivers/target/target_core_file.c               |   27 +-
>  drivers/target/target_core_iblock.c             |    4 +-
>  drivers/target/target_core_pr.c                 |   22 +-
>  drivers/target/target_core_pscsi.c              |   20 +-
>  drivers/target/target_core_rd.c                 |    8 +-
>  drivers/target/target_core_tmr.c                |   56 +-
>  drivers/target/target_core_transport.c          | 1213 ++++++++++-------------
>  drivers/target/target_core_ua.c                 |    2 +-
>  drivers/target/tcm_fc/tfc_cmd.c                 |   18 +-
>  drivers/target/tcm_fc/tfc_io.c                  |   21 +-
>  drivers/target/tcm_qla2xxx/tcm_qla2xxx_fabric.c |   34 +-
>  drivers/target/tcm_vhost/tcm_vhost_fabric.c     |    2 +-
>  drivers/target/tcm_vhost/tcm_vhost_scsi.c       |    2 +-
>  include/target/target_core_base.h               |  105 +--
>  include/target/target_core_fabric_ops.h         |    5 -
>  include/target/target_core_transport.h          |    6 +-
>  29 files changed, 1071 insertions(+), 1860 deletions(-)
> 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/42 RESEND+NEW] Target updates for May 27
  2011-05-27 22:15 ` [PATCH 0/42 RESEND+NEW] Target updates for May 27 Nicholas A. Bellinger
@ 2011-05-27 22:35   ` Christoph Hellwig
  2011-05-27 23:39     ` Nicholas A. Bellinger
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2011-05-27 22:35 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: linux-iscsi-target-dev, linux-scsi, Christoph Hellwig,
	James Bottomley

Nick, Andi,

any chance we can get these patches posted to the list for review?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/42 RESEND+NEW] Target updates for May 27
  2011-05-27 22:35   ` Christoph Hellwig
@ 2011-05-27 23:39     ` Nicholas A. Bellinger
  2011-05-28  7:43       ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-27 23:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-iscsi-target-dev, linux-scsi, Christoph Hellwig,
	James Bottomley

On Fri, 2011-05-27 at 18:35 -0400, Christoph Hellwig wrote:
> Nick, Andi,
> 
> any chance we can get these patches posted to the list for review?
> 

I will send these out as they are committed into lio-core-2.6.git for
next week and will be sure to include you in the specific discussions if
we run into concerns.  Also Andy, please start including Christoph and
the scsi list on subsequent drivers/target/ patch series for LIO
upstream for more reviewer eyeballs..

Btw, the LIO v4.1 upstream tree has been diverging from mainline v4.0
drivers/target/ quite a bit recently, and a number these cleanups and
improvements that Andy has made are running behind the initial mainline
merge of iscsi-target.  This series is pretty obviously all non .40
stuff, but would have really liked to get more of Andy's cleanup of
target core ahead of an initial iscsi-target .40 merge, but these
improvements to target core and the rest of Andy's changes will have to
be .41 items.

--nab












^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/42 RESEND+NEW] Target updates for May 27
  2011-05-27 23:39     ` Nicholas A. Bellinger
@ 2011-05-28  7:43       ` Christoph Hellwig
  2011-05-28 19:09         ` Nicholas A. Bellinger
  2011-05-29  2:23         ` Andy Grover
  0 siblings, 2 replies; 24+ messages in thread
From: Christoph Hellwig @ 2011-05-28  7:43 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Christoph Hellwig, linux-iscsi-target-dev, linux-scsi,
	Christoph Hellwig, James Bottomley

On Fri, May 27, 2011 at 04:39:23PM -0700, Nicholas A. Bellinger wrote:
> I will send these out as they are committed into lio-core-2.6.git for

Nick, the whole point is to get things reviewed before they are
commited.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/42 RESEND+NEW] Target updates for May 27
  2011-05-28  7:43       ` Christoph Hellwig
@ 2011-05-28 19:09         ` Nicholas A. Bellinger
  2011-05-29  2:23         ` Andy Grover
  1 sibling, 0 replies; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-28 19:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-iscsi-target-dev, linux-scsi, Christoph Hellwig,
	James Bottomley

On Sat, 2011-05-28 at 03:43 -0400, Christoph Hellwig wrote:
> On Fri, May 27, 2011 at 04:39:23PM -0700, Nicholas A. Bellinger wrote:
> > I will send these out as they are committed into lio-core-2.6.git for
> 
> Nick, the whole point is to get things reviewed before they are
> commited.
> 

Committed where..?  Andy has been sending lots cleanup patches regularly
the last months, but because of the very slow pace of target changes
moving into mainline these all end up going into lio-core-2.6.git.

I don't have time to seperate out all of the cleanup items that depend
upon iscsi-target from Andy, make drastic changes that James wants for
iscsi-target at the last minute of a merge window, and get everyone's
signoff on linux-scsi for everything into goes into lio-core-2.6.git.
It's not physically possible.

This workflow is quickly becoming unmanageable and extremely frustrating
when people are moving forward on lio-core-2.6.git, and we still cant
agree on basic design of iscsi-target for mainline.  I am not advocating
for less review, I want more review but the current workflow is
inefficent for the amount of changes that full time developers like Andy
are producing.

--nab



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/42 RESEND+NEW] Target updates for May 27
  2011-05-28  7:43       ` Christoph Hellwig
  2011-05-28 19:09         ` Nicholas A. Bellinger
@ 2011-05-29  2:23         ` Andy Grover
  2011-05-30 12:56           ` Christoph Hellwig
  1 sibling, 1 reply; 24+ messages in thread
From: Andy Grover @ 2011-05-29  2:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Nicholas A. Bellinger, linux-iscsi-target-dev, linux-scsi,
	James Bottomley

On 05/28/2011 12:43 AM, Christoph Hellwig wrote:
> On Fri, May 27, 2011 at 04:39:23PM -0700, Nicholas A. Bellinger wrote:
>> I will send these out as they are committed into lio-core-2.6.git for
> 
> Nick, the whole point is to get things reviewed before they are
> commited.

I guess this is my fault for not CCing linux-scsi. I'm working on tcm
cleanups and have been sending them to linux-iscsi-target-dev, because
there are a lot of them. Aren't you guys on that list too?

In any case, here's my git tree for review hch, I've been developing
against nab's lio-41, under the perhaps mistaken assumption that it all
would go in and then patched up to snuff:

git://fedorapeople.org/home/fedora/grover/public_git/linux-2.6.git for-nab

My work is completely unrelated to the user/kernel auth question, but
was done under the assumption that nab's iscsi fabric was going to go in.

Regards -- Andy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/42 RESEND+NEW] Target updates for May 27
  2011-05-29  2:23         ` Andy Grover
@ 2011-05-30 12:56           ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2011-05-30 12:56 UTC (permalink / raw)
  To: Andy Grover
  Cc: Christoph Hellwig, Nicholas A. Bellinger, linux-iscsi-target-dev,
	linux-scsi, James Bottomley

On Sat, May 28, 2011 at 07:23:43PM -0700, Andy Grover wrote:
> I guess this is my fault for not CCing linux-scsi. I'm working on tcm
> cleanups and have been sending them to linux-iscsi-target-dev, because
> there are a lot of them. Aren't you guys on that list too?

I'm not on it.  In fact I didn't really know about it until known.

I'll take all my complains back in that case, my fault for not beeing
on the proper list.

> In any case, here's my git tree for review hch, I've been developing
> against nab's lio-41, under the perhaps mistaken assumption that it all
> would go in and then patched up to snuff:
> 
> git://fedorapeople.org/home/fedora/grover/public_git/linux-2.6.git for-nab

Looks pretty good, and removes a few items from my todo list.

Some comments on the patches:

"target: get_cdb should never return NULL":

	looks good by itself, but I think the real fix here is to not
	leave the CDB allocation to the I/O backend, but just include
	it in the se_task.  Or maybe even in the se_cmd once that
	whole CDB splitting mess is moved into pscsi, which will
	simplify the common code further,

"target: inline struct se_transport_task into struct se_task"

	The description is wrong, it's moved into the se_cmd, which
	is the correct thing to do anyway.

"target: Rename transport_generic_handle_cdb to _new_cmd"

	About the naming:  For now it's transport_generic for exported
	APIs I think.  IMHO it's a pretty bad choice and should be
	replaced by target_, but let's do that in one big sweep.

"target: Have iscsi fabric allocate its own buffers"

	ceil seems to be a reimplementation of DIV_ROUND_UP from
	kernel.h

"target/iscsi: Do not use t_mem_list anymore"

	Very nice cleanup!

"target: Call transport_new_cmd instead of adding to cmd queue"

	Did you test all other frontends that they are fine with this
	change?
	Also the code really needs to handle errors from transport_new_cmd.

	The comment above transport_generic_new_cmd needs an update,
	or even better the wrapper should be killed and callers should
	use transport_new_cmd directly.
	The TRANSPORT_NEW_CMD enum value and code switch case in
	transport_processing_thread cab be removed now.

	Btw, it seems like transport_processing_thread and the various
	helpers to queue up work to it could be nicely cleaned up
	by using a workqueue with a work item embedded into the
	se_cmd.  That way this work could a) be made to scale to
	multiple CPUs, and b) we could kill new_cmd_map by just
	allowing the frontend to setup their own work queue handlers
	instead of going through this abstraction.

"target/file: Alloc iov[] off the stack"

	We can get pretty large iovecs, and blindly allocation them on
	the stack seems like a bad idea.  It might make sense to do
	so for the fast path, which means we'd need a __fd_do_readv
	that gets the vector passed, and then a smart wrapper using
	the stack for small vectors, and some technic making sure
	gcc doesn't bloat the stack otherwise.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 10/42] target: Rewrite transport_init_task_sg()
       [not found] ` <1306523240-15543-11-git-send-email-agrover@redhat.com>
@ 2011-05-31  7:08   ` Nicholas A. Bellinger
  2011-05-31 21:04     ` Andy Grover
  0 siblings, 1 reply; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-31  7:08 UTC (permalink / raw)
  To: linux-iscsi-target-dev; +Cc: target-devel, Christoph Hellwig, linux-scsi

On Fri, 2011-05-27 at 12:06 -0700, Andy Grover wrote:
> Don't set task->task_sg_num until task_sg is successfully alloced, to
> prevent it from possibly being out of sync.
> 
> Simplify loop counting sgs needed. Handle possible (?) case of offset
> greater than se_mem->se_len. Remove some cluttering debugprints. Reduce
> local variable usage.
> 
> Do not set termination bit in padded-1 entry in the padded_sg case. If
> something barfs on an extra zeroed sg, (will happen if chain-capable sgs
> don't actually get chained) then it's broken.
> 
> Remove unneeded parens.
> 
> Signed-off-by: Andy Grover <agrover@redhat.com>
> ---

So I am not exactly sure there is an issue here with task->task_sg_num
assignment in the failure case, as nothing depends AFAICT upon
task->task_sg_num in the release path before an possible
transport_init_task_sg() failure..

However, my main concern is the removal of sg_mark_end() calls  to
signify the end of task->task_sg[] memory for individual task I/O
dispatch.  As you have noticed this looks OK with existing code wrt to
backend *_map_task_sg() and *_do_task() using for_each_sg() w/
task->task_sg_num, but am wondering if it really makes sense to remove
this for the actual scatterlist dispatch for all backends..?  Is there
any possible code below TCM backend drivers where HW drivers can pose an
issue..?

I am happy to merge this for tcm v4.1 along with patch #11 if this is
really safe to do, but will defer for the moment until we can get a few
more ACKs on these two..

Thank you,

--nab

>  drivers/target/target_core_transport.c |  104 ++++++++++----------------------
>  1 files changed, 32 insertions(+), 72 deletions(-)
> 
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index c8e61d7..7b97c6b 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -4122,116 +4122,76 @@ out:
>  	return -ENOMEM;
>  }
>  
> +/*
> + * Allocate and init task->task_sg and ->task_sg_bidi so they are large enough
> + * to handle task->task_size bytes, and based on t_mem_list se_mem chunking.
> + */
>  int transport_init_task_sg(
>  	struct se_task *task,
> -	struct se_mem *in_se_mem,
> +	struct se_mem *se_mem,
>  	u32 task_offset)
>  {
>  	struct se_cmd *se_cmd = task->task_se_cmd;
>  	struct se_device *se_dev = se_cmd->se_dev;
> -	struct se_mem *se_mem = in_se_mem;
> -	struct target_core_fabric_ops *tfo = se_cmd->se_tfo;
> -	u32 sg_length, task_size = task->task_size, task_sg_num_padded;
> -
> -	while (task_size != 0) {
> -		DEBUG_SC("se_mem->se_page(%p) se_mem->se_len(%u)"
> -			" se_mem->se_off(%u) task_offset(%u)\n",
> -			se_mem->se_page, se_mem->se_len,
> -			se_mem->se_off, task_offset);
> -
> -		if (task_offset == 0) {
> -			if (task_size >= se_mem->se_len) {
> -				sg_length = se_mem->se_len;
> +	u32 task_size = task->task_size;
> +	u32 sgs_needed = 0;
>  
> -				if (!(list_is_last(&se_mem->se_list,
> -						&se_cmd->t_mem_list)))
> -					se_mem = list_entry(se_mem->se_list.next,
> -							struct se_mem, se_list);
> -			} else {
> -				sg_length = task_size;
> -				task_size -= sg_length;
> -				goto next;
> -			}
> +	/* count how many sgs we need */
> +	while (1) {
> +		/* handle if offset is greater than se_mem->se_len */
> +		u32 se_mem_offset = min_t(u32, task_offset, se_mem->se_len);
> +		u32 se_mem_length = (se_mem->se_len - se_mem_offset);
>  
> -			DEBUG_SC("sg_length(%u) task_size(%u)\n",
> -					sg_length, task_size);
> -		} else {
> -			if ((se_mem->se_len - task_offset) > task_size) {
> -				sg_length = task_size;
> -				task_size -= sg_length;
> -				goto next;
> -			 } else {
> -				sg_length = (se_mem->se_len - task_offset);
> +		sgs_needed++;
>  
> -				if (!(list_is_last(&se_mem->se_list,
> -						&se_cmd->t_mem_list)))
> -					se_mem = list_entry(se_mem->se_list.next,
> -							struct se_mem, se_list);
> -			}
> +		if (se_mem_length >= task_size)
> +			break;
>  
> -			DEBUG_SC("sg_length(%u) task_size(%u)\n",
> -					sg_length, task_size);
> +		task_offset -= se_mem_offset;
> +		task_size -= se_mem_length;
>  
> -			task_offset = 0;
> -		}
> -		task_size -= sg_length;
> -next:
> -		DEBUG_SC("task[%u] - Reducing task_size to(%u)\n",
> -			task->task_no, task_size);
> +		BUG_ON(list_is_last(&se_mem->se_list, &se_cmd->t_mem_list));
>  
> -		task->task_sg_num++;
> +		se_mem = list_entry(se_mem->se_list.next, struct se_mem, se_list);
>  	}
> +
>  	/*
>  	 * Check if the fabric module driver is requesting that all
>  	 * struct se_task->task_sg[] be chained together..  If so,
>  	 * then allocate an extra padding SG entry for linking and
>  	 * marking the end of the chained SGL.
>  	 */
> -	if (tfo->task_sg_chaining) {
> -		task_sg_num_padded = (task->task_sg_num + 1);
> +	if (se_cmd->se_tfo->task_sg_chaining) {
> +		sgs_needed++;
>  		task->task_padded_sg = 1;
> -	} else
> -		task_sg_num_padded = task->task_sg_num;
> +	}
>  
> -	task->task_sg = kzalloc(task_sg_num_padded *
> +	task->task_sg = kzalloc(sgs_needed *
>  			sizeof(struct scatterlist), GFP_KERNEL);
>  	if (!(task->task_sg)) {
>  		printk(KERN_ERR "Unable to allocate memory for"
>  				" task->task_sg\n");
>  		return -ENOMEM;
>  	}
> -	sg_init_table(&task->task_sg[0], task_sg_num_padded);
> +	sg_init_table(task->task_sg, sgs_needed);
> +	task->task_sg_num = sgs_needed;
> +
>  	/*
>  	 * Setup task->task_sg_bidi for SCSI READ payload for
>  	 * TCM/pSCSI passthrough if present for BIDI-COMMAND
>  	 */
>  	if (!list_empty(&se_cmd->t_mem_bidi_list) &&
>  	    (se_dev->transport->transport_type == TRANSPORT_PLUGIN_PHBA_PDEV)) {
> -		task->task_sg_bidi = kzalloc(task_sg_num_padded *
> +		task->task_sg_bidi = kzalloc(sgs_needed *
>  				sizeof(struct scatterlist), GFP_KERNEL);
> -		if (!(task->task_sg_bidi)) {
> +		if (!task->task_sg_bidi) {
>  			kfree(task->task_sg);
>  			task->task_sg = NULL;
>  			printk(KERN_ERR "Unable to allocate memory for"
>  				" task->task_sg_bidi\n");
>  			return -ENOMEM;
>  		}
> -		sg_init_table(&task->task_sg_bidi[0], task_sg_num_padded);
> -	}
> -	/*
> -	 * For the chaining case, setup the proper end of SGL for the
> -	 * initial submission struct task into struct se_subsystem_api.
> -	 * This will be cleared later by transport_do_task_sg_chain()
> -	 */
> -	if (task->task_padded_sg) {
> -		sg_mark_end(&task->task_sg[task->task_sg_num - 1]);
> -		/*
> -		 * Added the 'if' check before marking end of bi-directional
> -		 * scatterlist (which gets created only in case of request
> -		 * (RD + WR).
> -		 */
> -		if (task->task_sg_bidi)
> -			sg_mark_end(&task->task_sg_bidi[task->task_sg_num - 1]);
> +		sg_init_table(task->task_sg_bidi, sgs_needed);
>  	}
>  
>  	DEBUG_SC("Successfully allocated task->task_sg_num(%u),"
> -- 
> 1.7.1
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 37/42] target/iscsi: Do not use t_mem_list anymore
       [not found] ` <1306523240-15543-38-git-send-email-agrover@redhat.com>
@ 2011-05-31  9:00   ` Nicholas A. Bellinger
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-31  9:00 UTC (permalink / raw)
  To: linux-iscsi-target-dev; +Cc: target-devel, Christoph Hellwig, linux-scsi

On Fri, 2011-05-27 at 12:07 -0700, Andy Grover wrote:
> This patchset switches over from using se_cmd->t_mem_list as the base
> for building and mapping iovecs, to using iscsi's internal t_mem_sg,
> which we have since we allocated the memory ourselves. The se_offset_map
> and se_unmap_sg structs are no longer needed, replaced where needed by
> data members in iscsi_cmd itself.
> 

So for the amount of code and complexity this removes I am really very
happy to unify iscsi-target usage to follow iscsit_alloc_buffs() in
patch #31..  Again, very well done with ~300 LOC dropped for the main
iscsi-target I/O codepath.

The only small concern here (aside from more real testing with the whole
series of course) is the PAGE_SIZE direct usage in iscsit_map_iovec(),
as this would apply to the comments in patch #6 wrt to adding contigious
page scatterlist support in the future.  But with your changes that is
straight-forward enough for me to be comfortable with.  Thanks for
making that an easy decision..

> Modify iscsit_fe_sendpage_sg. It appears it used to be possible to have
> multiple pages attached to each se_mem (of which there could also be
> multiple.) Having a single array of pages simplifies the code greatly.
> 
> Modify crypto_hash fn to take sgs instead of an iovec. This works out well
> in that we don't need to rebuild the iovec, and the crypto has functions
> wanted sgs anyways.
> 
> Modify tx thread to not use struct unmap_sg, and put vars on their own
> lines.
> 
> Signed-off-by: Andy Grover <agrover@redhat.com>
> ---

A big +1 for agrover!!  Committed as 69077551ef

--nab

>  drivers/target/iscsi/iscsi_target.c      |  447 +++++++-----------------------
>  drivers/target/iscsi/iscsi_target_core.h |    9 +-
>  drivers/target/iscsi/iscsi_target_util.c |  102 ++-----
>  drivers/target/iscsi/iscsi_target_util.h |   31 +--
>  4 files changed, 142 insertions(+), 447 deletions(-)
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
> index 34749e8..2fe6004 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -684,259 +684,58 @@ int iscsit_add_reject_from_cmd(
>  	return (!fail_conn) ? 0 : -1;
>  }
>  
> -static void iscsit_calculate_map_segment(
> -	u32 data_length,
> -	struct se_offset_map *lm)
> -{
> -	u32 sg_offset = 0;
> -	struct se_mem *se_mem = lm->map_se_mem;
> -	/*
> -	 * Still working on pages in the current struct se_mem.
> -	 */
> -	if (!lm->map_reset) {
> -		lm->iovec_length = (lm->sg_length > PAGE_SIZE) ?
> -					PAGE_SIZE : lm->sg_length;
> -		if (data_length < lm->iovec_length)
> -			lm->iovec_length = data_length;
> -
> -		lm->iovec_base = page_address(lm->sg_page) + sg_offset;
> -		return;
> -	}
> -
> -	/*
> -	 * First run of an iscsi_linux_map_t.
> -	 *
> -	 * OR:
> -	 *
> -	 * Mapped all of the pages in the current scatterlist, move
> -	 * on to the next one.
> -	 */
> -	lm->map_reset = 0;
> -	sg_offset = se_mem->se_off;
> -	lm->sg_page = se_mem->se_page;
> -	lm->sg_length = se_mem->se_len;
> -	/*
> -	 * Get the base and length of the current page for use with the iovec.
> -	 */
> -recalc:
> -	lm->iovec_length = (lm->sg_length > (PAGE_SIZE - sg_offset)) ?
> -			   (PAGE_SIZE - sg_offset) : lm->sg_length;
> -	/*
> -	 * See if there is any iSCSI offset we need to deal with.
> -	 */
> -	if (!lm->current_offset) {
> -		lm->iovec_base = page_address(lm->sg_page) + sg_offset;
> -
> -		if (data_length < lm->iovec_length)
> -			lm->iovec_length = data_length;
> -
> -		return;
> -	}
> -
> -	/*
> -	 * We know the iSCSI offset is in the next page of the current
> -	 * scatterlist.  Increase the lm->sg_page pointer and try again.
> -	 */
> -	if (lm->current_offset >= lm->iovec_length) {
> -		lm->current_offset -= lm->iovec_length;
> -		lm->sg_length -= lm->iovec_length;
> -		lm->sg_page++;
> -		sg_offset = 0;
> -
> -		goto recalc;
> -	}
> -
> -	/*
> -	 * The iSCSI offset is in the current page, increment the iovec
> -	 * base and reduce iovec length.
> -	 */
> -	lm->iovec_base = page_address(lm->sg_page);
> -
> -	lm->iovec_base += sg_offset;
> -	lm->iovec_base += lm->current_offset;
> -
> -	if ((lm->iovec_length - lm->current_offset) < data_length)
> -		lm->iovec_length -= lm->current_offset;
> -	else
> -		lm->iovec_length = data_length;
> -
> -	if ((lm->sg_length - lm->current_offset) < data_length)
> -		lm->sg_length -= lm->current_offset;
> -	else
> -		lm->sg_length = data_length;
> -
> -	lm->current_offset = 0;
> -}
> -
>  /*
> - * Find the first se_mem we should be using, based on the offset used so far.
> - * Update lmap to look at first se_mem with room, and the offset of unused room
> - * in it.
> + * Map some portion of the allocated scatterlist to an iovec, suitable for
> + * kernel sockets to copy data in/out. This handles both pages and slab-allocated
> + * buffers, since we have been tricky and mapped t_mem_sg to the buffer in
> + * either case (see iscsit_alloc_buffs)
>   */
> -static void iscsit_get_offset(struct se_unmap_sg *usg)
> -{
> -	struct se_offset_map *lmap = &usg->lmap;
> -	struct se_mem *se_mem = NULL;
> -
> -	int offset = 0;
> -	int found = 0;
> -	int cur_se_mem_offset;
> -
> -	/* Burn through se_mems until the offset is consumed */
> -	list_for_each_entry(se_mem, &usg->se_cmd->t_mem_list, se_list) {
> -
> -		if ((offset + se_mem->se_len) < lmap->iscsi_offset) {
> -			found = 1;
> -			break;
> -		}
> -
> -		offset += se_mem->se_len;
> -	}
> -
> -	BUG_ON(!found);
> -
> -	lmap->map_orig_se_mem = se_mem;
> -	usg->cur_se_mem = se_mem;
> -
> -	cur_se_mem_offset = (lmap->iscsi_offset - offset);
> -	usg->t_offset = cur_se_mem_offset;
> -	lmap->orig_offset = cur_se_mem_offset;
> -	lmap->current_offset = cur_se_mem_offset;
> -}
> -
> -static int iscsit_set_iovec_ptrs(
> +static int iscsit_map_iovec(
>  	struct iscsi_cmd *cmd,
>  	struct kvec *iov,
>  	u32 data_offset,
> -	u32 data_length,
> -	int sg_kmap_active,
> -	struct se_unmap_sg *unmap_sg)
> +	u32 data_length)
>  {
> -	u32 i = 0; /* For iovecs */
> -	u32 j = 0; /* For scatterlists */
> -	struct se_cmd *se_cmd = &cmd->se_cmd;
> -	struct se_offset_map *lmap = &unmap_sg->lmap;
> +	u32 i;
> +	struct scatterlist *sg;
> +	unsigned int page_off;
>  
>  	/*
> -	 * Used for non scatterlist operations, assume a single iovec.
> +	 * We have a private mapping of the allocated pages in t_mem_sg.
> +	 * At this point, we also know each contains a page.
>  	 */
> -	if (!se_cmd->t_tasks_se_num) {
> -		iov[0].iov_base = se_cmd->t_task_buf + data_offset;
> -		iov[0].iov_len  = data_length;
> -		return 1;
> -	}
> +	sg = &cmd->t_mem_sg[data_offset / PAGE_SIZE];
> +	page_off = (data_offset % PAGE_SIZE) + sg->offset;
>  
> -	/*
> -	 * Set lmap->map_reset = 1 so the first call to
> -	 * iscsit_calculate_map_segment() sets up the initial
> -	 * values for struct se_offset_map.
> -	 */
> -	lmap->map_reset = 1;
> -	/*
> -	 * Get a pointer to the first used scatterlist based on the passed
> -	 * offset. Also set the rest of the needed values in iscsi_linux_map_t.
> -	 */
> -	lmap->iscsi_offset = data_offset;
> -	if (sg_kmap_active) {
> -		unmap_sg->se_cmd = se_cmd;
> -		iscsit_get_offset(unmap_sg);
> -		unmap_sg->data_length = data_length;
> -	} else {
> -		lmap->current_offset = lmap->orig_offset;
> -	}
> -	lmap->map_se_mem = lmap->map_orig_se_mem;
> +	cmd->first_data_sg = sg;
> +	cmd->first_data_sg_off = page_off;
>  
> +	i = 0;
>  	while (data_length) {
> -		/*
> -		 * Time to get the virtual address for use with iovec pointers.
> -		 * This function will return the expected iovec_base address
> -		 * and iovec_length.
> -		 */
> -		iscsit_calculate_map_segment(data_length, lmap);
> +		u32 cur_len = min_t(u32, data_length, (sg[i].length - page_off));
>  
> -		/*
> -		 * Set the iov.iov_base and iov.iov_len from the current values
> -		 * in iscsi_linux_map_t.
> -		 */
> -		iov[i].iov_base = lmap->iovec_base;
> -		iov[i].iov_len = lmap->iovec_length;
> +		iov[i].iov_base = kmap(sg_page(&sg[i]));
> +		iov[i].iov_len = cur_len;
>  
> -		/*
> -		 * Subtract the final iovec length from the total length to be
> -		 * mapped, and the length of the current scatterlist.  Also
> -		 * perform the paranoid check to make sure we are not going to
> -		 * overflow the iovecs allocated for this command in the next
> -		 * pass.
> -		 */
> -		data_length -= iov[i].iov_len;
> -		lmap->sg_length -= iov[i].iov_len;
> -
> -		if ((++i + 1) > cmd->orig_iov_data_count) {
> -			printk(KERN_ERR "Current iovec count %u is greater than"
> -				" struct se_cmd->orig_data_iov_count %u, cannot"
> -				" continue.\n", i+1, cmd->orig_iov_data_count);
> -			return -1;
> -		}
> -
> -		/*
> -		 * All done mapping this scatterlist's pages, move on to
> -		 * the next scatterlist by setting lmap.map_reset = 1;
> -		 */
> -		if (!lmap->sg_length || !data_length) {
> -			list_for_each_entry(lmap->map_se_mem,
> -					&lmap->map_se_mem->se_list, se_list)
> -				break;
> -
> -			if (!lmap->map_se_mem) {
> -				printk(KERN_ERR "Unable to locate next"
> -					" lmap->map_struct se_mem entry\n");
> -				return -1;
> -			}
> -			j++;
> -
> -			lmap->sg_page = NULL;
> -			lmap->map_reset = 1;
> -		} else
> -			lmap->sg_page++;
> +		data_length -= cur_len;
> +		page_off = 0;
> +		i++;
>  	}
>  
> -	unmap_sg->sg_count = j;
> +	cmd->kmapped_nents = i;
>  
>  	return i;
>  }
>  
> -static void iscsit_map_SG_segments(struct se_unmap_sg *unmap_sg)
> +static void iscsit_unmap_iovec(struct iscsi_cmd *cmd)
>  {
> -	u32 i = 0;
> -	struct se_cmd *cmd = unmap_sg->se_cmd;
> -	struct se_mem *se_mem = unmap_sg->cur_se_mem;
> -
> -	if (!cmd->t_tasks_se_num)
> -		return;
> -
> -	list_for_each_entry_continue(se_mem, &cmd->t_mem_list, se_list) {
> -		kmap(se_mem->se_page);
> -
> -		if (++i == unmap_sg->sg_count)
> -			break;
> -	}
> -}
> -
> -static void iscsit_unmap_SG_segments(struct se_unmap_sg *unmap_sg)
> -{
> -	u32 i = 0;
> -	struct se_cmd *cmd = unmap_sg->se_cmd;
> -	struct se_mem *se_mem = unmap_sg->cur_se_mem;
> -
> -	if (!cmd->t_tasks_se_num)
> -		return;
> +	u32 i;
> +	struct scatterlist *sg;
>  
> -	list_for_each_entry_continue(se_mem, &cmd->t_mem_list, se_list) {
> -		kunmap(se_mem->se_page);
> +	sg = cmd->first_data_sg;
>  
> -		if (++i == unmap_sg->sg_count)
> -			break;
> -	}
> +	for (i = 0; i < cmd->kmapped_nents; i++)
> +		kunmap(sg_page(&sg[i]));
>  }
>  
>  static void iscsit_ack_from_expstatsn(struct iscsi_conn *conn, u32 exp_statsn)
> @@ -969,19 +768,23 @@ static int iscsit_alloc_buffs(struct iscsi_cmd *cmd)
>  	u32 length = cmd->se_cmd.data_length;
>  	struct scatterlist *sgl;
>  	int nents = ceil(length, PAGE_SIZE);
> -	int i = 0;
> +	int i;
>  	void *buf;
>  	void *cur;
>  	struct page *page;
>  
> +	if (!length)
> +		return 0;
> +
>  	sgl = kzalloc(sizeof(*sgl) * nents, GFP_KERNEL);
>  	if (!sgl)
>  		return -ENOMEM;
> +	sg_init_table(sgl, nents);
>  
>  	/*
>  	 * Allocate from slab if nonsg, but sgl should point
>  	 * to the malloced mem.
> -	 * We know we have to kfree it if t_task_buf is set.
> +	 * We know we have to kfree it if t_mem is set.
>  	 * Alloc pages if sg.
>  	 */
>  	if (cmd->se_cmd.se_cmd_flags & SCF_SCSI_CONTROL_NONSG_IO_CDB) {
> @@ -1004,7 +807,7 @@ static int iscsit_alloc_buffs(struct iscsi_cmd *cmd)
>  		cmd->t_mem = buf;
>  
>  	} else {
> -
> +		i = 0;
>  		while (length) {
>  			int buf_size = min_t(int, length, PAGE_SIZE);
>  
> @@ -1258,6 +1061,12 @@ attach_cmd:
>  	 * Active/NonOptimized primary access state..
>  	 */
>  	core_alua_check_nonop_delay(SE_CMD(cmd));
> +
> +	ret = iscsit_alloc_buffs(cmd);
> +	if (ret < 0) {
> +		return ret;
> +	}
> +
>  	/*
>  	 * Check the CmdSN against ExpCmdSN/MaxCmdSN here if
>  	 * the Immediate Bit is not set, and no Immediate
> @@ -1307,11 +1116,6 @@ attach_cmd:
>  		goto after_immediate_data;
>  	}
>  
> -	ret = iscsit_alloc_buffs(cmd);
> -	if (ret < 0) {
> -		return ret;
> -	}
> -
>  	/*
>  	 * Immediate Data is present, send to the transport and block until
>  	 * the underlying transport plugin has allocated the buffer to
> @@ -1376,33 +1180,44 @@ after_immediate_data:
>  	return 0;
>  }
>  
> -static void iscsit_do_crypto_hash_iovec(
> +static u32 iscsit_do_crypto_hash_sg(
>  	struct hash_desc *hash,
> -	struct kvec *iov,
> -	u32 counter,
> +	struct iscsi_cmd *cmd,
> +	u32 data_offset,
> +	u32 data_length,
>  	u32 padding,
> -	u8 *pad_bytes,
> -	u8 *data_crc)
> +	u8 *pad_bytes)
>  {
> -	struct scatterlist sg;
> -	struct kvec *iov_ptr = iov;
> +	u32 data_crc;
> +	u32 i;
> +	struct scatterlist *sg;
> +	unsigned int page_off;
>  
>  	crypto_hash_init(hash);
>  
> -	while (counter > 0) {
> -		sg_init_one(&sg, iov_ptr->iov_base,
> -				iov_ptr->iov_len);
> -		crypto_hash_update(hash, &sg, iov_ptr->iov_len);
> -		
> -		counter -= iov_ptr->iov_len;
> -		iov_ptr++;
> +	sg = cmd->first_data_sg;
> +	page_off = cmd->first_data_sg_off;
> +
> +	i = 0;
> +	while (data_length) {
> +		u32 cur_len = min_t(u32, data_length, (sg[i].length - page_off));
> +
> +		crypto_hash_update(hash, sg, cur_len);
> +
> +		data_length -= cur_len;
> +		page_off = 0;
> +		i++;
>  	}
>  
>  	if (padding) {
> -		sg_init_one(&sg, pad_bytes, padding);
> -		crypto_hash_update(hash, &sg, padding);
> +		struct scatterlist pad_sg;
> +
> +		sg_init_one(&pad_sg, pad_bytes, padding);
> +		crypto_hash_update(hash, &pad_sg, padding);
>  	}
> -	crypto_hash_final(hash, data_crc);
> +	crypto_hash_final(hash, (u8 *) &data_crc);
> +
> +	return data_crc;
>  }
>  
>  static void iscsit_do_crypto_hash_buf(
> @@ -1435,7 +1250,6 @@ static int iscsit_handle_data_out(struct iscsi_conn *conn, unsigned char *buf)
>  	u32 rx_size = 0, payload_length;
>  	struct iscsi_cmd *cmd = NULL;
>  	struct se_cmd *se_cmd;
> -	struct se_unmap_sg unmap_sg;
>  	struct iscsi_data *hdr;
>  	struct kvec *iov;
>  	unsigned long flags;
> @@ -1606,11 +1420,7 @@ static int iscsit_handle_data_out(struct iscsi_conn *conn, unsigned char *buf)
>  	rx_size += payload_length;
>  	iov = &cmd->iov_data[0];
>  
> -	memset(&unmap_sg, 0, sizeof(struct se_unmap_sg));
> -	unmap_sg.fabric_cmd = (void *)cmd;
> -	unmap_sg.se_cmd = SE_CMD(cmd);
> -
> -	iov_ret = iscsit_set_iovec_ptrs(cmd, iov, hdr->offset, payload_length, 1, &unmap_sg);
> +	iov_ret = iscsit_map_iovec(cmd, iov, hdr->offset, payload_length);
>  	if (iov_ret < 0)
>  		return -1;
>  
> @@ -1630,30 +1440,19 @@ static int iscsit_handle_data_out(struct iscsi_conn *conn, unsigned char *buf)
>  		rx_size += ISCSI_CRC_LEN;
>  	}
>  
> -	iscsit_map_SG_segments(&unmap_sg);
> -
>  	rx_got = rx_data(conn, &cmd->iov_data[0], iov_count, rx_size);
>  
> -	iscsit_unmap_SG_segments(&unmap_sg);
> +	iscsit_unmap_iovec(cmd);
>  
>  	if (rx_got != rx_size)
>  		return -1;
>  
>  	if (conn->conn_ops->DataDigest) {
> -		u32 counter = payload_length, data_crc = 0;
> -		struct kvec *iov_ptr = &cmd->iov_data[0];
> +		u32 data_crc;
>  
> -		/*
> -		 * Thanks to the IP stack shitting on passed iovecs,  we have to
> -		 * call set_iovec_data_ptrs() again in order to have a iMD/PSCSI
> -		 * agnostic way of doing datadigests computations.
> -		 */
> -		if (iscsit_set_iovec_ptrs(cmd, iov_ptr, hdr->offset, payload_length, 0, &unmap_sg) < 0)
> -			return -1;
> -
> -		iscsit_do_crypto_hash_iovec(&conn->conn_rx_hash,
> -				iov_ptr, counter, padding,
> -				(u8 *)&pad_bytes, (u8 *)&data_crc);
> +		data_crc = iscsit_do_crypto_hash_sg(&conn->conn_rx_hash, cmd,
> +						    hdr->offset, payload_length, padding,
> +						    (u8 *)&pad_bytes);
>  
>  		if (checksum != data_crc) {
>  			printk(KERN_ERR "ITT: 0x%08x, Offset: %u, Length: %u,"
> @@ -2515,15 +2314,9 @@ static int iscsit_handle_immediate_data(
>  	int iov_ret, rx_got = 0, rx_size = 0;
>  	u32 checksum, iov_count = 0, padding = 0, pad_bytes = 0;
>  	struct iscsi_conn *conn = cmd->conn;
> -	struct se_unmap_sg unmap_sg;
>  	struct kvec *iov;
>  
> -	memset(&unmap_sg, 0, sizeof(struct se_unmap_sg));
> -	unmap_sg.fabric_cmd = (void *)cmd;
> -	unmap_sg.se_cmd = SE_CMD(cmd);
> -
> -	iov_ret = iscsit_set_iovec_ptrs(cmd, cmd->iov_data, cmd->write_data_done,
> -					length, 1, &unmap_sg);
> +	iov_ret = iscsit_map_iovec(cmd, cmd->iov_data, cmd->write_data_done, length);
>  	if (iov_ret < 0)
>  		return IMMEDIATE_DATA_CANNOT_RECOVER;
>  
> @@ -2544,11 +2337,9 @@ static int iscsit_handle_immediate_data(
>  		rx_size += ISCSI_CRC_LEN;
>  	}
>  
> -	iscsit_map_SG_segments(&unmap_sg);
> -
>  	rx_got = rx_data(conn, &cmd->iov_data[0], iov_count, rx_size);
>  
> -	iscsit_unmap_SG_segments(&unmap_sg);
> +	iscsit_unmap_iovec(cmd);
>  
>  	if (rx_got != rx_size) {
>  		iscsit_rx_thread_wait_for_tcp(conn);
> @@ -2556,19 +2347,11 @@ static int iscsit_handle_immediate_data(
>  	}
>  
>  	if (conn->conn_ops->DataDigest) {
> -		u32 counter = length, data_crc;
> -		struct kvec *iov_ptr = &cmd->iov_data[0];
> -		/*
> -		 * Thanks to the IP stack shitting on passed iovecs,  we have to
> -		 * call set_iovec_data_ptrs again in order to have a iMD/PSCSI
> -		 * agnostic way of doing datadigests computations.
> -		 */
> -		if (iscsit_set_iovec_ptrs(cmd, iov_ptr, cmd->write_data_done, length, 0, &unmap_sg) < 0)
> -			return IMMEDIATE_DATA_CANNOT_RECOVER;
> +		u32 data_crc;
>  
> -		iscsit_do_crypto_hash_iovec(&conn->conn_rx_hash, iov_ptr,
> -					counter, padding,
> -					(u8 *)&pad_bytes, (u8 *)&data_crc);
> +		data_crc = iscsit_do_crypto_hash_sg(&conn->conn_rx_hash, cmd,
> +						    cmd->write_data_done, length, padding,
> +						    (u8 *)&pad_bytes);
>  
>  		if (checksum != data_crc) {
>  			printk(KERN_ERR "ImmediateData CRC32C DataDigest 0x%08x"
> @@ -2699,7 +2482,6 @@ static int iscsit_send_conn_drop_async_message(
>  static int iscsit_send_data_in(
>  	struct iscsi_cmd *cmd,
>  	struct iscsi_conn *conn,
> -	struct se_unmap_sg *unmap_sg,
>  	int *eodr)
>  {
>  	int iov_ret = 0, set_statsn = 0;
> @@ -2709,6 +2491,7 @@ static int iscsit_send_data_in(
>  	struct iscsi_datain_req *dr;
>  	struct iscsi_data_rsp *hdr;
>  	struct kvec *iov;
> +	u32 padding;
>  
>  	memset(&datain, 0, sizeof(struct iscsi_datain));
>  	dr = iscsit_get_datain_values(cmd, &datain);
> @@ -2805,37 +2588,26 @@ static int iscsit_send_data_in(
>  			" for DataIN PDU 0x%08x\n", *header_digest);
>  	}
>  
> -	iov_ret = iscsit_set_iovec_ptrs(cmd, &cmd->iov_data[iov_count], datain.offset,
> -					datain.length, 1, unmap_sg);
> +	iov_ret = iscsit_map_iovec(cmd, &cmd->iov_data[1], datain.offset, datain.length);
>  	if (iov_ret < 0)
>  		return -1;
>  
>  	iov_count += iov_ret;
>  	tx_size += datain.length;
>  
> -	unmap_sg->padding = ((-datain.length) & 3);
> -	if (unmap_sg->padding != 0) {
> -		pad_bytes = kzalloc(unmap_sg->padding * sizeof(u8),
> -					GFP_KERNEL);
> -		if (!pad_bytes) {
> -			printk(KERN_ERR "Unable to allocate memory for"
> -					" pad_bytes.\n");
> -			return -1;
> -		}
> -		cmd->buf_ptr = pad_bytes;
> +	padding = ((-datain.length) & 3);
> +	if (padding) {
> +		pad_bytes = cmd->pad_bytes;
>  		iov[iov_count].iov_base		= pad_bytes;
> -		iov[iov_count++].iov_len	= unmap_sg->padding;
> -		tx_size += unmap_sg->padding;
> +		iov[iov_count++].iov_len	= cmd->padding;
> +		tx_size += cmd->padding;
>  
>  		TRACE(TRACE_ISCSI, "Attaching %u padding bytes\n",
> -				unmap_sg->padding);
> +				cmd->padding);
>  	}
>  	if (conn->conn_ops->DataDigest) {
> -		u32 counter = (datain.length + unmap_sg->padding);
> -		struct kvec *iov_ptr = &cmd->iov_data[1];
> -
> -		iscsit_do_crypto_hash_iovec(&conn->conn_tx_hash, iov_ptr,
> -				counter, 0, NULL, (u8 *)&cmd->data_crc);
> +		cmd->data_crc = iscsit_do_crypto_hash_sg(&conn->conn_tx_hash, cmd,
> +			 datain.offset, datain.length, cmd->padding, cmd->pad_bytes);
>  
>  		iov[iov_count].iov_base	= &cmd->data_crc;
>  		iov[iov_count++].iov_len = ISCSI_CRC_LEN;
> @@ -3741,13 +3513,16 @@ static inline void iscsit_thread_check_cpumask(
>  int iscsi_target_tx_thread(void *arg)
>  {
>  	u8 state;
> -	int eodr = 0, map_sg = 0, ret = 0, sent_status = 0, use_misc = 0;
> +	int eodr = 0;
> +	int ret = 0;
> +	int sent_status = 0;
> +	int use_misc = 0;
> +	int map_sg = 0;
>  	struct iscsi_cmd *cmd = NULL;
>  	struct iscsi_conn *conn;
>  	struct iscsi_queue_req *qr = NULL;
>  	struct se_cmd *se_cmd;
>  	struct iscsi_thread_set *ts = (struct iscsi_thread_set *)arg;
> -	struct se_unmap_sg unmap_sg;
>  	/*
>  	 * Allow ourselves to be interrupted by SIGINT so that a
>  	 * connection recovery / failure event can be triggered externally.
> @@ -3878,13 +3653,9 @@ check_rsp_state:
>  			switch (state) {
>  			case ISTATE_SEND_DATAIN:
>  				spin_unlock_bh(&cmd->istate_lock);
> -				memset(&unmap_sg, 0,
> -						sizeof(struct se_unmap_sg));
> -				unmap_sg.fabric_cmd = (void *)cmd;
> -				unmap_sg.se_cmd = SE_CMD(cmd);
> -				map_sg = 1;
>  				ret = iscsit_send_data_in(cmd, conn,
> -						&unmap_sg, &eodr);
> +							  &eodr);
> +				map_sg = 1;
>  				break;
>  			case ISTATE_SEND_STATUS:
>  			case ISTATE_SEND_STATUS_RECOVERY:
> @@ -3943,32 +3714,22 @@ check_rsp_state:
>  
>  			se_cmd = &cmd->se_cmd;
>  
> -			if (map_sg && !conn->conn_ops->IFMarker &&
> -			    se_cmd->t_tasks_se_num) {
> -				iscsit_map_SG_segments(&unmap_sg);
> -				if (iscsit_fe_sendpage_sg(&unmap_sg, conn) < 0) {
> +			if (map_sg && !conn->conn_ops->IFMarker) {
> +				if (iscsit_fe_sendpage_sg(cmd, conn) < 0) {
>  					conn->tx_response_queue = 0;
>  					iscsit_tx_thread_wait_for_tcp(conn);
> -					iscsit_unmap_SG_segments(&unmap_sg);
> +					iscsit_unmap_iovec(cmd);
>  					goto transport_err;
>  				}
> -				iscsit_unmap_SG_segments(&unmap_sg);
> -				map_sg = 0;
>  			} else {
> -				if (map_sg)
> -					iscsit_map_SG_segments(&unmap_sg);
>  				if (iscsit_send_tx_data(cmd, conn, use_misc) < 0) {
>  					conn->tx_response_queue = 0;
>  					iscsit_tx_thread_wait_for_tcp(conn);
> -					if (map_sg)
> -						iscsit_unmap_SG_segments(&unmap_sg);
> +					iscsit_unmap_iovec(cmd);
>  					goto transport_err;
>  				}
> -				if (map_sg) {
> -					iscsit_unmap_SG_segments(&unmap_sg);
> -					map_sg = 0;
> -				}
>  			}
> +			iscsit_unmap_iovec(cmd);
>  
>  			spin_lock_bh(&cmd->istate_lock);
>  			switch (state) {
> diff --git a/drivers/target/iscsi/iscsi_target_core.h b/drivers/target/iscsi/iscsi_target_core.h
> index d64cd30..693f8c8 100644
> --- a/drivers/target/iscsi/iscsi_target_core.h
> +++ b/drivers/target/iscsi/iscsi_target_core.h
> @@ -388,8 +388,6 @@ struct iscsi_cmd {
>  	u32			orig_iov_data_count;
>  	/* Number of miscellaneous iovecs used for IP stack calls */
>  	u32			iov_misc_count;
> -	/* Bytes used for 32-bit word padding */
> -	u32			pad_bytes;
>  	/* Number of struct iscsi_pdu in struct iscsi_cmd->pdu_list */
>  	u32			pdu_count;
>  	/* Next struct iscsi_pdu to send in struct iscsi_cmd->pdu_list */
> @@ -477,6 +475,13 @@ struct iscsi_cmd {
>  	struct scatterlist	*t_mem_sg;
>  	u32			t_mem_sg_nents;
>  
> +	u32			padding;
> +	u8			pad_bytes[4];
> +
> +	struct scatterlist	*first_data_sg;
> +	u32			first_data_sg_off;
> +	u32			kmapped_nents;
> +
>  }  ____cacheline_aligned;
>  
>  #define SE_CMD(cmd)		(&(cmd)->se_cmd)
> diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
> index 6d770a4..79d3b6e 100644
> --- a/drivers/target/iscsi/iscsi_target_util.c
> +++ b/drivers/target/iscsi/iscsi_target_util.c
> @@ -1275,22 +1275,26 @@ send_data:
>  }
>  
>  int iscsit_fe_sendpage_sg(
> -	struct se_unmap_sg *u_sg,
> +	struct iscsi_cmd *cmd,
>  	struct iscsi_conn *conn)
>  {
>  	int tx_sent;
> -	struct iscsi_cmd *cmd = (struct iscsi_cmd *)u_sg->fabric_cmd;
> -	struct se_cmd *se_cmd = SE_CMD(cmd);
> -	u32 len = cmd->tx_size, pg_len, se_len, se_off, tx_size;
> -	struct kvec *iov = &cmd->iov_data[0];
> -	struct page *page;
> -	struct se_mem *se_mem = u_sg->cur_se_mem;
> +	u32 tx_hdr_size;
> +	u32 data_len;
> +	struct kvec iov;
> +	struct scatterlist *sg = cmd->first_data_sg;
> +	u32 offset = cmd->first_data_sg_off;
>  
>  send_hdr:
> -	tx_size = (conn->conn_ops->HeaderDigest) ? ISCSI_HDR_LEN + ISCSI_CRC_LEN :
> -			ISCSI_HDR_LEN;
> -	tx_sent = tx_data(conn, iov, 1, tx_size);
> -	if (tx_size != tx_sent) {
> +	tx_hdr_size = ISCSI_HDR_LEN;
> +	if (conn->conn_ops->HeaderDigest)
> +		tx_hdr_size += ISCSI_CRC_LEN;
> +
> +	iov.iov_base = cmd->pdu;
> +	iov.iov_len = tx_hdr_size;
> +
> +	tx_sent = tx_data(conn, &iov, 1, tx_hdr_size);
> +	if (tx_hdr_size != tx_sent) {
>  		if (tx_sent == -EAGAIN) {
>  			printk(KERN_ERR "tx_data() returned -EAGAIN\n");
>  			goto send_hdr;
> @@ -1298,46 +1302,20 @@ send_hdr:
>  		return -1;
>  	}
>  
> -	len -= tx_size;
> -	len -= u_sg->padding;
> +	data_len = cmd->tx_size - tx_hdr_size - cmd->padding;
>  	if (conn->conn_ops->DataDigest)
> -		len -= ISCSI_CRC_LEN;
> -	/*
> -	 * Start calculating from the first page of current struct se_mem.
> -	 */
> -	page = se_mem->se_page;
> -	pg_len = (PAGE_SIZE - se_mem->se_off);
> -	se_len = se_mem->se_len;
> -	if (se_len < pg_len)
> -		pg_len = se_len;
> -	se_off = se_mem->se_off;
> -	/*
> -	 * Calucate new se_len and se_off based upon u_sg->t_offset into
> -	 * the current struct se_mem and possibily a different page.
> -	 */
> -	while (u_sg->t_offset) {
> -		if (u_sg->t_offset >= pg_len) {
> -			u_sg->t_offset -= pg_len;
> -			se_len -= pg_len;
> -			se_off = 0;
> -			pg_len = PAGE_SIZE;
> -			page++;
> -		} else {
> -			se_off += u_sg->t_offset;
> -			se_len -= u_sg->t_offset;
> -			u_sg->t_offset = 0;
> -		}
> -	}
> +		data_len -= ISCSI_CRC_LEN;
> +
>  	/*
> -	 * Perform sendpage() for each page in the struct se_mem
> +	 * Perform sendpage() for each page in the scatterlist
>  	 */
> -	while (len) {
> -		if (se_len > len)
> -			se_len = len;
> +	while (data_len) {
> +		u32 space = (sg->length - offset);
> +		u32 sub_len = min_t(u32, data_len, space);
>  send_pg:
>  		tx_sent = conn->sock->ops->sendpage(conn->sock,
> -				page, se_off, se_len, 0);
> -		if (tx_sent != se_len) {
> +						    sg_page(sg), offset, sub_len, 0);
> +		if (tx_sent != sub_len) {
>  			if (tx_sent == -EAGAIN) {
>  				printk(KERN_ERR "tcp_sendpage() returned"
>  						" -EAGAIN\n");
> @@ -1349,38 +1327,18 @@ send_pg:
>  			return -1;
>  		}
>  
> -		len -= se_len;
> -		if (!(len))
> -			break;
> -
> -		se_len -= tx_sent;
> -		if (!(se_len)) {
> -			list_for_each_entry_continue(se_mem,
> -					&se_cmd->t_mem_list, se_list)
> -				break;
> -
> -			if (!se_mem) {
> -				printk(KERN_ERR "Unable to locate next struct se_mem\n");
> -				return -1;
> -			}
> -
> -			se_len = se_mem->se_len;
> -			se_off = se_mem->se_off;
> -			page = se_mem->se_page;
> -		} else {
> -			se_len = PAGE_SIZE;
> -			se_off = 0;
> -			page++;
> -		}
> +		data_len -= sub_len;
> +		offset = 0;
> +		sg++;
>  	}
>  
>  send_padding:
> -	if (u_sg->padding) {
> +	if (cmd->padding) {
>  		struct kvec *iov_p =
>  			&cmd->iov_data[cmd->iov_data_count-2];
>  
> -		tx_sent = tx_data(conn, iov_p, 1, u_sg->padding);
> -		if (u_sg->padding != tx_sent) {
> +		tx_sent = tx_data(conn, iov_p, 1, cmd->padding);
> +		if (cmd->padding != tx_sent) {
>  			if (tx_sent == -EAGAIN) {
>  				printk(KERN_ERR "tx_data() returned -EAGAIN\n");
>  				goto send_padding;
> diff --git a/drivers/target/iscsi/iscsi_target_util.h b/drivers/target/iscsi/iscsi_target_util.h
> index 9bd5cc0..2cd49d6 100644
> --- a/drivers/target/iscsi/iscsi_target_util.h
> +++ b/drivers/target/iscsi/iscsi_target_util.h
> @@ -3,35 +3,6 @@
>  
>  #define MARKER_SIZE	8
>  
> -struct se_cmd;
> -
> -struct se_offset_map {
> -	int                     map_reset;
> -	u32                     iovec_length;
> -	u32                     iscsi_offset;
> -	u32                     current_offset;
> -	u32                     orig_offset;
> -	u32                     sg_count;
> -	u32                     sg_current;
> -	u32                     sg_length;
> -	struct page		*sg_page;
> -	struct se_mem		*map_se_mem;
> -	struct se_mem		*map_orig_se_mem;
> -	void			*iovec_base;
> -};
> -
> -struct se_unmap_sg {
> -	u32			data_length;
> -	u32			sg_count;
> -	u32			sg_offset;
> -	u32			padding;
> -	u32			t_offset;
> -	void			*fabric_cmd;
> -	struct se_cmd		*se_cmd;
> -	struct se_offset_map	lmap;
> -	struct se_mem		*cur_se_mem;
> -};
> -
>  extern int iscsit_add_r2t_to_list(struct iscsi_cmd *, u32, u32, int, u32);
>  extern struct iscsi_r2t *iscsit_get_r2t_for_eos(struct iscsi_cmd *, u32, u32);
>  extern struct iscsi_r2t *iscsit_get_r2t_from_list(struct iscsi_cmd *);
> @@ -75,7 +46,7 @@ extern void __iscsit_start_nopin_timer(struct iscsi_conn *);
>  extern void iscsit_start_nopin_timer(struct iscsi_conn *);
>  extern void iscsit_stop_nopin_timer(struct iscsi_conn *);
>  extern int iscsit_send_tx_data(struct iscsi_cmd *, struct iscsi_conn *, int);
> -extern int iscsit_fe_sendpage_sg(struct se_unmap_sg *, struct iscsi_conn *);
> +extern int iscsit_fe_sendpage_sg(struct iscsi_cmd *, struct iscsi_conn *);
>  extern int iscsit_tx_login_rsp(struct iscsi_conn *, u8, u8);
>  extern void iscsit_print_session_params(struct iscsi_session *);
>  extern int iscsit_print_dev_to_proc(char *, char **, off_t, int);
> -- 
> 1.7.1
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
       [not found] ` <1306523240-15543-40-git-send-email-agrover@redhat.com>
@ 2011-05-31  9:32   ` Nicholas A. Bellinger
  2011-05-31  9:48     ` Christoph Hellwig
  2011-05-31 10:17     ` Christoph Hellwig
  0 siblings, 2 replies; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-31  9:32 UTC (permalink / raw)
  To: linux-iscsi-target-dev; +Cc: target-devel, Christoph Hellwig, linux-scsi

On Fri, 2011-05-27 at 12:07 -0700, Andy Grover wrote:
> This will cause transport_new_cmd to be executed in the calling thread's
> context, instead of the backstore thread's context. This saves two
> context switches per cmd.
> 
> Signed-off-by: Andy Grover <agrover@redhat.com>
> ---

As Christoph mentioned earler, this original patch to enable your
iscsi-target optimization series breaks all non iscsi-target fabrics.
But hey, that is easy enough for me to fix now.  ;-)

Committed as 5bb520f61 with the following changes to skip patch #25/#26
for the wholesale target_*() naming for the moment, and make your
original change here optional based upon TFO->new_cmd_map() usage for
other fabric modules.

Thanks again Andy,

--nab

-----------------------------------------------------------------------

[PATCH] target: Add optional transport_generic_new_cmd instead of adding to cmd queue

This will cause transport_generic_new_cmd() to be optionally executed in the calling
thread's context, instead of the backstore thread's context.  This saves two
context switches per cmd, and assumes that fabric modules (like iscsi-target) who
do not provide a struct target_core_fabric_ops->new_cmd_map() in order to work
with existing non iscsi-target fabric modules.

Signed-off-by: Andy Grover <agrover@redhat.com>
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/target_core_transport.c |   16 ++++++++++++----
 1 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 915b2ef..7596f6f 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -203,6 +203,7 @@ typedef int (*map_func_t)(struct se_task *, u32);

 static int transport_generic_write_pending(struct se_cmd *);
 static int transport_processing_thread(void *param);
+static int transport_generic_new_cmd(struct se_cmd *cmd);
 static int __transport_execute_tasks(struct se_device *dev);
 static void transport_complete_task_attr(struct se_cmd *cmd);
 static void transport_direct_request_timeout(struct se_cmd *cmd);
@@ -751,7 +752,7 @@ void transport_cmd_finish_abort_tmr(struct se_cmd *cmd)
        transport_generic_remove(cmd, 0, 0);
 }

-static void transport_add_cmd_to_queue(
+static int transport_add_cmd_to_queue(
        struct se_cmd *cmd,
        int t_state)
 {
@@ -775,6 +776,8 @@ static void transport_add_cmd_to_queue(

        atomic_inc(&qobj->queue_cnt);
        wake_up_interruptible(&qobj->thread_wq);
+
+       return 0;
 }

 static struct se_cmd *
@@ -1847,9 +1850,14 @@ int transport_generic_handle_cdb(
                printk(KERN_ERR "cmd->se_lun is NULL\n");
                return -EINVAL;
        }
-
-       transport_add_cmd_to_queue(cmd, TRANSPORT_NEW_CMD);
-       return 0;
+       /*
+        * Assume that target fabrics not presenting an cmd->se_tfo->new_cmd_map
+        * function pointer (like iscsi-target) want direct execution of
+        * transport_generic_new_cmd()
+        */
+       return (cmd->se_tfo->new_cmd_map) ?
+               transport_add_cmd_to_queue(cmd, TRANSPORT_NEW_CMD) :
+               transport_generic_new_cmd(cmd);
 }
 EXPORT_SYMBOL(transport_generic_handle_cdb);

-- 
1.7.2.5






^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
  2011-05-31  9:32   ` [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue Nicholas A. Bellinger
@ 2011-05-31  9:48     ` Christoph Hellwig
  2011-05-31 10:10       ` Nicholas A. Bellinger
  2011-05-31 10:17     ` Christoph Hellwig
  1 sibling, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2011-05-31  9:48 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: linux-iscsi-target-dev, target-devel, Christoph Hellwig,
	linux-scsi

On Tue, May 31, 2011 at 02:32:47AM -0700, Nicholas A. Bellinger wrote:
> -static void transport_add_cmd_to_queue(
> +static int transport_add_cmd_to_queue(
>         struct se_cmd *cmd,
>         int t_state)
>  {
> @@ -775,6 +776,8 @@ static void transport_add_cmd_to_queue(
> 
>         atomic_inc(&qobj->queue_cnt);
>         wake_up_interruptible(&qobj->thread_wq);
> +
> +       return 0;
>  }
> 
>  static struct se_cmd *
> @@ -1847,9 +1850,14 @@ int transport_generic_handle_cdb(
>                 printk(KERN_ERR "cmd->se_lun is NULL\n");
>                 return -EINVAL;
>         }
> -
> -       transport_add_cmd_to_queue(cmd, TRANSPORT_NEW_CMD);
> -       return 0;
> +       /*
> +        * Assume that target fabrics not presenting an cmd->se_tfo->new_cmd_map
> +        * function pointer (like iscsi-target) want direct execution of
> +        * transport_generic_new_cmd()
> +        */
> +       return (cmd->se_tfo->new_cmd_map) ?
> +               transport_add_cmd_to_queue(cmd, TRANSPORT_NEW_CMD) :
> +               transport_generic_new_cmd(cmd);
>  }

This it too ugly to live, please revert it.

The proper way to do it is to simply mark transport_generic_new_cmd
non-static and call it directly from the iscsi code.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 41/42] target/iscsi: remove unsolicited_data_comp completion
       [not found] ` <1306523240-15543-42-git-send-email-agrover@redhat.com>
@ 2011-05-31  9:58   ` Nicholas A. Bellinger
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-31  9:58 UTC (permalink / raw)
  To: linux-iscsi-target-dev; +Cc: target-devel, Christoph Hellwig, linux-scsi

On Fri, 2011-05-27 at 12:07 -0700, Andy Grover wrote:
> Since iscsi now allocates its own buffers and in its own thread, we don't
> need the unsolicited_data_comp completion.
> 
> Signed-off-by: Andy Grover <agrover@redhat.com>
> ---

Committed as 26bc999ea68 with a few very minor changes in order to apply
w/o patch #25-#26..

Once again, very well done for enabling direct task allocation from
iscsi RX thread context and saving the initial context switch, and the
second switch with immediate write data and/or unsolicited data out.

I am excited to see to see how much of an latency improvement this buys
gives us.  Really great work Andy.  :)

--nab

>  drivers/target/iscsi/iscsi_target.c          |   41 +++----------------------
>  drivers/target/iscsi/iscsi_target_configfs.c |   13 ++------
>  drivers/target/iscsi/iscsi_target_core.h     |    2 -
>  drivers/target/iscsi/iscsi_target_util.c     |    1 -
>  drivers/target/target_core_transport.c       |    2 +-
>  5 files changed, 9 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
> index a36064b..e8b72d4 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -1139,15 +1139,9 @@ attach_cmd:
>  		goto after_immediate_data;
>  	}
>  
> -	/*
> -	 * Immediate Data is present, send to the transport and block until
> -	 * the underlying transport plugin has allocated the buffer to
> -	 * receive the Immediate Write Data into.
> -	 */
> +	/* Allocates backstore tasks */
>  	transport_generic_new_cmd(SE_CMD(cmd));
>  
> -	wait_for_completion(&cmd->unsolicited_data_comp);
> -
>  	if (SE_CMD(cmd)->se_cmd_flags & SCF_SE_CMD_FAILED) {
>  		immed_ret = IMMEDIATE_DATA_NORMAL_OPERATION;
>  		dump_immediate_data = 1;
> @@ -1344,7 +1338,7 @@ static int iscsit_handle_data_out(struct iscsi_conn *conn, unsigned char *buf)
>  	}
>  
>  	if (cmd->unsolicited_data) {
> -		int dump_unsolicited_data = 0, wait_for_transport = 0;
> +		int dump_unsolicited_data = 0;
>  
>  		if (conn->sess->sess_ops->InitialR2T) {
>  			printk(KERN_ERR "Received unexpected unsolicited data"
> @@ -1358,37 +1352,12 @@ static int iscsit_handle_data_out(struct iscsi_conn *conn, unsigned char *buf)
>  		 * and Unsupported SAM WRITE Opcodes and SE resource allocation
>  		 * failures;
>  		 */
> +
> +		/* Something's amiss if we're not in WRITE_PENDING state... */
>  		spin_lock_irqsave(&se_cmd->t_state_lock, flags);
> -		/*
> -		 * Handle cases where we do or do not want to sleep on
> -		 * unsolicited_data_comp
> -		 *
> -		 * First, if TRANSPORT_WRITE_PENDING state has not been reached,
> -		 * we need assume we need to wait and sleep..
> -		 */
> -		 wait_for_transport =
> -				(se_cmd->t_state != TRANSPORT_WRITE_PENDING);
> -		/*
> -		 * For the ImmediateData=Yes cases, there will already be
> -		 * generic target memory allocated with the original
> -		 * ISCSI_OP_SCSI_CMD PDU, so do not sleep for that case.
> -		 *
> -		 * The last is a check for a delayed TASK_ABORTED status that
> -		 * means the data payload will be dropped because
> -		 * SCF_SE_CMD_FAILED has been set to indicate that an exception
> -		 * condition for this struct sse_cmd has occured in generic target
> -		 * code that requires us to drop payload.
> -		 */
> -		wait_for_transport =
> -				(se_cmd->t_state != TRANSPORT_WRITE_PENDING);
> -		if ((cmd->immediate_data != 0) ||
> -		    (atomic_read(&se_cmd->t_transport_aborted) != 0))
> -			wait_for_transport = 0;
> +		WARN_ON(se_cmd->t_state != TRANSPORT_WRITE_PENDING);
>  		spin_unlock_irqrestore(&se_cmd->t_state_lock, flags);
>  
> -		if (wait_for_transport)
> -			wait_for_completion(&cmd->unsolicited_data_comp);
> -
>  		spin_lock_irqsave(&se_cmd->t_state_lock, flags);
>  		if (!(se_cmd->se_cmd_flags & SCF_SUPPORTED_SAM_OPCODE) ||
>  		     (se_cmd->se_cmd_flags & SCF_SE_CMD_FAILED))
> diff --git a/drivers/target/iscsi/iscsi_target_configfs.c b/drivers/target/iscsi/iscsi_target_configfs.c
> index 4d712cf..fafeddf 100644
> --- a/drivers/target/iscsi/iscsi_target_configfs.c
> +++ b/drivers/target/iscsi/iscsi_target_configfs.c
> @@ -1510,10 +1510,7 @@ static int iscsi_get_cmd_state(struct se_cmd *se_cmd)
>  
>  static void iscsi_new_cmd_failure(struct se_cmd *se_cmd)
>  {
> -	struct iscsi_cmd *cmd = container_of(se_cmd, struct iscsi_cmd, se_cmd);
> -
> -	if (cmd->immediate_data || cmd->unsolicited_data)
> -		complete(&cmd->unsolicited_data_comp);
> +	/* We no longer sleep while core allocs (we do alloc), so we can ignore this */
>  }
>  
>  static int iscsi_is_state_remove(struct se_cmd *se_cmd)
> @@ -1572,12 +1569,8 @@ static int lio_write_pending(struct se_cmd *se_cmd)
>  {
>  	struct iscsi_cmd *cmd = container_of(se_cmd, struct iscsi_cmd, se_cmd);
>  
> -	if (cmd->immediate_data || cmd->unsolicited_data)
> -		complete(&cmd->unsolicited_data_comp);
> -	else {
> -		if (iscsit_build_r2ts_for_cmd(cmd, cmd->conn, 1) < 0)
> -			return PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES;
> -	}
> +	if (!cmd->immediate_data && !cmd->unsolicited_data)
> +		return iscsit_build_r2ts_for_cmd(cmd, cmd->conn, 1);
>  
>  	return 0;
>  }
> diff --git a/drivers/target/iscsi/iscsi_target_core.h b/drivers/target/iscsi/iscsi_target_core.h
> index 693f8c8..fa45d87 100644
> --- a/drivers/target/iscsi/iscsi_target_core.h
> +++ b/drivers/target/iscsi/iscsi_target_core.h
> @@ -438,8 +438,6 @@ struct iscsi_cmd {
>  	/* R2T List */
>  	struct list_head	cmd_r2t_list;
>  	struct completion	reject_comp;
> -	/* Semaphore used for allocating buffer */
> -	struct completion	unsolicited_data_comp;
>  	/* Timer for DataOUT */
>  	struct timer_list	dataout_timer;
>  	/* Iovecs for SCSI data payload RX/TX w/ kernel level sockets */
> diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
> index 79d3b6e..c6219e5 100644
> --- a/drivers/target/iscsi/iscsi_target_util.c
> +++ b/drivers/target/iscsi/iscsi_target_util.c
> @@ -153,7 +153,6 @@ struct iscsi_cmd *iscsit_allocate_cmd(struct iscsi_conn *conn, gfp_t gfp_mask)
>  	INIT_LIST_HEAD(&cmd->datain_list);
>  	INIT_LIST_HEAD(&cmd->cmd_r2t_list);
>  	init_completion(&cmd->reject_comp);
> -	init_completion(&cmd->unsolicited_data_comp);
>  	spin_lock_init(&cmd->datain_lock);
>  	spin_lock_init(&cmd->dataout_timeout_lock);
>  	spin_lock_init(&cmd->istate_lock);
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 3c2c74f..7759404 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -4715,7 +4715,7 @@ static int transport_new_cmd(struct se_cmd *cmd)
>  	}
>  
>  	/*
> -	 * For WRITEs, let the iSCSI Target RX Thread know its buffer is ready..
> +	 * For WRITEs, let the fabric know its buffer is ready..
>  	 * This WRITE struct se_cmd (and all of its associated struct se_task's)
>  	 * will be added to the struct se_device execution queue after its WRITE
>  	 * data has arrived. (ie: It gets handled by the transport processing
> -- 
> 1.7.1
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
  2011-05-31  9:48     ` Christoph Hellwig
@ 2011-05-31 10:10       ` Nicholas A. Bellinger
  2011-05-31 10:22         ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-31 10:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-iscsi-target-dev, target-devel, linux-scsi

On Tue, 2011-05-31 at 11:48 +0200, Christoph Hellwig wrote:
> On Tue, May 31, 2011 at 02:32:47AM -0700, Nicholas A. Bellinger wrote:
> > -static void transport_add_cmd_to_queue(
> > +static int transport_add_cmd_to_queue(
> >         struct se_cmd *cmd,
> >         int t_state)
> >  {
> > @@ -775,6 +776,8 @@ static void transport_add_cmd_to_queue(
> > 
> >         atomic_inc(&qobj->queue_cnt);
> >         wake_up_interruptible(&qobj->thread_wq);
> > +
> > +       return 0;
> >  }
> > 
> >  static struct se_cmd *
> > @@ -1847,9 +1850,14 @@ int transport_generic_handle_cdb(
> >                 printk(KERN_ERR "cmd->se_lun is NULL\n");
> >                 return -EINVAL;
> >         }
> > -
> > -       transport_add_cmd_to_queue(cmd, TRANSPORT_NEW_CMD);
> > -       return 0;
> > +       /*
> > +        * Assume that target fabrics not presenting an cmd->se_tfo->new_cmd_map
> > +        * function pointer (like iscsi-target) want direct execution of
> > +        * transport_generic_new_cmd()
> > +        */
> > +       return (cmd->se_tfo->new_cmd_map) ?
> > +               transport_add_cmd_to_queue(cmd, TRANSPORT_NEW_CMD) :
> > +               transport_generic_new_cmd(cmd);
> >  }
> 
> This it too ugly to live, please revert it.
> 
> The proper way to do it is to simply mark transport_generic_new_cmd
> non-static and call it directly from the iscsi code.
> 

<nod>, rebased patch #39 to the following commit.

Thanks,

--nab

---------------------------------------------------------------------

[PATCH] target: Make transport_generic_new_cmd() available for iscsi-target

This patch makes transport_generic_new_cmd() defined as EXPORT_SYMBOL()
for use with iscsi-target direct task allocation.  It then changes
iscsi-target to use this direct method instead of the legacy
transport_generic_handle_cdb() queue usage.

Signed-off-by: Andy Grover <agrover@redhat.com>
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/iscsi/iscsi_target.c    |    2 +-
 drivers/target/target_core_transport.c |    4 ++--
 include/target/target_core_transport.h |    1 +
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index bf4261e..16be7b7 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -1118,7 +1118,7 @@ attach_cmd:
         * the underlying transport plugin has allocated the buffer to
         * receive the Immediate Write Data into.
         */
-       transport_generic_handle_cdb(SE_CMD(cmd));
+       transport_generic_new_cmd(SE_CMD(cmd));

        wait_for_completion(&cmd->unsolicited_data_comp);

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 915b2ef..5a689fd 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -1847,7 +1847,6 @@ int transport_generic_handle_cdb(
                printk(KERN_ERR "cmd->se_lun is NULL\n");
                return -EINVAL;
        }
-
        transport_add_cmd_to_queue(cmd, TRANSPORT_NEW_CMD);
        return 0;
 }
@@ -4754,7 +4753,7 @@ transport_map_control_cmd_to_task(struct se_cmd *cmd)
        /*
         * Generate struct se_task(s) and/or their payloads for this CDB.
         */
-static int transport_generic_new_cmd(struct se_cmd *cmd)
+int transport_generic_new_cmd(struct se_cmd *cmd)
 {
        struct se_portal_group *se_tpg;
        struct se_task *task;
@@ -4824,6 +4823,7 @@ static int transport_generic_new_cmd(struct se_cmd *cmd)
        transport_execute_tasks(cmd);
        return 0;
 }
+EXPORT_SYMBOL(transport_generic_new_cmd);



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
  2011-05-31  9:32   ` [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue Nicholas A. Bellinger
  2011-05-31  9:48     ` Christoph Hellwig
@ 2011-05-31 10:17     ` Christoph Hellwig
  2011-05-31 11:18       ` Nicholas A. Bellinger
  1 sibling, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2011-05-31 10:17 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: linux-iscsi-target-dev, target-devel, Christoph Hellwig,
	linux-scsi

On Tue, May 31, 2011 at 02:32:47AM -0700, Nicholas A. Bellinger wrote:
> As Christoph mentioned earler, this original patch to enable your
> iscsi-target optimization series breaks all non iscsi-target fabrics.
> But hey, that is easy enough for me to fix now.  ;-)

I don't think it breaks all of them.  Besides iscsi only tcm_fc (btw, why did
this one get commited with the stupd tcm prefix that we killed in all other
places?) and SRP use this function.  FC uses the same scheme as iSCSI
and should be fine with the direct call.  SRP has a three callers leading
towards srpt_handle_new_iu, of which two also come from the main handling
thread, and the third comes from an IB layer callback.  I've not looked
over the IB code yet, but it appears to either be fine to do the direct
call here, or easily doable by offloading the IB callback from a higher
level.

Doing this would have the huge advantage of greatly simplifying the
entry points into the command processing - either call transport_new_cmd
directly, or use the thread for the TRANSPORT_NEW_CMD case.  Mid-term
getting rid of that entry point also sounds fine, if we replace the
current main thread with a work queue where frontends can queue their
own items, and use those for deferring things from IRQ context only
when needed.  This would also mean we can remove the new_cmd_map callback.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
  2011-05-31 10:10       ` Nicholas A. Bellinger
@ 2011-05-31 10:22         ` Christoph Hellwig
  2011-05-31 11:22           ` Nicholas A. Bellinger
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2011-05-31 10:22 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Christoph Hellwig, linux-iscsi-target-dev, target-devel,
	linux-scsi

> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -1118,7 +1118,7 @@ attach_cmd:
>          * the underlying transport plugin has allocated the buffer to
>          * receive the Immediate Write Data into.
>          */
> -       transport_generic_handle_cdb(SE_CMD(cmd));
> +       transport_generic_new_cmd(SE_CMD(cmd));

This is still missing the error handling, as well as the other callers
in the iscsi code.  When adding the proper error handling please do it
by checking the transport_generic_new_cmd return value instead of
through transport_new_cmd_failure.  With a bit more work this should
also allow killing off the SCF_SE_CMD_FAILED flag.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 42/42] target/file: Alloc iov[] off the stack
       [not found] ` <1306523240-15543-43-git-send-email-agrover@redhat.com>
@ 2011-05-31 10:59   ` Nicholas A. Bellinger
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-31 10:59 UTC (permalink / raw)
  To: linux-iscsi-target-dev; +Cc: target-devel, Christoph Hellwig, linux-scsi

On Fri, 2011-05-27 at 12:07 -0700, Andy Grover wrote:
> Apparently C supports variable arrays on the stack now, pretty sweet.
> Perfect for the iovec array.
> 
> Signed-off-by: Andy Grover <agrover@redhat.com>
> ---

So as Christoph mentioned, this is really not safe to do for all cases..

What about something along the lines of the following..?  Christoph, is
there a smarter way to determine when *_onstack() should be getting
called..?

Thanks,

--nab

--------------------------------------------------------------------

diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c
index 5c47f42..f437c51 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -284,23 +284,16 @@ fd_alloc_task(struct se_cmd *cmd)
        return &fd_req->fd_task;
 }
 
-static int fd_do_readv(struct se_task *task)
+static int __fd_do_readv(struct se_task *task, struct iovec *iov)
 {
        struct fd_request *req = FILE_REQ(task);
        struct file *fd = req->fd_dev->fd_file;
        struct scatterlist *sg = task->task_sg;
-       struct iovec *iov;
        mm_segment_t old_fs;
        loff_t pos = (task->task_lba *
                      task->se_dev->se_sub_dev->se_dev_attrib.block_size);
        int ret = 0, i;
 
-       iov = kzalloc(sizeof(struct iovec) * task->task_sg_num, GFP_KERNEL);
-       if (!(iov)) {
-               printk(KERN_ERR "Unable to allocate fd_do_readv iov[]\n");
-               return -ENOMEM;
-       }
-
        for (i = 0; i < task->task_sg_num; i++) {
                iov[i].iov_len = sg[i].length;
                iov[i].iov_base = sg_virt(&sg[i]);
@@ -310,8 +303,6 @@ static int fd_do_readv(struct se_task *task)
        set_fs(get_ds());
        ret = vfs_readv(fd, &iov[0], task->task_sg_num, &pos);
        set_fs(old_fs);
-
-       kfree(iov);
        /*
         * Return zeros and GOOD status even if the READ did not return
         * the expected virt_size for struct file w/o a backing struct
@@ -335,23 +326,39 @@ static int fd_do_readv(struct se_task *task)
        return 1;
 }
 
-static int fd_do_writev(struct se_task *task)
+static int fd_do_readv_onstack(struct se_task *task)
+{
+       struct iovec iov[task->task_sg_num];
+
+       return __fd_do_readv(task, &iov[0]);
+}
+
+static int fd_do_readv_offstack(struct se_task *task)
+{
+       struct iovec *iov;
+       int ret;
+
+       iov = kzalloc(sizeof(struct iovec) * task->task_sg_num, GFP_KERNEL);
+       if (!iov) {
+               printk(KERN_ERR "Unable to allocate fd_do_readv iov[]\n");
+               return -ENOMEM;
+       }
+
+       ret = __fd_do_readv(task, iov);
+       kfree(iov);
+       return ret;
+}
+
+static int __fd_do_writev(struct se_task *task, struct iovec *iov)
 {
        struct fd_request *req = FILE_REQ(task);
        struct file *fd = req->fd_dev->fd_file;
        struct scatterlist *sg = task->task_sg;
-       struct iovec *iov;
        mm_segment_t old_fs;
        loff_t pos = (task->task_lba *
                      task->se_dev->se_sub_dev->se_dev_attrib.block_size);
        int ret, i = 0;
 
-       iov = kzalloc(sizeof(struct iovec) * task->task_sg_num, GFP_KERNEL);
-       if (!(iov)) {
-               printk(KERN_ERR "Unable to allocate fd_do_writev iov[]\n");
-               return -ENOMEM;
-       }
-
        for (i = 0; i < task->task_sg_num; i++) {
                iov[i].iov_len = sg[i].length;
                iov[i].iov_base = sg_virt(&sg[i]);
@@ -362,8 +369,6 @@ static int fd_do_writev(struct se_task *task)
        ret = vfs_writev(fd, &iov[0], task->task_sg_num, &pos);
        set_fs(old_fs);
 
-       kfree(iov);
-
        if (ret < 0 || ret != task->task_size) {
                printk(KERN_ERR "vfs_writev() returned %d\n", ret);
                return (ret < 0 ? ret : -EINVAL);
@@ -372,6 +377,29 @@ static int fd_do_writev(struct se_task *task)
        return 1;
 }
 
+static int fd_do_writev_onstack(struct se_task *task)
+{
+       struct iovec iov[task->task_sg_num];
+
+       return __fd_do_writev(task, &iov[0]);
+}
+
+static int fd_do_writev_offstack(struct se_task *task)
+{
+       struct iovec *iov;
+       int ret;
+
+       iov = kzalloc(sizeof(struct iovec) * task->task_sg_num, GFP_KERNEL);
+       if (!iov) {
+               printk(KERN_ERR "Unable to allocate fd_do_writev iov[]\n");
+               return -ENOMEM;
+       }
+
+       ret = __fd_do_writev(task, iov);
+       kfree(iov);
+       return ret;
+}
+
 static void fd_emulate_sync_cache(struct se_task *task)
 {
        struct se_cmd *cmd = task->task_se_cmd;
@@ -461,6 +489,7 @@ static int fd_do_task(struct se_task *task)
 {
        struct se_cmd *cmd = task->task_se_cmd;
        struct se_device *dev = cmd->se_dev;
+       u32 iovec_bytes = task->task_sg_num * sizeof(struct iovec);
        int ret = 0;
 
        /*
@@ -468,9 +497,15 @@ static int fd_do_task(struct se_task *task)
         * physical memory addresses to struct iovec virtual memory.
         */
        if (task->task_data_direction == DMA_FROM_DEVICE) {
-               ret = fd_do_readv(task);
+               if (iovec_bytes < (THREAD_SIZE / 2))
+                       ret = fd_do_readv_onstack(task);
+               else
+                       ret = fd_do_readv_offstack(task);
        } else {
-               ret = fd_do_writev(task);
+               if (iovec_bytes < (THREAD_SIZE / 2))
+                       ret = fd_do_writev_onstack(task);
+               else
+                       ret = fd_do_writev_offstack(task);
 
                if (ret > 0 &&
                    dev->se_sub_dev->se_dev_attrib.emulate_write_cache > 0 &&





^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
  2011-05-31 10:17     ` Christoph Hellwig
@ 2011-05-31 11:18       ` Nicholas A. Bellinger
  2011-06-01  4:09         ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-31 11:18 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-iscsi-target-dev, target-devel, linux-scsi

On Tue, 2011-05-31 at 12:17 +0200, Christoph Hellwig wrote:
> On Tue, May 31, 2011 at 02:32:47AM -0700, Nicholas A. Bellinger wrote:
> > As Christoph mentioned earler, this original patch to enable your
> > iscsi-target optimization series breaks all non iscsi-target fabrics.
> > But hey, that is easy enough for me to fix now.  ;-)
> 
> I don't think it breaks all of them.  Besides iscsi only tcm_fc (btw, why did
> this one get commited with the stupd tcm prefix that we killed in all other
> places?) and SRP use this function.  FC uses the same scheme as iSCSI
> and should be fine with the direct call.  SRP has a three callers leading
> towards srpt_handle_new_iu, of which two also come from the main handling
> thread, and the third comes from an IB layer callback.  I've not looked
> over the IB code yet, but it appears to either be fine to do the direct
> call here, or easily doable by offloading the IB callback from a higher
> level.
> 

Hmmm yes, tcm_fc is still using transport_generic_handle_cdb() and
should be OK with the direct call here..  I will get this converted
later this afternoon.

Also need to take a deeper look wrt to SRP and see how this conversion
would be possible..

> Doing this would have the huge advantage of greatly simplifying the
> entry points into the command processing - either call transport_new_cmd
> directly, or use the thread for the TRANSPORT_NEW_CMD case.  Mid-term
> getting rid of that entry point also sounds fine, if we replace the
> current main thread with a work queue where frontends can queue their
> own items, and use those for deferring things from IRQ context only
> when needed.  This would also mean we can remove the new_cmd_map callback.
> 
> 

I very much agree with these points, and moving to workqueues for TCM
v4.1 (mainline v3.1) for these reasons makes alot of sense to me.  I
need to get a better idea of how this will actually look, but am eager
to move ahead..

Thanks,

--nab



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
  2011-05-31 10:22         ` Christoph Hellwig
@ 2011-05-31 11:22           ` Nicholas A. Bellinger
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-05-31 11:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-iscsi-target-dev, target-devel, linux-scsi

On Tue, 2011-05-31 at 12:22 +0200, Christoph Hellwig wrote:
> > +++ b/drivers/target/iscsi/iscsi_target.c
> > @@ -1118,7 +1118,7 @@ attach_cmd:
> >          * the underlying transport plugin has allocated the buffer to
> >          * receive the Immediate Write Data into.
> >          */
> > -       transport_generic_handle_cdb(SE_CMD(cmd));
> > +       transport_generic_new_cmd(SE_CMD(cmd));
> 
> This is still missing the error handling, as well as the other callers
> in the iscsi code.  When adding the proper error handling please do it
> by checking the transport_generic_new_cmd return value instead of
> through transport_new_cmd_failure.  With a bit more work this should
> also allow killing off the SCF_SE_CMD_FAILED flag.
> 

<nod>.

I will have a look getting this addressed later today as well and send
out a follow-up patch on top of what has been pushed so far of Andy's
patches into lio-core-2.6.git.

Thanks,

--nab



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 10/42] target: Rewrite transport_init_task_sg()
  2011-05-31  7:08   ` [PATCH 10/42] target: Rewrite transport_init_task_sg() Nicholas A. Bellinger
@ 2011-05-31 21:04     ` Andy Grover
  2011-05-31 21:08       ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Andy Grover @ 2011-05-31 21:04 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: linux-iscsi-target-dev, target-devel, Christoph Hellwig,
	linux-scsi

On 05/31/2011 12:08 AM, Nicholas A. Bellinger wrote:
> On Fri, 2011-05-27 at 12:06 -0700, Andy Grover wrote:
>> Don't set task->task_sg_num until task_sg is successfully alloced, to
>> prevent it from possibly being out of sync.
>>
>> Simplify loop counting sgs needed. Handle possible (?) case of offset
>> greater than se_mem->se_len. Remove some cluttering debugprints. Reduce
>> local variable usage.
>>
>> Do not set termination bit in padded-1 entry in the padded_sg case. If
>> something barfs on an extra zeroed sg, (will happen if chain-capable sgs
>> don't actually get chained) then it's broken.
>>
>> Remove unneeded parens.
>>
>> Signed-off-by: Andy Grover <agrover@redhat.com>
>> ---
> 
> So I am not exactly sure there is an issue here with task->task_sg_num
> assignment in the failure case, as nothing depends AFAICT upon
> task->task_sg_num in the release path before an possible
> transport_init_task_sg() failure..

Yeah I was just being paranoid..

> However, my main concern is the removal of sg_mark_end() calls  to
> signify the end of task->task_sg[] memory for individual task I/O
> dispatch.  As you have noticed this looks OK with existing code wrt to
> backend *_map_task_sg() and *_do_task() using for_each_sg() w/
> task->task_sg_num, but am wondering if it really makes sense to remove
> this for the actual scatterlist dispatch for all backends..?  Is there
> any possible code below TCM backend drivers where HW drivers can pose an
> issue..?
> 
> I am happy to merge this for tcm v4.1 along with patch #11 if this is
> really safe to do, but will defer for the moment until we can get a few
> more ACKs on these two..

Every sg[] that is initialized by sg_init_table (or sg_init_one) should
have sg_mark_end called on its last element[1] so I think we're ok, or
am I misunderstanding the issue you're getting at?

Regards -- Andy

[1] http://lxr.linux.no/linux+v2.6.38/lib/scatterlist.c#L85

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 10/42] target: Rewrite transport_init_task_sg()
  2011-05-31 21:04     ` Andy Grover
@ 2011-05-31 21:08       ` Christoph Hellwig
  2011-06-01  0:33         ` Nicholas A. Bellinger
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2011-05-31 21:08 UTC (permalink / raw)
  To: Andy Grover
  Cc: Nicholas A. Bellinger, linux-iscsi-target-dev, target-devel,
	Christoph Hellwig, linux-scsi

On Tue, May 31, 2011 at 02:04:52PM -0700, Andy Grover wrote:
> Every sg[] that is initialized by sg_init_table (or sg_init_one) should
> have sg_mark_end called on its last element[1] so I think we're ok, or
> am I misunderstanding the issue you're getting at?

No, you understand it right.  That's how chained S/G lists are supposed
to work.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 10/42] target: Rewrite transport_init_task_sg()
  2011-05-31 21:08       ` Christoph Hellwig
@ 2011-06-01  0:33         ` Nicholas A. Bellinger
  0 siblings, 0 replies; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-06-01  0:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andy Grover, linux-iscsi-target-dev, target-devel,
	Christoph Hellwig, linux-scsi

On Tue, 2011-05-31 at 17:08 -0400, Christoph Hellwig wrote:
> On Tue, May 31, 2011 at 02:04:52PM -0700, Andy Grover wrote:
> > Every sg[] that is initialized by sg_init_table (or sg_init_one) should
> > have sg_mark_end called on its last element[1] so I think we're ok, or
> > am I misunderstanding the issue you're getting at?
> 
> No, you understand it right.  That's how chained S/G lists are supposed
> to work.
> 

Ok, thanks for the clarification here guys.  I am going take a little
extra time for reviewing and testing these on existing HW target mode
code, and will plan to push for tomorrow if everything looks OK.

Thanks!

--nab


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
  2011-05-31 11:18       ` Nicholas A. Bellinger
@ 2011-06-01  4:09         ` Christoph Hellwig
  2011-06-04  2:33           ` Nicholas A. Bellinger
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2011-06-01  4:09 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Christoph Hellwig, linux-iscsi-target-dev, target-devel,
	linux-scsi

On Tue, May 31, 2011 at 04:18:14AM -0700, Nicholas A. Bellinger wrote:
> I very much agree with these points, and moving to workqueues for TCM
> v4.1 (mainline v3.1) for these reasons makes alot of sense to me.  I
> need to get a better idea of how this will actually look, but am eager
> to move ahead..

I think the first step would be to move split the se_task execution
from the current thread into a workqueue.  That would also help with
making the file backend scale much better.  If it wasn't for head of queue
semantics this would be almost trivial - just embedd a work_struct in
every task and queue it up to the work queue, which handles concurrency.
I'm not entirely sure how to handle head of queue semantics correctly,
maybe just adding a pointer to pending head of queue tasks to the
se_dev and executing those first might work.


Btw, I can't fully make sense of the current task execution, so
here's a few questions:

 - why do we have separate __transport_execute_tasks and
   transport_execute_tasks, or rather why is it safe to use
   __transport_execute_tasks transport_processing_thread
   without the checks for shutting down?
 - what is the point of the transport_generic_process_write wrapper?
 - why does transport_generic_new_cmd call transport_execute_tasks
   for reads, but for writes we leave it to the ->write_pending
   method which only calls transport_execute_tasks for some of
   the fabrics?  Note that with the current code transport_generic_new_cmd
   might not get called from the fabric thread and thus stall the
   caller - it might make sense switch to calling transport_generic_handle_data
   for that case.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
  2011-06-01  4:09         ` Christoph Hellwig
@ 2011-06-04  2:33           ` Nicholas A. Bellinger
  2011-06-04 14:18             ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Nicholas A. Bellinger @ 2011-06-04  2:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-iscsi-target-dev, target-devel, linux-scsi

On Wed, 2011-06-01 at 06:09 +0200, Christoph Hellwig wrote:
> On Tue, May 31, 2011 at 04:18:14AM -0700, Nicholas A. Bellinger wrote:
> > I very much agree with these points, and moving to workqueues for TCM
> > v4.1 (mainline v3.1) for these reasons makes alot of sense to me.  I
> > need to get a better idea of how this will actually look, but am eager
> > to move ahead..
> 
> I think the first step would be to move split the se_task execution
> from the current thread into a workqueue.  That would also help with
> making the file backend scale much better.  If it wasn't for head of queue
> semantics this would be almost trivial - just embedd a work_struct in
> every task and queue it up to the work queue, which handles concurrency.
> I'm not entirely sure how to handle head of queue semantics correctly,
> maybe just adding a pointer to pending head of queue tasks to the
> se_dev and executing those first might work.
> 
> 

<nod>.  I need to think about this a bit more wrt to HOQ, but I think
this sounds like a reasonable first step.

> Btw, I can't fully make sense of the current task execution, so
> here's a few questions:
> 
>  - why do we have separate __transport_execute_tasks and
>    transport_execute_tasks, or rather why is it safe to use
>    __transport_execute_tasks transport_processing_thread
>    without the checks for shutting down?

This is because the current code in transport_processing_thread() cannot
be shutdown until last fabric port symlink reference has dropped from
configfs.  We currrently expect all outstanding I/O to be accounted
during fabric port shutdown for individual struct se_lun <-> struct
se_device fabric mappings in __transport_clear_lun_from_sessions()

>  - what is the point of the transport_generic_process_write wrapper?

Minus the stubbed WRITE overflow copy stuff, this primarly used by
tcm_loop to queue up the cmd's tasks for execution via
transport_execute_tasks() instead of a direct fabric module call..

Currently tcm_loop is the only user of this as we know the full WRITE
payload is available once the struct scsi_cmd as been dispatched via
->queuecommand().

>  - why does transport_generic_new_cmd call transport_execute_tasks
>    for reads, but for writes we leave it to the ->write_pending
>    method which only calls transport_execute_tasks for some of
>    the fabrics?

This is because all of the necessary SCSI WRITE payload may have not
been received to execute tasks to the struct se_device backends.

>    Note that with the current code transport_generic_new_cmd
>    might not get called from the fabric thread and thus stall the
>    caller - it might make sense switch to calling transport_generic_handle_data
>    for that case.

Not exactly..  Andy's patches call transport_generic_new_cmd() directly
only for the immediate WRITE payload case in iscsit_handle_scsi_cmd() to
alloc + build + queue via RX thread context, but not for the other non
immediate WRITEs cases.  The other case currently follows:

    iscsit_handle_scsi_cmd() ->
         iscsit_sequence_cmd () ->
              iscsit_execute_cmd() ->
                 transport_generic_handle_cdb() (queue via processing context)

This was originally done because it's possible for
transport_generic_handle_cdb() to be called from a different iSCSI
connection RX context depending upon CmdSN ordering.

So the immediate WRITE case appear to be working as expected now with
the most recent series, and I am still investigating the other codepaths
for possible breakage..

--nab



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
  2011-06-04  2:33           ` Nicholas A. Bellinger
@ 2011-06-04 14:18             ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2011-06-04 14:18 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Christoph Hellwig, linux-iscsi-target-dev, target-devel,
	linux-scsi

On Fri, Jun 03, 2011 at 07:33:58PM -0700, Nicholas A. Bellinger wrote:
> >  - what is the point of the transport_generic_process_write wrapper?
> 
> Minus the stubbed WRITE overflow copy stuff, this primarly used by
> tcm_loop to queue up the cmd's tasks for execution via
> transport_execute_tasks() instead of a direct fabric module call..
> 
> Currently tcm_loop is the only user of this as we know the full WRITE
> payload is available once the struct scsi_cmd as been dispatched via
> ->queuecommand().

As a first step the stubbed out code really should go away ASAP, it's
really rather confusing as it has little chance to work for the current
code base.  As a follow-on transport_generic_process_write should be
replaced by a direct call to transport_execute_tasks, to make it more
obvious what's going on here.

If we then plan to keep the direct task execution from the submitter
context we should document why we're doing that.  One option would be
to make ->write_pending return a boolean indicator whether we should
call transport_exectute_tasks from
transport_generic_write_pending/transport_generic_new_cmd, and thus
making it more obvious that's it's actually similar to the read path
in that respect.

I'm still not convinced that running the read/write/flush/discard tasks
from the caller is an all that good idea.  They will 100% block for the
file backend, and can block when we reach the request limit with iblock,
and the blocking behaviour might differ for different kinds of requests.


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2011-06-04 14:18 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1306523240-15543-1-git-send-email-agrover@redhat.com>
2011-05-27 22:15 ` [PATCH 0/42 RESEND+NEW] Target updates for May 27 Nicholas A. Bellinger
2011-05-27 22:35   ` Christoph Hellwig
2011-05-27 23:39     ` Nicholas A. Bellinger
2011-05-28  7:43       ` Christoph Hellwig
2011-05-28 19:09         ` Nicholas A. Bellinger
2011-05-29  2:23         ` Andy Grover
2011-05-30 12:56           ` Christoph Hellwig
     [not found] ` <1306523240-15543-11-git-send-email-agrover@redhat.com>
2011-05-31  7:08   ` [PATCH 10/42] target: Rewrite transport_init_task_sg() Nicholas A. Bellinger
2011-05-31 21:04     ` Andy Grover
2011-05-31 21:08       ` Christoph Hellwig
2011-06-01  0:33         ` Nicholas A. Bellinger
     [not found] ` <1306523240-15543-38-git-send-email-agrover@redhat.com>
2011-05-31  9:00   ` [PATCH 37/42] target/iscsi: Do not use t_mem_list anymore Nicholas A. Bellinger
     [not found] ` <1306523240-15543-40-git-send-email-agrover@redhat.com>
2011-05-31  9:32   ` [PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue Nicholas A. Bellinger
2011-05-31  9:48     ` Christoph Hellwig
2011-05-31 10:10       ` Nicholas A. Bellinger
2011-05-31 10:22         ` Christoph Hellwig
2011-05-31 11:22           ` Nicholas A. Bellinger
2011-05-31 10:17     ` Christoph Hellwig
2011-05-31 11:18       ` Nicholas A. Bellinger
2011-06-01  4:09         ` Christoph Hellwig
2011-06-04  2:33           ` Nicholas A. Bellinger
2011-06-04 14:18             ` Christoph Hellwig
     [not found] ` <1306523240-15543-42-git-send-email-agrover@redhat.com>
2011-05-31  9:58   ` [PATCH 41/42] target/iscsi: remove unsolicited_data_comp completion Nicholas A. Bellinger
     [not found] ` <1306523240-15543-43-git-send-email-agrover@redhat.com>
2011-05-31 10:59   ` [PATCH 42/42] target/file: Alloc iov[] off the stack Nicholas A. Bellinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox