Re: [RFC 0/2] target: Add TFO->complete_irq queue_work bypass

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sagi Grimberg <sagig@dev.mellanox.co.il>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>,
	Christoph Hellwig <hch@lst.de>
Cc: "Nicholas A. Bellinger" <nab@daterainc.com>,
	target-devel <target-devel@vger.kernel.org>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Hannes Reinecke <hare@suse.de>,
	Sagi Grimberg <sagig@mellanox.com>
Subject: Re: [RFC 0/2] target: Add TFO->complete_irq queue_work bypass
Date: Thu, 04 Jun 2015 20:01:09 +0300	[thread overview]
Message-ID: <55708455.2080500@dev.mellanox.co.il> (raw)
In-Reply-To: <1433401569.18125.112.camel@haakon3.risingtidesystems.com>

On 6/4/2015 10:06 AM, Nicholas A. Bellinger wrote:
> On Wed, 2015-06-03 at 14:57 +0200, Christoph Hellwig wrote:
>> This makes lockdep very unhappy, rightly so.  If you execute
>> one end_io function inside another you basіcally nest every possible
>> lock taken in the I/O completion path.  Also adding more work
>> to the hardirq path generally isn't a smart idea.  Can you explain
>> what issues you were seeing and how much this helps?  Note that
>> the workqueue usage in the target core so far is fairly basic, so
>> there should some low hanging fruit.
>
> So I've been using tcm_loop + RAMDISK backends for prototyping, but this
> patch is intended for vhost-scsi so it can avoid the unnecessary
> queue_work() context switch within target_complete_cmd() for all backend
> driver types.
>
> This is because vhost_work_queue() is just updating vhost_dev->work_list
> and immediately wake_up_process() into a different vhost_worker()
> process context.  For heavy small block workloads into fast IBLOCK
> backends, avoiding this extra context switch should be a nice efficiency
> win.

I can see that, did you get a chance to measure the expected latency
improvement?

>
> Also, AFAIK RDMA fabrics are allowed to do ib_post_send() response
> callbacks directly from IRQ context as well.

This is correct in general, ib_post_send is not allowed to schedule.
isert/srpt might benefit here in latency, but it would require the
the drivers to pre-allocate the sgls (ib_sge's) and use a worst-case
approach (or use GFP_ATOMIC allocations - I'm not sure which is
better...)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)

From: Sagi Grimberg <sagig@dev.mellanox.co.il>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>,
	Christoph Hellwig <hch@lst.de>
Cc: "Nicholas A. Bellinger" <nab@daterainc.com>,
	target-devel <target-devel@vger.kernel.org>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Hannes Reinecke <hare@suse.de>,
	Sagi Grimberg <sagig@mellanox.com>
Subject: Re: [RFC 0/2] target: Add TFO->complete_irq queue_work bypass
Date: Thu, 04 Jun 2015 20:01:09 +0300	[thread overview]
Message-ID: <55708455.2080500@dev.mellanox.co.il> (raw)
In-Reply-To: <1433401569.18125.112.camel@haakon3.risingtidesystems.com>

On 6/4/2015 10:06 AM, Nicholas A. Bellinger wrote:
> On Wed, 2015-06-03 at 14:57 +0200, Christoph Hellwig wrote:
>> This makes lockdep very unhappy, rightly so.  If you execute
>> one end_io function inside another you basіcally nest every possible
>> lock taken in the I/O completion path.  Also adding more work
>> to the hardirq path generally isn't a smart idea.  Can you explain
>> what issues you were seeing and how much this helps?  Note that
>> the workqueue usage in the target core so far is fairly basic, so
>> there should some low hanging fruit.
>
> So I've been using tcm_loop + RAMDISK backends for prototyping, but this
> patch is intended for vhost-scsi so it can avoid the unnecessary
> queue_work() context switch within target_complete_cmd() for all backend
> driver types.
>
> This is because vhost_work_queue() is just updating vhost_dev->work_list
> and immediately wake_up_process() into a different vhost_worker()
> process context.  For heavy small block workloads into fast IBLOCK
> backends, avoiding this extra context switch should be a nice efficiency
> win.

I can see that, did you get a chance to measure the expected latency
improvement?

>
> Also, AFAIK RDMA fabrics are allowed to do ib_post_send() response
> callbacks directly from IRQ context as well.

This is correct in general, ib_post_send is not allowed to schedule.
isert/srpt might benefit here in latency, but it would require the
the drivers to pre-allocate the sgls (ib_sge's) and use a worst-case
approach (or use GFP_ATOMIC allocations - I'm not sure which is
better...)

next prev parent reply	other threads:[~2015-06-04 17:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-22  7:57 [RFC 0/2] target: Add TFO->complete_irq queue_work bypass Nicholas A. Bellinger
2015-05-22  7:57 ` [RFC 1/2] target: Add support for fabric IRQ completion Nicholas A. Bellinger
2015-06-09  7:27   ` Christoph Hellwig
2015-06-29  9:51     ` Sagi Grimberg
2015-05-22  7:57 ` [RFC 2/2] loopback: Enable TFO->complete_irq for fast-path ->scsi_done Nicholas A. Bellinger
2015-06-03 12:57 ` [RFC 0/2] target: Add TFO->complete_irq queue_work bypass Christoph Hellwig
2015-06-04  7:06   ` Nicholas A. Bellinger
2015-06-04 17:01     ` Sagi Grimberg [this message]
2015-06-04 17:01       ` Sagi Grimberg
2015-06-09  7:19     ` Christoph Hellwig
2015-06-10  7:10       ` Nicholas A. Bellinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55708455.2080500@dev.mellanox.co.il \
    --to=sagig@dev.mellanox.co.il \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=nab@daterainc.com \
    --cc=nab@linux-iscsi.org \
    --cc=sagig@mellanox.com \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.