Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* Re: [PATCH 02/12] blk-mq: Introduce blk_mq_hctx_stopped()
From: Johannes Thumshirn @ 2016-10-27  8:15 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, Mike Snitzer, Doug Ledford, Keith Busch,
	Ming Lei, Laurence Oberman, linux-block@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-nvme@lists.infradead.org
In-Reply-To: <0de50789-e3b7-0a07-73c1-4fb87b1f957e@sandisk.com>

On Wed, Oct 26, 2016 at 03:51:33PM -0700, Bart Van Assche wrote:
> Multiple functions test the BLK_MQ_S_STOPPED bit so introduce
> a helper function that performs this test.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Christoph Hellwig <hch@lst.de>

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply

* Re: [PATCH 01/12] blk-mq: Do not invoke .queue_rq() for a stopped queue
From: Johannes Thumshirn @ 2016-10-27  8:14 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, Mike Snitzer, Doug Ledford, Keith Busch,
	Ming Lin, Laurence Oberman,
	linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
In-Reply-To: <1debcf7f-c950-308b-d297-3e48a77e08d7-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>

On Wed, Oct 26, 2016 at 03:50:44PM -0700, Bart Van Assche wrote:
> The meaning of the BLK_MQ_S_STOPPED flag is "do not call
> .queue_rq()". Hence modify blk_mq_make_request() such that requests
> are queued instead of issued if a queue has been stopped.
> 
> Reported-by: Ming Lei <tom.leiming-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> Reviewed-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn-l3A5Bk7waGM@public.gmane.org>

-- 
Johannes Thumshirn                                          Storage
jthumshirn-l3A5Bk7waGM@public.gmane.org                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 perftest] Support for Chelsio T6 devices
From: Leon Romanovsky @ 2016-10-27  7:54 UTC (permalink / raw)
  To: Steve Wise
  Cc: 'Gil Rockah', 'Zohar Ben Aharon',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <00c501d22ecc$aca8dec0$05fa9c40$@opengridcomputing.com>

[-- Attachment #1: Type: text/plain, Size: 1575 bytes --]

On Tue, Oct 25, 2016 at 09:32:55AM -0500, Steve Wise wrote:
> Hey guys,
>
> Has this patch been integrated yet?  Also, where is the official upstream
> perftest git repo now?

Hi Steve,

Sorry for the late response, due to the holidays our responses are
delaying a little bit.

We moved perftest repo to be under github's linux-rdma organization [1]
and it is now [2].

I'll remind to Zohar to take it.

[1] https://github.com/linux-rdma/
[2] https://github.com/linux-rdma/perftest

>
> Thanks,
>
> Steve.
>
>
> > -----Original Message-----
> > From: Steve Wise [mailto:swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org]
> > Sent: Tuesday, October 11, 2016 4:34 PM
> > To: 'Gil Rockah'; 'Zohar Ben Aharon'
> > Subject: RE: [PATCH v2 perftest] Support for Chelsio T6 devices
> >
> >
> > > -----Original Message-----
> > > From: Gil Rockah [mailto:gilr-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org]
> > > Sent: Monday, September 26, 2016 2:01 AM
> > > To: Zohar Ben Aharon; swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org
> > > Subject: FW: [PATCH v2 perftest] Support for Chelsio T6 devices
> > >
> > > Hi Steve,
> > > Thanks for the patch.
> > > Please notice that Zohar is the new owner of perftest.
> > >
> > > Thanks,
> > > Gil
> > >
> >
> > Hey Zohar,
> >
> > Where is the formal perftest git repo now?
> >
> > Thanks,
> >
> > Steve.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* RE: [bug report] qedr: Add GSI support
From: Amrani, Ram @ 2016-10-27  7:46 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <20161026132501.GA3628-mgFCXtclrQlZLf2FXnZxJA@public.gmane.org>

> The patch 048867793046: "qedr: Add GSI support" from Oct 10, 2016, leads to
> the following Smatch complaint:
> 
> drivers/infiniband/hw/qedr/qedr_cm.c:284 qedr_gsi_build_header()
> 	 warn: variable dereferenced before check 'sgid_attr.ndev' (see line 281)
> 
> drivers/infiniband/hw/qedr/qedr_cm.c
>    280
>    281		vlan_id = rdma_vlan_dev_vlan_id(sgid_attr.ndev);
>                                                 ^^^^^^^^^^^^^^ Dereference inside function.
> 
>    282		if (vlan_id < VLAN_CFI_MASK)
>    283			has_vlan = true;
>    284		if (sgid_attr.ndev)
>                     ^^^^^^^^^^^^^^
> Check too late.
> 
>    285			dev_put(sgid_attr.ndev);
>    286

Thanks Dan. I'll fix this.
Ram
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2] Avoid possible hang on device removal
From: Leon Romanovsky @ 2016-10-27  7:29 UTC (permalink / raw)
  To: Mustafa Ismail
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb,
	dledford-H+wXaHxf7aLQT0dZR+AlfA,
	swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW
In-Reply-To: <1477438557-15884-1-git-send-email-mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 644 bytes --]

On Tue, Oct 25, 2016 at 06:35:57PM -0500, Mustafa Ismail wrote:
> When we get an RDMA_CM_EVENT_DEVICE_REMOVAL the cm_thread will
> exit and because flush errors are ignored the cb->sem may not get signaled.
> So just signal on device removal event.
>
> v1 -> v2: Add Fixes tag
>
> Fixes: 612eae1f6fe3 ("rping: ignore flushed completions")
> Signed-off-by: Mustafa Ismail <mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  librdmacm/examples/rping.c | 2 ++
>  1 file changed, 2 insertions(+)

Mustafa,

For future submissions, please use rdma-core notation in [PATCH ..]
section, so we will be able to handle them faster.

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Re: Trouble enabling iSER for ConnectX-4 Lx
From: Alaa Hleihel @ 2016-10-27  7:29 UTC (permalink / raw)
  To: Robert LeBlanc, linux-rdma
In-Reply-To: <CAANLjFr_we+33Nen-NYp1xQPzQ-wbR=GL4LBkEZb9azMUN-_=Q@mail.gmail.com>

Hi Robert,

You've installed mlnx-en package, which does not provide iSER modules.
Instead, you should get MLNX_OFED from:
http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers

Note that ib_isert in MLNX_OFED is not backported to all kernels, and it's enabled by default only for a specific kernels.
To force building the module against your kernel, use the following command for MLNX_OFED installation:
# MLNX_EXTRA_FLAGS=--with-isert ./mlnxofedinstall --force --add-kernel-support  --with-isert --skip-repo

As for configuring RoCE, please refer to the MLNX_OFED User Manual:
http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_User_Manual_v3.40.pdf

Regards,
Alaa


On 10/27/2016 02:13, Robert LeBlanc wrote:
> We have some ConnectX-4 Lx cards that I'm trying to test RoCE and iSER
> on. I downloaded and installed the Mellanox drivers with VMA [0]. I
> was able to run the ib_read_bw tests over the adapters after
> installing the infiniband-diags and perftest RPMs. When I went to
> configure LIO for iSER, I'm getting the message "Cannot change iser"
> on step 6 in the procedure here [1] which I've done many times with
> Infiniband without issues. I navigated to
> /sys/kernel/config/target/iscsi/{iqn}/tpgt_1/np/{portal_ip:port} and
> sure enough, I can't write '1' into iser. The kernel is not giving any
> messages and the ib_isert module is loaded. This is on 4.4.27,
> Mellanox driver 3.4-1.0.0.3 built with `./install --add-kernel-support
> --skip-repo --tmpdir /root/junk --vma`
>
> # mstflint -d 4:00.0 q
> Image type:          FS3
> FW Version:          14.16.1020
> FW Release Date:     20.6.2016
> Rom Info:            type=UEFI version=14.10.16
>                     type=PXE version=3.4.812 devid=4117
> Description:         UID                GuidsNumber
> Base GUID:           0cc47a000089f706        4
> Base MAC:            00000cc47a89f706        4
> Image VSD:
> Device VSD:
> PSID:                SM_2001000001034
>
> # ibstatus
> Infiniband device 'mlx5_0' port 1 status:
>        default gid:     fe80:0000:0000:0000:0ec4:7aff:fe89:f706
>        base lid:        0x0
>        sm lid:          0x0
>        state:           4: ACTIVE
>        phys state:      5: LinkUp
>        rate:            25 Gb/sec (1X EDR)
>        link_layer:      Ethernet
>
> Infiniband device 'mlx5_1' port 1 status:
>        default gid:     fe80:0000:0000:0000:0ec4:7aff:fe89:f707
>        base lid:        0x0
>        sm lid:          0x0
>        state:           4: ACTIVE
>        phys state:      5: LinkUp
>        rate:            25 Gb/sec (1X EDR)
>        link_layer:      Ethernet
>
> Any ideas of what I'm doing wrong here? I don't have any experience
> with RoCE, so I'm sure I'm doing something wrong. And the manual has
> nothing about configuring RoCE other than enabling --vma when
> installing the drivers [2].
>
> Thanks,
> Robert LeBlanc
>
> [0] http://www.mellanox.com/page/products_dyn?product_family=27
> [1] https://community.mellanox.com/docs/DOC-1472
> [2] http://www.mellanox.com/related-docs/prod_software/Mellanox_EN_for_Linux_User_Manual_v3_40.pdf
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 07/12] dm: Use BLK_MQ_S_STOPPED instead of QUEUE_FLAG_STOPPED in blk-mq code
From: Hannes Reinecke @ 2016-10-27  7:27 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: Christoph Hellwig, James Bottomley, Martin K. Petersen,
	Mike Snitzer, Doug Ledford, Keith Busch, Ming Lei,
	Laurence Oberman, linux-block@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-nvme@lists.infradead.org
In-Reply-To: <a0733adf-065e-2099-3850-cb1c55df1e35@sandisk.com>

On 10/27/2016 12:54 AM, Bart Van Assche wrote:
> Instead of manipulating both QUEUE_FLAG_STOPPED and BLK_MQ_S_STOPPED
> in the dm start and stop queue functions, only manipulate the latter
> flag. Change blk_queue_stopped() tests into blk_mq_queue_stopped().
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Cc: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-rq.c | 18 ++----------------
>  1 file changed, 2 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index fbd37b4..d47a504 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -75,12 +75,6 @@ static void dm_old_start_queue(struct request_queue *q)
>  
>  static void dm_mq_start_queue(struct request_queue *q)
>  {
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(q->queue_lock, flags);
> -	queue_flag_clear(QUEUE_FLAG_STOPPED, q);
> -	spin_unlock_irqrestore(q->queue_lock, flags);
> -
>  	blk_mq_start_stopped_hw_queues(q, true);
>  	blk_mq_kick_requeue_list(q);
>  }
> @@ -105,16 +99,8 @@ static void dm_old_stop_queue(struct request_queue *q)
>  
>  static void dm_mq_stop_queue(struct request_queue *q)
>  {
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(q->queue_lock, flags);
> -	if (blk_queue_stopped(q)) {
> -		spin_unlock_irqrestore(q->queue_lock, flags);
> +	if (blk_mq_queue_stopped(q))
>  		return;
> -	}
> -
> -	queue_flag_set(QUEUE_FLAG_STOPPED, q);
> -	spin_unlock_irqrestore(q->queue_lock, flags);
>  
>  	/* Avoid that requeuing could restart the queue. */
>  	blk_mq_cancel_requeue_work(q);
> @@ -341,7 +327,7 @@ static void __dm_mq_kick_requeue_list(struct request_queue *q, unsigned long mse
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(q->queue_lock, flags);
> -	if (!blk_queue_stopped(q))
> +	if (!blk_mq_queue_stopped(q))
>  		blk_mq_delay_kick_requeue_list(q, msecs);
>  	spin_unlock_irqrestore(q->queue_lock, flags);
>  }
> 
Ah. Right. That answers my previous question.

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: [PATCH 06/12] blk-mq: Add a kick_requeue_list argument to blk_mq_requeue_request()
From: Hannes Reinecke @ 2016-10-27  7:26 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: Christoph Hellwig, James Bottomley, Martin K. Petersen,
	Mike Snitzer, Doug Ledford, Keith Busch, Ming Lei,
	Laurence Oberman, linux-block@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-nvme@lists.infradead.org
In-Reply-To: <3944826d-bfde-f1e6-40ec-2c9a3c259e3a@sandisk.com>

On 10/27/2016 12:53 AM, Bart Van Assche wrote:
> Most blk_mq_requeue_request() and blk_mq_add_to_requeue_list() calls
> are followed by kicking the requeue list. Hence add an argument to
> these two functions that allows to kick the requeue list. This was
> proposed by Christoph Hellwig.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Sagi Grimberg <sagi@grimberg.me>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  block/blk-flush.c            |  5 +----
>  block/blk-mq.c               | 10 +++++++---
>  drivers/block/xen-blkfront.c |  2 +-
>  drivers/md/dm-rq.c           |  2 +-
>  drivers/nvme/host/core.c     |  2 +-
>  drivers/scsi/scsi_lib.c      |  4 +---
>  include/linux/blk-mq.h       |  5 +++--
>  7 files changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/block/blk-flush.c b/block/blk-flush.c
> index 6a14b68..a834aed 100644
> --- a/block/blk-flush.c
> +++ b/block/blk-flush.c
> @@ -134,10 +134,7 @@ static void blk_flush_restore_request(struct request *rq)
>  static bool blk_flush_queue_rq(struct request *rq, bool add_front)
>  {
>  	if (rq->q->mq_ops) {
> -		struct request_queue *q = rq->q;
> -
> -		blk_mq_add_to_requeue_list(rq, add_front);
> -		blk_mq_kick_requeue_list(q);
> +		blk_mq_add_to_requeue_list(rq, add_front, true);
>  		return false;
>  	} else {
>  		if (add_front)
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 4945437..688bcc3 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -492,12 +492,12 @@ static void __blk_mq_requeue_request(struct request *rq)
>  	}
>  }
>  
> -void blk_mq_requeue_request(struct request *rq)
> +void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list)
>  {
>  	__blk_mq_requeue_request(rq);
>  
>  	BUG_ON(blk_queued_rq(rq));
> -	blk_mq_add_to_requeue_list(rq, true);
> +	blk_mq_add_to_requeue_list(rq, true, kick_requeue_list);
>  }
>  EXPORT_SYMBOL(blk_mq_requeue_request);
>  
> @@ -535,7 +535,8 @@ static void blk_mq_requeue_work(struct work_struct *work)
>  	blk_mq_start_hw_queues(q);
>  }
>  
> -void blk_mq_add_to_requeue_list(struct request *rq, bool at_head)
> +void blk_mq_add_to_requeue_list(struct request *rq, bool at_head,
> +				bool kick_requeue_list)
>  {
>  	struct request_queue *q = rq->q;
>  	unsigned long flags;
> @@ -554,6 +555,9 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head)
>  		list_add_tail(&rq->queuelist, &q->requeue_list);
>  	}
>  	spin_unlock_irqrestore(&q->requeue_lock, flags);
> +
> +	if (kick_requeue_list)
> +		blk_mq_kick_requeue_list(q);
>  }
>  EXPORT_SYMBOL(blk_mq_add_to_requeue_list);
>  
> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> index 9908597..1ca702d 100644
> --- a/drivers/block/xen-blkfront.c
> +++ b/drivers/block/xen-blkfront.c
> @@ -2043,7 +2043,7 @@ static int blkif_recover(struct blkfront_info *info)
>  		/* Requeue pending requests (flush or discard) */
>  		list_del_init(&req->queuelist);
>  		BUG_ON(req->nr_phys_segments > segs);
> -		blk_mq_requeue_request(req);
> +		blk_mq_requeue_request(req, false);
>  	}
>  	blk_mq_kick_requeue_list(info->rq);
>  
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index dc75bea..fbd37b4 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -354,7 +354,7 @@ EXPORT_SYMBOL(dm_mq_kick_requeue_list);
>  
>  static void dm_mq_delay_requeue_request(struct request *rq, unsigned long msecs)
>  {
> -	blk_mq_requeue_request(rq);
> +	blk_mq_requeue_request(rq, false);
>  	__dm_mq_kick_requeue_list(rq->q, msecs);
>  }
>  
Hmm. __dm_mq_kick_requeue_list() does essentially the same.
Have you checked if you can use 'true' here and drop the call to it?
However, it does take the queue_lock when doing so.
Is that required? None of the other drivers do it ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: [PATCH v4 00/10] infiniband: Remove semaphores
From: Leon Romanovsky @ 2016-10-27  7:13 UTC (permalink / raw)
  To: Binoy Jayan
  Cc: Doug Ledford, Sean Hefty, Hal Rosenstock, Arnd Bergmann,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 449 bytes --]

On Thu, Oct 27, 2016 at 12:29:04PM +0530, Binoy Jayan wrote:
> Hi,
>
> These are a set of patches [v4] which removes semaphores from infiniband.
> These are part of a bigger effort to eliminate all semaphores from the
> linux kernel.
>
> v3 -> v4:
>
> IB/mlx5: Added patch - Replace semaphore umr_common:sem with wait_event
> IB/mlx5: Fixed a bug pointed out by Leon Romanovsky

Please keep full changelog for your next submissions/respins.

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Re: A question regarding "multiple SGL"
From: Christoph Hellwig @ 2016-10-27  7:10 UTC (permalink / raw)
  To: Qiuxin (robert)
  Cc: Bart Van Assche, Jens Axboe, linux-block@vger.kernel.org,
	James Bottomley, Martin K. Petersen, Mike Snitzer,
	linux-rdma@vger.kernel.org, Ming Lei,
	linux-nvme@lists.infradead.org, Keith Busch, Doug Ledford,
	linux-scsi@vger.kernel.org, Laurence Oberman, Tiger zhao
In-Reply-To: <B78D1A958924E54594997EFF975251625BD49445@szxeml592-mbx.china.huawei.com>

Hi Robert,

please explain your use cases that isn't handled.  The one and only
reason to set MSDBD to 1 is to make the code a lot simpler given that
there is no real use case for supporting more.

RDMA uses memory registrations to register large and possibly
discontiguous data regions for a single rkey, aka single SGL descriptor
in NVMe terms.  There would be two reasons to support multiple SGL
descriptors:  a) to support a larger I/O size than supported by a single
MR, or b) to support a data region format not mappable by a single
MR.

iSER only supports a single rkey (or stag in IETF terminology) and has
been doing fine on a) and mostly fine on b).   There are a few possible
data layouts not supported by the traditional IB/iWarp FR WRs, but the
limit is in fact exactly the same as imposed by the NVMe PRPs used for
PCIe NVMe devices, so the Linux block layer has support to not generate
them.  Also with modern Mellanox IB/RoCE hardware we can actually
register completely arbitrary SGLs.  iSER supports using this registration
mode already with a trivial code addition, but for NVMe we didn't have a
pressing need yet.

^ permalink raw reply

* [PATCH v4 10/10] IB/mlx5: Simplify completion into a wait_event
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma, linux-kernel, Binoy Jayan
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan@linaro.org>

Convert the completion 'mlx5_ib_umr_context:done' to a wait_event as it
just waits for the return value to be filled.

Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 +-
 drivers/infiniband/hw/mlx5/mr.c      | 9 +++++----
 include/rdma/ib_verbs.h              | 1 +
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index de31b5f..cf496b5 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -524,7 +524,7 @@ struct mlx5_ib_mw {
 struct mlx5_ib_umr_context {
 	struct ib_cqe		cqe;
 	enum ib_wc_status	status;
-	struct completion	done;
+	wait_queue_head_t	wq;
 };
 
 struct umr_common {
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index dfaf6f6..49ff2af 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -846,14 +846,14 @@ static void mlx5_ib_umr_done(struct ib_cq *cq, struct ib_wc *wc)
 		container_of(wc->wr_cqe, struct mlx5_ib_umr_context, cqe);
 
 	context->status = wc->status;
-	complete(&context->done);
+	wake_up(&context->wq);
 }
 
 static inline void mlx5_ib_init_umr_context(struct mlx5_ib_umr_context *context)
 {
 	context->cqe.done = mlx5_ib_umr_done;
-	context->status = -1;
-	init_completion(&context->done);
+	context->status = IB_WC_STATUS_NONE;
+	init_waitqueue_head(&context->wq);
 }
 
 static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
@@ -873,7 +873,8 @@ static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
 	if (err) {
 		mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
 	} else {
-		wait_for_completion(&umr_context.done);
+		wait_event(umr_context.wq,
+			   umr_context.status != IB_WC_STATUS_NONE);
 		if (umr_context.status != IB_WC_SUCCESS) {
 			mlx5_ib_warn(dev, "reg umr failed (%u)\n",
 				     umr_context.status);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5ad43a4..8b15b6f 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -823,6 +823,7 @@ struct ib_ah_attr {
 };
 
 enum ib_wc_status {
+	IB_WC_STATUS_NONE = -1,
 	IB_WC_SUCCESS,
 	IB_WC_LOC_LEN_ERR,
 	IB_WC_LOC_QP_OP_ERR,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related

* [PATCH v4 09/10] IB/mlx5: Replace semaphore umr_common:sem with wait_event
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

Remove semaphore umr_common:sem used to limit concurrent access to umr qp
and introduce an atomic value 'users' to keep track of the same. Use a
wait_event to block when the limit is reached.

Signed-off-by: Binoy Jayan <binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c    | 6 +-----
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 7 ++++++-
 drivers/infiniband/hw/mlx5/mr.c      | 6 ++++--
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 2217477..eb72bff 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2437,10 +2437,6 @@ static void destroy_umrc_res(struct mlx5_ib_dev *dev)
 	ib_dealloc_pd(dev->umrc.pd);
 }
 
-enum {
-	MAX_UMR_WR = 128,
-};
-
 static int create_umr_res(struct mlx5_ib_dev *dev)
 {
 	struct ib_qp_init_attr *init_attr = NULL;
@@ -2520,7 +2516,7 @@ static int create_umr_res(struct mlx5_ib_dev *dev)
 	dev->umrc.cq = cq;
 	dev->umrc.pd = pd;
 
-	sema_init(&dev->umrc.sem, MAX_UMR_WR);
+	init_waitqueue_head(&dev->umrc.wq);
 	ret = mlx5_mr_cache_init(dev);
 	if (ret) {
 		mlx5_ib_warn(dev, "mr cache init failed %d\n", ret);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index dcdcd19..de31b5f 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -533,7 +533,12 @@ struct umr_common {
 	struct ib_qp	*qp;
 	/* control access to UMR QP
 	 */
-	struct semaphore	sem;
+	wait_queue_head_t	wq;
+	atomic_t		users;
+};
+
+enum {
+	MAX_UMR_WR = 128,
 };
 
 enum {
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 1593856..dfaf6f6 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -867,7 +867,8 @@ static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
 	mlx5_ib_init_umr_context(&umr_context);
 	umrwr->wr.wr_cqe = &umr_context.cqe;
 
-	down(&umrc->sem);
+	/* limit number of concurrent ib_post_send() on qp */
+	wait_event(umrc->wq, atomic_add_unless(&umrc->users, 1, MAX_UMR_WR));
 	err = ib_post_send(umrc->qp, &umrwr->wr, &bad);
 	if (err) {
 		mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
@@ -879,7 +880,8 @@ static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
 			err = -EFAULT;
 		}
 	}
-	up(&umrc->sem);
+	atomic_dec(&umrc->users);
+	wake_up(&umrc->wq);
 	return err;
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v4 08/10] IB/mlx5: Add helper mlx5_ib_post_send_wait
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

Clean up the following common code (to post a list of work requests to the
send queue of the specified QP) at various places and add a helper function
'mlx5_ib_post_send_wait' to implement the same.

 - Initialize 'mlx5_ib_umr_context' on stack
 - Assign "mlx5_umr_wr:wr:wr_cqe to umr_context.cqe
 - Acquire the semaphore
 - call ib_post_send with a single ib_send_wr
 - wait_for_completion()
 - Check for umr_context.status
 - Release the semaphore

As semaphores are going away in the future, moving all of these into the
shared helper leaves only a single function using the semaphore, which
can then be rewritten to use something else.

Signed-off-by: Binoy Jayan <binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/mr.c | 115 +++++++++++-----------------------------
 1 file changed, 32 insertions(+), 83 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index d4ad672..1593856 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -856,16 +856,40 @@ static inline void mlx5_ib_init_umr_context(struct mlx5_ib_umr_context *context)
 	init_completion(&context->done);
 }
 
+static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
+					 struct mlx5_umr_wr *umrwr)
+{
+	struct umr_common *umrc = &dev->umrc;
+	struct ib_send_wr *bad;
+	int err;
+	struct mlx5_ib_umr_context umr_context;
+
+	mlx5_ib_init_umr_context(&umr_context);
+	umrwr->wr.wr_cqe = &umr_context.cqe;
+
+	down(&umrc->sem);
+	err = ib_post_send(umrc->qp, &umrwr->wr, &bad);
+	if (err) {
+		mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
+	} else {
+		wait_for_completion(&umr_context.done);
+		if (umr_context.status != IB_WC_SUCCESS) {
+			mlx5_ib_warn(dev, "reg umr failed (%u)\n",
+				     umr_context.status);
+			err = -EFAULT;
+		}
+	}
+	up(&umrc->sem);
+	return err;
+}
+
 static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
 				  u64 virt_addr, u64 len, int npages,
 				  int page_shift, int order, int access_flags)
 {
 	struct mlx5_ib_dev *dev = to_mdev(pd->device);
 	struct device *ddev = dev->ib_dev.dma_device;
-	struct umr_common *umrc = &dev->umrc;
-	struct mlx5_ib_umr_context umr_context;
 	struct mlx5_umr_wr umrwr = {};
-	struct ib_send_wr *bad;
 	struct mlx5_ib_mr *mr;
 	struct ib_sge sg;
 	int size;
@@ -894,24 +918,12 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
 	if (err)
 		goto free_mr;
 
-	mlx5_ib_init_umr_context(&umr_context);
-
-	umrwr.wr.wr_cqe = &umr_context.cqe;
 	prep_umr_reg_wqe(pd, &umrwr.wr, &sg, dma, npages, mr->mmkey.key,
 			 page_shift, virt_addr, len, access_flags);
 
-	down(&umrc->sem);
-	err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
-	if (err) {
-		mlx5_ib_warn(dev, "post send failed, err %d\n", err);
+	err = mlx5_ib_post_send_wait(dev, &umrwr);
+	if (err && err != -EFAULT)
 		goto unmap_dma;
-	} else {
-		wait_for_completion(&umr_context.done);
-		if (umr_context.status != IB_WC_SUCCESS) {
-			mlx5_ib_warn(dev, "reg umr failed\n");
-			err = -EFAULT;
-		}
-	}
 
 	mr->mmkey.iova = virt_addr;
 	mr->mmkey.size = len;
@@ -920,7 +932,6 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
 	mr->live = 1;
 
 unmap_dma:
-	up(&umrc->sem);
 	dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
 
 	kfree(mr_pas);
@@ -940,13 +951,10 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
 {
 	struct mlx5_ib_dev *dev = mr->dev;
 	struct device *ddev = dev->ib_dev.dma_device;
-	struct umr_common *umrc = &dev->umrc;
-	struct mlx5_ib_umr_context umr_context;
 	struct ib_umem *umem = mr->umem;
 	int size;
 	__be64 *pas;
 	dma_addr_t dma;
-	struct ib_send_wr *bad;
 	struct mlx5_umr_wr wr;
 	struct ib_sge sg;
 	int err = 0;
@@ -1011,10 +1019,7 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
 
 		dma_sync_single_for_device(ddev, dma, size, DMA_TO_DEVICE);
 
-		mlx5_ib_init_umr_context(&umr_context);
-
 		memset(&wr, 0, sizeof(wr));
-		wr.wr.wr_cqe = &umr_context.cqe;
 
 		sg.addr = dma;
 		sg.length = ALIGN(npages * sizeof(u64),
@@ -1031,19 +1036,7 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
 		wr.mkey = mr->mmkey.key;
 		wr.target.offset = start_page_index;
 
-		down(&umrc->sem);
-		err = ib_post_send(umrc->qp, &wr.wr, &bad);
-		if (err) {
-			mlx5_ib_err(dev, "UMR post send failed, err %d\n", err);
-		} else {
-			wait_for_completion(&umr_context.done);
-			if (umr_context.status != IB_WC_SUCCESS) {
-				mlx5_ib_err(dev, "UMR completion failed, code %d\n",
-					    umr_context.status);
-				err = -EFAULT;
-			}
-		}
-		up(&umrc->sem);
+		err = mlx5_ib_post_send_wait(dev, &wr);
 	}
 	dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
 
@@ -1210,39 +1203,14 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 static int unreg_umr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
 {
 	struct mlx5_core_dev *mdev = dev->mdev;
-	struct umr_common *umrc = &dev->umrc;
-	struct mlx5_ib_umr_context umr_context;
 	struct mlx5_umr_wr umrwr = {};
-	struct ib_send_wr *bad;
-	int err;
 
 	if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
 		return 0;
 
-	mlx5_ib_init_umr_context(&umr_context);
-
-	umrwr.wr.wr_cqe = &umr_context.cqe;
 	prep_umr_unreg_wqe(dev, &umrwr.wr, mr->mmkey.key);
 
-	down(&umrc->sem);
-	err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
-	if (err) {
-		up(&umrc->sem);
-		mlx5_ib_dbg(dev, "err %d\n", err);
-		goto error;
-	} else {
-		wait_for_completion(&umr_context.done);
-		up(&umrc->sem);
-	}
-	if (umr_context.status != IB_WC_SUCCESS) {
-		mlx5_ib_warn(dev, "unreg umr failed\n");
-		err = -EFAULT;
-		goto error;
-	}
-	return 0;
-
-error:
-	return err;
+	return mlx5_ib_post_send_wait(dev, &umrwr);
 }
 
 static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
@@ -1251,19 +1219,13 @@ static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
 {
 	struct mlx5_ib_dev *dev = to_mdev(pd->device);
 	struct device *ddev = dev->ib_dev.dma_device;
-	struct mlx5_ib_umr_context umr_context;
-	struct ib_send_wr *bad;
 	struct mlx5_umr_wr umrwr = {};
 	struct ib_sge sg;
-	struct umr_common *umrc = &dev->umrc;
 	dma_addr_t dma = 0;
 	__be64 *mr_pas = NULL;
 	int size;
 	int err;
 
-	mlx5_ib_init_umr_context(&umr_context);
-
-	umrwr.wr.wr_cqe = &umr_context.cqe;
 	umrwr.wr.send_flags = MLX5_IB_SEND_UMR_FAIL_IF_FREE;
 
 	if (flags & IB_MR_REREG_TRANS) {
@@ -1291,21 +1253,8 @@ static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
 	}
 
 	/* post send request to UMR QP */
-	down(&umrc->sem);
-	err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
+	err = mlx5_ib_post_send_wait(dev, &umrwr);
 
-	if (err) {
-		mlx5_ib_warn(dev, "post send failed, err %d\n", err);
-	} else {
-		wait_for_completion(&umr_context.done);
-		if (umr_context.status != IB_WC_SUCCESS) {
-			mlx5_ib_warn(dev, "reg umr failed (%u)\n",
-				     umr_context.status);
-			err = -EFAULT;
-		}
-	}
-
-	up(&umrc->sem);
 	if (flags & IB_MR_REREG_TRANS) {
 		dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
 		kfree(mr_pas);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v4 07/10] IB/mthca: Replace counting semaphore event_sem with wait_event
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma, linux-kernel, Binoy Jayan
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan@linaro.org>

Counting semaphores are going away in the future, so replace the semaphore
mthca_cmd::event_sem with a conditional wait_event.

Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
 drivers/infiniband/hw/mthca/mthca_cmd.c | 47 ++++++++++++++++++++++-----------
 drivers/infiniband/hw/mthca/mthca_dev.h |  3 ++-
 2 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index 49c6e19..d6a048a 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -405,6 +405,34 @@ void mthca_cmd_event(struct mthca_dev *dev,
 	complete(&context->done);
 }
 
+static inline struct mthca_cmd_context *
+mthca_try_get_context(struct mthca_cmd *cmd)
+{
+	struct mthca_cmd_context *context = NULL;
+
+	spin_lock(&cmd->context_lock);
+
+	if (cmd->free_head < 0)
+		goto out;
+
+	context = &cmd->context[cmd->free_head];
+	context->token += cmd->token_mask + 1;
+	cmd->free_head = context->next;
+out:
+	spin_unlock(&cmd->context_lock);
+	return context;
+}
+
+/* wait for and acquire a free context */
+static inline struct mthca_cmd_context *
+mthca_get_free_context(struct mthca_cmd *cmd)
+{
+	struct mthca_cmd_context *context;
+
+	wait_event(cmd->wq, (context = mthca_try_get_context(cmd)));
+	return context;
+}
+
 static int mthca_cmd_wait(struct mthca_dev *dev,
 			  u64 in_param,
 			  u64 *out_param,
@@ -417,15 +445,7 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
 	int err = 0;
 	struct mthca_cmd_context *context;
 
-	down(&dev->cmd.event_sem);
-
-	spin_lock(&dev->cmd.context_lock);
-	BUG_ON(dev->cmd.free_head < 0);
-	context = &dev->cmd.context[dev->cmd.free_head];
-	context->token += dev->cmd.token_mask + 1;
-	dev->cmd.free_head = context->next;
-	spin_unlock(&dev->cmd.context_lock);
-
+	context = mthca_get_free_context(&dev->cmd);
 	init_completion(&context->done);
 
 	err = mthca_cmd_post(dev, in_param,
@@ -458,8 +478,8 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
 	context->next = dev->cmd.free_head;
 	dev->cmd.free_head = context - dev->cmd.context;
 	spin_unlock(&dev->cmd.context_lock);
+	wake_up(&dev->cmd.wq);
 
-	up(&dev->cmd.event_sem);
 	return err;
 }
 
@@ -571,7 +591,7 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
 	dev->cmd.context[dev->cmd.max_cmds - 1].next = -1;
 	dev->cmd.free_head = 0;
 
-	sema_init(&dev->cmd.event_sem, dev->cmd.max_cmds);
+	init_waitqueue_head(&dev->cmd.wq);
 	spin_lock_init(&dev->cmd.context_lock);
 
 	for (dev->cmd.token_mask = 1;
@@ -590,12 +610,9 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
  */
 void mthca_cmd_use_polling(struct mthca_dev *dev)
 {
-	int i;
-
 	dev->cmd.flags &= ~MTHCA_CMD_USE_EVENTS;
 
-	for (i = 0; i < dev->cmd.max_cmds; ++i)
-		down(&dev->cmd.event_sem);
+	dev->cmd.free_head = -1;
 
 	kfree(dev->cmd.context);
 }
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 87ab964..2fc86db 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -46,6 +46,7 @@
 #include <linux/list.h>
 #include <linux/semaphore.h>
 
+#include <rdma/ib_sa.h>
 #include "mthca_provider.h"
 #include "mthca_doorbell.h"
 
@@ -121,7 +122,7 @@ struct mthca_cmd {
 	struct pci_pool          *pool;
 	struct mutex              hcr_mutex;
 	struct mutex		  poll_mutex;
-	struct semaphore 	  event_sem;
+	wait_queue_head_t	  wq;
 	int              	  max_cmds;
 	spinlock_t                context_lock;
 	int                       free_head;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related

* [PATCH v4 06/10] IB/hns: Replace counting semaphore event_sem with wait_event
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma, linux-kernel, Binoy Jayan
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan@linaro.org>

Counting semaphores are going away in the future, so replace the semaphore
mthca_cmd::event_sem with a conditional wait_event.

Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
 drivers/infiniband/hw/hns/hns_roce_cmd.c    | 46 ++++++++++++++++++++---------
 drivers/infiniband/hw/hns/hns_roce_device.h |  2 +-
 2 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 51a0675..12ef3d8 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -189,6 +189,34 @@ void hns_roce_cmd_event(struct hns_roce_dev *hr_dev, u16 token, u8 status,
 	complete(&context->done);
 }
 
+static inline struct hns_roce_cmd_context *
+hns_roce_try_get_context(struct hns_roce_cmdq *cmd)
+{
+	struct hns_roce_cmd_context *context = NULL;
+
+	spin_lock(&cmd->context_lock);
+
+	if (cmd->free_head < 0)
+		goto out;
+
+	context = &cmd->context[cmd->free_head];
+	context->token += cmd->token_mask + 1;
+	cmd->free_head = context->next;
+out:
+	spin_unlock(&cmd->context_lock);
+	return context;
+}
+
+/* wait for and acquire a free context */
+static inline struct hns_roce_cmd_context *
+hns_roce_get_free_context(struct hns_roce_cmdq *cmd)
+{
+	struct hns_roce_cmd_context *context;
+
+	wait_event(cmd->wq, (context = hns_roce_try_get_context(cmd)));
+	return context;
+}
+
 /* this should be called with "use_events" */
 static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
 				    u64 out_param, unsigned long in_modifier,
@@ -200,13 +228,7 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
 	struct hns_roce_cmd_context *context;
 	int ret = 0;
 
-	spin_lock(&cmd->context_lock);
-	WARN_ON(cmd->free_head < 0);
-	context = &cmd->context[cmd->free_head];
-	context->token += cmd->token_mask + 1;
-	cmd->free_head = context->next;
-	spin_unlock(&cmd->context_lock);
-
+	context = hns_roce_get_free_context(cmd);
 	init_completion(&context->done);
 
 	ret = hns_roce_cmd_mbox_post_hw(hr_dev, in_param, out_param,
@@ -238,6 +260,7 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
 	context->next = cmd->free_head;
 	cmd->free_head = context - cmd->context;
 	spin_unlock(&cmd->context_lock);
+	wake_up(&cmd->wq);
 
 	return ret;
 }
@@ -248,10 +271,8 @@ static int hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
 {
 	int ret = 0;
 
-	down(&hr_dev->cmd.event_sem);
 	ret = __hns_roce_cmd_mbox_wait(hr_dev, in_param, out_param,
 				       in_modifier, op_modifier, op, timeout);
-	up(&hr_dev->cmd.event_sem);
 
 	return ret;
 }
@@ -313,7 +334,7 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
 	hr_cmd->context[hr_cmd->max_cmds - 1].next = -1;
 	hr_cmd->free_head = 0;
 
-	sema_init(&hr_cmd->event_sem, hr_cmd->max_cmds);
+	init_waitqueue_head(&hr_cmd->wq);
 	spin_lock_init(&hr_cmd->context_lock);
 
 	hr_cmd->token_mask = CMD_TOKEN_MASK;
@@ -325,12 +346,9 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
 void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
 {
 	struct hns_roce_cmdq *hr_cmd = &hr_dev->cmd;
-	int i;
 
 	hr_cmd->use_events = 0;
-
-	for (i = 0; i < hr_cmd->max_cmds; ++i)
-		down(&hr_cmd->event_sem);
+	hr_cmd->free_head = -1;
 
 	kfree(hr_cmd->context);
 }
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 2afe075..ac95f52 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -364,7 +364,7 @@ struct hns_roce_cmdq {
 	* Event mode: cmd register mutex protection,
 	* ensure to not exceed max_cmds and user use limit region
 	*/
-	struct semaphore	event_sem;
+	wait_queue_head_t       wq;
 	int			max_cmds;
 	spinlock_t		context_lock;
 	int			free_head;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related

* [PATCH v4 05/10] IB/isert: Replace semaphore sem with completion
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

The semaphore 'sem' in isert_device is used as completion, so convert
it to struct completion. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan <binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/ulp/isert/ib_isert.c | 6 +++---
 drivers/infiniband/ulp/isert/ib_isert.h | 3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 6dd43f6..de80f56 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -619,7 +619,7 @@
 	mutex_unlock(&isert_np->mutex);
 
 	isert_info("np %p: Allow accept_np to continue\n", isert_np);
-	up(&isert_np->sem);
+	complete(&isert_np->comp);
 }
 
 static void
@@ -2311,7 +2311,7 @@ struct rdma_cm_id *
 		isert_err("Unable to allocate struct isert_np\n");
 		return -ENOMEM;
 	}
-	sema_init(&isert_np->sem, 0);
+	init_completion(&isert_np->comp);
 	mutex_init(&isert_np->mutex);
 	INIT_LIST_HEAD(&isert_np->accepted);
 	INIT_LIST_HEAD(&isert_np->pending);
@@ -2427,7 +2427,7 @@ struct rdma_cm_id *
 	int ret;
 
 accept_wait:
-	ret = down_interruptible(&isert_np->sem);
+	ret = wait_for_completion_interruptible(&isert_np->comp);
 	if (ret)
 		return -ENODEV;
 
diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h
index c02ada5..a1277c0 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.h
+++ b/drivers/infiniband/ulp/isert/ib_isert.h
@@ -3,6 +3,7 @@
 #include <linux/in6.h>
 #include <rdma/ib_verbs.h>
 #include <rdma/rdma_cm.h>
+#include <linux/completion.h>
 #include <rdma/rw.h>
 #include <scsi/iser.h>
 
@@ -190,7 +191,7 @@ struct isert_device {
 
 struct isert_np {
 	struct iscsi_np         *np;
-	struct semaphore	sem;
+	struct completion	comp;
 	struct rdma_cm_id	*cm_id;
 	struct mutex		mutex;
 	struct list_head	accepted;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v4 04/10] IB/mthca: Replace semaphore poll_sem with mutex
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

The semaphore 'poll_sem' is a simple mutex, so it should be written as one.
Semaphores are going away in the future. So replace it with a mutex. Also,
remove mutex_[un]lock from mthca_cmd_use_events and mthca_cmd_use_polling
respectively.

Signed-off-by: Binoy Jayan <binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/hw/mthca/mthca_cmd.c | 10 +++-------
 drivers/infiniband/hw/mthca/mthca_cmd.h |  1 +
 drivers/infiniband/hw/mthca/mthca_dev.h |  2 +-
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index c7f49bb..49c6e19 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -347,7 +347,7 @@ static int mthca_cmd_poll(struct mthca_dev *dev,
 	unsigned long end;
 	u8 status;
 
-	down(&dev->cmd.poll_sem);
+	mutex_lock(&dev->cmd.poll_mutex);
 
 	err = mthca_cmd_post(dev, in_param,
 			     out_param ? *out_param : 0,
@@ -382,7 +382,7 @@ static int mthca_cmd_poll(struct mthca_dev *dev,
 	}
 
 out:
-	up(&dev->cmd.poll_sem);
+	mutex_unlock(&dev->cmd.poll_mutex);
 	return err;
 }
 
@@ -520,7 +520,7 @@ static int mthca_cmd_imm(struct mthca_dev *dev,
 int mthca_cmd_init(struct mthca_dev *dev)
 {
 	mutex_init(&dev->cmd.hcr_mutex);
-	sema_init(&dev->cmd.poll_sem, 1);
+	mutex_init(&dev->cmd.poll_mutex);
 	dev->cmd.flags = 0;
 
 	dev->hcr = ioremap(pci_resource_start(dev->pdev, 0) + MTHCA_HCR_BASE,
@@ -582,8 +582,6 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
 
 	dev->cmd.flags |= MTHCA_CMD_USE_EVENTS;
 
-	down(&dev->cmd.poll_sem);
-
 	return 0;
 }
 
@@ -600,8 +598,6 @@ void mthca_cmd_use_polling(struct mthca_dev *dev)
 		down(&dev->cmd.event_sem);
 
 	kfree(dev->cmd.context);
-
-	up(&dev->cmd.poll_sem);
 }
 
 struct mthca_mailbox *mthca_alloc_mailbox(struct mthca_dev *dev,
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.h b/drivers/infiniband/hw/mthca/mthca_cmd.h
index d2e5b19..a7f197e 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.h
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.h
@@ -35,6 +35,7 @@
 #ifndef MTHCA_CMD_H
 #define MTHCA_CMD_H
 
+#include <linux/mutex.h>
 #include <rdma/ib_verbs.h>
 
 #define MTHCA_MAILBOX_SIZE 4096
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 4393a02..87ab964 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -120,7 +120,7 @@ enum {
 struct mthca_cmd {
 	struct pci_pool          *pool;
 	struct mutex              hcr_mutex;
-	struct semaphore 	  poll_sem;
+	struct mutex		  poll_mutex;
 	struct semaphore 	  event_sem;
 	int              	  max_cmds;
 	spinlock_t                context_lock;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v4 03/10] IB/hns: Replace semaphore poll_sem with mutex
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma, linux-kernel, Binoy Jayan
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan@linaro.org>

The semaphore 'poll_sem' is a simple mutex, so it should be written as one.
Semaphores are going away in the future. So replace it with a mutex. Also,
remove mutex_[un]lock from mthca_cmd_use_events and mthca_cmd_use_polling
respectively.

Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
 drivers/infiniband/hw/hns/hns_roce_cmd.c    | 11 ++++-------
 drivers/infiniband/hw/hns/hns_roce_device.h |  3 ++-
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 2a0b6c0..51a0675 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -119,7 +119,7 @@ static int hns_roce_cmd_mbox_post_hw(struct hns_roce_dev *hr_dev, u64 in_param,
 	return ret;
 }
 
-/* this should be called with "poll_sem" */
+/* this should be called with "poll_mutex" */
 static int __hns_roce_cmd_mbox_poll(struct hns_roce_dev *hr_dev, u64 in_param,
 				    u64 out_param, unsigned long in_modifier,
 				    u8 op_modifier, u16 op,
@@ -167,10 +167,10 @@ static int hns_roce_cmd_mbox_poll(struct hns_roce_dev *hr_dev, u64 in_param,
 {
 	int ret;
 
-	down(&hr_dev->cmd.poll_sem);
+	mutex_lock(&hr_dev->cmd.poll_mutex);
 	ret = __hns_roce_cmd_mbox_poll(hr_dev, in_param, out_param, in_modifier,
 				       op_modifier, op, timeout);
-	up(&hr_dev->cmd.poll_sem);
+	mutex_unlock(&hr_dev->cmd.poll_mutex);
 
 	return ret;
 }
@@ -275,7 +275,7 @@ int hns_roce_cmd_init(struct hns_roce_dev *hr_dev)
 	struct device *dev = &hr_dev->pdev->dev;
 
 	mutex_init(&hr_dev->cmd.hcr_mutex);
-	sema_init(&hr_dev->cmd.poll_sem, 1);
+	mutex_init(&hr_dev->cmd.poll_mutex);
 	hr_dev->cmd.use_events = 0;
 	hr_dev->cmd.toggle = 1;
 	hr_dev->cmd.max_cmds = CMD_MAX_NUM;
@@ -319,8 +319,6 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
 	hr_cmd->token_mask = CMD_TOKEN_MASK;
 	hr_cmd->use_events = 1;
 
-	down(&hr_cmd->poll_sem);
-
 	return 0;
 }
 
@@ -335,7 +333,6 @@ void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
 		down(&hr_cmd->event_sem);
 
 	kfree(hr_cmd->context);
-	up(&hr_cmd->poll_sem);
 }
 
 struct hns_roce_cmd_mailbox
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 3417315..2afe075 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -34,6 +34,7 @@
 #define _HNS_ROCE_DEVICE_H
 
 #include <rdma/ib_verbs.h>
+#include <linux/mutex.h>
 
 #define DRV_NAME "hns_roce"
 
@@ -358,7 +359,7 @@ struct hns_roce_cmdq {
 	struct dma_pool		*pool;
 	u8 __iomem		*hcr;
 	struct mutex		hcr_mutex;
-	struct semaphore	poll_sem;
+	struct mutex		poll_mutex;
 	/*
 	* Event mode: cmd register mutex protection,
 	* ensure to not exceed max_cmds and user use limit region
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related

* [PATCH v4 02/10] IB/core: Replace semaphore sm_sem with an atomic wait
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

The semaphore 'sm_sem' is used for an exclusive ownership of the device
so model the same as an atomic variable with an associated wait_event.
Semaphores are going away in the future.

Signed-off-by: Binoy Jayan <binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/infiniband/core/user_mad.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 415a318..6101c0a 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -67,6 +67,8 @@ enum {
 	IB_UMAD_MINOR_BASE = 0
 };
 
+#define UMAD_F_CLAIM	0x01
+
 /*
  * Our lifetime rules for these structs are the following:
  * device special file is opened, we take a reference on the
@@ -87,7 +89,8 @@ struct ib_umad_port {
 
 	struct cdev           sm_cdev;
 	struct device	      *sm_dev;
-	struct semaphore       sm_sem;
+	wait_queue_head_t     wq;
+	unsigned long         flags;
 
 	struct mutex	       file_mutex;
 	struct list_head       file_list;
@@ -1030,12 +1033,14 @@ static int ib_umad_sm_open(struct inode *inode, struct file *filp)
 	port = container_of(inode->i_cdev, struct ib_umad_port, sm_cdev);
 
 	if (filp->f_flags & O_NONBLOCK) {
-		if (down_trylock(&port->sm_sem)) {
+		if (test_and_set_bit(UMAD_F_CLAIM, &port->flags)) {
 			ret = -EAGAIN;
 			goto fail;
 		}
 	} else {
-		if (down_interruptible(&port->sm_sem)) {
+		if (wait_event_interruptible(port->wq,
+					     !test_and_set_bit(UMAD_F_CLAIM,
+					     &port->flags))) {
 			ret = -ERESTARTSYS;
 			goto fail;
 		}
@@ -1060,7 +1065,8 @@ static int ib_umad_sm_open(struct inode *inode, struct file *filp)
 	ib_modify_port(port->ib_dev, port->port_num, 0, &props);
 
 err_up_sem:
-	up(&port->sm_sem);
+	clear_bit(UMAD_F_CLAIM, &port->flags);
+	wake_up(&port->wq);
 
 fail:
 	return ret;
@@ -1079,7 +1085,8 @@ static int ib_umad_sm_close(struct inode *inode, struct file *filp)
 		ret = ib_modify_port(port->ib_dev, port->port_num, 0, &props);
 	mutex_unlock(&port->file_mutex);
 
-	up(&port->sm_sem);
+	clear_bit(UMAD_F_CLAIM, &port->flags);
+	wake_up(&port->wq);
 
 	kobject_put(&port->umad_dev->kobj);
 
@@ -1177,7 +1184,8 @@ static int ib_umad_init_port(struct ib_device *device, int port_num,
 
 	port->ib_dev   = device;
 	port->port_num = port_num;
-	sema_init(&port->sm_sem, 1);
+	init_waitqueue_head(&port->wq);
+	__clear_bit(UMAD_F_CLAIM, &port->flags);
 	mutex_init(&port->file_mutex);
 	INIT_LIST_HEAD(&port->file_list);
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v4 01/10] IB/core: iwpm_nlmsg_request: Replace semaphore with completion
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma, linux-kernel, Binoy Jayan
In-Reply-To: <1477551554-30349-1-git-send-email-binoy.jayan@linaro.org>

Semaphore sem in iwpm_nlmsg_request is used as completion, so
convert it to a struct completion type. Semaphores are going
away in the future.

Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
 drivers/infiniband/core/iwpm_msg.c  | 8 ++++----
 drivers/infiniband/core/iwpm_util.c | 7 +++----
 drivers/infiniband/core/iwpm_util.h | 3 ++-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/iwpm_msg.c b/drivers/infiniband/core/iwpm_msg.c
index 1c41b95..761358f 100644
--- a/drivers/infiniband/core/iwpm_msg.c
+++ b/drivers/infiniband/core/iwpm_msg.c
@@ -394,7 +394,7 @@ int iwpm_register_pid_cb(struct sk_buff *skb, struct netlink_callback *cb)
 	/* always for found nlmsg_request */
 	kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
 	barrier();
-	up(&nlmsg_request->sem);
+	complete(&nlmsg_request->comp);
 	return 0;
 }
 EXPORT_SYMBOL(iwpm_register_pid_cb);
@@ -463,7 +463,7 @@ int iwpm_add_mapping_cb(struct sk_buff *skb, struct netlink_callback *cb)
 	/* always for found request */
 	kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
 	barrier();
-	up(&nlmsg_request->sem);
+	complete(&nlmsg_request->comp);
 	return 0;
 }
 EXPORT_SYMBOL(iwpm_add_mapping_cb);
@@ -555,7 +555,7 @@ int iwpm_add_and_query_mapping_cb(struct sk_buff *skb,
 	/* always for found request */
 	kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
 	barrier();
-	up(&nlmsg_request->sem);
+	complete(&nlmsg_request->comp);
 	return 0;
 }
 EXPORT_SYMBOL(iwpm_add_and_query_mapping_cb);
@@ -749,7 +749,7 @@ int iwpm_mapping_error_cb(struct sk_buff *skb, struct netlink_callback *cb)
 	/* always for found request */
 	kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
 	barrier();
-	up(&nlmsg_request->sem);
+	complete(&nlmsg_request->comp);
 	return 0;
 }
 EXPORT_SYMBOL(iwpm_mapping_error_cb);
diff --git a/drivers/infiniband/core/iwpm_util.c b/drivers/infiniband/core/iwpm_util.c
index ade71e7..08ddd2e 100644
--- a/drivers/infiniband/core/iwpm_util.c
+++ b/drivers/infiniband/core/iwpm_util.c
@@ -323,8 +323,7 @@ struct iwpm_nlmsg_request *iwpm_get_nlmsg_request(__u32 nlmsg_seq,
 	nlmsg_request->nl_client = nl_client;
 	nlmsg_request->request_done = 0;
 	nlmsg_request->err_code = 0;
-	sema_init(&nlmsg_request->sem, 1);
-	down(&nlmsg_request->sem);
+	init_completion(&nlmsg_request->comp);
 	return nlmsg_request;
 }
 
@@ -368,8 +367,8 @@ int iwpm_wait_complete_req(struct iwpm_nlmsg_request *nlmsg_request)
 {
 	int ret;
 
-	ret = down_timeout(&nlmsg_request->sem, IWPM_NL_TIMEOUT);
-	if (ret) {
+	ret = wait_for_completion_timeout(&nlmsg_request->comp, IWPM_NL_TIMEOUT);
+	if (!ret) {
 		ret = -EINVAL;
 		pr_info("%s: Timeout %d sec for netlink request (seq = %u)\n",
 			__func__, (IWPM_NL_TIMEOUT/HZ), nlmsg_request->nlmsg_seq);
diff --git a/drivers/infiniband/core/iwpm_util.h b/drivers/infiniband/core/iwpm_util.h
index af1fc14..ea6c299 100644
--- a/drivers/infiniband/core/iwpm_util.h
+++ b/drivers/infiniband/core/iwpm_util.h
@@ -43,6 +43,7 @@
 #include <linux/delay.h>
 #include <linux/workqueue.h>
 #include <linux/mutex.h>
+#include <linux/completion.h>
 #include <linux/jhash.h>
 #include <linux/kref.h>
 #include <net/netlink.h>
@@ -69,7 +70,7 @@ struct iwpm_nlmsg_request {
 	u8	            nl_client;
 	u8                  request_done;
 	u16                 err_code;
-	struct semaphore    sem;
+	struct completion   comp;
 	struct kref         kref;
 };
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related

* [PATCH v4 00/10] infiniband: Remove semaphores
From: Binoy Jayan @ 2016-10-27  6:59 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan

Hi,

These are a set of patches [v4] which removes semaphores from infiniband.
These are part of a bigger effort to eliminate all semaphores from the
linux kernel.

v3 -> v4:

IB/mlx5: Added patch - Replace semaphore umr_common:sem with wait_event
IB/mlx5: Fixed a bug pointed out by Leon Romanovsky

Thanks,
Binoy

Binoy Jayan (10):
  IB/core: iwpm_nlmsg_request: Replace semaphore with completion
  IB/core: Replace semaphore sm_sem with an atomic wait
  IB/hns: Replace semaphore poll_sem with mutex
  IB/mthca: Replace semaphore poll_sem with mutex
  IB/isert: Replace semaphore sem with completion
  IB/hns: Replace counting semaphore event_sem with wait_event
  IB/mthca: Replace counting semaphore event_sem with wait_event
  IB/mlx5: Add helper mlx5_ib_post_send_wait
  IB/mlx5: Replace semaphore umr_common:sem with wait_event
  IB/mlx5: Simplify completion into a wait_event

 drivers/infiniband/core/iwpm_msg.c          |   8 +-
 drivers/infiniband/core/iwpm_util.c         |   7 +-
 drivers/infiniband/core/iwpm_util.h         |   3 +-
 drivers/infiniband/core/user_mad.c          |  20 +++--
 drivers/infiniband/hw/hns/hns_roce_cmd.c    |  57 ++++++++-----
 drivers/infiniband/hw/hns/hns_roce_device.h |   5 +-
 drivers/infiniband/hw/mlx5/main.c           |   6 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h        |   9 +-
 drivers/infiniband/hw/mlx5/mr.c             | 124 +++++++++-------------------
 drivers/infiniband/hw/mthca/mthca_cmd.c     |  57 ++++++++-----
 drivers/infiniband/hw/mthca/mthca_cmd.h     |   1 +
 drivers/infiniband/hw/mthca/mthca_dev.h     |   5 +-
 drivers/infiniband/ulp/isert/ib_isert.c     |   6 +-
 drivers/infiniband/ulp/isert/ib_isert.h     |   3 +-
 include/rdma/ib_verbs.h                     |   1 +
 15 files changed, 153 insertions(+), 159 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: A question regarding "multiple SGL"
From: Qiuxin (robert) @ 2016-10-27  6:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Bart Van Assche, Jens Axboe, linux-block@vger.kernel.org,
	James Bottomley, Martin K. Petersen, Mike Snitzer,
	linux-rdma@vger.kernel.org, Ming Lei,
	linux-nvme@lists.infradead.org, Keith Busch, Doug Ledford,
	linux-scsi@vger.kernel.org, Laurence Oberman, Tiger zhao
In-Reply-To: <20161027064115.GA5864@lst.de>

Hi Christoph,

Thanks , got it.

Could you please do me favor to let me know the background why we ONLY support " MSDBD ==1"?   I am NOT trying to resist or oppose anything , I just want to know the reason.  You know,  it is a little wired for me, as  "MSDBD ==1" does not fulfill all the use cases which is depicted in the spec.

Best,
Robert Qiuxin
________________________________________
Robert Qiuxin
华为技术有限公司 Huawei Technologies Co., Ltd.
Phone: +86-755-28420357
Fax: 
Mobile: +86 15986638429
Email: qiuxin@huawei.com
地址:深圳市龙岗区坂田华为基地 邮编:518129
Huawei Technologies Co., Ltd.
Bantian, Longgang District,Shenzhen 518129, P.R.China
http://www.huawei.com 
________________________________________
本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from HUAWEI, which 
is intended only for the person or entity whose address is listed above. Any use of the 
information contained herein in any way (including, but not limited to, total or partial 
disclosure, reproduction, or dissemination) by persons other than the intended 
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by 
phone or email immediately and delete it!
-----邮件原件-----
发件人: Christoph Hellwig [mailto:hch@lst.de] 
发送时间: 2016年10月27日 14:41
收件人: 鑫愿
抄送: Bart Van Assche; Jens Axboe; linux-block@vger.kernel.org; James Bottomley; Martin K. Petersen; Mike Snitzer; linux-rdma@vger.kernel.org; Ming Lei; linux-nvme@lists.infradead.org; Keith Busch; Doug Ledford; linux-scsi@vger.kernel.org; Laurence Oberman; Christoph Hellwig; Tiger zhao; Qiuxin (robert)
主题: Re: A question regarding "multiple SGL"

Hi Robert,

There is no feature called "Multiple SGL in one NVMe capsule".  The NVMe over Fabrics specification allows a controller to advertise how many SGL descriptors it supports using the MSDBD Identify field:

"Maximum SGL Data Block Descriptors (MSDBD): This field indicates the maximum number of (Keyed) SGL Data Block descriptors that a host is allowed to place in a capsule. A value of 0h indicates no limit."

Setting this value to 1 is perfectly valid.  Similarly a host is free to chose any number of SGL descriptors between 0 (only for command that don't transfer data) to the limit imposed by the controller using the MSDBD field.

There are no plans to support a MSDBD value larger than 1 in the Linux NVMe target, and there are no plans to ever submit commands with multiple SGLs from the host driver either.

Cheers,
	Christoph

^ permalink raw reply

* Re: A question regarding "multiple SGL"
From: Christoph Hellwig @ 2016-10-27  6:41 UTC (permalink / raw)
  To: 鑫愿
  Cc: Bart Van Assche, Jens Axboe,
	linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	James Bottomley, Martin K. Petersen, Mike Snitzer,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Ming Lei,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	Keith Busch, Doug Ledford,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Laurence Oberman, Christoph Hellwig, tiger.zhao, qiuxin
In-Reply-To: <20161027005230.9904DC00097-2RFepEojUI2gQzYKMK1YzK/p1tWXv8elb9TvmfFkwKk@public.gmane.org>

Hi Robert,

There is no feature called "Multiple SGL in one NVMe capsule".  The
NVMe over Fabrics specification allows a controller to advertise how
many SGL descriptors it supports using the MSDBD Identify field:

"Maximum SGL Data Block Descriptors (MSDBD): This field indicates the
maximum number of (Keyed) SGL Data Block descriptors that a host is allowed to
place in a capsule. A value of 0h indicates no limit."

Setting this value to 1 is perfectly valid.  Similarly a host is free
to chose any number of SGL descriptors between 0 (only for command that
don't transfer data) to the limit imposed by the controller using the
MSDBD field.

There are no plans to support a MSDBD value larger than 1 in the Linux
NVMe target, and there are no plans to ever submit commands with multiple
SGLs from the host driver either.

Cheers,
	Christoph
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 8/8] IB/mlx5: Add helper mlx5_ib_post_send_wait
From: Binoy Jayan @ 2016-10-27  6:23 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Sean Hefty, Hal Rosenstock, Arnd Bergmann,
	linux-rdma, Linux kernel mailing list
In-Reply-To: <20161027060546.GA3617@leon.nu>

On 27 October 2016 at 11:35, Leon Romanovsky <leon@kernel.org> wrote:
> On Tue, Oct 25, 2016 at 06:46:58PM +0530, Binoy Jayan wrote:
>> On 25 October 2016 at 17:56, Leon Romanovsky <leon@kernel.org> wrote:
>> > On Tue, Oct 25, 2016 at 05:31:59PM +0530, Binoy Jayan wrote:
>>
>> > In case of success (err == 0), you will call to unmap_dma instead of
>> > normal flow.
>> >
>> > NAK,
>> > Leon Romanovsky <leonro@mellanox.com>
>>
>> Hi Loen,
>>
>> Even in the original code, the regular flow seems to reach 'unmap_dma' after
>> returning from 'wait_for_completion'().
>
> In original flow, the code executed assignments to mr->mmkey. In you
> code, it is skipped.
>

Yes you are right, I just noted it. My bad. I've changed it now.

Thanks,
Binoy

^ permalink raw reply

* Re: [PATCH v3 8/9] IB/mlx5: Add helper mlx5_ib_post_send_wait
From: Leon Romanovsky @ 2016-10-27  6:09 UTC (permalink / raw)
  To: Binoy Jayan
  Cc: Doug Ledford, Sean Hefty, Hal Rosenstock, Arnd Bergmann,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1477468874-16328-9-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 3204 bytes --]

On Wed, Oct 26, 2016 at 01:31:13PM +0530, Binoy Jayan wrote:
> Clean up the following common code (to post a list of work requests to the
> send queue of the specified QP) at various places and add a helper function
> 'mlx5_ib_post_send_wait' to implement the same.
>
>  - Initialize 'mlx5_ib_umr_context' on stack
>  - Assign "mlx5_umr_wr:wr:wr_cqe to umr_context.cqe
>  - Acquire the semaphore
>  - call ib_post_send with a single ib_send_wr
>  - wait_for_completion()
>  - Check for umr_context.status
>  - Release the semaphore
>
> As semaphores are going away in the future, moving all of these into the
> shared helper leaves only a single function using the semaphore, which
> can then be rewritten to use something else.
>
> Signed-off-by: Binoy Jayan <binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  drivers/infiniband/hw/mlx5/mr.c | 115 +++++++++++-----------------------------
>  1 file changed, 32 insertions(+), 83 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index d4ad672..19c292a 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -856,16 +856,40 @@ static inline void mlx5_ib_init_umr_context(struct mlx5_ib_umr_context *context)
>  	init_completion(&context->done);
>  }
>
> +static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
> +					 struct mlx5_umr_wr *umrwr)
> +{
> +	struct umr_common *umrc = &dev->umrc;
> +	struct ib_send_wr *bad;
> +	int err;
> +	struct mlx5_ib_umr_context umr_context;
> +
> +	mlx5_ib_init_umr_context(&umr_context);
> +	umrwr->wr.wr_cqe = &umr_context.cqe;
> +
> +	down(&umrc->sem);
> +	err = ib_post_send(umrc->qp, &umrwr->wr, &bad);
> +	if (err) {
> +		mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
> +	} else {
> +		wait_for_completion(&umr_context.done);
> +		if (umr_context.status != IB_WC_SUCCESS) {
> +			mlx5_ib_warn(dev, "reg umr failed (%u)\n",
> +				     umr_context.status);
> +			err = -EFAULT;
> +		}
> +	}
> +	up(&umrc->sem);
> +	return err;
> +}
> +
>  static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
>  				  u64 virt_addr, u64 len, int npages,
>  				  int page_shift, int order, int access_flags)
>  {
>  	struct mlx5_ib_dev *dev = to_mdev(pd->device);
>  	struct device *ddev = dev->ib_dev.dma_device;
> -	struct umr_common *umrc = &dev->umrc;
> -	struct mlx5_ib_umr_context umr_context;
>  	struct mlx5_umr_wr umrwr = {};
> -	struct ib_send_wr *bad;
>  	struct mlx5_ib_mr *mr;
>  	struct ib_sge sg;
>  	int size;
> @@ -894,24 +918,12 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
>  	if (err)
>  		goto free_mr;
>
> -	mlx5_ib_init_umr_context(&umr_context);
> -
> -	umrwr.wr.wr_cqe = &umr_context.cqe;
>  	prep_umr_reg_wqe(pd, &umrwr.wr, &sg, dma, npages, mr->mmkey.key,
>  			 page_shift, virt_addr, len, access_flags);
>
> -	down(&umrc->sem);
> -	err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
> -	if (err) {
> -		mlx5_ib_warn(dev, "post send failed, err %d\n", err);
> +	err = mlx5_ib_post_send_wait(dev, &umrwr);
> +	if (err != -EFAULT)
>  		goto unmap_dma;

NAK,
You are breaking driver.

it should be "if (err && err != -EFAULT)"

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox