From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sagi Grimberg Subject: Re: [PATCH 9/9] IB/srp: Add fast registration support Date: Wed, 07 May 2014 20:19:43 +0300 Message-ID: <536A6B2F.70807@dev.mellanox.co.il> References: <5368DA5B.80609@acm.org> <5368DC09.70608@acm.org> <536A1A3A.9090208@dev.mellanox.co.il> <536A4A68.4080605@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <536A4A68.4080605-HInyCGIudOg@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Bart Van Assche , Roland Dreier Cc: Sagi Grimberg , Vu Pham , David Dillow , Sebastian Parschauer , linux-rdma List-Id: linux-rdma@vger.kernel.org On 5/7/2014 5:59 PM, Bart Van Assche wrote: > On 05/07/14 13:34, Sagi Grimberg wrote: >>> +module_param(prefer_fr, bool, 0444); >>> +MODULE_PARM_DESC(prefer_fr, >>> + "Whether to use FR if both FMR and FR are supported"); >> Will it be better to make it a per-target attribute? > Hello Sagi, > > The only reason I introduced this kernel module parameter was to allow > me to test fast registration support on HCA's that support both FMR and > FR. I'm not sure it is useful to end users to configure this parameter. > Hence my preference to leave it global instead of converting it into a > per-target attribute. If you add a modparam, the user is allowed to use it. AFAICT a user that will ask prefer_fr on station with device that supports FR and a device that doesn't (don't know if even exists). per-target the conditions in add_one are correct per-device. >>> +module_param(register_always, bool, 0444); >>> +MODULE_PARM_DESC(register_always, >>> + "Use memory registration even for contiguous memory regions"); >>> + >> This is a separate patch not related to fast registration support, I >> suggest to post it as patch #10 > OK, I will split this out into a separate patch. > >>> +static struct srp_fr_pool *srp_alloc_fr_pool(struct srp_target_port >>> *target) >>> +{ >>> + struct srp_device *dev = target->srp_host->srp_dev; >>> + struct srp_fr_pool *pool; >>> + int max_pages_per_mr; >>> + >>> + for (max_pages_per_mr = SRP_MAX_PAGES_PER_MR; >>> + max_pages_per_mr >= SRP_MIN_PAGES_PER_MR; >>> + max_pages_per_mr /= 2) { >> Isn't there some device capability for that? maybe max_mr_size? >> Also for smaller max_pages you won't be able to handle 512 pages in a >> single cmnd (you would need 2 FRMRs). >> At least a warning print? > I will look into using max_mr_size here. Using multiple registrations > per command should be fine in the context of the SRP protocol as long as > cmd_sg_entries has not been set to a very small value. > >>> + pool = srp_create_fr_pool(dev->dev, dev->pd, >>> + SRP_MDESC_PER_POOL, max_pages_per_mr); >> 1024 FRMRs per connection?! we use 1024 FMRs for all connections >> (per-device). I'd say that's a major over-allocation. > It depends on how many discontiguous I/O requests are submitted > concurrently. Anyway, how about limiting the number of memory regions to > the queue size ? Perhaps, but we will need to reserve some more for discontinuous IOs. It is heuristic, for FS in most cases IO will align nicely, for some crazy DB applications - I'm not sure. > >>> - init_attr->sq_sig_type = IB_SIGNAL_ALL_WR; >>> + init_attr->sq_sig_type = IB_SIGNAL_REQ_WR; >> Will it be better to do this in a preparation patch? (not so important >> to me) > This patch is the first patch for the SRP initiator in which work > requests are queued for which the IB_SEND_SIGNALED flag is not set. That > is why I had included the above change in this patch. Agreed P.S, can we consider also skip scsi_cmnd SEND completions? Currently SRP asks for completion but never really arm the CQ (only for the first time). >>> @@ -898,20 +1112,25 @@ static int srp_rport_reconnect(struct srp_rport >>> *rport) >>> * callbacks will have finished before a new QP is allocated. >>> */ >>> ret = srp_new_cm_id(target); >>> + >>> + for (i = 0; i < target->req_ring_size; ++i) { >>> + struct srp_request *req = &target->req_ring[i]; >>> + srp_finish_req(target, req, NULL, DID_RESET << 16); >>> + } >> Can you explain why this moved here? I'm not sure I understand. > In fast registration mode calling srp_finish_req() can trigger rkey > invalidation. Trying to invalidate an rkey after the queue pair has been > destroyed and recreated would be wrong since that would cause the new > queue pair to transition into the error state. That is why this loop has > been moved up. Moving the srp_finish_req() loop up is safe since > srp_claim_req() prevents that any work completions that are received > after srp_finish_req() has finished cause any harm. Got it. > >>> +static int srp_map_finish_fr(struct srp_map_state *state, >>> + struct srp_target_port *target) >>> +{ >>> + struct srp_device *dev = target->srp_host->srp_dev; >>> + struct ib_send_wr *bad_wr; >>> + struct ib_send_wr wr; >>> + struct srp_fr_desc *desc; >>> + u32 rkey; >>> + >>> + desc = srp_fr_pool_get(target->fr_pool); >>> + if (!desc) >>> + return -ENOMEM; >>> + >>> + rkey = ib_inc_rkey(desc->mr->rkey); >>> + ib_update_fast_reg_key(desc->mr, rkey); >> Usually the inc_rkey is done right after the invalidation (but that's >> not really an issue). >> Maybe it is better here in case the target did remote invalidate (not >> supported at the moment but maybe in the future?) > Indeed - if remote invalidation would be added later on the code for > incrementing the rkey would have to be moved anyway. > >>> @@ -2910,7 +3213,8 @@ static void srp_add_one(struct ib_device *device) >>> if (IS_ERR(srp_dev->mr)) >>> goto err_pd; >>> - srp_alloc_fmr_pool(srp_dev); >>> + if (!srp_dev->use_fast_reg) >>> + srp_alloc_fmr_pool(srp_dev); >> So I ask here, are we OK with this asymmetry? FMRs are per-device and >> FRMRs are per-target? >> I have a feeling that converting FMRs to per-connection might simplify >> things, what do you think? > Making the FMR pools per-device instead of per-target would cause the > FMR and FR pool allocation and deallocation code to appear in the same > functions but unfortunately wouldn't make this patch shorter. My goal > with this patch was to introduce fast registration support without > changing the FMR code too much. I understand, my concern is that by this asymmetric design we may get different behaviors for a specific workload. For example revisiting the configurable queue_size work in SRP, I pointed that for a large number of connection with long queues all stressing out the system, the global FMR pool may not suffice. This may appear to the user by a periodic IO stalls, This will not happen for a per-target FRMR pools (at least less likely to happen, or at the very least will happen under different conditions). Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html