From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Neuling Subject: Re: [PATCH v4 2/3] cxlflash: Superpipe support Date: Wed, 12 Aug 2015 13:54:35 +1000 Message-ID: <1439351675.28873.57.camel@neuling.org> References: <1439226588-7886-1-git-send-email-mrochs@linux.vnet.ibm.com> <1439270596.5081.58.camel@neuling.org> <164CACE9-6195-4F1C-AA23-779FFF7D9DF8@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from ozlabs.org ([103.22.144.67]:34055 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932265AbbHLDyi convert rfc822-to-8bit (ORCPT ); Tue, 11 Aug 2015 23:54:38 -0400 In-Reply-To: <164CACE9-6195-4F1C-AA23-779FFF7D9DF8@linux.vnet.ibm.com> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: "Matthew R. Ochs" Cc: linux-scsi@vger.kernel.org, James.Bottomley@HansenPartnership.com, nab@linux-iscsi.org, brking@linux.vnet.ibm.com, wenxiong@linux.vnet.ibm.com, hch@infradead.org, imunsie@au1.ibm.com, dja@ozlabs.au.ibm.com, "Manoj N. Kumar" > >> + pr_debug("%s: Wait for user context to quiesce...\n", __func__)= ; > >> + wake_up_all(&cfg->limbo_waitq); > >> + ssleep(1); > >=20 > > Why 1 sec and why in a loop? Can=E2=80=99t you poll/wait for compl= etion somewhere? >=20 > This routine is called when the device is being removed and we need t= o stall until > the user contexts are made aware of removal [by marking them in error= ] and have > completely gone away [there are no longer any contexts in our table o= r list]. >=20 > The 1 second sleep is simply to give the users time to cleanup once t= hey see > the notification. We can make it a larger/smaller value or remove it = entirely, but > I felt that since this is not a hot path there was no reason to perfo= rm a busy-wait > style of loop here and yield to someone else. >=20 > As for the completion/poll, I did consider that but couldn=E2=80=99t = come up with a clean > way of implementing given how we designed the notification/cleanup me= chanism > (really just a failed recovery). We can certainly look at doing that = as an > enhancement in the future. Does the API allow this flexibility in the future? >=20 > >> + if (likely(ctxid < MAX_CONTEXT)) { > >> +retry: > >> + rc =3D mutex_lock_interruptible(&cfg->ctx_tbl_list_mutex); > >> + if (rc) > >> + goto out; > >> + > >> + ctxi =3D cfg->ctx_tbl[ctxid]; > >> + if (ctxi) > >> + if ((file && (ctxi->file !=3D file)) || > >> + (!file && (ctxi->ctxid !=3D rctxid))) > >> + ctxi =3D NULL; > >> + > >> + if ((ctx_ctrl & CTX_CTRL_ERR) || > >> + (!ctxi && (ctx_ctrl & CTX_CTRL_ERR_FALLBACK))) > >> + ctxi =3D find_error_context(cfg, rctxid, file); > >> + if (!ctxi) { > >> + mutex_unlock(&cfg->ctx_tbl_list_mutex); > >> + goto out; > >> + } > >> + > >> + /* > >> + * Need to acquire ownership of the context while still under > >> + * the table/list lock to serialize with a remove thread. Use > >> + * the 'try' to avoid stalling the table/list lock for a single > >> + * context. > >> + */ > >> + rc =3D mutex_trylock(&ctxi->mutex); > >> + mutex_unlock(&cfg->ctx_tbl_list_mutex); > >> + if (!rc) > >> + goto retry; > >=20 > > Please just create a loop rather than this goto retry. >=20 > Okay. >=20 > >> + rhte_checkin(ctxi, rhte); > >> + cxlflash_lun_detach(gli); > >> + > >> +out: > >> + if (unlock_ctx) > >> + mutex_unlock(&ctxi->mutex); > >=20 > > Where is the matching lock for this? >=20 > One of the main design points of our context serialization strategy i= s that > any context returned from get_context() has been fully validated, wil= l not > be removed from under you, and _is holding the context mutex_. Thus > for each of these mutex_unlock(ctxi) you see at the bottom of externa= l > entry points, the mutex was obtained in get_context(). Should we have something like put_context() that does this? So there i= s matching calls to get/put_context >=20 > >> + char *tmp =3D NULL; > >> + size_t size; > >> + struct afu *afu =3D cfg->afu; > >> + struct ctx_info *ctxi =3D NULL; > >> + struct sisl_rht_entry *rhte; > >> + > >> + size =3D (MAX_RHT_PER_CONTEXT * sizeof(*ctxi->rht_lun)); > >> + size +=3D sizeof(*ctxi); > >> + > >> + tmp =3D kzalloc(size, GFP_KERNEL); > >=20 > > Just do two allocs. One for ctxi and one for rht_lun. This is over= ly > > complicated. >=20 > I disagree that it=E2=80=99s overly complicated. The intention here w= as to get this > stuff together to avoid multiple function calls and possibly help out= perf-wise > via locality. That said, I=E2=80=99m not opposed to performing multip= le allocations. I'm not sure I buy it's a performance issue on create_context(). How often are you calling that? Doesn't that do a lot of things other than just this? Anyway if you want to do that, that's ok I guess, but you need to document why and what you're doing.=20 > Look for this in v5. >=20 > >=20 > >> + if (unlikely(!tmp)) { > >> + pr_err("%s: Unable to allocate context! (%ld)\n", > >> + __func__, size); > >> + goto out; > >> + } > >> + > >> + rhte =3D (struct sisl_rht_entry *)get_zeroed_page(GFP_KERNEL); > >> + if (unlikely(!rhte)) { > >> + pr_err("%s: Unable to allocate RHT!\n", __func__); > >> + goto err; > >> + } > >> + > >> + ctxi =3D (struct ctx_info *)tmp; > >> + tmp +=3D sizeof(*ctxi); > >> + ctxi->rht_lun =3D (struct llun_info **)tmp; > >=20 > > Yuck=E2=80=A6 just do two allocs rather than this throbbing. >=20 > You got it. >=20 > >> +out: > >> + if (unlock_ctx) > >> + mutex_unlock(&ctxi->mutex); > >=20 > > Where is the matching lock for this? >=20 > See earlier comment about get_context(). >=20 > >> + if (likely(ctxi)) > >> + mutex_unlock(&ctxi->mutex); > >=20 > > Where is the matching lock for this? >=20 > Ditto. >=20 > >> + file =3D cxl_get_fd(ctx, &cfg->cxl_fops, &fd); > >=20 > > You should create a new fops for each call here. We write the fops= to fill it > > out. I think it=E2=80=99ll work as you have now but it's a bit dod= gy. >=20 > Are you saying that instead of having a single fops for the device, e= ach of our > context=E2=80=99s should have an fops that we pass here? >=20 > >> + list_add(&lun_access->list, &ctxi->luns); > >> + mutex_unlock(&ctxi->mutex); > >=20 > > Where is the matching lock for this? >=20 > Just like with get_context(), we leave create_context() with the mute= x held. >=20 > >> + rc =3D cxlflash_lun_attach(gli, MODE_PHYSICAL); > >> + if (unlikely(rc)) { > >> + dev_err(dev, "%s: Failed to attach to LUN! (PHYSICAL)\n", > >=20 > > Is this going to spam the console from userspace? Same below. >=20 > With a well-behaved application it shouldn=E2=80=99t We can certainly= look at moving > to a debug if you feel it=E2=80=99s likely that someone would use thi= s to spam the > console. >=20 > >> + struct cxlflash_cfg *cfg =3D (struct cxlflash_cfg *)sdev->host->= hostdata; > >> + struct afu *afu =3D cfg->afu; > >> + struct dk_cxlflash_hdr *hdr; > >> + char buf[MAX_CXLFLASH_IOCTL_SZ]; > >=20 > > why is buf not just a =E2=80=9Cunion cxlflash_ioctls"? >=20 > Because I wanted you to have to look up this define? ;) :-P > In all seriousness, we originally had this define as a stand-alone va= lue (it > wasn=E2=80=99t tied to the union). When I implemented the union, I fi= gured the define > was more self-descriptive in what we were trying to convey. I=E2=80=99= ll change it > over to the union in v5. >=20 > >> + hdr =3D (struct dk_cxlflash_hdr *)&buf; > >> + if (hdr->version !=3D 0) { > >> + pr_err("%s: Version %u not supported for %s\n", > >> + __func__, hdr->version, decode_ioctl(cmd)); > >> + rc =3D -EINVAL; > >> + goto cxlflash_ioctl_exit; > >> + } > >=20 > > Do you advertise this version anywhere? Users just have to call it= and fail? >=20 > We don=E2=80=99t. We can add a version define to the exported ioctl h= eader. It needs to be exported dynamically by the kernel. Not the headers. Think new software on old kernel and visa versa. >=20 > > You should check hdr->flags are zero incase some new userspace trie= s to set > > them. Same for hdr->rsvd. >=20 > We can=E2=80=99t do that for the flags because those they are used fo= r some of the ioctls. > The reserved=E2=80=99s however _can_ be checked under the version 0 c= lause. OK > > Also, can you do these checks earlier. It seems you've already don= e a bunch of > > stuff before here. >=20 > From a runtime perspective, we haven=E2=80=99t actually done that muc= h prior to the version > check. I suppose we could put the check earlier but I don=E2=80=99t l= ike the idea of touching > data that hasn=E2=80=99t been copied in. So we would need to copy in = just the header, then > check the version. Then if it=E2=80=99s valid, copy in the rest. At s= ome point later on if/when more > versions are supported we might need to do something like this, but I= think it seems a > bit silly to do that now. Ok. > >> +#define DK_CXLFLASH_ATTACH CXL_IOWR(0x80, dk_cxlflash_attach) > >> +#define DK_CXLFLASH_USER_DIRECT CXL_IOWR(0x81, dk_cxlflash_udire= ct) > >> +#define DK_CXLFLASH_RELEASE CXL_IOWR(0x84, dk_cxlflash_release) > >> +#define DK_CXLFLASH_DETACH CXL_IOWR(0x85, dk_cxlflash_detach) > >> +#define DK_CXLFLASH_VERIFY CXL_IOWR(0x86, dk_cxlflash_verify) > >> +#define DK_CXLFLASH_RECOVER_AFU CXL_IOWR(0x88, dk_cxlflash_recov= er_afu) > >> +#define DK_CXLFLASH_MANAGE_LUN CXL_IOWR(0x89, dk_cxlflash_manage= _lun) > >=20 > > I'm not sure I'd leave these sparse. What happens if the vlun patc= hes don't > > get in? >=20 > I do agree with you. I had wanted to keep them like this as their ord= ering matched > closer with how they are expected to be used. That said, I=E2=80=99m = okay with moving > them around to avoid the sparseness. YEah, plus it breaks your table in the ioctl Mikey -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html