From: Naresh Kumar Inna <naresh@chelsio.com>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: "JBottomley@parallels.com" <JBottomley@parallels.com>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
Dimitrios Michailidis <dm@chelsio.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Chethan Seshadri <chethan@chelsio.com>
Subject: Re: [PATCH 5/8] csiostor: Chelsio FCoE offload driver submission (sources part 5).
Date: Sun, 26 Aug 2012 00:06:07 +0530 [thread overview]
Message-ID: <50391B17.2040505@chelsio.com> (raw)
In-Reply-To: <1345841784.28432.44.camel@haakon2.linux-iscsi.org>
On 8/25/2012 2:26 AM, Nicholas A. Bellinger wrote:
> On Fri, 2012-08-24 at 23:10 +0530, Naresh Kumar Inna wrote:
>> On 8/24/2012 1:18 AM, Nicholas A. Bellinger wrote:
>>> On Fri, 2012-08-24 at 03:57 +0530, Naresh Kumar Inna wrote:
>>>> This patch contains code to implement the interrupt handling and the fast
>>>> path I/O functionality. The interrupt handling includes allocation of
>>>> MSIX vectors, registering and implemeting the interrupt service routines.
>>>> The fast path I/O functionality includes posting the I/O request to firmware
>>>> via Work Requests, tracking/completing them, and handling task management
>>>> requests. SCSI midlayer host template implementation is also covered by
>>>> this patch.
>>>>
>>>> Signed-off-by: Naresh Kumar Inna <naresh@chelsio.com>
>>>> ---
>>>
>>> Hi Naresh,
>>>
>>> My review comments are inline below..
>>
>> Hi Nicholas,
>>
>> Thanks for taking the time to review the driver. Please find my replies
>> inline.
>>
>> Regards,
>> Naresh.
>>
>>>
>>>> drivers/scsi/csiostor/csio_isr.c | 631 ++++++++++
>>>> drivers/scsi/csiostor/csio_scsi.c | 2498 +++++++++++++++++++++++++++++++++++++
>>>> 2 files changed, 3129 insertions(+), 0 deletions(-)
>>>> create mode 100644 drivers/scsi/csiostor/csio_isr.c
>>>> create mode 100644 drivers/scsi/csiostor/csio_scsi.c
>>>>
>
> <SNIP>
>
>>>
>>>> diff --git a/drivers/scsi/csiostor/csio_scsi.c b/drivers/scsi/csiostor/csio_scsi.c
>>>> new file mode 100644
>>>> index 0000000..0f87b00
>>>> --- /dev/null
>>>> +++ b/drivers/scsi/csiostor/csio_scsi.c
>>>> +
>>>> +#define CSIO_SCSI_CMD_WR_SZ(_imm) \
>>>> + (sizeof(struct fw_scsi_cmd_wr) + /* WR size */ \
>>>> + ALIGN((_imm), 16)) /* Immed data */
>>>> +
>>>> +#define CSIO_SCSI_CMD_WR_SZ_16(_imm) \
>>>> + (ALIGN(CSIO_SCSI_CMD_WR_SZ((_imm)), 16))
>>>> +
>>>> +/*
>>>> + * csio_scsi_cmd - Create a SCSI CMD WR.
>>>> + * @req: IO req structure.
>>>> + *
>>>> + * Gets a WR slot in the ingress queue and initializes it with SCSI CMD WR.
>>>> + *
>>>> + */
>>>> +static inline void
>>>> +csio_scsi_cmd(struct csio_ioreq *req)
>>>> +{
>>>> + struct csio_wr_pair wrp;
>>>> + struct csio_hw *hw = req->lnode->hwp;
>>>> + struct csio_scsim *scsim = csio_hw_to_scsim(hw);
>>>> + uint32_t size = CSIO_SCSI_CMD_WR_SZ_16(scsim->proto_cmd_len);
>>>> +
>>>> + req->drv_status = csio_wr_get(hw, req->eq_idx, size, &wrp);
>>>> + if (unlikely(req->drv_status != CSIO_SUCCESS))
>>>> + return;
>>>> +
>>>> + if (wrp.size1 >= size) {
>>>> + /* Initialize WR in one shot */
>>>> + csio_scsi_init_cmd_wr(req, wrp.addr1, size);
>>>> + } else {
>>>> + uint8_t tmpwr[512];
>>>
>>> Mmmm, putting this large of a buffer on the local stack is probably not
>>> a good idea.
>>>
>>> This should become an allocation.. If it's a hot path then you'll
>>> probably want to set this up before-hand.
>>>
>>
>> The else switch above is entered only when we near the end of the DMA
>> ring. This is not a frequent occurrence. If it is a strict no-no to have
>> so many on-stack bytes, I have to think of a way re-work it to use
>> pre-allocated memory. Please let me know.
>>
>
> Considering it's used a number of times, it would be better to just
> figure out a sensible manner to pre-allocate this, especially if it's
> only a single occurrence to an individual ring held under a lock.
>
I will see what I can do - I can probably add these bytes into every
queues metadata structure.
>>>> + /*
>>>> + * Make a temporary copy of the WR and write back
>>>> + * the copy into the WR pair.
>>>> + */
>>>> + csio_scsi_init_cmd_wr(req, (void *)tmpwr, size);
>>>> + memcpy(wrp.addr1, tmpwr, wrp.size1);
>>>> + memcpy(wrp.addr2, tmpwr + wrp.size1, size - wrp.size1);
>>>> + }
>>>> +}
>>>> +
>>>> +/*
>>>> + * The following is fast path code. Therefore it is inlined with multi-line
>>>> + * macros using name substitution, thus avoiding if-else switches for
>>>> + * operation (read/write), as well as serving the purpose of code re-use.
>>>> + */
>>>> +/*
>>>> + * csio_scsi_init_ulptx_dsgl - Fill in a ULP_TX_SC_DSGL
>>>> + * @hw: HW module
>>>> + * @req: IO request
>>>> + * @sgl: ULP TX SGL pointer.
>>>> + *
>>>> + */
>>>> +#define csio_scsi_init_ultptx_dsgl(hw, req, sgl) \
>>>> +do { \
>>>> + struct ulptx_sge_pair *_sge_pair = NULL; \
>>>> + struct scatterlist *_sgel; \
>>>> + uint32_t _i = 0; \
>>>> + uint32_t _xfer_len; \
>>>> + struct list_head *_tmp; \
>>>> + struct csio_dma_buf *_dma_buf; \
>>>> + struct scsi_cmnd *scmnd = csio_scsi_cmnd((req)); \
>>>> + \
>>>> + (sgl)->cmd_nsge = htonl(ULPTX_CMD(ULP_TX_SC_DSGL) | ULPTX_MORE | \
>>>> + ULPTX_NSGE((req)->nsge)); \
>>>> + /* Now add the data SGLs */ \
>>>> + if (likely(!(req)->dcopy)) { \
>>>> + scsi_for_each_sg(scmnd, _sgel, (req)->nsge, _i) { \
>>>> + if (_i == 0) { \
>>>> + (sgl)->addr0 = cpu_to_be64( \
>>>> + sg_dma_address(_sgel)); \
>>>> + (sgl)->len0 = cpu_to_be32( \
>>>> + sg_dma_len(_sgel)); \
>>>> + _sge_pair = \
>>>> + (struct ulptx_sge_pair *)((sgl) + 1); \
>>>> + continue; \
>>>> + } \
>>>> + if ((_i - 1) & 0x1) { \
>>>> + _sge_pair->addr[1] = cpu_to_be64( \
>>>> + sg_dma_address(_sgel)); \
>>>> + _sge_pair->len[1] = cpu_to_be32( \
>>>> + sg_dma_len(_sgel)); \
>>>> + _sge_pair++; \
>>>> + } else { \
>>>> + _sge_pair->addr[0] = cpu_to_be64( \
>>>> + sg_dma_address(_sgel)); \
>>>> + _sge_pair->len[0] = cpu_to_be32( \
>>>> + sg_dma_len(_sgel)); \
>>>> + } \
>>>> + } \
>>>> + } else { \
>>>> + /* Program sg elements with driver's DDP buffer */ \
>>>> + _xfer_len = scsi_bufflen(scmnd); \
>>>> + list_for_each(_tmp, &(req)->gen_list) { \
>>>> + _dma_buf = (struct csio_dma_buf *)_tmp; \
>>>> + if (_i == 0) { \
>>>> + (sgl)->addr0 = cpu_to_be64(_dma_buf->paddr); \
>>>> + (sgl)->len0 = cpu_to_be32( \
>>>> + min(_xfer_len, _dma_buf->len)); \
>>>> + _sge_pair = \
>>>> + (struct ulptx_sge_pair *)((sgl) + 1); \
>>>> + } \
>>>> + else if ((_i - 1) & 0x1) { \
>>>> + _sge_pair->addr[1] = cpu_to_be64( \
>>>> + _dma_buf->paddr); \
>>>> + _sge_pair->len[1] = cpu_to_be32( \
>>>> + min(_xfer_len, _dma_buf->len)); \
>>>> + _sge_pair++; \
>>>> + } else { \
>>>> + _sge_pair->addr[0] = cpu_to_be64( \
>>>> + _dma_buf->paddr); \
>>>> + _sge_pair->len[0] = cpu_to_be32( \
>>>> + min(_xfer_len, _dma_buf->len)); \
>>>> + } \
>>>> + _xfer_len -= min(_xfer_len, _dma_buf->len); \
>>>> + _i++; \
>>>> + } \
>>>> + } \
>>>> +} while (0)
>>>> +
>>>
>>> I don't see any reason why this can't just be a static function..? Why
>>> is the macro usage necessary here..?
>>
>> OK, I will make this static-inline.
>>
>>>
>>>> +/*
>>>> + * csio_scsi_init_data_wr - Initialize the READ/WRITE SCSI WR.
>>>> + * @req: IO req structure.
>>>> + * @oper: read/write
>>>> + * @wrp: DMA location to place the payload.
>>>> + * @size: Size of WR (including FW WR + immed data + rsp SG entry + data SGL
>>>> + * @wrop: _READ_/_WRITE_
>>>> + *
>>>> + * Wrapper for populating fw_scsi_read_wr/fw_scsi_write_wr.
>>>> + */
>>>> +#define csio_scsi_init_data_wr(req, oper, wrp, size, wrop) \
>>>> +do { \
>>>> + struct csio_hw *_hw = (req)->lnode->hwp; \
>>>> + struct csio_rnode *_rn = (req)->rnode; \
>>>> + struct fw_scsi_##oper##_wr *__wr = (struct fw_scsi_##oper##_wr *)(wrp);\
>>>> + struct ulptx_sgl *_sgl; \
>>>> + struct csio_dma_buf *_dma_buf; \
>>>> + uint8_t _imm = csio_hw_to_scsim(_hw)->proto_cmd_len; \
>>>> + struct scsi_cmnd *scmnd = csio_scsi_cmnd((req)); \
>>>> + \
>>>> + __wr->op_immdlen = cpu_to_be32(FW_WR_OP(FW_SCSI##wrop##WR) | \
>>>> + FW_SCSI##wrop##WR_IMMDLEN(_imm)); \
>>>> + __wr->flowid_len16 = cpu_to_be32(FW_WR_FLOWID(_rn->flowid) | \
>>>> + FW_WR_LEN16( \
>>>> + CSIO_ROUNDUP((size), 16))); \
>>>> + __wr->cookie = (uintptr_t) (req); \
>>>> + __wr->iqid = (uint16_t)cpu_to_be16(csio_q_physiqid(_hw, \
>>>> + (req)->iq_idx));\
>>>> + __wr->tmo_val = (uint8_t)((req)->tmo); \
>>>> + __wr->use_xfer_cnt = 1; \
>>>> + __wr->xfer_cnt = cpu_to_be32(scsi_bufflen(scmnd)); \
>>>> + __wr->ini_xfer_cnt = cpu_to_be32(scsi_bufflen(scmnd)); \
>>>> + /* Get RSP DMA buffer */ \
>>>> + _dma_buf = &(req)->dma_buf; \
>>>> + \
>>>> + /* Prepare RSP SGL */ \
>>>> + __wr->rsp_dmalen = cpu_to_be32(_dma_buf->len); \
>>>> + __wr->rsp_dmaaddr = cpu_to_be64(_dma_buf->paddr); \
>>>> + \
>>>> + __wr->r4 = 0; \
>>>> + \
>>>> + __wr->u.fcoe.ctl_pri = 0; \
>>>> + __wr->u.fcoe.cp_en_class = 0; \
>>>> + __wr->u.fcoe.r3_lo[0] = 0; \
>>>> + __wr->u.fcoe.r3_lo[1] = 0; \
>>>> + csio_scsi_fcp_cmnd((req), (void *)((uintptr_t)(wrp) + \
>>>> + sizeof(struct fw_scsi_##oper##_wr))); \
>>>> + \
>>>> + /* Move WR pointer past command and immediate data */ \
>>>> + _sgl = (struct ulptx_sgl *) ((uintptr_t)(wrp) + \
>>>> + sizeof(struct fw_scsi_##oper##_wr) + \
>>>> + ALIGN(_imm, 16)); \
>>>> + \
>>>> + /* Fill in the DSGL */ \
>>>> + csio_scsi_init_ultptx_dsgl(_hw, (req), _sgl); \
>>>> + \
>>>> +} while (0)
>>>> +
>>>
>>> This one has four uses of CPP keys. Just turn those into macros, and
>>> leave the rest of the code in a static function.
>>>
>>
>> So what you are suggesting is to have all the lines of the macro
>> csio_scsi_init_data_wr() added into a static function, but for the ones
>> with the 4 keys. csio_scsi_init_data_wr() will then invoke this new
>> function. Is that correct?
>>
>
> Not sure how the above should actually look without actually doing it,
> but IMHO the usage of macro just obfuscates what is going on..
>
> If it's only used a few times, just inline the code into seperate static
> functions. If it's used more than a few times, then use a single static
> funciton with macro accessors for the assignment of the various '__wr'
> structure members.
>
> The larger problem with all of these macros is that you can't tell what
> is a macro and what is a function.
>
> If you need to use a CPP macro, please make sure to capitalize the name
> of the macro in order to tell the difference between the two.
>
OK, I will see what I can do to convert this macro into a function.
>>>> +/* Calculate WR size needed for fw_scsi_read_wr/fw_scsi_write_wr */
>>>> +#define csio_scsi_data_wrsz(req, oper, sz, imm) \
>>>> +do { \
>>>> + (sz) = sizeof(struct fw_scsi_##oper##_wr) + /* WR size */ \
>>>> + ALIGN((imm), 16) + /* Immed data */ \
>>>> + sizeof(struct ulptx_sgl); /* ulptx_sgl */ \
>>>> + \
>>>> + if (unlikely((req)->nsge > 1)) \
>>>> + (sz) += (sizeof(struct ulptx_sge_pair) * \
>>>> + (ALIGN(((req)->nsge - 1), 2) / 2)); \
>>>> + /* Data SGE */ \
>>>> +} while (0)
>>>> +
>>>> +/*
>>>> + * csio_scsi_data - Create a SCSI WRITE/READ WR.
>>>> + * @req: IO req structure.
>>>> + * @oper: read/write
>>>> + * @wrop: _READ_/_WRITE_ (string subsitutions to use with the FW bit field
>>>> + * macros).
>>>> + *
>>>> + * Gets a WR slot in the ingress queue and initializes it with
>>>> + * SCSI CMD READ/WRITE WR.
>>>> + *
>>>> + */
>>>> +#define csio_scsi_data(req, oper, wrop) \
>>>> +do { \
>>>> + struct csio_wr_pair _wrp; \
>>>> + uint32_t _size; \
>>>> + struct csio_hw *_hw = (req)->lnode->hwp; \
>>>> + struct csio_scsim *_scsim = csio_hw_to_scsim(_hw); \
>>>> + \
>>>> + csio_scsi_data_wrsz((req), oper, _size, _scsim->proto_cmd_len); \
>>>> + _size = ALIGN(_size, 16); \
>>>> + \
>>>> + (req)->drv_status = csio_wr_get(_hw, (req)->eq_idx, _size, &_wrp); \
>>>> + if (likely((req)->drv_status == CSIO_SUCCESS)) { \
>>>> + if (likely(_wrp.size1 >= _size)) { \
>>>> + /* Initialize WR in one shot */ \
>>>> + csio_scsi_init_data_wr((req), oper, _wrp.addr1, \
>>>> + _size, wrop); \
>>>> + } else { \
>>>> + uint8_t tmpwr[512]; \
>>>> + /* \
>>>> + * Make a temporary copy of the WR and write back \
>>>> + * the copy into the WR pair. \
>>>> + */ \
>>>> + csio_scsi_init_data_wr((req), oper, (void *)tmpwr, \
>>>> + _size, wrop); \
>>>> + memcpy(_wrp.addr1, tmpwr, _wrp.size1); \
>>>> + memcpy(_wrp.addr2, tmpwr + _wrp.size1, \
>>>> + _size - _wrp.size1); \
>>>> + } \
>>>> + } \
>>>> +} while (0)
>>>> +
>>>
>>> Ditto on this one, along with the tmpwr[512] stack usage..
>>>
>>>> +static inline void
>>>> +csio_scsi_abrt_cls(struct csio_ioreq *req, bool abort)
>>>> +{
>>>> + struct csio_wr_pair wrp;
>>>> + struct csio_hw *hw = req->lnode->hwp;
>>>> + uint32_t size = ALIGN(sizeof(struct fw_scsi_abrt_cls_wr), 16);
>>>> +
>>>> + req->drv_status = csio_wr_get(hw, req->eq_idx, size, &wrp);
>>>> + if (req->drv_status != CSIO_SUCCESS)
>>>> + return;
>>>> +
>>>> + if (wrp.size1 >= size) {
>>>> + /* Initialize WR in one shot */
>>>> + csio_scsi_init_abrt_cls_wr(req, wrp.addr1, size, abort);
>>>> + } else {
>>>> + uint8_t tmpwr[512];
>>>
>>> Ditto here on local scope stack usage..
>>>
>>>> + /*
>>>> + * Make a temporary copy of the WR and write back
>>>> + * the copy into the WR pair.
>>>> + */
>>>> + csio_scsi_init_abrt_cls_wr(req, (void *)tmpwr, size, abort);
>>>> + memcpy(wrp.addr1, tmpwr, wrp.size1);
>>>> + memcpy(wrp.addr2, tmpwr + wrp.size1, size - wrp.size1);
>>>> + }
>>>> +}
>>>> +
>>>> +/*****************************************************************************/
>>>> +/* START: SCSI SM */
>>>> +/*****************************************************************************/
>>>> +static void
>>>> +csio_scsis_uninit(struct csio_ioreq *req, enum csio_scsi_ev evt)
>>>> +{
>>>> + struct csio_hw *hw = req->lnode->hwp;
>>>> + struct csio_scsim *scsim = csio_hw_to_scsim(hw);
>>>> +
>>>> + switch (evt) {
>>>> +
>>>> + case CSIO_SCSIE_START_IO:
>>>
>>> Extra space between start of first switch case
>>
>> OK, I will remove it. Is there any tool that catches such deviations?
>> checkpath.pl didnt report it.
>>
>
> Sorry, can't help you there.. ;)
>
>>>
>>>> +
>>>> +/**
>>>> + * csio_queuecommand_lck - Entry point to kickstart an I/O request.
>>>> + * @cmnd: The I/O request from ML.
>>>> + * @done: The ML callback routine.
>>>> + *
>>>> + * This routine does the following:
>>>> + * - Checks for HW and Rnode module readiness.
>>>> + * - Gets a free ioreq structure (which is already initialized
>>>> + * to uninit during its allocation).
>>>> + * - Maps SG elements.
>>>> + * - Initializes ioreq members.
>>>> + * - Kicks off the SCSI state machine for this IO.
>>>> + * - Returns busy status on error.
>>>> + */
>>>> +static int
>>>> +csio_queuecommand_lck(struct scsi_cmnd *cmnd, void (*done)(struct scsi_cmnd *))
>>>> +{
>>>> + struct csio_lnode *ln = shost_priv(cmnd->device->host);
>>>> + struct csio_hw *hw = csio_lnode_to_hw(ln);
>>>> + struct csio_scsim *scsim = csio_hw_to_scsim(hw);
>>>> + struct csio_rnode *rn = (struct csio_rnode *)(cmnd->device->hostdata);
>>>> + struct csio_ioreq *ioreq = NULL;
>>>> + unsigned long flags;
>>>> + int nsge = 0;
>>>> + int rv = SCSI_MLQUEUE_HOST_BUSY, nr;
>>>> + csio_retval_t retval;
>>>> + int cpu;
>>>> + struct csio_scsi_qset *sqset;
>>>> + struct fc_rport *rport = starget_to_rport(scsi_target(cmnd->device));
>>>> +
>>>> + if (!blk_rq_cpu_valid(cmnd->request))
>>>> + cpu = smp_processor_id();
>>>> + else
>>>> + cpu = cmnd->request->cpu;
>>>> +
>>>> + sqset = &hw->sqset[ln->portid][cpu];
>>>> +
>>>> + nr = fc_remote_port_chkready(rport);
>>>> + if (nr) {
>>>> + cmnd->result = nr;
>>>> + csio_inc_stats(scsim, n_rn_nr_error);
>>>> + goto err_done;
>>>> + }
>>>> +
>>>> + if (unlikely(!csio_is_hw_ready(hw))) {
>>>> + cmnd->result = (DID_REQUEUE << 16);
>>>> + csio_inc_stats(scsim, n_hw_nr_error);
>>>> + goto err_done;
>>>> + }
>>>> +
>>>> + /* Get req->nsge, if there are SG elements to be mapped */
>>>> + nsge = scsi_dma_map(cmnd);
>>>> + if (unlikely(nsge < 0)) {
>>>> + csio_inc_stats(scsim, n_dmamap_error);
>>>> + goto err;
>>>> + }
>>>> +
>>>> + /* Do we support so many mappings? */
>>>> + if (unlikely(nsge > scsim->max_sge)) {
>>>> + csio_warn(hw,
>>>> + "More SGEs than can be supported."
>>>> + " SGEs: %d, Max SGEs: %d\n", nsge, scsim->max_sge);
>>>> + csio_inc_stats(scsim, n_unsupp_sge_error);
>>>> + goto err_dma_unmap;
>>>> + }
>>>> +
>>>> + /* Get a free ioreq structure - SM is already set to uninit */
>>>> + ioreq = csio_get_scsi_ioreq_lock(hw, scsim);
>>>> + if (!ioreq) {
>>>> + csio_err(hw, "Out of I/O request elements. Active #:%d\n",
>>>> + scsim->stats.n_active);
>>>> + csio_inc_stats(scsim, n_no_req_error);
>>>> + goto err_dma_unmap;
>>>> + }
>>>> +
>>>> + ioreq->nsge = nsge;
>>>> + ioreq->lnode = ln;
>>>> + ioreq->rnode = rn;
>>>> + ioreq->iq_idx = sqset->iq_idx;
>>>> + ioreq->eq_idx = sqset->eq_idx;
>>>> + ioreq->wr_status = 0;
>>>> + ioreq->drv_status = CSIO_SUCCESS;
>>>> + csio_scsi_cmnd(ioreq) = (void *)cmnd;
>>>> + ioreq->tmo = 0;
>>>> +
>>>> + switch (cmnd->sc_data_direction) {
>>>> + case DMA_BIDIRECTIONAL:
>>>> + ioreq->datadir = CSIO_IOREQF_DMA_BIDI;
>>>> + csio_inc_stats(ln, n_control_requests);
>>>> + break;
>>>> + case DMA_TO_DEVICE:
>>>> + ioreq->datadir = CSIO_IOREQF_DMA_WRITE;
>>>> + csio_inc_stats(ln, n_output_requests);
>>>> + ln->stats.n_output_bytes += scsi_bufflen(cmnd);
>>>> + break;
>>>> + case DMA_FROM_DEVICE:
>>>> + ioreq->datadir = CSIO_IOREQF_DMA_READ;
>>>> + csio_inc_stats(ln, n_input_requests);
>>>> + ln->stats.n_input_bytes += scsi_bufflen(cmnd);
>>>> + break;
>>>> + case DMA_NONE:
>>>> + ioreq->datadir = CSIO_IOREQF_DMA_NONE;
>>>> + csio_inc_stats(ln, n_control_requests);
>>>> + break;
>>>> + default:
>>>> + CSIO_DB_ASSERT(0);
>>>> + break;
>>>> + }
>>>> +
>>>> + /* Set cbfn */
>>>> + ioreq->io_cbfn = csio_scsi_cbfn;
>>>> +
>>>> + /* Needed during abort */
>>>> + cmnd->host_scribble = (unsigned char *)ioreq;
>>>> + cmnd->scsi_done = done;
>>>> + cmnd->SCp.Message = 0;
>>>> +
>>>> + /* Kick off SCSI IO SM on the ioreq */
>>>> + spin_lock_irqsave(&hw->lock, flags);
>>>> + retval = csio_scsi_start_io(ioreq);
>>>> + spin_unlock_irqrestore(&hw->lock, flags);
>>>> +
>>>> + if (retval != CSIO_SUCCESS) {
>>>> + csio_err(hw, "ioreq: %p couldnt be started, status:%d\n",
>>>> + ioreq, retval);
>>>> + csio_inc_stats(scsim, n_busy_error);
>>>> + goto err_put_req;
>>>> + }
>>>> +
>>>> + return 0;
>>>> +
>>>> +err_put_req:
>>>> + csio_put_scsi_ioreq_lock(hw, scsim, ioreq);
>>>> +err_dma_unmap:
>>>> + if (nsge > 0)
>>>> + scsi_dma_unmap(cmnd);
>>>> +err:
>>>> + return rv;
>>>> +
>>>> +err_done:
>>>> + done(cmnd);
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static DEF_SCSI_QCMD(csio_queuecommand);
>>>> +
>>>
>>> This means that your running with the host_lock held.. I'm not sure if
>>> that is really what you want to do as it really end's up killing
>>> multi-lun small packet performance..
>>>
>>> How about dropping DEF_SCSI_QCMD usage here, and figure out what
>>> actually needs to be protected by the SCSI host_lock within
>>> csio_queuecommand_lck()..?
>>
>> It is on my TODO list for the next version of the driver, after the
>> initial submission. Per the current design, we shouldnt need the
>> host_lock to be held, but I would like to test this change thoroughly
>> before I submit it.
>>
>
> Sure, it's really quite easy to convert and these days the majority of
> high performance LLDs do run in host-lock-less mode.
>
> IIRC the libfc based FCoE initiator driver is doing this too..
>
next prev parent reply other threads:[~2012-08-25 18:36 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-23 22:27 [PATCH 0/8] csiostor: Chelsio FCoE offload driver submission Naresh Kumar Inna
2012-08-23 22:27 ` [PATCH 1/8] csiostor: Chelsio FCoE offload driver submission (sources part 1) Naresh Kumar Inna
2012-08-23 22:27 ` [PATCH 2/8] csiostor: Chelsio FCoE offload driver submission (sources part 2) Naresh Kumar Inna
2012-08-23 22:27 ` [PATCH 3/8] csiostor: Chelsio FCoE offload driver submission (sources part 3) Naresh Kumar Inna
2012-08-23 22:27 ` [PATCH 4/8] csiostor: Chelsio FCoE offload driver submission (sources part 4) Naresh Kumar Inna
2012-08-23 22:27 ` [PATCH 5/8] csiostor: Chelsio FCoE offload driver submission (sources part 5) Naresh Kumar Inna
2012-08-23 19:48 ` Nicholas A. Bellinger
2012-08-24 17:40 ` Naresh Kumar Inna
2012-08-24 18:27 ` Joe Perches
2012-08-24 19:07 ` Naresh Kumar Inna
2012-08-24 20:56 ` Nicholas A. Bellinger
2012-08-25 18:36 ` Naresh Kumar Inna [this message]
2012-08-25 18:43 ` Nicholas A. Bellinger
2012-08-29 7:47 ` Naresh Kumar Inna
2012-08-23 22:27 ` [PATCH 6/8] csiostor: Chelsio FCoE offload driver submission (headers part 1) Naresh Kumar Inna
2012-08-23 18:15 ` Love, Robert W
2012-08-24 18:45 ` Naresh Kumar Inna
2012-08-23 19:58 ` Nicholas A. Bellinger
2012-08-24 18:36 ` Naresh Kumar Inna
2012-08-24 21:17 ` Nicholas A. Bellinger
2012-08-25 18:09 ` Naresh Kumar Inna
2012-08-25 18:40 ` Nicholas A. Bellinger
2012-08-25 19:01 ` Naresh Kumar Inna
2012-08-23 22:27 ` [PATCH 7/8] csiostor: Chelsio FCoE offload driver submission (headers part 2) Naresh Kumar Inna
2012-08-23 22:27 ` [PATCH 8/8] cxgb4: Chelsio FCoE offload driver submission (cxgb4 common header updates) Naresh Kumar Inna
2012-08-24 17:04 ` [PATCH 0/8] csiostor: Chelsio FCoE offload driver submission David Miller
2012-08-24 19:04 ` Naresh Kumar Inna
2012-08-24 19:10 ` David Miller
2012-08-25 19:08 ` Naresh Kumar Inna
2012-09-04 5:13 ` Naresh Kumar Inna
2012-09-04 5:34 ` David Miller
2012-08-24 21:45 ` Paul Gortmaker
2012-08-24 21:58 ` James Bottomley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50391B17.2040505@chelsio.com \
--to=naresh@chelsio.com \
--cc=JBottomley@parallels.com \
--cc=chethan@chelsio.com \
--cc=dm@chelsio.com \
--cc=linux-scsi@vger.kernel.org \
--cc=nab@linux-iscsi.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.