netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steve Wise <swise@opengridcomputing.com>
To: Bernard Metzler <BMT@zurich.ibm.com>
Cc: linux-rdma@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] SIW: Object management
Date: Tue, 05 Oct 2010 10:02:37 -0500	[thread overview]
Message-ID: <4CAB3E0D.8050609@opengridcomputing.com> (raw)
In-Reply-To: <OF706D1D5F.5849676F-ONC12577B3.004FB763-C12577B3.00521572@ch.ibm.com>

On 10/05/2010 09:56 AM, Bernard Metzler wrote:
> Steve Wise<swise@opengridcomputing.com>  wrote on 10/05/2010 04:26:48 PM:
>
>    
>> Steve Wise<swise@opengridcomputing.com>
>> 10/05/2010 04:26 PM
>>
>> To
>>
>> Bernard Metzler<bmt@zurich.ibm.com>
>>
>> cc
>>
>> netdev@vger.kernel.org, linux-rdma@vger.kernel.org
>>
>> Subject
>>
>> Re: [PATCH] SIW: Object management
>>
>> On 10/05/2010 01:54 AM, Bernard Metzler wrote:
>>
>> <snip>+
>>      
>>> +
>>> +/***** routines for WQE handling ***/
>>> +
>>> +/*
>>> + * siw_wqe_get()
>>> + *
>>> + * Get new WQE. For READ RESPONSE, take it from the free list which
>>> + * has a maximum size of maximum inbound READs. All other WQE are
>>> + * malloc'ed which creates some overhead. Consider change to
>>> + *
>>> + * 1. malloc WR only if it cannot be synchonously completed, or
>>> + * 2. operate own cache of reuseable WQE's.
>>> + *
>>> + * Current code trusts on malloc efficiency.
>>> + */
>>> +inline struct siw_wqe *siw_wqe_get(struct siw_qp *qp, enum
>>>        
>> siw_wr_opcode op)
>>      
>>> +{
>>> +   struct siw_wqe *wqe;
>>> +
>>> +   if (op == SIW_WR_RDMA_READ_RESP) {
>>> +      spin_lock(&qp->freelist_lock);
>>> +      if (!(list_empty(&qp->wqe_freelist))) {
>>> +         wqe = list_entry(qp->wqe_freelist.next,
>>> +                struct siw_wqe, list);
>>> +         list_del(&wqe->list);
>>> +         spin_unlock(&qp->freelist_lock);
>>> +         wqe->processed = 0;
>>> +         dprint(DBG_OBJ|DBG_WR,
>>> +            "(QP%d): WQE from FreeList p: %p\n",
>>> +            QP_ID(qp), wqe);
>>> +      } else {
>>> +         spin_unlock(&qp->freelist_lock);
>>> +         wqe = NULL;
>>> +         dprint(DBG_ON|DBG_OBJ|DBG_WR,
>>> +            "(QP%d): FreeList empty!\n", QP_ID(qp));
>>> +      }
>>> +   } else {
>>> +      wqe = kzalloc(sizeof(struct siw_wqe), GFP_KERNEL);
>>> +      dprint(DBG_OBJ|DBG_WR, "(QP%d): New WQE p: %p\n",
>>> +         QP_ID(qp), wqe);
>>> +   }
>>>
>>>        
>> I think you can't allocate at GFP_KERNEL here if this is called from the
>>      
>    
>> post_ functions.  I think you might want to pre-allocate these when you
>> create the QP...
>>
>>      
> the idea was to keep the memory footprint small and flexible
> while using the linux/list.h routines to manipulate all queues
> (no ring buffers etc, just lists). at the same time we
> decided to take the provided uverbs_cmd-syscall path down to
> the driver even for the post_-functions (since we would have to ring a
> doorbell on the send path anyway, which in software, is a syscall).
> in that path, even ib_uverbs_post_send() does one kmalloc() per wr
> (it would be helpful if the provider could keep and reuse that wr of
> known size, freeing it later at its own premises. that would avoid
> the second kmalloc here.)
>
> currently only work queue elements which are needed to satisfy
> inbound read requests are pre-allocated (amount corresponding
> to inbound read queue depth), since the read response is
> scheduled in network softirq context which must not sleep.
>
> that discussion may relate to the spinlock at the entrance to the
> post_ verbs. going down the uverbs_cmd path may sleep anyway...?
>
>    


The uverb calls may sleep, but certain kernel verbs must not.  Remember, 
the post_send/recv and other functions in your driver are called 
directly (almost) by kernel users like NFSRDMA.  These users may be 
calling in an interrupt context and thus you cannot block/sleep.

Steve.

  reply	other threads:[~2010-10-05 15:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-05  6:54 [PATCH] SIW: Object management Bernard Metzler
2010-10-05 14:26 ` Steve Wise
     [not found]   ` <4CAB35A8.6080906-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-10-05 14:56     ` Bernard Metzler
2010-10-05 15:02       ` Steve Wise [this message]
2010-10-05 15:25         ` Bernard Metzler
2010-10-05 15:37           ` Steve Wise
     [not found]             ` <4CAB464D.5030702-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-10-09 14:10               ` Bernard Metzler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CAB3E0D.8050609@opengridcomputing.com \
    --to=swise@opengridcomputing.com \
    --cc=BMT@zurich.ibm.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).