All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve Wise <swise@opengridcomputing.com>
To: Bernard Metzler <BMT@zurich.ibm.com>
Cc: linux-rdma@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] SIW: Object management
Date: Tue, 05 Oct 2010 10:02:37 -0500	[thread overview]
Message-ID: <4CAB3E0D.8050609@opengridcomputing.com> (raw)
In-Reply-To: <OF706D1D5F.5849676F-ONC12577B3.004FB763-C12577B3.00521572@ch.ibm.com>

On 10/05/2010 09:56 AM, Bernard Metzler wrote:
> Steve Wise<swise@opengridcomputing.com>  wrote on 10/05/2010 04:26:48 PM:
>
>    
>> Steve Wise<swise@opengridcomputing.com>
>> 10/05/2010 04:26 PM
>>
>> To
>>
>> Bernard Metzler<bmt@zurich.ibm.com>
>>
>> cc
>>
>> netdev@vger.kernel.org, linux-rdma@vger.kernel.org
>>
>> Subject
>>
>> Re: [PATCH] SIW: Object management
>>
>> On 10/05/2010 01:54 AM, Bernard Metzler wrote:
>>
>> <snip>+
>>      
>>> +
>>> +/***** routines for WQE handling ***/
>>> +
>>> +/*
>>> + * siw_wqe_get()
>>> + *
>>> + * Get new WQE. For READ RESPONSE, take it from the free list which
>>> + * has a maximum size of maximum inbound READs. All other WQE are
>>> + * malloc'ed which creates some overhead. Consider change to
>>> + *
>>> + * 1. malloc WR only if it cannot be synchonously completed, or
>>> + * 2. operate own cache of reuseable WQE's.
>>> + *
>>> + * Current code trusts on malloc efficiency.
>>> + */
>>> +inline struct siw_wqe *siw_wqe_get(struct siw_qp *qp, enum
>>>        
>> siw_wr_opcode op)
>>      
>>> +{
>>> +   struct siw_wqe *wqe;
>>> +
>>> +   if (op == SIW_WR_RDMA_READ_RESP) {
>>> +      spin_lock(&qp->freelist_lock);
>>> +      if (!(list_empty(&qp->wqe_freelist))) {
>>> +         wqe = list_entry(qp->wqe_freelist.next,
>>> +                struct siw_wqe, list);
>>> +         list_del(&wqe->list);
>>> +         spin_unlock(&qp->freelist_lock);
>>> +         wqe->processed = 0;
>>> +         dprint(DBG_OBJ|DBG_WR,
>>> +            "(QP%d): WQE from FreeList p: %p\n",
>>> +            QP_ID(qp), wqe);
>>> +      } else {
>>> +         spin_unlock(&qp->freelist_lock);
>>> +         wqe = NULL;
>>> +         dprint(DBG_ON|DBG_OBJ|DBG_WR,
>>> +            "(QP%d): FreeList empty!\n", QP_ID(qp));
>>> +      }
>>> +   } else {
>>> +      wqe = kzalloc(sizeof(struct siw_wqe), GFP_KERNEL);
>>> +      dprint(DBG_OBJ|DBG_WR, "(QP%d): New WQE p: %p\n",
>>> +         QP_ID(qp), wqe);
>>> +   }
>>>
>>>        
>> I think you can't allocate at GFP_KERNEL here if this is called from the
>>      
>    
>> post_ functions.  I think you might want to pre-allocate these when you
>> create the QP...
>>
>>      
> the idea was to keep the memory footprint small and flexible
> while using the linux/list.h routines to manipulate all queues
> (no ring buffers etc, just lists). at the same time we
> decided to take the provided uverbs_cmd-syscall path down to
> the driver even for the post_-functions (since we would have to ring a
> doorbell on the send path anyway, which in software, is a syscall).
> in that path, even ib_uverbs_post_send() does one kmalloc() per wr
> (it would be helpful if the provider could keep and reuse that wr of
> known size, freeing it later at its own premises. that would avoid
> the second kmalloc here.)
>
> currently only work queue elements which are needed to satisfy
> inbound read requests are pre-allocated (amount corresponding
> to inbound read queue depth), since the read response is
> scheduled in network softirq context which must not sleep.
>
> that discussion may relate to the spinlock at the entrance to the
> post_ verbs. going down the uverbs_cmd path may sleep anyway...?
>
>    


The uverb calls may sleep, but certain kernel verbs must not.  Remember, 
the post_send/recv and other functions in your driver are called 
directly (almost) by kernel users like NFSRDMA.  These users may be 
calling in an interrupt context and thus you cannot block/sleep.

Steve.

  reply	other threads:[~2010-10-05 15:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-05  6:54 [PATCH] SIW: Object management Bernard Metzler
2010-10-05 14:26 ` Steve Wise
     [not found]   ` <4CAB35A8.6080906-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-10-05 14:56     ` Bernard Metzler
2010-10-05 15:02       ` Steve Wise [this message]
2010-10-05 15:25         ` Bernard Metzler
2010-10-05 15:37           ` Steve Wise
     [not found]             ` <4CAB464D.5030702-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-10-09 14:10               ` Bernard Metzler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CAB3E0D.8050609@opengridcomputing.com \
    --to=swise@opengridcomputing.com \
    --cc=BMT@zurich.ibm.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.