From mboxrd@z Thu Jan  1 00:00:00 1970
From: Olivier MATZ <olivier.matz@6wind.com>
Subject: Re: [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update
Date: Tue, 05 Jan 2016 11:57:37 +0100
Message-ID: <568BA1A1.2070300@6wind.com>
References: <d18f2062724a4453a1f709dcf4f30792@XCH-RTP-017.cisco.com>
 <568A7959.7030506@6wind.com>
 <7f5255b98dcb4f2396ada16d5eb43e5a@XCH-RTP-017.cisco.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Ido Barnea \(ibarnea\)" <ibarnea@cisco.com>,
 "Itay Marom \(imarom\)" <imarom@cisco.com>
To: "Hanoch Haim (hhaim)" <hhaim@cisco.com>,
 "bruce.richardson@intel.com" <bruce.richardson@intel.com>
Return-path: <dev-bounces@dpdk.org>
Received: from mail-wm0-f44.google.com (mail-wm0-f44.google.com [74.125.82.44])
 by dpdk.org (Postfix) with ESMTP id 5985A9256
 for <dev@dpdk.org>; Tue,  5 Jan 2016 11:58:12 +0100 (CET)
Received: by mail-wm0-f44.google.com with SMTP id u188so18382960wmu.1
 for <dev@dpdk.org>; Tue, 05 Jan 2016 02:58:12 -0800 (PST)
In-Reply-To: <7f5255b98dcb4f2396ada16d5eb43e5a@XCH-RTP-017.cisco.com>
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi Hanoch,

On 01/04/2016 03:43 PM, Hanoch Haim (hhaim) wrote:
> Hi Oliver,
>
> Let's take your drawing as a reference and add my question
> The use case is sending a duplicate multicast packet by many threads.
> I can split it to x threads to do the job and with atomic-ref (my multicast not mbuf) count it until it reaches zero.
>
> In my following example the two cores (0 and 1) sending the indirect m1/m2 do alloc/attach/send
>
>      core0			             |	core1
> ---------------------------------                         |---------------------------------------
> m_const=rte_pktmbuf_alloc(mp)             |
>                                                                    |
> while true:                                                 |  while True:
>    m1 =rte_pktmbuf_alloc(mp_64)             |    m2 =rte_pktmbuf_alloc(mp_64)
>    rte_pktmbuf_attach(m1, m_const)         |    rte_pktmbuf_attach(m1, m_const)
>    tx_burst(m1)                                           |    tx_burst(m2)
>
> Is this example is not valid?

For me, m_const is not expected to be used concurrently on
several cores. By "used", I mean calling a function that modifies
the mbuf, which is the case for rte_pktmbuf_attach().

> BTW this is our workaround
>
>
>    core0			                    |	core1
> ---------------------------------                  |---------------------------------------
> m_const=rte_pktmbuf_alloc(mp)      |
> rte_mbuf_refcnt_update(m_const,1)| <<-- workaround
>                                                             |
> while true:                                          |  while True:
>    m1 =rte_pktmbuf_alloc(mp_64)      |    m2 =rte_pktmbuf_alloc(mp_64)
>    rte_pktmbuf_attach(m1, m_const)  |    rte_pktmbuf_attach(m1, m_const)
>    tx_burst(m1)                                     |    tx_burst(m2)

This workaround indeed solves the issue. Another solution would be to
protect the call to attach() with a lock, or call all the
rte_pktmbuf_attach() on the same core.

I'm open to discuss this behavior for rte_pktmbuf_attach() function
(should concurrent calls be allowed or not). In any case, we may
want to better document it in the doxygen API comments.


Regards,
Olivier