From: Flavio Leitner <fbl@sysclose.org>
To: Ilya Maximets <i.maximets@ovn.org>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>,
Shahaf Shuler <shahafs@mellanox.com>,
David Marchand <david.marchand@redhat.com>,
"dev@dpdk.org" <dev@dpdk.org>, Tiwei Bie <tiwei.bie@intel.com>,
Zhihong Wang <zhihong.wang@intel.com>,
Obrembski MichalX <michalx.obrembski@intel.com>,
Stokes Ian <ian.stokes@intel.com>
Subject: Re: [dpdk-dev] [PATCH] vhost: add support to large linear mbufs
Date: Thu, 3 Oct 2019 18:25:52 -0300 [thread overview]
Message-ID: <20191003182552.3f978ef5@p50.lan> (raw)
In-Reply-To: <088ea83c-cc00-5542-a554-ca857b9ef6ec@ovn.org>
On Thu, 3 Oct 2019 18:57:32 +0200
Ilya Maximets <i.maximets@ovn.org> wrote:
> On 02.10.2019 20:15, Flavio Leitner wrote:
> > On Wed, 2 Oct 2019 17:50:41 +0000
> > Shahaf Shuler <shahafs@mellanox.com> wrote:
> >
> >> Wednesday, October 2, 2019 3:59 PM, Flavio Leitner:
> >>> Obrembski MichalX <michalx.obrembski@intel.com>; Stokes Ian
> >>> <ian.stokes@intel.com>
> >>> Subject: Re: [dpdk-dev] [PATCH] vhost: add support to large linear
> >>> mbufs
> >>>
> >>>
> >>> Hi Shahaf,
> >>>
> >>> Thanks for looking into this, see my inline comments.
> >>>
> >>> On Wed, 2 Oct 2019 09:00:11 +0000
> >>> Shahaf Shuler <shahafs@mellanox.com> wrote:
> >>>
> >>>> Wednesday, October 2, 2019 11:05 AM, David Marchand:
> >>>>> Subject: Re: [dpdk-dev] [PATCH] vhost: add support to large
> >>>>> linear mbufs
> >>>>>
> >>>>> Hello Shahaf,
> >>>>>
> >>>>> On Wed, Oct 2, 2019 at 6:46 AM Shahaf Shuler
> >>>>> <shahafs@mellanox.com> wrote:
> >>>>>>
> >>
> >> [...]
> >>
> >>>>>
> >>>>> I am missing some piece here.
> >>>>> Which pool would the PMD take those external buffers from?
> >>>>
> >>>> The mbuf is always taken from the single mempool associated w/
> >>>> the rxq. The buffer for the mbuf may be allocated (in case virtio
> >>>> payload is bigger than current mbuf size) from DPDK hugepages or
> >>>> any other system memory and be attached to the mbuf.
> >>>>
> >>>> You can see example implementation of it in mlx5 PMD (checkout
> >>>> rte_pktmbuf_attach_extbuf call)
> >>>
> >>> Thanks, I wasn't aware of external buffers.
> >>>
> >>> I see that attaching external buffers of the correct size would be
> >>> more efficient in terms of saving memory/avoiding sparsing.
> >>>
> >>> However, we still need to be prepared to the worse case scenario
> >>> (all packets 64K), so that doesn't help with the total memory
> >>> required.
> >>
> >> Am not sure why.
> >> The allocation can be per demand. That is - only when you
> >> encounter a large buffer.
> >>
> >> Having buffer allocated in advance will benefit only from removing
> >> the cost of the rte_*malloc. However on such big buffers, and
> >> further more w/ device offloads like TSO, am not sure that is an
> >> issue.
> >
> > Now I see what you're saying. I was thinking we had to reserve the
> > memory before, like mempool does, then get the buffers as needed.
> >
> > OK, I can give a try with rte_*malloc and see how it goes.
>
> This way we actually could have a nice API. For example, by
> introducing some new flag RTE_VHOST_USER_NO_CHAINED_MBUFS (there
> might be better name) which could be passed to driver_register().
> On receive, depending on this flag, function will create chained
> mbufs or allocate new contiguous memory chunk and attach it as
> an external buffer if the data could not be stored in a single
> mbuf from the registered memory pool.
>
> Supporting external memory in mbufs will require some additional
> work from the OVS side (e.g. better work with ol_flags), but
> we'll have to do it anyway for upgrade to DPDK 19.11.
Agreed. Looks like rte_malloc is fast enough after all. I have a PoC
running iperf3 from VM to another baremetal using vhost-user client
with TSO enabled:
[...]
[ 5] 140.00-141.00 sec 4.60 GBytes 39.5 Gbits/sec 0 1.26 MBytes
[ 5] 141.00-142.00 sec 4.65 GBytes 39.9 Gbits/sec 0 1.26 MBytes
[ 5] 142.00-143.00 sec 4.65 GBytes 40.0 Gbits/sec 0 1.26 MBytes
[ 5] 143.00-144.00 sec 4.65 GBytes 39.9 Gbits/sec 9 1.04 MBytes
[ 5] 144.00-145.00 sec 4.59 GBytes 39.4 Gbits/sec 0 1.16 MBytes
[ 5] 145.00-146.00 sec 4.58 GBytes 39.3 Gbits/sec 0 1.26 MBytes
[ 5] 146.00-147.00 sec 4.48 GBytes 38.5 Gbits/sec 700 973 KBytes
[...]
(The physical link is 40Gbps)
I will clean that, test more and post the patches soon.
Thanks!
fbl
next prev parent reply other threads:[~2019-10-03 21:26 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-01 22:19 [dpdk-dev] [PATCH] vhost: add support to large linear mbufs Flavio Leitner
2019-10-01 23:10 ` Flavio Leitner
2019-10-02 4:45 ` Shahaf Shuler
2019-10-02 8:04 ` David Marchand
2019-10-02 9:00 ` Shahaf Shuler
2019-10-02 12:58 ` Flavio Leitner
2019-10-02 17:50 ` Shahaf Shuler
2019-10-02 18:15 ` Flavio Leitner
2019-10-03 16:57 ` Ilya Maximets
2019-10-03 21:25 ` Flavio Leitner [this message]
2019-10-02 7:51 ` Maxime Coquelin
2019-10-04 20:10 ` [dpdk-dev] [PATCH v2] vhost: add support for large buffers Flavio Leitner
2019-10-06 4:47 ` Shahaf Shuler
2019-10-10 5:12 ` Tiwei Bie
2019-10-10 12:12 ` Flavio Leitner
2019-10-11 17:09 ` [dpdk-dev] [PATCH v3] " Flavio Leitner
2019-10-14 2:44 ` Tiwei Bie
2019-10-15 16:17 ` [dpdk-dev] [PATCH v4] " Flavio Leitner
2019-10-15 17:41 ` Ilya Maximets
2019-10-15 18:44 ` Flavio Leitner
2019-10-15 18:59 ` [dpdk-dev] [PATCH v5] " Flavio Leitner
2019-10-16 10:02 ` Maxime Coquelin
2019-10-16 11:13 ` Maxime Coquelin
2019-10-16 13:32 ` Ilya Maximets
2019-10-16 13:46 ` Maxime Coquelin
2019-10-16 14:02 ` Flavio Leitner
2019-10-16 14:08 ` Ilya Maximets
2019-10-16 14:14 ` Flavio Leitner
2019-10-16 14:05 ` Ilya Maximets
2019-10-29 9:02 ` David Marchand
2019-10-29 12:21 ` Flavio Leitner
2019-10-29 16:19 ` David Marchand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191003182552.3f978ef5@p50.lan \
--to=fbl@sysclose.org \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=i.maximets@ovn.org \
--cc=ian.stokes@intel.com \
--cc=maxime.coquelin@redhat.com \
--cc=michalx.obrembski@intel.com \
--cc=shahafs@mellanox.com \
--cc=tiwei.bie@intel.com \
--cc=zhihong.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.