From: Olivier Matz <olivier.matz@6wind.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
"dev@dpdk.org" <dev@dpdk.org>,
Adrien Mazarguil <adrien.mazarguil@6wind.com>
Subject: Re: [PATCH] mbuf: remove inconsistent assert statements
Date: Wed, 8 Jun 2016 16:11:35 +0200 [thread overview]
Message-ID: <57582797.8050300@6wind.com> (raw)
In-Reply-To: <20160608135751.GM7621@6wind.com>
Hi Adrien, Konstantin,
I'm jumping in this (interesting) discussion. I already talked a
bit with Adrien in point to point, and I think its patch is valid.
Please see some comments below.
On 06/08/2016 03:57 PM, Adrien Mazarguil wrote:
> On Wed, Jun 08, 2016 at 01:09:18PM +0000, Ananyev, Konstantin wrote:
>>>
>>> Hi Konstantin,
>>>
>>> On Wed, Jun 08, 2016 at 10:34:17AM +0000, Ananyev, Konstantin wrote:
>>>> Hi Adrien,
>>>>
>>>>>
>>>>> An assertion failure occurs in __rte_mbuf_raw_free() (called by a few PMDs)
>>>>> when compiling DPDK with CONFIG_RTE_LOG_LEVEL=RTE_LOG_DEBUG and starting
>>>>> applications with a log level high enough to trigger it.
>>>>>
>>>>> While rte_mbuf_raw_alloc() sets refcount to 1, __rte_mbuf_raw_free()
>>>>> expects it to be 0.
>>>>> Considering users are not expected to reset the
>>>>> reference count to satisfy assert() and that raw functions are designed on
>>>>> purpose without safety belts, remove these checks.
>>>>
>>>> Yes, it refcnt supposed to be set to 0 by __rte_pktmbuf_prefree_seg().
>>>> Wright now, it is a user responsibility to make sure refcnt==0 before pushing
>>>> mbuf back to the pool.
>>>> Not sure why do you consider that wrong?
>>>
>>> I do not consider this wrong and I'm all for using assert() to catch
>>> programming errors, however in this specific case, I think they are
>>> inconsistent and misleading.
>>
>> Honestly, I don't understand why.
>> Right now the rule of thumb is - when mbuf is in the pool, it's refcnt should be equal zero.
What is the purpose of this? Is there some code relying on this?
>> Yes, as you pointed below - that rule probably can be changed to:
>> when mbuf is in the pool, it's refcnt should equal one, and that would probably allow us
>> to speedup things a bit, but I suppose that's the matter of another aptch/discussion.
>
> Agreed.
I agree this is somehow another discussion, but let's dive into :)
But since [1], it is allowed to call rte_mbuf_raw_alloc() in PMDs (all
PMDs were calling an internal function before). We could argue that
rte_mbuf_raw_free() should also be made public for PMDs.
As you said below, no-one promised that the free() reverts the malloc(),
but given the function names, one can legitimately expect that the
following code is valid:
m = __rte_mbuf_raw_alloc();
/* do nothing here */
__rte_mbuf_raw_free(m);
If no code relies on having the refcnt set to 0 when a mbuf is in
the pool, I suggest to relax this constraint as Adrien proposed.
Then, my opinion is that the refcount should be set to 1 in
rte_pktmbuf_reset(). And rte_pktmbuf_free() should not be allowed on
an uninitialized mbuf (yes, this would require some changes in PMDs).
This would open the door for bulk allocation/free in the mbuf api.
This could be done in several steps:
1/ remove these assert(), add introduce a public rte_mbuf_raw_free()
2/ announce that rte_pktmbuf_free() won't work on uninitialized mbufs
in a future version, and ensure that all PMD are inline with this
requirement
3/ later, move refcount initialization in rte_pktmbuf_reset()
What do you think?
Another option is to remove the rte_mbuf_raw_alloc()/rte_mbuf_raw_free()
and use mempool calls. But having a mbuf wrapper does not seem a bad
thing to me.
By the way, __rte_pktmbuf_prefree_seg() is also an internal function.
I think we should try to update the mbuf API so that the PMDs do not
need to call these internal functions.
[1]
http://dpdk.org/browse/dpdk/commit/?id=fbfd99551ca370266f4bfff58ce441cf5cb1203a
Regards,
Olivier
>
>>>> If the user calls __rte_mbuf_raw_free() manualy it is his responsibility to make
>>>> sure mbuf's refcn==0.
>>>
>>> Sure, however what harm does it cause (besides assert() to fail), since the
>>> allocation function sets refcount to 1?
>>>
>>> Why having the allocation function set the refcount if we are sure it is
>>> already 0 (assert() proves it). Removing rte_mbuf_refcnt_set(m, 1) can
>>> surely improve performance.
>>
>> That's' just an assert() enabled when MBUF_DEBUG is on.
>> Its sole purpose is to help troubleshoot the bugs and help to catch situations
>> when someone silently updates mbufs supposed to be free.
>
> I perfectly understand and I cannot agree more with this explanation,
> however the fact these functions are not symmetrical remains an issue that
> needs to be addressed somehow in my opinion.
>
>>>> BTW, why are you doing it?
>>>> The comment clearly states that the function is for internal use:
>>>> /**
>>>> * @internal Put mbuf back into its original mempool.
>>>> * The use of that function is reserved for RTE internal needs.
>>>> * Please use rte_pktmbuf_free().
>>>> *
>>>> * @param m
>>>> * The mbuf to be freed.
>>>> */
>>>> static inline void __attribute__((always_inline))
>>>> __rte_mbuf_raw_free(struct rte_mbuf *m)
>>>
>>> Several PMDs are using it anyway (won't name names, but I know one of them
>>> quite well).
>>
>> Then it probably is a bug in these PMDs that need to be fixed.
>>
>>> I chose to modify this code instead of its users for the
>>> following reasons:
>>>
>>> - Considering their names, these functions should be opposites and able to
>>> work together like malloc()/free().
>>
>> These are internal functions.
>> Comments in mbuf clearly state that library users shouldn't call them directly.
>> They are written to fit internal librte_mbuf needs, and no-one ever promised
>> malloc/free() compatibility here.
>
> So it cannot be provided for the sake of not providing it or is there a good
> reason?
>
> What I meant is that since PMDs already made the mistake of using these
> functions to benefit from the improved performance, DPDK being all about
> performance and stuff, let them use it as intended. Perhaps we should drop
> those "__" like for rte_mbuf_raw_alloc().
>
>>>
>>> - PMDs are relying on these functions for performance reasons, we can assume
>>> they took the extra care necessary to make sure it would work properly.
>>
>> That just doesn't seem correct to me.
>> The proper way to do free fo mbuf segment is:
>>
>> static inline void __attribute__((always_inline))
>> rte_pktmbuf_free_seg(struct rte_mbuf *m)
>> {
>> if (likely(NULL != (m = __rte_pktmbuf_prefree_seg(m)))) {
>> m->next = NULL;
>> __rte_mbuf_raw_free(m);
>> }
>> }
>>
>> If by some reason you choose not to use this function, then it is your
>> responsibility to perform similar actions on your own before pushing mbuf into the pool.
>> That's what some TX functions for some Intel NICs do to improve performance:
>> they call _prefree_seg() manually and try to put mbufs into the pool in groups.
>
> Not anymore it seems, but in the current code base both ena and mpipe PMDs
> (soon mlx5 as well) seem to get this wrong.
>
>>> - Preventing it would make these PMDs slower and is not acceptable either.
>>
>> I can hardly imagine that __rte_pktmbuf_prefree_seg() impact would be that severe...
>> But ok, probably you do have some very specific case, but then why you PMD just doesn't call:
>> rte_mempool_put(m->pool, m);
>> directly?
>
> To survive the upcoming transition to mbufs backed by libc malloc() without
> having to fix them? Joke aside, I guess the reason is to use functions with
> "mbuf" in their names when dealing with mbufs.
>
>> Why instead you choose to change common functions and compromise
>> librte_mbuf debug ability?
>
> No, I'm fine with keeping the debug ability, however I did not find a way to
> both keep it and fix the consistency issue without breaking something
> (performance or refcount assumptions I'm not familiar with elsewhere).
>
>>> What remains is the consistency issue, I think these statements were only
>>> added to catch multiple frees,
>>
>> Yes these asserts() here to help catch bugs,
>> and I think it is a good thing to have them here.
>>
>>> and those should be caught at a higher
>>> level, where other consistency checks are also performed.
>>
>> Like where?
>
> Possibly rte_pktmbuf_free_seg().
>
>>>>> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
>>>>> ---
>>>>> lib/librte_mbuf/rte_mbuf.h | 2 --
>>>>> 1 file changed, 2 deletions(-)
>>>>>
>>>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>>>>> index 11fa06d..7070bb8 100644
>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>>>> @@ -1108,7 +1108,6 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
>>>>> if (rte_mempool_get(mp, &mb) < 0)
>>>>> return NULL;
>>>>> m = (struct rte_mbuf *)mb;
>>>>> - RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0);
>>>>> rte_mbuf_refcnt_set(m, 1);
>>>>> __rte_mbuf_sanity_check(m, 0);
>>>>>
>>>>> @@ -1133,7 +1132,6 @@ __rte_mbuf_raw_alloc(struct rte_mempool *mp)
>>>>> static inline void __attribute__((always_inline))
>>>>> __rte_mbuf_raw_free(struct rte_mbuf *m)
>>>>> {
>>>>> - RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0);
>>>>> rte_mempool_put(m->pool, m);
>>>>> }
>>>>>
>>>>> --
>>>>> 2.1.4
>>>>
>>>
>>> --
>>> Adrien Mazarguil
>>> 6WIND
>
next prev parent reply other threads:[~2016-06-08 14:11 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-08 8:31 [PATCH] mbuf: remove inconsistent assert statements Adrien Mazarguil
2016-06-08 10:34 ` Ananyev, Konstantin
2016-06-08 12:27 ` Adrien Mazarguil
2016-06-08 13:09 ` Ananyev, Konstantin
2016-06-08 13:57 ` Adrien Mazarguil
2016-06-08 14:11 ` Olivier Matz [this message]
2016-06-08 16:07 ` Ananyev, Konstantin
2016-06-09 7:46 ` Olivier Matz
2016-06-09 13:21 ` Ananyev, Konstantin
2016-06-09 14:19 ` Bruce Richardson
2016-06-09 15:27 ` Thomas Monjalon
2016-06-09 15:45 ` Ananyev, Konstantin
2016-06-09 18:42 ` Don Provan
2016-06-20 13:49 ` Adrien Mazarguil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57582797.8050300@6wind.com \
--to=olivier.matz@6wind.com \
--cc=adrien.mazarguil@6wind.com \
--cc=dev@dpdk.org \
--cc=konstantin.ananyev@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.