RE: [PATCH 1/1] ring: safe partial ordering for head/tail update

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
To: Ola Liljedahl <Ola.Liljedahl@arm.com>,
	Wathsala Vithanage <wathsala.vithanage@arm.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	Dhruv Tripathi <Dhruv.Tripathi@arm.com>,
	Bruce Richardson <bruce.richardson@intel.com>
Subject: RE: [PATCH 1/1] ring: safe partial ordering for head/tail update
Date: Wed, 24 Sep 2025 15:03:35 +0000	[thread overview]
Message-ID: <4cec8b9ca5114f90b8ce1047abdaa936@huawei.com> (raw)
In-Reply-To: <BD3A1BB1-B778-46DE-94BA-6E27DA4B0B08@arm.com>



> > > > > > > Sure, I am talking about MT scenario.
> > > > > > > I think I already provided an example: DPDK mempool library (see below).
> > > > > > > In brief, It works like that:
> > > > > > > At init it allocates ring of N memory buffers and ring big enough to hold all
> of
> > > > > them.
> > > > > >
> > > > > > Sorry, I meant to say: "it allocates N memory buffers and ring big enough to
> > > hold
> > > > > all of them".
> > > > > >
> > > > > > > Then it enqueues all allocated memory buffers into the ring.
> > > > > > > mempool_get - retrieves (dequeues) buffers from the ring.
> > > > > > > mempool_put - puts them back (enqueues) to the ring
> > > > > > > get() might fail (ENOMEM), while put is expected to always succeed.
> > > > > But how does the thread which calls mempool_put() get hold of the memory
> > > buffers
> > > > > that
> > > > > were obtained using mempool_get() by some other thread? Or this is not the
> > > > > scenario you
> > > > > are worrying about?
> > > > > Is it rather that multiple threads independently call mempool_get() and then
> > > > > mempool_put()
> > > > > on their own buffers? And you are worried that a thread will fail to return
> > > > > (mempool_put) a
> > > > > buffer that it earlier allocated (mempool_get)? We could create a litmus test
> for
> > > > > that.
> > > >
> > > >
> > > > Both scenarios are possible.
> > > > For Run-To-Completion model each thread usually does: allocate/use/free
> group
> > > of mbufs.
> > > > For pipleline model one thread can allocate bunch of mbufs, then pass them to
> > > other
> > > thread (via another ring for example) for further processing and then releasing.
> > > In the pipeline model, if the last stage (thread) frees (enqueues) buffers onto
> some
> > > ring buffer
> > > and the first stage (thread) allocates (dequeues) buffers from the same ring
> buffer
> > > but there
> > > isn't any other type of synchronization between the threads, we can never
> guarantee
> > > that
> > > the first thread will be able to dequeue buffers because it doesn't know whether
> the
> > > last
> > > thread has enqueued any buffers.
> >
> >
> > Yes, as I said above - for mempool use-case: dequeue can fail, enqueue should
> always succeed.
> > The closest analogy: malloc() can fail, free() should never fail.
> >
> >
> > >
> > > However, enqueue ought to always succeed. We should be able to create a
> litmus
> > > test for that.
> > > Ring 1 is used as mempool, it initially contains capacity elements (full).
> > > Ring 2 is used as pipe between stages 1 and 2, it initially contains 0 elements
> (empty).
> > > Thread 1 allocates/dequeues a buffer from ring 1.
> > > Thread 1 enqueues that buffer onto ring 2.
> > > Thread 2 dequeues a buffer from ring 2.
> > > Thread 2 frees/enqueues that buffer onto ring 1. <<< this must succeed!
> > > Does this reflect the situation you worry about?
> >
> >
> > This is one of the possible scenarios.
> > As I said above - mempool_put() is expected to always be able to enqueue element
> to the ring.
> > TBH, I am not sure what you are trying to prove with the litmus test.
> With a litmus test, we can prove that a specific situation can or cannot occur when
> executing the specified memory accesses in multiple threads. By experimenting with
> different memory access sequences and orderings, we can find the most relaxed
> sequence that still prohibits undesired results.
> 
> So with a litmus test, I could prove that enqueue in thread 2 above will succeed or
> alternatively, may not succeed (which is undesired). And I could find out what
> orderings
> are required to guarantee success.

I understand that part, what I am trying to say: this is just one of possible scenarios.
We could have multiple threads, multiple allocators, and multiple releasers,
and/or threads that will perform both.
Not to mention that this is just one library (mempool), there probably some others
Within DPDK that have similar expectations, not to mention third-party code. 
I am not trying to stop you from run whatever tests you like,
I am just a bit skeptical that it will cover all possible scenarios.

> > Looking at the changes you proposed:
> >
> >
> > + /*
> > + * Ensure the entries calculation was not based on a stale
> > + * and unsafe stail observation that causes underflow.
> > + */
> > + if ((int)*entries < 0)
> > + *entries = 0;
> > +
> >
> >
> > /* check that we have enough room in ring */
> > if (unlikely(n > *entries))
> > n = (behavior == RTE_RING_QUEUE_FIXED) ?
> > 0 : *entries;
> >
> >
> > *new_head = *old_head + n;
> > if (n == 0)
> > return 0;
> >
> >
> > It is clear that with these changes enqueue/dequeue might fail even
> > when there are available entries in the ring.
> Without explicit synchronization between the threads, they will still race and
> a consumer cannot be guaranteed to succeed with its dequeue operation, e.g.
> the dequeue operation could occur before the (supposedly) matching enqueue
> operation in another thread.
> 
> Using acquire/release for all ring buffer metadata accesses doesn't inform the
> thread that its operation is guaranteed to succeed, it just ensures any data is
> transferred (or passed ownership) in a safe way (establishes a happens-before
> relation). But the ring buffer implementation itself cannot ensure that the
> enqueue in thread P happens before the dequeue in thread C. The dequeue
> can still fail and return 0 elements.
> 
> How do you define "there are available entries in the ring"?
> Just reading one
> metadata variable doesn't tell you this. And I assume users usually don't access
> internal ring buffer metadata. So how does a thread know there are available
> entries in the ring that can be dequeued?

As I said - dequeue can fail, that's ok.
enqueue never should. 
As I explained, that is dictated by mempool lib design.
We create a ring with capacity=N, then put N objects into it.
Then we just dequeue/enqueue these elements from/to the ring.
We never try to put any other elements into the ring, so there always
has to be enough space in the ring to return object back to it.  
 
> It seems to me that your perspective is still very much that of sequential
> consistency and total order. But even so, you need synchronization (usually
> based on memory accesses but there are other ways, e.g. using system calls)
> to ensure that operation D in thread 2 is only initiated after operation E in
> thread 1 has completed. Otherwise, the operations will race and the outcome
> is not guaranteed.
> 
> - Ola
> 
> > One simple way that probably will introduce a loop instead of 'if':
> > (keep reading head and tail values till we get a valid results)
> > but again I am not sure it is a good way.
> > Konstantin
> 
> 
> 
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended recipient, please
> notify the sender immediately and do not disclose the contents to any other person,
> use it for any purpose, or store or copy the information in any medium. Thank you.

next prev parent reply	other threads:[~2025-09-24 15:03 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-15 18:54 [PATCH 0/1] ring: correct ordering issue in head/tail update Wathsala Vithanage
2025-09-15 18:54 ` [PATCH 1/1] ring: safe partial ordering for " Wathsala Vithanage
2025-09-16 15:42   ` Bruce Richardson
2025-09-16 18:19     ` Ola Liljedahl
2025-09-17  7:47       ` Bruce Richardson
2025-09-17 15:06         ` Stephen Hemminger
2025-09-18 17:40         ` Wathsala Vithanage
2025-09-16 22:57   ` Konstantin Ananyev
2025-09-16 23:08     ` Konstantin Ananyev
     [not found]     ` <2a611c3cf926d752a54b7655c27d6df874a2d0de.camel@arm.com>
2025-09-17  7:58       ` Konstantin Ananyev
2025-09-17  9:05         ` Ola Liljedahl
2025-09-20 12:01           ` Konstantin Ananyev
     [not found]             ` <cf7e14d4ba5e9d78fddf083b6c92d75942447931.camel@arm.com>
2025-09-22  7:12               ` Konstantin Ananyev
2025-09-23 21:57             ` Ola Liljedahl
2025-09-24  6:56               ` Konstantin Ananyev
2025-09-24  7:50                 ` Konstantin Ananyev
2025-09-24  8:51                   ` Ola Liljedahl
2025-09-24 10:08                     ` Konstantin Ananyev
2025-09-24 11:27                       ` Ola Liljedahl
2025-09-24 11:50                         ` Konstantin Ananyev
2025-09-24 13:28                           ` Ola Liljedahl
2025-09-24 15:03                             ` Konstantin Ananyev [this message]
2025-09-25  4:29                               ` Morten Brørup
2025-09-25  7:11                                 ` Konstantin Ananyev
2025-09-24 15:24               ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4cec8b9ca5114f90b8ce1047abdaa936@huawei.com \
    --to=konstantin.ananyev@huawei.com \
    --cc=Dhruv.Tripathi@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=Ola.Liljedahl@arm.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=wathsala.vithanage@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.