From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: Andrea Parri <parri.andrea@gmail.com>,
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
Paul McKenney <paulmck@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
Steven Rostedt <rostedt@goodmis.org>,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: data_ring head_lpos and tail_lpos synchronization: was [PATCH v2 2/3] printk: add lockless buffer
Date: Wed, 10 Jun 2020 09:53:16 +0200 [thread overview]
Message-ID: <87wo4fpdar.fsf@vostro.fn.ogness.net> (raw)
In-Reply-To: <20200609143811.GF23752@linux-b0ei> (Petr Mladek's message of "Tue, 9 Jun 2020 16:38:11 +0200")
On 2020-06-09, Petr Mladek <pmladek@suse.com> wrote:
>> --- /dev/null
>> +++ b/kernel/printk/printk_ringbuffer.c
>> +/*
>> + * Advance the data ring tail to at least @lpos. This function puts
>> + * descriptors into the reusable state if the tail is pushed beyond
>> + * their associated data block.
>> + */
>> +static bool data_push_tail(struct printk_ringbuffer *rb,
>> + struct prb_data_ring *data_ring,
>> + unsigned long lpos)
>> +{
>> + unsigned long tail_lpos;
>> + unsigned long next_lpos;
>> +
>> + /* If @lpos is not valid, there is nothing to do. */
>> + if (lpos == INVALID_LPOS)
>> + return true;
>> +
>> + tail_lpos = atomic_long_read(&data_ring->tail_lpos);
>
> Hmm, I wonder whether data_ring->tail_lpos and data_ring->head_lpos
> are synchronized enough between each other.
>
> I feel that there should be read barrier here. But it seems that more
> barriers are missing. For example, let's have:
>
>
> CPU0 CPU1
>
> data_alloc()
> begin_lpos = atomic_read(data_ring->head_lpos);
>
> data_alloc()
> data_push_tail()
> cmpxchg(data_ring->tail_lpos);
> // A: no barrier
> cmpxchg(data_ring->head_lpos);
>
> data_push_tail()
> // B: no barrier
> tail_lpos = atomic_read(data_ring->tail_lpos);
>
> Problem 1:
>
> CPU0 might see random ordering of data_ring->tail_lpos and
> head_lpos values modified by CPU1. There are missing both
> write and read barriers.
You need to explain why this is a problem. CPU0 saw some head and some
tail. Both values are at least the current values (i.e. there is no
danger that it sees a tail that is further than the tail really is).
CPU0 then uses the head/tail values to determine how far to advance the
tail and how far to advance the head. Both of these advances use
cmpxchg_relaxed(). So there is no danger of random head/tail
modifications. Upon cmpxchg_relaxed() failure, the new current values
are loaded and it retries based on the new values.
The only issue is if data_push_tail()/data_make_reusable() are able to
recognize that a data area is already recycled. And both functions have
memory barriers in place for that.
> Problem 2:
>
> There might be still a chance because CPU0 does:
>
> if (!data_make_reusable())
> smp_rmb()
> tail_lpos = atomic_read(data_ring->tail_lpos);
>
> But CPU0 might still see old data_ring->tail because CPU1 did not
> do write barrier.
I claim that it does not matter. The smp_rmb() here pairs with the full
memory barrier LMM(desc_reserve:D). The reasoning:
* Guarantee any data ring tail changes are stored before
* recycling the descriptor. Data ring tail changes can happen
* via desc_push_tail()->data_push_tail(). A full memory
* barrier is needed since another task may have pushed the
* data ring tails. This pairs with data_push_tail:A.
So if data_make_reusable() failed due to a descriptor already being
recycled, we know CPU0 will be able to read an updated tail value (and
try again with the new value).
John Ogness
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Steven Rostedt <rostedt@goodmis.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Andrea Parri <parri.andrea@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
Paul McKenney <paulmck@kernel.org>
Subject: Re: data_ring head_lpos and tail_lpos synchronization: was [PATCH v2 2/3] printk: add lockless buffer
Date: Wed, 10 Jun 2020 09:53:16 +0200 [thread overview]
Message-ID: <87wo4fpdar.fsf@vostro.fn.ogness.net> (raw)
In-Reply-To: <20200609143811.GF23752@linux-b0ei> (Petr Mladek's message of "Tue, 9 Jun 2020 16:38:11 +0200")
On 2020-06-09, Petr Mladek <pmladek@suse.com> wrote:
>> --- /dev/null
>> +++ b/kernel/printk/printk_ringbuffer.c
>> +/*
>> + * Advance the data ring tail to at least @lpos. This function puts
>> + * descriptors into the reusable state if the tail is pushed beyond
>> + * their associated data block.
>> + */
>> +static bool data_push_tail(struct printk_ringbuffer *rb,
>> + struct prb_data_ring *data_ring,
>> + unsigned long lpos)
>> +{
>> + unsigned long tail_lpos;
>> + unsigned long next_lpos;
>> +
>> + /* If @lpos is not valid, there is nothing to do. */
>> + if (lpos == INVALID_LPOS)
>> + return true;
>> +
>> + tail_lpos = atomic_long_read(&data_ring->tail_lpos);
>
> Hmm, I wonder whether data_ring->tail_lpos and data_ring->head_lpos
> are synchronized enough between each other.
>
> I feel that there should be read barrier here. But it seems that more
> barriers are missing. For example, let's have:
>
>
> CPU0 CPU1
>
> data_alloc()
> begin_lpos = atomic_read(data_ring->head_lpos);
>
> data_alloc()
> data_push_tail()
> cmpxchg(data_ring->tail_lpos);
> // A: no barrier
> cmpxchg(data_ring->head_lpos);
>
> data_push_tail()
> // B: no barrier
> tail_lpos = atomic_read(data_ring->tail_lpos);
>
> Problem 1:
>
> CPU0 might see random ordering of data_ring->tail_lpos and
> head_lpos values modified by CPU1. There are missing both
> write and read barriers.
You need to explain why this is a problem. CPU0 saw some head and some
tail. Both values are at least the current values (i.e. there is no
danger that it sees a tail that is further than the tail really is).
CPU0 then uses the head/tail values to determine how far to advance the
tail and how far to advance the head. Both of these advances use
cmpxchg_relaxed(). So there is no danger of random head/tail
modifications. Upon cmpxchg_relaxed() failure, the new current values
are loaded and it retries based on the new values.
The only issue is if data_push_tail()/data_make_reusable() are able to
recognize that a data area is already recycled. And both functions have
memory barriers in place for that.
> Problem 2:
>
> There might be still a chance because CPU0 does:
>
> if (!data_make_reusable())
> smp_rmb()
> tail_lpos = atomic_read(data_ring->tail_lpos);
>
> But CPU0 might still see old data_ring->tail because CPU1 did not
> do write barrier.
I claim that it does not matter. The smp_rmb() here pairs with the full
memory barrier LMM(desc_reserve:D). The reasoning:
* Guarantee any data ring tail changes are stored before
* recycling the descriptor. Data ring tail changes can happen
* via desc_push_tail()->data_push_tail(). A full memory
* barrier is needed since another task may have pushed the
* data ring tails. This pairs with data_push_tail:A.
So if data_make_reusable() failed due to a descriptor already being
recycled, we know CPU0 will be able to read an updated tail value (and
try again with the new value).
John Ogness
next prev parent reply other threads:[~2020-06-10 7:53 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-01 9:40 [PATCH v2 0/3] printk: replace ringbuffer John Ogness
2020-05-01 9:40 ` John Ogness
2020-05-01 9:40 ` [PATCH v2 1/3] crash: add VMCOREINFO macro for anonymous structs John Ogness
2020-05-01 9:40 ` John Ogness
2020-06-03 10:16 ` Petr Mladek
2020-06-03 10:16 ` Petr Mladek
2020-05-01 9:40 ` [PATCH v2 2/3] printk: add lockless buffer John Ogness
2020-05-01 9:40 ` John Ogness
2020-05-18 13:03 ` John Ogness
2020-05-18 17:22 ` Linus Torvalds
2020-05-18 17:22 ` Linus Torvalds
2020-05-19 20:34 ` John Ogness
2020-05-19 20:34 ` John Ogness
2020-06-09 7:10 ` blk->id read race: was: " Petr Mladek
2020-06-09 7:10 ` Petr Mladek
2020-06-09 14:18 ` John Ogness
2020-06-09 14:18 ` John Ogness
2020-06-10 8:42 ` Petr Mladek
2020-06-10 8:42 ` Petr Mladek
2020-06-10 13:55 ` John Ogness
2020-06-10 13:55 ` John Ogness
2020-06-09 9:31 ` redundant check in make_data_reusable(): was " Petr Mladek
2020-06-09 9:31 ` Petr Mladek
2020-06-09 14:48 ` John Ogness
2020-06-09 14:48 ` John Ogness
2020-06-10 9:38 ` Petr Mladek
2020-06-10 9:38 ` Petr Mladek
2020-06-10 10:24 ` John Ogness
2020-06-10 10:24 ` John Ogness
2020-06-10 14:56 ` John Ogness
2020-06-10 14:56 ` John Ogness
2020-06-11 19:51 ` John Ogness
2020-06-11 19:51 ` John Ogness
2020-06-11 13:55 ` Petr Mladek
2020-06-11 13:55 ` Petr Mladek
2020-06-11 20:25 ` John Ogness
2020-06-11 20:25 ` John Ogness
2020-06-09 9:48 ` Full barrier in data_push_tail(): " Petr Mladek
2020-06-09 9:48 ` Petr Mladek
2020-06-09 15:03 ` John Ogness
2020-06-09 15:03 ` John Ogness
2020-06-09 11:37 ` Barrier before pushing desc_ring tail: " Petr Mladek
2020-06-09 11:37 ` Petr Mladek
2020-06-09 15:56 ` John Ogness
2020-06-09 15:56 ` John Ogness
2020-06-11 12:01 ` Petr Mladek
2020-06-11 12:01 ` Petr Mladek
2020-06-11 23:06 ` John Ogness
2020-06-11 23:06 ` John Ogness
2020-06-09 14:38 ` data_ring head_lpos and tail_lpos synchronization: " Petr Mladek
2020-06-09 14:38 ` Petr Mladek
2020-06-10 7:53 ` John Ogness [this message]
2020-06-10 7:53 ` John Ogness
2020-05-01 9:40 ` [PATCH v2 3/3] printk: use the lockless ringbuffer John Ogness
2020-05-01 9:40 ` John Ogness
2020-05-06 14:50 ` John Ogness
2020-05-06 14:50 ` John Ogness
2020-05-13 12:05 ` [PATCH v2 0/3] printk: replace ringbuffer Prarit Bhargava
2020-05-13 12:05 ` Prarit Bhargava
2020-05-15 10:24 ` Sergey Senozhatsky
2020-05-15 10:24 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wo4fpdar.fsf@vostro.fn.ogness.net \
--to=john.ogness@linutronix.de \
--cc=gregkh@linuxfoundation.org \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=parri.andrea@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=sergey.senozhatsky@gmail.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.