All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: Andrea Parri <parri.andrea@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Paul McKenney <paulmck@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: Barrier before pushing desc_ring tail: was [PATCH v2 2/3] printk: add lockless buffer
Date: Fri, 12 Jun 2020 01:06:28 +0200	[thread overview]
Message-ID: <87bllpyzgr.fsf@vostro.fn.ogness.net> (raw)
In-Reply-To: <20200611120107.GD6581@linux-b0ei> (Petr Mladek's message of "Thu, 11 Jun 2020 14:01:08 +0200")

On 2020-06-11, Petr Mladek <pmladek@suse.com> wrote:
> All this relies on the fact the the full barrier is called in
> data_push_tail() and data_push_tail() is called right above.
> But there are two situations where the barrier is not called.
> It is when:
>
>   1. desc.text_blk_lpos.next already is behind data_ring->tail_lpos.
>
>      This is safe.

Yes, and I have since expanded the comment above the data_push_tail()
while loop to mention that:

	/*
	 * Loop until the tail lpos is at or beyond @lpos. This condition
	 * may already be satisfied, resulting in no full memory barrier
	 * from data_push_tail:C being performed. However, since this CPU
	 * sees the new tail lpos, any descriptor states that transitioned to
	 * the reusable state must already be visible.
	 */

>   2. desc.text_blk_lpos == INVALID_LPOS.
>
>      It seems that this is not synchronized and other CPUs might see
>      the old state.

Great catch! This could trigger the WARN_ON at desc_reserve:F and lead
to prb_reserve() unnecessarily failing.

> The question is what to do with the empty data case. I see three
> possibilities:
>
>   1. Ignore the case with empty block because it (probably) does not
>      cause real problems.

It could cause dropped messages.

>   2. Call the full barrier in data_push_tail() even when the data
>      block is empty.

This is the common case, since most records will not have dictionary
data.

>   3. Call the full barrier also in desc_push_tail() as I suggested.
>
> My opinion:
>
> I prefer 3rd solution.

Agreed. For my next version I am folding all smp_mb() and smp_wmb()
calls into their neighboring cmpxchg_relaxed(). This would apply here as
well, changing desc_push_tail:B to a full cmpxchg().

We still need the full memory barrier at data_push_tail:C. Readers rely
on it to be able to verify if their copied data is still valid:

CPU0 (writer0)        CPU1 (writer1)       CPU2 (reader)
                                           desc_read->committed
desc_make_reusable
smp_mb
data_push_tail
                      read new data tail
                      data_push_head
                      smp_mb
                      write new data
                                           read garbage new data
                                           desc_read->reusable

John Ogness

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

WARNING: multiple messages have this Message-ID (diff)
From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andrea Parri <parri.andrea@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
	Paul McKenney <paulmck@kernel.org>
Subject: Re: Barrier before pushing desc_ring tail: was [PATCH v2 2/3] printk: add lockless buffer
Date: Fri, 12 Jun 2020 01:06:28 +0200	[thread overview]
Message-ID: <87bllpyzgr.fsf@vostro.fn.ogness.net> (raw)
In-Reply-To: <20200611120107.GD6581@linux-b0ei> (Petr Mladek's message of "Thu, 11 Jun 2020 14:01:08 +0200")

On 2020-06-11, Petr Mladek <pmladek@suse.com> wrote:
> All this relies on the fact the the full barrier is called in
> data_push_tail() and data_push_tail() is called right above.
> But there are two situations where the barrier is not called.
> It is when:
>
>   1. desc.text_blk_lpos.next already is behind data_ring->tail_lpos.
>
>      This is safe.

Yes, and I have since expanded the comment above the data_push_tail()
while loop to mention that:

	/*
	 * Loop until the tail lpos is at or beyond @lpos. This condition
	 * may already be satisfied, resulting in no full memory barrier
	 * from data_push_tail:C being performed. However, since this CPU
	 * sees the new tail lpos, any descriptor states that transitioned to
	 * the reusable state must already be visible.
	 */

>   2. desc.text_blk_lpos == INVALID_LPOS.
>
>      It seems that this is not synchronized and other CPUs might see
>      the old state.

Great catch! This could trigger the WARN_ON at desc_reserve:F and lead
to prb_reserve() unnecessarily failing.

> The question is what to do with the empty data case. I see three
> possibilities:
>
>   1. Ignore the case with empty block because it (probably) does not
>      cause real problems.

It could cause dropped messages.

>   2. Call the full barrier in data_push_tail() even when the data
>      block is empty.

This is the common case, since most records will not have dictionary
data.

>   3. Call the full barrier also in desc_push_tail() as I suggested.
>
> My opinion:
>
> I prefer 3rd solution.

Agreed. For my next version I am folding all smp_mb() and smp_wmb()
calls into their neighboring cmpxchg_relaxed(). This would apply here as
well, changing desc_push_tail:B to a full cmpxchg().

We still need the full memory barrier at data_push_tail:C. Readers rely
on it to be able to verify if their copied data is still valid:

CPU0 (writer0)        CPU1 (writer1)       CPU2 (reader)
                                           desc_read->committed
desc_make_reusable
smp_mb
data_push_tail
                      read new data tail
                      data_push_head
                      smp_mb
                      write new data
                                           read garbage new data
                                           desc_read->reusable

John Ogness

  reply	other threads:[~2020-06-11 23:06 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-01  9:40 [PATCH v2 0/3] printk: replace ringbuffer John Ogness
2020-05-01  9:40 ` John Ogness
2020-05-01  9:40 ` [PATCH v2 1/3] crash: add VMCOREINFO macro for anonymous structs John Ogness
2020-05-01  9:40   ` John Ogness
2020-06-03 10:16   ` Petr Mladek
2020-06-03 10:16     ` Petr Mladek
2020-05-01  9:40 ` [PATCH v2 2/3] printk: add lockless buffer John Ogness
2020-05-01  9:40   ` John Ogness
2020-05-18 13:03   ` John Ogness
2020-05-18 17:22     ` Linus Torvalds
2020-05-18 17:22       ` Linus Torvalds
2020-05-19 20:34       ` John Ogness
2020-05-19 20:34         ` John Ogness
2020-06-09  7:10   ` blk->id read race: was: " Petr Mladek
2020-06-09  7:10     ` Petr Mladek
2020-06-09 14:18     ` John Ogness
2020-06-09 14:18       ` John Ogness
2020-06-10  8:42       ` Petr Mladek
2020-06-10  8:42         ` Petr Mladek
2020-06-10 13:55         ` John Ogness
2020-06-10 13:55           ` John Ogness
2020-06-09  9:31   ` redundant check in make_data_reusable(): was " Petr Mladek
2020-06-09  9:31     ` Petr Mladek
2020-06-09 14:48     ` John Ogness
2020-06-09 14:48       ` John Ogness
2020-06-10  9:38       ` Petr Mladek
2020-06-10  9:38         ` Petr Mladek
2020-06-10 10:24         ` John Ogness
2020-06-10 10:24           ` John Ogness
2020-06-10 14:56           ` John Ogness
2020-06-10 14:56             ` John Ogness
2020-06-11 19:51             ` John Ogness
2020-06-11 19:51               ` John Ogness
2020-06-11 13:55           ` Petr Mladek
2020-06-11 13:55             ` Petr Mladek
2020-06-11 20:25             ` John Ogness
2020-06-11 20:25               ` John Ogness
2020-06-09  9:48   ` Full barrier in data_push_tail(): " Petr Mladek
2020-06-09  9:48     ` Petr Mladek
2020-06-09 15:03     ` John Ogness
2020-06-09 15:03       ` John Ogness
2020-06-09 11:37   ` Barrier before pushing desc_ring tail: " Petr Mladek
2020-06-09 11:37     ` Petr Mladek
2020-06-09 15:56     ` John Ogness
2020-06-09 15:56       ` John Ogness
2020-06-11 12:01       ` Petr Mladek
2020-06-11 12:01         ` Petr Mladek
2020-06-11 23:06         ` John Ogness [this message]
2020-06-11 23:06           ` John Ogness
2020-06-09 14:38   ` data_ring head_lpos and tail_lpos synchronization: " Petr Mladek
2020-06-09 14:38     ` Petr Mladek
2020-06-10  7:53     ` John Ogness
2020-06-10  7:53       ` John Ogness
2020-05-01  9:40 ` [PATCH v2 3/3] printk: use the lockless ringbuffer John Ogness
2020-05-01  9:40   ` John Ogness
2020-05-06 14:50   ` John Ogness
2020-05-06 14:50     ` John Ogness
2020-05-13 12:05 ` [PATCH v2 0/3] printk: replace ringbuffer Prarit Bhargava
2020-05-13 12:05   ` Prarit Bhargava
2020-05-15 10:24 ` Sergey Senozhatsky
2020-05-15 10:24   ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bllpyzgr.fsf@vostro.fn.ogness.net \
    --to=john.ogness@linutronix.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=parri.andrea@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.