public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tejun Heo <tj@kernel.org>, Jan Kara <jack@suse.com>,
	Kyle McMartin <kyle@kernel.org>,
	Dave Jones <davej@codemonkey.org.uk>,
	Calvin Owens <calvinowens@fb.com>,
	linux-kernel@vger.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: Re: [RFC][PATCH -next 1/2] printk: move can_use_console out of console_trylock_for_printk
Date: Wed, 20 Jan 2016 00:00:40 +0900	[thread overview]
Message-ID: <20160119150040.GA558@swordfish> (raw)
In-Reply-To: <20160119133136.GA2148@dhcp128.suse.cz>

Hello,

On (01/19/16 14:31), Petr Mladek wrote:
> > On (01/18/16 16:42), Petr Mladek wrote:
> > > On Thu 2016-01-14 13:57:22, Sergey Senozhatsky wrote:
> > > > vprintk_emit() disables preemption around console_trylock_for_printk()
> > > > and console_unlock() calls for a strong reason -- can_use_console()
> > > > check. The thing is that vprintl_emit() can be called on a CPU that
> > > > is not fully brought up yet (!cpu_online()), which potentially can
> > > > cause problems if console driver accesses per-cpu data. A console
> > > > driver can explicitly state that it's safe to call it from !online
> > > > cpu by setting CON_ANYTIME bit in console ->flags. That's why for
> > > > !cpu_online() can_use_console() iterates all the console to find out
> > > > if there is a CON_ANYTIME console, otherwise console_unlock() must be
> > > > avoided.
> > > > 
> > > > call_console_drivers(), called from console_cont_flush() and
> > > > console_unlock(), does the same test during for_each_console() loop.
> > > > However, we can have the following corner case. Assume that we have 2
> > > > cpus -- CPU0 is online, CPU1 is !online; and no CON_ANYTIME consoles
> > > > available.
> > > > 
> > > > CPU0 online                        CPU1 !online
> > > >                                  console_trylock()
> > > >                                  ...
> > > >                                  console_unlock()
> > > 
> > > Please, where this console_unlock() comes from?
> > 
> > from UP* or DOWN* (_PREPARE f.e.) notifiers on this CPU, for example, we don't
> > know what's going on there. what prevents it from calling console_trylock(),
> > grabbing the console_sem and eventually doing console_unlock()? there is
> > a can_use_console() check, but it handles only one case -- printk().
> 
> So, is it a theoretical problem or do you know about any
> particular path where this happens?

a theoretical one at this point.

> Well, it might make sense to get rid of console_trylock_for_printk()
> and do the can_use_console() check at the beginning of
> unlock_console(). I mean to do
> 
> -	if (console_trylock_for_printk())
> +	if (console_trylock())
> 		unlock_console();

agree, I almost removed it, but decided to keep for a while,
just in case if we would want to add any additional code there.


> But do we really need to repeat the check in every cycle?

well, on every iteration in the best case we check cpu_online()
only. which is what we would have done anyway in vprintk_emit(),
so no additional checks added. at the same time call_console_drivers
does not do '!cpu_online && !CON_ANYTIME' for each console now, so
in some sense there are less checks now (this is far even from a
micro-optimization, just noted).

> can_use_console() checks available consoles and if the CPU
> is online. Consoles could not get added or removed when
> we own console_sem. It seems that CPUs get disabled
> in a process context. Therefore it seems that it might happen
> only when unlock_console() gets rescheduled. I guess that
> it could not get scheduled back on an offline CPU. So, it
> seems that it is enough to check can_use_console() only once at
> the beginning.
> 
> Or did I miss anything?

It does sound interesting, thanks. I was trying to keep the existing
behaviour.

It almost works, there is one case.

console_unlock()    /* w/o can_use_console() in logbuf_lock section */
....
again:
	for (;;) {
		raw_spin_lock_irqsave logbuf_lock
		msg_print_text
		raw_spin_unlock logbuf_lock
		call_console_drivers
		local_irq_restore
	}

	up_console_sem

	raw_spin_lock logbuf_lock
	retry = console_seq != log_next_seq
	raw_spin_unlock_irqrestore logbuf_lock
	
	if retry && console_trylock()
		goto again;


when we up_console_sem(), consoles may appear and disappear, since we
don't keep the semaphore. Suppose that we have OFFLINE cpu and we had a
CON_ANYTIME console, but in between up_console_sem--console_trylock
that single CON_ANYTIME console was removed. So now we have !cpu_online
and !CON_ANYTIME.

So this is why I re-do the can_use_console() check. I can add a bool flag
and do the can_use_console() only once -- when the flag is false, set it
to true on successful can_use_console() and avoid can_use_console() during
this loop; and set the flag to false right after the 'again:' label, which
will imply that console semaphore has been re-acquired and we need to
can_use_console() again.

something like this, perhaps (to be folded.. or should it be a separate
commit -- optimization). will give a test tomorrow.

I reused `bool retry' flag (which is probably a bit hacky, it'll be
better to have a separate byte for this thing). And I moved
can_use_console() from console_cont_flush() to the top -- so if we
are executing the `for (;;)' loop, then can_use_console() was
successful in console_cont_flush() and we have to re-do
can_use_console() only if we released console_sem and jumped to
`again' label.

---
 kernel/printk/printk.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 28b2dec..c7a5ce8 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2170,6 +2170,11 @@ static bool console_cont_flush(char *text, size_t size)
 
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
 
+	if (!can_use_console()) {
+		ret = false;
+		goto out;
+	}
+
 	if (!cont.len)
 		goto out;
 
@@ -2181,11 +2186,6 @@ static bool console_cont_flush(char *text, size_t size)
 	if (console_seq < log_next_seq && !cont.cons)
 		goto out;
 
-	if (!can_use_console()) {
-		ret = false;
-		goto out;
-	}
-
 	len = cont_print_text(text, size);
 	raw_spin_unlock(&logbuf_lock);
 	stop_critical_timings();
@@ -2219,7 +2219,7 @@ void console_unlock(void)
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
-	bool do_cond_resched, retry, aborted = false;
+	bool do_cond_resched, retry = false, aborted = false;
 
 	if (console_suspended) {
 		up_console_sem();
@@ -2255,9 +2255,20 @@ again:
 
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
 
-		if (!can_use_console()) {
-			aborted = true;
-			break;
+		/*
+		 * @retry == true implies that we have released console_sem
+		 * and, thus, someone could have added/removed console(-s).
+		 * we need to can_use_console() again. @retry intially set
+		 * to false, because console_cont_flush() does the
+		 * can_use_console() check and we can't execute this loop
+		 * in can_use_console() returned `false` there.
+		 */
+		if (retry) {
+			retry = false;
+			if (!can_use_console()) {
+				aborted = true;
+				break;
+			}
 		}
 
 		if (seen_seq != log_next_seq) {
-- 
2.7.0

  reply	other threads:[~2016-01-19 15:02 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-14  4:57 [RFC][PATCH -next 0/2] cond_resched() some of console_trylock callers Sergey Senozhatsky
2016-01-14  4:57 ` [RFC][PATCH -next 1/2] printk: move can_use_console out of console_trylock_for_printk Sergey Senozhatsky
2016-01-14  5:59   ` [PATCH " Sergey Senozhatsky
2016-01-18 15:42   ` [RFC][PATCH -next " Petr Mladek
2016-01-19  0:42     ` Sergey Senozhatsky
2016-01-19 13:31       ` Petr Mladek
2016-01-19 15:00         ` Sergey Senozhatsky [this message]
2016-01-19 16:16           ` Petr Mladek
2016-01-20  4:18             ` Sergey Senozhatsky
2016-01-20 10:09               ` Petr Mladek
2016-01-14  4:57 ` [RFC][PATCH -next 2/2] printk: set may_schedule for some of console_trylock callers Sergey Senozhatsky
2016-01-17 14:11   ` Sergey Senozhatsky
2016-01-17 14:23     ` Sergey Senozhatsky
2016-01-18 16:17       ` Petr Mladek
2016-01-19  1:15         ` Sergey Senozhatsky
2016-01-19 15:18           ` Petr Mladek
2016-01-20  3:50             ` Sergey Senozhatsky
2016-01-20 11:51               ` Sergey Senozhatsky
2016-01-20 12:38                 ` Petr Mladek
2016-01-20 12:31               ` Petr Mladek
2016-01-21  1:25                 ` Sergey Senozhatsky
2016-01-21  5:51                   ` Sergey Senozhatsky
2016-01-22  9:48                     ` Petr Mladek
2016-01-23  4:46                       ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160119150040.GA558@swordfish \
    --to=sergey.senozhatsky@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=calvinowens@fb.com \
    --cc=davej@codemonkey.org.uk \
    --cc=jack@suse.com \
    --cc=kyle@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox