Re: [RFC][PATCH -next 1/2] printk: move can_use_console out of console_trylock_for_printk

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tejun Heo <tj@kernel.org>, Jan Kara <jack@suse.com>,
	Kyle McMartin <kyle@kernel.org>,
	Dave Jones <davej@codemonkey.org.uk>,
	Calvin Owens <calvinowens@fb.com>,
	linux-kernel@vger.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: Re: [RFC][PATCH -next 1/2] printk: move can_use_console out of console_trylock_for_printk
Date: Wed, 20 Jan 2016 00:00:40 +0900	[thread overview]
Message-ID: <20160119150040.GA558@swordfish> (raw)
In-Reply-To: <20160119133136.GA2148@dhcp128.suse.cz>

Hello,

On (01/19/16 14:31), Petr Mladek wrote:
> > On (01/18/16 16:42), Petr Mladek wrote:
> > > On Thu 2016-01-14 13:57:22, Sergey Senozhatsky wrote:
> > > > vprintk_emit() disables preemption around console_trylock_for_printk()
> > > > and console_unlock() calls for a strong reason -- can_use_console()
> > > > check. The thing is that vprintl_emit() can be called on a CPU that
> > > > is not fully brought up yet (!cpu_online()), which potentially can
> > > > cause problems if console driver accesses per-cpu data. A console
> > > > driver can explicitly state that it's safe to call it from !online
> > > > cpu by setting CON_ANYTIME bit in console ->flags. That's why for
> > > > !cpu_online() can_use_console() iterates all the console to find out
> > > > if there is a CON_ANYTIME console, otherwise console_unlock() must be
> > > > avoided.
> > > > 
> > > > call_console_drivers(), called from console_cont_flush() and
> > > > console_unlock(), does the same test during for_each_console() loop.
> > > > However, we can have the following corner case. Assume that we have 2
> > > > cpus -- CPU0 is online, CPU1 is !online; and no CON_ANYTIME consoles
> > > > available.
> > > > 
> > > > CPU0 online                        CPU1 !online
> > > >                                  console_trylock()
> > > >                                  ...
> > > >                                  console_unlock()
> > > 
> > > Please, where this console_unlock() comes from?
> > 
> > from UP* or DOWN* (_PREPARE f.e.) notifiers on this CPU, for example, we don't
> > know what's going on there. what prevents it from calling console_trylock(),
> > grabbing the console_sem and eventually doing console_unlock()? there is
> > a can_use_console() check, but it handles only one case -- printk().
> 
> So, is it a theoretical problem or do you know about any
> particular path where this happens?

a theoretical one at this point.

> Well, it might make sense to get rid of console_trylock_for_printk()
> and do the can_use_console() check at the beginning of
> unlock_console(). I mean to do
> 
> -	if (console_trylock_for_printk())
> +	if (console_trylock())
> 		unlock_console();

agree, I almost removed it, but decided to keep for a while,
just in case if we would want to add any additional code there.


> But do we really need to repeat the check in every cycle?

well, on every iteration in the best case we check cpu_online()
only. which is what we would have done anyway in vprintk_emit(),
so no additional checks added. at the same time call_console_drivers
does not do '!cpu_online && !CON_ANYTIME' for each console now, so
in some sense there are less checks now (this is far even from a
micro-optimization, just noted).

> can_use_console() checks available consoles and if the CPU
> is online. Consoles could not get added or removed when
> we own console_sem. It seems that CPUs get disabled
> in a process context. Therefore it seems that it might happen
> only when unlock_console() gets rescheduled. I guess that
> it could not get scheduled back on an offline CPU. So, it
> seems that it is enough to check can_use_console() only once at
> the beginning.
> 
> Or did I miss anything?

It does sound interesting, thanks. I was trying to keep the existing
behaviour.

It almost works, there is one case.

console_unlock()    /* w/o can_use_console() in logbuf_lock section */
....
again:
	for (;;) {
		raw_spin_lock_irqsave logbuf_lock
		msg_print_text
		raw_spin_unlock logbuf_lock
		call_console_drivers
		local_irq_restore
	}

	up_console_sem

	raw_spin_lock logbuf_lock
	retry = console_seq != log_next_seq
	raw_spin_unlock_irqrestore logbuf_lock
	
	if retry && console_trylock()
		goto again;


when we up_console_sem(), consoles may appear and disappear, since we
don't keep the semaphore. Suppose that we have OFFLINE cpu and we had a
CON_ANYTIME console, but in between up_console_sem--console_trylock
that single CON_ANYTIME console was removed. So now we have !cpu_online
and !CON_ANYTIME.

So this is why I re-do the can_use_console() check. I can add a bool flag
and do the can_use_console() only once -- when the flag is false, set it
to true on successful can_use_console() and avoid can_use_console() during
this loop; and set the flag to false right after the 'again:' label, which
will imply that console semaphore has been re-acquired and we need to
can_use_console() again.

something like this, perhaps (to be folded.. or should it be a separate
commit -- optimization). will give a test tomorrow.

I reused `bool retry' flag (which is probably a bit hacky, it'll be
better to have a separate byte for this thing). And I moved
can_use_console() from console_cont_flush() to the top -- so if we
are executing the `for (;;)' loop, then can_use_console() was
successful in console_cont_flush() and we have to re-do
can_use_console() only if we released console_sem and jumped to
`again' label.

---
 kernel/printk/printk.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 28b2dec..c7a5ce8 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2170,6 +2170,11 @@ static bool console_cont_flush(char *text, size_t size)
 
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
 
+	if (!can_use_console()) {
+		ret = false;
+		goto out;
+	}
+
 	if (!cont.len)
 		goto out;
 
@@ -2181,11 +2186,6 @@ static bool console_cont_flush(char *text, size_t size)
 	if (console_seq < log_next_seq && !cont.cons)
 		goto out;
 
-	if (!can_use_console()) {
-		ret = false;
-		goto out;
-	}
-
 	len = cont_print_text(text, size);
 	raw_spin_unlock(&logbuf_lock);
 	stop_critical_timings();
@@ -2219,7 +2219,7 @@ void console_unlock(void)
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
-	bool do_cond_resched, retry, aborted = false;
+	bool do_cond_resched, retry = false, aborted = false;
 
 	if (console_suspended) {
 		up_console_sem();
@@ -2255,9 +2255,20 @@ again:
 
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
 
-		if (!can_use_console()) {
-			aborted = true;
-			break;
+		/*
+		 * @retry == true implies that we have released console_sem
+		 * and, thus, someone could have added/removed console(-s).
+		 * we need to can_use_console() again. @retry intially set
+		 * to false, because console_cont_flush() does the
+		 * can_use_console() check and we can't execute this loop
+		 * in can_use_console() returned `false` there.
+		 */
+		if (retry) {
+			retry = false;
+			if (!can_use_console()) {
+				aborted = true;
+				break;
+			}
 		}
 
 		if (seen_seq != log_next_seq) {
-- 
2.7.0

next prev parent reply	other threads:[~2016-01-19 15:02 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-14  4:57 [RFC][PATCH -next 0/2] cond_resched() some of console_trylock callers Sergey Senozhatsky
2016-01-14  4:57 ` [RFC][PATCH -next 1/2] printk: move can_use_console out of console_trylock_for_printk Sergey Senozhatsky
2016-01-14  5:59   ` [PATCH " Sergey Senozhatsky
2016-01-18 15:42   ` [RFC][PATCH -next " Petr Mladek
2016-01-19  0:42     ` Sergey Senozhatsky
2016-01-19 13:31       ` Petr Mladek
2016-01-19 15:00         ` Sergey Senozhatsky [this message]
2016-01-19 16:16           ` Petr Mladek
2016-01-20  4:18             ` Sergey Senozhatsky
2016-01-20 10:09               ` Petr Mladek
2016-01-14  4:57 ` [RFC][PATCH -next 2/2] printk: set may_schedule for some of console_trylock callers Sergey Senozhatsky
2016-01-17 14:11   ` Sergey Senozhatsky
2016-01-17 14:23     ` Sergey Senozhatsky
2016-01-18 16:17       ` Petr Mladek
2016-01-19  1:15         ` Sergey Senozhatsky
2016-01-19 15:18           ` Petr Mladek
2016-01-20  3:50             ` Sergey Senozhatsky
2016-01-20 11:51               ` Sergey Senozhatsky
2016-01-20 12:38                 ` Petr Mladek
2016-01-20 12:31               ` Petr Mladek
2016-01-21  1:25                 ` Sergey Senozhatsky
2016-01-21  5:51                   ` Sergey Senozhatsky
2016-01-22  9:48                     ` Petr Mladek
2016-01-23  4:46                       ` Sergey Senozhatsky

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:28b2dec dfblob:c7a5ce8 )
 OR (
bs:"Re: [RFC][PATCH -next 1/2] printk: move can_use_console out of console_trylock_for_printk" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160119150040.GA558@swordfish \
    --to=sergey.senozhatsky@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=calvinowens@fb.com \
    --cc=davej@codemonkey.org.uk \
    --cc=jack@suse.com \
    --cc=kyle@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.