All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Dave Anderson <anderson@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Kay Sievers <kay@vrfy.org>, Jiri Kosina <jkosina@suse.cz>,
	Michal Hocko <mhocko@suse.cz>, Jan Kara <jack@suse.cz>,
	linux-kernel@vger.kernel.org,
	Wang Long <long.wanglong@huawei.com>,
	peifeiyue@huawei.com, dzickus@redhat.com, morgan.wang@huawei.com,
	sasha.levin@oracle.com
Subject: Re: [PATCH 02/10] printk: Try harder to get logbuf_lock on NMI
Date: Fri, 29 May 2015 12:56:07 +0200	[thread overview]
Message-ID: <20150529105607.GG3135@pathway.suse.cz> (raw)
In-Reply-To: <20150528130944.9dde0f591a18d656f2a7c519@linux-foundation.org>

On Thu 2015-05-28 13:09:44, Andrew Morton wrote:
> On Thu, 28 May 2015 15:50:54 +0200 Petr Mladek <pmladek@suse.cz> wrote:
> 
> > > > +{
> > > > +	u64 start_time, current_time;
> > > > +	int this_cpu = smp_processor_id();
> > > > +
> > > > +	/* no way if we are already locked on this CPU */
> > > > +	if (logbuf_cpu == this_cpu)
> > > > +		return 0;
> > > > +
> > > > +	/* try hard to get the lock but do not wait forever */
> > > > +	start_time = cpu_clock(this_cpu);
> > > > +	current_time = start_time;
> > > > +	while (current_time - start_time < TRY_LOCKBUF_LOCK_MAX_DELAY_NS) {
> > > > +		if (raw_spin_trylock(&logbuf_lock))
> > > > +			return 1;
> > > > +		cpu_relax();
> > > > +		current_time = cpu_clock(this_cpu);
> > > > +	}
> > > 
> > > (Looks at the read_seqcount_retry() in
> > > kernel/time/sched_clock.c:sched_clock())
> > >
> > > Running cpu_clock() in NMI context seems a generally bad idea.
> > 
> > I am sorry but this is too cryptic for me :-)
> > read_seqcount_retry() looks safe to me under NMI.
> 
> hmpf.  If you guys say so...
> 
> Note that it's not just a matter of "safe to call from NMI context". 
> The above loop also assume that cpu_clock() is *being updated* within
> the context of single NMI.  Is that true/safe now and in the future?
> Probably.  I didn't check all architectures but ARM looks OK at present.
> 
> We should at least update Documentation/timers/timekeeping.txt: "a sane
> value" becomes "the correct value", no alternatives.
> 
> > > There are many sites in kernel/printk/printk.c which take logbuf_lock,
> > > but this patch only sets logbuf_cpu in one of those cases:
> > > vprintk_emit().  I suggest adding helper functions to take/release
> > > logbuf_lock.  And rename logbuf_lock to something else to ensure that
> > > nobody accidentally takes the lock directly.
> > 
> > IMHO, vprintk_emit() is special. It is the only location where the
> > lock is taken in NMI context. The other functions are used to dump
> > @logbuf and are called in normal context.
> > 
> > try_logbuf_lock_in_nmi() could fail and we need to handle the error
> > path. We do not need to do this in the other locations.
> > 
> > Note that we do not want to get the console in NMI because
> > there are even more locks that might cause a deadlock.
> 
> Consider the case where a CPU has taken logbuf_lock within
> devkmsg_read() and then receives an NMI, from which it calls
> try_logbuf_lock_in_nmi():

I am not sure that I understand. My point is that we do not call
devkmsg_read() from NMI context, so we do not need to use
try_logbuf_lock_in_nmi() there. IMHO, the same is true for
all other locations except for vprintk_emit().


> > +/* We must be careful in NMI when we managed to preempt a running printk */
> > +static int try_logbuf_lock_in_nmi(void)
> > +{
> > +	u64 start_time, current_time;
> > +	int this_cpu = smp_processor_id();
> > +
> > +	/* no way if we are already locked on this CPU */
> > +	if (logbuf_cpu == this_cpu)
> > +		return 0;

Or do you have this check in mind? It will detect the deadlock
immediately but @logbuf_cpu is set only in vprintk_emit(). We
will spin when NMI comes inside the other functions,
e.g. devkmsg_read().


> > +	/* try hard to get the lock but do not wait forever */
> > +	start_time = cpu_clock(this_cpu);
> > +	current_time = start_time;
> > +	while (current_time - start_time < TRY_LOCKBUF_LOCK_MAX_DELAY_NS) {
> > +		if (raw_spin_trylock(&logbuf_lock))
> > +			return 1;
> > +		cpu_relax();
> > +		current_time = cpu_clock(this_cpu);
> > +	}
> > +
> > +	return 0;
> > +}
> 
> That CPU is now going to spin around for 100us and then time out.

Yes, there was a deadlock without the patch. So, limited spinning is
still a win.

Or would you like to detect the deadlock immediately in all cases?
I mean to add the proposed wrapper around take/release lock calls
and set/test some cpu-specific variable there?

It sounds interesting. Well, the detection will not be 100% correct
because there is a small race window between taking @logbuf_lock
and setting @lockbuf_cpu. I wonder if it is worth doing. But I will
do it if you want.

Best Regards,
Petr

  reply	other threads:[~2015-05-29 10:56 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-25 12:46 [PATCH 00/10] printk: Avoid deadlock in NMI + vprintk_emit() cleanup Petr Mladek
2015-05-25 12:46 ` [PATCH 01/10] printk: Avoid deadlock in NMI context Petr Mladek
2015-05-27 23:13   ` Andrew Morton
2015-05-28 12:00     ` Petr Mladek
2015-05-25 12:46 ` [PATCH 02/10] printk: Try harder to get logbuf_lock on NMI Petr Mladek
2015-05-27 23:14   ` Andrew Morton
2015-05-28  7:54     ` Jiri Kosina
2015-05-28 13:50     ` Petr Mladek
2015-05-28 20:09       ` Andrew Morton
2015-05-29 10:56         ` Petr Mladek [this message]
2015-05-29 20:46           ` Andrew Morton
2015-05-25 12:46 ` [PATCH 03/10] printk: Move the deferred printk stuff Petr Mladek
2015-05-25 12:46 ` [PATCH 04/10] printk: Merge and flush NMI buffer predictably via IRQ work Petr Mladek
2015-05-27 23:14   ` Andrew Morton
2015-05-28 13:12     ` Petr Mladek
2015-05-25 12:46 ` [PATCH 05/10] printk: Try hard to print Oops message in NMI context Petr Mladek
2015-05-25 12:46 ` [PATCH 06/10] printk: Split delayed printing of warnings from vprintk_emit() Petr Mladek
2015-05-25 12:46 ` [PATCH 07/10] printk: Split text formatting and analyze " Petr Mladek
2015-05-25 12:46 ` [PATCH 08/10] printk: Detect scheduler messages in vprintk_format_and_analyze() Petr Mladek
2015-05-25 12:46 ` [PATCH 09/10] printk: Split text storing logic from vprintk_emit() Petr Mladek
2015-05-25 12:46 ` [PATCH 10/10] printk: Split console call " Petr Mladek
2015-05-29 20:50 ` [PATCH 00/10] printk: Avoid deadlock in NMI + vprintk_emit() cleanup Andrew Morton
2015-06-01 13:06   ` Jan Kara
2015-06-02  9:46     ` long.wanglong
2015-06-02  9:52       ` Jiri Kosina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150529105607.GG3135@pathway.suse.cz \
    --to=pmladek@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=anderson@redhat.com \
    --cc=dzickus@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=jack@suse.cz \
    --cc=jkosina@suse.cz \
    --cc=kay@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=long.wanglong@huawei.com \
    --cc=mhocko@suse.cz \
    --cc=morgan.wang@huawei.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peifeiyue@huawei.com \
    --cc=rostedt@goodmis.org \
    --cc=sasha.levin@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.