All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Andrew Morton <akpm@osdl.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Greg Kroah-Hartman <gregkh@suse.de>,
	Christoph Hellwig <hch@infradead.org>,
	linux-kernel@vger.kernel.org, ltt-dev@shafik.org,
	"Martin J. Bligh" <mbligh@mbligh.org>,
	Douglas Niehaus <niehaus@eecs.ku.edu>,
	systemtap@sources.redhat.com,
	Thomas Gleixner <tglx@linutronix.de>,
	Richard J Moore <richardj_moore@uk.ibm.com>
Subject: Re: [PATCH 1/2] lockdep missing barrier()
Date: Wed, 24 Jan 2007 11:51:50 -0500	[thread overview]
Message-ID: <20070124165150.GC4979@Krystal> (raw)
In-Reply-To: <20070123202637.970e467b.akpm@osdl.org>

* Andrew Morton (akpm@osdl.org) wrote:
> On Tue, 16 Jan 2007 12:56:24 -0500
> Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > This patch adds a barrier() to lockdep.c lockdep_recursion updates. This
> > variable behaves like the preemption count and should therefore use similar
> > memory barriers.
> > 
> > This patch applies on 2.6.20-rc4-git3.
> > 
> > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> > 
> > --- a/kernel/lockdep.c
> > +++ b/kernel/lockdep.c
> > @@ -166,12 +166,14 @@ static struct list_head chainhash_table[CHAINHASH_SIZE];
> >  void lockdep_off(void)
> >  {
> >  	current->lockdep_recursion++;
> > +	barrier();
> >  }
> >  
> >  EXPORT_SYMBOL(lockdep_off);
> >  
> >  void lockdep_on(void)
> >  {
> > +	barrier();
> >  	current->lockdep_recursion--;
> >  }
> 
> I am allergic to undocumented barriers.  It is often unobvious what the
> barrier is supposed to protect against, yielding mystifying code.  This is
> one such case.
> 
> Please add code comments.

It looks like my fix was not the right one, but looking at the code in more
depth, another fix seems to be required. Summary : the order of locking in
vprintk() should be changed.


lockdep on/off used in : printk and nmi_enter/exit.

* In kernel/printk.c :

vprintk() does :

preempt_disable()
local_irq_save()
lockdep_off()
spin_lock(&logbuf_lock)
spin_unlock(&logbuf_lock)
if(!down_trylock(&console_sem))
   up(&console_sem)
lockdep_on()
local_irq_restore()
preempt_enable()

The goals here is to make sure we do not call printk() recursively from
kernel/lockdep.c:__lock_acquire() (called from spin_* and down/up) nor from
kernel/lockdep.c:trace_hardirqs_on/off() (called from local_irq_restore/save).
It can then potentially call printk() through mark_held_locks/mark_lock.

It correctly protects against the spin_lock call and the up/down call, but it
does not protect against local_irq_restore.

If we change the locking so it becomes correct :

preempt_disable()
lockdep_off()
local_irq_save()
spin_lock(&logbuf_lock)
spin_unlock(&logbuf_lock)
if(!down_trylock(&console_sem))
   up(&console_sem)
local_irq_restore()
lockdep_on()
preempt_enable()

Everything should be fine without a barrier(), because the
local_irq_save/restore will hopefully make sure the compiler won't reorder the
memory writes across cli()/sti() and the lockdep_recursion variable belongs to
the current task.



* In include/linux/hardirq.h:nmi_enter()/nmi_exit()

Used, for instance, in arch/i386/kernel/traps.c:do_nmi()
Calls nmi_enter : (notice : possibly no barrier between lockdep_off() and the
end of the nmi_enter() code with the "right" config options : preemption
disabled)
#define nmi_enter()             do { lockdep_off(); irq_enter(); } while (0)
#define irq_enter()                                     \
        do {                                            \
                account_system_vtime(current);          \
                add_preempt_count(HARDIRQ_OFFSET);      \
                trace_hardirq_enter();                  \
        } while (0)
# define add_preempt_count(val) do { preempt_count() += (val); } while (0)
# define trace_hardirq_enter()  do { current->hardirq_context++; } while (0)

Then calls, for instance, arch/i386/kernel/nmi.c:nmi_watchdog_tick(),
which takes a spinlock and may also call printk.

Because we are within a context where irqs are disabled and we use the
per-task lockdep_recursion only within the current task, there is no need to
make it appear ordered to other CPUs. Also, the compiler should not reorder the
lockdep_off() and the call to kernel/lockdep.c:__lock_acquire(), because they
both access the same variable : current->lockdep_recursion. So the NMI case
seems fine without a memory barrier.

Mathieu

-- 
OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68 

  reply	other threads:[~2007-01-24 16:51 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-20 23:52 [PATCH 0/4] Linux Kernel Markers Mathieu Desnoyers
2006-12-20 23:57 ` [PATCH 1/4] Linux Kernel Markers : Architecture agnostic code Mathieu Desnoyers
2006-12-20 23:59 ` [PATCH 2/4] Linux Kernel Markers : kconfig menus Mathieu Desnoyers
2006-12-21  0:00 ` [PATCH 3/4] Linux Kernel Markers : i386 optimisation Mathieu Desnoyers
2006-12-21  0:01 ` [PATCH 4/4] Linux Kernel Markers : powerpc optimisation Mathieu Desnoyers
2006-12-21  0:01   ` Mathieu Desnoyers
2007-01-13  1:33 ` [PATCH 0/4] Linux Kernel Markers Richard J Moore
2007-01-13  5:45   ` Mathieu Desnoyers
2007-01-16 17:41     ` [PATCH 0/4 update] Linux Kernel Markers - i386 : pIII erratum 49 : XMC Mathieu Desnoyers
2007-01-16 18:35       ` Frank Ch. Eigler
2007-01-16 21:27       ` [PATCH 0/4 update] kprobes and traps Mathieu Desnoyers
2007-01-17 12:25         ` S. P. Prasanna
2007-01-16 17:56   ` [PATCH 1/2] lockdep missing barrier() Mathieu Desnoyers
2007-01-24  4:26     ` Andrew Morton
2007-01-24 16:51       ` Mathieu Desnoyers [this message]
2007-01-24 17:24         ` [PATCH] order of lockdep off/on in vprintk() should be changed Mathieu Desnoyers
2007-01-24 17:55           ` [PATCH] minimize lockdep_on/off side-effect Mathieu Desnoyers
2007-01-16 17:56   ` [PATCH 2/2] lockdep reentrancy Mathieu Desnoyers
2007-01-24  4:29     ` Andrew Morton
2007-01-24 16:55       ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070124165150.GC4979@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=akpm@osdl.org \
    --cc=gregkh@suse.de \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltt-dev@shafik.org \
    --cc=mbligh@mbligh.org \
    --cc=mingo@redhat.com \
    --cc=niehaus@eecs.ku.edu \
    --cc=richardj_moore@uk.ibm.com \
    --cc=systemtap@sources.redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.