All of lore.kernel.org
 help / color / mirror / Atom feed
From: Byungchul Park <byungchul@sk.com>
To: Yunseong Kim <yskelg@gmail.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	kernel_team@skhynix.com, torvalds@linux-foundation.org,
	damien.lemoal@opensource.wdc.com, linux-ide@vger.kernel.org,
	adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
	mingo@redhat.com, peterz@infradead.org, will@kernel.org,
	tglx@linutronix.de, rostedt@goodmis.org, joel@joelfernandes.org,
	sashal@kernel.org, daniel.vetter@ffwll.ch, duyuyang@gmail.com,
	johannes.berg@intel.com, tj@kernel.org, tytso@mit.edu,
	willy@infradead.org, david@fromorbit.com, amir73il@gmail.com,
	gregkh@linuxfoundation.org, kernel-team@lge.com,
	linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@kernel.org,
	minchan@kernel.org, hannes@cmpxchg.org, vdavydov.dev@gmail.com,
	sj@kernel.org, jglisse@redhat.com, dennis@kernel.org,
	cl@linux.com, penberg@kernel.org, rientjes@google.com,
	vbabka@suse.cz, ngupta@vflare.org, linux-block@vger.kernel.org,
	josef@toxicpanda.com, linux-fsdevel@vger.kernel.org,
	jack@suse.cz, jlayton@kernel.org, dan.j.williams@intel.com,
	hch@infradead.org, djwong@kernel.org,
	dri-devel@lists.freedesktop.org, rodrigosiqueiramelo@gmail.com,
	melissa.srw@gmail.com, hamohammed.sa@gmail.com,
	42.hyeyoo@gmail.com, chris.p.wilson@intel.com,
	gwan-gyeong.mun@intel.com, max.byungchul.park@gmail.com,
	boqun.feng@gmail.com, longman@redhat.com, hdanton@sina.com,
	her0gyugyu@gmail.com, Yeoreum Yun <yeoreum.yun@arm.com>
Subject: Re: [PATCH v14 2/28] dept: Implement Dept(Dependency Tracker)
Date: Mon, 25 Nov 2024 10:05:43 +0900	[thread overview]
Message-ID: <20241125010543.GA9137@system.software.com> (raw)
In-Reply-To: <489d941f-c4e8-4d1f-92ee-02074c713dd1@gmail.com>

On Sun, Nov 24, 2024 at 10:34:02PM +0900, Yunseong Kim wrote:
> Hi Byungchul,
> 
> Thank you for the great feature. Currently, DEPT has a bug in the
> 'dept_key_destroy()' function that must be fixed to ensure proper
> operation in the upstream Linux kernel.
> 
> On 5/8/24 6:46 오후, Byungchul Park wrote:
> > CURRENT STATUS
> > --------------
> > Lockdep tracks acquisition order of locks in order to detect deadlock,
> > and IRQ and IRQ enable/disable state as well to take accident
> > acquisitions into account.
> > 
> > Lockdep should be turned off once it detects and reports a deadlock
> > since the data structure and algorithm are not reusable after detection
> > because of the complex design.
> > 
> > PROBLEM
> > -------
> > *Waits* and their *events* that never reach eventually cause deadlock.
> > However, Lockdep is only interested in lock acquisition order, forcing
> > to emulate lock acqusition even for just waits and events that have
> > nothing to do with real lock.
> > 
> > Even worse, no one likes Lockdep's false positive detection because that
> > prevents further one that might be more valuable. That's why all the
> > kernel developers are sensitive to Lockdep's false positive.
> > 
> > Besides those, by tracking acquisition order, it cannot correctly deal
> > with read lock and cross-event e.g. wait_for_completion()/complete() for
> > deadlock detection. Lockdep is no longer a good tool for that purpose.
> > 
> > SOLUTION
> > --------
> > Again, *waits* and their *events* that never reach eventually cause
> > deadlock. The new solution, Dept(DEPendency Tracker), focuses on waits
> > and events themselves. Dept tracks waits and events and report it if
> > any event would be never reachable.
> > 
> > Dept does:
> >    . Works with read lock in the right way.
> >    . Works with any wait and event e.i. cross-event.
> >    . Continue to work even after reporting multiple times.
> >    . Provides simple and intuitive APIs.
> >    . Does exactly what dependency checker should do.
> > 
> > Q & A
> > -----
> > Q. Is this the first try ever to address the problem?
> > A. No. Cross-release feature (b09be676e0ff2 locking/lockdep: Implement
> >    the 'crossrelease' feature) addressed it 2 years ago that was a
> >    Lockdep extension and merged but reverted shortly because:
> > 
> >    Cross-release started to report valuable hidden problems but started
> >    to give report false positive reports as well. For sure, no one
> >    likes Lockdep's false positive reports since it makes Lockdep stop,
> >    preventing reporting further real problems.
> > 
> > Q. Why not Dept was developed as an extension of Lockdep?
> > A. Lockdep definitely includes all the efforts great developers have
> >    made for a long time so as to be quite stable enough. But I had to
> >    design and implement newly because of the following:
> > 
> >    1) Lockdep was designed to track lock acquisition order. The APIs and
> >       implementation do not fit on wait-event model.
> >    2) Lockdep is turned off on detection including false positive. Which
> >       is terrible and prevents developing any extension for stronger
> >       detection.
> > 
> > Q. Do you intend to totally replace Lockdep?
> > A. No. Lockdep also checks if lock usage is correct. Of course, the
> >    dependency check routine should be replaced but the other functions
> >    should be still there.
> > 
> > Q. Do you mean the dependency check routine should be replaced right
> >    away?
> > A. No. I admit Lockdep is stable enough thanks to great efforts kernel
> >    developers have made. Lockdep and Dept, both should be in the kernel
> >    until Dept gets considered stable.
> > 
> > Q. Stronger detection capability would give more false positive report.
> >    Which was a big problem when cross-release was introduced. Is it ok
> >    with Dept?
> > A. It's ok. Dept allows multiple reporting thanks to simple and quite
> >    generalized design. Of course, false positive reports should be fixed
> >    anyway but it's no longer as a critical problem as it was.
> > 
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> 
> If a module previously checked for dependencies by DEPT is loaded and
> then would be unloaded, a kernel panic shall occur when the kernel

Hi,

Thank you for sharing the issue.  Yes.  I'm aware of what you are
mentioning.  I will fix it with high priority.

Thanks again.

	Byungchul

> reuses the corresponding memory area for other purposes. This issue must
> be addressed as a priority to enable the use of DEPT. Testing this patch
> on the Ubuntu kernel confirms the problem.
> 
> > +void dept_key_destroy(struct dept_key *k)
> > +{
> > +	struct dept_task *dt = dept_task();
> > +	unsigned long flags;
> > +	int sub_id;
> > +
> > +	if (unlikely(!dept_working()))
> > +		return;
> > +
> > +	if (dt->recursive == 1 && dt->task_exit) {
> > +		/*
> > +		 * Need to allow to go ahead in this case where
> > +		 * ->recursive has been set to 1 by dept_off() in
> > +		 * dept_task_exit() and ->task_exit has been set to
> > +		 * true in dept_task_exit().
> > +		 */
> > +	} else if (dt->recursive) {
> > +		DEPT_STOP("Key destroying fails.\n");
> > +		return;
> > +	}
> > +
> > +	flags = dept_enter();
> > +
> > +	/*
> > +	 * dept_key_destroy() should not fail.
> > +	 *
> > +	 * FIXME: Should be fixed if dept_key_destroy() causes deadlock
> > +	 * with dept_lock().
> > +	 */
> > +	while (unlikely(!dept_lock()))
> > +		cpu_relax();
> > +
> > +	for (sub_id = 0; sub_id < DEPT_MAX_SUBCLASSES; sub_id++) {
> > +		struct dept_class *c;
> > +
> > +		c = lookup_class((unsigned long)k->base + sub_id);
> > +		if (!c)
> > +			continue;
> > +
> > +		hash_del_class(c);
> > +		disconnect_class(c);
> > +		list_del(&c->all_node);
> > +		invalidate_class(c);
> > +
> > +		/*
> > +		 * Actual deletion will happen on the rcu callback
> > +		 * that has been added in disconnect_class().
> > +		 */
> > +		del_class(c);
> > +	}
> > +
> > +	dept_unlock();
> > +	dept_exit(flags);
> > +
> > +	/*
> > +	 * Wait until even lockless hash_lookup_class() for the class
> > +	 * returns NULL.
> > +	 */
> > +	might_sleep();
> > +	synchronize_rcu();
> > +}
> > +EXPORT_SYMBOL_GPL(dept_key_destroy);
> 
> Best regards,
> Yunseong Kim

  reply	other threads:[~2024-11-25  1:05 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-08  9:46 [PATCH v14 00/28] DEPT(Dependency Tracker) Byungchul Park
2024-05-08  9:46 ` [PATCH v14 01/28] llist: Move llist_{head,node} definition to types.h Byungchul Park
2024-05-08  9:46   ` [PATCH v14 01/28] llist: Move llist_{head, node} " Byungchul Park
2024-05-08  9:46 ` [PATCH v14 02/28] dept: Implement Dept(Dependency Tracker) Byungchul Park
2024-11-24 13:34   ` [PATCH v14 2/28] " Yunseong Kim
2024-11-25  1:05     ` Byungchul Park [this message]
2024-05-08  9:47 ` [PATCH v14 03/28] dept: Add single event dependency tracker APIs Byungchul Park
2024-05-08  9:47 ` [PATCH v14 04/28] dept: Add lock " Byungchul Park
2024-05-08  9:47 ` [PATCH v14 05/28] dept: Tie to Lockdep and IRQ tracing Byungchul Park
2024-05-08  9:47 ` [PATCH v14 06/28] dept: Add proc knobs to show stats and dependency graph Byungchul Park
2024-05-08  9:47 ` [PATCH v14 07/28] dept: Distinguish each syscall context from another Byungchul Park
2024-05-08  9:47 ` [PATCH v14 08/28] dept: Distinguish each work " Byungchul Park
2024-05-08  9:47 ` [PATCH v14 09/28] dept: Add a mechanism to refill the internal memory pools on running out Byungchul Park
2024-05-08  9:47 ` [PATCH v14 10/28] dept: Record the latest one out of consecutive waits of the same class Byungchul Park
2024-05-08  9:47 ` [PATCH v14 11/28] dept: Apply sdt_might_sleep_{start,end}() to wait_for_completion()/complete() Byungchul Park
2024-05-08  9:47   ` [PATCH v14 11/28] dept: Apply sdt_might_sleep_{start, end}() " Byungchul Park
2024-05-08  9:47 ` [PATCH v14 12/28] dept: Apply sdt_might_sleep_{start,end}() to swait Byungchul Park
2024-05-08  9:47 ` [PATCH v14 13/28] dept: Apply sdt_might_sleep_{start,end}() to waitqueue wait Byungchul Park
2024-05-08  9:47   ` [PATCH v14 13/28] dept: Apply sdt_might_sleep_{start, end}() " Byungchul Park
2024-05-08  9:47 ` [PATCH v14 14/28] dept: Apply sdt_might_sleep_{start,end}() to hashed-waitqueue wait Byungchul Park
2024-05-08  9:47   ` [PATCH v14 14/28] dept: Apply sdt_might_sleep_{start, end}() " Byungchul Park
2024-05-08  9:47 ` [PATCH v14 15/28] dept: Apply sdt_might_sleep_{start,end}() to dma fence wait Byungchul Park
2024-05-08  9:47   ` [PATCH v14 15/28] dept: Apply sdt_might_sleep_{start, end}() " Byungchul Park
2024-05-08  9:47 ` [PATCH v14 16/28] dept: Track timeout waits separately with a new Kconfig Byungchul Park
2024-05-08  9:47 ` [PATCH v14 17/28] dept: Apply timeout consideration to wait_for_completion()/complete() Byungchul Park
2024-05-08  9:47 ` [PATCH v14 18/28] dept: Apply timeout consideration to swait Byungchul Park
2024-05-08  9:47 ` [PATCH v14 19/28] dept: Apply timeout consideration to waitqueue wait Byungchul Park
2024-05-08  9:47 ` [PATCH v14 20/28] dept: Apply timeout consideration to hashed-waitqueue wait Byungchul Park
2024-05-08  9:47 ` [PATCH v14 21/28] dept: Apply timeout consideration to dma fence wait Byungchul Park
2024-05-08  9:47 ` [PATCH v14 22/28] dept: Make Dept able to work with an external wgen Byungchul Park
2024-05-08  9:47 ` [PATCH v14 23/28] dept: Track PG_locked with dept Byungchul Park
2024-05-08  9:47 ` [PATCH v14 24/28] dept: Print event context requestor's stacktrace on report Byungchul Park
2024-05-08  9:47 ` [PATCH v14 25/28] cpu/hotplug: Use a weaker annotation in AP thread Byungchul Park
2024-05-08  9:47 ` [PATCH v14 26/28] fs/jbd2: Use a weaker annotation in journal handling Byungchul Park
2024-05-08  9:47 ` [PATCH v14 27/28] dept: Add documentation for Dept Byungchul Park
2024-05-08  9:47 ` [PATCH v14 28/28] dept: Add documentation for Dept's APIs Byungchul Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241125010543.GA9137@system.software.com \
    --to=byungchul@sk.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=amir73il@gmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=chris.p.wilson@intel.com \
    --cc=cl@linux.com \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=david@fromorbit.com \
    --cc=dennis@kernel.org \
    --cc=djwong@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=duyuyang@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=gwan-gyeong.mun@intel.com \
    --cc=hamohammed.sa@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=hdanton@sina.com \
    --cc=her0gyugyu@gmail.com \
    --cc=jack@suse.cz \
    --cc=jglisse@redhat.com \
    --cc=jlayton@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=johannes.berg@intel.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@lge.com \
    --cc=kernel_team@skhynix.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=max.byungchul.park@gmail.com \
    --cc=melissa.srw@gmail.com \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=ngupta@vflare.org \
    --cc=penberg@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=rodrigosiqueiramelo@gmail.com \
    --cc=rostedt@goodmis.org \
    --cc=sashal@kernel.org \
    --cc=sj@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yeoreum.yun@arm.com \
    --cc=yskelg@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.