linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v17 00/47] DEPT(DEPendency Tracker)
@ 2025-10-02  8:12 Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 01/47] llist: move llist_{head,node} definition to types.h Byungchul Park
                   ` (46 more replies)
  0 siblings, 47 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Found out a recent deadlock issue can be reported by dept.  The issue is:

   https://lore.kernel.org/all/20250513093448.592150-1-gavinguo@igalia.com/

I'm happy to see that dept reported real problems in practice.  See:

   https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SAKURA.ne.jp/
   https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.park@lge.com/
   https://lore.kernel.org/all/b6e00e77-4a8c-4e05-ab79-266bf05fcc2d@igalia.com/

I added documents describing dept, that would help you understand what
dept is and how dept works.  You can use dept just with CONFIG_DEPT on
and checking dmesg at runtime.

There are still false positives and some of them are already in progress
to suppress and the efforts need to be kept for a while as lockdep
experienced.  Especially, since dept tracks PG_locked but folios have
never been split in class - which needs help from maybe fs guys tho.. -
we should put up with the AA report of PG_locked for a while, for
instance, any nested folio_lock()s will give the dept splat for now :(

It's worth noting that *EXPERIMENTAL* in Kconfig is tagged, which means
dept is not proper for an automation tool yet.

Thanks for the support and contribution, to:

   Harry Yoo <harry.yoo@oracle.com>
   Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
   Yunseong Kim <ysk@kzalloc.com>
   Yeoreum Yun <yeoreum.yun@arm.com>

---

Hi Linus and folks,

I've been developing a tool for detecting deadlock possibilities by
tracking wait/event rather than lock acquisition order to try to cover
all synchonization machanisms.

Benefits:

	0. Works with all lock primitives.
	1. Works with wait_for_completion()/complete().
	2. Works with PG_locked.
	3. Works with swait/wakeup.
	4. Works with waitqueue.
	5. Works with wait_bit.
	6. Multiple reports are allowed.
	7. Deduplication control on multiple reports.
	8. Withstand false positives thanks to 7.
	9. Easy to annotate on waits/events.

Future works after getting merged:

	0. To separates dept from lockdep.
	1. To use dept as a dependency engine for lockdep.
	2. To add missing annotations on waits/events.

How to interpret reports:
(See the document in this patchset for more detail.)

	[S] the start of the event context
	[W] the wait disturbing the event from being triggered
	[E] the event that cannot be reachable

Thanks.

	Byungchul

---

Changes from v16:
	1. Rebase on v6.17.
	2. Fix a false positive from rcu (by Yunseong Kim)
	3. Introduce APIs to set page's usage, dept_set_page_usage() and
	   dept_reset_page_usage() to avoid false positives.
	4. Consider lock_page() as a potential wait unconditionally.
	5. Consider folio_lock_killable() as a potential wait
	   unconditionally.
	6. Add support for tracking PG_writeback waits and events.
	7. Fix two build errors due to the additional debug information
	   added by dept. (by Yunseong Kim)

Changes from v15:
	1. Fix typo and improve comments and commit messages (feedbacked
	   by ALOK TIWARI, Waiman Long, and kernel test robot).
	2. Do not stop dept on detection of cicular dependency of
	   recover event, allowing to keep reporting.
	3. Add SK hynix to copyright.
	4. Consider folio_lock() as a potential wait unconditionally.
	5. Fix Kconfig dependency bug (feedbacked by kernel test rebot).
	6. Do not suppress reports that involve classes even that have
	   already involved in other reports, allowing to keep
	   reporting.

Changes from v14:
	1. Rebase on the current latest, v6.15-rc6.
	2. Refactor dept code.
	3. With multi event sites for a single wait, even if an event
	   forms a circular dependency, the event can be recovered by
	   other event(or wake up) paths.  Even though informing the
	   circular dependency is worthy but it should be suppressed
	   once informing it, if it doesn't lead an actual deadlock.  So
	   introduce APIs to annotate the relationship between event
	   site and recover site, that are, event_site() and
	   dept_recover_event().
	4. wait_for_completion() worked with dept map embedded in struct
	   completion.  However, it generates a few false positves since
	   all the waits using the instance of struct completion, share
	   the map and key.  To avoid the false positves, make it not to
	   share the map and key but each wait_for_completion() caller
	   have its own key by default.  Of course, external maps also
	   can be used if needed.
	5. Fix a bug about hardirq on/off tracing.
	6. Implement basic unit test for dept.
	7. Add more supports for dma fence synchronization.
	8. Add emergency stop of dept e.g. on panic().
	9. Fix false positives by mmu_notifier_invalidate_*().
	10. Fix recursive call bug by DEPT_WARN_*() and DEPT_STOP().
	11. Fix trivial bugs in DEPT_WARN_*() and DEPT_STOP().
	12. Fix a bug that a spin lock, dept_pool_spin, is used in
	    both contexts of irq disabled and enabled without irq
	    disabled.
	13. Suppress reports with classes, any of that already have
	    been reported, even though they have different chains but
	    being barely meaningful.
	14. Print stacktrace of the wait that an event is now waking up,
	    not only stacktrace of the event.
	15. Make dept aware of lockdep_cmp_fn() that is used to avoid
	    false positives in lockdep so that dept can also avoid them.
	16. Do do_event() only if there are no ecxts have been
	    delimited.
	17. Fix a bug that was not synchronized for stage_m in struct
	    dept_task, using a spin lock, dept_task()->stage_lock.
	18. Fix a bug that dept didn't handle the case that multiple
	    ttwus for a single waiter can be called at the same time
	    e.i. a race issue.
	19. Distinguish each kernel context from others, not only by
	    system call but also by user oriented fault so that dept can
	    work with more accuracy information about kernel context.
	    That helps to avoid a few false positives.
	20. Limit dept's working to x86_64 and arm64.

Changes from v13:

	1. Rebase on the current latest version, v6.9-rc7.
	2. Add 'dept' documentation describing dept APIs.

Changes from v12:

	1. Refine the whole document for dept.
	2. Add 'Interpret dept report' section in the document, using a
	   deadlock report obtained in practice. Hope this version of
	   document helps guys understand dept better.

	   https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SAKURA.ne.jp/#t
	   https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.park@lge.com/

Changes from v11:

	1. Add 'dept' documentation describing the concept of dept.
	2. Rewrite the commit messages of the following commits for
	   using weaker lockdep annotation, for better description.

	   fs/jbd2: Use a weaker annotation in journal handling
	   cpu/hotplug: Use a weaker annotation in AP thread

	   (feedbacked by Thomas Gleixner)

Changes from v10:

	1. Fix noinstr warning when building kernel source.
	2. dept has been reporting some false positives due to the folio
	   lock's unfairness. Reflect it and make dept work based on
	   dept annotaions instead of just wait and wake up primitives.
	3. Remove the support for PG_writeback while working on 2. I
	   will add the support later if needed.
	4. dept didn't print stacktrace for [S] if the participant of a
	   deadlock is not lock mechanism but general wait and event.
	   However, it made hard to interpret the report in that case.
	   So add support to print stacktrace of the requestor who asked
	   the event context to run - usually a waiter of the event does
	   it just before going to wait state.
	5. Give up tracking raw_local_irq_{disable,enable}() since it
	   totally messed up dept's irq tracking. So make it work in the
	   same way as lockdep does. I will consider it once any false
	   positives by those are observed again.
	6. Change the manual rwsem_acquire_read(->j_trans_commit_map)
	   annotation in fs/jbd2/transaction.c to the try version so
	   that it works as much as it exactly needs.
	7. Remove unnecessary 'inline' keyword in dept.c and add
	   '__maybe_unused' to a needed place.

Changes from v9:

	1. Fix a bug. SDT tracking didn't work well because of my big
	   mistake that I should've used waiter's map to indentify its
	   class but it had been working with waker's one. FYI,
	   PG_locked and PG_writeback weren't affected. They still
	   worked well. (reported by YoungJun)
	
Changes from v8:

	1. Fix build error by adding EXPORT_SYMBOL(PG_locked_map) and
	   EXPORT_SYMBOL(PG_writeback_map) for kernel module build -
	   appologize for that. (reported by kernel test robot)
	2. Fix build error by removing header file's circular dependency
	   that was caused by "atomic.h", "kernel.h" and "irqflags.h",
	   which I introduced - appolgize for that. (reported by kernel
	   test robot)

Changes from v7:

	1. Fix a bug that cannot track rwlock dependency properly,
	   introduced in v7. (reported by Boqun and lockdep selftest)
	2. Track wait/event of PG_{locked,writeback} more aggressively
	   assuming that when a bit of PG_{locked,writeback} is cleared
	   there might be waits on the bit. (reported by Linus, Hillf
	   and syzbot)
	3. Fix and clean bad style code e.i. unnecessarily introduced
	   a randome pattern and so on. (pointed out by Linux)
	4. Clean code for applying dept to wait_for_completion().

Changes from v6:

	1. Tie to task scheduler code to track sleep and try_to_wake_up()
	   assuming sleeps cause waits, try_to_wake_up()s would be the
	   events that those are waiting for, of course with proper dept
	   annotations, sdt_might_sleep_weak(), sdt_might_sleep_strong()
	   and so on. For these cases, class is classified at sleep
	   entrance rather than the synchronization initialization code.
	   Which would extremely reduce false alarms.
	2. Remove the dept associated instance in each page struct for
	   tracking dependencies by PG_locked and PG_writeback thanks to
	   the 1. work above.
	3. Introduce CONFIG_dept_AGGRESIVE_TIMEOUT_WAIT to suppress
	   reports that waits with timeout set are involved, for those
	   who don't like verbose reporting.
	4. Add a mechanism to refill the internal memory pools on
	   running out so that dept could keep working as long as free
	   memory is available in the system.
	5. Re-enable tracking hashed-waitqueue wait. That's going to no
	   longer generate false positives because class is classified
	   at sleep entrance rather than the waitqueue initailization.
	6. Refactor to make it easier to port onto each new version of
	   the kernel.
	7. Apply dept to dma fence.
	8. Do trivial optimizaitions.

Changes from v5:

	1. Use just pr_warn_once() rather than WARN_ONCE() on the lack
	   of internal resources because WARN_*() printing stacktrace is
	   too much for informing the lack. (feedback from Ted, Hyeonggon)
	2. Fix trivial bugs like missing initializing a struct before
	   using it.
	3. Assign a different class per task when handling onstack
	   variables for waitqueue or the like. Which makes dept
	   distinguish between onstack variables of different tasks so
	   as to prevent false positives. (reported by Hyeonggon)
	4. Make dept aware of even raw_local_irq_*() to prevent false
	   positives. (reported by Hyeonggon)
	5. Don't consider dependencies between the events that might be
	   triggered within __schedule() and the waits that requires
	    __schedule(), real ones. (reported by Hyeonggon)
	6. Unstage the staged wait that has prepare_to_wait_event()'ed
	   *and* yet to get to __schedule(), if we encounter __schedule()
	   in-between for another sleep, which is possible if e.g. a
	   mutex_lock() exists in 'condition' of ___wait_event().
	7. Turn on CONFIG_PROVE_LOCKING when CONFIG_DEPT is on, to rely
	   on the hardirq and softirq entrance tracing to make dept more
	   portable for now.

Changes from v4:

	1. Fix some bugs that produce false alarms.
	2. Distinguish each syscall context from another *for arm64*.
	3. Make it not warn it but just print it in case dept ring
	   buffer gets exhausted. (feedback from Hyeonggon)
	4. Explicitely describe "EXPERIMENTAL" and "dept might produce
	   false positive reports" in Kconfig. (feedback from Ted)

Changes from v3:

	1. dept shouldn't create dependencies between different depths
	   of a class that were indicated by *_lock_nested(). dept
	   normally doesn't but it does once another lock class comes
	   in. So fixed it. (feedback from Hyeonggon)
	2. dept considered a wait as a real wait once getting to
	   __schedule() even if it has been set to TASK_RUNNING by wake
	   up sources in advance. Fixed it so that dept doesn't consider
	   the case as a real wait. (feedback from Jan Kara)
	3. Stop tracking dependencies with a map once the event
	   associated with the map has been handled. dept will start to
	   work with the map again, on the next sleep.

Changes from v2:

	1. Disable dept on bit_wait_table[] in sched/wait_bit.c
	   reporting a lot of false positives, which is my fault.
	   Wait/event for bit_wait_table[] should've been tagged in a
	   higher layer for better work, which is a future work.
	   (feedback from Jan Kara)
	2. Disable dept on crypto_larval's completion to prevent a false
	   positive.

Changes from v1:

	1. Fix coding style and typo. (feedback from Steven)
	2. Distinguish each work context from another in workqueue.
	3. Skip checking lock acquisition with nest_lock, which is about
	   correct lock usage that should be checked by lockdep.

Changes from RFC(v0):

	1. Prevent adding a wait tag at prepare_to_wait() but __schedule().
	   (feedback from Linus and Matthew)
	2. Use try version at lockdep_acquire_cpus_lock() annotation.
	3. Distinguish each syscall context from another.

Byungchul Park (47):
  llist: move llist_{head,node} definition to types.h
  dept: implement DEPT(DEPendency Tracker)
  dept: add single event dependency tracker APIs
  dept: add lock dependency tracker APIs
  dept: tie to lockdep and IRQ tracing
  dept: add proc knobs to show stats and dependency graph
  dept: distinguish each kernel context from another
  x86_64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to x86_64
  arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64
  dept: distinguish each work from another
  dept: add a mechanism to refill the internal memory pools on running
    out
  dept: record the latest one out of consecutive waits of the same class
  dept: apply sdt_might_sleep_{start,end}() to
    wait_for_completion()/complete()
  dept: apply sdt_might_sleep_{start,end}() to swait
  dept: apply sdt_might_sleep_{start,end}() to waitqueue wait
  dept: apply sdt_might_sleep_{start,end}() to hashed-waitqueue wait
  dept: apply sdt_might_sleep_{start,end}() to dma fence
  dept: track timeout waits separately with a new Kconfig
  dept: apply timeout consideration to wait_for_completion()/complete()
  dept: apply timeout consideration to swait
  dept: apply timeout consideration to waitqueue wait
  dept: apply timeout consideration to hashed-waitqueue wait
  dept: apply timeout consideration to dma fence wait
  dept: make dept able to work with an external wgen
  dept: track PG_locked with dept
  dept: print staged wait's stacktrace on report
  locking/lockdep: prevent various lockdep assertions when
    lockdep_off()'ed
  dept: add documentation for dept
  cpu/hotplug: use a weaker annotation in AP thread
  fs/jbd2: use a weaker annotation in journal handling
  dept: assign dept map to mmu notifier invalidation synchronization
  dept: assign unique dept_key to each distinct dma fence caller
  dept: make dept aware of lockdep_set_lock_cmp_fn() annotation
  dept: make dept stop from working on debug_locks_off()
  i2c: rename wait_for_completion callback to wait_for_completion_cb
  dept: assign unique dept_key to each distinct wait_for_completion()
    caller
  completion, dept: introduce init_completion_dmap() API
  dept: introduce a new type of dependency tracking between multi event
    sites
  dept: add module support for struct dept_event_site and
    dept_event_site_dep
  dept: introduce event_site() to disable event tracking if it's
    recoverable
  dept: implement a basic unit test for dept
  dept: call dept_hardirqs_off() in local_irq_*() regardless of irq
    state
  rcu/update: fix same dept key collision between various types of RCU
  dept: introduce APIs to set page usage and use subclasses_evt for the
    usage
  dept: track PG_writeback with dept
  SUNRPC: relocate struct rcu_head to the first field of struct rpc_xprt
  mm: percpu: increase PERCPU_DYNAMIC_SIZE_SHIFT on DEPT and large
    PAGE_SIZE

 Documentation/dependency/dept.txt     |  735 ++++++
 Documentation/dependency/dept_api.txt |  117 +
 arch/arm64/Kconfig                    |    1 +
 arch/arm64/kernel/syscall.c           |    7 +
 arch/arm64/mm/fault.c                 |    7 +
 arch/x86/Kconfig                      |    1 +
 arch/x86/entry/syscall_64.c           |    7 +
 arch/x86/mm/fault.c                   |    7 +
 drivers/dma-buf/dma-fence.c           |   23 +-
 drivers/i2c/algos/i2c-algo-pca.c      |    2 +-
 drivers/i2c/busses/i2c-pca-isa.c      |    2 +-
 drivers/i2c/busses/i2c-pca-platform.c |    2 +-
 fs/jbd2/transaction.c                 |    2 +-
 include/asm-generic/vmlinux.lds.h     |   13 +-
 include/linux/completion.h            |  124 +-
 include/linux/dept.h                  |  647 +++++
 include/linux/dept_ldt.h              |   78 +
 include/linux/dept_sdt.h              |   68 +
 include/linux/dept_unit_test.h        |   67 +
 include/linux/dma-fence.h             |   74 +-
 include/linux/hardirq.h               |    3 +
 include/linux/i2c-algo-pca.h          |    2 +-
 include/linux/irqflags.h              |   21 +-
 include/linux/llist.h                 |    8 -
 include/linux/local_lock_internal.h   |    1 +
 include/linux/lockdep.h               |  105 +-
 include/linux/lockdep_types.h         |    3 +
 include/linux/mm_types.h              |    4 +
 include/linux/mmu_notifier.h          |   26 +
 include/linux/module.h                |    5 +
 include/linux/mutex.h                 |    1 +
 include/linux/page-flags.h            |  204 +-
 include/linux/pagemap.h               |   37 +-
 include/linux/percpu-rwsem.h          |    2 +-
 include/linux/percpu.h                |    4 +
 include/linux/rcupdate_wait.h         |   13 +-
 include/linux/rtmutex.h               |    1 +
 include/linux/rwlock_types.h          |    1 +
 include/linux/rwsem.h                 |    1 +
 include/linux/sched.h                 |  118 +
 include/linux/seqlock.h               |    2 +-
 include/linux/spinlock_types_raw.h    |    3 +
 include/linux/srcu.h                  |    2 +-
 include/linux/sunrpc/xprt.h           |    9 +-
 include/linux/swait.h                 |    3 +
 include/linux/types.h                 |    8 +
 include/linux/wait.h                  |    3 +
 include/linux/wait_bit.h              |    3 +
 init/init_task.c                      |    2 +
 init/main.c                           |    2 +
 kernel/Makefile                       |    1 +
 kernel/cpu.c                          |    2 +-
 kernel/dependency/Makefile            |    5 +
 kernel/dependency/dept.c              | 3499 +++++++++++++++++++++++++
 kernel/dependency/dept_hash.h         |   10 +
 kernel/dependency/dept_internal.h     |   65 +
 kernel/dependency/dept_object.h       |   13 +
 kernel/dependency/dept_proc.c         |   94 +
 kernel/dependency/dept_unit_test.c    |  173 ++
 kernel/exit.c                         |    1 +
 kernel/fork.c                         |    2 +
 kernel/locking/lockdep.c              |   33 +
 kernel/module/main.c                  |   19 +
 kernel/rcu/rcu.h                      |    1 +
 kernel/rcu/update.c                   |    5 +-
 kernel/sched/completion.c             |   62 +-
 kernel/sched/core.c                   |    9 +
 kernel/workqueue.c                    |    3 +
 lib/Kconfig.debug                     |   51 +
 lib/debug_locks.c                     |    2 +
 lib/locking-selftest.c                |    2 +
 mm/filemap.c                          |   37 +
 mm/mm_init.c                          |    3 +
 mm/mmu_notifier.c                     |   31 +-
 74 files changed, 6570 insertions(+), 134 deletions(-)
 create mode 100644 Documentation/dependency/dept.txt
 create mode 100644 Documentation/dependency/dept_api.txt
 create mode 100644 include/linux/dept.h
 create mode 100644 include/linux/dept_ldt.h
 create mode 100644 include/linux/dept_sdt.h
 create mode 100644 include/linux/dept_unit_test.h
 create mode 100644 kernel/dependency/Makefile
 create mode 100644 kernel/dependency/dept.c
 create mode 100644 kernel/dependency/dept_hash.h
 create mode 100644 kernel/dependency/dept_internal.h
 create mode 100644 kernel/dependency/dept_object.h
 create mode 100644 kernel/dependency/dept_proc.c
 create mode 100644 kernel/dependency/dept_unit_test.c


base-commit: e5f0a698b34ed76002dc5cff3804a61c80233a7a
-- 
2.17.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [PATCH v17 01/47] llist: move llist_{head,node} definition to types.h
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:24   ` Greg KH
  2025-10-02  8:12 ` [PATCH v17 02/47] dept: implement DEPT(DEPendency Tracker) Byungchul Park
                   ` (45 subsequent siblings)
  46 siblings, 1 reply; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

llist_head and llist_node can be used by some other header files.  For
example, dept for tracking dependencies uses llist in its header.  To
avoid header dependency, move them to types.h.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/llist.h | 8 --------
 include/linux/types.h | 8 ++++++++
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/llist.h b/include/linux/llist.h
index 607b2360c938..6bcdf378ebd7 100644
--- a/include/linux/llist.h
+++ b/include/linux/llist.h
@@ -53,14 +53,6 @@
 #include <linux/stddef.h>
 #include <linux/types.h>
 
-struct llist_head {
-	struct llist_node *first;
-};
-
-struct llist_node {
-	struct llist_node *next;
-};
-
 #define LLIST_HEAD_INIT(name)	{ NULL }
 #define LLIST_HEAD(name)	struct llist_head name = LLIST_HEAD_INIT(name)
 
diff --git a/include/linux/types.h b/include/linux/types.h
index 6dfdb8e8e4c3..58882a3730eb 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -208,6 +208,14 @@ struct hlist_node {
 	struct hlist_node *next, **pprev;
 };
 
+struct llist_head {
+	struct llist_node *first;
+};
+
+struct llist_node {
+	struct llist_node *next;
+};
+
 struct ustat {
 	__kernel_daddr_t	f_tfree;
 #ifdef CONFIG_ARCH_32BIT_USTAT_F_TINODE
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 02/47] dept: implement DEPT(DEPendency Tracker)
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 01/47] llist: move llist_{head,node} definition to types.h Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:25   ` Greg KH
  2025-10-02  8:12 ` [PATCH v17 03/47] dept: add single event dependency tracker APIs Byungchul Park
                   ` (44 subsequent siblings)
  46 siblings, 1 reply; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

CURRENT STATUS
--------------
lockdep tracks acquisition order of locks in order to detect deadlock,
and IRQ and IRQ enable/disable state as well to take accident
acquisitions into account.

lockdep should be turned off once it detects and reports a deadlock
since the data structure and algorithm are not reusable after detection
because of the complex design.

PROBLEM
-------
*Waits* and their *events* that never reach eventually cause deadlock.
However, lockdep is only interested in lock acquisition order, forcing
to emulate lock acqusition even for just waits and events that have
nothing to do with real lock.

Even worse, no one likes lockdep's false positive detection because that
prevents further one that might be more valuable.  That's why all the
kernel developers are sensitive to lockdep's false positive.

Besides those, by tracking acquisition order, it cannot correctly deal
with read lock and cross-event e.g. wait_for_completion()/complete() for
deadlock detection.  lockdep is no longer a good tool for that purpose.

SOLUTION
--------
Again, *waits* and their *events* that never reach eventually cause
deadlock.  The new solution, DEPT(DEPendency Tracker), focuses on waits
and events themselves.  dept tracks waits and events and report it if
any event would be never reachable.

dept does:
   . Works with read lock in the right way.
   . Works with any wait and event e.i. cross-event.
   . Continue to work even after reporting multiple times.
   . Provides simple and intuitive APIs.
   . Does exactly what dependency checker should do.

Q & A
-----
Q. Is this the first try ever to address the problem?
A. No, cross-release feature (b09be676e0ff2 locking/lockdep: Implement
   the 'crossrelease' feature) addressed it that was a lockdep extension
   and merged but reverted shortly because:

   cross-release started to report valuable hidden problems but started
   to give report false positive reports as well.  For sure, no one
   likes lockdep's false positive reports since it makes lockdep stop,
   preventing reporting further real problems.

Q. Why not dept was developed as an extension of lockdep?
A. lockdep definitely includes all the efforts great developers have
   made for a long time so as to be quite stable enough.  But I had to
   design and implement newly because of the following:

   1) lockdep was designed to track lock acquisition order.  The APIs
      and implementation do not fit on wait-event model.
   2) lockdep is turned off on detection including false positive.
      Which is terrible and prevents developing any extension for
      stronger detection.

Q. Do you intend to totally replace lockdep?
A. No, lockdep also checks if lock usage is correct.  Of course, the
   dependency check routine should be replaced but the other functions
   should be still there.

Q. Do you mean the dependency check routine should be replaced right
   away?
A. No, I admit lockdep is stable enough thanks to great efforts kernel
   developers have made.  lockdep and dept, both should be in the kernel
   until dept gets considered stable.

Q. Stronger detection capability would give more false positive report.
   Which was a big problem when cross-release was introduced.  Is it ok
   with dept?
A. It's ok.  dept allows multiple reporting thanks to simple and quite
   generalized design.  Of course, false positive reports should be
   fixed anyway but it's no longer as a critical problem as it was.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Tested-by: Yeoreum Yun <yeoreum.yun@arm.com>
Tested-by: Yunseong Kim <yskelg@gmail.com>
---
 include/linux/dept.h            |  446 +++++
 include/linux/hardirq.h         |    3 +
 include/linux/sched.h           |  108 ++
 init/init_task.c                |    2 +
 init/main.c                     |    2 +
 kernel/Makefile                 |    1 +
 kernel/dependency/Makefile      |    3 +
 kernel/dependency/dept.c        | 3002 +++++++++++++++++++++++++++++++
 kernel/dependency/dept_hash.h   |   10 +
 kernel/dependency/dept_object.h |   13 +
 kernel/exit.c                   |    1 +
 kernel/fork.c                   |    2 +
 kernel/module/main.c            |    4 +
 kernel/sched/core.c             |    9 +
 lib/Kconfig.debug               |   26 +
 lib/locking-selftest.c          |    2 +
 16 files changed, 3634 insertions(+)
 create mode 100644 include/linux/dept.h
 create mode 100644 kernel/dependency/Makefile
 create mode 100644 kernel/dependency/dept.c
 create mode 100644 kernel/dependency/dept_hash.h
 create mode 100644 kernel/dependency/dept_object.h

diff --git a/include/linux/dept.h b/include/linux/dept.h
new file mode 100644
index 000000000000..5f0d2d8c8cbe
--- /dev/null
+++ b/include/linux/dept.h
@@ -0,0 +1,446 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * DEPT(DEPendency Tracker) - runtime dependency tracker
+ *
+ * Started by Byungchul Park <max.byungchul.park@gmail.com>:
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ *  Copyright (c) 2024 SK hynix, Inc., Byungchul Park
+ */
+
+#ifndef __LINUX_DEPT_H
+#define __LINUX_DEPT_H
+
+#ifdef CONFIG_DEPT
+
+#include <linux/types.h>
+
+struct task_struct;
+
+#define DEPT_MAX_STACK_ENTRY		16
+#define DEPT_MAX_WAIT_HIST		64
+#define DEPT_MAX_ECXT_HELD		48
+
+#define DEPT_MAX_SUBCLASSES		16
+#define DEPT_MAX_SUBCLASSES_EVT		2
+#define DEPT_MAX_SUBCLASSES_USR		(DEPT_MAX_SUBCLASSES / DEPT_MAX_SUBCLASSES_EVT)
+#define DEPT_MAX_SUBCLASSES_CACHE	2
+
+#define DEPT_SIRQ			0
+#define DEPT_HIRQ			1
+#define DEPT_IRQS_NR			2
+#define DEPT_SIRQF			(1UL << DEPT_SIRQ)
+#define DEPT_HIRQF			(1UL << DEPT_HIRQ)
+
+struct dept_ecxt;
+struct dept_iecxt {
+	struct dept_ecxt		*ecxt;
+	int				enirq;
+	/*
+	 * for preventing to add a new ecxt
+	 */
+	bool				staled;
+};
+
+struct dept_wait;
+struct dept_iwait {
+	struct dept_wait		*wait;
+	int				irq;
+	/*
+	 * for preventing to add a new wait
+	 */
+	bool				staled;
+	bool				touched;
+};
+
+struct dept_class {
+	union {
+		struct llist_node	pool_node;
+		struct {
+			/*
+			 * reference counter for object management
+			 */
+			atomic_t	ref;
+
+			/*
+			 * unique information about the class
+			 */
+			const char	*name;
+			unsigned long	key;
+			int		sub_id;
+
+			/*
+			 * for BFS
+			 */
+			unsigned int	bfs_gen;
+			struct dept_class *bfs_parent;
+			struct list_head bfs_node;
+
+			/*
+			 * for hashing this object
+			 */
+			struct hlist_node hash_node;
+
+			/*
+			 * for linking all classes
+			 */
+			struct list_head all_node;
+
+			/*
+			 * for associating its dependencies
+			 */
+			struct list_head dep_head;
+			struct list_head dep_rev_head;
+
+			/*
+			 * for tracking IRQ dependencies
+			 */
+			struct dept_iecxt iecxt[DEPT_IRQS_NR];
+			struct dept_iwait iwait[DEPT_IRQS_NR];
+
+			/*
+			 * classified by a map embedded in task_struct,
+			 * not an explicit map
+			 */
+			bool		sched_map;
+		};
+	};
+};
+
+struct dept_key {
+	union {
+		/*
+		 * Each byte-wise address will be used as its key.
+		 */
+		char			base[DEPT_MAX_SUBCLASSES];
+
+		/*
+		 * for caching the main class pointer
+		 */
+		struct dept_class	*classes[DEPT_MAX_SUBCLASSES_CACHE];
+	};
+};
+
+struct dept_map {
+	const char			*name;
+	struct dept_key			*keys;
+
+	/*
+	 * subclass that can be set from user
+	 */
+	int				sub_u;
+
+	/*
+	 * It's local copy for fast access to the associated classes.
+	 * Also used for dept_key for static maps.
+	 */
+	struct dept_key			map_key;
+
+	/*
+	 * wait timestamp associated to this map
+	 */
+	unsigned int			wgen;
+
+	/*
+	 * whether this map should be going to be checked or not
+	 */
+	bool				nocheck;
+};
+
+#define DEPT_MAP_INITIALIZER(n, k)					\
+{									\
+	.name = #n,							\
+	.keys = (struct dept_key *)(k),					\
+	.sub_u = 0,							\
+	.map_key = { .classes = { NULL, } },				\
+	.wgen = 0U,							\
+	.nocheck = false,						\
+}
+
+struct dept_stack {
+	union {
+		struct llist_node	pool_node;
+		struct {
+			/*
+			 * reference counter for object management
+			 */
+			atomic_t	ref;
+
+			/*
+			 * backtrace entries
+			 */
+			unsigned long	raw[DEPT_MAX_STACK_ENTRY];
+			int nr;
+		};
+	};
+};
+
+struct dept_ecxt {
+	union {
+		struct llist_node	pool_node;
+		struct {
+			/*
+			 * reference counter for object management
+			 */
+			atomic_t	ref;
+
+			/*
+			 * function that entered to this ecxt
+			 */
+			const char	*ecxt_fn;
+
+			/*
+			 * event function
+			 */
+			const char	*event_fn;
+
+			/*
+			 * associated class
+			 */
+			struct dept_class *class;
+
+			/*
+			 * flag indicating which IRQ has been
+			 * enabled within the event context
+			 */
+			unsigned long	enirqf;
+
+			/*
+			 * where the IRQ-enabled happened
+			 */
+			unsigned long	enirq_ip[DEPT_IRQS_NR];
+			struct dept_stack *enirq_stack[DEPT_IRQS_NR];
+
+			/*
+			 * where the event context started
+			 */
+			unsigned long	ecxt_ip;
+			struct dept_stack *ecxt_stack;
+
+			/*
+			 * where the event triggered
+			 */
+			unsigned long	event_ip;
+			struct dept_stack *event_stack;
+		};
+	};
+};
+
+struct dept_wait {
+	union {
+		struct llist_node	pool_node;
+		struct {
+			/*
+			 * reference counter for object management
+			 */
+			atomic_t	ref;
+
+			/*
+			 * function causing this wait
+			 */
+			const char	*wait_fn;
+
+			/*
+			 * the associated class
+			 */
+			struct dept_class *class;
+
+			/*
+			 * which IRQ the wait was placed in
+			 */
+			unsigned long	irqf;
+
+			/*
+			 * where the IRQ wait happened
+			 */
+			unsigned long	irq_ip[DEPT_IRQS_NR];
+			struct dept_stack *irq_stack[DEPT_IRQS_NR];
+
+			/*
+			 * where the wait happened
+			 */
+			unsigned long	wait_ip;
+			struct dept_stack *wait_stack;
+
+			/*
+			 * whether this wait is for commit in scheduler
+			 */
+			bool		sched_sleep;
+		};
+	};
+};
+
+struct dept_dep {
+	union {
+		struct llist_node	pool_node;
+		struct {
+			/*
+			 * reference counter for object management
+			 */
+			atomic_t	ref;
+
+			/*
+			 * key data of dependency
+			 */
+			struct dept_ecxt *ecxt;
+			struct dept_wait *wait;
+
+			/*
+			 * This object can be referred without dept_lock
+			 * held but with IRQ disabled, e.g. for hash
+			 * lookup. So deferred deletion is needed.
+			 */
+			struct rcu_head rh;
+
+			/*
+			 * for hashing this object
+			 */
+			struct hlist_node hash_node;
+
+			/*
+			 * for linking to a class object
+			 */
+			struct list_head dep_node;
+			struct list_head dep_rev_node;
+		};
+	};
+};
+
+struct dept_hash {
+	/*
+	 * hash table
+	 */
+	struct hlist_head		*table;
+
+	/*
+	 * size of the table e.i. 2^bits
+	 */
+	int				bits;
+};
+
+struct dept_ecxt_held {
+	/*
+	 * associated event context
+	 */
+	struct dept_ecxt		*ecxt;
+
+	/*
+	 * unique key for this dept_ecxt_held
+	 */
+	struct dept_map			*map;
+
+	/*
+	 * class of the ecxt of this dept_ecxt_held
+	 */
+	struct dept_class		*class;
+
+	/*
+	 * the wgen when the event context started
+	 */
+	unsigned int			wgen;
+
+	/*
+	 * subclass that only works in the local context
+	 */
+	int				sub_l;
+};
+
+struct dept_wait_hist {
+	/*
+	 * associated wait
+	 */
+	struct dept_wait		*wait;
+
+	/*
+	 * unique id of all waits system-wise until wrapped
+	 */
+	unsigned int			wgen;
+
+	/*
+	 * local context id to identify IRQ context
+	 */
+	unsigned int			ctxt_id;
+};
+
+extern void dept_on(void);
+extern void dept_off(void);
+extern void dept_init(void);
+extern void dept_task_init(struct task_struct *t);
+extern void dept_task_exit(struct task_struct *t);
+extern void dept_free_range(void *start, unsigned int sz);
+
+extern void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u, const char *n);
+extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u, const char *n);
+extern void dept_map_copy(struct dept_map *to, struct dept_map *from);
+extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, const char *w_fn, int sub_l);
+extern void dept_stage_wait(struct dept_map *m, struct dept_key *k, unsigned long ip, const char *w_fn);
+extern void dept_request_event_wait_commit(void);
+extern void dept_clean_stage(void);
+extern void dept_ttwu_stage_wait(struct task_struct *t, unsigned long ip);
+extern void dept_ecxt_enter(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *c_fn, const char *e_fn, int sub_l);
+extern bool dept_ecxt_holding(struct dept_map *m, unsigned long e_f);
+extern void dept_request_event(struct dept_map *m);
+extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *e_fn);
+extern void dept_ecxt_exit(struct dept_map *m, unsigned long e_f, unsigned long ip);
+extern void dept_sched_enter(void);
+extern void dept_sched_exit(void);
+
+static inline void dept_ecxt_enter_nokeep(struct dept_map *m)
+{
+	dept_ecxt_enter(m, 0UL, 0UL, NULL, NULL, 0);
+}
+
+/*
+ * for users who want to manage external keys
+ */
+extern void dept_key_init(struct dept_key *k);
+extern void dept_key_destroy(struct dept_key *k);
+extern void dept_map_ecxt_modify(struct dept_map *m, unsigned long e_f, struct dept_key *new_k, unsigned long new_e_f, unsigned long new_ip, const char *new_c_fn, const char *new_e_fn, int new_sub_l);
+
+extern void dept_softirq_enter(void);
+extern void dept_hardirq_enter(void);
+extern void dept_softirqs_on_ip(unsigned long ip);
+extern void dept_hardirqs_on(void);
+extern void dept_softirqs_off(void);
+extern void dept_hardirqs_off(void);
+#else /* !CONFIG_DEPT */
+struct dept_key { };
+struct dept_map { };
+
+#define DEPT_MAP_INITIALIZER(n, k) { }
+
+#define dept_on()					do { } while (0)
+#define dept_off()					do { } while (0)
+#define dept_init()					do { } while (0)
+#define dept_task_init(t)				do { } while (0)
+#define dept_task_exit(t)				do { } while (0)
+#define dept_free_range(s, sz)				do { } while (0)
+
+#define dept_map_init(m, k, su, n)			do { (void)(n); (void)(k); } while (0)
+#define dept_map_reinit(m, k, su, n)			do { (void)(n); (void)(k); } while (0)
+#define dept_map_copy(t, f)				do { } while (0)
+#define dept_wait(m, w_f, ip, w_fn, sl)			do { (void)(w_fn); } while (0)
+#define dept_stage_wait(m, k, ip, w_fn)			do { (void)(k); (void)(w_fn); } while (0)
+#define dept_request_event_wait_commit()		do { } while (0)
+#define dept_clean_stage()				do { } while (0)
+#define dept_ttwu_stage_wait(t, ip)			do { } while (0)
+#define dept_ecxt_enter(m, e_f, ip, c_fn, e_fn, sl)	do { (void)(c_fn); (void)(e_fn); } while (0)
+#define dept_ecxt_holding(m, e_f)			false
+#define dept_request_event(m)				do { } while (0)
+#define dept_event(m, e_f, ip, e_fn)			do { (void)(e_fn); } while (0)
+#define dept_ecxt_exit(m, e_f, ip)			do { } while (0)
+#define dept_sched_enter()				do { } while (0)
+#define dept_sched_exit()				do { } while (0)
+#define dept_ecxt_enter_nokeep(m)			do { } while (0)
+#define dept_key_init(k)				do { (void)(k); } while (0)
+#define dept_key_destroy(k)				do { (void)(k); } while (0)
+#define dept_map_ecxt_modify(m, e_f, n_k, n_e_f, n_ip, n_c_fn, n_e_fn, n_sl) do { (void)(n_k); (void)(n_c_fn); (void)(n_e_fn); } while (0)
+
+#define dept_softirq_enter()				do { } while (0)
+#define dept_hardirq_enter()				do { } while (0)
+#define dept_softirqs_on_ip(ip)				do { } while (0)
+#define dept_hardirqs_on()				do { } while (0)
+#define dept_softirqs_off()				do { } while (0)
+#define dept_hardirqs_off()				do { } while (0)
+#endif
+#endif /* __LINUX_DEPT_H */
diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index d57cab4d4c06..bb279dbbe748 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -5,6 +5,7 @@
 #include <linux/context_tracking_state.h>
 #include <linux/preempt.h>
 #include <linux/lockdep.h>
+#include <linux/dept.h>
 #include <linux/ftrace_irq.h>
 #include <linux/sched.h>
 #include <linux/vtime.h>
@@ -106,6 +107,7 @@ void irq_exit_rcu(void);
  */
 #define __nmi_enter()						\
 	do {							\
+		dept_off();					\
 		lockdep_off();					\
 		arch_nmi_enter();				\
 		BUG_ON(in_nmi() == NMI_MASK);			\
@@ -128,6 +130,7 @@ void irq_exit_rcu(void);
 		__preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET);	\
 		arch_nmi_exit();				\
 		lockdep_on();					\
+		dept_on();					\
 	} while (0)
 
 #define nmi_exit()						\
diff --git a/include/linux/sched.h b/include/linux/sched.h
index e4ce0a76831e..ddb162201ba1 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -49,6 +49,8 @@
 #include <linux/tracepoint-defs.h>
 #include <linux/unwind_deferred_types.h>
 #include <asm/kmap_size.h>
+#include <linux/spinlock.h>
+#include <linux/dept.h>
 
 /* task_struct member predeclarations (sorted alphabetically): */
 struct audit_context;
@@ -813,6 +815,110 @@ struct kmap_ctrl {
 #endif
 };
 
+#ifdef CONFIG_DEPT
+struct dept_task {
+	/*
+	 * all event contexts that have entered and before exiting
+	 */
+	struct dept_ecxt_held		ecxt_held[DEPT_MAX_ECXT_HELD];
+	int				ecxt_held_pos;
+
+	/*
+	 * ring buffer holding all waits that have happened
+	 */
+	struct dept_wait_hist		wait_hist[DEPT_MAX_WAIT_HIST];
+	int				wait_hist_pos;
+
+	/*
+	 * sequential id to identify each IRQ context
+	 */
+	unsigned int			irq_id[DEPT_IRQS_NR];
+
+	/*
+	 * for tracking IRQ-enabled points with cross-event
+	 */
+	unsigned int			wgen_enirq[DEPT_IRQS_NR];
+
+	/*
+	 * for keeping up-to-date IRQ-enabled points
+	 */
+	unsigned long			enirq_ip[DEPT_IRQS_NR];
+
+	/*
+	 * for reserving a current stack instance at each operation
+	 */
+	struct dept_stack		*stack;
+
+	/*
+	 * for preventing recursive call into DEPT engine
+	 */
+	int				recursive;
+
+	/*
+	 * for preventing reentrance to WARN*() while warning
+	 */
+	int				in_warning;
+
+	/*
+	 * for staging data to commit a wait
+	 */
+	struct dept_map			stage_m;
+	struct dept_map			*stage_real_m;
+	bool				stage_sched_map;
+	const char			*stage_w_fn;
+	unsigned long			stage_ip;
+	arch_spinlock_t			stage_lock;
+
+	/*
+	 * the number of missing ecxts
+	 */
+	int				missing_ecxt;
+
+	/*
+	 * for tracking IRQ-enable state
+	 */
+	bool				hardirqs_enabled;
+	bool				softirqs_enabled;
+
+	/*
+	 * whether the current is on do_exit()
+	 */
+	bool				task_exit;
+
+	/*
+	 * whether the current is running __schedule()
+	 */
+	bool				in_sched;
+};
+
+#define DEPT_TASK_INITIALIZER(t)				\
+{								\
+	.wait_hist = { { .wait = NULL, } },			\
+	.ecxt_held_pos = 0,					\
+	.wait_hist_pos = 0,					\
+	.irq_id = { 0U },					\
+	.wgen_enirq = { 0U },					\
+	.enirq_ip = { 0UL },					\
+	.stack = NULL,						\
+	.recursive = 0,						\
+	.in_warning = 0,					\
+	.stage_m = DEPT_MAP_INITIALIZER((t)->stage_m, NULL),	\
+	.stage_real_m = NULL,					\
+	.stage_sched_map = false,				\
+	.stage_w_fn = NULL,					\
+	.stage_ip = 0UL,					\
+	.stage_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED,\
+	.missing_ecxt = 0,					\
+	.hardirqs_enabled = false,				\
+	.softirqs_enabled = false,				\
+	.task_exit = false,					\
+	.in_sched = false,					\
+}
+#else
+struct dept_task { };
+#define DEPT_TASK_INITIALIZER(t) { }
+#endif
+
 struct task_struct {
 #ifdef CONFIG_THREAD_INFO_IN_TASK
 	/*
@@ -1266,6 +1372,8 @@ struct task_struct {
 	struct held_lock		held_locks[MAX_LOCK_DEPTH];
 #endif
 
+	struct dept_task		dept_task;
+
 #if defined(CONFIG_UBSAN) && !defined(CONFIG_UBSAN_TRAP)
 	unsigned int			in_ubsan;
 #endif
diff --git a/init/init_task.c b/init/init_task.c
index e557f622bd90..84da2464c390 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -14,6 +14,7 @@
 #include <linux/numa.h>
 #include <linux/scs.h>
 #include <linux/plist.h>
+#include <linux/dept.h>
 
 #include <linux/uaccess.h>
 
@@ -204,6 +205,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
 	.curr_chain_key = INITIAL_CHAIN_KEY,
 	.lockdep_recursion = 0,
 #endif
+	.dept_task = DEPT_TASK_INITIALIZER(init_task),
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	.ret_stack		= NULL,
 	.tracing_graph_pause	= ATOMIC_INIT(0),
diff --git a/init/main.c b/init/main.c
index 5753e9539ae6..8a9b289c58e4 100644
--- a/init/main.c
+++ b/init/main.c
@@ -66,6 +66,7 @@
 #include <linux/debug_locks.h>
 #include <linux/debugobjects.h>
 #include <linux/lockdep.h>
+#include <linux/dept.h>
 #include <linux/kmemleak.h>
 #include <linux/padata.h>
 #include <linux/pid_namespace.h>
@@ -1038,6 +1039,7 @@ void start_kernel(void)
 		      panic_param);
 
 	lockdep_init();
+	dept_init();
 
 	/*
 	 * Need to run this when irqs are enabled, because it wants
diff --git a/kernel/Makefile b/kernel/Makefile
index c60623448235..72c0d9767c89 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -56,6 +56,7 @@ obj-y += dma/
 obj-y += entry/
 obj-y += unwind/
 obj-$(CONFIG_MODULES) += module/
+obj-y += dependency/
 
 obj-$(CONFIG_KCMP) += kcmp.o
 obj-$(CONFIG_FREEZER) += freezer.o
diff --git a/kernel/dependency/Makefile b/kernel/dependency/Makefile
new file mode 100644
index 000000000000..b5cfb8a03c0c
--- /dev/null
+++ b/kernel/dependency/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_DEPT) += dept.o
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
new file mode 100644
index 000000000000..712b7f79a095
--- /dev/null
+++ b/kernel/dependency/dept.c
@@ -0,0 +1,3002 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * DEPT(DEPendency Tracker) - Runtime dependency tracker
+ *
+ * Started by Byungchul Park <max.byungchul.park@gmail.com>:
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ *  Copyright (c) 2024 SK hynix, Inc., Byungchul Park
+ *
+ * DEPT provides a general way to detect potential deadlocks at runtime
+ * and the interest is not limited to typical lock but to every
+ * synchronization primitives.
+ *
+ * The following ideas were borrowed from LOCKDEP:
+ *
+ *    1) Use a graph to track relationship between classes.
+ *    2) Prevent performance regression using hash.
+ *
+ * The following items were enhanced from LOCKDEP:
+ *
+ *    1) Cover more deadlock cases.
+ *    2) Allow multiple reports.
+ *
+ * TODO: Both LOCKDEP and DEPT should co-exist until DEPT is considered
+ * stable. Then the dependency check routine should be replaced with
+ * DEPT after. It should finally look like:
+ *
+ *
+ *
+ * As is:
+ *
+ *    LOCKDEP
+ *    +-----------------------------------------+
+ *    | Lock usage correctness check            | <-> locks
+ *    |                                         |
+ *    |                                         |
+ *    | +-------------------------------------+ |
+ *    | | Dependency check                    | |
+ *    | | (by tracking lock acquisition order)| |
+ *    | +-------------------------------------+ |
+ *    |                                         |
+ *    +-----------------------------------------+
+ *
+ *    DEPT
+ *    +-----------------------------------------+
+ *    | Dependency check                        | <-> waits/events
+ *    | (by tracking wait and event context)    |
+ *    +-----------------------------------------+
+ *
+ *
+ *
+ * To be:
+ *
+ *    LOCKDEP
+ *    +-----------------------------------------+
+ *    | Lock usage correctness check            | <-> locks
+ *    |                                         |
+ *    |                                         |
+ *    |       (Request dependency check)        |
+ *    |                    T                    |
+ *    +--------------------|--------------------+
+ *                         |
+ *    DEPT                 V
+ *    +-----------------------------------------+
+ *    | Dependency check                        | <-> waits/events
+ *    | (by tracking wait and event context)    |
+ *    +-----------------------------------------+
+ */
+
+#include <linux/sched.h>
+#include <linux/stacktrace.h>
+#include <linux/spinlock.h>
+#include <linux/kallsyms.h>
+#include <linux/hash.h>
+#include <linux/dept.h>
+#include <linux/utsname.h>
+#include <linux/kernel.h>
+
+static int dept_stop;
+static int dept_per_cpu_ready;
+
+static inline struct dept_task *dept_task(void)
+{
+	return &current->dept_task;
+}
+
+#define DEPT_READY_WARN (!oops_in_progress && !dept_task()->in_warning)
+
+/*
+ * Make all operations using DEPT_WARN_ON() fail on oops_in_progress and
+ * prevent warning message.
+ */
+#define DEPT_WARN_ON_ONCE(c)						\
+	({								\
+		int __ret = !!(c);					\
+									\
+		if (likely(DEPT_READY_WARN)) {				\
+			++dept_task()->in_warning;			\
+			WARN_ONCE(c, "DEPT_WARN_ON_ONCE: " #c);		\
+			--dept_task()->in_warning;			\
+		}							\
+		__ret;							\
+	})
+
+#define DEPT_WARN_ONCE(s...)						\
+	({								\
+		if (likely(DEPT_READY_WARN)) {				\
+			++dept_task()->in_warning;			\
+			WARN_ONCE(1, "DEPT_WARN_ONCE: " s);		\
+			--dept_task()->in_warning;			\
+		}							\
+	})
+
+#define DEPT_WARN_ON(c)							\
+	({								\
+		int __ret = !!(c);					\
+									\
+		if (likely(DEPT_READY_WARN)) {				\
+			++dept_task()->in_warning;			\
+			WARN(c, "DEPT_WARN_ON: " #c);			\
+			--dept_task()->in_warning;			\
+		}							\
+		__ret;							\
+	})
+
+#define DEPT_WARN(s...)							\
+	({								\
+		if (likely(DEPT_READY_WARN)) {				\
+			++dept_task()->in_warning;			\
+			WARN(1, "DEPT_WARN: " s);			\
+			--dept_task()->in_warning;			\
+		}							\
+	})
+
+#define DEPT_STOP(s...)							\
+	({								\
+		WRITE_ONCE(dept_stop, 1);				\
+		if (likely(DEPT_READY_WARN)) {				\
+			++dept_task()->in_warning;			\
+			WARN(1, "DEPT_STOP: " s);			\
+			--dept_task()->in_warning;			\
+		}							\
+	})
+
+#define DEPT_INFO_ONCE(s...) pr_warn_once("DEPT_INFO_ONCE: " s)
+
+static arch_spinlock_t dept_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
+
+/*
+ * DEPT internal engine should be cautious in using outside functions
+ * e.g. printk at reporting since that kind of usage might cause
+ * untrackable deadlock.
+ */
+static atomic_t dept_outworld = ATOMIC_INIT(0);
+
+static void dept_outworld_enter(void)
+{
+	atomic_inc(&dept_outworld);
+}
+
+static void dept_outworld_exit(void)
+{
+	atomic_dec(&dept_outworld);
+}
+
+static bool dept_outworld_entered(void)
+{
+	return atomic_read(&dept_outworld);
+}
+
+static bool dept_lock(void)
+{
+	while (!arch_spin_trylock(&dept_spin))
+		if (unlikely(dept_outworld_entered()))
+			return false;
+	return true;
+}
+
+static void dept_unlock(void)
+{
+	arch_spin_unlock(&dept_spin);
+}
+
+enum bfs_ret {
+	BFS_CONTINUE,
+	BFS_DONE,
+	BFS_SKIP,
+};
+
+static bool before(unsigned int a, unsigned int b)
+{
+	return (int)(a - b) < 0;
+}
+
+static bool valid_stack(struct dept_stack *s)
+{
+	return s && s->nr > 0;
+}
+
+static bool valid_class(struct dept_class *c)
+{
+	return c->key;
+}
+
+static void invalidate_class(struct dept_class *c)
+{
+	c->key = 0UL;
+}
+
+static struct dept_ecxt *dep_e(struct dept_dep *d)
+{
+	return d->ecxt;
+}
+
+static struct dept_wait *dep_w(struct dept_dep *d)
+{
+	return d->wait;
+}
+
+static struct dept_class *dep_fc(struct dept_dep *d)
+{
+	return dep_e(d)->class;
+}
+
+static struct dept_class *dep_tc(struct dept_dep *d)
+{
+	return dep_w(d)->class;
+}
+
+static const char *irq_str(int irq)
+{
+	if (irq == DEPT_SIRQ)
+		return "softirq";
+	if (irq == DEPT_HIRQ)
+		return "hardirq";
+	return "(unknown)";
+}
+
+/*
+ * Dept doesn't work either when it's stopped by DEPT_STOP() or in a nmi
+ * context.
+ */
+static bool dept_working(void)
+{
+	return !READ_ONCE(dept_stop) && !in_nmi();
+}
+
+/*
+ * Even k == NULL is considered as a valid key because it would use
+ * &->map_key as the key in that case.
+ */
+struct dept_key __dept_no_validate__;
+static bool valid_key(struct dept_key *k)
+{
+	return &__dept_no_validate__ != k;
+}
+
+/*
+ * Pool
+ * =====================================================================
+ * DEPT maintains pools to provide objects in a safe way.
+ *
+ *    1) Static pool is used at the beginning of booting time.
+ *    2) Local pool is tried first before the static pool. Objects that
+ *       have been freed will be placed.
+ */
+
+enum object_t {
+#define OBJECT(id, nr) OBJECT_##id,
+	#include "dept_object.h"
+#undef OBJECT
+	OBJECT_NR,
+};
+
+#define OBJECT(id, nr)							\
+static struct dept_##id spool_##id[nr];					\
+static DEFINE_PER_CPU(struct llist_head, lpool_##id);
+	#include "dept_object.h"
+#undef OBJECT
+
+struct dept_pool {
+	const char			*name;
+
+	/*
+	 * object size
+	 */
+	size_t				obj_sz;
+
+	/*
+	 * the number of the static array
+	 */
+	atomic_t			obj_nr;
+
+	/*
+	 * offset of ->pool_node
+	 */
+	size_t				node_off;
+
+	/*
+	 * pointer to the pool
+	 */
+	void				*spool;
+	struct llist_head		boot_pool;
+	struct llist_head __percpu	*lpool;
+};
+
+static struct dept_pool pool[OBJECT_NR] = {
+#define OBJECT(id, nr) {						\
+	.name = #id,							\
+	.obj_sz = sizeof(struct dept_##id),				\
+	.obj_nr = ATOMIC_INIT(nr),					\
+	.node_off = offsetof(struct dept_##id, pool_node),		\
+	.spool = spool_##id,						\
+	.lpool = &lpool_##id, },
+	#include "dept_object.h"
+#undef OBJECT
+};
+
+/*
+ * Can use llist no matter whether CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG is
+ * enabled or not because NMI and other contexts in the same CPU never
+ * run inside of DEPT concurrently by preventing reentrance.
+ */
+static void *from_pool(enum object_t t)
+{
+	struct dept_pool *p;
+	struct llist_head *h;
+	struct llist_node *n;
+
+	/*
+	 * llist_del_first() doesn't allow concurrent access e.g.
+	 * between process and IRQ context.
+	 */
+	if (DEPT_WARN_ON(!irqs_disabled()))
+		return NULL;
+
+	p = &pool[t];
+
+	/*
+	 * Try local pool first.
+	 */
+	if (likely(dept_per_cpu_ready))
+		h = this_cpu_ptr(p->lpool);
+	else
+		h = &p->boot_pool;
+
+	n = llist_del_first(h);
+	if (n)
+		return (void *)n - p->node_off;
+
+	/*
+	 * Try static pool.
+	 */
+	if (atomic_read(&p->obj_nr) > 0) {
+		int idx = atomic_dec_return(&p->obj_nr);
+
+		if (idx >= 0)
+			return p->spool + (idx * p->obj_sz);
+	}
+
+	DEPT_INFO_ONCE("---------------------------------------------\n"
+		"  Some of Dept internal resources are run out.\n"
+		"  Dept might still work if the resources get freed.\n"
+		"  However, the chances are Dept will suffer from\n"
+		"  the lack from now. Needs to extend the internal\n"
+		"  resource pools. Ask max.byungchul.park@gmail.com\n");
+	return NULL;
+}
+
+static void to_pool(void *o, enum object_t t)
+{
+	struct dept_pool *p = &pool[t];
+	struct llist_head *h;
+
+	preempt_disable();
+	if (likely(dept_per_cpu_ready))
+		h = this_cpu_ptr(p->lpool);
+	else
+		h = &p->boot_pool;
+
+	llist_add(o + p->node_off, h);
+	preempt_enable();
+}
+
+#define OBJECT(id, nr)							\
+static void (*ctor_##id)(struct dept_##id *a);				\
+static void (*dtor_##id)(struct dept_##id *a);				\
+static struct dept_##id *new_##id(void)					\
+{									\
+	struct dept_##id *a;						\
+									\
+	a = (struct dept_##id *)from_pool(OBJECT_##id);			\
+	if (unlikely(!a))						\
+		return NULL;						\
+									\
+	atomic_set(&a->ref, 1);						\
+									\
+	if (ctor_##id)							\
+		ctor_##id(a);						\
+									\
+	return a;							\
+}									\
+									\
+static struct dept_##id *get_##id(struct dept_##id *a)			\
+{									\
+	atomic_inc(&a->ref);						\
+	return a;							\
+}									\
+									\
+static void put_##id(struct dept_##id *a)				\
+{									\
+	if (!atomic_dec_return(&a->ref)) {				\
+		if (dtor_##id)						\
+			dtor_##id(a);					\
+		to_pool(a, OBJECT_##id);				\
+	}								\
+}									\
+									\
+static void del_##id(struct dept_##id *a)				\
+{									\
+	put_##id(a);							\
+}									\
+									\
+static bool __maybe_unused id##_consumed(struct dept_##id *a)		\
+{									\
+	return a && atomic_read(&a->ref) > 1;				\
+}
+#include "dept_object.h"
+#undef OBJECT
+
+#define SET_CONSTRUCTOR(id, f) \
+static void (*ctor_##id)(struct dept_##id *a) = f
+
+static void initialize_dep(struct dept_dep *d)
+{
+	INIT_LIST_HEAD(&d->dep_node);
+	INIT_LIST_HEAD(&d->dep_rev_node);
+}
+SET_CONSTRUCTOR(dep, initialize_dep);
+
+static void initialize_class(struct dept_class *c)
+{
+	int i;
+
+	for (i = 0; i < DEPT_IRQS_NR; i++) {
+		struct dept_iecxt *ie = &c->iecxt[i];
+		struct dept_iwait *iw = &c->iwait[i];
+
+		ie->ecxt = NULL;
+		ie->enirq = i;
+		ie->staled = false;
+
+		iw->wait = NULL;
+		iw->irq = i;
+		iw->staled = false;
+		iw->touched = false;
+	}
+	c->bfs_gen = 0U;
+
+	INIT_LIST_HEAD(&c->all_node);
+	INIT_LIST_HEAD(&c->dep_head);
+	INIT_LIST_HEAD(&c->dep_rev_head);
+	INIT_LIST_HEAD(&c->bfs_node);
+}
+SET_CONSTRUCTOR(class, initialize_class);
+
+static void initialize_ecxt(struct dept_ecxt *e)
+{
+	int i;
+
+	for (i = 0; i < DEPT_IRQS_NR; i++) {
+		e->enirq_stack[i] = NULL;
+		e->enirq_ip[i] = 0UL;
+	}
+	e->ecxt_ip = 0UL;
+	e->ecxt_stack = NULL;
+	e->enirqf = 0UL;
+	e->event_ip = 0UL;
+	e->event_stack = NULL;
+}
+SET_CONSTRUCTOR(ecxt, initialize_ecxt);
+
+static void initialize_wait(struct dept_wait *w)
+{
+	int i;
+
+	for (i = 0; i < DEPT_IRQS_NR; i++) {
+		w->irq_stack[i] = NULL;
+		w->irq_ip[i] = 0UL;
+	}
+	w->wait_ip = 0UL;
+	w->wait_stack = NULL;
+	w->irqf = 0UL;
+}
+SET_CONSTRUCTOR(wait, initialize_wait);
+
+static void initialize_stack(struct dept_stack *s)
+{
+	s->nr = 0;
+}
+SET_CONSTRUCTOR(stack, initialize_stack);
+
+#define OBJECT(id, nr) \
+static void (*ctor_##id)(struct dept_##id *a);
+	#include "dept_object.h"
+#undef OBJECT
+
+#undef SET_CONSTRUCTOR
+
+#define SET_DESTRUCTOR(id, f) \
+static void (*dtor_##id)(struct dept_##id *a) = f
+
+static void destroy_dep(struct dept_dep *d)
+{
+	if (dep_e(d))
+		put_ecxt(dep_e(d));
+	if (dep_w(d))
+		put_wait(dep_w(d));
+}
+SET_DESTRUCTOR(dep, destroy_dep);
+
+static void destroy_ecxt(struct dept_ecxt *e)
+{
+	int i;
+
+	for (i = 0; i < DEPT_IRQS_NR; i++)
+		if (e->enirq_stack[i])
+			put_stack(e->enirq_stack[i]);
+	if (e->class)
+		put_class(e->class);
+	if (e->ecxt_stack)
+		put_stack(e->ecxt_stack);
+	if (e->event_stack)
+		put_stack(e->event_stack);
+}
+SET_DESTRUCTOR(ecxt, destroy_ecxt);
+
+static void destroy_wait(struct dept_wait *w)
+{
+	int i;
+
+	for (i = 0; i < DEPT_IRQS_NR; i++)
+		if (w->irq_stack[i])
+			put_stack(w->irq_stack[i]);
+	if (w->class)
+		put_class(w->class);
+	if (w->wait_stack)
+		put_stack(w->wait_stack);
+}
+SET_DESTRUCTOR(wait, destroy_wait);
+
+#define OBJECT(id, nr) \
+static void (*dtor_##id)(struct dept_##id *a);
+	#include "dept_object.h"
+#undef OBJECT
+
+#undef SET_DESTRUCTOR
+
+/*
+ * Caching and hashing
+ * =====================================================================
+ * DEPT makes use of caching and hashing to improve performance. Each
+ * object can be obtained in O(1) with its key.
+ *
+ * NOTE: Currently we assume all the objects in the hashs will never be
+ * removed. Implement it when needed.
+ */
+
+/*
+ * Some information might be lost but it's only for hashing key.
+ */
+static unsigned long mix(unsigned long a, unsigned long b)
+{
+	int halfbits = sizeof(unsigned long) * 8 / 2;
+	unsigned long halfmask = (1UL << halfbits) - 1UL;
+
+	return (a << halfbits) | (b & halfmask);
+}
+
+static bool cmp_dep(struct dept_dep *d1, struct dept_dep *d2)
+{
+	return dep_fc(d1)->key == dep_fc(d2)->key &&
+	       dep_tc(d1)->key == dep_tc(d2)->key;
+}
+
+static unsigned long key_dep(struct dept_dep *d)
+{
+	return mix(dep_fc(d)->key, dep_tc(d)->key);
+}
+
+static bool cmp_class(struct dept_class *c1, struct dept_class *c2)
+{
+	return c1->key == c2->key;
+}
+
+static unsigned long key_class(struct dept_class *c)
+{
+	return c->key;
+}
+
+#define HASH(id, bits)							\
+static struct hlist_head table_##id[1 << (bits)];			\
+									\
+static struct hlist_head *head_##id(struct dept_##id *a)		\
+{									\
+	return table_##id + hash_long(key_##id(a), bits);		\
+}									\
+									\
+static struct dept_##id *hash_lookup_##id(struct dept_##id *a)		\
+{									\
+	struct dept_##id *b;						\
+									\
+	hlist_for_each_entry_rcu(b, head_##id(a), hash_node)		\
+		if (cmp_##id(a, b))					\
+			return b;					\
+	return NULL;							\
+}									\
+									\
+static void hash_add_##id(struct dept_##id *a)				\
+{									\
+	get_##id(a);							\
+	hlist_add_head_rcu(&a->hash_node, head_##id(a));		\
+}									\
+									\
+static void hash_del_##id(struct dept_##id *a)				\
+{									\
+	hlist_del_rcu(&a->hash_node);					\
+	put_##id(a);							\
+}
+#include "dept_hash.h"
+#undef HASH
+
+static struct dept_dep *lookup_dep(struct dept_class *fc,
+				   struct dept_class *tc)
+{
+	struct dept_ecxt onetime_e = { .class = fc };
+	struct dept_wait onetime_w = { .class = tc };
+	struct dept_dep  onetime_d = { .ecxt = &onetime_e,
+				       .wait = &onetime_w };
+	return hash_lookup_dep(&onetime_d);
+}
+
+static struct dept_class *lookup_class(unsigned long key)
+{
+	struct dept_class onetime_c = { .key = key };
+
+	return hash_lookup_class(&onetime_c);
+}
+
+/*
+ * Report
+ * =====================================================================
+ * DEPT prints useful information to help debugging on detection of
+ * problematic dependency.
+ */
+
+static void print_ip_stack(unsigned long ip, struct dept_stack *s)
+{
+	if (ip)
+		print_ip_sym(KERN_WARNING, ip);
+
+#ifdef CONFIG_DEPT_DEBUG
+	if (!s)
+		pr_warn("stack is NULL.\n");
+	else if (!s->nr)
+		pr_warn("stack->nr is 0.\n");
+	if (s)
+		pr_warn("stack ref is %d.\n", atomic_read(&s->ref));
+#endif
+
+	if (valid_stack(s)) {
+		pr_warn("stacktrace:\n");
+		stack_trace_print(s->raw, s->nr, 5);
+	}
+
+	if (!ip && !valid_stack(s))
+		pr_warn("(N/A)\n");
+}
+
+#define print_spc(spc, fmt, ...) \
+	pr_warn("%*c" fmt, (spc) * 3, ' ', ##__VA_ARGS__)
+
+static void print_diagram(struct dept_dep *d)
+{
+	struct dept_ecxt *e = dep_e(d);
+	struct dept_wait *w = dep_w(d);
+	struct dept_class *fc = dep_fc(d);
+	struct dept_class *tc = dep_tc(d);
+	unsigned long irqf;
+	int irq;
+	bool firstline = true;
+	int spc = 1;
+	const char *w_fn = w->wait_fn ?: "(unknown)";
+	const char *e_fn = e->event_fn ?: "(unknown)";
+	const char *c_fn = e->ecxt_fn ?: "(unknown)";
+	const char *fc_n = fc->sched_map ? "<sched>" : (fc->name ?: "(unknown)");
+	const char *tc_n = tc->sched_map ? "<sched>" : (tc->name ?: "(unknown)");
+
+	irqf = e->enirqf & w->irqf;
+	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR) {
+		if (!firstline)
+			pr_warn("\nor\n\n");
+		firstline = false;
+
+		print_spc(spc, "[S] %s(%s:%d)\n", c_fn, fc_n, fc->sub_id);
+		print_spc(spc, "    <%s interrupt>\n", irq_str(irq));
+		print_spc(spc + 1, "[W] %s(%s:%d)\n", w_fn, tc_n, tc->sub_id);
+		print_spc(spc, "[E] %s(%s:%d)\n", e_fn, fc_n, fc->sub_id);
+	}
+
+	if (!irqf) {
+		print_spc(spc, "[S] %s(%s:%d)\n", c_fn, fc_n, fc->sub_id);
+		print_spc(spc, "[W] %s(%s:%d)\n", w_fn, tc_n, tc->sub_id);
+		print_spc(spc, "[E] %s(%s:%d)\n", e_fn, fc_n, fc->sub_id);
+	}
+}
+
+static void print_dep(struct dept_dep *d)
+{
+	struct dept_ecxt *e = dep_e(d);
+	struct dept_wait *w = dep_w(d);
+	struct dept_class *fc = dep_fc(d);
+	struct dept_class *tc = dep_tc(d);
+	unsigned long irqf;
+	int irq;
+	const char *w_fn = w->wait_fn ?: "(unknown)";
+	const char *e_fn = e->event_fn ?: "(unknown)";
+	const char *c_fn = e->ecxt_fn ?: "(unknown)";
+	const char *fc_n = fc->sched_map ? "<sched>" : (fc->name ?: "(unknown)");
+	const char *tc_n = tc->sched_map ? "<sched>" : (tc->name ?: "(unknown)");
+
+	irqf = e->enirqf & w->irqf;
+	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR) {
+		pr_warn("%s has been enabled:\n", irq_str(irq));
+		print_ip_stack(e->enirq_ip[irq], e->enirq_stack[irq]);
+		pr_warn("\n");
+
+		pr_warn("[S] %s(%s:%d):\n", c_fn, fc_n, fc->sub_id);
+		print_ip_stack(e->ecxt_ip, e->ecxt_stack);
+		pr_warn("\n");
+
+		pr_warn("[W] %s(%s:%d) in %s context:\n",
+		       w_fn, tc_n, tc->sub_id, irq_str(irq));
+		print_ip_stack(w->irq_ip[irq], w->irq_stack[irq]);
+		pr_warn("\n");
+
+		pr_warn("[E] %s(%s:%d):\n", e_fn, fc_n, fc->sub_id);
+		print_ip_stack(e->event_ip, e->event_stack);
+	}
+
+	if (!irqf) {
+		pr_warn("[S] %s(%s:%d):\n", c_fn, fc_n, fc->sub_id);
+		print_ip_stack(e->ecxt_ip, e->ecxt_stack);
+		pr_warn("\n");
+
+		pr_warn("[W] %s(%s:%d):\n", w_fn, tc_n, tc->sub_id);
+		print_ip_stack(w->wait_ip, w->wait_stack);
+		pr_warn("\n");
+
+		pr_warn("[E] %s(%s:%d):\n", e_fn, fc_n, fc->sub_id);
+		print_ip_stack(e->event_ip, e->event_stack);
+	}
+}
+
+static void save_current_stack(int skip);
+
+/*
+ * Print all classes in a circle.
+ */
+static void print_circle(struct dept_class *c)
+{
+	struct dept_class *fc = c->bfs_parent;
+	struct dept_class *tc = c;
+	int i;
+
+	dept_outworld_enter();
+	save_current_stack(6);
+
+	pr_warn("===================================================\n");
+	pr_warn("DEPT: Circular dependency has been detected.\n");
+	pr_warn("%s %.*s %s\n", init_utsname()->release,
+		(int)strcspn(init_utsname()->version, " "),
+		init_utsname()->version,
+		print_tainted());
+	pr_warn("---------------------------------------------------\n");
+	pr_warn("summary\n");
+	pr_warn("---------------------------------------------------\n");
+
+	if (fc == tc)
+		pr_warn("*** AA DEADLOCK ***\n\n");
+	else
+		pr_warn("*** DEADLOCK ***\n\n");
+
+	i = 0;
+	do {
+		struct dept_dep *d = lookup_dep(fc, tc);
+
+		pr_warn("context %c\n", 'A' + (i++));
+		print_diagram(d);
+		if (fc != c)
+			pr_warn("\n");
+
+		tc = fc;
+		fc = fc->bfs_parent;
+	} while (tc != c);
+
+	pr_warn("\n");
+	pr_warn("[S]: start of the event context\n");
+	pr_warn("[W]: the wait blocked\n");
+	pr_warn("[E]: the event not reachable\n");
+
+	i = 0;
+	do {
+		struct dept_dep *d = lookup_dep(fc, tc);
+
+		pr_warn("---------------------------------------------------\n");
+		pr_warn("context %c's detail\n", 'A' + i);
+		pr_warn("---------------------------------------------------\n");
+		pr_warn("context %c\n", 'A' + (i++));
+		print_diagram(d);
+		pr_warn("\n");
+		print_dep(d);
+
+		tc = fc;
+		fc = fc->bfs_parent;
+	} while (tc != c);
+
+	pr_warn("---------------------------------------------------\n");
+	pr_warn("information that might be helpful\n");
+	pr_warn("---------------------------------------------------\n");
+	dump_stack();
+
+	dept_outworld_exit();
+}
+
+/*
+ * BFS(Breadth First Search)
+ * =====================================================================
+ * Whenever a new dependency is added into the graph, search the graph
+ * for a new circular dependency.
+ */
+
+struct bfs_ops {
+	void (*bfs_init)(void *, void *, void **);
+	void (*extend)(struct list_head *, void *);
+	void *(*dequeue)(struct list_head *);
+	enum bfs_ret (*callback)(void *, void *, void **);
+};
+
+static unsigned int bfs_gen;
+
+/*
+ * NOTE: Must be called with dept_lock held.
+ */
+static void bfs(void *root, struct bfs_ops *ops, void *in, void **out)
+{
+	LIST_HEAD(q);
+	enum bfs_ret ret;
+
+	if (DEPT_WARN_ON(!ops || !ops->bfs_init || !ops->extend ||
+				!ops->dequeue || !ops->callback))
+		return;
+
+	/*
+	 * Avoid zero bfs_gen.
+	 */
+	bfs_gen = bfs_gen + 1 ?: 1;
+	ops->bfs_init(root, in, out);
+
+	ret = ops->callback(root, in, out);
+	if (ret != BFS_CONTINUE)
+		return;
+
+	ops->extend(&q, root);
+	while (!list_empty(&q)) {
+		void *node = ops->dequeue(&q);
+
+		if (ret == BFS_DONE)
+			continue;
+
+		ret = ops->callback(node, in, out);
+		if (ret == BFS_CONTINUE)
+			ops->extend(&q, node);
+	}
+}
+
+/*
+ * Main operations
+ * =====================================================================
+ * Add dependencies - Each new dependency is added into the graph and
+ * checked if it forms a circular dependency.
+ *
+ * Track waits - Waits are queued into the ring buffer for later use to
+ * generate appropriate dependencies with cross-event.
+ *
+ * Track event contexts(ecxt) - Event contexts are pushed into local
+ * stack for later use to generate appropriate dependencies with waits.
+ */
+
+static unsigned long cur_enirqf(void);
+static int cur_irq(void);
+static unsigned int cur_ctxt_id(void);
+
+static struct dept_iecxt *iecxt(struct dept_class *c, int irq)
+{
+	return &c->iecxt[irq];
+}
+
+static struct dept_iwait *iwait(struct dept_class *c, int irq)
+{
+	return &c->iwait[irq];
+}
+
+static void stale_iecxt(struct dept_iecxt *ie)
+{
+	if (ie->ecxt)
+		put_ecxt(ie->ecxt);
+
+	WRITE_ONCE(ie->ecxt, NULL);
+	WRITE_ONCE(ie->staled, true);
+}
+
+static void set_iecxt(struct dept_iecxt *ie, struct dept_ecxt *e)
+{
+	/*
+	 * ->ecxt will never be updated once getting set until the class
+	 * gets removed.
+	 */
+	if (ie->ecxt)
+		DEPT_WARN_ON(1);
+	else
+		WRITE_ONCE(ie->ecxt, get_ecxt(e));
+}
+
+static void stale_iwait(struct dept_iwait *iw)
+{
+	if (iw->wait)
+		put_wait(iw->wait);
+
+	WRITE_ONCE(iw->wait, NULL);
+	WRITE_ONCE(iw->staled, true);
+}
+
+static void set_iwait(struct dept_iwait *iw, struct dept_wait *w)
+{
+	/*
+	 * ->wait will never be updated once getting set until the class
+	 * gets removed.
+	 */
+	if (iw->wait)
+		DEPT_WARN_ON(1);
+	else
+		WRITE_ONCE(iw->wait, get_wait(w));
+
+	iw->touched = true;
+}
+
+static void touch_iwait(struct dept_iwait *iw)
+{
+	iw->touched = true;
+}
+
+static void untouch_iwait(struct dept_iwait *iw)
+{
+	iw->touched = false;
+}
+
+static struct dept_stack *get_current_stack(void)
+{
+	struct dept_stack *s = dept_task()->stack;
+
+	return s ? get_stack(s) : NULL;
+}
+
+static void prepare_current_stack(void)
+{
+	DEPT_WARN_ON(dept_task()->stack);
+
+	dept_task()->stack = new_stack();
+}
+
+static void save_current_stack(int skip)
+{
+	struct dept_stack *s = dept_task()->stack;
+
+	if (!s)
+		return;
+
+	if (valid_stack(s))
+		return;
+
+	s->nr = stack_trace_save(s->raw, DEPT_MAX_STACK_ENTRY, skip);
+}
+
+static void finish_current_stack(void)
+{
+	struct dept_stack *s = dept_task()->stack;
+
+	/*
+	 * Fill the struct dept_stack with a valid stracktrace if it has
+	 * been referred at least once.
+	 */
+	if (stack_consumed(s))
+		save_current_stack(2);
+
+	dept_task()->stack = NULL;
+
+	/*
+	 * Actual deletion will happen at put_stack() if the stack has
+	 * been referred.
+	 */
+	if (s)
+		del_stack(s);
+}
+
+/*
+ * FIXME: For now, disable LOCKDEP while DEPT is working.
+ *
+ * Both LOCKDEP and DEPT report it on a deadlock detection using
+ * printk taking the risk of another deadlock that might be caused by
+ * locks of console or printk between inside and outside of them.
+ *
+ * For DEPT, it's no problem since multiple reports are allowed. But it
+ * would be a bad idea for LOCKDEP since it will stop even on a singe
+ * report. So we need to prevent LOCKDEP from its reporting the risk
+ * DEPT would take when reporting something.
+ */
+#include <linux/lockdep.h>
+
+void noinstr dept_off(void)
+{
+	dept_task()->recursive++;
+	lockdep_off();
+}
+
+void noinstr dept_on(void)
+{
+	lockdep_on();
+	dept_task()->recursive--;
+}
+
+static unsigned long dept_enter(void)
+{
+	unsigned long flags;
+
+	flags = arch_local_irq_save();
+	dept_off();
+	prepare_current_stack();
+	return flags;
+}
+
+static void dept_exit(unsigned long flags)
+{
+	finish_current_stack();
+	dept_on();
+	arch_local_irq_restore(flags);
+}
+
+static unsigned long dept_enter_recursive(void)
+{
+	unsigned long flags;
+
+	flags = arch_local_irq_save();
+	return flags;
+}
+
+static void dept_exit_recursive(unsigned long flags)
+{
+	arch_local_irq_restore(flags);
+}
+
+/*
+ * NOTE: Must be called with dept_lock held.
+ */
+static struct dept_dep *__add_dep(struct dept_ecxt *e,
+				  struct dept_wait *w)
+{
+	struct dept_dep *d;
+
+	if (DEPT_WARN_ON(!valid_class(e->class)))
+		return NULL;
+
+	if (DEPT_WARN_ON(!valid_class(w->class)))
+		return NULL;
+
+	if (lookup_dep(e->class, w->class))
+		return NULL;
+
+	d = new_dep();
+	if (unlikely(!d))
+		return NULL;
+
+	d->ecxt = get_ecxt(e);
+	d->wait = get_wait(w);
+
+	/*
+	 * Add the dependency into hash and graph.
+	 */
+	hash_add_dep(d);
+	list_add(&d->dep_node, &dep_fc(d)->dep_head);
+	list_add(&d->dep_rev_node, &dep_tc(d)->dep_rev_head);
+	return d;
+}
+
+static void bfs_init_check_dl(void *node, void *in, void **out)
+{
+	struct dept_class *root = (struct dept_class *)node;
+	struct dept_dep *new = (struct dept_dep *)in;
+
+	root->bfs_gen = bfs_gen;
+	dep_tc(new)->bfs_parent = dep_fc(new);
+}
+
+static void bfs_extend_dep(struct list_head *h, void *node)
+{
+	struct dept_class *cur = (struct dept_class *)node;
+	struct dept_dep *d;
+
+	list_for_each_entry(d, &cur->dep_head, dep_node) {
+		struct dept_class *next = dep_tc(d);
+
+		if (bfs_gen == next->bfs_gen)
+			continue;
+		next->bfs_parent = cur;
+		next->bfs_gen = bfs_gen;
+		list_add_tail(&next->bfs_node, h);
+	}
+}
+
+static void *bfs_dequeue_dep(struct list_head *h)
+{
+	struct dept_class *c;
+
+	DEPT_WARN_ON(list_empty(h));
+
+	c = list_first_entry(h, struct dept_class, bfs_node);
+	list_del(&c->bfs_node);
+	return c;
+}
+
+static enum bfs_ret cb_check_dl(void *node, void *in, void **out)
+{
+	struct dept_class *cur = (struct dept_class *)node;
+	struct dept_dep *new = (struct dept_dep *)in;
+
+	if (cur == dep_fc(new)) {
+		print_circle(dep_tc(new));
+		return BFS_DONE;
+	}
+
+	return BFS_CONTINUE;
+}
+
+/*
+ * This function is actually in charge of reporting.
+ */
+static void check_dl_bfs(struct dept_dep *d)
+{
+	struct bfs_ops ops = {
+		.bfs_init = bfs_init_check_dl,
+		.extend = bfs_extend_dep,
+		.dequeue = bfs_dequeue_dep,
+		.callback = cb_check_dl,
+	};
+
+	bfs((void *)dep_tc(d), &ops, (void *)d, NULL);
+}
+
+static void bfs_init_dep(void *node, void *in, void **out)
+{
+	struct dept_class *root = (struct dept_class *)node;
+
+	root->bfs_gen = bfs_gen;
+}
+
+static void bfs_extend_dep_rev(struct list_head *h, void *node)
+{
+	struct dept_class *cur = (struct dept_class *)node;
+	struct dept_dep *d;
+
+	list_for_each_entry(d, &cur->dep_rev_head, dep_rev_node) {
+		struct dept_class *next = dep_fc(d);
+
+		if (bfs_gen == next->bfs_gen)
+			continue;
+		next->bfs_parent = cur;
+		next->bfs_gen = bfs_gen;
+		list_add_tail(&next->bfs_node, h);
+	}
+}
+
+static enum bfs_ret cb_find_iw(void *node, void *in, void **out)
+{
+	struct dept_class *cur = (struct dept_class *)node;
+	int irq = *(int *)in;
+	struct dept_iwait *iw;
+
+	if (DEPT_WARN_ON(!out))
+		return BFS_DONE;
+
+	iw = iwait(cur, irq);
+
+	/*
+	 * If any parent's ->wait was set, then the children would've
+	 * been touched.
+	 */
+	if (!iw->touched)
+		return BFS_SKIP;
+
+	if (!iw->wait)
+		return BFS_CONTINUE;
+
+	*out = iw;
+	return BFS_DONE;
+}
+
+static struct dept_iwait *find_iw_bfs(struct dept_class *c, int irq)
+{
+	struct dept_iwait *iw = iwait(c, irq);
+	struct dept_iwait *found = NULL;
+	struct bfs_ops ops = {
+		.bfs_init = bfs_init_dep,
+		.extend = bfs_extend_dep_rev,
+		.dequeue = bfs_dequeue_dep,
+		.callback = cb_find_iw,
+	};
+
+	bfs((void *)c, &ops, (void *)&irq, (void **)&found);
+
+	if (found)
+		return found;
+
+	untouch_iwait(iw);
+	return NULL;
+}
+
+static enum bfs_ret cb_touch_iw_find_ie(void *node, void *in, void **out)
+{
+	struct dept_class *cur = (struct dept_class *)node;
+	int irq = *(int *)in;
+	struct dept_iecxt *ie = iecxt(cur, irq);
+	struct dept_iwait *iw = iwait(cur, irq);
+
+	if (DEPT_WARN_ON(!out))
+		return BFS_DONE;
+
+	touch_iwait(iw);
+
+	if (!ie->ecxt)
+		return BFS_CONTINUE;
+	if (!*out)
+		*out = ie;
+
+	/*
+	 * Do touch_iwait() all the way.
+	 */
+	return BFS_CONTINUE;
+}
+
+static struct dept_iecxt *touch_iw_find_ie_bfs(struct dept_class *c,
+					       int irq)
+{
+	struct dept_iecxt *found = NULL;
+	struct bfs_ops ops = {
+		.bfs_init = bfs_init_dep,
+		.extend = bfs_extend_dep,
+		.dequeue = bfs_dequeue_dep,
+		.callback = cb_touch_iw_find_ie,
+	};
+
+	bfs((void *)c, &ops, (void *)&irq, (void **)&found);
+	return found;
+}
+
+/*
+ * Should be called with dept_lock held.
+ */
+static void __add_idep(struct dept_iecxt *ie, struct dept_iwait *iw)
+{
+	struct dept_dep *new;
+
+	/*
+	 * There's nothing to do.
+	 */
+	if (!ie || !iw || !ie->ecxt || !iw->wait)
+		return;
+
+	new = __add_dep(ie->ecxt, iw->wait);
+
+	/*
+	 * Deadlock detected. Let check_dl_bfs() report it.
+	 */
+	if (new) {
+		check_dl_bfs(new);
+		stale_iecxt(ie);
+		stale_iwait(iw);
+	}
+
+	/*
+	 * If !new, it would be the case of lack of object resource.
+	 * Just let it go and get checked by other chances. Retrying is
+	 * meaningless in that case.
+	 */
+}
+
+static void set_check_iecxt(struct dept_class *c, int irq,
+			    struct dept_ecxt *e)
+{
+	struct dept_iecxt *ie = iecxt(c, irq);
+
+	set_iecxt(ie, e);
+	__add_idep(ie, find_iw_bfs(c, irq));
+}
+
+static void set_check_iwait(struct dept_class *c, int irq,
+			    struct dept_wait *w)
+{
+	struct dept_iwait *iw = iwait(c, irq);
+
+	set_iwait(iw, w);
+	__add_idep(touch_iw_find_ie_bfs(c, irq), iw);
+}
+
+static void add_iecxt(struct dept_class *c, int irq, struct dept_ecxt *e,
+		      bool stack)
+{
+	/*
+	 * This access is safe since we ensure e->class has set locally.
+	 */
+	struct dept_task *dt = dept_task();
+	struct dept_iecxt *ie = iecxt(c, irq);
+
+	if (DEPT_WARN_ON(!valid_class(c)))
+		return;
+
+	if (unlikely(READ_ONCE(ie->staled)))
+		return;
+
+	/*
+	 * Skip add_iecxt() if ie->ecxt has ever been set at least once.
+	 * Which means it has a valid ->ecxt or been staled.
+	 */
+	if (READ_ONCE(ie->ecxt))
+		return;
+
+	if (unlikely(!dept_lock()))
+		return;
+
+	if (unlikely(ie->staled))
+		goto unlock;
+	if (ie->ecxt)
+		goto unlock;
+
+	e->enirqf |= (1UL << irq);
+
+	/*
+	 * Should be NULL since it's the first time that these
+	 * enirq_{ip,stack}[irq] have ever set.
+	 */
+	DEPT_WARN_ON(e->enirq_ip[irq]);
+	DEPT_WARN_ON(e->enirq_stack[irq]);
+
+	e->enirq_ip[irq] = dt->enirq_ip[irq];
+	e->enirq_stack[irq] = stack ? get_current_stack() : NULL;
+
+	set_check_iecxt(c, irq, e);
+unlock:
+	dept_unlock();
+}
+
+static void add_iwait(struct dept_class *c, int irq, struct dept_wait *w)
+{
+	struct dept_iwait *iw = iwait(c, irq);
+
+	if (DEPT_WARN_ON(!valid_class(c)))
+		return;
+
+	if (unlikely(READ_ONCE(iw->staled)))
+		return;
+
+	/*
+	 * Skip add_iwait() if iw->wait has ever been set at least once.
+	 * Which means it has a valid ->wait or been staled.
+	 */
+	if (READ_ONCE(iw->wait))
+		return;
+
+	if (unlikely(!dept_lock()))
+		return;
+
+	if (unlikely(iw->staled))
+		goto unlock;
+	if (iw->wait)
+		goto unlock;
+
+	w->irqf |= (1UL << irq);
+
+	/*
+	 * Should be NULL since it's the first time that these
+	 * irq_{ip,stack}[irq] have ever set.
+	 */
+	DEPT_WARN_ON(w->irq_ip[irq]);
+	DEPT_WARN_ON(w->irq_stack[irq]);
+
+	w->irq_ip[irq] = w->wait_ip;
+	w->irq_stack[irq] = get_current_stack();
+
+	set_check_iwait(c, irq, w);
+unlock:
+	dept_unlock();
+}
+
+static struct dept_wait_hist *hist(int pos)
+{
+	struct dept_task *dt = dept_task();
+
+	return dt->wait_hist + (pos % DEPT_MAX_WAIT_HIST);
+}
+
+static int hist_pos_next(void)
+{
+	struct dept_task *dt = dept_task();
+
+	return dt->wait_hist_pos % DEPT_MAX_WAIT_HIST;
+}
+
+static void hist_advance(void)
+{
+	struct dept_task *dt = dept_task();
+
+	dt->wait_hist_pos++;
+	dt->wait_hist_pos %= DEPT_MAX_WAIT_HIST;
+}
+
+static struct dept_wait_hist *new_hist(void)
+{
+	struct dept_wait_hist *wh = hist(hist_pos_next());
+
+	hist_advance();
+	return wh;
+}
+
+static void add_hist(struct dept_wait *w, unsigned int wg, unsigned int ctxt_id)
+{
+	struct dept_wait_hist *wh = new_hist();
+
+	if (likely(wh->wait))
+		put_wait(wh->wait);
+
+	wh->wait = get_wait(w);
+	wh->wgen = wg;
+	wh->ctxt_id = ctxt_id;
+}
+
+/*
+ * Should be called after setting up e's iecxt and w's iwait.
+ */
+static void add_dep(struct dept_ecxt *e, struct dept_wait *w)
+{
+	struct dept_class *fc = e->class;
+	struct dept_class *tc = w->class;
+	struct dept_dep *d;
+	int i;
+
+	if (lookup_dep(fc, tc))
+		return;
+
+	if (unlikely(!dept_lock()))
+		return;
+
+	/*
+	 * __add_dep() will lookup_dep() again with lock held.
+	 */
+	d = __add_dep(e, w);
+	if (d) {
+		check_dl_bfs(d);
+
+		for (i = 0; i < DEPT_IRQS_NR; i++) {
+			struct dept_iwait *fiw = iwait(fc, i);
+			struct dept_iecxt *found_ie;
+			struct dept_iwait *found_iw;
+
+			/*
+			 * '->touched == false' guarantees there's no
+			 * parent that has been set ->wait.
+			 */
+			if (!fiw->touched)
+				continue;
+
+			/*
+			 * find_iw_bfs() will untouch the iwait if
+			 * not found.
+			 */
+			found_iw = find_iw_bfs(fc, i);
+
+			if (!found_iw)
+				continue;
+
+			found_ie = touch_iw_find_ie_bfs(tc, i);
+			__add_idep(found_ie, found_iw);
+		}
+	}
+	dept_unlock();
+}
+
+static atomic_t wgen = ATOMIC_INIT(1);
+
+static int next_wgen(void)
+{
+	/*
+	 * Avoid zero wgen.
+	 */
+	return atomic_inc_return(&wgen) ?: atomic_inc_return(&wgen);
+}
+
+static void add_wait(struct dept_class *c, unsigned long ip,
+		     const char *w_fn, int sub_l, bool sched_sleep)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_wait *w;
+	unsigned int wg;
+	int irq;
+	int i;
+
+	if (DEPT_WARN_ON(!valid_class(c)))
+		return;
+
+	w = new_wait();
+	if (unlikely(!w))
+		return;
+
+	WRITE_ONCE(w->class, get_class(c));
+	w->wait_ip = ip;
+	w->wait_fn = w_fn;
+	w->wait_stack = get_current_stack();
+	w->sched_sleep = sched_sleep;
+
+	irq = cur_irq();
+	if (irq < DEPT_IRQS_NR)
+		add_iwait(c, irq, w);
+
+	/*
+	 * Avoid adding dependency between user aware nested ecxt and
+	 * wait.
+	 */
+	for (i = dt->ecxt_held_pos - 1; i >= 0; i--) {
+		struct dept_ecxt_held *eh;
+
+		eh = dt->ecxt_held + i;
+
+		/*
+		 * the case of invalid key'ed one
+		 */
+		if (!eh->ecxt)
+			continue;
+
+		if (eh->ecxt->class != c || eh->sub_l == sub_l)
+			add_dep(eh->ecxt, w);
+	}
+
+	wg = next_wgen();
+	add_hist(w, wg, cur_ctxt_id());
+
+	del_wait(w);
+}
+
+static struct dept_ecxt_held *add_ecxt(struct dept_map *m,
+		struct dept_class *c, unsigned long ip, const char *c_fn,
+		const char *e_fn, int sub_l)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_ecxt_held *eh;
+	struct dept_ecxt *e;
+	unsigned long irqf;
+	unsigned int wg;
+	int irq;
+
+	if (DEPT_WARN_ON(!valid_class(c)))
+		return NULL;
+
+	if (DEPT_WARN_ON_ONCE(dt->ecxt_held_pos >= DEPT_MAX_ECXT_HELD))
+		return NULL;
+
+	wg = next_wgen();
+	if (m->nocheck) {
+		eh = dt->ecxt_held + (dt->ecxt_held_pos++);
+		eh->ecxt = NULL;
+		eh->map = m;
+		eh->class = get_class(c);
+		eh->wgen = wg;
+		eh->sub_l = sub_l;
+
+		return eh;
+	}
+
+	e = new_ecxt();
+	if (unlikely(!e))
+		return NULL;
+
+	e->class = get_class(c);
+	e->ecxt_ip = ip;
+	e->ecxt_stack = ip ? get_current_stack() : NULL;
+	e->event_fn = e_fn;
+	e->ecxt_fn = c_fn;
+
+	eh = dt->ecxt_held + (dt->ecxt_held_pos++);
+	eh->ecxt = get_ecxt(e);
+	eh->map = m;
+	eh->class = get_class(c);
+	eh->wgen = wg;
+	eh->sub_l = sub_l;
+
+	irqf = cur_enirqf();
+	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR)
+		add_iecxt(c, irq, e, false);
+
+	del_ecxt(e);
+	return eh;
+}
+
+static int find_ecxt_pos(struct dept_map *m, struct dept_class *c,
+			 bool newfirst)
+{
+	struct dept_task *dt = dept_task();
+	int i;
+
+	if (newfirst) {
+		for (i = dt->ecxt_held_pos - 1; i >= 0; i--) {
+			struct dept_ecxt_held *eh;
+
+			eh = dt->ecxt_held + i;
+			if (eh->map == m && eh->class == c)
+				return i;
+		}
+	} else {
+		for (i = 0; i < dt->ecxt_held_pos; i++) {
+			struct dept_ecxt_held *eh;
+
+			eh = dt->ecxt_held + i;
+			if (eh->map == m && eh->class == c)
+				return i;
+		}
+	}
+	return -1;
+}
+
+static bool pop_ecxt(struct dept_map *m, struct dept_class *c)
+{
+	struct dept_task *dt = dept_task();
+	int pos;
+	int i;
+
+	pos = find_ecxt_pos(m, c, true);
+	if (pos == -1)
+		return false;
+
+	if (dt->ecxt_held[pos].class)
+		put_class(dt->ecxt_held[pos].class);
+
+	if (dt->ecxt_held[pos].ecxt)
+		put_ecxt(dt->ecxt_held[pos].ecxt);
+
+	dt->ecxt_held_pos--;
+
+	for (i = pos; i < dt->ecxt_held_pos; i++)
+		dt->ecxt_held[i] = dt->ecxt_held[i + 1];
+	return true;
+}
+
+static bool good_hist(struct dept_wait_hist *wh, unsigned int wg)
+{
+	return wh->wait != NULL && before(wg, wh->wgen);
+}
+
+/*
+ * Binary-search the ring buffer for the earliest valid wait.
+ */
+static int find_hist_pos(unsigned int wg)
+{
+	int oldest;
+	int l;
+	int r;
+	int pos;
+
+	oldest = hist_pos_next();
+	if (unlikely(good_hist(hist(oldest), wg))) {
+		DEPT_INFO_ONCE("Need to expand the ring buffer.\n");
+		return oldest;
+	}
+
+	l = oldest + 1;
+	r = oldest + DEPT_MAX_WAIT_HIST - 1;
+	for (pos = (l + r) / 2; l <= r; pos = (l + r) / 2) {
+		struct dept_wait_hist *p = hist(pos - 1);
+		struct dept_wait_hist *wh = hist(pos);
+
+		if (!good_hist(p, wg) && good_hist(wh, wg))
+			return pos % DEPT_MAX_WAIT_HIST;
+		if (good_hist(wh, wg))
+			r = pos - 1;
+		else
+			l = pos + 1;
+	}
+	return -1;
+}
+
+static void do_event(struct dept_map *m, struct dept_map *real_m,
+		struct dept_class *c, unsigned int wg, unsigned long ip,
+		const char *e_fn)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_wait_hist *wh;
+	struct dept_ecxt_held *eh;
+	unsigned int ctxt_id;
+	int end;
+	int pos;
+	int i;
+
+	if (DEPT_WARN_ON(!valid_class(c)))
+		return;
+
+	if (m->nocheck)
+		return;
+
+	/*
+	 * The event was triggered before wait.
+	 */
+	if (!wg)
+		return;
+
+	/*
+	 * If an ecxt for this map exists, let the ecxt work for this
+	 * event and do not proceed it in do_event().
+	 */
+	if (find_ecxt_pos(real_m, c, false) != -1)
+		return;
+	eh = add_ecxt(m, c, 0UL, NULL, e_fn, 0);
+
+	if (!eh)
+		return;
+
+	if (DEPT_WARN_ON(!eh->ecxt))
+		goto out;
+
+	eh->ecxt->event_ip = ip;
+	eh->ecxt->event_stack = get_current_stack();
+
+	pos = find_hist_pos(wg);
+	if (pos == -1)
+		goto out;
+
+	ctxt_id = cur_ctxt_id();
+	end = hist_pos_next();
+	end = end > pos ? end : end + DEPT_MAX_WAIT_HIST;
+	for (wh = hist(pos); pos < end; wh = hist(++pos)) {
+		if (dt->in_sched && wh->wait->sched_sleep)
+			continue;
+
+		if (wh->ctxt_id == ctxt_id)
+			add_dep(eh->ecxt, wh->wait);
+	}
+
+	for (i = 0; i < DEPT_IRQS_NR; i++) {
+		struct dept_ecxt *e;
+
+		if (before(dt->wgen_enirq[i], wg))
+			continue;
+
+		e = eh->ecxt;
+		add_iecxt(e->class, i, e, false);
+	}
+out:
+	/*
+	 * Pop ecxt that temporarily has been added to handle this event.
+	 */
+	pop_ecxt(m, c);
+}
+
+static void del_dep_rcu(struct rcu_head *rh)
+{
+	struct dept_dep *d = container_of(rh, struct dept_dep, rh);
+
+	preempt_disable();
+	del_dep(d);
+	preempt_enable();
+}
+
+/*
+ * NOTE: Must be called with dept_lock held.
+ */
+static void disconnect_class(struct dept_class *c)
+{
+	struct dept_dep *d, *n;
+	int i;
+
+	list_for_each_entry_safe(d, n, &c->dep_head, dep_node) {
+		list_del_rcu(&d->dep_node);
+		list_del_rcu(&d->dep_rev_node);
+		hash_del_dep(d);
+		call_rcu(&d->rh, del_dep_rcu);
+	}
+
+	list_for_each_entry_safe(d, n, &c->dep_rev_head, dep_rev_node) {
+		list_del_rcu(&d->dep_node);
+		list_del_rcu(&d->dep_rev_node);
+		hash_del_dep(d);
+		call_rcu(&d->rh, del_dep_rcu);
+	}
+
+	for (i = 0; i < DEPT_IRQS_NR; i++) {
+		stale_iecxt(iecxt(c, i));
+		stale_iwait(iwait(c, i));
+	}
+}
+
+/*
+ * Context control
+ * =====================================================================
+ * Whether a wait is in {hard,soft}-IRQ context or whether
+ * {hard,soft}-IRQ has been enabled on the way to an event is very
+ * important to check dependency. All those things should be tracked.
+ */
+
+static unsigned long cur_enirqf(void)
+{
+	struct dept_task *dt = dept_task();
+	int he = dt->hardirqs_enabled;
+	int se = dt->softirqs_enabled;
+
+	if (he)
+		return DEPT_HIRQF | (se ? DEPT_SIRQF : 0UL);
+	return 0UL;
+}
+
+static int cur_irq(void)
+{
+	if (lockdep_softirq_context(current))
+		return DEPT_SIRQ;
+	if (lockdep_hardirq_context())
+		return DEPT_HIRQ;
+	return DEPT_IRQS_NR;
+}
+
+static unsigned int cur_ctxt_id(void)
+{
+	struct dept_task *dt = dept_task();
+	int irq = cur_irq();
+
+	/*
+	 * Normal process context
+	 */
+	if (irq == DEPT_IRQS_NR)
+		return 0U;
+
+	return dt->irq_id[irq] | (1UL << irq);
+}
+
+static void enirq_transition(int irq)
+{
+	struct dept_task *dt = dept_task();
+	int i;
+
+	/*
+	 * IRQ can cut in on the way to the event. Used for cross-event
+	 * detection.
+	 *
+	 *    wait context	event context(ecxt)
+	 *    ------------	-------------------
+	 *    wait event
+	 *       UPDATE wgen
+	 *			observe IRQ enabled
+	 *			   UPDATE wgen
+	 *			   keep the wgen locally
+	 *
+	 *			on the event
+	 *			   check the wgen kept
+	 */
+
+	dt->wgen_enirq[irq] = next_wgen();
+
+	for (i = dt->ecxt_held_pos - 1; i >= 0; i--) {
+		struct dept_ecxt_held *eh;
+		struct dept_ecxt *e;
+
+		eh = dt->ecxt_held + i;
+		e = eh->ecxt;
+		if (e)
+			add_iecxt(e->class, irq, e, true);
+	}
+}
+
+static void dept_enirq(unsigned long ip)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long irqf = cur_enirqf();
+	int irq;
+	unsigned long flags;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	/*
+	 * IRQ ON/OFF transition might happen while Dept is working.
+	 * We cannot handle recursive entrance. Just ignore it.
+	 * Only transitions outside of Dept will be considered.
+	 */
+	if (dt->recursive)
+		return;
+
+	flags = dept_enter();
+
+	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR) {
+		dt->enirq_ip[irq] = ip;
+		enirq_transition(irq);
+	}
+
+	dept_exit(flags);
+}
+
+void dept_softirqs_on_ip(unsigned long ip)
+{
+	/*
+	 * Assumes that it's called with IRQ disabled so that accessing
+	 * current's fields is not racy.
+	 */
+	dept_task()->softirqs_enabled = true;
+	dept_enirq(ip);
+}
+
+void dept_hardirqs_on(void)
+{
+	/*
+	 * Assumes that it's called with IRQ disabled so that accessing
+	 * current's fields is not racy.
+	 */
+	dept_task()->hardirqs_enabled = true;
+	dept_enirq(_RET_IP_);
+}
+
+void dept_softirqs_off(void)
+{
+	/*
+	 * Assumes that it's called with IRQ disabled so that accessing
+	 * current's fields is not racy.
+	 */
+	dept_task()->softirqs_enabled = false;
+}
+
+void dept_hardirqs_off(void)
+{
+	/*
+	 * Assumes that it's called with IRQ disabled so that accessing
+	 * current's fields is not racy.
+	 */
+	dept_task()->hardirqs_enabled = false;
+}
+
+/*
+ * Ensure it's the outmost softirq context.
+ */
+void dept_softirq_enter(void)
+{
+	struct dept_task *dt = dept_task();
+
+	dt->irq_id[DEPT_SIRQ] += 1UL << DEPT_IRQS_NR;
+}
+
+/*
+ * Ensure it's the outmost hardirq context.
+ */
+void dept_hardirq_enter(void)
+{
+	struct dept_task *dt = dept_task();
+
+	dt->irq_id[DEPT_HIRQ] += 1UL << DEPT_IRQS_NR;
+}
+
+void dept_sched_enter(void)
+{
+	dept_task()->in_sched = true;
+}
+
+void dept_sched_exit(void)
+{
+	dept_task()->in_sched = false;
+}
+
+/*
+ * Exposed APIs
+ * =====================================================================
+ */
+
+static void clean_classes_cache(struct dept_key *k)
+{
+	int i;
+
+	for (i = 0; i < DEPT_MAX_SUBCLASSES_CACHE; i++) {
+		if (!READ_ONCE(k->classes[i]))
+			continue;
+
+		WRITE_ONCE(k->classes[i], NULL);
+	}
+}
+
+/*
+ * Assume we don't have to consider race with the map when
+ * dept_map_init() is called.
+ */
+void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u,
+		   const char *n)
+{
+	unsigned long flags;
+
+	if (unlikely(!dept_working())) {
+		m->nocheck = true;
+		return;
+	}
+
+	if (DEPT_WARN_ON(sub_u < 0)) {
+		m->nocheck = true;
+		return;
+	}
+
+	if (DEPT_WARN_ON(sub_u >= DEPT_MAX_SUBCLASSES_USR)) {
+		m->nocheck = true;
+		return;
+	}
+
+	/*
+	 * Allow recursive entrance.
+	 */
+	flags = dept_enter_recursive();
+
+	clean_classes_cache(&m->map_key);
+
+	m->keys = k;
+	m->sub_u = sub_u;
+	m->name = n;
+	m->wgen = 0U;
+	m->nocheck = !valid_key(k);
+
+	dept_exit_recursive(flags);
+}
+EXPORT_SYMBOL_GPL(dept_map_init);
+
+/*
+ * Assume we don't have to consider race with the map when
+ * dept_map_reinit() is called.
+ */
+void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u,
+		     const char *n)
+{
+	unsigned long flags;
+
+	if (unlikely(!dept_working())) {
+		m->nocheck = true;
+		return;
+	}
+
+	/*
+	 * Allow recursive entrance.
+	 */
+	flags = dept_enter_recursive();
+
+	if (k) {
+		clean_classes_cache(&m->map_key);
+		m->keys = k;
+		m->nocheck = !valid_key(k);
+	}
+
+	if (sub_u >= 0 && sub_u < DEPT_MAX_SUBCLASSES_USR)
+		m->sub_u = sub_u;
+
+	if (n)
+		m->name = n;
+
+	m->wgen = 0U;
+
+	dept_exit_recursive(flags);
+}
+EXPORT_SYMBOL_GPL(dept_map_reinit);
+
+void dept_map_copy(struct dept_map *to, struct dept_map *from)
+{
+	if (unlikely(!dept_working())) {
+		to->nocheck = true;
+		return;
+	}
+
+	*to = *from;
+
+	/*
+	 * XXX: 'to' might be in a stack or something. Using the address
+	 * in a stack segment as a key is meaningless. Just ignore the
+	 * case for now.
+	 */
+	if (!to->keys) {
+		to->nocheck = true;
+		return;
+	}
+
+	/*
+	 * Since the class cache can be modified concurrently we could
+	 * observe half pointers (64bit arch using 32bit copy
+	 * instructions).  Therefore clear the caches and take the
+	 * performance hit.
+	 */
+	clean_classes_cache(&to->map_key);
+}
+
+static LIST_HEAD(classes);
+
+static bool within(const void *addr, void *start, unsigned long size)
+{
+	return addr >= start && addr < start + size;
+}
+
+void dept_free_range(void *start, unsigned int sz)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_class *c, *n;
+	unsigned long flags;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (dt->recursive) {
+		DEPT_STOP("Failed to successfully free Dept objects.\n");
+		return;
+	}
+
+	flags = dept_enter();
+
+	/*
+	 * dept_free_range() should not fail.
+	 *
+	 * FIXME: Should be fixed if dept_free_range() causes deadlock
+	 * with dept_lock().
+	 */
+	while (unlikely(!dept_lock()))
+		cpu_relax();
+
+	list_for_each_entry_safe(c, n, &classes, all_node) {
+		if (!within((void *)c->key, start, sz) &&
+		    !within(c->name, start, sz))
+			continue;
+
+		hash_del_class(c);
+		disconnect_class(c);
+		list_del(&c->all_node);
+		invalidate_class(c);
+
+		/*
+		 * Actual deletion will happen on the rcu callback
+		 * that has been added in disconnect_class().
+		 */
+		del_class(c);
+	}
+	dept_unlock();
+	dept_exit(flags);
+
+	/*
+	 * Wait until even lockless hash_lookup_class() for the class
+	 * returns NULL.
+	 */
+	might_sleep();
+	synchronize_rcu();
+}
+
+static int sub_id(struct dept_map *m, int e)
+{
+	return (m ? m->sub_u : 0) + e * DEPT_MAX_SUBCLASSES_USR;
+}
+
+static struct dept_class *check_new_class(struct dept_key *local,
+					  struct dept_key *k, int sub_id,
+					  const char *n, bool sched_map)
+{
+	struct dept_class *c = NULL;
+
+	if (DEPT_WARN_ON(sub_id >= DEPT_MAX_SUBCLASSES))
+		return NULL;
+
+	if (DEPT_WARN_ON(!k))
+		return NULL;
+
+	/*
+	 * XXX: Assume that users prevent the map from using if any of
+	 * the cached keys has been invalidated. If not, the cache,
+	 * local->classes should not be used because it would be racy
+	 * with class deletion.
+	 */
+	if (local && sub_id < DEPT_MAX_SUBCLASSES_CACHE)
+		c = READ_ONCE(local->classes[sub_id]);
+
+	if (c)
+		return c;
+
+	c = lookup_class((unsigned long)k->base + sub_id);
+	if (c)
+		goto caching;
+
+	if (unlikely(!dept_lock()))
+		return NULL;
+
+	c = lookup_class((unsigned long)k->base + sub_id);
+	if (unlikely(c))
+		goto unlock;
+
+	c = new_class();
+	if (unlikely(!c))
+		goto unlock;
+
+	c->name = n;
+	c->sched_map = sched_map;
+	c->sub_id = sub_id;
+	c->key = (unsigned long)(k->base + sub_id);
+	hash_add_class(c);
+	list_add(&c->all_node, &classes);
+unlock:
+	dept_unlock();
+caching:
+	if (local && sub_id < DEPT_MAX_SUBCLASSES_CACHE)
+		WRITE_ONCE(local->classes[sub_id], c);
+
+	return c;
+}
+
+/*
+ * Called between dept_enter() and dept_exit().
+ */
+static void __dept_wait(struct dept_map *m, unsigned long w_f,
+			unsigned long ip, const char *w_fn, int sub_l,
+			bool sched_sleep, bool sched_map)
+{
+	int e;
+
+	/*
+	 * Be as conservative as possible. In case of multiple waits for
+	 * a single dept_map, we are going to keep only the last wait's
+	 * wgen for simplicity - keeping all wgens seems overengineering.
+	 *
+	 * Of course, it might cause missing some dependencies that
+	 * would rarely, probably never, happen but it helps avoid
+	 * false positive reports.
+	 */
+	for_each_set_bit(e, &w_f, DEPT_MAX_SUBCLASSES_EVT) {
+		struct dept_class *c;
+		struct dept_key *k;
+
+		k = m->keys ?: &m->map_key;
+		c = check_new_class(&m->map_key, k,
+				    sub_id(m, e), m->name, sched_map);
+		if (!c)
+			continue;
+
+		add_wait(c, ip, w_fn, sub_l, sched_sleep);
+	}
+}
+
+/*
+ * Called between dept_enter() and dept_exit().
+ */
+static void __dept_event(struct dept_map *m, struct dept_map *real_m,
+		unsigned long e_f, unsigned long ip, const char *e_fn,
+		bool sched_map)
+{
+	struct dept_class *c;
+	struct dept_key *k;
+	int e;
+
+	e = find_first_bit(&e_f, DEPT_MAX_SUBCLASSES_EVT);
+
+	if (DEPT_WARN_ON(e >= DEPT_MAX_SUBCLASSES_EVT))
+		return;
+
+	/*
+	 * An event is an event. If the caller passed more than single
+	 * event, then warn it and handle the event corresponding to
+	 * the first bit anyway.
+	 */
+	DEPT_WARN_ON(1UL << e != e_f);
+
+	k = m->keys ?: &m->map_key;
+	c = check_new_class(&m->map_key, k, sub_id(m, e), m->name, sched_map);
+
+	if (c)
+		do_event(m, real_m, c, READ_ONCE(m->wgen), ip, e_fn);
+}
+
+void dept_wait(struct dept_map *m, unsigned long w_f,
+	       unsigned long ip, const char *w_fn, int sub_l)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (dt->recursive)
+		return;
+
+	if (m->nocheck)
+		return;
+
+	flags = dept_enter();
+
+	__dept_wait(m, w_f, ip, w_fn, sub_l, false, false);
+
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_wait);
+
+void dept_stage_wait(struct dept_map *m, struct dept_key *k,
+		     unsigned long ip, const char *w_fn)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (m && m->nocheck)
+		return;
+
+	/*
+	 * Either m or k should be passed. Which means Dept relies on
+	 * either its own map or the caller's position in the code when
+	 * determining its class.
+	 */
+	if (DEPT_WARN_ON(!m && !k))
+		return;
+
+	/*
+	 * Allow recursive entrance.
+	 */
+	flags = dept_enter_recursive();
+
+	/*
+	 * Ensure the outmost dept_stage_wait() works.
+	 */
+	if (dt->stage_m.keys)
+		goto exit;
+
+	arch_spin_lock(&dt->stage_lock);
+	if (m) {
+		dt->stage_m = *m;
+		dt->stage_real_m = m;
+
+		/*
+		 * Ensure dt->stage_m.keys != NULL and it works with the
+		 * map's map_key, not stage_m's one when ->keys == NULL.
+		 */
+		if (!m->keys)
+			dt->stage_m.keys = &m->map_key;
+	} else {
+		dt->stage_m.name = w_fn;
+		dt->stage_sched_map = true;
+		dt->stage_real_m = &dt->stage_m;
+	}
+
+	/*
+	 * dept_map_reinit() includes WRITE_ONCE(->wgen, 0U) that
+	 * effectively disables the map just in case real sleep won't
+	 * happen. dept_request_event_wait_commit() will enable it.
+	 */
+	dept_map_reinit(&dt->stage_m, k, -1, NULL);
+
+	dt->stage_w_fn = w_fn;
+	dt->stage_ip = ip;
+	arch_spin_unlock(&dt->stage_lock);
+exit:
+	dept_exit_recursive(flags);
+}
+EXPORT_SYMBOL_GPL(dept_stage_wait);
+
+static void __dept_clean_stage(struct dept_task *dt)
+{
+	memset(&dt->stage_m, 0x0, sizeof(struct dept_map));
+	dt->stage_real_m = NULL;
+	dt->stage_sched_map = false;
+	dt->stage_w_fn = NULL;
+	dt->stage_ip = 0UL;
+}
+
+void dept_clean_stage(void)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	/*
+	 * Allow recursive entrance.
+	 */
+	flags = dept_enter_recursive();
+	arch_spin_lock(&dt->stage_lock);
+	__dept_clean_stage(dt);
+	arch_spin_unlock(&dt->stage_lock);
+	dept_exit_recursive(flags);
+}
+EXPORT_SYMBOL_GPL(dept_clean_stage);
+
+/*
+ * Always called from __schedule().
+ */
+void dept_request_event_wait_commit(void)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	unsigned int wg;
+	unsigned long ip;
+	const char *w_fn;
+	bool sched_map;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	/*
+	 * It's impossible that __schedule() is called while Dept is
+	 * working that already disabled IRQ at the entrance.
+	 */
+	if (DEPT_WARN_ON(dt->recursive))
+		return;
+
+	flags = dept_enter();
+
+	arch_spin_lock(&dt->stage_lock);
+
+	/*
+	 * Checks if current has staged a wait.
+	 */
+	if (!dt->stage_m.keys) {
+		arch_spin_unlock(&dt->stage_lock);
+		goto exit;
+	}
+
+	w_fn = dt->stage_w_fn;
+	ip = dt->stage_ip;
+	sched_map = dt->stage_sched_map;
+
+	wg = next_wgen();
+	WRITE_ONCE(dt->stage_m.wgen, wg);
+	arch_spin_unlock(&dt->stage_lock);
+
+	__dept_wait(&dt->stage_m, 1UL, ip, w_fn, 0, true, sched_map);
+exit:
+	dept_exit(flags);
+}
+
+/*
+ * Always called from try_to_wake_up().
+ */
+void dept_ttwu_stage_wait(struct task_struct *requestor, unsigned long ip)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_task *dt_req = &requestor->dept_task;
+	unsigned long flags;
+	struct dept_map m;
+	struct dept_map *real_m;
+	bool sched_map;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (dt->recursive)
+		return;
+
+	flags = dept_enter();
+
+	arch_spin_lock(&dt_req->stage_lock);
+
+	/*
+	 * Serializing is unnecessary as long as it always comes from
+	 * try_to_wake_up().
+	 */
+	m = dt_req->stage_m;
+	sched_map = dt_req->stage_sched_map;
+	real_m = dt_req->stage_real_m;
+	__dept_clean_stage(dt_req);
+	arch_spin_unlock(&dt_req->stage_lock);
+
+	/*
+	 * ->stage_m.keys should not be NULL if it's in use. Should
+	 * make sure that it's not NULL when staging a valid map.
+	 */
+	if (!m.keys)
+		goto exit;
+
+	__dept_event(&m, real_m, 1UL, ip, "try_to_wake_up", sched_map);
+exit:
+	dept_exit(flags);
+}
+
+/*
+ * Modifies the latest ecxt corresponding to m and e_f.
+ */
+void dept_map_ecxt_modify(struct dept_map *m, unsigned long e_f,
+			  struct dept_key *new_k, unsigned long new_e_f,
+			  unsigned long new_ip, const char *new_c_fn,
+			  const char *new_e_fn, int new_sub_l)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_ecxt_held *eh;
+	struct dept_class *c;
+	struct dept_key *k;
+	unsigned long flags;
+	int pos = -1;
+	int new_e;
+	int e;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	/*
+	 * XXX: Couldn't handle re-enterance cases. Ignore it for now.
+	 */
+	if (dt->recursive)
+		return;
+
+	/*
+	 * Should go ahead no matter whether ->nocheck == true or not
+	 * because ->nocheck value can be changed within the ecxt area
+	 * delimitated by dept_ecxt_enter() and dept_ecxt_exit().
+	 */
+
+	flags = dept_enter();
+
+	for_each_set_bit(e, &e_f, DEPT_MAX_SUBCLASSES_EVT) {
+		k = m->keys ?: &m->map_key;
+		c = check_new_class(&m->map_key, k,
+				    sub_id(m, e), m->name, false);
+		if (!c)
+			continue;
+
+		/*
+		 * When it found an ecxt for any event in e_f, done.
+		 */
+		pos = find_ecxt_pos(m, c, true);
+		if (pos != -1)
+			break;
+	}
+
+	if (unlikely(pos == -1))
+		goto exit;
+
+	eh = dt->ecxt_held + pos;
+	new_sub_l = new_sub_l >= 0 ? new_sub_l : eh->sub_l;
+
+	new_e = find_first_bit(&new_e_f, DEPT_MAX_SUBCLASSES_EVT);
+
+	if (new_e < DEPT_MAX_SUBCLASSES_EVT)
+		/*
+		 * Let it work with the first bit anyway.
+		 */
+		DEPT_WARN_ON(1UL << new_e != new_e_f);
+	else
+		new_e = e;
+
+	pop_ecxt(m, c);
+
+	/*
+	 * Apply the key to the map.
+	 */
+	if (new_k)
+		dept_map_reinit(m, new_k, -1, NULL);
+
+	k = m->keys ?: &m->map_key;
+	c = check_new_class(&m->map_key, k, sub_id(m, new_e), m->name, false);
+
+	if (c && add_ecxt(m, c, new_ip, new_c_fn, new_e_fn, new_sub_l))
+		goto exit;
+
+	/*
+	 * Successfully pop_ecxt()ed but failed to add_ecxt().
+	 */
+	dt->missing_ecxt++;
+exit:
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_map_ecxt_modify);
+
+void dept_ecxt_enter(struct dept_map *m, unsigned long e_f, unsigned long ip,
+		     const char *c_fn, const char *e_fn, int sub_l)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	struct dept_class *c;
+	struct dept_key *k;
+	int e;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (dt->recursive) {
+		dt->missing_ecxt++;
+		return;
+	}
+
+	/*
+	 * Should go ahead no matter whether ->nocheck == true or not
+	 * because ->nocheck value can be changed within the ecxt area
+	 * delimitated by dept_ecxt_enter() and dept_ecxt_exit().
+	 */
+
+	flags = dept_enter();
+
+	e = find_first_bit(&e_f, DEPT_MAX_SUBCLASSES_EVT);
+
+	if (e >= DEPT_MAX_SUBCLASSES_EVT)
+		goto missing_ecxt;
+
+	/*
+	 * An event is an event. If the caller passed more than single
+	 * event, then warn it and handle the event corresponding to
+	 * the first bit anyway.
+	 */
+	DEPT_WARN_ON(1UL << e != e_f);
+
+	k = m->keys ?: &m->map_key;
+	c = check_new_class(&m->map_key, k, sub_id(m, e), m->name, false);
+
+	if (c && add_ecxt(m, c, ip, c_fn, e_fn, sub_l))
+		goto exit;
+missing_ecxt:
+	dt->missing_ecxt++;
+exit:
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_ecxt_enter);
+
+bool dept_ecxt_holding(struct dept_map *m, unsigned long e_f)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	bool ret = false;
+	int e;
+
+	if (unlikely(!dept_working()))
+		return false;
+
+	if (dt->recursive)
+		return false;
+
+	flags = dept_enter();
+
+	for_each_set_bit(e, &e_f, DEPT_MAX_SUBCLASSES_EVT) {
+		struct dept_class *c;
+		struct dept_key *k;
+
+		k = m->keys ?: &m->map_key;
+		c = check_new_class(&m->map_key, k,
+				    sub_id(m, e), m->name, false);
+		if (!c)
+			continue;
+
+		if (find_ecxt_pos(m, c, true) != -1) {
+			ret = true;
+			break;
+		}
+	}
+
+	dept_exit(flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(dept_ecxt_holding);
+
+void dept_request_event(struct dept_map *m)
+{
+	unsigned long flags;
+	unsigned int wg;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (m->nocheck)
+		return;
+
+	/*
+	 * Allow recursive entrance.
+	 */
+	flags = dept_enter_recursive();
+
+	wg = next_wgen();
+	WRITE_ONCE(m->wgen, wg);
+
+	dept_exit_recursive(flags);
+}
+EXPORT_SYMBOL_GPL(dept_request_event);
+
+void dept_event(struct dept_map *m, unsigned long e_f,
+		unsigned long ip, const char *e_fn)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (m->nocheck)
+		return;
+
+	if (dt->recursive) {
+		/*
+		 * Dept won't work with this even though an event
+		 * context has been asked. Don't make it confused at
+		 * handling the event. Disable it until the next.
+		 */
+		WRITE_ONCE(m->wgen, 0U);
+		return;
+	}
+
+	flags = dept_enter();
+
+	__dept_event(m, m, e_f, ip, e_fn, false);
+
+	/*
+	 * Keep the map diabled until the next sleep.
+	 */
+	WRITE_ONCE(m->wgen, 0U);
+
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_event);
+
+void dept_ecxt_exit(struct dept_map *m, unsigned long e_f,
+		    unsigned long ip)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	int e;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (dt->recursive) {
+		dt->missing_ecxt--;
+		return;
+	}
+
+	/*
+	 * Should go ahead no matter whether ->nocheck == true or not
+	 * because ->nocheck value can be changed within the ecxt area
+	 * delimitated by dept_ecxt_enter() and dept_ecxt_exit().
+	 */
+
+	flags = dept_enter();
+
+	for_each_set_bit(e, &e_f, DEPT_MAX_SUBCLASSES_EVT) {
+		struct dept_class *c;
+		struct dept_key *k;
+
+		k = m->keys ?: &m->map_key;
+		c = check_new_class(&m->map_key, k,
+				    sub_id(m, e), m->name, false);
+		if (!c)
+			continue;
+
+		/*
+		 * When it found an ecxt for any event in e_f, done.
+		 */
+		if (pop_ecxt(m, c))
+			goto exit;
+	}
+
+	dt->missing_ecxt--;
+exit:
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_ecxt_exit);
+
+void dept_task_exit(struct task_struct *t)
+{
+	struct dept_task *dt = &t->dept_task;
+	int i;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	raw_local_irq_disable();
+
+	if (dt->stack) {
+		put_stack(dt->stack);
+		dt->stack = NULL;
+	}
+
+	for (i = 0; i < dt->ecxt_held_pos; i++) {
+		if (dt->ecxt_held[i].class) {
+			put_class(dt->ecxt_held[i].class);
+			dt->ecxt_held[i].class = NULL;
+		}
+		if (dt->ecxt_held[i].ecxt) {
+			put_ecxt(dt->ecxt_held[i].ecxt);
+			dt->ecxt_held[i].ecxt = NULL;
+		}
+	}
+
+	for (i = 0; i < DEPT_MAX_WAIT_HIST; i++) {
+		if (dt->wait_hist[i].wait) {
+			put_wait(dt->wait_hist[i].wait);
+			dt->wait_hist[i].wait = NULL;
+		}
+	}
+
+	dt->task_exit = true;
+	dept_off();
+
+	raw_local_irq_enable();
+}
+
+void dept_task_init(struct task_struct *t)
+{
+	memset(&t->dept_task, 0x0, sizeof(struct dept_task));
+	t->dept_task.stage_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
+}
+
+void dept_key_init(struct dept_key *k)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	int sub_id;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (dt->recursive) {
+		DEPT_STOP("Key initialization fails.\n");
+		return;
+	}
+
+	flags = dept_enter();
+
+	clean_classes_cache(k);
+
+	/*
+	 * dept_key_init() should not fail.
+	 *
+	 * FIXME: Should be fixed if dept_key_init() causes deadlock
+	 * with dept_lock().
+	 */
+	while (unlikely(!dept_lock()))
+		cpu_relax();
+
+	for (sub_id = 0; sub_id < DEPT_MAX_SUBCLASSES; sub_id++) {
+		struct dept_class *c;
+
+		c = lookup_class((unsigned long)k->base + sub_id);
+		if (!c)
+			continue;
+
+		DEPT_STOP("The class(%s/%d) has not been removed.\n",
+			  c->name, sub_id);
+		break;
+	}
+
+	dept_unlock();
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_key_init);
+
+void dept_key_destroy(struct dept_key *k)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	int sub_id;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (dt->recursive == 1 && dt->task_exit) {
+		/*
+		 * Need to allow to go ahead in this case where
+		 * ->recursive has been set to 1 by dept_off() in
+		 * dept_task_exit() and ->task_exit has been set to
+		 * true in dept_task_exit().
+		 */
+	} else if (dt->recursive) {
+		DEPT_STOP("Key destroying fails.\n");
+		return;
+	}
+
+	flags = dept_enter();
+
+	/*
+	 * dept_key_destroy() should not fail.
+	 *
+	 * FIXME: Should be fixed if dept_key_destroy() causes deadlock
+	 * with dept_lock().
+	 */
+	while (unlikely(!dept_lock()))
+		cpu_relax();
+
+	for (sub_id = 0; sub_id < DEPT_MAX_SUBCLASSES; sub_id++) {
+		struct dept_class *c;
+
+		c = lookup_class((unsigned long)k->base + sub_id);
+		if (!c)
+			continue;
+
+		hash_del_class(c);
+		disconnect_class(c);
+		list_del(&c->all_node);
+		invalidate_class(c);
+
+		/*
+		 * Actual deletion will happen on the rcu callback
+		 * that has been added in disconnect_class().
+		 */
+		del_class(c);
+	}
+
+	dept_unlock();
+	dept_exit(flags);
+
+	/*
+	 * Wait until even lockless hash_lookup_class() for the class
+	 * returns NULL.
+	 */
+	might_sleep();
+	synchronize_rcu();
+}
+EXPORT_SYMBOL_GPL(dept_key_destroy);
+
+static void move_llist(struct llist_head *to, struct llist_head *from)
+{
+	struct llist_node *first = llist_del_all(from);
+	struct llist_node *last = first;
+
+	if (!first)
+		return;
+
+	while (llist_next(last))
+		last = llist_next(last);
+	llist_add_batch(first, last, to);
+}
+
+static void migrate_per_cpu_pool(void)
+{
+	const int boot_cpu = 0;
+	int i;
+
+	/*
+	 * The boot CPU has been using the temporal local pool so far.
+	 * From now on that per_cpu areas have been ready, use the
+	 * per_cpu local pool instead.
+	 */
+	DEPT_WARN_ON(smp_processor_id() != boot_cpu);
+	for (i = 0; i < OBJECT_NR; i++) {
+		struct llist_head *from;
+		struct llist_head *to;
+
+		from = &pool[i].boot_pool;
+		to = per_cpu_ptr(pool[i].lpool, boot_cpu);
+		move_llist(to, from);
+	}
+}
+
+#define B2KB(B) ((B) / 1024)
+
+/*
+ * Should be called after setup_per_cpu_areas() and before no non-boot
+ * CPUs have been on.
+ */
+void __init dept_init(void)
+{
+	size_t mem_total = 0;
+
+	local_irq_disable();
+	dept_per_cpu_ready = 1;
+	migrate_per_cpu_pool();
+	local_irq_enable();
+
+#define HASH(id, bits) BUILD_BUG_ON(1 << (bits) <= 0);
+	#include "dept_hash.h"
+#undef HASH
+#define OBJECT(id, nr) mem_total += sizeof(struct dept_##id) * nr;
+	#include "dept_object.h"
+#undef OBJECT
+#define HASH(id, bits) mem_total += sizeof(struct hlist_head) * (1 << (bits));
+	#include "dept_hash.h"
+#undef HASH
+
+	pr_info("DEPendency Tracker: Copyright (c) 2020 LG Electronics, Inc., Byungchul Park\n");
+	pr_info("... DEPT_MAX_STACK_ENTRY: %d\n", DEPT_MAX_STACK_ENTRY);
+	pr_info("... DEPT_MAX_WAIT_HIST  : %d\n", DEPT_MAX_WAIT_HIST);
+	pr_info("... DEPT_MAX_ECXT_HELD  : %d\n", DEPT_MAX_ECXT_HELD);
+	pr_info("... DEPT_MAX_SUBCLASSES : %d\n", DEPT_MAX_SUBCLASSES);
+#define OBJECT(id, nr)							\
+	pr_info("... memory used by %s: %zu KB\n",			\
+	       #id, B2KB(sizeof(struct dept_##id) * nr));
+	#include "dept_object.h"
+#undef OBJECT
+#define HASH(id, bits)							\
+	pr_info("... hash list head used by %s: %zu KB\n",		\
+	       #id, B2KB(sizeof(struct hlist_head) * (1 << (bits))));
+	#include "dept_hash.h"
+#undef HASH
+	pr_info("... total memory used by objects and hashs: %zu KB\n", B2KB(mem_total));
+	pr_info("... per task memory footprint: %zu bytes\n", sizeof(struct dept_task));
+}
diff --git a/kernel/dependency/dept_hash.h b/kernel/dependency/dept_hash.h
new file mode 100644
index 000000000000..fd85aab1fdfb
--- /dev/null
+++ b/kernel/dependency/dept_hash.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * HASH(id, bits)
+ *
+ * id  : Id for the object of struct dept_##id.
+ * bits: 1UL << bits is the hash table size.
+ */
+
+HASH(dep, 12)
+HASH(class, 12)
diff --git a/kernel/dependency/dept_object.h b/kernel/dependency/dept_object.h
new file mode 100644
index 000000000000..0b7eb16fe9fb
--- /dev/null
+++ b/kernel/dependency/dept_object.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * OBJECT(id, nr)
+ *
+ * id: Id for the object of struct dept_##id.
+ * nr: # of the object that should be kept in the pool.
+ */
+
+OBJECT(dep, 1024 * 8)
+OBJECT(class, 1024 * 8)
+OBJECT(stack, 1024 * 32)
+OBJECT(ecxt, 1024 * 16)
+OBJECT(wait, 1024 * 32)
diff --git a/kernel/exit.c b/kernel/exit.c
index 343eb97543d5..88c0fbec9967 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1001,6 +1001,7 @@ void __noreturn do_exit(long code)
 	exit_tasks_rcu_finish();
 
 	lockdep_free_task(tsk);
+	dept_task_exit(tsk);
 	do_task_dead();
 }
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 6ca8689a83b5..c6fe9a23ac0a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -106,6 +106,7 @@
 #include <linux/pidfs.h>
 #include <linux/tick.h>
 #include <linux/unwind_deferred.h>
+#include <linux/dept.h>
 
 #include <asm/pgalloc.h>
 #include <linux/uaccess.h>
@@ -2127,6 +2128,7 @@ __latent_entropy struct task_struct *copy_process(
 #ifdef CONFIG_LOCKDEP
 	lockdep_init_task(p);
 #endif
+	dept_task_init(p);
 
 	p->blocked_on = NULL; /* not blocked yet */
 
diff --git a/kernel/module/main.c b/kernel/module/main.c
index c66b26184936..6ad78f0a58b6 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -1375,12 +1375,14 @@ static void free_mod_mem(struct module *mod)
 
 		/* Free lock-classes; relies on the preceding sync_rcu(). */
 		lockdep_free_key_range(mod_mem->base, mod_mem->size);
+		dept_free_range(mod_mem->base, mod_mem->size);
 		if (mod_mem->size)
 			module_memory_free(mod, type);
 	}
 
 	/* MOD_DATA hosts mod, so free it at last */
 	lockdep_free_key_range(mod->mem[MOD_DATA].base, mod->mem[MOD_DATA].size);
+	dept_free_range(mod->mem[MOD_DATA].base, mod->mem[MOD_DATA].size);
 	module_memory_free(mod, MOD_DATA);
 }
 
@@ -3548,6 +3550,8 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	for_class_mod_mem_type(type, core_data) {
 		lockdep_free_key_range(mod->mem[type].base,
 				       mod->mem[type].size);
+		dept_free_range(mod->mem[type].base,
+				mod->mem[type].size);
 	}
 
 	module_memory_restore_rox(mod);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ccba6fc3c3fe..db942591fb1a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -67,6 +67,7 @@
 #include <linux/wait_api.h>
 #include <linux/workqueue_api.h>
 #include <linux/livepatch_sched.h>
+#include <linux/dept.h>
 
 #ifdef CONFIG_PREEMPT_DYNAMIC
 # ifdef CONFIG_GENERIC_IRQ_ENTRY
@@ -4246,6 +4247,8 @@ int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 		if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
 			break;
 
+		dept_ttwu_stage_wait(p, _RET_IP_);
+
 		/*
 		 * Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
 		 * possible to, falsely, observe p->on_cpu == 0.
@@ -6835,6 +6838,11 @@ static void __sched notrace __schedule(int sched_mode)
 	rq = cpu_rq(cpu);
 	prev = rq->curr;
 
+	prev_state = READ_ONCE(prev->__state);
+	if (sched_mode != SM_PREEMPT && prev_state & TASK_NORMAL)
+		dept_request_event_wait_commit();
+
+	dept_sched_enter();
 	schedule_debug(prev, preempt);
 
 	if (sched_feat(HRTICK) || sched_feat(HRTICK_DL))
@@ -6969,6 +6977,7 @@ static void __sched notrace __schedule(int sched_mode)
 		raw_spin_rq_unlock_irq(rq);
 	}
 	trace_sched_exit_tp(is_switch);
+	dept_sched_exit();
 }
 
 void __noreturn do_task_dead(void)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index dc0e0c6ed075..b9cff0bec6f2 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1365,6 +1365,32 @@ config DEBUG_PREEMPT
 
 menu "Lock Debugging (spinlocks, mutexes, etc...)"
 
+config DEPT
+	bool "Dependency tracking (EXPERIMENTAL)"
+	depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
+	select DEBUG_SPINLOCK
+	select DEBUG_MUTEXES if !PREEMPT_RT
+	select DEBUG_RT_MUTEXES if RT_MUTEXES
+	select DEBUG_RWSEMS if !PREEMPT_RT
+	select DEBUG_WW_MUTEX_SLOWPATH
+	select DEBUG_LOCK_ALLOC
+	select TRACE_IRQFLAGS
+	select STACKTRACE
+	select KALLSYMS
+	select KALLSYMS_ALL
+	select PROVE_LOCKING
+	default n
+	help
+	  Check dependencies between wait and event and report it if
+	  deadlock possibility has been detected. Multiple reports are
+	  allowed if there are more than a single problem.
+
+	  This feature is considered EXPERIMENTAL that might produce
+	  false positive reports because new dependencies start to be
+	  tracked, that have never been tracked before. It's worth
+	  noting, to mitigate the impact by the false positives, multi
+	  reporting has been supported.
+
 config LOCK_DEBUGGING_SUPPORT
 	bool
 	depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index ed99344317f5..18228afccea5 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -1398,6 +1398,8 @@ static void reset_locks(void)
 	local_irq_disable();
 	lockdep_free_key_range(&ww_lockdep.acquire_key, 1);
 	lockdep_free_key_range(&ww_lockdep.mutex_key, 1);
+	dept_free_range(&ww_lockdep.acquire_key, 1);
+	dept_free_range(&ww_lockdep.mutex_key, 1);
 
 	I1(A); I1(B); I1(C); I1(D);
 	I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 03/47] dept: add single event dependency tracker APIs
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 01/47] llist: move llist_{head,node} definition to types.h Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 02/47] dept: implement DEPT(DEPendency Tracker) Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 04/47] dept: add lock " Byungchul Park
                   ` (43 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Wrapped the base APIs for easier annotation on wait and event.  Start
with supporting waiters on each single event.  More general support for
multiple events is a future work.  Do more when the need arises.

How to annotate:

1. Initaialize a map for the interesting wait.

   /*
    * Place along with the wait instance.
    */
   struct dept_map my_wait;

   /*
    * Place in the initialization code.
    */
   sdt_map_init(&my_wait);

2. Place the following at the wait code.

   sdt_wait(&my_wait);

3. Place the following at the event code.

   sdt_event(&my_wait);

That's it!

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept_sdt.h | 65 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)
 create mode 100644 include/linux/dept_sdt.h

diff --git a/include/linux/dept_sdt.h b/include/linux/dept_sdt.h
new file mode 100644
index 000000000000..0535f763b21b
--- /dev/null
+++ b/include/linux/dept_sdt.h
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Single-event Dependency Tracker
+ *
+ * Started by Byungchul Park <max.byungchul.park@gmail.com>:
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ *  Copyright (c) 2024 SK hynix, Inc., Byungchul Park
+ */
+
+#ifndef __LINUX_DEPT_SDT_H
+#define __LINUX_DEPT_SDT_H
+
+#include <linux/kernel.h>
+#include <linux/dept.h>
+
+#ifdef CONFIG_DEPT
+#define sdt_map_init(m)							\
+	do {								\
+		static struct dept_key __key;				\
+		dept_map_init(m, &__key, 0, #m);			\
+	} while (0)
+
+#define sdt_map_init_key(m, k)		dept_map_init(m, k, 0, #m)
+
+#define sdt_wait(m)							\
+	do {								\
+		dept_request_event(m);					\
+		dept_wait(m, 1UL, _THIS_IP_, __func__, 0);		\
+	} while (0)
+
+/*
+ * sdt_might_sleep() and its family will be committed in __schedule()
+ * when it actually gets to __schedule(). Both dept_request_event() and
+ * dept_wait() will be performed on the commit.
+ */
+
+/*
+ * Use the code location as the class key if an explicit map is not used.
+ */
+#define sdt_might_sleep_start(m)					\
+	do {								\
+		struct dept_map *__m = m;				\
+		static struct dept_key __key;				\
+		dept_stage_wait(__m, __m ? NULL : &__key, _THIS_IP_, __func__);\
+	} while (0)
+
+#define sdt_might_sleep_end()		dept_clean_stage()
+
+#define sdt_ecxt_enter(m)		dept_ecxt_enter(m, 1UL, _THIS_IP_, "start", "event", 0)
+#define sdt_event(m)			dept_event(m, 1UL, _THIS_IP_, __func__)
+#define sdt_ecxt_exit(m)		dept_ecxt_exit(m, 1UL, _THIS_IP_)
+#define sdt_request_event(m)		dept_request_event(m)
+#else /* !CONFIG_DEPT */
+#define sdt_map_init(m)			do { } while (0)
+#define sdt_map_init_key(m, k)		do { (void)(k); } while (0)
+#define sdt_wait(m)			do { } while (0)
+#define sdt_might_sleep_start(m)	do { } while (0)
+#define sdt_might_sleep_end()		do { } while (0)
+#define sdt_ecxt_enter(m)		do { } while (0)
+#define sdt_event(m)			do { } while (0)
+#define sdt_ecxt_exit(m)		do { } while (0)
+#define sdt_request_event(m)		do { } while (0)
+#endif
+#endif /* __LINUX_DEPT_SDT_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 04/47] dept: add lock dependency tracker APIs
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (2 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 03/47] dept: add single event dependency tracker APIs Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 05/47] dept: tie to lockdep and IRQ tracing Byungchul Park
                   ` (42 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Wrap the base APIs for easier annotation on typical lock.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept_ldt.h | 78 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)
 create mode 100644 include/linux/dept_ldt.h

diff --git a/include/linux/dept_ldt.h b/include/linux/dept_ldt.h
new file mode 100644
index 000000000000..8047d0a531f1
--- /dev/null
+++ b/include/linux/dept_ldt.h
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Lock Dependency Tracker
+ *
+ * Started by Byungchul Park <max.byungchul.park@gmail.com>:
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ *  Copyright (c) 2024 SK hynix, Inc., Byungchul Park
+ */
+
+#ifndef __LINUX_DEPT_LDT_H
+#define __LINUX_DEPT_LDT_H
+
+#include <linux/dept.h>
+
+#ifdef CONFIG_DEPT
+#define LDT_EVT_L			1UL
+#define LDT_EVT_R			2UL
+#define LDT_EVT_W			1UL
+#define LDT_EVT_RW			(LDT_EVT_R | LDT_EVT_W)
+#define LDT_EVT_ALL			(LDT_EVT_L | LDT_EVT_RW)
+
+#define ldt_init(m, k, su, n)		dept_map_init(m, k, su, n)
+#define ldt_lock(m, sl, t, n, i)					\
+	do {								\
+		if (n)							\
+			dept_ecxt_enter_nokeep(m);			\
+		else if (t)						\
+			dept_ecxt_enter(m, LDT_EVT_L, i, "trylock", "unlock", sl);\
+		else {							\
+			dept_wait(m, LDT_EVT_L, i, "lock", sl);		\
+			dept_ecxt_enter(m, LDT_EVT_L, i, "lock", "unlock", sl);\
+		}							\
+	} while (0)
+
+#define ldt_rlock(m, sl, t, n, i, q)					\
+	do {								\
+		if (n)							\
+			dept_ecxt_enter_nokeep(m);			\
+		else if (t)						\
+			dept_ecxt_enter(m, LDT_EVT_R, i, "read_trylock", "read_unlock", sl);\
+		else {							\
+			dept_wait(m, q ? LDT_EVT_RW : LDT_EVT_W, i, "read_lock", sl);\
+			dept_ecxt_enter(m, LDT_EVT_R, i, "read_lock", "read_unlock", sl);\
+		}							\
+	} while (0)
+
+#define ldt_wlock(m, sl, t, n, i)					\
+	do {								\
+		if (n)							\
+			dept_ecxt_enter_nokeep(m);			\
+		else if (t)						\
+			dept_ecxt_enter(m, LDT_EVT_W, i, "write_trylock", "write_unlock", sl);\
+		else {							\
+			dept_wait(m, LDT_EVT_RW, i, "write_lock", sl);	\
+			dept_ecxt_enter(m, LDT_EVT_W, i, "write_lock", "write_unlock", sl);\
+		}							\
+	} while (0)
+
+#define ldt_unlock(m, i)		dept_ecxt_exit(m, LDT_EVT_ALL, i)
+
+#define ldt_downgrade(m, i)						\
+	do {								\
+		if (dept_ecxt_holding(m, LDT_EVT_W))			\
+			dept_map_ecxt_modify(m, LDT_EVT_W, NULL, LDT_EVT_R, i, "downgrade", "read_unlock", -1);\
+	} while (0)
+
+#define ldt_set_class(m, n, k, sl, i)	dept_map_ecxt_modify(m, LDT_EVT_ALL, k, 0UL, i, "lock_set_class", "(any)unlock", sl)
+#else /* !CONFIG_DEPT */
+#define ldt_init(m, k, su, n)		do { (void)(k); } while (0)
+#define ldt_lock(m, sl, t, n, i)	do { } while (0)
+#define ldt_rlock(m, sl, t, n, i, q)	do { } while (0)
+#define ldt_wlock(m, sl, t, n, i)	do { } while (0)
+#define ldt_unlock(m, i)		do { } while (0)
+#define ldt_downgrade(m, i)		do { } while (0)
+#define ldt_set_class(m, n, k, sl, i)	do { } while (0)
+#endif
+#endif /* __LINUX_DEPT_LDT_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 05/47] dept: tie to lockdep and IRQ tracing
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (3 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 04/47] dept: add lock " Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 06/47] dept: add proc knobs to show stats and dependency graph Byungchul Park
                   ` (41 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

How to place dept this way looks so ugly.  However, it's inevitable for
now.  The way should be enhanced gradually.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/irqflags.h            |   7 +-
 include/linux/local_lock_internal.h |   1 +
 include/linux/lockdep.h             | 102 ++++++++++++++++++++++------
 include/linux/lockdep_types.h       |   3 +
 include/linux/mutex.h               |   1 +
 include/linux/percpu-rwsem.h        |   2 +-
 include/linux/rtmutex.h             |   1 +
 include/linux/rwlock_types.h        |   1 +
 include/linux/rwsem.h               |   1 +
 include/linux/seqlock.h             |   2 +-
 include/linux/spinlock_types_raw.h  |   3 +
 include/linux/srcu.h                |   2 +-
 kernel/dependency/dept.c            |   8 +--
 kernel/locking/lockdep.c            |  22 ++++++
 14 files changed, 127 insertions(+), 29 deletions(-)

diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 57b074e0cfbb..d8b9cf093f83 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -15,6 +15,7 @@
 #include <linux/irqflags_types.h>
 #include <linux/typecheck.h>
 #include <linux/cleanup.h>
+#include <linux/dept.h>
 #include <asm/irqflags.h>
 #include <asm/percpu.h>
 
@@ -55,8 +56,10 @@ extern void trace_hardirqs_off(void);
 # define lockdep_softirqs_enabled(p)	((p)->softirqs_enabled)
 # define lockdep_hardirq_enter()			\
 do {							\
-	if (__this_cpu_inc_return(hardirq_context) == 1)\
+	if (__this_cpu_inc_return(hardirq_context) == 1) { \
 		current->hardirq_threaded = 0;		\
+		dept_hardirq_enter();			\
+	}						\
 } while (0)
 # define lockdep_hardirq_threaded()		\
 do {						\
@@ -131,6 +134,8 @@ do {						\
 # define lockdep_softirq_enter()		\
 do {						\
 	current->softirq_context++;		\
+	if (current->softirq_context == 1)	\
+		dept_softirq_enter();		\
 } while (0)
 # define lockdep_softirq_exit()			\
 do {						\
diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h
index d80b5306a2c0..3b74da2fec50 100644
--- a/include/linux/local_lock_internal.h
+++ b/include/linux/local_lock_internal.h
@@ -27,6 +27,7 @@ typedef struct {
 		.name = #lockname,			\
 		.wait_type_inner = LD_WAIT_CONFIG,	\
 		.lock_type = LD_LOCK_PERCPU,		\
+		.dmap = DEPT_MAP_INITIALIZER(lockname, NULL),\
 	},						\
 	.owner = NULL,
 
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 67964dc4db95..ef03d8808c10 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -12,6 +12,7 @@
 
 #include <linux/lockdep_types.h>
 #include <linux/smp.h>
+#include <linux/dept_ldt.h>
 #include <asm/percpu.h>
 
 struct task_struct;
@@ -39,6 +40,8 @@ static inline void lockdep_copy_map(struct lockdep_map *to,
 	 */
 	for (i = 0; i < NR_LOCKDEP_CACHING_CLASSES; i++)
 		to->class_cache[i] = NULL;
+
+	dept_map_copy(&to->dmap, &from->dmap);
 }
 
 /*
@@ -428,7 +431,8 @@ enum xhlock_context_t {
  * Note that _name must not be NULL.
  */
 #define STATIC_LOCKDEP_MAP_INIT(_name, _key) \
-	{ .name = (_name), .key = (void *)(_key), }
+	{ .name = (_name), .key = (void *)(_key), \
+	  .dmap = DEPT_MAP_INITIALIZER(_name, _key) }
 
 static inline void lockdep_invariant_state(bool force) {}
 static inline void lockdep_free_task(struct task_struct *task) {}
@@ -510,33 +514,89 @@ extern bool read_lock_is_recursive(void);
 #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 1, n, i)
 #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 1, n, i)
 
-#define spin_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
-#define spin_acquire_nest(l, s, t, n, i)	lock_acquire_exclusive(l, s, t, n, i)
-#define spin_release(l, i)			lock_release(l, i)
-
-#define rwlock_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
+#define spin_acquire(l, s, t, i)					\
+do {									\
+	ldt_lock(&(l)->dmap, s, t, NULL, i);				\
+	lock_acquire_exclusive(l, s, t, NULL, i);			\
+} while (0)
+#define spin_acquire_nest(l, s, t, n, i)				\
+do {									\
+	ldt_lock(&(l)->dmap, s, t, n, i);				\
+	lock_acquire_exclusive(l, s, t, n, i);				\
+} while (0)
+#define spin_release(l, i)						\
+do {									\
+	ldt_unlock(&(l)->dmap, i);					\
+	lock_release(l, i);						\
+} while (0)
+#define rwlock_acquire(l, s, t, i)					\
+do {									\
+	ldt_wlock(&(l)->dmap, s, t, NULL, i);				\
+	lock_acquire_exclusive(l, s, t, NULL, i);			\
+} while (0)
 #define rwlock_acquire_read(l, s, t, i)					\
 do {									\
+	ldt_rlock(&(l)->dmap, s, t, NULL, i, !read_lock_is_recursive());\
 	if (read_lock_is_recursive())					\
 		lock_acquire_shared_recursive(l, s, t, NULL, i);	\
 	else								\
 		lock_acquire_shared(l, s, t, NULL, i);			\
 } while (0)
-
-#define rwlock_release(l, i)			lock_release(l, i)
-
-#define seqcount_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
-#define seqcount_acquire_read(l, s, t, i)	lock_acquire_shared_recursive(l, s, t, NULL, i)
-#define seqcount_release(l, i)			lock_release(l, i)
-
-#define mutex_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
-#define mutex_acquire_nest(l, s, t, n, i)	lock_acquire_exclusive(l, s, t, n, i)
-#define mutex_release(l, i)			lock_release(l, i)
-
-#define rwsem_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
-#define rwsem_acquire_nest(l, s, t, n, i)	lock_acquire_exclusive(l, s, t, n, i)
-#define rwsem_acquire_read(l, s, t, i)		lock_acquire_shared(l, s, t, NULL, i)
-#define rwsem_release(l, i)			lock_release(l, i)
+#define rwlock_release(l, i)						\
+do {									\
+	ldt_unlock(&(l)->dmap, i);					\
+	lock_release(l, i);						\
+} while (0)
+#define seqcount_acquire(l, s, t, i)					\
+do {									\
+	ldt_wlock(&(l)->dmap, s, t, NULL, i);				\
+	lock_acquire_exclusive(l, s, t, NULL, i);			\
+} while (0)
+#define seqcount_acquire_read(l, s, t, i)				\
+do {									\
+	ldt_rlock(&(l)->dmap, s, t, NULL, i, false);			\
+	lock_acquire_shared_recursive(l, s, t, NULL, i);		\
+} while (0)
+#define seqcount_release(l, i)						\
+do {									\
+	ldt_unlock(&(l)->dmap, i);					\
+	lock_release(l, i);						\
+} while (0)
+#define mutex_acquire(l, s, t, i)					\
+do {									\
+	ldt_lock(&(l)->dmap, s, t, NULL, i);				\
+	lock_acquire_exclusive(l, s, t, NULL, i);			\
+} while (0)
+#define mutex_acquire_nest(l, s, t, n, i)				\
+do {									\
+	ldt_lock(&(l)->dmap, s, t, n, i);				\
+	lock_acquire_exclusive(l, s, t, n, i);				\
+} while (0)
+#define mutex_release(l, i)						\
+do {									\
+	ldt_unlock(&(l)->dmap, i);					\
+	lock_release(l, i);						\
+} while (0)
+#define rwsem_acquire(l, s, t, i)					\
+do {									\
+	ldt_lock(&(l)->dmap, s, t, NULL, i);				\
+	lock_acquire_exclusive(l, s, t, NULL, i);			\
+} while (0)
+#define rwsem_acquire_nest(l, s, t, n, i)				\
+do {									\
+	ldt_lock(&(l)->dmap, s, t, n, i);				\
+	lock_acquire_exclusive(l, s, t, n, i);				\
+} while (0)
+#define rwsem_acquire_read(l, s, t, i)					\
+do {									\
+	ldt_lock(&(l)->dmap, s, t, NULL, i);				\
+	lock_acquire_shared(l, s, t, NULL, i);				\
+} while (0)
+#define rwsem_release(l, i)						\
+do {									\
+	ldt_unlock(&(l)->dmap, i);					\
+	lock_release(l, i);						\
+} while (0)
 
 #define lock_map_acquire(l)			lock_acquire_exclusive(l, 0, 0, NULL, _THIS_IP_)
 #define lock_map_acquire_try(l)			lock_acquire_exclusive(l, 0, 1, NULL, _THIS_IP_)
diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
index eae115a26488..0c3389ed26b6 100644
--- a/include/linux/lockdep_types.h
+++ b/include/linux/lockdep_types.h
@@ -11,6 +11,7 @@
 #define __LINUX_LOCKDEP_TYPES_H
 
 #include <linux/types.h>
+#include <linux/dept.h>
 
 #define MAX_LOCKDEP_SUBCLASSES		8UL
 
@@ -77,6 +78,7 @@ struct lock_class_key {
 		struct hlist_node		hash_entry;
 		struct lockdep_subclass_key	subkeys[MAX_LOCKDEP_SUBCLASSES];
 	};
+	struct dept_key				dkey;
 };
 
 extern struct lock_class_key __lockdep_no_validate__;
@@ -195,6 +197,7 @@ struct lockdep_map {
 	int				cpu;
 	unsigned long			ip;
 #endif
+	struct dept_map			dmap;
 };
 
 struct pin_cookie { unsigned int val; };
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 847b81ca6436..f8d7f02be04d 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -29,6 +29,7 @@ struct device;
 		, .dep_map = {					\
 			.name = #lockname,			\
 			.wait_type_inner = LD_WAIT_SLEEP,	\
+			.dmap = DEPT_MAP_INITIALIZER(lockname, NULL),\
 		}
 #else
 # define __DEP_MAP_MUTEX_INITIALIZER(lockname)
diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h
index 288f5235649a..11eece738f1f 100644
--- a/include/linux/percpu-rwsem.h
+++ b/include/linux/percpu-rwsem.h
@@ -22,7 +22,7 @@ struct percpu_rw_semaphore {
 };
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
-#define __PERCPU_RWSEM_DEP_MAP_INIT(lockname)	.dep_map = { .name = #lockname },
+#define __PERCPU_RWSEM_DEP_MAP_INIT(lockname)	.dep_map = { .name = #lockname, .dmap = DEPT_MAP_INITIALIZER(lockname, NULL) },
 #else
 #define __PERCPU_RWSEM_DEP_MAP_INIT(lockname)
 #endif
diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index fa9f1021541e..4dc7f046b0a6 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -81,6 +81,7 @@ do { \
 	.dep_map = {					\
 		.name = #mutexname,			\
 		.wait_type_inner = LD_WAIT_SLEEP,	\
+		.dmap = DEPT_MAP_INITIALIZER(mutexname, NULL),\
 	}
 #else
 #define __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname)
diff --git a/include/linux/rwlock_types.h b/include/linux/rwlock_types.h
index 1948442e7750..6e58dfc84997 100644
--- a/include/linux/rwlock_types.h
+++ b/include/linux/rwlock_types.h
@@ -10,6 +10,7 @@
 	.dep_map = {							\
 		.name = #lockname,					\
 		.wait_type_inner = LD_WAIT_CONFIG,			\
+		.dmap = DEPT_MAP_INITIALIZER(lockname, NULL),		\
 	}
 #else
 # define RW_DEP_MAP_INIT(lockname)
diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index f1aaf676a874..0f349c83a7dc 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -22,6 +22,7 @@
 	.dep_map = {					\
 		.name = #lockname,			\
 		.wait_type_inner = LD_WAIT_SLEEP,	\
+		.dmap = DEPT_MAP_INITIALIZER(lockname, NULL),\
 	},
 #else
 # define __RWSEM_DEP_MAP_INIT(lockname)
diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index 5ce48eab7a2a..5f3447449fe0 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -51,7 +51,7 @@ static inline void __seqcount_init(seqcount_t *s, const char *name,
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 
 # define SEQCOUNT_DEP_MAP_INIT(lockname)				\
-		.dep_map = { .name = #lockname }
+		.dep_map = { .name = #lockname, .dmap = DEPT_MAP_INITIALIZER(lockname, NULL) }
 
 /**
  * seqcount_init() - runtime initializer for seqcount_t
diff --git a/include/linux/spinlock_types_raw.h b/include/linux/spinlock_types_raw.h
index 91cb36b65a17..3dcc551ded25 100644
--- a/include/linux/spinlock_types_raw.h
+++ b/include/linux/spinlock_types_raw.h
@@ -31,11 +31,13 @@ typedef struct raw_spinlock {
 	.dep_map = {					\
 		.name = #lockname,			\
 		.wait_type_inner = LD_WAIT_SPIN,	\
+		.dmap = DEPT_MAP_INITIALIZER(lockname, NULL),\
 	}
 # define SPIN_DEP_MAP_INIT(lockname)			\
 	.dep_map = {					\
 		.name = #lockname,			\
 		.wait_type_inner = LD_WAIT_CONFIG,	\
+		.dmap = DEPT_MAP_INITIALIZER(lockname, NULL),\
 	}
 
 # define LOCAL_SPIN_DEP_MAP_INIT(lockname)		\
@@ -43,6 +45,7 @@ typedef struct raw_spinlock {
 		.name = #lockname,			\
 		.wait_type_inner = LD_WAIT_CONFIG,	\
 		.lock_type = LD_LOCK_PERCPU,		\
+		.dmap = DEPT_MAP_INITIALIZER(lockname, NULL),\
 	}
 #else
 # define RAW_SPIN_DEP_MAP_INIT(lockname)
diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index f179700fecaf..f1961554ed1a 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -35,7 +35,7 @@ int __init_srcu_struct(struct srcu_struct *ssp, const char *name,
 	__init_srcu_struct((ssp), #ssp, &__srcu_key); \
 })
 
-#define __SRCU_DEP_MAP_INIT(srcu_name)	.dep_map = { .name = #srcu_name },
+#define __SRCU_DEP_MAP_INIT(srcu_name)	.dep_map = { .name = #srcu_name, .dmap = DEPT_MAP_INITIALIZER(srcu_name, NULL) },
 #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
 int init_srcu_struct(struct srcu_struct *ssp);
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 712b7f79a095..cbb036e8cc1d 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -249,10 +249,10 @@ static bool dept_working(void)
  * Even k == NULL is considered as a valid key because it would use
  * &->map_key as the key in that case.
  */
-struct dept_key __dept_no_validate__;
+extern struct lock_class_key __lockdep_no_validate__;
 static bool valid_key(struct dept_key *k)
 {
-	return &__dept_no_validate__ != k;
+	return &__lockdep_no_validate__.dkey != k;
 }
 
 /*
@@ -1946,7 +1946,7 @@ void dept_softirqs_off(void)
 	dept_task()->softirqs_enabled = false;
 }
 
-void dept_hardirqs_off(void)
+void noinstr dept_hardirqs_off(void)
 {
 	/*
 	 * Assumes that it's called with IRQ disabled so that accessing
@@ -1968,7 +1968,7 @@ void dept_softirq_enter(void)
 /*
  * Ensure it's the outmost hardirq context.
  */
-void dept_hardirq_enter(void)
+void noinstr dept_hardirq_enter(void)
 {
 	struct dept_task *dt = dept_task();
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 2d4c5bab5af8..dc97f2753ef8 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1224,6 +1224,8 @@ void lockdep_register_key(struct lock_class_key *key)
 	struct lock_class_key *k;
 	unsigned long flags;
 
+	dept_key_init(&key->dkey);
+
 	if (WARN_ON_ONCE(static_obj(key)))
 		return;
 	hash_head = keyhashentry(key);
@@ -4361,6 +4363,8 @@ static void __trace_hardirqs_on_caller(void)
  */
 void lockdep_hardirqs_on_prepare(void)
 {
+	dept_hardirqs_on();
+
 	if (unlikely(!debug_locks))
 		return;
 
@@ -4481,6 +4485,8 @@ EXPORT_SYMBOL_GPL(lockdep_hardirqs_on);
  */
 void noinstr lockdep_hardirqs_off(unsigned long ip)
 {
+	dept_hardirqs_off();
+
 	if (unlikely(!debug_locks))
 		return;
 
@@ -4525,6 +4531,8 @@ void lockdep_softirqs_on(unsigned long ip)
 {
 	struct irqtrace_events *trace = &current->irqtrace;
 
+	dept_softirqs_on_ip(ip);
+
 	if (unlikely(!lockdep_enabled()))
 		return;
 
@@ -4563,6 +4571,8 @@ void lockdep_softirqs_on(unsigned long ip)
  */
 void lockdep_softirqs_off(unsigned long ip)
 {
+	dept_softirqs_off();
+
 	if (unlikely(!lockdep_enabled()))
 		return;
 
@@ -4940,6 +4950,8 @@ void lockdep_init_map_type(struct lockdep_map *lock, const char *name,
 {
 	int i;
 
+	ldt_init(&lock->dmap, &key->dkey, subclass, name);
+
 	for (i = 0; i < NR_LOCKDEP_CACHING_CLASSES; i++)
 		lock->class_cache[i] = NULL;
 
@@ -5736,6 +5748,12 @@ void lock_set_class(struct lockdep_map *lock, const char *name,
 {
 	unsigned long flags;
 
+	/*
+	 * dept_map_(re)init() might be called twice redundantly. But
+	 * there's no choice as long as Dept relies on Lockdep.
+	 */
+	ldt_set_class(&lock->dmap, name, &key->dkey, subclass, ip);
+
 	if (unlikely(!lockdep_enabled()))
 		return;
 
@@ -5753,6 +5771,8 @@ void lock_downgrade(struct lockdep_map *lock, unsigned long ip)
 {
 	unsigned long flags;
 
+	ldt_downgrade(&lock->dmap, ip);
+
 	if (unlikely(!lockdep_enabled()))
 		return;
 
@@ -6588,6 +6608,8 @@ void lockdep_unregister_key(struct lock_class_key *key)
 	bool found = false;
 	bool need_callback = false;
 
+	dept_key_destroy(&key->dkey);
+
 	might_sleep();
 
 	if (WARN_ON_ONCE(static_obj(key)))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 06/47] dept: add proc knobs to show stats and dependency graph
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (4 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 05/47] dept: tie to lockdep and IRQ tracing Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 07/47] dept: distinguish each kernel context from another Byungchul Park
                   ` (40 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

It'd be useful to show dept internal stats and dependency graph on
runtime via proc for better information.  Introduce the knobs.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 kernel/dependency/Makefile        |  1 +
 kernel/dependency/dept.c          | 50 +++-------------
 kernel/dependency/dept_internal.h | 54 +++++++++++++++++
 kernel/dependency/dept_proc.c     | 96 +++++++++++++++++++++++++++++++
 4 files changed, 160 insertions(+), 41 deletions(-)
 create mode 100644 kernel/dependency/dept_internal.h
 create mode 100644 kernel/dependency/dept_proc.c

diff --git a/kernel/dependency/Makefile b/kernel/dependency/Makefile
index b5cfb8a03c0c..92f165400187 100644
--- a/kernel/dependency/Makefile
+++ b/kernel/dependency/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_DEPT) += dept.o
+obj-$(CONFIG_DEPT) += dept_proc.o
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index cbb036e8cc1d..dfe9dfdb6991 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -75,6 +75,7 @@
 #include <linux/dept.h>
 #include <linux/utsname.h>
 #include <linux/kernel.h>
+#include "dept_internal.h"
 
 static int dept_stop;
 static int dept_per_cpu_ready;
@@ -265,46 +266,13 @@ static bool valid_key(struct dept_key *k)
  *       have been freed will be placed.
  */
 
-enum object_t {
-#define OBJECT(id, nr) OBJECT_##id,
-	#include "dept_object.h"
-#undef OBJECT
-	OBJECT_NR,
-};
-
 #define OBJECT(id, nr)							\
 static struct dept_##id spool_##id[nr];					\
 static DEFINE_PER_CPU(struct llist_head, lpool_##id);
 	#include "dept_object.h"
 #undef OBJECT
 
-struct dept_pool {
-	const char			*name;
-
-	/*
-	 * object size
-	 */
-	size_t				obj_sz;
-
-	/*
-	 * the number of the static array
-	 */
-	atomic_t			obj_nr;
-
-	/*
-	 * offset of ->pool_node
-	 */
-	size_t				node_off;
-
-	/*
-	 * pointer to the pool
-	 */
-	void				*spool;
-	struct llist_head		boot_pool;
-	struct llist_head __percpu	*lpool;
-};
-
-static struct dept_pool pool[OBJECT_NR] = {
+struct dept_pool dept_pool[OBJECT_NR] = {
 #define OBJECT(id, nr) {						\
 	.name = #id,							\
 	.obj_sz = sizeof(struct dept_##id),				\
@@ -334,7 +302,7 @@ static void *from_pool(enum object_t t)
 	if (DEPT_WARN_ON(!irqs_disabled()))
 		return NULL;
 
-	p = &pool[t];
+	p = &dept_pool[t];
 
 	/*
 	 * Try local pool first.
@@ -369,7 +337,7 @@ static void *from_pool(enum object_t t)
 
 static void to_pool(void *o, enum object_t t)
 {
-	struct dept_pool *p = &pool[t];
+	struct dept_pool *p = &dept_pool[t];
 	struct llist_head *h;
 
 	preempt_disable();
@@ -2108,7 +2076,7 @@ void dept_map_copy(struct dept_map *to, struct dept_map *from)
 	clean_classes_cache(&to->map_key);
 }
 
-static LIST_HEAD(classes);
+LIST_HEAD(dept_classes);
 
 static bool within(const void *addr, void *start, unsigned long size)
 {
@@ -2140,7 +2108,7 @@ void dept_free_range(void *start, unsigned int sz)
 	while (unlikely(!dept_lock()))
 		cpu_relax();
 
-	list_for_each_entry_safe(c, n, &classes, all_node) {
+	list_for_each_entry_safe(c, n, &dept_classes, all_node) {
 		if (!within((void *)c->key, start, sz) &&
 		    !within(c->name, start, sz))
 			continue;
@@ -2216,7 +2184,7 @@ static struct dept_class *check_new_class(struct dept_key *local,
 	c->sub_id = sub_id;
 	c->key = (unsigned long)(k->base + sub_id);
 	hash_add_class(c);
-	list_add(&c->all_node, &classes);
+	list_add(&c->all_node, &dept_classes);
 unlock:
 	dept_unlock();
 caching:
@@ -2951,8 +2919,8 @@ static void migrate_per_cpu_pool(void)
 		struct llist_head *from;
 		struct llist_head *to;
 
-		from = &pool[i].boot_pool;
-		to = per_cpu_ptr(pool[i].lpool, boot_cpu);
+		from = &dept_pool[i].boot_pool;
+		to = per_cpu_ptr(dept_pool[i].lpool, boot_cpu);
 		move_llist(to, from);
 	}
 }
diff --git a/kernel/dependency/dept_internal.h b/kernel/dependency/dept_internal.h
new file mode 100644
index 000000000000..6b39e5a2a830
--- /dev/null
+++ b/kernel/dependency/dept_internal.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Dept(DEPendency Tracker) - runtime dependency tracker internal header
+ *
+ * Started by Byungchul Park <max.byungchul.park@gmail.com>:
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ *  Copyright (c) 2024 SK hynix, Inc., Byungchul Park
+ */
+
+#ifndef __DEPT_INTERNAL_H
+#define __DEPT_INTERNAL_H
+
+#ifdef CONFIG_DEPT
+#include <linux/percpu.h>
+
+struct dept_pool {
+	const char			*name;
+
+	/*
+	 * object size
+	 */
+	size_t				obj_sz;
+
+	/*
+	 * the number of the static array
+	 */
+	atomic_t			obj_nr;
+
+	/*
+	 * offset of ->pool_node
+	 */
+	size_t				node_off;
+
+	/*
+	 * pointer to the pool
+	 */
+	void				*spool;
+	struct llist_head		boot_pool;
+	struct llist_head __percpu	*lpool;
+};
+
+enum object_t {
+#define OBJECT(id, nr) OBJECT_##id,
+	#include "dept_object.h"
+#undef OBJECT
+	OBJECT_NR,
+};
+
+extern struct list_head dept_classes;
+extern struct dept_pool dept_pool[];
+
+#endif
+#endif /* __DEPT_INTERNAL_H */
diff --git a/kernel/dependency/dept_proc.c b/kernel/dependency/dept_proc.c
new file mode 100644
index 000000000000..97beaf397715
--- /dev/null
+++ b/kernel/dependency/dept_proc.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Procfs knobs for Dept(DEPendency Tracker)
+ *
+ * Started by Byungchul Park <max.byungchul.park@gmail.com>:
+ *
+ *  Copyright (C) 2021 LG Electronics, Inc. , Byungchul Park
+ *  Copyright (C) 2024 SK hynix, Inc. , Byungchul Park
+ */
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/dept.h>
+#include "dept_internal.h"
+
+static void *l_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	/*
+	 * XXX: Serialize list traversal if needed. The following might
+	 * give a wrong information on contention.
+	 */
+	return seq_list_next(v, &dept_classes, pos);
+}
+
+static void *l_start(struct seq_file *m, loff_t *pos)
+{
+	/*
+	 * XXX: Serialize list traversal if needed. The following might
+	 * give a wrong information on contention.
+	 */
+	return seq_list_start_head(&dept_classes, *pos);
+}
+
+static void l_stop(struct seq_file *m, void *v)
+{
+}
+
+static int l_show(struct seq_file *m, void *v)
+{
+	struct dept_class *fc = list_entry(v, struct dept_class, all_node);
+	struct dept_dep *d;
+	const char *prefix;
+
+	if (v == &dept_classes) {
+		seq_puts(m, "All classes:\n\n");
+		return 0;
+	}
+
+	prefix = fc->sched_map ? "<sched> " : "";
+	seq_printf(m, "[%p] %s%s\n", (void *)fc->key, prefix, fc->name);
+
+	/*
+	 * XXX: Serialize list traversal if needed. The following might
+	 * give a wrong information on contention.
+	 */
+	list_for_each_entry(d, &fc->dep_head, dep_node) {
+		struct dept_class *tc = d->wait->class;
+
+		prefix = tc->sched_map ? "<sched> " : "";
+		seq_printf(m, " -> [%p] %s%s\n", (void *)tc->key, prefix, tc->name);
+	}
+	seq_puts(m, "\n");
+
+	return 0;
+}
+
+static const struct seq_operations dept_deps_ops = {
+	.start	= l_start,
+	.next	= l_next,
+	.stop	= l_stop,
+	.show	= l_show,
+};
+
+static int dept_stats_show(struct seq_file *m, void *v)
+{
+	int r;
+
+	seq_puts(m, "Availability in the static pools:\n\n");
+#define OBJECT(id, nr)							\
+	r = atomic_read(&dept_pool[OBJECT_##id].obj_nr);		\
+	if (r < 0)							\
+		r = 0;							\
+	seq_printf(m, "%s\t%d/%d(%d%%)\n", #id, r, nr, (r * 100) / (nr));
+	#include "dept_object.h"
+#undef  OBJECT
+
+	return 0;
+}
+
+static int __init dept_proc_init(void)
+{
+	proc_create_seq("dept_deps", S_IRUSR, NULL, &dept_deps_ops);
+	proc_create_single("dept_stats", S_IRUSR, NULL, dept_stats_show);
+	return 0;
+}
+
+__initcall(dept_proc_init);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 07/47] dept: distinguish each kernel context from another
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (5 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 06/47] dept: add proc knobs to show stats and dependency graph Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 08/47] x86_64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to x86_64 Byungchul Park
                   ` (39 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Each unique kernel context, in dept's point of view, should be
identified on every entrance to kernel mode e.g. system call or user
oriented fault.  Otherwise, dept may track meaningless dependencies
across different kernel context.

Plus, in order to update kernel context id at the very beginning of each
entrance, arch code support is required, that could be configured by
CONFIG_ARCH_HAS_DEPT_SUPPORT.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept.h     | 29 ++++++++++-------
 include/linux/sched.h    | 10 +++---
 kernel/dependency/dept.c | 67 ++++++++++++++++++++--------------------
 lib/Kconfig.debug        |  5 ++-
 4 files changed, 61 insertions(+), 50 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 5f0d2d8c8cbe..cb1b1beea077 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -26,11 +26,16 @@ struct task_struct;
 #define DEPT_MAX_SUBCLASSES_USR		(DEPT_MAX_SUBCLASSES / DEPT_MAX_SUBCLASSES_EVT)
 #define DEPT_MAX_SUBCLASSES_CACHE	2
 
-#define DEPT_SIRQ			0
-#define DEPT_HIRQ			1
-#define DEPT_IRQS_NR			2
-#define DEPT_SIRQF			(1UL << DEPT_SIRQ)
-#define DEPT_HIRQF			(1UL << DEPT_HIRQ)
+enum {
+	DEPT_CXT_SIRQ = 0,
+	DEPT_CXT_HIRQ,
+	DEPT_CXT_IRQS_NR,
+	DEPT_CXT_PROCESS = DEPT_CXT_IRQS_NR,
+	DEPT_CXTS_NR
+};
+
+#define DEPT_SIRQF			(1UL << DEPT_CXT_SIRQ)
+#define DEPT_HIRQF			(1UL << DEPT_CXT_HIRQ)
 
 struct dept_ecxt;
 struct dept_iecxt {
@@ -95,8 +100,8 @@ struct dept_class {
 			/*
 			 * for tracking IRQ dependencies
 			 */
-			struct dept_iecxt iecxt[DEPT_IRQS_NR];
-			struct dept_iwait iwait[DEPT_IRQS_NR];
+			struct dept_iecxt iecxt[DEPT_CXT_IRQS_NR];
+			struct dept_iwait iwait[DEPT_CXT_IRQS_NR];
 
 			/*
 			 * classified by a map embedded in task_struct,
@@ -208,8 +213,8 @@ struct dept_ecxt {
 			/*
 			 * where the IRQ-enabled happened
 			 */
-			unsigned long	enirq_ip[DEPT_IRQS_NR];
-			struct dept_stack *enirq_stack[DEPT_IRQS_NR];
+			unsigned long	enirq_ip[DEPT_CXT_IRQS_NR];
+			struct dept_stack *enirq_stack[DEPT_CXT_IRQS_NR];
 
 			/*
 			 * where the event context started
@@ -253,8 +258,8 @@ struct dept_wait {
 			/*
 			 * where the IRQ wait happened
 			 */
-			unsigned long	irq_ip[DEPT_IRQS_NR];
-			struct dept_stack *irq_stack[DEPT_IRQS_NR];
+			unsigned long	irq_ip[DEPT_CXT_IRQS_NR];
+			struct dept_stack *irq_stack[DEPT_CXT_IRQS_NR];
 
 			/*
 			 * where the wait happened
@@ -384,6 +389,7 @@ extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long ip,
 extern void dept_ecxt_exit(struct dept_map *m, unsigned long e_f, unsigned long ip);
 extern void dept_sched_enter(void);
 extern void dept_sched_exit(void);
+extern void dept_update_cxt(void);
 
 static inline void dept_ecxt_enter_nokeep(struct dept_map *m)
 {
@@ -431,6 +437,7 @@ struct dept_map { };
 #define dept_ecxt_exit(m, e_f, ip)			do { } while (0)
 #define dept_sched_enter()				do { } while (0)
 #define dept_sched_exit()				do { } while (0)
+#define dept_update_cxt()				do { } while (0)
 #define dept_ecxt_enter_nokeep(m)			do { } while (0)
 #define dept_key_init(k)				do { (void)(k); } while (0)
 #define dept_key_destroy(k)				do { (void)(k); } while (0)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ddb162201ba1..05c3f8a45405 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -830,19 +830,19 @@ struct dept_task {
 	int				wait_hist_pos;
 
 	/*
-	 * sequential id to identify each IRQ context
+	 * sequential id to identify each context
 	 */
-	unsigned int			irq_id[DEPT_IRQS_NR];
+	unsigned int			cxt_id[DEPT_CXTS_NR];
 
 	/*
 	 * for tracking IRQ-enabled points with cross-event
 	 */
-	unsigned int			wgen_enirq[DEPT_IRQS_NR];
+	unsigned int			wgen_enirq[DEPT_CXT_IRQS_NR];
 
 	/*
 	 * for keeping up-to-date IRQ-enabled points
 	 */
-	unsigned long			enirq_ip[DEPT_IRQS_NR];
+	unsigned long			enirq_ip[DEPT_CXT_IRQS_NR];
 
 	/*
 	 * for reserving a current stack instance at each operation
@@ -896,7 +896,7 @@ struct dept_task {
 	.wait_hist = { { .wait = NULL, } },			\
 	.ecxt_held_pos = 0,					\
 	.wait_hist_pos = 0,					\
-	.irq_id = { 0U },					\
+	.cxt_id = { 0U },					\
 	.wgen_enirq = { 0U },					\
 	.enirq_ip = { 0UL },					\
 	.stack = NULL,						\
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index dfe9dfdb6991..953e1b81a81f 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -230,9 +230,9 @@ static struct dept_class *dep_tc(struct dept_dep *d)
 
 static const char *irq_str(int irq)
 {
-	if (irq == DEPT_SIRQ)
+	if (irq == DEPT_CXT_SIRQ)
 		return "softirq";
-	if (irq == DEPT_HIRQ)
+	if (irq == DEPT_CXT_HIRQ)
 		return "hardirq";
 	return "(unknown)";
 }
@@ -410,7 +410,7 @@ static void initialize_class(struct dept_class *c)
 {
 	int i;
 
-	for (i = 0; i < DEPT_IRQS_NR; i++) {
+	for (i = 0; i < DEPT_CXT_IRQS_NR; i++) {
 		struct dept_iecxt *ie = &c->iecxt[i];
 		struct dept_iwait *iw = &c->iwait[i];
 
@@ -436,7 +436,7 @@ static void initialize_ecxt(struct dept_ecxt *e)
 {
 	int i;
 
-	for (i = 0; i < DEPT_IRQS_NR; i++) {
+	for (i = 0; i < DEPT_CXT_IRQS_NR; i++) {
 		e->enirq_stack[i] = NULL;
 		e->enirq_ip[i] = 0UL;
 	}
@@ -452,7 +452,7 @@ static void initialize_wait(struct dept_wait *w)
 {
 	int i;
 
-	for (i = 0; i < DEPT_IRQS_NR; i++) {
+	for (i = 0; i < DEPT_CXT_IRQS_NR; i++) {
 		w->irq_stack[i] = NULL;
 		w->irq_ip[i] = 0UL;
 	}
@@ -491,7 +491,7 @@ static void destroy_ecxt(struct dept_ecxt *e)
 {
 	int i;
 
-	for (i = 0; i < DEPT_IRQS_NR; i++)
+	for (i = 0; i < DEPT_CXT_IRQS_NR; i++)
 		if (e->enirq_stack[i])
 			put_stack(e->enirq_stack[i]);
 	if (e->class)
@@ -507,7 +507,7 @@ static void destroy_wait(struct dept_wait *w)
 {
 	int i;
 
-	for (i = 0; i < DEPT_IRQS_NR; i++)
+	for (i = 0; i < DEPT_CXT_IRQS_NR; i++)
 		if (w->irq_stack[i])
 			put_stack(w->irq_stack[i]);
 	if (w->class)
@@ -665,7 +665,7 @@ static void print_diagram(struct dept_dep *d)
 	const char *tc_n = tc->sched_map ? "<sched>" : (tc->name ?: "(unknown)");
 
 	irqf = e->enirqf & w->irqf;
-	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR) {
+	for_each_set_bit(irq, &irqf, DEPT_CXT_IRQS_NR) {
 		if (!firstline)
 			pr_warn("\nor\n\n");
 		firstline = false;
@@ -698,7 +698,7 @@ static void print_dep(struct dept_dep *d)
 	const char *tc_n = tc->sched_map ? "<sched>" : (tc->name ?: "(unknown)");
 
 	irqf = e->enirqf & w->irqf;
-	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR) {
+	for_each_set_bit(irq, &irqf, DEPT_CXT_IRQS_NR) {
 		pr_warn("%s has been enabled:\n", irq_str(irq));
 		print_ip_stack(e->enirq_ip[irq], e->enirq_stack[irq]);
 		pr_warn("\n");
@@ -866,7 +866,7 @@ static void bfs(void *root, struct bfs_ops *ops, void *in, void **out)
  */
 
 static unsigned long cur_enirqf(void);
-static int cur_irq(void);
+static int cur_cxt(void);
 static unsigned int cur_ctxt_id(void);
 
 static struct dept_iecxt *iecxt(struct dept_class *c, int irq)
@@ -1443,7 +1443,7 @@ static void add_dep(struct dept_ecxt *e, struct dept_wait *w)
 	if (d) {
 		check_dl_bfs(d);
 
-		for (i = 0; i < DEPT_IRQS_NR; i++) {
+		for (i = 0; i < DEPT_CXT_IRQS_NR; i++) {
 			struct dept_iwait *fiw = iwait(fc, i);
 			struct dept_iecxt *found_ie;
 			struct dept_iwait *found_iw;
@@ -1487,7 +1487,7 @@ static void add_wait(struct dept_class *c, unsigned long ip,
 	struct dept_task *dt = dept_task();
 	struct dept_wait *w;
 	unsigned int wg;
-	int irq;
+	int cxt;
 	int i;
 
 	if (DEPT_WARN_ON(!valid_class(c)))
@@ -1503,9 +1503,9 @@ static void add_wait(struct dept_class *c, unsigned long ip,
 	w->wait_stack = get_current_stack();
 	w->sched_sleep = sched_sleep;
 
-	irq = cur_irq();
-	if (irq < DEPT_IRQS_NR)
-		add_iwait(c, irq, w);
+	cxt = cur_cxt();
+	if (cxt == DEPT_CXT_HIRQ || cxt == DEPT_CXT_SIRQ)
+		add_iwait(c, cxt, w);
 
 	/*
 	 * Avoid adding dependency between user aware nested ecxt and
@@ -1579,7 +1579,7 @@ static struct dept_ecxt_held *add_ecxt(struct dept_map *m,
 	eh->sub_l = sub_l;
 
 	irqf = cur_enirqf();
-	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR)
+	for_each_set_bit(irq, &irqf, DEPT_CXT_IRQS_NR)
 		add_iecxt(c, irq, e, false);
 
 	del_ecxt(e);
@@ -1728,7 +1728,7 @@ static void do_event(struct dept_map *m, struct dept_map *real_m,
 			add_dep(eh->ecxt, wh->wait);
 	}
 
-	for (i = 0; i < DEPT_IRQS_NR; i++) {
+	for (i = 0; i < DEPT_CXT_IRQS_NR; i++) {
 		struct dept_ecxt *e;
 
 		if (before(dt->wgen_enirq[i], wg))
@@ -1775,7 +1775,7 @@ static void disconnect_class(struct dept_class *c)
 		call_rcu(&d->rh, del_dep_rcu);
 	}
 
-	for (i = 0; i < DEPT_IRQS_NR; i++) {
+	for (i = 0; i < DEPT_CXT_IRQS_NR; i++) {
 		stale_iecxt(iecxt(c, i));
 		stale_iwait(iwait(c, i));
 	}
@@ -1800,27 +1800,21 @@ static unsigned long cur_enirqf(void)
 	return 0UL;
 }
 
-static int cur_irq(void)
+static int cur_cxt(void)
 {
 	if (lockdep_softirq_context(current))
-		return DEPT_SIRQ;
+		return DEPT_CXT_SIRQ;
 	if (lockdep_hardirq_context())
-		return DEPT_HIRQ;
-	return DEPT_IRQS_NR;
+		return DEPT_CXT_HIRQ;
+	return DEPT_CXT_PROCESS;
 }
 
 static unsigned int cur_ctxt_id(void)
 {
 	struct dept_task *dt = dept_task();
-	int irq = cur_irq();
+	int cxt = cur_cxt();
 
-	/*
-	 * Normal process context
-	 */
-	if (irq == DEPT_IRQS_NR)
-		return 0U;
-
-	return dt->irq_id[irq] | (1UL << irq);
+	return dt->cxt_id[cxt] | (1UL << cxt);
 }
 
 static void enirq_transition(int irq)
@@ -1877,7 +1871,7 @@ static void dept_enirq(unsigned long ip)
 
 	flags = dept_enter();
 
-	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR) {
+	for_each_set_bit(irq, &irqf, DEPT_CXT_IRQS_NR) {
 		dt->enirq_ip[irq] = ip;
 		enirq_transition(irq);
 	}
@@ -1923,6 +1917,13 @@ void noinstr dept_hardirqs_off(void)
 	dept_task()->hardirqs_enabled = false;
 }
 
+void noinstr dept_update_cxt(void)
+{
+	struct dept_task *dt = dept_task();
+
+	dt->cxt_id[DEPT_CXT_PROCESS] += 1UL << DEPT_CXTS_NR;
+}
+
 /*
  * Ensure it's the outmost softirq context.
  */
@@ -1930,7 +1931,7 @@ void dept_softirq_enter(void)
 {
 	struct dept_task *dt = dept_task();
 
-	dt->irq_id[DEPT_SIRQ] += 1UL << DEPT_IRQS_NR;
+	dt->cxt_id[DEPT_CXT_SIRQ] += 1UL << DEPT_CXTS_NR;
 }
 
 /*
@@ -1940,7 +1941,7 @@ void noinstr dept_hardirq_enter(void)
 {
 	struct dept_task *dt = dept_task();
 
-	dt->irq_id[DEPT_HIRQ] += 1UL << DEPT_IRQS_NR;
+	dt->cxt_id[DEPT_CXT_HIRQ] += 1UL << DEPT_CXTS_NR;
 }
 
 void dept_sched_enter(void)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index b9cff0bec6f2..3669b069337b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1365,9 +1365,12 @@ config DEBUG_PREEMPT
 
 menu "Lock Debugging (spinlocks, mutexes, etc...)"
 
+config ARCH_HAS_DEPT_SUPPORT
+	bool
+
 config DEPT
 	bool "Dependency tracking (EXPERIMENTAL)"
-	depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
+	depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT && ARCH_HAS_DEPT_SUPPORT
 	select DEBUG_SPINLOCK
 	select DEBUG_MUTEXES if !PREEMPT_RT
 	select DEBUG_RT_MUTEXES if RT_MUTEXES
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 08/47] x86_64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to x86_64
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (6 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 07/47] dept: distinguish each kernel context from another Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02 15:22   ` Dave Hansen
  2025-10-02  8:12 ` [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64 Byungchul Park
                   ` (38 subsequent siblings)
  46 siblings, 1 reply; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

dept needs to notice every entrance from user to kernel mode to treat
every kernel context independently when tracking wait-event dependencies.
Roughly, system call and user oriented fault are the cases.

Make dept aware of the entrances of x86_64 and add support
CONFIG_ARCH_HAS_DEPT_SUPPORT to x86_64.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 arch/x86/Kconfig            | 1 +
 arch/x86/entry/syscall_64.c | 7 +++++++
 arch/x86/mm/fault.c         | 7 +++++++
 3 files changed, 15 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 05880301212e..46021cf5934b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -38,6 +38,7 @@ config X86_64
 	select ZONE_DMA32
 	select EXECMEM if DYNAMIC_FTRACE
 	select ACPI_MRRM if ACPI
+	select ARCH_HAS_DEPT_SUPPORT
 
 config FORCE_DYNAMIC_FTRACE
 	def_bool y
diff --git a/arch/x86/entry/syscall_64.c b/arch/x86/entry/syscall_64.c
index b6e68ea98b83..66bd5af5aff1 100644
--- a/arch/x86/entry/syscall_64.c
+++ b/arch/x86/entry/syscall_64.c
@@ -8,6 +8,7 @@
 #include <linux/entry-common.h>
 #include <linux/nospec.h>
 #include <asm/syscall.h>
+#include <linux/dept.h>
 
 #define __SYSCALL(nr, sym) extern long __x64_##sym(const struct pt_regs *);
 #define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __x64_##sym(const struct pt_regs *);
@@ -86,6 +87,12 @@ static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr)
 /* Returns true to return using SYSRET, or false to use IRET */
 __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
 {
+	/*
+	 * This is a system call from user mode.  Make dept work with a
+	 * new kernel mode context.
+	 */
+	dept_update_cxt();
+
 	add_random_kstack_offset();
 	nr = syscall_enter_from_user_mode(regs, nr);
 
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 998bd807fc7b..017edb75f0a0 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -19,6 +19,7 @@
 #include <linux/mm_types.h>
 #include <linux/mm.h>			/* find_and_lock_vma() */
 #include <linux/vmalloc.h>
+#include <linux/dept.h>
 
 #include <asm/cpufeature.h>		/* boot_cpu_has, ...		*/
 #include <asm/traps.h>			/* dotraplinkage, ...		*/
@@ -1219,6 +1220,12 @@ void do_user_addr_fault(struct pt_regs *regs,
 	tsk = current;
 	mm = tsk->mm;
 
+	/*
+	 * This fault comes from user mode.  Make dept work with a new
+	 * kernel mode context.
+	 */
+	dept_update_cxt();
+
 	if (unlikely((error_code & (X86_PF_USER | X86_PF_INSTR)) == X86_PF_INSTR)) {
 		/*
 		 * Whoops, this is kernel mode code trying to execute from
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (7 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 08/47] x86_64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to x86_64 Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02 11:39   ` Mark Brown
  2025-10-03 14:36   ` Mark Rutland
  2025-10-02  8:12 ` [PATCH v17 10/47] dept: distinguish each work from another Byungchul Park
                   ` (37 subsequent siblings)
  46 siblings, 2 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

dept needs to notice every entrance from user to kernel mode to treat
every kernel context independently when tracking wait-event dependencies.
Roughly, system call and user oriented fault are the cases.

Make dept aware of the entrances of arm64 and add support
CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 arch/arm64/Kconfig          | 1 +
 arch/arm64/kernel/syscall.c | 7 +++++++
 arch/arm64/mm/fault.c       | 7 +++++++
 3 files changed, 15 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e9bbfacc35a6..a8fab2c052dc 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -281,6 +281,7 @@ config ARM64
 	select USER_STACKTRACE_SUPPORT
 	select VDSO_GETRANDOM
 	select VMAP_STACK
+	select ARCH_HAS_DEPT_SUPPORT
 	help
 	  ARM 64-bit (AArch64) Linux support.
 
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index c442fcec6b9e..bbd306335179 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -7,6 +7,7 @@
 #include <linux/ptrace.h>
 #include <linux/randomize_kstack.h>
 #include <linux/syscalls.h>
+#include <linux/dept.h>
 
 #include <asm/debug-monitors.h>
 #include <asm/exception.h>
@@ -96,6 +97,12 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
 	 * (Similarly for HVC and SMC elsewhere.)
 	 */
 
+	/*
+	 * This is a system call from user mode.  Make dept work with a
+	 * new kernel mode context.
+	 */
+	dept_update_cxt();
+
 	if (flags & _TIF_MTE_ASYNC_FAULT) {
 		/*
 		 * Process the asynchronous tag check fault before the actual
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index d816ff44faff..96827b999d18 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -26,6 +26,7 @@
 #include <linux/pkeys.h>
 #include <linux/preempt.h>
 #include <linux/hugetlb.h>
+#include <linux/dept.h>
 
 #include <asm/acpi.h>
 #include <asm/bug.h>
@@ -622,6 +623,12 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 	if (!(mm_flags & FAULT_FLAG_USER))
 		goto lock_mmap;
 
+	/*
+	 * This fault comes from user mode.  Make dept work with a new
+	 * kernel mode context.
+	 */
+	dept_update_cxt();
+
 	vma = lock_vma_under_rcu(mm, addr);
 	if (!vma)
 		goto lock_mmap;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 10/47] dept: distinguish each work from another
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (8 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64 Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 11/47] dept: add a mechanism to refill the internal memory pools on running out Byungchul Park
                   ` (36 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Workqueue already provides concurrency control.  By that, any wait in a
work doesn't prevents events in other works with the control enabled.
Thus, each work would better be considered a different context.

So let dept assign a different context id to each work.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 kernel/workqueue.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c6b79b3675c3..0e05648b4501 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -55,6 +55,7 @@
 #include <linux/kvm_para.h>
 #include <linux/delay.h>
 #include <linux/irq_work.h>
+#include <linux/dept.h>
 
 #include "workqueue_internal.h"
 
@@ -3153,6 +3154,8 @@ __acquires(&pool->lock)
 
 	lockdep_copy_map(&lockdep_map, &work->lockdep_map);
 #endif
+	dept_update_cxt();
+
 	/* ensure we're on the correct CPU */
 	WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
 		     raw_smp_processor_id() != pool->cpu);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 11/47] dept: add a mechanism to refill the internal memory pools on running out
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (9 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 10/47] dept: distinguish each work from another Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 12/47] dept: record the latest one out of consecutive waits of the same class Byungchul Park
                   ` (35 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

dept engine works in a constrained environment.  For example, dept
cannot make use of dynamic allocation e.g. kmalloc().  So dept has been
using static pools to keep memory chunks dept uses.

However, dept would barely work once any of the pools gets run out.  So
implemented a mechanism for the refill on the lack, using irq work and
workqueue that fits on the contrained environment.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 kernel/dependency/dept.c          | 108 +++++++++++++++++++++++++-----
 kernel/dependency/dept_internal.h |  19 ++++--
 kernel/dependency/dept_object.h   |  10 +--
 kernel/dependency/dept_proc.c     |   8 +--
 4 files changed, 116 insertions(+), 29 deletions(-)

diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 953e1b81a81f..1b16a6095b3c 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -75,6 +75,9 @@
 #include <linux/dept.h>
 #include <linux/utsname.h>
 #include <linux/kernel.h>
+#include <linux/workqueue.h>
+#include <linux/irq_work.h>
+#include <linux/vmalloc.h>
 #include "dept_internal.h"
 
 static int dept_stop;
@@ -143,9 +146,11 @@ static inline struct dept_task *dept_task(void)
 		}							\
 	})
 
-#define DEPT_INFO_ONCE(s...) pr_warn_once("DEPT_INFO_ONCE: " s)
+#define DEPT_INFO_ONCE(s...)	pr_warn_once("DEPT_INFO_ONCE: " s)
+#define DEPT_INFO(s...)		pr_warn("DEPT_INFO: " s)
 
 static arch_spinlock_t dept_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
+static arch_spinlock_t dept_pool_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 
 /*
  * DEPT internal engine should be cautious in using outside functions
@@ -268,6 +273,7 @@ static bool valid_key(struct dept_key *k)
 
 #define OBJECT(id, nr)							\
 static struct dept_##id spool_##id[nr];					\
+static struct dept_##id rpool_##id[nr];					\
 static DEFINE_PER_CPU(struct llist_head, lpool_##id);
 	#include "dept_object.h"
 #undef OBJECT
@@ -276,14 +282,74 @@ struct dept_pool dept_pool[OBJECT_NR] = {
 #define OBJECT(id, nr) {						\
 	.name = #id,							\
 	.obj_sz = sizeof(struct dept_##id),				\
-	.obj_nr = ATOMIC_INIT(nr),					\
+	.obj_nr = nr,							\
+	.tot_nr = nr,							\
+	.acc_sz = ATOMIC_INIT(sizeof(spool_##id) + sizeof(rpool_##id)), \
 	.node_off = offsetof(struct dept_##id, pool_node),		\
 	.spool = spool_##id,						\
+	.rpool = rpool_##id,						\
 	.lpool = &lpool_##id, },
 	#include "dept_object.h"
 #undef OBJECT
 };
 
+static void dept_wq_work_fn(struct work_struct *work)
+{
+	int i;
+
+	for (i = 0; i < OBJECT_NR; i++) {
+		struct dept_pool *p = dept_pool + i;
+		int sz = p->tot_nr * p->obj_sz;
+		void *rpool;
+		bool need;
+
+		local_irq_disable();
+		arch_spin_lock(&dept_pool_spin);
+		need = !p->rpool;
+		arch_spin_unlock(&dept_pool_spin);
+		local_irq_enable();
+
+		if (!need)
+			continue;
+
+		rpool = vmalloc(sz);
+
+		if (!rpool) {
+			DEPT_STOP("Failed to extend internal resources.\n");
+			break;
+		}
+
+		local_irq_disable();
+		arch_spin_lock(&dept_pool_spin);
+		if (!p->rpool) {
+			p->rpool = rpool;
+			rpool = NULL;
+			atomic_add(sz, &p->acc_sz);
+		}
+		arch_spin_unlock(&dept_pool_spin);
+		local_irq_enable();
+
+		if (rpool)
+			vfree(rpool);
+		else
+			DEPT_INFO("Dept object(%s) just got refilled successfully.\n", p->name);
+	}
+}
+
+static DECLARE_WORK(dept_wq_work, dept_wq_work_fn);
+
+static void dept_irq_work_fn(struct irq_work *w)
+{
+	schedule_work(&dept_wq_work);
+}
+
+static DEFINE_IRQ_WORK(dept_irq_work, dept_irq_work_fn);
+
+static void request_rpool_refill(void)
+{
+	irq_work_queue(&dept_irq_work);
+}
+
 /*
  * Can use llist no matter whether CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG is
  * enabled or not because NMI and other contexts in the same CPU never
@@ -319,19 +385,31 @@ static void *from_pool(enum object_t t)
 	/*
 	 * Try static pool.
 	 */
-	if (atomic_read(&p->obj_nr) > 0) {
-		int idx = atomic_dec_return(&p->obj_nr);
+	arch_spin_lock(&dept_pool_spin);
+
+	if (!p->obj_nr) {
+		p->spool = p->rpool;
+		p->obj_nr = p->rpool ? p->tot_nr : 0;
+		p->rpool = NULL;
+		request_rpool_refill();
+	}
+
+	if (p->obj_nr) {
+		void *ret;
+
+		p->obj_nr--;
+		ret = p->spool + (p->obj_nr * p->obj_sz);
+		arch_spin_unlock(&dept_pool_spin);
 
-		if (idx >= 0)
-			return p->spool + (idx * p->obj_sz);
+		return ret;
 	}
+	arch_spin_unlock(&dept_pool_spin);
 
-	DEPT_INFO_ONCE("---------------------------------------------\n"
-		"  Some of Dept internal resources are run out.\n"
-		"  Dept might still work if the resources get freed.\n"
-		"  However, the chances are Dept will suffer from\n"
-		"  the lack from now. Needs to extend the internal\n"
-		"  resource pools. Ask max.byungchul.park@gmail.com\n");
+	DEPT_INFO("------------------------------------------\n"
+		"  Dept object(%s) is run out.\n"
+		"  Dept is trying to refill the object.\n"
+		"  Nevertheless, if it fails, Dept will stop.\n",
+		p->name);
 	return NULL;
 }
 
@@ -2957,8 +3035,8 @@ void __init dept_init(void)
 	pr_info("... DEPT_MAX_ECXT_HELD  : %d\n", DEPT_MAX_ECXT_HELD);
 	pr_info("... DEPT_MAX_SUBCLASSES : %d\n", DEPT_MAX_SUBCLASSES);
 #define OBJECT(id, nr)							\
-	pr_info("... memory used by %s: %zu KB\n",			\
-	       #id, B2KB(sizeof(struct dept_##id) * nr));
+	pr_info("... memory initially used by %s: %zu KB\n",		\
+	       #id, B2KB(sizeof(spool_##id) + sizeof(rpool_##id)));
 	#include "dept_object.h"
 #undef OBJECT
 #define HASH(id, bits)							\
@@ -2966,6 +3044,6 @@ void __init dept_init(void)
 	       #id, B2KB(sizeof(struct hlist_head) * (1 << (bits))));
 	#include "dept_hash.h"
 #undef HASH
-	pr_info("... total memory used by objects and hashs: %zu KB\n", B2KB(mem_total));
+	pr_info("... total memory initially used by objects and hashs: %zu KB\n", B2KB(mem_total));
 	pr_info("... per task memory footprint: %zu bytes\n", sizeof(struct dept_task));
 }
diff --git a/kernel/dependency/dept_internal.h b/kernel/dependency/dept_internal.h
index 6b39e5a2a830..b2a44632ee4d 100644
--- a/kernel/dependency/dept_internal.h
+++ b/kernel/dependency/dept_internal.h
@@ -23,9 +23,19 @@ struct dept_pool {
 	size_t				obj_sz;
 
 	/*
-	 * the number of the static array
+	 * the remaining number of the object in spool
 	 */
-	atomic_t			obj_nr;
+	int				obj_nr;
+
+	/*
+	 * the number of the object in spool
+	 */
+	int				tot_nr;
+
+	/*
+	 * accumulated amount of memory used by the object in byte
+	 */
+	atomic_t			acc_sz;
 
 	/*
 	 * offset of ->pool_node
@@ -35,9 +45,10 @@ struct dept_pool {
 	/*
 	 * pointer to the pool
 	 */
-	void				*spool;
+	void				*spool; /* static pool */
+	void				*rpool; /* reserved pool */
 	struct llist_head		boot_pool;
-	struct llist_head __percpu	*lpool;
+	struct llist_head __percpu	*lpool; /* local pool */
 };
 
 enum object_t {
diff --git a/kernel/dependency/dept_object.h b/kernel/dependency/dept_object.h
index 0b7eb16fe9fb..4f936adfa8ee 100644
--- a/kernel/dependency/dept_object.h
+++ b/kernel/dependency/dept_object.h
@@ -6,8 +6,8 @@
  * nr: # of the object that should be kept in the pool.
  */
 
-OBJECT(dep, 1024 * 8)
-OBJECT(class, 1024 * 8)
-OBJECT(stack, 1024 * 32)
-OBJECT(ecxt, 1024 * 16)
-OBJECT(wait, 1024 * 32)
+OBJECT(dep, 1024 * 4 * 2)
+OBJECT(class, 1024 * 4)
+OBJECT(stack, 1024 * 4 * 8)
+OBJECT(ecxt, 1024 * 4 * 2)
+OBJECT(wait, 1024 * 4 * 4)
diff --git a/kernel/dependency/dept_proc.c b/kernel/dependency/dept_proc.c
index 97beaf397715..f28992834588 100644
--- a/kernel/dependency/dept_proc.c
+++ b/kernel/dependency/dept_proc.c
@@ -74,12 +74,10 @@ static int dept_stats_show(struct seq_file *m, void *v)
 {
 	int r;
 
-	seq_puts(m, "Availability in the static pools:\n\n");
+	seq_puts(m, "Accumulated amount of memory used by pools:\n\n");
 #define OBJECT(id, nr)							\
-	r = atomic_read(&dept_pool[OBJECT_##id].obj_nr);		\
-	if (r < 0)							\
-		r = 0;							\
-	seq_printf(m, "%s\t%d/%d(%d%%)\n", #id, r, nr, (r * 100) / (nr));
+	r = atomic_read(&dept_pool[OBJECT_##id].acc_sz);		\
+	seq_printf(m, "%s\t%d KB\n", #id, r / 1024);
 	#include "dept_object.h"
 #undef  OBJECT
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 12/47] dept: record the latest one out of consecutive waits of the same class
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (10 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 11/47] dept: add a mechanism to refill the internal memory pools on running out Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 13/47] dept: apply sdt_might_sleep_{start,end}() to wait_for_completion()/complete() Byungchul Park
                   ` (34 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

The current code records all the waits for later use to track relation
between waits and events within each context.  However, since the same
class is handled the same way, it'd be okay to record only one on behalf
of the others if they all have the same class.

Even though it's the ideal to search the whole history buffer for that,
since it'd cost too high, alternatively, let's keep the latest one when
the same class'ed waits consecutively appear.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 kernel/dependency/dept.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 1b16a6095b3c..f4c08758f8db 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -1486,9 +1486,28 @@ static struct dept_wait_hist *new_hist(void)
 	return wh;
 }
 
+static struct dept_wait_hist *last_hist(void)
+{
+	int pos_n = hist_pos_next();
+	struct dept_wait_hist *wh_n = hist(pos_n);
+
+	/*
+	 * This is the first try.
+	 */
+	if (!pos_n && !wh_n->wait)
+		return NULL;
+
+	return hist(pos_n + DEPT_MAX_WAIT_HIST - 1);
+}
+
 static void add_hist(struct dept_wait *w, unsigned int wg, unsigned int ctxt_id)
 {
-	struct dept_wait_hist *wh = new_hist();
+	struct dept_wait_hist *wh;
+
+	wh = last_hist();
+
+	if (!wh || wh->wait->class != w->class || wh->ctxt_id != ctxt_id)
+		wh = new_hist();
 
 	if (likely(wh->wait))
 		put_wait(wh->wait);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 13/47] dept: apply sdt_might_sleep_{start,end}() to wait_for_completion()/complete()
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (11 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 12/47] dept: record the latest one out of consecutive waits of the same class Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 14/47] dept: apply sdt_might_sleep_{start,end}() to swait Byungchul Park
                   ` (33 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Make dept able to track dependencies by wait_for_completion()/complete().

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/completion.h | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index fb2915676574..bd2c207481d6 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -10,6 +10,7 @@
  */
 
 #include <linux/swait.h>
+#include <linux/dept_sdt.h>
 
 /*
  * struct completion - structure used to maintain state for a "completion"
@@ -26,14 +27,33 @@
 struct completion {
 	unsigned int done;
 	struct swait_queue_head wait;
+	struct dept_map dmap;
 };
 
+#define init_completion(x)				\
+do {							\
+	sdt_map_init(&(x)->dmap);			\
+	__init_completion(x);				\
+} while (0)
+
+/*
+ * XXX: No use cases for now. Fill the body when needed.
+ */
 #define init_completion_map(x, m) init_completion(x)
-static inline void complete_acquire(struct completion *x) {}
-static inline void complete_release(struct completion *x) {}
+
+static inline void complete_acquire(struct completion *x)
+{
+	sdt_might_sleep_start(&x->dmap);
+}
+
+static inline void complete_release(struct completion *x)
+{
+	sdt_might_sleep_end();
+}
 
 #define COMPLETION_INITIALIZER(work) \
-	{ 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
+	{ 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
+	  .dmap = DEPT_MAP_INITIALIZER(work, NULL), }
 
 #define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \
 	(*({ init_completion_map(&(work), &(map)); &(work); }))
@@ -75,13 +95,13 @@ static inline void complete_release(struct completion *x) {}
 #endif
 
 /**
- * init_completion - Initialize a dynamically allocated completion
+ * __init_completion - Initialize a dynamically allocated completion
  * @x:  pointer to completion structure that is to be initialized
  *
  * This inline function will initialize a dynamically created completion
  * structure.
  */
-static inline void init_completion(struct completion *x)
+static inline void __init_completion(struct completion *x)
 {
 	x->done = 0;
 	init_swait_queue_head(&x->wait);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 14/47] dept: apply sdt_might_sleep_{start,end}() to swait
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (12 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 13/47] dept: apply sdt_might_sleep_{start,end}() to wait_for_completion()/complete() Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 15/47] dept: apply sdt_might_sleep_{start,end}() to waitqueue wait Byungchul Park
                   ` (32 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Make dept able to track dependencies by swaits.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/swait.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/swait.h b/include/linux/swait.h
index d324419482a0..277ac74f61c3 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -6,6 +6,7 @@
 #include <linux/stddef.h>
 #include <linux/spinlock.h>
 #include <linux/wait.h>
+#include <linux/dept_sdt.h>
 #include <asm/current.h>
 
 /*
@@ -161,6 +162,7 @@ extern void finish_swait(struct swait_queue_head *q, struct swait_queue *wait);
 	struct swait_queue __wait;					\
 	long __ret = ret;						\
 									\
+	sdt_might_sleep_start(NULL);					\
 	INIT_LIST_HEAD(&__wait.task_list);				\
 	for (;;) {							\
 		long __int = prepare_to_swait_event(&wq, &__wait, state);\
@@ -176,6 +178,7 @@ extern void finish_swait(struct swait_queue_head *q, struct swait_queue *wait);
 		cmd;							\
 	}								\
 	finish_swait(&wq, &__wait);					\
+	sdt_might_sleep_end();						\
 __out:	__ret;								\
 })
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 15/47] dept: apply sdt_might_sleep_{start,end}() to waitqueue wait
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (13 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 14/47] dept: apply sdt_might_sleep_{start,end}() to swait Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 16/47] dept: apply sdt_might_sleep_{start,end}() to hashed-waitqueue wait Byungchul Park
                   ` (31 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Make dept able to track dependencies by waitqueue waits.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/wait.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 09855d819418..65dd50f25e54 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -7,6 +7,7 @@
 #include <linux/list.h>
 #include <linux/stddef.h>
 #include <linux/spinlock.h>
+#include <linux/dept_sdt.h>
 
 #include <asm/current.h>
 
@@ -305,6 +306,7 @@ extern void init_wait_entry(struct wait_queue_entry *wq_entry, int flags);
 	struct wait_queue_entry __wq_entry;					\
 	long __ret = ret;	/* explicit shadow */				\
 										\
+	sdt_might_sleep_start(NULL);						\
 	init_wait_entry(&__wq_entry, exclusive ? WQ_FLAG_EXCLUSIVE : 0);	\
 	for (;;) {								\
 		long __int = prepare_to_wait_event(&wq_head, &__wq_entry, state);\
@@ -323,6 +325,7 @@ extern void init_wait_entry(struct wait_queue_entry *wq_entry, int flags);
 			break;							\
 	}									\
 	finish_wait(&wq_head, &__wq_entry);					\
+	sdt_might_sleep_end();							\
 __out:	__ret;									\
 })
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 16/47] dept: apply sdt_might_sleep_{start,end}() to hashed-waitqueue wait
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (14 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 15/47] dept: apply sdt_might_sleep_{start,end}() to waitqueue wait Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 17/47] dept: apply sdt_might_sleep_{start,end}() to dma fence Byungchul Park
                   ` (30 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Make dept able to track dependencies by hashed-waitqueue waits.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/wait_bit.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
index 9e29d79fc790..179a616ad245 100644
--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -6,6 +6,7 @@
  * Linux wait-bit related types and methods:
  */
 #include <linux/wait.h>
+#include <linux/dept_sdt.h>
 
 struct wait_bit_key {
 	unsigned long		*flags;
@@ -257,6 +258,7 @@ extern wait_queue_head_t *__var_waitqueue(void *p);
 	struct wait_bit_queue_entry __wbq_entry;			\
 	long __ret = ret; /* explicit shadow */				\
 									\
+	sdt_might_sleep_start(NULL);					\
 	init_wait_var_entry(&__wbq_entry, var,				\
 			    exclusive ? WQ_FLAG_EXCLUSIVE : 0);		\
 	for (;;) {							\
@@ -274,6 +276,7 @@ extern wait_queue_head_t *__var_waitqueue(void *p);
 		cmd;							\
 	}								\
 	finish_wait(__wq_head, &__wbq_entry.wq_entry);			\
+	sdt_might_sleep_end();						\
 __out:	__ret;								\
 })
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 17/47] dept: apply sdt_might_sleep_{start,end}() to dma fence
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (15 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 16/47] dept: apply sdt_might_sleep_{start,end}() to hashed-waitqueue wait Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 18/47] dept: track timeout waits separately with a new Kconfig Byungchul Park
                   ` (29 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Make dept able to track dependencies by dma fence waits and signals.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 drivers/dma-buf/dma-fence.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 3f78c56b58dc..787fa10a49f1 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -16,6 +16,7 @@
 #include <linux/dma-fence.h>
 #include <linux/sched/signal.h>
 #include <linux/seq_file.h>
+#include <linux/dept_sdt.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/dma_fence.h>
@@ -800,6 +801,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout)
 	cb.task = current;
 	list_add(&cb.base.node, &fence->cb_list);
 
+	sdt_might_sleep_start(NULL);
 	while (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags) && ret > 0) {
 		if (intr)
 			__set_current_state(TASK_INTERRUPTIBLE);
@@ -813,6 +815,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout)
 		if (ret > 0 && intr && signal_pending(current))
 			ret = -ERESTARTSYS;
 	}
+	sdt_might_sleep_end();
 
 	if (!list_empty(&cb.base.node))
 		list_del(&cb.base.node);
@@ -902,6 +905,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
 		}
 	}
 
+	sdt_might_sleep_start(NULL);
 	while (ret > 0) {
 		if (intr)
 			set_current_state(TASK_INTERRUPTIBLE);
@@ -916,6 +920,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
 		if (ret > 0 && intr && signal_pending(current))
 			ret = -ERESTARTSYS;
 	}
+	sdt_might_sleep_end();
 
 	__set_current_state(TASK_RUNNING);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 18/47] dept: track timeout waits separately with a new Kconfig
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (16 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 17/47] dept: apply sdt_might_sleep_{start,end}() to dma fence Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 19/47] dept: apply timeout consideration to wait_for_completion()/complete() Byungchul Park
                   ` (28 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Waits with valid timeouts don't actually cause deadlocks.  However, dept
has been reporting the cases as well because it's worth informing the
circular dependency for some cases where, for example, timeout is used
to avoid a deadlock.

However, yes, there are also a lot of, even more, cases where timeout
is used for its clear purpose and meant to be expired.

Report these as an information rather than warning DEADLOCK.  Plus,
introduce CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT Kconfig to make it
optional so that any reports involving waits with timeouts can be turned
on/off depending on the purpose.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept.h     | 13 +++++---
 include/linux/dept_ldt.h |  6 ++--
 include/linux/dept_sdt.h | 13 +++++---
 include/linux/sched.h    |  2 ++
 kernel/dependency/dept.c | 66 ++++++++++++++++++++++++++++++++++------
 lib/Kconfig.debug        | 10 ++++++
 6 files changed, 89 insertions(+), 21 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index cb1b1beea077..49f457390521 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -271,6 +271,11 @@ struct dept_wait {
 			 * whether this wait is for commit in scheduler
 			 */
 			bool		sched_sleep;
+
+			/*
+			 * whether a timeout is set
+			 */
+			bool				timeout;
 		};
 	};
 };
@@ -377,8 +382,8 @@ extern void dept_free_range(void *start, unsigned int sz);
 extern void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u, const char *n);
 extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u, const char *n);
 extern void dept_map_copy(struct dept_map *to, struct dept_map *from);
-extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, const char *w_fn, int sub_l);
-extern void dept_stage_wait(struct dept_map *m, struct dept_key *k, unsigned long ip, const char *w_fn);
+extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, const char *w_fn, int sub_l, long timeout);
+extern void dept_stage_wait(struct dept_map *m, struct dept_key *k, unsigned long ip, const char *w_fn, long timeout);
 extern void dept_request_event_wait_commit(void);
 extern void dept_clean_stage(void);
 extern void dept_ttwu_stage_wait(struct task_struct *t, unsigned long ip);
@@ -425,8 +430,8 @@ struct dept_map { };
 #define dept_map_init(m, k, su, n)			do { (void)(n); (void)(k); } while (0)
 #define dept_map_reinit(m, k, su, n)			do { (void)(n); (void)(k); } while (0)
 #define dept_map_copy(t, f)				do { } while (0)
-#define dept_wait(m, w_f, ip, w_fn, sl)			do { (void)(w_fn); } while (0)
-#define dept_stage_wait(m, k, ip, w_fn)			do { (void)(k); (void)(w_fn); } while (0)
+#define dept_wait(m, w_f, ip, w_fn, sl, t)		do { (void)(w_fn); } while (0)
+#define dept_stage_wait(m, k, ip, w_fn, t)		do { (void)(k); (void)(w_fn); } while (0)
 #define dept_request_event_wait_commit()		do { } while (0)
 #define dept_clean_stage()				do { } while (0)
 #define dept_ttwu_stage_wait(t, ip)			do { } while (0)
diff --git a/include/linux/dept_ldt.h b/include/linux/dept_ldt.h
index 8047d0a531f1..730af2517ecd 100644
--- a/include/linux/dept_ldt.h
+++ b/include/linux/dept_ldt.h
@@ -28,7 +28,7 @@
 		else if (t)						\
 			dept_ecxt_enter(m, LDT_EVT_L, i, "trylock", "unlock", sl);\
 		else {							\
-			dept_wait(m, LDT_EVT_L, i, "lock", sl);		\
+			dept_wait(m, LDT_EVT_L, i, "lock", sl, false);	\
 			dept_ecxt_enter(m, LDT_EVT_L, i, "lock", "unlock", sl);\
 		}							\
 	} while (0)
@@ -40,7 +40,7 @@
 		else if (t)						\
 			dept_ecxt_enter(m, LDT_EVT_R, i, "read_trylock", "read_unlock", sl);\
 		else {							\
-			dept_wait(m, q ? LDT_EVT_RW : LDT_EVT_W, i, "read_lock", sl);\
+			dept_wait(m, q ? LDT_EVT_RW : LDT_EVT_W, i, "read_lock", sl, false);\
 			dept_ecxt_enter(m, LDT_EVT_R, i, "read_lock", "read_unlock", sl);\
 		}							\
 	} while (0)
@@ -52,7 +52,7 @@
 		else if (t)						\
 			dept_ecxt_enter(m, LDT_EVT_W, i, "write_trylock", "write_unlock", sl);\
 		else {							\
-			dept_wait(m, LDT_EVT_RW, i, "write_lock", sl);	\
+			dept_wait(m, LDT_EVT_RW, i, "write_lock", sl, false);\
 			dept_ecxt_enter(m, LDT_EVT_W, i, "write_lock", "write_unlock", sl);\
 		}							\
 	} while (0)
diff --git a/include/linux/dept_sdt.h b/include/linux/dept_sdt.h
index 0535f763b21b..14917df0cc30 100644
--- a/include/linux/dept_sdt.h
+++ b/include/linux/dept_sdt.h
@@ -23,11 +23,12 @@
 
 #define sdt_map_init_key(m, k)		dept_map_init(m, k, 0, #m)
 
-#define sdt_wait(m)							\
+#define sdt_wait_timeout(m, t)						\
 	do {								\
 		dept_request_event(m);					\
-		dept_wait(m, 1UL, _THIS_IP_, __func__, 0);		\
+		dept_wait(m, 1UL, _THIS_IP_, __func__, 0, t);		\
 	} while (0)
+#define sdt_wait(m) sdt_wait_timeout(m, -1L)
 
 /*
  * sdt_might_sleep() and its family will be committed in __schedule()
@@ -38,13 +39,13 @@
 /*
  * Use the code location as the class key if an explicit map is not used.
  */
-#define sdt_might_sleep_start(m)					\
+#define sdt_might_sleep_start_timeout(m, t)				\
 	do {								\
 		struct dept_map *__m = m;				\
 		static struct dept_key __key;				\
-		dept_stage_wait(__m, __m ? NULL : &__key, _THIS_IP_, __func__);\
+		dept_stage_wait(__m, __m ? NULL : &__key, _THIS_IP_, __func__, t);\
 	} while (0)
-
+#define sdt_might_sleep_start(m)	sdt_might_sleep_start_timeout(m, -1L)
 #define sdt_might_sleep_end()		dept_clean_stage()
 
 #define sdt_ecxt_enter(m)		dept_ecxt_enter(m, 1UL, _THIS_IP_, "start", "event", 0)
@@ -54,7 +55,9 @@
 #else /* !CONFIG_DEPT */
 #define sdt_map_init(m)			do { } while (0)
 #define sdt_map_init_key(m, k)		do { (void)(k); } while (0)
+#define sdt_wait_timeout(m, t)		do { } while (0)
 #define sdt_wait(m)			do { } while (0)
+#define sdt_might_sleep_start_timeout(m, t) do { } while (0)
 #define sdt_might_sleep_start(m)	do { } while (0)
 #define sdt_might_sleep_end()		do { } while (0)
 #define sdt_ecxt_enter(m)		do { } while (0)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 05c3f8a45405..95b1450f22fa 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -867,6 +867,7 @@ struct dept_task {
 	bool				stage_sched_map;
 	const char			*stage_w_fn;
 	unsigned long			stage_ip;
+	bool				stage_timeout;
 	arch_spinlock_t			stage_lock;
 
 	/*
@@ -907,6 +908,7 @@ struct dept_task {
 	.stage_sched_map = false,				\
 	.stage_w_fn = NULL,					\
 	.stage_ip = 0UL,					\
+	.stage_timeout = false,					\
 	.stage_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED,\
 	.missing_ecxt = 0,					\
 	.hardirqs_enabled = false,				\
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index f4c08758f8db..519b2151403e 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -757,6 +757,8 @@ static void print_diagram(struct dept_dep *d)
 	if (!irqf) {
 		print_spc(spc, "[S] %s(%s:%d)\n", c_fn, fc_n, fc->sub_id);
 		print_spc(spc, "[W] %s(%s:%d)\n", w_fn, tc_n, tc->sub_id);
+		if (w->timeout)
+			print_spc(spc, "--------------- >8 timeout ---------------\n");
 		print_spc(spc, "[E] %s(%s:%d)\n", e_fn, fc_n, fc->sub_id);
 	}
 }
@@ -810,6 +812,24 @@ static void print_dep(struct dept_dep *d)
 
 static void save_current_stack(int skip);
 
+static bool is_timeout_wait_circle(struct dept_class *c)
+{
+	struct dept_class *fc = c->bfs_parent;
+	struct dept_class *tc = c;
+
+	do {
+		struct dept_dep *d = lookup_dep(fc, tc);
+
+		if (d->wait->timeout)
+			return true;
+
+		tc = fc;
+		fc = fc->bfs_parent;
+	} while (tc != c);
+
+	return false;
+}
+
 /*
  * Print all classes in a circle.
  */
@@ -832,10 +852,14 @@ static void print_circle(struct dept_class *c)
 	pr_warn("summary\n");
 	pr_warn("---------------------------------------------------\n");
 
-	if (fc == tc)
+	if (is_timeout_wait_circle(c)) {
+		pr_warn("NOT A DEADLOCK BUT A CIRCULAR DEPENDENCY\n");
+		pr_warn("CHECK IF THE TIMEOUT IS INTENDED\n\n");
+	} else if (fc == tc) {
 		pr_warn("*** AA DEADLOCK ***\n\n");
-	else
+	} else {
 		pr_warn("*** DEADLOCK ***\n\n");
+	}
 
 	i = 0;
 	do {
@@ -1579,7 +1603,8 @@ static int next_wgen(void)
 }
 
 static void add_wait(struct dept_class *c, unsigned long ip,
-		     const char *w_fn, int sub_l, bool sched_sleep)
+		     const char *w_fn, int sub_l, bool sched_sleep,
+		     bool timeout)
 {
 	struct dept_task *dt = dept_task();
 	struct dept_wait *w;
@@ -1599,6 +1624,7 @@ static void add_wait(struct dept_class *c, unsigned long ip,
 	w->wait_fn = w_fn;
 	w->wait_stack = get_current_stack();
 	w->sched_sleep = sched_sleep;
+	w->timeout = timeout;
 
 	cxt = cur_cxt();
 	if (cxt == DEPT_CXT_HIRQ || cxt == DEPT_CXT_SIRQ)
@@ -2297,7 +2323,7 @@ static struct dept_class *check_new_class(struct dept_key *local,
  */
 static void __dept_wait(struct dept_map *m, unsigned long w_f,
 			unsigned long ip, const char *w_fn, int sub_l,
-			bool sched_sleep, bool sched_map)
+			bool sched_sleep, bool sched_map, bool timeout)
 {
 	int e;
 
@@ -2320,7 +2346,7 @@ static void __dept_wait(struct dept_map *m, unsigned long w_f,
 		if (!c)
 			continue;
 
-		add_wait(c, ip, w_fn, sub_l, sched_sleep);
+		add_wait(c, ip, w_fn, sub_l, sched_sleep, timeout);
 	}
 }
 
@@ -2355,14 +2381,23 @@ static void __dept_event(struct dept_map *m, struct dept_map *real_m,
 }
 
 void dept_wait(struct dept_map *m, unsigned long w_f,
-	       unsigned long ip, const char *w_fn, int sub_l)
+	       unsigned long ip, const char *w_fn, int sub_l,
+	       long timeoutval)
 {
 	struct dept_task *dt = dept_task();
 	unsigned long flags;
+	bool timeout;
 
 	if (unlikely(!dept_working()))
 		return;
 
+	timeout = timeoutval > 0 && timeoutval < MAX_SCHEDULE_TIMEOUT;
+
+#if !defined(CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT)
+	if (timeout)
+		return;
+#endif
+
 	if (dt->recursive)
 		return;
 
@@ -2371,21 +2406,30 @@ void dept_wait(struct dept_map *m, unsigned long w_f,
 
 	flags = dept_enter();
 
-	__dept_wait(m, w_f, ip, w_fn, sub_l, false, false);
+	__dept_wait(m, w_f, ip, w_fn, sub_l, false, false, timeout);
 
 	dept_exit(flags);
 }
 EXPORT_SYMBOL_GPL(dept_wait);
 
 void dept_stage_wait(struct dept_map *m, struct dept_key *k,
-		     unsigned long ip, const char *w_fn)
+		     unsigned long ip, const char *w_fn,
+		     long timeoutval)
 {
 	struct dept_task *dt = dept_task();
 	unsigned long flags;
+	bool timeout;
 
 	if (unlikely(!dept_working()))
 		return;
 
+	timeout = timeoutval > 0 && timeoutval < MAX_SCHEDULE_TIMEOUT;
+
+#if !defined(CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT)
+	if (timeout)
+		return;
+#endif
+
 	if (m && m->nocheck)
 		return;
 
@@ -2434,6 +2478,7 @@ void dept_stage_wait(struct dept_map *m, struct dept_key *k,
 
 	dt->stage_w_fn = w_fn;
 	dt->stage_ip = ip;
+	dt->stage_timeout = timeout;
 	arch_spin_unlock(&dt->stage_lock);
 exit:
 	dept_exit_recursive(flags);
@@ -2447,6 +2492,7 @@ static void __dept_clean_stage(struct dept_task *dt)
 	dt->stage_sched_map = false;
 	dt->stage_w_fn = NULL;
 	dt->stage_ip = 0UL;
+	dt->stage_timeout = false;
 }
 
 void dept_clean_stage(void)
@@ -2479,6 +2525,7 @@ void dept_request_event_wait_commit(void)
 	unsigned long ip;
 	const char *w_fn;
 	bool sched_map;
+	bool timeout;
 
 	if (unlikely(!dept_working()))
 		return;
@@ -2505,12 +2552,13 @@ void dept_request_event_wait_commit(void)
 	w_fn = dt->stage_w_fn;
 	ip = dt->stage_ip;
 	sched_map = dt->stage_sched_map;
+	timeout = dt->stage_timeout;
 
 	wg = next_wgen();
 	WRITE_ONCE(dt->stage_m.wgen, wg);
 	arch_spin_unlock(&dt->stage_lock);
 
-	__dept_wait(&dt->stage_m, 1UL, ip, w_fn, 0, true, sched_map);
+	__dept_wait(&dt->stage_m, 1UL, ip, w_fn, 0, true, sched_map, timeout);
 exit:
 	dept_exit(flags);
 }
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 3669b069337b..290563fa8b58 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1394,6 +1394,16 @@ config DEPT
 	  noting, to mitigate the impact by the false positives, multi
 	  reporting has been supported.
 
+config DEPT_AGGRESSIVE_TIMEOUT_WAIT
+	bool "Aggressively track even timeout waits"
+	depends on DEPT
+	default n
+	help
+	  Timeout wait doesn't contribute to a deadlock. However,
+	  informing a circular dependency might be helpful for cases
+	  that timeout is used to avoid a deadlock. Say N if you'd like
+	  to avoid verbose reports.
+
 config LOCK_DEBUGGING_SUPPORT
 	bool
 	depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 19/47] dept: apply timeout consideration to wait_for_completion()/complete()
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (17 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 18/47] dept: track timeout waits separately with a new Kconfig Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 20/47] dept: apply timeout consideration to swait Byungchul Park
                   ` (27 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to wait_for_completion()/complete().

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/completion.h | 4 ++--
 kernel/sched/completion.c  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index bd2c207481d6..3200b741de28 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -41,9 +41,9 @@ do {							\
  */
 #define init_completion_map(x, m) init_completion(x)
 
-static inline void complete_acquire(struct completion *x)
+static inline void complete_acquire(struct completion *x, long timeout)
 {
-	sdt_might_sleep_start(&x->dmap);
+	sdt_might_sleep_start_timeout(&x->dmap, timeout);
 }
 
 static inline void complete_release(struct completion *x)
diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index 19ee702273c0..5e45a60ff7b3 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -115,7 +115,7 @@ __wait_for_common(struct completion *x,
 {
 	might_sleep();
 
-	complete_acquire(x);
+	complete_acquire(x, timeout);
 
 	raw_spin_lock_irq(&x->wait.lock);
 	timeout = do_wait_for_common(x, action, timeout, state);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 20/47] dept: apply timeout consideration to swait
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (18 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 19/47] dept: apply timeout consideration to wait_for_completion()/complete() Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 21/47] dept: apply timeout consideration to waitqueue wait Byungchul Park
                   ` (26 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to swait, assuming an input 'ret' in ___swait_event()
macro is used as a timeout value.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/swait.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/swait.h b/include/linux/swait.h
index 277ac74f61c3..233acdf55e9b 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -162,7 +162,7 @@ extern void finish_swait(struct swait_queue_head *q, struct swait_queue *wait);
 	struct swait_queue __wait;					\
 	long __ret = ret;						\
 									\
-	sdt_might_sleep_start(NULL);					\
+	sdt_might_sleep_start_timeout(NULL, __ret);			\
 	INIT_LIST_HEAD(&__wait.task_list);				\
 	for (;;) {							\
 		long __int = prepare_to_swait_event(&wq, &__wait, state);\
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 21/47] dept: apply timeout consideration to waitqueue wait
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (19 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 20/47] dept: apply timeout consideration to swait Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 22/47] dept: apply timeout consideration to hashed-waitqueue wait Byungchul Park
                   ` (25 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to waitqueue wait, assuming an input 'ret' in
___wait_event() macro is used as a timeout value.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/wait.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 65dd50f25e54..902a2e5381db 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -306,7 +306,7 @@ extern void init_wait_entry(struct wait_queue_entry *wq_entry, int flags);
 	struct wait_queue_entry __wq_entry;					\
 	long __ret = ret;	/* explicit shadow */				\
 										\
-	sdt_might_sleep_start(NULL);						\
+	sdt_might_sleep_start_timeout(NULL, __ret);				\
 	init_wait_entry(&__wq_entry, exclusive ? WQ_FLAG_EXCLUSIVE : 0);	\
 	for (;;) {								\
 		long __int = prepare_to_wait_event(&wq_head, &__wq_entry, state);\
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 22/47] dept: apply timeout consideration to hashed-waitqueue wait
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (20 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 21/47] dept: apply timeout consideration to waitqueue wait Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 23/47] dept: apply timeout consideration to dma fence wait Byungchul Park
                   ` (24 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to hashed-waitqueue wait, assuming an input 'ret' in
___wait_var_event() macro is used as a timeout value.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/wait_bit.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
index 179a616ad245..9885ac4e1ded 100644
--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -258,7 +258,7 @@ extern wait_queue_head_t *__var_waitqueue(void *p);
 	struct wait_bit_queue_entry __wbq_entry;			\
 	long __ret = ret; /* explicit shadow */				\
 									\
-	sdt_might_sleep_start(NULL);					\
+	sdt_might_sleep_start_timeout(NULL, __ret);			\
 	init_wait_var_entry(&__wbq_entry, var,				\
 			    exclusive ? WQ_FLAG_EXCLUSIVE : 0);		\
 	for (;;) {							\
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 23/47] dept: apply timeout consideration to dma fence wait
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (21 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 22/47] dept: apply timeout consideration to hashed-waitqueue wait Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 24/47] dept: make dept able to work with an external wgen Byungchul Park
                   ` (23 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to dma fence wait.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 drivers/dma-buf/dma-fence.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 787fa10a49f1..0a4b519e3351 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -801,7 +801,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout)
 	cb.task = current;
 	list_add(&cb.base.node, &fence->cb_list);
 
-	sdt_might_sleep_start(NULL);
+	sdt_might_sleep_start_timeout(NULL, timeout);
 	while (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags) && ret > 0) {
 		if (intr)
 			__set_current_state(TASK_INTERRUPTIBLE);
@@ -905,7 +905,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
 		}
 	}
 
-	sdt_might_sleep_start(NULL);
+	sdt_might_sleep_start_timeout(NULL, timeout);
 	while (ret > 0) {
 		if (intr)
 			set_current_state(TASK_INTERRUPTIBLE);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 24/47] dept: make dept able to work with an external wgen
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (22 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 23/47] dept: apply timeout consideration to dma fence wait Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 25/47] dept: track PG_locked with dept Byungchul Park
                   ` (22 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

There is a case where the total map size of waits of a class is so large.
For instance, PG_locked is the case if every struct page embeds its
regular map for PG_locked.  The total size for the maps will be 'the #
of pages * sizeof(struct dept_map)', which is too big to accept.

Keep the minimum data in the case, timestamp called 'wgen', that dept
uses.  Make dept able to work with the wgen instead of whole regular map.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept.h     | 18 ++++++++++++++----
 include/linux/dept_sdt.h |  6 +++---
 kernel/dependency/dept.c | 30 +++++++++++++++++++++---------
 3 files changed, 38 insertions(+), 16 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 49f457390521..236e4f06e5c8 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -372,6 +372,13 @@ struct dept_wait_hist {
 	unsigned int			ctxt_id;
 };
 
+/*
+ * for subsystems that requires compact use of memory e.g. struct page
+ */
+struct dept_ext_wgen {
+	unsigned int wgen;
+};
+
 extern void dept_on(void);
 extern void dept_off(void);
 extern void dept_init(void);
@@ -381,6 +388,7 @@ extern void dept_free_range(void *start, unsigned int sz);
 
 extern void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u, const char *n);
 extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u, const char *n);
+extern void dept_ext_wgen_init(struct dept_ext_wgen *ewg);
 extern void dept_map_copy(struct dept_map *to, struct dept_map *from);
 extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, const char *w_fn, int sub_l, long timeout);
 extern void dept_stage_wait(struct dept_map *m, struct dept_key *k, unsigned long ip, const char *w_fn, long timeout);
@@ -389,8 +397,8 @@ extern void dept_clean_stage(void);
 extern void dept_ttwu_stage_wait(struct task_struct *t, unsigned long ip);
 extern void dept_ecxt_enter(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *c_fn, const char *e_fn, int sub_l);
 extern bool dept_ecxt_holding(struct dept_map *m, unsigned long e_f);
-extern void dept_request_event(struct dept_map *m);
-extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *e_fn);
+extern void dept_request_event(struct dept_map *m, struct dept_ext_wgen *ewg);
+extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *e_fn, struct dept_ext_wgen *ewg);
 extern void dept_ecxt_exit(struct dept_map *m, unsigned long e_f, unsigned long ip);
 extern void dept_sched_enter(void);
 extern void dept_sched_exit(void);
@@ -417,6 +425,7 @@ extern void dept_hardirqs_off(void);
 #else /* !CONFIG_DEPT */
 struct dept_key { };
 struct dept_map { };
+struct dept_ext_wgen { };
 
 #define DEPT_MAP_INITIALIZER(n, k) { }
 
@@ -429,6 +438,7 @@ struct dept_map { };
 
 #define dept_map_init(m, k, su, n)			do { (void)(n); (void)(k); } while (0)
 #define dept_map_reinit(m, k, su, n)			do { (void)(n); (void)(k); } while (0)
+#define dept_ext_wgen_init(wg)				do { } while (0)
 #define dept_map_copy(t, f)				do { } while (0)
 #define dept_wait(m, w_f, ip, w_fn, sl, t)		do { (void)(w_fn); } while (0)
 #define dept_stage_wait(m, k, ip, w_fn, t)		do { (void)(k); (void)(w_fn); } while (0)
@@ -437,8 +447,8 @@ struct dept_map { };
 #define dept_ttwu_stage_wait(t, ip)			do { } while (0)
 #define dept_ecxt_enter(m, e_f, ip, c_fn, e_fn, sl)	do { (void)(c_fn); (void)(e_fn); } while (0)
 #define dept_ecxt_holding(m, e_f)			false
-#define dept_request_event(m)				do { } while (0)
-#define dept_event(m, e_f, ip, e_fn)			do { (void)(e_fn); } while (0)
+#define dept_request_event(m, wg)			do { } while (0)
+#define dept_event(m, e_f, ip, e_fn, wg)		do { (void)(e_fn); } while (0)
 #define dept_ecxt_exit(m, e_f, ip)			do { } while (0)
 #define dept_sched_enter()				do { } while (0)
 #define dept_sched_exit()				do { } while (0)
diff --git a/include/linux/dept_sdt.h b/include/linux/dept_sdt.h
index 14917df0cc30..9cd70affaf35 100644
--- a/include/linux/dept_sdt.h
+++ b/include/linux/dept_sdt.h
@@ -25,7 +25,7 @@
 
 #define sdt_wait_timeout(m, t)						\
 	do {								\
-		dept_request_event(m);					\
+		dept_request_event(m, NULL);				\
 		dept_wait(m, 1UL, _THIS_IP_, __func__, 0, t);		\
 	} while (0)
 #define sdt_wait(m) sdt_wait_timeout(m, -1L)
@@ -49,9 +49,9 @@
 #define sdt_might_sleep_end()		dept_clean_stage()
 
 #define sdt_ecxt_enter(m)		dept_ecxt_enter(m, 1UL, _THIS_IP_, "start", "event", 0)
-#define sdt_event(m)			dept_event(m, 1UL, _THIS_IP_, __func__)
+#define sdt_event(m)			dept_event(m, 1UL, _THIS_IP_, __func__, NULL)
 #define sdt_ecxt_exit(m)		dept_ecxt_exit(m, 1UL, _THIS_IP_)
-#define sdt_request_event(m)		dept_request_event(m)
+#define sdt_request_event(m)		dept_request_event(m, NULL)
 #else /* !CONFIG_DEPT */
 #define sdt_map_init(m)			do { } while (0)
 #define sdt_map_init_key(m, k)		do { (void)(k); } while (0)
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 519b2151403e..f928c12111fe 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -2172,6 +2172,11 @@ void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u,
 }
 EXPORT_SYMBOL_GPL(dept_map_reinit);
 
+void dept_ext_wgen_init(struct dept_ext_wgen *ewg)
+{
+	ewg->wgen = 0U;
+}
+
 void dept_map_copy(struct dept_map *to, struct dept_map *from)
 {
 	if (unlikely(!dept_working())) {
@@ -2355,7 +2360,7 @@ static void __dept_wait(struct dept_map *m, unsigned long w_f,
  */
 static void __dept_event(struct dept_map *m, struct dept_map *real_m,
 		unsigned long e_f, unsigned long ip, const char *e_fn,
-		bool sched_map)
+		bool sched_map, unsigned int wg)
 {
 	struct dept_class *c;
 	struct dept_key *k;
@@ -2377,7 +2382,7 @@ static void __dept_event(struct dept_map *m, struct dept_map *real_m,
 	c = check_new_class(&m->map_key, k, sub_id(m, e), m->name, sched_map);
 
 	if (c)
-		do_event(m, real_m, c, READ_ONCE(m->wgen), ip, e_fn);
+		do_event(m, real_m, c, wg, ip, e_fn);
 }
 
 void dept_wait(struct dept_map *m, unsigned long w_f,
@@ -2602,7 +2607,7 @@ void dept_ttwu_stage_wait(struct task_struct *requestor, unsigned long ip)
 	if (!m.keys)
 		goto exit;
 
-	__dept_event(&m, real_m, 1UL, ip, "try_to_wake_up", sched_map);
+	__dept_event(&m, real_m, 1UL, ip, "try_to_wake_up", sched_map, m.wgen);
 exit:
 	dept_exit(flags);
 }
@@ -2781,10 +2786,11 @@ bool dept_ecxt_holding(struct dept_map *m, unsigned long e_f)
 }
 EXPORT_SYMBOL_GPL(dept_ecxt_holding);
 
-void dept_request_event(struct dept_map *m)
+void dept_request_event(struct dept_map *m, struct dept_ext_wgen *ewg)
 {
 	unsigned long flags;
 	unsigned int wg;
+	unsigned int *wg_p;
 
 	if (unlikely(!dept_working()))
 		return;
@@ -2797,18 +2803,22 @@ void dept_request_event(struct dept_map *m)
 	 */
 	flags = dept_enter_recursive();
 
+	wg_p = ewg ? &ewg->wgen : &m->wgen;
+
 	wg = next_wgen();
-	WRITE_ONCE(m->wgen, wg);
+	WRITE_ONCE(*wg_p, wg);
 
 	dept_exit_recursive(flags);
 }
 EXPORT_SYMBOL_GPL(dept_request_event);
 
 void dept_event(struct dept_map *m, unsigned long e_f,
-		unsigned long ip, const char *e_fn)
+		unsigned long ip, const char *e_fn,
+		struct dept_ext_wgen *ewg)
 {
 	struct dept_task *dt = dept_task();
 	unsigned long flags;
+	unsigned int *wg_p;
 
 	if (unlikely(!dept_working()))
 		return;
@@ -2816,24 +2826,26 @@ void dept_event(struct dept_map *m, unsigned long e_f,
 	if (m->nocheck)
 		return;
 
+	wg_p = ewg ? &ewg->wgen : &m->wgen;
+
 	if (dt->recursive) {
 		/*
 		 * Dept won't work with this even though an event
 		 * context has been asked. Don't make it confused at
 		 * handling the event. Disable it until the next.
 		 */
-		WRITE_ONCE(m->wgen, 0U);
+		WRITE_ONCE(*wg_p, 0U);
 		return;
 	}
 
 	flags = dept_enter();
 
-	__dept_event(m, m, e_f, ip, e_fn, false);
+	__dept_event(m, m, e_f, ip, e_fn, false, READ_ONCE(*wg_p));
 
 	/*
 	 * Keep the map diabled until the next sleep.
 	 */
-	WRITE_ONCE(m->wgen, 0U);
+	WRITE_ONCE(*wg_p, 0U);
 
 	dept_exit(flags);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 25/47] dept: track PG_locked with dept
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (23 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 24/47] dept: make dept able to work with an external wgen Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 26/47] dept: print staged wait's stacktrace on report Byungchul Park
                   ` (21 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Makes dept able to track PG_locked waits and events, which will be
useful in practice.  See the following link that shows dept worked with
PG_locked and detected real issues in practice:

   https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.park@lge.com/

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/mm_types.h   |   2 +
 include/linux/page-flags.h | 125 +++++++++++++++++++++++++++++++++----
 include/linux/pagemap.h    |  37 ++++++++++-
 mm/filemap.c               |  26 ++++++++
 mm/mm_init.c               |   2 +
 5 files changed, 179 insertions(+), 13 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index a643fae8a349..5ebc565309af 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -20,6 +20,7 @@
 #include <linux/seqlock.h>
 #include <linux/percpu_counter.h>
 #include <linux/types.h>
+#include <linux/dept.h>
 
 #include <asm/mmu.h>
 
@@ -223,6 +224,7 @@ struct page {
 	struct page *kmsan_shadow;
 	struct page *kmsan_origin;
 #endif
+	struct dept_ext_wgen pg_locked_wgen;
 } _struct_page_alignment;
 
 /*
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 8d3fa3a91ce4..d3c4954c4218 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -198,6 +198,61 @@ enum pageflags {
 
 #ifndef __GENERATING_BOUNDS_H
 
+#ifdef CONFIG_DEPT
+#include <linux/kernel.h>
+#include <linux/dept.h>
+
+extern struct dept_map pg_locked_map;
+
+/*
+ * Place the following annotations in its suitable point in code:
+ *
+ *	Annotate dept_page_set_bit() around firstly set_bit*()
+ *	Annotate dept_page_clear_bit() around clear_bit*()
+ *	Annotate dept_page_wait_on_bit() around wait_on_bit*()
+ */
+
+static inline void dept_page_set_bit(struct page *p, int bit_nr)
+{
+	if (bit_nr == PG_locked)
+		dept_request_event(&pg_locked_map, &p->pg_locked_wgen);
+}
+
+static inline void dept_page_clear_bit(struct page *p, int bit_nr)
+{
+	if (bit_nr == PG_locked)
+		dept_event(&pg_locked_map, 1UL, _RET_IP_, __func__, &p->pg_locked_wgen);
+}
+
+static inline void dept_page_wait_on_bit(struct page *p, int bit_nr)
+{
+	if (bit_nr == PG_locked)
+		dept_wait(&pg_locked_map, 1UL, _RET_IP_, __func__, 0, -1L);
+}
+
+static inline void dept_folio_set_bit(struct folio *f, int bit_nr)
+{
+	dept_page_set_bit(&f->page, bit_nr);
+}
+
+static inline void dept_folio_clear_bit(struct folio *f, int bit_nr)
+{
+	dept_page_clear_bit(&f->page, bit_nr);
+}
+
+static inline void dept_folio_wait_on_bit(struct folio *f, int bit_nr)
+{
+	dept_page_wait_on_bit(&f->page, bit_nr);
+}
+#else
+#define dept_page_set_bit(p, bit_nr)		do { } while (0)
+#define dept_page_clear_bit(p, bit_nr)		do { } while (0)
+#define dept_page_wait_on_bit(p, bit_nr)	do { } while (0)
+#define dept_folio_set_bit(f, bit_nr)		do { } while (0)
+#define dept_folio_clear_bit(f, bit_nr)		do { } while (0)
+#define dept_folio_wait_on_bit(f, bit_nr)	do { } while (0)
+#endif
+
 #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
 DECLARE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key);
 
@@ -419,27 +474,51 @@ static __always_inline bool folio_test_##name(const struct folio *folio) \
 
 #define FOLIO_SET_FLAG(name, page)					\
 static __always_inline void folio_set_##name(struct folio *folio)	\
-{ set_bit(PG_##name, folio_flags(folio, page)); }
+{									\
+	set_bit(PG_##name, folio_flags(folio, page));			\
+	dept_folio_set_bit(folio, PG_##name);				\
+}
 
 #define FOLIO_CLEAR_FLAG(name, page)					\
 static __always_inline void folio_clear_##name(struct folio *folio)	\
-{ clear_bit(PG_##name, folio_flags(folio, page)); }
+{									\
+	clear_bit(PG_##name, folio_flags(folio, page));			\
+	dept_folio_clear_bit(folio, PG_##name);				\
+}
 
 #define __FOLIO_SET_FLAG(name, page)					\
 static __always_inline void __folio_set_##name(struct folio *folio)	\
-{ __set_bit(PG_##name, folio_flags(folio, page)); }
+{									\
+	__set_bit(PG_##name, folio_flags(folio, page));			\
+	dept_folio_set_bit(folio, PG_##name);				\
+}
 
 #define __FOLIO_CLEAR_FLAG(name, page)					\
 static __always_inline void __folio_clear_##name(struct folio *folio)	\
-{ __clear_bit(PG_##name, folio_flags(folio, page)); }
+{									\
+	__clear_bit(PG_##name, folio_flags(folio, page));		\
+	dept_folio_clear_bit(folio, PG_##name);				\
+}
 
 #define FOLIO_TEST_SET_FLAG(name, page)					\
 static __always_inline bool folio_test_set_##name(struct folio *folio)	\
-{ return test_and_set_bit(PG_##name, folio_flags(folio, page)); }
+{									\
+	bool __ret = test_and_set_bit(PG_##name, folio_flags(folio, page)); \
+									\
+	if (!__ret)							\
+		dept_folio_set_bit(folio, PG_##name);			\
+	return __ret;							\
+}
 
 #define FOLIO_TEST_CLEAR_FLAG(name, page)				\
 static __always_inline bool folio_test_clear_##name(struct folio *folio) \
-{ return test_and_clear_bit(PG_##name, folio_flags(folio, page)); }
+{									\
+	bool __ret = test_and_clear_bit(PG_##name, folio_flags(folio, page)); \
+									\
+	if (__ret)							\
+		dept_folio_clear_bit(folio, PG_##name);			\
+	return __ret;							\
+}
 
 #define FOLIO_FLAG(name, page)						\
 FOLIO_TEST_FLAG(name, page)						\
@@ -454,32 +533,54 @@ static __always_inline int Page##uname(const struct page *page)		\
 #define SETPAGEFLAG(uname, lname, policy)				\
 FOLIO_SET_FLAG(lname, FOLIO_##policy)					\
 static __always_inline void SetPage##uname(struct page *page)		\
-{ set_bit(PG_##lname, &policy(page, 1)->flags); }
+{									\
+	set_bit(PG_##lname, &policy(page, 1)->flags);			\
+	dept_page_set_bit(page, PG_##lname);				\
+}
 
 #define CLEARPAGEFLAG(uname, lname, policy)				\
 FOLIO_CLEAR_FLAG(lname, FOLIO_##policy)					\
 static __always_inline void ClearPage##uname(struct page *page)		\
-{ clear_bit(PG_##lname, &policy(page, 1)->flags); }
+{									\
+	clear_bit(PG_##lname, &policy(page, 1)->flags);			\
+	dept_page_clear_bit(page, PG_##lname);				\
+}
 
 #define __SETPAGEFLAG(uname, lname, policy)				\
 __FOLIO_SET_FLAG(lname, FOLIO_##policy)					\
 static __always_inline void __SetPage##uname(struct page *page)		\
-{ __set_bit(PG_##lname, &policy(page, 1)->flags); }
+{									\
+	__set_bit(PG_##lname, &policy(page, 1)->flags);			\
+	dept_page_set_bit(page, PG_##lname);				\
+}
 
 #define __CLEARPAGEFLAG(uname, lname, policy)				\
 __FOLIO_CLEAR_FLAG(lname, FOLIO_##policy)				\
 static __always_inline void __ClearPage##uname(struct page *page)	\
-{ __clear_bit(PG_##lname, &policy(page, 1)->flags); }
+{									\
+	__clear_bit(PG_##lname, &policy(page, 1)->flags);		\
+	dept_page_clear_bit(page, PG_##lname);				\
+}
 
 #define TESTSETFLAG(uname, lname, policy)				\
 FOLIO_TEST_SET_FLAG(lname, FOLIO_##policy)				\
 static __always_inline int TestSetPage##uname(struct page *page)	\
-{ return test_and_set_bit(PG_##lname, &policy(page, 1)->flags); }
+{									\
+	bool ret = test_and_set_bit(PG_##lname, &policy(page, 1)->flags);\
+	if (!ret)							\
+		dept_page_set_bit(page, PG_##lname);			\
+	return ret;							\
+}
 
 #define TESTCLEARFLAG(uname, lname, policy)				\
 FOLIO_TEST_CLEAR_FLAG(lname, FOLIO_##policy)				\
 static __always_inline int TestClearPage##uname(struct page *page)	\
-{ return test_and_clear_bit(PG_##lname, &policy(page, 1)->flags); }
+{									\
+	bool ret = test_and_clear_bit(PG_##lname, &policy(page, 1)->flags);\
+	if (ret)							\
+		dept_page_clear_bit(page, PG_##lname);			\
+	return ret;							\
+}
 
 #define PAGEFLAG(uname, lname, policy)					\
 	TESTPAGEFLAG(uname, lname, policy)				\
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 12a12dae727d..53b68b7a3f17 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -1093,7 +1093,12 @@ void folio_unlock(struct folio *folio);
  */
 static inline bool folio_trylock(struct folio *folio)
 {
-	return likely(!test_and_set_bit_lock(PG_locked, folio_flags(folio, 0)));
+	bool ret = !test_and_set_bit_lock(PG_locked, folio_flags(folio, 0));
+
+	if (ret)
+		dept_page_set_bit(&folio->page, PG_locked);
+
+	return likely(ret);
 }
 
 /*
@@ -1129,6 +1134,16 @@ static inline bool trylock_page(struct page *page)
 static inline void folio_lock(struct folio *folio)
 {
 	might_sleep();
+
+	/*
+	 * dept_page_wait_on_bit() will be called if __folio_lock() goes
+	 * through a real wait path.  However, for better job to detect
+	 * *potential* deadlocks, let's assume that folio_lock() always
+	 * goes through wait so that dept can take into account all the
+	 * potential cases.
+	 */
+	dept_page_wait_on_bit(&folio->page, PG_locked);
+
 	if (!folio_trylock(folio))
 		__folio_lock(folio);
 }
@@ -1149,6 +1164,15 @@ static inline void lock_page(struct page *page)
 	struct folio *folio;
 	might_sleep();
 
+	/*
+	 * dept_page_wait_on_bit() will be called if __folio_lock() goes
+	 * through a real wait path.  However, for better job to detect
+	 * *potential* deadlocks, let's assume that lock_page() always
+	 * goes through wait so that dept can take into account all the
+	 * potential cases.
+	 */
+	dept_page_wait_on_bit(page, PG_locked);
+
 	folio = page_folio(page);
 	if (!folio_trylock(folio))
 		__folio_lock(folio);
@@ -1167,6 +1191,17 @@ static inline void lock_page(struct page *page)
 static inline int folio_lock_killable(struct folio *folio)
 {
 	might_sleep();
+
+	/*
+	 * dept_page_wait_on_bit() will be called if
+	 * __folio_lock_killable() goes through a real wait path.
+	 * However, for better job to detect *potential* deadlocks,
+	 * let's assume that folio_lock_killable() always goes through
+	 * wait so that dept can take into account all the potential
+	 * cases.
+	 */
+	dept_page_wait_on_bit(&folio->page, PG_locked);
+
 	if (!folio_trylock(folio))
 		return __folio_lock_killable(folio);
 	return 0;
diff --git a/mm/filemap.c b/mm/filemap.c
index 751838ef05e5..edb0710ddb3f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -48,6 +48,7 @@
 #include <linux/rcupdate_wait.h>
 #include <linux/sched/mm.h>
 #include <linux/sysctl.h>
+#include <linux/dept.h>
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
 #include "internal.h"
@@ -1145,6 +1146,7 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 		if (flags & WQ_FLAG_CUSTOM) {
 			if (test_and_set_bit(key->bit_nr, &key->folio->flags))
 				return -1;
+			dept_page_set_bit(&key->folio->page, key->bit_nr);
 			flags |= WQ_FLAG_DONE;
 		}
 	}
@@ -1228,6 +1230,7 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
 	if (wait->flags & WQ_FLAG_EXCLUSIVE) {
 		if (test_and_set_bit(bit_nr, &folio->flags))
 			return false;
+		dept_page_set_bit(&folio->page, bit_nr);
 	} else if (test_bit(bit_nr, &folio->flags))
 		return false;
 
@@ -1235,6 +1238,9 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
 	return true;
 }
 
+struct dept_map __maybe_unused pg_locked_map = DEPT_MAP_INITIALIZER(pg_locked_map, NULL);
+EXPORT_SYMBOL(pg_locked_map);
+
 static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
 		int state, enum behavior behavior)
 {
@@ -1246,6 +1252,8 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
 	unsigned long pflags;
 	bool in_thrashing;
 
+	dept_page_wait_on_bit(&folio->page, bit_nr);
+
 	if (bit_nr == PG_locked &&
 	    !folio_test_uptodate(folio) && folio_test_workingset(folio)) {
 		delayacct_thrashing_start(&in_thrashing);
@@ -1339,6 +1347,23 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
 		break;
 	}
 
+	/*
+	 * dept_page_set_bit() might have been called already in
+	 * folio_trylock_flag(), wake_page_function() or somewhere.
+	 * However, call it again to reset the wgen of dept to ensure
+	 * dept_page_wait_on_bit() is called prior to
+	 * dept_page_set_bit().
+	 *
+	 * Remind dept considers all the waits between
+	 * dept_page_set_bit() and dept_page_clear_bit() as potential
+	 * event disturbers. Ensure the correct sequence so that dept
+	 * can make correct decisions:
+	 *
+	 *	wait -> acquire(set bit) -> release(clear bit)
+	 */
+	if (wait->flags & WQ_FLAG_DONE)
+		dept_page_set_bit(&folio->page, bit_nr);
+
 	/*
 	 * If a signal happened, this 'finish_wait()' may remove the last
 	 * waiter from the wait-queues, but the folio waiters bit will remain
@@ -1496,6 +1521,7 @@ void folio_unlock(struct folio *folio)
 	BUILD_BUG_ON(PG_waiters != 7);
 	BUILD_BUG_ON(PG_locked > 7);
 	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
+	dept_page_clear_bit(&folio->page, PG_locked);
 	if (folio_xor_flags_has_waiters(folio, 1 << PG_locked))
 		folio_wake_bit(folio, PG_locked);
 }
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 5c21b3af216b..09e4ac6a73c7 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -32,6 +32,7 @@
 #include <linux/vmstat.h>
 #include <linux/kexec_handover.h>
 #include <linux/hugetlb.h>
+#include <linux/dept.h>
 #include "internal.h"
 #include "slab.h"
 #include "shuffle.h"
@@ -587,6 +588,7 @@ void __meminit __init_single_page(struct page *page, unsigned long pfn,
 	atomic_set(&page->_mapcount, -1);
 	page_cpupid_reset_last(page);
 	page_kasan_tag_reset(page);
+	dept_ext_wgen_init(&page->pg_locked_wgen);
 
 	INIT_LIST_HEAD(&page->lru);
 #ifdef WANT_PAGE_VIRTUAL
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 26/47] dept: print staged wait's stacktrace on report
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (24 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 25/47] dept: track PG_locked with dept Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 27/47] locking/lockdep: prevent various lockdep assertions when lockdep_off()'ed Byungchul Park
                   ` (20 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Currently, print nothing about what event wakes up in report.  However,
it makes hard to interpret dept's report.

Make it print wait's stacktrace that the event wakes up.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept.h     |  5 ++++
 include/linux/sched.h    |  2 ++
 kernel/dependency/dept.c | 59 ++++++++++++++++++++++++++++++++++------
 3 files changed, 57 insertions(+), 9 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 236e4f06e5c8..b6dc4ff19537 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -227,6 +227,11 @@ struct dept_ecxt {
 			 */
 			unsigned long	event_ip;
 			struct dept_stack *event_stack;
+
+			/*
+			 * wait that this event ttwu
+			 */
+			struct dept_stack *ewait_stack;
 		};
 	};
 };
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 95b1450f22fa..a01c10f28dfd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -868,6 +868,7 @@ struct dept_task {
 	const char			*stage_w_fn;
 	unsigned long			stage_ip;
 	bool				stage_timeout;
+	struct dept_stack		*stage_wait_stack;
 	arch_spinlock_t			stage_lock;
 
 	/*
@@ -909,6 +910,7 @@ struct dept_task {
 	.stage_w_fn = NULL,					\
 	.stage_ip = 0UL,					\
 	.stage_timeout = false,					\
+	.stage_wait_stack = NULL,				\
 	.stage_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED,\
 	.missing_ecxt = 0,					\
 	.hardirqs_enabled = false,				\
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index f928c12111fe..1c6eb0a6d0a6 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -523,6 +523,7 @@ static void initialize_ecxt(struct dept_ecxt *e)
 	e->enirqf = 0UL;
 	e->event_ip = 0UL;
 	e->event_stack = NULL;
+	e->ewait_stack = NULL;
 }
 SET_CONSTRUCTOR(ecxt, initialize_ecxt);
 
@@ -578,6 +579,8 @@ static void destroy_ecxt(struct dept_ecxt *e)
 		put_stack(e->ecxt_stack);
 	if (e->event_stack)
 		put_stack(e->event_stack);
+	if (e->ewait_stack)
+		put_stack(e->ewait_stack);
 }
 SET_DESTRUCTOR(ecxt, destroy_ecxt);
 
@@ -794,6 +797,11 @@ static void print_dep(struct dept_dep *d)
 
 		pr_warn("[E] %s(%s:%d):\n", e_fn, fc_n, fc->sub_id);
 		print_ip_stack(e->event_ip, e->event_stack);
+
+		if (valid_stack(e->ewait_stack)) {
+			pr_warn("(wait to wake up)\n");
+			print_ip_stack(0, e->ewait_stack);
+		}
 	}
 
 	if (!irqf) {
@@ -807,6 +815,11 @@ static void print_dep(struct dept_dep *d)
 
 		pr_warn("[E] %s(%s:%d):\n", e_fn, fc_n, fc->sub_id);
 		print_ip_stack(e->event_ip, e->event_stack);
+
+		if (valid_stack(e->ewait_stack)) {
+			pr_warn("(wait to wake up)\n");
+			print_ip_stack(0, e->ewait_stack);
+		}
 	}
 }
 
@@ -1657,7 +1670,8 @@ static void add_wait(struct dept_class *c, unsigned long ip,
 
 static struct dept_ecxt_held *add_ecxt(struct dept_map *m,
 		struct dept_class *c, unsigned long ip, const char *c_fn,
-		const char *e_fn, int sub_l)
+		const char *e_fn, int sub_l,
+		struct dept_stack *ewait_stack)
 {
 	struct dept_task *dt = dept_task();
 	struct dept_ecxt_held *eh;
@@ -1691,6 +1705,7 @@ static struct dept_ecxt_held *add_ecxt(struct dept_map *m,
 	e->class = get_class(c);
 	e->ecxt_ip = ip;
 	e->ecxt_stack = ip ? get_current_stack() : NULL;
+	e->ewait_stack = ewait_stack ? get_stack(ewait_stack) : NULL;
 	e->event_fn = e_fn;
 	e->ecxt_fn = c_fn;
 
@@ -1797,7 +1812,7 @@ static int find_hist_pos(unsigned int wg)
 
 static void do_event(struct dept_map *m, struct dept_map *real_m,
 		struct dept_class *c, unsigned int wg, unsigned long ip,
-		const char *e_fn)
+		const char *e_fn, struct dept_stack *ewait_stack)
 {
 	struct dept_task *dt = dept_task();
 	struct dept_wait_hist *wh;
@@ -1825,7 +1840,7 @@ static void do_event(struct dept_map *m, struct dept_map *real_m,
 	 */
 	if (find_ecxt_pos(real_m, c, false) != -1)
 		return;
-	eh = add_ecxt(m, c, 0UL, NULL, e_fn, 0);
+	eh = add_ecxt(m, c, 0UL, NULL, e_fn, 0, ewait_stack);
 
 	if (!eh)
 		return;
@@ -2360,7 +2375,8 @@ static void __dept_wait(struct dept_map *m, unsigned long w_f,
  */
 static void __dept_event(struct dept_map *m, struct dept_map *real_m,
 		unsigned long e_f, unsigned long ip, const char *e_fn,
-		bool sched_map, unsigned int wg)
+		bool sched_map, unsigned int wg,
+		struct dept_stack *ewait_stack)
 {
 	struct dept_class *c;
 	struct dept_key *k;
@@ -2382,7 +2398,7 @@ static void __dept_event(struct dept_map *m, struct dept_map *real_m,
 	c = check_new_class(&m->map_key, k, sub_id(m, e), m->name, sched_map);
 
 	if (c)
-		do_event(m, real_m, c, wg, ip, e_fn);
+		do_event(m, real_m, c, wg, ip, e_fn, ewait_stack);
 }
 
 void dept_wait(struct dept_map *m, unsigned long w_f,
@@ -2498,6 +2514,9 @@ static void __dept_clean_stage(struct dept_task *dt)
 	dt->stage_w_fn = NULL;
 	dt->stage_ip = 0UL;
 	dt->stage_timeout = false;
+	if (dt->stage_wait_stack)
+		put_stack(dt->stage_wait_stack);
+	dt->stage_wait_stack = NULL;
 }
 
 void dept_clean_stage(void)
@@ -2561,6 +2580,14 @@ void dept_request_event_wait_commit(void)
 
 	wg = next_wgen();
 	WRITE_ONCE(dt->stage_m.wgen, wg);
+
+	/*
+	 * __schedule() can be hit multiple times between
+	 * dept_stage_wait() and dept_clean_stage().  In that case,
+	 * keep the first stacktrace only.  That's enough.
+	 */
+	if (!dt->stage_wait_stack)
+		dt->stage_wait_stack = get_current_stack();
 	arch_spin_unlock(&dt->stage_lock);
 
 	__dept_wait(&dt->stage_m, 1UL, ip, w_fn, 0, true, sched_map, timeout);
@@ -2579,6 +2606,7 @@ void dept_ttwu_stage_wait(struct task_struct *requestor, unsigned long ip)
 	struct dept_map m;
 	struct dept_map *real_m;
 	bool sched_map;
+	struct dept_stack *ewait_stack;
 
 	if (unlikely(!dept_working()))
 		return;
@@ -2597,6 +2625,10 @@ void dept_ttwu_stage_wait(struct task_struct *requestor, unsigned long ip)
 	m = dt_req->stage_m;
 	sched_map = dt_req->stage_sched_map;
 	real_m = dt_req->stage_real_m;
+	ewait_stack = dt_req->stage_wait_stack;
+	if (ewait_stack)
+		get_stack(ewait_stack);
+
 	__dept_clean_stage(dt_req);
 	arch_spin_unlock(&dt_req->stage_lock);
 
@@ -2607,8 +2639,12 @@ void dept_ttwu_stage_wait(struct task_struct *requestor, unsigned long ip)
 	if (!m.keys)
 		goto exit;
 
-	__dept_event(&m, real_m, 1UL, ip, "try_to_wake_up", sched_map, m.wgen);
+	__dept_event(&m, real_m, 1UL, ip, "try_to_wake_up", sched_map,
+			m.wgen, ewait_stack);
 exit:
+	if (ewait_stack)
+		put_stack(ewait_stack);
+
 	dept_exit(flags);
 }
 
@@ -2688,7 +2724,7 @@ void dept_map_ecxt_modify(struct dept_map *m, unsigned long e_f,
 	k = m->keys ?: &m->map_key;
 	c = check_new_class(&m->map_key, k, sub_id(m, new_e), m->name, false);
 
-	if (c && add_ecxt(m, c, new_ip, new_c_fn, new_e_fn, new_sub_l))
+	if (c && add_ecxt(m, c, new_ip, new_c_fn, new_e_fn, new_sub_l, NULL))
 		goto exit;
 
 	/*
@@ -2740,7 +2776,7 @@ void dept_ecxt_enter(struct dept_map *m, unsigned long e_f, unsigned long ip,
 	k = m->keys ?: &m->map_key;
 	c = check_new_class(&m->map_key, k, sub_id(m, e), m->name, false);
 
-	if (c && add_ecxt(m, c, ip, c_fn, e_fn, sub_l))
+	if (c && add_ecxt(m, c, ip, c_fn, e_fn, sub_l, NULL))
 		goto exit;
 missing_ecxt:
 	dt->missing_ecxt++;
@@ -2840,7 +2876,7 @@ void dept_event(struct dept_map *m, unsigned long e_f,
 
 	flags = dept_enter();
 
-	__dept_event(m, m, e_f, ip, e_fn, false, READ_ONCE(*wg_p));
+	__dept_event(m, m, e_f, ip, e_fn, false, READ_ONCE(*wg_p), NULL);
 
 	/*
 	 * Keep the map diabled until the next sleep.
@@ -2912,6 +2948,11 @@ void dept_task_exit(struct task_struct *t)
 		dt->stack = NULL;
 	}
 
+	if (dt->stage_wait_stack) {
+		put_stack(dt->stage_wait_stack);
+		dt->stage_wait_stack = NULL;
+	}
+
 	for (i = 0; i < dt->ecxt_held_pos; i++) {
 		if (dt->ecxt_held[i].class) {
 			put_class(dt->ecxt_held[i].class);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 27/47] locking/lockdep: prevent various lockdep assertions when lockdep_off()'ed
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (25 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 26/47] dept: print staged wait's stacktrace on report Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 28/47] dept: add documentation for dept Byungchul Park
                   ` (19 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

lockdep provides APIs for assertion only if lockdep is enabled at the
moment asserting to avoid unnecessary confusion, using the following
condition, debug_locks && !this_cpu_read(lockdep_recursion).

However, lockdep_{off,on}() are also used for disabling and enabling
lockdep for a simular purpose.  Add !lockdep_recursing(current) that is
updated by lockdep_{off,on}() to the condition so that the assertions
are aware of !__lockdep_enabled if lockdep_off()'ed.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/lockdep.h  |  3 ++-
 kernel/locking/lockdep.c | 10 ++++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index ef03d8808c10..c83fe95199db 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -303,6 +303,7 @@ extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie);
 	lockdep_assert_once(!current->lockdep_depth)
 
 #define lockdep_recursing(tsk)	((tsk)->lockdep_recursion)
+extern bool lockdep_recursing_current(void);
 
 #define lockdep_pin_lock(l)	lock_pin_lock(&(l)->dep_map)
 #define lockdep_repin_lock(l,c)	lock_repin_lock(&(l)->dep_map, (c))
@@ -630,7 +631,7 @@ DECLARE_PER_CPU(int, hardirqs_enabled);
 DECLARE_PER_CPU(int, hardirq_context);
 DECLARE_PER_CPU(unsigned int, lockdep_recursion);
 
-#define __lockdep_enabled	(debug_locks && !this_cpu_read(lockdep_recursion))
+#define __lockdep_enabled	(debug_locks && !this_cpu_read(lockdep_recursion) && !lockdep_recursing_current())
 
 #define lockdep_assert_irqs_enabled()					\
 do {									\
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index dc97f2753ef8..39b9e3e27c0b 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -6900,3 +6900,13 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 	warn_rcu_exit(rcu);
 }
 EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
+
+/*
+ * For avoiding header dependency when using (struct task_struct *)current
+ * and lockdep_recursing() at the same time.
+ */
+noinstr bool lockdep_recursing_current(void)
+{
+	return lockdep_recursing(current);
+}
+EXPORT_SYMBOL_GPL(lockdep_recursing_current);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 28/47] dept: add documentation for dept
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (26 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 27/47] locking/lockdep: prevent various lockdep assertions when lockdep_off()'ed Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-03  2:44   ` Bagas Sanjaya
                     ` (2 more replies)
  2025-10-02  8:12 ` [PATCH v17 29/47] cpu/hotplug: use a weaker annotation in AP thread Byungchul Park
                   ` (18 subsequent siblings)
  46 siblings, 3 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

This document describes the concept and APIs of dept.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 Documentation/dependency/dept.txt     | 735 ++++++++++++++++++++++++++
 Documentation/dependency/dept_api.txt | 117 ++++
 2 files changed, 852 insertions(+)
 create mode 100644 Documentation/dependency/dept.txt
 create mode 100644 Documentation/dependency/dept_api.txt

diff --git a/Documentation/dependency/dept.txt b/Documentation/dependency/dept.txt
new file mode 100644
index 000000000000..5dd358b96734
--- /dev/null
+++ b/Documentation/dependency/dept.txt
@@ -0,0 +1,735 @@
+DEPT(DEPendency Tracker)
+========================
+
+Started by Byungchul Park <max.byungchul.park@sk.com>
+
+How lockdep works
+-----------------
+
+Lockdep detects a deadlock by checking lock acquisition order. For
+example, a graph to track acquisition order built by lockdep might look
+like:
+
+   A -> B -
+           \
+            -> E
+           /
+   C -> D -
+
+   where 'A -> B' means that acquisition A is prior to acquisition B
+   with A still held.
+
+Lockdep keeps adding each new acquisition order into the graph in
+runtime. For example, 'E -> C' will be added when the two locks have
+been acquired in the order, E and then C. The graph will look like:
+
+       A -> B -
+               \
+                -> E -
+               /      \
+    -> C -> D -        \
+   /                   /
+   \                  /
+    ------------------
+
+   where 'A -> B' means that acquisition A is prior to acquisition B
+   with A still held.
+
+This graph contains a subgraph that demonstrates a loop like:
+
+                -> E -
+               /      \
+    -> C -> D -        \
+   /                   /
+   \                  /
+    ------------------
+
+   where 'A -> B' means that acquisition A is prior to acquisition B
+   with A still held.
+
+Lockdep reports it as a deadlock on detection of a loop and stops its
+working.
+
+CONCLUSION
+
+Lockdep detects a deadlock by checking if a loop has been created after
+adding a new acquisition order into the graph.
+
+
+Limitation of lockdep
+---------------------
+
+Lockdep deals with a deadlock by typical lock e.g. spinlock and mutex,
+that are supposed to be released within the acquisition context. However,
+when it comes to a deadlock by folio lock that is not supposed to be
+released within the acquisition context or other general synchronization
+mechanisms, lockdep doesn't work.
+
+Can lockdep detect the following deadlock?
+
+   context X	   context Y	   context Z
+
+		   mutex_lock A
+   folio_lock B
+		   folio_lock B <- DEADLOCK
+				   mutex_lock A <- DEADLOCK
+				   folio_unlock B
+		   folio_unlock B
+		   mutex_unlock A
+				   mutex_unlock A
+
+No. What about the following?
+
+   context X		   context Y
+
+			   mutex_lock A
+   mutex_lock A <- DEADLOCK
+			   wait_for_complete B <- DEADLOCK
+   complete B
+			   mutex_unlock A
+   mutex_unlock A
+
+No.
+
+CONCLUSION
+
+Lockdep cannot detect a deadlock by folio lock or other general
+synchronization mechanisms.
+
+
+What leads a deadlock
+---------------------
+
+A deadlock occurs when one or multi contexts are waiting for events that
+will never happen. For example:
+
+   context X	   context Y	   context Z
+
+   |		   |		   |
+   v		   |		   |
+   1 wait for A    v		   |
+   .		   2 wait for C    v
+   event C	   .		   3 wait for B
+		   event B	   .
+				   event A
+
+Event C cannot be triggered because context X is stuck at 1, event B
+cannot be triggered because context Y is stuck at 2, and event A cannot
+be triggered because context Z is stuck at 3. All the contexts are stuck.
+We call this *deadlock*.
+
+If an event occurrence is a prerequisite to reaching another event, we
+call it *dependency*. In this example:
+
+   Event A occurrence is a prerequisite to reaching event C.
+   Event C occurrence is a prerequisite to reaching event B.
+   Event B occurrence is a prerequisite to reaching event A.
+
+In terms of dependency:
+
+   Event C depends on event A.
+   Event B depends on event C.
+   Event A depends on event B.
+
+Dependency graph reflecting this example will look like:
+
+    -> C -> A -> B -
+   /                \
+   \                /
+    ----------------
+
+   where 'A -> B' means that event A depends on event B.
+
+A circular dependency exists. Such a circular dependency leads a
+deadlock since no waiters can have desired events triggered.
+
+CONCLUSION
+
+A circular dependency of events leads a deadlock.
+
+
+Introduce DEPT
+--------------
+
+DEPT(DEPendency Tracker) tracks wait and event instead of lock
+acquisition order so as to recognize the following situation:
+
+   context X	   context Y	   context Z
+
+   |		   |		   |
+   v		   |		   |
+   wait for A	   v		   |
+   .		   wait for C	   v
+   event C	   .		   wait for B
+		   event B	   .
+				   event A
+
+and builds up a dependency graph in runtime that is similar to lockdep.
+The graph might look like:
+
+    -> C -> A -> B -
+   /                \
+   \                /
+    ----------------
+
+   where 'A -> B' means that event A depends on event B.
+
+DEPT keeps adding each new dependency into the graph in runtime. For
+example, 'B -> D' will be added when event D occurrence is a
+prerequisite to reaching event B like:
+
+   |
+   v
+   wait for D
+   .
+   event B
+
+After the addition, the graph will look like:
+
+                     -> D
+                    /
+    -> C -> A -> B -
+   /                \
+   \                /
+    ----------------
+
+   where 'A -> B' means that event A depends on event B.
+
+DEPT is going to report a deadlock on detection of a new loop.
+
+CONCLUSION
+
+DEPT works on wait and event so as to theoretically detect all the
+potential deadlocks.
+
+
+How DEPT works
+--------------
+
+Let's take a look how DEPT works with the 1st example in the section
+'Limitation of lockdep'.
+
+   context X	   context Y	   context Z
+
+		   mutex_lock A
+   folio_lock B
+		   folio_lock B <- DEADLOCK
+				   mutex_lock A <- DEADLOCK
+				   folio_unlock B
+		   folio_unlock B
+		   mutex_unlock A
+				   mutex_unlock A
+
+Adding comments to describe DEPT's view in terms of wait and event:
+
+   context X	   context Y	   context Z
+
+		   mutex_lock A
+		   /* wait for A */
+   folio_lock B
+   /* wait for A */
+   /* start event A context */
+
+		   folio_lock B
+		   /* wait for B */ <- DEADLOCK
+		   /* start event B context */
+
+				   mutex_lock A
+				   /* wait for A */ <- DEADLOCK
+				   /* start event A context */
+
+				   folio_unlock B
+				   /* event B */
+		   folio_unlock B
+		   /* event B */
+
+		   mutex_unlock A
+		   /* event A */
+				   mutex_unlock A
+				   /* event A */
+
+Adding more supplementary comments to describe DEPT's view in detail:
+
+   context X	   context Y	   context Z
+
+		   mutex_lock A
+		   /* might wait for A */
+		   /* start to take into account event A's context */
+		   /* 1 */
+   folio_lock B
+   /* might wait for B */
+   /* start to take into account event B's context */
+   /* 2 */
+
+		   folio_lock B
+		   /* might wait for B */ <- DEADLOCK
+		   /* start to take into account event B's context */
+		   /* 3 */
+
+				   mutex_lock A
+				   /* might wait for A */ <- DEADLOCK
+				   /* start to take into account
+				      event A's context */
+				   /* 4 */
+
+				   folio_unlock B
+				   /* event B that's been valid since 2 */
+		   folio_unlock B
+		   /* event B that's been valid since 3 */
+
+		   mutex_unlock A
+		   /* event A that's been valid since 1 */
+
+				   mutex_unlock A
+				   /* event A that's been valid since 4 */
+
+Let's build up dependency graph with this example. Firstly, context X:
+
+   context X
+
+   folio_lock B
+   /* might wait for B */
+   /* start to take into account event B's context */
+   /* 2 */
+
+There are no events to create dependency. Next, context Y:
+
+   context Y
+
+   mutex_lock A
+   /* might wait for A */
+   /* start to take into account event A's context */
+   /* 1 */
+
+   folio_lock B
+   /* might wait for B */
+   /* start to take into account event B's context */
+   /* 3 */
+
+   folio_unlock B
+   /* event B that's been valid since 3 */
+
+   mutex_unlock A
+   /* event A that's been valid since 1 */
+
+There are two events. For event B, folio_unlock B, since there are no
+waits between 3 and the event, event B does not create dependency. For
+event A, there is a wait, folio_lock B, between 1 and the event. Which
+means event A cannot be triggered if event B does not wake up the wait.
+Therefore, we can say event A depends on event B, say, 'A -> B'. The
+graph will look like after adding the dependency:
+
+   A -> B
+
+   where 'A -> B' means that event A depends on event B.
+
+Lastly, context Z:
+
+   context Z
+
+   mutex_lock A
+   /* might wait for A */
+   /* start to take into account event A's context */
+   /* 4 */
+
+   folio_unlock B
+   /* event B that's been valid since 2 */
+
+   mutex_unlock A
+   /* event A that's been valid since 4 */
+
+There are also two events. For event B, folio_unlock B, there is a
+wait, mutex_lock A, between 2 and the event - remind 2 is at a very
+start and before the wait in timeline. Which means event B cannot be
+triggered if event A does not wake up the wait. Therefore, we can say
+event B depends on event A, say, 'B -> A'. The graph will look like
+after adding the dependency:
+
+    -> A -> B -
+   /           \
+   \           /
+    -----------
+
+   where 'A -> B' means that event A depends on event B.
+
+A new loop has been created. So DEPT can report it as a deadlock. For
+event A, mutex_unlock A, since there are no waits between 4 and the
+event, event A does not create dependency. That's it.
+
+CONCLUSION
+
+DEPT works well with any general synchronization mechanisms by focusing
+on wait, event and its context.
+
+
+Interpret DEPT report
+---------------------
+
+The following is the example in the section 'How DEPT works'.
+
+   context X	   context Y	   context Z
+
+		   mutex_lock A
+		   /* might wait for A */
+		   /* start to take into account event A's context */
+		   /* 1 */
+   folio_lock B
+   /* might wait for B */
+   /* start to take into account event B's context */
+   /* 2 */
+
+		   folio_lock B
+		   /* might wait for B */ <- DEADLOCK
+		   /* start to take into account event B's context */
+		   /* 3 */
+
+				   mutex_lock A
+				   /* might wait for A */ <- DEADLOCK
+				   /* start to take into account
+				      event A's context */
+				   /* 4 */
+
+				   folio_unlock B
+				   /* event B that's been valid since 2 */
+		   folio_unlock B
+		   /* event B that's been valid since 3 */
+
+		   mutex_unlock A
+		   /* event A that's been valid since 1 */
+
+				   mutex_unlock A
+				   /* event A that's been valid since 4 */
+
+We can Simplify this by replacing each waiting point with [W], each
+point where its event's context starts with [S] and each event with [E].
+This example will look like after the replacement:
+
+   context X	   context Y	   context Z
+
+		   [W][S] mutex_lock A
+   [W][S] folio_lock B
+		   [W][S] folio_lock B <- DEADLOCK
+
+				   [W][S] mutex_lock A <- DEADLOCK
+				   [E] folio_unlock B
+		   [E] folio_unlock B
+		   [E] mutex_unlock A
+				   [E] mutex_unlock A
+
+DEPT uses the symbols [W], [S] and [E] in its report as described above.
+The following is an example reported by DEPT for a real problem.
+
+   Link: https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SAKURA.ne.jp/#t
+   Link: https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.park@lge.com/
+
+   ===================================================
+   DEPT: Circular dependency has been detected.
+   6.2.0-rc1-00025-gb0c20ebf51ac-dirty #28 Not tainted
+   ---------------------------------------------------
+   summary
+   ---------------------------------------------------
+   *** DEADLOCK ***
+
+   context A
+       [S] lock(&ni->ni_lock:0)
+       [W] folio_wait_bit_common(PG_locked_map:0)
+       [E] unlock(&ni->ni_lock:0)
+
+   context B
+       [S] (unknown)(PG_locked_map:0)
+       [W] lock(&ni->ni_lock:0)
+       [E] folio_unlock(PG_locked_map:0)
+
+   [S]: start of the event context
+   [W]: the wait blocked
+   [E]: the event not reachable
+   ---------------------------------------------------
+   context A's detail
+   ---------------------------------------------------
+   context A
+       [S] lock(&ni->ni_lock:0)
+       [W] folio_wait_bit_common(PG_locked_map:0)
+       [E] unlock(&ni->ni_lock:0)
+
+   [S] lock(&ni->ni_lock:0):
+   [<ffffffff82b396fb>] ntfs3_setattr+0x54b/0xd40
+   stacktrace:
+         ntfs3_setattr+0x54b/0xd40
+         notify_change+0xcb3/0x1430
+         do_truncate+0x149/0x210
+         path_openat+0x21a3/0x2a90
+         do_filp_open+0x1ba/0x410
+         do_sys_openat2+0x16d/0x4e0
+         __x64_sys_creat+0xcd/0x120
+         do_syscall_64+0x41/0xc0
+         entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+   [W] folio_wait_bit_common(PG_locked_map:0):
+   [<ffffffff81b228b0>] truncate_inode_pages_range+0x9b0/0xf20
+   stacktrace:
+         folio_wait_bit_common+0x5e0/0xaf0
+         truncate_inode_pages_range+0x9b0/0xf20
+         truncate_pagecache+0x67/0x90
+         ntfs3_setattr+0x55a/0xd40
+         notify_change+0xcb3/0x1430
+         do_truncate+0x149/0x210
+         path_openat+0x21a3/0x2a90
+         do_filp_open+0x1ba/0x410
+         do_sys_openat2+0x16d/0x4e0
+         __x64_sys_creat+0xcd/0x120
+         do_syscall_64+0x41/0xc0
+         entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+   [E] unlock(&ni->ni_lock:0):
+   (N/A)
+   ---------------------------------------------------
+   context B's detail
+   ---------------------------------------------------
+   context B
+       [S] (unknown)(PG_locked_map:0)
+       [W] lock(&ni->ni_lock:0)
+       [E] folio_unlock(PG_locked_map:0)
+
+   [S] (unknown)(PG_locked_map:0):
+   (N/A)
+
+   [W] lock(&ni->ni_lock:0):
+   [<ffffffff82b009ec>] attr_data_get_block+0x32c/0x19f0
+   stacktrace:
+         attr_data_get_block+0x32c/0x19f0
+         ntfs_get_block_vbo+0x264/0x1330
+         __block_write_begin_int+0x3bd/0x14b0
+         block_write_begin+0xb9/0x4d0
+         ntfs_write_begin+0x27e/0x480
+         generic_perform_write+0x256/0x570
+         __generic_file_write_iter+0x2ae/0x500
+         ntfs_file_write_iter+0x66d/0x1d70
+         do_iter_readv_writev+0x20b/0x3c0
+         do_iter_write+0x188/0x710
+         vfs_iter_write+0x74/0xa0
+         iter_file_splice_write+0x745/0xc90
+         direct_splice_actor+0x114/0x180
+         splice_direct_to_actor+0x33b/0x8b0
+         do_splice_direct+0x1b7/0x280
+         do_sendfile+0xb49/0x1310
+
+   [E] folio_unlock(PG_locked_map:0):
+   [<ffffffff81f10222>] generic_write_end+0xf2/0x440
+   stacktrace:
+         generic_write_end+0xf2/0x440
+         ntfs_write_end+0x42e/0x980
+         generic_perform_write+0x316/0x570
+         __generic_file_write_iter+0x2ae/0x500
+         ntfs_file_write_iter+0x66d/0x1d70
+         do_iter_readv_writev+0x20b/0x3c0
+         do_iter_write+0x188/0x710
+         vfs_iter_write+0x74/0xa0
+         iter_file_splice_write+0x745/0xc90
+         direct_splice_actor+0x114/0x180
+         splice_direct_to_actor+0x33b/0x8b0
+         do_splice_direct+0x1b7/0x280
+         do_sendfile+0xb49/0x1310
+         __x64_sys_sendfile64+0x1d0/0x210
+         do_syscall_64+0x41/0xc0
+         entry_SYSCALL_64_after_hwframe+0x63/0xcd
+   ---------------------------------------------------
+   information that might be helpful
+   ---------------------------------------------------
+   CPU: 1 PID: 8060 Comm: a.out Not tainted
+	6.2.0-rc1-00025-gb0c20ebf51ac-dirty #28
+   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
+	BIOS Bochs 01/01/2011
+   Call Trace:
+    <TASK>
+    dump_stack_lvl+0xf2/0x169
+    print_circle.cold+0xca4/0xd28
+    ? lookup_dep+0x240/0x240
+    ? extend_queue+0x223/0x300
+    cb_check_dl+0x1e7/0x260
+    bfs+0x27b/0x610
+    ? print_circle+0x240/0x240
+    ? llist_add_batch+0x180/0x180
+    ? extend_queue_rev+0x300/0x300
+    ? __add_dep+0x60f/0x810
+    add_dep+0x221/0x5b0
+    ? __add_idep+0x310/0x310
+    ? add_iecxt+0x1bc/0xa60
+    ? add_iecxt+0x1bc/0xa60
+    ? add_iecxt+0x1bc/0xa60
+    ? add_iecxt+0x1bc/0xa60
+    __dept_wait+0x600/0x1490
+    ? add_iecxt+0x1bc/0xa60
+    ? truncate_inode_pages_range+0x9b0/0xf20
+    ? check_new_class+0x790/0x790
+    ? dept_enirq_transition+0x519/0x9c0
+    dept_wait+0x159/0x3b0
+    ? truncate_inode_pages_range+0x9b0/0xf20
+    folio_wait_bit_common+0x5e0/0xaf0
+    ? filemap_get_folios_contig+0xa30/0xa30
+    ? dept_enirq_transition+0x519/0x9c0
+    ? lock_is_held_type+0x10e/0x160
+    ? lock_is_held_type+0x11e/0x160
+    truncate_inode_pages_range+0x9b0/0xf20
+    ? truncate_inode_partial_folio+0xba0/0xba0
+    ? setattr_prepare+0x142/0xc40
+    truncate_pagecache+0x67/0x90
+    ntfs3_setattr+0x55a/0xd40
+    ? ktime_get_coarse_real_ts64+0x1e5/0x2f0
+    ? ntfs_extend+0x5c0/0x5c0
+    ? mode_strip_sgid+0x210/0x210
+    ? ntfs_extend+0x5c0/0x5c0
+    notify_change+0xcb3/0x1430
+    ? do_truncate+0x149/0x210
+    do_truncate+0x149/0x210
+    ? file_open_root+0x430/0x430
+    ? process_measurement+0x18c0/0x18c0
+    ? ntfs_file_release+0x230/0x230
+    path_openat+0x21a3/0x2a90
+    ? path_lookupat+0x840/0x840
+    ? dept_enirq_transition+0x519/0x9c0
+    ? lock_is_held_type+0x10e/0x160
+    do_filp_open+0x1ba/0x410
+    ? may_open_dev+0xf0/0xf0
+    ? find_held_lock+0x2d/0x110
+    ? lock_release+0x43c/0x830
+    ? dept_ecxt_exit+0x31a/0x590
+    ? _raw_spin_unlock+0x3b/0x50
+    ? alloc_fd+0x2de/0x6e0
+    do_sys_openat2+0x16d/0x4e0
+    ? __ia32_sys_get_robust_list+0x3b0/0x3b0
+    ? build_open_flags+0x6f0/0x6f0
+    ? dept_enirq_transition+0x519/0x9c0
+    ? dept_enirq_transition+0x519/0x9c0
+    ? lock_is_held_type+0x4e/0x160
+    ? lock_is_held_type+0x4e/0x160
+    __x64_sys_creat+0xcd/0x120
+    ? __x64_compat_sys_openat+0x1f0/0x1f0
+    do_syscall_64+0x41/0xc0
+    entry_SYSCALL_64_after_hwframe+0x63/0xcd
+   RIP: 0033:0x7f8b9e4e4469
+   Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48
+   89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48>
+   3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
+   RSP: 002b:00007f8b9eea4ef8 EFLAGS: 00000202 ORIG_RAX: 0000000000000055
+   RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f8b9e4e4469
+   RDX: 0000000000737562 RSI: 0000000000000000 RDI: 0000000020000000
+   RBP: 00007f8b9eea4f20 R08: 0000000000000000 R09: 0000000000000000
+   R10: 0000000000000000 R11: 0000000000000202 R12: 00007fffa75511ee
+   R13: 00007fffa75511ef R14: 00007f8b9ee85000 R15: 0000000000000003
+    </TASK>
+
+Let's take a look at the summary that is the most important part.
+
+   ---------------------------------------------------
+   summary
+   ---------------------------------------------------
+   *** DEADLOCK ***
+
+   context A
+       [S] lock(&ni->ni_lock:0)
+       [W] folio_wait_bit_common(PG_locked_map:0)
+       [E] unlock(&ni->ni_lock:0)
+
+   context B
+       [S] (unknown)(PG_locked_map:0)
+       [W] lock(&ni->ni_lock:0)
+       [E] folio_unlock(PG_locked_map:0)
+
+   [S]: start of the event context
+   [W]: the wait blocked
+   [E]: the event not reachable
+
+The summary shows the following scenario:
+
+   context A	   context B	   context ?(unknown)
+
+				   [S] folio_lock(&f1)
+   [S] lock(&ni->ni_lock:0)
+   [W] folio_wait_bit_common(PG_locked_map:0)
+
+		   [W] lock(&ni->ni_lock:0)
+		   [E] folio_unlock(&f1)
+
+   [E] unlock(&ni->ni_lock:0)
+
+Adding supplementary comments to describe DEPT's view in detail:
+
+   context A	   context B	   context ?(unknown)
+
+				   [S] folio_lock(&f1)
+				   /* start to take into account context
+				      B heading for folio_unlock(&f1) */
+				   /* 1 */
+   [S] lock(&ni->ni_lock:0)
+   /* start to take into account this context heading for
+      unlock(&ni->ni_lock:0) */
+   /* 2 */
+
+   [W] folio_wait_bit_common(PG_locked_map:0) (= folio_lock(&f1))
+   /* might wait for folio_unlock(&f1) */
+
+		   [W] lock(&ni->ni_lock:0)
+		   /* might wait for unlock(&ni->ni_lock:0) */
+
+		   [E] folio_unlock(&f1)
+		   /* event that's been valid since 1 */
+
+   [E] unlock(&ni->ni_lock:0)
+   /* event that's been valid since 2 */
+
+Let's build up dependency graph with this report. Firstly, context A:
+
+   context A
+
+   [S] lock(&ni->ni_lock:0)
+   /* start to take into account this context heading for
+      unlock(&ni->ni_lock:0) */
+   /* 2 */
+
+   [W] folio_wait_bit_common(PG_locked_map:0) (= folio_lock(&f1))
+   /* might wait for folio_unlock(&f1) */
+
+   [E] unlock(&ni->ni_lock:0)
+   /* event that's been valid since 2 */
+
+There is one interesting event, unlock(&ni->ni_lock:0). There is a
+wait, folio_lock(&f1), between 2 and the event. Which means
+unlock(&ni->ni_lock:0) is not reachable if folio_unlock(&f1) does not
+wake up the wait. Therefore, we can say unlock(&ni->ni_lock:0) depends
+on folio_unlock(&f1), say, 'unlock(&ni->ni_lock:0) -> folio_unlock(&f1)'.
+The graph will look like after adding the dependency:
+
+   unlock(&ni->ni_lock:0) -> folio_unlock(&f1)
+
+   where 'A -> B' means that event A depends on event B.
+
+Secondly, context B:
+
+   context B
+
+   [W] lock(&ni->ni_lock:0)
+   /* might wait for unlock(&ni->ni_lock:0) */
+
+   [E] folio_unlock(&f1)
+   /* event that's been valid since 1 */
+
+There is also one interesting event, folio_unlock(&f1). There is a
+wait, lock(&ni->ni_lock:0), between 1 and the event - remind 1 is at a
+very start and before the wait in timeline. Which means folio_unlock(&f1)
+is not reachable if unlock(&ni->ni_lock:0) does not wake up the wait.
+Therefore, we can say folio_unlock(&f1) depends on unlock(&ni->ni_lock:0),
+say, 'folio_unlock(&f1) -> unlock(&ni->ni_lock:0)'. The graph will look
+like after adding the dependency:
+
+    -> unlock(&ni->ni_lock:0) -> folio_unlock(&f1) -
+   /                                                \
+   \                                                /
+    ------------------------------------------------
+
+   where 'A -> B' means that event A depends on event B.
+
+A new loop has been created. So DEPT can report it as a deadlock! Cool!
+
+CONCLUSION
+
+DEPT works awesome!
diff --git a/Documentation/dependency/dept_api.txt b/Documentation/dependency/dept_api.txt
new file mode 100644
index 000000000000..8e0d5a118a46
--- /dev/null
+++ b/Documentation/dependency/dept_api.txt
@@ -0,0 +1,117 @@
+DEPT(DEPendency Tracker) APIs
+=============================
+
+Started by Byungchul Park <max.byungchul.park@sk.com>
+
+SDT(Single-event Dependency Tracker) APIs
+-----------------------------------------
+Use these APIs to annotate on either wait or event.  These have been
+already applied into the existing synchronization primitives e.g.
+waitqueue, swait, wait_for_completion(), dma fence and so on.  The basic
+APIs of SDT are:
+
+   /*
+    * After defining 'struct dept_map map', initialize the instance.
+    */
+   sdt_map_init(map);
+
+   /*
+    * Place just before the interesting wait.
+    */
+   sdt_wait(map);
+
+   /*
+    * Place just before the interesting event.
+    */
+   sdt_event(map);
+
+The advanced APIs of SDT are:
+
+   /*
+    * After defining 'struct dept_map map', initialize the instance
+    * using an external key.
+    */
+   sdt_map_init_key(map, key);
+
+   /*
+    * Place just before the interesting timeout wait.
+    */
+   sdt_wait_timeout(map, time);
+
+   /*
+    * Use sdt_might_sleep_start() and sdt_might_sleep_end() in pair.
+    * Place at the start of the interesting section that might enter
+    * schedule() or its family that needs to be woken up by
+    * try_to_wake_up().
+    */
+   sdt_might_sleep_start(map);
+
+   /*
+    * Use sdt_might_sleep_start_timeout() and sdt_might_sleep_end() in
+    * pair.  Place at the start of the interesting section that might
+    * enter schedule_timeout() or its family that needs to be woken up
+    * by try_to_wake_up().
+    */
+   sdt_might_sleep_start_timeout(map, time);
+
+   /*
+    * Use sdt_might_sleep_start() and sdt_might_sleep_end() in pair.
+    * Place at the end of the interesting section that might enter
+    * schedule(), schedule_timeout() or its family that needs to be
+    * woken up by try_to_wake_up().
+    */
+   sdt_might_sleep_end();
+
+   /*
+    * Use sdt_ecxt_enter() and sdt_ecxt_exit() in pair.  Place at the
+    * start of the interesting section where the interesting event might
+    * be triggered.
+    */
+   sdt_ecxt_enter(map);
+
+   /*
+    * Use sdt_ecxt_enter() and sdt_ecxt_exit() in pair.  Place at the
+    * end of the interesting section where the interesting event might
+    * be triggered.
+    */
+   sdt_ecxt_exit(map);
+
+
+LDT(Lock Dependency Tracker) APIs
+---------------------------------
+Do not use these APIs directly.  These are the wrappers for typical
+locks, that have been already applied into major locks internally e.g.
+spin lock, mutex, rwlock and so on.  The APIs of LDT are:
+
+   ldt_init(map, key, sub, name);
+   ldt_lock(map, sub_local, try, nest, ip);
+   ldt_rlock(map, sub_local, try, nest, ip, queued);
+   ldt_wlock(map, sub_local, try, nest, ip);
+   ldt_unlock(map, ip);
+   ldt_downgrade(map, ip);
+   ldt_set_class(map, name, key, sub_local, ip);
+
+
+Raw APIs
+--------
+Do not use these APIs directly.  The raw APIs of dept are:
+
+   dept_free_range(start, size);
+   dept_map_init(map, key, sub, name);
+   dept_map_reinit(map, key, sub, name);
+   dept_ext_wgen_init(ext_wgen);
+   dept_map_copy(map_to, map_from);
+   dept_wait(map, wait_flags, ip, wait_func, sub_local, time);
+   dept_stage_wait(map, key, ip, wait_func, time);
+   dept_request_event_wait_commit();
+   dept_clean_stage();
+   dept_stage_event(task, ip);
+   dept_ecxt_enter(map, evt_flags, ip, ecxt_func, evt_func, sub_local);
+   dept_ecxt_holding(map, evt_flags);
+   dept_request_event(map, ext_wgen);
+   dept_event(map, evt_flags, ip, evt_func, ext_wgen);
+   dept_ecxt_exit(map, evt_flags, ip);
+   dept_ecxt_enter_nokeep(map);
+   dept_key_init(key);
+   dept_key_destroy(key);
+   dept_map_ecxt_modify(map, cur_evt_flags, key, evt_flags, ip, ecxt_func, evt_func, sub_local);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 29/47] cpu/hotplug: use a weaker annotation in AP thread
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (27 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 28/47] dept: add documentation for dept Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 30/47] fs/jbd2: use a weaker annotation in journal handling Byungchul Park
                   ` (17 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

cb92173d1f0 ("locking/lockdep, cpu/hotplug: Annotate AP thread") was
introduced to make lockdep_assert_cpus_held() work in AP thread.

However, the annotation is too strong for that purpose.  We don't have
to use more than try lock annotation for that.

rwsem_acquire() implies:

   1. might be a waiter on contention of the lock.
   2. enter to the critical section of the lock.

All we need in here is to act 2, not 1.  So trylock version of
annotation is sufficient for that purpose.  Now that dept partially
relies on lockdep annotaions, dept interpets rwsem_acquire() as a
potential wait and might report a deadlock by the wait.

Replace it with trylock version of annotation.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 kernel/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index db9f6c539b28..285d7fa55523 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -538,7 +538,7 @@ int lockdep_is_cpus_held(void)
 
 static void lockdep_acquire_cpus_lock(void)
 {
-	rwsem_acquire(&cpu_hotplug_lock.dep_map, 0, 0, _THIS_IP_);
+	rwsem_acquire(&cpu_hotplug_lock.dep_map, 0, 1, _THIS_IP_);
 }
 
 static void lockdep_release_cpus_lock(void)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 30/47] fs/jbd2: use a weaker annotation in journal handling
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (28 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 29/47] cpu/hotplug: use a weaker annotation in AP thread Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:40   ` Jan Kara
  2025-10-02  8:12 ` [PATCH v17 31/47] dept: assign dept map to mmu notifier invalidation synchronization Byungchul Park
                   ` (16 subsequent siblings)
  46 siblings, 1 reply; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

jbd2 journal handling code doesn't want jbd2_might_wait_for_commit()
to be placed between start_this_handle() and stop_this_handle().  So it
marks the region with rwsem_acquire_read() and rwsem_release().

However, the annotation is too strong for that purpose.  We don't have
to use more than try lock annotation for that.

rwsem_acquire_read() implies:

   1. might be a waiter on contention of the lock.
   2. enter to the critical section of the lock.

All we need in here is to act 2, not 1.  So trylock version of
annotation is sufficient for that purpose.  Now that dept partially
relies on lockdep annotaions, dept interpets rwsem_acquire_read() as a
potential wait and might report a deadlock by the wait.

Replace it with trylock version of annotation.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 fs/jbd2/transaction.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index c7867139af69..b4e65f51bf5e 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -441,7 +441,7 @@ static int start_this_handle(journal_t *journal, handle_t *handle,
 	read_unlock(&journal->j_state_lock);
 	current->journal_info = handle;
 
-	rwsem_acquire_read(&journal->j_trans_commit_map, 0, 0, _THIS_IP_);
+	rwsem_acquire_read(&journal->j_trans_commit_map, 0, 1, _THIS_IP_);
 	jbd2_journal_free_transaction(new_transaction);
 	/*
 	 * Ensure that no allocations done while the transaction is open are
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 31/47] dept: assign dept map to mmu notifier invalidation synchronization
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (29 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 30/47] fs/jbd2: use a weaker annotation in journal handling Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 32/47] dept: assign unique dept_key to each distinct dma fence caller Byungchul Park
                   ` (15 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Resolved the following false positive by introducing explicit dept map
and annotations for dealing with this case:

   *** DEADLOCK ***
   context A
       [S] (unknown)(<sched>:0)
       [W] lock(&mm->mmap_lock:0)
       [E] try_to_wake_up(<sched>:0)

   context B
       [S] lock(&mm->mmap_lock:0)
       [W] mmu_interval_read_begin(<sched>:0)
       [E] unlock(&mm->mmap_lock:0)

   [S]: start of the event context
   [W]: the wait blocked
   [E]: the event not reachable

dept already tracks dependencies between scheduler sleep and ttwu based
on internal timestamp called wgen.  However, in case that more than one
event contexts are overwrapped, dept has chance to wrongly guess the
start of the event context like the following:

   <before this patch>

   context A: lock L
   context A: mmu_notifier_invalidate_range_start()

   context B: lock L'
   context B: mmu_interval_read_begin() : wait
   <- here is the start of the event context of C.
   context B: unlock L'

   context C: lock L''
   context C: mmu_notifier_invalidate_range_start()

   context A: mmu_notifier_invalidate_range_end()
   context A: unlock L

   context C: mmu_notifier_invalidate_range_end() : ttwu
   <- here is the end of the event context of C.  dept observes a wait,
      lock L'' within the event context of C.  Which causes a false
      positive dept report.

   context C: unlock L''

By explicitly annotating the interesting event context range, make dept
work with more precise information like:

   <after this patch>

   context A: lock L
   context A: mmu_notifier_invalidate_range_start()

   context B: lock L'
   context B: mmu_interval_read_begin() : wait
   context B: unlock L'

   context C: lock L''
   context C: mmu_notifier_invalidate_range_start()
   <- here is the start of the event context of C.

   context A: mmu_notifier_invalidate_range_end()
   context A: unlock L

   context C: mmu_notifier_invalidate_range_end() : ttwu
   <- here is the end of the event context of C.  dept doesn't observe
      the wait, lock L'' within the event context of C.  context C is
      responsible only for the range delimited by
      mmu_notifier_invalidate_range_{start,end}().

   context C: unlock L''

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/mmu_notifier.h | 26 ++++++++++++++++++++++++++
 mm/mmu_notifier.c            | 31 +++++++++++++++++++++++++++++--
 2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index d1094c2d5fb6..2b70dce149f0 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -428,6 +428,14 @@ static inline int mmu_notifier_test_young(struct mm_struct *mm,
 	return 0;
 }
 
+#ifdef CONFIG_DEPT
+void mmu_notifier_invalidate_dept_ecxt_start(struct mmu_notifier_range *range);
+void mmu_notifier_invalidate_dept_ecxt_end(struct mmu_notifier_range *range);
+#else
+static inline void mmu_notifier_invalidate_dept_ecxt_start(struct mmu_notifier_range *range) {}
+static inline void mmu_notifier_invalidate_dept_ecxt_end(struct mmu_notifier_range *range) {}
+#endif
+
 static inline void
 mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
 {
@@ -439,6 +447,12 @@ mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
 		__mmu_notifier_invalidate_range_start(range);
 	}
 	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
+
+	/*
+	 * From now on, waiters could be there by this start until
+	 * mmu_notifier_invalidate_range_end().
+	 */
+	mmu_notifier_invalidate_dept_ecxt_start(range);
 }
 
 /*
@@ -459,6 +473,12 @@ mmu_notifier_invalidate_range_start_nonblock(struct mmu_notifier_range *range)
 		ret = __mmu_notifier_invalidate_range_start(range);
 	}
 	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
+
+	/*
+	 * From now on, waiters could be there by this start until
+	 * mmu_notifier_invalidate_range_end().
+	 */
+	mmu_notifier_invalidate_dept_ecxt_start(range);
 	return ret;
 }
 
@@ -470,6 +490,12 @@ mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range)
 
 	if (mm_has_notifiers(range->mm))
 		__mmu_notifier_invalidate_range_end(range);
+
+	/*
+	 * The event context that has been started by
+	 * mmu_notifier_invalidate_range_start() ends.
+	 */
+	mmu_notifier_invalidate_dept_ecxt_end(range);
 }
 
 static inline void mmu_notifier_arch_invalidate_secondary_tlbs(struct mm_struct *mm,
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 8e0125dc0522..31af5ea54a0c 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -46,6 +46,7 @@ struct mmu_notifier_subscriptions {
 	unsigned long active_invalidate_ranges;
 	struct rb_root_cached itree;
 	wait_queue_head_t wq;
+	struct dept_map dmap;
 	struct hlist_head deferred_list;
 };
 
@@ -165,6 +166,25 @@ static void mn_itree_inv_end(struct mmu_notifier_subscriptions *subscriptions)
 	wake_up_all(&subscriptions->wq);
 }
 
+#ifdef CONFIG_DEPT
+void mmu_notifier_invalidate_dept_ecxt_start(struct mmu_notifier_range *range)
+{
+	struct mmu_notifier_subscriptions *subscriptions =
+		range->mm->notifier_subscriptions;
+
+	if (subscriptions)
+		sdt_ecxt_enter(&subscriptions->dmap);
+}
+void mmu_notifier_invalidate_dept_ecxt_end(struct mmu_notifier_range *range)
+{
+	struct mmu_notifier_subscriptions *subscriptions =
+		range->mm->notifier_subscriptions;
+
+	if (subscriptions)
+		sdt_ecxt_exit(&subscriptions->dmap);
+}
+#endif
+
 /**
  * mmu_interval_read_begin - Begin a read side critical section against a VA
  *                           range
@@ -246,9 +266,12 @@ mmu_interval_read_begin(struct mmu_interval_notifier *interval_sub)
 	 */
 	lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
 	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
-	if (is_invalidating)
+	if (is_invalidating) {
+		sdt_might_sleep_start(&subscriptions->dmap);
 		wait_event(subscriptions->wq,
 			   READ_ONCE(subscriptions->invalidate_seq) != seq);
+		sdt_might_sleep_end();
+	}
 
 	/*
 	 * Notice that mmu_interval_read_retry() can already be true at this
@@ -625,6 +648,7 @@ int __mmu_notifier_register(struct mmu_notifier *subscription,
 
 		INIT_HLIST_HEAD(&subscriptions->list);
 		spin_lock_init(&subscriptions->lock);
+		sdt_map_init(&subscriptions->dmap);
 		subscriptions->invalidate_seq = 2;
 		subscriptions->itree = RB_ROOT_CACHED;
 		init_waitqueue_head(&subscriptions->wq);
@@ -1070,9 +1094,12 @@ void mmu_interval_notifier_remove(struct mmu_interval_notifier *interval_sub)
 	 */
 	lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
 	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
-	if (seq)
+	if (seq) {
+		sdt_might_sleep_start(&subscriptions->dmap);
 		wait_event(subscriptions->wq,
 			   mmu_interval_seq_released(subscriptions, seq));
+		sdt_might_sleep_end();
+	}
 
 	/* pairs with mmgrab in mmu_interval_notifier_insert() */
 	mmdrop(mm);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 32/47] dept: assign unique dept_key to each distinct dma fence caller
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (30 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 31/47] dept: assign dept map to mmu notifier invalidation synchronization Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 33/47] dept: make dept aware of lockdep_set_lock_cmp_fn() annotation Byungchul Park
                   ` (14 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

dma fence can be used at various points in the code and it's very hard
to distinguish dma fences between different usages.  Using a single
dept_key for all the dma fences could trigger false positive reports.

Assign unique dept_key to each distinct dma fence wait to avoid false
positive reports.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 drivers/dma-buf/dma-fence.c | 18 ++++-----
 include/linux/dma-fence.h   | 74 +++++++++++++++++++++++++++++--------
 2 files changed, 68 insertions(+), 24 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 0a4b519e3351..df52a7ae336a 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -481,7 +481,7 @@ int dma_fence_signal(struct dma_fence *fence)
 EXPORT_SYMBOL(dma_fence_signal);
 
 /**
- * dma_fence_wait_timeout - sleep until the fence gets signaled
+ * __dma_fence_wait_timeout - sleep until the fence gets signaled
  * or until timeout elapses
  * @fence: the fence to wait on
  * @intr: if true, do an interruptible wait
@@ -499,7 +499,7 @@ EXPORT_SYMBOL(dma_fence_signal);
  * See also dma_fence_wait() and dma_fence_wait_any_timeout().
  */
 signed long
-dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
+__dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
 {
 	signed long ret;
 
@@ -528,7 +528,7 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
 	}
 	return ret;
 }
-EXPORT_SYMBOL(dma_fence_wait_timeout);
+EXPORT_SYMBOL(__dma_fence_wait_timeout);
 
 /**
  * dma_fence_release - default release function for fences
@@ -764,7 +764,7 @@ dma_fence_default_wait_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
 }
 
 /**
- * dma_fence_default_wait - default sleep until the fence gets signaled
+ * __dma_fence_default_wait - default sleep until the fence gets signaled
  * or until timeout elapses
  * @fence: the fence to wait on
  * @intr: if true, do an interruptible wait
@@ -776,7 +776,7 @@ dma_fence_default_wait_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
  * functions taking a jiffies timeout.
  */
 signed long
-dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout)
+__dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout)
 {
 	struct default_wait_cb cb;
 	unsigned long flags;
@@ -825,7 +825,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout)
 	spin_unlock_irqrestore(fence->lock, flags);
 	return ret;
 }
-EXPORT_SYMBOL(dma_fence_default_wait);
+EXPORT_SYMBOL(__dma_fence_default_wait);
 
 static bool
 dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count,
@@ -845,7 +845,7 @@ dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count,
 }
 
 /**
- * dma_fence_wait_any_timeout - sleep until any fence gets signaled
+ * __dma_fence_wait_any_timeout - sleep until any fence gets signaled
  * or until timeout elapses
  * @fences: array of fences to wait on
  * @count: number of fences to wait on
@@ -865,7 +865,7 @@ dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count,
  * See also dma_fence_wait() and dma_fence_wait_timeout().
  */
 signed long
-dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
+__dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
 			   bool intr, signed long timeout, uint32_t *idx)
 {
 	struct default_wait_cb *cb;
@@ -933,7 +933,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
 
 	return ret;
 }
-EXPORT_SYMBOL(dma_fence_wait_any_timeout);
+EXPORT_SYMBOL(__dma_fence_wait_any_timeout);
 
 /**
  * DOC: deadline hints
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 64639e104110..1062fbb637e5 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -369,8 +369,22 @@ int dma_fence_signal_locked(struct dma_fence *fence);
 int dma_fence_signal_timestamp(struct dma_fence *fence, ktime_t timestamp);
 int dma_fence_signal_timestamp_locked(struct dma_fence *fence,
 				      ktime_t timestamp);
-signed long dma_fence_default_wait(struct dma_fence *fence,
+signed long __dma_fence_default_wait(struct dma_fence *fence,
 				   bool intr, signed long timeout);
+
+/*
+ * Associate every caller with its own dept map.
+ */
+#define dma_fence_default_wait(f, intr, t)				\
+({									\
+	signed long __ret;						\
+									\
+	sdt_might_sleep_start_timeout(NULL, t);				\
+	__ret = __dma_fence_default_wait(f, intr, t);			\
+	sdt_might_sleep_end();						\
+	__ret;								\
+})
+
 int dma_fence_add_callback(struct dma_fence *fence,
 			   struct dma_fence_cb *cb,
 			   dma_fence_func_t func);
@@ -607,12 +621,37 @@ static inline ktime_t dma_fence_timestamp(struct dma_fence *fence)
 	return fence->timestamp;
 }
 
-signed long dma_fence_wait_timeout(struct dma_fence *,
+signed long __dma_fence_wait_timeout(struct dma_fence *,
 				   bool intr, signed long timeout);
-signed long dma_fence_wait_any_timeout(struct dma_fence **fences,
+signed long __dma_fence_wait_any_timeout(struct dma_fence **fences,
 				       uint32_t count,
 				       bool intr, signed long timeout,
 				       uint32_t *idx);
+/*
+ * Associate every caller with its own dept map.
+ */
+#define dma_fence_wait_timeout(f, intr, t)				\
+({									\
+	signed long __ret;						\
+									\
+	sdt_might_sleep_start_timeout(NULL, t);				\
+	__ret = __dma_fence_wait_timeout(f, intr, t);			\
+	sdt_might_sleep_end();						\
+	__ret;								\
+})
+
+/*
+ * Associate every caller with its own dept map.
+ */
+#define dma_fence_wait_any_timeout(fpp, count, intr, t, idx)		\
+({									\
+	signed long __ret;						\
+									\
+	sdt_might_sleep_start_timeout(NULL, t);				\
+	__ret = __dma_fence_wait_any_timeout(fpp, count, intr, t, idx);	\
+	sdt_might_sleep_end();						\
+	__ret;								\
+})
 
 /**
  * dma_fence_wait - sleep until the fence gets signaled
@@ -628,19 +667,24 @@ signed long dma_fence_wait_any_timeout(struct dma_fence **fences,
  * fence might be freed before return, resulting in undefined behavior.
  *
  * See also dma_fence_wait_timeout() and dma_fence_wait_any_timeout().
+ *
+ * Associate every caller with its own dept map.
  */
-static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
-{
-	signed long ret;
-
-	/* Since dma_fence_wait_timeout cannot timeout with
-	 * MAX_SCHEDULE_TIMEOUT, only valid return values are
-	 * -ERESTARTSYS and MAX_SCHEDULE_TIMEOUT.
-	 */
-	ret = dma_fence_wait_timeout(fence, intr, MAX_SCHEDULE_TIMEOUT);
-
-	return ret < 0 ? ret : 0;
-}
+#define dma_fence_wait(f, intr)						\
+({									\
+	signed long __ret;						\
+									\
+	sdt_might_sleep_start_timeout(NULL, MAX_SCHEDULE_TIMEOUT);	\
+	__ret = __dma_fence_wait_timeout(f, intr, MAX_SCHEDULE_TIMEOUT);\
+	sdt_might_sleep_end();						\
+									\
+	/*								\
+	 * Since dma_fence_wait_timeout cannot timeout with		\
+	 * MAX_SCHEDULE_TIMEOUT, only valid return values are		\
+	 * -ERESTARTSYS and MAX_SCHEDULE_TIMEOUT.			\
+	 */								\
+	__ret < 0 ? __ret : 0;						\
+})
 
 void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 33/47] dept: make dept aware of lockdep_set_lock_cmp_fn() annotation
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (31 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 32/47] dept: assign unique dept_key to each distinct dma fence caller Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 34/47] dept: make dept stop from working on debug_locks_off() Byungchul Park
                   ` (13 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

commit eb1cfd09f788e ("lockdep: Add lock_set_cmp_fn() annotation") has
been added to address the issue that lockdep was not able to detect a
true deadlock like the following:

   https://lore.kernel.org/lkml/20220510232633.GA18445@X58A-UD3R/

The approach is only for lockdep but dept should work being aware of it
because the new annotation is already used to avoid false positive of
lockdep in the code.

Make dept aware of the new lockdep annotation.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept.h     | 10 +++++++++
 kernel/dependency/dept.c | 48 +++++++++++++++++++++++++++++++++++-----
 kernel/locking/lockdep.c |  1 +
 3 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index b6dc4ff19537..8b4d97fc4627 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -130,6 +130,11 @@ struct dept_map {
 	const char			*name;
 	struct dept_key			*keys;
 
+	/*
+	 * keep lockdep map to handle lockdep_set_lock_cmp_fn().
+	 */
+	void				*lockdep_map;
+
 	/*
 	 * subclass that can be set from user
 	 */
@@ -156,6 +161,7 @@ struct dept_map {
 {									\
 	.name = #n,							\
 	.keys = (struct dept_key *)(k),					\
+	.lockdep_map = NULL,						\
 	.sub_u = 0,							\
 	.map_key = { .classes = { NULL, } },				\
 	.wgen = 0U,							\
@@ -427,6 +433,8 @@ extern void dept_softirqs_on_ip(unsigned long ip);
 extern void dept_hardirqs_on(void);
 extern void dept_softirqs_off(void);
 extern void dept_hardirqs_off(void);
+
+#define dept_set_lockdep_map(m, lockdep_m) ({ (m)->lockdep_map = lockdep_m; })
 #else /* !CONFIG_DEPT */
 struct dept_key { };
 struct dept_map { };
@@ -469,5 +477,7 @@ struct dept_ext_wgen { };
 #define dept_hardirqs_on()				do { } while (0)
 #define dept_softirqs_off()				do { } while (0)
 #define dept_hardirqs_off()				do { } while (0)
+
+#define dept_set_lockdep_map(m, lockdep_m)		do { } while (0)
 #endif
 #endif /* __LINUX_DEPT_H */
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 1c6eb0a6d0a6..a0eb4b108de0 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -1615,9 +1615,39 @@ static int next_wgen(void)
 	return atomic_inc_return(&wgen) ?: atomic_inc_return(&wgen);
 }
 
-static void add_wait(struct dept_class *c, unsigned long ip,
-		     const char *w_fn, int sub_l, bool sched_sleep,
-		     bool timeout)
+/*
+ * XXX: This is a temporary patch needed until lockdep stops tracking
+ * dependency in wrong way.  lockdep has added an annotation to specify
+ * a callback to determin whether the given lock aquisition order is
+ * okay or not in its own way.  Even though dept is already working
+ * correctly with sub class on that issue, it needs to be aware of the
+ * annotation anyway.
+ */
+static bool lockdep_cmp_fn(struct dept_map *prev, struct dept_map *next)
+{
+	/*
+	 * Assumes the cmp_fn thing comes from struct lockdep_map.
+	 */
+	struct lockdep_map *p_lock = (struct lockdep_map *)prev->lockdep_map;
+	struct lockdep_map *n_lock = (struct lockdep_map *)next->lockdep_map;
+	struct lock_class *p_class = p_lock ? p_lock->class_cache[0] : NULL;
+	struct lock_class *n_class = n_lock ? n_lock->class_cache[0] : NULL;
+
+	if (!p_class || !n_class)
+		return false;
+
+	if (p_class != n_class)
+		return false;
+
+	if (!p_class->cmp_fn)
+		return false;
+
+	return p_class->cmp_fn(p_lock, n_lock) < 0;
+}
+
+static void add_wait(struct dept_map *m, struct dept_class *c,
+		unsigned long ip, const char *w_fn, int sub_l,
+		bool sched_sleep, bool timeout)
 {
 	struct dept_task *dt = dept_task();
 	struct dept_wait *w;
@@ -1658,8 +1688,13 @@ static void add_wait(struct dept_class *c, unsigned long ip,
 		if (!eh->ecxt)
 			continue;
 
-		if (eh->ecxt->class != c || eh->sub_l == sub_l)
-			add_dep(eh->ecxt, w);
+		if (eh->ecxt->class == c && eh->sub_l != sub_l)
+			continue;
+
+		if (i == dt->ecxt_held_pos - 1 && lockdep_cmp_fn(eh->map, m))
+			continue;
+
+		add_dep(eh->ecxt, w);
 	}
 
 	wg = next_wgen();
@@ -2145,6 +2180,7 @@ void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u,
 	m->name = n;
 	m->wgen = 0U;
 	m->nocheck = !valid_key(k);
+	m->lockdep_map = NULL;
 
 	dept_exit_recursive(flags);
 }
@@ -2366,7 +2402,7 @@ static void __dept_wait(struct dept_map *m, unsigned long w_f,
 		if (!c)
 			continue;
 
-		add_wait(c, ip, w_fn, sub_l, sched_sleep, timeout);
+		add_wait(m, c, ip, w_fn, sub_l, sched_sleep, timeout);
 	}
 }
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 39b9e3e27c0b..c99f91f7a54d 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -5035,6 +5035,7 @@ void lockdep_set_lock_cmp_fn(struct lockdep_map *lock, lock_cmp_fn cmp_fn,
 		class->print_fn = print_fn;
 	}
 
+	dept_set_lockdep_map(&lock->dmap, lock);
 	lockdep_recursion_finish();
 	raw_local_irq_restore(flags);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 34/47] dept: make dept stop from working on debug_locks_off()
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (32 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 33/47] dept: make dept aware of lockdep_set_lock_cmp_fn() annotation Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 35/47] i2c: rename wait_for_completion callback to wait_for_completion_cb Byungchul Park
                   ` (12 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

For many reasons, debug_locks_off() is called to stop lock debuging
feature e.g. on panic().  dept should also stop it in the conditions.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept.h     | 2 ++
 kernel/dependency/dept.c | 6 ++++++
 lib/debug_locks.c        | 2 ++
 3 files changed, 10 insertions(+)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 8b4d97fc4627..b164f74e86e5 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -390,6 +390,7 @@ struct dept_ext_wgen {
 	unsigned int wgen;
 };
 
+extern void dept_stop_emerg(void);
 extern void dept_on(void);
 extern void dept_off(void);
 extern void dept_init(void);
@@ -442,6 +443,7 @@ struct dept_ext_wgen { };
 
 #define DEPT_MAP_INITIALIZER(n, k) { }
 
+#define dept_stop_emerg()				do { } while (0)
 #define dept_on()					do { } while (0)
 #define dept_off()					do { } while (0)
 #define dept_init()					do { } while (0)
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index a0eb4b108de0..1de61306418b 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -187,6 +187,12 @@ static void dept_unlock(void)
 	arch_spin_unlock(&dept_spin);
 }
 
+void dept_stop_emerg(void)
+{
+	WRITE_ONCE(dept_stop, 1);
+}
+EXPORT_SYMBOL_GPL(dept_stop_emerg);
+
 enum bfs_ret {
 	BFS_CONTINUE,
 	BFS_DONE,
diff --git a/lib/debug_locks.c b/lib/debug_locks.c
index a75ee30b77cb..14a965914a8f 100644
--- a/lib/debug_locks.c
+++ b/lib/debug_locks.c
@@ -38,6 +38,8 @@ EXPORT_SYMBOL_GPL(debug_locks_silent);
  */
 int debug_locks_off(void)
 {
+	dept_stop_emerg();
+
 	if (debug_locks && __debug_locks_off()) {
 		if (!debug_locks_silent) {
 			console_verbose();
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 35/47] i2c: rename wait_for_completion callback to wait_for_completion_cb
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (33 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 34/47] dept: make dept stop from working on debug_locks_off() Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-04 16:39   ` Wolfram Sang
  2025-10-02  8:12 ` [PATCH v17 36/47] dept: assign unique dept_key to each distinct wait_for_completion() caller Byungchul Park
                   ` (11 subsequent siblings)
  46 siblings, 1 reply; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Functionally no change.  This patch is a preparation for DEPT(DEPendency
Tracker) to track dependencies related to a scheduler API,
wait_for_completion().

Unfortunately, struct i2c_algo_pca_data has a callback member named
wait_for_completion, that is the same as the scheduler API, which makes
it hard to change the scheduler API to a macro form because of the
ambiguity.

Add a postfix _cb to the callback member to remove the ambiguity.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 drivers/i2c/algos/i2c-algo-pca.c      | 2 +-
 drivers/i2c/busses/i2c-pca-isa.c      | 2 +-
 drivers/i2c/busses/i2c-pca-platform.c | 2 +-
 include/linux/i2c-algo-pca.h          | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/i2c/algos/i2c-algo-pca.c b/drivers/i2c/algos/i2c-algo-pca.c
index 74b66aec33d4..ee86df4cff4b 100644
--- a/drivers/i2c/algos/i2c-algo-pca.c
+++ b/drivers/i2c/algos/i2c-algo-pca.c
@@ -30,7 +30,7 @@ static int i2c_debug;
 #define pca_clock(adap) adap->i2c_clock
 #define pca_set_con(adap, val) pca_outw(adap, I2C_PCA_CON, val)
 #define pca_get_con(adap) pca_inw(adap, I2C_PCA_CON)
-#define pca_wait(adap) adap->wait_for_completion(adap->data)
+#define pca_wait(adap) adap->wait_for_completion_cb(adap->data)
 
 static void pca_reset(struct i2c_algo_pca_data *adap)
 {
diff --git a/drivers/i2c/busses/i2c-pca-isa.c b/drivers/i2c/busses/i2c-pca-isa.c
index 85e8cf58e8bf..0cbf2f509527 100644
--- a/drivers/i2c/busses/i2c-pca-isa.c
+++ b/drivers/i2c/busses/i2c-pca-isa.c
@@ -95,7 +95,7 @@ static struct i2c_algo_pca_data pca_isa_data = {
 	/* .data intentionally left NULL, not needed with ISA */
 	.write_byte		= pca_isa_writebyte,
 	.read_byte		= pca_isa_readbyte,
-	.wait_for_completion	= pca_isa_waitforcompletion,
+	.wait_for_completion_cb	= pca_isa_waitforcompletion,
 	.reset_chip		= pca_isa_resetchip,
 };
 
diff --git a/drivers/i2c/busses/i2c-pca-platform.c b/drivers/i2c/busses/i2c-pca-platform.c
index 87da8241b927..c0f35ebbe37d 100644
--- a/drivers/i2c/busses/i2c-pca-platform.c
+++ b/drivers/i2c/busses/i2c-pca-platform.c
@@ -180,7 +180,7 @@ static int i2c_pca_pf_probe(struct platform_device *pdev)
 	}
 
 	i2c->algo_data.data = i2c;
-	i2c->algo_data.wait_for_completion = i2c_pca_pf_waitforcompletion;
+	i2c->algo_data.wait_for_completion_cb = i2c_pca_pf_waitforcompletion;
 	if (i2c->gpio)
 		i2c->algo_data.reset_chip = i2c_pca_pf_resetchip;
 	else
diff --git a/include/linux/i2c-algo-pca.h b/include/linux/i2c-algo-pca.h
index 7c522fdd9ea7..e305bf32e40a 100644
--- a/include/linux/i2c-algo-pca.h
+++ b/include/linux/i2c-algo-pca.h
@@ -71,7 +71,7 @@ struct i2c_algo_pca_data {
 	void 				*data;	/* private low level data */
 	void (*write_byte)		(void *data, int reg, int val);
 	int  (*read_byte)		(void *data, int reg);
-	int  (*wait_for_completion)	(void *data);
+	int  (*wait_for_completion_cb)	(void *data);
 	void (*reset_chip)		(void *data);
 	/* For PCA9564, use one of the predefined frequencies:
 	 * 330000, 288000, 217000, 146000, 88000, 59000, 44000, 36000
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 36/47] dept: assign unique dept_key to each distinct wait_for_completion() caller
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (34 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 35/47] i2c: rename wait_for_completion callback to wait_for_completion_cb Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 37/47] completion, dept: introduce init_completion_dmap() API Byungchul Park
                   ` (10 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

wait_for_completion() can be used at various points in the code and it's
very hard to distinguish wait_for_completion()s between different usages.
Using a single dept_key for all the wait_for_completion()s could trigger
false positive reports.

Assign unique dept_key to each distinct wait_for_completion() caller to
avoid false positive reports.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/completion.h | 100 +++++++++++++++++++++++++++++++------
 kernel/sched/completion.c  |  60 +++++++++++-----------
 2 files changed, 115 insertions(+), 45 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index 3200b741de28..4d8fb1d95c0a 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -27,12 +27,10 @@
 struct completion {
 	unsigned int done;
 	struct swait_queue_head wait;
-	struct dept_map dmap;
 };
 
 #define init_completion(x)				\
 do {							\
-	sdt_map_init(&(x)->dmap);			\
 	__init_completion(x);				\
 } while (0)
 
@@ -43,17 +41,14 @@ do {							\
 
 static inline void complete_acquire(struct completion *x, long timeout)
 {
-	sdt_might_sleep_start_timeout(&x->dmap, timeout);
 }
 
 static inline void complete_release(struct completion *x)
 {
-	sdt_might_sleep_end();
 }
 
 #define COMPLETION_INITIALIZER(work) \
-	{ 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
-	  .dmap = DEPT_MAP_INITIALIZER(work, NULL), }
+	{ 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait), }
 
 #define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \
 	(*({ init_completion_map(&(work), &(map)); &(work); }))
@@ -119,18 +114,18 @@ static inline void reinit_completion(struct completion *x)
 	x->done = 0;
 }
 
-extern void wait_for_completion(struct completion *);
-extern void wait_for_completion_io(struct completion *);
-extern int wait_for_completion_interruptible(struct completion *x);
-extern int wait_for_completion_killable(struct completion *x);
-extern int wait_for_completion_state(struct completion *x, unsigned int state);
-extern unsigned long wait_for_completion_timeout(struct completion *x,
+extern void __wait_for_completion(struct completion *);
+extern void __wait_for_completion_io(struct completion *);
+extern int __wait_for_completion_interruptible(struct completion *x);
+extern int __wait_for_completion_killable(struct completion *x);
+extern int __wait_for_completion_state(struct completion *x, unsigned int state);
+extern unsigned long __wait_for_completion_timeout(struct completion *x,
 						   unsigned long timeout);
-extern unsigned long wait_for_completion_io_timeout(struct completion *x,
+extern unsigned long __wait_for_completion_io_timeout(struct completion *x,
 						    unsigned long timeout);
-extern long wait_for_completion_interruptible_timeout(
+extern long __wait_for_completion_interruptible_timeout(
 	struct completion *x, unsigned long timeout);
-extern long wait_for_completion_killable_timeout(
+extern long __wait_for_completion_killable_timeout(
 	struct completion *x, unsigned long timeout);
 extern bool try_wait_for_completion(struct completion *x);
 extern bool completion_done(struct completion *x);
@@ -139,4 +134,79 @@ extern void complete(struct completion *);
 extern void complete_on_current_cpu(struct completion *x);
 extern void complete_all(struct completion *);
 
+#define wait_for_completion(x)						\
+({									\
+	sdt_might_sleep_start_timeout(NULL, -1L);			\
+	__wait_for_completion(x);					\
+	sdt_might_sleep_end();						\
+})
+#define wait_for_completion_io(x)					\
+({									\
+	sdt_might_sleep_start_timeout(NULL, -1L);			\
+	__wait_for_completion_io(x);					\
+	sdt_might_sleep_end();						\
+})
+#define wait_for_completion_interruptible(x)				\
+({									\
+	int __ret;							\
+									\
+	sdt_might_sleep_start_timeout(NULL, -1L);			\
+	__ret = __wait_for_completion_interruptible(x);			\
+	sdt_might_sleep_end();						\
+	__ret;								\
+})
+#define wait_for_completion_killable(x)					\
+({									\
+	int __ret;							\
+									\
+	sdt_might_sleep_start_timeout(NULL, -1L);			\
+	__ret = __wait_for_completion_killable(x);			\
+	sdt_might_sleep_end();						\
+	__ret;								\
+})
+#define wait_for_completion_state(x, s)					\
+({									\
+	int __ret;							\
+									\
+	sdt_might_sleep_start_timeout(NULL, -1L);			\
+	__ret = __wait_for_completion_state(x, s);			\
+	sdt_might_sleep_end();						\
+	__ret;								\
+})
+#define wait_for_completion_timeout(x, t)				\
+({									\
+	unsigned long __ret;						\
+									\
+	sdt_might_sleep_start_timeout(NULL, t);				\
+	__ret = __wait_for_completion_timeout(x, t);			\
+	sdt_might_sleep_end();						\
+	__ret;								\
+})
+#define wait_for_completion_io_timeout(x, t)				\
+({									\
+	unsigned long __ret;						\
+									\
+	sdt_might_sleep_start_timeout(NULL, t);				\
+	__ret = __wait_for_completion_io_timeout(x, t);			\
+	sdt_might_sleep_end();						\
+	__ret;								\
+})
+#define wait_for_completion_interruptible_timeout(x, t)			\
+({									\
+	long __ret;							\
+									\
+	sdt_might_sleep_start_timeout(NULL, t);				\
+	__ret = __wait_for_completion_interruptible_timeout(x, t);	\
+	sdt_might_sleep_end();						\
+	__ret;								\
+})
+#define wait_for_completion_killable_timeout(x, t)			\
+({									\
+	long __ret;							\
+									\
+	sdt_might_sleep_start_timeout(NULL, t);				\
+	__ret = __wait_for_completion_killable_timeout(x, t);		\
+	sdt_might_sleep_end();						\
+	__ret;								\
+})
 #endif
diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index 5e45a60ff7b3..7262000db114 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -4,7 +4,7 @@
  * Generic wait-for-completion handler;
  *
  * It differs from semaphores in that their default case is the opposite,
- * wait_for_completion default blocks whereas semaphore default non-block. The
+ * __wait_for_completion default blocks whereas semaphore default non-block. The
  * interface also makes it easy to 'complete' multiple waiting threads,
  * something which isn't entirely natural for semaphores.
  *
@@ -42,7 +42,7 @@ void complete_on_current_cpu(struct completion *x)
  * This will wake up a single thread waiting on this completion. Threads will be
  * awakened in the same order in which they were queued.
  *
- * See also complete_all(), wait_for_completion() and related routines.
+ * See also complete_all(), __wait_for_completion() and related routines.
  *
  * If this function wakes up a task, it executes a full memory barrier before
  * accessing the task state.
@@ -139,23 +139,23 @@ wait_for_common_io(struct completion *x, long timeout, int state)
 }
 
 /**
- * wait_for_completion: - waits for completion of a task
+ * __wait_for_completion: - waits for completion of a task
  * @x:  holds the state of this particular completion
  *
  * This waits to be signaled for completion of a specific task. It is NOT
  * interruptible and there is no timeout.
  *
- * See also similar routines (i.e. wait_for_completion_timeout()) with timeout
+ * See also similar routines (i.e. __wait_for_completion_timeout()) with timeout
  * and interrupt capability. Also see complete().
  */
-void __sched wait_for_completion(struct completion *x)
+void __sched __wait_for_completion(struct completion *x)
 {
 	wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_UNINTERRUPTIBLE);
 }
-EXPORT_SYMBOL(wait_for_completion);
+EXPORT_SYMBOL(__wait_for_completion);
 
 /**
- * wait_for_completion_timeout: - waits for completion of a task (w/timeout)
+ * __wait_for_completion_timeout: - waits for completion of a task (w/timeout)
  * @x:  holds the state of this particular completion
  * @timeout:  timeout value in jiffies
  *
@@ -167,28 +167,28 @@ EXPORT_SYMBOL(wait_for_completion);
  * till timeout) if completed.
  */
 unsigned long __sched
-wait_for_completion_timeout(struct completion *x, unsigned long timeout)
+__wait_for_completion_timeout(struct completion *x, unsigned long timeout)
 {
 	return wait_for_common(x, timeout, TASK_UNINTERRUPTIBLE);
 }
-EXPORT_SYMBOL(wait_for_completion_timeout);
+EXPORT_SYMBOL(__wait_for_completion_timeout);
 
 /**
- * wait_for_completion_io: - waits for completion of a task
+ * __wait_for_completion_io: - waits for completion of a task
  * @x:  holds the state of this particular completion
  *
  * This waits to be signaled for completion of a specific task. It is NOT
  * interruptible and there is no timeout. The caller is accounted as waiting
  * for IO (which traditionally means blkio only).
  */
-void __sched wait_for_completion_io(struct completion *x)
+void __sched __wait_for_completion_io(struct completion *x)
 {
 	wait_for_common_io(x, MAX_SCHEDULE_TIMEOUT, TASK_UNINTERRUPTIBLE);
 }
-EXPORT_SYMBOL(wait_for_completion_io);
+EXPORT_SYMBOL(__wait_for_completion_io);
 
 /**
- * wait_for_completion_io_timeout: - waits for completion of a task (w/timeout)
+ * __wait_for_completion_io_timeout: - waits for completion of a task (w/timeout)
  * @x:  holds the state of this particular completion
  * @timeout:  timeout value in jiffies
  *
@@ -201,14 +201,14 @@ EXPORT_SYMBOL(wait_for_completion_io);
  * till timeout) if completed.
  */
 unsigned long __sched
-wait_for_completion_io_timeout(struct completion *x, unsigned long timeout)
+__wait_for_completion_io_timeout(struct completion *x, unsigned long timeout)
 {
 	return wait_for_common_io(x, timeout, TASK_UNINTERRUPTIBLE);
 }
-EXPORT_SYMBOL(wait_for_completion_io_timeout);
+EXPORT_SYMBOL(__wait_for_completion_io_timeout);
 
 /**
- * wait_for_completion_interruptible: - waits for completion of a task (w/intr)
+ * __wait_for_completion_interruptible: - waits for completion of a task (w/intr)
  * @x:  holds the state of this particular completion
  *
  * This waits for completion of a specific task to be signaled. It is
@@ -216,7 +216,7 @@ EXPORT_SYMBOL(wait_for_completion_io_timeout);
  *
  * Return: -ERESTARTSYS if interrupted, 0 if completed.
  */
-int __sched wait_for_completion_interruptible(struct completion *x)
+int __sched __wait_for_completion_interruptible(struct completion *x)
 {
 	long t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_INTERRUPTIBLE);
 
@@ -224,10 +224,10 @@ int __sched wait_for_completion_interruptible(struct completion *x)
 		return t;
 	return 0;
 }
-EXPORT_SYMBOL(wait_for_completion_interruptible);
+EXPORT_SYMBOL(__wait_for_completion_interruptible);
 
 /**
- * wait_for_completion_interruptible_timeout: - waits for completion (w/(to,intr))
+ * __wait_for_completion_interruptible_timeout: - waits for completion (w/(to,intr))
  * @x:  holds the state of this particular completion
  * @timeout:  timeout value in jiffies
  *
@@ -238,15 +238,15 @@ EXPORT_SYMBOL(wait_for_completion_interruptible);
  * or number of jiffies left till timeout) if completed.
  */
 long __sched
-wait_for_completion_interruptible_timeout(struct completion *x,
+__wait_for_completion_interruptible_timeout(struct completion *x,
 					  unsigned long timeout)
 {
 	return wait_for_common(x, timeout, TASK_INTERRUPTIBLE);
 }
-EXPORT_SYMBOL(wait_for_completion_interruptible_timeout);
+EXPORT_SYMBOL(__wait_for_completion_interruptible_timeout);
 
 /**
- * wait_for_completion_killable: - waits for completion of a task (killable)
+ * __wait_for_completion_killable: - waits for completion of a task (killable)
  * @x:  holds the state of this particular completion
  *
  * This waits to be signaled for completion of a specific task. It can be
@@ -254,7 +254,7 @@ EXPORT_SYMBOL(wait_for_completion_interruptible_timeout);
  *
  * Return: -ERESTARTSYS if interrupted, 0 if completed.
  */
-int __sched wait_for_completion_killable(struct completion *x)
+int __sched __wait_for_completion_killable(struct completion *x)
 {
 	long t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_KILLABLE);
 
@@ -262,9 +262,9 @@ int __sched wait_for_completion_killable(struct completion *x)
 		return t;
 	return 0;
 }
-EXPORT_SYMBOL(wait_for_completion_killable);
+EXPORT_SYMBOL(__wait_for_completion_killable);
 
-int __sched wait_for_completion_state(struct completion *x, unsigned int state)
+int __sched __wait_for_completion_state(struct completion *x, unsigned int state)
 {
 	long t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, state);
 
@@ -272,10 +272,10 @@ int __sched wait_for_completion_state(struct completion *x, unsigned int state)
 		return t;
 	return 0;
 }
-EXPORT_SYMBOL(wait_for_completion_state);
+EXPORT_SYMBOL(__wait_for_completion_state);
 
 /**
- * wait_for_completion_killable_timeout: - waits for completion of a task (w/(to,killable))
+ * __wait_for_completion_killable_timeout: - waits for completion of a task (w/(to,killable))
  * @x:  holds the state of this particular completion
  * @timeout:  timeout value in jiffies
  *
@@ -287,12 +287,12 @@ EXPORT_SYMBOL(wait_for_completion_state);
  * or number of jiffies left till timeout) if completed.
  */
 long __sched
-wait_for_completion_killable_timeout(struct completion *x,
+__wait_for_completion_killable_timeout(struct completion *x,
 				     unsigned long timeout)
 {
 	return wait_for_common(x, timeout, TASK_KILLABLE);
 }
-EXPORT_SYMBOL(wait_for_completion_killable_timeout);
+EXPORT_SYMBOL(__wait_for_completion_killable_timeout);
 
 /**
  *	try_wait_for_completion - try to decrement a completion without blocking
@@ -334,7 +334,7 @@ EXPORT_SYMBOL(try_wait_for_completion);
  *	completion_done - Test to see if a completion has any waiters
  *	@x:	completion structure
  *
- *	Return: 0 if there are waiters (wait_for_completion() in progress)
+ *	Return: 0 if there are waiters (__wait_for_completion() in progress)
  *		 1 if there are no waiters.
  *
  *	Note, this will always return true if complete_all() was called on @X.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 37/47] completion, dept: introduce init_completion_dmap() API
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (35 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 36/47] dept: assign unique dept_key to each distinct wait_for_completion() caller Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 38/47] dept: introduce a new type of dependency tracking between multi event sites Byungchul Park
                   ` (9 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Currently, dept uses dept's map embedded in task_struct to track
dependencies related to wait_for_completion() and its family.  So it
doesn't need an explicit map basically.

However, for those who want to set the maps with customized class or
key, introduce a new API to use external maps.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/completion.h | 40 +++++++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index 4d8fb1d95c0a..e50f7d9b4b97 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -27,17 +27,15 @@
 struct completion {
 	unsigned int done;
 	struct swait_queue_head wait;
+	struct dept_map *dmap;
 };
 
-#define init_completion(x)				\
-do {							\
-	__init_completion(x);				\
-} while (0)
+#define init_completion(x) init_completion_dmap(x, NULL)
 
 /*
- * XXX: No use cases for now. Fill the body when needed.
+ * XXX: This usage using lockdep's map should be deprecated.
  */
-#define init_completion_map(x, m) init_completion(x)
+#define init_completion_map(x, m) init_completion_dmap(x, NULL)
 
 static inline void complete_acquire(struct completion *x, long timeout)
 {
@@ -48,8 +46,11 @@ static inline void complete_release(struct completion *x)
 }
 
 #define COMPLETION_INITIALIZER(work) \
-	{ 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait), }
+	{ 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait), .dmap = NULL, }
 
+/*
+ * XXX: This usage using lockdep's map should be deprecated.
+ */
 #define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \
 	(*({ init_completion_map(&(work), &(map)); &(work); }))
 
@@ -90,15 +91,18 @@ static inline void complete_release(struct completion *x)
 #endif
 
 /**
- * __init_completion - Initialize a dynamically allocated completion
+ * init_completion_dmap - Initialize a dynamically allocated completion
  * @x:  pointer to completion structure that is to be initialized
+ * @dmap:  pointer to external dept's map to be used as a separated map
  *
  * This inline function will initialize a dynamically created completion
  * structure.
  */
-static inline void __init_completion(struct completion *x)
+static inline void init_completion_dmap(struct completion *x,
+		struct dept_map *dmap)
 {
 	x->done = 0;
+	x->dmap = dmap;
 	init_swait_queue_head(&x->wait);
 }
 
@@ -136,13 +140,13 @@ extern void complete_all(struct completion *);
 
 #define wait_for_completion(x)						\
 ({									\
-	sdt_might_sleep_start_timeout(NULL, -1L);			\
+	sdt_might_sleep_start_timeout((x)->dmap, -1L);			\
 	__wait_for_completion(x);					\
 	sdt_might_sleep_end();						\
 })
 #define wait_for_completion_io(x)					\
 ({									\
-	sdt_might_sleep_start_timeout(NULL, -1L);			\
+	sdt_might_sleep_start_timeout((x)->dmap, -1L);			\
 	__wait_for_completion_io(x);					\
 	sdt_might_sleep_end();						\
 })
@@ -150,7 +154,7 @@ extern void complete_all(struct completion *);
 ({									\
 	int __ret;							\
 									\
-	sdt_might_sleep_start_timeout(NULL, -1L);			\
+	sdt_might_sleep_start_timeout((x)->dmap, -1L);			\
 	__ret = __wait_for_completion_interruptible(x);			\
 	sdt_might_sleep_end();						\
 	__ret;								\
@@ -159,7 +163,7 @@ extern void complete_all(struct completion *);
 ({									\
 	int __ret;							\
 									\
-	sdt_might_sleep_start_timeout(NULL, -1L);			\
+	sdt_might_sleep_start_timeout((x)->dmap, -1L);			\
 	__ret = __wait_for_completion_killable(x);			\
 	sdt_might_sleep_end();						\
 	__ret;								\
@@ -168,7 +172,7 @@ extern void complete_all(struct completion *);
 ({									\
 	int __ret;							\
 									\
-	sdt_might_sleep_start_timeout(NULL, -1L);			\
+	sdt_might_sleep_start_timeout((x)->dmap, -1L);			\
 	__ret = __wait_for_completion_state(x, s);			\
 	sdt_might_sleep_end();						\
 	__ret;								\
@@ -177,7 +181,7 @@ extern void complete_all(struct completion *);
 ({									\
 	unsigned long __ret;						\
 									\
-	sdt_might_sleep_start_timeout(NULL, t);				\
+	sdt_might_sleep_start_timeout((x)->dmap, t);			\
 	__ret = __wait_for_completion_timeout(x, t);			\
 	sdt_might_sleep_end();						\
 	__ret;								\
@@ -186,7 +190,7 @@ extern void complete_all(struct completion *);
 ({									\
 	unsigned long __ret;						\
 									\
-	sdt_might_sleep_start_timeout(NULL, t);				\
+	sdt_might_sleep_start_timeout((x)->dmap, t);			\
 	__ret = __wait_for_completion_io_timeout(x, t);			\
 	sdt_might_sleep_end();						\
 	__ret;								\
@@ -195,7 +199,7 @@ extern void complete_all(struct completion *);
 ({									\
 	long __ret;							\
 									\
-	sdt_might_sleep_start_timeout(NULL, t);				\
+	sdt_might_sleep_start_timeout((x)->dmap, t);			\
 	__ret = __wait_for_completion_interruptible_timeout(x, t);	\
 	sdt_might_sleep_end();						\
 	__ret;								\
@@ -204,7 +208,7 @@ extern void complete_all(struct completion *);
 ({									\
 	long __ret;							\
 									\
-	sdt_might_sleep_start_timeout(NULL, t);				\
+	sdt_might_sleep_start_timeout((x)->dmap, t);			\
 	__ret = __wait_for_completion_killable_timeout(x, t);		\
 	sdt_might_sleep_end();						\
 	__ret;								\
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 38/47] dept: introduce a new type of dependency tracking between multi event sites
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (36 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 37/47] completion, dept: introduce init_completion_dmap() API Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 39/47] dept: add module support for struct dept_event_site and dept_event_site_dep Byungchul Park
                   ` (8 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

It's worth reporting wait-event circular dependency even if it doesn't
lead to an actual deadlock, because it's a good information about a
circular dependency anyway.  However, it should be suppressed once
turning out it doesn't lead an actual deadlock, for instance, there are
other wake-up(or event) paths.

The report needs to be suppressed by annotating that an event can be
recovered by other sites triggering the desired wake-up, using a newly
introduced API, dept_recover_event() specifying an event site and its
recover site.

By the introduction, need of a new type of dependency tracking arises
since a loop of recover dependency could trigger another type of
deadlock.  So implement a logic to track the new type of dependency
between multi event sites for a single wait.

Lastly, to make sure that recover sites must be used in code, introduce
a section '.dept.event_sites' to mark it as 'used' only if used in code,
and warn it if dept_recover_event()s are annotated with recover sites,
not used in code.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/asm-generic/vmlinux.lds.h |  13 +-
 include/linux/dept.h              |  91 ++++++++++++++
 kernel/dependency/dept.c          | 196 ++++++++++++++++++++++++++++++
 3 files changed, 299 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index ae2d2359b79e..704bb47ed843 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -700,6 +700,16 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
 #define KERNEL_CTORS()
 #endif
 
+#ifdef CONFIG_DEPT
+#define DEPT_EVNET_SITES_USED()						\
+	. = ALIGN(8);							\
+	__dept_event_sites_start = .;					\
+	KEEP(*(.dept.event_sites))					\
+	__dept_event_sites_end = .;
+#else
+#define DEPT_EVNET_SITES_USED()
+#endif
+
 /* init and exit section handling */
 #define INIT_DATA							\
 	KEEP(*(SORT(___kentry+*)))					\
@@ -724,7 +734,8 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
 	EARLYCON_TABLE()						\
 	LSM_TABLE()							\
 	EARLY_LSM_TABLE()						\
-	KUNIT_INIT_TABLE()
+	KUNIT_INIT_TABLE()						\
+	DEPT_EVNET_SITES_USED()
 
 #define INIT_TEXT							\
 	*(.init.text .init.text.*)					\
diff --git a/include/linux/dept.h b/include/linux/dept.h
index b164f74e86e5..988aceee36ad 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -390,6 +390,82 @@ struct dept_ext_wgen {
 	unsigned int wgen;
 };
 
+struct dept_event_site {
+	/*
+	 * event site name
+	 */
+	const char			*name;
+
+	/*
+	 * function name where the event is triggered in
+	 */
+	const char			*func_name;
+
+	/*
+	 * for associating its recover dependencies
+	 */
+	struct list_head		dep_head;
+	struct list_head		dep_rev_head;
+
+	/*
+	 * for BFS
+	 */
+	unsigned int			bfs_gen;
+	struct dept_event_site		*bfs_parent;
+	struct list_head		bfs_node;
+
+	/*
+	 * flag indicating the event is not only declared but also
+	 * actually used in code
+	 */
+	bool				used;
+};
+
+struct dept_event_site_dep {
+	struct dept_event_site		*evt_site;
+	struct dept_event_site		*recover_site;
+
+	/*
+	 * for linking to dept_event objects
+	 */
+	struct list_head		dep_node;
+	struct list_head		dep_rev_node;
+};
+
+#define DEPT_EVENT_SITE_INITIALIZER(es)					\
+{									\
+	.name = #es,							\
+	.func_name = NULL,						\
+	.dep_head = LIST_HEAD_INIT((es).dep_head),			\
+	.dep_rev_head = LIST_HEAD_INIT((es).dep_rev_head),		\
+	.bfs_gen = 0,							\
+	.bfs_parent = NULL,						\
+	.bfs_node = LIST_HEAD_INIT((es).bfs_node),			\
+	.used = false,							\
+}
+
+#define DEPT_EVENT_SITE_DEP_INITIALIZER(esd)				\
+{									\
+	.evt_site = NULL,						\
+	.recover_site = NULL,						\
+	.dep_node = LIST_HEAD_INIT((esd).dep_node),			\
+	.dep_rev_node = LIST_HEAD_INIT((esd).dep_rev_node),		\
+}
+
+struct dept_event_site_init {
+	struct dept_event_site *evt_site;
+	const char *func_name;
+};
+
+#define dept_event_site_used(es)					\
+do {									\
+	static struct dept_event_site_init _evtinit __initdata =	\
+		{ .evt_site = (es), .func_name = __func__ };		\
+	static struct dept_event_site_init *_evtinitp __used		\
+		__attribute__((__section__(".dept.event_sites"))) =	\
+		&_evtinit;						\
+} while (0)
+
 extern void dept_stop_emerg(void);
 extern void dept_on(void);
 extern void dept_off(void);
@@ -427,6 +503,14 @@ static inline void dept_ecxt_enter_nokeep(struct dept_map *m)
 extern void dept_key_init(struct dept_key *k);
 extern void dept_key_destroy(struct dept_key *k);
 extern void dept_map_ecxt_modify(struct dept_map *m, unsigned long e_f, struct dept_key *new_k, unsigned long new_e_f, unsigned long new_ip, const char *new_c_fn, const char *new_e_fn, int new_sub_l);
+extern void __dept_recover_event(struct dept_event_site_dep *esd, struct dept_event_site *es, struct dept_event_site *rs);
+
+#define dept_recover_event(es, rs)					\
+do {									\
+	static struct dept_event_site_dep _esd = DEPT_EVENT_SITE_DEP_INITIALIZER(_esd);\
+									\
+	__dept_recover_event(&_esd, es, rs);				\
+} while (0)
 
 extern void dept_softirq_enter(void);
 extern void dept_hardirq_enter(void);
@@ -440,8 +524,10 @@ extern void dept_hardirqs_off(void);
 struct dept_key { };
 struct dept_map { };
 struct dept_ext_wgen { };
+struct dept_event_site { };
 
 #define DEPT_MAP_INITIALIZER(n, k) { }
+#define DEPT_EVENT_SITE_INITIALIZER(es) { }
 
 #define dept_stop_emerg()				do { } while (0)
 #define dept_on()					do { } while (0)
@@ -472,6 +558,7 @@ struct dept_ext_wgen { };
 #define dept_key_init(k)				do { (void)(k); } while (0)
 #define dept_key_destroy(k)				do { (void)(k); } while (0)
 #define dept_map_ecxt_modify(m, e_f, n_k, n_e_f, n_ip, n_c_fn, n_e_fn, n_sl) do { (void)(n_k); (void)(n_c_fn); (void)(n_e_fn); } while (0)
+#define dept_recover_event(es, rs)			do { } while (0)
 
 #define dept_softirq_enter()				do { } while (0)
 #define dept_hardirq_enter()				do { } while (0)
@@ -482,4 +569,8 @@ struct dept_ext_wgen { };
 
 #define dept_set_lockdep_map(m, lockdep_m)		do { } while (0)
 #endif
+
+#define DECLARE_DEPT_EVENT_SITE(es) extern struct dept_event_site (es)
+#define DEFINE_DEPT_EVENT_SITE(es) struct dept_event_site (es) = DEPT_EVENT_SITE_INITIALIZER(es)
+
 #endif /* __LINUX_DEPT_H */
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 1de61306418b..b14400c4f83b 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -973,6 +973,117 @@ static void bfs(void *root, struct bfs_ops *ops, void *in, void **out)
 	}
 }
 
+/*
+ * Recover dependency between event sites
+ * =====================================================================
+ * Even though an event is in a chain of wait-event circular dependency,
+ * the corresponding wait might be woken up by another site triggering
+ * the desired event.  To reflect that, dept allows to annotate the
+ * recover relationship between event sites using __dept_recover_event().
+ * However, that requires to track a new type of dependency between the
+ * event sites.
+ */
+
+/*
+ * Print all events in the circle.
+ */
+static void print_recover_circle(struct dept_event_site *es)
+{
+	struct dept_event_site *from = es->bfs_parent;
+	struct dept_event_site *to = es;
+
+	dept_outworld_enter();
+
+	pr_warn("===================================================\n");
+	pr_warn("DEPT: Circular recover dependency has been detected.\n");
+	pr_warn("%s %.*s %s\n", init_utsname()->release,
+		(int)strcspn(init_utsname()->version, " "),
+		init_utsname()->version,
+		print_tainted());
+	pr_warn("---------------------------------------------------\n");
+
+	do {
+		print_spc(1, "event site(%s@%s)\n", from->name, from->func_name);
+		print_spc(1, "-> event site(%s@%s)\n", to->name, to->func_name);
+		to = from;
+		from = from->bfs_parent;
+
+		if (to != es)
+			pr_warn("\n");
+	} while (to != es);
+
+	pr_warn("---------------------------------------------------\n");
+	pr_warn("information that might be helpful\n");
+	pr_warn("---------------------------------------------------\n");
+	dump_stack();
+
+	dept_outworld_exit();
+}
+
+static void bfs_init_recover(void *node, void *in, void **out)
+{
+	struct dept_event_site *root = (struct dept_event_site *)node;
+	struct dept_event_site_dep *new = (struct dept_event_site_dep *)in;
+
+	root->bfs_gen = bfs_gen;
+	new->recover_site->bfs_parent = new->evt_site;
+}
+
+static void bfs_extend_recover(struct list_head *h, void *node)
+{
+	struct dept_event_site *cur = (struct dept_event_site *)node;
+	struct dept_event_site_dep *esd;
+
+	list_for_each_entry(esd, &cur->dep_head, dep_node) {
+		struct dept_event_site *next = esd->recover_site;
+
+		if (bfs_gen == next->bfs_gen)
+			continue;
+		next->bfs_parent = cur;
+		next->bfs_gen = bfs_gen;
+		list_add_tail(&next->bfs_node, h);
+	}
+}
+
+static void *bfs_dequeue_recover(struct list_head *h)
+{
+	struct dept_event_site *es;
+
+	DEPT_WARN_ON(list_empty(h));
+
+	es = list_first_entry(h, struct dept_event_site, bfs_node);
+	list_del(&es->bfs_node);
+	return es;
+}
+
+static enum bfs_ret cb_check_recover_dl(void *node, void *in, void **out)
+{
+	struct dept_event_site *cur = (struct dept_event_site *)node;
+	struct dept_event_site_dep *new = (struct dept_event_site_dep *)in;
+
+	if (cur == new->evt_site) {
+		print_recover_circle(new->recover_site);
+		return BFS_DONE;
+	}
+
+	return BFS_CONTINUE;
+}
+
+/*
+ * NOTE: Must be called with dept_lock held.
+ */
+static void check_recover_dl_bfs(struct dept_event_site_dep *esd)
+{
+	struct bfs_ops ops = {
+		.bfs_init = bfs_init_recover,
+		.extend = bfs_extend_recover,
+		.dequeue = bfs_dequeue_recover,
+		.callback = cb_check_recover_dl,
+	};
+
+	bfs((void *)esd->recover_site, &ops, (void *)esd, NULL);
+}
+
 /*
  * Main operations
  * =====================================================================
@@ -3166,8 +3277,78 @@ static void migrate_per_cpu_pool(void)
 	}
 }
 
+static bool dept_recover_ready;
+
+void __dept_recover_event(struct dept_event_site_dep *esd,
+		struct dept_event_site *es, struct dept_event_site *rs)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+
+	if (unlikely(!dept_working()))
+		return;
+
+	if (dt->recursive)
+		return;
+
+	if (!esd || !es || !rs) {
+		DEPT_WARN_ONCE("All the parameters should be !NULL.\n");
+		return;
+	}
+
+	/*
+	 * Check locklessly if another already has done it for us.
+	 */
+	if (READ_ONCE(esd->evt_site))
+		return;
+
+	if (!dept_recover_ready) {
+		DEPT_WARN("Should be called once dept_recover_ready.\n");
+		return;
+	}
+
+	flags = dept_enter();
+	if (unlikely(!dept_lock()))
+		goto exit;
+
+	/*
+	 * Check if another already has done it for us with lock held.
+	 */
+	if (esd->evt_site)
+		goto unlock;
+
+	/*
+	 * Can be used as an indicator of whether this
+	 * __dept_recover_event() has been processed or not as well as
+	 * for storing its associated events.
+	 */
+	WRITE_ONCE(esd->evt_site, es);
+	esd->recover_site = rs;
+
+	if (!es->used || !rs->used) {
+		if (!es->used)
+			DEPT_INFO("dept_event_site %s has never been used.\n", es->name);
+		if (!rs->used)
+			DEPT_INFO("dept_event_site %s has never been used.\n", rs->name);
+
+		DEPT_WARN("Cannot track recover dependency with events that never used.\n");
+		goto unlock;
+	}
+
+	list_add(&esd->dep_node, &es->dep_head);
+	list_add(&esd->dep_rev_node, &rs->dep_rev_head);
+	check_recover_dl_bfs(esd);
+unlock:
+	dept_unlock();
+exit:
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(__dept_recover_event);
+
 #define B2KB(B) ((B) / 1024)
 
+extern char __dept_event_sites_start[], __dept_event_sites_end[];
+
 /*
  * Should be called after setup_per_cpu_areas() and before no non-boot
  * CPUs have been on.
@@ -3175,6 +3356,21 @@ static void migrate_per_cpu_pool(void)
 void __init dept_init(void)
 {
 	size_t mem_total = 0;
+	struct dept_event_site_init **evtinitpp;
+
+	/*
+	 * dept recover dependency tracking works from now on.
+	 */
+	for (evtinitpp = (struct dept_event_site_init **)__dept_event_sites_start;
+	     evtinitpp < (struct dept_event_site_init **)__dept_event_sites_end;
+	     evtinitpp++) {
+		(*evtinitpp)->evt_site->used = true;
+		(*evtinitpp)->evt_site->func_name = (*evtinitpp)->func_name;
+		pr_info("dept_event %s@%s is initialized.\n",
+				(*evtinitpp)->evt_site->name,
+				(*evtinitpp)->evt_site->func_name);
+	}
+	dept_recover_ready = true;
 
 	local_irq_disable();
 	dept_per_cpu_ready = 1;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 39/47] dept: add module support for struct dept_event_site and dept_event_site_dep
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (37 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 38/47] dept: introduce a new type of dependency tracking between multi event sites Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 40/47] dept: introduce event_site() to disable event tracking if it's recoverable Byungchul Park
                   ` (7 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

struct dept_event_site and struct dept_event_site_dep have been
introduced to track dependencies between multi event sites for a single
wait, that will be loaded to data segment.  Plus, a custom section,
'.dept.event_sites', also has been introduced to keep pointers to the
objects to make sure all the event sites defined exist in code.

dept should work with the section and segment of module.  Add the
support to handle the section and segment properly whenever modules are
loaded and unloaded.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept.h     | 14 +++++++
 include/linux/module.h   |  5 +++
 kernel/dependency/dept.c | 79 +++++++++++++++++++++++++++++++++++-----
 kernel/module/main.c     | 15 ++++++++
 4 files changed, 103 insertions(+), 10 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 988aceee36ad..25fdd324614a 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -414,6 +414,11 @@ struct dept_event_site {
 	struct dept_event_site		*bfs_parent;
 	struct list_head		bfs_node;
 
+	/*
+	 * for linking all dept_event_site's
+	 */
+	struct list_head		all_node;
+
 	/*
 	 * flag indicating the event is not only declared but also
 	 * actually used in code
@@ -430,6 +435,11 @@ struct dept_event_site_dep {
 	 */
 	struct list_head		dep_node;
 	struct list_head		dep_rev_node;
+
+	/*
+	 * for linking all dept_event_site_dep's
+	 */
+	struct list_head		all_node;
 };
 
 #define DEPT_EVENT_SITE_INITIALIZER(es)					\
@@ -441,6 +451,7 @@ struct dept_event_site_dep {
 	.bfs_gen = 0,							\
 	.bfs_parent = NULL,						\
 	.bfs_node = LIST_HEAD_INIT((es).bfs_node),			\
+	.all_node = LIST_HEAD_INIT((es).all_node),			\
 	.used = false,							\
 }
 
@@ -450,6 +461,7 @@ struct dept_event_site_dep {
 	.recover_site = NULL,						\
 	.dep_node = LIST_HEAD_INIT((esd).dep_node),			\
 	.dep_rev_node = LIST_HEAD_INIT((esd).dep_rev_node),		\
+	.all_node = LIST_HEAD_INIT((esd).all_node),			\
 }
 
 struct dept_event_site_init {
@@ -473,6 +485,7 @@ extern void dept_init(void);
 extern void dept_task_init(struct task_struct *t);
 extern void dept_task_exit(struct task_struct *t);
 extern void dept_free_range(void *start, unsigned int sz);
+extern void dept_mark_event_site_used(void *start, void *end);
 
 extern void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u, const char *n);
 extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u, const char *n);
@@ -536,6 +549,7 @@ struct dept_event_site { };
 #define dept_task_init(t)				do { } while (0)
 #define dept_task_exit(t)				do { } while (0)
 #define dept_free_range(s, sz)				do { } while (0)
+#define dept_mark_event_site_used(s, e)			do { } while (0)
 
 #define dept_map_init(m, k, su, n)			do { (void)(n); (void)(k); } while (0)
 #define dept_map_reinit(m, k, su, n)			do { (void)(n); (void)(k); } while (0)
diff --git a/include/linux/module.h b/include/linux/module.h
index 3319a5269d28..4f360c7c9e96 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -29,6 +29,7 @@
 #include <linux/srcu.h>
 #include <linux/static_call_types.h>
 #include <linux/dynamic_debug.h>
+#include <linux/dept.h>
 
 #include <linux/percpu.h>
 #include <asm/module.h>
@@ -579,6 +580,10 @@ struct module {
 #ifdef CONFIG_DYNAMIC_DEBUG_CORE
 	struct _ddebug_info dyndbg_info;
 #endif
+#ifdef CONFIG_DEPT
+	struct dept_event_site **dept_event_sites;
+	unsigned int num_dept_event_sites;
+#endif
 } ____cacheline_aligned __randomize_layout;
 #ifndef MODULE_ARCH_INIT
 #define MODULE_ARCH_INIT {}
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index b14400c4f83b..07d883579269 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -984,6 +984,9 @@ static void bfs(void *root, struct bfs_ops *ops, void *in, void **out)
  * event sites.
  */
 
+static LIST_HEAD(dept_event_sites);
+static LIST_HEAD(dept_event_site_deps);
+
 /*
  * Print all events in the circle.
  */
@@ -2043,6 +2046,33 @@ static void del_dep_rcu(struct rcu_head *rh)
 	preempt_enable();
 }
 
+/*
+ * NOTE: Must be called with dept_lock held.
+ */
+static void disconnect_event_site_dep(struct dept_event_site_dep *esd)
+{
+	list_del_rcu(&esd->dep_node);
+	list_del_rcu(&esd->dep_rev_node);
+}
+
+/*
+ * NOTE: Must be called with dept_lock held.
+ */
+static void disconnect_event_site(struct dept_event_site *es)
+{
+	struct dept_event_site_dep *esd, *next_esd;
+
+	list_for_each_entry_safe(esd, next_esd, &es->dep_head, dep_node) {
+		list_del_rcu(&esd->dep_node);
+		list_del_rcu(&esd->dep_rev_node);
+	}
+
+	list_for_each_entry_safe(esd, next_esd, &es->dep_rev_head, dep_rev_node) {
+		list_del_rcu(&esd->dep_node);
+		list_del_rcu(&esd->dep_rev_node);
+	}
+}
+
 /*
  * NOTE: Must be called with dept_lock held.
  */
@@ -2384,6 +2414,8 @@ void dept_free_range(void *start, unsigned int sz)
 {
 	struct dept_task *dt = dept_task();
 	struct dept_class *c, *n;
+	struct dept_event_site_dep *esd, *next_esd;
+	struct dept_event_site *es, *next_es;
 	unsigned long flags;
 
 	if (unlikely(!dept_working()))
@@ -2405,6 +2437,24 @@ void dept_free_range(void *start, unsigned int sz)
 	while (unlikely(!dept_lock()))
 		cpu_relax();
 
+	list_for_each_entry_safe(esd, next_esd, &dept_event_site_deps, all_node) {
+		if (!within((void *)esd, start, sz))
+			continue;
+
+		disconnect_event_site_dep(esd);
+		list_del(&esd->all_node);
+	}
+
+	list_for_each_entry_safe(es, next_es, &dept_event_sites, all_node) {
+		if (!within((void *)es, start, sz) &&
+		    !within(es->name, start, sz) &&
+		    !within(es->func_name, start, sz))
+			continue;
+
+		disconnect_event_site(es);
+		list_del(&es->all_node);
+	}
+
 	list_for_each_entry_safe(c, n, &dept_classes, all_node) {
 		if (!within((void *)c->key, start, sz) &&
 		    !within(c->name, start, sz))
@@ -3337,6 +3387,7 @@ void __dept_recover_event(struct dept_event_site_dep *esd,
 
 	list_add(&esd->dep_node, &es->dep_head);
 	list_add(&esd->dep_rev_node, &rs->dep_rev_head);
+	list_add(&esd->all_node, &dept_event_site_deps);
 	check_recover_dl_bfs(esd);
 unlock:
 	dept_unlock();
@@ -3347,6 +3398,23 @@ EXPORT_SYMBOL_GPL(__dept_recover_event);
 
 #define B2KB(B) ((B) / 1024)
 
+void dept_mark_event_site_used(void *start, void *end)
+{
+	struct dept_event_site_init **evtinitpp;
+
+	for (evtinitpp = (struct dept_event_site_init **)start;
+	     evtinitpp < (struct dept_event_site_init **)end;
+	     evtinitpp++) {
+		(*evtinitpp)->evt_site->used = true;
+		(*evtinitpp)->evt_site->func_name = (*evtinitpp)->func_name;
+		list_add(&(*evtinitpp)->evt_site->all_node, &dept_event_sites);
+
+		pr_info("dept_event_site %s@%s is initialized.\n",
+				(*evtinitpp)->evt_site->name,
+				(*evtinitpp)->evt_site->func_name);
+	}
+}
+
 extern char __dept_event_sites_start[], __dept_event_sites_end[];
 
 /*
@@ -3356,20 +3424,11 @@ extern char __dept_event_sites_start[], __dept_event_sites_end[];
 void __init dept_init(void)
 {
 	size_t mem_total = 0;
-	struct dept_event_site_init **evtinitpp;
 
 	/*
 	 * dept recover dependency tracking works from now on.
 	 */
-	for (evtinitpp = (struct dept_event_site_init **)__dept_event_sites_start;
-	     evtinitpp < (struct dept_event_site_init **)__dept_event_sites_end;
-	     evtinitpp++) {
-		(*evtinitpp)->evt_site->used = true;
-		(*evtinitpp)->evt_site->func_name = (*evtinitpp)->func_name;
-		pr_info("dept_event %s@%s is initialized.\n",
-				(*evtinitpp)->evt_site->name,
-				(*evtinitpp)->evt_site->func_name);
-	}
+	dept_mark_event_site_used(__dept_event_sites_start, __dept_event_sites_end);
 	dept_recover_ready = true;
 
 	local_irq_disable();
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 6ad78f0a58b6..fe0b62a45ed2 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -2720,6 +2720,11 @@ static int find_module_sections(struct module *mod, struct load_info *info)
 						&mod->dyndbg_info.num_classes);
 #endif
 
+#ifdef CONFIG_DEPT
+	mod->dept_event_sites = section_objs(info, ".dept.event_sites",
+					sizeof(*mod->dept_event_sites),
+					&mod->num_dept_event_sites);
+#endif
 	return 0;
 }
 
@@ -3346,6 +3351,14 @@ static int early_mod_check(struct load_info *info, int flags)
 	return err;
 }
 
+static void dept_mark_event_site_used_module(struct module *mod)
+{
+#ifdef CONFIG_DEPT
+	dept_mark_event_site_used(mod->dept_event_sites,
+			     mod->dept_event_sites + mod->num_dept_event_sites);
+#endif
+}
+
 /*
  * Allocate and load the module: note that size of section 0 is always
  * zero, and we rely on this for optional sections.
@@ -3508,6 +3521,8 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	/* Done! */
 	trace_module_load(mod);
 
+	dept_mark_event_site_used_module(mod);
+
 	return do_init_module(mod);
 
  sysfs_cleanup:
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 40/47] dept: introduce event_site() to disable event tracking if it's recoverable
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (38 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 39/47] dept: add module support for struct dept_event_site and dept_event_site_dep Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 41/47] dept: implement a basic unit test for dept Byungchul Park
                   ` (6 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

With multi event sites for a single wait, dept allows to skip tracking
an event that is recoverable by other recover paths.

Introduce an API, event_site(), to skip tracking the event in the case.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept.h     | 30 ++++++++++++++++++++++++++++++
 include/linux/sched.h    |  6 ++++++
 kernel/dependency/dept.c | 20 ++++++++++++++++++++
 3 files changed, 56 insertions(+)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 25fdd324614a..0ac13129f308 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -487,6 +487,31 @@ extern void dept_task_exit(struct task_struct *t);
 extern void dept_free_range(void *start, unsigned int sz);
 extern void dept_mark_event_site_used(void *start, void *end);
 
+extern void disable_event_track(void);
+extern void enable_event_track(void);
+
+#define event_site(es, evt_func, ...)					\
+do {									\
+	unsigned long _flags;						\
+	bool _disable;							\
+									\
+	local_irq_save(_flags);						\
+	dept_event_site_used(es);					\
+	/*								\
+	 * If !list_empty(&(es)->dept_head), the event site can be	\
+	 * recovered by others.  Do not track event dependency if so.	\
+	 */								\
+	_disable = !list_empty(&(es)->dep_head);			\
+	if (_disable)							\
+		disable_event_track();					\
+									\
+	evt_func(__VA_ARGS__);						\
+									\
+	if (_disable)							\
+		enable_event_track();					\
+	local_irq_restore(_flags);					\
+} while (0)
+
 extern void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u, const char *n);
 extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u, const char *n);
 extern void dept_ext_wgen_init(struct dept_ext_wgen *ewg);
@@ -550,6 +575,11 @@ struct dept_event_site { };
 #define dept_task_exit(t)				do { } while (0)
 #define dept_free_range(s, sz)				do { } while (0)
 #define dept_mark_event_site_used(s, e)			do { } while (0)
+#define event_site(es, evt_func, ...)					\
+do {									\
+	(void)(es);							\
+	evt_func(__VA_ARGS__);						\
+} while (0)
 
 #define dept_map_init(m, k, su, n)			do { (void)(n); (void)(k); } while (0)
 #define dept_map_reinit(m, k, su, n)			do { (void)(n); (void)(k); } while (0)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index a01c10f28dfd..24a9dc9d6f5f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -876,6 +876,11 @@ struct dept_task {
 	 */
 	int				missing_ecxt;
 
+	/*
+	 * not to track events
+	 */
+	int				disable_event_track_cnt;
+
 	/*
 	 * for tracking IRQ-enable state
 	 */
@@ -913,6 +918,7 @@ struct dept_task {
 	.stage_wait_stack = NULL,				\
 	.stage_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED,\
 	.missing_ecxt = 0,					\
+	.disable_event_track_cnt = 0,				\
 	.hardirqs_enabled = false,				\
 	.softirqs_enabled = false,				\
 	.task_exit = false,					\
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 07d883579269..3c3ec2701bd6 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -2573,6 +2573,23 @@ static void __dept_wait(struct dept_map *m, unsigned long w_f,
 	}
 }
 
+void disable_event_track(void)
+{
+	dept_task()->disable_event_track_cnt++;
+}
+EXPORT_SYMBOL_GPL(disable_event_track);
+
+void enable_event_track(void)
+{
+	dept_task()->disable_event_track_cnt--;
+}
+EXPORT_SYMBOL_GPL(enable_event_track);
+
+static bool event_track_disabled(void)
+{
+	return !!dept_task()->disable_event_track_cnt;
+}
+
 /*
  * Called between dept_enter() and dept_exit().
  */
@@ -2585,6 +2602,9 @@ static void __dept_event(struct dept_map *m, struct dept_map *real_m,
 	struct dept_key *k;
 	int e;
 
+	if (event_track_disabled())
+		return;
+
 	e = find_first_bit(&e_f, DEPT_MAX_SUBCLASSES_EVT);
 
 	if (DEPT_WARN_ON(e >= DEPT_MAX_SUBCLASSES_EVT))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 41/47] dept: implement a basic unit test for dept
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (39 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 40/47] dept: introduce event_site() to disable event tracking if it's recoverable Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 42/47] dept: call dept_hardirqs_off() in local_irq_*() regardless of irq state Byungchul Park
                   ` (5 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Implement CONFIG_DEPT_UNIT_TEST introducing a kernel module that runs
basic unit test for dept.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept_unit_test.h     |  67 +++++++++++
 kernel/dependency/Makefile         |   1 +
 kernel/dependency/dept.c           |  12 ++
 kernel/dependency/dept_unit_test.c | 173 +++++++++++++++++++++++++++++
 lib/Kconfig.debug                  |  12 ++
 5 files changed, 265 insertions(+)
 create mode 100644 include/linux/dept_unit_test.h
 create mode 100644 kernel/dependency/dept_unit_test.c

diff --git a/include/linux/dept_unit_test.h b/include/linux/dept_unit_test.h
new file mode 100644
index 000000000000..7612b4e97e69
--- /dev/null
+++ b/include/linux/dept_unit_test.h
@@ -0,0 +1,67 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * DEPT unit test
+ *
+ * Started by Byungchul Park <max.byungchul.park@gmail.com>:
+ *
+ *  Copyright (c) 2025 SK hynix, Inc., Byungchul Park
+ */
+
+#ifndef __LINUX_DEPT_UNIT_TEST_H
+#define __LINUX_DEPT_UNIT_TEST_H
+
+#if defined(CONFIG_DEPT_UNIT_TEST) || defined(CONFIG_DEPT_UNIT_TEST_MODULE)
+struct dept_ut {
+	bool circle_detected;
+	bool recover_circle_detected;
+
+	int ecxt_stack_total_cnt;
+	int wait_stack_total_cnt;
+	int evnt_stack_total_cnt;
+	int ecxt_stack_valid_cnt;
+	int wait_stack_valid_cnt;
+	int evnt_stack_valid_cnt;
+};
+
+extern struct dept_ut dept_ut_results;
+
+static inline void dept_ut_circle_detect(void)
+{
+	dept_ut_results.circle_detected = true;
+}
+static inline void dept_ut_recover_circle_detect(void)
+{
+	dept_ut_results.recover_circle_detected = true;
+}
+static inline void dept_ut_ecxt_stack_account(bool valid)
+{
+	dept_ut_results.ecxt_stack_total_cnt++;
+
+	if (valid)
+		dept_ut_results.ecxt_stack_valid_cnt++;
+}
+static inline void dept_ut_wait_stack_account(bool valid)
+{
+	dept_ut_results.wait_stack_total_cnt++;
+
+	if (valid)
+		dept_ut_results.wait_stack_valid_cnt++;
+}
+static inline void dept_ut_evnt_stack_account(bool valid)
+{
+	dept_ut_results.evnt_stack_total_cnt++;
+
+	if (valid)
+		dept_ut_results.evnt_stack_valid_cnt++;
+}
+#else
+struct dept_ut {};
+
+#define dept_ut_circle_detect() do { } while (0)
+#define dept_ut_recover_circle_detect() do { } while (0)
+#define dept_ut_ecxt_stack_account(v) do { } while (0)
+#define dept_ut_wait_stack_account(v) do { } while (0)
+#define dept_ut_evnt_stack_account(v) do { } while (0)
+
+#endif
+#endif /* __LINUX_DEPT_UNIT_TEST_H */
diff --git a/kernel/dependency/Makefile b/kernel/dependency/Makefile
index 92f165400187..fc584ca87124 100644
--- a/kernel/dependency/Makefile
+++ b/kernel/dependency/Makefile
@@ -2,3 +2,4 @@
 
 obj-$(CONFIG_DEPT) += dept.o
 obj-$(CONFIG_DEPT) += dept_proc.o
+obj-$(CONFIG_DEPT_UNIT_TEST) += dept_unit_test.o
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 3c3ec2701bd6..0f4464657288 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -78,8 +78,12 @@
 #include <linux/workqueue.h>
 #include <linux/irq_work.h>
 #include <linux/vmalloc.h>
+#include <linux/dept_unit_test.h>
 #include "dept_internal.h"
 
+struct dept_ut dept_ut_results;
+EXPORT_SYMBOL_GPL(dept_ut_results);
+
 static int dept_stop;
 static int dept_per_cpu_ready;
 
@@ -826,6 +830,10 @@ static void print_dep(struct dept_dep *d)
 			pr_warn("(wait to wake up)\n");
 			print_ip_stack(0, e->ewait_stack);
 		}
+
+		dept_ut_ecxt_stack_account(valid_stack(e->ecxt_stack));
+		dept_ut_wait_stack_account(valid_stack(w->wait_stack));
+		dept_ut_evnt_stack_account(valid_stack(e->event_stack));
 	}
 }
 
@@ -920,6 +928,8 @@ static void print_circle(struct dept_class *c)
 	dump_stack();
 
 	dept_outworld_exit();
+
+	dept_ut_circle_detect();
 }
 
 /*
@@ -1021,6 +1031,8 @@ static void print_recover_circle(struct dept_event_site *es)
 	dump_stack();
 
 	dept_outworld_exit();
+
+	dept_ut_recover_circle_detect();
 }
 
 static void bfs_init_recover(void *node, void *in, void **out)
diff --git a/kernel/dependency/dept_unit_test.c b/kernel/dependency/dept_unit_test.c
new file mode 100644
index 000000000000..88e846b9f876
--- /dev/null
+++ b/kernel/dependency/dept_unit_test.c
@@ -0,0 +1,173 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * DEPT unit test
+ *
+ * Started by Byungchul Park <max.byungchul.park@gmail.com>:
+ *
+ *  Copyright (c) 2025 SK hynix, Inc., Byungchul Park
+ */
+
+#include <linux/module.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/dept.h>
+#include <linux/dept_unit_test.h>
+
+MODULE_DESCRIPTION("DEPT unit test");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Byungchul Park <max.byungchul.park@sk.com>");
+
+struct unit {
+	const char *name;
+	bool (*func)(void);
+	bool result;
+};
+
+static DEFINE_SPINLOCK(s1);
+static DEFINE_SPINLOCK(s2);
+static bool test_spin_lock_deadlock(void)
+{
+	dept_ut_results.circle_detected = false;
+
+	spin_lock(&s1);
+	spin_lock(&s2);
+	spin_unlock(&s2);
+	spin_unlock(&s1);
+
+	spin_lock(&s2);
+	spin_lock(&s1);
+	spin_unlock(&s1);
+	spin_unlock(&s2);
+
+	return dept_ut_results.circle_detected;
+}
+
+static DEFINE_MUTEX(m1);
+static DEFINE_MUTEX(m2);
+static bool test_mutex_lock_deadlock(void)
+{
+	dept_ut_results.circle_detected = false;
+
+	mutex_lock(&m1);
+	mutex_lock(&m2);
+	mutex_unlock(&m2);
+	mutex_unlock(&m1);
+
+	mutex_lock(&m2);
+	mutex_lock(&m1);
+	mutex_unlock(&m1);
+	mutex_unlock(&m2);
+
+	return dept_ut_results.circle_detected;
+}
+
+static bool test_wait_event_deadlock(void)
+{
+	struct dept_map dmap1;
+	struct dept_map dmap2;
+
+	sdt_map_init(&dmap1);
+	sdt_map_init(&dmap2);
+
+	dept_ut_results.circle_detected = false;
+
+	sdt_request_event(&dmap1); /* [S] */
+	sdt_wait(&dmap2); /* [W] */
+	sdt_event(&dmap1); /* [E] */
+
+	sdt_request_event(&dmap2); /* [S] */
+	sdt_wait(&dmap1); /* [W] */
+	sdt_event(&dmap2); /* [E] */
+
+	return dept_ut_results.circle_detected;
+}
+
+static void dummy_event(void)
+{
+	/* Do nothing. */
+}
+
+static DEFINE_DEPT_EVENT_SITE(es1);
+static DEFINE_DEPT_EVENT_SITE(es2);
+static bool test_recover_deadlock(void)
+{
+	dept_ut_results.recover_circle_detected = false;
+
+	dept_recover_event(&es1, &es2);
+	dept_recover_event(&es2, &es1);
+
+	event_site(&es1, dummy_event);
+	event_site(&es2, dummy_event);
+
+	return dept_ut_results.recover_circle_detected;
+}
+
+static struct unit units[] = {
+	{
+		.name = "spin lock deadlock test",
+		.func = test_spin_lock_deadlock,
+	},
+	{
+		.name = "mutex lock deadlock test",
+		.func = test_mutex_lock_deadlock,
+	},
+	{
+		.name = "wait event deadlock test",
+		.func = test_wait_event_deadlock,
+	},
+	{
+		.name = "event recover deadlock test",
+		.func = test_recover_deadlock,
+	},
+};
+
+static int __init dept_ut_init(void)
+{
+	int i;
+
+	lockdep_off();
+
+	dept_ut_results.ecxt_stack_valid_cnt = 0;
+	dept_ut_results.ecxt_stack_total_cnt = 0;
+	dept_ut_results.wait_stack_valid_cnt = 0;
+	dept_ut_results.wait_stack_total_cnt = 0;
+	dept_ut_results.evnt_stack_valid_cnt = 0;
+	dept_ut_results.evnt_stack_total_cnt = 0;
+
+	for (i = 0; i < ARRAY_SIZE(units); i++)
+		units[i].result = units[i].func();
+
+	pr_info("\n");
+	pr_info("******************************************\n");
+	pr_info("DEPT unit test results\n");
+	pr_info("******************************************\n");
+	for (i = 0; i < ARRAY_SIZE(units); i++) {
+		pr_info("(%s) %s\n", units[i].result ? "pass" : "fail",
+				units[i].name);
+	}
+	pr_info("ecxt stack valid count = %d/%d\n",
+			dept_ut_results.ecxt_stack_valid_cnt,
+			dept_ut_results.ecxt_stack_total_cnt);
+	pr_info("wait stack valid count = %d/%d\n",
+			dept_ut_results.wait_stack_valid_cnt,
+			dept_ut_results.wait_stack_total_cnt);
+	pr_info("event stack valid count = %d/%d\n",
+			dept_ut_results.evnt_stack_valid_cnt,
+			dept_ut_results.evnt_stack_total_cnt);
+	pr_info("******************************************\n");
+	pr_info("\n");
+
+	lockdep_on();
+
+	return 0;
+}
+
+static void dept_ut_cleanup(void)
+{
+	/*
+	 * Do nothing for now.
+	 */
+}
+
+module_init(dept_ut_init);
+module_exit(dept_ut_cleanup);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 290563fa8b58..f0c58bee263a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1404,6 +1404,18 @@ config DEPT_AGGRESSIVE_TIMEOUT_WAIT
 	  that timeout is used to avoid a deadlock. Say N if you'd like
 	  to avoid verbose reports.
 
+config DEPT_UNIT_TEST
+	tristate "unit test for DEPT"
+	depends on DEBUG_KERNEL && DEPT
+	default n
+	help
+	  This option provides a kernel module that runs unit test for
+	  DEPT.
+
+	  Say Y if you want DEPT unit test to be built into the kernel.
+	  Say M if you want DEPT unit test to build as a module.
+	  Say N if you are unsure.
+
 config LOCK_DEBUGGING_SUPPORT
 	bool
 	depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 42/47] dept: call dept_hardirqs_off() in local_irq_*() regardless of irq state
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (40 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 41/47] dept: implement a basic unit test for dept Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 43/47] rcu/update: fix same dept key collision between various types of RCU Byungchul Park
                   ` (4 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

For dept to function properly, dept_task()->hardirqs_enabled must be set
correctly.  If it fails to set this value to false, for example, dept
may mistakenly think irq is still enabled even when it's not.

Do dept_hardirqs_off() regardless of irq state not to miss any
unexpected cases by any chance e.g. changes of the state by asm code.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/irqflags.h | 14 ++++++++++++++
 kernel/dependency/dept.c |  1 +
 2 files changed, 15 insertions(+)

diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index d8b9cf093f83..586f5bad4da7 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -214,6 +214,13 @@ extern void warn_bogus_irq_restore(void);
 		raw_local_irq_disable();		\
 		if (!was_disabled)			\
 			trace_hardirqs_off();		\
+		/*					\
+		 * Just in case that C code has missed	\
+		 * trace_hardirqs_off() at the first	\
+		 * place e.g. disabling irq at asm code.\
+		 */					\
+		else					\
+			dept_hardirqs_off();		\
 	} while (0)
 
 #define local_irq_save(flags)				\
@@ -221,6 +228,13 @@ extern void warn_bogus_irq_restore(void);
 		raw_local_irq_save(flags);		\
 		if (!raw_irqs_disabled_flags(flags))	\
 			trace_hardirqs_off();		\
+		/*					\
+		 * Just in case that C code has missed	\
+		 * trace_hardirqs_off() at the first	\
+		 * place e.g. disabling irq at asm code.\
+		 */					\
+		else					\
+			dept_hardirqs_off();		\
 	} while (0)
 
 #define local_irq_restore(flags)			\
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 0f4464657288..a17b185d6a6a 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -2248,6 +2248,7 @@ void noinstr dept_hardirqs_off(void)
 	 */
 	dept_task()->hardirqs_enabled = false;
 }
+EXPORT_SYMBOL_GPL(dept_hardirqs_off);
 
 void noinstr dept_update_cxt(void)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 43/47] rcu/update: fix same dept key collision between various types of RCU
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (41 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 42/47] dept: call dept_hardirqs_off() in local_irq_*() regardless of irq state Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 44/47] dept: introduce APIs to set page usage and use subclasses_evt for the usage Byungchul Park
                   ` (3 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

The current implementation shares the same dept key for multiple
synchronization points, which can lead to false positive reports in
dependency tracking and potential confusion in debugging.  For example,
both normal RCU and tasks trace RCU synchronization points use the same
dept key.  Specifically:

   1. synchronize_rcu() uses a dept key embedded in __wait_rcu_gp():

      synchronize_rcu()
         synchronize_rcu_normal()
            _wait_rcu_gp()
               __wait_rcu_gp() <- the key as static variable

   2. synchronize_rcu_tasks_trace() uses the dept key, too:

      synchronize_rcu_tasks_trace()
         synchronize_rcu_tasks_generic()
            _wait_rcu_gp()
               __wait_rcu_gp() <- the key as static variable

Since the both rely on the same dept key, dept may report false positive
circular dependency.  To resolve this, separate dept keys and maps
should be assigned to each struct rcu_synchronize.

   ===================================================
   DEPT: Circular dependency has been detected.
   6.15.0-rc6-00042-ged94bafc6405 #2 Not tainted
   ---------------------------------------------------
   summary
   ---------------------------------------------------
   *** DEADLOCK ***

   context A
      [S] lock(cpu_hotplug_lock:0)
      [W] __wait_rcu_gp(<sched>:0)
      [E] unlock(cpu_hotplug_lock:0)

   context B
      [S] (unknown)(<sched>:0)
      [W] lock(cpu_hotplug_lock:0)
      [E] try_to_wake_up(<sched>:0)

   [S]: start of the event context
   [W]: the wait blocked
   [E]: the event not reachable
   ---------------------------------------------------
   context A's detail
   ---------------------------------------------------
   context A
      [S] lock(cpu_hotplug_lock:0)
      [W] __wait_rcu_gp(<sched>:0)
      [E] unlock(cpu_hotplug_lock:0)

   [S] lock(cpu_hotplug_lock:0):
   [<ffff8000802ce964>] cpus_read_lock+0x14/0x20
   stacktrace:
         percpu_down_read.constprop.0+0x88/0x2ec
         cpus_read_lock+0x14/0x20
         cgroup_procs_write_start+0x164/0x634
         __cgroup_procs_write+0xdc/0x4d0
         cgroup_procs_write+0x34/0x74
         cgroup_file_write+0x25c/0x670
         kernfs_fop_write_iter+0x2ec/0x498
         vfs_write+0x574/0xc30
         ksys_write+0x124/0x244
         __arm64_sys_write+0x70/0xa4
         invoke_syscall+0x88/0x2e0
         el0_svc_common.constprop.0+0xe8/0x2e0
         do_el0_svc+0x44/0x60
         el0_svc+0x50/0x188
         el0t_64_sync_handler+0x10c/0x140
         el0t_64_sync+0x198/0x19c

   [W] __wait_rcu_gp(<sched>:0):
   [<ffff8000804ce88c>] __wait_rcu_gp+0x324/0x498
   stacktrace:
         schedule+0xcc/0x348
         schedule_timeout+0x1a4/0x268
         __wait_for_common+0x1c4/0x3f0
         __wait_for_completion_state+0x20/0x38
         __wait_rcu_gp+0x35c/0x498
         synchronize_rcu_normal+0x200/0x218
         synchronize_rcu+0x234/0x2a0
         rcu_sync_enter+0x11c/0x300
         percpu_down_write+0xb4/0x3e0
         cgroup_procs_write_start+0x174/0x634
         __cgroup_procs_write+0xdc/0x4d0
         cgroup_procs_write+0x34/0x74
         cgroup_file_write+0x25c/0x670
         kernfs_fop_write_iter+0x2ec/0x498
         vfs_write+0x574/0xc30
         ksys_write+0x124/0x244

   [E] unlock(cpu_hotplug_lock:0):
   (N/A)
   ---------------------------------------------------
   context B's detail
   ---------------------------------------------------
   context B
      [S] (unknown)(<sched>:0)
      [W] lock(cpu_hotplug_lock:0)
      [E] try_to_wake_up(<sched>:0)

   [S] (unknown)(<sched>:0):
   (N/A)

   [W] lock(cpu_hotplug_lock:0):
   [<ffff8000802ce964>] cpus_read_lock+0x14/0x20
   stacktrace:
         percpu_down_read.constprop.0+0x6c/0x2ec
         cpus_read_lock+0x14/0x20
         check_all_holdout_tasks_trace+0x90/0xa30
         rcu_tasks_wait_gp+0x47c/0x938
         rcu_tasks_one_gp+0x75c/0xef8
         rcu_tasks_kthread+0x180/0x1dc
         kthread+0x3ac/0x74c
         ret_from_fork+0x10/0x20

   [E] try_to_wake_up(<sched>:0):
   [<ffff8000804233b8>] complete+0xb8/0x1e8
   stacktrace:
         try_to_wake_up+0x374/0x1164
         complete+0xb8/0x1e8
         wakeme_after_rcu+0x14/0x20
         rcu_tasks_invoke_cbs+0x218/0xaa8
         rcu_tasks_one_gp+0x834/0xef8
         rcu_tasks_kthread+0x180/0x1dc
         kthread+0x3ac/0x74c
         ret_from_fork+0x10/0x20
   (wait to wake up)
   stacktrace:
         __schedule+0xf64/0x3614
         schedule+0xcc/0x348
         schedule_timeout+0x1a4/0x268
         __wait_for_common+0x1c4/0x3f0
         __wait_for_completion_state+0x20/0x38
         __wait_rcu_gp+0x35c/0x498
         synchronize_rcu_tasks_generic+0x14c/0x220
         synchronize_rcu_tasks_trace+0x24/0x8c
         rcu_init_tasks_generic+0x168/0x194
         do_one_initcall+0x174/0xa00
         kernel_init_freeable+0x744/0x7dc
         kernel_init+0x78/0x220
         ret_from_fork+0x10/0x20

Separating the dept key and map for each of struct rcu_synchronize,
ensuring proper tracking for each execution context.

Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
[ Rewrote the changelog. ]
Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/rcupdate_wait.h | 13 ++++++++-----
 kernel/rcu/rcu.h              |  1 +
 kernel/rcu/update.c           |  5 +++--
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/linux/rcupdate_wait.h b/include/linux/rcupdate_wait.h
index 4c92d4291cce..ee598e70b4bc 100644
--- a/include/linux/rcupdate_wait.h
+++ b/include/linux/rcupdate_wait.h
@@ -19,17 +19,20 @@ struct rcu_synchronize {
 
 	/* This is for debugging. */
 	struct rcu_gp_oldstate oldstate;
+	struct dept_map dmap;
+	struct dept_key dkey;
 };
 void wakeme_after_rcu(struct rcu_head *head);
 
 void __wait_rcu_gp(bool checktiny, unsigned int state, int n, call_rcu_func_t *crcu_array,
-		   struct rcu_synchronize *rs_array);
+		   struct rcu_synchronize *rs_array, struct dept_key *dkey);
 
 #define _wait_rcu_gp(checktiny, state, ...) \
-do {												\
-	call_rcu_func_t __crcu_array[] = { __VA_ARGS__ };					\
-	struct rcu_synchronize __rs_array[ARRAY_SIZE(__crcu_array)];				\
-	__wait_rcu_gp(checktiny, state, ARRAY_SIZE(__crcu_array), __crcu_array, __rs_array);	\
+do {													\
+	call_rcu_func_t __crcu_array[] = { __VA_ARGS__ };						\
+	static struct dept_key __key;									\
+	struct rcu_synchronize __rs_array[ARRAY_SIZE(__crcu_array)];					\
+	__wait_rcu_gp(checktiny, state, ARRAY_SIZE(__crcu_array), __crcu_array, __rs_array, &__key);	\
 } while (0)
 
 #define wait_rcu_gp(...) _wait_rcu_gp(false, TASK_UNINTERRUPTIBLE, __VA_ARGS__)
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 9cf01832a6c3..c0d8ea139596 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -12,6 +12,7 @@
 
 #include <linux/slab.h>
 #include <trace/events/rcu.h>
+#include <linux/dept_sdt.h>
 
 /*
  * Grace-period counter management.
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index c912b594ba98..82292337d5b0 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -409,7 +409,7 @@ void wakeme_after_rcu(struct rcu_head *head)
 EXPORT_SYMBOL_GPL(wakeme_after_rcu);
 
 void __wait_rcu_gp(bool checktiny, unsigned int state, int n, call_rcu_func_t *crcu_array,
-		   struct rcu_synchronize *rs_array)
+		   struct rcu_synchronize *rs_array, struct dept_key *dkey)
 {
 	int i;
 	int j;
@@ -426,7 +426,8 @@ void __wait_rcu_gp(bool checktiny, unsigned int state, int n, call_rcu_func_t *c
 				break;
 		if (j == i) {
 			init_rcu_head_on_stack(&rs_array[i].head);
-			init_completion(&rs_array[i].completion);
+			sdt_map_init_key(&rs_array[i].dmap, dkey);
+			init_completion_dmap(&rs_array[i].completion, &rs_array[i].dmap);
 			(crcu_array[i])(&rs_array[i].head, wakeme_after_rcu);
 		}
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 44/47] dept: introduce APIs to set page usage and use subclasses_evt for the usage
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (42 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 43/47] rcu/update: fix same dept key collision between various types of RCU Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 45/47] dept: track PG_writeback with dept Byungchul Park
                   ` (2 subsequent siblings)
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

False positive reports have been observed since dept works with the
assumption that all the pages have the same dept class, but the class
should be split since the problematic call paths are different depending
on what the page is used for.

At least, ones in block device's address_space and ones in regular
file's address_space have exclusively different usages.

Thus, define usage candidates like:

   DEPT_PAGE_REGFILE_CACHE /* page in regular file's address_space */
   DEPT_PAGE_BDEV_CACHE    /* page in block device's address_space */
   DEPT_PAGE_DEFAULT       /* the others */

Introduce APIs to set each page's usage properly and make sure not to
interact between at least between DEPT_PAGE_REGFILE_CACHE and
DEPT_PAGE_BDEV_CACHE.  However, besides the exclusive usages, allow any
other combinations to interact to the other for example:

   PG_locked for DEPT_PAGE_DEFAULT page can wait for PG_locked for
   DEPT_PAGE_REGFILE_CACHE page and vice versa.

   PG_locked for DEPT_PAGE_DEFAULT page can wait for PG_locked for
   DEPT_PAGE_BDEV_CACHE page and vice versa.

   PG_locked for DEPT_PAGE_DEFAULT page can wait for PG_locked for
   DEPT_PAGE_DEFAULT page.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/dept.h       | 31 +++++++++++++++-
 include/linux/mm_types.h   |  1 +
 include/linux/page-flags.h | 76 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 104 insertions(+), 4 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 0ac13129f308..fbbc41048fac 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -21,8 +21,8 @@ struct task_struct;
 #define DEPT_MAX_WAIT_HIST		64
 #define DEPT_MAX_ECXT_HELD		48
 
-#define DEPT_MAX_SUBCLASSES		16
-#define DEPT_MAX_SUBCLASSES_EVT		2
+#define DEPT_MAX_SUBCLASSES		24
+#define DEPT_MAX_SUBCLASSES_EVT		3
 #define DEPT_MAX_SUBCLASSES_USR		(DEPT_MAX_SUBCLASSES / DEPT_MAX_SUBCLASSES_EVT)
 #define DEPT_MAX_SUBCLASSES_CACHE	2
 
@@ -390,6 +390,32 @@ struct dept_ext_wgen {
 	unsigned int wgen;
 };
 
+enum {
+	DEPT_PAGE_DEFAULT = 0,
+	DEPT_PAGE_REGFILE_CACHE,	/* regular file page cache */
+	DEPT_PAGE_BDEV_CACHE,		/* block device cache */
+	DEPT_PAGE_USAGE_NR,		/* nr of usages options */
+};
+
+#define DEPT_PAGE_USAGE_SHIFT 16
+#define DEPT_PAGE_USAGE_MASK ((1U << DEPT_PAGE_USAGE_SHIFT) - 1)
+#define DEPT_PAGE_USAGE_PENDING_MASK (DEPT_PAGE_USAGE_MASK << DEPT_PAGE_USAGE_SHIFT)
+
+/*
+ * Identify each page's usage type
+ */
+struct dept_page_usage {
+	/*
+	 * low 16 bits  : the current usage type
+	 * high 16 bits : usage type requested to be set
+	 *
+	 * Do not apply the type requested immediately but defer until
+	 * after clearing PG_locked bit of the folio or page e.g. by
+	 * folio_unlock().
+	 */
+	atomic_t type; /* Update and read atomically */
+};
+
 struct dept_event_site {
 	/*
 	 * event site name
@@ -562,6 +588,7 @@ extern void dept_hardirqs_off(void);
 struct dept_key { };
 struct dept_map { };
 struct dept_ext_wgen { };
+struct dept_page_usage { };
 struct dept_event_site { };
 
 #define DEPT_MAP_INITIALIZER(n, k) { }
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5ebc565309af..8ccbb030500c 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -224,6 +224,7 @@ struct page {
 	struct page *kmsan_shadow;
 	struct page *kmsan_origin;
 #endif
+	struct dept_page_usage usage;
 	struct dept_ext_wgen pg_locked_wgen;
 } _struct_page_alignment;
 
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index d3c4954c4218..3fd3660ddc6f 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -204,6 +204,68 @@ enum pageflags {
 
 extern struct dept_map pg_locked_map;
 
+static inline int dept_set_page_usage(struct page *p,
+		unsigned int new_type)
+{
+	unsigned int type = atomic_read(&p->usage.type);
+
+	if (WARN_ON_ONCE(new_type >= DEPT_PAGE_USAGE_NR))
+		return -1;
+
+	new_type <<= DEPT_PAGE_USAGE_SHIFT;
+retry:
+	new_type &= ~DEPT_PAGE_USAGE_MASK;
+	new_type |= type & DEPT_PAGE_USAGE_MASK;
+
+	if (!atomic_try_cmpxchg(&p->usage.type, &type, new_type))
+		goto retry;
+
+	return 0;
+}
+
+static inline int dept_reset_page_usage(struct page *p)
+{
+	return dept_set_page_usage(p, DEPT_PAGE_DEFAULT);
+}
+
+static inline void dept_update_page_usage(struct page *p)
+{
+	unsigned int type = atomic_read(&p->usage.type);
+	unsigned int new_type;
+
+retry:
+	new_type = type & DEPT_PAGE_USAGE_PENDING_MASK;
+	new_type >>= DEPT_PAGE_USAGE_SHIFT;
+	new_type |= type & DEPT_PAGE_USAGE_PENDING_MASK;
+
+	/*
+	 * Already updated by others.
+	 */
+	if (type == new_type)
+		return;
+
+	if (!atomic_try_cmpxchg(&p->usage.type, &type, new_type))
+		goto retry;
+}
+
+static inline unsigned long dept_event_flags(struct page *p, bool wait)
+{
+	unsigned int type;
+
+	type = atomic_read(&p->usage.type) & DEPT_PAGE_USAGE_MASK;
+
+	if (WARN_ON_ONCE(type >= DEPT_PAGE_USAGE_NR))
+		return 0;
+
+	/*
+	 * event
+	 */
+	if (!wait)
+		return 1UL << type;
+
+	return (1UL << DEPT_PAGE_DEFAULT) | (1UL << type);
+}
+
 /*
  * Place the following annotations in its suitable point in code:
  *
@@ -214,20 +276,28 @@ extern struct dept_map pg_locked_map;
 
 static inline void dept_page_set_bit(struct page *p, int bit_nr)
 {
+	dept_update_page_usage(p);
 	if (bit_nr == PG_locked)
 		dept_request_event(&pg_locked_map, &p->pg_locked_wgen);
 }
 
 static inline void dept_page_clear_bit(struct page *p, int bit_nr)
 {
+	unsigned long evt_f;
+
+	evt_f = dept_event_flags(p, false);
 	if (bit_nr == PG_locked)
-		dept_event(&pg_locked_map, 1UL, _RET_IP_, __func__, &p->pg_locked_wgen);
+		dept_event(&pg_locked_map, evt_f, _RET_IP_, __func__, &p->pg_locked_wgen);
 }
 
 static inline void dept_page_wait_on_bit(struct page *p, int bit_nr)
 {
+	unsigned long evt_f;
+
+	dept_update_page_usage(p);
+	evt_f = dept_event_flags(p, true);
 	if (bit_nr == PG_locked)
-		dept_wait(&pg_locked_map, 1UL, _RET_IP_, __func__, 0, -1L);
+		dept_wait(&pg_locked_map, evt_f, _RET_IP_, __func__, 0, -1L);
 }
 
 static inline void dept_folio_set_bit(struct folio *f, int bit_nr)
@@ -245,6 +315,8 @@ static inline void dept_folio_wait_on_bit(struct folio *f, int bit_nr)
 	dept_page_wait_on_bit(&f->page, bit_nr);
 }
 #else
+#define dept_set_page_usage(p, t)		do { } while (0)
+#define dept_reset_page_usage(p)		do { } while (0)
 #define dept_page_set_bit(p, bit_nr)		do { } while (0)
 #define dept_page_clear_bit(p, bit_nr)		do { } while (0)
 #define dept_page_wait_on_bit(p, bit_nr)	do { } while (0)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 45/47] dept: track PG_writeback with dept
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (43 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 44/47] dept: introduce APIs to set page usage and use subclasses_evt for the usage Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 46/47] SUNRPC: relocate struct rcu_head to the first field of struct rpc_xprt Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 47/47] mm: percpu: increase PERCPU_DYNAMIC_SIZE_SHIFT on DEPT and large PAGE_SIZE Byungchul Park
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Makes dept able to track PG_writeback waits and events, which will be
useful in practice.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/mm_types.h   |  1 +
 include/linux/page-flags.h |  7 +++++++
 mm/filemap.c               | 11 +++++++++++
 mm/mm_init.c               |  1 +
 4 files changed, 20 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 8ccbb030500c..bed1a3bc81e1 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -226,6 +226,7 @@ struct page {
 #endif
 	struct dept_page_usage usage;
 	struct dept_ext_wgen pg_locked_wgen;
+	struct dept_ext_wgen pg_writeback_wgen;
 } _struct_page_alignment;
 
 /*
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 3fd3660ddc6f..b965b16c8cee 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -203,6 +203,7 @@ enum pageflags {
 #include <linux/dept.h>
 
 extern struct dept_map pg_locked_map;
+extern struct dept_map pg_writeback_map;
 
 static inline int dept_set_page_usage(struct page *p,
 		unsigned int new_type)
@@ -279,6 +280,8 @@ static inline void dept_page_set_bit(struct page *p, int bit_nr)
 	dept_update_page_usage(p);
 	if (bit_nr == PG_locked)
 		dept_request_event(&pg_locked_map, &p->pg_locked_wgen);
+	else if (bit_nr == PG_writeback)
+		dept_request_event(&pg_writeback_map, &p->pg_writeback_wgen);
 }
 
 static inline void dept_page_clear_bit(struct page *p, int bit_nr)
@@ -288,6 +291,8 @@ static inline void dept_page_clear_bit(struct page *p, int bit_nr)
 	evt_f = dept_event_flags(p, false);
 	if (bit_nr == PG_locked)
 		dept_event(&pg_locked_map, evt_f, _RET_IP_, __func__, &p->pg_locked_wgen);
+	else if (bit_nr == PG_writeback)
+		dept_event(&pg_writeback_map, evt_f, _RET_IP_, __func__, &p->pg_writeback_wgen);
 }
 
 static inline void dept_page_wait_on_bit(struct page *p, int bit_nr)
@@ -298,6 +303,8 @@ static inline void dept_page_wait_on_bit(struct page *p, int bit_nr)
 	evt_f = dept_event_flags(p, true);
 	if (bit_nr == PG_locked)
 		dept_wait(&pg_locked_map, evt_f, _RET_IP_, __func__, 0, -1L);
+	else if (bit_nr == PG_writeback)
+		dept_wait(&pg_writeback_map, evt_f, _RET_IP_, __func__, 0, -1L);
 }
 
 static inline void dept_folio_set_bit(struct folio *f, int bit_nr)
diff --git a/mm/filemap.c b/mm/filemap.c
index edb0710ddb3f..d8f1816dc6c2 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1187,6 +1187,13 @@ static void folio_wake_bit(struct folio *folio, int bit_nr)
 	key.bit_nr = bit_nr;
 	key.page_match = 0;
 
+	/*
+	 * dept_page_clear_bit() being called multiple times is harmless.
+	 * The worst case is to miss some dependencies but it's okay.
+	 */
+	if (bit_nr == PG_locked || bit_nr == PG_writeback)
+		dept_page_clear_bit(&folio->page, bit_nr);
+
 	spin_lock_irqsave(&q->lock, flags);
 	__wake_up_locked_key(q, TASK_NORMAL, &key);
 
@@ -1241,6 +1248,9 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
 struct dept_map __maybe_unused pg_locked_map = DEPT_MAP_INITIALIZER(pg_locked_map, NULL);
 EXPORT_SYMBOL(pg_locked_map);
 
+struct dept_map __maybe_unused pg_writeback_map = DEPT_MAP_INITIALIZER(pg_writeback_map, NULL);
+EXPORT_SYMBOL(pg_writeback_map);
+
 static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
 		int state, enum behavior behavior)
 {
@@ -1683,6 +1693,7 @@ void folio_end_writeback(struct folio *folio)
 	 * reused before the folio_wake_bit().
 	 */
 	folio_get(folio);
+	dept_page_clear_bit(&folio->page, PG_writeback);
 	if (__folio_end_writeback(folio))
 		folio_wake_bit(folio, PG_writeback);
 
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 09e4ac6a73c7..fd2bf6689afa 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -589,6 +589,7 @@ void __meminit __init_single_page(struct page *page, unsigned long pfn,
 	page_cpupid_reset_last(page);
 	page_kasan_tag_reset(page);
 	dept_ext_wgen_init(&page->pg_locked_wgen);
+	dept_ext_wgen_init(&page->pg_writeback_wgen);
 
 	INIT_LIST_HEAD(&page->lru);
 #ifdef WANT_PAGE_VIRTUAL
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 46/47] SUNRPC: relocate struct rcu_head to the first field of struct rpc_xprt
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (44 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 45/47] dept: track PG_writeback with dept Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  2025-10-02  8:12 ` [PATCH v17 47/47] mm: percpu: increase PERCPU_DYNAMIC_SIZE_SHIFT on DEPT and large PAGE_SIZE Byungchul Park
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

While compiling Linux kernel with DEPT on, the following error was
observed:

   ./include/linux/rcupdate.h:1084:17: note: in expansion of macro
   ‘BUILD_BUG_ON’
   1084 | BUILD_BUG_ON(offsetof(typeof(*(ptr)), rhf) >= 4096);	\
        | ^~~~~~~~~~~~
   ./include/linux/rcupdate.h:1047:29: note: in expansion of macro
   'kvfree_rcu_arg_2'
   1047 | #define kfree_rcu(ptr, rhf) kvfree_rcu_arg_2(ptr, rhf)
        |                             ^~~~~~~~~~~~~~~~
   net/sunrpc/xprt.c:1856:9: note: in expansion of macro 'kfree_rcu'
   1856 | kfree_rcu(xprt, rcu);
        | ^~~~~~~~~
    CC net/kcm/kcmproc.o
   make[4]: *** [scripts/Makefile.build:203: net/sunrpc/xprt.o] Error 1

Since kfree_rcu() assumes 'offset of struct rcu_head in a rcu-managed
struct < 4096', the offest of struct rcu_head in struct rpc_xprt should
not exceed 4096 but does, due to the debug information added by DEPT.

Relocate struct rcu_head to the first field of struct rpc_xprt from an
arbitrary location to avoid the issue and meet the assumption.

Reported-by: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/sunrpc/xprt.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index f46d1fb8f71a..666e42a17a31 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -211,6 +211,14 @@ enum xprt_transports {
 
 struct rpc_sysfs_xprt;
 struct rpc_xprt {
+	/*
+	 * Place struct rcu_head within the first 4096 bytes of struct
+	 * rpc_xprt if sizeof(struct rpc_xprt) > 4096, so that
+	 * kfree_rcu() can simply work assuming that.  See the comment
+	 * in kfree_rcu().
+	 */
+	struct rcu_head		rcu;
+
 	struct kref		kref;		/* Reference count */
 	const struct rpc_xprt_ops *ops;		/* transport methods */
 	unsigned int		id;		/* transport id */
@@ -317,7 +325,6 @@ struct rpc_xprt {
 #if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
 	struct dentry		*debugfs;		/* debugfs directory */
 #endif
-	struct rcu_head		rcu;
 	const struct xprt_class	*xprt_class;
 	struct rpc_sysfs_xprt	*xprt_sysfs;
 	bool			main; /*mark if this is the 1st transport */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v17 47/47] mm: percpu: increase PERCPU_DYNAMIC_SIZE_SHIFT on DEPT and large PAGE_SIZE
  2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
                   ` (45 preceding siblings ...)
  2025-10-02  8:12 ` [PATCH v17 46/47] SUNRPC: relocate struct rcu_head to the first field of struct rpc_xprt Byungchul Park
@ 2025-10-02  8:12 ` Byungchul Park
  46 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-02  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Yunseong reported a build failure due to the BUILD_BUG_ON() statement in
alloc_kmem_cache_cpus().  In the following test:

  PERCPU_DYNAMIC_EARLY_SIZE < NR_KMALLOC_TYPES * KMALLOC_SHIFT_HIGH * sizeof(struct kmem_cache_cpu)

The following factors increase the right side of the equation:

  1. PAGE_SIZE > 4KiB increases KMALLOC_SHIFT_HIGH.
  2. DEPT increases the size of the local_lock_t in kmem_cache_cpu.

Increase PERCPU_DYNAMIC_SIZE_SHIFT to 11 on configs with PAGE_SIZE
larger than 4KiB and DEPT enabled.

Reported-by: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/percpu.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 85bf8dd9f087..dd74321d4bbd 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -43,7 +43,11 @@
 # define PERCPU_DYNAMIC_SIZE_SHIFT      12
 #endif /* LOCKDEP and PAGE_SIZE > 4KiB */
 #else
+#if defined(CONFIG_DEPT) && !defined(CONFIG_PAGE_SIZE_4KB)
+#define PERCPU_DYNAMIC_SIZE_SHIFT      11
+#else
 #define PERCPU_DYNAMIC_SIZE_SHIFT      10
+#endif /* DEPT and PAGE_SIZE > 4KiB */
 #endif
 
 /*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 01/47] llist: move llist_{head,node} definition to types.h
  2025-10-02  8:12 ` [PATCH v17 01/47] llist: move llist_{head,node} definition to types.h Byungchul Park
@ 2025-10-02  8:24   ` Greg KH
  2025-10-02 13:53     ` Mathieu Desnoyers
  0 siblings, 1 reply; 73+ messages in thread
From: Greg KH @ 2025-10-02  8:24 UTC (permalink / raw)
  To: Byungchul Park
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, kernel-team, linux-mm, akpm, mhocko,
	minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg,
	rientjes, vbabka, ngupta, linux-block, josef, linux-fsdevel, jack,
	jlayton, dan.j.williams, hch, djwong, dri-devel,
	rodrigosiqueiramelo, melissa.srw, hamohammed.sa, harry.yoo,
	chris.p.wilson, gwan-gyeong.mun, max.byungchul.park, boqun.feng,
	longman, yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost,
	her0gyugyu, corbet, catalin.marinas, bp, dave.hansen, x86, hpa,
	luto, sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu, Oct 02, 2025 at 05:12:01PM +0900, Byungchul Park wrote:
> llist_head and llist_node can be used by some other header files.  For
> example, dept for tracking dependencies uses llist in its header.  To
> avoid header dependency, move them to types.h.

If you need llist in your code, then include llist.h.  Don't force all
types.h users to do so as there is not a dependency in types.h for
llist.h.

This patch shouldn't be needed as you are hiding "header dependency" for
other files.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 02/47] dept: implement DEPT(DEPendency Tracker)
  2025-10-02  8:12 ` [PATCH v17 02/47] dept: implement DEPT(DEPendency Tracker) Byungchul Park
@ 2025-10-02  8:25   ` Greg KH
  2025-10-02 12:56     ` Geert Uytterhoeven
  0 siblings, 1 reply; 73+ messages in thread
From: Greg KH @ 2025-10-02  8:25 UTC (permalink / raw)
  To: Byungchul Park
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, kernel-team, linux-mm, akpm, mhocko,
	minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg,
	rientjes, vbabka, ngupta, linux-block, josef, linux-fsdevel, jack,
	jlayton, dan.j.williams, hch, djwong, dri-devel,
	rodrigosiqueiramelo, melissa.srw, hamohammed.sa, harry.yoo,
	chris.p.wilson, gwan-gyeong.mun, max.byungchul.park, boqun.feng,
	longman, yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost,
	her0gyugyu, corbet, catalin.marinas, bp, dave.hansen, x86, hpa,
	luto, sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

> @@ -0,0 +1,446 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * DEPT(DEPendency Tracker) - runtime dependency tracker
> + *
> + * Started by Byungchul Park <max.byungchul.park@gmail.com>:
> + *
> + *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
> + *  Copyright (c) 2024 SK hynix, Inc., Byungchul Park

Nit, it's now 2025 :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 30/47] fs/jbd2: use a weaker annotation in journal handling
  2025-10-02  8:12 ` [PATCH v17 30/47] fs/jbd2: use a weaker annotation in journal handling Byungchul Park
@ 2025-10-02  8:40   ` Jan Kara
  2025-10-03  1:13     ` Byungchul Park
  0 siblings, 1 reply; 73+ messages in thread
From: Jan Kara @ 2025-10-02  8:40 UTC (permalink / raw)
  To: Byungchul Park
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, broonie, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu 02-10-25 17:12:30, Byungchul Park wrote:
> jbd2 journal handling code doesn't want jbd2_might_wait_for_commit()
> to be placed between start_this_handle() and stop_this_handle().  So it
> marks the region with rwsem_acquire_read() and rwsem_release().
> 
> However, the annotation is too strong for that purpose.  We don't have
> to use more than try lock annotation for that.
> 
> rwsem_acquire_read() implies:
> 
>    1. might be a waiter on contention of the lock.
>    2. enter to the critical section of the lock.
> 
> All we need in here is to act 2, not 1.  So trylock version of
> annotation is sufficient for that purpose.  Now that dept partially
> relies on lockdep annotaions, dept interpets rwsem_acquire_read() as a
> potential wait and might report a deadlock by the wait.
> 
> Replace it with trylock version of annotation.
> 
> Signed-off-by: Byungchul Park <byungchul@sk.com>

Indeed. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/jbd2/transaction.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index c7867139af69..b4e65f51bf5e 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -441,7 +441,7 @@ static int start_this_handle(journal_t *journal, handle_t *handle,
>  	read_unlock(&journal->j_state_lock);
>  	current->journal_info = handle;
>  
> -	rwsem_acquire_read(&journal->j_trans_commit_map, 0, 0, _THIS_IP_);
> +	rwsem_acquire_read(&journal->j_trans_commit_map, 0, 1, _THIS_IP_);
>  	jbd2_journal_free_transaction(new_transaction);
>  	/*
>  	 * Ensure that no allocations done while the transaction is open are
> -- 
> 2.17.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64
  2025-10-02  8:12 ` [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64 Byungchul Park
@ 2025-10-02 11:39   ` Mark Brown
  2025-10-03  1:46     ` Byungchul Park
  2025-10-03 14:36   ` Mark Rutland
  1 sibling, 1 reply; 73+ messages in thread
From: Mark Brown @ 2025-10-02 11:39 UTC (permalink / raw)
  To: Byungchul Park
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

[-- Attachment #1: Type: text/plain, Size: 659 bytes --]

On Thu, Oct 02, 2025 at 05:12:09PM +0900, Byungchul Park wrote:
> dept needs to notice every entrance from user to kernel mode to treat
> every kernel context independently when tracking wait-event dependencies.
> Roughly, system call and user oriented fault are the cases.
> 
> Make dept aware of the entrances of arm64 and add support
> CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64.

The description of what needs to be tracked probably needs some
tightening up here, it's not clear to me for example why exceptions for
mops or the vector extensions aren't included here, or what the
distinction is with error faults like BTI or GCS not being tracked?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 02/47] dept: implement DEPT(DEPendency Tracker)
  2025-10-02  8:25   ` Greg KH
@ 2025-10-02 12:56     ` Geert Uytterhoeven
  0 siblings, 0 replies; 73+ messages in thread
From: Geert Uytterhoeven @ 2025-10-02 12:56 UTC (permalink / raw)
  To: Greg KH
  Cc: Byungchul Park, linux-kernel, kernel_team, torvalds,
	damien.lemoal, linux-ide, adilger.kernel, linux-ext4, mingo,
	peterz, will, tglx, rostedt, joel, sashal, daniel.vetter,
	duyuyang, johannes.berg, tj, tytso, willy, david, amir73il,
	kernel-team, linux-mm, akpm, mhocko, minchan, hannes,
	vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes, vbabka,
	ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Hi Greg,

On Thu, 2 Oct 2025 at 10:25, Greg KH <gregkh@linuxfoundation.org> wrote:
> > @@ -0,0 +1,446 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * DEPT(DEPendency Tracker) - runtime dependency tracker
> > + *
> > + * Started by Byungchul Park <max.byungchul.park@gmail.com>:
> > + *
> > + *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
> > + *  Copyright (c) 2024 SK hynix, Inc., Byungchul Park
>
> Nit, it's now 2025 :)

The last non-trivial change to this file was between the last version
posted in 2024 (v14) and the first version posted in 2025 (v15),
so 2024 doesn't sound that off to me.
You are not supposed to bump the copyright year when republishing
without any actual changes.  It is meant to be the work’s first year
of publication.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 01/47] llist: move llist_{head,node} definition to types.h
  2025-10-02  8:24   ` Greg KH
@ 2025-10-02 13:53     ` Mathieu Desnoyers
  2025-10-02 23:19       ` Arnd Bergmann
  0 siblings, 1 reply; 73+ messages in thread
From: Mathieu Desnoyers @ 2025-10-02 13:53 UTC (permalink / raw)
  To: Greg KH, Byungchul Park
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, kernel-team, linux-mm, akpm, mhocko,
	minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg,
	rientjes, vbabka, ngupta, linux-block, josef, linux-fsdevel, jack,
	jlayton, dan.j.williams, hch, djwong, dri-devel,
	rodrigosiqueiramelo, melissa.srw, hamohammed.sa, harry.yoo,
	chris.p.wilson, gwan-gyeong.mun, max.byungchul.park, boqun.feng,
	longman, yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost,
	her0gyugyu, corbet, catalin.marinas, bp, dave.hansen, x86, hpa,
	luto, sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, broonie, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On 2025-10-02 04:24, Greg KH wrote:
> On Thu, Oct 02, 2025 at 05:12:01PM +0900, Byungchul Park wrote:
>> llist_head and llist_node can be used by some other header files.  For
>> example, dept for tracking dependencies uses llist in its header.  To
>> avoid header dependency, move them to types.h.
> 
> If you need llist in your code, then include llist.h.  Don't force all
> types.h users to do so as there is not a dependency in types.h for
> llist.h.
> 
> This patch shouldn't be needed as you are hiding "header dependency" for
> other files.

I agree that moving this into a catch-all types.h is not what we should
aim for.

However, it's a good practice to move the type declarations to a
separate header file, so code that only cares about type and not
implementation of static inline functions can include just that.

Perhaps we can move struct llist_head and struct llist_node to a new
include/linux/llist_types.h instead ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 08/47] x86_64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to x86_64
  2025-10-02  8:12 ` [PATCH v17 08/47] x86_64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to x86_64 Byungchul Park
@ 2025-10-02 15:22   ` Dave Hansen
  2025-10-03  1:12     ` Byungchul Park
  0 siblings, 1 reply; 73+ messages in thread
From: Dave Hansen @ 2025-10-02 15:22 UTC (permalink / raw)
  To: Byungchul Park, linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On 10/2/25 01:12, Byungchul Park wrote:
> dept needs to notice every entrance from user to kernel mode to treat
> every kernel context independently when tracking wait-event dependencies.
> Roughly, system call and user oriented fault are the cases.

"Roughly"?

>  #define __SYSCALL(nr, sym) extern long __x64_##sym(const struct pt_regs *);
>  #define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __x64_##sym(const struct pt_regs *);
> @@ -86,6 +87,12 @@ static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr)
>  /* Returns true to return using SYSRET, or false to use IRET */
>  __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
>  {
> +	/*
> +	 * This is a system call from user mode.  Make dept work with a
> +	 * new kernel mode context.
> +	 */
> +	dept_update_cxt();
> +
>  	add_random_kstack_offset();
>  	nr = syscall_enter_from_user_mode(regs, nr);

Please take a look in syscall_enter_from_user_mode(). You'll see the
quite nicely-named function: enter_from_user_mode(). That might be a
nice place to put code that you want to run when the kernel is entered
from user mode.

> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 998bd807fc7b..017edb75f0a0 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -19,6 +19,7 @@
>  #include <linux/mm_types.h>
>  #include <linux/mm.h>			/* find_and_lock_vma() */
>  #include <linux/vmalloc.h>
> +#include <linux/dept.h>
>  
>  #include <asm/cpufeature.h>		/* boot_cpu_has, ...		*/
>  #include <asm/traps.h>			/* dotraplinkage, ...		*/
> @@ -1219,6 +1220,12 @@ void do_user_addr_fault(struct pt_regs *regs,
>  	tsk = current;
>  	mm = tsk->mm;
>  
> +	/*
> +	 * This fault comes from user mode.  Make dept work with a new
> +	 * kernel mode context.
> +	 */
> +	dept_update_cxt();
No, this fault does not come from user mode. That's why we call it "user
addr" fault, not "user mode" fault. You end up here if, for instance,
the kernel faults doing a copy_from_user().

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 01/47] llist: move llist_{head,node} definition to types.h
  2025-10-02 13:53     ` Mathieu Desnoyers
@ 2025-10-02 23:19       ` Arnd Bergmann
  0 siblings, 0 replies; 73+ messages in thread
From: Arnd Bergmann @ 2025-10-02 23:19 UTC (permalink / raw)
  To: Mathieu Desnoyers, Greg Kroah-Hartman, Byungchul Park
  Cc: linux-kernel, kernel_team, Linus Torvalds, Damien Le Moal,
	linux-ide, Andreas Dilger, linux-ext4, Ingo Molnar,
	Peter Zijlstra, Will Deacon, Thomas Gleixner, Steven Rostedt,
	Joel Fernandes, Sasha Levin, Daniel Vetter, duyuyang,
	Johannes Berg, Tejun Heo, Theodore Ts'o, Matthew Wilcox,
	Dave Chinner, Amir Goldstein, kernel-team, linux-mm,
	Andrew Morton, Michal Hocko, Minchan Kim, Johannes Weiner,
	vdavydov.dev, SeongJae Park, jglisse, Dennis Zhou,
	Christoph Lameter, Pekka Enberg, David Rientjes, Vlastimil Babka,
	ngupta, linux-block, Josef Bacik, linux-fsdevel, Jan Kara,
	Jeff Layton, Dan Williams, Christoph Hellwig, Darrick J. Wong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, Gwan-gyeong Mun, max.byungchul.park,
	Boqun Feng, Waiman Long, yunseong.kim, ysk, Yeoreum Yun, Netdev,
	Matthew Brost, her0gyugyu, Jonathan Corbet, Catalin Marinas,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Andy Lutomirski, Sumit Semwal, gustavo, Christian König,
	Andi Shyti, Lorenzo Stoakes, Liam R. Howlett, Mike Rapoport,
	Suren Baghdasaryan, Luis Chamberlain, Petr Pavlu, da.gomez,
	Sami Tolvanen, Paul E. McKenney, Frederic Weisbecker,
	neeraj.upadhyay, joelagnelf, Josh Triplett,
	Uladzislau Rezki (Sony), Lai Jiangshan, qiang.zhang, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Benjamin Segall, Mel Gorman,
	Valentin Schneider, Chuck Lever, neil, okorniev, Dai Ngo,
	Tom Talpey, trondmy, Anna Schumaker, Kees Cook,
	Sebastian Andrzej Siewior, Clark Williams, Mark Rutland,
	ada.coupriediaz, kristina.martsenko, Kefeng Wang, Mark Brown,
	Kevin Brodsky, David Woodhouse, Shakeel Butt, Alexei Starovoitov,
	Zi Yan, Yu Zhao, Baolin Wang, usamaarif642, joel.granados,
	Wei Yang, Geert Uytterhoeven, tim.c.chen, linux,
	Alexander Shishkin, lillian, Huacai Chen, francesco,
	guoweikang.kernel, link, Josh Poimboeuf, Masahiro Yamada,
	Christian Brauner, Thomas Weißschuh, Oleg Nesterov,
	Mateusz Guzik, Andrii Nakryiko, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	Linux-Arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu, Oct 2, 2025, at 15:53, Mathieu Desnoyers wrote:
> On 2025-10-02 04:24, Greg KH wrote:
>> On Thu, Oct 02, 2025 at 05:12:01PM +0900, Byungchul Park wrote:
>>> llist_head and llist_node can be used by some other header files.  For
>>> example, dept for tracking dependencies uses llist in its header.  To
>>> avoid header dependency, move them to types.h.
>> 
>> If you need llist in your code, then include llist.h.  Don't force all
>> types.h users to do so as there is not a dependency in types.h for
>> llist.h.
>> 
>> This patch shouldn't be needed as you are hiding "header dependency" for
>> other files.
>
> I agree that moving this into a catch-all types.h is not what we should
> aim for.
>
> However, it's a good practice to move the type declarations to a
> separate header file, so code that only cares about type and not
> implementation of static inline functions can include just that.
>
> Perhaps we can move struct llist_head and struct llist_node to a new
> include/linux/llist_types.h instead ?

We have around a dozen types of linked lists, and the most common
two of them are currently defined in linux/types.h, while the
rest of them are each defined in the same header as the inteface
definition.

Duplicating each of those headers by splitting out the trivial
type definition doesn't quite seem right either, as we'd end
up with even more headers that have to be included indirectly
in each compilation unit.

Maybe a shared linux/list_types.h would work, to specifically
contain all the list_head variants that are meant to be included
in larger structures?

    Arnd

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 08/47] x86_64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to x86_64
  2025-10-02 15:22   ` Dave Hansen
@ 2025-10-03  1:12     ` Byungchul Park
  0 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-03  1:12 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, broonie, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu, Oct 02, 2025 at 08:22:29AM -0700, Dave Hansen wrote:
> On 10/2/25 01:12, Byungchul Park wrote:
> > dept needs to notice every entrance from user to kernel mode to treat
> > every kernel context independently when tracking wait-event dependencies.
> > Roughly, system call and user oriented fault are the cases.
> 
> "Roughly"?

I will change it to a better one.

> >  #define __SYSCALL(nr, sym) extern long __x64_##sym(const struct pt_regs *);
> >  #define __SYSCALL_NORETURN(nr, sym) extern long __noreturn __x64_##sym(const struct pt_regs *);
> > @@ -86,6 +87,12 @@ static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr)
> >  /* Returns true to return using SYSRET, or false to use IRET */
> >  __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> >  {
> > +     /*
> > +      * This is a system call from user mode.  Make dept work with a
> > +      * new kernel mode context.
> > +      */
> > +     dept_update_cxt();
> > +
> >       add_random_kstack_offset();
> >       nr = syscall_enter_from_user_mode(regs, nr);
> 
> Please take a look in syscall_enter_from_user_mode(). You'll see the
> quite nicely-named function: enter_from_user_mode(). That might be a
> nice place to put code that you want to run when the kernel is entered
> from user mode.

I wanted to put dept_update_cxt() to the very beginning of c code but..
yeah enter_from_user_mode() looks fine or even better.  Thanks a lot.

> > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> > index 998bd807fc7b..017edb75f0a0 100644
> > --- a/arch/x86/mm/fault.c
> > +++ b/arch/x86/mm/fault.c
> > @@ -19,6 +19,7 @@
> >  #include <linux/mm_types.h>
> >  #include <linux/mm.h>                        /* find_and_lock_vma() */
> >  #include <linux/vmalloc.h>
> > +#include <linux/dept.h>
> >
> >  #include <asm/cpufeature.h>          /* boot_cpu_has, ...            */
> >  #include <asm/traps.h>                       /* dotraplinkage, ...           */
> > @@ -1219,6 +1220,12 @@ void do_user_addr_fault(struct pt_regs *regs,
> >       tsk = current;
> >       mm = tsk->mm;
> >
> > +     /*
> > +      * This fault comes from user mode.  Make dept work with a new
> > +      * kernel mode context.
> > +      */
> > +     dept_update_cxt();
> No, this fault does not come from user mode. That's why we call it "user
> addr" fault, not "user mode" fault. You end up here if, for instance,
> the kernel faults doing a copy_from_user().

My bad.  Thank you.  I will fix it.  Thank you very much.

	Byungchul

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 30/47] fs/jbd2: use a weaker annotation in journal handling
  2025-10-02  8:40   ` Jan Kara
@ 2025-10-03  1:13     ` Byungchul Park
  0 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-03  1:13 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jlayton, dan.j.williams, hch, djwong, dri-devel,
	rodrigosiqueiramelo, melissa.srw, hamohammed.sa, harry.yoo,
	chris.p.wilson, gwan-gyeong.mun, max.byungchul.park, boqun.feng,
	longman, yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost,
	her0gyugyu, corbet, catalin.marinas, bp, dave.hansen, x86, hpa,
	luto, sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu, Oct 02, 2025 at 10:40:56AM +0200, Jan Kara wrote:
> On Thu 02-10-25 17:12:30, Byungchul Park wrote:
> > jbd2 journal handling code doesn't want jbd2_might_wait_for_commit()
> > to be placed between start_this_handle() and stop_this_handle().  So it
> > marks the region with rwsem_acquire_read() and rwsem_release().
> >
> > However, the annotation is too strong for that purpose.  We don't have
> > to use more than try lock annotation for that.
> >
> > rwsem_acquire_read() implies:
> >
> >    1. might be a waiter on contention of the lock.
> >    2. enter to the critical section of the lock.
> >
> > All we need in here is to act 2, not 1.  So trylock version of
> > annotation is sufficient for that purpose.  Now that dept partially
> > relies on lockdep annotaions, dept interpets rwsem_acquire_read() as a
> > potential wait and might report a deadlock by the wait.
> >
> > Replace it with trylock version of annotation.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> 
> Indeed. Feel free to add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>

Thank you, Jan.

	Byungchul

>                                                                 Honza
> 
> > ---
> >  fs/jbd2/transaction.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> > index c7867139af69..b4e65f51bf5e 100644
> > --- a/fs/jbd2/transaction.c
> > +++ b/fs/jbd2/transaction.c
> > @@ -441,7 +441,7 @@ static int start_this_handle(journal_t *journal, handle_t *handle,
> >       read_unlock(&journal->j_state_lock);
> >       current->journal_info = handle;
> >
> > -     rwsem_acquire_read(&journal->j_trans_commit_map, 0, 0, _THIS_IP_);
> > +     rwsem_acquire_read(&journal->j_trans_commit_map, 0, 1, _THIS_IP_);
> >       jbd2_journal_free_transaction(new_transaction);
> >       /*
> >        * Ensure that no allocations done while the transaction is open are
> > --
> > 2.17.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64
  2025-10-02 11:39   ` Mark Brown
@ 2025-10-03  1:46     ` Byungchul Park
  2025-10-03 11:33       ` Mark Brown
  0 siblings, 1 reply; 73+ messages in thread
From: Byungchul Park @ 2025-10-03  1:46 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu, Oct 02, 2025 at 12:39:31PM +0100, Mark Brown wrote:
> On Thu, Oct 02, 2025 at 05:12:09PM +0900, Byungchul Park wrote:
> > dept needs to notice every entrance from user to kernel mode to treat
> > every kernel context independently when tracking wait-event dependencies.
> > Roughly, system call and user oriented fault are the cases.
> > 
> > Make dept aware of the entrances of arm64 and add support
> > CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64.
> 
> The description of what needs to be tracked probably needs some
> tightening up here, it's not clear to me for example why exceptions for
> mops or the vector extensions aren't included here, or what the
> distinction is with error faults like BTI or GCS not being tracked?

Thanks for the feedback but I'm afraid I don't get you.  Can you explain
in more detail with example?

JFYI, pairs of wait and its event need to be tracked to see if each
event can be prevented from being reachable by other waits like:

   context X				context Y

   lock L
   ...
   initiate event A context		start toward event A
   ...					...
   wait A // wait for event A and	lock L // wait for unlock L and
          // prevent unlock L		       // prevent event A
   ...					...
   unlock L				unlock L
					...
					event A

I meant things like this need to be tracked.

	Byungchul

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 28/47] dept: add documentation for dept
  2025-10-02  8:12 ` [PATCH v17 28/47] dept: add documentation for dept Byungchul Park
@ 2025-10-03  2:44   ` Bagas Sanjaya
  2025-10-13  1:28     ` Byungchul Park
  2025-10-03  5:36   ` Jonathan Corbet
  2025-10-03  6:55   ` NeilBrown
  2 siblings, 1 reply; 73+ messages in thread
From: Bagas Sanjaya @ 2025-10-03  2:44 UTC (permalink / raw)
  To: Byungchul Park, linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	corbet, catalin.marinas, bp, dave.hansen, x86, hpa, luto,
	sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
	lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
	da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
	joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
	qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
	ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
	kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
	usamaarif642, joel.granados, richard.weiyang, geert+renesas,
	tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
	francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu, Oct 02, 2025 at 05:12:28PM +0900, Byungchul Park wrote:
> This document describes the concept and APIs of dept.
> 
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
>  Documentation/dependency/dept.txt     | 735 ++++++++++++++++++++++++++
>  Documentation/dependency/dept_api.txt | 117 ++++
>  2 files changed, 852 insertions(+)
>  create mode 100644 Documentation/dependency/dept.txt
>  create mode 100644 Documentation/dependency/dept_api.txt

What about writing dept docs in reST (like the rest of kernel documentation)?

---- >8 ----
diff --git a/Documentation/dependency/dept.txt b/Documentation/locking/dept.rst
similarity index 92%
rename from Documentation/dependency/dept.txt
rename to Documentation/locking/dept.rst
index 5dd358b96734e6..7b90a0d95f0876 100644
--- a/Documentation/dependency/dept.txt
+++ b/Documentation/locking/dept.rst
@@ -8,7 +8,7 @@ How lockdep works
 
 Lockdep detects a deadlock by checking lock acquisition order. For
 example, a graph to track acquisition order built by lockdep might look
-like:
+like::
 
    A -> B -
            \
@@ -16,12 +16,12 @@ like:
            /
    C -> D -
 
-   where 'A -> B' means that acquisition A is prior to acquisition B
-   with A still held.
+where 'A -> B' means that acquisition A is prior to acquisition B
+with A still held.
 
 Lockdep keeps adding each new acquisition order into the graph in
 runtime. For example, 'E -> C' will be added when the two locks have
-been acquired in the order, E and then C. The graph will look like:
+been acquired in the order, E and then C. The graph will look like::
 
        A -> B -
                \
@@ -32,10 +32,10 @@ been acquired in the order, E and then C. The graph will look like:
    \                  /
     ------------------
 
-   where 'A -> B' means that acquisition A is prior to acquisition B
-   with A still held.
+where 'A -> B' means that acquisition A is prior to acquisition B
+with A still held.
 
-This graph contains a subgraph that demonstrates a loop like:
+This graph contains a subgraph that demonstrates a loop like::
 
                 -> E -
                /      \
@@ -67,6 +67,8 @@ mechanisms, lockdep doesn't work.
 
 Can lockdep detect the following deadlock?
 
+::
+
    context X	   context Y	   context Z
 
 		   mutex_lock A
@@ -80,6 +82,8 @@ Can lockdep detect the following deadlock?
 
 No. What about the following?
 
+::
+
    context X		   context Y
 
 			   mutex_lock A
@@ -101,7 +105,7 @@ What leads a deadlock
 ---------------------
 
 A deadlock occurs when one or multi contexts are waiting for events that
-will never happen. For example:
+will never happen. For example::
 
    context X	   context Y	   context Z
 
@@ -121,24 +125,24 @@ We call this *deadlock*.
 If an event occurrence is a prerequisite to reaching another event, we
 call it *dependency*. In this example:
 
-   Event A occurrence is a prerequisite to reaching event C.
-   Event C occurrence is a prerequisite to reaching event B.
-   Event B occurrence is a prerequisite to reaching event A.
+   * Event A occurrence is a prerequisite to reaching event C.
+   * Event C occurrence is a prerequisite to reaching event B.
+   * Event B occurrence is a prerequisite to reaching event A.
 
 In terms of dependency:
 
-   Event C depends on event A.
-   Event B depends on event C.
-   Event A depends on event B.
+   * Event C depends on event A.
+   * Event B depends on event C.
+   * Event A depends on event B.
 
-Dependency graph reflecting this example will look like:
+Dependency graph reflecting this example will look like::
 
     -> C -> A -> B -
    /                \
    \                /
     ----------------
 
-   where 'A -> B' means that event A depends on event B.
+where 'A -> B' means that event A depends on event B.
 
 A circular dependency exists. Such a circular dependency leads a
 deadlock since no waiters can have desired events triggered.
@@ -152,7 +156,7 @@ Introduce DEPT
 --------------
 
 DEPT(DEPendency Tracker) tracks wait and event instead of lock
-acquisition order so as to recognize the following situation:
+acquisition order so as to recognize the following situation::
 
    context X	   context Y	   context Z
 
@@ -165,18 +169,18 @@ acquisition order so as to recognize the following situation:
 				   event A
 
 and builds up a dependency graph in runtime that is similar to lockdep.
-The graph might look like:
+The graph might look like::
 
     -> C -> A -> B -
    /                \
    \                /
     ----------------
 
-   where 'A -> B' means that event A depends on event B.
+where 'A -> B' means that event A depends on event B.
 
 DEPT keeps adding each new dependency into the graph in runtime. For
 example, 'B -> D' will be added when event D occurrence is a
-prerequisite to reaching event B like:
+prerequisite to reaching event B like::
 
    |
    v
@@ -184,7 +188,7 @@ prerequisite to reaching event B like:
    .
    event B
 
-After the addition, the graph will look like:
+After the addition, the graph will look like::
 
                      -> D
                     /
@@ -209,6 +213,8 @@ How DEPT works
 Let's take a look how DEPT works with the 1st example in the section
 'Limitation of lockdep'.
 
+::
+
    context X	   context Y	   context Z
 
 		   mutex_lock A
@@ -220,7 +226,7 @@ Let's take a look how DEPT works with the 1st example in the section
 		   mutex_unlock A
 				   mutex_unlock A
 
-Adding comments to describe DEPT's view in terms of wait and event:
+Adding comments to describe DEPT's view in terms of wait and event::
 
    context X	   context Y	   context Z
 
@@ -248,7 +254,7 @@ Adding comments to describe DEPT's view in terms of wait and event:
 				   mutex_unlock A
 				   /* event A */
 
-Adding more supplementary comments to describe DEPT's view in detail:
+Adding more supplementary comments to describe DEPT's view in detail::
 
    context X	   context Y	   context Z
 
@@ -283,7 +289,7 @@ Adding more supplementary comments to describe DEPT's view in detail:
 				   mutex_unlock A
 				   /* event A that's been valid since 4 */
 
-Let's build up dependency graph with this example. Firstly, context X:
+Let's build up dependency graph with this example. Firstly, context X::
 
    context X
 
@@ -292,7 +298,7 @@ Let's build up dependency graph with this example. Firstly, context X:
    /* start to take into account event B's context */
    /* 2 */
 
-There are no events to create dependency. Next, context Y:
+There are no events to create dependency. Next, context Y::
 
    context Y
 
@@ -317,13 +323,13 @@ waits between 3 and the event, event B does not create dependency. For
 event A, there is a wait, folio_lock B, between 1 and the event. Which
 means event A cannot be triggered if event B does not wake up the wait.
 Therefore, we can say event A depends on event B, say, 'A -> B'. The
-graph will look like after adding the dependency:
+graph will look like after adding the dependency::
 
    A -> B
 
-   where 'A -> B' means that event A depends on event B.
+where 'A -> B' means that event A depends on event B.
 
-Lastly, context Z:
+Lastly, context Z::
 
    context Z
 
@@ -343,7 +349,7 @@ wait, mutex_lock A, between 2 and the event - remind 2 is at a very
 start and before the wait in timeline. Which means event B cannot be
 triggered if event A does not wake up the wait. Therefore, we can say
 event B depends on event A, say, 'B -> A'. The graph will look like
-after adding the dependency:
+after adding the dependency::
 
     -> A -> B -
    /           \
@@ -367,6 +373,8 @@ Interpret DEPT report
 
 The following is the example in the section 'How DEPT works'.
 
+::
+
    context X	   context Y	   context Z
 
 		   mutex_lock A
@@ -402,7 +410,7 @@ The following is the example in the section 'How DEPT works'.
 
 We can Simplify this by replacing each waiting point with [W], each
 point where its event's context starts with [S] and each event with [E].
-This example will look like after the replacement:
+This example will look like after the replacement::
 
    context X	   context Y	   context Z
 
@@ -419,6 +427,8 @@ This example will look like after the replacement:
 DEPT uses the symbols [W], [S] and [E] in its report as described above.
 The following is an example reported by DEPT for a real problem.
 
+::
+
    Link: https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SAKURA.ne.jp/#t
    Link: https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.park@lge.com/
 
@@ -620,6 +630,8 @@ The following is an example reported by DEPT for a real problem.
 
 Let's take a look at the summary that is the most important part.
 
+::
+
    ---------------------------------------------------
    summary
    ---------------------------------------------------
@@ -639,7 +651,7 @@ Let's take a look at the summary that is the most important part.
    [W]: the wait blocked
    [E]: the event not reachable
 
-The summary shows the following scenario:
+The summary shows the following scenario::
 
    context A	   context B	   context ?(unknown)
 
@@ -652,7 +664,7 @@ The summary shows the following scenario:
 
    [E] unlock(&ni->ni_lock:0)
 
-Adding supplementary comments to describe DEPT's view in detail:
+Adding supplementary comments to describe DEPT's view in detail::
 
    context A	   context B	   context ?(unknown)
 
@@ -677,7 +689,7 @@ Adding supplementary comments to describe DEPT's view in detail:
    [E] unlock(&ni->ni_lock:0)
    /* event that's been valid since 2 */
 
-Let's build up dependency graph with this report. Firstly, context A:
+Let's build up dependency graph with this report. Firstly, context A::
 
    context A
 
@@ -697,13 +709,13 @@ wait, folio_lock(&f1), between 2 and the event. Which means
 unlock(&ni->ni_lock:0) is not reachable if folio_unlock(&f1) does not
 wake up the wait. Therefore, we can say unlock(&ni->ni_lock:0) depends
 on folio_unlock(&f1), say, 'unlock(&ni->ni_lock:0) -> folio_unlock(&f1)'.
-The graph will look like after adding the dependency:
+The graph will look like after adding the dependency::
 
    unlock(&ni->ni_lock:0) -> folio_unlock(&f1)
 
-   where 'A -> B' means that event A depends on event B.
+where 'A -> B' means that event A depends on event B.
 
-Secondly, context B:
+Secondly, context B::
 
    context B
 
@@ -719,14 +731,14 @@ very start and before the wait in timeline. Which means folio_unlock(&f1)
 is not reachable if unlock(&ni->ni_lock:0) does not wake up the wait.
 Therefore, we can say folio_unlock(&f1) depends on unlock(&ni->ni_lock:0),
 say, 'folio_unlock(&f1) -> unlock(&ni->ni_lock:0)'. The graph will look
-like after adding the dependency:
+like after adding the dependency::
 
     -> unlock(&ni->ni_lock:0) -> folio_unlock(&f1) -
    /                                                \
    \                                                /
     ------------------------------------------------
 
-   where 'A -> B' means that event A depends on event B.
+where 'A -> B' means that event A depends on event B.
 
 A new loop has been created. So DEPT can report it as a deadlock! Cool!
 
diff --git a/Documentation/dependency/dept_api.txt b/Documentation/locking/dept_api.rst
similarity index 97%
rename from Documentation/dependency/dept_api.txt
rename to Documentation/locking/dept_api.rst
index 8e0d5a118a460e..96c4d65f4a9a2d 100644
--- a/Documentation/dependency/dept_api.txt
+++ b/Documentation/locking/dept_api.rst
@@ -10,6 +10,8 @@ already applied into the existing synchronization primitives e.g.
 waitqueue, swait, wait_for_completion(), dma fence and so on.  The basic
 APIs of SDT are:
 
+.. code-block:: c
+
    /*
     * After defining 'struct dept_map map', initialize the instance.
     */
@@ -27,6 +29,8 @@ APIs of SDT are:
 
 The advanced APIs of SDT are:
 
+.. code-block:: c
+
    /*
     * After defining 'struct dept_map map', initialize the instance
     * using an external key.
@@ -83,6 +87,8 @@ Do not use these APIs directly.  These are the wrappers for typical
 locks, that have been already applied into major locks internally e.g.
 spin lock, mutex, rwlock and so on.  The APIs of LDT are:
 
+.. code-block:: c
+   
    ldt_init(map, key, sub, name);
    ldt_lock(map, sub_local, try, nest, ip);
    ldt_rlock(map, sub_local, try, nest, ip, queued);
@@ -96,6 +102,8 @@ Raw APIs
 --------
 Do not use these APIs directly.  The raw APIs of dept are:
 
+.. code-block:: c
+
    dept_free_range(start, size);
    dept_map_init(map, key, sub, name);
    dept_map_reinit(map, key, sub, name);
diff --git a/Documentation/locking/index.rst b/Documentation/locking/index.rst
index 6a9ea96c8bcb70..7ec3dce7fee425 100644
--- a/Documentation/locking/index.rst
+++ b/Documentation/locking/index.rst
@@ -24,6 +24,8 @@ Locking
     percpu-rw-semaphore
     robust-futexes
     robust-futex-ABI
+    dept
+    dept_api
 
 .. only::  subproject and html
 

> +Can lockdep detect the following deadlock?
> +
> +   context X	   context Y	   context Z
> +
> +		   mutex_lock A
> +   folio_lock B
> +		   folio_lock B <- DEADLOCK
> +				   mutex_lock A <- DEADLOCK
> +				   folio_unlock B
> +		   folio_unlock B
> +		   mutex_unlock A
> +				   mutex_unlock A
> +
> +No. What about the following?
> +
> +   context X		   context Y
> +
> +			   mutex_lock A
> +   mutex_lock A <- DEADLOCK
> +			   wait_for_complete B <- DEADLOCK
> +   complete B
> +			   mutex_unlock A
> +   mutex_unlock A

Can you explain how DEPT detects deadlock on the second example above (like
the first one being described in "How DEPT works" section)?

Confused...

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 28/47] dept: add documentation for dept
  2025-10-02  8:12 ` [PATCH v17 28/47] dept: add documentation for dept Byungchul Park
  2025-10-03  2:44   ` Bagas Sanjaya
@ 2025-10-03  5:36   ` Jonathan Corbet
  2025-10-13  1:03     ` Byungchul Park
  2025-10-03  6:55   ` NeilBrown
  2 siblings, 1 reply; 73+ messages in thread
From: Jonathan Corbet @ 2025-10-03  5:36 UTC (permalink / raw)
  To: Byungchul Park, linux-kernel
  Cc: kernel_team, torvalds, damien.lemoal, linux-ide, adilger.kernel,
	linux-ext4, mingo, peterz, will, tglx, rostedt, joel, sashal,
	daniel.vetter, duyuyang, johannes.berg, tj, tytso, willy, david,
	amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko, minchan,
	hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg, rientjes,
	vbabka, ngupta, linux-block, josef, linux-fsdevel, jack, jlayton,
	dan.j.williams, hch, djwong, dri-devel, rodrigosiqueiramelo,
	melissa.srw, hamohammed.sa, harry.yoo, chris.p.wilson,
	gwan-gyeong.mun, max.byungchul.park, boqun.feng, longman,
	yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost, her0gyugyu,
	catalin.marinas, bp, dave.hansen, x86, hpa, luto, sumit.semwal,
	gustavo, christian.koenig, andi.shyti, arnd, lorenzo.stoakes,
	Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu, da.gomez,
	samitolvanen, paulmck, frederic, neeraj.upadhyay, joelagnelf,
	josh, urezki, mathieu.desnoyers, jiangshanlai, qiang.zhang,
	juri.lelli, vincent.guittot, dietmar.eggemann, bsegall, mgorman,
	vschneid, chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy,
	anna, kees, bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, broonie, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

Byungchul Park <byungchul@sk.com> writes:

> This document describes the concept and APIs of dept.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
>  Documentation/dependency/dept.txt     | 735 ++++++++++++++++++++++++++
>  Documentation/dependency/dept_api.txt | 117 ++++
>  2 files changed, 852 insertions(+)
>  create mode 100644 Documentation/dependency/dept.txt
>  create mode 100644 Documentation/dependency/dept_api.txt

As already suggested, this should be in RST; you're already 95% of the
way there.  Also, please put it under Documentation/dev-tools; we don't
need another top-level directory for this.

Thanks,

jon

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 28/47] dept: add documentation for dept
  2025-10-02  8:12 ` [PATCH v17 28/47] dept: add documentation for dept Byungchul Park
  2025-10-03  2:44   ` Bagas Sanjaya
  2025-10-03  5:36   ` Jonathan Corbet
@ 2025-10-03  6:55   ` NeilBrown
  2025-10-13  5:23     ` Byungchul Park
  2 siblings, 1 reply; 73+ messages in thread
From: NeilBrown @ 2025-10-03  6:55 UTC (permalink / raw)
  To: Byungchul Park
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, okorniev, Dai.Ngo, tom, trondmy, anna, kees, bigeasy,
	clrkwllms, mark.rutland, ada.coupriediaz, kristina.martsenko,
	wangkefeng.wang, broonie, kevin.brodsky, dwmw, shakeel.butt, ast,
	ziy, yuzhao, baolin.wang, usamaarif642, joel.granados,
	richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu, 02 Oct 2025, Byungchul Park wrote:
> This document describes the concept and APIs of dept.
> 

Thanks for the documentation.  I've been trying to understand it.


> +How DEPT works
> +--------------
> +
> +Let's take a look how DEPT works with the 1st example in the section
> +'Limitation of lockdep'.
> +
> +   context X	   context Y	   context Z
> +
> +		   mutex_lock A
> +   folio_lock B
> +		   folio_lock B <- DEADLOCK
> +				   mutex_lock A <- DEADLOCK
> +				   folio_unlock B
> +		   folio_unlock B
> +		   mutex_unlock A
> +				   mutex_unlock A
> +
> +Adding comments to describe DEPT's view in terms of wait and event:
> +
> +   context X	   context Y	   context Z
> +
> +		   mutex_lock A
> +		   /* wait for A */
> +   folio_lock B
> +   /* wait for A */
> +   /* start event A context */
> +
> +		   folio_lock B
> +		   /* wait for B */ <- DEADLOCK
> +		   /* start event B context */
> +
> +				   mutex_lock A
> +				   /* wait for A */ <- DEADLOCK
> +				   /* start event A context */
> +
> +				   folio_unlock B
> +				   /* event B */
> +		   folio_unlock B
> +		   /* event B */
> +
> +		   mutex_unlock A
> +		   /* event A */
> +				   mutex_unlock A
> +				   /* event A */
> +

I can't see the value of the above section.
The first section with no comments is useful as it is easy to see the
deadlock being investigate.  The section below is useful as it add
comments to explain how DEPT sees the situation.  But the above section,
with some but not all of the comments, does seem (to me) to add anything
useful.

> +Adding more supplementary comments to describe DEPT's view in detail:
> +
> +   context X	   context Y	   context Z
> +
> +		   mutex_lock A
> +		   /* might wait for A */
> +		   /* start to take into account event A's context */

What do you mean precisely by "context".
You use the word in the heading "context X  context Y  context Z"
so it seems like "context" means "task" or "process".  But then as I
read on, I think - maybe it means something else.  If it does, then you
should use different words.  Maybe "task X ..." in the heading.

If the examples that follow It seems that the "context" for event A
starts at "mutex lock A" when it (possibly) waits for a mutex and ends
at "mutex unlock A" - which are both in the same process.  Clearly
various other events that happen between these two points in the same
process could be seen as the "context" for event A.

However event B starts in "context X" with "folio_lock B" and ends in
"context Z" or "context Y" with "folio_unlock B".  Is that right?

My question then is: how do you decide which, of all the event in all
the processes in all the system, between the start[S] and the end[E] are
considered to be part of the "context" of event A.

I think it would help me if you defined what a "context" is earlier.
What sorts of things appear in a context?

Thanks,
NeilBrown


> +		   /* 1 */
> +   folio_lock B
> +   /* might wait for B */
> +   /* start to take into account event B's context */
> +   /* 2 */
> +
> +		   folio_lock B
> +		   /* might wait for B */ <- DEADLOCK
> +		   /* start to take into account event B's context */
> +		   /* 3 */
> +
> +				   mutex_lock A
> +				   /* might wait for A */ <- DEADLOCK
> +				   /* start to take into account
> +				      event A's context */
> +				   /* 4 */
> +
> +				   folio_unlock B
> +				   /* event B that's been valid since 2 */
> +		   folio_unlock B
> +		   /* event B that's been valid since 3 */
> +
> +		   mutex_unlock A
> +		   /* event A that's been valid since 1 */
> +
> +				   mutex_unlock A
> +				   /* event A that's been valid since 4 */
> +
> +Let's build up dependency graph with this example. Firstly, context X:
> +
> +   context X
> +
> +   folio_lock B
> +   /* might wait for B */
> +   /* start to take into account event B's context */
> +   /* 2 */
> +
> +There are no events to create dependency. Next, context Y:
> +
> +   context Y
> +
> +   mutex_lock A
> +   /* might wait for A */
> +   /* start to take into account event A's context */
> +   /* 1 */
> +
> +   folio_lock B
> +   /* might wait for B */
> +   /* start to take into account event B's context */
> +   /* 3 */
> +
> +   folio_unlock B
> +   /* event B that's been valid since 3 */
> +
> +   mutex_unlock A
> +   /* event A that's been valid since 1 */
> +
> +There are two events. For event B, folio_unlock B, since there are no
> +waits between 3 and the event, event B does not create dependency. For
> +event A, there is a wait, folio_lock B, between 1 and the event. Which
> +means event A cannot be triggered if event B does not wake up the wait.
> +Therefore, we can say event A depends on event B, say, 'A -> B'. The
> +graph will look like after adding the dependency:
> +
> +   A -> B
> +
> +   where 'A -> B' means that event A depends on event B.
> +
> +Lastly, context Z:
> +
> +   context Z
> +
> +   mutex_lock A
> +   /* might wait for A */
> +   /* start to take into account event A's context */
> +   /* 4 */
> +
> +   folio_unlock B
> +   /* event B that's been valid since 2 */
> +
> +   mutex_unlock A
> +   /* event A that's been valid since 4 */
> +
> +There are also two events. For event B, folio_unlock B, there is a
> +wait, mutex_lock A, between 2 and the event - remind 2 is at a very
> +start and before the wait in timeline. Which means event B cannot be
> +triggered if event A does not wake up the wait. Therefore, we can say
> +event B depends on event A, say, 'B -> A'. The graph will look like
> +after adding the dependency:
> +
> +    -> A -> B -
> +   /           \
> +   \           /
> +    -----------
> +
> +   where 'A -> B' means that event A depends on event B.
> +
> +A new loop has been created. So DEPT can report it as a deadlock. For
> +event A, mutex_unlock A, since there are no waits between 4 and the
> +event, event A does not create dependency. That's it.
> +
> +CONCLUSION
> +
> +DEPT works well with any general synchronization mechanisms by focusing
> +on wait, event and its context.
> +

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64
  2025-10-03  1:46     ` Byungchul Park
@ 2025-10-03 11:33       ` Mark Brown
  2025-10-13  1:51         ` Byungchul Park
  0 siblings, 1 reply; 73+ messages in thread
From: Mark Brown @ 2025-10-03 11:33 UTC (permalink / raw)
  To: Byungchul Park
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

[-- Attachment #1: Type: text/plain, Size: 1888 bytes --]

On Fri, Oct 03, 2025 at 10:46:41AM +0900, Byungchul Park wrote:
> On Thu, Oct 02, 2025 at 12:39:31PM +0100, Mark Brown wrote:
> > On Thu, Oct 02, 2025 at 05:12:09PM +0900, Byungchul Park wrote:
> > > dept needs to notice every entrance from user to kernel mode to treat
> > > every kernel context independently when tracking wait-event dependencies.
> > > Roughly, system call and user oriented fault are the cases.

> > > Make dept aware of the entrances of arm64 and add support
> > > CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64.

> > The description of what needs to be tracked probably needs some
> > tightening up here, it's not clear to me for example why exceptions for
> > mops or the vector extensions aren't included here, or what the
> > distinction is with error faults like BTI or GCS not being tracked?

> Thanks for the feedback but I'm afraid I don't get you.  Can you explain
> in more detail with example?

Your commit log says we need to track every entrance from user mode to
kernel mode but the code only adds tracking to syscalls and some memory
faults.  The exception types listed above (and some others) also result
in entries to the kernel from userspace.

> JFYI, pairs of wait and its event need to be tracked to see if each
> event can be prevented from being reachable by other waits like:

>    context X				context Y
> 
>    lock L
>    ...
>    initiate event A context		start toward event A
>    ...					...
>    wait A // wait for event A and	lock L // wait for unlock L and
>           // prevent unlock L		       // prevent event A
>    ...					...
>    unlock L				unlock L
> 					...
> 					event A

> I meant things like this need to be tracked.

I don't think that's at all clear from the above context, and the
handling for some of the above exception types (eg, the vector
extensions) includes taking locks.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64
  2025-10-02  8:12 ` [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64 Byungchul Park
  2025-10-02 11:39   ` Mark Brown
@ 2025-10-03 14:36   ` Mark Rutland
  2025-10-13  4:28     ` Byungchul Park
  1 sibling, 1 reply; 73+ messages in thread
From: Mark Rutland @ 2025-10-03 14:36 UTC (permalink / raw)
  To: Byungchul Park
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, ada.coupriediaz, kristina.martsenko,
	wangkefeng.wang, broonie, kevin.brodsky, dwmw, shakeel.butt, ast,
	ziy, yuzhao, baolin.wang, usamaarif642, joel.granados,
	richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu, Oct 02, 2025 at 05:12:09PM +0900, Byungchul Park wrote:
> dept needs to notice every entrance from user to kernel mode to treat
> every kernel context independently when tracking wait-event dependencies.
> Roughly, system call and user oriented fault are the cases.
> 
> Make dept aware of the entrances of arm64 and add support
> CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64.
> 
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
>  arch/arm64/Kconfig          | 1 +
>  arch/arm64/kernel/syscall.c | 7 +++++++
>  arch/arm64/mm/fault.c       | 7 +++++++
>  3 files changed, 15 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index e9bbfacc35a6..a8fab2c052dc 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -281,6 +281,7 @@ config ARM64
>  	select USER_STACKTRACE_SUPPORT
>  	select VDSO_GETRANDOM
>  	select VMAP_STACK
> +	select ARCH_HAS_DEPT_SUPPORT
>  	help
>  	  ARM 64-bit (AArch64) Linux support.
>  
> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> index c442fcec6b9e..bbd306335179 100644
> --- a/arch/arm64/kernel/syscall.c
> +++ b/arch/arm64/kernel/syscall.c
> @@ -7,6 +7,7 @@
>  #include <linux/ptrace.h>
>  #include <linux/randomize_kstack.h>
>  #include <linux/syscalls.h>
> +#include <linux/dept.h>
>  
>  #include <asm/debug-monitors.h>
>  #include <asm/exception.h>
> @@ -96,6 +97,12 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
>  	 * (Similarly for HVC and SMC elsewhere.)
>  	 */
>  
> +	/*
> +	 * This is a system call from user mode.  Make dept work with a
> +	 * new kernel mode context.
> +	 */
> +	dept_update_cxt();

As Mark Brown pointed out in his replies, this patch is missing a whole
bunch of cases and does not work correctly as-is.

As Dave Hansen pointed out on the x86 patch, you shouldn't do this
piecemeal in architecture code, and should instead work with the
existing context tracking, e.g. by adding logic to
enter_from_user_mode() and exit_to_user_mode(), or by reusing some
existing context tracking logic that's called there.

Mark.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 35/47] i2c: rename wait_for_completion callback to wait_for_completion_cb
  2025-10-02  8:12 ` [PATCH v17 35/47] i2c: rename wait_for_completion callback to wait_for_completion_cb Byungchul Park
@ 2025-10-04 16:39   ` Wolfram Sang
  2025-10-13  5:27     ` Byungchul Park
  0 siblings, 1 reply; 73+ messages in thread
From: Wolfram Sang @ 2025-10-04 16:39 UTC (permalink / raw)
  To: Byungchul Park
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, broonie, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu, Oct 02, 2025 at 05:12:35PM +0900, Byungchul Park wrote:
> Functionally no change.  This patch is a preparation for DEPT(DEPendency
> Tracker) to track dependencies related to a scheduler API,
> wait_for_completion().
> 
> Unfortunately, struct i2c_algo_pca_data has a callback member named
> wait_for_completion, that is the same as the scheduler API, which makes
> it hard to change the scheduler API to a macro form because of the
> ambiguity.
> 
> Add a postfix _cb to the callback member to remove the ambiguity.
> 
> Signed-off-by: Byungchul Park <byungchul@sk.com>

This patch seems reasonable in any case. I'll pick it, so you have one
dependency less. Good luck with the series!

Applied to for-next, thanks!


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 28/47] dept: add documentation for dept
  2025-10-03  5:36   ` Jonathan Corbet
@ 2025-10-13  1:03     ` Byungchul Park
  0 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-13  1:03 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, catalin.marinas, bp, dave.hansen, x86,
	hpa, luto, sumit.semwal, gustavo, christian.koenig, andi.shyti,
	arnd, lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof,
	petr.pavlu, da.gomez, samitolvanen, paulmck, frederic,
	neeraj.upadhyay, joelagnelf, josh, urezki, mathieu.desnoyers,
	jiangshanlai, qiang.zhang, juri.lelli, vincent.guittot,
	dietmar.eggemann, bsegall, mgorman, vschneid, chuck.lever, neil,
	okorniev, Dai.Ngo, tom, trondmy, anna, kees, bigeasy, clrkwllms,
	mark.rutland, ada.coupriediaz, kristina.martsenko,
	wangkefeng.wang, broonie, kevin.brodsky, dwmw, shakeel.butt, ast,
	ziy, yuzhao, baolin.wang, usamaarif642, joel.granados,
	richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Thu, Oct 02, 2025 at 11:36:16PM -0600, Jonathan Corbet wrote:
> Byungchul Park <byungchul@sk.com> writes:
> 
> > This document describes the concept and APIs of dept.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > ---
> >  Documentation/dependency/dept.txt     | 735 ++++++++++++++++++++++++++
> >  Documentation/dependency/dept_api.txt | 117 ++++
> >  2 files changed, 852 insertions(+)
> >  create mode 100644 Documentation/dependency/dept.txt
> >  create mode 100644 Documentation/dependency/dept_api.txt
> 
> As already suggested, this should be in RST; you're already 95% of the
> way there.  Also, please put it under Documentation/dev-tools; we don't
> need another top-level directory for this.

Sure.  I will.  Thanks!

	Byungchul
> 
> Thanks,
> 
> jon

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 28/47] dept: add documentation for dept
  2025-10-03  2:44   ` Bagas Sanjaya
@ 2025-10-13  1:28     ` Byungchul Park
  0 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-13  1:28 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, broonie, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Fri, Oct 03, 2025 at 09:44:28AM +0700, Bagas Sanjaya wrote:
> On Thu, Oct 02, 2025 at 05:12:28PM +0900, Byungchul Park wrote:
> > This document describes the concept and APIs of dept.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > ---
> >  Documentation/dependency/dept.txt     | 735 ++++++++++++++++++++++++++
> >  Documentation/dependency/dept_api.txt | 117 ++++
> >  2 files changed, 852 insertions(+)
> >  create mode 100644 Documentation/dependency/dept.txt
> >  create mode 100644 Documentation/dependency/dept_api.txt
> 
> What about writing dept docs in reST (like the rest of kernel documentation)?

Sorry for late reply, but sure.  I should and will.  Thank you!

> ---- >8 ----
> diff --git a/Documentation/dependency/dept.txt b/Documentation/locking/dept.rst
> similarity index 92%
> rename from Documentation/dependency/dept.txt
> rename to Documentation/locking/dept.rst
> index 5dd358b96734e6..7b90a0d95f0876 100644
> --- a/Documentation/dependency/dept.txt
> +++ b/Documentation/locking/dept.rst

However, I'm not sure if dept is about locking.  dept is about general
waits e.g. wait_for_completion(), wait_event(), and so on, rather than
just waits involved in typical locking mechanisms e.g. spin lock, mutex,
and so on.

[snip]

> > +
> > +   context X            context Y
> > +
			     /*
			      * Acquired A.
			      */
> > +                        mutex_lock A

			     /*
			      * Request something that will be handled
			      * through e.g. wq, deamon or any its own
			      * way, and then do 'complete B'.
			      */
			     request_xxx_and_complete_B();
        /*
	 * The request from
	 * context Y has been
	 * done.  So running
	 * toward 'complete B'.
	 */

	/*
	 * Wait for A to be
	 * released, but will
	 * never happen.
	 */
> > +   mutex_lock A <- DEADLOCK
			     /*
			      * Wait for 'complete B' to happen, but
			      * will never happen.
			      */
> > +                        wait_for_complete B <- DEADLOCK
	/*
	 * Never reachable.
	 */
> > +   complete B

			     /*
			      * Never reachable.
			      */
> > +                        mutex_unlock A
> > +   mutex_unlock A
> 
> Can you explain how DEPT detects deadlock on the second example above (like
> the first one being described in "How DEPT works" section)?

Sure.  I added the explanation inline above.  Don't hesitate if you have
any questions.  Thanks.

	Byungchul
> 
> Confused...
> 
> --
> An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64
  2025-10-03 11:33       ` Mark Brown
@ 2025-10-13  1:51         ` Byungchul Park
  0 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-13  1:51 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Fri, Oct 03, 2025 at 12:33:03PM +0100, Mark Brown wrote:
> On Fri, Oct 03, 2025 at 10:46:41AM +0900, Byungchul Park wrote:
> > On Thu, Oct 02, 2025 at 12:39:31PM +0100, Mark Brown wrote:
> > > On Thu, Oct 02, 2025 at 05:12:09PM +0900, Byungchul Park wrote:
> > > > dept needs to notice every entrance from user to kernel mode to treat
> > > > every kernel context independently when tracking wait-event dependencies.
> > > > Roughly, system call and user oriented fault are the cases.
> 
> > > > Make dept aware of the entrances of arm64 and add support
> > > > CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64.
> 
> > > The description of what needs to be tracked probably needs some
> > > tightening up here, it's not clear to me for example why exceptions for
> > > mops or the vector extensions aren't included here, or what the
> > > distinction is with error faults like BTI or GCS not being tracked?
> 
> > Thanks for the feedback but I'm afraid I don't get you.  Can you explain
> > in more detail with example?
> 
> Your commit log says we need to track every entrance from user mode to
> kernel mode but the code only adds tracking to syscalls and some memory
> faults.  The exception types listed above (and some others) also result
> in entries to the kernel from userspace.

You're right.  Each kernel mode context needs to be tracked
independently.  Just for your information, context ID is used for making
DEPT only track waits and events within each context, preventing
tracking those across independent contexts to avoid false positives.

Currently, irq, fault, and system calls are covered.  It should be taken
into account if any other exceptions can include waits and events anyway.

> > JFYI, pairs of wait and its event need to be tracked to see if each
> > event can be prevented from being reachable by other waits like:
> 
> >    context X				context Y
> > 
> >    lock L
> >    ...
> >    initiate event A context		start toward event A
> >    ...					...
> >    wait A // wait for event A and	lock L // wait for unlock L and
> >           // prevent unlock L		       // prevent event A
> >    ...					...
> >    unlock L				unlock L
> > 					...
> > 					event A
> 
> > I meant things like this need to be tracked.
> 
> I don't think that's at all clear from the above context, and the
> handling for some of the above exception types (eg, the vector
> extensions) includes taking locks.

A trivial thing to mention, each typical lock pair, lock and unlock, can
only happen within each independent context, not across different
contexts.  So the context ID is not necessary for typical locks.

   exception X

   lock A;
   ...
   unlock A; // already guarateed to unlock A in the context that lock A
             // has been acqured in.
   ...
   finish exception X and return back to user mode;

But yes, as you concern, we should take into account all the exceptions
if any can include general waits and events in it, to avoid unnecessary
false positives.

Thank you!

	Byungchul

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64
  2025-10-03 14:36   ` Mark Rutland
@ 2025-10-13  4:28     ` Byungchul Park
  0 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-13  4:28 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, ada.coupriediaz, kristina.martsenko,
	wangkefeng.wang, broonie, kevin.brodsky, dwmw, shakeel.butt, ast,
	ziy, yuzhao, baolin.wang, usamaarif642, joel.granados,
	richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Fri, Oct 03, 2025 at 03:36:42PM +0100, Mark Rutland wrote:
> On Thu, Oct 02, 2025 at 05:12:09PM +0900, Byungchul Park wrote:
> > dept needs to notice every entrance from user to kernel mode to treat
> > every kernel context independently when tracking wait-event dependencies.
> > Roughly, system call and user oriented fault are the cases.
> >
> > Make dept aware of the entrances of arm64 and add support
> > CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > ---
> >  arch/arm64/Kconfig          | 1 +
> >  arch/arm64/kernel/syscall.c | 7 +++++++
> >  arch/arm64/mm/fault.c       | 7 +++++++
> >  3 files changed, 15 insertions(+)
> >
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index e9bbfacc35a6..a8fab2c052dc 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -281,6 +281,7 @@ config ARM64
> >       select USER_STACKTRACE_SUPPORT
> >       select VDSO_GETRANDOM
> >       select VMAP_STACK
> > +     select ARCH_HAS_DEPT_SUPPORT
> >       help
> >         ARM 64-bit (AArch64) Linux support.
> >
> > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > index c442fcec6b9e..bbd306335179 100644
> > --- a/arch/arm64/kernel/syscall.c
> > +++ b/arch/arm64/kernel/syscall.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/ptrace.h>
> >  #include <linux/randomize_kstack.h>
> >  #include <linux/syscalls.h>
> > +#include <linux/dept.h>
> >
> >  #include <asm/debug-monitors.h>
> >  #include <asm/exception.h>
> > @@ -96,6 +97,12 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> >        * (Similarly for HVC and SMC elsewhere.)
> >        */
> >
> > +     /*
> > +      * This is a system call from user mode.  Make dept work with a
> > +      * new kernel mode context.
> > +      */
> > +     dept_update_cxt();
> 
> As Mark Brown pointed out in his replies, this patch is missing a whole
> bunch of cases and does not work correctly as-is.
> 
> As Dave Hansen pointed out on the x86 patch, you shouldn't do this
> piecemeal in architecture code, and should instead work with the
> existing context tracking, e.g. by adding logic to
> enter_from_user_mode() and exit_to_user_mode(), or by reusing some

I will consider it.  However, I need to check if there are not any waits
and events before enter_from_user_mode(), or after exit_to_user_mode()
since those functions aren't the outmost functions for kernel mode C
code anyway.

	Byungchul

> existing context tracking logic that's called there.
> 
> Mark.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 28/47] dept: add documentation for dept
  2025-10-03  6:55   ` NeilBrown
@ 2025-10-13  5:23     ` Byungchul Park
  2025-10-14  6:03       ` NeilBrown
  0 siblings, 1 reply; 73+ messages in thread
From: Byungchul Park @ 2025-10-13  5:23 UTC (permalink / raw)
  To: NeilBrown
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, okorniev, Dai.Ngo, tom, trondmy, anna, kees, bigeasy,
	clrkwllms, mark.rutland, ada.coupriediaz, kristina.martsenko,
	wangkefeng.wang, broonie, kevin.brodsky, dwmw, shakeel.butt, ast,
	ziy, yuzhao, baolin.wang, usamaarif642, joel.granados,
	richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Fri, Oct 03, 2025 at 04:55:14PM +1000, NeilBrown wrote:
> On Thu, 02 Oct 2025, Byungchul Park wrote:
> > This document describes the concept and APIs of dept.
> >
> 
> Thanks for the documentation.  I've been trying to understand it.

You're welcome.  Feel free to ask me if you have any questions.

> > +How DEPT works
> > +--------------
> > +
> > +Let's take a look how DEPT works with the 1st example in the section
> > +'Limitation of lockdep'.
> > +
> > +   context X    context Y       context Z
> > +
> > +                mutex_lock A
> > +   folio_lock B
> > +                folio_lock B <- DEADLOCK
> > +                                mutex_lock A <- DEADLOCK
> > +                                folio_unlock B
> > +                folio_unlock B
> > +                mutex_unlock A
> > +                                mutex_unlock A
> > +
> > +Adding comments to describe DEPT's view in terms of wait and event:
> > +
> > +   context X    context Y       context Z
> > +
> > +                mutex_lock A
> > +                /* wait for A */
> > +   folio_lock B
> > +   /* wait for A */
> > +   /* start event A context */
> > +
> > +                folio_lock B
> > +                /* wait for B */ <- DEADLOCK
> > +                /* start event B context */
> > +
> > +                                mutex_lock A
> > +                                /* wait for A */ <- DEADLOCK
> > +                                /* start event A context */
> > +
> > +                                folio_unlock B
> > +                                /* event B */
> > +                folio_unlock B
> > +                /* event B */
> > +
> > +                mutex_unlock A
> > +                /* event A */
> > +                                mutex_unlock A
> > +                                /* event A */
> > +
> 
> I can't see the value of the above section.
> The first section with no comments is useful as it is easy to see the
> deadlock being investigate.  The section below is useful as it add
> comments to explain how DEPT sees the situation.  But the above section,
> with some but not all of the comments, does seem (to me) to add anything
> useful.

I just wanted to convert 'locking terms' to 'wait and event terms' by
one step.  However, I can remove the section you pointed out that you
thought was useless.

> > +Adding more supplementary comments to describe DEPT's view in detail:
> > +
> > +   context X    context Y       context Z
> > +
> > +                mutex_lock A
> > +                /* might wait for A */
> > +                /* start to take into account event A's context */
> 
> What do you mean precisely by "context".

That means one of task context, irq context, wq worker context (even
though it can also be considered as task context), or something.

Of course, in the example above, it must be task context since it showed
a case involving only sleepible ones.  However, I wanted to use general
terms in the document to cover all the types of context e.g. irq, task,
and so on.

> You use the word in the heading "context X  context Y  context Z"
> so it seems like "context" means "task" or "process".  But then as I
> read on, I think - maybe it means something else.  If it does, then you
> should use different words.  Maybe "task X ..." in the heading.

It should cover all the types of context.  What word would you recommend
for that purpose?

> If the examples that follow It seems that the "context" for event A
> starts at "mutex lock A" when it (possibly) waits for a mutex and ends
> at "mutex unlock A" - which are both in the same process.  Clearly
> various other events that happen between these two points in the same
> process could be seen as the "context" for event A.
> 
> However event B starts in "context X" with "folio_lock B" and ends in
> "context Z" or "context Y" with "folio_unlock B".  Is that right?

Right.

> My question then is: how do you decide which, of all the event in all
> the processes in all the system, between the start[S] and the end[E] are
> considered to be part of the "context" of event A.

DEPT can identify the "context" of event A only *once* the event A is
actually executed, and builds dependencies between the event and the
recorded waits in the "context" of event A since [S].

> I think it would help me if you defined what a "context" is earlier.

Sorry if my description was not clear enough.

	Byungchul

> What sorts of things appear in a context?
> 
> Thanks,
> NeilBrown
> 
> 
> > +                /* 1 */
> > +   folio_lock B
> > +   /* might wait for B */
> > +   /* start to take into account event B's context */
> > +   /* 2 */
> > +
> > +                folio_lock B
> > +                /* might wait for B */ <- DEADLOCK
> > +                /* start to take into account event B's context */
> > +                /* 3 */
> > +
> > +                                mutex_lock A
> > +                                /* might wait for A */ <- DEADLOCK
> > +                                /* start to take into account
> > +                                   event A's context */
> > +                                /* 4 */
> > +
> > +                                folio_unlock B
> > +                                /* event B that's been valid since 2 */
> > +                folio_unlock B
> > +                /* event B that's been valid since 3 */
> > +
> > +                mutex_unlock A
> > +                /* event A that's been valid since 1 */
> > +
> > +                                mutex_unlock A
> > +                                /* event A that's been valid since 4 */
> > +
> > +Let's build up dependency graph with this example. Firstly, context X:
> > +
> > +   context X
> > +
> > +   folio_lock B
> > +   /* might wait for B */
> > +   /* start to take into account event B's context */
> > +   /* 2 */
> > +
> > +There are no events to create dependency. Next, context Y:
> > +
> > +   context Y
> > +
> > +   mutex_lock A
> > +   /* might wait for A */
> > +   /* start to take into account event A's context */
> > +   /* 1 */
> > +
> > +   folio_lock B
> > +   /* might wait for B */
> > +   /* start to take into account event B's context */
> > +   /* 3 */
> > +
> > +   folio_unlock B
> > +   /* event B that's been valid since 3 */
> > +
> > +   mutex_unlock A
> > +   /* event A that's been valid since 1 */
> > +
> > +There are two events. For event B, folio_unlock B, since there are no
> > +waits between 3 and the event, event B does not create dependency. For
> > +event A, there is a wait, folio_lock B, between 1 and the event. Which
> > +means event A cannot be triggered if event B does not wake up the wait.
> > +Therefore, we can say event A depends on event B, say, 'A -> B'. The
> > +graph will look like after adding the dependency:
> > +
> > +   A -> B
> > +
> > +   where 'A -> B' means that event A depends on event B.
> > +
> > +Lastly, context Z:
> > +
> > +   context Z
> > +
> > +   mutex_lock A
> > +   /* might wait for A */
> > +   /* start to take into account event A's context */
> > +   /* 4 */
> > +
> > +   folio_unlock B
> > +   /* event B that's been valid since 2 */
> > +
> > +   mutex_unlock A
> > +   /* event A that's been valid since 4 */
> > +
> > +There are also two events. For event B, folio_unlock B, there is a
> > +wait, mutex_lock A, between 2 and the event - remind 2 is at a very
> > +start and before the wait in timeline. Which means event B cannot be
> > +triggered if event A does not wake up the wait. Therefore, we can say
> > +event B depends on event A, say, 'B -> A'. The graph will look like
> > +after adding the dependency:
> > +
> > +    -> A -> B -
> > +   /           \
> > +   \           /
> > +    -----------
> > +
> > +   where 'A -> B' means that event A depends on event B.
> > +
> > +A new loop has been created. So DEPT can report it as a deadlock. For
> > +event A, mutex_unlock A, since there are no waits between 4 and the
> > +event, event A does not create dependency. That's it.
> > +
> > +CONCLUSION
> > +
> > +DEPT works well with any general synchronization mechanisms by focusing
> > +on wait, event and its context.
> > +

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 35/47] i2c: rename wait_for_completion callback to wait_for_completion_cb
  2025-10-04 16:39   ` Wolfram Sang
@ 2025-10-13  5:27     ` Byungchul Park
  0 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-13  5:27 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, neil, okorniev, Dai.Ngo, tom, trondmy, anna, kees,
	bigeasy, clrkwllms, mark.rutland, ada.coupriediaz,
	kristina.martsenko, wangkefeng.wang, broonie, kevin.brodsky, dwmw,
	shakeel.butt, ast, ziy, yuzhao, baolin.wang, usamaarif642,
	joel.granados, richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Sat, Oct 04, 2025 at 06:39:43PM +0200, Wolfram Sang wrote:
> On Thu, Oct 02, 2025 at 05:12:35PM +0900, Byungchul Park wrote:
> > Functionally no change.  This patch is a preparation for DEPT(DEPendency
> > Tracker) to track dependencies related to a scheduler API,
> > wait_for_completion().
> >
> > Unfortunately, struct i2c_algo_pca_data has a callback member named
> > wait_for_completion, that is the same as the scheduler API, which makes
> > it hard to change the scheduler API to a macro form because of the
> > ambiguity.
> >
> > Add a postfix _cb to the callback member to remove the ambiguity.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> 
> This patch seems reasonable in any case. I'll pick it, so you have one
> dependency less. Good luck with the series!

Thanks, Wolfram Sang!

	Byungchul

> Applied to for-next, thanks!

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 28/47] dept: add documentation for dept
  2025-10-13  5:23     ` Byungchul Park
@ 2025-10-14  6:03       ` NeilBrown
  2025-10-14  6:38         ` Byungchul Park
  0 siblings, 1 reply; 73+ messages in thread
From: NeilBrown @ 2025-10-14  6:03 UTC (permalink / raw)
  To: Byungchul Park
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, okorniev, Dai.Ngo, tom, trondmy, anna, kees, bigeasy,
	clrkwllms, mark.rutland, ada.coupriediaz, kristina.martsenko,
	wangkefeng.wang, broonie, kevin.brodsky, dwmw, shakeel.butt, ast,
	ziy, yuzhao, baolin.wang, usamaarif642, joel.granados,
	richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Mon, 13 Oct 2025, Byungchul Park wrote:
> On Fri, Oct 03, 2025 at 04:55:14PM +1000, NeilBrown wrote:
> > On Thu, 02 Oct 2025, Byungchul Park wrote:
> > > This document describes the concept and APIs of dept.
> > >
> > 
> > Thanks for the documentation.  I've been trying to understand it.
> 
> You're welcome.  Feel free to ask me if you have any questions.
> 
> > > +How DEPT works
> > > +--------------
> > > +
> > > +Let's take a look how DEPT works with the 1st example in the section
> > > +'Limitation of lockdep'.
> > > +
> > > +   context X    context Y       context Z
> > > +
> > > +                mutex_lock A
> > > +   folio_lock B
> > > +                folio_lock B <- DEADLOCK
> > > +                                mutex_lock A <- DEADLOCK
> > > +                                folio_unlock B
> > > +                folio_unlock B
> > > +                mutex_unlock A
> > > +                                mutex_unlock A
> > > +
> > > +Adding comments to describe DEPT's view in terms of wait and event:
> > > +
> > > +   context X    context Y       context Z
> > > +
> > > +                mutex_lock A
> > > +                /* wait for A */
> > > +   folio_lock B
> > > +   /* wait for A */
> > > +   /* start event A context */
> > > +
> > > +                folio_lock B
> > > +                /* wait for B */ <- DEADLOCK
> > > +                /* start event B context */
> > > +
> > > +                                mutex_lock A
> > > +                                /* wait for A */ <- DEADLOCK
> > > +                                /* start event A context */
> > > +
> > > +                                folio_unlock B
> > > +                                /* event B */
> > > +                folio_unlock B
> > > +                /* event B */
> > > +
> > > +                mutex_unlock A
> > > +                /* event A */
> > > +                                mutex_unlock A
> > > +                                /* event A */
> > > +
> > 
> > I can't see the value of the above section.
> > The first section with no comments is useful as it is easy to see the
> > deadlock being investigate.  The section below is useful as it add
> > comments to explain how DEPT sees the situation.  But the above section,
> > with some but not all of the comments, does seem (to me) to add anything
> > useful.
> 
> I just wanted to convert 'locking terms' to 'wait and event terms' by
> one step.  However, I can remove the section you pointed out that you
> thought was useless.

But it seems you did it in two steps???

If you think the middle section with some but not all of the comments
adds value (And maybe it does - maybe I just haven't seen it yet), the
please explain what value is being added at each step.

It is currently documented as:

 +Adding comments to describe DEPT's view in terms of wait and event:

then

 +Adding more supplementary comments to describe DEPT's view in detail:

Maybe if you said more DEPT's view so at this point so that when we see
the supplementary comments, we can understand how they relate to DEPT's
view.


> 
> > > +
> > > +   context X    context Y       context Z
> > > +
> > > +                mutex_lock A
> > > +                /* might wait for A */
> > > +                /* start to take into account event A's context */
> > 
> > What do you mean precisely by "context".
> 
> That means one of task context, irq context, wq worker context (even
> though it can also be considered as task context), or something.

OK, that makes sense.  If you provide this definition for "context"
before you use the term, I think that will help the reader.


> > If the examples that follow It seems that the "context" for event A
> > starts at "mutex lock A" when it (possibly) waits for a mutex and ends
> > at "mutex unlock A" - which are both in the same process.  Clearly
> > various other events that happen between these two points in the same
> > process could be seen as the "context" for event A.
> > 
> > However event B starts in "context X" with "folio_lock B" and ends in
> > "context Z" or "context Y" with "folio_unlock B".  Is that right?
> 
> Right.
> 
> > My question then is: how do you decide which, of all the event in all
> > the processes in all the system, between the start[S] and the end[E] are
> > considered to be part of the "context" of event A.
> 
> DEPT can identify the "context" of event A only *once* the event A is
> actually executed, and builds dependencies between the event and the
> recorded waits in the "context" of event A since [S].

So a dependency is an ordered set of pairs of "context" and "wait" or
"context" and "event".  "wait"s and "event"s are linked by some abstract
identifier for the event (like lockdep's lock classes).
How are the contexts abstracted. Is it just "same" or "different"

I'll try reading the document again and see how much further I get.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v17 28/47] dept: add documentation for dept
  2025-10-14  6:03       ` NeilBrown
@ 2025-10-14  6:38         ` Byungchul Park
  0 siblings, 0 replies; 73+ messages in thread
From: Byungchul Park @ 2025-10-14  6:38 UTC (permalink / raw)
  To: NeilBrown
  Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
	adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
	joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
	willy, david, amir73il, gregkh, kernel-team, linux-mm, akpm,
	mhocko, minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl,
	penberg, rientjes, vbabka, ngupta, linux-block, josef,
	linux-fsdevel, jack, jlayton, dan.j.williams, hch, djwong,
	dri-devel, rodrigosiqueiramelo, melissa.srw, hamohammed.sa,
	harry.yoo, chris.p.wilson, gwan-gyeong.mun, max.byungchul.park,
	boqun.feng, longman, yunseong.kim, ysk, yeoreum.yun, netdev,
	matthew.brost, her0gyugyu, corbet, catalin.marinas, bp,
	dave.hansen, x86, hpa, luto, sumit.semwal, gustavo,
	christian.koenig, andi.shyti, arnd, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mcgrof, petr.pavlu, da.gomez, samitolvanen, paulmck,
	frederic, neeraj.upadhyay, joelagnelf, josh, urezki,
	mathieu.desnoyers, jiangshanlai, qiang.zhang, juri.lelli,
	vincent.guittot, dietmar.eggemann, bsegall, mgorman, vschneid,
	chuck.lever, okorniev, Dai.Ngo, tom, trondmy, anna, kees, bigeasy,
	clrkwllms, mark.rutland, ada.coupriediaz, kristina.martsenko,
	wangkefeng.wang, broonie, kevin.brodsky, dwmw, shakeel.butt, ast,
	ziy, yuzhao, baolin.wang, usamaarif642, joel.granados,
	richard.weiyang, geert+renesas, tim.c.chen, linux,
	alexander.shishkin, lillian, chenhuacai, francesco,
	guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
	thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
	linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
	linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel

On Tue, Oct 14, 2025 at 05:03:58PM +1100, NeilBrown wrote:
> On Mon, 13 Oct 2025, Byungchul Park wrote:
> > On Fri, Oct 03, 2025 at 04:55:14PM +1000, NeilBrown wrote:
> > > On Thu, 02 Oct 2025, Byungchul Park wrote:
> > > > This document describes the concept and APIs of dept.
> > > >
> > >
> > > Thanks for the documentation.  I've been trying to understand it.
> >
> > You're welcome.  Feel free to ask me if you have any questions.
> >
> > > > +How DEPT works
> > > > +--------------
> > > > +
> > > > +Let's take a look how DEPT works with the 1st example in the section
> > > > +'Limitation of lockdep'.
> > > > +
> > > > +   context X    context Y       context Z
> > > > +
> > > > +                mutex_lock A
> > > > +   folio_lock B
> > > > +                folio_lock B <- DEADLOCK
> > > > +                                mutex_lock A <- DEADLOCK
> > > > +                                folio_unlock B
> > > > +                folio_unlock B
> > > > +                mutex_unlock A
> > > > +                                mutex_unlock A
> > > > +
> > > > +Adding comments to describe DEPT's view in terms of wait and event:
> > > > +
> > > > +   context X    context Y       context Z
> > > > +
> > > > +                mutex_lock A
> > > > +                /* wait for A */
> > > > +   folio_lock B
> > > > +   /* wait for A */
> > > > +   /* start event A context */
> > > > +
> > > > +                folio_lock B
> > > > +                /* wait for B */ <- DEADLOCK
> > > > +                /* start event B context */
> > > > +
> > > > +                                mutex_lock A
> > > > +                                /* wait for A */ <- DEADLOCK
> > > > +                                /* start event A context */
> > > > +
> > > > +                                folio_unlock B
> > > > +                                /* event B */
> > > > +                folio_unlock B
> > > > +                /* event B */
> > > > +
> > > > +                mutex_unlock A
> > > > +                /* event A */
> > > > +                                mutex_unlock A
> > > > +                                /* event A */
> > > > +
> > >
> > > I can't see the value of the above section.
> > > The first section with no comments is useful as it is easy to see the
> > > deadlock being investigate.  The section below is useful as it add
> > > comments to explain how DEPT sees the situation.  But the above section,
> > > with some but not all of the comments, does seem (to me) to add anything
> > > useful.
> >
> > I just wanted to convert 'locking terms' to 'wait and event terms' by
> > one step.  However, I can remove the section you pointed out that you
> > thought was useless.
> 
> But it seems you did it in two steps???
> 
> If you think the middle section with some but not all of the comments
> adds value (And maybe it does - maybe I just haven't seen it yet), the
> please explain what value is being added at each step.
> 
> It is currently documented as:
> 
>  +Adding comments to describe DEPT's view in terms of wait and event:
> 
> then
> 
>  +Adding more supplementary comments to describe DEPT's view in detail:
> 
> Maybe if you said more DEPT's view so at this point so that when we see
> the supplementary comments, we can understand how they relate to DEPT's
> view.

As you pointed out, I'd better remove the middle part so as to simplify
it.  It doesn't give much information I also think.

> > > > +
> > > > +   context X    context Y       context Z
> > > > +
> > > > +                mutex_lock A
> > > > +                /* might wait for A */
> > > > +                /* start to take into account event A's context */
> > >
> > > What do you mean precisely by "context".
> >
> > That means one of task context, irq context, wq worker context (even
> > though it can also be considered as task context), or something.
> 
> OK, that makes sense.  If you provide this definition for "context"
> before you use the term, I think that will help the reader.

Thank you.  I will add it.

> > > If the examples that follow It seems that the "context" for event A
> > > starts at "mutex lock A" when it (possibly) waits for a mutex and ends
> > > at "mutex unlock A" - which are both in the same process.  Clearly
> > > various other events that happen between these two points in the same
> > > process could be seen as the "context" for event A.
> > >
> > > However event B starts in "context X" with "folio_lock B" and ends in
> > > "context Z" or "context Y" with "folio_unlock B".  Is that right?
> >
> > Right.
> >
> > > My question then is: how do you decide which, of all the event in all
> > > the processes in all the system, between the start[S] and the end[E] are
> > > considered to be part of the "context" of event A.
> >
> > DEPT can identify the "context" of event A only *once* the event A is
> > actually executed, and builds dependencies between the event and the
> > recorded waits in the "context" of event A since [S].
> 
> So a dependency is an ordered set of pairs of "context" and "wait" or

I don't get what you were trying to tell here.  FWIW, DEPT focuses on
*event* contexts and, within each event context, it tracks pairs of
waits that appears since [S] and the interesting event that identifies
the event context.

> "context" and "event".  "wait"s and "event"s are linked by some abstract
> identifier for the event (like lockdep's lock classes).

Yeah, kind of.

> How are the contexts abstracted. Is it just "same" or "different"

I don't get this.  Can you explain in more detail?

	Byungchul

> I'll try reading the document again and see how much further I get.
> 
> Thanks,
> NeilBrown

^ permalink raw reply	[flat|nested] 73+ messages in thread

end of thread, other threads:[~2025-10-14  6:38 UTC | newest]

Thread overview: 73+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-02  8:12 [PATCH v17 00/47] DEPT(DEPendency Tracker) Byungchul Park
2025-10-02  8:12 ` [PATCH v17 01/47] llist: move llist_{head,node} definition to types.h Byungchul Park
2025-10-02  8:24   ` Greg KH
2025-10-02 13:53     ` Mathieu Desnoyers
2025-10-02 23:19       ` Arnd Bergmann
2025-10-02  8:12 ` [PATCH v17 02/47] dept: implement DEPT(DEPendency Tracker) Byungchul Park
2025-10-02  8:25   ` Greg KH
2025-10-02 12:56     ` Geert Uytterhoeven
2025-10-02  8:12 ` [PATCH v17 03/47] dept: add single event dependency tracker APIs Byungchul Park
2025-10-02  8:12 ` [PATCH v17 04/47] dept: add lock " Byungchul Park
2025-10-02  8:12 ` [PATCH v17 05/47] dept: tie to lockdep and IRQ tracing Byungchul Park
2025-10-02  8:12 ` [PATCH v17 06/47] dept: add proc knobs to show stats and dependency graph Byungchul Park
2025-10-02  8:12 ` [PATCH v17 07/47] dept: distinguish each kernel context from another Byungchul Park
2025-10-02  8:12 ` [PATCH v17 08/47] x86_64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to x86_64 Byungchul Park
2025-10-02 15:22   ` Dave Hansen
2025-10-03  1:12     ` Byungchul Park
2025-10-02  8:12 ` [PATCH v17 09/47] arm64, dept: add support CONFIG_ARCH_HAS_DEPT_SUPPORT to arm64 Byungchul Park
2025-10-02 11:39   ` Mark Brown
2025-10-03  1:46     ` Byungchul Park
2025-10-03 11:33       ` Mark Brown
2025-10-13  1:51         ` Byungchul Park
2025-10-03 14:36   ` Mark Rutland
2025-10-13  4:28     ` Byungchul Park
2025-10-02  8:12 ` [PATCH v17 10/47] dept: distinguish each work from another Byungchul Park
2025-10-02  8:12 ` [PATCH v17 11/47] dept: add a mechanism to refill the internal memory pools on running out Byungchul Park
2025-10-02  8:12 ` [PATCH v17 12/47] dept: record the latest one out of consecutive waits of the same class Byungchul Park
2025-10-02  8:12 ` [PATCH v17 13/47] dept: apply sdt_might_sleep_{start,end}() to wait_for_completion()/complete() Byungchul Park
2025-10-02  8:12 ` [PATCH v17 14/47] dept: apply sdt_might_sleep_{start,end}() to swait Byungchul Park
2025-10-02  8:12 ` [PATCH v17 15/47] dept: apply sdt_might_sleep_{start,end}() to waitqueue wait Byungchul Park
2025-10-02  8:12 ` [PATCH v17 16/47] dept: apply sdt_might_sleep_{start,end}() to hashed-waitqueue wait Byungchul Park
2025-10-02  8:12 ` [PATCH v17 17/47] dept: apply sdt_might_sleep_{start,end}() to dma fence Byungchul Park
2025-10-02  8:12 ` [PATCH v17 18/47] dept: track timeout waits separately with a new Kconfig Byungchul Park
2025-10-02  8:12 ` [PATCH v17 19/47] dept: apply timeout consideration to wait_for_completion()/complete() Byungchul Park
2025-10-02  8:12 ` [PATCH v17 20/47] dept: apply timeout consideration to swait Byungchul Park
2025-10-02  8:12 ` [PATCH v17 21/47] dept: apply timeout consideration to waitqueue wait Byungchul Park
2025-10-02  8:12 ` [PATCH v17 22/47] dept: apply timeout consideration to hashed-waitqueue wait Byungchul Park
2025-10-02  8:12 ` [PATCH v17 23/47] dept: apply timeout consideration to dma fence wait Byungchul Park
2025-10-02  8:12 ` [PATCH v17 24/47] dept: make dept able to work with an external wgen Byungchul Park
2025-10-02  8:12 ` [PATCH v17 25/47] dept: track PG_locked with dept Byungchul Park
2025-10-02  8:12 ` [PATCH v17 26/47] dept: print staged wait's stacktrace on report Byungchul Park
2025-10-02  8:12 ` [PATCH v17 27/47] locking/lockdep: prevent various lockdep assertions when lockdep_off()'ed Byungchul Park
2025-10-02  8:12 ` [PATCH v17 28/47] dept: add documentation for dept Byungchul Park
2025-10-03  2:44   ` Bagas Sanjaya
2025-10-13  1:28     ` Byungchul Park
2025-10-03  5:36   ` Jonathan Corbet
2025-10-13  1:03     ` Byungchul Park
2025-10-03  6:55   ` NeilBrown
2025-10-13  5:23     ` Byungchul Park
2025-10-14  6:03       ` NeilBrown
2025-10-14  6:38         ` Byungchul Park
2025-10-02  8:12 ` [PATCH v17 29/47] cpu/hotplug: use a weaker annotation in AP thread Byungchul Park
2025-10-02  8:12 ` [PATCH v17 30/47] fs/jbd2: use a weaker annotation in journal handling Byungchul Park
2025-10-02  8:40   ` Jan Kara
2025-10-03  1:13     ` Byungchul Park
2025-10-02  8:12 ` [PATCH v17 31/47] dept: assign dept map to mmu notifier invalidation synchronization Byungchul Park
2025-10-02  8:12 ` [PATCH v17 32/47] dept: assign unique dept_key to each distinct dma fence caller Byungchul Park
2025-10-02  8:12 ` [PATCH v17 33/47] dept: make dept aware of lockdep_set_lock_cmp_fn() annotation Byungchul Park
2025-10-02  8:12 ` [PATCH v17 34/47] dept: make dept stop from working on debug_locks_off() Byungchul Park
2025-10-02  8:12 ` [PATCH v17 35/47] i2c: rename wait_for_completion callback to wait_for_completion_cb Byungchul Park
2025-10-04 16:39   ` Wolfram Sang
2025-10-13  5:27     ` Byungchul Park
2025-10-02  8:12 ` [PATCH v17 36/47] dept: assign unique dept_key to each distinct wait_for_completion() caller Byungchul Park
2025-10-02  8:12 ` [PATCH v17 37/47] completion, dept: introduce init_completion_dmap() API Byungchul Park
2025-10-02  8:12 ` [PATCH v17 38/47] dept: introduce a new type of dependency tracking between multi event sites Byungchul Park
2025-10-02  8:12 ` [PATCH v17 39/47] dept: add module support for struct dept_event_site and dept_event_site_dep Byungchul Park
2025-10-02  8:12 ` [PATCH v17 40/47] dept: introduce event_site() to disable event tracking if it's recoverable Byungchul Park
2025-10-02  8:12 ` [PATCH v17 41/47] dept: implement a basic unit test for dept Byungchul Park
2025-10-02  8:12 ` [PATCH v17 42/47] dept: call dept_hardirqs_off() in local_irq_*() regardless of irq state Byungchul Park
2025-10-02  8:12 ` [PATCH v17 43/47] rcu/update: fix same dept key collision between various types of RCU Byungchul Park
2025-10-02  8:12 ` [PATCH v17 44/47] dept: introduce APIs to set page usage and use subclasses_evt for the usage Byungchul Park
2025-10-02  8:12 ` [PATCH v17 45/47] dept: track PG_writeback with dept Byungchul Park
2025-10-02  8:12 ` [PATCH v17 46/47] SUNRPC: relocate struct rcu_head to the first field of struct rpc_xprt Byungchul Park
2025-10-02  8:12 ` [PATCH v17 47/47] mm: percpu: increase PERCPU_DYNAMIC_SIZE_SHIFT on DEPT and large PAGE_SIZE Byungchul Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).