* [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-10 12:01 ` Thomas Hellström (Intel)
` (2 more replies)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 02/18] dma-buf: minor doc touch-ups Daniel Vetter
` (29 subsequent siblings)
30 siblings, 3 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, linux-mm, Jason Gunthorpe, Daniel Vetter, Andrew Morton,
Christian König
fs_reclaim_acquire/release nicely catch recursion issues when
allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
to use to keep the excessive caches in check). For mmu notifier
recursions we do have lockdep annotations since 23b68395c7c7
("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
But these only fire if a path actually results in some pte
invalidation - for most small allocations that's very rarely the case.
The other trouble is that pte invalidation can happen any time when
__GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
recursion.
I was pondering whether we should just do the general annotation, but
there's always the risk for false positives. Plus I'm assuming that
the core fs and io code is a lot better reviewed and tested than
random mmu notifier code in drivers. Hence why I decide to only
annotate for that specific case.
Furthermore even if we'd create a lockdep map for direct reclaim, we'd
still need to explicit pull in the mmu notifier map - there's a lot
more places that do pte invalidation than just direct reclaim, these
two contexts arent the same.
Note that the mmu notifiers needing their own independent lockdep map
is also the reason we can't hold them from fs_reclaim_acquire to
fs_reclaim_release - it would nest with the acquistion in the pte
invalidation code, causing a lockdep splat. And we can't remove the
annotations from pte invalidation and all the other places since
they're called from many other places than page reclaim. Hence we can
only do the equivalent of might_lock, but on the raw lockdep map.
With this we can also remove the lockdep priming added in 66204f1d2d1b
("mm/mmu_notifiers: prime lockdep") since the new annotations are
strictly more powerful.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: linux-mm@kvack.org
Cc: linux-rdma@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
This is part of a gpu lockdep annotation series simply because it
really helps to catch issues where gpu subsystem locks and primitives
can deadlock with themselves through allocations and mmu notifiers.
But aside from that motivation it should be completely free-standing,
and can land through -mm/-rdma/-hmm or any other tree really whenever.
-Daniel
---
mm/mmu_notifier.c | 7 -------
mm/page_alloc.c | 23 ++++++++++++++---------
2 files changed, 14 insertions(+), 16 deletions(-)
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 06852b896fa6..5d578b9122f8 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription,
lockdep_assert_held_write(&mm->mmap_sem);
BUG_ON(atomic_read(&mm->mm_users) <= 0);
- if (IS_ENABLED(CONFIG_LOCKDEP)) {
- fs_reclaim_acquire(GFP_KERNEL);
- lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
- lock_map_release(&__mmu_notifier_invalidate_range_start_map);
- fs_reclaim_release(GFP_KERNEL);
- }
-
if (!mm->notifier_subscriptions) {
/*
* kmalloc cannot be called under mm_take_all_locks(), but we
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 13cc653122b7..f8a222db4a53 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -57,6 +57,7 @@
#include <trace/events/oom.h>
#include <linux/prefetch.h>
#include <linux/mm_inline.h>
+#include <linux/mmu_notifier.h>
#include <linux/migrate.h>
#include <linux/hugetlb.h>
#include <linux/sched/rt.h>
@@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
static struct lockdep_map __fs_reclaim_map =
STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
-static bool __need_fs_reclaim(gfp_t gfp_mask)
+static bool __need_reclaim(gfp_t gfp_mask)
{
gfp_mask = current_gfp_context(gfp_mask);
@@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
if (current->flags & PF_MEMALLOC)
return false;
- /* We're only interested __GFP_FS allocations for now */
- if (!(gfp_mask & __GFP_FS))
- return false;
-
if (gfp_mask & __GFP_NOLOCKDEP)
return false;
@@ -4158,15 +4155,23 @@ void __fs_reclaim_release(void)
void fs_reclaim_acquire(gfp_t gfp_mask)
{
- if (__need_fs_reclaim(gfp_mask))
- __fs_reclaim_acquire();
+ if (__need_reclaim(gfp_mask)) {
+ if (!(gfp_mask & __GFP_FS))
+ __fs_reclaim_acquire();
+
+ lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
+ lock_map_release(&__mmu_notifier_invalidate_range_start_map);
+
+ }
}
EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
void fs_reclaim_release(gfp_t gfp_mask)
{
- if (__need_fs_reclaim(gfp_mask))
- __fs_reclaim_release();
+ if (__need_reclaim(gfp_mask)) {
+ if (!(gfp_mask & __GFP_FS))
+ __fs_reclaim_release();
+ }
}
EXPORT_SYMBOL_GPL(fs_reclaim_release);
#endif
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-04 8:12 ` [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release Daniel Vetter
@ 2020-06-10 12:01 ` Thomas Hellström (Intel)
2020-06-10 12:25 ` Daniel Vetter
2020-06-10 19:41 ` [Intel-gfx] [PATCH] " Daniel Vetter
2020-06-21 17:00 ` [Intel-gfx] [PATCH 01/18] " Qian Cai
2 siblings, 1 reply; 106+ messages in thread
From: Thomas Hellström (Intel) @ 2020-06-10 12:01 UTC (permalink / raw)
To: Daniel Vetter, DRI Development
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx, linux-mm,
Jason Gunthorpe, Daniel Vetter, Andrew Morton,
Christian König
Hi, Daniel,
Please see below.
On 6/4/20 10:12 AM, Daniel Vetter wrote:
> fs_reclaim_acquire/release nicely catch recursion issues when
> allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> to use to keep the excessive caches in check). For mmu notifier
> recursions we do have lockdep annotations since 23b68395c7c7
> ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
>
> But these only fire if a path actually results in some pte
> invalidation - for most small allocations that's very rarely the case.
> The other trouble is that pte invalidation can happen any time when
> __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> recursion.
>
> I was pondering whether we should just do the general annotation, but
> there's always the risk for false positives. Plus I'm assuming that
> the core fs and io code is a lot better reviewed and tested than
> random mmu notifier code in drivers. Hence why I decide to only
> annotate for that specific case.
>
> Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> still need to explicit pull in the mmu notifier map - there's a lot
> more places that do pte invalidation than just direct reclaim, these
> two contexts arent the same.
>
> Note that the mmu notifiers needing their own independent lockdep map
> is also the reason we can't hold them from fs_reclaim_acquire to
> fs_reclaim_release - it would nest with the acquistion in the pte
> invalidation code, causing a lockdep splat. And we can't remove the
> annotations from pte invalidation and all the other places since
> they're called from many other places than page reclaim. Hence we can
> only do the equivalent of might_lock, but on the raw lockdep map.
>
> With this we can also remove the lockdep priming added in 66204f1d2d1b
> ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> strictly more powerful.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: linux-mm@kvack.org
> Cc: linux-rdma@vger.kernel.org
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
> This is part of a gpu lockdep annotation series simply because it
> really helps to catch issues where gpu subsystem locks and primitives
> can deadlock with themselves through allocations and mmu notifiers.
> But aside from that motivation it should be completely free-standing,
> and can land through -mm/-rdma/-hmm or any other tree really whenever.
> -Daniel
> ---
> mm/mmu_notifier.c | 7 -------
> mm/page_alloc.c | 23 ++++++++++++++---------
> 2 files changed, 14 insertions(+), 16 deletions(-)
>
> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> index 06852b896fa6..5d578b9122f8 100644
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription,
> lockdep_assert_held_write(&mm->mmap_sem);
> BUG_ON(atomic_read(&mm->mm_users) <= 0);
>
> - if (IS_ENABLED(CONFIG_LOCKDEP)) {
> - fs_reclaim_acquire(GFP_KERNEL);
> - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> - lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> - fs_reclaim_release(GFP_KERNEL);
> - }
> -
> if (!mm->notifier_subscriptions) {
> /*
> * kmalloc cannot be called under mm_take_all_locks(), but we
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 13cc653122b7..f8a222db4a53 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -57,6 +57,7 @@
> #include <trace/events/oom.h>
> #include <linux/prefetch.h>
> #include <linux/mm_inline.h>
> +#include <linux/mmu_notifier.h>
> #include <linux/migrate.h>
> #include <linux/hugetlb.h>
> #include <linux/sched/rt.h>
> @@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
> static struct lockdep_map __fs_reclaim_map =
> STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
>
> -static bool __need_fs_reclaim(gfp_t gfp_mask)
> +static bool __need_reclaim(gfp_t gfp_mask)
> {
> gfp_mask = current_gfp_context(gfp_mask);
>
> @@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> if (current->flags & PF_MEMALLOC)
> return false;
>
> - /* We're only interested __GFP_FS allocations for now */
> - if (!(gfp_mask & __GFP_FS))
> - return false;
> -
> if (gfp_mask & __GFP_NOLOCKDEP)
> return false;
>
> @@ -4158,15 +4155,23 @@ void __fs_reclaim_release(void)
>
> void fs_reclaim_acquire(gfp_t gfp_mask)
> {
> - if (__need_fs_reclaim(gfp_mask))
> - __fs_reclaim_acquire();
> + if (__need_reclaim(gfp_mask)) {
> + if (!(gfp_mask & __GFP_FS))
Hmm. Shouldn't this be "if (gfp_mask & __GFP_FS)" or am I misunderstanding?
> + __fs_reclaim_acquire();
#ifdef CONFIG_MMU_NOTIFIER?
> +
> + lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> + lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> +
> + }
> }
> EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
>
> void fs_reclaim_release(gfp_t gfp_mask)
> {
> - if (__need_fs_reclaim(gfp_mask))
> - __fs_reclaim_release();
> + if (__need_reclaim(gfp_mask)) {
> + if (!(gfp_mask & __GFP_FS))
Same here?
> + __fs_reclaim_release();
> + }
> }
> EXPORT_SYMBOL_GPL(fs_reclaim_release);
> #endif
One suggested test case would perhaps be to call madvise(madv_dontneed)
on a subpart of a transhuge page. That would IIRC trigger a page split
and interesting mmu notifier calls....
Thanks,
Thomas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-10 12:01 ` Thomas Hellström (Intel)
@ 2020-06-10 12:25 ` Daniel Vetter
0 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-10 12:25 UTC (permalink / raw)
To: Thomas Hellström (Intel)
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Linux MM, Jason Gunthorpe, DRI Development, Daniel Vetter,
Andrew Morton, Christian König
On Wed, Jun 10, 2020 at 2:01 PM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
> Hi, Daniel,
>
> Please see below.
>
> On 6/4/20 10:12 AM, Daniel Vetter wrote:
> > fs_reclaim_acquire/release nicely catch recursion issues when
> > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> > to use to keep the excessive caches in check). For mmu notifier
> > recursions we do have lockdep annotations since 23b68395c7c7
> > ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
> >
> > But these only fire if a path actually results in some pte
> > invalidation - for most small allocations that's very rarely the case.
> > The other trouble is that pte invalidation can happen any time when
> > __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> > choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> > recursion.
> >
> > I was pondering whether we should just do the general annotation, but
> > there's always the risk for false positives. Plus I'm assuming that
> > the core fs and io code is a lot better reviewed and tested than
> > random mmu notifier code in drivers. Hence why I decide to only
> > annotate for that specific case.
> >
> > Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> > still need to explicit pull in the mmu notifier map - there's a lot
> > more places that do pte invalidation than just direct reclaim, these
> > two contexts arent the same.
> >
> > Note that the mmu notifiers needing their own independent lockdep map
> > is also the reason we can't hold them from fs_reclaim_acquire to
> > fs_reclaim_release - it would nest with the acquistion in the pte
> > invalidation code, causing a lockdep splat. And we can't remove the
> > annotations from pte invalidation and all the other places since
> > they're called from many other places than page reclaim. Hence we can
> > only do the equivalent of might_lock, but on the raw lockdep map.
> >
> > With this we can also remove the lockdep priming added in 66204f1d2d1b
> > ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> > strictly more powerful.
> >
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > Cc: linux-mm@kvack.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> > This is part of a gpu lockdep annotation series simply because it
> > really helps to catch issues where gpu subsystem locks and primitives
> > can deadlock with themselves through allocations and mmu notifiers.
> > But aside from that motivation it should be completely free-standing,
> > and can land through -mm/-rdma/-hmm or any other tree really whenever.
> > -Daniel
> > ---
> > mm/mmu_notifier.c | 7 -------
> > mm/page_alloc.c | 23 ++++++++++++++---------
> > 2 files changed, 14 insertions(+), 16 deletions(-)
> >
> > diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> > index 06852b896fa6..5d578b9122f8 100644
> > --- a/mm/mmu_notifier.c
> > +++ b/mm/mmu_notifier.c
> > @@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription,
> > lockdep_assert_held_write(&mm->mmap_sem);
> > BUG_ON(atomic_read(&mm->mm_users) <= 0);
> >
> > - if (IS_ENABLED(CONFIG_LOCKDEP)) {
> > - fs_reclaim_acquire(GFP_KERNEL);
> > - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> > - lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> > - fs_reclaim_release(GFP_KERNEL);
> > - }
> > -
> > if (!mm->notifier_subscriptions) {
> > /*
> > * kmalloc cannot be called under mm_take_all_locks(), but we
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 13cc653122b7..f8a222db4a53 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -57,6 +57,7 @@
> > #include <trace/events/oom.h>
> > #include <linux/prefetch.h>
> > #include <linux/mm_inline.h>
> > +#include <linux/mmu_notifier.h>
> > #include <linux/migrate.h>
> > #include <linux/hugetlb.h>
> > #include <linux/sched/rt.h>
> > @@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
> > static struct lockdep_map __fs_reclaim_map =
> > STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
> >
> > -static bool __need_fs_reclaim(gfp_t gfp_mask)
> > +static bool __need_reclaim(gfp_t gfp_mask)
> > {
> > gfp_mask = current_gfp_context(gfp_mask);
> >
> > @@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> > if (current->flags & PF_MEMALLOC)
> > return false;
> >
> > - /* We're only interested __GFP_FS allocations for now */
> > - if (!(gfp_mask & __GFP_FS))
> > - return false;
> > -
> > if (gfp_mask & __GFP_NOLOCKDEP)
> > return false;
> >
> > @@ -4158,15 +4155,23 @@ void __fs_reclaim_release(void)
> >
> > void fs_reclaim_acquire(gfp_t gfp_mask)
> > {
> > - if (__need_fs_reclaim(gfp_mask))
> > - __fs_reclaim_acquire();
> > + if (__need_reclaim(gfp_mask)) {
> > + if (!(gfp_mask & __GFP_FS))
> Hmm. Shouldn't this be "if (gfp_mask & __GFP_FS)" or am I misunderstanding?
Uh yes :-( I guess what saved me is that I immediately went for the
lockdep splat in drivers/gpu. And I guess there's not any obvious
inversions for GFP_NOFS/GFP_NOIO, and since I made the mistake
consintely the GFP_FS annotation was still consistent, but simply for
GFP_NOFS. Oops.
Will fix in the next version.
> > + __fs_reclaim_acquire();
>
>
> #ifdef CONFIG_MMU_NOTIFIER?
Hm indeed. Will fix too.
Thanks for your review.
>
> > +
> > + lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> > + lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> > +
> > + }
> > }
> > EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
> >
> > void fs_reclaim_release(gfp_t gfp_mask)
> > {
> > - if (__need_fs_reclaim(gfp_mask))
> > - __fs_reclaim_release();
> > + if (__need_reclaim(gfp_mask)) {
> > + if (!(gfp_mask & __GFP_FS))
> Same here?
> > + __fs_reclaim_release();
> > + }
> > }
> > EXPORT_SYMBOL_GPL(fs_reclaim_release);
> > #endif
>
> One suggested test case would perhaps be to call madvise(madv_dontneed)
> on a subpart of a transhuge page. That would IIRC trigger a page split
> and interesting mmu notifier calls....
The neat thing about the mmu notifier lockdep key is that we take it
whether there's notifiers or not - it's called outside of any of these
paths. So as long as you have ever hit a hugepage split somewhen since
boot, and you've hit your driver's mmu_notifier paths, lockdep will
connect the dots. Explicit testcases for all combinations not needed
anymore. This patch here just makes sure that the same holds for
memory allocations and direct reclaim (which is a lot harder to
trigger intentionally in testcases).
That was at least the idea, seems to have caught a few things already.
-Daniel
>
> Thanks,
> Thomas
>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-04 8:12 ` [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release Daniel Vetter
2020-06-10 12:01 ` Thomas Hellström (Intel)
@ 2020-06-10 19:41 ` Daniel Vetter
2020-06-21 17:42 ` Qian Cai
2020-06-21 17:00 ` [Intel-gfx] [PATCH 01/18] " Qian Cai
2 siblings, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-10 19:41 UTC (permalink / raw)
To: Intel Graphics Development
Cc: linux-rdma, Daniel Vetter, LKML, amd-gfx, linux-mm,
Jason Gunthorpe, DRI Development, Daniel Vetter, Andrew Morton,
Christian König
fs_reclaim_acquire/release nicely catch recursion issues when
allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
to use to keep the excessive caches in check). For mmu notifier
recursions we do have lockdep annotations since 23b68395c7c7
("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
But these only fire if a path actually results in some pte
invalidation - for most small allocations that's very rarely the case.
The other trouble is that pte invalidation can happen any time when
__GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
recursion.
I was pondering whether we should just do the general annotation, but
there's always the risk for false positives. Plus I'm assuming that
the core fs and io code is a lot better reviewed and tested than
random mmu notifier code in drivers. Hence why I decide to only
annotate for that specific case.
Furthermore even if we'd create a lockdep map for direct reclaim, we'd
still need to explicit pull in the mmu notifier map - there's a lot
more places that do pte invalidation than just direct reclaim, these
two contexts arent the same.
Note that the mmu notifiers needing their own independent lockdep map
is also the reason we can't hold them from fs_reclaim_acquire to
fs_reclaim_release - it would nest with the acquistion in the pte
invalidation code, causing a lockdep splat. And we can't remove the
annotations from pte invalidation and all the other places since
they're called from many other places than page reclaim. Hence we can
only do the equivalent of might_lock, but on the raw lockdep map.
With this we can also remove the lockdep priming added in 66204f1d2d1b
("mm/mmu_notifiers: prime lockdep") since the new annotations are
strictly more powerful.
v2: Review from Thomas Hellstrom:
- unbotch the fs_reclaim context check, I accidentally inverted it,
but it didn't blow up because I inverted it immediately
- fix compiling for !CONFIG_MMU_NOTIFIER
Cc: Thomas Hellström (Intel) <thomas_os@shipmail.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: linux-mm@kvack.org
Cc: linux-rdma@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
This is part of a gpu lockdep annotation series simply because it
really helps to catch issues where gpu subsystem locks and primitives
can deadlock with themselves through allocations and mmu notifiers.
But aside from that motivation it should be completely free-standing,
and can land through -mm/-rdma/-hmm or any other tree really whenever.
-Daniel
---
mm/mmu_notifier.c | 7 -------
mm/page_alloc.c | 25 ++++++++++++++++---------
2 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 06852b896fa6..5d578b9122f8 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription,
lockdep_assert_held_write(&mm->mmap_sem);
BUG_ON(atomic_read(&mm->mm_users) <= 0);
- if (IS_ENABLED(CONFIG_LOCKDEP)) {
- fs_reclaim_acquire(GFP_KERNEL);
- lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
- lock_map_release(&__mmu_notifier_invalidate_range_start_map);
- fs_reclaim_release(GFP_KERNEL);
- }
-
if (!mm->notifier_subscriptions) {
/*
* kmalloc cannot be called under mm_take_all_locks(), but we
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 13cc653122b7..7536faaaa0fd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -57,6 +57,7 @@
#include <trace/events/oom.h>
#include <linux/prefetch.h>
#include <linux/mm_inline.h>
+#include <linux/mmu_notifier.h>
#include <linux/migrate.h>
#include <linux/hugetlb.h>
#include <linux/sched/rt.h>
@@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
static struct lockdep_map __fs_reclaim_map =
STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
-static bool __need_fs_reclaim(gfp_t gfp_mask)
+static bool __need_reclaim(gfp_t gfp_mask)
{
gfp_mask = current_gfp_context(gfp_mask);
@@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
if (current->flags & PF_MEMALLOC)
return false;
- /* We're only interested __GFP_FS allocations for now */
- if (!(gfp_mask & __GFP_FS))
- return false;
-
if (gfp_mask & __GFP_NOLOCKDEP)
return false;
@@ -4158,15 +4155,25 @@ void __fs_reclaim_release(void)
void fs_reclaim_acquire(gfp_t gfp_mask)
{
- if (__need_fs_reclaim(gfp_mask))
- __fs_reclaim_acquire();
+ if (__need_reclaim(gfp_mask)) {
+ if (gfp_mask & __GFP_FS)
+ __fs_reclaim_acquire();
+
+#ifdef CONFIG_MMU_NOTIFIER
+ lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
+ lock_map_release(&__mmu_notifier_invalidate_range_start_map);
+#endif
+
+ }
}
EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
void fs_reclaim_release(gfp_t gfp_mask)
{
- if (__need_fs_reclaim(gfp_mask))
- __fs_reclaim_release();
+ if (__need_reclaim(gfp_mask)) {
+ if (gfp_mask & __GFP_FS)
+ __fs_reclaim_release();
+ }
}
EXPORT_SYMBOL_GPL(fs_reclaim_release);
#endif
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-10 19:41 ` [Intel-gfx] [PATCH] " Daniel Vetter
@ 2020-06-21 17:42 ` Qian Cai
2020-06-21 18:07 ` Daniel Vetter
2020-06-23 22:31 ` Dave Chinner
0 siblings, 2 replies; 106+ messages in thread
From: Qian Cai @ 2020-06-21 17:42 UTC (permalink / raw)
To: Daniel Vetter
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx,
Christian König, linux-xfs, linux-mm, Jason Gunthorpe,
DRI Development, Daniel Vetter, Andrew Morton
On Wed, Jun 10, 2020 at 09:41:01PM +0200, Daniel Vetter wrote:
> fs_reclaim_acquire/release nicely catch recursion issues when
> allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> to use to keep the excessive caches in check). For mmu notifier
> recursions we do have lockdep annotations since 23b68395c7c7
> ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
>
> But these only fire if a path actually results in some pte
> invalidation - for most small allocations that's very rarely the case.
> The other trouble is that pte invalidation can happen any time when
> __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> recursion.
>
> I was pondering whether we should just do the general annotation, but
> there's always the risk for false positives. Plus I'm assuming that
> the core fs and io code is a lot better reviewed and tested than
> random mmu notifier code in drivers. Hence why I decide to only
> annotate for that specific case.
>
> Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> still need to explicit pull in the mmu notifier map - there's a lot
> more places that do pte invalidation than just direct reclaim, these
> two contexts arent the same.
>
> Note that the mmu notifiers needing their own independent lockdep map
> is also the reason we can't hold them from fs_reclaim_acquire to
> fs_reclaim_release - it would nest with the acquistion in the pte
> invalidation code, causing a lockdep splat. And we can't remove the
> annotations from pte invalidation and all the other places since
> they're called from many other places than page reclaim. Hence we can
> only do the equivalent of might_lock, but on the raw lockdep map.
>
> With this we can also remove the lockdep priming added in 66204f1d2d1b
> ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> strictly more powerful.
>
> v2: Review from Thomas Hellstrom:
> - unbotch the fs_reclaim context check, I accidentally inverted it,
> but it didn't blow up because I inverted it immediately
> - fix compiling for !CONFIG_MMU_NOTIFIER
>
> Cc: Thomas Hellström (Intel) <thomas_os@shipmail.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: linux-mm@kvack.org
> Cc: linux-rdma@vger.kernel.org
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Replying the right patch here...
Reverting this commit [1] fixed the lockdep warning below while applying
some memory pressure.
[1] linux-next cbf7c9d86d75 ("mm: track mmu notifiers in fs_reclaim_acquire/release")
[ 190.455003][ T369] WARNING: possible circular locking dependency detected
[ 190.487291][ T369] 5.8.0-rc1-next-20200621 #1 Not tainted
[ 190.512363][ T369] ------------------------------------------------------
[ 190.543354][ T369] kswapd3/369 is trying to acquire lock:
[ 190.568523][ T369] ffff889fcf694528 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_reclaim_inode+0xdf/0x860
spin_lock at include/linux/spinlock.h:353
(inlined by) xfs_iflags_test_and_set at fs/xfs/xfs_inode.h:166
(inlined by) xfs_iflock_nowait at fs/xfs/xfs_inode.h:249
(inlined by) xfs_reclaim_inode at fs/xfs/xfs_icache.c:1127
[ 190.614359][ T369]
[ 190.614359][ T369] but task is already holding lock:
[ 190.647763][ T369] ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
__fs_reclaim_acquire at mm/page_alloc.c:4200
[ 190.687845][ T369]
[ 190.687845][ T369] which lock already depends on the new lock.
[ 190.687845][ T369]
[ 190.734890][ T369]
[ 190.734890][ T369] the existing dependency chain (in reverse order) is:
[ 190.775991][ T369]
[ 190.775991][ T369] -> #1 (fs_reclaim){+.+.}-{0:0}:
[ 190.808150][ T369] fs_reclaim_acquire+0x77/0x80
[ 190.832152][ T369] slab_pre_alloc_hook.constprop.52+0x20/0x120
slab_pre_alloc_hook at mm/slab.h:507
[ 190.862173][ T369] kmem_cache_alloc+0x43/0x2a0
[ 190.885602][ T369] kmem_zone_alloc+0x113/0x3ef
kmem_zone_alloc at fs/xfs/kmem.c:129
[ 190.908702][ T369] xfs_inode_item_init+0x1d/0xa0
xfs_inode_item_init at fs/xfs/xfs_inode_item.c:639
[ 190.934461][ T369] xfs_trans_ijoin+0x96/0x100
xfs_trans_ijoin at fs/xfs/libxfs/xfs_trans_inode.c:34
[ 190.961530][ T369] xfs_setattr_nonsize+0x1a6/0xcd0
xfs_setattr_nonsize at fs/xfs/xfs_iops.c:716
[ 190.987331][ T369] xfs_vn_setattr+0x133/0x160
xfs_vn_setattr at fs/xfs/xfs_iops.c:1081
[ 191.010476][ T369] notify_change+0x6c5/0xba1
notify_change at fs/attr.c:336
[ 191.033317][ T369] chmod_common+0x19b/0x390
[ 191.055770][ T369] ksys_fchmod+0x28/0x60
[ 191.077957][ T369] __x64_sys_fchmod+0x4e/0x70
[ 191.102767][ T369] do_syscall_64+0x5f/0x310
[ 191.125090][ T369] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 191.153749][ T369]
[ 191.153749][ T369] -> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
[ 191.191267][ T369] __lock_acquire+0x2efc/0x4da0
[ 191.215974][ T369] lock_acquire+0x1ac/0xaf0
[ 191.238953][ T369] down_write_nested+0x92/0x150
[ 191.262955][ T369] xfs_reclaim_inode+0xdf/0x860
[ 191.287149][ T369] xfs_reclaim_inodes_ag+0x505/0xb00
[ 191.313291][ T369] xfs_reclaim_inodes_nr+0x93/0xd0
[ 191.338357][ T369] super_cache_scan+0x2fd/0x430
[ 191.362354][ T369] do_shrink_slab+0x317/0x990
[ 191.385341][ T369] shrink_slab+0x3a8/0x4b0
[ 191.407214][ T369] shrink_node+0x49c/0x17b0
[ 191.429841][ T369] balance_pgdat+0x59c/0xed0
[ 191.455041][ T369] kswapd+0x5a4/0xc40
[ 191.477524][ T369] kthread+0x358/0x420
[ 191.499285][ T369] ret_from_fork+0x22/0x30
[ 191.521107][ T369]
[ 191.521107][ T369] other info that might help us debug this:
[ 191.521107][ T369]
[ 191.567490][ T369] Possible unsafe locking scenario:
[ 191.567490][ T369]
[ 191.600947][ T369] CPU0 CPU1
[ 191.624808][ T369] ---- ----
[ 191.649236][ T369] lock(fs_reclaim);
[ 191.667607][ T369] lock(&xfs_nondir_ilock_class);
[ 191.702096][ T369] lock(fs_reclaim);
[ 191.731243][ T369] lock(&xfs_nondir_ilock_class);
[ 191.754025][ T369]
[ 191.754025][ T369] *** DEADLOCK ***
[ 191.754025][ T369]
[ 191.791126][ T369] 4 locks held by kswapd3/369:
[ 191.812198][ T369] #0: ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
[ 191.854319][ T369] #1: ffffffffb5074c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x219/0x4b0
[ 191.896043][ T369] #2: ffff8890279b40e0 (&type->s_umount_key#27){++++}-{3:3}, at: trylock_super+0x11/0xb0
[ 191.940538][ T369] #3: ffff889027a73a28 (&pag->pag_ici_reclaim_lock){+.+.}-{3:3}, at: xfs_reclaim_inodes_ag+0x135/0xb00
[ 191.995314][ T369]
[ 191.995314][ T369] stack backtrace:
[ 192.022934][ T369] CPU: 42 PID: 369 Comm: kswapd3 Not tainted 5.8.0-rc1-next-20200621 #1
[ 192.060546][ T369] Hardware name: HP ProLiant BL660c Gen9, BIOS I38 10/17/2018
[ 192.094518][ T369] Call Trace:
[ 192.109005][ T369] dump_stack+0x9d/0xe0
[ 192.127468][ T369] check_noncircular+0x347/0x400
[ 192.149526][ T369] ? print_circular_bug+0x360/0x360
[ 192.172584][ T369] ? freezing_slow_path.cold.2+0x2a/0x2a
[ 192.197251][ T369] __lock_acquire+0x2efc/0x4da0
[ 192.218737][ T369] ? lockdep_hardirqs_on_prepare+0x550/0x550
[ 192.246736][ T369] ? __lock_acquire+0x3541/0x4da0
[ 192.269673][ T369] lock_acquire+0x1ac/0xaf0
[ 192.290192][ T369] ? xfs_reclaim_inode+0xdf/0x860
[ 192.313158][ T369] ? rcu_read_unlock+0x50/0x50
[ 192.335057][ T369] down_write_nested+0x92/0x150
[ 192.358409][ T369] ? xfs_reclaim_inode+0xdf/0x860
[ 192.380890][ T369] ? rwsem_down_write_slowpath+0xf50/0xf50
[ 192.406891][ T369] ? find_held_lock+0x33/0x1c0
[ 192.427925][ T369] ? xfs_ilock+0x2ef/0x370
[ 192.447496][ T369] ? xfs_reclaim_inode+0xdf/0x860
[ 192.472315][ T369] xfs_reclaim_inode+0xdf/0x860
[ 192.496649][ T369] ? xfs_inode_clear_reclaim_tag+0xa0/0xa0
[ 192.524188][ T369] ? do_raw_spin_unlock+0x4f/0x250
[ 192.546852][ T369] xfs_reclaim_inodes_ag+0x505/0xb00
[ 192.570473][ T369] ? xfs_reclaim_inode+0x860/0x860
[ 192.592692][ T369] ? mark_held_locks+0xb0/0x110
[ 192.614287][ T369] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 192.640800][ T369] ? _raw_spin_unlock_irqrestore+0x39/0x40
[ 192.666695][ T369] ? try_to_wake_up+0xcf/0xf40
[ 192.688265][ T369] ? migrate_swap_stop+0xc10/0xc10
[ 192.711966][ T369] ? do_raw_spin_unlock+0x4f/0x250
[ 192.735032][ T369] xfs_reclaim_inodes_nr+0x93/0xd0
xfs_reclaim_inodes_nr at fs/xfs/xfs_icache.c:1399
[ 192.757674][ T369] ? xfs_reclaim_inodes+0x90/0x90
[ 192.780028][ T369] ? list_lru_count_one+0x177/0x300
[ 192.803010][ T369] super_cache_scan+0x2fd/0x430
super_cache_scan at fs/super.c:115
[ 192.824491][ T369] do_shrink_slab+0x317/0x990
do_shrink_slab at mm/vmscan.c:514
[ 192.845160][ T369] shrink_slab+0x3a8/0x4b0
shrink_slab_memcg at mm/vmscan.c:584
(inlined by) shrink_slab at mm/vmscan.c:662
[ 192.864722][ T369] ? do_shrink_slab+0x990/0x990
[ 192.886137][ T369] ? rcu_is_watching+0x2c/0x80
[ 192.907289][ T369] ? mem_cgroup_protected+0x228/0x470
[ 192.931166][ T369] ? vmpressure+0x25/0x290
[ 192.950595][ T369] shrink_node+0x49c/0x17b0
[ 192.972332][ T369] balance_pgdat+0x59c/0xed0
kswapd_shrink_node at mm/vmscan.c:3521
(inlined by) balance_pgdat at mm/vmscan.c:3670
[ 192.994918][ T369] ? __node_reclaim+0x950/0x950
[ 193.018625][ T369] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 193.046566][ T369] ? _raw_spin_unlock_irq+0x1f/0x30
[ 193.070214][ T369] ? _raw_spin_unlock_irq+0x1f/0x30
[ 193.093176][ T369] ? finish_task_switch+0x129/0x650
[ 193.116225][ T369] ? finish_task_switch+0xf2/0x650
[ 193.138809][ T369] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 193.163323][ T369] kswapd+0x5a4/0xc40
[ 193.182690][ T369] ? __kthread_parkme+0x4d/0x1a0
[ 193.204660][ T369] ? balance_pgdat+0xed0/0xed0
[ 193.225776][ T369] ? _raw_spin_unlock_irqrestore+0x39/0x40
[ 193.252306][ T369] ? finish_wait+0x270/0x270
[ 193.272473][ T369] ? __kthread_parkme+0x4d/0x1a0
[ 193.294476][ T369] ? __kthread_parkme+0xcc/0x1a0
[ 193.316704][ T369] ? balance_pgdat+0xed0/0xed0
[ 193.337808][ T369] kthread+0x358/0x420
[ 193.355666][ T369] ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 193.381884][ T369] ret_from_fork+0x22/0x30
> ---
> This is part of a gpu lockdep annotation series simply because it
> really helps to catch issues where gpu subsystem locks and primitives
> can deadlock with themselves through allocations and mmu notifiers.
> But aside from that motivation it should be completely free-standing,
> and can land through -mm/-rdma/-hmm or any other tree really whenever.
> -Daniel
> ---
> mm/mmu_notifier.c | 7 -------
> mm/page_alloc.c | 25 ++++++++++++++++---------
> 2 files changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> index 06852b896fa6..5d578b9122f8 100644
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription,
> lockdep_assert_held_write(&mm->mmap_sem);
> BUG_ON(atomic_read(&mm->mm_users) <= 0);
>
> - if (IS_ENABLED(CONFIG_LOCKDEP)) {
> - fs_reclaim_acquire(GFP_KERNEL);
> - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> - lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> - fs_reclaim_release(GFP_KERNEL);
> - }
> -
> if (!mm->notifier_subscriptions) {
> /*
> * kmalloc cannot be called under mm_take_all_locks(), but we
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 13cc653122b7..7536faaaa0fd 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -57,6 +57,7 @@
> #include <trace/events/oom.h>
> #include <linux/prefetch.h>
> #include <linux/mm_inline.h>
> +#include <linux/mmu_notifier.h>
> #include <linux/migrate.h>
> #include <linux/hugetlb.h>
> #include <linux/sched/rt.h>
> @@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
> static struct lockdep_map __fs_reclaim_map =
> STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
>
> -static bool __need_fs_reclaim(gfp_t gfp_mask)
> +static bool __need_reclaim(gfp_t gfp_mask)
> {
> gfp_mask = current_gfp_context(gfp_mask);
>
> @@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> if (current->flags & PF_MEMALLOC)
> return false;
>
> - /* We're only interested __GFP_FS allocations for now */
> - if (!(gfp_mask & __GFP_FS))
> - return false;
> -
> if (gfp_mask & __GFP_NOLOCKDEP)
> return false;
>
> @@ -4158,15 +4155,25 @@ void __fs_reclaim_release(void)
>
> void fs_reclaim_acquire(gfp_t gfp_mask)
> {
> - if (__need_fs_reclaim(gfp_mask))
> - __fs_reclaim_acquire();
> + if (__need_reclaim(gfp_mask)) {
> + if (gfp_mask & __GFP_FS)
> + __fs_reclaim_acquire();
> +
> +#ifdef CONFIG_MMU_NOTIFIER
> + lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> + lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> +#endif
> +
> + }
> }
> EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
>
> void fs_reclaim_release(gfp_t gfp_mask)
> {
> - if (__need_fs_reclaim(gfp_mask))
> - __fs_reclaim_release();
> + if (__need_reclaim(gfp_mask)) {
> + if (gfp_mask & __GFP_FS)
> + __fs_reclaim_release();
> + }
> }
> EXPORT_SYMBOL_GPL(fs_reclaim_release);
> #endif
> --
> 2.26.2
>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-21 17:42 ` Qian Cai
@ 2020-06-21 18:07 ` Daniel Vetter
2020-06-21 20:01 ` Daniel Vetter
2020-06-23 22:31 ` Dave Chinner
1 sibling, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-21 18:07 UTC (permalink / raw)
To: Qian Cai
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Christian König, linux-xfs, Linux MM, Jason Gunthorpe,
DRI Development, Daniel Vetter, Andrew Morton
On Sun, Jun 21, 2020 at 7:42 PM Qian Cai <cai@lca.pw> wrote:
>
> On Wed, Jun 10, 2020 at 09:41:01PM +0200, Daniel Vetter wrote:
> > fs_reclaim_acquire/release nicely catch recursion issues when
> > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> > to use to keep the excessive caches in check). For mmu notifier
> > recursions we do have lockdep annotations since 23b68395c7c7
> > ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
> >
> > But these only fire if a path actually results in some pte
> > invalidation - for most small allocations that's very rarely the case.
> > The other trouble is that pte invalidation can happen any time when
> > __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> > choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> > recursion.
> >
> > I was pondering whether we should just do the general annotation, but
> > there's always the risk for false positives. Plus I'm assuming that
> > the core fs and io code is a lot better reviewed and tested than
> > random mmu notifier code in drivers. Hence why I decide to only
> > annotate for that specific case.
> >
> > Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> > still need to explicit pull in the mmu notifier map - there's a lot
> > more places that do pte invalidation than just direct reclaim, these
> > two contexts arent the same.
> >
> > Note that the mmu notifiers needing their own independent lockdep map
> > is also the reason we can't hold them from fs_reclaim_acquire to
> > fs_reclaim_release - it would nest with the acquistion in the pte
> > invalidation code, causing a lockdep splat. And we can't remove the
> > annotations from pte invalidation and all the other places since
> > they're called from many other places than page reclaim. Hence we can
> > only do the equivalent of might_lock, but on the raw lockdep map.
> >
> > With this we can also remove the lockdep priming added in 66204f1d2d1b
> > ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> > strictly more powerful.
> >
> > v2: Review from Thomas Hellstrom:
> > - unbotch the fs_reclaim context check, I accidentally inverted it,
> > but it didn't blow up because I inverted it immediately
> > - fix compiling for !CONFIG_MMU_NOTIFIER
> >
> > Cc: Thomas Hellström (Intel) <thomas_os@shipmail.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > Cc: linux-mm@kvack.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>
> Replying the right patch here...
>
> Reverting this commit [1] fixed the lockdep warning below while applying
> some memory pressure.
>
> [1] linux-next cbf7c9d86d75 ("mm: track mmu notifiers in fs_reclaim_acquire/release")
Hm, then I'm confused because
- there's not mmut notifier lockdep map in the splat at a..
- the patch is supposed to not change anything for fs_reclaim (but the
interim version got that wrong)
- looking at the paths it's kmalloc vs kswapd, both places I totally
expect fs_reflaim to be used.
But you're claiming reverting this prevents the lockdep splat. If
that's right, then my reasoning above is broken somewhere. Someone
less blind than me having an idea?
Aside this is the first email I've typed, until I realized the first
report was against the broken patch and that looked like a much more
reasonable explanation (but didn't quite match up with the code
paths).
Thanks, Daniel
>
> [ 190.455003][ T369] WARNING: possible circular locking dependency detected
> [ 190.487291][ T369] 5.8.0-rc1-next-20200621 #1 Not tainted
> [ 190.512363][ T369] ------------------------------------------------------
> [ 190.543354][ T369] kswapd3/369 is trying to acquire lock:
> [ 190.568523][ T369] ffff889fcf694528 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_reclaim_inode+0xdf/0x860
> spin_lock at include/linux/spinlock.h:353
> (inlined by) xfs_iflags_test_and_set at fs/xfs/xfs_inode.h:166
> (inlined by) xfs_iflock_nowait at fs/xfs/xfs_inode.h:249
> (inlined by) xfs_reclaim_inode at fs/xfs/xfs_icache.c:1127
> [ 190.614359][ T369]
> [ 190.614359][ T369] but task is already holding lock:
> [ 190.647763][ T369] ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
> __fs_reclaim_acquire at mm/page_alloc.c:4200
> [ 190.687845][ T369]
> [ 190.687845][ T369] which lock already depends on the new lock.
> [ 190.687845][ T369]
> [ 190.734890][ T369]
> [ 190.734890][ T369] the existing dependency chain (in reverse order) is:
> [ 190.775991][ T369]
> [ 190.775991][ T369] -> #1 (fs_reclaim){+.+.}-{0:0}:
> [ 190.808150][ T369] fs_reclaim_acquire+0x77/0x80
> [ 190.832152][ T369] slab_pre_alloc_hook.constprop.52+0x20/0x120
> slab_pre_alloc_hook at mm/slab.h:507
> [ 190.862173][ T369] kmem_cache_alloc+0x43/0x2a0
> [ 190.885602][ T369] kmem_zone_alloc+0x113/0x3ef
> kmem_zone_alloc at fs/xfs/kmem.c:129
> [ 190.908702][ T369] xfs_inode_item_init+0x1d/0xa0
> xfs_inode_item_init at fs/xfs/xfs_inode_item.c:639
> [ 190.934461][ T369] xfs_trans_ijoin+0x96/0x100
> xfs_trans_ijoin at fs/xfs/libxfs/xfs_trans_inode.c:34
> [ 190.961530][ T369] xfs_setattr_nonsize+0x1a6/0xcd0
> xfs_setattr_nonsize at fs/xfs/xfs_iops.c:716
> [ 190.987331][ T369] xfs_vn_setattr+0x133/0x160
> xfs_vn_setattr at fs/xfs/xfs_iops.c:1081
> [ 191.010476][ T369] notify_change+0x6c5/0xba1
> notify_change at fs/attr.c:336
> [ 191.033317][ T369] chmod_common+0x19b/0x390
> [ 191.055770][ T369] ksys_fchmod+0x28/0x60
> [ 191.077957][ T369] __x64_sys_fchmod+0x4e/0x70
> [ 191.102767][ T369] do_syscall_64+0x5f/0x310
> [ 191.125090][ T369] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 191.153749][ T369]
> [ 191.153749][ T369] -> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
> [ 191.191267][ T369] __lock_acquire+0x2efc/0x4da0
> [ 191.215974][ T369] lock_acquire+0x1ac/0xaf0
> [ 191.238953][ T369] down_write_nested+0x92/0x150
> [ 191.262955][ T369] xfs_reclaim_inode+0xdf/0x860
> [ 191.287149][ T369] xfs_reclaim_inodes_ag+0x505/0xb00
> [ 191.313291][ T369] xfs_reclaim_inodes_nr+0x93/0xd0
> [ 191.338357][ T369] super_cache_scan+0x2fd/0x430
> [ 191.362354][ T369] do_shrink_slab+0x317/0x990
> [ 191.385341][ T369] shrink_slab+0x3a8/0x4b0
> [ 191.407214][ T369] shrink_node+0x49c/0x17b0
> [ 191.429841][ T369] balance_pgdat+0x59c/0xed0
> [ 191.455041][ T369] kswapd+0x5a4/0xc40
> [ 191.477524][ T369] kthread+0x358/0x420
> [ 191.499285][ T369] ret_from_fork+0x22/0x30
> [ 191.521107][ T369]
> [ 191.521107][ T369] other info that might help us debug this:
> [ 191.521107][ T369]
> [ 191.567490][ T369] Possible unsafe locking scenario:
> [ 191.567490][ T369]
> [ 191.600947][ T369] CPU0 CPU1
> [ 191.624808][ T369] ---- ----
> [ 191.649236][ T369] lock(fs_reclaim);
> [ 191.667607][ T369] lock(&xfs_nondir_ilock_class);
> [ 191.702096][ T369] lock(fs_reclaim);
> [ 191.731243][ T369] lock(&xfs_nondir_ilock_class);
> [ 191.754025][ T369]
> [ 191.754025][ T369] *** DEADLOCK ***
> [ 191.754025][ T369]
> [ 191.791126][ T369] 4 locks held by kswapd3/369:
> [ 191.812198][ T369] #0: ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
> [ 191.854319][ T369] #1: ffffffffb5074c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x219/0x4b0
> [ 191.896043][ T369] #2: ffff8890279b40e0 (&type->s_umount_key#27){++++}-{3:3}, at: trylock_super+0x11/0xb0
> [ 191.940538][ T369] #3: ffff889027a73a28 (&pag->pag_ici_reclaim_lock){+.+.}-{3:3}, at: xfs_reclaim_inodes_ag+0x135/0xb00
> [ 191.995314][ T369]
> [ 191.995314][ T369] stack backtrace:
> [ 192.022934][ T369] CPU: 42 PID: 369 Comm: kswapd3 Not tainted 5.8.0-rc1-next-20200621 #1
> [ 192.060546][ T369] Hardware name: HP ProLiant BL660c Gen9, BIOS I38 10/17/2018
> [ 192.094518][ T369] Call Trace:
> [ 192.109005][ T369] dump_stack+0x9d/0xe0
> [ 192.127468][ T369] check_noncircular+0x347/0x400
> [ 192.149526][ T369] ? print_circular_bug+0x360/0x360
> [ 192.172584][ T369] ? freezing_slow_path.cold.2+0x2a/0x2a
> [ 192.197251][ T369] __lock_acquire+0x2efc/0x4da0
> [ 192.218737][ T369] ? lockdep_hardirqs_on_prepare+0x550/0x550
> [ 192.246736][ T369] ? __lock_acquire+0x3541/0x4da0
> [ 192.269673][ T369] lock_acquire+0x1ac/0xaf0
> [ 192.290192][ T369] ? xfs_reclaim_inode+0xdf/0x860
> [ 192.313158][ T369] ? rcu_read_unlock+0x50/0x50
> [ 192.335057][ T369] down_write_nested+0x92/0x150
> [ 192.358409][ T369] ? xfs_reclaim_inode+0xdf/0x860
> [ 192.380890][ T369] ? rwsem_down_write_slowpath+0xf50/0xf50
> [ 192.406891][ T369] ? find_held_lock+0x33/0x1c0
> [ 192.427925][ T369] ? xfs_ilock+0x2ef/0x370
> [ 192.447496][ T369] ? xfs_reclaim_inode+0xdf/0x860
> [ 192.472315][ T369] xfs_reclaim_inode+0xdf/0x860
> [ 192.496649][ T369] ? xfs_inode_clear_reclaim_tag+0xa0/0xa0
> [ 192.524188][ T369] ? do_raw_spin_unlock+0x4f/0x250
> [ 192.546852][ T369] xfs_reclaim_inodes_ag+0x505/0xb00
> [ 192.570473][ T369] ? xfs_reclaim_inode+0x860/0x860
> [ 192.592692][ T369] ? mark_held_locks+0xb0/0x110
> [ 192.614287][ T369] ? lockdep_hardirqs_on_prepare+0x38c/0x550
> [ 192.640800][ T369] ? _raw_spin_unlock_irqrestore+0x39/0x40
> [ 192.666695][ T369] ? try_to_wake_up+0xcf/0xf40
> [ 192.688265][ T369] ? migrate_swap_stop+0xc10/0xc10
> [ 192.711966][ T369] ? do_raw_spin_unlock+0x4f/0x250
> [ 192.735032][ T369] xfs_reclaim_inodes_nr+0x93/0xd0
> xfs_reclaim_inodes_nr at fs/xfs/xfs_icache.c:1399
> [ 192.757674][ T369] ? xfs_reclaim_inodes+0x90/0x90
> [ 192.780028][ T369] ? list_lru_count_one+0x177/0x300
> [ 192.803010][ T369] super_cache_scan+0x2fd/0x430
> super_cache_scan at fs/super.c:115
> [ 192.824491][ T369] do_shrink_slab+0x317/0x990
> do_shrink_slab at mm/vmscan.c:514
> [ 192.845160][ T369] shrink_slab+0x3a8/0x4b0
> shrink_slab_memcg at mm/vmscan.c:584
> (inlined by) shrink_slab at mm/vmscan.c:662
> [ 192.864722][ T369] ? do_shrink_slab+0x990/0x990
> [ 192.886137][ T369] ? rcu_is_watching+0x2c/0x80
> [ 192.907289][ T369] ? mem_cgroup_protected+0x228/0x470
> [ 192.931166][ T369] ? vmpressure+0x25/0x290
> [ 192.950595][ T369] shrink_node+0x49c/0x17b0
> [ 192.972332][ T369] balance_pgdat+0x59c/0xed0
> kswapd_shrink_node at mm/vmscan.c:3521
> (inlined by) balance_pgdat at mm/vmscan.c:3670
> [ 192.994918][ T369] ? __node_reclaim+0x950/0x950
> [ 193.018625][ T369] ? lockdep_hardirqs_on_prepare+0x38c/0x550
> [ 193.046566][ T369] ? _raw_spin_unlock_irq+0x1f/0x30
> [ 193.070214][ T369] ? _raw_spin_unlock_irq+0x1f/0x30
> [ 193.093176][ T369] ? finish_task_switch+0x129/0x650
> [ 193.116225][ T369] ? finish_task_switch+0xf2/0x650
> [ 193.138809][ T369] ? rcu_read_lock_bh_held+0xc0/0xc0
> [ 193.163323][ T369] kswapd+0x5a4/0xc40
> [ 193.182690][ T369] ? __kthread_parkme+0x4d/0x1a0
> [ 193.204660][ T369] ? balance_pgdat+0xed0/0xed0
> [ 193.225776][ T369] ? _raw_spin_unlock_irqrestore+0x39/0x40
> [ 193.252306][ T369] ? finish_wait+0x270/0x270
> [ 193.272473][ T369] ? __kthread_parkme+0x4d/0x1a0
> [ 193.294476][ T369] ? __kthread_parkme+0xcc/0x1a0
> [ 193.316704][ T369] ? balance_pgdat+0xed0/0xed0
> [ 193.337808][ T369] kthread+0x358/0x420
> [ 193.355666][ T369] ? kthread_create_worker_on_cpu+0xc0/0xc0
> [ 193.381884][ T369] ret_from_fork+0x22/0x30
>
> > ---
> > This is part of a gpu lockdep annotation series simply because it
> > really helps to catch issues where gpu subsystem locks and primitives
> > can deadlock with themselves through allocations and mmu notifiers.
> > But aside from that motivation it should be completely free-standing,
> > and can land through -mm/-rdma/-hmm or any other tree really whenever.
> > -Daniel
> > ---
> > mm/mmu_notifier.c | 7 -------
> > mm/page_alloc.c | 25 ++++++++++++++++---------
> > 2 files changed, 16 insertions(+), 16 deletions(-)
> >
> > diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> > index 06852b896fa6..5d578b9122f8 100644
> > --- a/mm/mmu_notifier.c
> > +++ b/mm/mmu_notifier.c
> > @@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription,
> > lockdep_assert_held_write(&mm->mmap_sem);
> > BUG_ON(atomic_read(&mm->mm_users) <= 0);
> >
> > - if (IS_ENABLED(CONFIG_LOCKDEP)) {
> > - fs_reclaim_acquire(GFP_KERNEL);
> > - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> > - lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> > - fs_reclaim_release(GFP_KERNEL);
> > - }
> > -
> > if (!mm->notifier_subscriptions) {
> > /*
> > * kmalloc cannot be called under mm_take_all_locks(), but we
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 13cc653122b7..7536faaaa0fd 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -57,6 +57,7 @@
> > #include <trace/events/oom.h>
> > #include <linux/prefetch.h>
> > #include <linux/mm_inline.h>
> > +#include <linux/mmu_notifier.h>
> > #include <linux/migrate.h>
> > #include <linux/hugetlb.h>
> > #include <linux/sched/rt.h>
> > @@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
> > static struct lockdep_map __fs_reclaim_map =
> > STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
> >
> > -static bool __need_fs_reclaim(gfp_t gfp_mask)
> > +static bool __need_reclaim(gfp_t gfp_mask)
> > {
> > gfp_mask = current_gfp_context(gfp_mask);
> >
> > @@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> > if (current->flags & PF_MEMALLOC)
> > return false;
> >
> > - /* We're only interested __GFP_FS allocations for now */
> > - if (!(gfp_mask & __GFP_FS))
> > - return false;
> > -
> > if (gfp_mask & __GFP_NOLOCKDEP)
> > return false;
> >
> > @@ -4158,15 +4155,25 @@ void __fs_reclaim_release(void)
> >
> > void fs_reclaim_acquire(gfp_t gfp_mask)
> > {
> > - if (__need_fs_reclaim(gfp_mask))
> > - __fs_reclaim_acquire();
> > + if (__need_reclaim(gfp_mask)) {
> > + if (gfp_mask & __GFP_FS)
> > + __fs_reclaim_acquire();
> > +
> > +#ifdef CONFIG_MMU_NOTIFIER
> > + lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> > + lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> > +#endif
> > +
> > + }
> > }
> > EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
> >
> > void fs_reclaim_release(gfp_t gfp_mask)
> > {
> > - if (__need_fs_reclaim(gfp_mask))
> > - __fs_reclaim_release();
> > + if (__need_reclaim(gfp_mask)) {
> > + if (gfp_mask & __GFP_FS)
> > + __fs_reclaim_release();
> > + }
> > }
> > EXPORT_SYMBOL_GPL(fs_reclaim_release);
> > #endif
> > --
> > 2.26.2
> >
> >
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-21 18:07 ` Daniel Vetter
@ 2020-06-21 20:01 ` Daniel Vetter
2020-06-21 22:09 ` Qian Cai
2020-06-23 16:17 ` Qian Cai
0 siblings, 2 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-21 20:01 UTC (permalink / raw)
To: Qian Cai
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Christian König, linux-xfs, Linux MM, Jason Gunthorpe,
DRI Development, Daniel Vetter, Andrew Morton
On Sun, Jun 21, 2020 at 08:07:08PM +0200, Daniel Vetter wrote:
> On Sun, Jun 21, 2020 at 7:42 PM Qian Cai <cai@lca.pw> wrote:
> >
> > On Wed, Jun 10, 2020 at 09:41:01PM +0200, Daniel Vetter wrote:
> > > fs_reclaim_acquire/release nicely catch recursion issues when
> > > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> > > to use to keep the excessive caches in check). For mmu notifier
> > > recursions we do have lockdep annotations since 23b68395c7c7
> > > ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
> > >
> > > But these only fire if a path actually results in some pte
> > > invalidation - for most small allocations that's very rarely the case.
> > > The other trouble is that pte invalidation can happen any time when
> > > __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> > > choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> > > recursion.
> > >
> > > I was pondering whether we should just do the general annotation, but
> > > there's always the risk for false positives. Plus I'm assuming that
> > > the core fs and io code is a lot better reviewed and tested than
> > > random mmu notifier code in drivers. Hence why I decide to only
> > > annotate for that specific case.
> > >
> > > Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> > > still need to explicit pull in the mmu notifier map - there's a lot
> > > more places that do pte invalidation than just direct reclaim, these
> > > two contexts arent the same.
> > >
> > > Note that the mmu notifiers needing their own independent lockdep map
> > > is also the reason we can't hold them from fs_reclaim_acquire to
> > > fs_reclaim_release - it would nest with the acquistion in the pte
> > > invalidation code, causing a lockdep splat. And we can't remove the
> > > annotations from pte invalidation and all the other places since
> > > they're called from many other places than page reclaim. Hence we can
> > > only do the equivalent of might_lock, but on the raw lockdep map.
> > >
> > > With this we can also remove the lockdep priming added in 66204f1d2d1b
> > > ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> > > strictly more powerful.
> > >
> > > v2: Review from Thomas Hellstrom:
> > > - unbotch the fs_reclaim context check, I accidentally inverted it,
> > > but it didn't blow up because I inverted it immediately
> > > - fix compiling for !CONFIG_MMU_NOTIFIER
> > >
> > > Cc: Thomas Hellström (Intel) <thomas_os@shipmail.org>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > > Cc: linux-mm@kvack.org
> > > Cc: linux-rdma@vger.kernel.org
> > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > Cc: Christian König <christian.koenig@amd.com>
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >
> > Replying the right patch here...
> >
> > Reverting this commit [1] fixed the lockdep warning below while applying
> > some memory pressure.
> >
> > [1] linux-next cbf7c9d86d75 ("mm: track mmu notifiers in fs_reclaim_acquire/release")
>
> Hm, then I'm confused because
> - there's not mmut notifier lockdep map in the splat at a..
> - the patch is supposed to not change anything for fs_reclaim (but the
> interim version got that wrong)
> - looking at the paths it's kmalloc vs kswapd, both places I totally
> expect fs_reflaim to be used.
>
> But you're claiming reverting this prevents the lockdep splat. If
> that's right, then my reasoning above is broken somewhere. Someone
> less blind than me having an idea?
>
> Aside this is the first email I've typed, until I realized the first
> report was against the broken patch and that looked like a much more
> reasonable explanation (but didn't quite match up with the code
> paths).
Below diff should undo the functional change in my patch. Can you pls test
whether the lockdep splat is really gone with that? Might need a lot of
testing and memory pressure to be sure, since all these reclaim paths
aren't very deterministic.
-Daniel
---
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d807587c9ae6..27ea763c6155 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4191,11 +4191,6 @@ void fs_reclaim_acquire(gfp_t gfp_mask)
if (gfp_mask & __GFP_FS)
__fs_reclaim_acquire();
-#ifdef CONFIG_MMU_NOTIFIER
- lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
- lock_map_release(&__mmu_notifier_invalidate_range_start_map);
-#endif
-
}
}
EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-21 20:01 ` Daniel Vetter
@ 2020-06-21 22:09 ` Qian Cai
2020-06-23 16:17 ` Qian Cai
1 sibling, 0 replies; 106+ messages in thread
From: Qian Cai @ 2020-06-21 22:09 UTC (permalink / raw)
To: Intel Graphics Development, DRI Development, LKML, amd-gfx list,
Thomas Hellström, Andrew Morton, Jason Gunthorpe, Linux MM,
linux-rdma, Maarten Lankhorst, Christian König,
Daniel Vetter, linux-xfs
On Sun, Jun 21, 2020 at 10:01:03PM +0200, Daniel Vetter wrote:
> On Sun, Jun 21, 2020 at 08:07:08PM +0200, Daniel Vetter wrote:
> > On Sun, Jun 21, 2020 at 7:42 PM Qian Cai <cai@lca.pw> wrote:
> > >
> > > On Wed, Jun 10, 2020 at 09:41:01PM +0200, Daniel Vetter wrote:
> > > > fs_reclaim_acquire/release nicely catch recursion issues when
> > > > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> > > > to use to keep the excessive caches in check). For mmu notifier
> > > > recursions we do have lockdep annotations since 23b68395c7c7
> > > > ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
> > > >
> > > > But these only fire if a path actually results in some pte
> > > > invalidation - for most small allocations that's very rarely the case.
> > > > The other trouble is that pte invalidation can happen any time when
> > > > __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> > > > choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> > > > recursion.
> > > >
> > > > I was pondering whether we should just do the general annotation, but
> > > > there's always the risk for false positives. Plus I'm assuming that
> > > > the core fs and io code is a lot better reviewed and tested than
> > > > random mmu notifier code in drivers. Hence why I decide to only
> > > > annotate for that specific case.
> > > >
> > > > Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> > > > still need to explicit pull in the mmu notifier map - there's a lot
> > > > more places that do pte invalidation than just direct reclaim, these
> > > > two contexts arent the same.
> > > >
> > > > Note that the mmu notifiers needing their own independent lockdep map
> > > > is also the reason we can't hold them from fs_reclaim_acquire to
> > > > fs_reclaim_release - it would nest with the acquistion in the pte
> > > > invalidation code, causing a lockdep splat. And we can't remove the
> > > > annotations from pte invalidation and all the other places since
> > > > they're called from many other places than page reclaim. Hence we can
> > > > only do the equivalent of might_lock, but on the raw lockdep map.
> > > >
> > > > With this we can also remove the lockdep priming added in 66204f1d2d1b
> > > > ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> > > > strictly more powerful.
> > > >
> > > > v2: Review from Thomas Hellstrom:
> > > > - unbotch the fs_reclaim context check, I accidentally inverted it,
> > > > but it didn't blow up because I inverted it immediately
> > > > - fix compiling for !CONFIG_MMU_NOTIFIER
> > > >
> > > > Cc: Thomas Hellström (Intel) <thomas_os@shipmail.org>
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > > > Cc: linux-mm@kvack.org
> > > > Cc: linux-rdma@vger.kernel.org
> > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > Cc: Christian König <christian.koenig@amd.com>
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > >
> > > Replying the right patch here...
> > >
> > > Reverting this commit [1] fixed the lockdep warning below while applying
> > > some memory pressure.
> > >
> > > [1] linux-next cbf7c9d86d75 ("mm: track mmu notifiers in fs_reclaim_acquire/release")
> >
> > Hm, then I'm confused because
> > - there's not mmut notifier lockdep map in the splat at a..
> > - the patch is supposed to not change anything for fs_reclaim (but the
> > interim version got that wrong)
> > - looking at the paths it's kmalloc vs kswapd, both places I totally
> > expect fs_reflaim to be used.
> >
> > But you're claiming reverting this prevents the lockdep splat. If
> > that's right, then my reasoning above is broken somewhere. Someone
> > less blind than me having an idea?
> >
> > Aside this is the first email I've typed, until I realized the first
> > report was against the broken patch and that looked like a much more
> > reasonable explanation (but didn't quite match up with the code
> > paths).
>
> Below diff should undo the functional change in my patch. Can you pls test
> whether the lockdep splat is really gone with that? Might need a lot of
> testing and memory pressure to be sure, since all these reclaim paths
> aren't very deterministic.
Well, I am running even heavy memory pressure workloads on linux-next
like every day, and never saw this splat until today where your patch
first show up.
Since I am rather busy tracking another regression, here is the steps to
reproduce (super easy to reproduce on multiple machines here.):
# git clone https://github.com/cailca/linux-mm.git
# cd linux-mm; make
# ./random 0
The .config is in there as well if ever matters.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-21 20:01 ` Daniel Vetter
2020-06-21 22:09 ` Qian Cai
@ 2020-06-23 16:17 ` Qian Cai
2020-06-23 22:13 ` Daniel Vetter
1 sibling, 1 reply; 106+ messages in thread
From: Qian Cai @ 2020-06-23 16:17 UTC (permalink / raw)
To: Daniel Vetter
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Christian König, linux-xfs, Linux MM, Jason Gunthorpe,
DRI Development, Daniel Vetter, Andrew Morton
On Sun, Jun 21, 2020 at 10:01:03PM +0200, Daniel Vetter wrote:
> On Sun, Jun 21, 2020 at 08:07:08PM +0200, Daniel Vetter wrote:
> > On Sun, Jun 21, 2020 at 7:42 PM Qian Cai <cai@lca.pw> wrote:
> > >
> > > On Wed, Jun 10, 2020 at 09:41:01PM +0200, Daniel Vetter wrote:
> > > > fs_reclaim_acquire/release nicely catch recursion issues when
> > > > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> > > > to use to keep the excessive caches in check). For mmu notifier
> > > > recursions we do have lockdep annotations since 23b68395c7c7
> > > > ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
> > > >
> > > > But these only fire if a path actually results in some pte
> > > > invalidation - for most small allocations that's very rarely the case.
> > > > The other trouble is that pte invalidation can happen any time when
> > > > __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> > > > choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> > > > recursion.
> > > >
> > > > I was pondering whether we should just do the general annotation, but
> > > > there's always the risk for false positives. Plus I'm assuming that
> > > > the core fs and io code is a lot better reviewed and tested than
> > > > random mmu notifier code in drivers. Hence why I decide to only
> > > > annotate for that specific case.
> > > >
> > > > Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> > > > still need to explicit pull in the mmu notifier map - there's a lot
> > > > more places that do pte invalidation than just direct reclaim, these
> > > > two contexts arent the same.
> > > >
> > > > Note that the mmu notifiers needing their own independent lockdep map
> > > > is also the reason we can't hold them from fs_reclaim_acquire to
> > > > fs_reclaim_release - it would nest with the acquistion in the pte
> > > > invalidation code, causing a lockdep splat. And we can't remove the
> > > > annotations from pte invalidation and all the other places since
> > > > they're called from many other places than page reclaim. Hence we can
> > > > only do the equivalent of might_lock, but on the raw lockdep map.
> > > >
> > > > With this we can also remove the lockdep priming added in 66204f1d2d1b
> > > > ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> > > > strictly more powerful.
> > > >
> > > > v2: Review from Thomas Hellstrom:
> > > > - unbotch the fs_reclaim context check, I accidentally inverted it,
> > > > but it didn't blow up because I inverted it immediately
> > > > - fix compiling for !CONFIG_MMU_NOTIFIER
> > > >
> > > > Cc: Thomas Hellström (Intel) <thomas_os@shipmail.org>
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > > > Cc: linux-mm@kvack.org
> > > > Cc: linux-rdma@vger.kernel.org
> > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > Cc: Christian König <christian.koenig@amd.com>
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > >
> > > Replying the right patch here...
> > >
> > > Reverting this commit [1] fixed the lockdep warning below while applying
> > > some memory pressure.
> > >
> > > [1] linux-next cbf7c9d86d75 ("mm: track mmu notifiers in fs_reclaim_acquire/release")
> >
> > Hm, then I'm confused because
> > - there's not mmut notifier lockdep map in the splat at a..
> > - the patch is supposed to not change anything for fs_reclaim (but the
> > interim version got that wrong)
> > - looking at the paths it's kmalloc vs kswapd, both places I totally
> > expect fs_reflaim to be used.
> >
> > But you're claiming reverting this prevents the lockdep splat. If
> > that's right, then my reasoning above is broken somewhere. Someone
> > less blind than me having an idea?
> >
> > Aside this is the first email I've typed, until I realized the first
> > report was against the broken patch and that looked like a much more
> > reasonable explanation (but didn't quite match up with the code
> > paths).
>
> Below diff should undo the functional change in my patch. Can you pls test
> whether the lockdep splat is really gone with that? Might need a lot of
> testing and memory pressure to be sure, since all these reclaim paths
> aren't very deterministic.
No, this patch does not help but reverting the whole patch still fixed
the splat.
> -Daniel
>
> ---
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d807587c9ae6..27ea763c6155 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4191,11 +4191,6 @@ void fs_reclaim_acquire(gfp_t gfp_mask)
> if (gfp_mask & __GFP_FS)
> __fs_reclaim_acquire();
>
> -#ifdef CONFIG_MMU_NOTIFIER
> - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> - lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> -#endif
> -
> }
> }
> EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-23 16:17 ` Qian Cai
@ 2020-06-23 22:13 ` Daniel Vetter
2020-06-23 22:29 ` Qian Cai
0 siblings, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-23 22:13 UTC (permalink / raw)
To: Qian Cai
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Christian König, linux-xfs, Linux MM, Jason Gunthorpe,
DRI Development, Daniel Vetter, Andrew Morton
On Tue, Jun 23, 2020 at 6:18 PM Qian Cai <cai@lca.pw> wrote:
>
> On Sun, Jun 21, 2020 at 10:01:03PM +0200, Daniel Vetter wrote:
> > On Sun, Jun 21, 2020 at 08:07:08PM +0200, Daniel Vetter wrote:
> > > On Sun, Jun 21, 2020 at 7:42 PM Qian Cai <cai@lca.pw> wrote:
> > > >
> > > > On Wed, Jun 10, 2020 at 09:41:01PM +0200, Daniel Vetter wrote:
> > > > > fs_reclaim_acquire/release nicely catch recursion issues when
> > > > > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> > > > > to use to keep the excessive caches in check). For mmu notifier
> > > > > recursions we do have lockdep annotations since 23b68395c7c7
> > > > > ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
> > > > >
> > > > > But these only fire if a path actually results in some pte
> > > > > invalidation - for most small allocations that's very rarely the case.
> > > > > The other trouble is that pte invalidation can happen any time when
> > > > > __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> > > > > choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> > > > > recursion.
> > > > >
> > > > > I was pondering whether we should just do the general annotation, but
> > > > > there's always the risk for false positives. Plus I'm assuming that
> > > > > the core fs and io code is a lot better reviewed and tested than
> > > > > random mmu notifier code in drivers. Hence why I decide to only
> > > > > annotate for that specific case.
> > > > >
> > > > > Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> > > > > still need to explicit pull in the mmu notifier map - there's a lot
> > > > > more places that do pte invalidation than just direct reclaim, these
> > > > > two contexts arent the same.
> > > > >
> > > > > Note that the mmu notifiers needing their own independent lockdep map
> > > > > is also the reason we can't hold them from fs_reclaim_acquire to
> > > > > fs_reclaim_release - it would nest with the acquistion in the pte
> > > > > invalidation code, causing a lockdep splat. And we can't remove the
> > > > > annotations from pte invalidation and all the other places since
> > > > > they're called from many other places than page reclaim. Hence we can
> > > > > only do the equivalent of might_lock, but on the raw lockdep map.
> > > > >
> > > > > With this we can also remove the lockdep priming added in 66204f1d2d1b
> > > > > ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> > > > > strictly more powerful.
> > > > >
> > > > > v2: Review from Thomas Hellstrom:
> > > > > - unbotch the fs_reclaim context check, I accidentally inverted it,
> > > > > but it didn't blow up because I inverted it immediately
> > > > > - fix compiling for !CONFIG_MMU_NOTIFIER
> > > > >
> > > > > Cc: Thomas Hellström (Intel) <thomas_os@shipmail.org>
> > > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > > > > Cc: linux-mm@kvack.org
> > > > > Cc: linux-rdma@vger.kernel.org
> > > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > > Cc: Christian König <christian.koenig@amd.com>
> > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > >
> > > > Replying the right patch here...
> > > >
> > > > Reverting this commit [1] fixed the lockdep warning below while applying
> > > > some memory pressure.
> > > >
> > > > [1] linux-next cbf7c9d86d75 ("mm: track mmu notifiers in fs_reclaim_acquire/release")
> > >
> > > Hm, then I'm confused because
> > > - there's not mmut notifier lockdep map in the splat at a..
> > > - the patch is supposed to not change anything for fs_reclaim (but the
> > > interim version got that wrong)
> > > - looking at the paths it's kmalloc vs kswapd, both places I totally
> > > expect fs_reflaim to be used.
> > >
> > > But you're claiming reverting this prevents the lockdep splat. If
> > > that's right, then my reasoning above is broken somewhere. Someone
> > > less blind than me having an idea?
> > >
> > > Aside this is the first email I've typed, until I realized the first
> > > report was against the broken patch and that looked like a much more
> > > reasonable explanation (but didn't quite match up with the code
> > > paths).
> >
> > Below diff should undo the functional change in my patch. Can you pls test
> > whether the lockdep splat is really gone with that? Might need a lot of
> > testing and memory pressure to be sure, since all these reclaim paths
> > aren't very deterministic.
>
> No, this patch does not help but reverting the whole patch still fixed
> the splat.
Ok I tested this. I can't use your script to repro because
- I don't have a setup with xfs, and the splat points at an issue in xfs
- reproducing lockdep splats in shrinker callbacks is always a bit tricky
So instead I made a quick test to validate whether the fs_reclaim
annotations work correctly, and nothing has changed:
+ printk("GFP_NOFS block\n");
+ fs_reclaim_acquire(GFP_NOFS);
+ printk("allocate atomic\n");
+ kfree(kmalloc(16, GFP_ATOMIC));
+ printk("allocate noio\n");
+ kfree(kmalloc(16, GFP_NOIO));
The below two calls to kmalloc are wrong, but the current annotations
don't track __GFP_IO and other levels, only __GFP_FS. So no lockdep
splats here.
+ printk("allocate nofs\n");
+ kfree(kmalloc(16, GFP_NOFS));
+ printk("allocate kernel\n");
+ kfree(kmalloc(16, GFP_KERNEL));
+ fs_reclaim_release(GFP_NOFS);
+
+
+ printk("GFP_KERNEL block\n");
+ fs_reclaim_acquire(GFP_KERNEL);
+ printk("allocate atomic\n");
+ kfree(kmalloc(16, GFP_ATOMIC));
+ printk("allocate noio\n");
+ kfree(kmalloc(16, GFP_NOIO));
+ printk("allocate nofs\n");
+ kfree(kmalloc(16, GFP_NOFS));
This allocation is buggy, and should splat. This is the case for both
with my patch, and with my patch reverted.
+ printk("allocate kernel\n");
+ kfree(kmalloc(16, GFP_KERNEL));
+ fs_reclaim_release(GFP_KERNEL);
I also looked at the paths in your lockdep splat in xfs, this is
simply GFP_KERNEL vs a shrinker reclaim in kswapd.
Summary: Everything is working as expected, there's no change in the
lockdep annotations.
I really think the problem is that either your testcase doesn't hit
the issue reliably enough, or that you're not actually testing the
same kernels and there's some other changes (xfs most likely, but
really it could be anywhere) which is causing this regression. I'm
rather convinced now after this test that it's not my stuff.
Thanks, Daniel
>
> > -Daniel
> >
> > ---
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index d807587c9ae6..27ea763c6155 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -4191,11 +4191,6 @@ void fs_reclaim_acquire(gfp_t gfp_mask)
> > if (gfp_mask & __GFP_FS)
> > __fs_reclaim_acquire();
> >
> > -#ifdef CONFIG_MMU_NOTIFIER
> > - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> > - lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> > -#endif
> > -
> > }
> > }
> > EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-23 22:13 ` Daniel Vetter
@ 2020-06-23 22:29 ` Qian Cai
0 siblings, 0 replies; 106+ messages in thread
From: Qian Cai @ 2020-06-23 22:29 UTC (permalink / raw)
To: Daniel Vetter
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Christian König, linux-xfs, Linux MM, Jason Gunthorpe,
DRI Development, Daniel Vetter, Andrew Morton
> On Jun 23, 2020, at 6:13 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>
> Ok I tested this. I can't use your script to repro because
> - I don't have a setup with xfs, and the splat points at an issue in xfs
> - reproducing lockdep splats in shrinker callbacks is always a bit tricky
What’s xfs setup are you talking about? This is simple xfs rootfs and then trigger swapping. Nothing tricky here as it hit on multiple machines within a few seconds on linux-next.
> Summary: Everything is working as expected, there's no change in the
> lockdep annotations.
> I really think the problem is that either your testcase doesn't hit
> the issue reliably enough, or that you're not actually testing the
> same kernels and there's some other changes (xfs most likely, but
> really it could be anywhere) which is causing this regression. I'm
> rather convinced now after this test that it's not my stuff.
Well, the memory pressure workloads have been running for years on daily linux-next builds and never saw this one happened once. Also, the reverting is ONLY to revert your patch on the top of linux-next will stop the splat, so there is no not testing the same kernel at all.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-21 17:42 ` Qian Cai
2020-06-21 18:07 ` Daniel Vetter
@ 2020-06-23 22:31 ` Dave Chinner
2020-06-23 22:36 ` Daniel Vetter
1 sibling, 1 reply; 106+ messages in thread
From: Dave Chinner @ 2020-06-23 22:31 UTC (permalink / raw)
To: Qian Cai
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Christian König, linux-xfs, linux-mm,
Jason Gunthorpe, DRI Development, Daniel Vetter, Andrew Morton
On Sun, Jun 21, 2020 at 01:42:05PM -0400, Qian Cai wrote:
> On Wed, Jun 10, 2020 at 09:41:01PM +0200, Daniel Vetter wrote:
> > fs_reclaim_acquire/release nicely catch recursion issues when
> > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> > to use to keep the excessive caches in check). For mmu notifier
> > recursions we do have lockdep annotations since 23b68395c7c7
> > ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
> >
> > But these only fire if a path actually results in some pte
> > invalidation - for most small allocations that's very rarely the case.
> > The other trouble is that pte invalidation can happen any time when
> > __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> > choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> > recursion.
> >
> > I was pondering whether we should just do the general annotation, but
> > there's always the risk for false positives. Plus I'm assuming that
> > the core fs and io code is a lot better reviewed and tested than
> > random mmu notifier code in drivers. Hence why I decide to only
> > annotate for that specific case.
> >
> > Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> > still need to explicit pull in the mmu notifier map - there's a lot
> > more places that do pte invalidation than just direct reclaim, these
> > two contexts arent the same.
> >
> > Note that the mmu notifiers needing their own independent lockdep map
> > is also the reason we can't hold them from fs_reclaim_acquire to
> > fs_reclaim_release - it would nest with the acquistion in the pte
> > invalidation code, causing a lockdep splat. And we can't remove the
> > annotations from pte invalidation and all the other places since
> > they're called from many other places than page reclaim. Hence we can
> > only do the equivalent of might_lock, but on the raw lockdep map.
> >
> > With this we can also remove the lockdep priming added in 66204f1d2d1b
> > ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> > strictly more powerful.
> >
> > v2: Review from Thomas Hellstrom:
> > - unbotch the fs_reclaim context check, I accidentally inverted it,
> > but it didn't blow up because I inverted it immediately
> > - fix compiling for !CONFIG_MMU_NOTIFIER
> >
> > Cc: Thomas Hellström (Intel) <thomas_os@shipmail.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > Cc: linux-mm@kvack.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>
> Replying the right patch here...
>
> Reverting this commit [1] fixed the lockdep warning below while applying
> some memory pressure.
>
> [1] linux-next cbf7c9d86d75 ("mm: track mmu notifiers in fs_reclaim_acquire/release")
>
> [ 190.455003][ T369] WARNING: possible circular locking dependency detected
> [ 190.487291][ T369] 5.8.0-rc1-next-20200621 #1 Not tainted
> [ 190.512363][ T369] ------------------------------------------------------
> [ 190.543354][ T369] kswapd3/369 is trying to acquire lock:
> [ 190.568523][ T369] ffff889fcf694528 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_reclaim_inode+0xdf/0x860
> spin_lock at include/linux/spinlock.h:353
> (inlined by) xfs_iflags_test_and_set at fs/xfs/xfs_inode.h:166
> (inlined by) xfs_iflock_nowait at fs/xfs/xfs_inode.h:249
> (inlined by) xfs_reclaim_inode at fs/xfs/xfs_icache.c:1127
> [ 190.614359][ T369]
> [ 190.614359][ T369] but task is already holding lock:
> [ 190.647763][ T369] ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
> __fs_reclaim_acquire at mm/page_alloc.c:4200
> [ 190.687845][ T369]
> [ 190.687845][ T369] which lock already depends on the new lock.
> [ 190.687845][ T369]
> [ 190.734890][ T369]
> [ 190.734890][ T369] the existing dependency chain (in reverse order) is:
> [ 190.775991][ T369]
> [ 190.775991][ T369] -> #1 (fs_reclaim){+.+.}-{0:0}:
> [ 190.808150][ T369] fs_reclaim_acquire+0x77/0x80
> [ 190.832152][ T369] slab_pre_alloc_hook.constprop.52+0x20/0x120
> slab_pre_alloc_hook at mm/slab.h:507
> [ 190.862173][ T369] kmem_cache_alloc+0x43/0x2a0
> [ 190.885602][ T369] kmem_zone_alloc+0x113/0x3ef
> kmem_zone_alloc at fs/xfs/kmem.c:129
> [ 190.908702][ T369] xfs_inode_item_init+0x1d/0xa0
> xfs_inode_item_init at fs/xfs/xfs_inode_item.c:639
> [ 190.934461][ T369] xfs_trans_ijoin+0x96/0x100
> xfs_trans_ijoin at fs/xfs/libxfs/xfs_trans_inode.c:34
> [ 190.961530][ T369] xfs_setattr_nonsize+0x1a6/0xcd0
OK, this patch has royally screwed something up if this path thinks
it can enter memory reclaim. This path is inside a transaction, so
it is running under PF_MEMALLOC_NOFS context, so should *never*
enter memory reclaim.
I'd suggest that whatever mods were made to fs_reclaim_acquire by
this patch broke it's basic functionality....
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 13cc653122b7..7536faaaa0fd 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -57,6 +57,7 @@
> > #include <trace/events/oom.h>
> > #include <linux/prefetch.h>
> > #include <linux/mm_inline.h>
> > +#include <linux/mmu_notifier.h>
> > #include <linux/migrate.h>
> > #include <linux/hugetlb.h>
> > #include <linux/sched/rt.h>
> > @@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
> > static struct lockdep_map __fs_reclaim_map =
> > STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
> >
> > -static bool __need_fs_reclaim(gfp_t gfp_mask)
> > +static bool __need_reclaim(gfp_t gfp_mask)
> > {
> > gfp_mask = current_gfp_context(gfp_mask);
This is applies the per-task memory allocation context flags to the
mask that is checked here.
> > @@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> > if (current->flags & PF_MEMALLOC)
> > return false;
> >
> > - /* We're only interested __GFP_FS allocations for now */
> > - if (!(gfp_mask & __GFP_FS))
> > - return false;
> > -
> > if (gfp_mask & __GFP_NOLOCKDEP)
> > return false;
> >
> > @@ -4158,15 +4155,25 @@ void __fs_reclaim_release(void)
> >
> > void fs_reclaim_acquire(gfp_t gfp_mask)
> > {
> > - if (__need_fs_reclaim(gfp_mask))
> > - __fs_reclaim_acquire();
> > + if (__need_reclaim(gfp_mask)) {
> > + if (gfp_mask & __GFP_FS)
> > + __fs_reclaim_acquire();
.... and they have not been applied in this path. There's your
breakage.
For future reference, please post anything that changes NOFS
allocation contexts or behaviours to linux-fsdevel, as filesystem
developers need to know about proposed changes to infrastructure
that is critical to the correct functioning of filesystems...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-23 22:31 ` Dave Chinner
@ 2020-06-23 22:36 ` Daniel Vetter
0 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-23 22:36 UTC (permalink / raw)
To: Dave Chinner
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Christian König, linux-xfs, Linux MM, Jason Gunthorpe,
Qian Cai, DRI Development, Daniel Vetter, Andrew Morton
On Wed, Jun 24, 2020 at 12:31 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Sun, Jun 21, 2020 at 01:42:05PM -0400, Qian Cai wrote:
> > On Wed, Jun 10, 2020 at 09:41:01PM +0200, Daniel Vetter wrote:
> > > fs_reclaim_acquire/release nicely catch recursion issues when
> > > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> > > to use to keep the excessive caches in check). For mmu notifier
> > > recursions we do have lockdep annotations since 23b68395c7c7
> > > ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
> > >
> > > But these only fire if a path actually results in some pte
> > > invalidation - for most small allocations that's very rarely the case.
> > > The other trouble is that pte invalidation can happen any time when
> > > __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> > > choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> > > recursion.
> > >
> > > I was pondering whether we should just do the general annotation, but
> > > there's always the risk for false positives. Plus I'm assuming that
> > > the core fs and io code is a lot better reviewed and tested than
> > > random mmu notifier code in drivers. Hence why I decide to only
> > > annotate for that specific case.
> > >
> > > Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> > > still need to explicit pull in the mmu notifier map - there's a lot
> > > more places that do pte invalidation than just direct reclaim, these
> > > two contexts arent the same.
> > >
> > > Note that the mmu notifiers needing their own independent lockdep map
> > > is also the reason we can't hold them from fs_reclaim_acquire to
> > > fs_reclaim_release - it would nest with the acquistion in the pte
> > > invalidation code, causing a lockdep splat. And we can't remove the
> > > annotations from pte invalidation and all the other places since
> > > they're called from many other places than page reclaim. Hence we can
> > > only do the equivalent of might_lock, but on the raw lockdep map.
> > >
> > > With this we can also remove the lockdep priming added in 66204f1d2d1b
> > > ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> > > strictly more powerful.
> > >
> > > v2: Review from Thomas Hellstrom:
> > > - unbotch the fs_reclaim context check, I accidentally inverted it,
> > > but it didn't blow up because I inverted it immediately
> > > - fix compiling for !CONFIG_MMU_NOTIFIER
> > >
> > > Cc: Thomas Hellström (Intel) <thomas_os@shipmail.org>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > > Cc: linux-mm@kvack.org
> > > Cc: linux-rdma@vger.kernel.org
> > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > Cc: Christian König <christian.koenig@amd.com>
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >
> > Replying the right patch here...
> >
> > Reverting this commit [1] fixed the lockdep warning below while applying
> > some memory pressure.
> >
> > [1] linux-next cbf7c9d86d75 ("mm: track mmu notifiers in fs_reclaim_acquire/release")
> >
> > [ 190.455003][ T369] WARNING: possible circular locking dependency detected
> > [ 190.487291][ T369] 5.8.0-rc1-next-20200621 #1 Not tainted
> > [ 190.512363][ T369] ------------------------------------------------------
> > [ 190.543354][ T369] kswapd3/369 is trying to acquire lock:
> > [ 190.568523][ T369] ffff889fcf694528 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_reclaim_inode+0xdf/0x860
> > spin_lock at include/linux/spinlock.h:353
> > (inlined by) xfs_iflags_test_and_set at fs/xfs/xfs_inode.h:166
> > (inlined by) xfs_iflock_nowait at fs/xfs/xfs_inode.h:249
> > (inlined by) xfs_reclaim_inode at fs/xfs/xfs_icache.c:1127
> > [ 190.614359][ T369]
> > [ 190.614359][ T369] but task is already holding lock:
> > [ 190.647763][ T369] ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
> > __fs_reclaim_acquire at mm/page_alloc.c:4200
> > [ 190.687845][ T369]
> > [ 190.687845][ T369] which lock already depends on the new lock.
> > [ 190.687845][ T369]
> > [ 190.734890][ T369]
> > [ 190.734890][ T369] the existing dependency chain (in reverse order) is:
> > [ 190.775991][ T369]
> > [ 190.775991][ T369] -> #1 (fs_reclaim){+.+.}-{0:0}:
> > [ 190.808150][ T369] fs_reclaim_acquire+0x77/0x80
> > [ 190.832152][ T369] slab_pre_alloc_hook.constprop.52+0x20/0x120
> > slab_pre_alloc_hook at mm/slab.h:507
> > [ 190.862173][ T369] kmem_cache_alloc+0x43/0x2a0
> > [ 190.885602][ T369] kmem_zone_alloc+0x113/0x3ef
> > kmem_zone_alloc at fs/xfs/kmem.c:129
> > [ 190.908702][ T369] xfs_inode_item_init+0x1d/0xa0
> > xfs_inode_item_init at fs/xfs/xfs_inode_item.c:639
> > [ 190.934461][ T369] xfs_trans_ijoin+0x96/0x100
> > xfs_trans_ijoin at fs/xfs/libxfs/xfs_trans_inode.c:34
> > [ 190.961530][ T369] xfs_setattr_nonsize+0x1a6/0xcd0
>
> OK, this patch has royally screwed something up if this path thinks
> it can enter memory reclaim. This path is inside a transaction, so
> it is running under PF_MEMALLOC_NOFS context, so should *never*
> enter memory reclaim.
>
> I'd suggest that whatever mods were made to fs_reclaim_acquire by
> this patch broke it's basic functionality....
>
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 13cc653122b7..7536faaaa0fd 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -57,6 +57,7 @@
> > > #include <trace/events/oom.h>
> > > #include <linux/prefetch.h>
> > > #include <linux/mm_inline.h>
> > > +#include <linux/mmu_notifier.h>
> > > #include <linux/migrate.h>
> > > #include <linux/hugetlb.h>
> > > #include <linux/sched/rt.h>
> > > @@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
> > > static struct lockdep_map __fs_reclaim_map =
> > > STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
> > >
> > > -static bool __need_fs_reclaim(gfp_t gfp_mask)
> > > +static bool __need_reclaim(gfp_t gfp_mask)
> > > {
> > > gfp_mask = current_gfp_context(gfp_mask);
>
> This is applies the per-task memory allocation context flags to the
> mask that is checked here.
>
> > > @@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> > > if (current->flags & PF_MEMALLOC)
> > > return false;
> > >
> > > - /* We're only interested __GFP_FS allocations for now */
> > > - if (!(gfp_mask & __GFP_FS))
> > > - return false;
> > > -
> > > if (gfp_mask & __GFP_NOLOCKDEP)
> > > return false;
> > >
> > > @@ -4158,15 +4155,25 @@ void __fs_reclaim_release(void)
> > >
> > > void fs_reclaim_acquire(gfp_t gfp_mask)
> > > {
> > > - if (__need_fs_reclaim(gfp_mask))
> > > - __fs_reclaim_acquire();
> > > + if (__need_reclaim(gfp_mask)) {
> > > + if (gfp_mask & __GFP_FS)
> > > + __fs_reclaim_acquire();
>
> .... and they have not been applied in this path. There's your
> breakage.
>
> For future reference, please post anything that changes NOFS
> allocation contexts or behaviours to linux-fsdevel, as filesystem
> developers need to know about proposed changes to infrastructure
> that is critical to the correct functioning of filesystems...
Uh crap I totally missed that. Apologies for wasting everyone's time here.
Andrew, please drop for now, I respin this thing.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-04 8:12 ` [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release Daniel Vetter
2020-06-10 12:01 ` Thomas Hellström (Intel)
2020-06-10 19:41 ` [Intel-gfx] [PATCH] " Daniel Vetter
@ 2020-06-21 17:00 ` Qian Cai
2020-06-21 17:28 ` Daniel Vetter
2 siblings, 1 reply; 106+ messages in thread
From: Qian Cai @ 2020-06-21 17:00 UTC (permalink / raw)
To: Daniel Vetter
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx, linux-xfs,
linux-mm, Jason Gunthorpe, DRI Development, Daniel Vetter,
Andrew Morton, Christian König
On Thu, Jun 04, 2020 at 10:12:07AM +0200, Daniel Vetter wrote:
> fs_reclaim_acquire/release nicely catch recursion issues when
> allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> to use to keep the excessive caches in check). For mmu notifier
> recursions we do have lockdep annotations since 23b68395c7c7
> ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
>
> But these only fire if a path actually results in some pte
> invalidation - for most small allocations that's very rarely the case.
> The other trouble is that pte invalidation can happen any time when
> __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> recursion.
>
> I was pondering whether we should just do the general annotation, but
> there's always the risk for false positives. Plus I'm assuming that
> the core fs and io code is a lot better reviewed and tested than
> random mmu notifier code in drivers. Hence why I decide to only
> annotate for that specific case.
>
> Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> still need to explicit pull in the mmu notifier map - there's a lot
> more places that do pte invalidation than just direct reclaim, these
> two contexts arent the same.
>
> Note that the mmu notifiers needing their own independent lockdep map
> is also the reason we can't hold them from fs_reclaim_acquire to
> fs_reclaim_release - it would nest with the acquistion in the pte
> invalidation code, causing a lockdep splat. And we can't remove the
> annotations from pte invalidation and all the other places since
> they're called from many other places than page reclaim. Hence we can
> only do the equivalent of might_lock, but on the raw lockdep map.
>
> With this we can also remove the lockdep priming added in 66204f1d2d1b
> ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> strictly more powerful.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: linux-mm@kvack.org
> Cc: linux-rdma@vger.kernel.org
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reverting this commit fixed the lockdep splat below while applying some
memory pressure,
[ 190.455003][ T369] WARNING: possible circular locking dependency detected
[ 190.487291][ T369] 5.8.0-rc1-next-20200621 #1 Not tainted
[ 190.512363][ T369] ------------------------------------------------------
[ 190.543354][ T369] kswapd3/369 is trying to acquire lock:
[ 190.568523][ T369] ffff889fcf694528 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_reclaim_inode+0xdf/0x860
spin_lock at include/linux/spinlock.h:353
(inlined by) xfs_iflags_test_and_set at fs/xfs/xfs_inode.h:166
(inlined by) xfs_iflock_nowait at fs/xfs/xfs_inode.h:249
(inlined by) xfs_reclaim_inode at fs/xfs/xfs_icache.c:1127
[ 190.614359][ T369]
[ 190.614359][ T369] but task is already holding lock:
[ 190.647763][ T369] ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
__fs_reclaim_acquire at mm/page_alloc.c:4200
[ 190.687845][ T369]
[ 190.687845][ T369] which lock already depends on the new lock.
[ 190.687845][ T369]
[ 190.734890][ T369]
[ 190.734890][ T369] the existing dependency chain (in reverse order) is:
[ 190.775991][ T369]
[ 190.775991][ T369] -> #1 (fs_reclaim){+.+.}-{0:0}:
[ 190.808150][ T369] fs_reclaim_acquire+0x77/0x80
[ 190.832152][ T369] slab_pre_alloc_hook.constprop.52+0x20/0x120
slab_pre_alloc_hook at mm/slab.h:507
[ 190.862173][ T369] kmem_cache_alloc+0x43/0x2a0
[ 190.885602][ T369] kmem_zone_alloc+0x113/0x3ef
kmem_zone_alloc at fs/xfs/kmem.c:129
[ 190.908702][ T369] xfs_inode_item_init+0x1d/0xa0
xfs_inode_item_init at fs/xfs/xfs_inode_item.c:639
[ 190.934461][ T369] xfs_trans_ijoin+0x96/0x100
xfs_trans_ijoin at fs/xfs/libxfs/xfs_trans_inode.c:34
[ 190.961530][ T369] xfs_setattr_nonsize+0x1a6/0xcd0
xfs_setattr_nonsize at fs/xfs/xfs_iops.c:716
[ 190.987331][ T369] xfs_vn_setattr+0x133/0x160
xfs_vn_setattr at fs/xfs/xfs_iops.c:1081
[ 191.010476][ T369] notify_change+0x6c5/0xba1
notify_change at fs/attr.c:336
[ 191.033317][ T369] chmod_common+0x19b/0x390
[ 191.055770][ T369] ksys_fchmod+0x28/0x60
[ 191.077957][ T369] __x64_sys_fchmod+0x4e/0x70
[ 191.102767][ T369] do_syscall_64+0x5f/0x310
[ 191.125090][ T369] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 191.153749][ T369]
[ 191.153749][ T369] -> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
[ 191.191267][ T369] __lock_acquire+0x2efc/0x4da0
[ 191.215974][ T369] lock_acquire+0x1ac/0xaf0
[ 191.238953][ T369] down_write_nested+0x92/0x150
[ 191.262955][ T369] xfs_reclaim_inode+0xdf/0x860
[ 191.287149][ T369] xfs_reclaim_inodes_ag+0x505/0xb00
[ 191.313291][ T369] xfs_reclaim_inodes_nr+0x93/0xd0
[ 191.338357][ T369] super_cache_scan+0x2fd/0x430
[ 191.362354][ T369] do_shrink_slab+0x317/0x990
[ 191.385341][ T369] shrink_slab+0x3a8/0x4b0
[ 191.407214][ T369] shrink_node+0x49c/0x17b0
[ 191.429841][ T369] balance_pgdat+0x59c/0xed0
[ 191.455041][ T369] kswapd+0x5a4/0xc40
[ 191.477524][ T369] kthread+0x358/0x420
[ 191.499285][ T369] ret_from_fork+0x22/0x30
[ 191.521107][ T369]
[ 191.521107][ T369] other info that might help us debug this:
[ 191.521107][ T369]
[ 191.567490][ T369] Possible unsafe locking scenario:
[ 191.567490][ T369]
[ 191.600947][ T369] CPU0 CPU1
[ 191.624808][ T369] ---- ----
[ 191.649236][ T369] lock(fs_reclaim);
[ 191.667607][ T369] lock(&xfs_nondir_ilock_class);
[ 191.702096][ T369] lock(fs_reclaim);
[ 191.731243][ T369] lock(&xfs_nondir_ilock_class);
[ 191.754025][ T369]
[ 191.754025][ T369] *** DEADLOCK ***
[ 191.754025][ T369]
[ 191.791126][ T369] 4 locks held by kswapd3/369:
[ 191.812198][ T369] #0: ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
[ 191.854319][ T369] #1: ffffffffb5074c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x219/0x4b0
[ 191.896043][ T369] #2: ffff8890279b40e0 (&type->s_umount_key#27){++++}-{3:3}, at: trylock_super+0x11/0xb0
[ 191.940538][ T369] #3: ffff889027a73a28 (&pag->pag_ici_reclaim_lock){+.+.}-{3:3}, at: xfs_reclaim_inodes_ag+0x135/0xb00
[ 191.995314][ T369]
[ 191.995314][ T369] stack backtrace:
[ 192.022934][ T369] CPU: 42 PID: 369 Comm: kswapd3 Not tainted 5.8.0-rc1-next-20200621 #1
[ 192.060546][ T369] Hardware name: HP ProLiant BL660c Gen9, BIOS I38 10/17/2018
[ 192.094518][ T369] Call Trace:
[ 192.109005][ T369] dump_stack+0x9d/0xe0
[ 192.127468][ T369] check_noncircular+0x347/0x400
[ 192.149526][ T369] ? print_circular_bug+0x360/0x360
[ 192.172584][ T369] ? freezing_slow_path.cold.2+0x2a/0x2a
[ 192.197251][ T369] __lock_acquire+0x2efc/0x4da0
[ 192.218737][ T369] ? lockdep_hardirqs_on_prepare+0x550/0x550
[ 192.246736][ T369] ? __lock_acquire+0x3541/0x4da0
[ 192.269673][ T369] lock_acquire+0x1ac/0xaf0
[ 192.290192][ T369] ? xfs_reclaim_inode+0xdf/0x860
[ 192.313158][ T369] ? rcu_read_unlock+0x50/0x50
[ 192.335057][ T369] down_write_nested+0x92/0x150
[ 192.358409][ T369] ? xfs_reclaim_inode+0xdf/0x860
[ 192.380890][ T369] ? rwsem_down_write_slowpath+0xf50/0xf50
[ 192.406891][ T369] ? find_held_lock+0x33/0x1c0
[ 192.427925][ T369] ? xfs_ilock+0x2ef/0x370
[ 192.447496][ T369] ? xfs_reclaim_inode+0xdf/0x860
[ 192.472315][ T369] xfs_reclaim_inode+0xdf/0x860
[ 192.496649][ T369] ? xfs_inode_clear_reclaim_tag+0xa0/0xa0
[ 192.524188][ T369] ? do_raw_spin_unlock+0x4f/0x250
[ 192.546852][ T369] xfs_reclaim_inodes_ag+0x505/0xb00
[ 192.570473][ T369] ? xfs_reclaim_inode+0x860/0x860
[ 192.592692][ T369] ? mark_held_locks+0xb0/0x110
[ 192.614287][ T369] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 192.640800][ T369] ? _raw_spin_unlock_irqrestore+0x39/0x40
[ 192.666695][ T369] ? try_to_wake_up+0xcf/0xf40
[ 192.688265][ T369] ? migrate_swap_stop+0xc10/0xc10
[ 192.711966][ T369] ? do_raw_spin_unlock+0x4f/0x250
[ 192.735032][ T369] xfs_reclaim_inodes_nr+0x93/0xd0
xfs_reclaim_inodes_nr at fs/xfs/xfs_icache.c:1399
[ 192.757674][ T369] ? xfs_reclaim_inodes+0x90/0x90
[ 192.780028][ T369] ? list_lru_count_one+0x177/0x300
[ 192.803010][ T369] super_cache_scan+0x2fd/0x430
super_cache_scan at fs/super.c:115
[ 192.824491][ T369] do_shrink_slab+0x317/0x990
do_shrink_slab at mm/vmscan.c:514
[ 192.845160][ T369] shrink_slab+0x3a8/0x4b0
shrink_slab_memcg at mm/vmscan.c:584
(inlined by) shrink_slab at mm/vmscan.c:662
[ 192.864722][ T369] ? do_shrink_slab+0x990/0x990
[ 192.886137][ T369] ? rcu_is_watching+0x2c/0x80
[ 192.907289][ T369] ? mem_cgroup_protected+0x228/0x470
[ 192.931166][ T369] ? vmpressure+0x25/0x290
[ 192.950595][ T369] shrink_node+0x49c/0x17b0
[ 192.972332][ T369] balance_pgdat+0x59c/0xed0
kswapd_shrink_node at mm/vmscan.c:3521
(inlined by) balance_pgdat at mm/vmscan.c:3670
[ 192.994918][ T369] ? __node_reclaim+0x950/0x950
[ 193.018625][ T369] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 193.046566][ T369] ? _raw_spin_unlock_irq+0x1f/0x30
[ 193.070214][ T369] ? _raw_spin_unlock_irq+0x1f/0x30
[ 193.093176][ T369] ? finish_task_switch+0x129/0x650
[ 193.116225][ T369] ? finish_task_switch+0xf2/0x650
[ 193.138809][ T369] ? rcu_read_lock_bh_held+0xc0/0xc0
[ 193.163323][ T369] kswapd+0x5a4/0xc40
[ 193.182690][ T369] ? __kthread_parkme+0x4d/0x1a0
[ 193.204660][ T369] ? balance_pgdat+0xed0/0xed0
[ 193.225776][ T369] ? _raw_spin_unlock_irqrestore+0x39/0x40
[ 193.252306][ T369] ? finish_wait+0x270/0x270
[ 193.272473][ T369] ? __kthread_parkme+0x4d/0x1a0
[ 193.294476][ T369] ? __kthread_parkme+0xcc/0x1a0
[ 193.316704][ T369] ? balance_pgdat+0xed0/0xed0
[ 193.337808][ T369] kthread+0x358/0x420
[ 193.355666][ T369] ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 193.381884][ T369] ret_from_fork+0x22/0x30
> ---
> This is part of a gpu lockdep annotation series simply because it
> really helps to catch issues where gpu subsystem locks and primitives
> can deadlock with themselves through allocations and mmu notifiers.
> But aside from that motivation it should be completely free-standing,
> and can land through -mm/-rdma/-hmm or any other tree really whenever.
> -Daniel
> ---
> mm/mmu_notifier.c | 7 -------
> mm/page_alloc.c | 23 ++++++++++++++---------
> 2 files changed, 14 insertions(+), 16 deletions(-)
>
> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> index 06852b896fa6..5d578b9122f8 100644
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription,
> lockdep_assert_held_write(&mm->mmap_sem);
> BUG_ON(atomic_read(&mm->mm_users) <= 0);
>
> - if (IS_ENABLED(CONFIG_LOCKDEP)) {
> - fs_reclaim_acquire(GFP_KERNEL);
> - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> - lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> - fs_reclaim_release(GFP_KERNEL);
> - }
> -
> if (!mm->notifier_subscriptions) {
> /*
> * kmalloc cannot be called under mm_take_all_locks(), but we
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 13cc653122b7..f8a222db4a53 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -57,6 +57,7 @@
> #include <trace/events/oom.h>
> #include <linux/prefetch.h>
> #include <linux/mm_inline.h>
> +#include <linux/mmu_notifier.h>
> #include <linux/migrate.h>
> #include <linux/hugetlb.h>
> #include <linux/sched/rt.h>
> @@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
> static struct lockdep_map __fs_reclaim_map =
> STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
>
> -static bool __need_fs_reclaim(gfp_t gfp_mask)
> +static bool __need_reclaim(gfp_t gfp_mask)
> {
> gfp_mask = current_gfp_context(gfp_mask);
>
> @@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> if (current->flags & PF_MEMALLOC)
> return false;
>
> - /* We're only interested __GFP_FS allocations for now */
> - if (!(gfp_mask & __GFP_FS))
> - return false;
> -
> if (gfp_mask & __GFP_NOLOCKDEP)
> return false;
>
> @@ -4158,15 +4155,23 @@ void __fs_reclaim_release(void)
>
> void fs_reclaim_acquire(gfp_t gfp_mask)
> {
> - if (__need_fs_reclaim(gfp_mask))
> - __fs_reclaim_acquire();
> + if (__need_reclaim(gfp_mask)) {
> + if (!(gfp_mask & __GFP_FS))
> + __fs_reclaim_acquire();
> +
> + lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> + lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> +
> + }
> }
> EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
>
> void fs_reclaim_release(gfp_t gfp_mask)
> {
> - if (__need_fs_reclaim(gfp_mask))
> - __fs_reclaim_release();
> + if (__need_reclaim(gfp_mask)) {
> + if (!(gfp_mask & __GFP_FS))
> + __fs_reclaim_release();
> + }
> }
> EXPORT_SYMBOL_GPL(fs_reclaim_release);
> #endif
> --
> 2.26.2
>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-21 17:00 ` [Intel-gfx] [PATCH 01/18] " Qian Cai
@ 2020-06-21 17:28 ` Daniel Vetter
2020-06-21 17:46 ` Qian Cai
0 siblings, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-21 17:28 UTC (permalink / raw)
To: Qian Cai
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
linux-xfs, Linux MM, Jason Gunthorpe, DRI Development,
Daniel Vetter, Andrew Morton, Christian König
On Sun, Jun 21, 2020 at 7:01 PM Qian Cai <cai@lca.pw> wrote:
>
> On Thu, Jun 04, 2020 at 10:12:07AM +0200, Daniel Vetter wrote:
> > fs_reclaim_acquire/release nicely catch recursion issues when
> > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> > to use to keep the excessive caches in check). For mmu notifier
> > recursions we do have lockdep annotations since 23b68395c7c7
> > ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
> >
> > But these only fire if a path actually results in some pte
> > invalidation - for most small allocations that's very rarely the case.
> > The other trouble is that pte invalidation can happen any time when
> > __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> > choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> > recursion.
> >
> > I was pondering whether we should just do the general annotation, but
> > there's always the risk for false positives. Plus I'm assuming that
> > the core fs and io code is a lot better reviewed and tested than
> > random mmu notifier code in drivers. Hence why I decide to only
> > annotate for that specific case.
> >
> > Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> > still need to explicit pull in the mmu notifier map - there's a lot
> > more places that do pte invalidation than just direct reclaim, these
> > two contexts arent the same.
> >
> > Note that the mmu notifiers needing their own independent lockdep map
> > is also the reason we can't hold them from fs_reclaim_acquire to
> > fs_reclaim_release - it would nest with the acquistion in the pte
> > invalidation code, causing a lockdep splat. And we can't remove the
> > annotations from pte invalidation and all the other places since
> > they're called from many other places than page reclaim. Hence we can
> > only do the equivalent of might_lock, but on the raw lockdep map.
> >
> > With this we can also remove the lockdep priming added in 66204f1d2d1b
> > ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> > strictly more powerful.
> >
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > Cc: linux-mm@kvack.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>
> Reverting this commit fixed the lockdep splat below while applying some
> memory pressure,
This is a broken version of the patch, please use the one Andrew
merged into -mm.
Thanks, Daniel
>
> [ 190.455003][ T369] WARNING: possible circular locking dependency detected
> [ 190.487291][ T369] 5.8.0-rc1-next-20200621 #1 Not tainted
> [ 190.512363][ T369] ------------------------------------------------------
> [ 190.543354][ T369] kswapd3/369 is trying to acquire lock:
> [ 190.568523][ T369] ffff889fcf694528 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_reclaim_inode+0xdf/0x860
> spin_lock at include/linux/spinlock.h:353
> (inlined by) xfs_iflags_test_and_set at fs/xfs/xfs_inode.h:166
> (inlined by) xfs_iflock_nowait at fs/xfs/xfs_inode.h:249
> (inlined by) xfs_reclaim_inode at fs/xfs/xfs_icache.c:1127
> [ 190.614359][ T369]
> [ 190.614359][ T369] but task is already holding lock:
> [ 190.647763][ T369] ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
> __fs_reclaim_acquire at mm/page_alloc.c:4200
> [ 190.687845][ T369]
> [ 190.687845][ T369] which lock already depends on the new lock.
> [ 190.687845][ T369]
> [ 190.734890][ T369]
> [ 190.734890][ T369] the existing dependency chain (in reverse order) is:
> [ 190.775991][ T369]
> [ 190.775991][ T369] -> #1 (fs_reclaim){+.+.}-{0:0}:
> [ 190.808150][ T369] fs_reclaim_acquire+0x77/0x80
> [ 190.832152][ T369] slab_pre_alloc_hook.constprop.52+0x20/0x120
> slab_pre_alloc_hook at mm/slab.h:507
> [ 190.862173][ T369] kmem_cache_alloc+0x43/0x2a0
> [ 190.885602][ T369] kmem_zone_alloc+0x113/0x3ef
> kmem_zone_alloc at fs/xfs/kmem.c:129
> [ 190.908702][ T369] xfs_inode_item_init+0x1d/0xa0
> xfs_inode_item_init at fs/xfs/xfs_inode_item.c:639
> [ 190.934461][ T369] xfs_trans_ijoin+0x96/0x100
> xfs_trans_ijoin at fs/xfs/libxfs/xfs_trans_inode.c:34
> [ 190.961530][ T369] xfs_setattr_nonsize+0x1a6/0xcd0
> xfs_setattr_nonsize at fs/xfs/xfs_iops.c:716
> [ 190.987331][ T369] xfs_vn_setattr+0x133/0x160
> xfs_vn_setattr at fs/xfs/xfs_iops.c:1081
> [ 191.010476][ T369] notify_change+0x6c5/0xba1
> notify_change at fs/attr.c:336
> [ 191.033317][ T369] chmod_common+0x19b/0x390
> [ 191.055770][ T369] ksys_fchmod+0x28/0x60
> [ 191.077957][ T369] __x64_sys_fchmod+0x4e/0x70
> [ 191.102767][ T369] do_syscall_64+0x5f/0x310
> [ 191.125090][ T369] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 191.153749][ T369]
> [ 191.153749][ T369] -> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
> [ 191.191267][ T369] __lock_acquire+0x2efc/0x4da0
> [ 191.215974][ T369] lock_acquire+0x1ac/0xaf0
> [ 191.238953][ T369] down_write_nested+0x92/0x150
> [ 191.262955][ T369] xfs_reclaim_inode+0xdf/0x860
> [ 191.287149][ T369] xfs_reclaim_inodes_ag+0x505/0xb00
> [ 191.313291][ T369] xfs_reclaim_inodes_nr+0x93/0xd0
> [ 191.338357][ T369] super_cache_scan+0x2fd/0x430
> [ 191.362354][ T369] do_shrink_slab+0x317/0x990
> [ 191.385341][ T369] shrink_slab+0x3a8/0x4b0
> [ 191.407214][ T369] shrink_node+0x49c/0x17b0
> [ 191.429841][ T369] balance_pgdat+0x59c/0xed0
> [ 191.455041][ T369] kswapd+0x5a4/0xc40
> [ 191.477524][ T369] kthread+0x358/0x420
> [ 191.499285][ T369] ret_from_fork+0x22/0x30
> [ 191.521107][ T369]
> [ 191.521107][ T369] other info that might help us debug this:
> [ 191.521107][ T369]
> [ 191.567490][ T369] Possible unsafe locking scenario:
> [ 191.567490][ T369]
> [ 191.600947][ T369] CPU0 CPU1
> [ 191.624808][ T369] ---- ----
> [ 191.649236][ T369] lock(fs_reclaim);
> [ 191.667607][ T369] lock(&xfs_nondir_ilock_class);
> [ 191.702096][ T369] lock(fs_reclaim);
> [ 191.731243][ T369] lock(&xfs_nondir_ilock_class);
> [ 191.754025][ T369]
> [ 191.754025][ T369] *** DEADLOCK ***
> [ 191.754025][ T369]
> [ 191.791126][ T369] 4 locks held by kswapd3/369:
> [ 191.812198][ T369] #0: ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
> [ 191.854319][ T369] #1: ffffffffb5074c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x219/0x4b0
> [ 191.896043][ T369] #2: ffff8890279b40e0 (&type->s_umount_key#27){++++}-{3:3}, at: trylock_super+0x11/0xb0
> [ 191.940538][ T369] #3: ffff889027a73a28 (&pag->pag_ici_reclaim_lock){+.+.}-{3:3}, at: xfs_reclaim_inodes_ag+0x135/0xb00
> [ 191.995314][ T369]
> [ 191.995314][ T369] stack backtrace:
> [ 192.022934][ T369] CPU: 42 PID: 369 Comm: kswapd3 Not tainted 5.8.0-rc1-next-20200621 #1
> [ 192.060546][ T369] Hardware name: HP ProLiant BL660c Gen9, BIOS I38 10/17/2018
> [ 192.094518][ T369] Call Trace:
> [ 192.109005][ T369] dump_stack+0x9d/0xe0
> [ 192.127468][ T369] check_noncircular+0x347/0x400
> [ 192.149526][ T369] ? print_circular_bug+0x360/0x360
> [ 192.172584][ T369] ? freezing_slow_path.cold.2+0x2a/0x2a
> [ 192.197251][ T369] __lock_acquire+0x2efc/0x4da0
> [ 192.218737][ T369] ? lockdep_hardirqs_on_prepare+0x550/0x550
> [ 192.246736][ T369] ? __lock_acquire+0x3541/0x4da0
> [ 192.269673][ T369] lock_acquire+0x1ac/0xaf0
> [ 192.290192][ T369] ? xfs_reclaim_inode+0xdf/0x860
> [ 192.313158][ T369] ? rcu_read_unlock+0x50/0x50
> [ 192.335057][ T369] down_write_nested+0x92/0x150
> [ 192.358409][ T369] ? xfs_reclaim_inode+0xdf/0x860
> [ 192.380890][ T369] ? rwsem_down_write_slowpath+0xf50/0xf50
> [ 192.406891][ T369] ? find_held_lock+0x33/0x1c0
> [ 192.427925][ T369] ? xfs_ilock+0x2ef/0x370
> [ 192.447496][ T369] ? xfs_reclaim_inode+0xdf/0x860
> [ 192.472315][ T369] xfs_reclaim_inode+0xdf/0x860
> [ 192.496649][ T369] ? xfs_inode_clear_reclaim_tag+0xa0/0xa0
> [ 192.524188][ T369] ? do_raw_spin_unlock+0x4f/0x250
> [ 192.546852][ T369] xfs_reclaim_inodes_ag+0x505/0xb00
> [ 192.570473][ T369] ? xfs_reclaim_inode+0x860/0x860
> [ 192.592692][ T369] ? mark_held_locks+0xb0/0x110
> [ 192.614287][ T369] ? lockdep_hardirqs_on_prepare+0x38c/0x550
> [ 192.640800][ T369] ? _raw_spin_unlock_irqrestore+0x39/0x40
> [ 192.666695][ T369] ? try_to_wake_up+0xcf/0xf40
> [ 192.688265][ T369] ? migrate_swap_stop+0xc10/0xc10
> [ 192.711966][ T369] ? do_raw_spin_unlock+0x4f/0x250
> [ 192.735032][ T369] xfs_reclaim_inodes_nr+0x93/0xd0
> xfs_reclaim_inodes_nr at fs/xfs/xfs_icache.c:1399
> [ 192.757674][ T369] ? xfs_reclaim_inodes+0x90/0x90
> [ 192.780028][ T369] ? list_lru_count_one+0x177/0x300
> [ 192.803010][ T369] super_cache_scan+0x2fd/0x430
> super_cache_scan at fs/super.c:115
> [ 192.824491][ T369] do_shrink_slab+0x317/0x990
> do_shrink_slab at mm/vmscan.c:514
> [ 192.845160][ T369] shrink_slab+0x3a8/0x4b0
> shrink_slab_memcg at mm/vmscan.c:584
> (inlined by) shrink_slab at mm/vmscan.c:662
> [ 192.864722][ T369] ? do_shrink_slab+0x990/0x990
> [ 192.886137][ T369] ? rcu_is_watching+0x2c/0x80
> [ 192.907289][ T369] ? mem_cgroup_protected+0x228/0x470
> [ 192.931166][ T369] ? vmpressure+0x25/0x290
> [ 192.950595][ T369] shrink_node+0x49c/0x17b0
> [ 192.972332][ T369] balance_pgdat+0x59c/0xed0
> kswapd_shrink_node at mm/vmscan.c:3521
> (inlined by) balance_pgdat at mm/vmscan.c:3670
> [ 192.994918][ T369] ? __node_reclaim+0x950/0x950
> [ 193.018625][ T369] ? lockdep_hardirqs_on_prepare+0x38c/0x550
> [ 193.046566][ T369] ? _raw_spin_unlock_irq+0x1f/0x30
> [ 193.070214][ T369] ? _raw_spin_unlock_irq+0x1f/0x30
> [ 193.093176][ T369] ? finish_task_switch+0x129/0x650
> [ 193.116225][ T369] ? finish_task_switch+0xf2/0x650
> [ 193.138809][ T369] ? rcu_read_lock_bh_held+0xc0/0xc0
> [ 193.163323][ T369] kswapd+0x5a4/0xc40
> [ 193.182690][ T369] ? __kthread_parkme+0x4d/0x1a0
> [ 193.204660][ T369] ? balance_pgdat+0xed0/0xed0
> [ 193.225776][ T369] ? _raw_spin_unlock_irqrestore+0x39/0x40
> [ 193.252306][ T369] ? finish_wait+0x270/0x270
> [ 193.272473][ T369] ? __kthread_parkme+0x4d/0x1a0
> [ 193.294476][ T369] ? __kthread_parkme+0xcc/0x1a0
> [ 193.316704][ T369] ? balance_pgdat+0xed0/0xed0
> [ 193.337808][ T369] kthread+0x358/0x420
> [ 193.355666][ T369] ? kthread_create_worker_on_cpu+0xc0/0xc0
> [ 193.381884][ T369] ret_from_fork+0x22/0x30
>
> > ---
> > This is part of a gpu lockdep annotation series simply because it
> > really helps to catch issues where gpu subsystem locks and primitives
> > can deadlock with themselves through allocations and mmu notifiers.
> > But aside from that motivation it should be completely free-standing,
> > and can land through -mm/-rdma/-hmm or any other tree really whenever.
> > -Daniel
> > ---
> > mm/mmu_notifier.c | 7 -------
> > mm/page_alloc.c | 23 ++++++++++++++---------
> > 2 files changed, 14 insertions(+), 16 deletions(-)
> >
> > diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> > index 06852b896fa6..5d578b9122f8 100644
> > --- a/mm/mmu_notifier.c
> > +++ b/mm/mmu_notifier.c
> > @@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription,
> > lockdep_assert_held_write(&mm->mmap_sem);
> > BUG_ON(atomic_read(&mm->mm_users) <= 0);
> >
> > - if (IS_ENABLED(CONFIG_LOCKDEP)) {
> > - fs_reclaim_acquire(GFP_KERNEL);
> > - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> > - lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> > - fs_reclaim_release(GFP_KERNEL);
> > - }
> > -
> > if (!mm->notifier_subscriptions) {
> > /*
> > * kmalloc cannot be called under mm_take_all_locks(), but we
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 13cc653122b7..f8a222db4a53 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -57,6 +57,7 @@
> > #include <trace/events/oom.h>
> > #include <linux/prefetch.h>
> > #include <linux/mm_inline.h>
> > +#include <linux/mmu_notifier.h>
> > #include <linux/migrate.h>
> > #include <linux/hugetlb.h>
> > #include <linux/sched/rt.h>
> > @@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
> > static struct lockdep_map __fs_reclaim_map =
> > STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
> >
> > -static bool __need_fs_reclaim(gfp_t gfp_mask)
> > +static bool __need_reclaim(gfp_t gfp_mask)
> > {
> > gfp_mask = current_gfp_context(gfp_mask);
> >
> > @@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> > if (current->flags & PF_MEMALLOC)
> > return false;
> >
> > - /* We're only interested __GFP_FS allocations for now */
> > - if (!(gfp_mask & __GFP_FS))
> > - return false;
> > -
> > if (gfp_mask & __GFP_NOLOCKDEP)
> > return false;
> >
> > @@ -4158,15 +4155,23 @@ void __fs_reclaim_release(void)
> >
> > void fs_reclaim_acquire(gfp_t gfp_mask)
> > {
> > - if (__need_fs_reclaim(gfp_mask))
> > - __fs_reclaim_acquire();
> > + if (__need_reclaim(gfp_mask)) {
> > + if (!(gfp_mask & __GFP_FS))
> > + __fs_reclaim_acquire();
> > +
> > + lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> > + lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> > +
> > + }
> > }
> > EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
> >
> > void fs_reclaim_release(gfp_t gfp_mask)
> > {
> > - if (__need_fs_reclaim(gfp_mask))
> > - __fs_reclaim_release();
> > + if (__need_reclaim(gfp_mask)) {
> > + if (!(gfp_mask & __GFP_FS))
> > + __fs_reclaim_release();
> > + }
> > }
> > EXPORT_SYMBOL_GPL(fs_reclaim_release);
> > #endif
> > --
> > 2.26.2
> >
> >
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release
2020-06-21 17:28 ` Daniel Vetter
@ 2020-06-21 17:46 ` Qian Cai
0 siblings, 0 replies; 106+ messages in thread
From: Qian Cai @ 2020-06-21 17:46 UTC (permalink / raw)
To: Daniel Vetter
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
linux-xfs, Linux MM, Jason Gunthorpe, DRI Development,
Daniel Vetter, Andrew Morton, Christian König
On Sun, Jun 21, 2020 at 07:28:40PM +0200, Daniel Vetter wrote:
> On Sun, Jun 21, 2020 at 7:01 PM Qian Cai <cai@lca.pw> wrote:
> >
> > On Thu, Jun 04, 2020 at 10:12:07AM +0200, Daniel Vetter wrote:
> > > fs_reclaim_acquire/release nicely catch recursion issues when
> > > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend
> > > to use to keep the excessive caches in check). For mmu notifier
> > > recursions we do have lockdep annotations since 23b68395c7c7
> > > ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end").
> > >
> > > But these only fire if a path actually results in some pte
> > > invalidation - for most small allocations that's very rarely the case.
> > > The other trouble is that pte invalidation can happen any time when
> > > __GFP_RECLAIM is set. Which means only really GFP_ATOMIC is a safe
> > > choice, GFP_NOIO isn't good enough to avoid potential mmu notifier
> > > recursion.
> > >
> > > I was pondering whether we should just do the general annotation, but
> > > there's always the risk for false positives. Plus I'm assuming that
> > > the core fs and io code is a lot better reviewed and tested than
> > > random mmu notifier code in drivers. Hence why I decide to only
> > > annotate for that specific case.
> > >
> > > Furthermore even if we'd create a lockdep map for direct reclaim, we'd
> > > still need to explicit pull in the mmu notifier map - there's a lot
> > > more places that do pte invalidation than just direct reclaim, these
> > > two contexts arent the same.
> > >
> > > Note that the mmu notifiers needing their own independent lockdep map
> > > is also the reason we can't hold them from fs_reclaim_acquire to
> > > fs_reclaim_release - it would nest with the acquistion in the pte
> > > invalidation code, causing a lockdep splat. And we can't remove the
> > > annotations from pte invalidation and all the other places since
> > > they're called from many other places than page reclaim. Hence we can
> > > only do the equivalent of might_lock, but on the raw lockdep map.
> > >
> > > With this we can also remove the lockdep priming added in 66204f1d2d1b
> > > ("mm/mmu_notifiers: prime lockdep") since the new annotations are
> > > strictly more powerful.
> > >
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > > Cc: linux-mm@kvack.org
> > > Cc: linux-rdma@vger.kernel.org
> > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > Cc: Christian König <christian.koenig@amd.com>
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >
> > Reverting this commit fixed the lockdep splat below while applying some
> > memory pressure,
>
> This is a broken version of the patch, please use the one Andrew
> merged into -mm.
Yes, since it is 5.8.0-rc1-next-20200621 which I believe it includes the
latest version from -mm. Anyway, I replied again to your latest patch,
https://lore.kernel.org/lkml/20200621174205.GB1398@lca.pw/
>
> Thanks, Daniel
>
>
> >
> > [ 190.455003][ T369] WARNING: possible circular locking dependency detected
> > [ 190.487291][ T369] 5.8.0-rc1-next-20200621 #1 Not tainted
> > [ 190.512363][ T369] ------------------------------------------------------
> > [ 190.543354][ T369] kswapd3/369 is trying to acquire lock:
> > [ 190.568523][ T369] ffff889fcf694528 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_reclaim_inode+0xdf/0x860
> > spin_lock at include/linux/spinlock.h:353
> > (inlined by) xfs_iflags_test_and_set at fs/xfs/xfs_inode.h:166
> > (inlined by) xfs_iflock_nowait at fs/xfs/xfs_inode.h:249
> > (inlined by) xfs_reclaim_inode at fs/xfs/xfs_icache.c:1127
> > [ 190.614359][ T369]
> > [ 190.614359][ T369] but task is already holding lock:
> > [ 190.647763][ T369] ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
> > __fs_reclaim_acquire at mm/page_alloc.c:4200
> > [ 190.687845][ T369]
> > [ 190.687845][ T369] which lock already depends on the new lock.
> > [ 190.687845][ T369]
> > [ 190.734890][ T369]
> > [ 190.734890][ T369] the existing dependency chain (in reverse order) is:
> > [ 190.775991][ T369]
> > [ 190.775991][ T369] -> #1 (fs_reclaim){+.+.}-{0:0}:
> > [ 190.808150][ T369] fs_reclaim_acquire+0x77/0x80
> > [ 190.832152][ T369] slab_pre_alloc_hook.constprop.52+0x20/0x120
> > slab_pre_alloc_hook at mm/slab.h:507
> > [ 190.862173][ T369] kmem_cache_alloc+0x43/0x2a0
> > [ 190.885602][ T369] kmem_zone_alloc+0x113/0x3ef
> > kmem_zone_alloc at fs/xfs/kmem.c:129
> > [ 190.908702][ T369] xfs_inode_item_init+0x1d/0xa0
> > xfs_inode_item_init at fs/xfs/xfs_inode_item.c:639
> > [ 190.934461][ T369] xfs_trans_ijoin+0x96/0x100
> > xfs_trans_ijoin at fs/xfs/libxfs/xfs_trans_inode.c:34
> > [ 190.961530][ T369] xfs_setattr_nonsize+0x1a6/0xcd0
> > xfs_setattr_nonsize at fs/xfs/xfs_iops.c:716
> > [ 190.987331][ T369] xfs_vn_setattr+0x133/0x160
> > xfs_vn_setattr at fs/xfs/xfs_iops.c:1081
> > [ 191.010476][ T369] notify_change+0x6c5/0xba1
> > notify_change at fs/attr.c:336
> > [ 191.033317][ T369] chmod_common+0x19b/0x390
> > [ 191.055770][ T369] ksys_fchmod+0x28/0x60
> > [ 191.077957][ T369] __x64_sys_fchmod+0x4e/0x70
> > [ 191.102767][ T369] do_syscall_64+0x5f/0x310
> > [ 191.125090][ T369] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [ 191.153749][ T369]
> > [ 191.153749][ T369] -> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
> > [ 191.191267][ T369] __lock_acquire+0x2efc/0x4da0
> > [ 191.215974][ T369] lock_acquire+0x1ac/0xaf0
> > [ 191.238953][ T369] down_write_nested+0x92/0x150
> > [ 191.262955][ T369] xfs_reclaim_inode+0xdf/0x860
> > [ 191.287149][ T369] xfs_reclaim_inodes_ag+0x505/0xb00
> > [ 191.313291][ T369] xfs_reclaim_inodes_nr+0x93/0xd0
> > [ 191.338357][ T369] super_cache_scan+0x2fd/0x430
> > [ 191.362354][ T369] do_shrink_slab+0x317/0x990
> > [ 191.385341][ T369] shrink_slab+0x3a8/0x4b0
> > [ 191.407214][ T369] shrink_node+0x49c/0x17b0
> > [ 191.429841][ T369] balance_pgdat+0x59c/0xed0
> > [ 191.455041][ T369] kswapd+0x5a4/0xc40
> > [ 191.477524][ T369] kthread+0x358/0x420
> > [ 191.499285][ T369] ret_from_fork+0x22/0x30
> > [ 191.521107][ T369]
> > [ 191.521107][ T369] other info that might help us debug this:
> > [ 191.521107][ T369]
> > [ 191.567490][ T369] Possible unsafe locking scenario:
> > [ 191.567490][ T369]
> > [ 191.600947][ T369] CPU0 CPU1
> > [ 191.624808][ T369] ---- ----
> > [ 191.649236][ T369] lock(fs_reclaim);
> > [ 191.667607][ T369] lock(&xfs_nondir_ilock_class);
> > [ 191.702096][ T369] lock(fs_reclaim);
> > [ 191.731243][ T369] lock(&xfs_nondir_ilock_class);
> > [ 191.754025][ T369]
> > [ 191.754025][ T369] *** DEADLOCK ***
> > [ 191.754025][ T369]
> > [ 191.791126][ T369] 4 locks held by kswapd3/369:
> > [ 191.812198][ T369] #0: ffffffffb50ced00 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30
> > [ 191.854319][ T369] #1: ffffffffb5074c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x219/0x4b0
> > [ 191.896043][ T369] #2: ffff8890279b40e0 (&type->s_umount_key#27){++++}-{3:3}, at: trylock_super+0x11/0xb0
> > [ 191.940538][ T369] #3: ffff889027a73a28 (&pag->pag_ici_reclaim_lock){+.+.}-{3:3}, at: xfs_reclaim_inodes_ag+0x135/0xb00
> > [ 191.995314][ T369]
> > [ 191.995314][ T369] stack backtrace:
> > [ 192.022934][ T369] CPU: 42 PID: 369 Comm: kswapd3 Not tainted 5.8.0-rc1-next-20200621 #1
> > [ 192.060546][ T369] Hardware name: HP ProLiant BL660c Gen9, BIOS I38 10/17/2018
> > [ 192.094518][ T369] Call Trace:
> > [ 192.109005][ T369] dump_stack+0x9d/0xe0
> > [ 192.127468][ T369] check_noncircular+0x347/0x400
> > [ 192.149526][ T369] ? print_circular_bug+0x360/0x360
> > [ 192.172584][ T369] ? freezing_slow_path.cold.2+0x2a/0x2a
> > [ 192.197251][ T369] __lock_acquire+0x2efc/0x4da0
> > [ 192.218737][ T369] ? lockdep_hardirqs_on_prepare+0x550/0x550
> > [ 192.246736][ T369] ? __lock_acquire+0x3541/0x4da0
> > [ 192.269673][ T369] lock_acquire+0x1ac/0xaf0
> > [ 192.290192][ T369] ? xfs_reclaim_inode+0xdf/0x860
> > [ 192.313158][ T369] ? rcu_read_unlock+0x50/0x50
> > [ 192.335057][ T369] down_write_nested+0x92/0x150
> > [ 192.358409][ T369] ? xfs_reclaim_inode+0xdf/0x860
> > [ 192.380890][ T369] ? rwsem_down_write_slowpath+0xf50/0xf50
> > [ 192.406891][ T369] ? find_held_lock+0x33/0x1c0
> > [ 192.427925][ T369] ? xfs_ilock+0x2ef/0x370
> > [ 192.447496][ T369] ? xfs_reclaim_inode+0xdf/0x860
> > [ 192.472315][ T369] xfs_reclaim_inode+0xdf/0x860
> > [ 192.496649][ T369] ? xfs_inode_clear_reclaim_tag+0xa0/0xa0
> > [ 192.524188][ T369] ? do_raw_spin_unlock+0x4f/0x250
> > [ 192.546852][ T369] xfs_reclaim_inodes_ag+0x505/0xb00
> > [ 192.570473][ T369] ? xfs_reclaim_inode+0x860/0x860
> > [ 192.592692][ T369] ? mark_held_locks+0xb0/0x110
> > [ 192.614287][ T369] ? lockdep_hardirqs_on_prepare+0x38c/0x550
> > [ 192.640800][ T369] ? _raw_spin_unlock_irqrestore+0x39/0x40
> > [ 192.666695][ T369] ? try_to_wake_up+0xcf/0xf40
> > [ 192.688265][ T369] ? migrate_swap_stop+0xc10/0xc10
> > [ 192.711966][ T369] ? do_raw_spin_unlock+0x4f/0x250
> > [ 192.735032][ T369] xfs_reclaim_inodes_nr+0x93/0xd0
> > xfs_reclaim_inodes_nr at fs/xfs/xfs_icache.c:1399
> > [ 192.757674][ T369] ? xfs_reclaim_inodes+0x90/0x90
> > [ 192.780028][ T369] ? list_lru_count_one+0x177/0x300
> > [ 192.803010][ T369] super_cache_scan+0x2fd/0x430
> > super_cache_scan at fs/super.c:115
> > [ 192.824491][ T369] do_shrink_slab+0x317/0x990
> > do_shrink_slab at mm/vmscan.c:514
> > [ 192.845160][ T369] shrink_slab+0x3a8/0x4b0
> > shrink_slab_memcg at mm/vmscan.c:584
> > (inlined by) shrink_slab at mm/vmscan.c:662
> > [ 192.864722][ T369] ? do_shrink_slab+0x990/0x990
> > [ 192.886137][ T369] ? rcu_is_watching+0x2c/0x80
> > [ 192.907289][ T369] ? mem_cgroup_protected+0x228/0x470
> > [ 192.931166][ T369] ? vmpressure+0x25/0x290
> > [ 192.950595][ T369] shrink_node+0x49c/0x17b0
> > [ 192.972332][ T369] balance_pgdat+0x59c/0xed0
> > kswapd_shrink_node at mm/vmscan.c:3521
> > (inlined by) balance_pgdat at mm/vmscan.c:3670
> > [ 192.994918][ T369] ? __node_reclaim+0x950/0x950
> > [ 193.018625][ T369] ? lockdep_hardirqs_on_prepare+0x38c/0x550
> > [ 193.046566][ T369] ? _raw_spin_unlock_irq+0x1f/0x30
> > [ 193.070214][ T369] ? _raw_spin_unlock_irq+0x1f/0x30
> > [ 193.093176][ T369] ? finish_task_switch+0x129/0x650
> > [ 193.116225][ T369] ? finish_task_switch+0xf2/0x650
> > [ 193.138809][ T369] ? rcu_read_lock_bh_held+0xc0/0xc0
> > [ 193.163323][ T369] kswapd+0x5a4/0xc40
> > [ 193.182690][ T369] ? __kthread_parkme+0x4d/0x1a0
> > [ 193.204660][ T369] ? balance_pgdat+0xed0/0xed0
> > [ 193.225776][ T369] ? _raw_spin_unlock_irqrestore+0x39/0x40
> > [ 193.252306][ T369] ? finish_wait+0x270/0x270
> > [ 193.272473][ T369] ? __kthread_parkme+0x4d/0x1a0
> > [ 193.294476][ T369] ? __kthread_parkme+0xcc/0x1a0
> > [ 193.316704][ T369] ? balance_pgdat+0xed0/0xed0
> > [ 193.337808][ T369] kthread+0x358/0x420
> > [ 193.355666][ T369] ? kthread_create_worker_on_cpu+0xc0/0xc0
> > [ 193.381884][ T369] ret_from_fork+0x22/0x30
> >
> > > ---
> > > This is part of a gpu lockdep annotation series simply because it
> > > really helps to catch issues where gpu subsystem locks and primitives
> > > can deadlock with themselves through allocations and mmu notifiers.
> > > But aside from that motivation it should be completely free-standing,
> > > and can land through -mm/-rdma/-hmm or any other tree really whenever.
> > > -Daniel
> > > ---
> > > mm/mmu_notifier.c | 7 -------
> > > mm/page_alloc.c | 23 ++++++++++++++---------
> > > 2 files changed, 14 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> > > index 06852b896fa6..5d578b9122f8 100644
> > > --- a/mm/mmu_notifier.c
> > > +++ b/mm/mmu_notifier.c
> > > @@ -612,13 +612,6 @@ int __mmu_notifier_register(struct mmu_notifier *subscription,
> > > lockdep_assert_held_write(&mm->mmap_sem);
> > > BUG_ON(atomic_read(&mm->mm_users) <= 0);
> > >
> > > - if (IS_ENABLED(CONFIG_LOCKDEP)) {
> > > - fs_reclaim_acquire(GFP_KERNEL);
> > > - lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> > > - lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> > > - fs_reclaim_release(GFP_KERNEL);
> > > - }
> > > -
> > > if (!mm->notifier_subscriptions) {
> > > /*
> > > * kmalloc cannot be called under mm_take_all_locks(), but we
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 13cc653122b7..f8a222db4a53 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -57,6 +57,7 @@
> > > #include <trace/events/oom.h>
> > > #include <linux/prefetch.h>
> > > #include <linux/mm_inline.h>
> > > +#include <linux/mmu_notifier.h>
> > > #include <linux/migrate.h>
> > > #include <linux/hugetlb.h>
> > > #include <linux/sched/rt.h>
> > > @@ -4124,7 +4125,7 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
> > > static struct lockdep_map __fs_reclaim_map =
> > > STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map);
> > >
> > > -static bool __need_fs_reclaim(gfp_t gfp_mask)
> > > +static bool __need_reclaim(gfp_t gfp_mask)
> > > {
> > > gfp_mask = current_gfp_context(gfp_mask);
> > >
> > > @@ -4136,10 +4137,6 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> > > if (current->flags & PF_MEMALLOC)
> > > return false;
> > >
> > > - /* We're only interested __GFP_FS allocations for now */
> > > - if (!(gfp_mask & __GFP_FS))
> > > - return false;
> > > -
> > > if (gfp_mask & __GFP_NOLOCKDEP)
> > > return false;
> > >
> > > @@ -4158,15 +4155,23 @@ void __fs_reclaim_release(void)
> > >
> > > void fs_reclaim_acquire(gfp_t gfp_mask)
> > > {
> > > - if (__need_fs_reclaim(gfp_mask))
> > > - __fs_reclaim_acquire();
> > > + if (__need_reclaim(gfp_mask)) {
> > > + if (!(gfp_mask & __GFP_FS))
> > > + __fs_reclaim_acquire();
> > > +
> > > + lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
> > > + lock_map_release(&__mmu_notifier_invalidate_range_start_map);
> > > +
> > > + }
> > > }
> > > EXPORT_SYMBOL_GPL(fs_reclaim_acquire);
> > >
> > > void fs_reclaim_release(gfp_t gfp_mask)
> > > {
> > > - if (__need_fs_reclaim(gfp_mask))
> > > - __fs_reclaim_release();
> > > + if (__need_reclaim(gfp_mask)) {
> > > + if (!(gfp_mask & __GFP_FS))
> > > + __fs_reclaim_release();
> > > + }
> > > }
> > > EXPORT_SYMBOL_GPL(fs_reclaim_release);
> > > #endif
> > > --
> > > 2.26.2
> > >
> > >
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH 02/18] dma-buf: minor doc touch-ups
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-10 13:07 ` Thomas Hellström (Intel)
2020-06-12 7:05 ` [Intel-gfx] [PATCH] " Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations Daniel Vetter
` (28 subsequent siblings)
30 siblings, 2 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Daniel Vetter
Just some tiny edits:
- fix link to struct dma_fence
- give slightly more meaningful title - the polling here is about
implicit fences, explicit fences (in sync_file or drm_syncobj) also
have their own polling
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/dma-buf/dma-buf.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 01ce125f8e8d..e018ef80451e 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -161,11 +161,11 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
}
/**
- * DOC: fence polling
+ * DOC: implicit fence polling
*
* To support cross-device and cross-driver synchronization of buffer access
- * implicit fences (represented internally in the kernel with &struct fence) can
- * be attached to a &dma_buf. The glue for that and a few related things are
+ * implicit fences (represented internally in the kernel with &struct dma_fence)
+ * can be attached to a &dma_buf. The glue for that and a few related things are
* provided in the &dma_resv structure.
*
* Userspace can query the state of these implicitly tracked fences using poll()
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 02/18] dma-buf: minor doc touch-ups
2020-06-04 8:12 ` [Intel-gfx] [PATCH 02/18] dma-buf: minor doc touch-ups Daniel Vetter
@ 2020-06-10 13:07 ` Thomas Hellström (Intel)
2020-06-12 7:05 ` [Intel-gfx] [PATCH] " Daniel Vetter
1 sibling, 0 replies; 106+ messages in thread
From: Thomas Hellström (Intel) @ 2020-06-10 13:07 UTC (permalink / raw)
To: Daniel Vetter, DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx
On 6/4/20 10:12 AM, Daniel Vetter wrote:
> Just some tiny edits:
> - fix link to struct dma_fence
> - give slightly more meaningful title - the polling here is about
> implicit fences, explicit fences (in sync_file or drm_syncobj) also
> have their own polling
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Thomas Hellstrom <thomas.hellstrom@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH] dma-buf: minor doc touch-ups
2020-06-04 8:12 ` [Intel-gfx] [PATCH 02/18] dma-buf: minor doc touch-ups Daniel Vetter
2020-06-10 13:07 ` Thomas Hellström (Intel)
@ 2020-06-12 7:05 ` Daniel Vetter
2020-06-24 19:32 ` Daniel Vetter
1 sibling, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-12 7:05 UTC (permalink / raw)
To: DRI Development
Cc: Daniel Vetter, Intel Graphics Development, Thomas Hellstrom,
Daniel Vetter
Just some tiny edits:
- fix link to struct dma_fence
- give slightly more meaningful title - the polling here is about
implicit fences, explicit fences (in sync_file or drm_syncobj) also
have their own polling
v2: I misplaced the .rst include change corresponding to this patch.
Reviewed-by: Thomas Hellstrom <thomas.hellstrom@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
Documentation/driver-api/dma-buf.rst | 6 +++---
drivers/dma-buf/dma-buf.c | 6 +++---
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index 63dec76d1d8d..7fb7b661febd 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
.. kernel-doc:: drivers/dma-buf/dma-buf.c
:doc: cpu access
-Fence Poll Support
-~~~~~~~~~~~~~~~~~~
+Implicit Fence Poll Support
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. kernel-doc:: drivers/dma-buf/dma-buf.c
- :doc: fence polling
+ :doc: implicit fence polling
Kernel Functions and Structures Reference
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 01ce125f8e8d..e018ef80451e 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -161,11 +161,11 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
}
/**
- * DOC: fence polling
+ * DOC: implicit fence polling
*
* To support cross-device and cross-driver synchronization of buffer access
- * implicit fences (represented internally in the kernel with &struct fence) can
- * be attached to a &dma_buf. The glue for that and a few related things are
+ * implicit fences (represented internally in the kernel with &struct dma_fence)
+ * can be attached to a &dma_buf. The glue for that and a few related things are
* provided in the &dma_resv structure.
*
* Userspace can query the state of these implicitly tracked fences using poll()
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH] dma-buf: minor doc touch-ups
2020-06-12 7:05 ` [Intel-gfx] [PATCH] " Daniel Vetter
@ 2020-06-24 19:32 ` Daniel Vetter
0 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-24 19:32 UTC (permalink / raw)
To: DRI Development
Cc: Daniel Vetter, Intel Graphics Development, Thomas Hellstrom,
Daniel Vetter
On Fri, Jun 12, 2020 at 09:05:35AM +0200, Daniel Vetter wrote:
> Just some tiny edits:
> - fix link to struct dma_fence
> - give slightly more meaningful title - the polling here is about
> implicit fences, explicit fences (in sync_file or drm_syncobj) also
> have their own polling
>
> v2: I misplaced the .rst include change corresponding to this patch.
>
> Reviewed-by: Thomas Hellstrom <thomas.hellstrom@intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
I went ahead and merged this one, shouldn't be the controversial part of
the series :-)
-Daniel
> ---
> Documentation/driver-api/dma-buf.rst | 6 +++---
> drivers/dma-buf/dma-buf.c | 6 +++---
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
> index 63dec76d1d8d..7fb7b661febd 100644
> --- a/Documentation/driver-api/dma-buf.rst
> +++ b/Documentation/driver-api/dma-buf.rst
> @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> :doc: cpu access
>
> -Fence Poll Support
> -~~~~~~~~~~~~~~~~~~
> +Implicit Fence Poll Support
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> - :doc: fence polling
> + :doc: implicit fence polling
>
> Kernel Functions and Structures Reference
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 01ce125f8e8d..e018ef80451e 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -161,11 +161,11 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
> }
>
> /**
> - * DOC: fence polling
> + * DOC: implicit fence polling
> *
> * To support cross-device and cross-driver synchronization of buffer access
> - * implicit fences (represented internally in the kernel with &struct fence) can
> - * be attached to a &dma_buf. The glue for that and a few related things are
> + * implicit fences (represented internally in the kernel with &struct dma_fence)
> + * can be attached to a &dma_buf. The glue for that and a few related things are
> * provided in the &dma_resv structure.
> *
> * Userspace can query the state of these implicitly tracked fences using poll()
> --
> 2.26.2
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 01/18] mm: Track mmu notifiers in fs_reclaim_acquire/release Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 02/18] dma-buf: minor doc touch-ups Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:57 ` Thomas Hellström (Intel)
` (4 more replies)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 04/18] dma-fence: prime " Daniel Vetter
` (27 subsequent siblings)
30 siblings, 5 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Thomas Hellstrom,
Daniel Vetter, linux-media, Christian König, Mika Kuoppala
Design is similar to the lockdep annotations for workers, but with
some twists:
- We use a read-lock for the execution/worker/completion side, so that
this explicit annotation can be more liberally sprinkled around.
With read locks lockdep isn't going to complain if the read-side
isn't nested the same way under all circumstances, so ABBA deadlocks
are ok. Which they are, since this is an annotation only.
- We're using non-recursive lockdep read lock mode, since in recursive
read lock mode lockdep does not catch read side hazards. And we
_very_ much want read side hazards to be caught. For full details of
this limitation see
commit e91498589746065e3ae95d9a00b068e525eec34f
Author: Peter Zijlstra <peterz@infradead.org>
Date: Wed Aug 23 13:13:11 2017 +0200
locking/lockdep/selftests: Add mixed read-write ABBA tests
- To allow nesting of the read-side explicit annotations we explicitly
keep track of the nesting. lock_is_held() allows us to do that.
- The wait-side annotation is a write lock, and entirely done within
dma_fence_wait() for everyone by default.
- To be able to freely annotate helper functions I want to make it ok
to call dma_fence_begin/end_signalling from soft/hardirq context.
First attempt was using the hardirq locking context for the write
side in lockdep, but this forces all normal spinlocks nested within
dma_fence_begin/end_signalling to be spinlocks. That bollocks.
The approach now is to simple check in_atomic(), and for these cases
entirely rely on the might_sleep() check in dma_fence_wait(). That
will catch any wrong nesting against spinlocks from soft/hardirq
contexts.
The idea here is that every code path that's critical for eventually
signalling a dma_fence should be annotated with
dma_fence_begin/end_signalling. The annotation ideally starts right
after a dma_fence is published (added to a dma_resv, exposed as a
sync_file fd, attached to a drm_syncobj fd, or anything else that
makes the dma_fence visible to other kernel threads), up to and
including the dma_fence_wait(). Examples are irq handlers, the
scheduler rt threads, the tail of execbuf (after the corresponding
fences are visible), any workers that end up signalling dma_fences and
really anything else. Not annotated should be code paths that only
complete fences opportunistically as the gpu progresses, like e.g.
shrinker/eviction code.
The main class of deadlocks this is supposed to catch are:
Thread A:
mutex_lock(A);
mutex_unlock(A);
dma_fence_signal();
Thread B:
mutex_lock(A);
dma_fence_wait();
mutex_unlock(A);
Thread B is blocked on A signalling the fence, but A never gets around
to that because it cannot acquire the lock A.
Note that dma_fence_wait() is allowed to be nested within
dma_fence_begin/end_signalling sections. To allow this to happen the
read lock needs to be upgraded to a write lock, which means that any
other lock is acquired between the dma_fence_begin_signalling() call and
the call to dma_fence_wait(), and still held, this will result in an
immediate lockdep complaint. The only other option would be to not
annotate such calls, defeating the point. Therefore these annotations
cannot be sprinkled over the code entirely mindless to avoid false
positives.
v2: handle soft/hardirq ctx better against write side and dont forget
EXPORT_SYMBOL, drivers can't use this otherwise.
v3: Kerneldoc.
v4: Some spelling fixes from Mika
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
Documentation/driver-api/dma-buf.rst | 12 +-
drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
include/linux/dma-fence.h | 12 ++
3 files changed, 182 insertions(+), 3 deletions(-)
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index 63dec76d1d8d..05d856131140 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
.. kernel-doc:: drivers/dma-buf/dma-buf.c
:doc: cpu access
-Fence Poll Support
-~~~~~~~~~~~~~~~~~~
+Implicit Fence Poll Support
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. kernel-doc:: drivers/dma-buf/dma-buf.c
- :doc: fence polling
+ :doc: implicit fence polling
Kernel Functions and Structures Reference
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -133,6 +133,12 @@ DMA Fences
.. kernel-doc:: drivers/dma-buf/dma-fence.c
:doc: DMA fences overview
+DMA Fence Signalling Annotations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-fence.c
+ :doc: fence signalling annotation
+
DMA Fences Functions Reference
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 656e9ac2d028..0005bc002529 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
}
EXPORT_SYMBOL(dma_fence_context_alloc);
+/**
+ * DOC: fence signalling annotation
+ *
+ * Proving correctness of all the kernel code around &dma_fence through code
+ * review and testing is tricky for a few reasons:
+ *
+ * * It is a cross-driver contract, and therefore all drivers must follow the
+ * same rules for lock nesting order, calling contexts for various functions
+ * and anything else significant for in-kernel interfaces. But it is also
+ * impossible to test all drivers in a single machine, hence brute-force N vs.
+ * N testing of all combinations is impossible. Even just limiting to the
+ * possible combinations is infeasible.
+ *
+ * * There is an enormous amount of driver code involved. For render drivers
+ * there's the tail of command submission, after fences are published,
+ * scheduler code, interrupt and workers to process job completion,
+ * and timeout, gpu reset and gpu hang recovery code. Plus for integration
+ * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
+ * and &shrinker. For modesetting drivers there's the commit tail functions
+ * between when fences for an atomic modeset are published, and when the
+ * corresponding vblank completes, including any interrupt processing and
+ * related workers. Auditing all that code, across all drivers, is not
+ * feasible.
+ *
+ * * Due to how many other subsystems are involved and the locking hierarchies
+ * this pulls in there is extremely thin wiggle-room for driver-specific
+ * differences. &dma_fence interacts with almost all of the core memory
+ * handling through page fault handlers via &dma_resv, dma_resv_lock() and
+ * dma_resv_unlock(). On the other side it also interacts through all
+ * allocation sites through &mmu_notifier and &shrinker.
+ *
+ * Furthermore lockdep does not handle cross-release dependencies, which means
+ * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
+ * at runtime with some quick testing. The simplest example is one thread
+ * waiting on a &dma_fence while holding a lock::
+ *
+ * lock(A);
+ * dma_fence_wait(B);
+ * unlock(A);
+ *
+ * while the other thread is stuck trying to acquire the same lock, which
+ * prevents it from signalling the fence the previous thread is stuck waiting
+ * on::
+ *
+ * lock(A);
+ * unlock(A);
+ * dma_fence_signal(B);
+ *
+ * By manually annotating all code relevant to signalling a &dma_fence we can
+ * teach lockdep about these dependencies, which also helps with the validation
+ * headache since now lockdep can check all the rules for us::
+ *
+ * cookie = dma_fence_begin_signalling();
+ * lock(A);
+ * unlock(A);
+ * dma_fence_signal(B);
+ * dma_fence_end_signalling(cookie);
+ *
+ * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
+ * annotate critical sections the following rules need to be observed:
+ *
+ * * All code necessary to complete a &dma_fence must be annotated, from the
+ * point where a fence is accessible to other threads, to the point where
+ * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
+ * and due to the very strict rules and many corner cases it is infeasible to
+ * catch these just with review or normal stress testing.
+ *
+ * * &struct dma_resv deserves a special note, since the readers are only
+ * protected by rcu. This means the signalling critical section starts as soon
+ * as the new fences are installed, even before dma_resv_unlock() is called.
+ *
+ * * The only exception are fast paths and opportunistic signalling code, which
+ * calls dma_fence_signal() purely as an optimization, but is not required to
+ * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
+ * which calls dma_fence_signal(), while the mandatory completion path goes
+ * through a hardware interrupt and possible job completion worker.
+ *
+ * * To aid composability of code, the annotations can be freely nested, as long
+ * as the overall locking hierarchy is consistent. The annotations also work
+ * both in interrupt and process context. Due to implementation details this
+ * requires that callers pass an opaque cookie from
+ * dma_fence_begin_signalling() to dma_fence_end_signalling().
+ *
+ * * Validation against the cross driver contract is implemented by priming
+ * lockdep with the relevant hierarchy at boot-up. This means even just
+ * testing with a single device is enough to validate a driver, at least as
+ * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
+ * concerned.
+ */
+#ifdef CONFIG_LOCKDEP
+struct lockdep_map dma_fence_lockdep_map = {
+ .name = "dma_fence_map"
+};
+
+/**
+ * dma_fence_begin_signalling - begin a critical DMA fence signalling section
+ *
+ * Drivers should use this to annotate the beginning of any code section
+ * required to eventually complete &dma_fence by calling dma_fence_signal().
+ *
+ * The end of these critical sections are annotated with
+ * dma_fence_end_signalling().
+ *
+ * Returns:
+ *
+ * Opaque cookie needed by the implementation, which needs to be passed to
+ * dma_fence_end_signalling().
+ */
+bool dma_fence_begin_signalling(void)
+{
+ /* explicitly nesting ... */
+ if (lock_is_held_type(&dma_fence_lockdep_map, 1))
+ return true;
+
+ /* rely on might_sleep check for soft/hardirq locks */
+ if (in_atomic())
+ return true;
+
+ /* ... and non-recursive readlock */
+ lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
+
+ return false;
+}
+EXPORT_SYMBOL(dma_fence_begin_signalling);
+
+/**
+ * dma_fence_end_signalling - end a critical DMA fence signalling section
+ *
+ * Closes a critical section annotation opened by dma_fence_begin_signalling().
+ */
+void dma_fence_end_signalling(bool cookie)
+{
+ if (cookie)
+ return;
+
+ lock_release(&dma_fence_lockdep_map, _RET_IP_);
+}
+EXPORT_SYMBOL(dma_fence_end_signalling);
+
+void __dma_fence_might_wait(void)
+{
+ bool tmp;
+
+ tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
+ if (tmp)
+ lock_release(&dma_fence_lockdep_map, _THIS_IP_);
+ lock_map_acquire(&dma_fence_lockdep_map);
+ lock_map_release(&dma_fence_lockdep_map);
+ if (tmp)
+ lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
+}
+#endif
+
+
/**
* dma_fence_signal_locked - signal completion of a fence
* @fence: the fence to signal
@@ -170,14 +324,19 @@ int dma_fence_signal(struct dma_fence *fence)
{
unsigned long flags;
int ret;
+ bool tmp;
if (!fence)
return -EINVAL;
+ tmp = dma_fence_begin_signalling();
+
spin_lock_irqsave(fence->lock, flags);
ret = dma_fence_signal_locked(fence);
spin_unlock_irqrestore(fence->lock, flags);
+ dma_fence_end_signalling(tmp);
+
return ret;
}
EXPORT_SYMBOL(dma_fence_signal);
@@ -210,6 +369,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
might_sleep();
+ __dma_fence_might_wait();
+
trace_dma_fence_wait_start(fence);
if (fence->ops->wait)
ret = fence->ops->wait(fence, intr, timeout);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 3347c54f3a87..3f288f7db2ef 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -357,6 +357,18 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
} while (1);
}
+#ifdef CONFIG_LOCKDEP
+bool dma_fence_begin_signalling(void);
+void dma_fence_end_signalling(bool cookie);
+#else
+static inline bool dma_fence_begin_signalling(void)
+{
+ return true;
+}
+static inline void dma_fence_end_signalling(bool cookie) {}
+static inline void __dma_fence_might_wait(void) {}
+#endif
+
int dma_fence_signal(struct dma_fence *fence);
int dma_fence_signal_locked(struct dma_fence *fence);
signed long dma_fence_default_wait(struct dma_fence *fence,
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-04 8:12 ` [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations Daniel Vetter
@ 2020-06-04 8:57 ` Thomas Hellström (Intel)
2020-06-04 9:21 ` Daniel Vetter
2020-06-05 13:29 ` [Intel-gfx] [PATCH] " Daniel Vetter
` (3 subsequent siblings)
4 siblings, 1 reply; 106+ messages in thread
From: Thomas Hellström (Intel) @ 2020-06-04 8:57 UTC (permalink / raw)
To: Daniel Vetter, DRI Development
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx,
Chris Wilson, linaro-mm-sig, Thomas Hellstrom, Daniel Vetter,
Mika Kuoppala, Christian König, linux-media
On 6/4/20 10:12 AM, Daniel Vetter wrote:
...
> Thread A:
>
> mutex_lock(A);
> mutex_unlock(A);
>
> dma_fence_signal();
>
> Thread B:
>
> mutex_lock(A);
> dma_fence_wait();
> mutex_unlock(A);
>
> Thread B is blocked on A signalling the fence, but A never gets around
> to that because it cannot acquire the lock A.
>
> Note that dma_fence_wait() is allowed to be nested within
> dma_fence_begin/end_signalling sections. To allow this to happen the
> read lock needs to be upgraded to a write lock, which means that any
> other lock is acquired between the dma_fence_begin_signalling() call and
> the call to dma_fence_wait(), and still held, this will result in an
> immediate lockdep complaint. The only other option would be to not
> annotate such calls, defeating the point. Therefore these annotations
> cannot be sprinkled over the code entirely mindless to avoid false
> positives.
Just realized, isn't that example actually a true positive, or at least
a great candidate for a true positive, since if another thread reenters
that signaling path, it will block on that mutex, and the fence would
never be signaled unless there is another signaling path?
Although I agree the conclusion is sound: These annotations cannot be
sprinkled mindlessly over the code.
/Thomas
>
> v2: handle soft/hardirq ctx better against write side and dont forget
> EXPORT_SYMBOL, drivers can't use this otherwise.
>
> v3: Kerneldoc.
>
> v4: Some spelling fixes from Mika
>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: linux-rdma@vger.kernel.org
> Cc: amd-gfx@lists.freedesktop.org
> Cc: intel-gfx@lists.freedesktop.org
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
> Documentation/driver-api/dma-buf.rst | 12 +-
> drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
> include/linux/dma-fence.h | 12 ++
> 3 files changed, 182 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
> index 63dec76d1d8d..05d856131140 100644
> --- a/Documentation/driver-api/dma-buf.rst
> +++ b/Documentation/driver-api/dma-buf.rst
> @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> :doc: cpu access
>
> -Fence Poll Support
> -~~~~~~~~~~~~~~~~~~
> +Implicit Fence Poll Support
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> - :doc: fence polling
> + :doc: implicit fence polling
>
> Kernel Functions and Structures Reference
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> @@ -133,6 +133,12 @@ DMA Fences
> .. kernel-doc:: drivers/dma-buf/dma-fence.c
> :doc: DMA fences overview
>
> +DMA Fence Signalling Annotations
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +.. kernel-doc:: drivers/dma-buf/dma-fence.c
> + :doc: fence signalling annotation
> +
> DMA Fences Functions Reference
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 656e9ac2d028..0005bc002529 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
> }
> EXPORT_SYMBOL(dma_fence_context_alloc);
>
> +/**
> + * DOC: fence signalling annotation
> + *
> + * Proving correctness of all the kernel code around &dma_fence through code
> + * review and testing is tricky for a few reasons:
> + *
> + * * It is a cross-driver contract, and therefore all drivers must follow the
> + * same rules for lock nesting order, calling contexts for various functions
> + * and anything else significant for in-kernel interfaces. But it is also
> + * impossible to test all drivers in a single machine, hence brute-force N vs.
> + * N testing of all combinations is impossible. Even just limiting to the
> + * possible combinations is infeasible.
> + *
> + * * There is an enormous amount of driver code involved. For render drivers
> + * there's the tail of command submission, after fences are published,
> + * scheduler code, interrupt and workers to process job completion,
> + * and timeout, gpu reset and gpu hang recovery code. Plus for integration
> + * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
> + * and &shrinker. For modesetting drivers there's the commit tail functions
> + * between when fences for an atomic modeset are published, and when the
> + * corresponding vblank completes, including any interrupt processing and
> + * related workers. Auditing all that code, across all drivers, is not
> + * feasible.
> + *
> + * * Due to how many other subsystems are involved and the locking hierarchies
> + * this pulls in there is extremely thin wiggle-room for driver-specific
> + * differences. &dma_fence interacts with almost all of the core memory
> + * handling through page fault handlers via &dma_resv, dma_resv_lock() and
> + * dma_resv_unlock(). On the other side it also interacts through all
> + * allocation sites through &mmu_notifier and &shrinker.
> + *
> + * Furthermore lockdep does not handle cross-release dependencies, which means
> + * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
> + * at runtime with some quick testing. The simplest example is one thread
> + * waiting on a &dma_fence while holding a lock::
> + *
> + * lock(A);
> + * dma_fence_wait(B);
> + * unlock(A);
> + *
> + * while the other thread is stuck trying to acquire the same lock, which
> + * prevents it from signalling the fence the previous thread is stuck waiting
> + * on::
> + *
> + * lock(A);
> + * unlock(A);
> + * dma_fence_signal(B);
> + *
> + * By manually annotating all code relevant to signalling a &dma_fence we can
> + * teach lockdep about these dependencies, which also helps with the validation
> + * headache since now lockdep can check all the rules for us::
> + *
> + * cookie = dma_fence_begin_signalling();
> + * lock(A);
> + * unlock(A);
> + * dma_fence_signal(B);
> + * dma_fence_end_signalling(cookie);
> + *
> + * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
> + * annotate critical sections the following rules need to be observed:
> + *
> + * * All code necessary to complete a &dma_fence must be annotated, from the
> + * point where a fence is accessible to other threads, to the point where
> + * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
> + * and due to the very strict rules and many corner cases it is infeasible to
> + * catch these just with review or normal stress testing.
> + *
> + * * &struct dma_resv deserves a special note, since the readers are only
> + * protected by rcu. This means the signalling critical section starts as soon
> + * as the new fences are installed, even before dma_resv_unlock() is called.
> + *
> + * * The only exception are fast paths and opportunistic signalling code, which
> + * calls dma_fence_signal() purely as an optimization, but is not required to
> + * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
> + * which calls dma_fence_signal(), while the mandatory completion path goes
> + * through a hardware interrupt and possible job completion worker.
> + *
> + * * To aid composability of code, the annotations can be freely nested, as long
> + * as the overall locking hierarchy is consistent. The annotations also work
> + * both in interrupt and process context. Due to implementation details this
> + * requires that callers pass an opaque cookie from
> + * dma_fence_begin_signalling() to dma_fence_end_signalling().
> + *
> + * * Validation against the cross driver contract is implemented by priming
> + * lockdep with the relevant hierarchy at boot-up. This means even just
> + * testing with a single device is enough to validate a driver, at least as
> + * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
> + * concerned.
> + */
> +#ifdef CONFIG_LOCKDEP
> +struct lockdep_map dma_fence_lockdep_map = {
> + .name = "dma_fence_map"
> +};
> +
> +/**
> + * dma_fence_begin_signalling - begin a critical DMA fence signalling section
> + *
> + * Drivers should use this to annotate the beginning of any code section
> + * required to eventually complete &dma_fence by calling dma_fence_signal().
> + *
> + * The end of these critical sections are annotated with
> + * dma_fence_end_signalling().
> + *
> + * Returns:
> + *
> + * Opaque cookie needed by the implementation, which needs to be passed to
> + * dma_fence_end_signalling().
> + */
> +bool dma_fence_begin_signalling(void)
> +{
> + /* explicitly nesting ... */
> + if (lock_is_held_type(&dma_fence_lockdep_map, 1))
> + return true;
> +
> + /* rely on might_sleep check for soft/hardirq locks */
> + if (in_atomic())
> + return true;
> +
> + /* ... and non-recursive readlock */
> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
> +
> + return false;
> +}
> +EXPORT_SYMBOL(dma_fence_begin_signalling);
> +
> +/**
> + * dma_fence_end_signalling - end a critical DMA fence signalling section
> + *
> + * Closes a critical section annotation opened by dma_fence_begin_signalling().
> + */
> +void dma_fence_end_signalling(bool cookie)
> +{
> + if (cookie)
> + return;
> +
> + lock_release(&dma_fence_lockdep_map, _RET_IP_);
> +}
> +EXPORT_SYMBOL(dma_fence_end_signalling);
> +
> +void __dma_fence_might_wait(void)
> +{
> + bool tmp;
> +
> + tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
> + if (tmp)
> + lock_release(&dma_fence_lockdep_map, _THIS_IP_);
> + lock_map_acquire(&dma_fence_lockdep_map);
> + lock_map_release(&dma_fence_lockdep_map);
> + if (tmp)
> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
> +}
> +#endif
> +
> +
> /**
> * dma_fence_signal_locked - signal completion of a fence
> * @fence: the fence to signal
> @@ -170,14 +324,19 @@ int dma_fence_signal(struct dma_fence *fence)
> {
> unsigned long flags;
> int ret;
> + bool tmp;
>
> if (!fence)
> return -EINVAL;
>
> + tmp = dma_fence_begin_signalling();
> +
> spin_lock_irqsave(fence->lock, flags);
> ret = dma_fence_signal_locked(fence);
> spin_unlock_irqrestore(fence->lock, flags);
>
> + dma_fence_end_signalling(tmp);
> +
> return ret;
> }
> EXPORT_SYMBOL(dma_fence_signal);
> @@ -210,6 +369,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
>
> might_sleep();
>
> + __dma_fence_might_wait();
> +
> trace_dma_fence_wait_start(fence);
> if (fence->ops->wait)
> ret = fence->ops->wait(fence, intr, timeout);
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 3347c54f3a87..3f288f7db2ef 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -357,6 +357,18 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
> } while (1);
> }
>
> +#ifdef CONFIG_LOCKDEP
> +bool dma_fence_begin_signalling(void);
> +void dma_fence_end_signalling(bool cookie);
> +#else
> +static inline bool dma_fence_begin_signalling(void)
> +{
> + return true;
> +}
> +static inline void dma_fence_end_signalling(bool cookie) {}
> +static inline void __dma_fence_might_wait(void) {}
> +#endif
> +
> int dma_fence_signal(struct dma_fence *fence);
> int dma_fence_signal_locked(struct dma_fence *fence);
> signed long dma_fence_default_wait(struct dma_fence *fence,
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-04 8:57 ` Thomas Hellström (Intel)
@ 2020-06-04 9:21 ` Daniel Vetter
2020-06-04 9:26 ` Chris Wilson
0 siblings, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 9:21 UTC (permalink / raw)
To: Thomas Hellström (Intel)
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Chris Wilson, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, DRI Development, Daniel Vetter, Mika Kuoppala,
Christian König, open list:DMA BUFFER SHARING FRAMEWORK
On Thu, Jun 4, 2020 at 10:57 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 6/4/20 10:12 AM, Daniel Vetter wrote:
> ...
> > Thread A:
> >
> > mutex_lock(A);
> > mutex_unlock(A);
> >
> > dma_fence_signal();
> >
> > Thread B:
> >
> > mutex_lock(A);
> > dma_fence_wait();
> > mutex_unlock(A);
> >
> > Thread B is blocked on A signalling the fence, but A never gets around
> > to that because it cannot acquire the lock A.
> >
> > Note that dma_fence_wait() is allowed to be nested within
> > dma_fence_begin/end_signalling sections. To allow this to happen the
> > read lock needs to be upgraded to a write lock, which means that any
> > other lock is acquired between the dma_fence_begin_signalling() call and
> > the call to dma_fence_wait(), and still held, this will result in an
> > immediate lockdep complaint. The only other option would be to not
> > annotate such calls, defeating the point. Therefore these annotations
> > cannot be sprinkled over the code entirely mindless to avoid false
> > positives.
>
> Just realized, isn't that example actually a true positive, or at least
> a great candidate for a true positive, since if another thread reenters
> that signaling path, it will block on that mutex, and the fence would
> never be signaled unless there is another signaling path?
Not sure I understand fully, but I think the answer is "it's complicated".
dma_fence are meant to be a DAG (directed acyclic graph). Now it would
be nice to enforce that, and i915 has some attempts to that effect,
but these annotations here don't try to pull off that miracle. I'm
assuming that all the dependencies between dma_fence don't create a
loop, and instead I'm only focusing on deadlocks between dma_fences
and other locks. Usually an async work looks like this:
1. wait for a bunch of dma_fence that we have as dependencies
2. do work (e.g. atomic commit)
3. signal the dma_fence that represents our work
This can happen on the cpu in a kthread or worker, or on the gpu. Now
for reasons you might want to have a per-work mutex or something and
hold that while going through all this, and this is the false positive
I'm thinking off. Of course, if your fences aren't a DAG, or if you're
holding a mutex that's shared with some other work which is part of
your dependency chain, then this goes boom. But it doesn't have to.
I think in general it's best to purely rely on ordering, and remove as
much locking as possible. This is the design behind the atomic modeset
commit code, which is does not take any mutexes in the commit path, at
least not in the helpers. Drivers can still do stuff of course. Then
the only locks you're left with are spinlocks (maybe irq safe ones) to
coordinate with interrupt handlers, workers, handle the wait/wake
queues, manage work/scheduler run queues and all that stuff, and no
spinlocks.
Now for the case where you have something like the below:
thread 1:
dma_fence_begin_signalling()
mutex_lock(a);
dma_fence_wait(b1);
mutex_unlock(a);
dma_fence_signal(b2);
dma_fence_end_signalling();
That's indeed a bit problematic, assuming you're annotating stuff
correctly, and the locking is actually required. I've seen a few of
these, and annotating the properly needs care:
- often the mutex_lock/unlock is not needed, and just gets in the way.
This was the case for the original atomic modeset commit work patches,
which again locked all the modeset locks. But strict ordering of
commit work was all that was needed to make this work, plus making
sure data structure lifetimes are handled correctly too. I think the
tendency to abuse locking to handle lifetime and ordering problems is
fairly common, but it can lead to lots of trouble. Ime all async work
items with the above problematic pattern can be fixed like this.
- other often case is that the dma_fence_begin_signalling() can&should
be pushed down past the mutex_lock, and maybe even past the
dma_fence_wait, depending upon when/how the dma_fence is published.
The fence signalling critical section can still extend past the
mutex_unlock, lockdep and semantics are fine with that (I think at
least). This is more the case for execbuf tails, where you take locks,
set up some async work, publish the fences and then begin to process
these fences (which could just be pushing the work to the job
scheduler, but could also involve running it directly in the userspace
process thread context, but with locks already dropped).
So I wouldn't go out and say these are true positives, just maybe
unecessary locking and over-eager annotations, without any real bugs
in the code.
Or am I completely off the track and you're thinking of something else?
> Although I agree the conclusion is sound: These annotations cannot be
> sprinkled mindlessly over the code.
Yup, that much is for sure.
-Daniel
>
> /Thomas
>
>
>
>
>
>
> >
> > v2: handle soft/hardirq ctx better against write side and dont forget
> > EXPORT_SYMBOL, drivers can't use this otherwise.
> >
> > v3: Kerneldoc.
> >
> > v4: Some spelling fixes from Mika
> >
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: amd-gfx@lists.freedesktop.org
> > Cc: intel-gfx@lists.freedesktop.org
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> > Documentation/driver-api/dma-buf.rst | 12 +-
> > drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
> > include/linux/dma-fence.h | 12 ++
> > 3 files changed, 182 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
> > index 63dec76d1d8d..05d856131140 100644
> > --- a/Documentation/driver-api/dma-buf.rst
> > +++ b/Documentation/driver-api/dma-buf.rst
> > @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
> > .. kernel-doc:: drivers/dma-buf/dma-buf.c
> > :doc: cpu access
> >
> > -Fence Poll Support
> > -~~~~~~~~~~~~~~~~~~
> > +Implicit Fence Poll Support
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > .. kernel-doc:: drivers/dma-buf/dma-buf.c
> > - :doc: fence polling
> > + :doc: implicit fence polling
> >
> > Kernel Functions and Structures Reference
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > @@ -133,6 +133,12 @@ DMA Fences
> > .. kernel-doc:: drivers/dma-buf/dma-fence.c
> > :doc: DMA fences overview
> >
> > +DMA Fence Signalling Annotations
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +.. kernel-doc:: drivers/dma-buf/dma-fence.c
> > + :doc: fence signalling annotation
> > +
> > DMA Fences Functions Reference
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> > index 656e9ac2d028..0005bc002529 100644
> > --- a/drivers/dma-buf/dma-fence.c
> > +++ b/drivers/dma-buf/dma-fence.c
> > @@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
> > }
> > EXPORT_SYMBOL(dma_fence_context_alloc);
> >
> > +/**
> > + * DOC: fence signalling annotation
> > + *
> > + * Proving correctness of all the kernel code around &dma_fence through code
> > + * review and testing is tricky for a few reasons:
> > + *
> > + * * It is a cross-driver contract, and therefore all drivers must follow the
> > + * same rules for lock nesting order, calling contexts for various functions
> > + * and anything else significant for in-kernel interfaces. But it is also
> > + * impossible to test all drivers in a single machine, hence brute-force N vs.
> > + * N testing of all combinations is impossible. Even just limiting to the
> > + * possible combinations is infeasible.
> > + *
> > + * * There is an enormous amount of driver code involved. For render drivers
> > + * there's the tail of command submission, after fences are published,
> > + * scheduler code, interrupt and workers to process job completion,
> > + * and timeout, gpu reset and gpu hang recovery code. Plus for integration
> > + * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
> > + * and &shrinker. For modesetting drivers there's the commit tail functions
> > + * between when fences for an atomic modeset are published, and when the
> > + * corresponding vblank completes, including any interrupt processing and
> > + * related workers. Auditing all that code, across all drivers, is not
> > + * feasible.
> > + *
> > + * * Due to how many other subsystems are involved and the locking hierarchies
> > + * this pulls in there is extremely thin wiggle-room for driver-specific
> > + * differences. &dma_fence interacts with almost all of the core memory
> > + * handling through page fault handlers via &dma_resv, dma_resv_lock() and
> > + * dma_resv_unlock(). On the other side it also interacts through all
> > + * allocation sites through &mmu_notifier and &shrinker.
> > + *
> > + * Furthermore lockdep does not handle cross-release dependencies, which means
> > + * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
> > + * at runtime with some quick testing. The simplest example is one thread
> > + * waiting on a &dma_fence while holding a lock::
> > + *
> > + * lock(A);
> > + * dma_fence_wait(B);
> > + * unlock(A);
> > + *
> > + * while the other thread is stuck trying to acquire the same lock, which
> > + * prevents it from signalling the fence the previous thread is stuck waiting
> > + * on::
> > + *
> > + * lock(A);
> > + * unlock(A);
> > + * dma_fence_signal(B);
> > + *
> > + * By manually annotating all code relevant to signalling a &dma_fence we can
> > + * teach lockdep about these dependencies, which also helps with the validation
> > + * headache since now lockdep can check all the rules for us::
> > + *
> > + * cookie = dma_fence_begin_signalling();
> > + * lock(A);
> > + * unlock(A);
> > + * dma_fence_signal(B);
> > + * dma_fence_end_signalling(cookie);
> > + *
> > + * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
> > + * annotate critical sections the following rules need to be observed:
> > + *
> > + * * All code necessary to complete a &dma_fence must be annotated, from the
> > + * point where a fence is accessible to other threads, to the point where
> > + * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
> > + * and due to the very strict rules and many corner cases it is infeasible to
> > + * catch these just with review or normal stress testing.
> > + *
> > + * * &struct dma_resv deserves a special note, since the readers are only
> > + * protected by rcu. This means the signalling critical section starts as soon
> > + * as the new fences are installed, even before dma_resv_unlock() is called.
> > + *
> > + * * The only exception are fast paths and opportunistic signalling code, which
> > + * calls dma_fence_signal() purely as an optimization, but is not required to
> > + * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
> > + * which calls dma_fence_signal(), while the mandatory completion path goes
> > + * through a hardware interrupt and possible job completion worker.
> > + *
> > + * * To aid composability of code, the annotations can be freely nested, as long
> > + * as the overall locking hierarchy is consistent. The annotations also work
> > + * both in interrupt and process context. Due to implementation details this
> > + * requires that callers pass an opaque cookie from
> > + * dma_fence_begin_signalling() to dma_fence_end_signalling().
> > + *
> > + * * Validation against the cross driver contract is implemented by priming
> > + * lockdep with the relevant hierarchy at boot-up. This means even just
> > + * testing with a single device is enough to validate a driver, at least as
> > + * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
> > + * concerned.
> > + */
> > +#ifdef CONFIG_LOCKDEP
> > +struct lockdep_map dma_fence_lockdep_map = {
> > + .name = "dma_fence_map"
> > +};
> > +
> > +/**
> > + * dma_fence_begin_signalling - begin a critical DMA fence signalling section
> > + *
> > + * Drivers should use this to annotate the beginning of any code section
> > + * required to eventually complete &dma_fence by calling dma_fence_signal().
> > + *
> > + * The end of these critical sections are annotated with
> > + * dma_fence_end_signalling().
> > + *
> > + * Returns:
> > + *
> > + * Opaque cookie needed by the implementation, which needs to be passed to
> > + * dma_fence_end_signalling().
> > + */
> > +bool dma_fence_begin_signalling(void)
> > +{
> > + /* explicitly nesting ... */
> > + if (lock_is_held_type(&dma_fence_lockdep_map, 1))
> > + return true;
> > +
> > + /* rely on might_sleep check for soft/hardirq locks */
> > + if (in_atomic())
> > + return true;
> > +
> > + /* ... and non-recursive readlock */
> > + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
> > +
> > + return false;
> > +}
> > +EXPORT_SYMBOL(dma_fence_begin_signalling);
> > +
> > +/**
> > + * dma_fence_end_signalling - end a critical DMA fence signalling section
> > + *
> > + * Closes a critical section annotation opened by dma_fence_begin_signalling().
> > + */
> > +void dma_fence_end_signalling(bool cookie)
> > +{
> > + if (cookie)
> > + return;
> > +
> > + lock_release(&dma_fence_lockdep_map, _RET_IP_);
> > +}
> > +EXPORT_SYMBOL(dma_fence_end_signalling);
> > +
> > +void __dma_fence_might_wait(void)
> > +{
> > + bool tmp;
> > +
> > + tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
> > + if (tmp)
> > + lock_release(&dma_fence_lockdep_map, _THIS_IP_);
> > + lock_map_acquire(&dma_fence_lockdep_map);
> > + lock_map_release(&dma_fence_lockdep_map);
> > + if (tmp)
> > + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
> > +}
> > +#endif
> > +
> > +
> > /**
> > * dma_fence_signal_locked - signal completion of a fence
> > * @fence: the fence to signal
> > @@ -170,14 +324,19 @@ int dma_fence_signal(struct dma_fence *fence)
> > {
> > unsigned long flags;
> > int ret;
> > + bool tmp;
> >
> > if (!fence)
> > return -EINVAL;
> >
> > + tmp = dma_fence_begin_signalling();
> > +
> > spin_lock_irqsave(fence->lock, flags);
> > ret = dma_fence_signal_locked(fence);
> > spin_unlock_irqrestore(fence->lock, flags);
> >
> > + dma_fence_end_signalling(tmp);
> > +
> > return ret;
> > }
> > EXPORT_SYMBOL(dma_fence_signal);
> > @@ -210,6 +369,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
> >
> > might_sleep();
> >
> > + __dma_fence_might_wait();
> > +
> > trace_dma_fence_wait_start(fence);
> > if (fence->ops->wait)
> > ret = fence->ops->wait(fence, intr, timeout);
> > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> > index 3347c54f3a87..3f288f7db2ef 100644
> > --- a/include/linux/dma-fence.h
> > +++ b/include/linux/dma-fence.h
> > @@ -357,6 +357,18 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
> > } while (1);
> > }
> >
> > +#ifdef CONFIG_LOCKDEP
> > +bool dma_fence_begin_signalling(void);
> > +void dma_fence_end_signalling(bool cookie);
> > +#else
> > +static inline bool dma_fence_begin_signalling(void)
> > +{
> > + return true;
> > +}
> > +static inline void dma_fence_end_signalling(bool cookie) {}
> > +static inline void __dma_fence_might_wait(void) {}
> > +#endif
> > +
> > int dma_fence_signal(struct dma_fence *fence);
> > int dma_fence_signal_locked(struct dma_fence *fence);
> > signed long dma_fence_default_wait(struct dma_fence *fence,
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-04 9:21 ` Daniel Vetter
@ 2020-06-04 9:26 ` Chris Wilson
2020-06-04 9:36 ` Daniel Vetter
0 siblings, 1 reply; 106+ messages in thread
From: Chris Wilson @ 2020-06-04 9:26 UTC (permalink / raw)
To: Thomas Hellström, Daniel Vetter
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
moderated list:DMA BUFFER SHARING FRAMEWORK, Thomas Hellstrom,
DRI Development, Daniel Vetter, Mika Kuoppala,
Christian König, open list:DMA BUFFER SHARING FRAMEWORK
Quoting Daniel Vetter (2020-06-04 10:21:46)
> On Thu, Jun 4, 2020 at 10:57 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
> >
> >
> > On 6/4/20 10:12 AM, Daniel Vetter wrote:
> > ...
> > > Thread A:
> > >
> > > mutex_lock(A);
> > > mutex_unlock(A);
> > >
> > > dma_fence_signal();
> > >
> > > Thread B:
> > >
> > > mutex_lock(A);
> > > dma_fence_wait();
> > > mutex_unlock(A);
> > >
> > > Thread B is blocked on A signalling the fence, but A never gets around
> > > to that because it cannot acquire the lock A.
> > >
> > > Note that dma_fence_wait() is allowed to be nested within
> > > dma_fence_begin/end_signalling sections. To allow this to happen the
> > > read lock needs to be upgraded to a write lock, which means that any
> > > other lock is acquired between the dma_fence_begin_signalling() call and
> > > the call to dma_fence_wait(), and still held, this will result in an
> > > immediate lockdep complaint. The only other option would be to not
> > > annotate such calls, defeating the point. Therefore these annotations
> > > cannot be sprinkled over the code entirely mindless to avoid false
> > > positives.
> >
> > Just realized, isn't that example actually a true positive, or at least
> > a great candidate for a true positive, since if another thread reenters
> > that signaling path, it will block on that mutex, and the fence would
> > never be signaled unless there is another signaling path?
>
> Not sure I understand fully, but I think the answer is "it's complicated".
See cd8084f91c02 ("locking/lockdep: Apply crossrelease to completions")
dma_fence usage here is nothing but another name for a completion.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-04 9:26 ` Chris Wilson
@ 2020-06-04 9:36 ` Daniel Vetter
0 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 9:36 UTC (permalink / raw)
To: Chris Wilson
Cc: linux-rdma, Thomas Hellström, LKML, amd-gfx list,
Christian König, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, DRI Development, Daniel Vetter,
open list:DMA BUFFER SHARING FRAMEWORK,
Intel Graphics Development, Mika Kuoppala
On Thu, Jun 4, 2020 at 11:27 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2020-06-04 10:21:46)
> > On Thu, Jun 4, 2020 at 10:57 AM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> > >
> > >
> > > On 6/4/20 10:12 AM, Daniel Vetter wrote:
> > > ...
> > > > Thread A:
> > > >
> > > > mutex_lock(A);
> > > > mutex_unlock(A);
> > > >
> > > > dma_fence_signal();
> > > >
> > > > Thread B:
> > > >
> > > > mutex_lock(A);
> > > > dma_fence_wait();
> > > > mutex_unlock(A);
> > > >
> > > > Thread B is blocked on A signalling the fence, but A never gets around
> > > > to that because it cannot acquire the lock A.
> > > >
> > > > Note that dma_fence_wait() is allowed to be nested within
> > > > dma_fence_begin/end_signalling sections. To allow this to happen the
> > > > read lock needs to be upgraded to a write lock, which means that any
> > > > other lock is acquired between the dma_fence_begin_signalling() call and
> > > > the call to dma_fence_wait(), and still held, this will result in an
> > > > immediate lockdep complaint. The only other option would be to not
> > > > annotate such calls, defeating the point. Therefore these annotations
> > > > cannot be sprinkled over the code entirely mindless to avoid false
> > > > positives.
> > >
> > > Just realized, isn't that example actually a true positive, or at least
> > > a great candidate for a true positive, since if another thread reenters
> > > that signaling path, it will block on that mutex, and the fence would
> > > never be signaled unless there is another signaling path?
> >
> > Not sure I understand fully, but I think the answer is "it's complicated".
>
> See cd8084f91c02 ("locking/lockdep: Apply crossrelease to completions")
>
> dma_fence usage here is nothing but another name for a completion.
Quoting from my previous cover letter:
"I've dragged my feet for years on this, hoping that cross-release lockdep
would do this for us, but well that never really happened unfortunately.
So here we are."
I discussed this with Peter, cross-release not getting in is pretty
final it seems. The trouble is false positives without explicit
begin/end annotations reviewed by humans - ime from just these few
examples you just can't guess this stuff by computeres, you need real
brains thinking about all the edge cases, and where exactly the
critical section starts and ends. Without that you're just going to
drown in a sea of false positives and yuck.
So yeah I had hopes for cross-release too, unfortunately that was
entirely in vain and a distraction.
Now I guess it would be nice if there's a per-class
completion_begin/end annotation for the more generic problem. But then
also most people don't have a cross-driver completion api contract
like dma_fence is, with some of the most ridiculous over the top
constraints of what's possible and what's not possible on each side of
the cross-release. We do have a bit an outsized benefit (in pain
reduction) vs cost ratio here.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH] dma-fence: basic lockdep annotations
2020-06-04 8:12 ` [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations Daniel Vetter
2020-06-04 8:57 ` Thomas Hellström (Intel)
@ 2020-06-05 13:29 ` Daniel Vetter
2020-06-05 14:30 ` Thomas Hellström (Intel)
2020-06-11 9:57 ` Maarten Lankhorst
2020-06-10 14:21 ` [Intel-gfx] [PATCH 03/18] " Tvrtko Ursulin
` (2 subsequent siblings)
4 siblings, 2 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-05 13:29 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Thomas Hellstrom,
Daniel Vetter, linux-media, Christian König, Mika Kuoppala
Design is similar to the lockdep annotations for workers, but with
some twists:
- We use a read-lock for the execution/worker/completion side, so that
this explicit annotation can be more liberally sprinkled around.
With read locks lockdep isn't going to complain if the read-side
isn't nested the same way under all circumstances, so ABBA deadlocks
are ok. Which they are, since this is an annotation only.
- We're using non-recursive lockdep read lock mode, since in recursive
read lock mode lockdep does not catch read side hazards. And we
_very_ much want read side hazards to be caught. For full details of
this limitation see
commit e91498589746065e3ae95d9a00b068e525eec34f
Author: Peter Zijlstra <peterz@infradead.org>
Date: Wed Aug 23 13:13:11 2017 +0200
locking/lockdep/selftests: Add mixed read-write ABBA tests
- To allow nesting of the read-side explicit annotations we explicitly
keep track of the nesting. lock_is_held() allows us to do that.
- The wait-side annotation is a write lock, and entirely done within
dma_fence_wait() for everyone by default.
- To be able to freely annotate helper functions I want to make it ok
to call dma_fence_begin/end_signalling from soft/hardirq context.
First attempt was using the hardirq locking context for the write
side in lockdep, but this forces all normal spinlocks nested within
dma_fence_begin/end_signalling to be spinlocks. That bollocks.
The approach now is to simple check in_atomic(), and for these cases
entirely rely on the might_sleep() check in dma_fence_wait(). That
will catch any wrong nesting against spinlocks from soft/hardirq
contexts.
The idea here is that every code path that's critical for eventually
signalling a dma_fence should be annotated with
dma_fence_begin/end_signalling. The annotation ideally starts right
after a dma_fence is published (added to a dma_resv, exposed as a
sync_file fd, attached to a drm_syncobj fd, or anything else that
makes the dma_fence visible to other kernel threads), up to and
including the dma_fence_wait(). Examples are irq handlers, the
scheduler rt threads, the tail of execbuf (after the corresponding
fences are visible), any workers that end up signalling dma_fences and
really anything else. Not annotated should be code paths that only
complete fences opportunistically as the gpu progresses, like e.g.
shrinker/eviction code.
The main class of deadlocks this is supposed to catch are:
Thread A:
mutex_lock(A);
mutex_unlock(A);
dma_fence_signal();
Thread B:
mutex_lock(A);
dma_fence_wait();
mutex_unlock(A);
Thread B is blocked on A signalling the fence, but A never gets around
to that because it cannot acquire the lock A.
Note that dma_fence_wait() is allowed to be nested within
dma_fence_begin/end_signalling sections. To allow this to happen the
read lock needs to be upgraded to a write lock, which means that any
other lock is acquired between the dma_fence_begin_signalling() call and
the call to dma_fence_wait(), and still held, this will result in an
immediate lockdep complaint. The only other option would be to not
annotate such calls, defeating the point. Therefore these annotations
cannot be sprinkled over the code entirely mindless to avoid false
positives.
Originally I hope that the cross-release lockdep extensions would
alleviate the need for explicit annotations:
https://lwn.net/Articles/709849/
But there's a few reasons why that's not an option:
- It's not happening in upstream, since it got reverted due to too
many false positives:
commit e966eaeeb623f09975ef362c2866fae6f86844f9
Author: Ingo Molnar <mingo@kernel.org>
Date: Tue Dec 12 12:31:16 2017 +0100
locking/lockdep: Remove the cross-release locking checks
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
while it found a number of old bugs initially, was also causing too many
false positives that caused people to disable lockdep - which is arguably
a worse overall outcome.
- cross-release uses the complete() call to annotate the end of
critical sections, for dma_fence that would be dma_fence_signal().
But we do not want all dma_fence_signal() calls to be treated as
critical, since many are opportunistic cleanup of gpu requests. If
these get stuck there's still the main completion interrupt and
workers who can unblock everyone. Automatically annotating all
dma_fence_signal() calls would hence cause false positives.
- cross-release had some educated guesses for when a critical section
starts, like fresh syscall or fresh work callback. This would again
cause false positives without explicit annotations, since for
dma_fence the critical sections only starts when we publish a fence.
- Furthermore there can be cases where a thread never does a
dma_fence_signal, but is still critical for reaching completion of
fences. One example would be a scheduler kthread which picks up jobs
and pushes them into hardware, where the interrupt handler or
another completion thread calls dma_fence_signal(). But if the
scheduler thread hangs, then all the fences hang, hence we need to
manually annotate it. cross-release aimed to solve this by chaining
cross-release dependencies, but the dependency from scheduler thread
to the completion interrupt handler goes through hw where
cross-release code can't observe it.
In short, without manual annotations and careful review of the start
and end of critical sections, cross-relese dependency tracking doesn't
work. We need explicit annotations.
v2: handle soft/hardirq ctx better against write side and dont forget
EXPORT_SYMBOL, drivers can't use this otherwise.
v3: Kerneldoc.
v4: Some spelling fixes from Mika
v5: Amend commit message to explain in detail why cross-release isn't
the solution.
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
Documentation/driver-api/dma-buf.rst | 12 +-
drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
include/linux/dma-fence.h | 12 ++
3 files changed, 182 insertions(+), 3 deletions(-)
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index 63dec76d1d8d..05d856131140 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
.. kernel-doc:: drivers/dma-buf/dma-buf.c
:doc: cpu access
-Fence Poll Support
-~~~~~~~~~~~~~~~~~~
+Implicit Fence Poll Support
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. kernel-doc:: drivers/dma-buf/dma-buf.c
- :doc: fence polling
+ :doc: implicit fence polling
Kernel Functions and Structures Reference
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -133,6 +133,12 @@ DMA Fences
.. kernel-doc:: drivers/dma-buf/dma-fence.c
:doc: DMA fences overview
+DMA Fence Signalling Annotations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-fence.c
+ :doc: fence signalling annotation
+
DMA Fences Functions Reference
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 656e9ac2d028..0005bc002529 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
}
EXPORT_SYMBOL(dma_fence_context_alloc);
+/**
+ * DOC: fence signalling annotation
+ *
+ * Proving correctness of all the kernel code around &dma_fence through code
+ * review and testing is tricky for a few reasons:
+ *
+ * * It is a cross-driver contract, and therefore all drivers must follow the
+ * same rules for lock nesting order, calling contexts for various functions
+ * and anything else significant for in-kernel interfaces. But it is also
+ * impossible to test all drivers in a single machine, hence brute-force N vs.
+ * N testing of all combinations is impossible. Even just limiting to the
+ * possible combinations is infeasible.
+ *
+ * * There is an enormous amount of driver code involved. For render drivers
+ * there's the tail of command submission, after fences are published,
+ * scheduler code, interrupt and workers to process job completion,
+ * and timeout, gpu reset and gpu hang recovery code. Plus for integration
+ * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
+ * and &shrinker. For modesetting drivers there's the commit tail functions
+ * between when fences for an atomic modeset are published, and when the
+ * corresponding vblank completes, including any interrupt processing and
+ * related workers. Auditing all that code, across all drivers, is not
+ * feasible.
+ *
+ * * Due to how many other subsystems are involved and the locking hierarchies
+ * this pulls in there is extremely thin wiggle-room for driver-specific
+ * differences. &dma_fence interacts with almost all of the core memory
+ * handling through page fault handlers via &dma_resv, dma_resv_lock() and
+ * dma_resv_unlock(). On the other side it also interacts through all
+ * allocation sites through &mmu_notifier and &shrinker.
+ *
+ * Furthermore lockdep does not handle cross-release dependencies, which means
+ * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
+ * at runtime with some quick testing. The simplest example is one thread
+ * waiting on a &dma_fence while holding a lock::
+ *
+ * lock(A);
+ * dma_fence_wait(B);
+ * unlock(A);
+ *
+ * while the other thread is stuck trying to acquire the same lock, which
+ * prevents it from signalling the fence the previous thread is stuck waiting
+ * on::
+ *
+ * lock(A);
+ * unlock(A);
+ * dma_fence_signal(B);
+ *
+ * By manually annotating all code relevant to signalling a &dma_fence we can
+ * teach lockdep about these dependencies, which also helps with the validation
+ * headache since now lockdep can check all the rules for us::
+ *
+ * cookie = dma_fence_begin_signalling();
+ * lock(A);
+ * unlock(A);
+ * dma_fence_signal(B);
+ * dma_fence_end_signalling(cookie);
+ *
+ * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
+ * annotate critical sections the following rules need to be observed:
+ *
+ * * All code necessary to complete a &dma_fence must be annotated, from the
+ * point where a fence is accessible to other threads, to the point where
+ * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
+ * and due to the very strict rules and many corner cases it is infeasible to
+ * catch these just with review or normal stress testing.
+ *
+ * * &struct dma_resv deserves a special note, since the readers are only
+ * protected by rcu. This means the signalling critical section starts as soon
+ * as the new fences are installed, even before dma_resv_unlock() is called.
+ *
+ * * The only exception are fast paths and opportunistic signalling code, which
+ * calls dma_fence_signal() purely as an optimization, but is not required to
+ * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
+ * which calls dma_fence_signal(), while the mandatory completion path goes
+ * through a hardware interrupt and possible job completion worker.
+ *
+ * * To aid composability of code, the annotations can be freely nested, as long
+ * as the overall locking hierarchy is consistent. The annotations also work
+ * both in interrupt and process context. Due to implementation details this
+ * requires that callers pass an opaque cookie from
+ * dma_fence_begin_signalling() to dma_fence_end_signalling().
+ *
+ * * Validation against the cross driver contract is implemented by priming
+ * lockdep with the relevant hierarchy at boot-up. This means even just
+ * testing with a single device is enough to validate a driver, at least as
+ * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
+ * concerned.
+ */
+#ifdef CONFIG_LOCKDEP
+struct lockdep_map dma_fence_lockdep_map = {
+ .name = "dma_fence_map"
+};
+
+/**
+ * dma_fence_begin_signalling - begin a critical DMA fence signalling section
+ *
+ * Drivers should use this to annotate the beginning of any code section
+ * required to eventually complete &dma_fence by calling dma_fence_signal().
+ *
+ * The end of these critical sections are annotated with
+ * dma_fence_end_signalling().
+ *
+ * Returns:
+ *
+ * Opaque cookie needed by the implementation, which needs to be passed to
+ * dma_fence_end_signalling().
+ */
+bool dma_fence_begin_signalling(void)
+{
+ /* explicitly nesting ... */
+ if (lock_is_held_type(&dma_fence_lockdep_map, 1))
+ return true;
+
+ /* rely on might_sleep check for soft/hardirq locks */
+ if (in_atomic())
+ return true;
+
+ /* ... and non-recursive readlock */
+ lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
+
+ return false;
+}
+EXPORT_SYMBOL(dma_fence_begin_signalling);
+
+/**
+ * dma_fence_end_signalling - end a critical DMA fence signalling section
+ *
+ * Closes a critical section annotation opened by dma_fence_begin_signalling().
+ */
+void dma_fence_end_signalling(bool cookie)
+{
+ if (cookie)
+ return;
+
+ lock_release(&dma_fence_lockdep_map, _RET_IP_);
+}
+EXPORT_SYMBOL(dma_fence_end_signalling);
+
+void __dma_fence_might_wait(void)
+{
+ bool tmp;
+
+ tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
+ if (tmp)
+ lock_release(&dma_fence_lockdep_map, _THIS_IP_);
+ lock_map_acquire(&dma_fence_lockdep_map);
+ lock_map_release(&dma_fence_lockdep_map);
+ if (tmp)
+ lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
+}
+#endif
+
+
/**
* dma_fence_signal_locked - signal completion of a fence
* @fence: the fence to signal
@@ -170,14 +324,19 @@ int dma_fence_signal(struct dma_fence *fence)
{
unsigned long flags;
int ret;
+ bool tmp;
if (!fence)
return -EINVAL;
+ tmp = dma_fence_begin_signalling();
+
spin_lock_irqsave(fence->lock, flags);
ret = dma_fence_signal_locked(fence);
spin_unlock_irqrestore(fence->lock, flags);
+ dma_fence_end_signalling(tmp);
+
return ret;
}
EXPORT_SYMBOL(dma_fence_signal);
@@ -210,6 +369,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
might_sleep();
+ __dma_fence_might_wait();
+
trace_dma_fence_wait_start(fence);
if (fence->ops->wait)
ret = fence->ops->wait(fence, intr, timeout);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 3347c54f3a87..3f288f7db2ef 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -357,6 +357,18 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
} while (1);
}
+#ifdef CONFIG_LOCKDEP
+bool dma_fence_begin_signalling(void);
+void dma_fence_end_signalling(bool cookie);
+#else
+static inline bool dma_fence_begin_signalling(void)
+{
+ return true;
+}
+static inline void dma_fence_end_signalling(bool cookie) {}
+static inline void __dma_fence_might_wait(void) {}
+#endif
+
int dma_fence_signal(struct dma_fence *fence);
int dma_fence_signal_locked(struct dma_fence *fence);
signed long dma_fence_default_wait(struct dma_fence *fence,
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH] dma-fence: basic lockdep annotations
2020-06-05 13:29 ` [Intel-gfx] [PATCH] " Daniel Vetter
@ 2020-06-05 14:30 ` Thomas Hellström (Intel)
2020-06-11 9:57 ` Maarten Lankhorst
1 sibling, 0 replies; 106+ messages in thread
From: Thomas Hellström (Intel) @ 2020-06-05 14:30 UTC (permalink / raw)
To: Daniel Vetter, DRI Development
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx,
Chris Wilson, linaro-mm-sig, Thomas Hellstrom, Daniel Vetter,
Mika Kuoppala, Christian König, linux-media
On 6/5/20 3:29 PM, Daniel Vetter wrote:
> Design is similar to the lockdep annotations for workers, but with
> some twists:
>
> - We use a read-lock for the execution/worker/completion side, so that
> this explicit annotation can be more liberally sprinkled around.
> With read locks lockdep isn't going to complain if the read-side
> isn't nested the same way under all circumstances, so ABBA deadlocks
> are ok. Which they are, since this is an annotation only.
>
> - We're using non-recursive lockdep read lock mode, since in recursive
> read lock mode lockdep does not catch read side hazards. And we
> _very_ much want read side hazards to be caught. For full details of
> this limitation see
>
> commit e91498589746065e3ae95d9a00b068e525eec34f
> Author: Peter Zijlstra <peterz@infradead.org>
> Date: Wed Aug 23 13:13:11 2017 +0200
>
> locking/lockdep/selftests: Add mixed read-write ABBA tests
>
> - To allow nesting of the read-side explicit annotations we explicitly
> keep track of the nesting. lock_is_held() allows us to do that.
>
> - The wait-side annotation is a write lock, and entirely done within
> dma_fence_wait() for everyone by default.
>
> - To be able to freely annotate helper functions I want to make it ok
> to call dma_fence_begin/end_signalling from soft/hardirq context.
> First attempt was using the hardirq locking context for the write
> side in lockdep, but this forces all normal spinlocks nested within
> dma_fence_begin/end_signalling to be spinlocks. That bollocks.
>
> The approach now is to simple check in_atomic(), and for these cases
> entirely rely on the might_sleep() check in dma_fence_wait(). That
> will catch any wrong nesting against spinlocks from soft/hardirq
> contexts.
>
> The idea here is that every code path that's critical for eventually
> signalling a dma_fence should be annotated with
> dma_fence_begin/end_signalling. The annotation ideally starts right
> after a dma_fence is published (added to a dma_resv, exposed as a
> sync_file fd, attached to a drm_syncobj fd, or anything else that
> makes the dma_fence visible to other kernel threads), up to and
> including the dma_fence_wait(). Examples are irq handlers, the
> scheduler rt threads, the tail of execbuf (after the corresponding
> fences are visible), any workers that end up signalling dma_fences and
> really anything else. Not annotated should be code paths that only
> complete fences opportunistically as the gpu progresses, like e.g.
> shrinker/eviction code.
>
> The main class of deadlocks this is supposed to catch are:
>
> Thread A:
>
> mutex_lock(A);
> mutex_unlock(A);
>
> dma_fence_signal();
>
> Thread B:
>
> mutex_lock(A);
> dma_fence_wait();
> mutex_unlock(A);
>
> Thread B is blocked on A signalling the fence, but A never gets around
> to that because it cannot acquire the lock A.
>
> Note that dma_fence_wait() is allowed to be nested within
> dma_fence_begin/end_signalling sections. To allow this to happen the
> read lock needs to be upgraded to a write lock, which means that any
> other lock is acquired between the dma_fence_begin_signalling() call and
> the call to dma_fence_wait(), and still held, this will result in an
> immediate lockdep complaint. The only other option would be to not
> annotate such calls, defeating the point. Therefore these annotations
> cannot be sprinkled over the code entirely mindless to avoid false
> positives.
>
> Originally I hope that the cross-release lockdep extensions would
> alleviate the need for explicit annotations:
>
> https://lwn.net/Articles/709849/
>
> But there's a few reasons why that's not an option:
>
> - It's not happening in upstream, since it got reverted due to too
> many false positives:
>
> commit e966eaeeb623f09975ef362c2866fae6f86844f9
> Author: Ingo Molnar <mingo@kernel.org>
> Date: Tue Dec 12 12:31:16 2017 +0100
>
> locking/lockdep: Remove the cross-release locking checks
>
> This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
> while it found a number of old bugs initially, was also causing too many
> false positives that caused people to disable lockdep - which is arguably
> a worse overall outcome.
>
> - cross-release uses the complete() call to annotate the end of
> critical sections, for dma_fence that would be dma_fence_signal().
> But we do not want all dma_fence_signal() calls to be treated as
> critical, since many are opportunistic cleanup of gpu requests. If
> these get stuck there's still the main completion interrupt and
> workers who can unblock everyone. Automatically annotating all
> dma_fence_signal() calls would hence cause false positives.
>
> - cross-release had some educated guesses for when a critical section
> starts, like fresh syscall or fresh work callback. This would again
> cause false positives without explicit annotations, since for
> dma_fence the critical sections only starts when we publish a fence.
>
> - Furthermore there can be cases where a thread never does a
> dma_fence_signal, but is still critical for reaching completion of
> fences. One example would be a scheduler kthread which picks up jobs
> and pushes them into hardware, where the interrupt handler or
> another completion thread calls dma_fence_signal(). But if the
> scheduler thread hangs, then all the fences hang, hence we need to
> manually annotate it. cross-release aimed to solve this by chaining
> cross-release dependencies, but the dependency from scheduler thread
> to the completion interrupt handler goes through hw where
> cross-release code can't observe it.
>
> In short, without manual annotations and careful review of the start
> and end of critical sections, cross-relese dependency tracking doesn't
> work. We need explicit annotations.
>
> v2: handle soft/hardirq ctx better against write side and dont forget
> EXPORT_SYMBOL, drivers can't use this otherwise.
>
> v3: Kerneldoc.
>
> v4: Some spelling fixes from Mika
>
> v5: Amend commit message to explain in detail why cross-release isn't
> the solution.
>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: linux-rdma@vger.kernel.org
> Cc: amd-gfx@lists.freedesktop.org
> Cc: intel-gfx@lists.freedesktop.org
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH] dma-fence: basic lockdep annotations
2020-06-05 13:29 ` [Intel-gfx] [PATCH] " Daniel Vetter
2020-06-05 14:30 ` Thomas Hellström (Intel)
@ 2020-06-11 9:57 ` Maarten Lankhorst
1 sibling, 0 replies; 106+ messages in thread
From: Maarten Lankhorst @ 2020-06-11 9:57 UTC (permalink / raw)
To: Daniel Vetter, DRI Development
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx,
Chris Wilson, linaro-mm-sig, Thomas Hellstrom, Daniel Vetter,
linux-media, Christian König, Mika Kuoppala
Op 05-06-2020 om 15:29 schreef Daniel Vetter:
> Design is similar to the lockdep annotations for workers, but with
> some twists:
>
> - We use a read-lock for the execution/worker/completion side, so that
> this explicit annotation can be more liberally sprinkled around.
> With read locks lockdep isn't going to complain if the read-side
> isn't nested the same way under all circumstances, so ABBA deadlocks
> are ok. Which they are, since this is an annotation only.
>
> - We're using non-recursive lockdep read lock mode, since in recursive
> read lock mode lockdep does not catch read side hazards. And we
> _very_ much want read side hazards to be caught. For full details of
> this limitation see
>
> commit e91498589746065e3ae95d9a00b068e525eec34f
> Author: Peter Zijlstra <peterz@infradead.org>
> Date: Wed Aug 23 13:13:11 2017 +0200
>
> locking/lockdep/selftests: Add mixed read-write ABBA tests
>
> - To allow nesting of the read-side explicit annotations we explicitly
> keep track of the nesting. lock_is_held() allows us to do that.
>
> - The wait-side annotation is a write lock, and entirely done within
> dma_fence_wait() for everyone by default.
>
> - To be able to freely annotate helper functions I want to make it ok
> to call dma_fence_begin/end_signalling from soft/hardirq context.
> First attempt was using the hardirq locking context for the write
> side in lockdep, but this forces all normal spinlocks nested within
> dma_fence_begin/end_signalling to be spinlocks. That bollocks.
>
> The approach now is to simple check in_atomic(), and for these cases
> entirely rely on the might_sleep() check in dma_fence_wait(). That
> will catch any wrong nesting against spinlocks from soft/hardirq
> contexts.
>
> The idea here is that every code path that's critical for eventually
> signalling a dma_fence should be annotated with
> dma_fence_begin/end_signalling. The annotation ideally starts right
> after a dma_fence is published (added to a dma_resv, exposed as a
> sync_file fd, attached to a drm_syncobj fd, or anything else that
> makes the dma_fence visible to other kernel threads), up to and
> including the dma_fence_wait(). Examples are irq handlers, the
> scheduler rt threads, the tail of execbuf (after the corresponding
> fences are visible), any workers that end up signalling dma_fences and
> really anything else. Not annotated should be code paths that only
> complete fences opportunistically as the gpu progresses, like e.g.
> shrinker/eviction code.
>
> The main class of deadlocks this is supposed to catch are:
>
> Thread A:
>
> mutex_lock(A);
> mutex_unlock(A);
>
> dma_fence_signal();
>
> Thread B:
>
> mutex_lock(A);
> dma_fence_wait();
> mutex_unlock(A);
>
> Thread B is blocked on A signalling the fence, but A never gets around
> to that because it cannot acquire the lock A.
>
> Note that dma_fence_wait() is allowed to be nested within
> dma_fence_begin/end_signalling sections. To allow this to happen the
> read lock needs to be upgraded to a write lock, which means that any
> other lock is acquired between the dma_fence_begin_signalling() call and
> the call to dma_fence_wait(), and still held, this will result in an
> immediate lockdep complaint. The only other option would be to not
> annotate such calls, defeating the point. Therefore these annotations
> cannot be sprinkled over the code entirely mindless to avoid false
> positives.
>
> Originally I hope that the cross-release lockdep extensions would
> alleviate the need for explicit annotations:
>
> https://lwn.net/Articles/709849/
>
> But there's a few reasons why that's not an option:
>
> - It's not happening in upstream, since it got reverted due to too
> many false positives:
>
> commit e966eaeeb623f09975ef362c2866fae6f86844f9
> Author: Ingo Molnar <mingo@kernel.org>
> Date: Tue Dec 12 12:31:16 2017 +0100
>
> locking/lockdep: Remove the cross-release locking checks
>
> This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
> while it found a number of old bugs initially, was also causing too many
> false positives that caused people to disable lockdep - which is arguably
> a worse overall outcome.
>
> - cross-release uses the complete() call to annotate the end of
> critical sections, for dma_fence that would be dma_fence_signal().
> But we do not want all dma_fence_signal() calls to be treated as
> critical, since many are opportunistic cleanup of gpu requests. If
> these get stuck there's still the main completion interrupt and
> workers who can unblock everyone. Automatically annotating all
> dma_fence_signal() calls would hence cause false positives.
>
> - cross-release had some educated guesses for when a critical section
> starts, like fresh syscall or fresh work callback. This would again
> cause false positives without explicit annotations, since for
> dma_fence the critical sections only starts when we publish a fence.
>
> - Furthermore there can be cases where a thread never does a
> dma_fence_signal, but is still critical for reaching completion of
> fences. One example would be a scheduler kthread which picks up jobs
> and pushes them into hardware, where the interrupt handler or
> another completion thread calls dma_fence_signal(). But if the
> scheduler thread hangs, then all the fences hang, hence we need to
> manually annotate it. cross-release aimed to solve this by chaining
> cross-release dependencies, but the dependency from scheduler thread
> to the completion interrupt handler goes through hw where
> cross-release code can't observe it.
>
> In short, without manual annotations and careful review of the start
> and end of critical sections, cross-relese dependency tracking doesn't
> work. We need explicit annotations.
>
> v2: handle soft/hardirq ctx better against write side and dont forget
> EXPORT_SYMBOL, drivers can't use this otherwise.
>
> v3: Kerneldoc.
>
> v4: Some spelling fixes from Mika
>
> v5: Amend commit message to explain in detail why cross-release isn't
> the solution.
>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: linux-rdma@vger.kernel.org
> Cc: amd-gfx@lists.freedesktop.org
> Cc: intel-gfx@lists.freedesktop.org
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
> Documentation/driver-api/dma-buf.rst | 12 +-
> drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
> include/linux/dma-fence.h | 12 ++
> 3 files changed, 182 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
> index 63dec76d1d8d..05d856131140 100644
> --- a/Documentation/driver-api/dma-buf.rst
> +++ b/Documentation/driver-api/dma-buf.rst
> @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> :doc: cpu access
>
> -Fence Poll Support
> -~~~~~~~~~~~~~~~~~~
> +Implicit Fence Poll Support
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> - :doc: fence polling
> + :doc: implicit fence polling
>
> Kernel Functions and Structures Reference
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> @@ -133,6 +133,12 @@ DMA Fences
> .. kernel-doc:: drivers/dma-buf/dma-fence.c
> :doc: DMA fences overview
>
> +DMA Fence Signalling Annotations
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +.. kernel-doc:: drivers/dma-buf/dma-fence.c
> + :doc: fence signalling annotation
> +
> DMA Fences Functions Reference
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 656e9ac2d028..0005bc002529 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
> }
> EXPORT_SYMBOL(dma_fence_context_alloc);
>
> +/**
> + * DOC: fence signalling annotation
> + *
> + * Proving correctness of all the kernel code around &dma_fence through code
> + * review and testing is tricky for a few reasons:
> + *
> + * * It is a cross-driver contract, and therefore all drivers must follow the
> + * same rules for lock nesting order, calling contexts for various functions
> + * and anything else significant for in-kernel interfaces. But it is also
> + * impossible to test all drivers in a single machine, hence brute-force N vs.
> + * N testing of all combinations is impossible. Even just limiting to the
> + * possible combinations is infeasible.
> + *
> + * * There is an enormous amount of driver code involved. For render drivers
> + * there's the tail of command submission, after fences are published,
> + * scheduler code, interrupt and workers to process job completion,
> + * and timeout, gpu reset and gpu hang recovery code. Plus for integration
> + * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
> + * and &shrinker. For modesetting drivers there's the commit tail functions
> + * between when fences for an atomic modeset are published, and when the
> + * corresponding vblank completes, including any interrupt processing and
> + * related workers. Auditing all that code, across all drivers, is not
> + * feasible.
> + *
> + * * Due to how many other subsystems are involved and the locking hierarchies
> + * this pulls in there is extremely thin wiggle-room for driver-specific
> + * differences. &dma_fence interacts with almost all of the core memory
> + * handling through page fault handlers via &dma_resv, dma_resv_lock() and
> + * dma_resv_unlock(). On the other side it also interacts through all
> + * allocation sites through &mmu_notifier and &shrinker.
> + *
> + * Furthermore lockdep does not handle cross-release dependencies, which means
> + * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
> + * at runtime with some quick testing. The simplest example is one thread
> + * waiting on a &dma_fence while holding a lock::
> + *
> + * lock(A);
> + * dma_fence_wait(B);
> + * unlock(A);
> + *
> + * while the other thread is stuck trying to acquire the same lock, which
> + * prevents it from signalling the fence the previous thread is stuck waiting
> + * on::
> + *
> + * lock(A);
> + * unlock(A);
> + * dma_fence_signal(B);
> + *
> + * By manually annotating all code relevant to signalling a &dma_fence we can
> + * teach lockdep about these dependencies, which also helps with the validation
> + * headache since now lockdep can check all the rules for us::
> + *
> + * cookie = dma_fence_begin_signalling();
> + * lock(A);
> + * unlock(A);
> + * dma_fence_signal(B);
> + * dma_fence_end_signalling(cookie);
> + *
> + * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
> + * annotate critical sections the following rules need to be observed:
> + *
> + * * All code necessary to complete a &dma_fence must be annotated, from the
> + * point where a fence is accessible to other threads, to the point where
> + * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
> + * and due to the very strict rules and many corner cases it is infeasible to
> + * catch these just with review or normal stress testing.
> + *
> + * * &struct dma_resv deserves a special note, since the readers are only
> + * protected by rcu. This means the signalling critical section starts as soon
> + * as the new fences are installed, even before dma_resv_unlock() is called.
> + *
> + * * The only exception are fast paths and opportunistic signalling code, which
> + * calls dma_fence_signal() purely as an optimization, but is not required to
> + * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
> + * which calls dma_fence_signal(), while the mandatory completion path goes
> + * through a hardware interrupt and possible job completion worker.
> + *
> + * * To aid composability of code, the annotations can be freely nested, as long
> + * as the overall locking hierarchy is consistent. The annotations also work
> + * both in interrupt and process context. Due to implementation details this
> + * requires that callers pass an opaque cookie from
> + * dma_fence_begin_signalling() to dma_fence_end_signalling().
> + *
> + * * Validation against the cross driver contract is implemented by priming
> + * lockdep with the relevant hierarchy at boot-up. This means even just
> + * testing with a single device is enough to validate a driver, at least as
> + * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
> + * concerned.
> + */
> +#ifdef CONFIG_LOCKDEP
> +struct lockdep_map dma_fence_lockdep_map = {
> + .name = "dma_fence_map"
> +};
> +
> +/**
> + * dma_fence_begin_signalling - begin a critical DMA fence signalling section
> + *
> + * Drivers should use this to annotate the beginning of any code section
> + * required to eventually complete &dma_fence by calling dma_fence_signal().
> + *
> + * The end of these critical sections are annotated with
> + * dma_fence_end_signalling().
> + *
> + * Returns:
> + *
> + * Opaque cookie needed by the implementation, which needs to be passed to
> + * dma_fence_end_signalling().
> + */
> +bool dma_fence_begin_signalling(void)
> +{
> + /* explicitly nesting ... */
> + if (lock_is_held_type(&dma_fence_lockdep_map, 1))
> + return true;
> +
> + /* rely on might_sleep check for soft/hardirq locks */
> + if (in_atomic())
> + return true;
> +
> + /* ... and non-recursive readlock */
> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
> +
> + return false;
> +}
> +EXPORT_SYMBOL(dma_fence_begin_signalling);
> +
> +/**
> + * dma_fence_end_signalling - end a critical DMA fence signalling section
> + *
> + * Closes a critical section annotation opened by dma_fence_begin_signalling().
> + */
> +void dma_fence_end_signalling(bool cookie)
> +{
> + if (cookie)
> + return;
> +
> + lock_release(&dma_fence_lockdep_map, _RET_IP_);
> +}
> +EXPORT_SYMBOL(dma_fence_end_signalling);
> +
> +void __dma_fence_might_wait(void)
> +{
> + bool tmp;
> +
> + tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
> + if (tmp)
> + lock_release(&dma_fence_lockdep_map, _THIS_IP_);
> + lock_map_acquire(&dma_fence_lockdep_map);
> + lock_map_release(&dma_fence_lockdep_map);
> + if (tmp)
> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
> +}
> +#endif
> +
> +
> /**
> * dma_fence_signal_locked - signal completion of a fence
> * @fence: the fence to signal
> @@ -170,14 +324,19 @@ int dma_fence_signal(struct dma_fence *fence)
> {
> unsigned long flags;
> int ret;
> + bool tmp;
>
> if (!fence)
> return -EINVAL;
>
> + tmp = dma_fence_begin_signalling();
> +
> spin_lock_irqsave(fence->lock, flags);
> ret = dma_fence_signal_locked(fence);
> spin_unlock_irqrestore(fence->lock, flags);
>
> + dma_fence_end_signalling(tmp);
> +
> return ret;
> }
> EXPORT_SYMBOL(dma_fence_signal);
> @@ -210,6 +369,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
>
> might_sleep();
>
> + __dma_fence_might_wait();
> +
> trace_dma_fence_wait_start(fence);
> if (fence->ops->wait)
> ret = fence->ops->wait(fence, intr, timeout);
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 3347c54f3a87..3f288f7db2ef 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -357,6 +357,18 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
> } while (1);
> }
>
> +#ifdef CONFIG_LOCKDEP
> +bool dma_fence_begin_signalling(void);
> +void dma_fence_end_signalling(bool cookie);
> +#else
> +static inline bool dma_fence_begin_signalling(void)
> +{
> + return true;
> +}
> +static inline void dma_fence_end_signalling(bool cookie) {}
> +static inline void __dma_fence_might_wait(void) {}
> +#endif
> +
> int dma_fence_signal(struct dma_fence *fence);
> int dma_fence_signal_locked(struct dma_fence *fence);
> signed long dma_fence_default_wait(struct dma_fence *fence,
As original author of dma-fence, I enjoy seeing more lockdep annotations. Fence was always meant to be cross-driver, so strict driver annotations that can be verified by lockdep are a good thing. Because drivers have to interact with other drivers that use dma-fence, the rules must be the same for everyone, and the above code makes sense.
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-04 8:12 ` [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations Daniel Vetter
2020-06-04 8:57 ` Thomas Hellström (Intel)
2020-06-05 13:29 ` [Intel-gfx] [PATCH] " Daniel Vetter
@ 2020-06-10 14:21 ` Tvrtko Ursulin
2020-06-10 15:17 ` Daniel Vetter
2020-06-11 8:00 ` Chris Wilson
2020-06-12 7:06 ` [Intel-gfx] [PATCH] " Daniel Vetter
4 siblings, 1 reply; 106+ messages in thread
From: Tvrtko Ursulin @ 2020-06-10 14:21 UTC (permalink / raw)
To: Daniel Vetter, DRI Development
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx,
Chris Wilson, linaro-mm-sig, Thomas Hellstrom, Daniel Vetter,
Mika Kuoppala, Christian König, linux-media
On 04/06/2020 09:12, Daniel Vetter wrote:
> Design is similar to the lockdep annotations for workers, but with
> some twists:
>
> - We use a read-lock for the execution/worker/completion side, so that
> this explicit annotation can be more liberally sprinkled around.
> With read locks lockdep isn't going to complain if the read-side
> isn't nested the same way under all circumstances, so ABBA deadlocks
> are ok. Which they are, since this is an annotation only.
>
> - We're using non-recursive lockdep read lock mode, since in recursive
> read lock mode lockdep does not catch read side hazards. And we
> _very_ much want read side hazards to be caught. For full details of
> this limitation see
>
> commit e91498589746065e3ae95d9a00b068e525eec34f
> Author: Peter Zijlstra <peterz@infradead.org>
> Date: Wed Aug 23 13:13:11 2017 +0200
>
> locking/lockdep/selftests: Add mixed read-write ABBA tests
>
> - To allow nesting of the read-side explicit annotations we explicitly
> keep track of the nesting. lock_is_held() allows us to do that.
>
> - The wait-side annotation is a write lock, and entirely done within
> dma_fence_wait() for everyone by default.
>
> - To be able to freely annotate helper functions I want to make it ok
> to call dma_fence_begin/end_signalling from soft/hardirq context.
> First attempt was using the hardirq locking context for the write
> side in lockdep, but this forces all normal spinlocks nested within
> dma_fence_begin/end_signalling to be spinlocks. That bollocks.
>
> The approach now is to simple check in_atomic(), and for these cases
> entirely rely on the might_sleep() check in dma_fence_wait(). That
> will catch any wrong nesting against spinlocks from soft/hardirq
> contexts.
>
> The idea here is that every code path that's critical for eventually
> signalling a dma_fence should be annotated with
> dma_fence_begin/end_signalling. The annotation ideally starts right
> after a dma_fence is published (added to a dma_resv, exposed as a
> sync_file fd, attached to a drm_syncobj fd, or anything else that
> makes the dma_fence visible to other kernel threads), up to and
> including the dma_fence_wait(). Examples are irq handlers, the
> scheduler rt threads, the tail of execbuf (after the corresponding
> fences are visible), any workers that end up signalling dma_fences and
> really anything else. Not annotated should be code paths that only
> complete fences opportunistically as the gpu progresses, like e.g.
> shrinker/eviction code.
>
> The main class of deadlocks this is supposed to catch are:
>
> Thread A:
>
> mutex_lock(A);
> mutex_unlock(A);
>
> dma_fence_signal();
>
> Thread B:
>
> mutex_lock(A);
> dma_fence_wait();
> mutex_unlock(A);
>
> Thread B is blocked on A signalling the fence, but A never gets around
> to that because it cannot acquire the lock A.
>
> Note that dma_fence_wait() is allowed to be nested within
> dma_fence_begin/end_signalling sections. To allow this to happen the
> read lock needs to be upgraded to a write lock, which means that any
> other lock is acquired between the dma_fence_begin_signalling() call and
> the call to dma_fence_wait(), and still held, this will result in an
> immediate lockdep complaint. The only other option would be to not
> annotate such calls, defeating the point. Therefore these annotations
> cannot be sprinkled over the code entirely mindless to avoid false
> positives.
>
> v2: handle soft/hardirq ctx better against write side and dont forget
> EXPORT_SYMBOL, drivers can't use this otherwise.
>
> v3: Kerneldoc.
>
> v4: Some spelling fixes from Mika
>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: linux-rdma@vger.kernel.org
> Cc: amd-gfx@lists.freedesktop.org
> Cc: intel-gfx@lists.freedesktop.org
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
> Documentation/driver-api/dma-buf.rst | 12 +-
> drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
> include/linux/dma-fence.h | 12 ++
> 3 files changed, 182 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
> index 63dec76d1d8d..05d856131140 100644
> --- a/Documentation/driver-api/dma-buf.rst
> +++ b/Documentation/driver-api/dma-buf.rst
> @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> :doc: cpu access
>
> -Fence Poll Support
> -~~~~~~~~~~~~~~~~~~
> +Implicit Fence Poll Support
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> - :doc: fence polling
> + :doc: implicit fence polling
>
> Kernel Functions and Structures Reference
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> @@ -133,6 +133,12 @@ DMA Fences
> .. kernel-doc:: drivers/dma-buf/dma-fence.c
> :doc: DMA fences overview
>
> +DMA Fence Signalling Annotations
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +.. kernel-doc:: drivers/dma-buf/dma-fence.c
> + :doc: fence signalling annotation
> +
> DMA Fences Functions Reference
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 656e9ac2d028..0005bc002529 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
> }
> EXPORT_SYMBOL(dma_fence_context_alloc);
>
> +/**
> + * DOC: fence signalling annotation
> + *
> + * Proving correctness of all the kernel code around &dma_fence through code
> + * review and testing is tricky for a few reasons:
> + *
> + * * It is a cross-driver contract, and therefore all drivers must follow the
> + * same rules for lock nesting order, calling contexts for various functions
> + * and anything else significant for in-kernel interfaces. But it is also
> + * impossible to test all drivers in a single machine, hence brute-force N vs.
> + * N testing of all combinations is impossible. Even just limiting to the
> + * possible combinations is infeasible.
> + *
> + * * There is an enormous amount of driver code involved. For render drivers
> + * there's the tail of command submission, after fences are published,
> + * scheduler code, interrupt and workers to process job completion,
> + * and timeout, gpu reset and gpu hang recovery code. Plus for integration
> + * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
> + * and &shrinker. For modesetting drivers there's the commit tail functions
> + * between when fences for an atomic modeset are published, and when the
> + * corresponding vblank completes, including any interrupt processing and
> + * related workers. Auditing all that code, across all drivers, is not
> + * feasible.
> + *
> + * * Due to how many other subsystems are involved and the locking hierarchies
> + * this pulls in there is extremely thin wiggle-room for driver-specific
> + * differences. &dma_fence interacts with almost all of the core memory
> + * handling through page fault handlers via &dma_resv, dma_resv_lock() and
> + * dma_resv_unlock(). On the other side it also interacts through all
> + * allocation sites through &mmu_notifier and &shrinker.
> + *
> + * Furthermore lockdep does not handle cross-release dependencies, which means
> + * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
> + * at runtime with some quick testing. The simplest example is one thread
> + * waiting on a &dma_fence while holding a lock::
> + *
> + * lock(A);
> + * dma_fence_wait(B);
> + * unlock(A);
> + *
> + * while the other thread is stuck trying to acquire the same lock, which
> + * prevents it from signalling the fence the previous thread is stuck waiting
> + * on::
> + *
> + * lock(A);
> + * unlock(A);
> + * dma_fence_signal(B);
> + *
> + * By manually annotating all code relevant to signalling a &dma_fence we can
> + * teach lockdep about these dependencies, which also helps with the validation
> + * headache since now lockdep can check all the rules for us::
> + *
> + * cookie = dma_fence_begin_signalling();
> + * lock(A);
> + * unlock(A);
> + * dma_fence_signal(B);
> + * dma_fence_end_signalling(cookie);
> + *
> + * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
> + * annotate critical sections the following rules need to be observed:
> + *
> + * * All code necessary to complete a &dma_fence must be annotated, from the
> + * point where a fence is accessible to other threads, to the point where
> + * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
> + * and due to the very strict rules and many corner cases it is infeasible to
> + * catch these just with review or normal stress testing.
> + *
> + * * &struct dma_resv deserves a special note, since the readers are only
> + * protected by rcu. This means the signalling critical section starts as soon
> + * as the new fences are installed, even before dma_resv_unlock() is called.
> + *
> + * * The only exception are fast paths and opportunistic signalling code, which
> + * calls dma_fence_signal() purely as an optimization, but is not required to
> + * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
> + * which calls dma_fence_signal(), while the mandatory completion path goes
> + * through a hardware interrupt and possible job completion worker.
> + *
> + * * To aid composability of code, the annotations can be freely nested, as long
> + * as the overall locking hierarchy is consistent. The annotations also work
> + * both in interrupt and process context. Due to implementation details this
> + * requires that callers pass an opaque cookie from
> + * dma_fence_begin_signalling() to dma_fence_end_signalling().
> + *
> + * * Validation against the cross driver contract is implemented by priming
> + * lockdep with the relevant hierarchy at boot-up. This means even just
> + * testing with a single device is enough to validate a driver, at least as
> + * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
> + * concerned.
> + */
> +#ifdef CONFIG_LOCKDEP
> +struct lockdep_map dma_fence_lockdep_map = {
> + .name = "dma_fence_map"
> +};
Maybe a stupid question because this is definitely complicated, but.. If
you have a single/static/global lockdep map, doesn't this mean _all_
locks, from _all_ drivers happening to use dma-fences will get recorded
in it. Will this work and not cause false positives?
Sounds like it could create a common link between two completely
unconnected usages. Because below you do add annotations to generic
dma_fence_signal and dma_fence_wait.
> +
> +/**
> + * dma_fence_begin_signalling - begin a critical DMA fence signalling section
> + *
> + * Drivers should use this to annotate the beginning of any code section
> + * required to eventually complete &dma_fence by calling dma_fence_signal().
> + *
> + * The end of these critical sections are annotated with
> + * dma_fence_end_signalling().
> + *
> + * Returns:
> + *
> + * Opaque cookie needed by the implementation, which needs to be passed to
> + * dma_fence_end_signalling().
> + */
> +bool dma_fence_begin_signalling(void)
> +{
> + /* explicitly nesting ... */
> + if (lock_is_held_type(&dma_fence_lockdep_map, 1))
> + return true;
> +
> + /* rely on might_sleep check for soft/hardirq locks */
> + if (in_atomic())
> + return true;
> +
> + /* ... and non-recursive readlock */
> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
Would it work if signalling path would mark itself as a write lock? I am
thinking it would be nice to see in lockdep splats what are signals and
what are waits.
The recursive usage wouldn't work then right? Would write annotation on
the wait path work?
Regards,
Tvrtko
> +
> + return false;
> +}
> +EXPORT_SYMBOL(dma_fence_begin_signalling);
> +
> +/**
> + * dma_fence_end_signalling - end a critical DMA fence signalling section
> + *
> + * Closes a critical section annotation opened by dma_fence_begin_signalling().
> + */
> +void dma_fence_end_signalling(bool cookie)
> +{
> + if (cookie)
> + return;
> +
> + lock_release(&dma_fence_lockdep_map, _RET_IP_);
> +}
> +EXPORT_SYMBOL(dma_fence_end_signalling);
> +
> +void __dma_fence_might_wait(void)
> +{
> + bool tmp;
> +
> + tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
> + if (tmp)
> + lock_release(&dma_fence_lockdep_map, _THIS_IP_);
> + lock_map_acquire(&dma_fence_lockdep_map);
> + lock_map_release(&dma_fence_lockdep_map);
> + if (tmp)
> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
> +}
> +#endif
> +
> +
> /**
> * dma_fence_signal_locked - signal completion of a fence
> * @fence: the fence to signal
> @@ -170,14 +324,19 @@ int dma_fence_signal(struct dma_fence *fence)
> {
> unsigned long flags;
> int ret;
> + bool tmp;
>
> if (!fence)
> return -EINVAL;
>
> + tmp = dma_fence_begin_signalling();
> +
> spin_lock_irqsave(fence->lock, flags);
> ret = dma_fence_signal_locked(fence);
> spin_unlock_irqrestore(fence->lock, flags);
>
> + dma_fence_end_signalling(tmp);
> +
> return ret;
> }
> EXPORT_SYMBOL(dma_fence_signal);
> @@ -210,6 +369,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
>
> might_sleep();
>
> + __dma_fence_might_wait();
> +
> trace_dma_fence_wait_start(fence);
> if (fence->ops->wait)
> ret = fence->ops->wait(fence, intr, timeout);
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 3347c54f3a87..3f288f7db2ef 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -357,6 +357,18 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
> } while (1);
> }
>
> +#ifdef CONFIG_LOCKDEP
> +bool dma_fence_begin_signalling(void);
> +void dma_fence_end_signalling(bool cookie);
> +#else
> +static inline bool dma_fence_begin_signalling(void)
> +{
> + return true;
> +}
> +static inline void dma_fence_end_signalling(bool cookie) {}
> +static inline void __dma_fence_might_wait(void) {}
> +#endif
> +
> int dma_fence_signal(struct dma_fence *fence);
> int dma_fence_signal_locked(struct dma_fence *fence);
> signed long dma_fence_default_wait(struct dma_fence *fence,
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-10 14:21 ` [Intel-gfx] [PATCH 03/18] " Tvrtko Ursulin
@ 2020-06-10 15:17 ` Daniel Vetter
2020-06-11 10:36 ` Tvrtko Ursulin
0 siblings, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-10 15:17 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Chris Wilson, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, DRI Development, Daniel Vetter, Mika Kuoppala,
Christian König, open list:DMA BUFFER SHARING FRAMEWORK
On Wed, Jun 10, 2020 at 4:22 PM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 04/06/2020 09:12, Daniel Vetter wrote:
> > Design is similar to the lockdep annotations for workers, but with
> > some twists:
> >
> > - We use a read-lock for the execution/worker/completion side, so that
> > this explicit annotation can be more liberally sprinkled around.
> > With read locks lockdep isn't going to complain if the read-side
> > isn't nested the same way under all circumstances, so ABBA deadlocks
> > are ok. Which they are, since this is an annotation only.
> >
> > - We're using non-recursive lockdep read lock mode, since in recursive
> > read lock mode lockdep does not catch read side hazards. And we
> > _very_ much want read side hazards to be caught. For full details of
> > this limitation see
> >
> > commit e91498589746065e3ae95d9a00b068e525eec34f
> > Author: Peter Zijlstra <peterz@infradead.org>
> > Date: Wed Aug 23 13:13:11 2017 +0200
> >
> > locking/lockdep/selftests: Add mixed read-write ABBA tests
> >
> > - To allow nesting of the read-side explicit annotations we explicitly
> > keep track of the nesting. lock_is_held() allows us to do that.
> >
> > - The wait-side annotation is a write lock, and entirely done within
> > dma_fence_wait() for everyone by default.
> >
> > - To be able to freely annotate helper functions I want to make it ok
> > to call dma_fence_begin/end_signalling from soft/hardirq context.
> > First attempt was using the hardirq locking context for the write
> > side in lockdep, but this forces all normal spinlocks nested within
> > dma_fence_begin/end_signalling to be spinlocks. That bollocks.
> >
> > The approach now is to simple check in_atomic(), and for these cases
> > entirely rely on the might_sleep() check in dma_fence_wait(). That
> > will catch any wrong nesting against spinlocks from soft/hardirq
> > contexts.
> >
> > The idea here is that every code path that's critical for eventually
> > signalling a dma_fence should be annotated with
> > dma_fence_begin/end_signalling. The annotation ideally starts right
> > after a dma_fence is published (added to a dma_resv, exposed as a
> > sync_file fd, attached to a drm_syncobj fd, or anything else that
> > makes the dma_fence visible to other kernel threads), up to and
> > including the dma_fence_wait(). Examples are irq handlers, the
> > scheduler rt threads, the tail of execbuf (after the corresponding
> > fences are visible), any workers that end up signalling dma_fences and
> > really anything else. Not annotated should be code paths that only
> > complete fences opportunistically as the gpu progresses, like e.g.
> > shrinker/eviction code.
> >
> > The main class of deadlocks this is supposed to catch are:
> >
> > Thread A:
> >
> > mutex_lock(A);
> > mutex_unlock(A);
> >
> > dma_fence_signal();
> >
> > Thread B:
> >
> > mutex_lock(A);
> > dma_fence_wait();
> > mutex_unlock(A);
> >
> > Thread B is blocked on A signalling the fence, but A never gets around
> > to that because it cannot acquire the lock A.
> >
> > Note that dma_fence_wait() is allowed to be nested within
> > dma_fence_begin/end_signalling sections. To allow this to happen the
> > read lock needs to be upgraded to a write lock, which means that any
> > other lock is acquired between the dma_fence_begin_signalling() call and
> > the call to dma_fence_wait(), and still held, this will result in an
> > immediate lockdep complaint. The only other option would be to not
> > annotate such calls, defeating the point. Therefore these annotations
> > cannot be sprinkled over the code entirely mindless to avoid false
> > positives.
> >
> > v2: handle soft/hardirq ctx better against write side and dont forget
> > EXPORT_SYMBOL, drivers can't use this otherwise.
> >
> > v3: Kerneldoc.
> >
> > v4: Some spelling fixes from Mika
> >
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: amd-gfx@lists.freedesktop.org
> > Cc: intel-gfx@lists.freedesktop.org
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> > Documentation/driver-api/dma-buf.rst | 12 +-
> > drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
> > include/linux/dma-fence.h | 12 ++
> > 3 files changed, 182 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
> > index 63dec76d1d8d..05d856131140 100644
> > --- a/Documentation/driver-api/dma-buf.rst
> > +++ b/Documentation/driver-api/dma-buf.rst
> > @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
> > .. kernel-doc:: drivers/dma-buf/dma-buf.c
> > :doc: cpu access
> >
> > -Fence Poll Support
> > -~~~~~~~~~~~~~~~~~~
> > +Implicit Fence Poll Support
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > .. kernel-doc:: drivers/dma-buf/dma-buf.c
> > - :doc: fence polling
> > + :doc: implicit fence polling
> >
> > Kernel Functions and Structures Reference
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > @@ -133,6 +133,12 @@ DMA Fences
> > .. kernel-doc:: drivers/dma-buf/dma-fence.c
> > :doc: DMA fences overview
> >
> > +DMA Fence Signalling Annotations
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +.. kernel-doc:: drivers/dma-buf/dma-fence.c
> > + :doc: fence signalling annotation
> > +
> > DMA Fences Functions Reference
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> > index 656e9ac2d028..0005bc002529 100644
> > --- a/drivers/dma-buf/dma-fence.c
> > +++ b/drivers/dma-buf/dma-fence.c
> > @@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
> > }
> > EXPORT_SYMBOL(dma_fence_context_alloc);
> >
> > +/**
> > + * DOC: fence signalling annotation
> > + *
> > + * Proving correctness of all the kernel code around &dma_fence through code
> > + * review and testing is tricky for a few reasons:
> > + *
> > + * * It is a cross-driver contract, and therefore all drivers must follow the
> > + * same rules for lock nesting order, calling contexts for various functions
> > + * and anything else significant for in-kernel interfaces. But it is also
> > + * impossible to test all drivers in a single machine, hence brute-force N vs.
> > + * N testing of all combinations is impossible. Even just limiting to the
> > + * possible combinations is infeasible.
> > + *
> > + * * There is an enormous amount of driver code involved. For render drivers
> > + * there's the tail of command submission, after fences are published,
> > + * scheduler code, interrupt and workers to process job completion,
> > + * and timeout, gpu reset and gpu hang recovery code. Plus for integration
> > + * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
> > + * and &shrinker. For modesetting drivers there's the commit tail functions
> > + * between when fences for an atomic modeset are published, and when the
> > + * corresponding vblank completes, including any interrupt processing and
> > + * related workers. Auditing all that code, across all drivers, is not
> > + * feasible.
> > + *
> > + * * Due to how many other subsystems are involved and the locking hierarchies
> > + * this pulls in there is extremely thin wiggle-room for driver-specific
> > + * differences. &dma_fence interacts with almost all of the core memory
> > + * handling through page fault handlers via &dma_resv, dma_resv_lock() and
> > + * dma_resv_unlock(). On the other side it also interacts through all
> > + * allocation sites through &mmu_notifier and &shrinker.
> > + *
> > + * Furthermore lockdep does not handle cross-release dependencies, which means
> > + * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
> > + * at runtime with some quick testing. The simplest example is one thread
> > + * waiting on a &dma_fence while holding a lock::
> > + *
> > + * lock(A);
> > + * dma_fence_wait(B);
> > + * unlock(A);
> > + *
> > + * while the other thread is stuck trying to acquire the same lock, which
> > + * prevents it from signalling the fence the previous thread is stuck waiting
> > + * on::
> > + *
> > + * lock(A);
> > + * unlock(A);
> > + * dma_fence_signal(B);
> > + *
> > + * By manually annotating all code relevant to signalling a &dma_fence we can
> > + * teach lockdep about these dependencies, which also helps with the validation
> > + * headache since now lockdep can check all the rules for us::
> > + *
> > + * cookie = dma_fence_begin_signalling();
> > + * lock(A);
> > + * unlock(A);
> > + * dma_fence_signal(B);
> > + * dma_fence_end_signalling(cookie);
> > + *
> > + * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
> > + * annotate critical sections the following rules need to be observed:
> > + *
> > + * * All code necessary to complete a &dma_fence must be annotated, from the
> > + * point where a fence is accessible to other threads, to the point where
> > + * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
> > + * and due to the very strict rules and many corner cases it is infeasible to
> > + * catch these just with review or normal stress testing.
> > + *
> > + * * &struct dma_resv deserves a special note, since the readers are only
> > + * protected by rcu. This means the signalling critical section starts as soon
> > + * as the new fences are installed, even before dma_resv_unlock() is called.
> > + *
> > + * * The only exception are fast paths and opportunistic signalling code, which
> > + * calls dma_fence_signal() purely as an optimization, but is not required to
> > + * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
> > + * which calls dma_fence_signal(), while the mandatory completion path goes
> > + * through a hardware interrupt and possible job completion worker.
> > + *
> > + * * To aid composability of code, the annotations can be freely nested, as long
> > + * as the overall locking hierarchy is consistent. The annotations also work
> > + * both in interrupt and process context. Due to implementation details this
> > + * requires that callers pass an opaque cookie from
> > + * dma_fence_begin_signalling() to dma_fence_end_signalling().
> > + *
> > + * * Validation against the cross driver contract is implemented by priming
> > + * lockdep with the relevant hierarchy at boot-up. This means even just
> > + * testing with a single device is enough to validate a driver, at least as
> > + * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
> > + * concerned.
> > + */
> > +#ifdef CONFIG_LOCKDEP
> > +struct lockdep_map dma_fence_lockdep_map = {
> > + .name = "dma_fence_map"
> > +};
>
> Maybe a stupid question because this is definitely complicated, but.. If
> you have a single/static/global lockdep map, doesn't this mean _all_
> locks, from _all_ drivers happening to use dma-fences will get recorded
> in it. Will this work and not cause false positives?
>
> Sounds like it could create a common link between two completely
> unconnected usages. Because below you do add annotations to generic
> dma_fence_signal and dma_fence_wait.
This is fully intentional. dma-fence is a cross-driver interface, if
every driver invents its own rules about how this should work we have
an unmaintainable and unreviewable mess.
I've typed up the full length rant already here:
https://lore.kernel.org/dri-devel/CAKMK7uGnFhbpuurRsnZ4dvRV9gQ_3-rmSJaoqSFY=+Kvepz_CA@mail.gmail.com/
> > +
> > +/**
> > + * dma_fence_begin_signalling - begin a critical DMA fence signalling section
> > + *
> > + * Drivers should use this to annotate the beginning of any code section
> > + * required to eventually complete &dma_fence by calling dma_fence_signal().
> > + *
> > + * The end of these critical sections are annotated with
> > + * dma_fence_end_signalling().
> > + *
> > + * Returns:
> > + *
> > + * Opaque cookie needed by the implementation, which needs to be passed to
> > + * dma_fence_end_signalling().
> > + */
> > +bool dma_fence_begin_signalling(void)
> > +{
> > + /* explicitly nesting ... */
> > + if (lock_is_held_type(&dma_fence_lockdep_map, 1))
> > + return true;
> > +
> > + /* rely on might_sleep check for soft/hardirq locks */
> > + if (in_atomic())
> > + return true;
> > +
> > + /* ... and non-recursive readlock */
> > + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
>
> Would it work if signalling path would mark itself as a write lock? I am
> thinking it would be nice to see in lockdep splats what are signals and
> what are waits.
Yeah it'd be nice to have a read vs write name for the lock. But we
already have this problem for e.g. flush_work(), from which I've
stolen this idea. So it's not really new. Essentially look at the
backtraces lockdep gives you, and reconstruct the deadlock. I'm hoping
that people will notice the special functions on the backtrace, e.g.
dma_fence_begin_signalling will be listed as offending function/lock
holder, and then read the kerneldoc.
> The recursive usage wouldn't work then right? Would write annotation on
> the wait path work?
Wait path is write annotations already, but yeah annotating the
signalling side as write would cause endless amounts of alse
positives. Also it makes composability of these e.g. what I've done in
amdgpu with annotations in tdr work in drm/scheduler, annotations in
the amdgpu gpu reset code and then also annotations in atomic code,
which all nest within each other in some call chains, but not others.
Dropping the recursion would break that and make it really awkward to
annotate such cases correctly.
And the recursion only works if it's read locks, otherwise lockdep
complains if you have inconsistent annotations on the signalling side
(which again would make it more or less impossible to annotate the
above case fully).
Cheers, Daniel
>
> Regards,
>
> Tvrtko
>
> > +
> > + return false;
> > +}
> > +EXPORT_SYMBOL(dma_fence_begin_signalling);
> > +
> > +/**
> > + * dma_fence_end_signalling - end a critical DMA fence signalling section
> > + *
> > + * Closes a critical section annotation opened by dma_fence_begin_signalling().
> > + */
> > +void dma_fence_end_signalling(bool cookie)
> > +{
> > + if (cookie)
> > + return;
> > +
> > + lock_release(&dma_fence_lockdep_map, _RET_IP_);
> > +}
> > +EXPORT_SYMBOL(dma_fence_end_signalling);
> > +
> > +void __dma_fence_might_wait(void)
> > +{
> > + bool tmp;
> > +
> > + tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
> > + if (tmp)
> > + lock_release(&dma_fence_lockdep_map, _THIS_IP_);
> > + lock_map_acquire(&dma_fence_lockdep_map);
> > + lock_map_release(&dma_fence_lockdep_map);
> > + if (tmp)
> > + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
> > +}
> > +#endif
> > +
> > +
> > /**
> > * dma_fence_signal_locked - signal completion of a fence
> > * @fence: the fence to signal
> > @@ -170,14 +324,19 @@ int dma_fence_signal(struct dma_fence *fence)
> > {
> > unsigned long flags;
> > int ret;
> > + bool tmp;
> >
> > if (!fence)
> > return -EINVAL;
> >
> > + tmp = dma_fence_begin_signalling();
> > +
> > spin_lock_irqsave(fence->lock, flags);
> > ret = dma_fence_signal_locked(fence);
> > spin_unlock_irqrestore(fence->lock, flags);
> >
> > + dma_fence_end_signalling(tmp);
> > +
> > return ret;
> > }
> > EXPORT_SYMBOL(dma_fence_signal);
> > @@ -210,6 +369,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
> >
> > might_sleep();
> >
> > + __dma_fence_might_wait();
> > +
> > trace_dma_fence_wait_start(fence);
> > if (fence->ops->wait)
> > ret = fence->ops->wait(fence, intr, timeout);
> > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> > index 3347c54f3a87..3f288f7db2ef 100644
> > --- a/include/linux/dma-fence.h
> > +++ b/include/linux/dma-fence.h
> > @@ -357,6 +357,18 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
> > } while (1);
> > }
> >
> > +#ifdef CONFIG_LOCKDEP
> > +bool dma_fence_begin_signalling(void);
> > +void dma_fence_end_signalling(bool cookie);
> > +#else
> > +static inline bool dma_fence_begin_signalling(void)
> > +{
> > + return true;
> > +}
> > +static inline void dma_fence_end_signalling(bool cookie) {}
> > +static inline void __dma_fence_might_wait(void) {}
> > +#endif
> > +
> > int dma_fence_signal(struct dma_fence *fence);
> > int dma_fence_signal_locked(struct dma_fence *fence);
> > signed long dma_fence_default_wait(struct dma_fence *fence,
> >
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-10 15:17 ` Daniel Vetter
@ 2020-06-11 10:36 ` Tvrtko Ursulin
2020-06-11 11:29 ` Daniel Vetter
0 siblings, 1 reply; 106+ messages in thread
From: Tvrtko Ursulin @ 2020-06-11 10:36 UTC (permalink / raw)
To: Daniel Vetter
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Chris Wilson, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, DRI Development, Daniel Vetter, Mika Kuoppala,
Christian König, open list:DMA BUFFER SHARING FRAMEWORK
On 10/06/2020 16:17, Daniel Vetter wrote:
> On Wed, Jun 10, 2020 at 4:22 PM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 04/06/2020 09:12, Daniel Vetter wrote:
>>> Design is similar to the lockdep annotations for workers, but with
>>> some twists:
>>>
>>> - We use a read-lock for the execution/worker/completion side, so that
>>> this explicit annotation can be more liberally sprinkled around.
>>> With read locks lockdep isn't going to complain if the read-side
>>> isn't nested the same way under all circumstances, so ABBA deadlocks
>>> are ok. Which they are, since this is an annotation only.
>>>
>>> - We're using non-recursive lockdep read lock mode, since in recursive
>>> read lock mode lockdep does not catch read side hazards. And we
>>> _very_ much want read side hazards to be caught. For full details of
>>> this limitation see
>>>
>>> commit e91498589746065e3ae95d9a00b068e525eec34f
>>> Author: Peter Zijlstra <peterz@infradead.org>
>>> Date: Wed Aug 23 13:13:11 2017 +0200
>>>
>>> locking/lockdep/selftests: Add mixed read-write ABBA tests
>>>
>>> - To allow nesting of the read-side explicit annotations we explicitly
>>> keep track of the nesting. lock_is_held() allows us to do that.
>>>
>>> - The wait-side annotation is a write lock, and entirely done within
>>> dma_fence_wait() for everyone by default.
>>>
>>> - To be able to freely annotate helper functions I want to make it ok
>>> to call dma_fence_begin/end_signalling from soft/hardirq context.
>>> First attempt was using the hardirq locking context for the write
>>> side in lockdep, but this forces all normal spinlocks nested within
>>> dma_fence_begin/end_signalling to be spinlocks. That bollocks.
>>>
>>> The approach now is to simple check in_atomic(), and for these cases
>>> entirely rely on the might_sleep() check in dma_fence_wait(). That
>>> will catch any wrong nesting against spinlocks from soft/hardirq
>>> contexts.
>>>
>>> The idea here is that every code path that's critical for eventually
>>> signalling a dma_fence should be annotated with
>>> dma_fence_begin/end_signalling. The annotation ideally starts right
>>> after a dma_fence is published (added to a dma_resv, exposed as a
>>> sync_file fd, attached to a drm_syncobj fd, or anything else that
>>> makes the dma_fence visible to other kernel threads), up to and
>>> including the dma_fence_wait(). Examples are irq handlers, the
>>> scheduler rt threads, the tail of execbuf (after the corresponding
>>> fences are visible), any workers that end up signalling dma_fences and
>>> really anything else. Not annotated should be code paths that only
>>> complete fences opportunistically as the gpu progresses, like e.g.
>>> shrinker/eviction code.
>>>
>>> The main class of deadlocks this is supposed to catch are:
>>>
>>> Thread A:
>>>
>>> mutex_lock(A);
>>> mutex_unlock(A);
>>>
>>> dma_fence_signal();
>>>
>>> Thread B:
>>>
>>> mutex_lock(A);
>>> dma_fence_wait();
>>> mutex_unlock(A);
>>>
>>> Thread B is blocked on A signalling the fence, but A never gets around
>>> to that because it cannot acquire the lock A.
>>>
>>> Note that dma_fence_wait() is allowed to be nested within
>>> dma_fence_begin/end_signalling sections. To allow this to happen the
>>> read lock needs to be upgraded to a write lock, which means that any
>>> other lock is acquired between the dma_fence_begin_signalling() call and
>>> the call to dma_fence_wait(), and still held, this will result in an
>>> immediate lockdep complaint. The only other option would be to not
>>> annotate such calls, defeating the point. Therefore these annotations
>>> cannot be sprinkled over the code entirely mindless to avoid false
>>> positives.
>>>
>>> v2: handle soft/hardirq ctx better against write side and dont forget
>>> EXPORT_SYMBOL, drivers can't use this otherwise.
>>>
>>> v3: Kerneldoc.
>>>
>>> v4: Some spelling fixes from Mika
>>>
>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
>>> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
>>> Cc: linux-media@vger.kernel.org
>>> Cc: linaro-mm-sig@lists.linaro.org
>>> Cc: linux-rdma@vger.kernel.org
>>> Cc: amd-gfx@lists.freedesktop.org
>>> Cc: intel-gfx@lists.freedesktop.org
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> ---
>>> Documentation/driver-api/dma-buf.rst | 12 +-
>>> drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
>>> include/linux/dma-fence.h | 12 ++
>>> 3 files changed, 182 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
>>> index 63dec76d1d8d..05d856131140 100644
>>> --- a/Documentation/driver-api/dma-buf.rst
>>> +++ b/Documentation/driver-api/dma-buf.rst
>>> @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
>>> .. kernel-doc:: drivers/dma-buf/dma-buf.c
>>> :doc: cpu access
>>>
>>> -Fence Poll Support
>>> -~~~~~~~~~~~~~~~~~~
>>> +Implicit Fence Poll Support
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> .. kernel-doc:: drivers/dma-buf/dma-buf.c
>>> - :doc: fence polling
>>> + :doc: implicit fence polling
>>>
>>> Kernel Functions and Structures Reference
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> @@ -133,6 +133,12 @@ DMA Fences
>>> .. kernel-doc:: drivers/dma-buf/dma-fence.c
>>> :doc: DMA fences overview
>>>
>>> +DMA Fence Signalling Annotations
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +.. kernel-doc:: drivers/dma-buf/dma-fence.c
>>> + :doc: fence signalling annotation
>>> +
>>> DMA Fences Functions Reference
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
>>> index 656e9ac2d028..0005bc002529 100644
>>> --- a/drivers/dma-buf/dma-fence.c
>>> +++ b/drivers/dma-buf/dma-fence.c
>>> @@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
>>> }
>>> EXPORT_SYMBOL(dma_fence_context_alloc);
>>>
>>> +/**
>>> + * DOC: fence signalling annotation
>>> + *
>>> + * Proving correctness of all the kernel code around &dma_fence through code
>>> + * review and testing is tricky for a few reasons:
>>> + *
>>> + * * It is a cross-driver contract, and therefore all drivers must follow the
>>> + * same rules for lock nesting order, calling contexts for various functions
>>> + * and anything else significant for in-kernel interfaces. But it is also
>>> + * impossible to test all drivers in a single machine, hence brute-force N vs.
>>> + * N testing of all combinations is impossible. Even just limiting to the
>>> + * possible combinations is infeasible.
>>> + *
>>> + * * There is an enormous amount of driver code involved. For render drivers
>>> + * there's the tail of command submission, after fences are published,
>>> + * scheduler code, interrupt and workers to process job completion,
>>> + * and timeout, gpu reset and gpu hang recovery code. Plus for integration
>>> + * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
>>> + * and &shrinker. For modesetting drivers there's the commit tail functions
>>> + * between when fences for an atomic modeset are published, and when the
>>> + * corresponding vblank completes, including any interrupt processing and
>>> + * related workers. Auditing all that code, across all drivers, is not
>>> + * feasible.
>>> + *
>>> + * * Due to how many other subsystems are involved and the locking hierarchies
>>> + * this pulls in there is extremely thin wiggle-room for driver-specific
>>> + * differences. &dma_fence interacts with almost all of the core memory
>>> + * handling through page fault handlers via &dma_resv, dma_resv_lock() and
>>> + * dma_resv_unlock(). On the other side it also interacts through all
>>> + * allocation sites through &mmu_notifier and &shrinker.
>>> + *
>>> + * Furthermore lockdep does not handle cross-release dependencies, which means
>>> + * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
>>> + * at runtime with some quick testing. The simplest example is one thread
>>> + * waiting on a &dma_fence while holding a lock::
>>> + *
>>> + * lock(A);
>>> + * dma_fence_wait(B);
>>> + * unlock(A);
>>> + *
>>> + * while the other thread is stuck trying to acquire the same lock, which
>>> + * prevents it from signalling the fence the previous thread is stuck waiting
>>> + * on::
>>> + *
>>> + * lock(A);
>>> + * unlock(A);
>>> + * dma_fence_signal(B);
>>> + *
>>> + * By manually annotating all code relevant to signalling a &dma_fence we can
>>> + * teach lockdep about these dependencies, which also helps with the validation
>>> + * headache since now lockdep can check all the rules for us::
>>> + *
>>> + * cookie = dma_fence_begin_signalling();
>>> + * lock(A);
>>> + * unlock(A);
>>> + * dma_fence_signal(B);
>>> + * dma_fence_end_signalling(cookie);
>>> + *
>>> + * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
>>> + * annotate critical sections the following rules need to be observed:
>>> + *
>>> + * * All code necessary to complete a &dma_fence must be annotated, from the
>>> + * point where a fence is accessible to other threads, to the point where
>>> + * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
>>> + * and due to the very strict rules and many corner cases it is infeasible to
>>> + * catch these just with review or normal stress testing.
>>> + *
>>> + * * &struct dma_resv deserves a special note, since the readers are only
>>> + * protected by rcu. This means the signalling critical section starts as soon
>>> + * as the new fences are installed, even before dma_resv_unlock() is called.
>>> + *
>>> + * * The only exception are fast paths and opportunistic signalling code, which
>>> + * calls dma_fence_signal() purely as an optimization, but is not required to
>>> + * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
>>> + * which calls dma_fence_signal(), while the mandatory completion path goes
>>> + * through a hardware interrupt and possible job completion worker.
>>> + *
>>> + * * To aid composability of code, the annotations can be freely nested, as long
>>> + * as the overall locking hierarchy is consistent. The annotations also work
>>> + * both in interrupt and process context. Due to implementation details this
>>> + * requires that callers pass an opaque cookie from
>>> + * dma_fence_begin_signalling() to dma_fence_end_signalling().
>>> + *
>>> + * * Validation against the cross driver contract is implemented by priming
>>> + * lockdep with the relevant hierarchy at boot-up. This means even just
>>> + * testing with a single device is enough to validate a driver, at least as
>>> + * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
>>> + * concerned.
>>> + */
>>> +#ifdef CONFIG_LOCKDEP
>>> +struct lockdep_map dma_fence_lockdep_map = {
>>> + .name = "dma_fence_map"
>>> +};
>>
>> Maybe a stupid question because this is definitely complicated, but.. If
>> you have a single/static/global lockdep map, doesn't this mean _all_
>> locks, from _all_ drivers happening to use dma-fences will get recorded
>> in it. Will this work and not cause false positives?
>>
>> Sounds like it could create a common link between two completely
>> unconnected usages. Because below you do add annotations to generic
>> dma_fence_signal and dma_fence_wait.
>
> This is fully intentional. dma-fence is a cross-driver interface, if
> every driver invents its own rules about how this should work we have
> an unmaintainable and unreviewable mess.
>
> I've typed up the full length rant already here:
>
> https://lore.kernel.org/dri-devel/CAKMK7uGnFhbpuurRsnZ4dvRV9gQ_3-rmSJaoqSFY=+Kvepz_CA@mail.gmail.com/
But "perfect storm" of:
+ global fence lockmap
+ mmu notifiers
+ fs reclaim
+ default annotations in dma_fence_signal / dma_fence_wait
Equals to anything ever using dma_fence will be in impossible chains with random other drivers, even if neither driver has code to export/share that fence.
Example from the CI run:
[25.918788] Chain exists of:
fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map
[25.918794] Possible unsafe locking scenario:
[25.918797] CPU0 CPU1
[25.918799] ---- ----
[25.918801] lock(dma_fence_map);
[25.918803] lock(mmu_notifier_invalidate_range_start);
[25.918807] lock(dma_fence_map);
[25.918809] lock(fs_reclaim);
What about a dma_fence_export helper which would "arm" the annotations? It would be called as soon as the fence is exported. Maybe when added to dma_resv, or exported via sync_file, etc. Before that point begin/end_signaling and so would be no-ops.
>>> +
>>> +/**
>>> + * dma_fence_begin_signalling - begin a critical DMA fence signalling section
>>> + *
>>> + * Drivers should use this to annotate the beginning of any code section
>>> + * required to eventually complete &dma_fence by calling dma_fence_signal().
>>> + *
>>> + * The end of these critical sections are annotated with
>>> + * dma_fence_end_signalling().
>>> + *
>>> + * Returns:
>>> + *
>>> + * Opaque cookie needed by the implementation, which needs to be passed to
>>> + * dma_fence_end_signalling().
>>> + */
>>> +bool dma_fence_begin_signalling(void)
>>> +{
>>> + /* explicitly nesting ... */
>>> + if (lock_is_held_type(&dma_fence_lockdep_map, 1))
>>> + return true;
>>> +
>>> + /* rely on might_sleep check for soft/hardirq locks */
>>> + if (in_atomic())
>>> + return true;
>>> +
>>> + /* ... and non-recursive readlock */
>>> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
>>
>> Would it work if signalling path would mark itself as a write lock? I am
>> thinking it would be nice to see in lockdep splats what are signals and
>> what are waits.
>
> Yeah it'd be nice to have a read vs write name for the lock. But we
> already have this problem for e.g. flush_work(), from which I've
> stolen this idea. So it's not really new. Essentially look at the
> backtraces lockdep gives you, and reconstruct the deadlock. I'm hoping
> that people will notice the special functions on the backtrace, e.g.
> dma_fence_begin_signalling will be listed as offending function/lock
> holder, and then read the kerneldoc.
>
>> The recursive usage wouldn't work then right? Would write annotation on
>> the wait path work?
>
> Wait path is write annotations already, but yeah annotating the
> signalling side as write would cause endless amounts of alse
> positives. Also it makes composability of these e.g. what I've done in
> amdgpu with annotations in tdr work in drm/scheduler, annotations in
> the amdgpu gpu reset code and then also annotations in atomic code,
> which all nest within each other in some call chains, but not others.
> Dropping the recursion would break that and make it really awkward to
> annotate such cases correctly.
>
> And the recursion only works if it's read locks, otherwise lockdep
> complains if you have inconsistent annotations on the signalling side
> (which again would make it more or less impossible to annotate the
> above case fully).
How do I see in lockdep splats if it was a read or write user? Your patch appears to have:
dma_fence_signal:
+ lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
__dma_fence_might_wait:
+ lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
Which both seem like read lock. I don't fully understand the lockdep API so I might be wrong, not sure. But neither I see a difference in splats telling me which path is which.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-11 10:36 ` Tvrtko Ursulin
@ 2020-06-11 11:29 ` Daniel Vetter
2020-06-11 14:29 ` Tvrtko Ursulin
0 siblings, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-11 11:29 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Chris Wilson, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, DRI Development, Daniel Vetter, Mika Kuoppala,
Christian König, open list:DMA BUFFER SHARING FRAMEWORK
On Thu, Jun 11, 2020 at 12:36 PM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 10/06/2020 16:17, Daniel Vetter wrote:
> > On Wed, Jun 10, 2020 at 4:22 PM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 04/06/2020 09:12, Daniel Vetter wrote:
> >>> Design is similar to the lockdep annotations for workers, but with
> >>> some twists:
> >>>
> >>> - We use a read-lock for the execution/worker/completion side, so that
> >>> this explicit annotation can be more liberally sprinkled around.
> >>> With read locks lockdep isn't going to complain if the read-side
> >>> isn't nested the same way under all circumstances, so ABBA deadlocks
> >>> are ok. Which they are, since this is an annotation only.
> >>>
> >>> - We're using non-recursive lockdep read lock mode, since in recursive
> >>> read lock mode lockdep does not catch read side hazards. And we
> >>> _very_ much want read side hazards to be caught. For full details of
> >>> this limitation see
> >>>
> >>> commit e91498589746065e3ae95d9a00b068e525eec34f
> >>> Author: Peter Zijlstra <peterz@infradead.org>
> >>> Date: Wed Aug 23 13:13:11 2017 +0200
> >>>
> >>> locking/lockdep/selftests: Add mixed read-write ABBA tests
> >>>
> >>> - To allow nesting of the read-side explicit annotations we explicitly
> >>> keep track of the nesting. lock_is_held() allows us to do that.
> >>>
> >>> - The wait-side annotation is a write lock, and entirely done within
> >>> dma_fence_wait() for everyone by default.
> >>>
> >>> - To be able to freely annotate helper functions I want to make it ok
> >>> to call dma_fence_begin/end_signalling from soft/hardirq context.
> >>> First attempt was using the hardirq locking context for the write
> >>> side in lockdep, but this forces all normal spinlocks nested within
> >>> dma_fence_begin/end_signalling to be spinlocks. That bollocks.
> >>>
> >>> The approach now is to simple check in_atomic(), and for these cases
> >>> entirely rely on the might_sleep() check in dma_fence_wait(). That
> >>> will catch any wrong nesting against spinlocks from soft/hardirq
> >>> contexts.
> >>>
> >>> The idea here is that every code path that's critical for eventually
> >>> signalling a dma_fence should be annotated with
> >>> dma_fence_begin/end_signalling. The annotation ideally starts right
> >>> after a dma_fence is published (added to a dma_resv, exposed as a
> >>> sync_file fd, attached to a drm_syncobj fd, or anything else that
> >>> makes the dma_fence visible to other kernel threads), up to and
> >>> including the dma_fence_wait(). Examples are irq handlers, the
> >>> scheduler rt threads, the tail of execbuf (after the corresponding
> >>> fences are visible), any workers that end up signalling dma_fences and
> >>> really anything else. Not annotated should be code paths that only
> >>> complete fences opportunistically as the gpu progresses, like e.g.
> >>> shrinker/eviction code.
> >>>
> >>> The main class of deadlocks this is supposed to catch are:
> >>>
> >>> Thread A:
> >>>
> >>> mutex_lock(A);
> >>> mutex_unlock(A);
> >>>
> >>> dma_fence_signal();
> >>>
> >>> Thread B:
> >>>
> >>> mutex_lock(A);
> >>> dma_fence_wait();
> >>> mutex_unlock(A);
> >>>
> >>> Thread B is blocked on A signalling the fence, but A never gets around
> >>> to that because it cannot acquire the lock A.
> >>>
> >>> Note that dma_fence_wait() is allowed to be nested within
> >>> dma_fence_begin/end_signalling sections. To allow this to happen the
> >>> read lock needs to be upgraded to a write lock, which means that any
> >>> other lock is acquired between the dma_fence_begin_signalling() call and
> >>> the call to dma_fence_wait(), and still held, this will result in an
> >>> immediate lockdep complaint. The only other option would be to not
> >>> annotate such calls, defeating the point. Therefore these annotations
> >>> cannot be sprinkled over the code entirely mindless to avoid false
> >>> positives.
> >>>
> >>> v2: handle soft/hardirq ctx better against write side and dont forget
> >>> EXPORT_SYMBOL, drivers can't use this otherwise.
> >>>
> >>> v3: Kerneldoc.
> >>>
> >>> v4: Some spelling fixes from Mika
> >>>
> >>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> >>> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> >>> Cc: linux-media@vger.kernel.org
> >>> Cc: linaro-mm-sig@lists.linaro.org
> >>> Cc: linux-rdma@vger.kernel.org
> >>> Cc: amd-gfx@lists.freedesktop.org
> >>> Cc: intel-gfx@lists.freedesktop.org
> >>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> >>> Cc: Christian König <christian.koenig@amd.com>
> >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>> ---
> >>> Documentation/driver-api/dma-buf.rst | 12 +-
> >>> drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
> >>> include/linux/dma-fence.h | 12 ++
> >>> 3 files changed, 182 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
> >>> index 63dec76d1d8d..05d856131140 100644
> >>> --- a/Documentation/driver-api/dma-buf.rst
> >>> +++ b/Documentation/driver-api/dma-buf.rst
> >>> @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
> >>> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> >>> :doc: cpu access
> >>>
> >>> -Fence Poll Support
> >>> -~~~~~~~~~~~~~~~~~~
> >>> +Implicit Fence Poll Support
> >>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>
> >>> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> >>> - :doc: fence polling
> >>> + :doc: implicit fence polling
> >>>
> >>> Kernel Functions and Structures Reference
> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>> @@ -133,6 +133,12 @@ DMA Fences
> >>> .. kernel-doc:: drivers/dma-buf/dma-fence.c
> >>> :doc: DMA fences overview
> >>>
> >>> +DMA Fence Signalling Annotations
> >>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>> +
> >>> +.. kernel-doc:: drivers/dma-buf/dma-fence.c
> >>> + :doc: fence signalling annotation
> >>> +
> >>> DMA Fences Functions Reference
> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>
> >>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> >>> index 656e9ac2d028..0005bc002529 100644
> >>> --- a/drivers/dma-buf/dma-fence.c
> >>> +++ b/drivers/dma-buf/dma-fence.c
> >>> @@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
> >>> }
> >>> EXPORT_SYMBOL(dma_fence_context_alloc);
> >>>
> >>> +/**
> >>> + * DOC: fence signalling annotation
> >>> + *
> >>> + * Proving correctness of all the kernel code around &dma_fence through code
> >>> + * review and testing is tricky for a few reasons:
> >>> + *
> >>> + * * It is a cross-driver contract, and therefore all drivers must follow the
> >>> + * same rules for lock nesting order, calling contexts for various functions
> >>> + * and anything else significant for in-kernel interfaces. But it is also
> >>> + * impossible to test all drivers in a single machine, hence brute-force N vs.
> >>> + * N testing of all combinations is impossible. Even just limiting to the
> >>> + * possible combinations is infeasible.
> >>> + *
> >>> + * * There is an enormous amount of driver code involved. For render drivers
> >>> + * there's the tail of command submission, after fences are published,
> >>> + * scheduler code, interrupt and workers to process job completion,
> >>> + * and timeout, gpu reset and gpu hang recovery code. Plus for integration
> >>> + * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
> >>> + * and &shrinker. For modesetting drivers there's the commit tail functions
> >>> + * between when fences for an atomic modeset are published, and when the
> >>> + * corresponding vblank completes, including any interrupt processing and
> >>> + * related workers. Auditing all that code, across all drivers, is not
> >>> + * feasible.
> >>> + *
> >>> + * * Due to how many other subsystems are involved and the locking hierarchies
> >>> + * this pulls in there is extremely thin wiggle-room for driver-specific
> >>> + * differences. &dma_fence interacts with almost all of the core memory
> >>> + * handling through page fault handlers via &dma_resv, dma_resv_lock() and
> >>> + * dma_resv_unlock(). On the other side it also interacts through all
> >>> + * allocation sites through &mmu_notifier and &shrinker.
> >>> + *
> >>> + * Furthermore lockdep does not handle cross-release dependencies, which means
> >>> + * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
> >>> + * at runtime with some quick testing. The simplest example is one thread
> >>> + * waiting on a &dma_fence while holding a lock::
> >>> + *
> >>> + * lock(A);
> >>> + * dma_fence_wait(B);
> >>> + * unlock(A);
> >>> + *
> >>> + * while the other thread is stuck trying to acquire the same lock, which
> >>> + * prevents it from signalling the fence the previous thread is stuck waiting
> >>> + * on::
> >>> + *
> >>> + * lock(A);
> >>> + * unlock(A);
> >>> + * dma_fence_signal(B);
> >>> + *
> >>> + * By manually annotating all code relevant to signalling a &dma_fence we can
> >>> + * teach lockdep about these dependencies, which also helps with the validation
> >>> + * headache since now lockdep can check all the rules for us::
> >>> + *
> >>> + * cookie = dma_fence_begin_signalling();
> >>> + * lock(A);
> >>> + * unlock(A);
> >>> + * dma_fence_signal(B);
> >>> + * dma_fence_end_signalling(cookie);
> >>> + *
> >>> + * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
> >>> + * annotate critical sections the following rules need to be observed:
> >>> + *
> >>> + * * All code necessary to complete a &dma_fence must be annotated, from the
> >>> + * point where a fence is accessible to other threads, to the point where
> >>> + * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
> >>> + * and due to the very strict rules and many corner cases it is infeasible to
> >>> + * catch these just with review or normal stress testing.
> >>> + *
> >>> + * * &struct dma_resv deserves a special note, since the readers are only
> >>> + * protected by rcu. This means the signalling critical section starts as soon
> >>> + * as the new fences are installed, even before dma_resv_unlock() is called.
> >>> + *
> >>> + * * The only exception are fast paths and opportunistic signalling code, which
> >>> + * calls dma_fence_signal() purely as an optimization, but is not required to
> >>> + * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
> >>> + * which calls dma_fence_signal(), while the mandatory completion path goes
> >>> + * through a hardware interrupt and possible job completion worker.
> >>> + *
> >>> + * * To aid composability of code, the annotations can be freely nested, as long
> >>> + * as the overall locking hierarchy is consistent. The annotations also work
> >>> + * both in interrupt and process context. Due to implementation details this
> >>> + * requires that callers pass an opaque cookie from
> >>> + * dma_fence_begin_signalling() to dma_fence_end_signalling().
> >>> + *
> >>> + * * Validation against the cross driver contract is implemented by priming
> >>> + * lockdep with the relevant hierarchy at boot-up. This means even just
> >>> + * testing with a single device is enough to validate a driver, at least as
> >>> + * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
> >>> + * concerned.
> >>> + */
> >>> +#ifdef CONFIG_LOCKDEP
> >>> +struct lockdep_map dma_fence_lockdep_map = {
> >>> + .name = "dma_fence_map"
> >>> +};
> >>
> >> Maybe a stupid question because this is definitely complicated, but.. If
> >> you have a single/static/global lockdep map, doesn't this mean _all_
> >> locks, from _all_ drivers happening to use dma-fences will get recorded
> >> in it. Will this work and not cause false positives?
> >>
> >> Sounds like it could create a common link between two completely
> >> unconnected usages. Because below you do add annotations to generic
> >> dma_fence_signal and dma_fence_wait.
> >
> > This is fully intentional. dma-fence is a cross-driver interface, if
> > every driver invents its own rules about how this should work we have
> > an unmaintainable and unreviewable mess.
> >
> > I've typed up the full length rant already here:
> >
> > https://lore.kernel.org/dri-devel/CAKMK7uGnFhbpuurRsnZ4dvRV9gQ_3-rmSJaoqSFY=+Kvepz_CA@mail.gmail.com/
>
> But "perfect storm" of:
>
> + global fence lockmap
> + mmu notifiers
> + fs reclaim
> + default annotations in dma_fence_signal / dma_fence_wait
>
> Equals to anything ever using dma_fence will be in impossible chains with random other drivers, even if neither driver has code to export/share that fence.
>
> Example from the CI run:
>
> [25.918788] Chain exists of:
> fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map
> [25.918794] Possible unsafe locking scenario:
> [25.918797] CPU0 CPU1
> [25.918799] ---- ----
> [25.918801] lock(dma_fence_map);
> [25.918803] lock(mmu_notifier_invalidate_range_start);
> [25.918807] lock(dma_fence_map);
> [25.918809] lock(fs_reclaim);
>
> What about a dma_fence_export helper which would "arm" the annotations? It would be called as soon as the fence is exported. Maybe when added to dma_resv, or exported via sync_file, etc. Before that point begin/end_signaling and so would be no-ops.
Run CI without the i915 annotation patch, nothing breaks.
So we can gradually fix up existing code that doesn't quite get it
right and move on.
> >>> +
> >>> +/**
> >>> + * dma_fence_begin_signalling - begin a critical DMA fence signalling section
> >>> + *
> >>> + * Drivers should use this to annotate the beginning of any code section
> >>> + * required to eventually complete &dma_fence by calling dma_fence_signal().
> >>> + *
> >>> + * The end of these critical sections are annotated with
> >>> + * dma_fence_end_signalling().
> >>> + *
> >>> + * Returns:
> >>> + *
> >>> + * Opaque cookie needed by the implementation, which needs to be passed to
> >>> + * dma_fence_end_signalling().
> >>> + */
> >>> +bool dma_fence_begin_signalling(void)
> >>> +{
> >>> + /* explicitly nesting ... */
> >>> + if (lock_is_held_type(&dma_fence_lockdep_map, 1))
> >>> + return true;
> >>> +
> >>> + /* rely on might_sleep check for soft/hardirq locks */
> >>> + if (in_atomic())
> >>> + return true;
> >>> +
> >>> + /* ... and non-recursive readlock */
> >>> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
> >>
> >> Would it work if signalling path would mark itself as a write lock? I am
> >> thinking it would be nice to see in lockdep splats what are signals and
> >> what are waits.
> >
> > Yeah it'd be nice to have a read vs write name for the lock. But we
> > already have this problem for e.g. flush_work(), from which I've
> > stolen this idea. So it's not really new. Essentially look at the
> > backtraces lockdep gives you, and reconstruct the deadlock. I'm hoping
> > that people will notice the special functions on the backtrace, e.g.
> > dma_fence_begin_signalling will be listed as offending function/lock
> > holder, and then read the kerneldoc.
> >
> >> The recursive usage wouldn't work then right? Would write annotation on
> >> the wait path work?
> >
> > Wait path is write annotations already, but yeah annotating the
> > signalling side as write would cause endless amounts of alse
> > positives. Also it makes composability of these e.g. what I've done in
> > amdgpu with annotations in tdr work in drm/scheduler, annotations in
> > the amdgpu gpu reset code and then also annotations in atomic code,
> > which all nest within each other in some call chains, but not others.
> > Dropping the recursion would break that and make it really awkward to
> > annotate such cases correctly.
> >
> > And the recursion only works if it's read locks, otherwise lockdep
> > complains if you have inconsistent annotations on the signalling side
> > (which again would make it more or less impossible to annotate the
> > above case fully).
>
> How do I see in lockdep splats if it was a read or write user? Your patch appears to have:
>
> dma_fence_signal:
> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
>
> __dma_fence_might_wait:
> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
>
> Which both seem like read lock. I don't fully understand the lockdep API so I might be wrong, not sure. But neither I see a difference in splats telling me which path is which.
I think you got tricked by the implementation, this isn't quite what's
going on. There's two things which make the annotations special:
- we want a recursive read lock on the signalling critical section.
The problem is that lockdep doesn't implement full validation for
recursive read locks, only non-recursive read/write locks fully
validated. There's some checks for recursive read locks, but exactly
the checks we need to catch common dma_fence_wait deadlocks aren't
done. That's why we need to implement manual lock recursion on the
reader side
- now on the write side we additionally need to implement an
read2write upgrade, and a write2read downgrade. Lockdep doesn't
implement that, so again we have to hand-roll this.
Let's go through the code line-by-line:
bool tmp;
tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
We check whether someone is holding the non-recursive read lock already.
if (tmp)
lock_release(&dma_fence_lockdep_map, _THIS_IP_);
If that's the case, we drop that read lock.
lock_map_acquire(&dma_fence_lockdep_map);
Then we do the actual might_wait annotation, the above takes the full
write lock ...
lock_map_release(&dma_fence_lockdep_map);
... and now we release the write lock again.
if (tmp)
lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
Finally we need to re-acquire the read lock, if we've held that when
entering this function. This annotation naturally has to exactly match
what begin_signalling would do, otherwise the hand-rolled nesting
would fall apart.
I hope that explains what's going on here, and assures you that
might_wait() is indeed a write lock annotation, but with a big pile of
complications.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-11 11:29 ` Daniel Vetter
@ 2020-06-11 14:29 ` Tvrtko Ursulin
2020-06-11 15:03 ` Daniel Vetter
0 siblings, 1 reply; 106+ messages in thread
From: Tvrtko Ursulin @ 2020-06-11 14:29 UTC (permalink / raw)
To: Daniel Vetter
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Chris Wilson, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, DRI Development, Daniel Vetter, Mika Kuoppala,
Christian König, open list:DMA BUFFER SHARING FRAMEWORK
On 11/06/2020 12:29, Daniel Vetter wrote:
> On Thu, Jun 11, 2020 at 12:36 PM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>> On 10/06/2020 16:17, Daniel Vetter wrote:
>>> On Wed, Jun 10, 2020 at 4:22 PM Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>
>>>>
>>>> On 04/06/2020 09:12, Daniel Vetter wrote:
>>>>> Design is similar to the lockdep annotations for workers, but with
>>>>> some twists:
>>>>>
>>>>> - We use a read-lock for the execution/worker/completion side, so that
>>>>> this explicit annotation can be more liberally sprinkled around.
>>>>> With read locks lockdep isn't going to complain if the read-side
>>>>> isn't nested the same way under all circumstances, so ABBA deadlocks
>>>>> are ok. Which they are, since this is an annotation only.
>>>>>
>>>>> - We're using non-recursive lockdep read lock mode, since in recursive
>>>>> read lock mode lockdep does not catch read side hazards. And we
>>>>> _very_ much want read side hazards to be caught. For full details of
>>>>> this limitation see
>>>>>
>>>>> commit e91498589746065e3ae95d9a00b068e525eec34f
>>>>> Author: Peter Zijlstra <peterz@infradead.org>
>>>>> Date: Wed Aug 23 13:13:11 2017 +0200
>>>>>
>>>>> locking/lockdep/selftests: Add mixed read-write ABBA tests
>>>>>
>>>>> - To allow nesting of the read-side explicit annotations we explicitly
>>>>> keep track of the nesting. lock_is_held() allows us to do that.
>>>>>
>>>>> - The wait-side annotation is a write lock, and entirely done within
>>>>> dma_fence_wait() for everyone by default.
>>>>>
>>>>> - To be able to freely annotate helper functions I want to make it ok
>>>>> to call dma_fence_begin/end_signalling from soft/hardirq context.
>>>>> First attempt was using the hardirq locking context for the write
>>>>> side in lockdep, but this forces all normal spinlocks nested within
>>>>> dma_fence_begin/end_signalling to be spinlocks. That bollocks.
>>>>>
>>>>> The approach now is to simple check in_atomic(), and for these cases
>>>>> entirely rely on the might_sleep() check in dma_fence_wait(). That
>>>>> will catch any wrong nesting against spinlocks from soft/hardirq
>>>>> contexts.
>>>>>
>>>>> The idea here is that every code path that's critical for eventually
>>>>> signalling a dma_fence should be annotated with
>>>>> dma_fence_begin/end_signalling. The annotation ideally starts right
>>>>> after a dma_fence is published (added to a dma_resv, exposed as a
>>>>> sync_file fd, attached to a drm_syncobj fd, or anything else that
>>>>> makes the dma_fence visible to other kernel threads), up to and
>>>>> including the dma_fence_wait(). Examples are irq handlers, the
>>>>> scheduler rt threads, the tail of execbuf (after the corresponding
>>>>> fences are visible), any workers that end up signalling dma_fences and
>>>>> really anything else. Not annotated should be code paths that only
>>>>> complete fences opportunistically as the gpu progresses, like e.g.
>>>>> shrinker/eviction code.
>>>>>
>>>>> The main class of deadlocks this is supposed to catch are:
>>>>>
>>>>> Thread A:
>>>>>
>>>>> mutex_lock(A);
>>>>> mutex_unlock(A);
>>>>>
>>>>> dma_fence_signal();
>>>>>
>>>>> Thread B:
>>>>>
>>>>> mutex_lock(A);
>>>>> dma_fence_wait();
>>>>> mutex_unlock(A);
>>>>>
>>>>> Thread B is blocked on A signalling the fence, but A never gets around
>>>>> to that because it cannot acquire the lock A.
>>>>>
>>>>> Note that dma_fence_wait() is allowed to be nested within
>>>>> dma_fence_begin/end_signalling sections. To allow this to happen the
>>>>> read lock needs to be upgraded to a write lock, which means that any
>>>>> other lock is acquired between the dma_fence_begin_signalling() call and
>>>>> the call to dma_fence_wait(), and still held, this will result in an
>>>>> immediate lockdep complaint. The only other option would be to not
>>>>> annotate such calls, defeating the point. Therefore these annotations
>>>>> cannot be sprinkled over the code entirely mindless to avoid false
>>>>> positives.
>>>>>
>>>>> v2: handle soft/hardirq ctx better against write side and dont forget
>>>>> EXPORT_SYMBOL, drivers can't use this otherwise.
>>>>>
>>>>> v3: Kerneldoc.
>>>>>
>>>>> v4: Some spelling fixes from Mika
>>>>>
>>>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
>>>>> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
>>>>> Cc: linux-media@vger.kernel.org
>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>> Cc: linux-rdma@vger.kernel.org
>>>>> Cc: amd-gfx@lists.freedesktop.org
>>>>> Cc: intel-gfx@lists.freedesktop.org
>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>> ---
>>>>> Documentation/driver-api/dma-buf.rst | 12 +-
>>>>> drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
>>>>> include/linux/dma-fence.h | 12 ++
>>>>> 3 files changed, 182 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
>>>>> index 63dec76d1d8d..05d856131140 100644
>>>>> --- a/Documentation/driver-api/dma-buf.rst
>>>>> +++ b/Documentation/driver-api/dma-buf.rst
>>>>> @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
>>>>> .. kernel-doc:: drivers/dma-buf/dma-buf.c
>>>>> :doc: cpu access
>>>>>
>>>>> -Fence Poll Support
>>>>> -~~~~~~~~~~~~~~~~~~
>>>>> +Implicit Fence Poll Support
>>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>
>>>>> .. kernel-doc:: drivers/dma-buf/dma-buf.c
>>>>> - :doc: fence polling
>>>>> + :doc: implicit fence polling
>>>>>
>>>>> Kernel Functions and Structures Reference
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> @@ -133,6 +133,12 @@ DMA Fences
>>>>> .. kernel-doc:: drivers/dma-buf/dma-fence.c
>>>>> :doc: DMA fences overview
>>>>>
>>>>> +DMA Fence Signalling Annotations
>>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> +
>>>>> +.. kernel-doc:: drivers/dma-buf/dma-fence.c
>>>>> + :doc: fence signalling annotation
>>>>> +
>>>>> DMA Fences Functions Reference
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>
>>>>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
>>>>> index 656e9ac2d028..0005bc002529 100644
>>>>> --- a/drivers/dma-buf/dma-fence.c
>>>>> +++ b/drivers/dma-buf/dma-fence.c
>>>>> @@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
>>>>> }
>>>>> EXPORT_SYMBOL(dma_fence_context_alloc);
>>>>>
>>>>> +/**
>>>>> + * DOC: fence signalling annotation
>>>>> + *
>>>>> + * Proving correctness of all the kernel code around &dma_fence through code
>>>>> + * review and testing is tricky for a few reasons:
>>>>> + *
>>>>> + * * It is a cross-driver contract, and therefore all drivers must follow the
>>>>> + * same rules for lock nesting order, calling contexts for various functions
>>>>> + * and anything else significant for in-kernel interfaces. But it is also
>>>>> + * impossible to test all drivers in a single machine, hence brute-force N vs.
>>>>> + * N testing of all combinations is impossible. Even just limiting to the
>>>>> + * possible combinations is infeasible.
>>>>> + *
>>>>> + * * There is an enormous amount of driver code involved. For render drivers
>>>>> + * there's the tail of command submission, after fences are published,
>>>>> + * scheduler code, interrupt and workers to process job completion,
>>>>> + * and timeout, gpu reset and gpu hang recovery code. Plus for integration
>>>>> + * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
>>>>> + * and &shrinker. For modesetting drivers there's the commit tail functions
>>>>> + * between when fences for an atomic modeset are published, and when the
>>>>> + * corresponding vblank completes, including any interrupt processing and
>>>>> + * related workers. Auditing all that code, across all drivers, is not
>>>>> + * feasible.
>>>>> + *
>>>>> + * * Due to how many other subsystems are involved and the locking hierarchies
>>>>> + * this pulls in there is extremely thin wiggle-room for driver-specific
>>>>> + * differences. &dma_fence interacts with almost all of the core memory
>>>>> + * handling through page fault handlers via &dma_resv, dma_resv_lock() and
>>>>> + * dma_resv_unlock(). On the other side it also interacts through all
>>>>> + * allocation sites through &mmu_notifier and &shrinker.
>>>>> + *
>>>>> + * Furthermore lockdep does not handle cross-release dependencies, which means
>>>>> + * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
>>>>> + * at runtime with some quick testing. The simplest example is one thread
>>>>> + * waiting on a &dma_fence while holding a lock::
>>>>> + *
>>>>> + * lock(A);
>>>>> + * dma_fence_wait(B);
>>>>> + * unlock(A);
>>>>> + *
>>>>> + * while the other thread is stuck trying to acquire the same lock, which
>>>>> + * prevents it from signalling the fence the previous thread is stuck waiting
>>>>> + * on::
>>>>> + *
>>>>> + * lock(A);
>>>>> + * unlock(A);
>>>>> + * dma_fence_signal(B);
>>>>> + *
>>>>> + * By manually annotating all code relevant to signalling a &dma_fence we can
>>>>> + * teach lockdep about these dependencies, which also helps with the validation
>>>>> + * headache since now lockdep can check all the rules for us::
>>>>> + *
>>>>> + * cookie = dma_fence_begin_signalling();
>>>>> + * lock(A);
>>>>> + * unlock(A);
>>>>> + * dma_fence_signal(B);
>>>>> + * dma_fence_end_signalling(cookie);
>>>>> + *
>>>>> + * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
>>>>> + * annotate critical sections the following rules need to be observed:
>>>>> + *
>>>>> + * * All code necessary to complete a &dma_fence must be annotated, from the
>>>>> + * point where a fence is accessible to other threads, to the point where
>>>>> + * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
>>>>> + * and due to the very strict rules and many corner cases it is infeasible to
>>>>> + * catch these just with review or normal stress testing.
>>>>> + *
>>>>> + * * &struct dma_resv deserves a special note, since the readers are only
>>>>> + * protected by rcu. This means the signalling critical section starts as soon
>>>>> + * as the new fences are installed, even before dma_resv_unlock() is called.
>>>>> + *
>>>>> + * * The only exception are fast paths and opportunistic signalling code, which
>>>>> + * calls dma_fence_signal() purely as an optimization, but is not required to
>>>>> + * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
>>>>> + * which calls dma_fence_signal(), while the mandatory completion path goes
>>>>> + * through a hardware interrupt and possible job completion worker.
>>>>> + *
>>>>> + * * To aid composability of code, the annotations can be freely nested, as long
>>>>> + * as the overall locking hierarchy is consistent. The annotations also work
>>>>> + * both in interrupt and process context. Due to implementation details this
>>>>> + * requires that callers pass an opaque cookie from
>>>>> + * dma_fence_begin_signalling() to dma_fence_end_signalling().
>>>>> + *
>>>>> + * * Validation against the cross driver contract is implemented by priming
>>>>> + * lockdep with the relevant hierarchy at boot-up. This means even just
>>>>> + * testing with a single device is enough to validate a driver, at least as
>>>>> + * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
>>>>> + * concerned.
>>>>> + */
>>>>> +#ifdef CONFIG_LOCKDEP
>>>>> +struct lockdep_map dma_fence_lockdep_map = {
>>>>> + .name = "dma_fence_map"
>>>>> +};
>>>>
>>>> Maybe a stupid question because this is definitely complicated, but.. If
>>>> you have a single/static/global lockdep map, doesn't this mean _all_
>>>> locks, from _all_ drivers happening to use dma-fences will get recorded
>>>> in it. Will this work and not cause false positives?
>>>>
>>>> Sounds like it could create a common link between two completely
>>>> unconnected usages. Because below you do add annotations to generic
>>>> dma_fence_signal and dma_fence_wait.
>>>
>>> This is fully intentional. dma-fence is a cross-driver interface, if
>>> every driver invents its own rules about how this should work we have
>>> an unmaintainable and unreviewable mess.
>>>
>>> I've typed up the full length rant already here:
>>>
>>> https://lore.kernel.org/dri-devel/CAKMK7uGnFhbpuurRsnZ4dvRV9gQ_3-rmSJaoqSFY=+Kvepz_CA@mail.gmail.com/
>>
>> But "perfect storm" of:
>>
>> + global fence lockmap
>> + mmu notifiers
>> + fs reclaim
>> + default annotations in dma_fence_signal / dma_fence_wait
>>
>> Equals to anything ever using dma_fence will be in impossible chains with random other drivers, even if neither driver has code to export/share that fence.
>>
>> Example from the CI run:
>>
>> [25.918788] Chain exists of:
>> fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map
>> [25.918794] Possible unsafe locking scenario:
>> [25.918797] CPU0 CPU1
>> [25.918799] ---- ----
>> [25.918801] lock(dma_fence_map);
>> [25.918803] lock(mmu_notifier_invalidate_range_start);
>> [25.918807] lock(dma_fence_map);
>> [25.918809] lock(fs_reclaim);
>>
>> What about a dma_fence_export helper which would "arm" the annotations? It would be called as soon as the fence is exported. Maybe when added to dma_resv, or exported via sync_file, etc. Before that point begin/end_signaling and so would be no-ops.
>
> Run CI without the i915 annotation patch, nothing breaks.
I think some parts of i915 would still break with my idea to only apply annotations on exported fences. What do you dislike about that idea? I thought the point is to enforce rules for _exported_ fences.
How you have annotated dma_fence_work you can't say, maybe it is exported maybe it isn't. I think it is btw, so splats would still be there, but I am not sure it is conceptually correct.
At least my understanding is GFP_KERNEL allocations are only disallowed by the virtue of the global dma-fence contract. If you want to enforce they are never used for anything but exporting, then that would be a bit harsh, no?
Another example from the CI run:
[26.585357] CPU0 CPU1
[26.585359] ---- ----
[26.585360] lock(dma_fence_map);
[26.585362] lock(mmu_notifier_invalidate_range_start);
[26.585365] lock(dma_fence_map);
[26.585367] lock(i915_gem_object_internal/1);
[26.585369]
*** DEADLOCK ***
Lets say someone submitted an execbuf using userptr as a batch and then unmapped it immediately. That would explain CPU1 getting into the mmu notifier and waiting on this batch to unbind the object.
Meanwhile CPU0 is the async command parser for this request trying to lock the shadow batch buffer. Because it uses the dma_fence_work this is between the begin/end signalling markers.
It can be the same dma-fence I think, since we install the async parser fence on the real batch dma-resv, but dma_fence_map is not a real lock, so what is actually preventing progress in this case?
CPU1 is waiting on a fence, but CPU0 can obtain the lock(i915_gem_object_internal/1), proceed to parse the batch, and exit the signalling section. At which point CPU1 is still blocked, waiting until the execbuf finishes and then mmu notifier can finish and invalidate the pages.
Maybe I am missing something but I don't see how this one is real.
> So we can gradually fix up existing code that doesn't quite get it
> right and move on.
>
>>>>> +
>>>>> +/**
>>>>> + * dma_fence_begin_signalling - begin a critical DMA fence signalling section
>>>>> + *
>>>>> + * Drivers should use this to annotate the beginning of any code section
>>>>> + * required to eventually complete &dma_fence by calling dma_fence_signal().
>>>>> + *
>>>>> + * The end of these critical sections are annotated with
>>>>> + * dma_fence_end_signalling().
>>>>> + *
>>>>> + * Returns:
>>>>> + *
>>>>> + * Opaque cookie needed by the implementation, which needs to be passed to
>>>>> + * dma_fence_end_signalling().
>>>>> + */
>>>>> +bool dma_fence_begin_signalling(void)
>>>>> +{
>>>>> + /* explicitly nesting ... */
>>>>> + if (lock_is_held_type(&dma_fence_lockdep_map, 1))
>>>>> + return true;
>>>>> +
>>>>> + /* rely on might_sleep check for soft/hardirq locks */
>>>>> + if (in_atomic())
>>>>> + return true;
>>>>> +
>>>>> + /* ... and non-recursive readlock */
>>>>> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
>>>>
>>>> Would it work if signalling path would mark itself as a write lock? I am
>>>> thinking it would be nice to see in lockdep splats what are signals and
>>>> what are waits.
>>>
>>> Yeah it'd be nice to have a read vs write name for the lock. But we
>>> already have this problem for e.g. flush_work(), from which I've
>>> stolen this idea. So it's not really new. Essentially look at the
>>> backtraces lockdep gives you, and reconstruct the deadlock. I'm hoping
>>> that people will notice the special functions on the backtrace, e.g.
>>> dma_fence_begin_signalling will be listed as offending function/lock
>>> holder, and then read the kerneldoc.
>>>
>>>> The recursive usage wouldn't work then right? Would write annotation on
>>>> the wait path work?
>>>
>>> Wait path is write annotations already, but yeah annotating the
>>> signalling side as write would cause endless amounts of alse
>>> positives. Also it makes composability of these e.g. what I've done in
>>> amdgpu with annotations in tdr work in drm/scheduler, annotations in
>>> the amdgpu gpu reset code and then also annotations in atomic code,
>>> which all nest within each other in some call chains, but not others.
>>> Dropping the recursion would break that and make it really awkward to
>>> annotate such cases correctly.
>>>
>>> And the recursion only works if it's read locks, otherwise lockdep
>>> complains if you have inconsistent annotations on the signalling side
>>> (which again would make it more or less impossible to annotate the
>>> above case fully).
>>
>> How do I see in lockdep splats if it was a read or write user? Your patch appears to have:
>>
>> dma_fence_signal:
>> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
>>
>> __dma_fence_might_wait:
>> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
>>
>> Which both seem like read lock. I don't fully understand the lockdep API so I might be wrong, not sure. But neither I see a difference in splats telling me which path is which.
>
> I think you got tricked by the implementation, this isn't quite what's
> going on. There's two things which make the annotations special:
>
> - we want a recursive read lock on the signalling critical section.
> The problem is that lockdep doesn't implement full validation for
> recursive read locks, only non-recursive read/write locks fully
> validated. There's some checks for recursive read locks, but exactly
> the checks we need to catch common dma_fence_wait deadlocks aren't
> done. That's why we need to implement manual lock recursion on the
> reader side
>
> - now on the write side we additionally need to implement an
> read2write upgrade, and a write2read downgrade. Lockdep doesn't
> implement that, so again we have to hand-roll this.
>
> Let's go through the code line-by-line:
>
> bool tmp;
>
> tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
>
> We check whether someone is holding the non-recursive read lock already.
>
> if (tmp)
> lock_release(&dma_fence_lockdep_map, _THIS_IP_);
>
> If that's the case, we drop that read lock.
>
> lock_map_acquire(&dma_fence_lockdep_map);
>
> Then we do the actual might_wait annotation, the above takes the full
> write lock ...
>
> lock_map_release(&dma_fence_lockdep_map);
>
> ... and now we release the write lock again.
>
>
> if (tmp)
> lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
>
> Finally we need to re-acquire the read lock, if we've held that when
> entering this function. This annotation naturally has to exactly match
> what begin_signalling would do, otherwise the hand-rolled nesting
> would fall apart.
>
> I hope that explains what's going on here, and assures you that
> might_wait() is indeed a write lock annotation, but with a big pile of
> complications.
I am certainly confused by the difference between lock_map_acquire/release and lock_acquire/release. What is the difference between the two?
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-11 14:29 ` Tvrtko Ursulin
@ 2020-06-11 15:03 ` Daniel Vetter
0 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-11 15:03 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Chris Wilson, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, DRI Development, Daniel Vetter, Mika Kuoppala,
Christian König, open list:DMA BUFFER SHARING FRAMEWORK
On Thu, Jun 11, 2020 at 4:29 PM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 11/06/2020 12:29, Daniel Vetter wrote:
> > On Thu, Jun 11, 2020 at 12:36 PM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >> On 10/06/2020 16:17, Daniel Vetter wrote:
> >>> On Wed, Jun 10, 2020 at 4:22 PM Tvrtko Ursulin
> >>> <tvrtko.ursulin@linux.intel.com> wrote:
> >>>>
> >>>>
> >>>> On 04/06/2020 09:12, Daniel Vetter wrote:
> >>>>> Design is similar to the lockdep annotations for workers, but with
> >>>>> some twists:
> >>>>>
> >>>>> - We use a read-lock for the execution/worker/completion side, so that
> >>>>> this explicit annotation can be more liberally sprinkled around.
> >>>>> With read locks lockdep isn't going to complain if the read-side
> >>>>> isn't nested the same way under all circumstances, so ABBA deadlocks
> >>>>> are ok. Which they are, since this is an annotation only.
> >>>>>
> >>>>> - We're using non-recursive lockdep read lock mode, since in recursive
> >>>>> read lock mode lockdep does not catch read side hazards. And we
> >>>>> _very_ much want read side hazards to be caught. For full details of
> >>>>> this limitation see
> >>>>>
> >>>>> commit e91498589746065e3ae95d9a00b068e525eec34f
> >>>>> Author: Peter Zijlstra <peterz@infradead.org>
> >>>>> Date: Wed Aug 23 13:13:11 2017 +0200
> >>>>>
> >>>>> locking/lockdep/selftests: Add mixed read-write ABBA tests
> >>>>>
> >>>>> - To allow nesting of the read-side explicit annotations we explicitly
> >>>>> keep track of the nesting. lock_is_held() allows us to do that.
> >>>>>
> >>>>> - The wait-side annotation is a write lock, and entirely done within
> >>>>> dma_fence_wait() for everyone by default.
> >>>>>
> >>>>> - To be able to freely annotate helper functions I want to make it ok
> >>>>> to call dma_fence_begin/end_signalling from soft/hardirq context.
> >>>>> First attempt was using the hardirq locking context for the write
> >>>>> side in lockdep, but this forces all normal spinlocks nested within
> >>>>> dma_fence_begin/end_signalling to be spinlocks. That bollocks.
> >>>>>
> >>>>> The approach now is to simple check in_atomic(), and for these cases
> >>>>> entirely rely on the might_sleep() check in dma_fence_wait(). That
> >>>>> will catch any wrong nesting against spinlocks from soft/hardirq
> >>>>> contexts.
> >>>>>
> >>>>> The idea here is that every code path that's critical for eventually
> >>>>> signalling a dma_fence should be annotated with
> >>>>> dma_fence_begin/end_signalling. The annotation ideally starts right
> >>>>> after a dma_fence is published (added to a dma_resv, exposed as a
> >>>>> sync_file fd, attached to a drm_syncobj fd, or anything else that
> >>>>> makes the dma_fence visible to other kernel threads), up to and
> >>>>> including the dma_fence_wait(). Examples are irq handlers, the
> >>>>> scheduler rt threads, the tail of execbuf (after the corresponding
> >>>>> fences are visible), any workers that end up signalling dma_fences and
> >>>>> really anything else. Not annotated should be code paths that only
> >>>>> complete fences opportunistically as the gpu progresses, like e.g.
> >>>>> shrinker/eviction code.
> >>>>>
> >>>>> The main class of deadlocks this is supposed to catch are:
> >>>>>
> >>>>> Thread A:
> >>>>>
> >>>>> mutex_lock(A);
> >>>>> mutex_unlock(A);
> >>>>>
> >>>>> dma_fence_signal();
> >>>>>
> >>>>> Thread B:
> >>>>>
> >>>>> mutex_lock(A);
> >>>>> dma_fence_wait();
> >>>>> mutex_unlock(A);
> >>>>>
> >>>>> Thread B is blocked on A signalling the fence, but A never gets around
> >>>>> to that because it cannot acquire the lock A.
> >>>>>
> >>>>> Note that dma_fence_wait() is allowed to be nested within
> >>>>> dma_fence_begin/end_signalling sections. To allow this to happen the
> >>>>> read lock needs to be upgraded to a write lock, which means that any
> >>>>> other lock is acquired between the dma_fence_begin_signalling() call and
> >>>>> the call to dma_fence_wait(), and still held, this will result in an
> >>>>> immediate lockdep complaint. The only other option would be to not
> >>>>> annotate such calls, defeating the point. Therefore these annotations
> >>>>> cannot be sprinkled over the code entirely mindless to avoid false
> >>>>> positives.
> >>>>>
> >>>>> v2: handle soft/hardirq ctx better against write side and dont forget
> >>>>> EXPORT_SYMBOL, drivers can't use this otherwise.
> >>>>>
> >>>>> v3: Kerneldoc.
> >>>>>
> >>>>> v4: Some spelling fixes from Mika
> >>>>>
> >>>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> >>>>> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> >>>>> Cc: linux-media@vger.kernel.org
> >>>>> Cc: linaro-mm-sig@lists.linaro.org
> >>>>> Cc: linux-rdma@vger.kernel.org
> >>>>> Cc: amd-gfx@lists.freedesktop.org
> >>>>> Cc: intel-gfx@lists.freedesktop.org
> >>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> >>>>> Cc: Christian König <christian.koenig@amd.com>
> >>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>>>> ---
> >>>>> Documentation/driver-api/dma-buf.rst | 12 +-
> >>>>> drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
> >>>>> include/linux/dma-fence.h | 12 ++
> >>>>> 3 files changed, 182 insertions(+), 3 deletions(-)
> >>>>>
> >>>>> diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
> >>>>> index 63dec76d1d8d..05d856131140 100644
> >>>>> --- a/Documentation/driver-api/dma-buf.rst
> >>>>> +++ b/Documentation/driver-api/dma-buf.rst
> >>>>> @@ -100,11 +100,11 @@ CPU Access to DMA Buffer Objects
> >>>>> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> >>>>> :doc: cpu access
> >>>>>
> >>>>> -Fence Poll Support
> >>>>> -~~~~~~~~~~~~~~~~~~
> >>>>> +Implicit Fence Poll Support
> >>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>>>
> >>>>> .. kernel-doc:: drivers/dma-buf/dma-buf.c
> >>>>> - :doc: fence polling
> >>>>> + :doc: implicit fence polling
> >>>>>
> >>>>> Kernel Functions and Structures Reference
> >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>>> @@ -133,6 +133,12 @@ DMA Fences
> >>>>> .. kernel-doc:: drivers/dma-buf/dma-fence.c
> >>>>> :doc: DMA fences overview
> >>>>>
> >>>>> +DMA Fence Signalling Annotations
> >>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>>> +
> >>>>> +.. kernel-doc:: drivers/dma-buf/dma-fence.c
> >>>>> + :doc: fence signalling annotation
> >>>>> +
> >>>>> DMA Fences Functions Reference
> >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>>>
> >>>>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> >>>>> index 656e9ac2d028..0005bc002529 100644
> >>>>> --- a/drivers/dma-buf/dma-fence.c
> >>>>> +++ b/drivers/dma-buf/dma-fence.c
> >>>>> @@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
> >>>>> }
> >>>>> EXPORT_SYMBOL(dma_fence_context_alloc);
> >>>>>
> >>>>> +/**
> >>>>> + * DOC: fence signalling annotation
> >>>>> + *
> >>>>> + * Proving correctness of all the kernel code around &dma_fence through code
> >>>>> + * review and testing is tricky for a few reasons:
> >>>>> + *
> >>>>> + * * It is a cross-driver contract, and therefore all drivers must follow the
> >>>>> + * same rules for lock nesting order, calling contexts for various functions
> >>>>> + * and anything else significant for in-kernel interfaces. But it is also
> >>>>> + * impossible to test all drivers in a single machine, hence brute-force N vs.
> >>>>> + * N testing of all combinations is impossible. Even just limiting to the
> >>>>> + * possible combinations is infeasible.
> >>>>> + *
> >>>>> + * * There is an enormous amount of driver code involved. For render drivers
> >>>>> + * there's the tail of command submission, after fences are published,
> >>>>> + * scheduler code, interrupt and workers to process job completion,
> >>>>> + * and timeout, gpu reset and gpu hang recovery code. Plus for integration
> >>>>> + * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
> >>>>> + * and &shrinker. For modesetting drivers there's the commit tail functions
> >>>>> + * between when fences for an atomic modeset are published, and when the
> >>>>> + * corresponding vblank completes, including any interrupt processing and
> >>>>> + * related workers. Auditing all that code, across all drivers, is not
> >>>>> + * feasible.
> >>>>> + *
> >>>>> + * * Due to how many other subsystems are involved and the locking hierarchies
> >>>>> + * this pulls in there is extremely thin wiggle-room for driver-specific
> >>>>> + * differences. &dma_fence interacts with almost all of the core memory
> >>>>> + * handling through page fault handlers via &dma_resv, dma_resv_lock() and
> >>>>> + * dma_resv_unlock(). On the other side it also interacts through all
> >>>>> + * allocation sites through &mmu_notifier and &shrinker.
> >>>>> + *
> >>>>> + * Furthermore lockdep does not handle cross-release dependencies, which means
> >>>>> + * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
> >>>>> + * at runtime with some quick testing. The simplest example is one thread
> >>>>> + * waiting on a &dma_fence while holding a lock::
> >>>>> + *
> >>>>> + * lock(A);
> >>>>> + * dma_fence_wait(B);
> >>>>> + * unlock(A);
> >>>>> + *
> >>>>> + * while the other thread is stuck trying to acquire the same lock, which
> >>>>> + * prevents it from signalling the fence the previous thread is stuck waiting
> >>>>> + * on::
> >>>>> + *
> >>>>> + * lock(A);
> >>>>> + * unlock(A);
> >>>>> + * dma_fence_signal(B);
> >>>>> + *
> >>>>> + * By manually annotating all code relevant to signalling a &dma_fence we can
> >>>>> + * teach lockdep about these dependencies, which also helps with the validation
> >>>>> + * headache since now lockdep can check all the rules for us::
> >>>>> + *
> >>>>> + * cookie = dma_fence_begin_signalling();
> >>>>> + * lock(A);
> >>>>> + * unlock(A);
> >>>>> + * dma_fence_signal(B);
> >>>>> + * dma_fence_end_signalling(cookie);
> >>>>> + *
> >>>>> + * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
> >>>>> + * annotate critical sections the following rules need to be observed:
> >>>>> + *
> >>>>> + * * All code necessary to complete a &dma_fence must be annotated, from the
> >>>>> + * point where a fence is accessible to other threads, to the point where
> >>>>> + * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
> >>>>> + * and due to the very strict rules and many corner cases it is infeasible to
> >>>>> + * catch these just with review or normal stress testing.
> >>>>> + *
> >>>>> + * * &struct dma_resv deserves a special note, since the readers are only
> >>>>> + * protected by rcu. This means the signalling critical section starts as soon
> >>>>> + * as the new fences are installed, even before dma_resv_unlock() is called.
> >>>>> + *
> >>>>> + * * The only exception are fast paths and opportunistic signalling code, which
> >>>>> + * calls dma_fence_signal() purely as an optimization, but is not required to
> >>>>> + * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
> >>>>> + * which calls dma_fence_signal(), while the mandatory completion path goes
> >>>>> + * through a hardware interrupt and possible job completion worker.
> >>>>> + *
> >>>>> + * * To aid composability of code, the annotations can be freely nested, as long
> >>>>> + * as the overall locking hierarchy is consistent. The annotations also work
> >>>>> + * both in interrupt and process context. Due to implementation details this
> >>>>> + * requires that callers pass an opaque cookie from
> >>>>> + * dma_fence_begin_signalling() to dma_fence_end_signalling().
> >>>>> + *
> >>>>> + * * Validation against the cross driver contract is implemented by priming
> >>>>> + * lockdep with the relevant hierarchy at boot-up. This means even just
> >>>>> + * testing with a single device is enough to validate a driver, at least as
> >>>>> + * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
> >>>>> + * concerned.
> >>>>> + */
> >>>>> +#ifdef CONFIG_LOCKDEP
> >>>>> +struct lockdep_map dma_fence_lockdep_map = {
> >>>>> + .name = "dma_fence_map"
> >>>>> +};
> >>>>
> >>>> Maybe a stupid question because this is definitely complicated, but.. If
> >>>> you have a single/static/global lockdep map, doesn't this mean _all_
> >>>> locks, from _all_ drivers happening to use dma-fences will get recorded
> >>>> in it. Will this work and not cause false positives?
> >>>>
> >>>> Sounds like it could create a common link between two completely
> >>>> unconnected usages. Because below you do add annotations to generic
> >>>> dma_fence_signal and dma_fence_wait.
> >>>
> >>> This is fully intentional. dma-fence is a cross-driver interface, if
> >>> every driver invents its own rules about how this should work we have
> >>> an unmaintainable and unreviewable mess.
> >>>
> >>> I've typed up the full length rant already here:
> >>>
> >>> https://lore.kernel.org/dri-devel/CAKMK7uGnFhbpuurRsnZ4dvRV9gQ_3-rmSJaoqSFY=+Kvepz_CA@mail.gmail.com/
> >>
> >> But "perfect storm" of:
> >>
> >> + global fence lockmap
> >> + mmu notifiers
> >> + fs reclaim
> >> + default annotations in dma_fence_signal / dma_fence_wait
> >>
> >> Equals to anything ever using dma_fence will be in impossible chains with random other drivers, even if neither driver has code to export/share that fence.
> >>
> >> Example from the CI run:
> >>
> >> [25.918788] Chain exists of:
> >> fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map
> >> [25.918794] Possible unsafe locking scenario:
> >> [25.918797] CPU0 CPU1
> >> [25.918799] ---- ----
> >> [25.918801] lock(dma_fence_map);
> >> [25.918803] lock(mmu_notifier_invalidate_range_start);
> >> [25.918807] lock(dma_fence_map);
> >> [25.918809] lock(fs_reclaim);
> >>
> >> What about a dma_fence_export helper which would "arm" the annotations? It would be called as soon as the fence is exported. Maybe when added to dma_resv, or exported via sync_file, etc. Before that point begin/end_signaling and so would be no-ops.
> >
> > Run CI without the i915 annotation patch, nothing breaks.
>
> I think some parts of i915 would still break with my idea to only apply annotations on exported fences. What do you dislike about that idea? I thought the point is to enforce rules for _exported_ fences.
dma_fence is a shared concept, this is upstream, drivers are expected
to a) use shared concepts and b) use them in a consistent way. If
drivers do whatever they feel like then they're no maintainable in the
upstream sense of "maintainable even if the vendor walks away". This
was the reason why amd had to spend 2 refactoring from DAL (which used
all the helpers they shared with their firmware/windows driver) to DC
(which uses all the upstream kms helpers and datastructures directly).
> How you have annotated dma_fence_work you can't say, maybe it is exported maybe it isn't. I think it is btw, so splats would still be there, but I am not sure it is conceptually correct.
>
> At least my understanding is GFP_KERNEL allocations are only disallowed by the virtue of the global dma-fence contract. If you want to enforce they are never used for anything but exporting, then that would be a bit harsh, no?
>
> Another example from the CI run:
>
> [26.585357] CPU0 CPU1
> [26.585359] ---- ----
> [26.585360] lock(dma_fence_map);
> [26.585362] lock(mmu_notifier_invalidate_range_start);
> [26.585365] lock(dma_fence_map);
> [26.585367] lock(i915_gem_object_internal/1);
> [26.585369]
> *** DEADLOCK ***
So ime the above deadlock summaries tend to be wrong as soon as you
have more than 2 locks involved. Which we have here - they only ever
show at most 2 threads, with each thread only taking 2 locks in total,
which isn't going to deadlock if you have more than 2 locks involved.
Which is the case above.
Personally I just ignore the above deadlock scenario and just always
look at all the locks and backtraces lockdep gives me, and then
reconstruct the dependency graph by hand myself, including deadlock
scenario.
> Lets say someone submitted an execbuf using userptr as a batch and then unmapped it immediately. That would explain CPU1 getting into the mmu notifier and waiting on this batch to unbind the object.
>
> Meanwhile CPU0 is the async command parser for this request trying to lock the shadow batch buffer. Because it uses the dma_fence_work this is between the begin/end signalling markers.
>
> It can be the same dma-fence I think, since we install the async parser fence on the real batch dma-resv, but dma_fence_map is not a real lock, so what is actually preventing progress in this case?
>
> CPU1 is waiting on a fence, but CPU0 can obtain the lock(i915_gem_object_internal/1), proceed to parse the batch, and exit the signalling section. At which point CPU1 is still blocked, waiting until the execbuf finishes and then mmu notifier can finish and invalidate the pages.
>
> Maybe I am missing something but I don't see how this one is real.
The above doesn't deadlock, and it also shouldn't result in a lockdep
splat. The trouble is when the signalling thread also grabs
i915_gem_object_internal/1 somewhere. Which if you go through full CI
results you see there's more involved (and at least one of the splats
is all just lockdep priming and might_lock, so could be an annotation
bug on top), and there is indeed a path where we lock the driver
private lock in more places, and the wrong way round. That's the thing
lockdep is complaining about, it's just not making that clear in the
summary because the summary is only ever correct for 2 locks. Not if
more is involved.
> > So we can gradually fix up existing code that doesn't quite get it
> > right and move on.
> >
> >>>>> +
> >>>>> +/**
> >>>>> + * dma_fence_begin_signalling - begin a critical DMA fence signalling section
> >>>>> + *
> >>>>> + * Drivers should use this to annotate the beginning of any code section
> >>>>> + * required to eventually complete &dma_fence by calling dma_fence_signal().
> >>>>> + *
> >>>>> + * The end of these critical sections are annotated with
> >>>>> + * dma_fence_end_signalling().
> >>>>> + *
> >>>>> + * Returns:
> >>>>> + *
> >>>>> + * Opaque cookie needed by the implementation, which needs to be passed to
> >>>>> + * dma_fence_end_signalling().
> >>>>> + */
> >>>>> +bool dma_fence_begin_signalling(void)
> >>>>> +{
> >>>>> + /* explicitly nesting ... */
> >>>>> + if (lock_is_held_type(&dma_fence_lockdep_map, 1))
> >>>>> + return true;
> >>>>> +
> >>>>> + /* rely on might_sleep check for soft/hardirq locks */
> >>>>> + if (in_atomic())
> >>>>> + return true;
> >>>>> +
> >>>>> + /* ... and non-recursive readlock */
> >>>>> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
> >>>>
> >>>> Would it work if signalling path would mark itself as a write lock? I am
> >>>> thinking it would be nice to see in lockdep splats what are signals and
> >>>> what are waits.
> >>>
> >>> Yeah it'd be nice to have a read vs write name for the lock. But we
> >>> already have this problem for e.g. flush_work(), from which I've
> >>> stolen this idea. So it's not really new. Essentially look at the
> >>> backtraces lockdep gives you, and reconstruct the deadlock. I'm hoping
> >>> that people will notice the special functions on the backtrace, e.g.
> >>> dma_fence_begin_signalling will be listed as offending function/lock
> >>> holder, and then read the kerneldoc.
> >>>
> >>>> The recursive usage wouldn't work then right? Would write annotation on
> >>>> the wait path work?
> >>>
> >>> Wait path is write annotations already, but yeah annotating the
> >>> signalling side as write would cause endless amounts of alse
> >>> positives. Also it makes composability of these e.g. what I've done in
> >>> amdgpu with annotations in tdr work in drm/scheduler, annotations in
> >>> the amdgpu gpu reset code and then also annotations in atomic code,
> >>> which all nest within each other in some call chains, but not others.
> >>> Dropping the recursion would break that and make it really awkward to
> >>> annotate such cases correctly.
> >>>
> >>> And the recursion only works if it's read locks, otherwise lockdep
> >>> complains if you have inconsistent annotations on the signalling side
> >>> (which again would make it more or less impossible to annotate the
> >>> above case fully).
> >>
> >> How do I see in lockdep splats if it was a read or write user? Your patch appears to have:
> >>
> >> dma_fence_signal:
> >> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
> >>
> >> __dma_fence_might_wait:
> >> + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
> >>
> >> Which both seem like read lock. I don't fully understand the lockdep API so I might be wrong, not sure. But neither I see a difference in splats telling me which path is which.
> >
> > I think you got tricked by the implementation, this isn't quite what's
> > going on. There's two things which make the annotations special:
> >
> > - we want a recursive read lock on the signalling critical section.
> > The problem is that lockdep doesn't implement full validation for
> > recursive read locks, only non-recursive read/write locks fully
> > validated. There's some checks for recursive read locks, but exactly
> > the checks we need to catch common dma_fence_wait deadlocks aren't
> > done. That's why we need to implement manual lock recursion on the
> > reader side
> >
> > - now on the write side we additionally need to implement an
> > read2write upgrade, and a write2read downgrade. Lockdep doesn't
> > implement that, so again we have to hand-roll this.
> >
> > Let's go through the code line-by-line:
> >
> > bool tmp;
> >
> > tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
> >
> > We check whether someone is holding the non-recursive read lock already.
> >
> > if (tmp)
> > lock_release(&dma_fence_lockdep_map, _THIS_IP_);
> >
> > If that's the case, we drop that read lock.
> >
> > lock_map_acquire(&dma_fence_lockdep_map);
> >
> > Then we do the actual might_wait annotation, the above takes the full
> > write lock ...
> >
> > lock_map_release(&dma_fence_lockdep_map);
> >
> > ... and now we release the write lock again.
> >
> >
> > if (tmp)
> > lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
> >
> > Finally we need to re-acquire the read lock, if we've held that when
> > entering this function. This annotation naturally has to exactly match
> > what begin_signalling would do, otherwise the hand-rolled nesting
> > would fall apart.
> >
> > I hope that explains what's going on here, and assures you that
> > might_wait() is indeed a write lock annotation, but with a big pile of
> > complications.
>
> I am certainly confused by the difference between lock_map_acquire/release and lock_acquire/release. What is the difference between the two?
lock_acquire/release is a wrapper around lock_map_acquire/release.
This is all lockdep internal, it's a completely undocumented maze, so
unfortunately only option is to really careful follow all the
definitions from various locking primitives. And then compare with
lockdep self-test (which use the locking primitives, not the lockdep
internals) to see which flag controls which kind of behaviour.
That's at least what I do, and it's horrible. But yeah lockdep doesn't
have documentation for this.
If you think it's better to open code the lock_map/acquire, I guess I
can do that. But it's a mess, so I need to carefully retest everything
and make sure I've set the right flags and bits - for added fun they
also change ordering in some of the wrappers!
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-04 8:12 ` [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations Daniel Vetter
` (2 preceding siblings ...)
2020-06-10 14:21 ` [Intel-gfx] [PATCH 03/18] " Tvrtko Ursulin
@ 2020-06-11 8:00 ` Chris Wilson
2020-06-11 8:44 ` Dave Airlie
2020-06-12 7:06 ` [Intel-gfx] [PATCH] " Daniel Vetter
4 siblings, 1 reply; 106+ messages in thread
From: Chris Wilson @ 2020-06-11 8:00 UTC (permalink / raw)
To: DRI Development, Daniel Vetter
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, linaro-mm-sig, Thomas Hellstrom, Daniel Vetter,
linux-media, Christian König, Mika Kuoppala
Quoting Daniel Vetter (2020-06-04 09:12:09)
> Design is similar to the lockdep annotations for workers, but with
> some twists:
>
> - We use a read-lock for the execution/worker/completion side, so that
> this explicit annotation can be more liberally sprinkled around.
> With read locks lockdep isn't going to complain if the read-side
> isn't nested the same way under all circumstances, so ABBA deadlocks
> are ok. Which they are, since this is an annotation only.
>
> - We're using non-recursive lockdep read lock mode, since in recursive
> read lock mode lockdep does not catch read side hazards. And we
> _very_ much want read side hazards to be caught. For full details of
> this limitation see
>
> commit e91498589746065e3ae95d9a00b068e525eec34f
> Author: Peter Zijlstra <peterz@infradead.org>
> Date: Wed Aug 23 13:13:11 2017 +0200
>
> locking/lockdep/selftests: Add mixed read-write ABBA tests
>
> - To allow nesting of the read-side explicit annotations we explicitly
> keep track of the nesting. lock_is_held() allows us to do that.
>
> - The wait-side annotation is a write lock, and entirely done within
> dma_fence_wait() for everyone by default.
>
> - To be able to freely annotate helper functions I want to make it ok
> to call dma_fence_begin/end_signalling from soft/hardirq context.
> First attempt was using the hardirq locking context for the write
> side in lockdep, but this forces all normal spinlocks nested within
> dma_fence_begin/end_signalling to be spinlocks. That bollocks.
>
> The approach now is to simple check in_atomic(), and for these cases
> entirely rely on the might_sleep() check in dma_fence_wait(). That
> will catch any wrong nesting against spinlocks from soft/hardirq
> contexts.
>
> The idea here is that every code path that's critical for eventually
> signalling a dma_fence should be annotated with
> dma_fence_begin/end_signalling. The annotation ideally starts right
> after a dma_fence is published (added to a dma_resv, exposed as a
> sync_file fd, attached to a drm_syncobj fd, or anything else that
> makes the dma_fence visible to other kernel threads), up to and
> including the dma_fence_wait(). Examples are irq handlers, the
> scheduler rt threads, the tail of execbuf (after the corresponding
> fences are visible), any workers that end up signalling dma_fences and
> really anything else. Not annotated should be code paths that only
> complete fences opportunistically as the gpu progresses, like e.g.
> shrinker/eviction code.
>
> The main class of deadlocks this is supposed to catch are:
>
> Thread A:
>
> mutex_lock(A);
> mutex_unlock(A);
>
> dma_fence_signal();
>
> Thread B:
>
> mutex_lock(A);
> dma_fence_wait();
> mutex_unlock(A);
>
> Thread B is blocked on A signalling the fence, but A never gets around
> to that because it cannot acquire the lock A.
>
> Note that dma_fence_wait() is allowed to be nested within
> dma_fence_begin/end_signalling sections. To allow this to happen the
> read lock needs to be upgraded to a write lock, which means that any
> other lock is acquired between the dma_fence_begin_signalling() call and
> the call to dma_fence_wait(), and still held, this will result in an
> immediate lockdep complaint. The only other option would be to not
> annotate such calls, defeating the point. Therefore these annotations
> cannot be sprinkled over the code entirely mindless to avoid false
> positives.
>
> v2: handle soft/hardirq ctx better against write side and dont forget
> EXPORT_SYMBOL, drivers can't use this otherwise.
>
> v3: Kerneldoc.
>
> v4: Some spelling fixes from Mika
>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: linux-rdma@vger.kernel.org
> Cc: amd-gfx@lists.freedesktop.org
> Cc: intel-gfx@lists.freedesktop.org
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Introducing a global lockmap that cannot capture the rules correctly,
Nacked-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-11 8:00 ` Chris Wilson
@ 2020-06-11 8:44 ` Dave Airlie
2020-06-11 9:01 ` Daniel Stone
0 siblings, 1 reply; 106+ messages in thread
From: Dave Airlie @ 2020-06-11 8:44 UTC (permalink / raw)
To: Chris Wilson
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, amd-gfx mailing list, Daniel Vetter,
Mika Kuoppala, Christian König, Linux Media Mailing List
On Thu, 11 Jun 2020 at 18:01, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2020-06-04 09:12:09)
> > Design is similar to the lockdep annotations for workers, but with
> > some twists:
> >
> > - We use a read-lock for the execution/worker/completion side, so that
> > this explicit annotation can be more liberally sprinkled around.
> > With read locks lockdep isn't going to complain if the read-side
> > isn't nested the same way under all circumstances, so ABBA deadlocks
> > are ok. Which they are, since this is an annotation only.
> >
> > - We're using non-recursive lockdep read lock mode, since in recursive
> > read lock mode lockdep does not catch read side hazards. And we
> > _very_ much want read side hazards to be caught. For full details of
> > this limitation see
> >
> > commit e91498589746065e3ae95d9a00b068e525eec34f
> > Author: Peter Zijlstra <peterz@infradead.org>
> > Date: Wed Aug 23 13:13:11 2017 +0200
> >
> > locking/lockdep/selftests: Add mixed read-write ABBA tests
> >
> > - To allow nesting of the read-side explicit annotations we explicitly
> > keep track of the nesting. lock_is_held() allows us to do that.
> >
> > - The wait-side annotation is a write lock, and entirely done within
> > dma_fence_wait() for everyone by default.
> >
> > - To be able to freely annotate helper functions I want to make it ok
> > to call dma_fence_begin/end_signalling from soft/hardirq context.
> > First attempt was using the hardirq locking context for the write
> > side in lockdep, but this forces all normal spinlocks nested within
> > dma_fence_begin/end_signalling to be spinlocks. That bollocks.
> >
> > The approach now is to simple check in_atomic(), and for these cases
> > entirely rely on the might_sleep() check in dma_fence_wait(). That
> > will catch any wrong nesting against spinlocks from soft/hardirq
> > contexts.
> >
> > The idea here is that every code path that's critical for eventually
> > signalling a dma_fence should be annotated with
> > dma_fence_begin/end_signalling. The annotation ideally starts right
> > after a dma_fence is published (added to a dma_resv, exposed as a
> > sync_file fd, attached to a drm_syncobj fd, or anything else that
> > makes the dma_fence visible to other kernel threads), up to and
> > including the dma_fence_wait(). Examples are irq handlers, the
> > scheduler rt threads, the tail of execbuf (after the corresponding
> > fences are visible), any workers that end up signalling dma_fences and
> > really anything else. Not annotated should be code paths that only
> > complete fences opportunistically as the gpu progresses, like e.g.
> > shrinker/eviction code.
> >
> > The main class of deadlocks this is supposed to catch are:
> >
> > Thread A:
> >
> > mutex_lock(A);
> > mutex_unlock(A);
> >
> > dma_fence_signal();
> >
> > Thread B:
> >
> > mutex_lock(A);
> > dma_fence_wait();
> > mutex_unlock(A);
> >
> > Thread B is blocked on A signalling the fence, but A never gets around
> > to that because it cannot acquire the lock A.
> >
> > Note that dma_fence_wait() is allowed to be nested within
> > dma_fence_begin/end_signalling sections. To allow this to happen the
> > read lock needs to be upgraded to a write lock, which means that any
> > other lock is acquired between the dma_fence_begin_signalling() call and
> > the call to dma_fence_wait(), and still held, this will result in an
> > immediate lockdep complaint. The only other option would be to not
> > annotate such calls, defeating the point. Therefore these annotations
> > cannot be sprinkled over the code entirely mindless to avoid false
> > positives.
> >
> > v2: handle soft/hardirq ctx better against write side and dont forget
> > EXPORT_SYMBOL, drivers can't use this otherwise.
> >
> > v3: Kerneldoc.
> >
> > v4: Some spelling fixes from Mika
> >
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: amd-gfx@lists.freedesktop.org
> > Cc: intel-gfx@lists.freedesktop.org
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>
> Introducing a global lockmap that cannot capture the rules correctly,
Can you document the rules all drivers should be following then,
because from here it looks to get refactored every version of i915,
and it would be nice if we could all aim for the same set of things
roughly. We've already had enough problems with amdgpu vs i915 vs
everyone else with fences, if this stops that in the future then I'd
rather we have that than just some unwritten rules per driver and
untestable.
Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-11 8:44 ` Dave Airlie
@ 2020-06-11 9:01 ` Daniel Stone
2020-06-19 8:25 ` Chris Wilson
0 siblings, 1 reply; 106+ messages in thread
From: Daniel Stone @ 2020-06-11 9:01 UTC (permalink / raw)
To: Dave Airlie
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
DRI Development, Chris Wilson,
moderated list:DMA BUFFER SHARING FRAMEWORK, Thomas Hellstrom,
amd-gfx mailing list, Daniel Vetter, Linux Media Mailing List,
Christian König, Mika Kuoppala
Hi,
On Thu, 11 Jun 2020 at 09:44, Dave Airlie <airlied@gmail.com> wrote:
> On Thu, 11 Jun 2020 at 18:01, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > Introducing a global lockmap that cannot capture the rules correctly,
>
> Can you document the rules all drivers should be following then,
> because from here it looks to get refactored every version of i915,
> and it would be nice if we could all aim for the same set of things
> roughly. We've already had enough problems with amdgpu vs i915 vs
> everyone else with fences, if this stops that in the future then I'd
> rather we have that than just some unwritten rules per driver and
> untestable.
As someone who has sunk a bunch of work into explicit-fencing
awareness in my compositor so I can never be blocked, I'd be
disappointed if the infrastructure was ultimately pointless because
the documented fencing rules were \_o_/ or thereabouts. Lockdep
definitely isn't my area of expertise so I can't comment on the patch
per se, but having something to ensure we don't hit deadlocks sure
seems a lot better than nothing.
Cheers,
Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-11 9:01 ` Daniel Stone
@ 2020-06-19 8:25 ` Chris Wilson
2020-06-19 8:51 ` Daniel Vetter
0 siblings, 1 reply; 106+ messages in thread
From: Chris Wilson @ 2020-06-19 8:25 UTC (permalink / raw)
To: Daniel Stone, Dave Airlie
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, amd-gfx mailing list, Daniel Vetter,
Linux Media Mailing List, Christian König, Mika Kuoppala
Quoting Daniel Stone (2020-06-11 10:01:46)
> Hi,
>
> On Thu, 11 Jun 2020 at 09:44, Dave Airlie <airlied@gmail.com> wrote:
> > On Thu, 11 Jun 2020 at 18:01, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > Introducing a global lockmap that cannot capture the rules correctly,
> >
> > Can you document the rules all drivers should be following then,
> > because from here it looks to get refactored every version of i915,
> > and it would be nice if we could all aim for the same set of things
> > roughly. We've already had enough problems with amdgpu vs i915 vs
> > everyone else with fences, if this stops that in the future then I'd
> > rather we have that than just some unwritten rules per driver and
> > untestable.
>
> As someone who has sunk a bunch of work into explicit-fencing
> awareness in my compositor so I can never be blocked, I'd be
> disappointed if the infrastructure was ultimately pointless because
> the documented fencing rules were \_o_/ or thereabouts. Lockdep
> definitely isn't my area of expertise so I can't comment on the patch
> per se, but having something to ensure we don't hit deadlocks sure
> seems a lot better than nothing.
This is doing dependency analysis on execution contexts which is a far
cry from doing the fence dependency analysis, and so has to actively
ignore the cycles that must exist on the dma side, and also the cycles
that prevent entering execution contexts on the CPU. It has to actively
ignore scheduler execution contexts, for lockdep cries, and so we do not
get analysis of the locking contexts along that path. This would be
solvable along the lines of extending lockdep ala lockdep_dma_enter().
Had i915's execution flow been marked up, it should have found the
dubious wait for external fences inside the dead GPU recovery, and
probably found a few more things to complain about with the reset locking.
[Note we already do the same annotations for wait-vs-reset, but not
reset-vs-execution.]
Determination of which waits are legal and which are not is entirely ad
hoc, for there is no status change tracking in the dependency analysis
[that is once an execution context is linked to a published fence, again
integral to lockdep.] Consider if the completion chain in atomic is
swapped out for the morally equivalent fences along intertwined timelines,
and so it does a bunch of dma_fence_wait() instead. Why are those waits
legal despite them being after we have committed to fulfilling the out
fence? [Why are the waits on and for the GPU legal, since they equally
block execution flow?]
Forcing a generic primitive to always be part of the same global map is
horrible. You forgo being able to use the primitive for unrelated tasks,
lose the ability to name particular contexts to gain more informative
dependency cycle reports from having the explicit linkage. You can add
wait_map tracking without loss of generality [in less than 10 lines],
and you can still enforce that all fences used for a common purpose
follow the same rules [the simplest way being to default to the singular
wait_map]. But it's the explicitly named execution contexts that are the
biggest boon to reading the code and reading the lockdep warns.
This is a bunch of ad hoc tracking for a very narrow purpose applied
globally, with loss of information.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-19 8:25 ` Chris Wilson
@ 2020-06-19 8:51 ` Daniel Vetter
2020-06-19 9:13 ` Chris Wilson
0 siblings, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-19 8:51 UTC (permalink / raw)
To: Chris Wilson
Cc: amd-gfx mailing list, linux-rdma, Intel Graphics Development,
LKML, DRI Development,
moderated list:DMA BUFFER SHARING FRAMEWORK, Thomas Hellstrom,
Daniel Vetter, Mika Kuoppala, Christian König,
Linux Media Mailing List
On Fri, Jun 19, 2020 at 10:25 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Stone (2020-06-11 10:01:46)
> > Hi,
> >
> > On Thu, 11 Jun 2020 at 09:44, Dave Airlie <airlied@gmail.com> wrote:
> > > On Thu, 11 Jun 2020 at 18:01, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > > Introducing a global lockmap that cannot capture the rules correctly,
> > >
> > > Can you document the rules all drivers should be following then,
> > > because from here it looks to get refactored every version of i915,
> > > and it would be nice if we could all aim for the same set of things
> > > roughly. We've already had enough problems with amdgpu vs i915 vs
> > > everyone else with fences, if this stops that in the future then I'd
> > > rather we have that than just some unwritten rules per driver and
> > > untestable.
> >
> > As someone who has sunk a bunch of work into explicit-fencing
> > awareness in my compositor so I can never be blocked, I'd be
> > disappointed if the infrastructure was ultimately pointless because
> > the documented fencing rules were \_o_/ or thereabouts. Lockdep
> > definitely isn't my area of expertise so I can't comment on the patch
> > per se, but having something to ensure we don't hit deadlocks sure
> > seems a lot better than nothing.
>
> This is doing dependency analysis on execution contexts which is a far
> cry from doing the fence dependency analysis, and so has to actively
> ignore the cycles that must exist on the dma side, and also the cycles
> that prevent entering execution contexts on the CPU. It has to actively
> ignore scheduler execution contexts, for lockdep cries, and so we do not
> get analysis of the locking contexts along that path. This would be
> solvable along the lines of extending lockdep ala lockdep_dma_enter().
drm/scheduler is annotated, found some rather improbably to hit issues
in practice. But from the quick chat I've had with König and others I
think he agrees that it's real at least in the theoretical sense.
Probably should consider playing lottery if you hit it in practice
though :-)
> Had i915's execution flow been marked up, it should have found the
> dubious wait for external fences inside the dead GPU recovery, and
> probably found a few more things to complain about with the reset locking.
> [Note we already do the same annotations for wait-vs-reset, but not
> reset-vs-execution.]
I know it splats, that's why the tdr annotation patch comes with a
spec proposal for lifting the wait busting we do in i915 to the
dma_fence level. I included that because amdgpu has the same problem
on modern hw. Apparently their planned fix (because they've hit this
bug in testing) was to push some shared lock down into their
atomic_comit_tail function and use that in gpu reset, so don't seem
that interested in extending dma_fence.
For i915 it's just gen2/3 display, and cross-driver dma-buf/fence
usage for those is nil and won't change. Pragmatic solution imo would
be to just not annotate gpu reset on these platforms, and relying on
our wait busting plus igt tests to make sure it keeps working as-is.
The point of the explicit annotations for the signalling side is very
much that it can be rolled out gradually, and entirely left out for
old legacy paths that aren't worth fixing.
> Determination of which waits are legal and which are not is entirely ad
> hoc, for there is no status change tracking in the dependency analysis
> [that is once an execution context is linked to a published fence, again
> integral to lockdep.] Consider if the completion chain in atomic is
> swapped out for the morally equivalent fences along intertwined timelines,
> and so it does a bunch of dma_fence_wait() instead. Why are those waits
> legal despite them being after we have committed to fulfilling the out
> fence? [Why are the waits on and for the GPU legal, since they equally
> block execution flow?]
No need to consider, it's already real and resulted in some pretty
splats until I got the recursion handling right.
> Forcing a generic primitive to always be part of the same global map is
> horrible. You forgo being able to use the primitive for unrelated tasks,
> lose the ability to name particular contexts to gain more informative
> dependency cycle reports from having the explicit linkage. You can add
> wait_map tracking without loss of generality [in less than 10 lines],
> and you can still enforce that all fences used for a common purpose
> follow the same rules [the simplest way being to default to the singular
> wait_map]. But it's the explicitly named execution contexts that are the
> biggest boon to reading the code and reading the lockdep warns.
So one thing that's maybe not clear here: This doesn't track the DAG
of dependencies. Doesn't even try, I'm still faithfully assuming
drivers get that part right. Which is a gap and maybe we should fix
this, but not the goal here.
All this does is validate fences against anything else that might be
going on in the system. E.g. your recursion example for atomic is
handled by just assuming that any dma_fence_wait within a signalling
section is legit and correct. We can add this later on, but not with
lockdep, since lockdep works with classes. And proofing that
dma_fences are acyclic requires you track them all as individuals.
Entirely different things.
That still leaves the below:
> Forcing a generic primitive to always be part of the same global map is
> horrible.
And no concrete example or reason for why that's not possible.
Because frankly it's not horrible, this is what upstream is all about:
Shared concepts, shared contracts, shared code.
The proposed patches might very well encode the wrong contract, that's
all up for discussion. But fundamentally questioning that we need one
is missing what upstream is all about.
> This is a bunch of ad hoc tracking for a very narrow purpose applied
> globally, with loss of information.
It doesn't solve every problem indeed. I'm happy to review patches to
check acyclic-ness of dma-fence at the global level from you, I
haven't figured out yet how to make that happen. I know i915-gem has
that, but this is about the cross-driver contract here.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-19 8:51 ` Daniel Vetter
@ 2020-06-19 9:13 ` Chris Wilson
2020-06-19 9:43 ` Daniel Vetter
0 siblings, 1 reply; 106+ messages in thread
From: Chris Wilson @ 2020-06-19 9:13 UTC (permalink / raw)
To: Daniel Vetter
Cc: amd-gfx mailing list, linux-rdma, Intel Graphics Development,
LKML, DRI Development,
moderated list:DMA BUFFER SHARING FRAMEWORK, Thomas Hellstrom,
Daniel Vetter, Mika Kuoppala, Christian König,
Linux Media Mailing List
Quoting Daniel Vetter (2020-06-19 09:51:59)
> On Fri, Jun 19, 2020 at 10:25 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > Forcing a generic primitive to always be part of the same global map is
> > horrible.
>
> And no concrete example or reason for why that's not possible.
> Because frankly it's not horrible, this is what upstream is all about:
> Shared concepts, shared contracts, shared code.
>
> The proposed patches might very well encode the wrong contract, that's
> all up for discussion. But fundamentally questioning that we need one
> is missing what upstream is all about.
Then I have not clearly communicated, as my opinion is not that
validation is worthless, but that the implementation is enshrining a
global property on a low level primitive that prevents it from being
used elsewhere. And I want to replace completion [chains] with fences, and
bio with fences, and closures with fences, and what other equivalencies
there are in the kernel. The fence is as central a locking construct as
struct completion and deserves to be a foundational primitive provided
by kernel/ used throughout all drivers for discrete problem domains.
This is narrowing dma_fence whereby adding
struct lockdep_map *dma_fence::wait_map
and annotating linkage, allows you to continue to specify that all
dma_fence used for a particular purpose must follow common rules,
without restricting the primitive for uses outside of this scope.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-19 9:13 ` Chris Wilson
@ 2020-06-19 9:43 ` Daniel Vetter
2020-06-19 13:12 ` Chris Wilson
2020-07-09 7:29 ` Daniel Stone
0 siblings, 2 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-19 9:43 UTC (permalink / raw)
To: Chris Wilson
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, amd-gfx mailing list, Daniel Vetter,
Linux Media Mailing List, Christian König, Mika Kuoppala
On Fri, Jun 19, 2020 at 10:13:35AM +0100, Chris Wilson wrote:
> Quoting Daniel Vetter (2020-06-19 09:51:59)
> > On Fri, Jun 19, 2020 at 10:25 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > Forcing a generic primitive to always be part of the same global map is
> > > horrible.
> >
> > And no concrete example or reason for why that's not possible.
> > Because frankly it's not horrible, this is what upstream is all about:
> > Shared concepts, shared contracts, shared code.
> >
> > The proposed patches might very well encode the wrong contract, that's
> > all up for discussion. But fundamentally questioning that we need one
> > is missing what upstream is all about.
>
> Then I have not clearly communicated, as my opinion is not that
> validation is worthless, but that the implementation is enshrining a
> global property on a low level primitive that prevents it from being
> used elsewhere. And I want to replace completion [chains] with fences, and
> bio with fences, and closures with fences, and what other equivalencies
> there are in the kernel. The fence is as central a locking construct as
> struct completion and deserves to be a foundational primitive provided
> by kernel/ used throughout all drivers for discrete problem domains.
>
> This is narrowing dma_fence whereby adding
> struct lockdep_map *dma_fence::wait_map
> and annotating linkage, allows you to continue to specify that all
> dma_fence used for a particular purpose must follow common rules,
> without restricting the primitive for uses outside of this scope.
Somewhere else in this thread I had discussions with Jason Gunthorpe about
this topic. It might maybe change somewhat depending upon exact rules, but
his take is very much "I don't want dma_fence in rdma". Or pretty close to
that at least.
Similar discussions with habanalabs, they're using dma_fence internally
without any of the uapi. Discussion there has also now concluded that it's
best if they remove them, and simply switch over to a wait_queue or
completion like every other driver does.
The next round of the patches already have a paragraph to at least
somewhat limit how non-gpu drivers use dma_fence. And I guess actual
consensus might be pointing even more strongly at dma_fence being solely
something for gpus and closely related subsystem (maybe media) for syncing
dma-buf access.
So dma_fence as general replacement for completion chains I think just
wont happen.
What might make sense is if e.g. the lockdep annotations could be reused,
at least in design, for wait_queue or completion or anything else
really. I do think that has a fair chance compared to the automagic
cross-release annotations approach, which relied way too heavily on
guessing where barriers are. My experience from just a bit of playing
around with these patches here and discussing them with other driver
maintainers is that accurately deciding where critical sections start and
end is a job for humans only. And if you get it wrong, you will have a
false positive.
And you're indeed correct that if we'd do annotations for completions and
wait queues, then that would need to have a class per semantically
equivalent user, like we have lockdep classes for mutexes, not just one
overall.
But dma_fence otoh is something very specific, which comes with very
specific rules attached - it's not a generic wait_queue at all. Originally
it did start out as one even, but it is a very specialized wait_queue.
So there's imo two cases:
- Your completion is entirely orthogonal of dma_fences, and can never ever
block a dma_fence. Don't use dma_fence for this, and no problem. It's
just another wait_queue somewhere.
- Your completion can eventually, maybe through lots of convolutions and
depdencies, block a dma_fence. In that case full dma_fence rules apply,
and the only thing you can do with a custom annotation is make the rules
even stricter. E.g. if a sub-timeline in the scheduler isn't allowed to
take certain scheduler locks. But the userspace visible/published fence
do take them, maybe as part of command submission or retirement.
Entirely hypotethical, no idea any driver actually needs this.
Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-19 9:43 ` Daniel Vetter
@ 2020-06-19 13:12 ` Chris Wilson
2020-06-22 9:16 ` Daniel Vetter
2020-07-09 7:29 ` Daniel Stone
1 sibling, 1 reply; 106+ messages in thread
From: Chris Wilson @ 2020-06-19 13:12 UTC (permalink / raw)
To: Daniel Vetter
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK,
Thomas Hellstrom, amd-gfx mailing list, Daniel Vetter,
Linux Media Mailing List, Christian König, Mika Kuoppala
Quoting Daniel Vetter (2020-06-19 10:43:09)
> On Fri, Jun 19, 2020 at 10:13:35AM +0100, Chris Wilson wrote:
> > Quoting Daniel Vetter (2020-06-19 09:51:59)
> > > On Fri, Jun 19, 2020 at 10:25 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > > Forcing a generic primitive to always be part of the same global map is
> > > > horrible.
> > >
> > > And no concrete example or reason for why that's not possible.
> > > Because frankly it's not horrible, this is what upstream is all about:
> > > Shared concepts, shared contracts, shared code.
> > >
> > > The proposed patches might very well encode the wrong contract, that's
> > > all up for discussion. But fundamentally questioning that we need one
> > > is missing what upstream is all about.
> >
> > Then I have not clearly communicated, as my opinion is not that
> > validation is worthless, but that the implementation is enshrining a
> > global property on a low level primitive that prevents it from being
> > used elsewhere. And I want to replace completion [chains] with fences, and
> > bio with fences, and closures with fences, and what other equivalencies
> > there are in the kernel. The fence is as central a locking construct as
> > struct completion and deserves to be a foundational primitive provided
> > by kernel/ used throughout all drivers for discrete problem domains.
> >
> > This is narrowing dma_fence whereby adding
> > struct lockdep_map *dma_fence::wait_map
> > and annotating linkage, allows you to continue to specify that all
> > dma_fence used for a particular purpose must follow common rules,
> > without restricting the primitive for uses outside of this scope.
>
> Somewhere else in this thread I had discussions with Jason Gunthorpe about
> this topic. It might maybe change somewhat depending upon exact rules, but
> his take is very much "I don't want dma_fence in rdma". Or pretty close to
> that at least.
>
> Similar discussions with habanalabs, they're using dma_fence internally
> without any of the uapi. Discussion there has also now concluded that it's
> best if they remove them, and simply switch over to a wait_queue or
> completion like every other driver does.
>
> The next round of the patches already have a paragraph to at least
> somewhat limit how non-gpu drivers use dma_fence. And I guess actual
> consensus might be pointing even more strongly at dma_fence being solely
> something for gpus and closely related subsystem (maybe media) for syncing
> dma-buf access.
>
> So dma_fence as general replacement for completion chains I think just
> wont happen.
That is sad. I cannot comprehend going back to pure completions after a
taste of fence scheduling. And we are not even close to fully utilising
them, as not all the async cpu [allocation!] tasks are fully tracked by
fences yet and are still stuck in a FIFO workqueue.
> What might make sense is if e.g. the lockdep annotations could be reused,
> at least in design, for wait_queue or completion or anything else
> really. I do think that has a fair chance compared to the automagic
> cross-release annotations approach, which relied way too heavily on
> guessing where barriers are. My experience from just a bit of playing
> around with these patches here and discussing them with other driver
> maintainers is that accurately deciding where critical sections start and
> end is a job for humans only. And if you get it wrong, you will have a
> false positive.
>
> And you're indeed correct that if we'd do annotations for completions and
> wait queues, then that would need to have a class per semantically
> equivalent user, like we have lockdep classes for mutexes, not just one
> overall.
>
> But dma_fence otoh is something very specific, which comes with very
> specific rules attached - it's not a generic wait_queue at all. Originally
> it did start out as one even, but it is a very specialized wait_queue.
>
> So there's imo two cases:
>
> - Your completion is entirely orthogonal of dma_fences, and can never ever
> block a dma_fence. Don't use dma_fence for this, and no problem. It's
> just another wait_queue somewhere.
>
> - Your completion can eventually, maybe through lots of convolutions and
> depdencies, block a dma_fence. In that case full dma_fence rules apply,
> and the only thing you can do with a custom annotation is make the rules
> even stricter. E.g. if a sub-timeline in the scheduler isn't allowed to
> take certain scheduler locks. But the userspace visible/published fence
> do take them, maybe as part of command submission or retirement.
> Entirely hypotethical, no idea any driver actually needs this.
I think we are faced with this very real problem.
The papering we have today over userptr is so very thin, and if you
squint you can already see it is coupled into the completion signal. Just
it happens to be on the other side of the fence.
The next batch of priority inversions involve integrating the async cpu
tasks into the scheduler, and have full dependency tracking over every
internal fence. I do not see any way to avoid coupling the completion
signal from the GPU to the earliest resource allocation, as it's an
unbroken chain of work, at least from the user's perspective. [Next up
for annotations is that we need to always assume that userspace has an
implicit lock on GPU resources; having to break that lock with a GPU
reset should be a breach of our data integrity, and best avoided, for
compute does not care one iota about system integrity and insist
userspace knows best.] Such allocations have to be allowed to fail and
for that failure to propagate cancelling the queued work, such that I'm
considering what rules we need for gfp_t. That might allow enough
leverage to break any fs_reclaim loops, but userptr is likely forever
doomed [aside from its fs_reclaim loop is as preventable as the normal
shrinker paths], but we still need to suggest to pin_user_pages that
failure is better than oom and that is not clear atm. Plus the usual
failure can happen at any time after updating the user facing
bookkeeping, but that is just extra layers in the execution monitor
ready to step in and replacing failing work with the error propagation.
Or where the system grinds to a halt, requiring the monitor to patch in
a new page / resource.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-19 13:12 ` Chris Wilson
@ 2020-06-22 9:16 ` Daniel Vetter
0 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-22 9:16 UTC (permalink / raw)
To: Chris Wilson
Cc: linux-rdma, Intel Graphics Development, LKML, DRI Development,
moderated list:DMA BUFFER SHARING FRAMEWORK, Thomas Hellstrom,
amd-gfx mailing list, Daniel Vetter, Linux Media Mailing List,
Christian König, Mika Kuoppala
On Fri, Jun 19, 2020 at 3:12 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2020-06-19 10:43:09)
> > On Fri, Jun 19, 2020 at 10:13:35AM +0100, Chris Wilson wrote:
> > > Quoting Daniel Vetter (2020-06-19 09:51:59)
> > > > On Fri, Jun 19, 2020 at 10:25 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > > > Forcing a generic primitive to always be part of the same global map is
> > > > > horrible.
> > > >
> > > > And no concrete example or reason for why that's not possible.
> > > > Because frankly it's not horrible, this is what upstream is all about:
> > > > Shared concepts, shared contracts, shared code.
> > > >
> > > > The proposed patches might very well encode the wrong contract, that's
> > > > all up for discussion. But fundamentally questioning that we need one
> > > > is missing what upstream is all about.
> > >
> > > Then I have not clearly communicated, as my opinion is not that
> > > validation is worthless, but that the implementation is enshrining a
> > > global property on a low level primitive that prevents it from being
> > > used elsewhere. And I want to replace completion [chains] with fences, and
> > > bio with fences, and closures with fences, and what other equivalencies
> > > there are in the kernel. The fence is as central a locking construct as
> > > struct completion and deserves to be a foundational primitive provided
> > > by kernel/ used throughout all drivers for discrete problem domains.
> > >
> > > This is narrowing dma_fence whereby adding
> > > struct lockdep_map *dma_fence::wait_map
> > > and annotating linkage, allows you to continue to specify that all
> > > dma_fence used for a particular purpose must follow common rules,
> > > without restricting the primitive for uses outside of this scope.
> >
> > Somewhere else in this thread I had discussions with Jason Gunthorpe about
> > this topic. It might maybe change somewhat depending upon exact rules, but
> > his take is very much "I don't want dma_fence in rdma". Or pretty close to
> > that at least.
> >
> > Similar discussions with habanalabs, they're using dma_fence internally
> > without any of the uapi. Discussion there has also now concluded that it's
> > best if they remove them, and simply switch over to a wait_queue or
> > completion like every other driver does.
> >
> > The next round of the patches already have a paragraph to at least
> > somewhat limit how non-gpu drivers use dma_fence. And I guess actual
> > consensus might be pointing even more strongly at dma_fence being solely
> > something for gpus and closely related subsystem (maybe media) for syncing
> > dma-buf access.
> >
> > So dma_fence as general replacement for completion chains I think just
> > wont happen.
>
> That is sad. I cannot comprehend going back to pure completions after a
> taste of fence scheduling. And we are not even close to fully utilising
> them, as not all the async cpu [allocation!] tasks are fully tracked by
> fences yet and are still stuck in a FIFO workqueue.
>
> > What might make sense is if e.g. the lockdep annotations could be reused,
> > at least in design, for wait_queue or completion or anything else
> > really. I do think that has a fair chance compared to the automagic
> > cross-release annotations approach, which relied way too heavily on
> > guessing where barriers are. My experience from just a bit of playing
> > around with these patches here and discussing them with other driver
> > maintainers is that accurately deciding where critical sections start and
> > end is a job for humans only. And if you get it wrong, you will have a
> > false positive.
> >
> > And you're indeed correct that if we'd do annotations for completions and
> > wait queues, then that would need to have a class per semantically
> > equivalent user, like we have lockdep classes for mutexes, not just one
> > overall.
> >
> > But dma_fence otoh is something very specific, which comes with very
> > specific rules attached - it's not a generic wait_queue at all. Originally
> > it did start out as one even, but it is a very specialized wait_queue.
> >
> > So there's imo two cases:
> >
> > - Your completion is entirely orthogonal of dma_fences, and can never ever
> > block a dma_fence. Don't use dma_fence for this, and no problem. It's
> > just another wait_queue somewhere.
> >
> > - Your completion can eventually, maybe through lots of convolutions and
> > depdencies, block a dma_fence. In that case full dma_fence rules apply,
> > and the only thing you can do with a custom annotation is make the rules
> > even stricter. E.g. if a sub-timeline in the scheduler isn't allowed to
> > take certain scheduler locks. But the userspace visible/published fence
> > do take them, maybe as part of command submission or retirement.
> > Entirely hypotethical, no idea any driver actually needs this.
>
> I think we are faced with this very real problem.
>
> The papering we have today over userptr is so very thin, and if you
> squint you can already see it is coupled into the completion signal. Just
> it happens to be on the other side of the fence.
>
> The next batch of priority inversions involve integrating the async cpu
> tasks into the scheduler, and have full dependency tracking over every
> internal fence. I do not see any way to avoid coupling the completion
> signal from the GPU to the earliest resource allocation, as it's an
> unbroken chain of work, at least from the user's perspective. [Next up
> for annotations is that we need to always assume that userspace has an
> implicit lock on GPU resources; having to break that lock with a GPU
> reset should be a breach of our data integrity, and best avoided, for
> compute does not care one iota about system integrity and insist
> userspace knows best.] Such allocations have to be allowed to fail and
> for that failure to propagate cancelling the queued work, such that I'm
> considering what rules we need for gfp_t. That might allow enough
> leverage to break any fs_reclaim loops, but userptr is likely forever
> doomed [aside from its fs_reclaim loop is as preventable as the normal
> shrinker paths], but we still need to suggest to pin_user_pages that
> failure is better than oom and that is not clear atm. Plus the usual
> failure can happen at any time after updating the user facing
> bookkeeping, but that is just extra layers in the execution monitor
> ready to step in and replacing failing work with the error propagation.
> Or where the system grinds to a halt, requiring the monitor to patch in
> a new page / resource.
Zooming out a bunch, since this is a lot about the details of making
this happen, and I want to make sure I'm understanding your aim
correctly. I think we have 2 big things here interacting:
On one side the "everything async" push, for some value of everything.
Once everything is async we let either the linux scheduler (for
dma_fence_work) or the gpu scheduler (for i915_request) figure out how
to order everything, with all the dependencies. For memory allocations
there's likely quite a bit of retrying (on the allocation side) and
skipping (on the shrinker/mmu notifier side) involved to make this all
pan out. Maybe something like a GFP_NOGPU flag.
On the other side we have opinionated userspace with both very
long-running batches (they might as well be infinite, best we can do
is check that they still preempt within a reasonable amount of time,
lack of hw support for preemption in all cases notwithstanding). And
batches which synchronize across engines and whatever entirely under
userspace controls, with stuff like gpu semaphore waits entirely in
the cmd stream, without any kernel or gpu scheduler involvement. Well
maybe a slightly smarter gpu scheduler which converts the semaphore
wait from a pure busy loop into a "repoll on each scheduler
timeslice". But not actual dependency tracking awareness in the kernel
(or guc/hw fwiw) of what userspace is really trying to do.
Later is a big motivator for the former, since with arbitrary long
batches and arbitrary fences any wait for a batch to complete can take
forever, hence anything that might end up doing that needs to be done
async and without locks. That way we don't have to shoot anything if a
batch takes too long.
Finally if anything goes wrong (on the kernel side at least) we just
propagete fence error state through the entire ladder of in-flight
things (only if it goes wrong terminally ofc).
Roughly correct or did I miss a big (or small but really important) thing?
Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-06-19 9:43 ` Daniel Vetter
2020-06-19 13:12 ` Chris Wilson
@ 2020-07-09 7:29 ` Daniel Stone
2020-07-09 8:01 ` Daniel Vetter
1 sibling, 1 reply; 106+ messages in thread
From: Daniel Stone @ 2020-07-09 7:29 UTC (permalink / raw)
To: Chris Wilson, amd-gfx mailing list, linux-rdma,
Intel Graphics Development, LKML, DRI Development,
moderated list:DMA BUFFER SHARING FRAMEWORK, Thomas Hellstrom,
Daniel Vetter, Mika Kuoppala, Christian König,
Linux Media Mailing List
Cc: Daniel Vetter
Hi,
Jumping in after a couple of weeks where I've paged most everything
out of my brain ...
On Fri, 19 Jun 2020 at 10:43, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Fri, Jun 19, 2020 at 10:13:35AM +0100, Chris Wilson wrote:
> > > The proposed patches might very well encode the wrong contract, that's
> > > all up for discussion. But fundamentally questioning that we need one
> > > is missing what upstream is all about.
> >
> > Then I have not clearly communicated, as my opinion is not that
> > validation is worthless, but that the implementation is enshrining a
> > global property on a low level primitive that prevents it from being
> > used elsewhere. And I want to replace completion [chains] with fences, and
> > bio with fences, and closures with fences, and what other equivalencies
> > there are in the kernel. The fence is as central a locking construct as
> > struct completion and deserves to be a foundational primitive provided
> > by kernel/ used throughout all drivers for discrete problem domains.
> >
> > This is narrowing dma_fence whereby adding
> > struct lockdep_map *dma_fence::wait_map
> > and annotating linkage, allows you to continue to specify that all
> > dma_fence used for a particular purpose must follow common rules,
> > without restricting the primitive for uses outside of this scope.
>
> Somewhere else in this thread I had discussions with Jason Gunthorpe about
> this topic. It might maybe change somewhat depending upon exact rules, but
> his take is very much "I don't want dma_fence in rdma". Or pretty close to
> that at least.
>
> Similar discussions with habanalabs, they're using dma_fence internally
> without any of the uapi. Discussion there has also now concluded that it's
> best if they remove them, and simply switch over to a wait_queue or
> completion like every other driver does.
>
> The next round of the patches already have a paragraph to at least
> somewhat limit how non-gpu drivers use dma_fence. And I guess actual
> consensus might be pointing even more strongly at dma_fence being solely
> something for gpus and closely related subsystem (maybe media) for syncing
> dma-buf access.
>
> So dma_fence as general replacement for completion chains I think just
> wont happen.
>
> What might make sense is if e.g. the lockdep annotations could be reused,
> at least in design, for wait_queue or completion or anything else
> really. I do think that has a fair chance compared to the automagic
> cross-release annotations approach, which relied way too heavily on
> guessing where barriers are. My experience from just a bit of playing
> around with these patches here and discussing them with other driver
> maintainers is that accurately deciding where critical sections start and
> end is a job for humans only. And if you get it wrong, you will have a
> false positive.
>
> And you're indeed correct that if we'd do annotations for completions and
> wait queues, then that would need to have a class per semantically
> equivalent user, like we have lockdep classes for mutexes, not just one
> overall.
>
> But dma_fence otoh is something very specific, which comes with very
> specific rules attached - it's not a generic wait_queue at all. Originally
> it did start out as one even, but it is a very specialized wait_queue.
>
> So there's imo two cases:
>
> - Your completion is entirely orthogonal of dma_fences, and can never ever
> block a dma_fence. Don't use dma_fence for this, and no problem. It's
> just another wait_queue somewhere.
>
> - Your completion can eventually, maybe through lots of convolutions and
> depdencies, block a dma_fence. In that case full dma_fence rules apply,
> and the only thing you can do with a custom annotation is make the rules
> even stricter. E.g. if a sub-timeline in the scheduler isn't allowed to
> take certain scheduler locks. But the userspace visible/published fence
> do take them, maybe as part of command submission or retirement.
> Entirely hypotethical, no idea any driver actually needs this.
I don't claim to understand the implementation of i915's scheduler and
GEM handling, and it seems like there's some public context missing
here. But to me, the above is a good statement of what I (and a lot of
other userspace) have been relying on - that dma-fence is a very
tightly scoped thing which is very predictable but in extremis.
It would be great to have something like this enshrined in dma-fence
documentation, visible to both kernel and external users. The
properties we've so far been assuming for the graphics pipeline -
covering production & execution of vertex/fragment workloads on the
GPU, framebuffer display, and to the extent this is necessary
involving compute - are something like this:
A single dma-fence with no dependencies represents (the tail of) a
unit of work, which has been all but committed to the hardware. Once
committed to the hardware, this work will complete (successfully or in
error) in bounded time. The unit of work referred to by a dma-fence
may carry dependencies on other dma-fences, which must of course be
subject to the same restrictions as above. No action from any
userspace component is required to ensure that the completion occurs.
The cases I know of which legitimately blow holes in this are:
- the work is scheduled but GPU execution resource contention
prevents it from completion, e.g. something on a higher-priority
context repeatedly gets scheduled in front of it - this is OK because
by definition it's what should happen
- the work is scheduled but CPU execution resource contention
prevents it from completion, e.g. the DRM scheduler does not get to
trigger the hardware to execute the work - this is OK because at this
point we have a big system-wide problem
- the work is scheduled but non-execution resource contention
prevents it from making progress, e.g. VRAM contention and/or a paging
storm - this is OK because again we have a larger problem here and we
can't reasonably expect the driver to solve this
- the work is executed but execution does not complete due to the
nature of the work, e.g. a chain of work contains a hostile compute
shader which does not complete in any reasonable time - this is OK
because we require TDR; even without a smart compositor detecting
based on fence waits that the work is unsuitable and should not hold
up other work, the driver will probably ban the context and lock it
out anyway
The first three are general system resource-overload cases, no
different from the CPU-side equivalent where it's up to the admin to
impose ulimits to prevent forkbombs or runaway memory usage, or up to
the user to run fewer Electron apps. The last one is more difficult,
because we can't solve the halting problem to know ahead of time that
the user has submitted an infinite workload, so we have to live with
that as a real hazard and mitigate it where we can (by returning -EIO
and killing the app from inside Mesa).
If repurposing dma-fence for non-graphics uses (like general-purpose
compute or driver-internal tracking for things other than GPU
workloads) makes it more difficult to guarantee the above properties,
then I don't want to do it. Maybe the answer is that dma-fence gets
split into its core infrastructure which can be used for completion
chains, with actual dma-fence being layered above generic completion
APIs: other-completion-API can consume fences, but fences _cannot_
consume non-fence things.
This does force a split between graphics (GL/Vulkan/display) workloads
and compute (CL/oneAPI/HSA/CUDA), which I get is really difficult to
resolve in the driver. But the two are hard split anyway: graphics
requires upfront and explicit buffer management, in return dangling
the carrot that you can pipeline your workloads and expect completion
in reasonable time. General-purpose compute lets you go far more YOLO
on resource access, including full userptr SVM, but the flipside is
that your execution time might be measured in weeks; as a result you
don't get to do execution pipelining because even if you could, it's
not a big enough win relative to your execution time to be worth the
extra driver and system complexity. I don't think there's a reasonable
lowest common denominator between the two that we can try to reuse a
generic model for both, because you make too many compromises to try
to fit conflicting interests.
In the pre-syncobj days, we did look at what we called 'empty fences'
or 'future fences' with the ChromeOS team: a synchronisation object
which wasn't backed by a promise of completion as dma-fence is, but
instead by the meta-promise (from userspace) of a promise of
completion. Ultimately it never became a real thing for the same
reason that swsync isn't either; it needed so much special-case
handling and so many disclaimers and opt-ins everywhere that by the
end, we weren't sure why we were trying to shoehorn it into dma-fence
apart from dma-fence already existing - but by removing all its
guarantees, we also removed all its usefulness as a primitive.
Cheers,
Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
2020-07-09 7:29 ` Daniel Stone
@ 2020-07-09 8:01 ` Daniel Vetter
0 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-07-09 8:01 UTC (permalink / raw)
To: Daniel Stone
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
DRI Development, Chris Wilson,
moderated list:DMA BUFFER SHARING FRAMEWORK, Thomas Hellstrom,
amd-gfx mailing list, Daniel Vetter, Linux Media Mailing List,
Christian König, Mika Kuoppala
On Thu, Jul 09, 2020 at 08:29:21AM +0100, Daniel Stone wrote:
> Hi,
> Jumping in after a couple of weeks where I've paged most everything
> out of my brain ...
>
> On Fri, 19 Jun 2020 at 10:43, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Fri, Jun 19, 2020 at 10:13:35AM +0100, Chris Wilson wrote:
> > > > The proposed patches might very well encode the wrong contract, that's
> > > > all up for discussion. But fundamentally questioning that we need one
> > > > is missing what upstream is all about.
> > >
> > > Then I have not clearly communicated, as my opinion is not that
> > > validation is worthless, but that the implementation is enshrining a
> > > global property on a low level primitive that prevents it from being
> > > used elsewhere. And I want to replace completion [chains] with fences, and
> > > bio with fences, and closures with fences, and what other equivalencies
> > > there are in the kernel. The fence is as central a locking construct as
> > > struct completion and deserves to be a foundational primitive provided
> > > by kernel/ used throughout all drivers for discrete problem domains.
> > >
> > > This is narrowing dma_fence whereby adding
> > > struct lockdep_map *dma_fence::wait_map
> > > and annotating linkage, allows you to continue to specify that all
> > > dma_fence used for a particular purpose must follow common rules,
> > > without restricting the primitive for uses outside of this scope.
> >
> > Somewhere else in this thread I had discussions with Jason Gunthorpe about
> > this topic. It might maybe change somewhat depending upon exact rules, but
> > his take is very much "I don't want dma_fence in rdma". Or pretty close to
> > that at least.
> >
> > Similar discussions with habanalabs, they're using dma_fence internally
> > without any of the uapi. Discussion there has also now concluded that it's
> > best if they remove them, and simply switch over to a wait_queue or
> > completion like every other driver does.
> >
> > The next round of the patches already have a paragraph to at least
> > somewhat limit how non-gpu drivers use dma_fence. And I guess actual
> > consensus might be pointing even more strongly at dma_fence being solely
> > something for gpus and closely related subsystem (maybe media) for syncing
> > dma-buf access.
> >
> > So dma_fence as general replacement for completion chains I think just
> > wont happen.
> >
> > What might make sense is if e.g. the lockdep annotations could be reused,
> > at least in design, for wait_queue or completion or anything else
> > really. I do think that has a fair chance compared to the automagic
> > cross-release annotations approach, which relied way too heavily on
> > guessing where barriers are. My experience from just a bit of playing
> > around with these patches here and discussing them with other driver
> > maintainers is that accurately deciding where critical sections start and
> > end is a job for humans only. And if you get it wrong, you will have a
> > false positive.
> >
> > And you're indeed correct that if we'd do annotations for completions and
> > wait queues, then that would need to have a class per semantically
> > equivalent user, like we have lockdep classes for mutexes, not just one
> > overall.
> >
> > But dma_fence otoh is something very specific, which comes with very
> > specific rules attached - it's not a generic wait_queue at all. Originally
> > it did start out as one even, but it is a very specialized wait_queue.
> >
> > So there's imo two cases:
> >
> > - Your completion is entirely orthogonal of dma_fences, and can never ever
> > block a dma_fence. Don't use dma_fence for this, and no problem. It's
> > just another wait_queue somewhere.
> >
> > - Your completion can eventually, maybe through lots of convolutions and
> > depdencies, block a dma_fence. In that case full dma_fence rules apply,
> > and the only thing you can do with a custom annotation is make the rules
> > even stricter. E.g. if a sub-timeline in the scheduler isn't allowed to
> > take certain scheduler locks. But the userspace visible/published fence
> > do take them, maybe as part of command submission or retirement.
> > Entirely hypotethical, no idea any driver actually needs this.
>
> I don't claim to understand the implementation of i915's scheduler and
> GEM handling, and it seems like there's some public context missing
> here. But to me, the above is a good statement of what I (and a lot of
> other userspace) have been relying on - that dma-fence is a very
> tightly scoped thing which is very predictable but in extremis.
>
> It would be great to have something like this enshrined in dma-fence
> documentation, visible to both kernel and external users. The
> properties we've so far been assuming for the graphics pipeline -
> covering production & execution of vertex/fragment workloads on the
> GPU, framebuffer display, and to the extent this is necessary
> involving compute - are something like this:
>
> A single dma-fence with no dependencies represents (the tail of) a
> unit of work, which has been all but committed to the hardware. Once
> committed to the hardware, this work will complete (successfully or in
> error) in bounded time. The unit of work referred to by a dma-fence
> may carry dependencies on other dma-fences, which must of course be
> subject to the same restrictions as above. No action from any
> userspace component is required to ensure that the completion occurs.
>
> The cases I know of which legitimately blow holes in this are:
> - the work is scheduled but GPU execution resource contention
> prevents it from completion, e.g. something on a higher-priority
> context repeatedly gets scheduled in front of it - this is OK because
> by definition it's what should happen
> - the work is scheduled but CPU execution resource contention
> prevents it from completion, e.g. the DRM scheduler does not get to
> trigger the hardware to execute the work - this is OK because at this
> point we have a big system-wide problem
> - the work is scheduled but non-execution resource contention
> prevents it from making progress, e.g. VRAM contention and/or a paging
> storm - this is OK because again we have a larger problem here and we
> can't reasonably expect the driver to solve this
> - the work is executed but execution does not complete due to the
> nature of the work, e.g. a chain of work contains a hostile compute
> shader which does not complete in any reasonable time - this is OK
> because we require TDR; even without a smart compositor detecting
> based on fence waits that the work is unsuitable and should not hold
> up other work, the driver will probably ban the context and lock it
> out anyway
>
> The first three are general system resource-overload cases, no
> different from the CPU-side equivalent where it's up to the admin to
> impose ulimits to prevent forkbombs or runaway memory usage, or up to
> the user to run fewer Electron apps. The last one is more difficult,
> because we can't solve the halting problem to know ahead of time that
> the user has submitted an infinite workload, so we have to live with
> that as a real hazard and mitigate it where we can (by returning -EIO
> and killing the app from inside Mesa).
>
> If repurposing dma-fence for non-graphics uses (like general-purpose
> compute or driver-internal tracking for things other than GPU
> workloads) makes it more difficult to guarantee the above properties,
> then I don't want to do it. Maybe the answer is that dma-fence gets
> split into its core infrastructure which can be used for completion
> chains, with actual dma-fence being layered above generic completion
> APIs: other-completion-API can consume fences, but fences _cannot_
> consume non-fence things.
>
> This does force a split between graphics (GL/Vulkan/display) workloads
> and compute (CL/oneAPI/HSA/CUDA), which I get is really difficult to
> resolve in the driver. But the two are hard split anyway: graphics
> requires upfront and explicit buffer management, in return dangling
> the carrot that you can pipeline your workloads and expect completion
> in reasonable time. General-purpose compute lets you go far more YOLO
> on resource access, including full userptr SVM, but the flipside is
> that your execution time might be measured in weeks; as a result you
> don't get to do execution pipelining because even if you could, it's
> not a big enough win relative to your execution time to be worth the
> extra driver and system complexity. I don't think there's a reasonable
> lowest common denominator between the two that we can try to reuse a
> generic model for both, because you make too many compromises to try
> to fit conflicting interests.
>
> In the pre-syncobj days, we did look at what we called 'empty fences'
> or 'future fences' with the ChromeOS team: a synchronisation object
> which wasn't backed by a promise of completion as dma-fence is, but
> instead by the meta-promise (from userspace) of a promise of
> completion. Ultimately it never became a real thing for the same
> reason that swsync isn't either; it needed so much special-case
> handling and so many disclaimers and opt-ins everywhere that by the
> end, we weren't sure why we were trying to shoehorn it into dma-fence
> apart from dma-fence already existing - but by removing all its
> guarantees, we also removed all its usefulness as a primitive.
New series has a patch which tries to at least somewhat summarize this
entire problem, and why it just doesn't work. Doesn't contain yet the full
proposed solution, but maybe that's best for a follow-up patch. Anyway
probably best if we poke holes at that text there.
Between the preepmt ctx fence in amdgpu and userspace fences or gpu futex
or whatever you want to call it, I do think we can make the compute side
happy. The sad puppy face comes a bit from vulkan, since vulkan would
really like the same execution model, but because it needs to integrate
with the overall dma-fence based compositor stack, it can't.
I think even that is solveable, if we have vulkan-based compositors and a
completely new set of protocols and uapi from client all the way down to
display. That makes it about as bad as a flag day as atomic+modifiers.
Also the only reason why the kms driver can then suddenly import a
userspace fence, while nothing else in the kernel can allow such
dependencies is fairly simple: Framebuffers are pinned, which breaks the
dependency loops in the memory manager, and so avoids all the troubles in
a slightly different form.
And of course we'd need a timeout in case userspace just screwed up
somehow.
-Daniel
>
> Cheers,
> Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH] dma-fence: basic lockdep annotations
2020-06-04 8:12 ` [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations Daniel Vetter
` (3 preceding siblings ...)
2020-06-11 8:00 ` Chris Wilson
@ 2020-06-12 7:06 ` Daniel Vetter
4 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-12 7:06 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, amd-gfx,
Chris Wilson, linaro-mm-sig, Thomas Hellström, Daniel Vetter,
Mika Kuoppala, Christian König, linux-media
Design is similar to the lockdep annotations for workers, but with
some twists:
- We use a read-lock for the execution/worker/completion side, so that
this explicit annotation can be more liberally sprinkled around.
With read locks lockdep isn't going to complain if the read-side
isn't nested the same way under all circumstances, so ABBA deadlocks
are ok. Which they are, since this is an annotation only.
- We're using non-recursive lockdep read lock mode, since in recursive
read lock mode lockdep does not catch read side hazards. And we
_very_ much want read side hazards to be caught. For full details of
this limitation see
commit e91498589746065e3ae95d9a00b068e525eec34f
Author: Peter Zijlstra <peterz@infradead.org>
Date: Wed Aug 23 13:13:11 2017 +0200
locking/lockdep/selftests: Add mixed read-write ABBA tests
- To allow nesting of the read-side explicit annotations we explicitly
keep track of the nesting. lock_is_held() allows us to do that.
- The wait-side annotation is a write lock, and entirely done within
dma_fence_wait() for everyone by default.
- To be able to freely annotate helper functions I want to make it ok
to call dma_fence_begin/end_signalling from soft/hardirq context.
First attempt was using the hardirq locking context for the write
side in lockdep, but this forces all normal spinlocks nested within
dma_fence_begin/end_signalling to be spinlocks. That bollocks.
The approach now is to simple check in_atomic(), and for these cases
entirely rely on the might_sleep() check in dma_fence_wait(). That
will catch any wrong nesting against spinlocks from soft/hardirq
contexts.
The idea here is that every code path that's critical for eventually
signalling a dma_fence should be annotated with
dma_fence_begin/end_signalling. The annotation ideally starts right
after a dma_fence is published (added to a dma_resv, exposed as a
sync_file fd, attached to a drm_syncobj fd, or anything else that
makes the dma_fence visible to other kernel threads), up to and
including the dma_fence_wait(). Examples are irq handlers, the
scheduler rt threads, the tail of execbuf (after the corresponding
fences are visible), any workers that end up signalling dma_fences and
really anything else. Not annotated should be code paths that only
complete fences opportunistically as the gpu progresses, like e.g.
shrinker/eviction code.
The main class of deadlocks this is supposed to catch are:
Thread A:
mutex_lock(A);
mutex_unlock(A);
dma_fence_signal();
Thread B:
mutex_lock(A);
dma_fence_wait();
mutex_unlock(A);
Thread B is blocked on A signalling the fence, but A never gets around
to that because it cannot acquire the lock A.
Note that dma_fence_wait() is allowed to be nested within
dma_fence_begin/end_signalling sections. To allow this to happen the
read lock needs to be upgraded to a write lock, which means that any
other lock is acquired between the dma_fence_begin_signalling() call and
the call to dma_fence_wait(), and still held, this will result in an
immediate lockdep complaint. The only other option would be to not
annotate such calls, defeating the point. Therefore these annotations
cannot be sprinkled over the code entirely mindless to avoid false
positives.
Originally I hope that the cross-release lockdep extensions would
alleviate the need for explicit annotations:
https://lwn.net/Articles/709849/
But there's a few reasons why that's not an option:
- It's not happening in upstream, since it got reverted due to too
many false positives:
commit e966eaeeb623f09975ef362c2866fae6f86844f9
Author: Ingo Molnar <mingo@kernel.org>
Date: Tue Dec 12 12:31:16 2017 +0100
locking/lockdep: Remove the cross-release locking checks
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
while it found a number of old bugs initially, was also causing too many
false positives that caused people to disable lockdep - which is arguably
a worse overall outcome.
- cross-release uses the complete() call to annotate the end of
critical sections, for dma_fence that would be dma_fence_signal().
But we do not want all dma_fence_signal() calls to be treated as
critical, since many are opportunistic cleanup of gpu requests. If
these get stuck there's still the main completion interrupt and
workers who can unblock everyone. Automatically annotating all
dma_fence_signal() calls would hence cause false positives.
- cross-release had some educated guesses for when a critical section
starts, like fresh syscall or fresh work callback. This would again
cause false positives without explicit annotations, since for
dma_fence the critical sections only starts when we publish a fence.
- Furthermore there can be cases where a thread never does a
dma_fence_signal, but is still critical for reaching completion of
fences. One example would be a scheduler kthread which picks up jobs
and pushes them into hardware, where the interrupt handler or
another completion thread calls dma_fence_signal(). But if the
scheduler thread hangs, then all the fences hang, hence we need to
manually annotate it. cross-release aimed to solve this by chaining
cross-release dependencies, but the dependency from scheduler thread
to the completion interrupt handler goes through hw where
cross-release code can't observe it.
In short, without manual annotations and careful review of the start
and end of critical sections, cross-relese dependency tracking doesn't
work. We need explicit annotations.
v2: handle soft/hardirq ctx better against write side and dont forget
EXPORT_SYMBOL, drivers can't use this otherwise.
v3: Kerneldoc.
v4: Some spelling fixes from Mika
v5: Amend commit message to explain in detail why cross-release isn't
the solution.
v6: Pull out misplaced .rst hunk.
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
Documentation/driver-api/dma-buf.rst | 6 +
drivers/dma-buf/dma-fence.c | 161 +++++++++++++++++++++++++++
include/linux/dma-fence.h | 12 ++
3 files changed, 179 insertions(+)
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index 7fb7b661febd..05d856131140 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -133,6 +133,12 @@ DMA Fences
.. kernel-doc:: drivers/dma-buf/dma-fence.c
:doc: DMA fences overview
+DMA Fence Signalling Annotations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-fence.c
+ :doc: fence signalling annotation
+
DMA Fences Functions Reference
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 656e9ac2d028..0005bc002529 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -110,6 +110,160 @@ u64 dma_fence_context_alloc(unsigned num)
}
EXPORT_SYMBOL(dma_fence_context_alloc);
+/**
+ * DOC: fence signalling annotation
+ *
+ * Proving correctness of all the kernel code around &dma_fence through code
+ * review and testing is tricky for a few reasons:
+ *
+ * * It is a cross-driver contract, and therefore all drivers must follow the
+ * same rules for lock nesting order, calling contexts for various functions
+ * and anything else significant for in-kernel interfaces. But it is also
+ * impossible to test all drivers in a single machine, hence brute-force N vs.
+ * N testing of all combinations is impossible. Even just limiting to the
+ * possible combinations is infeasible.
+ *
+ * * There is an enormous amount of driver code involved. For render drivers
+ * there's the tail of command submission, after fences are published,
+ * scheduler code, interrupt and workers to process job completion,
+ * and timeout, gpu reset and gpu hang recovery code. Plus for integration
+ * with core mm with have &mmu_notifier, respectively &mmu_interval_notifier,
+ * and &shrinker. For modesetting drivers there's the commit tail functions
+ * between when fences for an atomic modeset are published, and when the
+ * corresponding vblank completes, including any interrupt processing and
+ * related workers. Auditing all that code, across all drivers, is not
+ * feasible.
+ *
+ * * Due to how many other subsystems are involved and the locking hierarchies
+ * this pulls in there is extremely thin wiggle-room for driver-specific
+ * differences. &dma_fence interacts with almost all of the core memory
+ * handling through page fault handlers via &dma_resv, dma_resv_lock() and
+ * dma_resv_unlock(). On the other side it also interacts through all
+ * allocation sites through &mmu_notifier and &shrinker.
+ *
+ * Furthermore lockdep does not handle cross-release dependencies, which means
+ * any deadlocks between dma_fence_wait() and dma_fence_signal() can't be caught
+ * at runtime with some quick testing. The simplest example is one thread
+ * waiting on a &dma_fence while holding a lock::
+ *
+ * lock(A);
+ * dma_fence_wait(B);
+ * unlock(A);
+ *
+ * while the other thread is stuck trying to acquire the same lock, which
+ * prevents it from signalling the fence the previous thread is stuck waiting
+ * on::
+ *
+ * lock(A);
+ * unlock(A);
+ * dma_fence_signal(B);
+ *
+ * By manually annotating all code relevant to signalling a &dma_fence we can
+ * teach lockdep about these dependencies, which also helps with the validation
+ * headache since now lockdep can check all the rules for us::
+ *
+ * cookie = dma_fence_begin_signalling();
+ * lock(A);
+ * unlock(A);
+ * dma_fence_signal(B);
+ * dma_fence_end_signalling(cookie);
+ *
+ * For using dma_fence_begin_signalling() and dma_fence_end_signalling() to
+ * annotate critical sections the following rules need to be observed:
+ *
+ * * All code necessary to complete a &dma_fence must be annotated, from the
+ * point where a fence is accessible to other threads, to the point where
+ * dma_fence_signal() is called. Un-annotated code can contain deadlock issues,
+ * and due to the very strict rules and many corner cases it is infeasible to
+ * catch these just with review or normal stress testing.
+ *
+ * * &struct dma_resv deserves a special note, since the readers are only
+ * protected by rcu. This means the signalling critical section starts as soon
+ * as the new fences are installed, even before dma_resv_unlock() is called.
+ *
+ * * The only exception are fast paths and opportunistic signalling code, which
+ * calls dma_fence_signal() purely as an optimization, but is not required to
+ * guarantee completion of a &dma_fence. The usual example is a wait IOCTL
+ * which calls dma_fence_signal(), while the mandatory completion path goes
+ * through a hardware interrupt and possible job completion worker.
+ *
+ * * To aid composability of code, the annotations can be freely nested, as long
+ * as the overall locking hierarchy is consistent. The annotations also work
+ * both in interrupt and process context. Due to implementation details this
+ * requires that callers pass an opaque cookie from
+ * dma_fence_begin_signalling() to dma_fence_end_signalling().
+ *
+ * * Validation against the cross driver contract is implemented by priming
+ * lockdep with the relevant hierarchy at boot-up. This means even just
+ * testing with a single device is enough to validate a driver, at least as
+ * far as deadlocks with dma_fence_wait() against dma_fence_signal() are
+ * concerned.
+ */
+#ifdef CONFIG_LOCKDEP
+struct lockdep_map dma_fence_lockdep_map = {
+ .name = "dma_fence_map"
+};
+
+/**
+ * dma_fence_begin_signalling - begin a critical DMA fence signalling section
+ *
+ * Drivers should use this to annotate the beginning of any code section
+ * required to eventually complete &dma_fence by calling dma_fence_signal().
+ *
+ * The end of these critical sections are annotated with
+ * dma_fence_end_signalling().
+ *
+ * Returns:
+ *
+ * Opaque cookie needed by the implementation, which needs to be passed to
+ * dma_fence_end_signalling().
+ */
+bool dma_fence_begin_signalling(void)
+{
+ /* explicitly nesting ... */
+ if (lock_is_held_type(&dma_fence_lockdep_map, 1))
+ return true;
+
+ /* rely on might_sleep check for soft/hardirq locks */
+ if (in_atomic())
+ return true;
+
+ /* ... and non-recursive readlock */
+ lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_);
+
+ return false;
+}
+EXPORT_SYMBOL(dma_fence_begin_signalling);
+
+/**
+ * dma_fence_end_signalling - end a critical DMA fence signalling section
+ *
+ * Closes a critical section annotation opened by dma_fence_begin_signalling().
+ */
+void dma_fence_end_signalling(bool cookie)
+{
+ if (cookie)
+ return;
+
+ lock_release(&dma_fence_lockdep_map, _RET_IP_);
+}
+EXPORT_SYMBOL(dma_fence_end_signalling);
+
+void __dma_fence_might_wait(void)
+{
+ bool tmp;
+
+ tmp = lock_is_held_type(&dma_fence_lockdep_map, 1);
+ if (tmp)
+ lock_release(&dma_fence_lockdep_map, _THIS_IP_);
+ lock_map_acquire(&dma_fence_lockdep_map);
+ lock_map_release(&dma_fence_lockdep_map);
+ if (tmp)
+ lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_);
+}
+#endif
+
+
/**
* dma_fence_signal_locked - signal completion of a fence
* @fence: the fence to signal
@@ -170,14 +324,19 @@ int dma_fence_signal(struct dma_fence *fence)
{
unsigned long flags;
int ret;
+ bool tmp;
if (!fence)
return -EINVAL;
+ tmp = dma_fence_begin_signalling();
+
spin_lock_irqsave(fence->lock, flags);
ret = dma_fence_signal_locked(fence);
spin_unlock_irqrestore(fence->lock, flags);
+ dma_fence_end_signalling(tmp);
+
return ret;
}
EXPORT_SYMBOL(dma_fence_signal);
@@ -210,6 +369,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout)
might_sleep();
+ __dma_fence_might_wait();
+
trace_dma_fence_wait_start(fence);
if (fence->ops->wait)
ret = fence->ops->wait(fence, intr, timeout);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 3347c54f3a87..3f288f7db2ef 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -357,6 +357,18 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
} while (1);
}
+#ifdef CONFIG_LOCKDEP
+bool dma_fence_begin_signalling(void);
+void dma_fence_end_signalling(bool cookie);
+#else
+static inline bool dma_fence_begin_signalling(void)
+{
+ return true;
+}
+static inline void dma_fence_end_signalling(bool cookie) {}
+static inline void __dma_fence_might_wait(void) {}
+#endif
+
int dma_fence_signal(struct dma_fence *fence);
int dma_fence_signal_locked(struct dma_fence *fence);
signed long dma_fence_default_wait(struct dma_fence *fence,
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH 04/18] dma-fence: prime lockdep annotations
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (2 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-11 7:30 ` [Intel-gfx] [Linaro-mm-sig] " Thomas Hellström (Intel)
2020-06-12 7:01 ` [Intel-gfx] [PATCH] " Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 05/18] drm/vkms: Annotate vblank timer Daniel Vetter
` (26 subsequent siblings)
30 siblings, 2 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Thomas Hellstrom,
Daniel Vetter, linux-media, Christian König, Mika Kuoppala
Two in one go:
- it is allowed to call dma_fence_wait() while holding a
dma_resv_lock(). This is fundamental to how eviction works with ttm,
so required.
- it is allowed to call dma_fence_wait() from memory reclaim contexts,
specifically from shrinker callbacks (which i915 does), and from mmu
notifier callbacks (which amdgpu does, and which i915 sometimes also
does, and probably always should, but that's kinda a debate). Also
for stuff like HMM we really need to be able to do this, or things
get real dicey.
Consequence is that any critical path necessary to get to a
dma_fence_signal for a fence must never a) call dma_resv_lock nor b)
allocate memory with GFP_KERNEL. Also by implication of
dma_resv_lock(), no userspace faulting allowed. That's some supremely
obnoxious limitations, which is why we need to sprinkle the right
annotations to all relevant paths.
The one big locking context we're leaving out here is mmu notifiers,
added in
commit 23b68395c7c78a764e8963fc15a7cfd318bf187f
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Mon Aug 26 22:14:21 2019 +0200
mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
that one covers a lot of other callsites, and it's also allowed to
wait on dma-fences from mmu notifiers. But there's no ready-made
functions exposed to prime this, so I've left it out for now.
v2: Also track against mmu notifier context.
v3: kerneldoc to spec the cross-driver contract. Note that currently
i915 throws in a hard-coded 10s timeout on foreign fences (not sure
why that was done, but it's there), which is why that rule is worded
with SHOULD instead of MUST.
Also some of the mmu_notifier/shrinker rules might surprise SoC
drivers, I haven't fully audited them all. Which is infeasible anyway,
we'll need to run them with lockdep and dma-fence annotations and see
what goes boom.
v4: A spelling fix from Mika
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
Documentation/driver-api/dma-buf.rst | 6 ++++
drivers/dma-buf/dma-fence.c | 41 ++++++++++++++++++++++++++++
drivers/dma-buf/dma-resv.c | 4 +++
include/linux/dma-fence.h | 1 +
4 files changed, 52 insertions(+)
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index 05d856131140..f8f6decde359 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -133,6 +133,12 @@ DMA Fences
.. kernel-doc:: drivers/dma-buf/dma-fence.c
:doc: DMA fences overview
+DMA Fence Cross-Driver Contract
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-fence.c
+ :doc: fence cross-driver contract
+
DMA Fence Signalling Annotations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 0005bc002529..754e6fb84fb7 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -64,6 +64,47 @@ static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(1);
* &dma_buf.resv pointer.
*/
+/**
+ * DOC: fence cross-driver contract
+ *
+ * Since &dma_fence provide a cross driver contract, all drivers must follow the
+ * same rules:
+ *
+ * * Fences must complete in a reasonable time. Fences which represent kernels
+ * and shaders submitted by userspace, which could run forever, must be backed
+ * up by timeout and gpu hang recovery code. Minimally that code must prevent
+ * further command submission and force complete all in-flight fences, e.g.
+ * when the driver or hardware do not support gpu reset, or if the gpu reset
+ * failed for some reason. Ideally the driver supports gpu recovery which only
+ * affects the offending userspace context, and no other userspace
+ * submissions.
+ *
+ * * Drivers may have different ideas of what completion within a reasonable
+ * time means. Some hang recovery code uses a fixed timeout, others a mix
+ * between observing forward progress and increasingly strict timeouts.
+ * Drivers should not try to second guess timeout handling of fences from
+ * other drivers.
+ *
+ * * To ensure there's no deadlocks of dma_fence_wait() against other locks
+ * drivers should annotate all code required to reach dma_fence_signal(),
+ * which completes the fences, with dma_fence_begin_signalling() and
+ * dma_fence_end_signalling().
+ *
+ * * Drivers are allowed to call dma_fence_wait() while holding dma_resv_lock().
+ * This means any code required for fence completion cannot acquire a
+ * &dma_resv lock. Note that this also pulls in the entire established
+ * locking hierarchy around dma_resv_lock() and dma_resv_unlock().
+ *
+ * * Drivers are allowed to call dma_fence_wait() from their &shrinker
+ * callbacks. This means any code required for fence completion cannot
+ * allocate memory with GFP_KERNEL.
+ *
+ * * Drivers are allowed to call dma_fence_wait() from their &mmu_notifier
+ * respectively &mmu_interval_notifier callbacks. This means any code required
+ * for fence completeion cannot allocate memory with GFP_NOFS or GFP_NOIO.
+ * Only GFP_ATOMIC is permissible, which might fail.
+ */
+
static const char *dma_fence_stub_get_name(struct dma_fence *fence)
{
return "stub";
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 99c0a33c918d..c223f32425c4 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -35,6 +35,7 @@
#include <linux/dma-resv.h>
#include <linux/export.h>
#include <linux/sched/mm.h>
+#include <linux/mmu_notifier.h>
/**
* DOC: Reservation Object Overview
@@ -115,6 +116,9 @@ static int __init dma_resv_lockdep(void)
if (ret == -EDEADLK)
dma_resv_lock_slow(&obj, &ctx);
fs_reclaim_acquire(GFP_KERNEL);
+ lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
+ __dma_fence_might_wait();
+ lock_map_release(&__mmu_notifier_invalidate_range_start_map);
fs_reclaim_release(GFP_KERNEL);
ww_mutex_unlock(&obj.lock);
ww_acquire_fini(&ctx);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 3f288f7db2ef..09e23adb351d 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -360,6 +360,7 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
#ifdef CONFIG_LOCKDEP
bool dma_fence_begin_signalling(void);
void dma_fence_end_signalling(bool cookie);
+void __dma_fence_might_wait(void);
#else
static inline bool dma_fence_begin_signalling(void)
{
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 04/18] dma-fence: prime lockdep annotations
2020-06-04 8:12 ` [Intel-gfx] [PATCH 04/18] dma-fence: prime " Daniel Vetter
@ 2020-06-11 7:30 ` Thomas Hellström (Intel)
2020-06-11 8:34 ` Daniel Vetter
2020-06-12 7:01 ` [Intel-gfx] [PATCH] " Daniel Vetter
1 sibling, 1 reply; 106+ messages in thread
From: Thomas Hellström (Intel) @ 2020-06-11 7:30 UTC (permalink / raw)
To: Daniel Vetter, DRI Development
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx,
linaro-mm-sig, Thomas Hellstrom, Daniel Vetter, Mika Kuoppala,
Christian König, linux-media
On 6/4/20 10:12 AM, Daniel Vetter wrote:
> Two in one go:
> - it is allowed to call dma_fence_wait() while holding a
> dma_resv_lock(). This is fundamental to how eviction works with ttm,
> so required.
>
> - it is allowed to call dma_fence_wait() from memory reclaim contexts,
> specifically from shrinker callbacks (which i915 does), and from mmu
> notifier callbacks (which amdgpu does, and which i915 sometimes also
> does, and probably always should, but that's kinda a debate). Also
> for stuff like HMM we really need to be able to do this, or things
> get real dicey.
>
> Consequence is that any critical path necessary to get to a
> dma_fence_signal for a fence must never a) call dma_resv_lock nor b)
> allocate memory with GFP_KERNEL. Also by implication of
> dma_resv_lock(), no userspace faulting allowed. That's some supremely
> obnoxious limitations, which is why we need to sprinkle the right
> annotations to all relevant paths.
>
> The one big locking context we're leaving out here is mmu notifiers,
> added in
>
> commit 23b68395c7c78a764e8963fc15a7cfd318bf187f
> Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> Date: Mon Aug 26 22:14:21 2019 +0200
>
> mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
>
> that one covers a lot of other callsites, and it's also allowed to
> wait on dma-fences from mmu notifiers. But there's no ready-made
> functions exposed to prime this, so I've left it out for now.
>
> v2: Also track against mmu notifier context.
>
> v3: kerneldoc to spec the cross-driver contract. Note that currently
> i915 throws in a hard-coded 10s timeout on foreign fences (not sure
> why that was done, but it's there), which is why that rule is worded
> with SHOULD instead of MUST.
>
> Also some of the mmu_notifier/shrinker rules might surprise SoC
> drivers, I haven't fully audited them all. Which is infeasible anyway,
> we'll need to run them with lockdep and dma-fence annotations and see
> what goes boom.
>
> v4: A spelling fix from Mika
>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: linux-rdma@vger.kernel.org
> Cc: amd-gfx@lists.freedesktop.org
> Cc: intel-gfx@lists.freedesktop.org
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
> Documentation/driver-api/dma-buf.rst | 6 ++++
> drivers/dma-buf/dma-fence.c | 41 ++++++++++++++++++++++++++++
> drivers/dma-buf/dma-resv.c | 4 +++
> include/linux/dma-fence.h | 1 +
> 4 files changed, 52 insertions(+)
I still have my doubts about allowing fence waiting from within
shrinkers. IMO ideally they should use a trywait approach, in order to
allow memory allocation during command submission for drivers that
publish fences before command submission. (Since early reservation
object release requires that).
But since drivers are already waiting from within shrinkers and I take
your word for HMM requiring this,
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 04/18] dma-fence: prime lockdep annotations
2020-06-11 7:30 ` [Intel-gfx] [Linaro-mm-sig] " Thomas Hellström (Intel)
@ 2020-06-11 8:34 ` Daniel Vetter
[not found] ` <20200611141515.GW6578@ziepe.ca>
0 siblings, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-11 8:34 UTC (permalink / raw)
To: Thomas Hellström (Intel)
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
DRI Development, linaro-mm-sig, Thomas Hellstrom, amd-gfx,
Daniel Vetter, Mika Kuoppala, Christian König, linux-media
On Thu, Jun 11, 2020 at 09:30:12AM +0200, Thomas Hellström (Intel) wrote:
>
> On 6/4/20 10:12 AM, Daniel Vetter wrote:
> > Two in one go:
> > - it is allowed to call dma_fence_wait() while holding a
> > dma_resv_lock(). This is fundamental to how eviction works with ttm,
> > so required.
> >
> > - it is allowed to call dma_fence_wait() from memory reclaim contexts,
> > specifically from shrinker callbacks (which i915 does), and from mmu
> > notifier callbacks (which amdgpu does, and which i915 sometimes also
> > does, and probably always should, but that's kinda a debate). Also
> > for stuff like HMM we really need to be able to do this, or things
> > get real dicey.
> >
> > Consequence is that any critical path necessary to get to a
> > dma_fence_signal for a fence must never a) call dma_resv_lock nor b)
> > allocate memory with GFP_KERNEL. Also by implication of
> > dma_resv_lock(), no userspace faulting allowed. That's some supremely
> > obnoxious limitations, which is why we need to sprinkle the right
> > annotations to all relevant paths.
> >
> > The one big locking context we're leaving out here is mmu notifiers,
> > added in
> >
> > commit 23b68395c7c78a764e8963fc15a7cfd318bf187f
> > Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Date: Mon Aug 26 22:14:21 2019 +0200
> >
> > mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
> >
> > that one covers a lot of other callsites, and it's also allowed to
> > wait on dma-fences from mmu notifiers. But there's no ready-made
> > functions exposed to prime this, so I've left it out for now.
> >
> > v2: Also track against mmu notifier context.
> >
> > v3: kerneldoc to spec the cross-driver contract. Note that currently
> > i915 throws in a hard-coded 10s timeout on foreign fences (not sure
> > why that was done, but it's there), which is why that rule is worded
> > with SHOULD instead of MUST.
> >
> > Also some of the mmu_notifier/shrinker rules might surprise SoC
> > drivers, I haven't fully audited them all. Which is infeasible anyway,
> > we'll need to run them with lockdep and dma-fence annotations and see
> > what goes boom.
> >
> > v4: A spelling fix from Mika
> >
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: amd-gfx@lists.freedesktop.org
> > Cc: intel-gfx@lists.freedesktop.org
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> > Documentation/driver-api/dma-buf.rst | 6 ++++
> > drivers/dma-buf/dma-fence.c | 41 ++++++++++++++++++++++++++++
> > drivers/dma-buf/dma-resv.c | 4 +++
> > include/linux/dma-fence.h | 1 +
> > 4 files changed, 52 insertions(+)
>
> I still have my doubts about allowing fence waiting from within shrinkers.
> IMO ideally they should use a trywait approach, in order to allow memory
> allocation during command submission for drivers that
> publish fences before command submission. (Since early reservation object
> release requires that).
Yeah it is a bit annoying, e.g. for drm/scheduler I think we'll end up
with a mempool to make sure it can handle it's allocations.
> But since drivers are already waiting from within shrinkers and I take your
> word for HMM requiring this,
Yeah the big trouble is HMM and mmu notifiers. That's the really awkward
one, the shrinker one is a lot less established.
I do wonder whether the mmu notifier constraint should only be set when
mmu notifiers are enabled, since on a bunch of arm-soc gpu drivers that
stuff just doesn't matter. But I expect that sooner or later these arm
gpus will show up in bigger arm cores, where you might want to have kvm
and maybe device virtualization and stuff, and then you need mmu
notifiers.
Plus having a very clear and consistent cross-driver api contract is imo
better than leaving this up to drivers and then having incompatible
assumptions.
I've pinged a bunch of armsoc gpu driver people and ask them how much this
hurts, so that we have a clear answer. On x86 I don't think we have much
of a choice on this, with userptr in amd and i915 and hmm work in nouveau
(but nouveau I think doesn't use dma_fence in there). I think it'll take
us a while to really bottom out on this specific question here.
-Daniel
>
> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
>
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH] dma-fence: prime lockdep annotations
2020-06-04 8:12 ` [Intel-gfx] [PATCH 04/18] dma-fence: prime " Daniel Vetter
2020-06-11 7:30 ` [Intel-gfx] [Linaro-mm-sig] " Thomas Hellström (Intel)
@ 2020-06-12 7:01 ` Daniel Vetter
1 sibling, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-12 7:01 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, amd-gfx,
Chris Wilson, linaro-mm-sig, Thomas Hellström, Daniel Vetter,
Mika Kuoppala, Christian König, linux-media
Two in one go:
- it is allowed to call dma_fence_wait() while holding a
dma_resv_lock(). This is fundamental to how eviction works with ttm,
so required.
- it is allowed to call dma_fence_wait() from memory reclaim contexts,
specifically from shrinker callbacks (which i915 does), and from mmu
notifier callbacks (which amdgpu does, and which i915 sometimes also
does, and probably always should, but that's kinda a debate). Also
for stuff like HMM we really need to be able to do this, or things
get real dicey.
Consequence is that any critical path necessary to get to a
dma_fence_signal for a fence must never a) call dma_resv_lock nor b)
allocate memory with GFP_KERNEL. Also by implication of
dma_resv_lock(), no userspace faulting allowed. That's some supremely
obnoxious limitations, which is why we need to sprinkle the right
annotations to all relevant paths.
The one big locking context we're leaving out here is mmu notifiers,
added in
commit 23b68395c7c78a764e8963fc15a7cfd318bf187f
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Mon Aug 26 22:14:21 2019 +0200
mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
that one covers a lot of other callsites, and it's also allowed to
wait on dma-fences from mmu notifiers. But there's no ready-made
functions exposed to prime this, so I've left it out for now.
v2: Also track against mmu notifier context.
v3: kerneldoc to spec the cross-driver contract. Note that currently
i915 throws in a hard-coded 10s timeout on foreign fences (not sure
why that was done, but it's there), which is why that rule is worded
with SHOULD instead of MUST.
Also some of the mmu_notifier/shrinker rules might surprise SoC
drivers, I haven't fully audited them all. Which is infeasible anyway,
we'll need to run them with lockdep and dma-fence annotations and see
what goes boom.
v4: A spelling fix from Mika
v5: #ifdef for CONFIG_MMU_NOTIFIER. Reported by 0day. Unfortunately
this means lockdep enforcement is slightly inconsistent, it won't spot
GFP_NOIO and GFP_NOFS allocations in the wrong spot if
CONFIG_MMU_NOTIFIER is disabled in the kernel config. Oh well.
Cc: kernel test robot <lkp@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> (v4)
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
Documentation/driver-api/dma-buf.rst | 6 ++++
drivers/dma-buf/dma-fence.c | 41 ++++++++++++++++++++++++++++
drivers/dma-buf/dma-resv.c | 8 ++++++
include/linux/dma-fence.h | 1 +
4 files changed, 56 insertions(+)
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index 05d856131140..f8f6decde359 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -133,6 +133,12 @@ DMA Fences
.. kernel-doc:: drivers/dma-buf/dma-fence.c
:doc: DMA fences overview
+DMA Fence Cross-Driver Contract
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-fence.c
+ :doc: fence cross-driver contract
+
DMA Fence Signalling Annotations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 0005bc002529..754e6fb84fb7 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -64,6 +64,47 @@ static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(1);
* &dma_buf.resv pointer.
*/
+/**
+ * DOC: fence cross-driver contract
+ *
+ * Since &dma_fence provide a cross driver contract, all drivers must follow the
+ * same rules:
+ *
+ * * Fences must complete in a reasonable time. Fences which represent kernels
+ * and shaders submitted by userspace, which could run forever, must be backed
+ * up by timeout and gpu hang recovery code. Minimally that code must prevent
+ * further command submission and force complete all in-flight fences, e.g.
+ * when the driver or hardware do not support gpu reset, or if the gpu reset
+ * failed for some reason. Ideally the driver supports gpu recovery which only
+ * affects the offending userspace context, and no other userspace
+ * submissions.
+ *
+ * * Drivers may have different ideas of what completion within a reasonable
+ * time means. Some hang recovery code uses a fixed timeout, others a mix
+ * between observing forward progress and increasingly strict timeouts.
+ * Drivers should not try to second guess timeout handling of fences from
+ * other drivers.
+ *
+ * * To ensure there's no deadlocks of dma_fence_wait() against other locks
+ * drivers should annotate all code required to reach dma_fence_signal(),
+ * which completes the fences, with dma_fence_begin_signalling() and
+ * dma_fence_end_signalling().
+ *
+ * * Drivers are allowed to call dma_fence_wait() while holding dma_resv_lock().
+ * This means any code required for fence completion cannot acquire a
+ * &dma_resv lock. Note that this also pulls in the entire established
+ * locking hierarchy around dma_resv_lock() and dma_resv_unlock().
+ *
+ * * Drivers are allowed to call dma_fence_wait() from their &shrinker
+ * callbacks. This means any code required for fence completion cannot
+ * allocate memory with GFP_KERNEL.
+ *
+ * * Drivers are allowed to call dma_fence_wait() from their &mmu_notifier
+ * respectively &mmu_interval_notifier callbacks. This means any code required
+ * for fence completeion cannot allocate memory with GFP_NOFS or GFP_NOIO.
+ * Only GFP_ATOMIC is permissible, which might fail.
+ */
+
static const char *dma_fence_stub_get_name(struct dma_fence *fence)
{
return "stub";
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 99c0a33c918d..51f0583ead19 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -35,6 +35,7 @@
#include <linux/dma-resv.h>
#include <linux/export.h>
#include <linux/sched/mm.h>
+#include <linux/mmu_notifier.h>
/**
* DOC: Reservation Object Overview
@@ -115,6 +116,13 @@ static int __init dma_resv_lockdep(void)
if (ret == -EDEADLK)
dma_resv_lock_slow(&obj, &ctx);
fs_reclaim_acquire(GFP_KERNEL);
+#ifdef CONFIG_MMU_NOTIFIER
+ lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
+ __dma_fence_might_wait();
+ lock_map_release(&__mmu_notifier_invalidate_range_start_map);
+#else
+ __dma_fence_might_wait();
+#endif
fs_reclaim_release(GFP_KERNEL);
ww_mutex_unlock(&obj.lock);
ww_acquire_fini(&ctx);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 3f288f7db2ef..09e23adb351d 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -360,6 +360,7 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
#ifdef CONFIG_LOCKDEP
bool dma_fence_begin_signalling(void);
void dma_fence_end_signalling(bool cookie);
+void __dma_fence_might_wait(void);
#else
static inline bool dma_fence_begin_signalling(void)
{
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH 05/18] drm/vkms: Annotate vblank timer
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (3 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 04/18] dma-fence: prime " Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 06/18] drm/vblank: Annotate with dma-fence signalling section Daniel Vetter
` (25 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: Haneen Mohammed, Rodrigo Siqueira, linux-rdma, Daniel Vetter,
Intel Graphics Development, LKML, amd-gfx, Chris Wilson,
linaro-mm-sig, Daniel Vetter, Christian König, linux-media
This is needed to signal the fences from page flips, annotate it
accordingly. We need to annotate entire timer callback since if we get
stuck anywhere in there, then the timer stops, and hence fences stop.
Just annotating the top part that does the vblank handling isn't
enough.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
drivers/gpu/drm/vkms/vkms_crtc.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index ac85e17428f8..a53a40848a72 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -1,5 +1,7 @@
// SPDX-License-Identifier: GPL-2.0+
+#include <linux/dma-fence.h>
+
#include <drm/drm_atomic.h>
#include <drm/drm_atomic_helper.h>
#include <drm/drm_probe_helper.h>
@@ -14,7 +16,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
struct drm_crtc *crtc = &output->crtc;
struct vkms_crtc_state *state;
u64 ret_overrun;
- bool ret;
+ bool ret, fence_cookie;
+
+ fence_cookie = dma_fence_begin_signalling();
ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
output->period_ns);
@@ -49,6 +53,8 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
DRM_DEBUG_DRIVER("Composer worker already queued\n");
}
+ dma_fence_end_signalling(fence_cookie);
+
return HRTIMER_RESTART;
}
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 06/18] drm/vblank: Annotate with dma-fence signalling section
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (4 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 05/18] drm/vkms: Annotate vblank timer Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 07/18] drm/atomic-helper: Add dma-fence annotations Daniel Vetter
` (24 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
This is rather overkill since currently all drivers call this from
hardirq (or at least timers). But maybe in the future we're going to
have thread irq handlers and what not, doesn't hurt to be prepared.
Plus this is an easy start for sprinkling these fence annotations into
shared code.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/drm_vblank.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 85e5f2db1608..93a5bba5f665 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -24,6 +24,7 @@
* OTHER DEALINGS IN THE SOFTWARE.
*/
+#include <linux/dma-fence.h>
#include <linux/export.h>
#include <linux/moduleparam.h>
@@ -1908,7 +1909,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
{
struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
unsigned long irqflags;
- bool disable_irq;
+ bool disable_irq, fence_cookie;
if (drm_WARN_ON_ONCE(dev, !drm_dev_has_vblank(dev)))
return false;
@@ -1916,6 +1917,8 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
return false;
+ fence_cookie = dma_fence_begin_signalling();
+
spin_lock_irqsave(&dev->event_lock, irqflags);
/* Need timestamp lock to prevent concurrent execution with
@@ -1928,6 +1931,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
if (!vblank->enabled) {
spin_unlock(&dev->vblank_time_lock);
spin_unlock_irqrestore(&dev->event_lock, irqflags);
+ dma_fence_end_signalling(fence_cookie);
return false;
}
@@ -1953,6 +1957,8 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
if (disable_irq)
vblank_disable_fn(&vblank->disable_timer);
+ dma_fence_end_signalling(fence_cookie);
+
return true;
}
EXPORT_SYMBOL(drm_handle_vblank);
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 07/18] drm/atomic-helper: Add dma-fence annotations
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (5 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 06/18] drm/vblank: Annotate with dma-fence signalling section Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 08/18] drm/amdgpu: add dma-fence annotations to atomic commit path Daniel Vetter
` (23 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
This is a bit disappointing since we need to split the annotations
over all the different parts.
I was considering just leaking the critical section into the
->atomic_commit_tail callback of each driver. But that would mean we
need to pass the fence_cookie into each driver (there's a total of 13
implementations of this hook right now), so bad flag day. And also a
bit leaky abstraction.
Hence just do it function-by-function.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/drm_atomic_helper.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 7cd7fe0d57b4..bfcc7857a9a1 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1549,6 +1549,7 @@ EXPORT_SYMBOL(drm_atomic_helper_wait_for_flip_done);
void drm_atomic_helper_commit_tail(struct drm_atomic_state *old_state)
{
struct drm_device *dev = old_state->dev;
+ bool fence_cookie = dma_fence_begin_signalling();
drm_atomic_helper_commit_modeset_disables(dev, old_state);
@@ -1560,6 +1561,8 @@ void drm_atomic_helper_commit_tail(struct drm_atomic_state *old_state)
drm_atomic_helper_commit_hw_done(old_state);
+ dma_fence_end_signalling(fence_cookie);
+
drm_atomic_helper_wait_for_vblanks(dev, old_state);
drm_atomic_helper_cleanup_planes(dev, old_state);
@@ -1579,6 +1582,7 @@ EXPORT_SYMBOL(drm_atomic_helper_commit_tail);
void drm_atomic_helper_commit_tail_rpm(struct drm_atomic_state *old_state)
{
struct drm_device *dev = old_state->dev;
+ bool fence_cookie = dma_fence_begin_signalling();
drm_atomic_helper_commit_modeset_disables(dev, old_state);
@@ -1591,6 +1595,8 @@ void drm_atomic_helper_commit_tail_rpm(struct drm_atomic_state *old_state)
drm_atomic_helper_commit_hw_done(old_state);
+ dma_fence_end_signalling(fence_cookie);
+
drm_atomic_helper_wait_for_vblanks(dev, old_state);
drm_atomic_helper_cleanup_planes(dev, old_state);
@@ -1606,6 +1612,9 @@ static void commit_tail(struct drm_atomic_state *old_state)
ktime_t start;
s64 commit_time_ms;
unsigned int i, new_self_refresh_mask = 0;
+ bool fence_cookie;
+
+ fence_cookie = dma_fence_begin_signalling();
funcs = dev->mode_config.helper_private;
@@ -1634,6 +1643,8 @@ static void commit_tail(struct drm_atomic_state *old_state)
if (new_crtc_state->self_refresh_active)
new_self_refresh_mask |= BIT(i);
+ dma_fence_end_signalling(fence_cookie);
+
if (funcs && funcs->atomic_commit_tail)
funcs->atomic_commit_tail(old_state);
else
@@ -1789,6 +1800,7 @@ int drm_atomic_helper_commit(struct drm_device *dev,
bool nonblock)
{
int ret;
+ bool fence_cookie;
if (state->async_update) {
ret = drm_atomic_helper_prepare_planes(dev, state);
@@ -1811,6 +1823,8 @@ int drm_atomic_helper_commit(struct drm_device *dev,
if (ret)
return ret;
+ fence_cookie = dma_fence_begin_signalling();
+
if (!nonblock) {
ret = drm_atomic_helper_wait_for_fences(dev, state, true);
if (ret)
@@ -1848,6 +1862,7 @@ int drm_atomic_helper_commit(struct drm_device *dev,
*/
drm_atomic_state_get(state);
+ dma_fence_end_signalling(fence_cookie);
if (nonblock)
queue_work(system_unbound_wq, &state->commit_work);
else
@@ -1856,6 +1871,7 @@ int drm_atomic_helper_commit(struct drm_device *dev,
return 0;
err:
+ dma_fence_end_signalling(fence_cookie);
drm_atomic_helper_cleanup_planes(dev, state);
return ret;
}
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 08/18] drm/amdgpu: add dma-fence annotations to atomic commit path
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (6 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 07/18] drm/atomic-helper: Add dma-fence annotations Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-23 10:51 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 09/18] drm/scheduler: use dma-fence annotations in main thread Daniel Vetter
` (22 subsequent siblings)
30 siblings, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
I need a canary in a ttm-based atomic driver to make sure the
dma_fence_begin/end_signalling annotations actually work.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index bdba0bfd6df1..adabfa929f42 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -57,6 +57,7 @@
#include "ivsrcid/ivsrcid_vislands30.h"
+#include <linux/module.h>
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/version.h>
@@ -7320,6 +7321,9 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
struct drm_connector_state *old_con_state, *new_con_state;
struct dm_crtc_state *dm_old_crtc_state, *dm_new_crtc_state;
int crtc_disable_count = 0;
+ bool fence_cookie;
+
+ fence_cookie = dma_fence_begin_signalling();
drm_atomic_helper_update_legacy_modeset_state(dev, state);
@@ -7600,6 +7604,8 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
/* Signal HW programming completion */
drm_atomic_helper_commit_hw_done(state);
+ dma_fence_end_signalling(fence_cookie);
+
if (wait_for_vblank)
drm_atomic_helper_wait_for_flip_done(dev, state);
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 08/18] drm/amdgpu: add dma-fence annotations to atomic commit path
2020-06-04 8:12 ` [Intel-gfx] [PATCH 08/18] drm/amdgpu: add dma-fence annotations to atomic commit path Daniel Vetter
@ 2020-06-23 10:51 ` Daniel Vetter
0 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-23 10:51 UTC (permalink / raw)
To: DRI Development, Roland Scheidegger, VMware Graphics,
Thomas Hellstrom
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Chris Wilson, moderated list:DMA BUFFER SHARING FRAMEWORK,
Daniel Vetter, Christian König,
open list:DMA BUFFER SHARING FRAMEWORK
Hi Roland & vmwgfx maintainers,
Thomas has played around with these annotations on his vmwgfx setup,
and found some issues. Apparently in the atomic_commit_tail path when
handling the dirty rectangle stuff you acquire a ttm reservation,
which is a no-go since it could deadlock with other paths - atomic
commits can produce a dma_fence.
This patch here highlights that with the new annotations, and
apparently causes a lockdep splat if you go through the dirty rect
paths (not sure if it also happens otherwise, Thomas can fill you in
with the details).
Can you pls take a look at this? I'm happy to help out with analyzing
any lockdep splats. For actual fixes Thomas is better since I don't
understand a lot of how drm/vmwgfx works internally.
Cheers, Daniel
On Thu, Jun 4, 2020 at 10:12 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> I need a canary in a ttm-based atomic driver to make sure the
> dma_fence_begin/end_signalling annotations actually work.
>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: linux-rdma@vger.kernel.org
> Cc: amd-gfx@lists.freedesktop.org
> Cc: intel-gfx@lists.freedesktop.org
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index bdba0bfd6df1..adabfa929f42 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -57,6 +57,7 @@
>
> #include "ivsrcid/ivsrcid_vislands30.h"
>
> +#include <linux/module.h>
> #include <linux/module.h>
> #include <linux/moduleparam.h>
> #include <linux/version.h>
> @@ -7320,6 +7321,9 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
> struct drm_connector_state *old_con_state, *new_con_state;
> struct dm_crtc_state *dm_old_crtc_state, *dm_new_crtc_state;
> int crtc_disable_count = 0;
> + bool fence_cookie;
> +
> + fence_cookie = dma_fence_begin_signalling();
>
> drm_atomic_helper_update_legacy_modeset_state(dev, state);
>
> @@ -7600,6 +7604,8 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
> /* Signal HW programming completion */
> drm_atomic_helper_commit_hw_done(state);
>
> + dma_fence_end_signalling(fence_cookie);
> +
> if (wait_for_vblank)
> drm_atomic_helper_wait_for_flip_done(dev, state);
>
> --
> 2.26.2
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH 09/18] drm/scheduler: use dma-fence annotations in main thread
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (7 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 08/18] drm/amdgpu: add dma-fence annotations to atomic commit path Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 10/18] drm/amdgpu: use dma-fence annotations in cs_submit() Daniel Vetter
` (21 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
If the scheduler rt thread gets stuck on a mutex that we're holding
while waiting for gpu workloads to complete, we have a problem.
Add dma-fence annotations so that lockdep can check this for us.
I've tried to quite carefully review this, and I think it's at the
right spot. But obviosly no expert on drm scheduler.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/scheduler/sched_main.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 2f319102ae9f..06a736e506ad 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -763,9 +763,12 @@ static int drm_sched_main(void *param)
struct sched_param sparam = {.sched_priority = 1};
struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
int r;
+ bool fence_cookie;
sched_setscheduler(current, SCHED_FIFO, &sparam);
+ fence_cookie = dma_fence_begin_signalling();
+
while (!kthread_should_stop()) {
struct drm_sched_entity *entity = NULL;
struct drm_sched_fence *s_fence;
@@ -823,6 +826,9 @@ static int drm_sched_main(void *param)
wake_up(&sched->job_scheduled);
}
+
+ dma_fence_end_signalling(fence_cookie);
+
return 0;
}
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 10/18] drm/amdgpu: use dma-fence annotations in cs_submit()
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (8 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 09/18] drm/scheduler: use dma-fence annotations in main thread Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 11/18] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code Daniel Vetter
` (20 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
This is a bit tricky, since ->notifier_lock is held while calling
dma_fence_wait we must ensure that also the read side (i.e.
dma_fence_begin_signalling) is on the same side. If we mix this up
lockdep complaints, and that's again why we want to have these
annotations.
A nice side effect of this is that because of the fs_reclaim priming
for dma_fence_enable lockdep now automatically checks for us that
nothing in here allocates memory, without even running any userptr
workloads.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index a25fb59c127c..e109666aec14 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1212,6 +1212,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
struct amdgpu_job *job;
uint64_t seq;
int r;
+ bool fence_cookie;
job = p->job;
p->job = NULL;
@@ -1226,6 +1227,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
*/
mutex_lock(&p->adev->notifier_lock);
+ fence_cookie = dma_fence_begin_signalling();
+
/* If userptr are invalidated after amdgpu_cs_parser_bos(), return
* -EAGAIN, drmIoctl in libdrm will restart the amdgpu_cs_ioctl.
*/
@@ -1262,12 +1265,14 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
+ dma_fence_end_signalling(fence_cookie);
mutex_unlock(&p->adev->notifier_lock);
return 0;
error_abort:
drm_sched_job_cleanup(&job->base);
+ dma_fence_end_signalling(fence_cookie);
mutex_unlock(&p->adev->notifier_lock);
error_unlock:
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 11/18] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (9 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 10/18] drm/amdgpu: use dma-fence annotations in cs_submit() Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 12/18] drm/amdgpu: DC also loves to allocate stuff where it shouldn't Daniel Vetter
` (19 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
My dma-fence lockdep annotations caught an inversion because we
allocate memory where we really shouldn't:
kmem_cache_alloc+0x2b/0x6d0
amdgpu_fence_emit+0x30/0x330 [amdgpu]
amdgpu_ib_schedule+0x306/0x550 [amdgpu]
amdgpu_job_run+0x10f/0x260 [amdgpu]
drm_sched_main+0x1b9/0x490 [gpu_sched]
kthread+0x12e/0x150
Trouble right now is that lockdep only validates against GFP_FS, which
would be good enough for shrinkers. But for mmu_notifiers we actually
need !GFP_ATOMIC, since they can be called from any page laundering,
even if GFP_NOFS or GFP_NOIO are set.
I guess we should improve the lockdep annotations for
fs_reclaim_acquire/release.
Ofc real fix is to properly preallocate this fence and stuff it into
the amdgpu job structure. But GFP_ATOMIC gets the lockdep splat out of
the way.
v2: Two more allocations in scheduler paths.
Frist one:
__kmalloc+0x58/0x720
amdgpu_vmid_grab+0x100/0xca0 [amdgpu]
amdgpu_job_dependency+0xf9/0x120 [amdgpu]
drm_sched_entity_pop_job+0x3f/0x440 [gpu_sched]
drm_sched_main+0xf9/0x490 [gpu_sched]
Second one:
kmem_cache_alloc+0x2b/0x6d0
amdgpu_sync_fence+0x7e/0x110 [amdgpu]
amdgpu_vmid_grab+0x86b/0xca0 [amdgpu]
amdgpu_job_dependency+0xf9/0x120 [amdgpu]
drm_sched_entity_pop_job+0x3f/0x440 [gpu_sched]
drm_sched_main+0xf9/0x490 [gpu_sched]
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index d878fe7fee51..055b47241bb1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -143,7 +143,7 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f,
uint32_t seq;
int r;
- fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL);
+ fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_ATOMIC);
if (fence == NULL)
return -ENOMEM;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index fe92dcd94d4a..fdcd6659f5ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -208,7 +208,7 @@ static int amdgpu_vmid_grab_idle(struct amdgpu_vm *vm,
if (ring->vmid_wait && !dma_fence_is_signaled(ring->vmid_wait))
return amdgpu_sync_fence(sync, ring->vmid_wait, false);
- fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_KERNEL);
+ fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_ATOMIC);
if (!fences)
return -ENOMEM;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index b87ca171986a..330476cc0c86 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -168,7 +168,7 @@ int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f,
if (amdgpu_sync_add_later(sync, f, explicit))
return 0;
- e = kmem_cache_alloc(amdgpu_sync_slab, GFP_KERNEL);
+ e = kmem_cache_alloc(amdgpu_sync_slab, GFP_ATOMIC);
if (!e)
return -ENOMEM;
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 12/18] drm/amdgpu: DC also loves to allocate stuff where it shouldn't
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (10 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 11/18] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 13/18] drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail Daniel Vetter
` (18 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
Not going to bother with a complete&pretty commit message, just
offending backtrace:
kvmalloc_node+0x47/0x80
dc_create_state+0x1f/0x60 [amdgpu]
dc_commit_state+0xcb/0x9b0 [amdgpu]
amdgpu_dm_atomic_commit_tail+0xd31/0x2010 [amdgpu]
commit_tail+0xa4/0x140 [drm_kms_helper]
drm_atomic_helper_commit+0x152/0x180 [drm_kms_helper]
drm_client_modeset_commit_atomic+0x1ea/0x250 [drm]
drm_client_modeset_commit_locked+0x55/0x190 [drm]
drm_client_modeset_commit+0x24/0x40 [drm]
v2: Found more in DC code, I'm just going to pile them all up.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/amd/amdgpu/atom.c | 2 +-
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
drivers/gpu/drm/amd/display/dc/core/dc.c | 4 +++-
3 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/atom.c b/drivers/gpu/drm/amd/amdgpu/atom.c
index 4cfc786699c7..1b0c674fab25 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.c
+++ b/drivers/gpu/drm/amd/amdgpu/atom.c
@@ -1226,7 +1226,7 @@ static int amdgpu_atom_execute_table_locked(struct atom_context *ctx, int index,
ectx.abort = false;
ectx.last_jump = 0;
if (ws)
- ectx.ws = kcalloc(4, ws, GFP_KERNEL);
+ ectx.ws = kcalloc(4, ws, GFP_ATOMIC);
else
ectx.ws = NULL;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index adabfa929f42..c575e7394d03 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6833,7 +6833,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
struct dc_stream_update stream_update;
} *bundle;
- bundle = kzalloc(sizeof(*bundle), GFP_KERNEL);
+ bundle = kzalloc(sizeof(*bundle), GFP_ATOMIC);
if (!bundle) {
dm_error("Failed to allocate update bundle\n");
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 45cfb7c45566..9a8e321a7a15 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1416,8 +1416,10 @@ bool dc_post_update_surfaces_to_stream(struct dc *dc)
struct dc_state *dc_create_state(struct dc *dc)
{
+ /* No you really cant allocate random crap here this late in
+ * atomic_commit_tail. */
struct dc_state *context = kvzalloc(sizeof(struct dc_state),
- GFP_KERNEL);
+ GFP_ATOMIC);
if (!context)
return NULL;
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 13/18] drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (11 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 12/18] drm/amdgpu: DC also loves to allocate stuff where it shouldn't Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-05 8:30 ` Pierre-Eric Pelloux-Prayer
2020-06-04 8:12 ` [Intel-gfx] [PATCH 14/18] drm/scheduler: use dma-fence annotations in tdr work Daniel Vetter
` (17 subsequent siblings)
30 siblings, 1 reply; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
Trying to grab dma_resv_lock while in commit_tail before we've done
all the code that leads to the eventual signalling of the vblank event
(which can be a dma_fence) is deadlock-y. Don't do that.
Here the solution is easy because just grabbing locks to read
something races anyway. We don't need to bother, READ_ONCE is
equivalent. And avoids the locking issue.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index c575e7394d03..04c11443b9ca 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6910,7 +6910,11 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
* explicitly on fences instead
* and in general should be called for
* blocking commit to as per framework helpers
+ *
+ * Yes, this deadlocks, since you're calling dma_resv_lock in a
+ * path that leads to a dma_fence_signal(). Don't do that.
*/
+#if 0
r = amdgpu_bo_reserve(abo, true);
if (unlikely(r != 0))
DRM_ERROR("failed to reserve buffer before flip\n");
@@ -6920,6 +6924,12 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
tmz_surface = amdgpu_bo_encrypted(abo);
amdgpu_bo_unreserve(abo);
+#endif
+ /*
+ * this races anyway, so READ_ONCE isn't any better or worse
+ * than the stuff above. Except the stuff above can deadlock.
+ */
+ tiling_flags = READ_ONCE(abo->tiling_flags);
fill_dc_plane_info_and_addr(
dm->adev, new_plane_state, tiling_flags,
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 13/18] drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
2020-06-04 8:12 ` [Intel-gfx] [PATCH 13/18] drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail Daniel Vetter
@ 2020-06-05 8:30 ` Pierre-Eric Pelloux-Prayer
2020-06-05 12:41 ` Daniel Vetter
0 siblings, 1 reply; 106+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2020-06-05 8:30 UTC (permalink / raw)
To: Daniel Vetter, DRI Development
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx,
Chris Wilson, linaro-mm-sig, Daniel Vetter, Christian König,
linux-media
Hi Daniel,
On 04/06/2020 10:12, Daniel Vetter wrote:
[...]
> @@ -6910,7 +6910,11 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
> * explicitly on fences instead
> * and in general should be called for
> * blocking commit to as per framework helpers
> + *
> + * Yes, this deadlocks, since you're calling dma_resv_lock in a
> + * path that leads to a dma_fence_signal(). Don't do that.
> */
> +#if 0
> r = amdgpu_bo_reserve(abo, true);
> if (unlikely(r != 0))
> DRM_ERROR("failed to reserve buffer before flip\n");
> @@ -6920,6 +6924,12 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
> tmz_surface = amdgpu_bo_encrypted(abo);
>
> amdgpu_bo_unreserve(abo);
> +#endif
> + /*
> + * this races anyway, so READ_ONCE isn't any better or worse
> + * than the stuff above. Except the stuff above can deadlock.
> + */
> + tiling_flags = READ_ONCE(abo->tiling_flags);
With this change "tmz_surface" won't be initialized properly.
Adding the following line should fix it:
tmz_surface = READ_ONCE(abo->flags) & AMDGPU_GEM_CREATE_ENCRYPTED;
Pierre-Eric
>
> fill_dc_plane_info_and_addr(
> dm->adev, new_plane_state, tiling_flags,
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* Re: [Intel-gfx] [PATCH 13/18] drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
2020-06-05 8:30 ` Pierre-Eric Pelloux-Prayer
@ 2020-06-05 12:41 ` Daniel Vetter
0 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-05 12:41 UTC (permalink / raw)
To: Pierre-Eric Pelloux-Prayer
Cc: linux-rdma, Intel Graphics Development, LKML, amd-gfx list,
Chris Wilson, moderated list:DMA BUFFER SHARING FRAMEWORK,
DRI Development, Daniel Vetter, Christian König,
open list:DMA BUFFER SHARING FRAMEWORK
On Fri, Jun 5, 2020 at 10:30 AM Pierre-Eric Pelloux-Prayer
<pierre-eric.pelloux-prayer@amd.com> wrote:
>
> Hi Daniel,
>
> On 04/06/2020 10:12, Daniel Vetter wrote:
> [...]
> > @@ -6910,7 +6910,11 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
> > * explicitly on fences instead
> > * and in general should be called for
> > * blocking commit to as per framework helpers
> > + *
> > + * Yes, this deadlocks, since you're calling dma_resv_lock in a
> > + * path that leads to a dma_fence_signal(). Don't do that.
> > */
> > +#if 0
> > r = amdgpu_bo_reserve(abo, true);
> > if (unlikely(r != 0))
> > DRM_ERROR("failed to reserve buffer before flip\n");
> > @@ -6920,6 +6924,12 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
> > tmz_surface = amdgpu_bo_encrypted(abo);
> >
> > amdgpu_bo_unreserve(abo);
> > +#endif
> > + /*
> > + * this races anyway, so READ_ONCE isn't any better or worse
> > + * than the stuff above. Except the stuff above can deadlock.
> > + */
> > + tiling_flags = READ_ONCE(abo->tiling_flags);
>
> With this change "tmz_surface" won't be initialized properly.
> Adding the following line should fix it:
>
> tmz_surface = READ_ONCE(abo->flags) & AMDGPU_GEM_CREATE_ENCRYPTED;
So to make this clear, I'm not really proposing to fix up all the
drivers in detail. There's a lot more bugs in all the other drivers,
I'm pretty sure. The driver fixups really are just quick hacks to
illustrate the problem, and at least in some cases, maybe illustrate a
possible solution.
For the real fixes I think this needs driver teams working on this,
and make sure it's all solid. I can help a bit with review (especially
for placing the annotations, e.g. the one I put in cs_submit()
annotates a bit too much), but that's it.
Also I think the patch is from before tmz landed, and I just blindly
rebased over it :-)
-Daniel
>
>
> Pierre-Eric
>
>
> >
> > fill_dc_plane_info_and_addr(
> > dm->adev, new_plane_state, tiling_flags,
> >
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread
* [Intel-gfx] [PATCH 14/18] drm/scheduler: use dma-fence annotations in tdr work
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (12 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 13/18] drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 15/18] drm/amdgpu: use dma-fence annotations for gpu reset code Daniel Vetter
` (16 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
In the face of unpriviledged userspace being able to submit bogus gpu
workloads the kernel needs gpu timeout and reset (tdr) to guarantee
that dma_fences actually complete. Annotate this worker to make sure
we don't have any accidental locking inversions or other problems
lurking.
Originally this was part of the overall scheduler annotation patch.
But amdgpu has some glorious inversions here:
- grabs console_lock
- does a full modeset, which grabs all kinds of locks
(drm_modeset_lock, dma_resv_lock) which can deadlock with
dma_fence_wait held inside them.
- almost minor at that point, but the modeset code also allocates
memory
These all look like they'll be very hard to fix properly, the hardware
seems to require a full display reset with any gpu recovery.
Hence split out as a seperate patch.
Since amdgpu isn't the only hardware driver that needs to reset the
display (at least gen2/3 on intel have the same problem) we need a
generic solution for this. There's two tricks we could still from
drm/i915 and lift to dma-fence:
- The big whack, aka force-complete all fences. i915 does this for all
pending jobs if the reset is somehow stuck. Trouble is we'd need to
do this for all fences in the entire system, and just the
book-keeping for that will be fun. Plus lots of drivers use fences
for all kinds of internal stuff like memory management, so
unconditionally resetting all of them doesn't work.
I'm also hoping that with these fence annotations we could enlist
lockdep in finding the last offenders causing deadlocks, and we
could remove this get-out-of-jail trick.
- The more feasible approach (across drivers at least as part of the
dma_fence contract) is what drm/i915 does for gen2/3: When we need
to reset the display we wake up all dma_fence_wait_interruptible
calls, or well at least the equivalent of those in i915 internally.
Relying on ioctl restart we force all other threads to release their
locks, which means the tdr thread is guaranteed to be able to get
them. I think we could implement this at the dma_fence level,
including proper lockdep annotations.
dma_fence_begin_tdr():
- must be nested within a dma_fence_begin/end_signalling section
- will wake up all interruptible (but not the non-interruptible)
dma_fence_wait() calls and force them to complete with a
-ERESTARTSYS errno code. All new interrupitble calls to
dma_fence_wait() will immeidately fail with the same error code.
dma_fence_end_trdr():
- this will convert dma_fence_wait() calls back to normal.
Of course interrupting dma_fence_wait is only ok if the caller
specified that, which means we need to split the annotations into
interruptible and non-interruptible version. If we then make sure
that we only use interruptible dma_fence_wait() calls while holding
drm_modeset_lock we can grab them in tdr code, and allow display
resets. Doing the same for dma_resv_lock might be a lot harder, so
buffer updates must be avoided.
What's worse, we're not going to be able to make the dma_fence_wait
calls in mmu-notifiers interruptible, that doesn't work. So
allocating memory still wont' be allowed, even in tdr sections. Plus
obviously we can use this trick only in tdr, it is rather intrusive.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/scheduler/sched_main.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 06a736e506ad..e34a44376e87 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -279,9 +279,12 @@ static void drm_sched_job_timedout(struct work_struct *work)
{
struct drm_gpu_scheduler *sched;
struct drm_sched_job *job;
+ bool fence_cookie;
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
+ fence_cookie = dma_fence_begin_signalling();
+
/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
spin_lock(&sched->job_list_lock);
job = list_first_entry_or_null(&sched->ring_mirror_list,
@@ -313,6 +316,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
spin_lock(&sched->job_list_lock);
drm_sched_start_timeout(sched);
spin_unlock(&sched->job_list_lock);
+
+ dma_fence_end_signalling(fence_cookie);
}
/**
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 15/18] drm/amdgpu: use dma-fence annotations for gpu reset code
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (13 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 14/18] drm/scheduler: use dma-fence annotations in tdr work Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 16/18] Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset" Daniel Vetter
` (15 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Christian König,
linux-media
To improve coverage also annotate the gpu reset code itself, since
that's called from other places than drm/scheduler (which is already
annotated). Annotations nests, so this doesn't break anything, and
allows easier testing.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a027a8f7b281..ac0286a5f2fc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4215,6 +4215,9 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
(amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) ?
true : false;
bool audio_suspended = false;
+ bool fence_cookie;
+
+ fence_cookie = dma_fence_begin_signalling();
/*
* Flush RAM to disk so that after reboot
@@ -4243,6 +4246,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another already in progress",
job ? job->base.id : -1, hive->hive_id);
mutex_unlock(&hive->hive_lock);
+ dma_fence_end_signalling(fence_cookie);
return 0;
}
@@ -4253,8 +4257,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
*/
INIT_LIST_HEAD(&device_list);
if (adev->gmc.xgmi.num_physical_nodes > 1) {
- if (!hive)
+ if (!hive) {
+ dma_fence_end_signalling(fence_cookie);
return -ENODEV;
+ }
if (!list_is_first(&adev->gmc.xgmi.head, &hive->device_list))
list_rotate_to_front(&adev->gmc.xgmi.head, &hive->device_list);
device_list_handle = &hive->device_list;
@@ -4269,6 +4275,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
DRM_INFO("Bailing on TDR for s_job:%llx, as another already in progress",
job ? job->base.id : -1);
mutex_unlock(&hive->hive_lock);
+ dma_fence_end_signalling(fence_cookie);
return 0;
}
@@ -4409,6 +4416,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
if (r)
dev_info(adev->dev, "GPU reset end with ret = %d\n", r);
+ dma_fence_end_signalling(fence_cookie);
return r;
}
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 16/18] Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (14 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 15/18] drm/amdgpu: use dma-fence annotations for gpu reset code Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 17/18] drm/amdgpu: gpu recovery does full modesets Daniel Vetter
` (14 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
This is one from the department of "maybe play lottery if you hit
this, karma compensation might work". Or at least lockdep ftw!
This reverts commit 565d1941557756a584ac357d945bc374d5fcd1d0.
It's not quite as low-risk as the commit message claims, because this
grabs console_lock, which might be held when we allocate memory, which
might never happen because the dma_fence_wait() is stuck waiting on
our gpu reset:
[ 136.763714] ======================================================
[ 136.763714] WARNING: possible circular locking dependency detected
[ 136.763715] 5.7.0-rc3+ #346 Tainted: G W
[ 136.763716] ------------------------------------------------------
[ 136.763716] kworker/2:3/682 is trying to acquire lock:
[ 136.763716] ffffffff8226f140 (console_lock){+.+.}-{0:0}, at: drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper]
[ 136.763723]
but task is already holding lock:
[ 136.763724] ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched]
[ 136.763726]
which lock already depends on the new lock.
[ 136.763726]
the existing dependency chain (in reverse order) is:
[ 136.763727]
-> #2 (dma_fence_map){++++}-{0:0}:
[ 136.763730] __dma_fence_might_wait+0x41/0xb0
[ 136.763732] dma_resv_lockdep+0x171/0x202
[ 136.763734] do_one_initcall+0x5d/0x2f0
[ 136.763736] kernel_init_freeable+0x20d/0x26d
[ 136.763738] kernel_init+0xa/0xfb
[ 136.763740] ret_from_fork+0x27/0x50
[ 136.763740]
-> #1 (fs_reclaim){+.+.}-{0:0}:
[ 136.763743] fs_reclaim_acquire.part.0+0x25/0x30
[ 136.763745] kmem_cache_alloc_trace+0x2e/0x6e0
[ 136.763747] device_create_groups_vargs+0x52/0xf0
[ 136.763747] device_create+0x49/0x60
[ 136.763749] fb_console_init+0x25/0x145
[ 136.763750] fbmem_init+0xcc/0xe2
[ 136.763750] do_one_initcall+0x5d/0x2f0
[ 136.763751] kernel_init_freeable+0x20d/0x26d
[ 136.763752] kernel_init+0xa/0xfb
[ 136.763753] ret_from_fork+0x27/0x50
[ 136.763753]
-> #0 (console_lock){+.+.}-{0:0}:
[ 136.763755] __lock_acquire+0x1241/0x23f0
[ 136.763756] lock_acquire+0xad/0x370
[ 136.763757] console_lock+0x47/0x70
[ 136.763761] drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper]
[ 136.763809] amdgpu_device_gpu_recover.cold+0x21e/0xe7b [amdgpu]
[ 136.763850] amdgpu_job_timedout+0xfb/0x150 [amdgpu]
[ 136.763851] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched]
[ 136.763852] process_one_work+0x23c/0x580
[ 136.763853] worker_thread+0x50/0x3b0
[ 136.763854] kthread+0x12e/0x150
[ 136.763855] ret_from_fork+0x27/0x50
[ 136.763855]
other info that might help us debug this:
[ 136.763856] Chain exists of:
console_lock --> fs_reclaim --> dma_fence_map
[ 136.763857] Possible unsafe locking scenario:
[ 136.763857] CPU0 CPU1
[ 136.763857] ---- ----
[ 136.763857] lock(dma_fence_map);
[ 136.763858] lock(fs_reclaim);
[ 136.763858] lock(dma_fence_map);
[ 136.763858] lock(console_lock);
[ 136.763859]
*** DEADLOCK ***
[ 136.763860] 4 locks held by kworker/2:3/682:
[ 136.763860] #0: ffff8887fb81c938 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580
[ 136.763862] #1: ffffc90000cafe58 ((work_completion)(&(&sched->work_tdr)->work)){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580
[ 136.763863] #2: ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched]
[ 136.763865] #3: ffff8887ab621748 (&adev->lock_reset){+.+.}-{3:3}, at: amdgpu_device_gpu_recover.cold+0x5ab/0xe7b [amdgpu]
[ 136.763914]
stack backtrace:
[ 136.763915] CPU: 2 PID: 682 Comm: kworker/2:3 Tainted: G W 5.7.0-rc3+ #346
[ 136.763916] Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4011 04/19/2018
[ 136.763918] Workqueue: events drm_sched_job_timedout [gpu_sched]
[ 136.763919] Call Trace:
[ 136.763922] dump_stack+0x8f/0xd0
[ 136.763924] check_noncircular+0x162/0x180
[ 136.763926] __lock_acquire+0x1241/0x23f0
[ 136.763927] lock_acquire+0xad/0x370
[ 136.763932] ? drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper]
[ 136.763933] ? mark_held_locks+0x2d/0x80
[ 136.763934] ? _raw_spin_unlock_irqrestore+0x46/0x60
[ 136.763936] console_lock+0x47/0x70
[ 136.763940] ? drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper]
[ 136.763944] drm_fb_helper_set_suspend_unlocked+0x7b/0xa0 [drm_kms_helper]
[ 136.763993] amdgpu_device_gpu_recover.cold+0x21e/0xe7b [amdgpu]
[ 136.764036] amdgpu_job_timedout+0xfb/0x150 [amdgpu]
[ 136.764038] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched]
[ 136.764040] process_one_work+0x23c/0x580
[ 136.764041] worker_thread+0x50/0x3b0
[ 136.764042] ? process_one_work+0x580/0x580
[ 136.764044] kthread+0x12e/0x150
[ 136.764045] ? kthread_create_worker_on_cpu+0x70/0x70
[ 136.764046] ret_from_fork+0x27/0x50
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ac0286a5f2fc..4c4492de670c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4063,8 +4063,6 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive,
if (r)
goto out;
- amdgpu_fbdev_set_suspend(tmp_adev, 0);
-
/* must succeed. */
amdgpu_ras_resume(tmp_adev);
@@ -4305,8 +4303,6 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
*/
amdgpu_unregister_gpu_instance(tmp_adev);
- amdgpu_fbdev_set_suspend(tmp_adev, 1);
-
/* disable ras on ALL IPs */
if (!(in_ras_intr && !use_baco) &&
amdgpu_device_ip_need_full_reset(tmp_adev))
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 17/18] drm/amdgpu: gpu recovery does full modesets
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (15 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 16/18] Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset" Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:12 ` [Intel-gfx] [PATCH 18/18] drm/i915: Annotate dma_fence_work Daniel Vetter
` (13 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
...
I think it's time to stop this little exercise.
The lockdep splat, for the record:
[ 132.583381] ======================================================
[ 132.584091] WARNING: possible circular locking dependency detected
[ 132.584775] 5.7.0-rc3+ #346 Tainted: G W
[ 132.585461] ------------------------------------------------------
[ 132.586184] kworker/2:3/865 is trying to acquire lock:
[ 132.586857] ffffc90000677c70 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper]
[ 132.587569]
but task is already holding lock:
[ 132.589044] ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched]
[ 132.589803]
which lock already depends on the new lock.
[ 132.592009]
the existing dependency chain (in reverse order) is:
[ 132.593507]
-> #2 (dma_fence_map){++++}-{0:0}:
[ 132.595019] dma_fence_begin_signalling+0x50/0x60
[ 132.595767] drm_atomic_helper_commit+0xa1/0x180 [drm_kms_helper]
[ 132.596567] drm_client_modeset_commit_atomic+0x1ea/0x250 [drm]
[ 132.597420] drm_client_modeset_commit_locked+0x55/0x190 [drm]
[ 132.598178] drm_client_modeset_commit+0x24/0x40 [drm]
[ 132.598948] drm_fb_helper_restore_fbdev_mode_unlocked+0x4b/0xa0 [drm_kms_helper]
[ 132.599738] drm_fb_helper_set_par+0x30/0x40 [drm_kms_helper]
[ 132.600539] fbcon_init+0x2e8/0x660
[ 132.601344] visual_init+0xce/0x130
[ 132.602156] do_bind_con_driver+0x1bc/0x2b0
[ 132.602970] do_take_over_console+0x115/0x180
[ 132.603763] do_fbcon_takeover+0x58/0xb0
[ 132.604564] register_framebuffer+0x1ee/0x300
[ 132.605369] __drm_fb_helper_initial_config_and_unlock+0x36e/0x520 [drm_kms_helper]
[ 132.606187] amdgpu_fbdev_init+0xb3/0xf0 [amdgpu]
[ 132.607032] amdgpu_device_init.cold+0xe90/0x1677 [amdgpu]
[ 132.607862] amdgpu_driver_load_kms+0x5a/0x200 [amdgpu]
[ 132.608697] amdgpu_pci_probe+0xf7/0x180 [amdgpu]
[ 132.609511] local_pci_probe+0x42/0x80
[ 132.610324] pci_device_probe+0x104/0x1a0
[ 132.611130] really_probe+0x147/0x3c0
[ 132.611939] driver_probe_device+0xb6/0x100
[ 132.612766] device_driver_attach+0x53/0x60
[ 132.613593] __driver_attach+0x8c/0x150
[ 132.614419] bus_for_each_dev+0x7b/0xc0
[ 132.615249] bus_add_driver+0x14c/0x1f0
[ 132.616071] driver_register+0x6c/0xc0
[ 132.616902] do_one_initcall+0x5d/0x2f0
[ 132.617731] do_init_module+0x5c/0x230
[ 132.618560] load_module+0x2981/0x2bc0
[ 132.619391] __do_sys_finit_module+0xaa/0x110
[ 132.620228] do_syscall_64+0x5a/0x250
[ 132.621064] entry_SYSCALL_64_after_hwframe+0x49/0xb3
[ 132.621903]
-> #1 (crtc_ww_class_mutex){+.+.}-{3:3}:
[ 132.623587] __ww_mutex_lock.constprop.0+0xcc/0x10c0
[ 132.624448] ww_mutex_lock+0x43/0xb0
[ 132.625315] drm_modeset_lock+0x44/0x120 [drm]
[ 132.626184] drmm_mode_config_init+0x2db/0x8b0 [drm]
[ 132.627098] amdgpu_device_init.cold+0xbd1/0x1677 [amdgpu]
[ 132.628007] amdgpu_driver_load_kms+0x5a/0x200 [amdgpu]
[ 132.628920] amdgpu_pci_probe+0xf7/0x180 [amdgpu]
[ 132.629804] local_pci_probe+0x42/0x80
[ 132.630690] pci_device_probe+0x104/0x1a0
[ 132.631583] really_probe+0x147/0x3c0
[ 132.632479] driver_probe_device+0xb6/0x100
[ 132.633379] device_driver_attach+0x53/0x60
[ 132.634275] __driver_attach+0x8c/0x150
[ 132.635170] bus_for_each_dev+0x7b/0xc0
[ 132.636069] bus_add_driver+0x14c/0x1f0
[ 132.636974] driver_register+0x6c/0xc0
[ 132.637870] do_one_initcall+0x5d/0x2f0
[ 132.638765] do_init_module+0x5c/0x230
[ 132.639654] load_module+0x2981/0x2bc0
[ 132.640522] __do_sys_finit_module+0xaa/0x110
[ 132.641372] do_syscall_64+0x5a/0x250
[ 132.642203] entry_SYSCALL_64_after_hwframe+0x49/0xb3
[ 132.643022]
-> #0 (crtc_ww_class_acquire){+.+.}-{0:0}:
[ 132.644643] __lock_acquire+0x1241/0x23f0
[ 132.645469] lock_acquire+0xad/0x370
[ 132.646274] drm_modeset_acquire_init+0xd2/0x100 [drm]
[ 132.647071] drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper]
[ 132.647902] dm_suspend+0x1c/0x60 [amdgpu]
[ 132.648698] amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu]
[ 132.649498] amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu]
[ 132.650300] amdgpu_device_gpu_recover.cold+0x4e6/0xe64 [amdgpu]
[ 132.651084] amdgpu_job_timedout+0xfb/0x150 [amdgpu]
[ 132.651825] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched]
[ 132.652594] process_one_work+0x23c/0x580
[ 132.653402] worker_thread+0x50/0x3b0
[ 132.654139] kthread+0x12e/0x150
[ 132.654868] ret_from_fork+0x27/0x50
[ 132.655598]
other info that might help us debug this:
[ 132.657739] Chain exists of:
crtc_ww_class_acquire --> crtc_ww_class_mutex --> dma_fence_map
[ 132.659877] Possible unsafe locking scenario:
[ 132.661416] CPU0 CPU1
[ 132.662126] ---- ----
[ 132.662847] lock(dma_fence_map);
[ 132.663574] lock(crtc_ww_class_mutex);
[ 132.664319] lock(dma_fence_map);
[ 132.665063] lock(crtc_ww_class_acquire);
[ 132.665799]
*** DEADLOCK ***
[ 132.667965] 4 locks held by kworker/2:3/865:
[ 132.668701] #0: ffff8887fb81c938 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580
[ 132.669462] #1: ffffc90000677e58 ((work_completion)(&(&sched->work_tdr)->work)){+.+.}-{0:0}, at: process_one_work+0x1bc/0x580
[ 132.670242] #2: ffffffff82318c80 (dma_fence_map){++++}-{0:0}, at: drm_sched_job_timedout+0x25/0xf0 [gpu_sched]
[ 132.671039] #3: ffff8887b84a1748 (&adev->lock_reset){+.+.}-{3:3}, at: amdgpu_device_gpu_recover.cold+0x59e/0xe64 [amdgpu]
[ 132.671902]
stack backtrace:
[ 132.673515] CPU: 2 PID: 865 Comm: kworker/2:3 Tainted: G W 5.7.0-rc3+ #346
[ 132.674347] Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4011 04/19/2018
[ 132.675194] Workqueue: events drm_sched_job_timedout [gpu_sched]
[ 132.676046] Call Trace:
[ 132.676897] dump_stack+0x8f/0xd0
[ 132.677748] check_noncircular+0x162/0x180
[ 132.678604] ? stack_trace_save+0x4b/0x70
[ 132.679459] __lock_acquire+0x1241/0x23f0
[ 132.680311] lock_acquire+0xad/0x370
[ 132.681163] ? drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper]
[ 132.682021] ? cpumask_next+0x16/0x20
[ 132.682880] ? module_assert_mutex_or_preempt+0x14/0x40
[ 132.683737] ? __module_address+0x28/0xf0
[ 132.684601] drm_modeset_acquire_init+0xd2/0x100 [drm]
[ 132.685466] ? drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper]
[ 132.686335] drm_atomic_helper_suspend+0x38/0x120 [drm_kms_helper]
[ 132.687255] dm_suspend+0x1c/0x60 [amdgpu]
[ 132.688152] amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu]
[ 132.689057] ? amdgpu_fence_process+0x4c/0x150 [amdgpu]
[ 132.689963] amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu]
[ 132.690893] amdgpu_device_gpu_recover.cold+0x4e6/0xe64 [amdgpu]
[ 132.691818] amdgpu_job_timedout+0xfb/0x150 [amdgpu]
[ 132.692707] drm_sched_job_timedout+0x8a/0xf0 [gpu_sched]
[ 132.693597] process_one_work+0x23c/0x580
[ 132.694487] worker_thread+0x50/0x3b0
[ 132.695373] ? process_one_work+0x580/0x580
[ 132.696264] kthread+0x12e/0x150
[ 132.697154] ? kthread_create_worker_on_cpu+0x70/0x70
[ 132.698057] ret_from_fork+0x27/0x50
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4c4492de670c..3ea4b9258fb0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2441,6 +2441,14 @@ static int amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev)
/* displays are handled separately */
if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_DCE) {
/* XXX handle errors */
+
+ /*
+ * This is dm_suspend, which calls modeset locks, and
+ * that a pretty good inversion against dma_fence_signal
+ * which gpu recovery is supposed to guarantee.
+ *
+ * Dont ask me how to fix this.
+ */
r = adev->ip_blocks[i].version->funcs->suspend(adev);
/* XXX handle errors */
if (r) {
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] [PATCH 18/18] drm/i915: Annotate dma_fence_work
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (16 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 17/18] drm/amdgpu: gpu recovery does full modesets Daniel Vetter
@ 2020-06-04 8:12 ` Daniel Vetter
2020-06-04 8:55 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 Patchwork
` (12 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Daniel Vetter @ 2020-06-04 8:12 UTC (permalink / raw)
To: DRI Development
Cc: linux-rdma, Daniel Vetter, Intel Graphics Development, LKML,
amd-gfx, Chris Wilson, linaro-mm-sig, Daniel Vetter,
Christian König, linux-media
i915 does tons of allocations from this worker, which lockdep catches.
Also generic infrastructure like this with big potential for how
dma_fence or other cross driver contracts work, really should be
reviewed on dri-devel. Implementing custom wheels for everything
within the driver is a classic case of "platform problem" [1]. Which in
upstream we really shouldn't have.
Since there's no quick way to solve these splats (dma_fence_work is
used a bunch in basic buffer management and command submission) like
for amdgpu, I'm giving up at this point here. Annotating i915
scheduler and gpu reset could would be interesting, but since lockdep
is one-shot we can't see what surprises would lurk there.
1: https://lwn.net/Articles/443531/
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
drivers/gpu/drm/i915/i915_sw_fence_work.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c b/drivers/gpu/drm/i915/i915_sw_fence_work.c
index a3a81bb8f2c3..5b74acadaef5 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
@@ -17,12 +17,15 @@ static void fence_work(struct work_struct *work)
{
struct dma_fence_work *f = container_of(work, typeof(*f), work);
int err;
+ bool fence_cookie;
+ fence_cookie = dma_fence_begin_signalling();
err = f->ops->work(f);
if (err)
dma_fence_set_error(&f->dma, err);
fence_complete(f);
+ dma_fence_end_signalling(fence_cookie);
dma_fence_put(&f->dma);
}
--
2.26.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (17 preceding siblings ...)
2020-06-04 8:12 ` [Intel-gfx] [PATCH 18/18] drm/i915: Annotate dma_fence_work Daniel Vetter
@ 2020-06-04 8:55 ` Patchwork
2020-06-04 8:57 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
` (11 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-04 8:55 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2
URL : https://patchwork.freedesktop.org/series/77986/
State : warning
== Summary ==
$ dim checkpatch origin/drm-tip
199c3f9df986 mm: Track mmu notifiers in fs_reclaim_acquire/release
-:12: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 23b68395c7c7 ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end")'
#12:
recursions we do have lockdep annotations since 23b68395c7c7
-:41: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 66204f1d2d1b ("mm/mmu_notifiers: prime lockdep")'
#41:
With this we can also remove the lockdep priming added in 66204f1d2d1b
-:116: CHECK:BRACES: Blank lines aren't necessary before a close brace '}'
#116: FILE: mm/page_alloc.c:4165:
+
+ }
-:130: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 2 errors, 1 warnings, 1 checks, 65 lines checked
b85d9997eaca dma-buf: minor doc touch-ups
-:32: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 14 lines checked
7f8c3b44f8eb dma-fence: basic lockdep annotations
-:23: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit e91498589746 ("locking/lockdep/selftests: Add mixed read-write ABBA tests")'
#23:
commit e91498589746065e3ae95d9a00b068e525eec34f
-:261: ERROR:IN_ATOMIC: do not use in_atomic in drivers
#261: FILE: drivers/dma-buf/dma-fence.c:228:
+ if (in_atomic())
-:299: CHECK:LINE_SPACING: Please don't use multiple blank lines
#299: FILE: drivers/dma-buf/dma-fence.c:266:
+
+
-:348: CHECK:LINE_SPACING: Please use a blank line after function/struct/union/enum declarations
#348: FILE: include/linux/dma-fence.h:368:
+}
+static inline void dma_fence_end_signalling(bool cookie) {}
-:354: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 2 errors, 1 warnings, 2 checks, 231 lines checked
96b50a5032df dma-fence: prime lockdep annotations
-:31: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 23b68395c7c7 ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end")'
#31:
commit 23b68395c7c78a764e8963fc15a7cfd318bf187f
-:169: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 1 errors, 1 warnings, 0 checks, 82 lines checked
2fb5c8b43ac8 drm/vkms: Annotate vblank timer
-:59: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 25 lines checked
93efe4f7dc82 drm/vblank: Annotate with dma-fence signalling section
-:71: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 38 lines checked
9d42aa205b3f drm/atomic-helper: Add dma-fence annotations
-:119: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 76 lines checked
911b274bb909 drm/amdgpu: add dma-fence annotations to atomic commit path
-:52: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 24 lines checked
0485794be8aa drm/scheduler: use dma-fence annotations in main thread
-:53: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 21 lines checked
82020872a9a2 drm/amdgpu: use dma-fence annotations in cs_submit()
-:65: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 29 lines checked
05627337ac19 drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
-:82: WARNING:ALLOC_ARRAY_ARGS: kmalloc_array uses number as first arg, sizeof is generally wrong
#82: FILE: drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c:211:
+ fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_ATOMIC);
-:98: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 24 lines checked
753d44dd8a51 drm/amdgpu: DC also loves to allocate stuff where it shouldn't
-:70: WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line
#70: FILE: drivers/gpu/drm/amd/display/dc/core/dc.c:1420:
+ * atomic_commit_tail. */
-:76: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 27 lines checked
838703bb63b9 drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
-:39: WARNING:IF_0: Consider removing the code enclosed by this #if 0 and its #endif
#39: FILE: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:6917:
+#if 0
-:55: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 23 lines checked
6fd3e8ef0756 drm/scheduler: use dma-fence annotations in tdr work
-:28: WARNING:TYPO_SPELLING: 'seperate' may be misspelled - perhaps 'separate'?
#28:
Hence split out as a seperate patch.
-:114: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 20 lines checked
552ee41a6739 drm/amdgpu: use dma-fence annotations for gpu reset code
1bb25e8d8189 Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"
-:145: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 16 lines checked
77ef9df05cc0 drm/amdgpu: gpu recovery does full modesets
-:186: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 14 lines checked
4db85879be37 drm/i915: Annotate dma_fence_work
-:53: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 15 lines checked
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for dma-fence lockdep annotations, round 2
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (18 preceding siblings ...)
2020-06-04 8:55 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 Patchwork
@ 2020-06-04 8:57 ` Patchwork
2020-06-04 9:08 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
` (10 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-04 8:57 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2
URL : https://patchwork.freedesktop.org/series/77986/
State : warning
== Summary ==
$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: expected unsigned int [addressable] [usertype] ulClockParams
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: got restricted __le32 [usertype]
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: warning: incorrect type in assignment (different base types)
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1028:50: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1029:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1037:47: warning: too many warnings
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:184:44: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:283:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:320:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:323:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:326:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:329:18: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:330:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:338:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:340:38: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:342:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:346:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:348:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:353:33: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:367:43: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:369:38: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:374:67: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:375:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:378:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:389:80: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:395:57: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:402:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:403:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:406:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:414:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:423:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:424:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:473:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:476:45: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:477:45: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:484:54: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:52:28: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:531:35: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:53:29: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:533:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:54:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:55:27: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:56:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:57:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:577:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:581:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:58:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:583:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:586:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:590:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:59:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:598:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:600:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:617:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:621:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:623:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:630:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:632:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:644:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:648:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:650:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:657:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:659:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:662:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:664:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:676:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:688:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:691:47: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:697:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:796:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:797:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:800:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:801:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:804:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:805:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:812:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:813:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:816:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:817:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:820:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:821:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:828:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:829:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:832:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:833:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:836:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:837:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:844:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:845:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:848:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:849:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:852:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:853:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:916:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:918:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:920:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:934:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:936:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:938:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:956:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:958:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:960:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:296:34: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:330:34: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:360:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:362:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:369:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:383:40: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:406:40: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:44:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:447:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:451:33: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:454:61: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:455:64: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:457:54: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:483:17: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:486:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:64:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:85:30: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:86:24: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:98:39: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:222:29: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:227:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:233:43: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:236:44: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:239:51: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:464:39: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:465:30: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:466:39: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:468:24: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: expected unsigned long long [usertype] *chunk_array_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: got unsigned long long [usertype] *chunk_array_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: expected struct drm_amdgpu_cs_chunk [noderef] <asn:1> **chunk_ptr
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: got struct drm_amdgpu_cs_chunk [noderef] <asn:1> **chunk_ptr
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: expected struct drm_amdgpu_fence *fences_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: got struct drm_amdgpu_fence *fences_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1256:25: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1257:17: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1313:17: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: expected restricted __poll_t ( *poll )( ... )
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: got unsigned int ( * )( ... )
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: warning: incorrect type in initializer (different base types)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: too many warnings
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1618:65: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1625:55: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626:50: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1627:50: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1628:56: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1630:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1631:45: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1632:51: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1633:55: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1634:57: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1636:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1637:53: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1639:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1641:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1642:46: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1646:73: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1648:33: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1650:33: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1661:73: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:260:16: error: incompatible types in comparison expression (different type sizes)
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:507:39: warning: cast removes address space '<asn:2>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:527:31: warning: cast removes address space '<asn:2>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.BAT: failure for dma-fence lockdep annotations, round 2
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (19 preceding siblings ...)
2020-06-04 8:57 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-06-04 9:08 ` Patchwork
2020-06-05 13:59 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 (rev2) Patchwork
` (9 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-04 9:08 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2
URL : https://patchwork.freedesktop.org/series/77986/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_8580 -> Patchwork_17864
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_17864 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_17864, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/index.html
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_17864:
### IGT changes ###
#### Possible regressions ####
* igt@gem_close_race@basic-process:
- fi-ivb-3770: [PASS][1] -> [DMESG-WARN][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-ivb-3770/igt@gem_close_race@basic-process.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-ivb-3770/igt@gem_close_race@basic-process.html
- fi-byt-j1900: [PASS][3] -> [DMESG-WARN][4]
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-byt-j1900/igt@gem_close_race@basic-process.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-byt-j1900/igt@gem_close_race@basic-process.html
- fi-hsw-4770: [PASS][5] -> [DMESG-WARN][6]
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-hsw-4770/igt@gem_close_race@basic-process.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-hsw-4770/igt@gem_close_race@basic-process.html
- fi-byt-n2820: [PASS][7] -> [DMESG-WARN][8]
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-byt-n2820/igt@gem_close_race@basic-process.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-byt-n2820/igt@gem_close_race@basic-process.html
* igt@gem_tiled_blits@basic:
- fi-pnv-d510: [PASS][9] -> [DMESG-WARN][10]
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-pnv-d510/igt@gem_tiled_blits@basic.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-pnv-d510/igt@gem_tiled_blits@basic.html
* igt@kms_busy@basic@flip:
- fi-snb-2600: [PASS][11] -> [DMESG-WARN][12]
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-snb-2600/igt@kms_busy@basic@flip.html
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-snb-2600/igt@kms_busy@basic@flip.html
- fi-snb-2520m: [PASS][13] -> [DMESG-WARN][14]
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-snb-2520m/igt@kms_busy@basic@flip.html
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-snb-2520m/igt@kms_busy@basic@flip.html
* igt@kms_frontbuffer_tracking@basic:
- fi-ilk-650: [PASS][15] -> [DMESG-WARN][16]
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-ilk-650/igt@kms_frontbuffer_tracking@basic.html
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-ilk-650/igt@kms_frontbuffer_tracking@basic.html
* igt@runner@aborted:
- fi-pnv-d510: NOTRUN -> [FAIL][17]
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-pnv-d510/igt@runner@aborted.html
- fi-cfl-8700k: NOTRUN -> [FAIL][18]
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-cfl-8700k/igt@runner@aborted.html
- fi-tgl-y: NOTRUN -> [FAIL][19]
[19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-tgl-y/igt@runner@aborted.html
- fi-cfl-8109u: NOTRUN -> [FAIL][20]
[20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-cfl-8109u/igt@runner@aborted.html
- fi-icl-u2: NOTRUN -> [FAIL][21]
[21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-icl-u2/igt@runner@aborted.html
- fi-snb-2520m: NOTRUN -> [FAIL][22]
[22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-snb-2520m/igt@runner@aborted.html
- fi-bdw-5557u: NOTRUN -> [FAIL][23]
[23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-bdw-5557u/igt@runner@aborted.html
- fi-byt-n2820: NOTRUN -> [FAIL][24]
[24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-byt-n2820/igt@runner@aborted.html
- fi-icl-guc: NOTRUN -> [FAIL][25]
[25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-icl-guc/igt@runner@aborted.html
- fi-hsw-4770: NOTRUN -> [FAIL][26]
[26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-hsw-4770/igt@runner@aborted.html
- fi-snb-2600: NOTRUN -> [FAIL][27]
[27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-snb-2600/igt@runner@aborted.html
- fi-whl-u: NOTRUN -> [FAIL][28]
[28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-whl-u/igt@runner@aborted.html
- fi-cml-u2: NOTRUN -> [FAIL][29]
[29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-cml-u2/igt@runner@aborted.html
- fi-ivb-3770: NOTRUN -> [FAIL][30]
[30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-ivb-3770/igt@runner@aborted.html
- fi-bxt-dsi: NOTRUN -> [FAIL][31]
[31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-bxt-dsi/igt@runner@aborted.html
- fi-byt-j1900: NOTRUN -> [FAIL][32]
[32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-byt-j1900/igt@runner@aborted.html
- fi-cml-s: NOTRUN -> [FAIL][33]
[33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-cml-s/igt@runner@aborted.html
- fi-cfl-guc: NOTRUN -> [FAIL][34]
[34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-cfl-guc/igt@runner@aborted.html
- fi-icl-y: NOTRUN -> [FAIL][35]
[35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-icl-y/igt@runner@aborted.html
#### Suppressed ####
The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.
* {igt@gem_busy@busy@all}:
- fi-kbl-x1275: [PASS][36] -> [DMESG-WARN][37]
[36]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-kbl-x1275/igt@gem_busy@busy@all.html
[37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-kbl-x1275/igt@gem_busy@busy@all.html
- fi-cfl-8700k: [PASS][38] -> [DMESG-WARN][39]
[38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-cfl-8700k/igt@gem_busy@busy@all.html
[39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-cfl-8700k/igt@gem_busy@busy@all.html
- fi-tgl-y: [PASS][40] -> [DMESG-WARN][41]
[40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-tgl-y/igt@gem_busy@busy@all.html
[41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-tgl-y/igt@gem_busy@busy@all.html
- fi-skl-6600u: [PASS][42] -> [DMESG-WARN][43]
[42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-skl-6600u/igt@gem_busy@busy@all.html
[43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-skl-6600u/igt@gem_busy@busy@all.html
- fi-cfl-8109u: [PASS][44] -> [DMESG-WARN][45]
[44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-cfl-8109u/igt@gem_busy@busy@all.html
[45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-cfl-8109u/igt@gem_busy@busy@all.html
- fi-icl-u2: [PASS][46] -> [DMESG-WARN][47]
[46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-icl-u2/igt@gem_busy@busy@all.html
[47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-icl-u2/igt@gem_busy@busy@all.html
- {fi-tgl-dsi}: [PASS][48] -> [DMESG-WARN][49]
[48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-tgl-dsi/igt@gem_busy@busy@all.html
[49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-tgl-dsi/igt@gem_busy@busy@all.html
- fi-glk-dsi: [PASS][50] -> [DMESG-WARN][51]
[50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-glk-dsi/igt@gem_busy@busy@all.html
[51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-glk-dsi/igt@gem_busy@busy@all.html
- fi-kbl-8809g: [PASS][52] -> [DMESG-WARN][53]
[52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-kbl-8809g/igt@gem_busy@busy@all.html
[53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-kbl-8809g/igt@gem_busy@busy@all.html
- fi-skl-lmem: [PASS][54] -> [DMESG-WARN][55]
[54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-skl-lmem/igt@gem_busy@busy@all.html
[55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-skl-lmem/igt@gem_busy@busy@all.html
- fi-kbl-r: [PASS][56] -> [DMESG-WARN][57]
[56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-kbl-r/igt@gem_busy@busy@all.html
[57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-kbl-r/igt@gem_busy@busy@all.html
- fi-bdw-5557u: [PASS][58] -> [DMESG-WARN][59]
[58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-bdw-5557u/igt@gem_busy@busy@all.html
[59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-bdw-5557u/igt@gem_busy@busy@all.html
- fi-icl-guc: [PASS][60] -> [DMESG-WARN][61]
[60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-icl-guc/igt@gem_busy@busy@all.html
[61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-icl-guc/igt@gem_busy@busy@all.html
- fi-kbl-soraka: [PASS][62] -> [DMESG-WARN][63]
[62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-kbl-soraka/igt@gem_busy@busy@all.html
[63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-kbl-soraka/igt@gem_busy@busy@all.html
- {fi-ehl-1}: [PASS][64] -> [DMESG-WARN][65]
[64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-ehl-1/igt@gem_busy@busy@all.html
[65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-ehl-1/igt@gem_busy@busy@all.html
- fi-kbl-7500u: [PASS][66] -> [DMESG-WARN][67]
[66]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-kbl-7500u/igt@gem_busy@busy@all.html
[67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-kbl-7500u/igt@gem_busy@busy@all.html
- fi-kbl-guc: [PASS][68] -> [DMESG-WARN][69]
[68]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-kbl-guc/igt@gem_busy@busy@all.html
[69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-kbl-guc/igt@gem_busy@busy@all.html
- fi-whl-u: [PASS][70] -> [DMESG-WARN][71]
[70]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-whl-u/igt@gem_busy@busy@all.html
[71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-whl-u/igt@gem_busy@busy@all.html
- fi-cml-u2: [PASS][72] -> [DMESG-WARN][73]
[72]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-cml-u2/igt@gem_busy@busy@all.html
[73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-cml-u2/igt@gem_busy@busy@all.html
- fi-bxt-dsi: [PASS][74] -> [DMESG-WARN][75]
[74]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-bxt-dsi/igt@gem_busy@busy@all.html
[75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-bxt-dsi/igt@gem_busy@busy@all.html
- {fi-tgl-u}: [PASS][76] -> [DMESG-WARN][77]
[76]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-tgl-u/igt@gem_busy@busy@all.html
[77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-tgl-u/igt@gem_busy@busy@all.html
- fi-cml-s: [PASS][78] -> [DMESG-WARN][79]
[78]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-cml-s/igt@gem_busy@busy@all.html
[79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-cml-s/igt@gem_busy@busy@all.html
- fi-cfl-guc: [PASS][80] -> [DMESG-WARN][81]
[80]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-cfl-guc/igt@gem_busy@busy@all.html
[81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-cfl-guc/igt@gem_busy@busy@all.html
- fi-icl-y: [PASS][82] -> [DMESG-WARN][83]
[82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-icl-y/igt@gem_busy@busy@all.html
[83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-icl-y/igt@gem_busy@busy@all.html
- fi-skl-guc: [PASS][84] -> [DMESG-WARN][85]
[84]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-skl-guc/igt@gem_busy@busy@all.html
[85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-skl-guc/igt@gem_busy@busy@all.html
- fi-skl-6700k2: [PASS][86] -> [DMESG-WARN][87]
[86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-skl-6700k2/igt@gem_busy@busy@all.html
[87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-skl-6700k2/igt@gem_busy@busy@all.html
* igt@runner@aborted:
- {fi-tgl-dsi}: NOTRUN -> [FAIL][88]
[88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-tgl-dsi/igt@runner@aborted.html
- {fi-ehl-1}: NOTRUN -> [FAIL][89]
[89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-ehl-1/igt@runner@aborted.html
- {fi-tgl-u}: NOTRUN -> [FAIL][90]
[90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-tgl-u/igt@runner@aborted.html
Known issues
------------
Here are the changes found in Patchwork_17864 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@debugfs_test@read_all_entries:
- fi-kbl-soraka: [PASS][91] -> [DMESG-WARN][92] ([i915#1982])
[91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8580/fi-kbl-soraka/igt@debugfs_test@read_all_entries.html
[92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/fi-kbl-soraka/igt@debugfs_test@read_all_entries.html
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
[i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
Participating hosts (50 -> 44)
------------------------------
Missing (6): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus
Build changes
-------------
* Linux: CI_DRM_8580 -> Patchwork_17864
CI-20190529: 20190529
CI_DRM_8580: dbab119950f978cd41000b0daba1ff332e5b0856 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5694: a9b6c4c74bfddf7d3d2da3be08804fe315945cea @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_17864: 4db85879be37fd460696d0ea753a5eb243880719 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
4db85879be37 drm/i915: Annotate dma_fence_work
77ef9df05cc0 drm/amdgpu: gpu recovery does full modesets
1bb25e8d8189 Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"
552ee41a6739 drm/amdgpu: use dma-fence annotations for gpu reset code
6fd3e8ef0756 drm/scheduler: use dma-fence annotations in tdr work
838703bb63b9 drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
753d44dd8a51 drm/amdgpu: DC also loves to allocate stuff where it shouldn't
05627337ac19 drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
82020872a9a2 drm/amdgpu: use dma-fence annotations in cs_submit()
0485794be8aa drm/scheduler: use dma-fence annotations in main thread
911b274bb909 drm/amdgpu: add dma-fence annotations to atomic commit path
9d42aa205b3f drm/atomic-helper: Add dma-fence annotations
93efe4f7dc82 drm/vblank: Annotate with dma-fence signalling section
2fb5c8b43ac8 drm/vkms: Annotate vblank timer
96b50a5032df dma-fence: prime lockdep annotations
7f8c3b44f8eb dma-fence: basic lockdep annotations
b85d9997eaca dma-buf: minor doc touch-ups
199c3f9df986 mm: Track mmu notifiers in fs_reclaim_acquire/release
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17864/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 (rev2)
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (20 preceding siblings ...)
2020-06-04 9:08 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
@ 2020-06-05 13:59 ` Patchwork
2020-06-05 14:01 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
` (8 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-05 13:59 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2 (rev2)
URL : https://patchwork.freedesktop.org/series/77986/
State : warning
== Summary ==
$ dim checkpatch origin/drm-tip
e78a321ad3b9 mm: Track mmu notifiers in fs_reclaim_acquire/release
-:12: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 23b68395c7c7 ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end")'
#12:
recursions we do have lockdep annotations since 23b68395c7c7
-:41: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 66204f1d2d1b ("mm/mmu_notifiers: prime lockdep")'
#41:
With this we can also remove the lockdep priming added in 66204f1d2d1b
-:116: CHECK:BRACES: Blank lines aren't necessary before a close brace '}'
#116: FILE: mm/page_alloc.c:4165:
+
+ }
-:130: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 2 errors, 1 warnings, 1 checks, 65 lines checked
7e972dd54c14 dma-buf: minor doc touch-ups
-:32: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 14 lines checked
83bbee724172 dma-fence: basic lockdep annotations
-:23: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit e91498589746 ("locking/lockdep/selftests: Add mixed read-write ABBA tests")'
#23:
commit e91498589746065e3ae95d9a00b068e525eec34f
-:97: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit e966eaeeb623 ("locking/lockdep: Remove the cross-release locking checks")'
#97:
commit e966eaeeb623f09975ef362c2866fae6f86844f9
-:103: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#103:
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
-:313: ERROR:IN_ATOMIC: do not use in_atomic in drivers
#313: FILE: drivers/dma-buf/dma-fence.c:228:
+ if (in_atomic())
-:351: CHECK:LINE_SPACING: Please don't use multiple blank lines
#351: FILE: drivers/dma-buf/dma-fence.c:266:
+
+
-:400: CHECK:LINE_SPACING: Please use a blank line after function/struct/union/enum declarations
#400: FILE: include/linux/dma-fence.h:368:
+}
+static inline void dma_fence_end_signalling(bool cookie) {}
-:406: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 3 errors, 2 warnings, 2 checks, 231 lines checked
24dfd7c2f31d dma-fence: prime lockdep annotations
-:31: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 23b68395c7c7 ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end")'
#31:
commit 23b68395c7c78a764e8963fc15a7cfd318bf187f
-:169: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 1 errors, 1 warnings, 0 checks, 82 lines checked
f0ab547cc6c5 drm/vkms: Annotate vblank timer
-:59: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 25 lines checked
5594f77b32e6 drm/vblank: Annotate with dma-fence signalling section
-:71: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 38 lines checked
46b7c2fd5ffd drm/atomic-helper: Add dma-fence annotations
-:119: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 76 lines checked
ac0b1f52d0fe drm/amdgpu: add dma-fence annotations to atomic commit path
-:52: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 24 lines checked
c0009c21b60f drm/scheduler: use dma-fence annotations in main thread
-:53: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 21 lines checked
0e0f7023514c drm/amdgpu: use dma-fence annotations in cs_submit()
-:65: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 29 lines checked
47de02c44f96 drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
-:82: WARNING:ALLOC_ARRAY_ARGS: kmalloc_array uses number as first arg, sizeof is generally wrong
#82: FILE: drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c:211:
+ fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_ATOMIC);
-:98: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 24 lines checked
4c8281f15785 drm/amdgpu: DC also loves to allocate stuff where it shouldn't
-:70: WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line
#70: FILE: drivers/gpu/drm/amd/display/dc/core/dc.c:1420:
+ * atomic_commit_tail. */
-:76: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 27 lines checked
2a75b848a16e drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
-:39: WARNING:IF_0: Consider removing the code enclosed by this #if 0 and its #endif
#39: FILE: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:6917:
+#if 0
-:55: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 23 lines checked
428021cf6ac1 drm/scheduler: use dma-fence annotations in tdr work
-:28: WARNING:TYPO_SPELLING: 'seperate' may be misspelled - perhaps 'separate'?
#28:
Hence split out as a seperate patch.
-:114: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 20 lines checked
edb53cc8cdcc drm/amdgpu: use dma-fence annotations for gpu reset code
ca71ae15991e Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"
-:145: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 16 lines checked
92582d6b872d drm/amdgpu: gpu recovery does full modesets
-:186: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 14 lines checked
edc7511baed1 drm/i915: Annotate dma_fence_work
-:53: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 15 lines checked
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for dma-fence lockdep annotations, round 2 (rev2)
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (21 preceding siblings ...)
2020-06-05 13:59 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 (rev2) Patchwork
@ 2020-06-05 14:01 ` Patchwork
2020-06-05 14:15 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
` (7 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-05 14:01 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2 (rev2)
URL : https://patchwork.freedesktop.org/series/77986/
State : warning
== Summary ==
$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: expected unsigned int [addressable] [usertype] ulClockParams
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: got restricted __le32 [usertype]
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: warning: incorrect type in assignment (different base types)
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1028:50: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1029:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1037:47: warning: too many warnings
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:184:44: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:283:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:320:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:323:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:326:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:329:18: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:330:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:338:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:340:38: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:342:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:346:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:348:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:353:33: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:367:43: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:369:38: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:374:67: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:375:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:378:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:389:80: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:395:57: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:402:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:403:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:406:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:414:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:423:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:424:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:473:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:476:45: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:477:45: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:484:54: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:52:28: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:531:35: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:53:29: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:533:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:54:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:55:27: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:56:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:57:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:577:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:581:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:58:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:583:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:586:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:590:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:59:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:598:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:600:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:617:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:621:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:623:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:630:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:632:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:644:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:648:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:650:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:657:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:659:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:662:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:664:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:676:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:688:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:691:47: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:697:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:796:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:797:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:800:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:801:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:804:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:805:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:812:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:813:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:816:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:817:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:820:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:821:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:828:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:829:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:832:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:833:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:836:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:837:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:844:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:845:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:848:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:849:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:852:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:853:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:916:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:918:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:920:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:934:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:936:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:938:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:956:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:958:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:960:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:296:34: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:330:34: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:360:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:362:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:369:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:383:40: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:406:40: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:44:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:447:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:451:33: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:454:61: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:455:64: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:457:54: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:483:17: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:486:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:64:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:85:30: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:86:24: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:98:39: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:222:29: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:227:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:233:43: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:236:44: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:239:51: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:464:39: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:465:30: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:466:39: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:468:24: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: expected unsigned long long [usertype] *chunk_array_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: got unsigned long long [usertype] *chunk_array_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: expected struct drm_amdgpu_cs_chunk [noderef] <asn:1> **chunk_ptr
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: got struct drm_amdgpu_cs_chunk [noderef] <asn:1> **chunk_ptr
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: expected struct drm_amdgpu_fence *fences_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: got struct drm_amdgpu_fence *fences_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1256:25: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1257:17: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1313:17: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: expected restricted __poll_t ( *poll )( ... )
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: got unsigned int ( * )( ... )
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: warning: incorrect type in initializer (different base types)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: too many warnings
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1618:65: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1625:55: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626:50: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1627:50: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1628:56: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1630:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1631:45: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1632:51: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1633:55: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1634:57: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1636:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1637:53: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1639:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1641:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1642:46: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1646:73: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1648:33: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1650:33: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1661:73: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:260:16: error: incompatible types in comparison expression (different type sizes)
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:507:39: warning: cast removes address space '<asn:2>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:527:31: warning: cast removes address space '<asn:2>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.BAT: failure for dma-fence lockdep annotations, round 2 (rev2)
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (22 preceding siblings ...)
2020-06-05 14:01 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-06-05 14:15 ` Patchwork
2020-06-10 20:20 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 (rev3) Patchwork
` (6 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-05 14:15 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2 (rev2)
URL : https://patchwork.freedesktop.org/series/77986/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_8590 -> Patchwork_17886
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_17886 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_17886, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/index.html
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_17886:
### IGT changes ###
#### Possible regressions ####
* igt@gem_close_race@basic-process:
- fi-ivb-3770: [PASS][1] -> [DMESG-WARN][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-ivb-3770/igt@gem_close_race@basic-process.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-ivb-3770/igt@gem_close_race@basic-process.html
- fi-byt-j1900: [PASS][3] -> [DMESG-WARN][4]
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-byt-j1900/igt@gem_close_race@basic-process.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-byt-j1900/igt@gem_close_race@basic-process.html
- fi-hsw-4770: [PASS][5] -> [DMESG-WARN][6]
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-hsw-4770/igt@gem_close_race@basic-process.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-hsw-4770/igt@gem_close_race@basic-process.html
- fi-byt-n2820: [PASS][7] -> [DMESG-WARN][8]
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-byt-n2820/igt@gem_close_race@basic-process.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-byt-n2820/igt@gem_close_race@basic-process.html
* igt@kms_busy@basic@flip:
- fi-snb-2600: [PASS][9] -> [DMESG-WARN][10]
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-snb-2600/igt@kms_busy@basic@flip.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-snb-2600/igt@kms_busy@basic@flip.html
- fi-snb-2520m: [PASS][11] -> [DMESG-WARN][12]
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-snb-2520m/igt@kms_busy@basic@flip.html
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-snb-2520m/igt@kms_busy@basic@flip.html
* igt@kms_frontbuffer_tracking@basic:
- fi-ilk-650: [PASS][13] -> [DMESG-WARN][14]
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-ilk-650/igt@kms_frontbuffer_tracking@basic.html
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-ilk-650/igt@kms_frontbuffer_tracking@basic.html
* igt@runner@aborted:
- fi-cfl-8700k: NOTRUN -> [FAIL][15]
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-cfl-8700k/igt@runner@aborted.html
- fi-tgl-y: NOTRUN -> [FAIL][16]
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-tgl-y/igt@runner@aborted.html
- fi-cfl-8109u: NOTRUN -> [FAIL][17]
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-cfl-8109u/igt@runner@aborted.html
- fi-icl-u2: NOTRUN -> [FAIL][18]
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-icl-u2/igt@runner@aborted.html
- fi-snb-2520m: NOTRUN -> [FAIL][19]
[19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-snb-2520m/igt@runner@aborted.html
- fi-bdw-5557u: NOTRUN -> [FAIL][20]
[20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-bdw-5557u/igt@runner@aborted.html
- fi-byt-n2820: NOTRUN -> [FAIL][21]
[21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-byt-n2820/igt@runner@aborted.html
- fi-icl-guc: NOTRUN -> [FAIL][22]
[22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-icl-guc/igt@runner@aborted.html
- fi-hsw-4770: NOTRUN -> [FAIL][23]
[23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-hsw-4770/igt@runner@aborted.html
- fi-snb-2600: NOTRUN -> [FAIL][24]
[24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-snb-2600/igt@runner@aborted.html
- fi-whl-u: NOTRUN -> [FAIL][25]
[25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-whl-u/igt@runner@aborted.html
- fi-cml-u2: NOTRUN -> [FAIL][26]
[26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-cml-u2/igt@runner@aborted.html
- fi-ivb-3770: NOTRUN -> [FAIL][27]
[27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-ivb-3770/igt@runner@aborted.html
- fi-bxt-dsi: NOTRUN -> [FAIL][28]
[28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-bxt-dsi/igt@runner@aborted.html
- fi-byt-j1900: NOTRUN -> [FAIL][29]
[29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-byt-j1900/igt@runner@aborted.html
- fi-cml-s: NOTRUN -> [FAIL][30]
[30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-cml-s/igt@runner@aborted.html
- fi-cfl-guc: NOTRUN -> [FAIL][31]
[31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-cfl-guc/igt@runner@aborted.html
- fi-icl-y: NOTRUN -> [FAIL][32]
[32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-icl-y/igt@runner@aborted.html
#### Suppressed ####
The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.
* {igt@gem_busy@busy@all}:
- fi-kbl-x1275: [PASS][33] -> [DMESG-WARN][34]
[33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-kbl-x1275/igt@gem_busy@busy@all.html
[34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-kbl-x1275/igt@gem_busy@busy@all.html
- fi-cfl-8700k: [PASS][35] -> [DMESG-WARN][36]
[35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-cfl-8700k/igt@gem_busy@busy@all.html
[36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-cfl-8700k/igt@gem_busy@busy@all.html
- fi-tgl-y: [PASS][37] -> [DMESG-WARN][38]
[37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-tgl-y/igt@gem_busy@busy@all.html
[38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-tgl-y/igt@gem_busy@busy@all.html
- fi-skl-6600u: [PASS][39] -> [DMESG-WARN][40]
[39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-skl-6600u/igt@gem_busy@busy@all.html
[40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-skl-6600u/igt@gem_busy@busy@all.html
- fi-cfl-8109u: [PASS][41] -> [DMESG-WARN][42]
[41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-cfl-8109u/igt@gem_busy@busy@all.html
[42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-cfl-8109u/igt@gem_busy@busy@all.html
- fi-icl-u2: [PASS][43] -> [DMESG-WARN][44]
[43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-icl-u2/igt@gem_busy@busy@all.html
[44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-icl-u2/igt@gem_busy@busy@all.html
- {fi-tgl-dsi}: [PASS][45] -> [DMESG-WARN][46]
[45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-tgl-dsi/igt@gem_busy@busy@all.html
[46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-tgl-dsi/igt@gem_busy@busy@all.html
- fi-glk-dsi: [PASS][47] -> [DMESG-WARN][48]
[47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-glk-dsi/igt@gem_busy@busy@all.html
[48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-glk-dsi/igt@gem_busy@busy@all.html
- fi-kbl-8809g: [PASS][49] -> [DMESG-WARN][50]
[49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-kbl-8809g/igt@gem_busy@busy@all.html
[50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-kbl-8809g/igt@gem_busy@busy@all.html
- fi-skl-lmem: [PASS][51] -> [DMESG-WARN][52]
[51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-skl-lmem/igt@gem_busy@busy@all.html
[52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-skl-lmem/igt@gem_busy@busy@all.html
- fi-kbl-r: [PASS][53] -> [DMESG-WARN][54]
[53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-kbl-r/igt@gem_busy@busy@all.html
[54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-kbl-r/igt@gem_busy@busy@all.html
- fi-bdw-5557u: [PASS][55] -> [DMESG-WARN][56]
[55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-bdw-5557u/igt@gem_busy@busy@all.html
[56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-bdw-5557u/igt@gem_busy@busy@all.html
- fi-icl-guc: [PASS][57] -> [DMESG-WARN][58]
[57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-icl-guc/igt@gem_busy@busy@all.html
[58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-icl-guc/igt@gem_busy@busy@all.html
- fi-kbl-soraka: [PASS][59] -> [DMESG-WARN][60]
[59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-kbl-soraka/igt@gem_busy@busy@all.html
[60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-kbl-soraka/igt@gem_busy@busy@all.html
- {fi-ehl-1}: [PASS][61] -> [DMESG-WARN][62]
[61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-ehl-1/igt@gem_busy@busy@all.html
[62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-ehl-1/igt@gem_busy@busy@all.html
- fi-kbl-7500u: [PASS][63] -> [DMESG-WARN][64]
[63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-kbl-7500u/igt@gem_busy@busy@all.html
[64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-kbl-7500u/igt@gem_busy@busy@all.html
- fi-kbl-guc: [PASS][65] -> [DMESG-WARN][66]
[65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-kbl-guc/igt@gem_busy@busy@all.html
[66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-kbl-guc/igt@gem_busy@busy@all.html
- fi-whl-u: [PASS][67] -> [DMESG-WARN][68]
[67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-whl-u/igt@gem_busy@busy@all.html
[68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-whl-u/igt@gem_busy@busy@all.html
- fi-cml-u2: [PASS][69] -> [DMESG-WARN][70]
[69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-cml-u2/igt@gem_busy@busy@all.html
[70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-cml-u2/igt@gem_busy@busy@all.html
- {fi-kbl-7560u}: NOTRUN -> [DMESG-WARN][71]
[71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-kbl-7560u/igt@gem_busy@busy@all.html
- fi-bxt-dsi: [PASS][72] -> [DMESG-WARN][73]
[72]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-bxt-dsi/igt@gem_busy@busy@all.html
[73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-bxt-dsi/igt@gem_busy@busy@all.html
- {fi-tgl-u}: [PASS][74] -> [DMESG-WARN][75]
[74]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-tgl-u/igt@gem_busy@busy@all.html
[75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-tgl-u/igt@gem_busy@busy@all.html
- fi-cml-s: [PASS][76] -> [DMESG-WARN][77]
[76]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-cml-s/igt@gem_busy@busy@all.html
[77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-cml-s/igt@gem_busy@busy@all.html
- fi-cfl-guc: [PASS][78] -> [DMESG-WARN][79]
[78]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-cfl-guc/igt@gem_busy@busy@all.html
[79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-cfl-guc/igt@gem_busy@busy@all.html
- fi-icl-y: [PASS][80] -> [DMESG-WARN][81]
[80]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-icl-y/igt@gem_busy@busy@all.html
[81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-icl-y/igt@gem_busy@busy@all.html
- fi-skl-6700k2: [PASS][82] -> [DMESG-WARN][83]
[82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-skl-6700k2/igt@gem_busy@busy@all.html
[83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-skl-6700k2/igt@gem_busy@busy@all.html
* igt@runner@aborted:
- {fi-tgl-dsi}: NOTRUN -> [FAIL][84]
[84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-tgl-dsi/igt@runner@aborted.html
- {fi-ehl-1}: NOTRUN -> [FAIL][85]
[85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-ehl-1/igt@runner@aborted.html
- {fi-tgl-u}: NOTRUN -> [FAIL][86]
[86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-tgl-u/igt@runner@aborted.html
Known issues
------------
Here are the changes found in Patchwork_17886 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@debugfs_test@read_all_entries:
- fi-kbl-soraka: [PASS][87] -> [DMESG-WARN][88] ([i915#1982])
[87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-kbl-soraka/igt@debugfs_test@read_all_entries.html
[88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-kbl-soraka/igt@debugfs_test@read_all_entries.html
#### Warnings ####
* igt@debugfs_test@read_all_entries:
- fi-kbl-x1275: [DMESG-WARN][89] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][90] ([i915#62] / [i915#92])
[89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8590/fi-kbl-x1275/igt@debugfs_test@read_all_entries.html
[90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/fi-kbl-x1275/igt@debugfs_test@read_all_entries.html
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
[i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
[i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
[i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
[i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95
Participating hosts (50 -> 44)
------------------------------
Additional (1): fi-kbl-7560u
Missing (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus
Build changes
-------------
* Linux: CI_DRM_8590 -> Patchwork_17886
CI-20190529: 20190529
CI_DRM_8590: 91c6f0274b54c89679cd23f6fc65e9fe5922971f @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5695: 53e8c878a6fb5708e63c99403691e8960b86ea9c @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_17886: edc7511baed1bde446c55372c108b49f5f7acf39 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
edc7511baed1 drm/i915: Annotate dma_fence_work
92582d6b872d drm/amdgpu: gpu recovery does full modesets
ca71ae15991e Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"
edb53cc8cdcc drm/amdgpu: use dma-fence annotations for gpu reset code
428021cf6ac1 drm/scheduler: use dma-fence annotations in tdr work
2a75b848a16e drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
4c8281f15785 drm/amdgpu: DC also loves to allocate stuff where it shouldn't
47de02c44f96 drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
0e0f7023514c drm/amdgpu: use dma-fence annotations in cs_submit()
c0009c21b60f drm/scheduler: use dma-fence annotations in main thread
ac0b1f52d0fe drm/amdgpu: add dma-fence annotations to atomic commit path
46b7c2fd5ffd drm/atomic-helper: Add dma-fence annotations
5594f77b32e6 drm/vblank: Annotate with dma-fence signalling section
f0ab547cc6c5 drm/vkms: Annotate vblank timer
24dfd7c2f31d dma-fence: prime lockdep annotations
83bbee724172 dma-fence: basic lockdep annotations
7e972dd54c14 dma-buf: minor doc touch-ups
e78a321ad3b9 mm: Track mmu notifiers in fs_reclaim_acquire/release
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17886/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 (rev3)
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (23 preceding siblings ...)
2020-06-05 14:15 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
@ 2020-06-10 20:20 ` Patchwork
2020-06-10 20:21 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
` (5 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-10 20:20 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2 (rev3)
URL : https://patchwork.freedesktop.org/series/77986/
State : warning
== Summary ==
$ dim checkpatch origin/drm-tip
b91d6e9b2219 mm: Track mmu notifiers in fs_reclaim_acquire/release
-:12: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 23b68395c7c7 ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end")'
#12:
recursions we do have lockdep annotations since 23b68395c7c7
-:41: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 66204f1d2d1b ("mm/mmu_notifiers: prime lockdep")'
#41:
With this we can also remove the lockdep priming added in 66204f1d2d1b
-:124: CHECK:BRACES: Blank lines aren't necessary before a close brace '}'
#124: FILE: mm/page_alloc.c:4167:
+
+ }
-:138: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 2 errors, 1 warnings, 1 checks, 67 lines checked
464bebc66202 dma-buf: minor doc touch-ups
-:33: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 14 lines checked
4a356e005b80 dma-fence: basic lockdep annotations
-:23: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit e91498589746 ("locking/lockdep/selftests: Add mixed read-write ABBA tests")'
#23:
commit e91498589746065e3ae95d9a00b068e525eec34f
-:97: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit e966eaeeb623 ("locking/lockdep: Remove the cross-release locking checks")'
#97:
commit e966eaeeb623f09975ef362c2866fae6f86844f9
-:103: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#103:
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
-:314: ERROR:IN_ATOMIC: do not use in_atomic in drivers
#314: FILE: drivers/dma-buf/dma-fence.c:228:
+ if (in_atomic())
-:352: CHECK:LINE_SPACING: Please don't use multiple blank lines
#352: FILE: drivers/dma-buf/dma-fence.c:266:
+
+
-:401: CHECK:LINE_SPACING: Please use a blank line after function/struct/union/enum declarations
#401: FILE: include/linux/dma-fence.h:368:
+}
+static inline void dma_fence_end_signalling(bool cookie) {}
-:407: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 3 errors, 2 warnings, 2 checks, 231 lines checked
e85757129eef dma-fence: prime lockdep annotations
-:31: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 23b68395c7c7 ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end")'
#31:
commit 23b68395c7c78a764e8963fc15a7cfd318bf187f
-:169: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 1 errors, 1 warnings, 0 checks, 82 lines checked
abea167ccc2c drm/vkms: Annotate vblank timer
-:59: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 25 lines checked
74cf66d0c736 drm/vblank: Annotate with dma-fence signalling section
-:71: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 38 lines checked
49852bebf34d drm/atomic-helper: Add dma-fence annotations
-:119: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 76 lines checked
812e8d183ea1 drm/amdgpu: add dma-fence annotations to atomic commit path
-:52: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 24 lines checked
e86ec566effc drm/scheduler: use dma-fence annotations in main thread
-:53: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 21 lines checked
d05e15f8ad27 drm/amdgpu: use dma-fence annotations in cs_submit()
-:65: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 29 lines checked
805637835bf6 drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
-:82: WARNING:ALLOC_ARRAY_ARGS: kmalloc_array uses number as first arg, sizeof is generally wrong
#82: FILE: drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c:211:
+ fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_ATOMIC);
-:98: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 24 lines checked
18403b85aff4 drm/amdgpu: DC also loves to allocate stuff where it shouldn't
-:70: WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line
#70: FILE: drivers/gpu/drm/amd/display/dc/core/dc.c:1436:
+ * atomic_commit_tail. */
-:76: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 27 lines checked
2dbc37297b21 drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
-:39: WARNING:IF_0: Consider removing the code enclosed by this #if 0 and its #endif
#39: FILE: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:6914:
+#if 0
-:55: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 23 lines checked
d1361c491f79 drm/scheduler: use dma-fence annotations in tdr work
-:28: WARNING:TYPO_SPELLING: 'seperate' may be misspelled - perhaps 'separate'?
#28:
Hence split out as a seperate patch.
-:114: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 20 lines checked
718c082d14cb drm/amdgpu: use dma-fence annotations for gpu reset code
f08fc8bb8383 Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"
-:145: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 16 lines checked
c40ab1c8276b drm/amdgpu: gpu recovery does full modesets
-:186: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 14 lines checked
c8130cec52d5 drm/i915: Annotate dma_fence_work
-:53: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 15 lines checked
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for dma-fence lockdep annotations, round 2 (rev3)
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (24 preceding siblings ...)
2020-06-10 20:20 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 (rev3) Patchwork
@ 2020-06-10 20:21 ` Patchwork
2020-06-10 20:35 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
` (4 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-10 20:21 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2 (rev3)
URL : https://patchwork.freedesktop.org/series/77986/
State : warning
== Summary ==
$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: expected unsigned int [addressable] [usertype] ulClockParams
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: got restricted __le32 [usertype]
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: warning: incorrect type in assignment (different base types)
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1028:50: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1029:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1037:47: warning: too many warnings
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:184:44: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:283:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:320:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:323:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:326:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:329:18: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:330:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:338:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:340:38: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:342:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:346:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:348:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:353:33: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:367:43: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:369:38: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:374:67: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:375:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:378:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:389:80: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:395:57: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:402:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:403:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:406:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:414:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:423:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:424:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:473:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:476:45: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:477:45: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:484:54: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:52:28: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:531:35: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:53:29: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:533:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:54:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:55:27: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:56:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:57:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:577:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:581:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:58:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:583:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:586:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:590:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:59:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:598:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:600:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:617:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:621:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:623:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:630:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:632:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:644:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:648:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:650:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:657:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:659:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:662:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:664:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:676:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:688:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:691:47: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:697:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:796:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:797:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:800:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:801:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:804:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:805:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:812:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:813:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:816:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:817:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:820:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:821:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:828:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:829:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:832:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:833:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:836:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:837:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:844:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:845:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:848:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:849:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:852:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:853:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:916:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:918:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:920:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:934:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:936:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:938:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:956:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:958:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:960:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:296:34: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:330:34: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:360:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:362:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:369:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:383:40: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:406:40: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:44:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:447:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:451:33: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:454:61: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:455:64: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:457:54: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:483:17: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:486:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:64:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:85:30: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:86:24: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:98:39: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:222:29: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:227:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:233:43: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:236:44: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:239:51: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:464:39: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:465:30: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:466:39: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:468:24: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: expected unsigned long long [usertype] *chunk_array_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: got unsigned long long [usertype] *chunk_array_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: expected struct drm_amdgpu_cs_chunk [noderef] <asn:1> **chunk_ptr
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: got struct drm_amdgpu_cs_chunk [noderef] <asn:1> **chunk_ptr
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: expected struct drm_amdgpu_fence *fences_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: got struct drm_amdgpu_fence *fences_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1256:25: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1257:17: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1313:17: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: expected restricted __poll_t ( *poll )( ... )
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: got unsigned int ( * )( ... )
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: warning: incorrect type in initializer (different base types)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: too many warnings
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1618:65: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1625:55: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626:50: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1627:50: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1628:56: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1630:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1631:45: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1632:51: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1633:55: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1634:57: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1636:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1637:53: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1639:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1641:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1642:46: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1646:73: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1648:33: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1650:33: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1661:73: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:260:16: error: incompatible types in comparison expression (different type sizes)
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:507:39: warning: cast removes address space '<asn:2>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:527:31: warning: cast removes address space '<asn:2>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.BAT: failure for dma-fence lockdep annotations, round 2 (rev3)
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (25 preceding siblings ...)
2020-06-10 20:21 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-06-10 20:35 ` Patchwork
2020-06-12 7:18 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 (rev6) Patchwork
` (3 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-10 20:35 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2 (rev3)
URL : https://patchwork.freedesktop.org/series/77986/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_8611 -> Patchwork_17923
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_17923 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_17923, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/index.html
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_17923:
### IGT changes ###
#### Possible regressions ####
* igt@gem_busy@busy@all:
- fi-kbl-x1275: [PASS][1] -> [DMESG-WARN][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-kbl-x1275/igt@gem_busy@busy@all.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-kbl-x1275/igt@gem_busy@busy@all.html
- fi-cfl-8700k: [PASS][3] -> [DMESG-WARN][4]
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-cfl-8700k/igt@gem_busy@busy@all.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-cfl-8700k/igt@gem_busy@busy@all.html
- fi-skl-6600u: [PASS][5] -> [DMESG-WARN][6]
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-skl-6600u/igt@gem_busy@busy@all.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-skl-6600u/igt@gem_busy@busy@all.html
- fi-cfl-8109u: [PASS][7] -> [DMESG-WARN][8]
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-cfl-8109u/igt@gem_busy@busy@all.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-cfl-8109u/igt@gem_busy@busy@all.html
- fi-icl-u2: [PASS][9] -> [DMESG-WARN][10]
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-icl-u2/igt@gem_busy@busy@all.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-icl-u2/igt@gem_busy@busy@all.html
- fi-glk-dsi: [PASS][11] -> [DMESG-WARN][12]
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-glk-dsi/igt@gem_busy@busy@all.html
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-glk-dsi/igt@gem_busy@busy@all.html
- fi-skl-lmem: [PASS][13] -> [DMESG-WARN][14]
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-skl-lmem/igt@gem_busy@busy@all.html
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-skl-lmem/igt@gem_busy@busy@all.html
- fi-kbl-r: [PASS][15] -> [DMESG-WARN][16]
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-kbl-r/igt@gem_busy@busy@all.html
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-kbl-r/igt@gem_busy@busy@all.html
- fi-bdw-5557u: [PASS][17] -> [DMESG-WARN][18]
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-bdw-5557u/igt@gem_busy@busy@all.html
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-bdw-5557u/igt@gem_busy@busy@all.html
- fi-icl-guc: [PASS][19] -> [DMESG-WARN][20]
[19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-icl-guc/igt@gem_busy@busy@all.html
[20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-icl-guc/igt@gem_busy@busy@all.html
- fi-kbl-soraka: [PASS][21] -> [DMESG-WARN][22]
[21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-kbl-soraka/igt@gem_busy@busy@all.html
[22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-kbl-soraka/igt@gem_busy@busy@all.html
- fi-kbl-7500u: [PASS][23] -> [DMESG-WARN][24]
[23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-kbl-7500u/igt@gem_busy@busy@all.html
[24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-kbl-7500u/igt@gem_busy@busy@all.html
- fi-kbl-guc: [PASS][25] -> [DMESG-WARN][26]
[25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-kbl-guc/igt@gem_busy@busy@all.html
[26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-kbl-guc/igt@gem_busy@busy@all.html
- fi-whl-u: [PASS][27] -> [DMESG-WARN][28]
[27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-whl-u/igt@gem_busy@busy@all.html
[28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-whl-u/igt@gem_busy@busy@all.html
- fi-cml-u2: [PASS][29] -> [DMESG-WARN][30]
[29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-cml-u2/igt@gem_busy@busy@all.html
[30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-cml-u2/igt@gem_busy@busy@all.html
- fi-bxt-dsi: [PASS][31] -> [DMESG-WARN][32]
[31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-bxt-dsi/igt@gem_busy@busy@all.html
[32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-bxt-dsi/igt@gem_busy@busy@all.html
- fi-cml-s: [PASS][33] -> [DMESG-WARN][34]
[33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-cml-s/igt@gem_busy@busy@all.html
[34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-cml-s/igt@gem_busy@busy@all.html
- fi-cfl-guc: [PASS][35] -> [DMESG-WARN][36]
[35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-cfl-guc/igt@gem_busy@busy@all.html
[36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-cfl-guc/igt@gem_busy@busy@all.html
- fi-icl-y: [PASS][37] -> [DMESG-WARN][38]
[37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-icl-y/igt@gem_busy@busy@all.html
[38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-icl-y/igt@gem_busy@busy@all.html
- fi-skl-guc: [PASS][39] -> [DMESG-WARN][40]
[39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-skl-guc/igt@gem_busy@busy@all.html
[40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-skl-guc/igt@gem_busy@busy@all.html
- fi-skl-6700k2: [PASS][41] -> [DMESG-WARN][42]
[41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-skl-6700k2/igt@gem_busy@busy@all.html
[42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-skl-6700k2/igt@gem_busy@busy@all.html
- fi-tgl-u2: NOTRUN -> [DMESG-WARN][43]
[43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-tgl-u2/igt@gem_busy@busy@all.html
* igt@gem_close_race@basic-process:
- fi-ivb-3770: [PASS][44] -> [DMESG-WARN][45]
[44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-ivb-3770/igt@gem_close_race@basic-process.html
[45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-ivb-3770/igt@gem_close_race@basic-process.html
- fi-byt-j1900: [PASS][46] -> [DMESG-WARN][47]
[46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-byt-j1900/igt@gem_close_race@basic-process.html
[47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-byt-j1900/igt@gem_close_race@basic-process.html
- fi-hsw-4770: [PASS][48] -> [DMESG-WARN][49]
[48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-hsw-4770/igt@gem_close_race@basic-process.html
[49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-hsw-4770/igt@gem_close_race@basic-process.html
- fi-byt-n2820: [PASS][50] -> [DMESG-WARN][51]
[50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-byt-n2820/igt@gem_close_race@basic-process.html
[51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-byt-n2820/igt@gem_close_race@basic-process.html
* igt@kms_busy@basic@flip:
- fi-snb-2600: [PASS][52] -> [DMESG-WARN][53]
[52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-snb-2600/igt@kms_busy@basic@flip.html
[53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-snb-2600/igt@kms_busy@basic@flip.html
- fi-snb-2520m: [PASS][54] -> [DMESG-WARN][55]
[54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-snb-2520m/igt@kms_busy@basic@flip.html
[55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-snb-2520m/igt@kms_busy@basic@flip.html
* igt@kms_frontbuffer_tracking@basic:
- fi-ilk-650: [PASS][56] -> [DMESG-WARN][57]
[56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-ilk-650/igt@kms_frontbuffer_tracking@basic.html
[57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-ilk-650/igt@kms_frontbuffer_tracking@basic.html
* igt@runner@aborted:
- fi-cfl-8700k: NOTRUN -> [FAIL][58]
[58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-cfl-8700k/igt@runner@aborted.html
- fi-cfl-8109u: NOTRUN -> [FAIL][59]
[59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-cfl-8109u/igt@runner@aborted.html
- fi-icl-u2: NOTRUN -> [FAIL][60]
[60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-icl-u2/igt@runner@aborted.html
- fi-snb-2520m: NOTRUN -> [FAIL][61]
[61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-snb-2520m/igt@runner@aborted.html
- fi-bdw-5557u: NOTRUN -> [FAIL][62]
[62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-bdw-5557u/igt@runner@aborted.html
- fi-byt-n2820: NOTRUN -> [FAIL][63]
[63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-byt-n2820/igt@runner@aborted.html
- fi-icl-guc: NOTRUN -> [FAIL][64]
[64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-icl-guc/igt@runner@aborted.html
- fi-hsw-4770: NOTRUN -> [FAIL][65]
[65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-hsw-4770/igt@runner@aborted.html
- fi-snb-2600: NOTRUN -> [FAIL][66]
[66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-snb-2600/igt@runner@aborted.html
- fi-whl-u: NOTRUN -> [FAIL][67]
[67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-whl-u/igt@runner@aborted.html
- fi-cml-u2: NOTRUN -> [FAIL][68]
[68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-cml-u2/igt@runner@aborted.html
- fi-ivb-3770: NOTRUN -> [FAIL][69]
[69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-ivb-3770/igt@runner@aborted.html
- fi-bxt-dsi: NOTRUN -> [FAIL][70]
[70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-bxt-dsi/igt@runner@aborted.html
- fi-byt-j1900: NOTRUN -> [FAIL][71]
[71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-byt-j1900/igt@runner@aborted.html
- fi-cml-s: NOTRUN -> [FAIL][72]
[72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-cml-s/igt@runner@aborted.html
- fi-cfl-guc: NOTRUN -> [FAIL][73]
[73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-cfl-guc/igt@runner@aborted.html
- fi-icl-y: NOTRUN -> [FAIL][74]
[74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-icl-y/igt@runner@aborted.html
- fi-tgl-u2: NOTRUN -> [FAIL][75]
[75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-tgl-u2/igt@runner@aborted.html
#### Suppressed ####
The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.
* igt@gem_busy@busy@all:
- {fi-tgl-dsi}: [PASS][76] -> [DMESG-WARN][77]
[76]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-tgl-dsi/igt@gem_busy@busy@all.html
[77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-tgl-dsi/igt@gem_busy@busy@all.html
- {fi-ehl-1}: [PASS][78] -> [DMESG-WARN][79]
[78]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-ehl-1/igt@gem_busy@busy@all.html
[79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-ehl-1/igt@gem_busy@busy@all.html
* igt@runner@aborted:
- {fi-tgl-dsi}: NOTRUN -> [FAIL][80]
[80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-tgl-dsi/igt@runner@aborted.html
- {fi-ehl-1}: NOTRUN -> [FAIL][81]
[81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-ehl-1/igt@runner@aborted.html
Known issues
------------
Here are the changes found in Patchwork_17923 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@debugfs_test@read_all_entries:
- fi-kbl-soraka: [PASS][82] -> [DMESG-WARN][83] ([i915#1982])
[82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8611/fi-kbl-soraka/igt@debugfs_test@read_all_entries.html
[83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/fi-kbl-soraka/igt@debugfs_test@read_all_entries.html
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
[i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
Participating hosts (48 -> 42)
------------------------------
Additional (1): fi-tgl-u2
Missing (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus
Build changes
-------------
* Linux: CI_DRM_8611 -> Patchwork_17923
CI-20190529: 20190529
CI_DRM_8611: b87354483fa40fef86da19ade9bfe9349f0cf6d5 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5702: d16ad07e7f2a028e14d61f570931c87fa5ce404c @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_17923: c8130cec52d56bce2b4f09cd005c0f7d68806ac0 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
c8130cec52d5 drm/i915: Annotate dma_fence_work
c40ab1c8276b drm/amdgpu: gpu recovery does full modesets
f08fc8bb8383 Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"
718c082d14cb drm/amdgpu: use dma-fence annotations for gpu reset code
d1361c491f79 drm/scheduler: use dma-fence annotations in tdr work
2dbc37297b21 drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
18403b85aff4 drm/amdgpu: DC also loves to allocate stuff where it shouldn't
805637835bf6 drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
d05e15f8ad27 drm/amdgpu: use dma-fence annotations in cs_submit()
e86ec566effc drm/scheduler: use dma-fence annotations in main thread
812e8d183ea1 drm/amdgpu: add dma-fence annotations to atomic commit path
49852bebf34d drm/atomic-helper: Add dma-fence annotations
74cf66d0c736 drm/vblank: Annotate with dma-fence signalling section
abea167ccc2c drm/vkms: Annotate vblank timer
e85757129eef dma-fence: prime lockdep annotations
4a356e005b80 dma-fence: basic lockdep annotations
464bebc66202 dma-buf: minor doc touch-ups
b91d6e9b2219 mm: Track mmu notifiers in fs_reclaim_acquire/release
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17923/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 (rev6)
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (26 preceding siblings ...)
2020-06-10 20:35 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
@ 2020-06-12 7:18 ` Patchwork
2020-06-12 7:19 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
` (2 subsequent siblings)
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-12 7:18 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2 (rev6)
URL : https://patchwork.freedesktop.org/series/77986/
State : warning
== Summary ==
$ dim checkpatch origin/drm-tip
59ff28b69eed mm: Track mmu notifiers in fs_reclaim_acquire/release
-:12: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 23b68395c7c7 ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end")'
#12:
recursions we do have lockdep annotations since 23b68395c7c7
-:41: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 66204f1d2d1b ("mm/mmu_notifiers: prime lockdep")'
#41:
With this we can also remove the lockdep priming added in 66204f1d2d1b
-:124: CHECK:BRACES: Blank lines aren't necessary before a close brace '}'
#124: FILE: mm/page_alloc.c:4167:
+
+ }
-:138: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 2 errors, 1 warnings, 1 checks, 67 lines checked
ceede5e08eb8 dma-buf: minor doc touch-ups
-:54: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 28 lines checked
07c16f051d28 dma-fence: basic lockdep annotations
-:23: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit e91498589746 ("locking/lockdep/selftests: Add mixed read-write ABBA tests")'
#23:
commit e91498589746065e3ae95d9a00b068e525eec34f
-:97: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit e966eaeeb623 ("locking/lockdep: Remove the cross-release locking checks")'
#97:
commit e966eaeeb623f09975ef362c2866fae6f86844f9
-:103: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#103:
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
-:302: ERROR:IN_ATOMIC: do not use in_atomic in drivers
#302: FILE: drivers/dma-buf/dma-fence.c:228:
+ if (in_atomic())
-:340: CHECK:LINE_SPACING: Please don't use multiple blank lines
#340: FILE: drivers/dma-buf/dma-fence.c:266:
+
+
-:389: CHECK:LINE_SPACING: Please use a blank line after function/struct/union/enum declarations
#389: FILE: include/linux/dma-fence.h:368:
+}
+static inline void dma_fence_end_signalling(bool cookie) {}
-:395: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 3 errors, 2 warnings, 2 checks, 217 lines checked
6442f8dad95b dma-fence: prime lockdep annotations
-:31: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 23b68395c7c7 ("mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end")'
#31:
commit 23b68395c7c78a764e8963fc15a7cfd318bf187f
-:180: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 1 errors, 1 warnings, 0 checks, 86 lines checked
b874c76322b8 drm/vkms: Annotate vblank timer
-:59: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 25 lines checked
9f0f8c8303fa drm/vblank: Annotate with dma-fence signalling section
-:71: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 38 lines checked
d85809aae908 drm/atomic-helper: Add dma-fence annotations
-:119: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 76 lines checked
b6778d197cf3 drm/amdgpu: add dma-fence annotations to atomic commit path
-:52: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 24 lines checked
c4ab594ebf4a drm/scheduler: use dma-fence annotations in main thread
-:53: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 21 lines checked
b4093fcdacd2 drm/amdgpu: use dma-fence annotations in cs_submit()
-:65: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 29 lines checked
d9ed9c09b946 drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
-:82: WARNING:ALLOC_ARRAY_ARGS: kmalloc_array uses number as first arg, sizeof is generally wrong
#82: FILE: drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c:211:
+ fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_ATOMIC);
-:98: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 24 lines checked
45b663c70065 drm/amdgpu: DC also loves to allocate stuff where it shouldn't
-:70: WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line
#70: FILE: drivers/gpu/drm/amd/display/dc/core/dc.c:1436:
+ * atomic_commit_tail. */
-:76: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 27 lines checked
7a2bb8a3d251 drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
-:39: WARNING:IF_0: Consider removing the code enclosed by this #if 0 and its #endif
#39: FILE: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:6914:
+#if 0
-:55: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 23 lines checked
36895e1c3363 drm/scheduler: use dma-fence annotations in tdr work
-:28: WARNING:TYPO_SPELLING: 'seperate' may be misspelled - perhaps 'separate'?
#28:
Hence split out as a seperate patch.
-:114: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 2 warnings, 0 checks, 20 lines checked
e8d515333826 drm/amdgpu: use dma-fence annotations for gpu reset code
823a78e8bd4d Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"
-:145: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 16 lines checked
20094f452976 drm/amdgpu: gpu recovery does full modesets
-:186: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 14 lines checked
6403c98f95cb drm/i915: Annotate dma_fence_work
-:53: WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author 'Daniel Vetter <daniel.vetter@ffwll.ch>'
total: 0 errors, 1 warnings, 0 checks, 15 lines checked
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for dma-fence lockdep annotations, round 2 (rev6)
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (27 preceding siblings ...)
2020-06-12 7:18 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for dma-fence lockdep annotations, round 2 (rev6) Patchwork
@ 2020-06-12 7:19 ` Patchwork
2020-06-12 7:32 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-06-22 10:11 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for dma-fence lockdep annotations, round 2 (rev7) Patchwork
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-12 7:19 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2 (rev6)
URL : https://patchwork.freedesktop.org/series/77986/
State : warning
== Summary ==
$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: expected unsigned int [addressable] [usertype] ulClockParams
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: got restricted __le32 [usertype]
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1019:47: warning: incorrect type in assignment (different base types)
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1028:50: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1029:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:1037:47: warning: too many warnings
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:184:44: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:283:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:320:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:323:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:326:14: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:329:18: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:330:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:338:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:340:38: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:342:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:346:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:348:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:353:33: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:367:43: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:369:38: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:374:67: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:375:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:378:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:389:80: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:395:57: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:402:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:403:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:406:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:414:66: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:423:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:424:69: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:473:30: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:476:45: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:477:45: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:484:54: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:52:28: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:531:35: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:53:29: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:533:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:54:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:55:27: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:56:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:57:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:577:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:581:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:58:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:583:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:586:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:590:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:59:26: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:598:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:600:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:617:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:621:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:623:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:630:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:632:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:644:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:648:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:650:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:657:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:659:21: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:662:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:664:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:676:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:688:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:691:47: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:697:25: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:796:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:797:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:800:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:801:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:804:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:805:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:812:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:813:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:816:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:817:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:820:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:821:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:828:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:829:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:832:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:833:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:836:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:837:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:844:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:845:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:848:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:849:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:852:46: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:853:40: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:916:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:918:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:920:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:934:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:936:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:938:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:956:47: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:958:49: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:960:52: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:296:34: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:330:34: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:360:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:362:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:369:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:383:40: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:406:40: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:44:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:447:53: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:451:33: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:454:61: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:455:64: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:457:54: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:483:17: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:486:21: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:64:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:80:17: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:85:30: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:86:24: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c:98:39: warning: cast to restricted __le16
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:222:29: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:226:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:227:37: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:233:43: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:236:44: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:239:51: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:458:41: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:464:39: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:465:30: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:466:39: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:468:24: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: expected unsigned long long [usertype] *chunk_array_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:140:26: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: got unsigned long long [usertype] *chunk_array_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:141:41: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: expected struct drm_amdgpu_cs_chunk [noderef] <asn:1> **chunk_ptr
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:160:27: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: got struct drm_amdgpu_cs_chunk [noderef] <asn:1> **chunk_ptr
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:161:49: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: expected struct drm_amdgpu_fence *fences_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: got void [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1618:21: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: expected void const [noderef] <asn:1> *from
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: got struct drm_amdgpu_fence *fences_user
+drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1619:36: warning: incorrect type in argument 2 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1256:25: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1257:17: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1313:17: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: expected restricted __poll_t ( *poll )( ... )
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: got unsigned int ( * )( ... )
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:138:17: warning: incorrect type in initializer (different base types)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:257:29: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:259:29: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:346:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:400:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:457:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:511:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:568:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:622:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: expected void const volatile [noderef] <asn:1> *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: got unsigned int [usertype] *
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: cast removes address space '<asn:1>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:719:21: warning: too many warnings
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1618:65: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1625:55: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626:50: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1627:50: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1628:56: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1630:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1631:45: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1632:51: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1633:55: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1634:57: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1636:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1637:53: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1639:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1641:25: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1642:46: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1646:73: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1648:33: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1650:33: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1661:73: warning: cast to restricted __le32
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:260:16: error: incompatible types in comparison expression (different type sizes)
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:507:39: warning: cast removes address space '<asn:2>' of expression
+drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:527:31: warning: cast removes address space '<asn:2>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.BAT: failure for dma-fence lockdep annotations, round 2 (rev6)
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (28 preceding siblings ...)
2020-06-12 7:19 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-06-12 7:32 ` Patchwork
2020-06-22 10:11 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for dma-fence lockdep annotations, round 2 (rev7) Patchwork
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-12 7:32 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2 (rev6)
URL : https://patchwork.freedesktop.org/series/77986/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_8618 -> Patchwork_17934
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_17934 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_17934, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/index.html
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_17934:
### IGT changes ###
#### Possible regressions ####
* igt@gem_busy@busy@all:
- fi-kbl-x1275: [PASS][1] -> [DMESG-WARN][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-kbl-x1275/igt@gem_busy@busy@all.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-kbl-x1275/igt@gem_busy@busy@all.html
- fi-cfl-8700k: [PASS][3] -> [DMESG-WARN][4]
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-cfl-8700k/igt@gem_busy@busy@all.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-cfl-8700k/igt@gem_busy@busy@all.html
- fi-skl-6600u: [PASS][5] -> [DMESG-WARN][6]
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-skl-6600u/igt@gem_busy@busy@all.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-skl-6600u/igt@gem_busy@busy@all.html
- fi-cfl-8109u: [PASS][7] -> [DMESG-WARN][8]
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-cfl-8109u/igt@gem_busy@busy@all.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-cfl-8109u/igt@gem_busy@busy@all.html
- fi-icl-u2: [PASS][9] -> [DMESG-WARN][10]
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-icl-u2/igt@gem_busy@busy@all.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-icl-u2/igt@gem_busy@busy@all.html
- fi-glk-dsi: [PASS][11] -> [DMESG-WARN][12]
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-glk-dsi/igt@gem_busy@busy@all.html
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-glk-dsi/igt@gem_busy@busy@all.html
- fi-skl-lmem: [PASS][13] -> [DMESG-WARN][14]
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-skl-lmem/igt@gem_busy@busy@all.html
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-skl-lmem/igt@gem_busy@busy@all.html
- fi-kbl-r: [PASS][15] -> [DMESG-WARN][16]
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-kbl-r/igt@gem_busy@busy@all.html
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-kbl-r/igt@gem_busy@busy@all.html
- fi-bdw-5557u: [PASS][17] -> [DMESG-WARN][18]
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-bdw-5557u/igt@gem_busy@busy@all.html
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-bdw-5557u/igt@gem_busy@busy@all.html
- fi-icl-guc: [PASS][19] -> [DMESG-WARN][20]
[19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-icl-guc/igt@gem_busy@busy@all.html
[20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-icl-guc/igt@gem_busy@busy@all.html
- fi-kbl-soraka: [PASS][21] -> [DMESG-WARN][22]
[21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-kbl-soraka/igt@gem_busy@busy@all.html
[22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-kbl-soraka/igt@gem_busy@busy@all.html
- fi-kbl-7500u: [PASS][23] -> [DMESG-WARN][24]
[23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-kbl-7500u/igt@gem_busy@busy@all.html
[24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-kbl-7500u/igt@gem_busy@busy@all.html
- fi-kbl-guc: [PASS][25] -> [DMESG-WARN][26]
[25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-kbl-guc/igt@gem_busy@busy@all.html
[26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-kbl-guc/igt@gem_busy@busy@all.html
- fi-whl-u: [PASS][27] -> [DMESG-WARN][28]
[27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-whl-u/igt@gem_busy@busy@all.html
[28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-whl-u/igt@gem_busy@busy@all.html
- fi-cml-u2: [PASS][29] -> [DMESG-WARN][30]
[29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-cml-u2/igt@gem_busy@busy@all.html
[30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-cml-u2/igt@gem_busy@busy@all.html
- fi-bxt-dsi: [PASS][31] -> [DMESG-WARN][32]
[31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-bxt-dsi/igt@gem_busy@busy@all.html
[32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-bxt-dsi/igt@gem_busy@busy@all.html
- fi-cml-s: [PASS][33] -> [DMESG-WARN][34]
[33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-cml-s/igt@gem_busy@busy@all.html
[34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-cml-s/igt@gem_busy@busy@all.html
- fi-cfl-guc: [PASS][35] -> [DMESG-WARN][36]
[35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-cfl-guc/igt@gem_busy@busy@all.html
[36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-cfl-guc/igt@gem_busy@busy@all.html
- fi-icl-y: [PASS][37] -> [DMESG-WARN][38]
[37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-icl-y/igt@gem_busy@busy@all.html
[38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-icl-y/igt@gem_busy@busy@all.html
- fi-skl-guc: [PASS][39] -> [DMESG-WARN][40]
[39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-skl-guc/igt@gem_busy@busy@all.html
[40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-skl-guc/igt@gem_busy@busy@all.html
- fi-skl-6700k2: [PASS][41] -> [DMESG-WARN][42]
[41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-skl-6700k2/igt@gem_busy@busy@all.html
[42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-skl-6700k2/igt@gem_busy@busy@all.html
- fi-tgl-u2: [PASS][43] -> [DMESG-WARN][44]
[43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-tgl-u2/igt@gem_busy@busy@all.html
[44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-tgl-u2/igt@gem_busy@busy@all.html
* igt@gem_close_race@basic-process:
- fi-ivb-3770: [PASS][45] -> [DMESG-WARN][46]
[45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-ivb-3770/igt@gem_close_race@basic-process.html
[46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-ivb-3770/igt@gem_close_race@basic-process.html
- fi-byt-j1900: [PASS][47] -> [DMESG-WARN][48]
[47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-byt-j1900/igt@gem_close_race@basic-process.html
[48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-byt-j1900/igt@gem_close_race@basic-process.html
- fi-hsw-4770: [PASS][49] -> [DMESG-WARN][50]
[49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-hsw-4770/igt@gem_close_race@basic-process.html
[50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-hsw-4770/igt@gem_close_race@basic-process.html
- fi-byt-n2820: [PASS][51] -> [DMESG-WARN][52]
[51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-byt-n2820/igt@gem_close_race@basic-process.html
[52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-byt-n2820/igt@gem_close_race@basic-process.html
* igt@gem_tiled_blits@basic:
- fi-bwr-2160: [PASS][53] -> [DMESG-WARN][54]
[53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-bwr-2160/igt@gem_tiled_blits@basic.html
[54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-bwr-2160/igt@gem_tiled_blits@basic.html
* igt@kms_busy@basic@flip:
- fi-snb-2600: [PASS][55] -> [DMESG-WARN][56]
[55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-snb-2600/igt@kms_busy@basic@flip.html
[56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-snb-2600/igt@kms_busy@basic@flip.html
- fi-snb-2520m: [PASS][57] -> [DMESG-WARN][58]
[57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-snb-2520m/igt@kms_busy@basic@flip.html
[58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-snb-2520m/igt@kms_busy@basic@flip.html
* igt@kms_frontbuffer_tracking@basic:
- fi-ilk-650: [PASS][59] -> [DMESG-WARN][60]
[59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-ilk-650/igt@kms_frontbuffer_tracking@basic.html
[60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-ilk-650/igt@kms_frontbuffer_tracking@basic.html
* igt@runner@aborted:
- fi-cfl-8700k: NOTRUN -> [FAIL][61]
[61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-cfl-8700k/igt@runner@aborted.html
- fi-cfl-8109u: NOTRUN -> [FAIL][62]
[62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-cfl-8109u/igt@runner@aborted.html
- fi-icl-u2: NOTRUN -> [FAIL][63]
[63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-icl-u2/igt@runner@aborted.html
- fi-snb-2520m: NOTRUN -> [FAIL][64]
[64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-snb-2520m/igt@runner@aborted.html
- fi-bdw-5557u: NOTRUN -> [FAIL][65]
[65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-bdw-5557u/igt@runner@aborted.html
- fi-bwr-2160: NOTRUN -> [FAIL][66]
[66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-bwr-2160/igt@runner@aborted.html
- fi-byt-n2820: NOTRUN -> [FAIL][67]
[67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-byt-n2820/igt@runner@aborted.html
- fi-icl-guc: NOTRUN -> [FAIL][68]
[68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-icl-guc/igt@runner@aborted.html
- fi-hsw-4770: NOTRUN -> [FAIL][69]
[69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-hsw-4770/igt@runner@aborted.html
- fi-snb-2600: NOTRUN -> [FAIL][70]
[70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-snb-2600/igt@runner@aborted.html
- fi-whl-u: NOTRUN -> [FAIL][71]
[71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-whl-u/igt@runner@aborted.html
- fi-cml-u2: NOTRUN -> [FAIL][72]
[72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-cml-u2/igt@runner@aborted.html
- fi-ivb-3770: NOTRUN -> [FAIL][73]
[73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-ivb-3770/igt@runner@aborted.html
- fi-bxt-dsi: NOTRUN -> [FAIL][74]
[74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-bxt-dsi/igt@runner@aborted.html
- fi-byt-j1900: NOTRUN -> [FAIL][75]
[75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-byt-j1900/igt@runner@aborted.html
- fi-cml-s: NOTRUN -> [FAIL][76]
[76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-cml-s/igt@runner@aborted.html
- fi-cfl-guc: NOTRUN -> [FAIL][77]
[77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-cfl-guc/igt@runner@aborted.html
- fi-icl-y: NOTRUN -> [FAIL][78]
[78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-icl-y/igt@runner@aborted.html
- fi-tgl-u2: NOTRUN -> [FAIL][79]
[79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-tgl-u2/igt@runner@aborted.html
#### Suppressed ####
The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.
* igt@gem_busy@busy@all:
- {fi-tgl-dsi}: [PASS][80] -> [DMESG-WARN][81]
[80]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-tgl-dsi/igt@gem_busy@busy@all.html
[81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-tgl-dsi/igt@gem_busy@busy@all.html
- {fi-ehl-1}: [PASS][82] -> [DMESG-WARN][83]
[82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8618/fi-ehl-1/igt@gem_busy@busy@all.html
[83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-ehl-1/igt@gem_busy@busy@all.html
* igt@runner@aborted:
- {fi-tgl-dsi}: NOTRUN -> [FAIL][84]
[84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-tgl-dsi/igt@runner@aborted.html
- {fi-ehl-1}: NOTRUN -> [FAIL][85]
[85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/fi-ehl-1/igt@runner@aborted.html
Known issues
------------
Here are the changes found in Patchwork_17934 that come from known issues:
### IGT changes ###
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
[i915#1569]: https://gitlab.freedesktop.org/drm/intel/issues/1569
[i915#192]: https://gitlab.freedesktop.org/drm/intel/issues/192
[i915#193]: https://gitlab.freedesktop.org/drm/intel/issues/193
[i915#194]: https://gitlab.freedesktop.org/drm/intel/issues/194
Participating hosts (50 -> 43)
------------------------------
Missing (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus
Build changes
-------------
* Linux: CI_DRM_8618 -> Patchwork_17934
CI-20190529: 20190529
CI_DRM_8618: 88841e30e7f8c60ff464be277e5b8fef49ebaea0 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5703: c33471b4aa0a0ae9dd42202048e7037a661e0574 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_17934: 6403c98f95cb651d38c8824b670e1172236e96a3 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
6403c98f95cb drm/i915: Annotate dma_fence_work
20094f452976 drm/amdgpu: gpu recovery does full modesets
823a78e8bd4d Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"
e8d515333826 drm/amdgpu: use dma-fence annotations for gpu reset code
36895e1c3363 drm/scheduler: use dma-fence annotations in tdr work
7a2bb8a3d251 drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
45b663c70065 drm/amdgpu: DC also loves to allocate stuff where it shouldn't
d9ed9c09b946 drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
b4093fcdacd2 drm/amdgpu: use dma-fence annotations in cs_submit()
c4ab594ebf4a drm/scheduler: use dma-fence annotations in main thread
b6778d197cf3 drm/amdgpu: add dma-fence annotations to atomic commit path
d85809aae908 drm/atomic-helper: Add dma-fence annotations
9f0f8c8303fa drm/vblank: Annotate with dma-fence signalling section
b874c76322b8 drm/vkms: Annotate vblank timer
6442f8dad95b dma-fence: prime lockdep annotations
07c16f051d28 dma-fence: basic lockdep annotations
ceede5e08eb8 dma-buf: minor doc touch-ups
59ff28b69eed mm: Track mmu notifiers in fs_reclaim_acquire/release
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17934/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread* [Intel-gfx] ✗ Fi.CI.BUILD: failure for dma-fence lockdep annotations, round 2 (rev7)
2020-06-04 8:12 [Intel-gfx] [PATCH 00/18] dma-fence lockdep annotations, round 2 Daniel Vetter
` (29 preceding siblings ...)
2020-06-12 7:32 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
@ 2020-06-22 10:11 ` Patchwork
30 siblings, 0 replies; 106+ messages in thread
From: Patchwork @ 2020-06-22 10:11 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
== Series Details ==
Series: dma-fence lockdep annotations, round 2 (rev7)
URL : https://patchwork.freedesktop.org/series/77986/
State : failure
== Summary ==
Applying: mm: Track mmu notifiers in fs_reclaim_acquire/release
error: sha1 information is lacking or useless (mm/page_alloc.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 mm: Track mmu notifiers in fs_reclaim_acquire/release
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 106+ messages in thread