From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Jani Nikula <jani.nikula@linux.intel.com>,
Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
Daniel Vetter <daniel@ffwll.ch>, David Airlie <airlied@linux.ie>,
Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org,
dri-devel@lists.freedesktop.org
Subject: [Intel-gfx] drm/i915: __GFP_RETRY_MAYFAIL allocations in stable kernels
Date: Mon, 14 Jun 2021 21:45:37 +0900 [thread overview]
Message-ID: <YMdPcWZi4x7vnCxI@google.com> (raw)
Hi,
We are observing some user-space crashes (sigabort, segfaults etc.)
under moderate memory pressure (pretty far from severe pressure) which
have one thing in common - restrictive GFP mask in setup_scratch_page().
For instance, (stable 4.19) drivers/gpu/drm/i915/i915_gem_gtt.c
(trimmed down version)
static int gen8_init_scratch(struct i915_address_space *vm)
{
setup_scratch_page(vm, __GFP_HIGHMEM);
vm->scratch_pt = alloc_pt(vm);
vm->scratch_pd = alloc_pd(vm);
if (use_4lvl(vm)) {
vm->scratch_pdp = alloc_pdp(vm);
}
}
gen8_init_scratch() function puts a rather inconsistent restrictions on mm.
Looking at it line by line:
setup_scratch_page() uses very restrictive gfp mask:
__GFP_HIGHMEM | __GFP_ZERO | __GFP_RETRY_MAYFAIL
it doesn't try to reclaim anything and fails almost immediately.
alloc_pt() - uses more permissive gfp mask:
GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
alloc_pd() - likewise:
GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
alloc_pdp() - very permissive gfp mask:
GFP_KERNEL
So can all allocations in gen8_init_scratch() use
GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
?
E.g.
---
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a12430187108..e862680b9c93 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -792,7 +792,7 @@ alloc_pdp(struct i915_address_space *vm)
GEM_BUG_ON(!use_4lvl(vm));
- pdp = kzalloc(sizeof(*pdp), GFP_KERNEL);
+ pdp = kzalloc(sizeof(*pdp), I915_GFP_ALLOW_FAIL);
if (!pdp)
return ERR_PTR(-ENOMEM);
@@ -1262,7 +1262,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
{
int ret;
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -1972,7 +1972,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_hw_ppgtt *ppgtt)
u32 pde;
int ret;
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -3078,7 +3078,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
return -ENOMEM;
}
- ret = setup_scratch_page(&ggtt->vm, GFP_DMA32);
+ ret = setup_scratch_page(&ggtt->vm, GFP_KERNEL | GFP_DMA32);
if (ret) {
DRM_ERROR("Scratch setup failed\n");
/* iounmap will also get called at remove, but meh */
---
It's quite similar on stable 5.4 - setup_scratch_page() uses restrictive
gfp mask again.
---
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index f614646ed3f9..99d78b1052df 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1378,7 +1378,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
return 0;
}
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -1753,7 +1753,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
struct i915_page_directory * const pd = ppgtt->base.pd;
int ret;
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -2860,7 +2860,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
return -ENOMEM;
}
- ret = setup_scratch_page(&ggtt->vm, GFP_DMA32);
+ ret = setup_scratch_page(&ggtt->vm, GFP_KERNEL | GFP_DMA32);
if (ret) {
DRM_ERROR("Scratch setup failed\n");
/* iounmap will also get called at remove, but meh */
---
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
WARNING: multiple messages have this Message-ID (diff)
From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Jani Nikula <jani.nikula@linux.intel.com>,
Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
Daniel Vetter <daniel@ffwll.ch>, David Airlie <airlied@linux.ie>,
Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org,
dri-devel@lists.freedesktop.org
Subject: drm/i915: __GFP_RETRY_MAYFAIL allocations in stable kernels
Date: Mon, 14 Jun 2021 21:45:37 +0900 [thread overview]
Message-ID: <YMdPcWZi4x7vnCxI@google.com> (raw)
Hi,
We are observing some user-space crashes (sigabort, segfaults etc.)
under moderate memory pressure (pretty far from severe pressure) which
have one thing in common - restrictive GFP mask in setup_scratch_page().
For instance, (stable 4.19) drivers/gpu/drm/i915/i915_gem_gtt.c
(trimmed down version)
static int gen8_init_scratch(struct i915_address_space *vm)
{
setup_scratch_page(vm, __GFP_HIGHMEM);
vm->scratch_pt = alloc_pt(vm);
vm->scratch_pd = alloc_pd(vm);
if (use_4lvl(vm)) {
vm->scratch_pdp = alloc_pdp(vm);
}
}
gen8_init_scratch() function puts a rather inconsistent restrictions on mm.
Looking at it line by line:
setup_scratch_page() uses very restrictive gfp mask:
__GFP_HIGHMEM | __GFP_ZERO | __GFP_RETRY_MAYFAIL
it doesn't try to reclaim anything and fails almost immediately.
alloc_pt() - uses more permissive gfp mask:
GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
alloc_pd() - likewise:
GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
alloc_pdp() - very permissive gfp mask:
GFP_KERNEL
So can all allocations in gen8_init_scratch() use
GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
?
E.g.
---
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a12430187108..e862680b9c93 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -792,7 +792,7 @@ alloc_pdp(struct i915_address_space *vm)
GEM_BUG_ON(!use_4lvl(vm));
- pdp = kzalloc(sizeof(*pdp), GFP_KERNEL);
+ pdp = kzalloc(sizeof(*pdp), I915_GFP_ALLOW_FAIL);
if (!pdp)
return ERR_PTR(-ENOMEM);
@@ -1262,7 +1262,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
{
int ret;
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -1972,7 +1972,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_hw_ppgtt *ppgtt)
u32 pde;
int ret;
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -3078,7 +3078,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
return -ENOMEM;
}
- ret = setup_scratch_page(&ggtt->vm, GFP_DMA32);
+ ret = setup_scratch_page(&ggtt->vm, GFP_KERNEL | GFP_DMA32);
if (ret) {
DRM_ERROR("Scratch setup failed\n");
/* iounmap will also get called at remove, but meh */
---
It's quite similar on stable 5.4 - setup_scratch_page() uses restrictive
gfp mask again.
---
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index f614646ed3f9..99d78b1052df 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1378,7 +1378,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
return 0;
}
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -1753,7 +1753,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
struct i915_page_directory * const pd = ppgtt->base.pd;
int ret;
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -2860,7 +2860,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
return -ENOMEM;
}
- ret = setup_scratch_page(&ggtt->vm, GFP_DMA32);
+ ret = setup_scratch_page(&ggtt->vm, GFP_KERNEL | GFP_DMA32);
if (ret) {
DRM_ERROR("Scratch setup failed\n");
/* iounmap will also get called at remove, but meh */
---
WARNING: multiple messages have this Message-ID (diff)
From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Jani Nikula <jani.nikula@linux.intel.com>,
Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
Daniel Vetter <daniel@ffwll.ch>, David Airlie <airlied@linux.ie>,
Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
linux-kernel@vger.kernel.org
Subject: drm/i915: __GFP_RETRY_MAYFAIL allocations in stable kernels
Date: Mon, 14 Jun 2021 21:45:37 +0900 [thread overview]
Message-ID: <YMdPcWZi4x7vnCxI@google.com> (raw)
Hi,
We are observing some user-space crashes (sigabort, segfaults etc.)
under moderate memory pressure (pretty far from severe pressure) which
have one thing in common - restrictive GFP mask in setup_scratch_page().
For instance, (stable 4.19) drivers/gpu/drm/i915/i915_gem_gtt.c
(trimmed down version)
static int gen8_init_scratch(struct i915_address_space *vm)
{
setup_scratch_page(vm, __GFP_HIGHMEM);
vm->scratch_pt = alloc_pt(vm);
vm->scratch_pd = alloc_pd(vm);
if (use_4lvl(vm)) {
vm->scratch_pdp = alloc_pdp(vm);
}
}
gen8_init_scratch() function puts a rather inconsistent restrictions on mm.
Looking at it line by line:
setup_scratch_page() uses very restrictive gfp mask:
__GFP_HIGHMEM | __GFP_ZERO | __GFP_RETRY_MAYFAIL
it doesn't try to reclaim anything and fails almost immediately.
alloc_pt() - uses more permissive gfp mask:
GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
alloc_pd() - likewise:
GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
alloc_pdp() - very permissive gfp mask:
GFP_KERNEL
So can all allocations in gen8_init_scratch() use
GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN
?
E.g.
---
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a12430187108..e862680b9c93 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -792,7 +792,7 @@ alloc_pdp(struct i915_address_space *vm)
GEM_BUG_ON(!use_4lvl(vm));
- pdp = kzalloc(sizeof(*pdp), GFP_KERNEL);
+ pdp = kzalloc(sizeof(*pdp), I915_GFP_ALLOW_FAIL);
if (!pdp)
return ERR_PTR(-ENOMEM);
@@ -1262,7 +1262,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
{
int ret;
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -1972,7 +1972,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_hw_ppgtt *ppgtt)
u32 pde;
int ret;
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -3078,7 +3078,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
return -ENOMEM;
}
- ret = setup_scratch_page(&ggtt->vm, GFP_DMA32);
+ ret = setup_scratch_page(&ggtt->vm, GFP_KERNEL | GFP_DMA32);
if (ret) {
DRM_ERROR("Scratch setup failed\n");
/* iounmap will also get called at remove, but meh */
---
It's quite similar on stable 5.4 - setup_scratch_page() uses restrictive
gfp mask again.
---
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index f614646ed3f9..99d78b1052df 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1378,7 +1378,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
return 0;
}
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -1753,7 +1753,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
struct i915_page_directory * const pd = ppgtt->base.pd;
int ret;
- ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+ ret = setup_scratch_page(vm, GFP_KERNEL | __GFP_HIGHMEM);
if (ret)
return ret;
@@ -2860,7 +2860,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
return -ENOMEM;
}
- ret = setup_scratch_page(&ggtt->vm, GFP_DMA32);
+ ret = setup_scratch_page(&ggtt->vm, GFP_KERNEL | GFP_DMA32);
if (ret) {
DRM_ERROR("Scratch setup failed\n");
/* iounmap will also get called at remove, but meh */
---
next reply other threads:[~2021-06-14 12:45 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-14 12:45 Sergey Senozhatsky [this message]
2021-06-14 12:45 ` drm/i915: __GFP_RETRY_MAYFAIL allocations in stable kernels Sergey Senozhatsky
2021-06-14 12:45 ` Sergey Senozhatsky
2021-06-14 23:38 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
2021-06-17 17:27 ` [Intel-gfx] " Daniel Vetter
2021-06-17 17:27 ` Daniel Vetter
2021-06-17 17:27 ` Daniel Vetter
2021-06-18 2:29 ` [Intel-gfx] " Sergey Senozhatsky
2021-06-18 2:29 ` Sergey Senozhatsky
2021-06-18 15:46 ` [Intel-gfx] " Matthew Auld
2021-06-18 15:46 ` Matthew Auld
2021-06-18 15:46 ` Matthew Auld
2021-06-21 14:10 ` [Intel-gfx] " Daniel Vetter
2021-06-21 14:10 ` Daniel Vetter
2021-06-21 14:10 ` Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YMdPcWZi4x7vnCxI@google.com \
--to=senozhatsky@chromium.org \
--cc=airlied@linux.ie \
--cc=chris@chris-wilson.co.uk \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=jani.nikula@linux.intel.com \
--cc=joonas.lahtinen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.