From: Mel Gorman <mgorman@suse.de>
To: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Glauber Costa <glommer@openvz.org>,
linux-mm@kvack.org, cgroups@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Greg Thelen <gthelen@google.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Michal Hocko <mhocko@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
Dave Chinner <dchinner@redhat.com>,
intel-gfx <intel-gfx@lists.freedesktop.org>,
dri-devel <dri-devel@lists.freedesktop.org>,
Kent Overstreet <koverstreet@google.com>,
devel@driverdev.osuosl.org,
Dan Magenheimer <dan.magenheimer@oracle.com>
Subject: Re: [PATCH v4 17/31] drivers: convert shrinkers to new count/scan API
Date: Thu, 2 May 2013 10:31:50 +0100 [thread overview]
Message-ID: <20130502093150.GI11497@suse.de> (raw)
In-Reply-To: <CAKMK7uEkZ8nYZgB4pGiqJx+PAt6xL10FN3R5nFRgCAHVvPW8iQ@mail.gmail.com>
On Wed, May 01, 2013 at 05:26:38PM +0200, Daniel Vetter wrote:
> On Tue, Apr 30, 2013 at 11:53 PM, Mel Gorman <mgorman@suse.de> wrote:
> > On Sat, Apr 27, 2013 at 03:19:13AM +0400, Glauber Costa wrote:
> >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >> index 6be940e..2e44733 100644
> >> --- a/drivers/gpu/drm/i915/i915_gem.c
> >> +++ b/drivers/gpu/drm/i915/i915_gem.c
> >> @@ -1729,15 +1731,20 @@ i915_gem_purge(struct drm_i915_private *dev_priv, long target)
> >> return __i915_gem_shrink(dev_priv, target, true);
> >> }
> >>
> >> -static void
> >> +static long
> >> i915_gem_shrink_all(struct drm_i915_private *dev_priv)
> >> {
> >> struct drm_i915_gem_object *obj, *next;
> >> + long freed = 0;
> >>
> >> - i915_gem_evict_everything(dev_priv->dev);
> >> + freed += i915_gem_evict_everything(dev_priv->dev);
> >>
> >> - list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, gtt_list)
> >> + list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, gtt_list) {
> >> + if (obj->pages_pin_count == 0)
> >> + freed += obj->base.size >> PAGE_SHIFT;
> >> i915_gem_object_put_pages(obj);
> >> + }
> >> + return freed;
> >> }
> >>
> >
> > i915_gem_shrink_all is a sledge hammer! That i915_gem_evict_everything
> > looks like it switches to every GPU context, waits for everything to
> > complete and then retire it all. I don't know the details of what it's
> > doing but it's sounds very heavy handed and is called from shrinker
> > context if it fails to shrink 128 objects. Those shrinker callsback can
> > be very frequently called even from kswapd.
>
> i915_gem_shrink_all is our escape hatch, we only use it as a
> last-ditch effort when all else fails. Imo there's no point in passing
> the number of freed objects around from it since it really never
> should get called (as long as we don't get called with more objects to
> shrink than our counter counted beforehand at least).
The shrinkers can be called quite frequently so I'm concerned that you do
get called with "more objects to shrink than our counter counted beforehand"
if it's called from direct reclaim and kswapd at the same time. kswapd
can be a very frequent caller via
kswapd
balance_pgdat
shrink_slab
i915_gem_inactive_shrink
i915_gem_shrink_all
That can be active just because there is a streaming reader of a video
file that is larger than physical memory. If there is enough additional
pressure then direct reclaimers and kswapd can both call the shrinker.
The mutex on its own is not enough for them both to read "500 objects"
and then both trying to free 500 objects each with the second stalling
in i915_gem_shrink_all.
Unfortunately the laptop I'm using does not have an i915 card to check
how easy this is to trigger but I suspect filling memory with dd,
starting a video of some sort and writing to a USB stick may be enough
to trigger it.
> >> @@ -4472,3 +4470,36 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >> mutex_unlock(&dev->struct_mutex);
> >> return cnt;
> >> }
> >> +static long
> >> +i915_gem_inactive_scan(struct shrinker *shrinker, struct shrink_control *sc)
> >> +{
> >> + struct drm_i915_private *dev_priv =
> >> + container_of(shrinker,
> >> + struct drm_i915_private,
> >> + mm.inactive_shrinker);
> >> + struct drm_device *dev = dev_priv->dev;
> >> + int nr_to_scan = sc->nr_to_scan;
> >> + long freed;
> >> + bool unlock = true;
> >> +
> >> + if (!mutex_trylock(&dev->struct_mutex)) {
> >> + if (!mutex_is_locked_by(&dev->struct_mutex, current))
> >> + return 0;
> >> +
> >
> > return -1 if it's about preventing potential deadlocks?
> >
In Glauber's series, shrinkers are split into count and scan callbacks.
If a scan returns -1, it indicates to vmscan.c that the slab could not
be shrunk at this time be it due to a deadlock risk or because the necessary
locks could not be acquired at this time.
> >> + if (dev_priv->mm.shrinker_no_lock_stealing)
> >> + return 0;
> >> +
> >
> > same?
>
> No idea. Aside, the aggressive shrinking with shrink_all and the lock
> stealing madness here are to paper our current "one lock for
> everything" approach we have for i915 gem stuff. We've papered over
> the worst offenders through lock-dropping tricks while waiting, the
> lock stealing above plus aggressively calling shrink_all.
>
> Still it's pretty trivial to (spuriously) OOM if you compete a gpu
> workload with something else. Real fix is per-object locking plus some
> watermark limits on how many pages are locked down this way, but
> that's long term (and currently stalling for the wait/wound mutexes
> from Maarten Lankhorst to get in).
>
Hmm, that's unfortunate. Later in Glauber's series, he also hooks up
the vm_pressure for in-kernel notification which is a better indication
of pressure than a slab shrinker. Conceivably that could be used to call
shrink_all if the system is about to go OOM. This would only be necessary
if it can be shown that the shrinker currently calls i915_gem_shrink_all
and stalls the world too easily.
> >
> >> + unlock = false;
> >> + }
> >> +
> >> + freed = i915_gem_purge(dev_priv, nr_to_scan);
> >> + if (freed < nr_to_scan)
> >> + freed += __i915_gem_shrink(dev_priv, nr_to_scan,
> >> + false);
> >> + if (freed < nr_to_scan)
> >> + freed += i915_gem_shrink_all(dev_priv);
> >> +
> >
> > Do we *really* want to call i915_gem_shrink_all from the slab shrinker?
> > Are there any bug reports where i915 rendering jitters in low memory
> > situations while shrinkers might be active? Maybe it's really fast.
>
> It's terrible for interactivity in X, but we need it :( See above for
> how we plan to eventually fix this mess.
>
Ok, thanks.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-05-02 9:31 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-26 23:18 [PATCH v4 00/31] kmemcg shrinkers Glauber Costa
2013-04-26 23:18 ` [PATCH v4 01/31] super: fix calculation of shrinkable objects for small numbers Glauber Costa
2013-04-30 13:03 ` Mel Gorman
2013-04-26 23:18 ` [PATCH v4 02/31] vmscan: take at least one pass with shrinkers Glauber Costa
2013-04-30 13:22 ` Mel Gorman
2013-04-30 13:31 ` Glauber Costa
2013-04-30 15:37 ` Mel Gorman
2013-05-07 13:35 ` Glauber Costa
2013-04-26 23:18 ` [PATCH v4 03/31] dcache: convert dentry_stat.nr_unused to per-cpu counters Glauber Costa
2013-04-30 13:37 ` Mel Gorman
2013-04-26 23:19 ` [PATCH v4 04/31] dentry: move to per-sb LRU locks Glauber Costa
2013-04-30 14:01 ` Mel Gorman
2013-04-26 23:19 ` [PATCH v4 05/31] dcache: remove dentries from LRU before putting on dispose list Glauber Costa
2013-04-30 14:14 ` Mel Gorman
2013-04-26 23:19 ` [PATCH v4 06/31] mm: new shrinker API Glauber Costa
2013-04-30 14:40 ` Mel Gorman
2013-04-30 15:03 ` Glauber Costa
2013-04-30 15:32 ` Mel Gorman
2013-04-26 23:19 ` [PATCH v4 07/31] shrinker: convert superblock shrinkers to new API Glauber Costa
2013-04-30 14:49 ` Mel Gorman
2013-04-26 23:19 ` [PATCH v4 08/31] list: add a new LRU list type Glauber Costa
2013-04-30 15:18 ` Mel Gorman
2013-04-30 16:01 ` Glauber Costa
2013-04-26 23:19 ` [PATCH v4 09/31] inode: convert inode lru list to generic lru list code Glauber Costa
2013-04-30 15:46 ` Mel Gorman
2013-05-07 13:47 ` Glauber Costa
2013-04-26 23:19 ` [PATCH v4 10/31] dcache: convert to use new lru list infrastructure Glauber Costa
2013-04-30 16:04 ` Mel Gorman
2013-04-30 16:13 ` Glauber Costa
2013-04-26 23:19 ` [PATCH v4 11/31] list_lru: per-node " Glauber Costa
2013-04-30 16:33 ` Mel Gorman
2013-04-30 21:44 ` Glauber Costa
2013-04-26 23:19 ` [PATCH v4 12/31] shrinker: add node awareness Glauber Costa
2013-04-30 16:35 ` Mel Gorman
2013-04-26 23:19 ` [PATCH v4 13/31] fs: convert inode and dentry shrinking to be node aware Glauber Costa
2013-04-30 17:39 ` Mel Gorman
2013-04-26 23:19 ` [PATCH v4 14/31] xfs: convert buftarg LRU to generic code Glauber Costa
2013-04-26 23:19 ` [PATCH v4 15/31] xfs: convert dquot cache lru to list_lru Glauber Costa
2013-04-26 23:19 ` [PATCH v4 16/31] fs: convert fs shrinkers to new scan/count API Glauber Costa
2013-04-26 23:19 ` [PATCH v4 17/31] drivers: convert shrinkers to new count/scan API Glauber Costa
2013-04-30 21:53 ` Mel Gorman
2013-04-30 22:00 ` Kent Overstreet
2013-05-02 9:37 ` Mel Gorman
2013-05-02 13:37 ` Glauber Costa
2013-05-01 15:26 ` Daniel Vetter
2013-05-02 9:31 ` Mel Gorman [this message]
2013-04-26 23:19 ` [PATCH v4 18/31] shrinker: convert remaining shrinkers to " Glauber Costa
2013-04-26 23:19 ` [PATCH v4 19/31] hugepage: convert huge zero page shrinker to new shrinker API Glauber Costa
2013-04-26 23:19 ` [PATCH v4 20/31] shrinker: Kill old ->shrink API Glauber Costa
2013-04-30 21:57 ` Mel Gorman
2013-04-26 23:19 ` [PATCH v4 21/31] vmscan: also shrink slab in memcg pressure Glauber Costa
2013-04-26 23:19 ` [PATCH v4 22/31] memcg,list_lru: duplicate LRUs upon kmemcg creation Glauber Costa
2013-04-26 23:19 ` [PATCH v4 23/31] lru: add an element to a memcg list Glauber Costa
2013-04-26 23:19 ` [PATCH v4 24/31] list_lru: per-memcg walks Glauber Costa
2013-04-26 23:19 ` [PATCH v4 25/31] memcg: per-memcg kmem shrinking Glauber Costa
2013-04-26 23:19 ` [PATCH v4 26/31] memcg: scan cache objects hierarchically Glauber Costa
2013-04-26 23:19 ` [PATCH v4 27/31] super: targeted memcg reclaim Glauber Costa
2013-04-26 23:19 ` [PATCH v4 28/31] memcg: move initialization to memcg creation Glauber Costa
2013-04-26 23:19 ` [PATCH v4 29/31] vmpressure: in-kernel notifications Glauber Costa
2013-04-26 23:19 ` [PATCH v4 30/31] memcg: reap dead memcgs upon global memory pressure Glauber Costa
2013-04-26 23:19 ` [PATCH v4 31/31] memcg: debugging facility to access dangling memcgs Glauber Costa
2013-04-30 22:47 ` [PATCH v4 00/31] kmemcg shrinkers Mel Gorman
2013-05-01 9:05 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130502093150.GI11497@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=dan.magenheimer@oracle.com \
--cc=daniel.vetter@ffwll.ch \
--cc=dchinner@redhat.com \
--cc=devel@driverdev.osuosl.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=glommer@openvz.org \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=koverstreet@google.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).