public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
To: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: Markus Blank-Burian
	<burian-iYtK5bfT9M8b1SvskN2V4Q@public.gmane.org>,
	Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
	Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>,
	David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Ying Han <yinghan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Michel Lespinasse
	<walken-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: Possible regression with cgroups in 3.11
Date: Thu, 7 Nov 2013 18:53:01 -0500	[thread overview]
Message-ID: <20131107235301.GB1092@cmpxchg.org> (raw)
In-Reply-To: <5278B3F1.9040502-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

On Tue, Nov 05, 2013 at 05:01:37PM +0800, Li Zefan wrote:
> On 2013/11/4 21:43, Markus Blank-Burian wrote:
> >> synchronize_rcu() is a block operation and can keep us waiting for
> >> a long period, so instead it's possible that usage never goes down
> >> to 0 and we are in a dead loop.
> > 
> > Ok, I didn't think of that. Tracing shows that the function keeps
> > looping. The last lines repeat indefinitely.
> > 
> ...
> >      kworker/3:5-7605  [003] ....   987.475678:
> > mem_cgroup_reparent_charges: usage: 1568768
> >      kworker/3:5-7605  [003] ....   987.478677:
> > mem_cgroup_reparent_charges: usage: 1568768
> >      kworker/3:5-7605  [003] ....   987.481675:
> > mem_cgroup_reparent_charges: usage: 1568768
> 
> So it's much more likely this is a memcg bug rather than a cgroup bug.
> I hope memcg guys could look into it, or you could do a git-bisect if
> you can reliably reproduce the bug.

I think there is a problem with ref counting and memcg.

The old scheme would wait with the charge reparenting until all
references were gone for good, whereas the new scheme has only a RCU
grace period between disabling tryget and offlining the css.
Unfortunately, memory cgroups don't hold the rcu_read_lock() over both
the tryget and the res_counter charge that would make it visible to
offline_css(), which means that there is a possible race condition
between cgroup teardown and an ongoing charge:

#0: destroy                #1: charge

                           rcu_read_lock()
                           css_tryget()
                           rcu_read_unlock()
disable tryget()
call_rcu()
  offline_css()
    reparent_charges()
                           res_counter_charge()
                           css_put()
                             css_free()
                           pc->mem_cgroup = deadcg
                           add page to lru

If the res_counter is hierarchical, there is now a leaked charge from
the dead group in the parent counter with no corresponding page on the
LRU, which will lead to this endless loop when deleting the parent.

The race window can be seconds if the res_counter hits its limit and
page reclaim is entered between css_tryget() and the res counter
charge succeeding.

I thought about calling reparent_charges() again from css_free() at
first to catch any raced charges.  But that won't work if the last
reference is actually put by the charger because then it'll drop into
the loop before putting the page on the LRU.

The lifetime management in memory cgroups is a disaster and it's going
to require some thought to fix.  Even before the cgroups rewrite,
swapin accounting was prone to this race condition because a task from
a completely different cgroup can start charging a swap-in page
against the cgroup that owned the page on swapout, a cgroup that might
be exiting and had been found to have no tasks, no child groups, and
no outstanding references anymore.

  parent reply	other threads:[~2013-11-07 23:53 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-10  8:50 Possible regression with cgroups in 3.11 Markus Blank-Burian
     [not found] ` <4431690.ZqnBIdaGMg-fhzw3bAB8VLGE+7tAf435K1T39T6GgSB@public.gmane.org>
2013-10-11 13:06   ` Li Zefan
     [not found]     ` <5257F7CE.90702-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-11 16:05       ` Markus Blank-Burian
     [not found]         ` <CA+SBX_Pa8sJbRq3aOghzqam5tDUbs_SPnVTaewtg-pRmvUqSzA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-12  6:00           ` Li Zefan
     [not found]             ` <5258E584.70500-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-14  8:06               ` Markus Blank-Burian
     [not found]                 ` <CA+SBX_MQVMuzWKroASK7Cr5J8cu9ajGo=CWr7SRs+OWh83h4_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-15  3:15                   ` Li Zefan
     [not found]                     ` <525CB337.8050105-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-18  9:34                       ` Markus Blank-Burian
     [not found]                         ` <CA+SBX_Ogo8HP81o+vrJ8ozSBN6gPwzc8WNOV3Uya=4AYv+CCyQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-18  9:57                           ` Markus Blank-Burian
     [not found]                             ` <CA+SBX_OJBbYzrNX5Mi4rmM2SANShXMmAvuPGczAyBdx8F2hBDQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-10-30  8:14                               ` Li Zefan
     [not found]                                 ` <5270BFE7.4000602-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-10-31  2:09                                   ` Hugh Dickins
     [not found]                                     ` <alpine.LNX.2.00.1310301606080.2333-fupSdm12i1nKWymIFiNcPA@public.gmane.org>
2013-10-31 17:06                                       ` Steven Rostedt
     [not found]                                         ` <20131031130647.0ff6f2c7-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org>
2013-10-31 21:46                                           ` Hugh Dickins
     [not found]                                             ` <alpine.LNX.2.00.1310311442030.2633-fupSdm12i1nKWymIFiNcPA@public.gmane.org>
2013-10-31 23:27                                               ` Steven Rostedt
     [not found]                                                 ` <20131031192732.2dbb14b3-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org>
2013-11-01  1:33                                                   ` Hugh Dickins
2013-11-04 11:00                                                   ` Markus Blank-Burian
     [not found]                                                     ` <CA+SBX_NjAYrqqOpSuCy8Wpj6q1hE_qdLrRV6auydmJjdcHKQHg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-04 12:29                                                       ` Li Zefan
     [not found]                                                         ` <5277932C.40400-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-11-04 13:43                                                           ` Markus Blank-Burian
     [not found]                                                         ` <CA+SBX_ORkOzDynKKweg=JomY2+1kz4=FXYJXYMsN8LKf48idBg@mail.gmail. com>
     [not found]                                                           ` <CA+SBX_ORkOzDynKKweg=JomY2+1kz4=FXYJXYMsN8LKf48idBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-05  9:01                                                             ` Li Zefan
     [not found]                                                               ` <5278B3F1.9040502-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-11-07 23:53                                                                 ` Johannes Weiner [this message]
     [not found]                                                                   ` <20131107235301.GB1092-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2013-11-08  0:14                                                                     ` Johannes Weiner
     [not found]                                                                       ` <20131108001437.GC1092-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2013-11-08  8:36                                                                         ` Li Zefan
     [not found]                                                                           ` <527CA292.7090104-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-11-08 13:34                                                                             ` Johannes Weiner
2013-11-08 10:20                                                                         ` Markus Blank-Burian
     [not found]                                                                           ` <CA+SBX_P6wzmb0k0qM1m06C_1024ZTfYZOs0axLBBJm46X+osqA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-11 15:39                                                                             ` Michal Hocko
     [not found]                                                                               ` <20131111153943.GA22384-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-11 16:11                                                                                 ` Markus Blank-Burian
     [not found]                                                                                   ` <CA+SBX_PiRoL7HU-C_wXHjHYduYrbTjO3i6_OoHOJ_Mq+sMZStg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-12 13:58                                                                                     ` Michal Hocko
     [not found]                                                                                       ` <20131112135844.GA6049-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-12 19:33                                                                                         ` Markus Blank-Burian
     [not found]                                                                                           ` <CA+SBX_MWM1iU7kyT5Ct3OJ7S3oMgbz_EWbFH1dGae+r_UnDxOA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-13  1:51                                                                                             ` Li Zefan
2013-11-13 16:31                                                                                         ` Markus Blank-Burian
     [not found]                                                                                       ` <CA+SBX_O4oK1H7Gtb5OFYSn_W3Gz+d-YqF7OmM3mOrRTp6x3pvw@mail.gmail.com>
     [not found]                                                                                         ` <CA+SBX_O4oK1H7Gtb5OFYSn_W3Gz+d-YqF7OmM3mOrRTp6x3pvw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-18  9:45                                                                                           ` Michal Hocko
     [not found]                                                                                             ` <20131118094554.GA32623-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-18 14:31                                                                                               ` Markus Blank-Burian
     [not found]                                                                                                 ` <CA+SBX_PqdsG5LBQ1uLpPsSUsbjF8TJ+ok4E+Hp_3AdHf+_5e-A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-18 19:16                                                                                                   ` Michal Hocko
     [not found]                                                                                                     ` <20131118191655.GB12923-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-21 15:59                                                                                                       ` Markus Blank-Burian
     [not found]                                                                                                         ` <CA+SBX_OeGCr5oDbF0n7jSLu-TTY9xpqc=LYp_=18qFYHB-nBdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-21 16:45                                                                                                           ` Michal Hocko
     [not found]                                                                                                             ` <CA+SBX_PDuU7roist-rQ136Jhx1pr-Nt-r=ULdghJFNHsMWwLrg@mail.gmail.com>
     [not found]                                                                                                               ` <CA+SBX_PDuU7roist-rQ136Jhx1pr-Nt-r=ULdghJFNHsMWwLrg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-22 14:50                                                                                                                 ` Michal Hocko
     [not found]                                                                                                                   ` <20131122145033.GE25406-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-25 14:03                                                                                                                     ` Markus Blank-Burian
     [not found]                                                                                                                       ` <CA+SBX_O_+WbZGUJ_tw_EWPaSfrWbTgQu8=GpGpqm0sizmmP=cA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-26 15:21                                                                                                                         ` Michal Hocko
     [not found]                                                                                                                           ` <20131126152124.GC32639-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-26 21:05                                                                                                                             ` Markus Blank-Burian
     [not found]                                                                                                                               ` <CA+SBX_Mb0EwvmaejqoW4mtYbiOTV6yV3VrLH7=s0wX-6rH7yDA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-28 17:05                                                                                                                                 ` Michal Hocko
     [not found]                                                                                                                                   ` <20131128170536.GA17411-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-29  8:33                                                                                                                                     ` Markus Blank-Burian
2013-11-26 21:47                                                                                                                             ` Markus Blank-Burian
2013-11-13 15:17                                                                         ` Michal Hocko
2013-11-18 10:30                                                                         ` William Dauchy
     [not found]                                                                           ` <CAJ75kXamrtQz5-cYS7tYtYeP1ZLf2pzSE7UnEPpyORzpG3BASg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-18 16:43                                                                             ` Johannes Weiner
     [not found]                                                                               ` <20131118164308.GD3556-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2013-11-19 11:16                                                                                 ` William Dauchy
2013-11-11 15:31                                                                     ` Michal Hocko
     [not found]                                                                       ` <20131111153148.GC14497-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-12 14:58                                                                         ` Michal Hocko
     [not found]                                                                           ` <20131112145824.GC6049-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-13  3:38                                                                             ` Tejun Heo
     [not found]                                                                               ` <20131113033840.GC19394-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-11-13 11:01                                                                                 ` Michal Hocko
     [not found]                                                                                   ` <20131113110108.GA22131-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-13 13:23                                                                                     ` [RFC] memcg: fix race between css_offline and async charge (was: Re: Possible regression with cgroups in 3.11) Michal Hocko
     [not found]                                                                                       ` <20131113132337.GB22131-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-13 14:54                                                                                         ` Johannes Weiner
     [not found]                                                                                           ` <20131113145427.GG707-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2013-11-13 15:13                                                                                             ` Michal Hocko
     [not found]                                                                                               ` <20131113151339.GC22131-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-13 15:30                                                                                                 ` Johannes Weiner
2013-11-13  3:28                                               ` Possible regression with cgroups in 3.11 Tejun Heo
     [not found]                                                 ` <20131113032804.GB19394-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-11-13  7:38                                                   ` Tejun Heo
2013-11-16  0:28                                                     ` Bjorn Helgaas
     [not found]                                                       ` <20131116002820.GA31073-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2013-11-16  4:53                                                         ` Tejun Heo
2013-11-18 18:14                                                           ` Bjorn Helgaas
     [not found]                                                             ` <20131118181440.GA2996-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2013-11-18 19:29                                                               ` Yinghai Lu
2013-11-18 20:39                                                                 ` Bjorn Helgaas
     [not found]                                                                   ` <20131118203925.GA26682-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2013-11-21  4:26                                                                     ` Sasha Levin
     [not found]                                                                       ` <528D8B6A.40008-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2013-11-21  4:47                                                                         ` Bjorn Helgaas
     [not found]                                                                           ` <CAErSpo4bXfVbxcJ6-LcByDRX25DSa8Pa+9dLtcaW631YK88Gcw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-25 21:57                                                                             ` Bjorn Helgaas
2013-10-15  3:47                   ` Li Zefan
  -- strict thread matches above, loose matches on Subject: below --
2013-10-10  8:49 Markus Blank-Burian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131107235301.GB1092@cmpxchg.org \
    --to=hannes-druugvl0lcnafugrpc6u6w@public.gmane.org \
    --cc=burian-iYtK5bfT9M8b1SvskN2V4Q@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
    --cc=mhocko-AlSwsSmVLrQ@public.gmane.org \
    --cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=walken-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=yinghan-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox