linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nikolay Borisov <kernel@kyup.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>,
	linux-ext4@vger.kernel.org, Marian Marinov <mm@1h.com>
Subject: Re: Lockup in wait_transaction_locked under memory pressure
Date: Mon, 29 Jun 2015 13:21:25 +0300	[thread overview]
Message-ID: <55911C25.9090700@kyup.com> (raw)
In-Reply-To: <20150629093826.GE28471@dhcp22.suse.cz>



On 06/29/2015 12:38 PM, Michal Hocko wrote:
> On Mon 29-06-15 12:23:16, Nikolay Borisov wrote:
>>
>>
>> On 06/29/2015 12:16 PM, Michal Hocko wrote:
>>> On Mon 29-06-15 12:07:54, Nikolay Borisov wrote:
>>>>
>>>>
>>>> On 06/29/2015 11:32 AM, Michal Hocko wrote:
>>>>> On Thu 25-06-15 18:27:10, Nikolay Borisov wrote:
>>>>>>
>>>>>>
>>>>>> On 06/25/2015 06:18 PM, Michal Hocko wrote:
>>>>>>> On Thu 25-06-15 17:34:22, Nikolay Borisov wrote:
>>>>>>>> On 06/25/2015 05:05 PM, Michal Hocko wrote:
>>>>>>>>> On Thu 25-06-15 16:49:43, Nikolay Borisov wrote:
>>>>>>>>> [...]
>>>>>>>>>> How would you advise to rectify such situation?
>>>>>>>>>
>>>>>>>>> As I've said. Check the oom victim traces and see if it is holding any
>>>>>>>>> of those locks.
>>>>>>>>
>>>>>>>> As mentioned previously all OOM traces are identical to the one I've
>>>>>>>> sent - OOM being called form the page fault path.
>>>>>>>  
>>>>>>> By identical you mean that all of them kill the same task? Or just that
>>>>>>> the path is same (which wouldn't be surprising as this is the only path
>>>>>>> which triggers memcg oom killer)?
>>>>>>
>>>>>> The code path is the same, the tasks being killed are different
>>>>>
>>>>> Is the OOM killer triggered only for a singe memcg or others misbehave
>>>>> as well?
>>>>
>>>> Generally OOM would be triggered for whichever memcg runs out of
>>>> resources but so far I've only observed that the D state issue happens
>>>> in a single containers.
>>>
>>> It is not clear whether it is the OOM memcg which has tasks in the D
>>> state. Anyway I think it all smells like one memcg is throttling others
>>> on another shared resource - journal in your case.
>>
>> Be that as it may, how do I find which cgroup is the culprit?
> 
> Ted has already described that. You have to check all the running tasks
> and try to find which of them is doing the operation which blocks
> others. Transaction commit sounds like the first one to check.

One other, fairly crucial detail - each and every container is on a
separate block device, meaning the journals for different block devices
is not being shared, since the journal is per-block device. I guess this
means that whatever is happening is more or less constrained to the
block device and thus the possibility that different memcg competing for
the journal can be eliminated?

> 

  reply	other threads:[~2015-06-29 10:21 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <558BD447.1010503@kyup.com>
2015-06-25 10:16 ` Lockup in wait_transaction_locked under memory pressure Nikolay Borisov
2015-06-25 11:21   ` Michal Hocko
2015-06-25 11:43     ` Nikolay Borisov
2015-06-25 11:50       ` Michal Hocko
2015-06-25 12:05         ` Nikolay Borisov
2015-06-25 13:29         ` Nikolay Borisov
2015-06-25 13:45           ` Michal Hocko
2015-06-25 13:54             ` Nikolay Borisov
2015-06-25 13:58               ` Michal Hocko
2015-06-25 13:31         ` Theodore Ts'o
2015-06-25 13:49           ` Nikolay Borisov
2015-06-25 14:05             ` Michal Hocko
2015-06-25 14:34               ` Nikolay Borisov
2015-06-25 15:18                 ` Michal Hocko
2015-06-25 15:27                   ` Nikolay Borisov
2015-06-29  8:32                     ` Michal Hocko
2015-06-29  9:07                       ` Nikolay Borisov
2015-06-29  9:16                         ` Michal Hocko
2015-06-29  9:23                           ` Nikolay Borisov
2015-06-29  9:38                             ` Michal Hocko
2015-06-29 10:21                               ` Nikolay Borisov [this message]
2015-06-29 11:44                                 ` Michal Hocko
2015-06-25 14:45             ` Theodore Ts'o
2015-06-25 13:57           ` Michal Hocko
2015-06-29  9:01           ` Nikolay Borisov
2015-06-29  9:36             ` Michal Hocko
2015-06-30  1:52               ` Dave Chinner
2015-06-30  3:02                 ` Theodore Ts'o
2015-06-30  6:35                   ` Nikolay Borisov
2015-06-30 12:30                 ` Michal Hocko
2015-06-30 14:31                   ` Michal Hocko
2015-06-30 22:58                     ` Dave Chinner
2015-07-01  6:10                       ` Michal Hocko
2015-07-01 11:13                         ` Theodore Ts'o
2015-07-01 14:21                           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55911C25.9090700@kyup.com \
    --to=kernel@kyup.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=mm@1h.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).