linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>,
	linux-kernel@vger.kernel.org, kamezawa.hiroyu@jp.fujitsu.com,
	balbir@linux.vnet.ibm.com, akpm@linux-foundation.org,
	linux-mm@kvack.org, Paul Menage <menage@google.com>,
	Li Zefan <lizf@cn.fujitsu.com>,
	containers@lists.linux-foundation.org
Subject: Re: [PATCH] memcg: do not expose uninitialized mem_cgroup_per_node to world
Date: Tue, 07 Jun 2011 15:25:59 +0200	[thread overview]
Message-ID: <4DEE26E7.2060201@redhat.com> (raw)
In-Reply-To: <BANLkTinMamg_qesEffGxKu3QkT=zyQ2MRQ@mail.gmail.com>

Sorry for late reply,

On 06/03/2011 03:00 PM, Hiroyuki Kamezawa wrote:
> 2011/6/3 Igor Mammedov<imammedo@redhat.com>:
>> On 06/02/2011 01:10 AM, Hiroyuki Kamezawa wrote:
>>>> pc = list_entry(list->prev, struct page_cgroup, lru);
>>> Hmm, I disagree your patch is a fix for mainline. At least, a cgroup
>>> before completion of
>>> create() is not populated to userland and you never be able to rmdir()
>>> it because you can't
>>> find it.
>>>
>>>
>>>   >26:   e8 7d 12 30 00          call   0x3012a8
>>>   >2b:*  8b 73 08                mov    0x8(%ebx),%esi<-- trapping
>>> instruction
>>>   >2e:   8b 7c 24 24             mov    0x24(%esp),%edi
>>>   >32:   8b 07                   mov    (%edi),%eax
>>>
>>> Hm, what is the call 0x3012a8 ?
>>>
>>                 pc = list_entry(list->prev, struct page_cgroup, lru);
>>                 if (busy == pc) {
>>                         list_move(&pc->lru, list);
>>                         busy = 0;
>>                         spin_unlock_irqrestore(&zone->lru_lock, flags);
>>                         continue;
>>                 }
>>                 spin_unlock_irqrestore(&zone->lru_lock, flags);<---- is
>>   call 0x3012a8
>>                 ret = mem_cgroup_move_parent(pc, mem, GFP_KERNEL);
>>
>> and  mov 0x8(%ebx),%esi
>> is dereferencing of 'pc' in inlined mem_cgroup_move_parent
>>
> Ah, thank you for input..then panicd at accessing pc->page and "pc"
> was 0xfffffff4.
> it means list->prev was NULL.
>
yes, that's the case.
>> I've looked at vmcore once more and indeed there isn't any parallel task
>> that touches cgroups code path.
>> Will investigate if it is xen to blame for incorrect data in place.
>>
>> Thanks very much for your opinion.
> What curious to me is that the fact "list->prev" is NULL.
> I can see why you doubt the initialization code ....the list pointer never
> contains NULL once it's used....
> it smells like memory corruption or some to me. If you have vmcore,
> what the problematic mem_cgroup_per_zone(node) contains ?

it has all zeros except for last field:

crash> rd f3446a00 62
f3446a00:  00000000 00000000 00000000 00000000   ................
f3446a10:  00000000 00000000 00000000 00000000   ................
f3446a20:  00000000 00000000 00000000 00000000   ................
f3446a30:  00000000 00000000 00000000 00000000   ................
f3446a40:  00000000 00000000 00000000 00000000   ................
f3446a50:  00000000 00000000 00000000 00000000   ................
f3446a60:  00000000 00000000 00000000 00000000   ................
f3446a70:  00000000 00000000 f36ef800 f3446a7c   ..........n.|jD.
f3446a80:  f3446a7c f3446a84 f3446a84 f3446a8c   |jD..jD..jD..jD.
f3446a90:  f3446a8c f3446a94 f3446a94 f3446a9c   .jD..jD..jD..jD.
f3446aa0:  f3446a9c 00000000 00000000 00000000   .jD.............
f3446ab0:  00000000 00000000 00000000 00000000   ................
f3446ac0:  00000000 00000000 00000000 00000000   ................
f3446ad0:  00000000 00000000 00000000 00000000   ................
f3446ae0:  00000000 00000000 00000000 00000000   ................
f3446af0:  00000000 f36ef800

crash> struct mem_cgroup f36ef800
struct mem_cgroup {
...
info = {
     nodeinfo = {0xf3446a00}
   },
...

It looks like a very targeted corruption of the first zone except of
the last field, while the second zone and the rest are perfectly
normal (i.e. have empty initialized lists).


PS:
It most easily reproduced only on xen hvm 32bit guest under heavy
vcpus contention for real cpus resources (i.e. I had to overcommit
cpus and run several cpu hog tasks on host to make guest crash on
reboot cycle).
And from last experiments, crash happens only on on hosts that
doesn't have hap feature or if hap is disabled in hypervisor.

> Thanks,
> -Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-06-07 13:26 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1306925044-2828-1-git-send-email-imammedo@redhat.com>
2011-06-01 12:39 ` [PATCH] memcg: do not expose uninitialized mem_cgroup_per_node to world Michal Hocko
2011-06-01 13:07   ` Igor Mammedov
2011-06-01 13:41     ` Michal Hocko
2011-06-01 14:39       ` Igor Mammedov
2011-06-01 15:20         ` Michal Hocko
2011-06-01 16:42           ` Igor Mammedov
2011-06-01 23:10             ` Hiroyuki Kamezawa
2011-06-03 12:35               ` Igor Mammedov
2011-06-03 13:00                 ` Hiroyuki Kamezawa
2011-06-07 13:25                   ` Igor Mammedov [this message]
2011-06-08  3:35                     ` KAMEZAWA Hiroyuki
2011-06-08 21:09                       ` Andrew Morton
2011-06-08 23:44                         ` KAMEZAWA Hiroyuki
2011-06-10 16:57                         ` Igor Mammedov
2011-07-26 21:17                           ` Andrew Morton
2011-07-27  7:58                             ` Michal Hocko
2011-07-27  9:30                               ` Igor Mammedov
2011-07-27  9:57                                 ` Michal Hocko
2011-06-09  8:11                       ` Igor Mammedov
2011-06-09 12:40                         ` Possible shadow bug (was: Re: [PATCH] memcg: do not expose uninitialized mem_cgroup_per_node to world) Stefano Stabellini
2011-06-09 15:01                           ` [Xen-devel] " Tim Deegan
2011-06-09 16:47                             ` [Xen-devel] Possible shadow bug Igor Mammedov
2011-06-10 10:01                               ` Tim Deegan
2011-06-10 10:10                                 ` Tim Deegan
2011-06-10 11:48                                   ` Pasi Kärkkäinen
2011-06-10 12:40                                     ` Tim Deegan
2011-06-10 15:38                                       ` Igor Mammedov
2011-06-10 13:55                                   ` Igor Mammedov
2011-06-01 13:49   ` [PATCH] memcg: do not expose uninitialized mem_cgroup_per_node to world Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DEE26E7.2060201@redhat.com \
    --to=imammedo@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kamezawa.hiroyuki@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).