* [lkp] [mm, page_alloc] 43993977ba: +88% OOM possibility
@ 2015-10-28 5:36 kernel test robot
2015-10-29 14:24 ` Michal Hocko
0 siblings, 1 reply; 8+ messages in thread
From: kernel test robot @ 2015-10-28 5:36 UTC (permalink / raw)
To: Mel Gorman
Cc: lkp, LKML, Andrew Morton, Rik van Riel, Vitaly Wool,
David Rientjes, Christoph Lameter, Johannes Weiner, Michal Hocko,
Vlastimil Babka, Stephen Rothwell
[-- Attachment #1: Type: text/plain, Size: 3329 bytes --]
FYI, we noticed the below changes on
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit 43993977baecd838d66ccabc7f682342fc6ff635 ("mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd")
We found the OOM possibility increased 88% in a virtual machine with 1G memory.
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/disk/fs/test:
vm-kbuild-1G/xfstests/debian-x86_64-2015-02-07.cgz/x86_64-allyesdebian/gcc-4.9/4HDD/btrfs/generic-mid
commit:
74fad8a3a917b9e0a407af8a4150c61f7b836591
43993977baecd838d66ccabc7f682342fc6ff635
74fad8a3a917b9e0 43993977baecd838d66ccabc7f
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
1:24 -4% :24 xfstests.generic.192.fail
1:24 -4% :24 xfstests.nr_fail
:24 88% 21:24 dmesg.Mem-Info
:24 62% 15:24 dmesg.page_allocation_failure:order:#,mode
:24 88% 21:24 dmesg.warn_alloc_failed+0x
:24 75% 18:24 last_state.is_incomplete_run
1:24 -4% :24 last_state.xfstests.exit_code.1
:24 54% 13:24 last_state.xfstests.exit_code.143
:24 71% 17:24 kmsg.SLAB:Unable_to_allocate_memory_on_node#(gfp=#)
1:24 -4% :24 kmsg.TDH<#>
1:24 -4% :24 kmsg.TDH<c7>
1:24 -4% :24 kmsg.TDT<#>
1:24 -4% :24 kmsg.TDT<c7>
1:24 -4% :24 kmsg.Tx_Queue<#>
1:24 -4% :24 kmsg.buffer_info[next_to_clean]
1:24 -4% :24 kmsg.e1000#:#:#eth0:Detected_Tx_Unit_Hang
1:24 -4% :24 kmsg.jiffies<#>
1:24 -4% :24 kmsg.jiffies<#c5c>
1:24 -4% :24 kmsg.next_to_clean<#>
1:24 -4% :24 kmsg.next_to_clean<c7>
1:24 -4% :24 kmsg.next_to_use<#>
1:24 -4% :24 kmsg.next_to_use<c9>
1:24 -4% :24 kmsg.next_to_watch.status<#>
1:24 -4% :24 kmsg.next_to_watch<#>
1:24 -4% :24 kmsg.next_to_watch<c8>
1:24 -4% :24 kmsg.time_stamp<#>
1:24 -4% :24 kmsg.time_stamp<#afd>
vm-kbuild-1G: qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap
Memory: 1G
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Ying Huang
[-- Attachment #2: job.yaml --]
[-- Type: text/plain, Size: 2984 bytes --]
---
LKP_SERVER: inn
LKP_CGI_PORT: 80
LKP_CIFS_PORT: 139
testcase: xfstests
default-monitors:
wait: activate-monitor
kmsg:
vmstat:
interval: 10
default-watchdogs:
oom-killer:
watchdog:
cpufreq_governor:
model: qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap
nr_vm: 16
nr_cpu: 2
memory: 1G
disk_type: virtio-scsi
rootfs: debian-x86_64-2015-02-07.cgz
hdd_partitions: "/dev/sda /dev/sdb /dev/sdc /dev/sdd"
swap_partitions: "/dev/sde"
ssh_base_port: 23000
category: functional
disk: 4HDD
fs: btrfs
xfstests:
test: generic-mid
enqueue_time: 2015-10-19 23:36:05.806413316 +08:00
branch: linux-review/Richard-Fitzgerald/Add-support-for-Cirrus-Logic-CS47L24-and-WM1831-codecs/20151019-221754
commit: 70e25d9e15fdae345ae7499131c643db506d08e0
queue: rand
repeat_to: 2
testbox: vm-kbuild-1G-11
tbox_group: vm-kbuild-1G
kconfig: x86_64-allyesdebian
id: 64d645db367175db708ac69d321aa547a7f23900
user: lkp
compiler: gcc-4.9
kernel: "/pkg/linux/x86_64-allyesdebian/gcc-4.9/70e25d9e15fdae345ae7499131c643db506d08e0/vmlinuz-4.3.0-rc5-next-20151016-00003-g70e25d9"
result_root: "/result/xfstests/4HDD-btrfs-generic-mid/vm-kbuild-1G/debian-x86_64-2015-02-07.cgz/x86_64-allyesdebian/gcc-4.9/70e25d9e15fdae345ae7499131c643db506d08e0/0"
job_file: "/lkp/scheduled/vm-kbuild-1G-11/rand_xfstests-4HDD-btrfs-generic-mid-debian-x86_64.cgz-x86_64-allyesdebian-70e25d9e15fdae345ae7499131c643db506d08e0-20151019-128924-ut9rnp-0.yaml"
dequeue_time: 2015-10-20 00:24:58.505681006 +08:00
max_uptime: 3600
initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
bootloader_append:
- root=/dev/ram0
- user=lkp
- job=/lkp/scheduled/vm-kbuild-1G-11/rand_xfstests-4HDD-btrfs-generic-mid-debian-x86_64.cgz-x86_64-allyesdebian-70e25d9e15fdae345ae7499131c643db506d08e0-20151019-128924-ut9rnp-0.yaml
- ARCH=x86_64
- kconfig=x86_64-allyesdebian
- branch=linux-review/Richard-Fitzgerald/Add-support-for-Cirrus-Logic-CS47L24-and-WM1831-codecs/20151019-221754
- commit=70e25d9e15fdae345ae7499131c643db506d08e0
- BOOT_IMAGE=/pkg/linux/x86_64-allyesdebian/gcc-4.9/70e25d9e15fdae345ae7499131c643db506d08e0/vmlinuz-4.3.0-rc5-next-20151016-00003-g70e25d9
- max_uptime=3600
- RESULT_ROOT=/result/xfstests/4HDD-btrfs-generic-mid/vm-kbuild-1G/debian-x86_64-2015-02-07.cgz/x86_64-allyesdebian/gcc-4.9/70e25d9e15fdae345ae7499131c643db506d08e0/0
- LKP_SERVER=inn
- |2-
earlyprintk=ttyS0,115200 systemd.log_level=err
debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100
panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0
console=ttyS0,115200 console=tty0 vga=normal
rw
lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
modules_initrd: "/pkg/linux/x86_64-allyesdebian/gcc-4.9/70e25d9e15fdae345ae7499131c643db506d08e0/modules.cgz"
bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/fs.cgz,/lkp/benchmarks/xfstests.cgz"
job_state: upload_dmesg
[-- Attachment #3: reproduce --]
[-- Type: text/plain, Size: 593 bytes --]
mkfs -t btrfs /dev/sdd
mkfs -t btrfs /dev/sdb
mkfs -t btrfs /dev/sda
mkfs -t btrfs /dev/sdc
mount -t btrfs /dev/sda /fs/sda
mount -t btrfs /dev/sdb /fs/sdb
mount -t btrfs /dev/sdc /fs/sdc
mount -t btrfs /dev/sdd /fs/sdd
export TEST_DIR=/fs/sda
export TEST_DEV=/dev/sda
export FSTYP=btrfs
export SCRATCH_MNT=/fs/scratch
mkdir /fs/scratch -p
export SCRATCH_DEV_POOL="/dev/sdb /dev/sdc /dev/sdd"
./check generic/068 generic/100 generic/117 generic/120 generic/125 generic/130 generic/192 generic/209 generic/223 generic/225 generic/226 generic/230 generic/241 generic/247 generic/256 generic/275
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lkp] [mm, page_alloc] 43993977ba: +88% OOM possibility
2015-10-28 5:36 [lkp] [mm, page_alloc] 43993977ba: +88% OOM possibility kernel test robot
@ 2015-10-29 14:24 ` Michal Hocko
2015-10-30 8:21 ` Huang, Ying
0 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2015-10-29 14:24 UTC (permalink / raw)
To: kernel test robot
Cc: Mel Gorman, lkp, LKML, Andrew Morton, Rik van Riel, Vitaly Wool,
David Rientjes, Christoph Lameter, Johannes Weiner,
Vlastimil Babka, Stephen Rothwell
On Wed 28-10-15 13:36:02, kernel test robot wrote:
> FYI, we noticed the below changes on
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> commit 43993977baecd838d66ccabc7f682342fc6ff635 ("mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd")
>
> We found the OOM possibility increased 88% in a virtual machine with 1G memory.
Could you provide dmesg output from this test?
Thanks!
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lkp] [mm, page_alloc] 43993977ba: +88% OOM possibility
2015-10-29 14:24 ` Michal Hocko
@ 2015-10-30 8:21 ` Huang, Ying
2015-10-30 10:38 ` Michal Hocko
0 siblings, 1 reply; 8+ messages in thread
From: Huang, Ying @ 2015-10-30 8:21 UTC (permalink / raw)
To: Michal Hocko
Cc: Mel Gorman, lkp, LKML, Andrew Morton, Rik van Riel, Vitaly Wool,
David Rientjes, Christoph Lameter, Johannes Weiner,
Vlastimil Babka, Stephen Rothwell
[-- Attachment #1: Type: text/plain, Size: 565 bytes --]
Michal Hocko <mhocko@kernel.org> writes:
> On Wed 28-10-15 13:36:02, kernel test robot wrote:
>> FYI, we noticed the below changes on
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>> commit 43993977baecd838d66ccabc7f682342fc6ff635 ("mm, page_alloc:
>> distinguish between being unable to sleep, unwilling to sleep and
>> avoiding waking kswapd")
>>
>> We found the OOM possibility increased 88% in a virtual machine with 1G memory.
>
> Could you provide dmesg output from this test?
Sure, Attached.
Best Regards,
Huang, Ying
[-- Attachment #2: dmesg.xz --]
[-- Type: application/x-xz, Size: 32456 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lkp] [mm, page_alloc] 43993977ba: +88% OOM possibility
2015-10-30 8:21 ` Huang, Ying
@ 2015-10-30 10:38 ` Michal Hocko
2015-11-01 23:20 ` Huang, Ying
0 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2015-10-30 10:38 UTC (permalink / raw)
To: Huang, Ying
Cc: Mel Gorman, lkp, LKML, Andrew Morton, Rik van Riel, Vitaly Wool,
David Rientjes, Christoph Lameter, Johannes Weiner,
Vlastimil Babka, Stephen Rothwell
On Fri 30-10-15 16:21:40, Huang, Ying wrote:
> Michal Hocko <mhocko@kernel.org> writes:
>
> > On Wed 28-10-15 13:36:02, kernel test robot wrote:
> >> FYI, we noticed the below changes on
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> >> commit 43993977baecd838d66ccabc7f682342fc6ff635 ("mm, page_alloc:
> >> distinguish between being unable to sleep, unwilling to sleep and
> >> avoiding waking kswapd")
> >>
> >> We found the OOM possibility increased 88% in a virtual machine with 1G memory.
> >
> > Could you provide dmesg output from this test?
>
> Sure, Attached.
I can only see a single allocation failure warning:
kworker/u4:1: page allocation failure: order:0, mode:0x2204000
This is obviously a non sleeping allocation with ___GFP_KSWAPD_RECLAIM
set. ___GFP_HIGH (aka access to memory reserves) is not required so a
failure of such an allocation is something to be expected.
[ 2294.616369] Workqueue: btrfs-submit btrfs_submit_helper
[ 2294.616369] 0000000000000000 ffff88000d38f5e0 ffffffff8173f84c 0000000000000000
[ 2294.616369] ffff88000d38f678 ffffffff811abaee 00000000ffffffff 000000010038f618
[ 2294.616369] ffff8800584e4148 00000000ffffffff ffff8800584e2f00 0000000000000001
[ 2294.616369] Call Trace:
[ 2294.616369] [<ffffffff8173f84c>] dump_stack+0x4b/0x63
[ 2294.616369] [<ffffffff811abaee>] warn_alloc_failed+0x125/0x13d
[ 2294.616369] [<ffffffff811aecce>] __alloc_pages_nodemask+0x7c9/0x915
[ 2294.616369] [<ffffffff811ecc7b>] kmem_getpages+0x91/0x155
[ 2294.616369] [<ffffffff811eef0d>] fallback_alloc+0x1cc/0x24c
[ 2294.616369] [<ffffffff811eed32>] ____cache_alloc_node+0x151/0x160
[ 2294.616369] [<ffffffff811ef1ed>] __kmalloc+0xb0/0x134
[ 2294.616369] [<ffffffff8105d7a5>] ? sched_clock+0x9/0xb
[ 2294.616369] [<ffffffff8187d929>] ? virtqueue_add+0x78/0x37f
[ 2294.616369] [<ffffffff8187d929>] virtqueue_add+0x78/0x37f
[ 2294.616369] [<ffffffff81114f72>] ? __lock_acquire+0x751/0xf55
[ 2294.616369] [<ffffffff8187dca6>] virtqueue_add_sgs+0x76/0x85
The patch you are referring shouldn't make any change in this path
because alloc_indirect which I expect is the allocation failing here
does:
gfp &= ~(__GFP_HIGHMEM | __GFP_HIGH)
and that came in via b92b1b89a33c ("virtio: force vring descriptors to
be allocated from lowmem").
Are there more failed allocations during the test? The subject would
suggest so.
Thanks!
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lkp] [mm, page_alloc] 43993977ba: +88% OOM possibility
2015-10-30 10:38 ` Michal Hocko
@ 2015-11-01 23:20 ` Huang, Ying
2015-11-02 7:45 ` Michal Hocko
0 siblings, 1 reply; 8+ messages in thread
From: Huang, Ying @ 2015-11-01 23:20 UTC (permalink / raw)
To: Michal Hocko
Cc: Mel Gorman, lkp, LKML, Andrew Morton, Rik van Riel, Vitaly Wool,
David Rientjes, Christoph Lameter, Johannes Weiner,
Vlastimil Babka, Stephen Rothwell
Michal Hocko <mhocko@kernel.org> writes:
> On Fri 30-10-15 16:21:40, Huang, Ying wrote:
>> Michal Hocko <mhocko@kernel.org> writes:
>>
>> > On Wed 28-10-15 13:36:02, kernel test robot wrote:
>> >> FYI, we noticed the below changes on
>> >>
>> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>> >> commit 43993977baecd838d66ccabc7f682342fc6ff635 ("mm, page_alloc:
>> >> distinguish between being unable to sleep, unwilling to sleep and
>> >> avoiding waking kswapd")
>> >>
>> >> We found the OOM possibility increased 88% in a virtual machine with 1G memory.
>> >
>> > Could you provide dmesg output from this test?
>>
>> Sure, Attached.
>
> I can only see a single allocation failure warning:
> kworker/u4:1: page allocation failure: order:0, mode:0x2204000
>
> This is obviously a non sleeping allocation with ___GFP_KSWAPD_RECLAIM
> set. ___GFP_HIGH (aka access to memory reserves) is not required so a
> failure of such an allocation is something to be expected.
>
> [ 2294.616369] Workqueue: btrfs-submit btrfs_submit_helper
> [ 2294.616369] 0000000000000000 ffff88000d38f5e0 ffffffff8173f84c 0000000000000000
> [ 2294.616369] ffff88000d38f678 ffffffff811abaee 00000000ffffffff 000000010038f618
> [ 2294.616369] ffff8800584e4148 00000000ffffffff ffff8800584e2f00 0000000000000001
> [ 2294.616369] Call Trace:
> [ 2294.616369] [<ffffffff8173f84c>] dump_stack+0x4b/0x63
> [ 2294.616369] [<ffffffff811abaee>] warn_alloc_failed+0x125/0x13d
> [ 2294.616369] [<ffffffff811aecce>] __alloc_pages_nodemask+0x7c9/0x915
> [ 2294.616369] [<ffffffff811ecc7b>] kmem_getpages+0x91/0x155
> [ 2294.616369] [<ffffffff811eef0d>] fallback_alloc+0x1cc/0x24c
> [ 2294.616369] [<ffffffff811eed32>] ____cache_alloc_node+0x151/0x160
> [ 2294.616369] [<ffffffff811ef1ed>] __kmalloc+0xb0/0x134
> [ 2294.616369] [<ffffffff8105d7a5>] ? sched_clock+0x9/0xb
> [ 2294.616369] [<ffffffff8187d929>] ? virtqueue_add+0x78/0x37f
> [ 2294.616369] [<ffffffff8187d929>] virtqueue_add+0x78/0x37f
> [ 2294.616369] [<ffffffff81114f72>] ? __lock_acquire+0x751/0xf55
> [ 2294.616369] [<ffffffff8187dca6>] virtqueue_add_sgs+0x76/0x85
>
> The patch you are referring shouldn't make any change in this path
> because alloc_indirect which I expect is the allocation failing here
> does:
> gfp &= ~(__GFP_HIGHMEM | __GFP_HIGH)
>
> and that came in via b92b1b89a33c ("virtio: force vring descriptors to
> be allocated from lowmem").
>
> Are there more failed allocations during the test? The subject would
> suggest so.
We done 24 tests for the commit and 24 tests for its parent. There is
no OOM in any test for the parent commit, but there are OOM in 21 tests
for this commit. This is what I want to say in the subject. Sorry for
confusing.
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lkp] [mm, page_alloc] 43993977ba: +88% OOM possibility
2015-11-01 23:20 ` Huang, Ying
@ 2015-11-02 7:45 ` Michal Hocko
2015-11-02 8:55 ` Huang, Ying
0 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2015-11-02 7:45 UTC (permalink / raw)
To: Huang, Ying
Cc: Mel Gorman, lkp, LKML, Andrew Morton, Rik van Riel, Vitaly Wool,
David Rientjes, Christoph Lameter, Johannes Weiner,
Vlastimil Babka, Stephen Rothwell
On Mon 02-11-15 07:20:37, Huang, Ying wrote:
> Michal Hocko <mhocko@kernel.org> writes:
>
> > On Fri 30-10-15 16:21:40, Huang, Ying wrote:
> >> Michal Hocko <mhocko@kernel.org> writes:
> >>
> >> > On Wed 28-10-15 13:36:02, kernel test robot wrote:
> >> >> FYI, we noticed the below changes on
> >> >>
> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> >> >> commit 43993977baecd838d66ccabc7f682342fc6ff635 ("mm, page_alloc:
> >> >> distinguish between being unable to sleep, unwilling to sleep and
> >> >> avoiding waking kswapd")
> >> >>
> >> >> We found the OOM possibility increased 88% in a virtual machine with 1G memory.
> >> >
> >> > Could you provide dmesg output from this test?
> >>
> >> Sure, Attached.
> >
> > I can only see a single allocation failure warning:
> > kworker/u4:1: page allocation failure: order:0, mode:0x2204000
> >
> > This is obviously a non sleeping allocation with ___GFP_KSWAPD_RECLAIM
> > set. ___GFP_HIGH (aka access to memory reserves) is not required so a
> > failure of such an allocation is something to be expected.
> >
> > [ 2294.616369] Workqueue: btrfs-submit btrfs_submit_helper
> > [ 2294.616369] 0000000000000000 ffff88000d38f5e0 ffffffff8173f84c 0000000000000000
> > [ 2294.616369] ffff88000d38f678 ffffffff811abaee 00000000ffffffff 000000010038f618
> > [ 2294.616369] ffff8800584e4148 00000000ffffffff ffff8800584e2f00 0000000000000001
> > [ 2294.616369] Call Trace:
> > [ 2294.616369] [<ffffffff8173f84c>] dump_stack+0x4b/0x63
> > [ 2294.616369] [<ffffffff811abaee>] warn_alloc_failed+0x125/0x13d
> > [ 2294.616369] [<ffffffff811aecce>] __alloc_pages_nodemask+0x7c9/0x915
> > [ 2294.616369] [<ffffffff811ecc7b>] kmem_getpages+0x91/0x155
> > [ 2294.616369] [<ffffffff811eef0d>] fallback_alloc+0x1cc/0x24c
> > [ 2294.616369] [<ffffffff811eed32>] ____cache_alloc_node+0x151/0x160
> > [ 2294.616369] [<ffffffff811ef1ed>] __kmalloc+0xb0/0x134
> > [ 2294.616369] [<ffffffff8105d7a5>] ? sched_clock+0x9/0xb
> > [ 2294.616369] [<ffffffff8187d929>] ? virtqueue_add+0x78/0x37f
> > [ 2294.616369] [<ffffffff8187d929>] virtqueue_add+0x78/0x37f
> > [ 2294.616369] [<ffffffff81114f72>] ? __lock_acquire+0x751/0xf55
> > [ 2294.616369] [<ffffffff8187dca6>] virtqueue_add_sgs+0x76/0x85
> >
> > The patch you are referring shouldn't make any change in this path
> > because alloc_indirect which I expect is the allocation failing here
> > does:
> > gfp &= ~(__GFP_HIGHMEM | __GFP_HIGH)
> >
> > and that came in via b92b1b89a33c ("virtio: force vring descriptors to
> > be allocated from lowmem").
> >
> > Are there more failed allocations during the test? The subject would
> > suggest so.
>
> We done 24 tests for the commit and 24 tests for its parent. There is
> no OOM in any test for the parent commit, but there are OOM in 21 tests
> for this commit. This is what I want to say in the subject. Sorry for
> confusing.
It would be interesting to see all the page allocation failure warnings
(if they are different). Maybe other callers have relied on GFP_ATOMIC
and access to memory reserves. The above path is not this case though.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lkp] [mm, page_alloc] 43993977ba: +88% OOM possibility
2015-11-02 7:45 ` Michal Hocko
@ 2015-11-02 8:55 ` Huang, Ying
2015-11-02 9:53 ` Michal Hocko
0 siblings, 1 reply; 8+ messages in thread
From: Huang, Ying @ 2015-11-02 8:55 UTC (permalink / raw)
To: Michal Hocko
Cc: Mel Gorman, lkp, LKML, Andrew Morton, Rik van Riel, Vitaly Wool,
David Rientjes, Christoph Lameter, Johannes Weiner,
Vlastimil Babka, Stephen Rothwell
Michal Hocko <mhocko@kernel.org> writes:
> On Mon 02-11-15 07:20:37, Huang, Ying wrote:
>> Michal Hocko <mhocko@kernel.org> writes:
>>
>> > On Fri 30-10-15 16:21:40, Huang, Ying wrote:
>> >> Michal Hocko <mhocko@kernel.org> writes:
>> >>
>> >> > On Wed 28-10-15 13:36:02, kernel test robot wrote:
>> >> >> FYI, we noticed the below changes on
>> >> >>
>> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>> >> >> commit 43993977baecd838d66ccabc7f682342fc6ff635 ("mm, page_alloc:
>> >> >> distinguish between being unable to sleep, unwilling to sleep and
>> >> >> avoiding waking kswapd")
>> >> >>
>> >> >> We found the OOM possibility increased 88% in a virtual machine with 1G memory.
>> >> >
>> >> > Could you provide dmesg output from this test?
>> >>
>> >> Sure, Attached.
>> >
>> > I can only see a single allocation failure warning:
>> > kworker/u4:1: page allocation failure: order:0, mode:0x2204000
>> >
>> > This is obviously a non sleeping allocation with ___GFP_KSWAPD_RECLAIM
>> > set. ___GFP_HIGH (aka access to memory reserves) is not required so a
>> > failure of such an allocation is something to be expected.
>> >
>> > [ 2294.616369] Workqueue: btrfs-submit btrfs_submit_helper
>> > [ 2294.616369] 0000000000000000 ffff88000d38f5e0 ffffffff8173f84c 0000000000000000
>> > [ 2294.616369] ffff88000d38f678 ffffffff811abaee 00000000ffffffff 000000010038f618
>> > [ 2294.616369] ffff8800584e4148 00000000ffffffff ffff8800584e2f00 0000000000000001
>> > [ 2294.616369] Call Trace:
>> > [ 2294.616369] [<ffffffff8173f84c>] dump_stack+0x4b/0x63
>> > [ 2294.616369] [<ffffffff811abaee>] warn_alloc_failed+0x125/0x13d
>> > [ 2294.616369] [<ffffffff811aecce>] __alloc_pages_nodemask+0x7c9/0x915
>> > [ 2294.616369] [<ffffffff811ecc7b>] kmem_getpages+0x91/0x155
>> > [ 2294.616369] [<ffffffff811eef0d>] fallback_alloc+0x1cc/0x24c
>> > [ 2294.616369] [<ffffffff811eed32>] ____cache_alloc_node+0x151/0x160
>> > [ 2294.616369] [<ffffffff811ef1ed>] __kmalloc+0xb0/0x134
>> > [ 2294.616369] [<ffffffff8105d7a5>] ? sched_clock+0x9/0xb
>> > [ 2294.616369] [<ffffffff8187d929>] ? virtqueue_add+0x78/0x37f
>> > [ 2294.616369] [<ffffffff8187d929>] virtqueue_add+0x78/0x37f
>> > [ 2294.616369] [<ffffffff81114f72>] ? __lock_acquire+0x751/0xf55
>> > [ 2294.616369] [<ffffffff8187dca6>] virtqueue_add_sgs+0x76/0x85
>> >
>> > The patch you are referring shouldn't make any change in this path
>> > because alloc_indirect which I expect is the allocation failing here
>> > does:
>> > gfp &= ~(__GFP_HIGHMEM | __GFP_HIGH)
>> >
>> > and that came in via b92b1b89a33c ("virtio: force vring descriptors to
>> > be allocated from lowmem").
>> >
>> > Are there more failed allocations during the test? The subject would
>> > suggest so.
>>
>> We done 24 tests for the commit and 24 tests for its parent. There is
>> no OOM in any test for the parent commit, but there are OOM in 21 tests
>> for this commit. This is what I want to say in the subject. Sorry for
>> confusing.
>
> It would be interesting to see all the page allocation failure warnings
> (if they are different). Maybe other callers have relied on GFP_ATOMIC
> and access to memory reserves. The above path is not this case though.
I take a look at all dmesgs, and found the backtrace for page allocation
failure is same for all. Is it possible that this commit cause more
memory were allocated or kept in memory so that more OOM were triggered?
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lkp] [mm, page_alloc] 43993977ba: +88% OOM possibility
2015-11-02 8:55 ` Huang, Ying
@ 2015-11-02 9:53 ` Michal Hocko
0 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2015-11-02 9:53 UTC (permalink / raw)
To: Huang, Ying
Cc: Mel Gorman, lkp, LKML, Andrew Morton, Rik van Riel, Vitaly Wool,
David Rientjes, Christoph Lameter, Johannes Weiner,
Vlastimil Babka, Stephen Rothwell
On Mon 02-11-15 16:55:15, Huang, Ying wrote:
> Michal Hocko <mhocko@kernel.org> writes:
[...]
> > It would be interesting to see all the page allocation failure warnings
> > (if they are different). Maybe other callers have relied on GFP_ATOMIC
> > and access to memory reserves. The above path is not this case though.
>
> I take a look at all dmesgs, and found the backtrace for page allocation
> failure is same for all. Is it possible that this commit cause more
> memory were allocated or kept in memory so that more OOM were triggered?
I can imagine that some of the callers were not converted properly or
missed and a lack of __GFP_KSWAPD_RECLAIM could indeed cause a later
kswapd kick off. I am staring into the commit but nothing has jumped at
me yet. Could you collect /proc/vmstat (snapshot every 1s) on both good
and bad kernels. I expect the later would see a less scanning by kswapd.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-11-02 10:23 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-28 5:36 [lkp] [mm, page_alloc] 43993977ba: +88% OOM possibility kernel test robot
2015-10-29 14:24 ` Michal Hocko
2015-10-30 8:21 ` Huang, Ying
2015-10-30 10:38 ` Michal Hocko
2015-11-01 23:20 ` Huang, Ying
2015-11-02 7:45 ` Michal Hocko
2015-11-02 8:55 ` Huang, Ying
2015-11-02 9:53 ` Michal Hocko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).