* order 7 allocations from xt_recent
@ 2013-01-03 16:43 Dave Jones
2013-01-03 16:55 ` Eric Dumazet
0 siblings, 1 reply; 7+ messages in thread
From: Dave Jones @ 2013-01-03 16:43 UTC (permalink / raw)
To: netdev; +Cc: h.reindl, Fedora Kernel Team
We had a report from a user that shows this code trying
to do enormous allocations, which isn't going to work too well..
iptables: page allocation failure: order:7, mode:0xc0d0
Pid: 2822, comm: iptables Not tainted 3.6.10-2.fc17.x86_64 #1
Call Trace:
[<ffffffff8113130b>] warn_alloc_failed+0xeb/0x150
[<ffffffff81616576>] ? __alloc_pages_direct_compact+0x17e/0x190
[<ffffffff81135196>] __alloc_pages_nodemask+0x736/0x990
[<ffffffff811710e0>] alloc_pages_current+0xb0/0x120
[<ffffffff8113022a>] __get_free_pages+0x2a/0x80
[<ffffffff811786d9>] kmalloc_order_trace+0x39/0xb0
[<ffffffff8117ae3a>] __kmalloc+0x16a/0x1a0
[<ffffffff8118aa7c>] ? mem_cgroup_bad_page_check+0x1c/0x30
[<ffffffff81134563>] ? get_page_from_freelist+0x453/0x950
[<ffffffffa007696e>] recent_mt_check.isra.6+0x16e/0x2c0 [xt_recent]
[<ffffffffa0076b4b>] recent_mt_check_v0+0x6b/0xa0 [xt_recent]
[<ffffffff8153fdda>] xt_check_match+0xaa/0x1e0
[<ffffffff8153f3ab>] ? xt_find_match+0x11b/0x130
[<ffffffff8153f3ab>] ? xt_find_match+0x11b/0x130
[<ffffffff8159257c>] check_match+0x3c/0x50
[<ffffffff81593ccb>] translate_table+0x39b/0x5b0
[<ffffffff815956f3>] do_ipt_set_ctl+0x133/0x200
[<ffffffff8153e10b>] nf_setsockopt+0x6b/0x90
[<ffffffff8161f236>] ? _raw_spin_lock_bh+0x16/0x40
[<ffffffff8154e41f>] ip_setsockopt+0x8f/0xa0
[<ffffffff8156f49d>] raw_setsockopt+0x1d/0x30
[<ffffffff814fcf14>] sock_common_setsockopt+0x14/0x20
[<ffffffff814fc23c>] sys_setsockopt+0x7c/0xe0
[<ffffffff816270e9>] system_call_fastpath+0x16/0x1b
which looks like it's this..
t = kzalloc(sizeof(*t) + sizeof(t->iphash[0]) * ip_list_hash_size,
GFP_KERNEL);
Which is initialised thus..
ip_list_hash_size = 1 << fls(ip_list_tot);
And ip_list_tot is 10000 in this case. Hmm ?
Complete report and setup described in his bug report at https://bugzilla.redhat.com/show_bug.cgi?id=890715
Dave
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: order 7 allocations from xt_recent
2013-01-03 16:43 order 7 allocations from xt_recent Dave Jones
@ 2013-01-03 16:55 ` Eric Dumazet
2013-01-03 17:11 ` Dave Jones
0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2013-01-03 16:55 UTC (permalink / raw)
To: Dave Jones; +Cc: netdev, h.reindl, Fedora Kernel Team
On Thu, 2013-01-03 at 11:43 -0500, Dave Jones wrote:
> We had a report from a user that shows this code trying
> to do enormous allocations, which isn't going to work too well..
>
> iptables: page allocation failure: order:7, mode:0xc0d0
> Pid: 2822, comm: iptables Not tainted 3.6.10-2.fc17.x86_64 #1
> Call Trace:
> [<ffffffff8113130b>] warn_alloc_failed+0xeb/0x150
> [<ffffffff81616576>] ? __alloc_pages_direct_compact+0x17e/0x190
> [<ffffffff81135196>] __alloc_pages_nodemask+0x736/0x990
> [<ffffffff811710e0>] alloc_pages_current+0xb0/0x120
> [<ffffffff8113022a>] __get_free_pages+0x2a/0x80
> [<ffffffff811786d9>] kmalloc_order_trace+0x39/0xb0
> [<ffffffff8117ae3a>] __kmalloc+0x16a/0x1a0
> [<ffffffff8118aa7c>] ? mem_cgroup_bad_page_check+0x1c/0x30
> [<ffffffff81134563>] ? get_page_from_freelist+0x453/0x950
> [<ffffffffa007696e>] recent_mt_check.isra.6+0x16e/0x2c0 [xt_recent]
> [<ffffffffa0076b4b>] recent_mt_check_v0+0x6b/0xa0 [xt_recent]
> [<ffffffff8153fdda>] xt_check_match+0xaa/0x1e0
> [<ffffffff8153f3ab>] ? xt_find_match+0x11b/0x130
> [<ffffffff8153f3ab>] ? xt_find_match+0x11b/0x130
> [<ffffffff8159257c>] check_match+0x3c/0x50
> [<ffffffff81593ccb>] translate_table+0x39b/0x5b0
> [<ffffffff815956f3>] do_ipt_set_ctl+0x133/0x200
> [<ffffffff8153e10b>] nf_setsockopt+0x6b/0x90
> [<ffffffff8161f236>] ? _raw_spin_lock_bh+0x16/0x40
> [<ffffffff8154e41f>] ip_setsockopt+0x8f/0xa0
> [<ffffffff8156f49d>] raw_setsockopt+0x1d/0x30
> [<ffffffff814fcf14>] sock_common_setsockopt+0x14/0x20
> [<ffffffff814fc23c>] sys_setsockopt+0x7c/0xe0
> [<ffffffff816270e9>] system_call_fastpath+0x16/0x1b
>
> which looks like it's this..
>
> t = kzalloc(sizeof(*t) + sizeof(t->iphash[0]) * ip_list_hash_size,
> GFP_KERNEL);
>
> Which is initialised thus..
>
> ip_list_hash_size = 1 << fls(ip_list_tot);
>
> And ip_list_tot is 10000 in this case. Hmm ?
>
>
> Complete report and setup described in his bug report at https://bugzilla.redhat.com/show_bug.cgi?id=890715
>
> Dave
>
Yes, we had a report and a patch :
http://comments.gmane.org/gmane.linux.network/248216
I'll send it in a more formal way.
Thanks
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: order 7 allocations from xt_recent
2013-01-03 16:55 ` Eric Dumazet
@ 2013-01-03 17:11 ` Dave Jones
2013-01-03 17:26 ` Dave Jones
2013-01-03 18:02 ` Eric Dumazet
0 siblings, 2 replies; 7+ messages in thread
From: Dave Jones @ 2013-01-03 17:11 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, h.reindl, Fedora Kernel Team
On Thu, Jan 03, 2013 at 08:55:04AM -0800, Eric Dumazet wrote:
> On Thu, 2013-01-03 at 11:43 -0500, Dave Jones wrote:
> > We had a report from a user that shows this code trying
> > to do enormous allocations, which isn't going to work too well..
> > ...
> > Which is initialised thus..
> >
> > ip_list_hash_size = 1 << fls(ip_list_tot);
> >
> > And ip_list_tot is 10000 in this case. Hmm ?
> >
> > Complete report and setup described in his bug report at https://bugzilla.redhat.com/show_bug.cgi?id=890715
>
> Yes, we had a report and a patch :
>
> http://comments.gmane.org/gmane.linux.network/248216
>
> I'll send it in a more formal way.
Ah! Excellent.
That 'check size and vmalloc/kmalloc accordingly' thing seems to be a pattern
that comes up time and time again. Is it worth maybe making a more generic
version of that instead of open-coding it each time it comes up ?
thanks,
Dave
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: order 7 allocations from xt_recent
2013-01-03 17:11 ` Dave Jones
@ 2013-01-03 17:26 ` Dave Jones
2013-01-03 18:00 ` Eric Dumazet
2013-01-03 18:02 ` Eric Dumazet
1 sibling, 1 reply; 7+ messages in thread
From: Dave Jones @ 2013-01-03 17:26 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, h.reindl, Fedora Kernel Team
On Thu, Jan 03, 2013 at 12:11:15PM -0500, Dave Jones wrote:
> On Thu, Jan 03, 2013 at 08:55:04AM -0800, Eric Dumazet wrote:
> > On Thu, 2013-01-03 at 11:43 -0500, Dave Jones wrote:
> > > We had a report from a user that shows this code trying
> > > to do enormous allocations, which isn't going to work too well..
> > > ...
> > > Which is initialised thus..
> > >
> > > ip_list_hash_size = 1 << fls(ip_list_tot);
> > >
> > > And ip_list_tot is 10000 in this case. Hmm ?
> > >
> > > Complete report and setup described in his bug report at https://bugzilla.redhat.com/show_bug.cgi?id=890715
> >
> > Yes, we had a report and a patch :
> >
> > http://comments.gmane.org/gmane.linux.network/248216
> >
> > I'll send it in a more formal way.
>
> Ah! Excellent.
>
> That 'check size and vmalloc/kmalloc accordingly' thing seems to be a pattern
> that comes up time and time again. Is it worth maybe making a more generic
> version of that instead of open-coding it each time it comes up ?
Something else that I'm puzzled by.
In the report above, it failed to allocate 512kb, but..
Node 0 Normal: 2388*4kB 347*8kB 1029*16kB 3512*32kB 29*64kB 2*128kB 1*256kB 5*512kB 1*1024kB 0*2048kB 0*4096kB = 147128kB
^^^^^^^^^^^^^^^^
Shouldn't the allocator have been able to satisfy that anyway ?
Dave
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: order 7 allocations from xt_recent
2013-01-03 17:26 ` Dave Jones
@ 2013-01-03 18:00 ` Eric Dumazet
2013-01-03 19:51 ` Reindl Harald
0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2013-01-03 18:00 UTC (permalink / raw)
To: Dave Jones; +Cc: netdev, h.reindl, Fedora Kernel Team
On Thu, 2013-01-03 at 12:26 -0500, Dave Jones wrote:
> On Thu, Jan 03, 2013 at 12:11:15PM -0500, Dave Jones wrote:
> > On Thu, Jan 03, 2013 at 08:55:04AM -0800, Eric Dumazet wrote:
> > > On Thu, 2013-01-03 at 11:43 -0500, Dave Jones wrote:
> > > > We had a report from a user that shows this code trying
> > > > to do enormous allocations, which isn't going to work too well..
> > > > ...
> > > > Which is initialised thus..
> > > >
> > > > ip_list_hash_size = 1 << fls(ip_list_tot);
> > > >
> > > > And ip_list_tot is 10000 in this case. Hmm ?
> > > >
> > > > Complete report and setup described in his bug report at https://bugzilla.redhat.com/show_bug.cgi?id=890715
> > >
> > > Yes, we had a report and a patch :
> > >
> > > http://comments.gmane.org/gmane.linux.network/248216
> > >
> > > I'll send it in a more formal way.
> >
> > Ah! Excellent.
> >
> > That 'check size and vmalloc/kmalloc accordingly' thing seems to be a pattern
> > that comes up time and time again. Is it worth maybe making a more generic
> > version of that instead of open-coding it each time it comes up ?
>
> Something else that I'm puzzled by.
>
> In the report above, it failed to allocate 512kb, but..
>
> Node 0 Normal: 2388*4kB 347*8kB 1029*16kB 3512*32kB 29*64kB 2*128kB 1*256kB 5*512kB 1*1024kB 0*2048kB 0*4096kB = 147128kB
> ^^^^^^^^^^^^^^^^
>
> Shouldn't the allocator have been able to satisfy that anyway ?
>
> Dave
>
Might be something related to the CONFIG_COMPACTION=y and lumpy reclaim
removal ?
Anyway, we keep a fraction of memory for ATOMIC allocations.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: order 7 allocations from xt_recent
2013-01-03 17:11 ` Dave Jones
2013-01-03 17:26 ` Dave Jones
@ 2013-01-03 18:02 ` Eric Dumazet
1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2013-01-03 18:02 UTC (permalink / raw)
To: Dave Jones; +Cc: netdev, h.reindl, Fedora Kernel Team
On Thu, 2013-01-03 at 12:11 -0500, Dave Jones wrote:
> That 'check size and vmalloc/kmalloc accordingly' thing seems to be a pattern
> that comes up time and time again. Is it worth maybe making a more generic
> version of that instead of open-coding it each time it comes up ?
We had numerous discussions and patch submissions in the past, and this
went nowhere. I am not sure anybody wants to spend more cycles on this.
Its sad, but true.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: order 7 allocations from xt_recent
2013-01-03 18:00 ` Eric Dumazet
@ 2013-01-03 19:51 ` Reindl Harald
0 siblings, 0 replies; 7+ messages in thread
From: Reindl Harald @ 2013-01-03 19:51 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Dave Jones, netdev, Fedora Kernel Team
[-- Attachment #1: Type: text/plain, Size: 2274 bytes --]
Am 03.01.2013 19:00, schrieb Eric Dumazet:
> On Thu, 2013-01-03 at 12:26 -0500, Dave Jones wrote:
>> On Thu, Jan 03, 2013 at 12:11:15PM -0500, Dave Jones wrote:
>> > On Thu, Jan 03, 2013 at 08:55:04AM -0800, Eric Dumazet wrote:
>> > > On Thu, 2013-01-03 at 11:43 -0500, Dave Jones wrote:
>> > > > We had a report from a user that shows this code trying
>> > > > to do enormous allocations, which isn't going to work too well..
>> > > > ...
>> > > > Which is initialised thus..
>> > > >
>> > > > ip_list_hash_size = 1 << fls(ip_list_tot);
>> > > >
>> > > > And ip_list_tot is 10000 in this case. Hmm ?
>> > > >
>> > > > Complete report and setup described in his bug report at https://bugzilla.redhat.com/show_bug.cgi?id=890715
>> > >
>> > > Yes, we had a report and a patch :
>> > >
>> > > http://comments.gmane.org/gmane.linux.network/248216
>> > >
>> > > I'll send it in a more formal way.
>> >
>> > Ah! Excellent.
>> >
>> > That 'check size and vmalloc/kmalloc accordingly' thing seems to be a pattern
>> > that comes up time and time again. Is it worth maybe making a more generic
>> > version of that instead of open-coding it each time it comes up ?
>>
>> Something else that I'm puzzled by.
>>
>> In the report above, it failed to allocate 512kb, but..
>>
>> Node 0 Normal: 2388*4kB 347*8kB 1029*16kB 3512*32kB 29*64kB 2*128kB 1*256kB 5*512kB 1*1024kB 0*2048kB 0*4096kB = 147128kB
>> ^^^^^^^^^^^^^^^^
>>
>> Shouldn't the allocator have been able to satisfy that anyway ?
>>
>
> Might be something related to the CONFIG_COMPACTION=y and lumpy reclaim
> removal ?
>
> Anyway, we keep a fraction of memory for ATOMIC allocations
on the machine there is even "vm.min_free_kbytes" set to 128 MB
however, something goes terrible wrong if cache-pages leads to
stack-traces about failed memory allocation
vm.swappiness = 0
vm.overcommit_memory = 1
vm.overcommit_ratio = 60
vm.vfs_cache_pressure = 30
vm.dirty_background_ratio = 15
vm.dirty_ratio = 40
vm.dirty_expire_centisecs = 1500
vm.dirty_writeback_centisecs = 1500
vm.zone_reclaim_mode = 0
vm.min_free_kbytes = 131072
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 261 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-01-03 20:09 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-03 16:43 order 7 allocations from xt_recent Dave Jones
2013-01-03 16:55 ` Eric Dumazet
2013-01-03 17:11 ` Dave Jones
2013-01-03 17:26 ` Dave Jones
2013-01-03 18:00 ` Eric Dumazet
2013-01-03 19:51 ` Reindl Harald
2013-01-03 18:02 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).