* Very slow remove interface from kernel @ 2023-05-09 8:22 Martin Zaharinov 2023-05-09 10:20 ` Ido Schimmel 0 siblings, 1 reply; 16+ messages in thread From: Martin Zaharinov @ 2023-05-09 8:22 UTC (permalink / raw) To: Eric Dumazet, netdev Hi Eric I think may be help for this : I try this on kernel 6.3.1 add vlans : for i in $(seq 2 4094); do ip link add link eth1 name vlan$i type vlan id $i; done for i in $(seq 2 4094); do ip link set dev vlan$i up; done and after that run : for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done time for remove for this 4093 vlans is 5-10 min . Is there options to make fast this ? Same problem is when have 5-6k ppp interface kernel very slow unregister device. best regards, m. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 8:22 Very slow remove interface from kernel Martin Zaharinov @ 2023-05-09 10:20 ` Ido Schimmel 2023-05-09 10:32 ` Eric Dumazet 0 siblings, 1 reply; 16+ messages in thread From: Ido Schimmel @ 2023-05-09 10:20 UTC (permalink / raw) To: Martin Zaharinov; +Cc: Eric Dumazet, netdev On Tue, May 09, 2023 at 11:22:13AM +0300, Martin Zaharinov wrote: > add vlans : > for i in $(seq 2 4094); do ip link add link eth1 name vlan$i type vlan id $i; done > for i in $(seq 2 4094); do ip link set dev vlan$i up; done > > > and after that run : > > for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done > > > time for remove for this 4093 vlans is 5-10 min . > > Is there options to make fast this ? If you know you are going to delete all of them together, then you can add them to the same group during creation: for i in $(seq 2 4094); do ip link add link eth1 name vlan$i up group 10 type vlan id $i; done Then delete the group: ip link del group 10 IIRC, in the past there was a patchset to allow passing a list of ifindexes instead of a group number, but it never made its way upstream. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 10:20 ` Ido Schimmel @ 2023-05-09 10:32 ` Eric Dumazet 2023-05-09 11:10 ` Martin Zaharinov 0 siblings, 1 reply; 16+ messages in thread From: Eric Dumazet @ 2023-05-09 10:32 UTC (permalink / raw) To: Ido Schimmel; +Cc: Martin Zaharinov, netdev On Tue, May 9, 2023 at 12:20 PM Ido Schimmel <idosch@idosch.org> wrote: > > On Tue, May 09, 2023 at 11:22:13AM +0300, Martin Zaharinov wrote: > > add vlans : > > for i in $(seq 2 4094); do ip link add link eth1 name vlan$i type vlan id $i; done > > for i in $(seq 2 4094); do ip link set dev vlan$i up; done > > > > > > and after that run : > > > > for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done > > > > > > time for remove for this 4093 vlans is 5-10 min . > > > > Is there options to make fast this ? > > If you know you are going to delete all of them together, then you can > add them to the same group during creation: > > for i in $(seq 2 4094); do ip link add link eth1 name vlan$i up group 10 type vlan id $i; done > > Then delete the group: > > ip link del group 10 > Another way is to create a netns for retiring devices, move devices to the 'retirens' when they need to go away. Then once per minute, delete the retirens and create a new one. -> This batches netdev deletions. > IIRC, in the past there was a patchset to allow passing a list of > ifindexes instead of a group number, but it never made its way upstream. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 10:32 ` Eric Dumazet @ 2023-05-09 11:10 ` Martin Zaharinov 2023-05-09 12:36 ` Eric Dumazet 0 siblings, 1 reply; 16+ messages in thread From: Martin Zaharinov @ 2023-05-09 11:10 UTC (permalink / raw) To: Eric Dumazet; +Cc: Ido Schimmel, netdev Hi in short, there is no way to make the kernel do it faster. Before time with old kernel unregister device make more faster . with latest kernel >6.x this make very slow . is there any chance to try to make this more fast. m. > On 9 May 2023, at 13:32, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, May 9, 2023 at 12:20 PM Ido Schimmel <idosch@idosch.org> wrote: >> >> On Tue, May 09, 2023 at 11:22:13AM +0300, Martin Zaharinov wrote: >>> add vlans : >>> for i in $(seq 2 4094); do ip link add link eth1 name vlan$i type vlan id $i; done >>> for i in $(seq 2 4094); do ip link set dev vlan$i up; done >>> >>> >>> and after that run : >>> >>> for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done >>> >>> >>> time for remove for this 4093 vlans is 5-10 min . >>> >>> Is there options to make fast this ? >> >> If you know you are going to delete all of them together, then you can >> add them to the same group during creation: >> >> for i in $(seq 2 4094); do ip link add link eth1 name vlan$i up group 10 type vlan id $i; done >> >> Then delete the group: >> >> ip link del group 10 >> > > Another way is to create a netns for retiring devices, > move devices to the 'retirens' when they need to go away. > > Then once per minute, delete the retirens and create a new one. > > -> This batches netdev deletions. > > >> IIRC, in the past there was a patchset to allow passing a list of >> ifindexes instead of a group number, but it never made its way upstream. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 11:10 ` Martin Zaharinov @ 2023-05-09 12:36 ` Eric Dumazet 2023-05-09 18:50 ` Martin Zaharinov 2023-05-09 20:08 ` Martin Zaharinov 0 siblings, 2 replies; 16+ messages in thread From: Eric Dumazet @ 2023-05-09 12:36 UTC (permalink / raw) To: Martin Zaharinov; +Cc: Ido Schimmel, netdev On Tue, May 9, 2023 at 1:10 PM Martin Zaharinov <micron10@gmail.com> wrote: > > Hi > > in short, there is no way to make the kernel do it faster. Make sure your kernel does not include options you do not need. > > Before time with old kernel unregister device make more faster . > > with latest kernel >6.x this make very slow . > Yup, I feel your pain. Maybe you should start a bisection then... You might find that you have some CONFIG_ option that makes this operation very slow. Some layers (like hamradio and others) lack batch operations in their netdev removal handlers. For instance, on one machine I have access to and with my standard .config, your benchmark gives a not too bad result with pristine linux-6.3 modprobe dummy ip link set dev dummy0 up for i in $(seq 2 4094); do ip link add link dummy0 name vlan$i type vlan id $i; done for i in $(seq 2 4094); do ip link set dev vlan$i up; done time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done real 0m55.808s user 0m0.788s sys 0m6.868s Without batching, I think one netdev removal needs three synchronize_net() calls I am reasonably certain numbers would not look so good if I booted a "make allyesconfig" kernel. > > is there any chance to try to make this more fast. > > > m. > > > > On 9 May 2023, at 13:32, Eric Dumazet <edumazet@google.com> wrote: > > > > On Tue, May 9, 2023 at 12:20 PM Ido Schimmel <idosch@idosch.org> wrote: > >> > >> On Tue, May 09, 2023 at 11:22:13AM +0300, Martin Zaharinov wrote: > >>> add vlans : > >>> for i in $(seq 2 4094); do ip link add link eth1 name vlan$i type vlan id $i; done > >>> for i in $(seq 2 4094); do ip link set dev vlan$i up; done > >>> > >>> > >>> and after that run : > >>> > >>> for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done > >>> > >>> > >>> time for remove for this 4093 vlans is 5-10 min . > >>> > >>> Is there options to make fast this ? > >> > >> If you know you are going to delete all of them together, then you can > >> add them to the same group during creation: > >> > >> for i in $(seq 2 4094); do ip link add link eth1 name vlan$i up group 10 type vlan id $i; done > >> > >> Then delete the group: > >> > >> ip link del group 10 > >> > > > > Another way is to create a netns for retiring devices, > > move devices to the 'retirens' when they need to go away. > > > > Then once per minute, delete the retirens and create a new one. > > > > -> This batches netdev deletions. > > > > > >> IIRC, in the past there was a patchset to allow passing a list of > >> ifindexes instead of a group number, but it never made its way upstream. > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 12:36 ` Eric Dumazet @ 2023-05-09 18:50 ` Martin Zaharinov 2023-05-09 20:08 ` Ido Schimmel 2023-05-09 20:08 ` Martin Zaharinov 1 sibling, 1 reply; 16+ messages in thread From: Martin Zaharinov @ 2023-05-09 18:50 UTC (permalink / raw) To: Eric Dumazet; +Cc: Ido Schimmel, netdev Hi Eric i try on kernel 6.3.1 time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min user 0m7.479s sys 0m0.367s Config is very clean i remove big part of CONFIG options . is there options to debug what is happen. m > On 9 May 2023, at 15:36, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, May 9, 2023 at 1:10 PM Martin Zaharinov <micron10@gmail.com> wrote: >> >> Hi >> >> in short, there is no way to make the kernel do it faster. > > Make sure your kernel does not include options you do not need. > >> >> Before time with old kernel unregister device make more faster . >> >> with latest kernel >6.x this make very slow . >> > > Yup, I feel your pain. > > Maybe you should start a bisection then... > > You might find that you have some CONFIG_ option that makes this > operation very slow. > > Some layers (like hamradio and others) lack batch operations in their > netdev removal handlers. > > For instance, on one machine I have access to and with my standard > .config, your benchmark gives a not too bad result with pristine > linux-6.3 > > modprobe dummy > ip link set dev dummy0 up > for i in $(seq 2 4094); do ip link add link dummy0 name vlan$i type > vlan id $i; done > for i in $(seq 2 4094); do ip link set dev vlan$i up; done > time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type > vlan id $i; done > real 0m55.808s > user 0m0.788s > sys 0m6.868s > > Without batching, I think one netdev removal needs three synchronize_net() calls > > I am reasonably certain numbers would not look so good if I booted a > "make allyesconfig" kernel. > > > > > > > > >> >> is there any chance to try to make this more fast. >> >> >> m. >> >> >>> On 9 May 2023, at 13:32, Eric Dumazet <edumazet@google.com> wrote: >>> >>> On Tue, May 9, 2023 at 12:20 PM Ido Schimmel <idosch@idosch.org> wrote: >>>> >>>> On Tue, May 09, 2023 at 11:22:13AM +0300, Martin Zaharinov wrote: >>>>> add vlans : >>>>> for i in $(seq 2 4094); do ip link add link eth1 name vlan$i type vlan id $i; done >>>>> for i in $(seq 2 4094); do ip link set dev vlan$i up; done >>>>> >>>>> >>>>> and after that run : >>>>> >>>>> for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done >>>>> >>>>> >>>>> time for remove for this 4093 vlans is 5-10 min . >>>>> >>>>> Is there options to make fast this ? >>>> >>>> If you know you are going to delete all of them together, then you can >>>> add them to the same group during creation: >>>> >>>> for i in $(seq 2 4094); do ip link add link eth1 name vlan$i up group 10 type vlan id $i; done >>>> >>>> Then delete the group: >>>> >>>> ip link del group 10 >>>> >>> >>> Another way is to create a netns for retiring devices, >>> move devices to the 'retirens' when they need to go away. >>> >>> Then once per minute, delete the retirens and create a new one. >>> >>> -> This batches netdev deletions. >>> >>> >>>> IIRC, in the past there was a patchset to allow passing a list of >>>> ifindexes instead of a group number, but it never made its way upstream. >> >> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 18:50 ` Martin Zaharinov @ 2023-05-09 20:08 ` Ido Schimmel 2023-05-09 20:16 ` Martin Zaharinov ` (3 more replies) 0 siblings, 4 replies; 16+ messages in thread From: Ido Schimmel @ 2023-05-09 20:08 UTC (permalink / raw) To: Martin Zaharinov; +Cc: Eric Dumazet, netdev On Tue, May 09, 2023 at 09:50:18PM +0300, Martin Zaharinov wrote: > i try on kernel 6.3.1 > > > time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done > > real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min > user 0m7.479s > sys 0m0.367s You are off-CPU most of the time, the question is what is blocking. I'm getting the following results with net-next: # time -p for i in $(seq 2 4094); do ip link del dev eth0.$i; done real 177.09 user 3.85 sys 31.26 When using a batch file to perform the deletion: # time -p ip -b vlan_del.batch real 35.25 user 0.02 sys 3.61 And to check where we are blocked most of the time while using the batch file: # ../bcc/libbpf-tools/offcputime -p `pgrep -nx ip` [...] __schedule schedule schedule_timeout wait_for_completion rcu_barrier netdev_run_todo rtnetlink_rcv_msg netlink_rcv_skb netlink_unicast netlink_sendmsg ____sys_sendmsg ___sys_sendmsg __sys_sendmsg do_syscall_64 entry_SYSCALL_64_after_hwframe - ip (3660) 25089479 [...] We are blocked for around 70% of the time on the rcu_barrier() in netdev_run_todo(). Note that one big difference between my setup and yours is that in my case eth0 is a dummy device and in your case it's probably a physical device that actually implements netdev_ops::ndo_vlan_rx_kill_vid(). If so, it's possible that a non-negligible amount of time is spent talking to hardware/firmware to delete the 4K VIDs from the device's VLAN filter. > > > Config is very clean i remove big part of CONFIG options . > > is there options to debug what is happen. > > m ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 20:08 ` Ido Schimmel @ 2023-05-09 20:16 ` Martin Zaharinov 2023-05-10 5:31 ` Martin Zaharinov ` (2 subsequent siblings) 3 siblings, 0 replies; 16+ messages in thread From: Martin Zaharinov @ 2023-05-09 20:16 UTC (permalink / raw) To: Ido Schimmel; +Cc: Eric Dumazet, netdev Hi Ido yes is physical card intel 82599 dual port 10G on 2 socket system with 24 core on 3Ghz this is time : time ./vlanadd real 0m12.347s user 0m8.863s sys 0m2.594s time ./vlanrem real 8m59.105s user 0m11.931s sys 0m0.035s for 1sec with : watch -n.1 "ip a | grep UP | wc” and run vlanrem in 1sec ~ remove 4-5 vlans and i think rcu make problem. i found one post from 2009 : https://lore.kernel.org/all/20091024144610.GC6638@linux.vnet.ibm.com/T/ yes is old and may be is make many changes after that . i have same case with slow remove interface and with ppp interface when drop users over 800-900 make same problem to remove device and reconnect (readd) m. > On 9 May 2023, at 23:08, Ido Schimmel <idosch@idosch.org> wrote: > > On Tue, May 09, 2023 at 09:50:18PM +0300, Martin Zaharinov wrote: >> i try on kernel 6.3.1 >> >> >> time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done >> >> real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min >> user 0m7.479s >> sys 0m0.367s > > You are off-CPU most of the time, the question is what is blocking. I'm > getting the following results with net-next: > > # time -p for i in $(seq 2 4094); do ip link del dev eth0.$i; done > real 177.09 > user 3.85 > sys 31.26 > > When using a batch file to perform the deletion: > > # time -p ip -b vlan_del.batch > real 35.25 > user 0.02 > sys 3.61 > > And to check where we are blocked most of the time while using the batch > file: > > # ../bcc/libbpf-tools/offcputime -p `pgrep -nx ip` > [...] > __schedule > schedule > schedule_timeout > wait_for_completion > rcu_barrier > netdev_run_todo > rtnetlink_rcv_msg > netlink_rcv_skb > netlink_unicast > netlink_sendmsg > ____sys_sendmsg > ___sys_sendmsg > __sys_sendmsg > do_syscall_64 > entry_SYSCALL_64_after_hwframe > - ip (3660) > 25089479 > [...] > > We are blocked for around 70% of the time on the rcu_barrier() in > netdev_run_todo(). > > Note that one big difference between my setup and yours is that in my > case eth0 is a dummy device and in your case it's probably a physical > device that actually implements netdev_ops::ndo_vlan_rx_kill_vid(). If > so, it's possible that a non-negligible amount of time is spent talking > to hardware/firmware to delete the 4K VIDs from the device's VLAN > filter. > >> >> >> Config is very clean i remove big part of CONFIG options . >> >> is there options to debug what is happen. >> >> m ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 20:08 ` Ido Schimmel 2023-05-09 20:16 ` Martin Zaharinov @ 2023-05-10 5:31 ` Martin Zaharinov 2023-05-10 6:06 ` Martin Zaharinov 2023-05-10 9:16 ` Martin Zaharinov 3 siblings, 0 replies; 16+ messages in thread From: Martin Zaharinov @ 2023-05-10 5:31 UTC (permalink / raw) To: Ido Schimmel; +Cc: Eric Dumazet, netdev Hi Eric and Ido after little research after change CONFIG_HZ_100 > CONFIG_HZ_1000 vlanadd real 0m15.106s user 0m2.420s sys 0m13.250s vlandel: real 1m10.995s user 0m1.045s sys 0m7.678s i use 100 last 10 years all installation is server for networking. do you have any recommendations best regards, m > On 9 May 2023, at 23:08, Ido Schimmel <idosch@idosch.org> wrote: > > On Tue, May 09, 2023 at 09:50:18PM +0300, Martin Zaharinov wrote: >> i try on kernel 6.3.1 >> >> >> time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done >> >> real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min >> user 0m7.479s >> sys 0m0.367s > > You are off-CPU most of the time, the question is what is blocking. I'm > getting the following results with net-next: > > # time -p for i in $(seq 2 4094); do ip link del dev eth0.$i; done > real 177.09 > user 3.85 > sys 31.26 > > When using a batch file to perform the deletion: > > # time -p ip -b vlan_del.batch > real 35.25 > user 0.02 > sys 3.61 > > And to check where we are blocked most of the time while using the batch > file: > > # ../bcc/libbpf-tools/offcputime -p `pgrep -nx ip` > [...] > __schedule > schedule > schedule_timeout > wait_for_completion > rcu_barrier > netdev_run_todo > rtnetlink_rcv_msg > netlink_rcv_skb > netlink_unicast > netlink_sendmsg > ____sys_sendmsg > ___sys_sendmsg > __sys_sendmsg > do_syscall_64 > entry_SYSCALL_64_after_hwframe > - ip (3660) > 25089479 > [...] > > We are blocked for around 70% of the time on the rcu_barrier() in > netdev_run_todo(). > > Note that one big difference between my setup and yours is that in my > case eth0 is a dummy device and in your case it's probably a physical > device that actually implements netdev_ops::ndo_vlan_rx_kill_vid(). If > so, it's possible that a non-negligible amount of time is spent talking > to hardware/firmware to delete the 4K VIDs from the device's VLAN > filter. > >> >> >> Config is very clean i remove big part of CONFIG options . >> >> is there options to debug what is happen. >> >> m ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 20:08 ` Ido Schimmel 2023-05-09 20:16 ` Martin Zaharinov 2023-05-10 5:31 ` Martin Zaharinov @ 2023-05-10 6:06 ` Martin Zaharinov 2023-05-10 9:40 ` Eric Dumazet 2023-05-10 9:16 ` Martin Zaharinov 3 siblings, 1 reply; 16+ messages in thread From: Martin Zaharinov @ 2023-05-10 6:06 UTC (permalink / raw) To: Ido Schimmel; +Cc: Eric Dumazet, netdev I think problem is in this part of code in net/core/dev.c #define WAIT_REFS_MIN_MSECS 1 #define WAIT_REFS_MAX_MSECS 250 /** * netdev_wait_allrefs_any - wait until all references are gone. * @list: list of net_devices to wait on * * This is called when unregistering network devices. * * Any protocol or device that holds a reference should register * for netdevice notification, and cleanup and put back the * reference if they receive an UNREGISTER event. * We can get stuck here if buggy protocols don't correctly * call dev_put. */ static struct net_device *netdev_wait_allrefs_any(struct list_head *list) { unsigned long rebroadcast_time, warning_time; struct net_device *dev; int wait = 0; rebroadcast_time = warning_time = jiffies; list_for_each_entry(dev, list, todo_list) if (netdev_refcnt_read(dev) == 1) return dev; while (true) { if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { rtnl_lock(); /* Rebroadcast unregister notification */ list_for_each_entry(dev, list, todo_list) call_netdevice_notifiers(NETDEV_UNREGISTER, dev); __rtnl_unlock(); rcu_barrier(); rtnl_lock(); list_for_each_entry(dev, list, todo_list) if (test_bit(__LINK_STATE_LINKWATCH_PENDING, &dev->state)) { /* We must not have linkwatch events * pending on unregister. If this * happens, we simply run the queue * unscheduled, resulting in a noop * for this device. */ linkwatch_run_queue(); break; } __rtnl_unlock(); rebroadcast_time = jiffies; } if (!wait) { rcu_barrier(); wait = WAIT_REFS_MIN_MSECS; } else { msleep(wait); wait = min(wait << 1, WAIT_REFS_MAX_MSECS); } list_for_each_entry(dev, list, todo_list) if (netdev_refcnt_read(dev) == 1) return dev; if (time_after(jiffies, warning_time + READ_ONCE(netdev_unregister_timeout_secs) * HZ)) { list_for_each_entry(dev, list, todo_list) { pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n", dev->name, netdev_refcnt_read(dev)); ref_tracker_dir_print(&dev->refcnt_tracker, 10); } warning_time = jiffies; } } } m. > On 9 May 2023, at 23:08, Ido Schimmel <idosch@idosch.org> wrote: > > On Tue, May 09, 2023 at 09:50:18PM +0300, Martin Zaharinov wrote: >> i try on kernel 6.3.1 >> >> >> time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done >> >> real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min >> user 0m7.479s >> sys 0m0.367s > > You are off-CPU most of the time, the question is what is blocking. I'm > getting the following results with net-next: > > # time -p for i in $(seq 2 4094); do ip link del dev eth0.$i; done > real 177.09 > user 3.85 > sys 31.26 > > When using a batch file to perform the deletion: > > # time -p ip -b vlan_del.batch > real 35.25 > user 0.02 > sys 3.61 > > And to check where we are blocked most of the time while using the batch > file: > > # ../bcc/libbpf-tools/offcputime -p `pgrep -nx ip` > [...] > __schedule > schedule > schedule_timeout > wait_for_completion > rcu_barrier > netdev_run_todo > rtnetlink_rcv_msg > netlink_rcv_skb > netlink_unicast > netlink_sendmsg > ____sys_sendmsg > ___sys_sendmsg > __sys_sendmsg > do_syscall_64 > entry_SYSCALL_64_after_hwframe > - ip (3660) > 25089479 > [...] > > We are blocked for around 70% of the time on the rcu_barrier() in > netdev_run_todo(). > > Note that one big difference between my setup and yours is that in my > case eth0 is a dummy device and in your case it's probably a physical > device that actually implements netdev_ops::ndo_vlan_rx_kill_vid(). If > so, it's possible that a non-negligible amount of time is spent talking > to hardware/firmware to delete the 4K VIDs from the device's VLAN > filter. > >> >> >> Config is very clean i remove big part of CONFIG options . >> >> is there options to debug what is happen. >> >> m ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-10 6:06 ` Martin Zaharinov @ 2023-05-10 9:40 ` Eric Dumazet 2023-05-10 13:15 ` Martin Zaharinov 2023-05-25 7:50 ` Martin Zaharinov 0 siblings, 2 replies; 16+ messages in thread From: Eric Dumazet @ 2023-05-10 9:40 UTC (permalink / raw) To: Martin Zaharinov; +Cc: Ido Schimmel, netdev On Wed, May 10, 2023 at 8:06 AM Martin Zaharinov <micron10@gmail.com> wrote: > > I think problem is in this part of code in net/core/dev.c What makes you think this ? msleep() is not called a single time on my test bed. # perf probe -a msleep # cat bench.sh modprobe dummy 2>/dev/null ip link set dev dummy0 up 2>/dev/null for i in $(seq 2 4094); do ip link add link dummy0 name vlan$i type vlan id $i; done for i in $(seq 2 4094); do ip link set dev vlan$i up; done time for i in $(seq 2 4094); do ip link del link dummy0 name vlan$i type vlan id $i; done # perf record -e probe:msleep -a -g ./bench.sh real 0m59.877s user 0m0.588s sys 0m7.023s [ perf record: Woken up 6 times to write data ] [ perf record: Captured and wrote 8.561 MB perf.data ] # perf script # << empty, nothing >> > #define WAIT_REFS_MIN_MSECS 1 > #define WAIT_REFS_MAX_MSECS 250 > /** > * netdev_wait_allrefs_any - wait until all references are gone. > * @list: list of net_devices to wait on > * > * This is called when unregistering network devices. > * > * Any protocol or device that holds a reference should register > * for netdevice notification, and cleanup and put back the > * reference if they receive an UNREGISTER event. > * We can get stuck here if buggy protocols don't correctly > * call dev_put. > */ > static struct net_device *netdev_wait_allrefs_any(struct list_head *list) > { > unsigned long rebroadcast_time, warning_time; > struct net_device *dev; > int wait = 0; > > rebroadcast_time = warning_time = jiffies; > > list_for_each_entry(dev, list, todo_list) > if (netdev_refcnt_read(dev) == 1) > return dev; > > while (true) { > if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { > rtnl_lock(); > > /* Rebroadcast unregister notification */ > list_for_each_entry(dev, list, todo_list) > call_netdevice_notifiers(NETDEV_UNREGISTER, dev); > > __rtnl_unlock(); > rcu_barrier(); > rtnl_lock(); > > list_for_each_entry(dev, list, todo_list) > if (test_bit(__LINK_STATE_LINKWATCH_PENDING, > &dev->state)) { > /* We must not have linkwatch events > * pending on unregister. If this > * happens, we simply run the queue > * unscheduled, resulting in a noop > * for this device. > */ > linkwatch_run_queue(); > break; > } > > __rtnl_unlock(); > > rebroadcast_time = jiffies; > } > > if (!wait) { > rcu_barrier(); > wait = WAIT_REFS_MIN_MSECS; > } else { > msleep(wait); > wait = min(wait << 1, WAIT_REFS_MAX_MSECS); > } > > list_for_each_entry(dev, list, todo_list) > if (netdev_refcnt_read(dev) == 1) > return dev; > > if (time_after(jiffies, warning_time + > READ_ONCE(netdev_unregister_timeout_secs) * HZ)) { > list_for_each_entry(dev, list, todo_list) { > pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n", > dev->name, netdev_refcnt_read(dev)); > ref_tracker_dir_print(&dev->refcnt_tracker, 10); > } > > warning_time = jiffies; > } > } > } > > > > m. > > > > On 9 May 2023, at 23:08, Ido Schimmel <idosch@idosch.org> wrote: > > > > On Tue, May 09, 2023 at 09:50:18PM +0300, Martin Zaharinov wrote: > >> i try on kernel 6.3.1 > >> > >> > >> time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done > >> > >> real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min > >> user 0m7.479s > >> sys 0m0.367s > > > > You are off-CPU most of the time, the question is what is blocking. I'm > > getting the following results with net-next: > > > > # time -p for i in $(seq 2 4094); do ip link del dev eth0.$i; done > > real 177.09 > > user 3.85 > > sys 31.26 > > > > When using a batch file to perform the deletion: > > > > # time -p ip -b vlan_del.batch > > real 35.25 > > user 0.02 > > sys 3.61 > > > > And to check where we are blocked most of the time while using the batch > > file: > > > > # ../bcc/libbpf-tools/offcputime -p `pgrep -nx ip` > > [...] > > __schedule > > schedule > > schedule_timeout > > wait_for_completion > > rcu_barrier > > netdev_run_todo > > rtnetlink_rcv_msg > > netlink_rcv_skb > > netlink_unicast > > netlink_sendmsg > > ____sys_sendmsg > > ___sys_sendmsg > > __sys_sendmsg > > do_syscall_64 > > entry_SYSCALL_64_after_hwframe > > - ip (3660) > > 25089479 > > [...] > > > > We are blocked for around 70% of the time on the rcu_barrier() in > > netdev_run_todo(). > > > > Note that one big difference between my setup and yours is that in my > > case eth0 is a dummy device and in your case it's probably a physical > > device that actually implements netdev_ops::ndo_vlan_rx_kill_vid(). If > > so, it's possible that a non-negligible amount of time is spent talking > > to hardware/firmware to delete the 4K VIDs from the device's VLAN > > filter. > > > >> > >> > >> Config is very clean i remove big part of CONFIG options . > >> > >> is there options to debug what is happen. > >> > >> m > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-10 9:40 ` Eric Dumazet @ 2023-05-10 13:15 ` Martin Zaharinov 2023-05-25 7:50 ` Martin Zaharinov 1 sibling, 0 replies; 16+ messages in thread From: Martin Zaharinov @ 2023-05-10 13:15 UTC (permalink / raw) To: Eric Dumazet; +Cc: Ido Schimmel, netdev Ok i will try to set CONFIG_HZ to 1000 and will make tests Thanks Eric > On 10 May 2023, at 12:40, Eric Dumazet <edumazet@google.com> wrote: > > On Wed, May 10, 2023 at 8:06 AM Martin Zaharinov <micron10@gmail.com> wrote: >> >> I think problem is in this part of code in net/core/dev.c > > What makes you think this ? > > msleep() is not called a single time on my test bed. > > # perf probe -a msleep > # cat bench.sh > modprobe dummy 2>/dev/null > ip link set dev dummy0 up 2>/dev/null > for i in $(seq 2 4094); do ip link add link dummy0 name vlan$i type > vlan id $i; done > for i in $(seq 2 4094); do ip link set dev vlan$i up; done > time for i in $(seq 2 4094); do ip link del link dummy0 name vlan$i > type vlan id $i; done > > # perf record -e probe:msleep -a -g ./bench.sh > > real 0m59.877s > user 0m0.588s > sys 0m7.023s > [ perf record: Woken up 6 times to write data ] > [ perf record: Captured and wrote 8.561 MB perf.data ] > # perf script > # << empty, nothing >> > > > > >> #define WAIT_REFS_MIN_MSECS 1 >> #define WAIT_REFS_MAX_MSECS 250 >> /** >> * netdev_wait_allrefs_any - wait until all references are gone. >> * @list: list of net_devices to wait on >> * >> * This is called when unregistering network devices. >> * >> * Any protocol or device that holds a reference should register >> * for netdevice notification, and cleanup and put back the >> * reference if they receive an UNREGISTER event. >> * We can get stuck here if buggy protocols don't correctly >> * call dev_put. >> */ >> static struct net_device *netdev_wait_allrefs_any(struct list_head *list) >> { >> unsigned long rebroadcast_time, warning_time; >> struct net_device *dev; >> int wait = 0; >> >> rebroadcast_time = warning_time = jiffies; >> >> list_for_each_entry(dev, list, todo_list) >> if (netdev_refcnt_read(dev) == 1) >> return dev; >> >> while (true) { >> if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { >> rtnl_lock(); >> >> /* Rebroadcast unregister notification */ >> list_for_each_entry(dev, list, todo_list) >> call_netdevice_notifiers(NETDEV_UNREGISTER, dev); >> >> __rtnl_unlock(); >> rcu_barrier(); >> rtnl_lock(); >> >> list_for_each_entry(dev, list, todo_list) >> if (test_bit(__LINK_STATE_LINKWATCH_PENDING, >> &dev->state)) { >> /* We must not have linkwatch events >> * pending on unregister. If this >> * happens, we simply run the queue >> * unscheduled, resulting in a noop >> * for this device. >> */ >> linkwatch_run_queue(); >> break; >> } >> >> __rtnl_unlock(); >> >> rebroadcast_time = jiffies; >> } >> >> if (!wait) { >> rcu_barrier(); >> wait = WAIT_REFS_MIN_MSECS; >> } else { >> msleep(wait); >> wait = min(wait << 1, WAIT_REFS_MAX_MSECS); >> } >> >> list_for_each_entry(dev, list, todo_list) >> if (netdev_refcnt_read(dev) == 1) >> return dev; >> >> if (time_after(jiffies, warning_time + >> READ_ONCE(netdev_unregister_timeout_secs) * HZ)) { >> list_for_each_entry(dev, list, todo_list) { >> pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n", >> dev->name, netdev_refcnt_read(dev)); >> ref_tracker_dir_print(&dev->refcnt_tracker, 10); >> } >> >> warning_time = jiffies; >> } >> } >> } >> >> >> >> m. >> >> >>> On 9 May 2023, at 23:08, Ido Schimmel <idosch@idosch.org> wrote: >>> >>> On Tue, May 09, 2023 at 09:50:18PM +0300, Martin Zaharinov wrote: >>>> i try on kernel 6.3.1 >>>> >>>> >>>> time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done >>>> >>>> real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min >>>> user 0m7.479s >>>> sys 0m0.367s >>> >>> You are off-CPU most of the time, the question is what is blocking. I'm >>> getting the following results with net-next: >>> >>> # time -p for i in $(seq 2 4094); do ip link del dev eth0.$i; done >>> real 177.09 >>> user 3.85 >>> sys 31.26 >>> >>> When using a batch file to perform the deletion: >>> >>> # time -p ip -b vlan_del.batch >>> real 35.25 >>> user 0.02 >>> sys 3.61 >>> >>> And to check where we are blocked most of the time while using the batch >>> file: >>> >>> # ../bcc/libbpf-tools/offcputime -p `pgrep -nx ip` >>> [...] >>> __schedule >>> schedule >>> schedule_timeout >>> wait_for_completion >>> rcu_barrier >>> netdev_run_todo >>> rtnetlink_rcv_msg >>> netlink_rcv_skb >>> netlink_unicast >>> netlink_sendmsg >>> ____sys_sendmsg >>> ___sys_sendmsg >>> __sys_sendmsg >>> do_syscall_64 >>> entry_SYSCALL_64_after_hwframe >>> - ip (3660) >>> 25089479 >>> [...] >>> >>> We are blocked for around 70% of the time on the rcu_barrier() in >>> netdev_run_todo(). >>> >>> Note that one big difference between my setup and yours is that in my >>> case eth0 is a dummy device and in your case it's probably a physical >>> device that actually implements netdev_ops::ndo_vlan_rx_kill_vid(). If >>> so, it's possible that a non-negligible amount of time is spent talking >>> to hardware/firmware to delete the 4K VIDs from the device's VLAN >>> filter. >>> >>>> >>>> >>>> Config is very clean i remove big part of CONFIG options . >>>> >>>> is there options to debug what is happen. >>>> >>>> m >> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-10 9:40 ` Eric Dumazet 2023-05-10 13:15 ` Martin Zaharinov @ 2023-05-25 7:50 ` Martin Zaharinov 1 sibling, 0 replies; 16+ messages in thread From: Martin Zaharinov @ 2023-05-25 7:50 UTC (permalink / raw) To: Eric Dumazet; +Cc: Ido Schimmel, netdev Hi Eric after switch to HZ 1666 reduce time to 30 sec for remove 4093 vlans . Do you think there will be a problem? Best regards, martin > On 10 May 2023, at 12:40, Eric Dumazet <edumazet@google.com> wrote: > > On Wed, May 10, 2023 at 8:06 AM Martin Zaharinov <micron10@gmail.com> wrote: >> >> I think problem is in this part of code in net/core/dev.c > > What makes you think this ? > > msleep() is not called a single time on my test bed. > > # perf probe -a msleep > # cat bench.sh > modprobe dummy 2>/dev/null > ip link set dev dummy0 up 2>/dev/null > for i in $(seq 2 4094); do ip link add link dummy0 name vlan$i type > vlan id $i; done > for i in $(seq 2 4094); do ip link set dev vlan$i up; done > time for i in $(seq 2 4094); do ip link del link dummy0 name vlan$i > type vlan id $i; done > > # perf record -e probe:msleep -a -g ./bench.sh > > real 0m59.877s > user 0m0.588s > sys 0m7.023s > [ perf record: Woken up 6 times to write data ] > [ perf record: Captured and wrote 8.561 MB perf.data ] > # perf script > # << empty, nothing >> > > > > >> #define WAIT_REFS_MIN_MSECS 1 >> #define WAIT_REFS_MAX_MSECS 250 >> /** >> * netdev_wait_allrefs_any - wait until all references are gone. >> * @list: list of net_devices to wait on >> * >> * This is called when unregistering network devices. >> * >> * Any protocol or device that holds a reference should register >> * for netdevice notification, and cleanup and put back the >> * reference if they receive an UNREGISTER event. >> * We can get stuck here if buggy protocols don't correctly >> * call dev_put. >> */ >> static struct net_device *netdev_wait_allrefs_any(struct list_head *list) >> { >> unsigned long rebroadcast_time, warning_time; >> struct net_device *dev; >> int wait = 0; >> >> rebroadcast_time = warning_time = jiffies; >> >> list_for_each_entry(dev, list, todo_list) >> if (netdev_refcnt_read(dev) == 1) >> return dev; >> >> while (true) { >> if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { >> rtnl_lock(); >> >> /* Rebroadcast unregister notification */ >> list_for_each_entry(dev, list, todo_list) >> call_netdevice_notifiers(NETDEV_UNREGISTER, dev); >> >> __rtnl_unlock(); >> rcu_barrier(); >> rtnl_lock(); >> >> list_for_each_entry(dev, list, todo_list) >> if (test_bit(__LINK_STATE_LINKWATCH_PENDING, >> &dev->state)) { >> /* We must not have linkwatch events >> * pending on unregister. If this >> * happens, we simply run the queue >> * unscheduled, resulting in a noop >> * for this device. >> */ >> linkwatch_run_queue(); >> break; >> } >> >> __rtnl_unlock(); >> >> rebroadcast_time = jiffies; >> } >> >> if (!wait) { >> rcu_barrier(); >> wait = WAIT_REFS_MIN_MSECS; >> } else { >> msleep(wait); >> wait = min(wait << 1, WAIT_REFS_MAX_MSECS); >> } >> >> list_for_each_entry(dev, list, todo_list) >> if (netdev_refcnt_read(dev) == 1) >> return dev; >> >> if (time_after(jiffies, warning_time + >> READ_ONCE(netdev_unregister_timeout_secs) * HZ)) { >> list_for_each_entry(dev, list, todo_list) { >> pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n", >> dev->name, netdev_refcnt_read(dev)); >> ref_tracker_dir_print(&dev->refcnt_tracker, 10); >> } >> >> warning_time = jiffies; >> } >> } >> } >> >> >> >> m. >> >> >>> On 9 May 2023, at 23:08, Ido Schimmel <idosch@idosch.org> wrote: >>> >>> On Tue, May 09, 2023 at 09:50:18PM +0300, Martin Zaharinov wrote: >>>> i try on kernel 6.3.1 >>>> >>>> >>>> time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done >>>> >>>> real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min >>>> user 0m7.479s >>>> sys 0m0.367s >>> >>> You are off-CPU most of the time, the question is what is blocking. I'm >>> getting the following results with net-next: >>> >>> # time -p for i in $(seq 2 4094); do ip link del dev eth0.$i; done >>> real 177.09 >>> user 3.85 >>> sys 31.26 >>> >>> When using a batch file to perform the deletion: >>> >>> # time -p ip -b vlan_del.batch >>> real 35.25 >>> user 0.02 >>> sys 3.61 >>> >>> And to check where we are blocked most of the time while using the batch >>> file: >>> >>> # ../bcc/libbpf-tools/offcputime -p `pgrep -nx ip` >>> [...] >>> __schedule >>> schedule >>> schedule_timeout >>> wait_for_completion >>> rcu_barrier >>> netdev_run_todo >>> rtnetlink_rcv_msg >>> netlink_rcv_skb >>> netlink_unicast >>> netlink_sendmsg >>> ____sys_sendmsg >>> ___sys_sendmsg >>> __sys_sendmsg >>> do_syscall_64 >>> entry_SYSCALL_64_after_hwframe >>> - ip (3660) >>> 25089479 >>> [...] >>> >>> We are blocked for around 70% of the time on the rcu_barrier() in >>> netdev_run_todo(). >>> >>> Note that one big difference between my setup and yours is that in my >>> case eth0 is a dummy device and in your case it's probably a physical >>> device that actually implements netdev_ops::ndo_vlan_rx_kill_vid(). If >>> so, it's possible that a non-negligible amount of time is spent talking >>> to hardware/firmware to delete the 4K VIDs from the device's VLAN >>> filter. >>> >>>> >>>> >>>> Config is very clean i remove big part of CONFIG options . >>>> >>>> is there options to debug what is happen. >>>> >>>> m >> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 20:08 ` Ido Schimmel ` (2 preceding siblings ...) 2023-05-10 6:06 ` Martin Zaharinov @ 2023-05-10 9:16 ` Martin Zaharinov 2023-05-10 9:22 ` Eric Dumazet 3 siblings, 1 reply; 16+ messages in thread From: Martin Zaharinov @ 2023-05-10 9:16 UTC (permalink / raw) To: Ido Schimmel; +Cc: Eric Dumazet, netdev Hi all one more update i test with Proxmox direct with kernel 6.2.6 modprobe dummy numdummies=1 ip link set dev dummy0 up for i in $(seq 2 1999); do ip link add link dummy0 name vlan$i type vlan id $i; done for i in $(seq 2 1999); do ip link set dev vlan$i up; done time for i in $(seq 2 1999); do ip link del link dummy0 name vlan$i type vlan id $i; done real 1m6.308s user 0m4.451s sys 0m1.589s This kernel is configured with CONFIG_HZ 250 and as you see i add 1998 vlans if add 4094 is time up to 4-5 min to remove in test kernel i set CONFIG_HZ to 1000 but i dont this this is fine for any server. > On 9 May 2023, at 23:08, Ido Schimmel <idosch@idosch.org> wrote: > > On Tue, May 09, 2023 at 09:50:18PM +0300, Martin Zaharinov wrote: >> i try on kernel 6.3.1 >> >> >> time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done >> >> real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min >> user 0m7.479s >> sys 0m0.367s > > You are off-CPU most of the time, the question is what is blocking. I'm > getting the following results with net-next: > > # time -p for i in $(seq 2 4094); do ip link del dev eth0.$i; done > real 177.09 > user 3.85 > sys 31.26 > > When using a batch file to perform the deletion: > > # time -p ip -b vlan_del.batch > real 35.25 > user 0.02 > sys 3.61 > > And to check where we are blocked most of the time while using the batch > file: > > # ../bcc/libbpf-tools/offcputime -p `pgrep -nx ip` > [...] > __schedule > schedule > schedule_timeout > wait_for_completion > rcu_barrier > netdev_run_todo > rtnetlink_rcv_msg > netlink_rcv_skb > netlink_unicast > netlink_sendmsg > ____sys_sendmsg > ___sys_sendmsg > __sys_sendmsg > do_syscall_64 > entry_SYSCALL_64_after_hwframe > - ip (3660) > 25089479 > [...] > > We are blocked for around 70% of the time on the rcu_barrier() in > netdev_run_todo(). > > Note that one big difference between my setup and yours is that in my > case eth0 is a dummy device and in your case it's probably a physical > device that actually implements netdev_ops::ndo_vlan_rx_kill_vid(). If > so, it's possible that a non-negligible amount of time is spent talking > to hardware/firmware to delete the 4K VIDs from the device's VLAN > filter. > >> >> >> Config is very clean i remove big part of CONFIG options . >> >> is there options to debug what is happen. >> >> m ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-10 9:16 ` Martin Zaharinov @ 2023-05-10 9:22 ` Eric Dumazet 0 siblings, 0 replies; 16+ messages in thread From: Eric Dumazet @ 2023-05-10 9:22 UTC (permalink / raw) To: Martin Zaharinov; +Cc: Ido Schimmel, netdev On Wed, May 10, 2023 at 11:17 AM Martin Zaharinov <micron10@gmail.com> wrote: > > Hi all > > one more update > > i test with Proxmox direct with kernel 6.2.6 > > modprobe dummy numdummies=1 > ip link set dev dummy0 up > for i in $(seq 2 1999); do ip link add link dummy0 name vlan$i type vlan id $i; done > for i in $(seq 2 1999); do ip link set dev vlan$i up; done > time for i in $(seq 2 1999); do ip link del link dummy0 name vlan$i type vlan id $i; done > > real 1m6.308s > user 0m4.451s > sys 0m1.589s > > > This kernel is configured with CONFIG_HZ 250 and as you see i add 1998 vlans if add 4094 is time up to 4-5 min to remove > > in test kernel i set CONFIG_HZ to 1000 but i dont this this is fine for any server. We use CONFIG_HZ=1000 on server builds. Other values cause suboptimal behavior, for instance in TCP stack. > > > > On 9 May 2023, at 23:08, Ido Schimmel <idosch@idosch.org> wrote: > > > > On Tue, May 09, 2023 at 09:50:18PM +0300, Martin Zaharinov wrote: > >> i try on kernel 6.3.1 > >> > >> > >> time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done > >> > >> real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min > >> user 0m7.479s > >> sys 0m0.367s > > > > You are off-CPU most of the time, the question is what is blocking. I'm > > getting the following results with net-next: > > > > # time -p for i in $(seq 2 4094); do ip link del dev eth0.$i; done > > real 177.09 > > user 3.85 > > sys 31.26 > > > > When using a batch file to perform the deletion: > > > > # time -p ip -b vlan_del.batch > > real 35.25 > > user 0.02 > > sys 3.61 > > > > And to check where we are blocked most of the time while using the batch > > file: > > > > # ../bcc/libbpf-tools/offcputime -p `pgrep -nx ip` > > [...] > > __schedule > > schedule > > schedule_timeout > > wait_for_completion > > rcu_barrier > > netdev_run_todo > > rtnetlink_rcv_msg > > netlink_rcv_skb > > netlink_unicast > > netlink_sendmsg > > ____sys_sendmsg > > ___sys_sendmsg > > __sys_sendmsg > > do_syscall_64 > > entry_SYSCALL_64_after_hwframe > > - ip (3660) > > 25089479 > > [...] > > > > We are blocked for around 70% of the time on the rcu_barrier() in > > netdev_run_todo(). > > > > Note that one big difference between my setup and yours is that in my > > case eth0 is a dummy device and in your case it's probably a physical > > device that actually implements netdev_ops::ndo_vlan_rx_kill_vid(). If > > so, it's possible that a non-negligible amount of time is spent talking > > to hardware/firmware to delete the 4K VIDs from the device's VLAN > > filter. > > > >> > >> > >> Config is very clean i remove big part of CONFIG options . > >> > >> is there options to debug what is happen. > >> > >> m > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Very slow remove interface from kernel 2023-05-09 12:36 ` Eric Dumazet 2023-05-09 18:50 ` Martin Zaharinov @ 2023-05-09 20:08 ` Martin Zaharinov 1 sibling, 0 replies; 16+ messages in thread From: Martin Zaharinov @ 2023-05-09 20:08 UTC (permalink / raw) To: Eric Dumazet; +Cc: Ido Schimmel, netdev [-- Attachment #1: Type: text/plain, Size: 142 bytes --] One more see this video from time of remove i make this : watch -n.1 "ip a | grep UP | wc” to look how many interface remove in 1sec [-- Attachment #2: Screen Recording 2023-05-09 at 23.06.52.mov --] [-- Type: video/quicktime, Size: 606423 bytes --] [-- Attachment #3: Type: text/plain, Size: 2972 bytes --] > On 9 May 2023, at 15:36, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, May 9, 2023 at 1:10 PM Martin Zaharinov <micron10@gmail.com> wrote: >> >> Hi >> >> in short, there is no way to make the kernel do it faster. > > Make sure your kernel does not include options you do not need. > >> >> Before time with old kernel unregister device make more faster . >> >> with latest kernel >6.x this make very slow . >> > > Yup, I feel your pain. > > Maybe you should start a bisection then... > > You might find that you have some CONFIG_ option that makes this > operation very slow. > > Some layers (like hamradio and others) lack batch operations in their > netdev removal handlers. > > For instance, on one machine I have access to and with my standard > .config, your benchmark gives a not too bad result with pristine > linux-6.3 > > modprobe dummy > ip link set dev dummy0 up > for i in $(seq 2 4094); do ip link add link dummy0 name vlan$i type > vlan id $i; done > for i in $(seq 2 4094); do ip link set dev vlan$i up; done > time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type > vlan id $i; done > real 0m55.808s > user 0m0.788s > sys 0m6.868s > > Without batching, I think one netdev removal needs three synchronize_net() calls > > I am reasonably certain numbers would not look so good if I booted a > "make allyesconfig" kernel. > > > > > > > > >> >> is there any chance to try to make this more fast. >> >> >> m. >> >> >>> On 9 May 2023, at 13:32, Eric Dumazet <edumazet@google.com> wrote: >>> >>> On Tue, May 9, 2023 at 12:20 PM Ido Schimmel <idosch@idosch.org> wrote: >>>> >>>> On Tue, May 09, 2023 at 11:22:13AM +0300, Martin Zaharinov wrote: >>>>> add vlans : >>>>> for i in $(seq 2 4094); do ip link add link eth1 name vlan$i type vlan id $i; done >>>>> for i in $(seq 2 4094); do ip link set dev vlan$i up; done >>>>> >>>>> >>>>> and after that run : >>>>> >>>>> for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done >>>>> >>>>> >>>>> time for remove for this 4093 vlans is 5-10 min . >>>>> >>>>> Is there options to make fast this ? >>>> >>>> If you know you are going to delete all of them together, then you can >>>> add them to the same group during creation: >>>> >>>> for i in $(seq 2 4094); do ip link add link eth1 name vlan$i up group 10 type vlan id $i; done >>>> >>>> Then delete the group: >>>> >>>> ip link del group 10 >>>> >>> >>> Another way is to create a netns for retiring devices, >>> move devices to the 'retirens' when they need to go away. >>> >>> Then once per minute, delete the retirens and create a new one. >>> >>> -> This batches netdev deletions. >>> >>> >>>> IIRC, in the past there was a patchset to allow passing a list of >>>> ifindexes instead of a group number, but it never made its way upstream. >> >> ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2023-05-25 7:50 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-05-09 8:22 Very slow remove interface from kernel Martin Zaharinov 2023-05-09 10:20 ` Ido Schimmel 2023-05-09 10:32 ` Eric Dumazet 2023-05-09 11:10 ` Martin Zaharinov 2023-05-09 12:36 ` Eric Dumazet 2023-05-09 18:50 ` Martin Zaharinov 2023-05-09 20:08 ` Ido Schimmel 2023-05-09 20:16 ` Martin Zaharinov 2023-05-10 5:31 ` Martin Zaharinov 2023-05-10 6:06 ` Martin Zaharinov 2023-05-10 9:40 ` Eric Dumazet 2023-05-10 13:15 ` Martin Zaharinov 2023-05-25 7:50 ` Martin Zaharinov 2023-05-10 9:16 ` Martin Zaharinov 2023-05-10 9:22 ` Eric Dumazet 2023-05-09 20:08 ` Martin Zaharinov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).