* [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc
@ 2023-11-17 17:12 Pedro Tammela
2023-11-17 17:12 ` [PATCH net-next 1/6] selftests: tc-testing: cap parallel tdc to 4 cores Pedro Tammela
` (8 more replies)
0 siblings, 9 replies; 16+ messages in thread
From: Pedro Tammela @ 2023-11-17 17:12 UTC (permalink / raw)
To: netdev
Cc: jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni, shuah,
pctammela, victor
Address the issues making tdc timeout on downstream CIs like lkp and
tuxsuite.
Pedro Tammela (6):
selftests: tc-testing: cap parallel tdc to 4 cores
selftests: tc-testing: move back to per test ns setup
selftests: tc-testing: use netns delete from pyroute2
selftests: tc-testing: leverage -all in suite ns teardown
selftests: tc-testing: timeout on unbounded loops
selftests: tc-testing: report number of workers in use
.../tc-testing/plugin-lib/nsPlugin.py | 98 +++++++++----------
tools/testing/selftests/tc-testing/tdc.py | 3 +-
2 files changed, 51 insertions(+), 50 deletions(-)
--
2.40.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH net-next 1/6] selftests: tc-testing: cap parallel tdc to 4 cores
2023-11-17 17:12 [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Pedro Tammela
@ 2023-11-17 17:12 ` Pedro Tammela
2023-11-20 17:38 ` Simon Horman
2023-11-17 17:12 ` [PATCH net-next 2/6] selftests: tc-testing: move back to per test ns setup Pedro Tammela
` (7 subsequent siblings)
8 siblings, 1 reply; 16+ messages in thread
From: Pedro Tammela @ 2023-11-17 17:12 UTC (permalink / raw)
To: netdev
Cc: jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni, shuah,
pctammela, victor
We have observed a lot of lock contention and test instability when running with >8 cores.
Enough to actually make the tests run slower than with fewer cores.
Cap the maximum cores of parallel tdc to 4 which showed in testing to
be a reasonable number for efficiency and stability in different kernel
config scenarios.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
---
tools/testing/selftests/tc-testing/tdc.py | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/tc-testing/tdc.py b/tools/testing/selftests/tc-testing/tdc.py
index a6718192aff3..f764b43f112b 100755
--- a/tools/testing/selftests/tc-testing/tdc.py
+++ b/tools/testing/selftests/tc-testing/tdc.py
@@ -1017,6 +1017,7 @@ def main():
parser = pm.call_add_args(parser)
(args, remaining) = parser.parse_known_args()
args.NAMES = NAMES
+ args.mp = min(args.mp, 4)
pm.set_args(args)
check_default_settings(args, remaining, pm)
if args.verbose > 2:
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 2/6] selftests: tc-testing: move back to per test ns setup
2023-11-17 17:12 [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Pedro Tammela
2023-11-17 17:12 ` [PATCH net-next 1/6] selftests: tc-testing: cap parallel tdc to 4 cores Pedro Tammela
@ 2023-11-17 17:12 ` Pedro Tammela
2023-11-20 17:38 ` Simon Horman
2023-11-17 17:12 ` [PATCH net-next 3/6] selftests: tc-testing: use netns delete from pyroute2 Pedro Tammela
` (6 subsequent siblings)
8 siblings, 1 reply; 16+ messages in thread
From: Pedro Tammela @ 2023-11-17 17:12 UTC (permalink / raw)
To: netdev
Cc: jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni, shuah,
pctammela, victor, kernel test robot
Surprisingly in kernel configs with most of the debug knobs turned on,
pre-allocating the test resources makes tdc run much slower overall than
when allocating resources on a per test basis.
As these knobs are used in kselftests in downstream CIs, let's go back
to the old way of doing things to avoid kselftests timeouts.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202311161129.3b45ed53-oliver.sang@intel.com
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
---
.../tc-testing/plugin-lib/nsPlugin.py | 68 +++++++------------
1 file changed, 25 insertions(+), 43 deletions(-)
diff --git a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
index 62974bd3a4a5..2b8cbfdf1083 100644
--- a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
+++ b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
@@ -17,44 +17,6 @@ except ImportError:
netlink = False
print("!!! Consider installing pyroute2 !!!")
-def prepare_suite(obj, test):
- original = obj.args.NAMES
-
- if 'skip' in test and test['skip'] == 'yes':
- return
-
- if 'nsPlugin' not in test['plugins']:
- return
-
- shadow = {}
- shadow['IP'] = original['IP']
- shadow['TC'] = original['TC']
- shadow['NS'] = '{}-{}'.format(original['NS'], test['random'])
- shadow['DEV0'] = '{}id{}'.format(original['DEV0'], test['id'])
- shadow['DEV1'] = '{}id{}'.format(original['DEV1'], test['id'])
- shadow['DUMMY'] = '{}id{}'.format(original['DUMMY'], test['id'])
- shadow['DEV2'] = original['DEV2']
- obj.args.NAMES = shadow
-
- if netlink == True:
- obj._nl_ns_create()
- else:
- obj._ns_create()
-
- # Make sure the netns is visible in the fs
- while True:
- obj._proc_check()
- try:
- ns = obj.args.NAMES['NS']
- f = open('/run/netns/{}'.format(ns))
- f.close()
- break
- except:
- time.sleep(0.1)
- continue
-
- obj.args.NAMES = original
-
class SubPlugin(TdcPlugin):
def __init__(self):
self.sub_class = 'ns/SubPlugin'
@@ -65,19 +27,39 @@ class SubPlugin(TdcPlugin):
super().pre_suite(testcount, testlist)
- print("Setting up namespaces and devices...")
+ def prepare_test(self, test):
+ if 'skip' in test and test['skip'] == 'yes':
+ return
- with Pool(self.args.mp) as p:
- it = zip(cycle([self]), testlist)
- p.starmap(prepare_suite, it)
+ if 'nsPlugin' not in test['plugins']:
+ return
- def pre_case(self, caseinfo, test_skip):
+ if netlink == True:
+ self._nl_ns_create()
+ else:
+ self._ns_create()
+
+ # Make sure the netns is visible in the fs
+ while True:
+ self._proc_check()
+ try:
+ ns = self.args.NAMES['NS']
+ f = open('/run/netns/{}'.format(ns))
+ f.close()
+ break
+ except:
+ time.sleep(0.1)
+ continue
+
+ def pre_case(self, test, test_skip):
if self.args.verbose:
print('{}.pre_case'.format(self.sub_class))
if test_skip:
return
+ self.prepare_test(test)
+
def post_case(self):
if self.args.verbose:
print('{}.post_case'.format(self.sub_class))
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 3/6] selftests: tc-testing: use netns delete from pyroute2
2023-11-17 17:12 [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Pedro Tammela
2023-11-17 17:12 ` [PATCH net-next 1/6] selftests: tc-testing: cap parallel tdc to 4 cores Pedro Tammela
2023-11-17 17:12 ` [PATCH net-next 2/6] selftests: tc-testing: move back to per test ns setup Pedro Tammela
@ 2023-11-17 17:12 ` Pedro Tammela
2023-11-20 17:35 ` Simon Horman
2023-11-17 17:12 ` [PATCH net-next 4/6] selftests: tc-testing: leverage -all in suite ns teardown Pedro Tammela
` (5 subsequent siblings)
8 siblings, 1 reply; 16+ messages in thread
From: Pedro Tammela @ 2023-11-17 17:12 UTC (permalink / raw)
To: netdev
Cc: jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni, shuah,
pctammela, victor
When pyroute2 is available, use the native netns delete routine instead
of calling iproute2 to do it. As forks are expensive with some kernel
configs, minimize its usage to avoid kselftests timeouts.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
---
.../testing/selftests/tc-testing/plugin-lib/nsPlugin.py | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
index 2b8cbfdf1083..920dcbedc395 100644
--- a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
+++ b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
@@ -64,7 +64,10 @@ class SubPlugin(TdcPlugin):
if self.args.verbose:
print('{}.post_case'.format(self.sub_class))
- self._ns_destroy()
+ if netlink == True:
+ self._nl_ns_destroy()
+ else:
+ self._ns_destroy()
def post_suite(self, index):
if self.args.verbose:
@@ -174,6 +177,10 @@ class SubPlugin(TdcPlugin):
'''
self._exec_cmd_batched('pre', self._ns_create_cmds())
+ def _nl_ns_destroy(self):
+ ns = self.args.NAMES['NS']
+ netns.remove(ns)
+
def _ns_destroy_cmd(self):
return self._replace_keywords('netns delete {}'.format(self.args.NAMES['NS']))
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 4/6] selftests: tc-testing: leverage -all in suite ns teardown
2023-11-17 17:12 [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Pedro Tammela
` (2 preceding siblings ...)
2023-11-17 17:12 ` [PATCH net-next 3/6] selftests: tc-testing: use netns delete from pyroute2 Pedro Tammela
@ 2023-11-17 17:12 ` Pedro Tammela
2023-11-20 17:39 ` Simon Horman
2023-11-17 17:12 ` [PATCH net-next 5/6] selftests: tc-testing: timeout on unbounded loops Pedro Tammela
` (4 subsequent siblings)
8 siblings, 1 reply; 16+ messages in thread
From: Pedro Tammela @ 2023-11-17 17:12 UTC (permalink / raw)
To: netdev
Cc: jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni, shuah,
pctammela, victor
Instead of listing lingering ns pinned files and delete them one by one, leverage '-all'
from iproute2 to do it in a single process fork.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
---
.../testing/selftests/tc-testing/plugin-lib/nsPlugin.py | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
index 920dcbedc395..7b674befceec 100644
--- a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
+++ b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
@@ -74,13 +74,12 @@ class SubPlugin(TdcPlugin):
print('{}.post_suite'.format(self.sub_class))
# Make sure we don't leak resources
- for f in os.listdir('/run/netns/'):
- cmd = self._replace_keywords("$IP netns del {}".format(f))
+ cmd = "$IP -a netns del"
- if self.args.verbose > 3:
- print('_exec_cmd: command "{}"'.format(cmd))
+ if self.args.verbose > 3:
+ print('_exec_cmd: command "{}"'.format(cmd))
- subprocess.run(cmd, shell=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+ subprocess.run(cmd, shell=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
def adjust_command(self, stage, command):
super().adjust_command(stage, command)
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 5/6] selftests: tc-testing: timeout on unbounded loops
2023-11-17 17:12 [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Pedro Tammela
` (3 preceding siblings ...)
2023-11-17 17:12 ` [PATCH net-next 4/6] selftests: tc-testing: leverage -all in suite ns teardown Pedro Tammela
@ 2023-11-17 17:12 ` Pedro Tammela
2023-11-20 17:35 ` Simon Horman
2023-11-17 17:12 ` [PATCH net-next 6/6] selftests: tc-testing: report number of workers in use Pedro Tammela
` (3 subsequent siblings)
8 siblings, 1 reply; 16+ messages in thread
From: Pedro Tammela @ 2023-11-17 17:12 UTC (permalink / raw)
To: netdev
Cc: jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni, shuah,
pctammela, victor
In the spirit of failing early, timeout on unbounded loops that take
longer than 20 ticks to complete. Such loops are to ensure that objects
created are already visible so tests can proceed without any issues.
If a test setup takes more than 20 ticks to see an object, there's
definetely something wrong.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
---
.../selftests/tc-testing/plugin-lib/nsPlugin.py | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
index 7b674befceec..65c8f3f983b9 100644
--- a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
+++ b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
@@ -40,7 +40,10 @@ class SubPlugin(TdcPlugin):
self._ns_create()
# Make sure the netns is visible in the fs
+ ticks = 20
while True:
+ if ticks == 0:
+ raise TimeoutError
self._proc_check()
try:
ns = self.args.NAMES['NS']
@@ -49,6 +52,7 @@ class SubPlugin(TdcPlugin):
break
except:
time.sleep(0.1)
+ ticks -= 1
continue
def pre_case(self, test, test_skip):
@@ -127,7 +131,10 @@ class SubPlugin(TdcPlugin):
with IPRoute() as ip:
ip.link('add', ifname=dev1, kind='veth', peer={'ifname': dev0, 'net_ns_fd':'/proc/1/ns/net'})
ip.link('add', ifname=dummy, kind='dummy')
+ ticks = 20
while True:
+ if ticks == 0:
+ raise TimeoutError
try:
dev1_idx = ip.link_lookup(ifname=dev1)[0]
dummy_idx = ip.link_lookup(ifname=dummy)[0]
@@ -136,17 +143,22 @@ class SubPlugin(TdcPlugin):
break
except:
time.sleep(0.1)
+ ticks -= 1
continue
netns.popns()
with IPRoute() as ip:
+ ticks = 20
while True:
+ if ticks == 0:
+ raise TimeoutError
try:
dev0_idx = ip.link_lookup(ifname=dev0)[0]
ip.link('set', index=dev0_idx, state='up')
break
except:
time.sleep(0.1)
+ ticks -= 1
continue
def _ns_create_cmds(self):
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next 6/6] selftests: tc-testing: report number of workers in use
2023-11-17 17:12 [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Pedro Tammela
` (4 preceding siblings ...)
2023-11-17 17:12 ` [PATCH net-next 5/6] selftests: tc-testing: timeout on unbounded loops Pedro Tammela
@ 2023-11-17 17:12 ` Pedro Tammela
2023-11-20 17:40 ` Simon Horman
2023-11-17 20:47 ` [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Jamal Hadi Salim
` (2 subsequent siblings)
8 siblings, 1 reply; 16+ messages in thread
From: Pedro Tammela @ 2023-11-17 17:12 UTC (permalink / raw)
To: netdev
Cc: jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni, shuah,
pctammela, victor
Report the number of workers in use to process the test batches.
Since the number is now subject to a limit, avoid users getting
confused.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
---
tools/testing/selftests/tc-testing/tdc.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/tc-testing/tdc.py b/tools/testing/selftests/tc-testing/tdc.py
index f764b43f112b..669ec89ebfe1 100755
--- a/tools/testing/selftests/tc-testing/tdc.py
+++ b/tools/testing/selftests/tc-testing/tdc.py
@@ -616,7 +616,7 @@ def test_runner_mp(pm, args, alltests):
batches.insert(0, serial)
print("Executing {} tests in parallel and {} in serial".format(len(parallel), len(serial)))
- print("Using {} batches".format(len(batches)))
+ print("Using {} batches and {} workers".format(len(batches), args.mp))
# We can't pickle these objects so workaround them
global mp_pm
--
2.40.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc
2023-11-17 17:12 [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Pedro Tammela
` (5 preceding siblings ...)
2023-11-17 17:12 ` [PATCH net-next 6/6] selftests: tc-testing: report number of workers in use Pedro Tammela
@ 2023-11-17 20:47 ` Jamal Hadi Salim
2023-11-21 2:07 ` Jakub Kicinski
2023-11-21 2:10 ` patchwork-bot+netdevbpf
8 siblings, 0 replies; 16+ messages in thread
From: Jamal Hadi Salim @ 2023-11-17 20:47 UTC (permalink / raw)
To: Pedro Tammela
Cc: netdev, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni,
shuah, victor
On Fri, Nov 17, 2023 at 12:12 PM Pedro Tammela <pctammela@mojatatu.com> wrote:
>
> Address the issues making tdc timeout on downstream CIs like lkp and
> tuxsuite.
>
> Pedro Tammela (6):
> selftests: tc-testing: cap parallel tdc to 4 cores
> selftests: tc-testing: move back to per test ns setup
> selftests: tc-testing: use netns delete from pyroute2
> selftests: tc-testing: leverage -all in suite ns teardown
> selftests: tc-testing: timeout on unbounded loops
> selftests: tc-testing: report number of workers in use
>
> .../tc-testing/plugin-lib/nsPlugin.py | 98 +++++++++----------
> tools/testing/selftests/tc-testing/tdc.py | 3 +-
> 2 files changed, 51 insertions(+), 50 deletions(-)
For the series:
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
cheers,
jamal
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 3/6] selftests: tc-testing: use netns delete from pyroute2
2023-11-17 17:12 ` [PATCH net-next 3/6] selftests: tc-testing: use netns delete from pyroute2 Pedro Tammela
@ 2023-11-20 17:35 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2023-11-20 17:35 UTC (permalink / raw)
To: Pedro Tammela
Cc: netdev, jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni,
shuah, victor
On Fri, Nov 17, 2023 at 02:12:05PM -0300, Pedro Tammela wrote:
> When pyroute2 is available, use the native netns delete routine instead
> of calling iproute2 to do it. As forks are expensive with some kernel
> configs, minimize its usage to avoid kselftests timeouts.
>
> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
I have a suggestion for a follow up below, but this change looks good to me.
Reviewed-by: Simon Horman <horms@kernel.org>
> ---
> .../testing/selftests/tc-testing/plugin-lib/nsPlugin.py | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
> index 2b8cbfdf1083..920dcbedc395 100644
> --- a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
> +++ b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
> @@ -64,7 +64,10 @@ class SubPlugin(TdcPlugin):
> if self.args.verbose:
> print('{}.post_case'.format(self.sub_class))
>
> - self._ns_destroy()
> + if netlink == True:
> + self._nl_ns_destroy()
> + else:
> + self._ns_destroy()
As an aside, I think it would to rename _ns_* to
_iproute2_ns_* or similar, to make the distinction with _nl_ns_* clearer.
>
> def post_suite(self, index):
> if self.args.verbose:
> @@ -174,6 +177,10 @@ class SubPlugin(TdcPlugin):
> '''
> self._exec_cmd_batched('pre', self._ns_create_cmds())
>
> + def _nl_ns_destroy(self):
> + ns = self.args.NAMES['NS']
> + netns.remove(ns)
> +
> def _ns_destroy_cmd(self):
> return self._replace_keywords('netns delete {}'.format(self.args.NAMES['NS']))
>
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 5/6] selftests: tc-testing: timeout on unbounded loops
2023-11-17 17:12 ` [PATCH net-next 5/6] selftests: tc-testing: timeout on unbounded loops Pedro Tammela
@ 2023-11-20 17:35 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2023-11-20 17:35 UTC (permalink / raw)
To: Pedro Tammela
Cc: netdev, jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni,
shuah, victor
On Fri, Nov 17, 2023 at 02:12:07PM -0300, Pedro Tammela wrote:
> In the spirit of failing early, timeout on unbounded loops that take
> longer than 20 ticks to complete. Such loops are to ensure that objects
> created are already visible so tests can proceed without any issues.
>
> If a test setup takes more than 20 ticks to see an object, there's
> definetely something wrong.
>
> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Hi Pedro,
no need to respin because of this, but 'definitely' is misspelt above.
Moving on, I am very pleased to see these loops become bounded in time.
So the above nit notwithstanding,
Reviewed-by: Simon Horman <horms@kernel.org>
...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 1/6] selftests: tc-testing: cap parallel tdc to 4 cores
2023-11-17 17:12 ` [PATCH net-next 1/6] selftests: tc-testing: cap parallel tdc to 4 cores Pedro Tammela
@ 2023-11-20 17:38 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2023-11-20 17:38 UTC (permalink / raw)
To: Pedro Tammela
Cc: netdev, jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni,
shuah, victor
On Fri, Nov 17, 2023 at 02:12:03PM -0300, Pedro Tammela wrote:
> We have observed a lot of lock contention and test instability when running with >8 cores.
> Enough to actually make the tests run slower than with fewer cores.
>
> Cap the maximum cores of parallel tdc to 4 which showed in testing to
> be a reasonable number for efficiency and stability in different kernel
> config scenarios.
>
> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Hi Pedro,
This limit seems a bit unfortunate, because it seems dependent on
hardware and software details that are subject to change. Meanwhile
this patch will be long since forgotten. But, OTOH, I can't think of
a better idea at this time, so:
Reviewed-by: Simon Horman <horms@kernel.org>
...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 2/6] selftests: tc-testing: move back to per test ns setup
2023-11-17 17:12 ` [PATCH net-next 2/6] selftests: tc-testing: move back to per test ns setup Pedro Tammela
@ 2023-11-20 17:38 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2023-11-20 17:38 UTC (permalink / raw)
To: Pedro Tammela
Cc: netdev, jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni,
shuah, victor, kernel test robot
On Fri, Nov 17, 2023 at 02:12:04PM -0300, Pedro Tammela wrote:
> Surprisingly in kernel configs with most of the debug knobs turned on,
> pre-allocating the test resources makes tdc run much slower overall than
> when allocating resources on a per test basis.
>
> As these knobs are used in kselftests in downstream CIs, let's go back
> to the old way of doing things to avoid kselftests timeouts.
>
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202311161129.3b45ed53-oliver.sang@intel.com
> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 4/6] selftests: tc-testing: leverage -all in suite ns teardown
2023-11-17 17:12 ` [PATCH net-next 4/6] selftests: tc-testing: leverage -all in suite ns teardown Pedro Tammela
@ 2023-11-20 17:39 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2023-11-20 17:39 UTC (permalink / raw)
To: Pedro Tammela
Cc: netdev, jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni,
shuah, victor
On Fri, Nov 17, 2023 at 02:12:06PM -0300, Pedro Tammela wrote:
> Instead of listing lingering ns pinned files and delete them one by one, leverage '-all'
> from iproute2 to do it in a single process fork.
>
> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 6/6] selftests: tc-testing: report number of workers in use
2023-11-17 17:12 ` [PATCH net-next 6/6] selftests: tc-testing: report number of workers in use Pedro Tammela
@ 2023-11-20 17:40 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2023-11-20 17:40 UTC (permalink / raw)
To: Pedro Tammela
Cc: netdev, jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni,
shuah, victor
On Fri, Nov 17, 2023 at 02:12:08PM -0300, Pedro Tammela wrote:
> Report the number of workers in use to process the test batches.
> Since the number is now subject to a limit, avoid users getting
> confused.
>
> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc
2023-11-17 17:12 [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Pedro Tammela
` (6 preceding siblings ...)
2023-11-17 20:47 ` [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Jamal Hadi Salim
@ 2023-11-21 2:07 ` Jakub Kicinski
2023-11-21 2:10 ` patchwork-bot+netdevbpf
8 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2023-11-21 2:07 UTC (permalink / raw)
To: Pedro Tammela
Cc: netdev, jhs, xiyou.wangcong, jiri, davem, edumazet, pabeni, shuah,
victor
On Fri, 17 Nov 2023 14:12:02 -0300 Pedro Tammela wrote:
> Address the issues making tdc timeout on downstream CIs like lkp and
> tuxsuite.
Please do CC linux-kselftest@vger.kernel.org in the future.
Perhaps someone wants to read all selftests coming into the kernel.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc
2023-11-17 17:12 [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Pedro Tammela
` (7 preceding siblings ...)
2023-11-21 2:07 ` Jakub Kicinski
@ 2023-11-21 2:10 ` patchwork-bot+netdevbpf
8 siblings, 0 replies; 16+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-11-21 2:10 UTC (permalink / raw)
To: Pedro Tammela
Cc: netdev, jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni,
shuah, victor
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 17 Nov 2023 14:12:02 -0300 you wrote:
> Address the issues making tdc timeout on downstream CIs like lkp and
> tuxsuite.
>
> Pedro Tammela (6):
> selftests: tc-testing: cap parallel tdc to 4 cores
> selftests: tc-testing: move back to per test ns setup
> selftests: tc-testing: use netns delete from pyroute2
> selftests: tc-testing: leverage -all in suite ns teardown
> selftests: tc-testing: timeout on unbounded loops
> selftests: tc-testing: report number of workers in use
>
> [...]
Here is the summary with links:
- [net-next,1/6] selftests: tc-testing: cap parallel tdc to 4 cores
https://git.kernel.org/netdev/net-next/c/025de7b6a6dd
- [net-next,2/6] selftests: tc-testing: move back to per test ns setup
https://git.kernel.org/netdev/net-next/c/50a5988a7a54
- [net-next,3/6] selftests: tc-testing: use netns delete from pyroute2
https://git.kernel.org/netdev/net-next/c/3d5026fc5adb
- [net-next,4/6] selftests: tc-testing: leverage -all in suite ns teardown
https://git.kernel.org/netdev/net-next/c/3f2d94a4ff48
- [net-next,5/6] selftests: tc-testing: timeout on unbounded loops
https://git.kernel.org/netdev/net-next/c/4b480cfb1066
- [net-next,6/6] selftests: tc-testing: report number of workers in use
https://git.kernel.org/netdev/net-next/c/4968afa0143d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2023-11-21 2:10 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-17 17:12 [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Pedro Tammela
2023-11-17 17:12 ` [PATCH net-next 1/6] selftests: tc-testing: cap parallel tdc to 4 cores Pedro Tammela
2023-11-20 17:38 ` Simon Horman
2023-11-17 17:12 ` [PATCH net-next 2/6] selftests: tc-testing: move back to per test ns setup Pedro Tammela
2023-11-20 17:38 ` Simon Horman
2023-11-17 17:12 ` [PATCH net-next 3/6] selftests: tc-testing: use netns delete from pyroute2 Pedro Tammela
2023-11-20 17:35 ` Simon Horman
2023-11-17 17:12 ` [PATCH net-next 4/6] selftests: tc-testing: leverage -all in suite ns teardown Pedro Tammela
2023-11-20 17:39 ` Simon Horman
2023-11-17 17:12 ` [PATCH net-next 5/6] selftests: tc-testing: timeout on unbounded loops Pedro Tammela
2023-11-20 17:35 ` Simon Horman
2023-11-17 17:12 ` [PATCH net-next 6/6] selftests: tc-testing: report number of workers in use Pedro Tammela
2023-11-20 17:40 ` Simon Horman
2023-11-17 20:47 ` [PATCH net-next 0/6] selftests: tc-testing: more updates to tdc Jamal Hadi Salim
2023-11-21 2:07 ` Jakub Kicinski
2023-11-21 2:10 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).