Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v2 0/2] neighbour: fix possible DoS due to net iface start/stop loop
From: Alexander Mikhalitsyn @ 2022-08-11 14:57 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
	Daniel Borkmann, David Ahern, Yajun Deng, Roopa Prabhu,
	Christian Brauner, linux-kernel, Denis V . Lunev,
	Alexey Kuznetsov, Konstantin Khorenko, Pavel Tikhomirov,
	Andrey Zhadchenko, kernel, devel
In-Reply-To: <20220811075346.22699ece@kernel.org>

On Thu, Aug 11, 2022 at 5:53 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 11 Aug 2022 17:51:32 +0300 Alexander Mikhalitsyn wrote:
> > On Thu, Aug 11, 2022 at 5:46 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > > On Wed, 10 Aug 2022 19:08:38 +0300 Alexander Mikhalitsyn wrote:
> > > >  include/net/neighbour.h |  1 +
> > > >  net/core/neighbour.c    | 46 +++++++++++++++++++++++++++++++++--------
> > > >  2 files changed, 38 insertions(+), 9 deletions(-)
> > >
> > > Which tree are these based on? They don't seem to apply cleanly
> >
> > It's based on 5.19 tree, but I can easily resent it based on net-next.
>
> netdev/net would be the most appropriate tree for a fix.
> Not that it differs much from net-next at this stage of
> the merge window.

Yes, thanks Jakub. I'm a newbie here ;) Sorry for the inconvenience.

Will rebase and send patches soon.

^ permalink raw reply

* Re: pull-request: bpf 2022-08-10
From: Jakub Kicinski @ 2022-08-11 14:54 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, David S. Miller, Paolo Abeni, Eric Dumazet,
	Alexei Starovoitov, Andrii Nakryiko, Network Development, bpf
In-Reply-To: <CAADnVQK589CZN1Q9w8huJqkEyEed+ZMTWqcpA1Rm2CjN3a4XoQ@mail.gmail.com>

On Thu, 11 Aug 2022 00:06:52 -0700 Alexei Starovoitov wrote:
> Yeah. It is intentional.
> We used all sorts of hacks to shut up this pointless warning.
> Just grep for __diag_ignore_all("-Wmissing-prototypes
> in two files already.
> Here I've opted for the explicit hack and the comment.
> Pushed this fix to bpf tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=4e4588f1c4d2e67c993208f0550ef3fae33abce4
> 
> Please consider pulling these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Great, pulled in. Thanks!!

^ permalink raw reply

* Re: [PATCH v2 0/2] neighbour: fix possible DoS due to net iface start/stop loop
From: Jakub Kicinski @ 2022-08-11 14:53 UTC (permalink / raw)
  To: Alexander Mikhalitsyn
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
	Daniel Borkmann, David Ahern, Yajun Deng, Roopa Prabhu,
	Christian Brauner, linux-kernel, Denis V . Lunev,
	Alexey Kuznetsov, Konstantin Khorenko, Pavel Tikhomirov,
	Andrey Zhadchenko, kernel, devel
In-Reply-To: <CAJqdLrq6D+w=H_9t8A7s0c96GyitHFTnY0a2QvUrVeuxaUdtAQ@mail.gmail.com>

On Thu, 11 Aug 2022 17:51:32 +0300 Alexander Mikhalitsyn wrote:
> On Thu, Aug 11, 2022 at 5:46 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Wed, 10 Aug 2022 19:08:38 +0300 Alexander Mikhalitsyn wrote:  
> > >  include/net/neighbour.h |  1 +
> > >  net/core/neighbour.c    | 46 +++++++++++++++++++++++++++++++++--------
> > >  2 files changed, 38 insertions(+), 9 deletions(-)  
> >
> > Which tree are these based on? They don't seem to apply cleanly  
> 
> It's based on 5.19 tree, but I can easily resent it based on net-next.

netdev/net would be the most appropriate tree for a fix.
Not that it differs much from net-next at this stage of 
the merge window.

^ permalink raw reply

* Re: [PATCH v2 0/2] neighbour: fix possible DoS due to net iface start/stop loop
From: Alexander Mikhalitsyn @ 2022-08-11 14:51 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
	Daniel Borkmann, David Ahern, Yajun Deng, Roopa Prabhu,
	Christian Brauner, linux-kernel, Denis V . Lunev,
	Alexey Kuznetsov, Konstantin Khorenko, Pavel Tikhomirov,
	Andrey Zhadchenko, kernel, devel
In-Reply-To: <20220811074630.4784fe6e@kernel.org>

Hi, Jakub

On Thu, Aug 11, 2022 at 5:46 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 10 Aug 2022 19:08:38 +0300 Alexander Mikhalitsyn wrote:
> >  include/net/neighbour.h |  1 +
> >  net/core/neighbour.c    | 46 +++++++++++++++++++++++++++++++++--------
> >  2 files changed, 38 insertions(+), 9 deletions(-)
>
> Which tree are these based on? They don't seem to apply cleanly

It's based on 5.19 tree, but I can easily resent it based on net-next.

Regards,
Alex

^ permalink raw reply

* Re: [PATCH v2 0/2] neighbour: fix possible DoS due to net iface start/stop loop
From: Jakub Kicinski @ 2022-08-11 14:46 UTC (permalink / raw)
  To: Alexander Mikhalitsyn
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
	Daniel Borkmann, David Ahern, Yajun Deng, Roopa Prabhu,
	Christian Brauner, linux-kernel, Denis V . Lunev,
	Alexey Kuznetsov, Konstantin Khorenko, Pavel Tikhomirov,
	Andrey Zhadchenko, Alexander Mikhalitsyn, kernel, devel
In-Reply-To: <20220810160840.311628-1-alexander.mikhalitsyn@virtuozzo.com>

On Wed, 10 Aug 2022 19:08:38 +0300 Alexander Mikhalitsyn wrote:
>  include/net/neighbour.h |  1 +
>  net/core/neighbour.c    | 46 +++++++++++++++++++++++++++++++++--------
>  2 files changed, 38 insertions(+), 9 deletions(-)

Which tree are these based on? They don't seem to apply cleanly

^ permalink raw reply

* RE: [PATCH bpf-next v4] selftests: xsk: Update poll test cases
From: Koikkara Reeny, Shibin @ 2022-08-11 14:45 UTC (permalink / raw)
  To: Daniel Borkmann, Fijalkowski, Maciej
  Cc: bpf@vger.kernel.org, ast@kernel.org, netdev@vger.kernel.org,
	Karlsson, Magnus, bjorn@kernel.org, kuba@kernel.org,
	andrii@kernel.org, Loftus, Ciara
In-Reply-To: <36e36fb2-948f-84c7-0d3b-d97e76373dfa@iogearbox.net>



> -----Original Message-----
> From: Daniel Borkmann <daniel@iogearbox.net>
> Sent: Wednesday, August 10, 2022 4:18 PM
> To: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; Koikkara Reeny,
> Shibin <shibin.koikkara.reeny@intel.com>
> Cc: bpf@vger.kernel.org; ast@kernel.org; netdev@vger.kernel.org; Karlsson,
> Magnus <magnus.karlsson@intel.com>; bjorn@kernel.org;
> kuba@kernel.org; andrii@kernel.org; Loftus, Ciara <ciara.loftus@intel.com>
> Subject: Re: [PATCH bpf-next v4] selftests: xsk: Update poll test cases
> 
> On 8/10/22 3:08 PM, Maciej Fijalkowski wrote:
> > On Wed, Aug 03, 2022 at 02:43:54PM +0000, Shibin Koikkara Reeny wrote:
> >> Poll test case was not testing all the functionality of the poll
> >> feature in the testsuite. This patch update the poll test case which
> >> contain 2 testcases to
> >
> > updates, contains, test cases
> >
> >> test the RX and the TX poll functionality and additional
> >> 2 more testcases to check the timeout features of the
> >
> > feature
> >
> >> poll event.
> >>
> >> Poll testsuite have 4 test cases:
> >
> > test suite, has
> >
> >>
> >> 1. TEST_TYPE_RX_POLL:
> >> Check if RX path POLLIN function work as expect. TX path
> >
> > works
> >
> >> can use any method to sent the traffic.
> >
> > send
> >
> >>
> >> 2. TEST_TYPE_TX_POLL:
> >> Check if TX path POLLOUT function work as expect. RX path can use any
> >> method to receive the traffic.
> >>
> >> 3. TEST_TYPE_POLL_RXQ_EMPTY:
> >> Call poll function with parameter POLLIN on empty rx queue will cause
> >> timeout.If return timeout then test case is pass.
> >
> > space after dot
> >
> >>
> >> 4. TEST_TYPE_POLL_TXQ_FULL:
> >> When txq is filled and packets are not cleaned by the kernel then if
> >> we invoke the poll function with POLLOUT then it should trigger
> >> timeout.
> >>
> >> v1:
> >> https://lore.kernel.org/bpf/20220718095712.588513-1-shibin.koikkara.r
> >> eeny@intel.com/
> >> v2:
> >> https://lore.kernel.org/bpf/20220726101723.250746-1-shibin.koikkara.r
> >> eeny@intel.com/
> >> v3:
> >> https://lore.kernel.org/bpf/20220729132337.211443-1-shibin.koikkara.r
> >> eeny@intel.com/
> >>
> >> Changes in v2:
> >>   * Updated the commit message
> >>   * fixed the while loop flow in receive_pkts function.
> >> Changes in v3:
> >>   * Introduced single thread validation function.
> >>   * Removed pkt_stream_invalid().
> >>   * Updated TEST_TYPE_POLL_TXQ_FULL testcase to create invalid frame.
> >>   * Removed timer from send_pkts().
> >>   * Removed boolean variable skip_rx and skip_tx.
> >> Change in v4:
> >>   * Added is_umem_valid()
> >
> > for single patches, I believe that it's concerned a better practice to
> > include the versioning below the '---' line?
> >
> >>
> >> Signed-off-by: Shibin Koikkara Reeny
> >> <shibin.koikkara.reeny@intel.com>
> >> ---
> >>   tools/testing/selftests/bpf/xskxceiver.c | 166 +++++++++++++++++-----
> -
> >>   tools/testing/selftests/bpf/xskxceiver.h |   8 +-
> >>   2 files changed, 134 insertions(+), 40 deletions(-)
> >
> > I don't think these grammar suggestions require a new revision, so:
> > Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> 
> I cleaned these up while applying. Shibin, please take care of this before
> sending out next time, thanks guys!

Thank you Daniel. Appreciate it. 😊

^ permalink raw reply

* Re: [PATCH] fec: Restart PPS after link state change
From: Csókás Bence @ 2022-08-11 14:45 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev, Richard Cochran, Fugang Duan
In-Reply-To: <YvUEgKl6fpHwMwuS@lunn.ch>

On 2022. 08. 11. 15:30, Andrew Lunn wrote:
>> `fep->pps_enable` is the state of the PPS the driver *believes* to be the
>> case. After a reset, this belief may or may not be true anymore: if the
>> driver believed formerly that the PPS is down, then after a reset, its
>> belief will still be correct, thus nothing needs to be done about the
>> situation. If, however, the driver thought that PPS was up, after controller
>> reset, it no longer holds, so we need to update our world-view
>> (`fep->pps_enable = 0;`), and then correct for the fact that PPS just
>> unexpectedly stopped.
> 
> Your way of doing it just seems very unclean. I would make
> fec_ptp_enable_pps() read the actual status from the
> hardware. fep->pps_enable then has the clear meaning of user space
> requested it should be enabled.

1. It is not "my way", it is how it was in the original code. I am 
merely following those who came before me.
2. There is already a variable which holds userspace's wish: parameter 
`uint enable` in `fec_ptp_enable_pps()`. `fep->pps_enable` is whether 
the driver already fulfilled this wish.

> 
> 	  Andrew

Honestly, I would rather see the entire `fec` driver re-written from 
scratch, it is really bad code and full of bugs. Plus, Fugang Duan's 
mail server keeps bouncing back all my emails (I can only hope he sees 
these mails through the mailing list). However, that exceeds my 
capabilities unfortunately (I know not nearly enough of the various 
fec-based controllers and their internals, I only have the i.MX6 TX6UL 
to test). So the best I can do is provide fixes to the bugs we 
experienced, while changing as little of the original driver's code as 
possible, so as to (hopefully) not introduce even more bugs.

Bence

^ permalink raw reply

* Re: [PATCH] dpaa2-eth: trace the allocated address instead of page struct
From: Ioana Ciornei @ 2022-08-11 14:41 UTC (permalink / raw)
  To: Chen Lin
  Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Yi Liu, Chen Lin
In-Reply-To: <20220810232948.40636-1-chen45464546@163.com>

On Thu, Aug 11, 2022 at 07:29:48AM +0800, Chen Lin wrote:
> Follow the commit 27c874867c4(dpaa2-eth: Use a single page per Rx buffer),
> we should trace the allocated address instead of page struct.
> 
> Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>

Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>

Is this intended for the net tree? In that case, maybe it would be a
good idea to add a fixes tag to the commit that you are already
referencing.

Ioana

^ permalink raw reply

* Re: [PATCH net-next] net: skb: prevent the split of kfree_skb_reason() by gcc
From: Miguel Ojeda @ 2022-08-11 14:35 UTC (permalink / raw)
  To: menglong8.dong
  Cc: kuba, ojeda, ndesaulniers, davem, edumazet, pabeni, asml.silence,
	imagedong, luiz.von.dentz, vasily.averin, jk, linux-kernel,
	netdev
In-Reply-To: <CANiq72=Eq1265hYhEVTGuh-_ZW+3HjWkwaktEfs7H7yPERfO0w@mail.gmail.com>

On Thu, Aug 11, 2022 at 4:34 PM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> Two notes on this: please use the double underscore form:
> `__optimize__` and keep the file sorted (it should go after
> `__overloadable__`, since we sort by the actual attribute name).

s/after/before

Cheers,
Miguel

^ permalink raw reply

* Re: [PATCH net-next] net: skb: prevent the split of kfree_skb_reason() by gcc
From: Miguel Ojeda @ 2022-08-11 14:34 UTC (permalink / raw)
  To: menglong8.dong
  Cc: kuba, ojeda, ndesaulniers, davem, edumazet, pabeni, asml.silence,
	imagedong, luiz.von.dentz, vasily.averin, jk, linux-kernel,
	netdev
In-Reply-To: <20220811120708.34912-1-imagedong@tencent.com>

On Thu, Aug 11, 2022 at 2:07 PM <menglong8.dong@gmail.com> wrote:
>
> diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
> index 445e80517cab..51f7c13bca98 100644
> --- a/include/linux/compiler_attributes.h
> +++ b/include/linux/compiler_attributes.h
> @@ -371,4 +371,6 @@
>   */
>  #define __weak                          __attribute__((__weak__))
>
> +#define __nofnsplit                     __attribute__((optimize("O1")))
> +
>  #endif /* __LINUX_COMPILER_ATTRIBUTES_H */

Two notes on this: please use the double underscore form:
`__optimize__` and keep the file sorted (it should go after
`__overloadable__`, since we sort by the actual attribute name).

Thanks!

Cheers,
Miguel

^ permalink raw reply

* Re: [PATCH 2/2] net/mlx5e: Leverage sched_numa_hop_mask()
From: Valentin Schneider @ 2022-08-11 14:26 UTC (permalink / raw)
  To: Tariq Toukan, netdev, linux-kernel, Jakub Kicinski
  Cc: Tariq Toukan, David S. Miller, Saeed Mahameed, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Eric Dumazet, Paolo Abeni,
	Gal Pressman, Vincent Guittot
In-Reply-To: <8448dade-a64a-0b6b-1ed0-dd164917eedf@gmail.com>

On 10/08/22 15:57, Tariq Toukan wrote:
> On 8/10/2022 1:51 PM, Valentin Schneider wrote:
>> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
>> ---
>>   drivers/net/ethernet/mellanox/mlx5/core/eq.c | 16 ++++++++++++++--
>>   1 file changed, 14 insertions(+), 2 deletions(-)
>>
>
> Missing description.
>
> I had a very detailed description with performance numbers and an
> affinity hints example with before/after tables. I don't want to get
> them lost.
>

Me neither! This here is just a stand-in to show how the interface would be
used, I'd much rather have someone who actually knows the code and can
easily test it do it :)

>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
>> index 229728c80233..2eb4ffd96a95 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
>> @@ -809,9 +809,12 @@ static void comp_irqs_release(struct mlx5_core_dev *dev)
>>   static int comp_irqs_request(struct mlx5_core_dev *dev)
>>   {
>>      struct mlx5_eq_table *table = dev->priv.eq_table;
>> +	const struct cpumask *mask;
>>      int ncomp_eqs = table->num_comp_eqs;
>> +	int hops = 0;
>>      u16 *cpus;
>>      int ret;
>> +	int cpu;
>>      int i;
>>
>>      ncomp_eqs = table->num_comp_eqs;
>> @@ -830,8 +833,17 @@ static int comp_irqs_request(struct mlx5_core_dev *dev)
>>              ret = -ENOMEM;
>>              goto free_irqs;
>>      }
>> -	for (i = 0; i < ncomp_eqs; i++)
>> -		cpus[i] = cpumask_local_spread(i, dev->priv.numa_node);
>> +
>> +	rcu_read_lock();
>> +	for_each_numa_hop_mask(dev->priv.numa_node, hops, mask) {
>
> We don't really use this 'hops' iterator. We always pass 0 (not a
> valuable input...), and we do not care about its final value. Probably
> it's best to hide it from the user into the macro.
>

That's a very valid point. After a lot of mulling around, I've found some
way to hide it away in a macro, but it's not pretty :-) cf. other email.


^ permalink raw reply

* Re: [PATCH 1/2] sched/topology: Introduce sched_numa_hop_mask()
From: Valentin Schneider @ 2022-08-11 14:26 UTC (permalink / raw)
  To: Tariq Toukan, netdev, linux-kernel
  Cc: Tariq Toukan, David S. Miller, Saeed Mahameed, Jakub Kicinski,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Eric Dumazet,
	Paolo Abeni, Gal Pressman, Vincent Guittot
In-Reply-To: <03aaf512-3ac5-fdfe-da2d-3fecd24591e2@gmail.com>

On 10/08/22 15:57, Tariq Toukan wrote:
> On 8/10/2022 3:42 PM, Tariq Toukan wrote:
>>
>>
>> On 8/10/2022 1:51 PM, Valentin Schneider wrote:
>>> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
>>> index 8739c2a5a54e..f0236a0ae65c 100644
>>> --- a/kernel/sched/topology.c
>>> +++ b/kernel/sched/topology.c
>>> @@ -2067,6 +2067,34 @@ int sched_numa_find_closest(const struct
>>> cpumask *cpus, int cpu)
>>>       return found;
>>>   }
>>> +/**
>>> + * sched_numa_hop_mask() - Get the cpumask of CPUs at most @hops hops
>>> away.
>>> + * @node: The node to count hops from.
>>> + * @hops: Include CPUs up to that many hops away. 0 means local node.
>
> AFAIU, here you work with a specific level/num of hops, description is
> not accurate.
>

Hmph, unfortunately it's the other way around - the masks do include CPUs
*up to* a number of hops, but in my mlx5 example I've used it as if it only
included CPUs a specific distance away :/

As things stand we'd need a temporary cpumask to account for which CPUs we
have visited (which is what you had in your original submission), but with
a for_each_cpu_andnot() we don't need any of that.

Below is what I ended up with. I've tested it on a range of NUMA topologies
and it behaves as I'd expect, and on the plus side the code required in the
driver side is even simpler than before.

If you don't have major gripes with it, I'll shape that into a proper
series and will let you handle the mlx5/enic bits.

---

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 229728c80233..0a5432903edd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -812,6 +812,7 @@ static int comp_irqs_request(struct mlx5_core_dev *dev)
 	int ncomp_eqs = table->num_comp_eqs;
 	u16 *cpus;
 	int ret;
+	int cpu;
 	int i;
 
 	ncomp_eqs = table->num_comp_eqs;
@@ -830,8 +831,15 @@ static int comp_irqs_request(struct mlx5_core_dev *dev)
 		ret = -ENOMEM;
 		goto free_irqs;
 	}
-	for (i = 0; i < ncomp_eqs; i++)
-		cpus[i] = cpumask_local_spread(i, dev->priv.numa_node);
+
+	rcu_read_lock();
+	for_each_numa_hop_cpus(cpu, dev->priv.numa_node) {
+		cpus[i] = cpu;
+		if (++i == ncomp_eqs)
+			goto spread_done;
+	}
+spread_done:
+	rcu_read_unlock();
 	ret = mlx5_irqs_request_vectors(dev, cpus, ncomp_eqs, table->comp_irqs);
 	kfree(cpus);
 	if (ret < 0)
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index fe29ac7cc469..ccd5d71aefef 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -157,6 +157,13 @@ static inline unsigned int cpumask_next_and(int n,
 	return n+1;
 }
 
+static inline unsigned int cpumask_next_andnot(int n,
+					    const struct cpumask *srcp,
+					    const struct cpumask *andp)
+{
+	return n+1;
+}
+
 static inline unsigned int cpumask_next_wrap(int n, const struct cpumask *mask,
 					     int start, bool wrap)
 {
@@ -194,6 +201,8 @@ static inline int cpumask_any_distribute(const struct cpumask *srcp)
 	for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask, (void)(start))
 #define for_each_cpu_and(cpu, mask1, mask2)	\
 	for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask1, (void)mask2)
+#define for_each_cpu_andnot(cpu, mask1, mask2)	\
+	for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask1, (void)mask2)
 #else
 /**
  * cpumask_first - get the first cpu in a cpumask
@@ -259,6 +268,7 @@ static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp)
 }
 
 int __pure cpumask_next_and(int n, const struct cpumask *, const struct cpumask *);
+int __pure cpumask_next_andnot(int n, const struct cpumask *, const struct cpumask *);
 int __pure cpumask_any_but(const struct cpumask *mask, unsigned int cpu);
 unsigned int cpumask_local_spread(unsigned int i, int node);
 int cpumask_any_and_distribute(const struct cpumask *src1p,
@@ -324,6 +334,26 @@ extern int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool
 	for ((cpu) = -1;						\
 		(cpu) = cpumask_next_and((cpu), (mask1), (mask2)),	\
 		(cpu) < nr_cpu_ids;)
+
+/**
+ * for_each_cpu_andnot - iterate over every cpu in one mask but not in another
+ * @cpu: the (optionally unsigned) integer iterator
+ * @mask1: the first cpumask pointer
+ * @mask2: the second cpumask pointer
+ *
+ * This saves a temporary CPU mask in many places.  It is equivalent to:
+ *	struct cpumask tmp;
+ *	cpumask_andnot(&tmp, &mask1, &mask2);
+ *	for_each_cpu(cpu, &tmp)
+ *		...
+ *
+ * After the loop, cpu is >= nr_cpu_ids.
+ */
+#define for_each_cpu_andnot(cpu, mask1, mask2)				\
+	for ((cpu) = -1;						\
+		(cpu) = cpumask_next_andnot((cpu), (mask1), (mask2)),	\
+		(cpu) < nr_cpu_ids;)
+
 #endif /* SMP */
 
 #define CPU_BITS_NONE						\
diff --git a/include/linux/find.h b/include/linux/find.h
index 424ef67d4a42..454cde69b30b 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -10,7 +10,8 @@
 
 extern unsigned long _find_next_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
-		unsigned long start, unsigned long invert, unsigned long le);
+		unsigned long start, unsigned long invert, unsigned long le,
+		bool negate);
 extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size);
 extern unsigned long _find_first_and_bit(const unsigned long *addr1,
 					 const unsigned long *addr2, unsigned long size);
@@ -41,7 +42,7 @@ unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
 		return val ? __ffs(val) : size;
 	}
 
-	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
+	return _find_next_bit(addr, NULL, size, offset, 0UL, 0, 0);
 }
 #endif
 
@@ -71,7 +72,38 @@ unsigned long find_next_and_bit(const unsigned long *addr1,
 		return val ? __ffs(val) : size;
 	}
 
-	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
+	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0, 0);
+}
+#endif
+
+#ifndef find_next_andnot_bit
+/**
+ * find_next_andnot_bit - find the next set bit in one memory region
+ *                        but not in the other
+ * @addr1: The first address to base the search on
+ * @addr2: The second address to base the search on
+ * @size: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * Returns the bit number for the next set bit
+ * If no bits are set, returns @size.
+ */
+static inline
+unsigned long find_next_andnot_bit(const unsigned long *addr1,
+		const unsigned long *addr2, unsigned long size,
+		unsigned long offset)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr1 & ~*addr2 & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
+	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0, 1);
 }
 #endif
 
@@ -99,7 +131,7 @@ unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
 		return val == ~0UL ? size : ffz(val);
 	}
 
-	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
+	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0, 0);
 }
 #endif
 
@@ -247,7 +279,7 @@ unsigned long find_next_zero_bit_le(const void *addr, unsigned
 		return val == ~0UL ? size : ffz(val);
 	}
 
-	return _find_next_bit(addr, NULL, size, offset, ~0UL, 1);
+	return _find_next_bit(addr, NULL, size, offset, ~0UL, 1, 0);
 }
 #endif
 
@@ -266,7 +298,7 @@ unsigned long find_next_bit_le(const void *addr, unsigned
 		return val ? __ffs(val) : size;
 	}
 
-	return _find_next_bit(addr, NULL, size, offset, 0UL, 1);
+	return _find_next_bit(addr, NULL, size, offset, 0UL, 1, 0);
 }
 #endif
 
diff --git a/include/linux/topology.h b/include/linux/topology.h
index 4564faafd0e1..41bed4b883d3 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -245,5 +245,50 @@ static inline const struct cpumask *cpu_cpu_mask(int cpu)
 	return cpumask_of_node(cpu_to_node(cpu));
 }
 
+#ifdef CONFIG_NUMA
+extern const struct cpumask *sched_numa_hop_mask(int node, int hops);
+#else
+static inline const struct cpumask *sched_numa_hop_mask(int node, int hops)
+{
+	return ERR_PTR(-ENOTSUPP);
+}
+#endif	/* CONFIG_NUMA */
+
+/**
+ * for_each_numa_hop_cpu - iterate over CPUs by increasing NUMA distance,
+ *                         starting from a given node.
+ * @cpu: the iteration variable.
+ * @node: the NUMA node to start the search from.
+ *
+ * Requires rcu_lock to be held.
+ * Careful: this is a double loop, 'break' won't work as expected.
+ *
+ *
+ * Implementation notes:
+ *
+ * Providing it is valid, the mask returned by
+ *  sched_numa_hop_mask(node, hops+1)
+ * is a superset of the one returned by
+ *   sched_numa_hop_mask(node, hops)
+ * which may not be that useful for drivers that try to spread things out and
+ * want to visit a CPU not more than once.
+ *
+ * To accomodate for that, we use for_each_cpu_andnot() to iterate over the cpus
+ * of sched_numa_hop_mask(node, hops+1) with the CPUs of
+ * sched_numa_hop_mask(node, hops) removed, IOW we only iterate over CPUs
+ * a given distance away (rather than *up to* a given distance).
+ *
+ * h=0 forces us to play silly games and pass cpu_none_mask to
+ * for_each_cpu_andnot(), which turns it into for_each_cpu().
+ */
+#define for_each_numa_hop_cpu(cpu, node)				       \
+	for (struct { const struct cpumask *mask; int hops; } __v__ =	       \
+		     { sched_numa_hop_mask(node, 0), 0 };		       \
+	     !IS_ERR_OR_NULL(__v__.mask);				       \
+	     __v__.hops++, __v__.mask = sched_numa_hop_mask(node, __v__.hops)) \
+		for_each_cpu_andnot(cpu, __v__.mask,			       \
+				    __v__.hops ?			       \
+				    sched_numa_hop_mask(node, __v__.hops - 1) :\
+				    cpu_none_mask)
 
 #endif /* _LINUX_TOPOLOGY_H */
diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
index 976092b7bd45..9182101f2c4f 100644
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -29,6 +29,6 @@ endif
 # build parallelizes well and finishes roughly at once:
 #
 obj-y += core.o
-obj-y += fair.o
+obj-y += fair.o yolo.o
 obj-y += build_policy.o
 obj-y += build_utility.o
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 8739c2a5a54e..f0236a0ae65c 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2067,6 +2067,34 @@ int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
 	return found;
 }
 
+/**
+ * sched_numa_hop_mask() - Get the cpumask of CPUs at most @hops hops away.
+ * @node: The node to count hops from.
+ * @hops: Include CPUs up to that many hops away. 0 means local node.
+ *
+ * Requires rcu_lock to be held. Returned cpumask is only valid within that
+ * read-side section, copy it if required beyond that.
+ *
+ * Note that not all hops are equal in size; see sched_init_numa() for how
+ * distances and masks are handled.
+ *
+ * Also note that this is a reflection of sched_domains_numa_masks, which may change
+ * during the lifetime of the system (offline nodes are taken out of the masks).
+ */
+const struct cpumask *sched_numa_hop_mask(int node, int hops)
+{
+	struct cpumask ***masks = rcu_dereference(sched_domains_numa_masks);
+
+	if (node >= nr_node_ids || hops >= sched_domains_numa_levels)
+		return ERR_PTR(-EINVAL);
+
+	if (!masks)
+		return NULL;
+
+	return masks[hops][node];
+}
+EXPORT_SYMBOL_GPL(sched_numa_hop_mask);
+
 #endif /* CONFIG_NUMA */
 
 static int __sdt_alloc(const struct cpumask *cpu_map)
diff --git a/lib/cpumask.c b/lib/cpumask.c
index a971a82d2f43..8bcf7e919193 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -42,6 +42,25 @@ int cpumask_next_and(int n, const struct cpumask *src1p,
 }
 EXPORT_SYMBOL(cpumask_next_and);
 
+/**
+ * cpumask_next_andnot - get the next cpu in *src1p & ~*src2p
+ * @n: the cpu prior to the place to search (ie. return will be > @n)
+ * @src1p: the first cpumask pointer
+ * @src2p: the second cpumask pointer
+ *
+ * Returns >= nr_cpu_ids if no further cpus set in both.
+ */
+int cpumask_next_andnot(int n, const struct cpumask *src1p,
+		     const struct cpumask *src2p)
+{
+	/* -1 is a legal arg here. */
+	if (n != -1)
+		cpumask_check(n);
+	return find_next_andnot_bit(cpumask_bits(src1p), cpumask_bits(src2p),
+		nr_cpumask_bits, n + 1);
+}
+EXPORT_SYMBOL(cpumask_next_andnot);
+
 /**
  * cpumask_any_but - return a "random" in a cpumask, but not this one.
  * @mask: the cpumask to search
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 1b8e4b2a9cba..6e5f42c621a9 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -21,17 +21,19 @@
 
 #if !defined(find_next_bit) || !defined(find_next_zero_bit) ||			\
 	!defined(find_next_bit_le) || !defined(find_next_zero_bit_le) ||	\
-	!defined(find_next_and_bit)
+	!defined(find_next_and_bit) || !defined(find_next_andnot_bit)
 /*
  * This is a common helper function for find_next_bit, find_next_zero_bit, and
  * find_next_and_bit. The differences are:
  *  - The "invert" argument, which is XORed with each fetched word before
  *    searching it for one bits.
- *  - The optional "addr2", which is anded with "addr1" if present.
+ *  - The optional "addr2", negated if "negate" and ANDed with "addr1" if
+ *    present.
  */
 unsigned long _find_next_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
-		unsigned long start, unsigned long invert, unsigned long le)
+		unsigned long start, unsigned long invert, unsigned long le,
+		bool negate)
 {
 	unsigned long tmp, mask;
 
@@ -40,7 +42,9 @@ unsigned long _find_next_bit(const unsigned long *addr1,
 
 	tmp = addr1[start / BITS_PER_LONG];
 	if (addr2)
-		tmp &= addr2[start / BITS_PER_LONG];
+		tmp &= negate ?
+		       ~addr2[start / BITS_PER_LONG] :
+			addr2[start / BITS_PER_LONG];
 	tmp ^= invert;
 
 	/* Handle 1st word. */
@@ -59,7 +63,9 @@ unsigned long _find_next_bit(const unsigned long *addr1,
 
 		tmp = addr1[start / BITS_PER_LONG];
 		if (addr2)
-			tmp &= addr2[start / BITS_PER_LONG];
+			tmp &= negate ?
+			       ~addr2[start / BITS_PER_LONG] :
+				addr2[start / BITS_PER_LONG];
 		tmp ^= invert;
 	}
 


^ permalink raw reply related

* Re: [PATCH v8 0/3] Implement vdpasim suspend operation
From: Eugenio Perez Martin @ 2022-08-11 14:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, Jason Wang, kvm list, linux-kernel, netdev,
	ecree.xilinx, Dawar, Gautam, Zhang Min,
	Pablo Cascon Katchadourian, Uminski, Piotr, Dan Carpenter,
	Kamde, Tanuj, Zhu Lingshan, Martin Petrus Hubertus Habets,
	Christophe JAILLET, Laurent Vivier, Martin Porter,
	Harpreet Singh Anand, Eli Cohen, Cindy Lu, habetsm.xilinx,
	Parav Pandit, Longpeng, Wu Zongyong, Si-Wei Liu,
	Stefano Garzarella, Dinan Gunawardena, Xie Yongji
In-Reply-To: <20220811095743-mutt-send-email-mst@kernel.org>

On Thu, Aug 11, 2022 at 3:58 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Aug 11, 2022 at 03:53:50PM +0200, Eugenio Pérez wrote:
> > Implement suspend operation for vdpa_sim devices, so vhost-vdpa will offer
> > that backend feature and userspace can effectively suspend the device.
> >
> > This is a must before getting virtqueue indexes (base) for live migration,
> > since the device could modify them after userland gets them. There are
> > individual ways to perform that action for some devices
> > (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no
> > way to perform it for any vhost device (and, in particular, vhost-vdpa).
> >
> > After a successful return of ioctl the device must not process more virtqueue
> > descriptors. The device can answer to read or writes of config fields as if it
> > were not suspended. In particular, writing to "queue_enable" with a value of 1
> > will not make the device start processing virtqueue buffers.
> >
> > In the future, we will provide features similar to
> > VHOST_USER_GET_INFLIGHT_FD so the device can save pending operations.
> >
> > Applied on top of vhost branch.
> >
> > Comments are welcome.
> >
> > v8:
> > * v7 but incremental from vhost instead of isolated.
>
> Now I'm lost. incremental to what? Does the vhost branch now
> have the correct bits?
>

This patch is intended to be applied on top of the current vhost
branch. In particular, on the top of commit
6a9720576cd00d30722c5f755bd17d4cfa9df636.

It basically deletes the code, the doc, and the unused ioctl argument.

Did I misunderstand what you meant with "incremental" in previous mail?

> > v7:
> > * Remove ioctl leftover argument and update doc accordingly.
> >
> > v6:
> > * Remove the resume operation, making the ioctl simpler. We can always add
> >   another ioctl for VM_STOP/VM_RESUME operation later.
> > * s/stop/suspend/ to differentiate more from reset.
> > * Clarify scope of the suspend operation.
> >
> > v5:
> > * s/not stop/resume/ in doc.
> >
> > v4:
> > * Replace VHOST_STOP to VHOST_VDPA_STOP in vhost ioctl switch case too.
> >
> > v3:
> > * s/VHOST_STOP/VHOST_VDPA_STOP/
> > * Add documentation and requirements of the ioctl above its definition.
> >
> > v2:
> > * Replace raw _F_STOP with BIT_ULL(_F_STOP).
> > * Fix obtaining of stop ioctl arg (it was not obtained but written).
> > * Add stop to vdpa_sim_blk.
> >
> > [1] git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
> >
> > Eugenio Pérez (3):
> >   vdpa: delete unreachable branch on vdpasim_suspend
> >   vdpa: Remove wrong doc of VHOST_VDPA_SUSPEND ioctl
> >   vhost: Remove invalid parameter of VHOST_VDPA_SUSPEND ioctl
> >
> >  drivers/vdpa/vdpa_sim/vdpa_sim.c |  7 -------
> >  include/linux/vdpa.h             |  2 +-
> >  include/uapi/linux/vhost.h       | 17 ++++++-----------
> >  3 files changed, 7 insertions(+), 19 deletions(-)
> >
> > --
> > 2.31.1
> >
>


^ permalink raw reply

* Re: [PATCH] net: dsa: mv88e6060: report max mtu 1536
From: Andrew Lunn @ 2022-08-11 14:12 UTC (permalink / raw)
  To: Sergei Antonov; +Cc: Vladimir Oltean, netdev, Florian Fainelli
In-Reply-To: <CABikg9wUtyNGJ+SvASGC==qezh2eghJ=SyM5hECYVguR3BmGQQ@mail.gmail.com>

> > > 2: eth0: <BROADCAST,MULTICAST> mtu 1504 qdisc noop qlen 1000
> > >     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
> >
> > The DSA master is super odd for starting with an all-zero MAC address.
> > What driver handles this part? Normally, drivers are expected to work
> > with a MAC address provided by the firmware (of_get_mac_address or
> > other, perhaps proprietary, means) and fall back to eth_random_addr()
> > if that is missing.
> 
> eth0 is handled by the CONFIG_ARM_MOXART_ETHER driver. By the way, I
> had to change some code in it to make it work, and I am going to
> submit a patch or two later.
> The driver does not know its MAC address initially. On my hardware it
> is stored in a flash memory chip, so I assign it using "ip link set
> ..." either manually or from an /etc/init.d script. A solution with
> early MAC assignment in the moxart_mac_probe() function is doable. Do
> you think I should implement it?

I would suggest a few patches:

1) Use eth_hw_addr_random() to assign a random MAC address during probe.
2) Remove is_valid_ether_addr() from moxart_mac_open()
3) Add a call to platform_get_ethdev_address() during probe
4) Remove is_valid_ether_addr() from moxart_set_mac_address(). The core does this

platform_get_ethdev_address() will call of_get_mac_addr_nvmem() which
might be able to get your MAC address out of flash, without user space
being involved.

      Andrew

^ permalink raw reply

* Re: [PATCH bpf-next v7 4/8] bpf: Introduce cgroup iter
From: Yosry Ahmed @ 2022-08-11 14:09 UTC (permalink / raw)
  To: Hao Luo
  Cc: Alexei Starovoitov, Andrii Nakryiko, Linux Kernel Mailing List,
	bpf, Cgroups, Networking, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Tejun Heo, Zefan Li, KP Singh, Johannes Weiner, Michal Hocko,
	Benjamin Tissoires, John Fastabend, Michal Koutny, Roman Gushchin,
	David Rientjes, Stanislav Fomichev, Shakeel Butt
In-Reply-To: <CA+khW7j1Ni_PfvsGisUpUgFtgg=f_qEUVd1VUmocn6L3=kndhw@mail.gmail.com>

On Wed, Aug 10, 2022 at 8:10 PM Hao Luo <haoluo@google.com> wrote:
>
> On Tue, Aug 9, 2022 at 11:38 AM Hao Luo <haoluo@google.com> wrote:
> >
> > On Tue, Aug 9, 2022 at 9:23 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Mon, Aug 08, 2022 at 05:56:57PM -0700, Hao Luo wrote:
> > > > On Mon, Aug 8, 2022 at 5:19 PM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Fri, Aug 5, 2022 at 2:49 PM Hao Luo <haoluo@google.com> wrote:
> > > > > >
> > > > > > Cgroup_iter is a type of bpf_iter. It walks over cgroups in four modes:
> > > > > >
> > > > > >  - walking a cgroup's descendants in pre-order.
> > > > > >  - walking a cgroup's descendants in post-order.
> > > > > >  - walking a cgroup's ancestors.
> > > > > >  - process only the given cgroup.
> > > > > >
> > [...]
> > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > > index 59a217ca2dfd..4d758b2e70d6 100644
> > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > @@ -87,10 +87,37 @@ struct bpf_cgroup_storage_key {
> > > > > >         __u32   attach_type;            /* program attach type (enum bpf_attach_type) */
> > > > > >  };
> > > > > >
> > > > > > +enum bpf_iter_order {
> > > > > > +       BPF_ITER_ORDER_DEFAULT = 0,     /* default order. */
> > > > >
> > > > > why is this default order necessary? It just adds confusion (I had to
> > > > > look up source code to know what is default order). I might have
> > > > > missed some discussion, so if there is some very good reason, then
> > > > > please document this in commit message. But I'd rather not do some
> > > > > magical default order instead. We can set 0 to mean invalid and error
> > > > > out, or just do SELF as the very first value (and if user forgot to
> > > > > specify more fancy mode, they hopefully will quickly discover this in
> > > > > their testing).
> > > > >
> > > >
> > > > PRE/POST/UP are tree-specific orders. SELF applies on all iters and
> > > > yields only a single object. How does task_iter express a non-self
> > > > order? By non-self, I mean something like "I don't care about the
> > > > order, just scan _all_ the objects". And this "don't care" order, IMO,
> > > > may be the common case. I don't think everyone cares about walking
> > > > order for tasks. The DEFAULT is intentionally put at the first value,
> > > > so that if users don't care about order, they don't have to specify
> > > > this field.
> > > >
> > > > If that sounds valid, maybe using "UNSPEC" instead of "DEFAULT" is better?
> > >
> > > I agree with Andrii.
> > > This:
> > > +       if (order == BPF_ITER_ORDER_DEFAULT)
> > > +               order = BPF_ITER_DESCENDANTS_PRE;
> > >
> > > looks like an arbitrary choice.
> > > imo
> > > BPF_ITER_DESCENDANTS_PRE = 0,
> > > would have been more obvious. No need to dig into definition of "default".
> > >
> > > UNSPEC = 0
> > > is fine too if we want user to always be conscious about the order
> > > and the kernel will error if that field is not initialized.
> > > That would be my preference, since it will match the rest of uapi/bpf.h
> > >
> >
> > Sounds good. In the next version, will use
> >
> > enum bpf_iter_order {
> >         BPF_ITER_ORDER_UNSPEC = 0,
> >         BPF_ITER_SELF_ONLY,             /* process only a single object. */
> >         BPF_ITER_DESCENDANTS_PRE,       /* walk descendants in pre-order. */
> >         BPF_ITER_DESCENDANTS_POST,      /* walk descendants in post-order. */
> >         BPF_ITER_ANCESTORS_UP,          /* walk ancestors upward. */
> > };
> >
>
> Sigh, I find that having UNSPEC=0 and erroring out when seeing UNSPEC
> doesn't work. Basically, if we have a non-iter prog and a cgroup_iter
> prog written in the same source file, I can't use
> bpf_object__attach_skeleton to attach them. Because the default
> prog_attach_fn for iter initializes `order` to 0 (that is, UNSPEC),
> which is going to be rejected by the kernel. In order to make
> bpf_object__attach_skeleton work on cgroup_iter, I think I need to use
> the following
>
> enum bpf_iter_order {
>         BPF_ITER_DESCENDANTS_PRE,       /* walk descendants in pre-order. */
>         BPF_ITER_DESCENDANTS_POST,      /* walk descendants in post-order. */
>         BPF_ITER_ANCESTORS_UP,          /* walk ancestors upward. */
>         BPF_ITER_SELF_ONLY,             /* process only a single object. */
> };
>
> So that when calling bpf_object__attach_skeleton() on cgroup_iter, a
> link can be generated and the generated link defaults to pre-order
> walk on the whole hierarchy. Is there a better solution?
>

I think this can be handled by userspace? We can attach the
cgroup_iter separately first (and maybe we will need to set prog->link
as well) so that bpf_object__attach_skeleton() doesn't try to attach
it? I am following this pattern in the selftest in the final patch,
although I think I might be missing setting prog->link, so I am
wondering why there are no issues in that selftest which has the same
scenario that you are talking about.

I think such a pattern will need to be used anyway if the users need
to set any non-default arguments for the cgroup_iter prog (like the
selftest), right? The only case we are discussing here is the case
where the user wants to attach the cgroup_iter with all default
options (in which case the default order will fail).
I agree that it might be inconvenient if the default/uninitialized
options don't work for cgroup_iter, but Alexei pointed out that this
matches other bpf uapis.

My concern is that in the future we try to reuse enum bpf_iter_order
to set ordering for other iterators, and then the
default/uninitialized value (BPF_ITER_DESCENDANTS_PRE) doesn't make
sense for that iterator (e.g. not a tree). In this case, the same
problem that we are avoiding for cgroup_iter here will show up for
that iterator, and we can't easily change it at this point because
it's uapi.


> > and explicitly list the values acceptable by cgroup_iter, error out if
> > UNSPEC is detected.
> >
> > Also, following Andrii's comments, will change BPF_ITER_SELF to
> > BPF_ITER_SELF_ONLY, which does seem a little bit explicit in
> > comparison.
> >
> > > I applied the first 3 patches to ease respin.
> >
> > Thanks! This helps!
> >
> > > Thanks!

^ permalink raw reply

* Re: [PATCH v8 0/3] Implement vdpasim suspend operation
From: Michael S. Tsirkin @ 2022-08-11 13:58 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: virtualization, Jason Wang, kvm, linux-kernel, netdev,
	ecree.xilinx, gautam.dawar, Zhang Min, pabloc, Piotr.Uminski,
	Dan Carpenter, tanuj.kamde, Zhu Lingshan, martinh,
	Christophe JAILLET, lvivier, martinpo, hanand, Eli Cohen, lulu,
	habetsm.xilinx, Parav Pandit, Longpeng, Wu Zongyong, Si-Wei Liu,
	Stefano Garzarella, dinang, Xie Yongji
In-Reply-To: <20220811135353.2549658-1-eperezma@redhat.com>

On Thu, Aug 11, 2022 at 03:53:50PM +0200, Eugenio Pérez wrote:
> Implement suspend operation for vdpa_sim devices, so vhost-vdpa will offer
> that backend feature and userspace can effectively suspend the device.
> 
> This is a must before getting virtqueue indexes (base) for live migration,
> since the device could modify them after userland gets them. There are
> individual ways to perform that action for some devices
> (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no
> way to perform it for any vhost device (and, in particular, vhost-vdpa).
> 
> After a successful return of ioctl the device must not process more virtqueue
> descriptors. The device can answer to read or writes of config fields as if it
> were not suspended. In particular, writing to "queue_enable" with a value of 1
> will not make the device start processing virtqueue buffers.
> 
> In the future, we will provide features similar to
> VHOST_USER_GET_INFLIGHT_FD so the device can save pending operations.
> 
> Applied on top of vhost branch.
> 
> Comments are welcome.
> 
> v8:
> * v7 but incremental from vhost instead of isolated.

Now I'm lost. incremental to what? Does the vhost branch now
have the correct bits?

> v7:
> * Remove ioctl leftover argument and update doc accordingly.
> 
> v6:
> * Remove the resume operation, making the ioctl simpler. We can always add
>   another ioctl for VM_STOP/VM_RESUME operation later.
> * s/stop/suspend/ to differentiate more from reset.
> * Clarify scope of the suspend operation.
> 
> v5:
> * s/not stop/resume/ in doc.
> 
> v4:
> * Replace VHOST_STOP to VHOST_VDPA_STOP in vhost ioctl switch case too.
> 
> v3:
> * s/VHOST_STOP/VHOST_VDPA_STOP/
> * Add documentation and requirements of the ioctl above its definition.
> 
> v2:
> * Replace raw _F_STOP with BIT_ULL(_F_STOP).
> * Fix obtaining of stop ioctl arg (it was not obtained but written).
> * Add stop to vdpa_sim_blk.
> 
> [1] git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
> 
> Eugenio Pérez (3):
>   vdpa: delete unreachable branch on vdpasim_suspend
>   vdpa: Remove wrong doc of VHOST_VDPA_SUSPEND ioctl
>   vhost: Remove invalid parameter of VHOST_VDPA_SUSPEND ioctl
> 
>  drivers/vdpa/vdpa_sim/vdpa_sim.c |  7 -------
>  include/linux/vdpa.h             |  2 +-
>  include/uapi/linux/vhost.h       | 17 ++++++-----------
>  3 files changed, 7 insertions(+), 19 deletions(-)
> 
> -- 
> 2.31.1
> 


^ permalink raw reply

* [PATCH v8 3/3] vhost: Remove invalid parameter of VHOST_VDPA_SUSPEND ioctl
From: Eugenio Pérez @ 2022-08-11 13:53 UTC (permalink / raw)
  To: virtualization, Jason Wang, Michael S. Tsirkin, kvm, linux-kernel,
	netdev
  Cc: ecree.xilinx, gautam.dawar, Zhang Min, pabloc, Piotr.Uminski,
	Dan Carpenter, tanuj.kamde, Zhu Lingshan, martinh,
	Christophe JAILLET, lvivier, martinpo, hanand, Eli Cohen, lulu,
	habetsm.xilinx, Parav Pandit, Longpeng, Wu Zongyong, Si-Wei Liu,
	Stefano Garzarella, dinang, Xie Yongji
In-Reply-To: <20220811135353.2549658-1-eperezma@redhat.com>

It was a leftover from previous versions.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
Note that I'm not sure this removal is valid. The ioctl is not in master
branch by the send date of this patch, but there are commits on vhost
branch that do have it.
---
 include/uapi/linux/vhost.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 89fcb2afe472..768ec46a88bf 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -178,6 +178,6 @@
  * the possible device specific states) that is required for restoring in the
  * future. The device must not change its configuration after that point.
  */
-#define VHOST_VDPA_SUSPEND		_IOW(VHOST_VIRTIO, 0x7D, int)
+#define VHOST_VDPA_SUSPEND		_IO(VHOST_VIRTIO, 0x7D)
 
 #endif
-- 
2.31.1


^ permalink raw reply related

* [PATCH v8 2/3] vdpa: Remove wrong doc of VHOST_VDPA_SUSPEND ioctl
From: Eugenio Pérez @ 2022-08-11 13:53 UTC (permalink / raw)
  To: virtualization, Jason Wang, Michael S. Tsirkin, kvm, linux-kernel,
	netdev
  Cc: ecree.xilinx, gautam.dawar, Zhang Min, pabloc, Piotr.Uminski,
	Dan Carpenter, tanuj.kamde, Zhu Lingshan, martinh,
	Christophe JAILLET, lvivier, martinpo, hanand, Eli Cohen, lulu,
	habetsm.xilinx, Parav Pandit, Longpeng, Wu Zongyong, Si-Wei Liu,
	Stefano Garzarella, dinang, Xie Yongji
In-Reply-To: <20220811135353.2549658-1-eperezma@redhat.com>

It was a leftover from previous versions.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/linux/vdpa.h       |  2 +-
 include/uapi/linux/vhost.h | 15 +++++----------
 2 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index d282f464d2f1..6c4e6ea7f7eb 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -218,7 +218,7 @@ struct vdpa_map_file {
  * @reset:			Reset device
  *				@vdev: vdpa device
  *				Returns integer: success (0) or error (< 0)
- * @suspend:			Suspend or resume the device (optional)
+ * @suspend:			Suspend the device (optional)
  *				@vdev: vdpa device
  *				Returns integer: success (0) or error (< 0)
  * @get_config_size:		Get the size of the configuration space includes
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 6d9f45163155..89fcb2afe472 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -171,17 +171,12 @@
 #define VHOST_VDPA_SET_GROUP_ASID	_IOW(VHOST_VIRTIO, 0x7C, \
 					     struct vhost_vring_state)
 
-/* Suspend or resume a device so it does not process virtqueue requests anymore
+/* Suspend a device so it does not process virtqueue requests anymore
  *
- * After the return of ioctl with suspend != 0, the device must finish any
- * pending operations like in flight requests. It must also preserve all the
- * necessary state (the virtqueue vring base plus the possible device specific
- * states) that is required for restoring in the future. The device must not
- * change its configuration after that point.
- *
- * After the return of ioctl with suspend == 0, the device can continue
- * processing buffers as long as typical conditions are met (vq is enabled,
- * DRIVER_OK status bit is enabled, etc).
+ * After the return of ioctl the device must finish any pending operations. It
+ * must also preserve all the necessary state (the virtqueue vring base plus
+ * the possible device specific states) that is required for restoring in the
+ * future. The device must not change its configuration after that point.
  */
 #define VHOST_VDPA_SUSPEND		_IOW(VHOST_VIRTIO, 0x7D, int)
 
-- 
2.31.1


^ permalink raw reply related

* [PATCH v8 1/3] vdpa: delete unreachable branch on vdpasim_suspend
From: Eugenio Pérez @ 2022-08-11 13:53 UTC (permalink / raw)
  To: virtualization, Jason Wang, Michael S. Tsirkin, kvm, linux-kernel,
	netdev
  Cc: ecree.xilinx, gautam.dawar, Zhang Min, pabloc, Piotr.Uminski,
	Dan Carpenter, tanuj.kamde, Zhu Lingshan, martinh,
	Christophe JAILLET, lvivier, martinpo, hanand, Eli Cohen, lulu,
	habetsm.xilinx, Parav Pandit, Longpeng, Wu Zongyong, Si-Wei Liu,
	Stefano Garzarella, dinang, Xie Yongji
In-Reply-To: <20220811135353.2549658-1-eperezma@redhat.com>

It was a leftover from previous versions.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 drivers/vdpa/vdpa_sim/vdpa_sim.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 213883487f9b..79a50edf8998 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -509,16 +509,9 @@ static int vdpasim_reset(struct vdpa_device *vdpa)
 static int vdpasim_suspend(struct vdpa_device *vdpa)
 {
 	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
-	int i;
 
 	spin_lock(&vdpasim->lock);
 	vdpasim->running = false;
-	if (vdpasim->running) {
-		/* Check for missed buffers */
-		for (i = 0; i < vdpasim->dev_attr.nvqs; ++i)
-			vdpasim_kick_vq(vdpa, i);
-
-	}
 	spin_unlock(&vdpasim->lock);
 
 	return 0;
-- 
2.31.1


^ permalink raw reply related

* [PATCH v8 0/3] Implement vdpasim suspend operation
From: Eugenio Pérez @ 2022-08-11 13:53 UTC (permalink / raw)
  To: virtualization, Jason Wang, Michael S. Tsirkin, kvm, linux-kernel,
	netdev
  Cc: ecree.xilinx, gautam.dawar, Zhang Min, pabloc, Piotr.Uminski,
	Dan Carpenter, tanuj.kamde, Zhu Lingshan, martinh,
	Christophe JAILLET, lvivier, martinpo, hanand, Eli Cohen, lulu,
	habetsm.xilinx, Parav Pandit, Longpeng, Wu Zongyong, Si-Wei Liu,
	Stefano Garzarella, dinang, Xie Yongji

Implement suspend operation for vdpa_sim devices, so vhost-vdpa will offer
that backend feature and userspace can effectively suspend the device.

This is a must before getting virtqueue indexes (base) for live migration,
since the device could modify them after userland gets them. There are
individual ways to perform that action for some devices
(VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no
way to perform it for any vhost device (and, in particular, vhost-vdpa).

After a successful return of ioctl the device must not process more virtqueue
descriptors. The device can answer to read or writes of config fields as if it
were not suspended. In particular, writing to "queue_enable" with a value of 1
will not make the device start processing virtqueue buffers.

In the future, we will provide features similar to
VHOST_USER_GET_INFLIGHT_FD so the device can save pending operations.

Applied on top of vhost branch.

Comments are welcome.

v8:
* v7 but incremental from vhost instead of isolated.

v7:
* Remove ioctl leftover argument and update doc accordingly.

v6:
* Remove the resume operation, making the ioctl simpler. We can always add
  another ioctl for VM_STOP/VM_RESUME operation later.
* s/stop/suspend/ to differentiate more from reset.
* Clarify scope of the suspend operation.

v5:
* s/not stop/resume/ in doc.

v4:
* Replace VHOST_STOP to VHOST_VDPA_STOP in vhost ioctl switch case too.

v3:
* s/VHOST_STOP/VHOST_VDPA_STOP/
* Add documentation and requirements of the ioctl above its definition.

v2:
* Replace raw _F_STOP with BIT_ULL(_F_STOP).
* Fix obtaining of stop ioctl arg (it was not obtained but written).
* Add stop to vdpa_sim_blk.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git

Eugenio Pérez (3):
  vdpa: delete unreachable branch on vdpasim_suspend
  vdpa: Remove wrong doc of VHOST_VDPA_SUSPEND ioctl
  vhost: Remove invalid parameter of VHOST_VDPA_SUSPEND ioctl

 drivers/vdpa/vdpa_sim/vdpa_sim.c |  7 -------
 include/linux/vdpa.h             |  2 +-
 include/uapi/linux/vhost.h       | 17 ++++++-----------
 3 files changed, 7 insertions(+), 19 deletions(-)

-- 
2.31.1

^ permalink raw reply

* Re: igc: missing HW timestamps at TX
From: Vinicius Costa Gomes @ 2022-08-11 13:33 UTC (permalink / raw)
  To: Ferenc Fejes, netdev@vger.kernel.org
  Cc: marton12050@gmail.com, peti.antal99@gmail.com
In-Reply-To: <d5571f0ea205e26bced51220044781131296aaac.camel@ericsson.com>

Hi Ferenc,

Ferenc Fejes <ferenc.fejes@ericsson.com> writes:

> Hi Vinicius!
>
> On Tue, 2022-07-19 at 09:40 +0200, Fejes Ferenc wrote:
>> Hi Vinicius!
>> 
>> On Mon, 2022-07-18 at 11:46 -0300, Vinicius Costa Gomes wrote:
>> > Hi Ferenc,
>> > 
>> > Ferenc Fejes <ferenc.fejes@ericsson.com> writes:
>> > 
>> > > (Ctrl+Enter'd by mistake)
>> > > 
>> > > My question here: is there anything I can quickly try to avoid
>> > > that
>> > > behavior? Even when I send only a few (like 10) packets but on
>> > > fast
>> > > rate (5us between packets) I get missing TX HW timestamps. The
>> > > receive
>> > > side looks much more roboust, I cannot noticed missing HW
>> > > timestamps
>> > > there.
>> > 
>> > There's a limitation in the i225/i226 in the number of "in flight"
>> > TX
>> > timestamps they are able to handle. The hardware has 4 sets of
>> > registers
>> > to handle timestamps.
>> > 
>> > There's an aditional issue that the driver as it is right now, only
>> > uses
>> > one set of those registers.
>> > 
>> > I have one only briefly tested series that enables the driver to
>> > use
>> > the
>> > full set of TX timestamp registers. Another reason that it was not
>> > proposed yet is that I still have to benchmark it and see what is
>> > the
>> > performance impact.
>> 
>> Thank you for the quick reply! I'm glad you already have this series
>> right off the bat. I'll be back when we done with a quick testing,
>> hopefully sooner than later.
>
> Sorry for the late reply. I had time for a few tests, with the patch.
> For my tests it looks much better. I send a packet in every 500us with
> isochron-send, TX SW and HW timestamping enabled and for 10000 packets
> I see zero lost timestamp. Even for 100000 packets only a few dropped
> HW timestamps visible.
>
> With iperf TCP test line-rate achiveable just like without the patch.
>

That's very good to know.

>> > 
>> > If you are feeling adventurous and feel like helping test it, here
>> > is
>> > the link:
>> > 
>> > https%3A%2F%2Fgithub.com%2Fvcgomes%2Fnet-next%2Ftree%2Figc-
>> > multiple-tstamp-timers-lock-new
>> > 
>
> Is there any test in partucular you interested in? My testbed is
> configured so I can do some.
>

The only thing I am worried about is, if in the "dropped" HW timestamps
case, if all the timestamp slots are indeed full, or if there's any bug
and we missed one timestamp.

Can you verify that for for every dropped HW timestamp in your
application, can you see that 'tx_hwtstamp_skipped' (from 'ethtool -S')
increases everytime the drop happens? Seeing if 'tx_hwtstamp_timeouts'
also increases would be useful as well.

If for every drop there's one 'tx_hwtstamp_skipped' increment, then it
means that the driver is doing its best, and the workload is requesting
more timestamps than the system is able to handle.

If only 'tx_hwtstamp_timeouts' increases then it's possible that there
could be a bug hiding still.

>> > 
>> > Cheers,
>> 
>> Best,
>> Ferenc
>
> Best,
> Ferenc
>

Cheers,
-- 
Vinicius

^ permalink raw reply

* Re: [PATCH] fec: Restart PPS after link state change
From: Andrew Lunn @ 2022-08-11 13:30 UTC (permalink / raw)
  To: Csókás Bence; +Cc: netdev, Richard Cochran, Fugang Duan
In-Reply-To: <9aa60160-8d8e-477f-991a-b2f4cc72ddf6@prolan.hu>

> `fep->pps_enable` is the state of the PPS the driver *believes* to be the
> case. After a reset, this belief may or may not be true anymore: if the
> driver believed formerly that the PPS is down, then after a reset, its
> belief will still be correct, thus nothing needs to be done about the
> situation. If, however, the driver thought that PPS was up, after controller
> reset, it no longer holds, so we need to update our world-view
> (`fep->pps_enable = 0;`), and then correct for the fact that PPS just
> unexpectedly stopped.

Your way of doing it just seems very unclean. I would make
fec_ptp_enable_pps() read the actual status from the
hardware. fep->pps_enable then has the clear meaning of user space
requested it should be enabled.

	  Andrew

^ permalink raw reply

* Re: [PATCH v2] tcp: adjust rcvbuff according copied rate of user space
From: Neal Cardwell @ 2022-08-11 13:26 UTC (permalink / raw)
  To: Yonglong Li; +Cc: netdev, davem, edumazet, ycheng, dsahern, kuba, pabeni
In-Reply-To: <12489b98-772f-ff2a-0ac4-cb33a06f8870@chinatelecom.cn>

On Thu, Aug 11, 2022 at 4:15 AM Yonglong Li <liyonglong@chinatelecom.cn> wrote:
>
>
>
> On 8/10/2022 8:43 PM, Neal Cardwell wrote:
> > On Wed, Aug 10, 2022 at 3:49 AM Yonglong Li <liyonglong@chinatelecom.cn> wrote:
> >>
> >> every time data is copied to user space tcp_rcv_space_adjust is called.
> >> current It adjust rcvbuff by the length of data copied to user space.
> >> If the interval of user space copy data from socket is not stable, the
> >> length of data copied to user space will not exactly show the speed of
> >> copying data from rcvbuff.
> >> so in tcp_rcv_space_adjust it is more reasonable to adjust rcvbuff by
> >> copied rate (length of copied data/interval)instead of copied data len
> >>
> >> I tested this patch in simulation environment by Mininet:
> >> with 80~120ms RTT / 1% loss link, 100 runs
> >> of (netperf -t TCP_STREAM -l 5), and got an average throughput
> >> of 17715 Kbit instead of 17703 Kbit.
> >> with 80~120ms RTT without loss link, 100 runs of (netperf -t
> >> TCP_STREAM -l 5), and got an average throughput of 18272 Kbit
> >> instead of 18248 Kbit.
> >
> > So with 1% emulated loss that's a 0.06% throughput improvement and
> > without emulated loss that's a 0.13% improvement. That sounds like it
> > may well be statistical noise, particularly given that we would expect
> > the steady-state impact of this change to be negligible.
> >
> Hi neal,
>
> Thank you for your feedback.
> I don't think the improvement is statistical noise. Because I can get small
> improvement after patch every time I test.

Interesting. To help us all understand the dynamics, can you please
share a sender-side tcpdump binary .pcap trace of the emulated tests
without loss, with:

(a) one baseline pcap of a test without the patch, and

(b) one experimental pcap of a test with the patch showing the roughly
0.13% throughput improvement.

It will be interesting to compare the receive window and transmit
behavior in both cases.

thanks,
neal

^ permalink raw reply

* Re: [RFCv7 PATCH net-next 36/36] net: redefine the prototype of netdev_features_t
From: Alexander Lobakin @ 2022-08-11 13:07 UTC (permalink / raw)
  To: shenjian (K)
  Cc: Alexander Lobakin, davem, kuba, andrew, ecree.xilinx, hkallweit1,
	saeed, leon, netdev, linuxarm
In-Reply-To: <3df89822-7dec-c01e-0df9-15b8e6f7d4e5@huawei.com>

From: "shenjian (K)" <shenjian15@huawei.com>
Date: Wed, 10 Aug 2022 21:34:43 +0800

BTW, you replied in HTML instead of plain text and korg mail servers
rejected it. So non-Ccs can't see it. Just be aware that LKML
accepts plain text only :)

> 在 2022/8/10 19:35, Alexander Lobakin 写道:
> > From: Jian Shen <shenjian15@huawei.com>
> > Date: Wed, 10 Aug 2022 11:06:24 +0800
> >
> >> For the prototype of netdev_features_t is u64, and the number
> >> of netdevice feature bits is 64 now. So there is no space to
> >> introduce new feature bit. Change the prototype of netdev_features_t
> >> from u64 to structure below:
> >> 	typedef struct {
> >> 		DECLARE_BITMAP(bits, NETDEV_FEATURE_COUNT);
> >> 	} netdev_features_t;
> >>
> >> Rewrite the netdev_features helpers to adapt with new prototype.
> >>
> >> To avoid mistake using NETIF_F_XXX as NETIF_F_XXX_BIT as
> >> input macroes for above helpers, remove all the macroes
> >> of NETIF_F_XXX for single feature bit. Serveal macroes remained
> >> temporarily, by some precompile dependency.
> >>
> >> With the prototype is no longer u64, the implementation of print
> >> interface for netdev features(%pNF) is changed to bitmap. So
> >> does the implementation of net/ethtool/.
> >>
> >> Signed-off-by: Jian Shen <shenjian15@huawei.com>
> >> ---
> >>   drivers/net/ethernet/amazon/ena/ena_netdev.c  |  12 +-
> >>   .../net/ethernet/intel/i40e/i40e_debugfs.c    |  12 +-
> >>   .../ethernet/netronome/nfp/nfp_net_common.c   |   4 +-
> >>   .../net/ethernet/pensando/ionic/ionic_lif.c   |   4 +-
> >>   include/linux/netdev_features.h               | 101 ++----------
> >>   include/linux/netdev_features_helper.h        | 149 +++++++++++-------
> >>   include/linux/netdevice.h                     |   7 +-
> >>   include/linux/skbuff.h                        |   4 +-
> >>   include/net/ip_tunnels.h                      |   2 +-
> >>   lib/vsprintf.c                                |  11 +-
> >>   net/ethtool/features.c                        |  96 ++++-------
> >>   net/ethtool/ioctl.c                           |  46 ++++--
> >>   net/mac80211/main.c                           |   3 +-
> >>   13 files changed, 201 insertions(+), 250 deletions(-)
> > [...]
> >
> >> -static inline int find_next_netdev_feature(u64 feature, unsigned long start)
> >> -{
> >> -	/* like BITMAP_LAST_WORD_MASK() for u64
> >> -	 * this sets the most significant 64 - start to 0.
> >> -	 */
> >> -	feature &= ~0ULL >> (-start & ((sizeof(feature) * 8) - 1));
> >> -
> >> -	return fls64(feature) - 1;
> >> -}
> >> +#define NETIF_F_HW_VLAN_CTAG_TX
> >> +#define NETIF_F_IPV6_CSUM
> >> +#define NETIF_F_TSO
> >> +#define NETIF_F_GSO
> > Uhm, what are those empty definitions for? They look confusing.
> I kept them temporary for some drivers use them like below:
> for example in drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
> #ifdef NETIF_F_HW_VLAN_CTAG_TX
> #include <linux/if_vlan.h>
> #endif
> 
> So far I haven't got a good way to replace it.

I believe such constructs sneaked in from some development/draft
versions of the code, as those definitions are always here, so
this is just redundant/pointless.
Just remove those ifdefs and always include the file.
The empty definitions you left in netdev_features.h are confusing,
I'd not keep them.

> 
> >>   
> >>   /* This goes for the MSB to the LSB through the set feature bits,
> >>    * mask_addr should be a u64 and bit an int
> >>    */

[...]

> >> +#define GSO_ENCAP_FEATURES	(((u64)1 << NETIF_F_GSO_GRE_BIT) |		\
> >> +				 ((u64)1 << NETIF_F_GSO_GRE_CSUM_BIT) |		\
> >> +				 ((u64)1 << NETIF_F_GSO_IPXIP4_BIT) |		\
> >> +				 ((u64)1 << NETIF_F_GSO_IPXIP6_BIT) |		\
> >> +				 (((u64)1 << NETIF_F_GSO_UDP_TUNNEL_BIT) |	\
> >> +				  ((u64)1 << NETIF_F_GSO_UDP_TUNNEL_CSUM_BIT)))
> > 1) 1ULL;
> ok，will fix it
> 
> > 2) what if we get a new GSO encap type which's bit will be higher
> >     than 64?
> So far I prefer to use this.  It's used to assgned to 
> skb_shinfo(skb)->gso_type, which prototype
> is 'unsigned int'.  Once new gso encap type introduced, we should extend 
> the gso_type first.

But ::gso_type accepts flags like %SKB_GSO_DODGY and so on, not
netdev_features, doesn't it?

> 
> 
> >> +
> >>   #endif	/* _LINUX_NETDEV_FEATURES_H */

[...]

> >>   static inline netdev_features_t
> >>   netdev_features_and(const netdev_features_t a, const netdev_features_t b)
> >>   {
> >> -	return a & b;
> >> +	netdev_features_t dst;
> >> +
> >> +	bitmap_and(dst.bits, a.bits, b.bits, NETDEV_FEATURE_COUNT);
> >> +	return dst;
> > Yeah, so as I wrote previously, not a good idea to return a whole
> > bitmap/structure.
> >
> > netdev_features_and(*dst, const *a, const *b)
> > {
> > 	return bitmap_and(); // bitmap_and() actually returns useful value
> > }
> >
> > I mean, 16 bytes (currently 8, but some new features will come
> > pretty shortly, I'm sure) are probably okayish, but... let's see
> > what other folks think, but even Linus wrote about this recently
> > BTW.
> Yes, Jakub also mentioned this.
> 
> But there are many existed features interfaces(e.g. ndo_fix_features,
> ndo_features_check), use netdev_features_t as return value. Then we
> have to change their prototype.

We have to do 12k lines of changes already :D
You know, 16 bytes is probably fine to return directly and it will
be enough for up to 128 features (+64 more comparing to the
mainline). OTOH, using pointers removes that "what if/when", so
it's more flexible in that term. So that's why I asked for other
folks' opinions -- 2 PoVs doesn't seem enough here.

> 
> second problem is for the helpers' definition. For example:
> When we introduce helper like netdev_features_zero(netdev_features_t 
> *features)
> without change prototype of netdev_features_t.
> once covert netdev_features_t from u64 to unsigned long *, then it becomes
> netdev_features_zero(unsigned long **features), result in much redundant 
> work
> to adjust it to netdev_features_zero(unsigned long *features).

> 
> 
> >>   }

[...]

> >>   static noinline_for_stack
> >> -char *netdev_bits(char *buf, char *end, const void *addr,
> >> +char *netdev_bits(char *buf, char *end, void *addr,
> >>   		  struct printf_spec spec,  const char *fmt)
> >>   {
> >> -	unsigned long long num;
> >> -	int size;
> >> +	netdev_features_t *features;
> > const? We're printing.
> It will cause compile warning for bitmap_string use features->bits
> as input param without "const" definition in its prototype.

Oof, that's weird. I checked the function you mentioned and don't
see any reason why it would require non-RO access to the bitmap it
prints.
Could you maybe please change its proto to take const bitmap, so
that it won't complain on your code? As a separate patch going right
before this one in your series.

> >>   
> >>   	if (check_pointer(&buf, end, addr, spec))
> >>   		return buf;
> >>   
> >>   	switch (fmt[1]) {
> >>   	case 'F':
> >> -		num = *(const netdev_features_t *)addr;
> >> -		size = sizeof(netdev_features_t);
> >> +		features = (netdev_features_t *)addr;
> > Casts are not needed when assigning from `void *`.
> ok, will fix it
> >> +		spec.field_width = NETDEV_FEATURE_COUNT;
> >>   		break;
> >>   	default:
> >>   		return error_string(buf, end, "(%pN?)", spec);
> >>   	}
> >>   
> >> -	return special_hex_number(buf, end, num, size);
> >> +	return bitmap_string(buf, end, features->bits, spec, fmt);
> >>   }

[...]

> >> -- 
> >> 2.33.0
> > That's my last review email for now. Insane amount of work, I'm glad
> > someone did it finally. Thanks a lot!
> >
> > Olek
> > .
> Hi   Olek,
> Grateful for your review.  You made a lot of valuable suggestions. I will
> check and continue refine the patchset.
> 
> Thanks again!
> 
> Jian

Thanks!
Olek

^ permalink raw reply

* Re: [syzbot] WARNING in ieee80211_ibss_csa_beacon
From: syzbot @ 2022-08-11 13:01 UTC (permalink / raw)
  To: code, davem, johannes, kuba, linux-kernel, linux-wireless, netdev,
	syzkaller-bugs
In-Reply-To: <1828ba28d43.27f8b7ca86738.4232033862850200536@siddh.me>

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

 active interface with an up link
[   50.263416][ T3638] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
[   50.285671][ T3638] team0: Port device team_slave_0 added
[   50.292792][ T3638] team0: Port device team_slave_1 added
[   50.310141][ T3638] batman_adv: batadv0: Adding interface: batadv_slave_0
[   50.317225][ T3638] batman_adv: batadv0: The MTU of interface batadv_slave_0 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1560 would solve the problem.
[   50.343683][ T3638] batman_adv: batadv0: Not using interface batadv_slave_0 (retrying later): interface not active
[   50.356731][ T3638] batman_adv: batadv0: Adding interface: batadv_slave_1
[   50.364022][ T3638] batman_adv: batadv0: The MTU of interface batadv_slave_1 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1560 would solve the problem.
[   50.390221][ T3638] batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
[   50.416765][ T3638] device hsr_slave_0 entered promiscuous mode
[   50.423796][ T3638] device hsr_slave_1 entered promiscuous mode
[   50.500816][ T3638] netdevsim netdevsim0 netdevsim0: renamed from eth0
[   50.511693][ T3638] netdevsim netdevsim0 netdevsim1: renamed from eth1
[   50.521112][ T3638] netdevsim netdevsim0 netdevsim2: renamed from eth2
[   50.530709][ T3638] netdevsim netdevsim0 netdevsim3: renamed from eth3
[   50.551898][ T3638] bridge0: port 2(bridge_slave_1) entered blocking state
[   50.559135][ T3638] bridge0: port 2(bridge_slave_1) entered forwarding state
[   50.566985][ T3638] bridge0: port 1(bridge_slave_0) entered blocking state
[   50.574182][ T3638] bridge0: port 1(bridge_slave_0) entered forwarding state
[   50.620324][ T3638] 8021q: adding VLAN 0 to HW filter on device bond0
[   50.634712][   T14] IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
[   50.644519][   T14] bridge0: port 1(bridge_slave_0) entered disabled state
[   50.653256][   T14] bridge0: port 2(bridge_slave_1) entered disabled state
[   50.661875][   T14] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[   50.675639][ T3638] 8021q: adding VLAN 0 to HW filter on device team0
[   50.686683][   T14] IPv6: ADDRCONF(NETDEV_CHANGE): bridge_slave_0: link becomes ready
[   50.696048][   T14] bridge0: port 1(bridge_slave_0) entered blocking state
[   50.703313][   T14] bridge0: port 1(bridge_slave_0) entered forwarding state
[   50.716587][  T143] IPv6: ADDRCONF(NETDEV_CHANGE): bridge_slave_1: link becomes ready
[   50.726078][  T143] bridge0: port 2(bridge_slave_1) entered blocking state
[   50.733163][  T143] bridge0: port 2(bridge_slave_1) entered forwarding state
[   50.751866][   T14] IPv6: ADDRCONF(NETDEV_CHANGE): team_slave_0: link becomes ready
[   50.766290][   T14] IPv6: ADDRCONF(NETDEV_CHANGE): team0: link becomes ready
[   50.775160][   T14] IPv6: ADDRCONF(NETDEV_CHANGE): team_slave_1: link becomes ready
[   50.787909][ T3650] IPv6: ADDRCONF(NETDEV_CHANGE): hsr_slave_0: link becomes ready
[   50.800188][ T3638] hsr0: Slave B (hsr_slave_1) is not up; please bring it up to get a fully working HSR network
[   50.812187][ T3638] IPv6: ADDRCONF(NETDEV_CHANGE): hsr0: link becomes ready
[   50.821509][   T14] IPv6: ADDRCONF(NETDEV_CHANGE): hsr_slave_1: link becomes ready
[   50.845246][ T3638] 8021q: adding VLAN 0 to HW filter on device batadv0
[   50.853191][    T6] IPv6: ADDRCONF(NETDEV_CHANGE): vxcan0: link becomes ready
[   50.861297][    T6] IPv6: ADDRCONF(NETDEV_CHANGE): vxcan1: link becomes ready
[   50.976504][  T143] IPv6: ADDRCONF(NETDEV_CHANGE): veth0_virt_wifi: link becomes ready
[   50.990346][    T6] IPv6: ADDRCONF(NETDEV_CHANGE): veth0_vlan: link becomes ready
[   51.001733][    T6] IPv6: ADDRCONF(NETDEV_CHANGE): vlan0: link becomes ready
[   51.009609][    T6] IPv6: ADDRCONF(NETDEV_CHANGE): vlan1: link becomes ready
[   51.018756][ T3638] device veth0_vlan entered promiscuous mode
[   51.033199][ T3638] device veth1_vlan entered promiscuous mode
[   51.053599][ T3650] IPv6: ADDRCONF(NETDEV_CHANGE): macvlan0: link becomes ready
[   51.063508][ T3650] IPv6: ADDRCONF(NETDEV_CHANGE): macvlan1: link becomes ready
[   51.072555][ T3650] IPv6: ADDRCONF(NETDEV_CHANGE): veth0_macvtap: link becomes ready
[   51.084230][ T3638] device veth0_macvtap entered promiscuous mode
[   51.093816][ T3638] device veth1_macvtap entered promiscuous mode
[   51.116192][ T3638] batman_adv: batadv0: Interface activated: batadv_slave_0
[   51.124686][    T6] IPv6: ADDRCONF(NETDEV_CHANGE): veth0_to_batadv: link becomes ready
[   51.136685][    T6] IPv6: ADDRCONF(NETDEV_CHANGE): macvtap0: link becomes ready
[   51.149935][ T3638] batman_adv: batadv0: Interface activated: batadv_slave_1
[   51.158614][ T3650] IPv6: ADDRCONF(NETDEV_CHANGE): batadv_slave_1: link becomes ready
[   51.168462][ T3650] IPv6: ADDRCONF(NETDEV_CHANGE): veth1_to_batadv: link becomes ready
[   51.182170][ T3638] netdevsim netdevsim0 netdevsim0: set [1, 0] type 2 family 0 port 6081 - 0
[   51.192643][ T3638] netdevsim netdevsim0 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
[   51.202462][ T3638] netdevsim netdevsim0 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
[   51.212143][ T3638] netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
[   51.299438][   T29] wlan0: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[   51.310951][   T29] wlan0: Creating new IBSS network, BSSID 50:50:50:50:50:50
[   51.322462][  T143] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[   51.335671][ T2494] wlan1: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[   51.344772][ T2494] wlan1: Creating new IBSS network, BSSID 50:50:50:50:50:50
[   51.353568][  T143] IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
2022/08/11 13:00:39 building call list...
[   51.514478][ T3638] ------------[ cut here ]------------
[   51.520140][ T3638] ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
[   51.530079][ T3638] WARNING: CPU: 0 PID: 3638 at lib/debugobjects.c:505 debug_object_assert_init+0x1fa/0x250
[   51.540259][ T3638] Modules linked in:
[   51.544336][ T3638] CPU: 0 PID: 3638 Comm: syz-executor.0 Not tainted 5.19.0-syzkaller #0
[   51.552766][ T3638] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022
[   51.563627][ T3638] RIP: 0010:debug_object_assert_init+0x1fa/0x250
[   51.570150][ T3638] Code: e8 eb 83 d1 fd 4c 8b 45 00 48 c7 c7 60 9f 6a 8a 48 c7 c6 60 9c 6a 8a 48 c7 c2 00 a1 6a 8a 31 c9 49 89 d9 31 c0 e8 66 73 4e fd <0f> 0b ff 05 9a 20 8a 09 48 83 c5 38 48 89 e8 48 c1 e8 03 42 80 3c
[   51.590417][ T3638] RSP: 0018:ffffc90003b0f8c8 EFLAGS: 00010046
[   51.596584][ T3638] RAX: af76cc1e655f6400 RBX: 0000000000000000 RCX: ffff888014741d40
[   51.604841][ T3638] RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
[   51.612846][ T3638] RBP: ffffffff8a0fc700 R08: ffffffff8165f59d R09: ffffed1017384f14
[   51.621025][ T3638] R10: ffffed1017384f14 R11: 1ffff11017384f13 R12: dffffc0000000000
[   51.629090][ T3638] R13: ffff88807f3d09d0 R14: 0000000000000011 R15: ffffffff90029988
[   51.637103][ T3638] FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
[   51.646131][ T3638] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.652815][ T3638] CR2: 000000c00060a001 CR3: 0000000020940000 CR4: 00000000003506f0
[   51.660973][ T3638] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   51.668940][ T3638] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   51.677688][ T3638] Call Trace:
[   51.680962][ T3638]  <TASK>
[   51.683992][ T3638]  del_timer+0x3d/0x2d0
[   51.688212][ T3638]  ? try_to_grab_pending+0xb1/0x700
[   51.693412][ T3638]  try_to_grab_pending+0xbf/0x700
[   51.698619][ T3638]  __cancel_work_timer+0x81/0x5b0
[   51.703983][ T3638]  ? mgmt_send_event_skb+0x2ee/0x4e0
[   51.709401][ T3638]  ? kmem_cache_free+0x95/0x1d0
[   51.714437][ T3638]  ? mgmt_send_event_skb+0x2ee/0x4e0
[   51.719724][ T3638]  mgmt_index_removed+0x244/0x330
[   51.725464][ T3638]  hci_unregister_dev+0x28e/0x460
[   51.730623][ T3638]  ? vhci_open+0x360/0x360
[   51.735209][ T3638]  vhci_release+0x7f/0xd0
[   51.739538][ T3638]  __fput+0x3b9/0x820
[   51.743800][ T3638]  task_work_run+0x146/0x1c0
[   51.748501][ T3638]  do_exit+0x4ed/0x1f30
[   51.752680][ T3638]  ? rcu_read_lock_sched_held+0x41/0xb0
[   51.758365][ T3638]  do_group_exit+0x23b/0x2f0
[   51.762963][ T3638]  ? _raw_spin_unlock_irq+0x1f/0x40
[   51.768653][ T3638]  ? lockdep_hardirqs_on+0x8d/0x130
[   51.774245][ T3638]  get_signal+0x16a3/0x1700
[   51.779122][ T3638]  arch_do_signal_or_restart+0x29/0x5d0
[   51.784691][ T3638]  exit_to_user_mode_loop+0x74/0x150
[   51.790147][ T3638]  exit_to_user_mode_prepare+0xb2/0x140
[   51.796127][ T3638]  syscall_exit_to_user_mode+0x26/0x60
[   51.802152][ T3638]  do_syscall_64+0x49/0x90
[   51.806773][ T3638]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   51.812939][ T3638] RIP: 0033:0x4191dc
[   51.817111][ T3638] Code: Unable to access opcode bytes at RIP 0x4191b2.
[   51.823960][ T3638] RSP: 002b:00007fffcb50dcd0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[   51.832477][ T3638] RAX: fffffffffffffe00 RBX: 00007fffcb50dd90 RCX: 00000000004191dc
[   51.840566][ T3638] RDX: 0000000000000050 RSI: 0000000000568020 RDI: 00000000000000f9
[   51.848559][ T3638] RBP: 0000000000000003 R08: 0000000000000000 R09: 0079746972756365
[   51.857099][ T3638] R10: 00000000005436a0 R11: 0000000000000246 R12: 0000000000000032
[   51.865523][ T3638] R13: 000000000000c8b1 R14: 0000000000000000 R15: 00007fffcb50ddd0
[   51.873863][ T3638]  </TASK>
[   51.876995][ T3638] Kernel panic - not syncing: panic_on_warn set ...
[   51.883575][ T3638] CPU: 0 PID: 3638 Comm: syz-executor.0 Not tainted 5.19.0-syzkaller #0
[   51.892096][ T3638] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022
[   51.902161][ T3638] Call Trace:
[   51.905527][ T3638]  <TASK>
[   51.908545][ T3638]  dump_stack_lvl+0x131/0x1c8
[   51.913322][ T3638]  panic+0x26b/0x693
[   51.917497][ T3638]  ? __warn+0x131/0x220
[   51.921826][ T3638]  ? debug_object_assert_init+0x1fa/0x250
[   51.927674][ T3638]  __warn+0x1fa/0x220
[   51.931700][ T3638]  ? debug_object_assert_init+0x1fa/0x250
[   51.937520][ T3638]  report_bug+0x1b3/0x2d0
[   51.941878][ T3638]  handle_bug+0x3d/0x70
[   51.946174][ T3638]  exc_invalid_op+0x16/0x40
[   51.950780][ T3638]  asm_exc_invalid_op+0x16/0x20
[   51.955734][ T3638] RIP: 0010:debug_object_assert_init+0x1fa/0x250
[   51.962102][ T3638] Code: e8 eb 83 d1 fd 4c 8b 45 00 48 c7 c7 60 9f 6a 8a 48 c7 c6 60 9c 6a 8a 48 c7 c2 00 a1 6a 8a 31 c9 49 89 d9 31 c0 e8 66 73 4e fd <0f> 0b ff 05 9a 20 8a 09 48 83 c5 38 48 89 e8 48 c1 e8 03 42 80 3c
[   51.981793][ T3638] RSP: 0018:ffffc90003b0f8c8 EFLAGS: 00010046
[   51.988057][ T3638] RAX: af76cc1e655f6400 RBX: 0000000000000000 RCX: ffff888014741d40
[   51.996245][ T3638] RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
[   52.005054][ T3638] RBP: ffffffff8a0fc700 R08: ffffffff8165f59d R09: ffffed1017384f14
[   52.013065][ T3638] R10: ffffed1017384f14 R11: 1ffff11017384f13 R12: dffffc0000000000
[   52.021767][ T3638] R13: ffff88807f3d09d0 R14: 0000000000000011 R15: ffffffff90029988
[   52.031524][ T3638]  ? __wake_up_klogd+0xcd/0x100
[   52.036840][ T3638]  ? debug_object_assert_init+0x1fa/0x250
[   52.042873][ T3638]  del_timer+0x3d/0x2d0
[   52.047348][ T3638]  ? try_to_grab_pending+0xb1/0x700
[   52.052749][ T3638]  try_to_grab_pending+0xbf/0x700
[   52.057800][ T3638]  __cancel_work_timer+0x81/0x5b0
[   52.062831][ T3638]  ? mgmt_send_event_skb+0x2ee/0x4e0
[   52.068345][ T3638]  ? kmem_cache_free+0x95/0x1d0
[   52.073224][ T3638]  ? mgmt_send_event_skb+0x2ee/0x4e0
[   52.078681][ T3638]  mgmt_index_removed+0x244/0x330
[   52.083977][ T3638]  hci_unregister_dev+0x28e/0x460
[   52.089128][ T3638]  ? vhci_open+0x360/0x360
[   52.093621][ T3638]  vhci_release+0x7f/0xd0
[   52.097961][ T3638]  __fput+0x3b9/0x820
[   52.102128][ T3638]  task_work_run+0x146/0x1c0
[   52.106746][ T3638]  do_exit+0x4ed/0x1f30
[   52.110908][ T3638]  ? rcu_read_lock_sched_held+0x41/0xb0
[   52.116673][ T3638]  do_group_exit+0x23b/0x2f0
[   52.121452][ T3638]  ? _raw_spin_unlock_irq+0x1f/0x40
[   52.126661][ T3638]  ? lockdep_hardirqs_on+0x8d/0x130
[   52.131858][ T3638]  get_signal+0x16a3/0x1700
[   52.136804][ T3638]  arch_do_signal_or_restart+0x29/0x5d0
[   52.142556][ T3638]  exit_to_user_mode_loop+0x74/0x150
[   52.147840][ T3638]  exit_to_user_mode_prepare+0xb2/0x140
[   52.153731][ T3638]  syscall_exit_to_user_mode+0x26/0x60
[   52.159196][ T3638]  do_syscall_64+0x49/0x90
[   52.163644][ T3638]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   52.169629][ T3638] RIP: 0033:0x4191dc
[   52.173526][ T3638] Code: Unable to access opcode bytes at RIP 0x4191b2.
[   52.180458][ T3638] RSP: 002b:00007fffcb50dcd0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[   52.189133][ T3638] RAX: fffffffffffffe00 RBX: 00007fffcb50dd90 RCX: 00000000004191dc
[   52.197189][ T3638] RDX: 0000000000000050 RSI: 0000000000568020 RDI: 00000000000000f9
[   52.205251][ T3638] RBP: 0000000000000003 R08: 0000000000000000 R09: 0079746972756365
[   52.213269][ T3638] R10: 00000000005436a0 R11: 0000000000000246 R12: 0000000000000032
[   52.221332][ T3638] R13: 000000000000c8b1 R14: 0000000000000000 R15: 00007fffcb50ddd0
[   52.229405][ T3638]  </TASK>
[   52.232869][ T3638] Kernel Offset: disabled
[   52.237327][ T3638] Rebooting in 86400 seconds..


syzkaller build log:
go env (err=<nil>)
GO111MODULE="auto"
GOARCH="amd64"
GOBIN=""
GOCACHE="/syzkaller/.cache/go-build"
GOENV="/syzkaller/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/syzkaller/jobs/linux/gopath/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/syzkaller/jobs/linux/gopath"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/syzkaller/jobs/linux/gopath/src/github.com/google/syzkaller/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1977670166=/tmp/go-build -gno-record-gcc-switches"

git status (err=<nil>)
HEAD detached at 607e3baf1
nothing to commit, working tree clean


go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=607e3baf1c25928040d05fc22eff6fce7edd709e -X 'github.com/google/syzkaller/prog.gitRevisionDate=20210324-183421'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-fuzzer github.com/google/syzkaller/syz-fuzzer
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=607e3baf1c25928040d05fc22eff6fce7edd709e -X 'github.com/google/syzkaller/prog.gitRevisionDate=20210324-183421'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=607e3baf1c25928040d05fc22eff6fce7edd709e -X 'github.com/google/syzkaller/prog.gitRevisionDate=20210324-183421'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-stress github.com/google/syzkaller/tools/syz-stress
mkdir -p ./bin/linux_amd64
gcc -o ./bin/linux_amd64/syz-executor executor/executor.cc \
	-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -static -fpermissive -w -DGOOS_linux=1 -DGOARCH_amd64=1 \
	-DHOSTGOOS_linux=1 -DGIT_REVISION=\"607e3baf1c25928040d05fc22eff6fce7edd709e\"


Error text is too large and was truncated, full error text is at:
https://syzkaller.appspot.com/x/error.txt?x=13730f25080000


Tested on:

commit:         64737995 wifi: mac80211: Don't finalize CSA in IBSS mo..
git tree:       https://github.com/siddhpant/linux.git warning_ibss_csa_beacon
kernel config:  https://syzkaller.appspot.com/x/.config?x=9b770cb261c0c061
dashboard link: https://syzkaller.appspot.com/bug?extid=b6c9fe29aefe68e4ad34
compiler:       Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2

Note: no patches were applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox