Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 1/7] net: qrtr: Invoke sk_error_report() after setting sk_err
From: Chris Lew @ 2017-09-21 18:27 UTC (permalink / raw)
  To: Bjorn Andersson, David S. Miller; +Cc: netdev, linux-kernel, linux-arm-msm
In-Reply-To: <20170907060329.32402-2-bjorn.andersson@linaro.org>



On 9/6/2017 11:03 PM, Bjorn Andersson wrote:
> Rather than manually waking up any context sleeping on the sock to
> signal an error we should call sk_error_report(). This has the added
> benefit that in-kernel consumers can override this notificatino with
> its own callback.
> 

Typo with notification.

Thanks,
Chris

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply

* Re: [PATCH 00/64] use setup_timer() helper function.
From: David Miller @ 2017-09-21 18:39 UTC (permalink / raw)
  To: allen.lkml
  Cc: netdev, linux-kernel, m.grzeschik, dmitry.tarnyagin, wg, mkl,
	mark.einon, linux, pcnet32, f.fainelli, bcm-kernel-feedback-list,
	venza, ajk, jes, romieu, khc, simon, linux-can,
	adi-buildroot-devel
In-Reply-To: <1506013525-29291-1-git-send-email-allen.lkml@gmail.com>

From: Allen Pais <allen.lkml@gmail.com>
Date: Thu, 21 Sep 2017 22:34:21 +0530

>  This series uses setup_timer() helper function. The series
> addresses the files under drivers/net/*.

I've reviewed this series and will apply it to net-next.

But please send out smaller chunks next time, maybe 10-15
in a bunch?  64 patches at once makes it really hard for
reviewiers.

Thanks.

^ permalink raw reply

* Re: [Patch net] net_sched: remove cls_flower idr on failure
From: Cong Wang @ 2017-09-21 18:47 UTC (permalink / raw)
  To: Linux Kernel Network Developers; +Cc: Cong Wang, Chris Mi, Jiri Pirko
In-Reply-To: <20170920161845.28753-1-xiyou.wangcong@gmail.com>

On Wed, Sep 20, 2017 at 9:18 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> +errout_idr:
> +       if (fnew->handle)
> +               idr_remove_ext(&head->handle_idr, fnew->handle);

Hmm, I should check fold instead of fnew->handle here.
I will update this patch.

^ permalink raw reply

* Re: [Patch net] net_sched: remove cls_flower idr on failure
From: Cong Wang @ 2017-09-21 18:50 UTC (permalink / raw)
  To: Linux Kernel Network Developers; +Cc: Cong Wang, Chris Mi, Jiri Pirko
In-Reply-To: <CAM_iQpX2QHqPFJ-vZz+6M4oDW2UgTbYdRuCxtNLm14S4sRVONg@mail.gmail.com>

On Thu, Sep 21, 2017 at 11:47 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Wed, Sep 20, 2017 at 9:18 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> +errout_idr:
>> +       if (fnew->handle)
>> +               idr_remove_ext(&head->handle_idr, fnew->handle);
>
> Hmm, I should check fold instead of fnew->handle here.
> I will update this patch.

Never mind, it should be the same logic. If fold is non-NULL,
then fnew->handle == 0.

It can be applied as it is.

^ permalink raw reply

* Re: [PATCH net v2] l2tp: fix race condition in l2tp_tunnel_delete
From: David Miller @ 2017-09-21 18:53 UTC (permalink / raw)
  To: sd; +Cc: netdev, g.nault, lucien.xin, tparkin
In-Reply-To: <6bfc5aceda47773af4c75fe7e0e3c0d255a2342d.1505828155.git.sd@queasysnail.net>

From: Sabrina Dubroca <sd@queasysnail.net>
Date: Tue, 19 Sep 2017 15:40:40 +0200

> If we try to delete the same tunnel twice, the first delete operation
> does a lookup (l2tp_tunnel_get), finds the tunnel, calls
> l2tp_tunnel_delete, which queues it for deletion by
> l2tp_tunnel_del_work.
> 
> The second delete operation also finds the tunnel and calls
> l2tp_tunnel_delete. If the workqueue has already fired and started
> running l2tp_tunnel_del_work, then l2tp_tunnel_delete will queue the
> same tunnel a second time, and try to free the socket again.
> 
> Add a dead flag to prevent firing the workqueue twice. Then we can
> remove the check of queue_work's result that was meant to prevent that
> race but doesn't.
> 
> Also check the flag in the tunnel lookup functions, to avoid returning a
> tunnel that is already scheduled for destruction.

Sabrina, please respond to Guillaume's feedback.

Thank you.

^ permalink raw reply

* Re: [PATCH] net_sched: always reset qdisc backlog in qdisc_reset()
From: David Miller @ 2017-09-21 18:57 UTC (permalink / raw)
  To: khlebnikov; +Cc: netdev, xiyou.wangcong, jiri, jhs
In-Reply-To: <150591153693.113604.8604505743746410801.stgit@buzz>

From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Date: Wed, 20 Sep 2017 15:45:36 +0300

> SKB stored in qdisc->gso_skb also counted into backlog.
> 
> Some qdiscs don't reset backlog to zero in ->reset(),
> for example sfq just dequeue and free all queued skb.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too")

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH] net_sched/hfsc: fix curve activation in hfsc_change_class()
From: David Miller @ 2017-09-21 18:57 UTC (permalink / raw)
  To: khlebnikov; +Cc: netdev, xiyou.wangcong, jiri, jhs
In-Reply-To: <150591157194.113686.4485981339195559372.stgit@buzz>

From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Date: Wed, 20 Sep 2017 15:46:11 +0300

> If real-time or fair-share curves are enabled in hfsc_change_class()
> class isn't inserted into rb-trees yet. Thus init_ed() and init_vf()
> must be called in place of update_ed() and update_vf().
> 
> Remove isn't required because for now curves cannot be disabled.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

Applied.

^ permalink raw reply

* Re: [PATCH v4 0/4] Add cross-compilation support to eBPF samples
From: David Miller @ 2017-09-21 18:59 UTC (permalink / raw)
  To: joelaf
  Cc: linux-kernel, netdev, alison, juri.lelli, fengc, daniel, ast,
	kernel-team
In-Reply-To: <20170920160436.24689-1-joelaf@google.com>

From: Joel Fernandes <joelaf@google.com>
Date: Wed, 20 Sep 2017 09:04:32 -0700

> These patches fix issues seen when cross-compiling eBPF samples on arm64.
> Compared to [1], I dropped the controversial inline-asm patch and exploring
> other options to fix it. However these patches are a step in the right
> direction and I look forward to getting them into -next and the merge window.
> 
> Changes since v3:
> - just a repost with acks
> 
> [1] https://lkml.org/lkml/2017/8/7/417

Series applied with typo in patch #4 corrected.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF_END instructions
From: Daniel Borkmann @ 2017-09-21 19:29 UTC (permalink / raw)
  To: Edward Cree, Y Song; +Cc: Alexei Starovoitov, David Miller, netdev
In-Reply-To: <207ecd4c-b1b4-3dcd-62a6-30824c19dbf7@solarflare.com>

On 09/21/2017 06:58 PM, Edward Cree wrote:
> On 21/09/17 17:40, Y Song wrote:
>> On Thu, Sep 21, 2017 at 9:24 AM, Edward Cree <ecree@solarflare.com> wrote:
>>> On 21/09/17 16:52, Alexei Starovoitov wrote:
>>>> imo
>>>> (u16) r4 endian be
>>>> isn't intuitive.
>>>> Can we come up with some better syntax?
>>>> Like
>>>> bswap16be r4
>>>> bswap32le r4
>>> Hmm, I don't like these, since bswapbe is a swap on *le* and a nop on be.

Agree, a bit too much 'swap' semantics in the name that could be
confusing perhaps, at least the be/le could be missed easily.

>>>> or
>>>>
>>>> to_be16 r4
>>>> to_le32 r4
>>> And the problem here is that it's not just to_be, it's also from_be.

More intuitive, but agree on the from_be/le. Maybe we should
just drop the "to_" prefix altogether, and leave the rest as is since
it's not surrounded by braces, it's also not a cast but rather an op.

>> Could you explain what is "from_be" here? Do not quite understand.
> Taking the example of a little-endian processor:
> cpu_to_be16() is a byte-swap, converting a u16 (cpu-endian) to a __be16.
> be16_to_cpu(), to convert a __be16 to a u16, is *also* a byte-swap.
> Meanwhile, cpu_to_le16() and le16_to_cpu() are both no-ops.
>
> More generally, the conversions between cpu-endian and fixed-endian for
>   any given size are self-inverses.  eBPF takes advantage of this by only
>   having a single opcode for both the "to" and "from" direction.  So to
>   specify an endianness conversion, you need only the size and the fixed
>   endianness (le or be), not the to/from direction.  Conversely, when
>   disassembling one of these instructions, you don't know whether it's a
>   cpu_to_be16() or a be16_to_cpu(), because they both look the same at an
>   instruction level (they only differ in what types the programmer thought
>   of the register as holding before and after).

Yeah, exactly to the point. :)

^ permalink raw reply

* Re: [v2,1/3] can: m_can: Make hclk optional
From: Franklin S Cooper Jr @ 2017-09-21 19:35 UTC (permalink / raw)
  To: Sekhar Nori, linux-kernel, devicetree, netdev, linux-can, wg, mkl,
	robh+dt, quentin.schulz
  Cc: Kristo, Tero, Tony Lindgren, Linux OMAP List
In-Reply-To: <895ab55a-fede-3a67-27aa-1ac1af9843d7@ti.com>



On 09/21/2017 09:08 AM, Sekhar Nori wrote:
> On Thursday 21 September 2017 06:01 AM, Franklin S Cooper Jr wrote:
>>
>>
>> On 08/24/2017 03:00 AM, Sekhar Nori wrote:
>>> + some OMAP folks and Linux OMAP list
>>>
>>> On Tuesday 25 July 2017 04:21 AM, Franklin Cooper wrote:
>>>> Hclk is the MCAN's interface clock. However, for OMAP based devices such as
>>>> DRA7 SoC family the interface clock is handled by hwmod. Therefore, this
>>>> interface clock is managed by hwmod driver via pm_runtime_get and
>>>> pm_runtime_put calls. Therefore, this interface clock isn't defined in DT
>>>> and thus the driver shouldn't fail if this clock isn't found.
>>>
>>> I agree that hclk is defined as interface clock for M_CAN IP on DRA76x.
>>>
>>> However, there may be a need for the driver to know the value of hclk to
>>> properly configure the RAM watchdog register which has a counter
>>> counting down using hclk. Looks like the driver does not use the RAM
>>> watchdog today. But if there is a need to configure it in future, it
>>> might be a problem.
>>
>> Honestly the RAM watchdog seems like a fundamental design problem.
>> This RAM watchdog seems to be used in case a request to access the
>> message ram is made but it hangs for what ever reason. Its even more
>> complicated since the Message RAM is external to the MCAN IP so its
>> implementation or how its handled probably differs from device to
>> device. From example say you do have this error it isn't clear how you
>> would recover from it. A logically answer would be to reset the entire
>> IP but that also assumes that Message RAM will be reset along with the
>> ip which likely depends on each SoC.
>>
>> But if a readl/writel command hangs will the kernel eventually throw an
>> error on its on or will the driver just hang? If it does hang can a
>> driver in the ISR do something to properly terminate the driver or even
>> recover from it?
>>>
>>> Is there a restriction in OMAP architecture against passing the
>>> interface clock also in the 'clocks' property in DT. I have not tried it
>>> myself, but wonder if you hit an issue that led to this patch.
>>
>> No but not passing the interface clock is typical.
> 
> Okay, then it sounds like it will just be easier to pass the hclk too?
> 
> So it can be used if needed in future and also so that the driver can
> stay the same as today.

That is fine. For now I can just drop this patch unless I discover when
enabling it on DRA76x I am unable to add the interface clock to dt.
> 
> Thanks,
> Sekhar
> 

^ permalink raw reply

* Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF_END instructions
From: Alexei Starovoitov @ 2017-09-21 19:44 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Edward Cree, Y Song, David Miller, netdev
In-Reply-To: <59C4131D.8050003@iogearbox.net>

On Thu, Sep 21, 2017 at 09:29:33PM +0200, Daniel Borkmann wrote:
> On 09/21/2017 06:58 PM, Edward Cree wrote:
> > On 21/09/17 17:40, Y Song wrote:
> > > On Thu, Sep 21, 2017 at 9:24 AM, Edward Cree <ecree@solarflare.com> wrote:
> > > > On 21/09/17 16:52, Alexei Starovoitov wrote:
> > > > > imo
> > > > > (u16) r4 endian be
> > > > > isn't intuitive.
> > > > > Can we come up with some better syntax?
> > > > > Like
> > > > > bswap16be r4
> > > > > bswap32le r4
> > > > Hmm, I don't like these, since bswapbe is a swap on *le* and a nop on be.
> 
> Agree, a bit too much 'swap' semantics in the name that could be
> confusing perhaps, at least the be/le could be missed easily.
> 
> > > > > or
> > > > > 
> > > > > to_be16 r4
> > > > > to_le32 r4
> > > > And the problem here is that it's not just to_be, it's also from_be.
> 
> More intuitive, but agree on the from_be/le. Maybe we should
> just drop the "to_" prefix altogether, and leave the rest as is since
> it's not surrounded by braces, it's also not a cast but rather an op.

'be16 r4' is ambiguous regarding upper bits.

what about my earlier suggestion:
r4 = (be16) (u16) r4
r4 = (le64) (u64) r4

It will be pretty clear what instruction is doing (that upper bits become zero).

^ permalink raw reply

* Re: [PATCH net-next] bpf/verifier: improve disassembly of BPF_END instructions
From: Edward Cree @ 2017-09-21 19:58 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann; +Cc: Y Song, David Miller, netdev
In-Reply-To: <20170921194426.tnd5xos5irm3gred@ast-mbp>

On 21/09/17 20:44, Alexei Starovoitov wrote:
> On Thu, Sep 21, 2017 at 09:29:33PM +0200, Daniel Borkmann wrote:
>> More intuitive, but agree on the from_be/le. Maybe we should
>> just drop the "to_" prefix altogether, and leave the rest as is since
>> it's not surrounded by braces, it's also not a cast but rather an op.
That works for me.
> 'be16 r4' is ambiguous regarding upper bits.
>
> what about my earlier suggestion:
> r4 = (be16) (u16) r4
> r4 = (le64) (u64) r4
>
> It will be pretty clear what instruction is doing (that upper bits become zero).
Trouble with that is that's very *not* what C will do with those casts
 and it doesn't really capture the bidirectional/symmetry thing.  The
 closest I could see with that is something like `r4 = (be16/u16) r4`,
 but that's quite an ugly mongrel.
I think Daniel's idea of `be16`, `le32` etc one-arg opcodes is the
 cleanest and clearest.  Should it be
    r4 = be16 r4
 or just
    be16 r4
?  Personally I incline towards the latter, but admit it doesn't really
 match the syntax of other opcodes.


To shed a few more bikes, I did also wonder about the BPF_NEG opcode,
 which (if I'm reading the code correctly) currently renders as
    r4 = neg r4 0
    (u32) r4 = neg (u32) r4 0
That printing of the insn->imm, while harmless, is needless and
 potentially confusing.  Should we get rid of it?

^ permalink raw reply

* net: macb: fail when there's no PHY
From: Grant Edwards @ 2017-09-21 19:59 UTC (permalink / raw)
  To: netdev

Several years back (circa 2.6.33) I had to hack up macb.c to work on
an at91 board that didn't have a PHY connected to the macb controller.
Now I might need to get a recent kernel version running on that board.

It looks like the macb driver still can't handle boards that don't
have a PHY.  Is that correct?

What's the right way to deal with this?

With the older macb driver, I ended up adding code to macb.c that
presented a "fake" PHY that discarded MDIO writes and returned some
hard-wired values for MDIO reads.  That seemed like a pretty ugly way
to deal with the situation, so I never bothered to submit a patch.

-- 
Grant Edwards
grant.b.edwards@gmail.com

^ permalink raw reply

* Re: net: macb: fail when there's no PHY
From: Florian Fainelli @ 2017-09-21 20:05 UTC (permalink / raw)
  To: Grant Edwards, netdev
In-Reply-To: <20170921195905.GA29873@grante>

On 09/21/2017 12:59 PM, Grant Edwards wrote:
> Several years back (circa 2.6.33) I had to hack up macb.c to work on
> an at91 board that didn't have a PHY connected to the macb controller.
> Now I might need to get a recent kernel version running on that board.
> 
> It looks like the macb driver still can't handle boards that don't
> have a PHY.  Is that correct?

Not since:

dacdbb4dfc1a1a1378df8ebc914d4fe82259ed46 ("net: macb: add fixed-link
node support")

> 
> What's the right way to deal with this?

Declaring a fixed PHY that will present an emulated link UP, with a
fixed speed/duplex etc. is the way to go.

> 
> With the older macb driver, I ended up adding code to macb.c that
> presented a "fake" PHY that discarded MDIO writes and returned some
> hard-wired values for MDIO reads.  That seemed like a pretty ugly way
> to deal with the situation, so I never bothered to submit a patch.
> 

Yeah, no :)
-- 
Florian

^ permalink raw reply

* Re: net: macb: fail when there's no PHY
From: Grant Edwards @ 2017-09-21 20:36 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev
In-Reply-To: <66c0a032-4d20-69f1-deb4-6c65af6ec740@gmail.com>

On Thu, Sep 21, 2017 at 01:05:57PM -0700, Florian Fainelli wrote:

>> It looks like the macb driver still can't handle boards that don't
>> have a PHY.  Is that correct?
> 
> Not since:
> 
> dacdbb4dfc1a1a1378df8ebc914d4fe82259ed46 ("net: macb: add fixed-link
> node support")

Yep, it's obvious now that I've got the diff in front of me.

Thanks!

[I just started working with device tree for the first time yesterday,
and I must say it's way better than the "old days" which required all
sorts of ugly to produce a kernel that could work on two slightly
different boards.]

-- 
Grant

^ permalink raw reply

* [PATCH] net: use 32-bit arithmetic while allocating net device
From: Alexey Dobriyan @ 2017-09-21 20:33 UTC (permalink / raw)
  To: davem; +Cc: netdev

Private part of allocation is never big enough to warrant size_t.

Space savings:

	add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-10 (-10)
	function                                     old     new   delta
	alloc_netdev_mqs                            1120    1110     -10

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 net/core/dev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7989,7 +7989,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 		unsigned int txqs, unsigned int rxqs)
 {
 	struct net_device *dev;
-	size_t alloc_size;
+	unsigned int alloc_size;
 	struct net_device *p;
 
 	BUG_ON(strlen(name) >= sizeof(dev->name));

^ permalink raw reply

* [PATCH 1/5] xfrm: make aead_len() return unsigned int
From: Alexey Dobriyan @ 2017-09-21 20:45 UTC (permalink / raw)
  To: steffen.klassert; +Cc: herbert, davem, netdev

Key lengths can't be negative.

Comparison with nla_len() is left signed just in case negative value
can sneak in there.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 include/net/xfrm.h   |    2 +-
 net/xfrm/xfrm_user.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1764,7 +1764,7 @@ static inline int xfrm_acquire_is_on(struct net *net)
 }
 #endif
 
-static inline int aead_len(struct xfrm_algo_aead *alg)
+static inline unsigned int aead_len(struct xfrm_algo_aead *alg)
 {
 	return sizeof(*alg) + ((alg->alg_key_len + 7) / 8);
 }
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -84,7 +84,7 @@ static int verify_aead(struct nlattr **attrs)
 		return 0;
 
 	algp = nla_data(rt);
-	if (nla_len(rt) < aead_len(algp))
+	if (nla_len(rt) < (int)aead_len(algp))
 		return -EINVAL;
 
 	algp->alg_name[sizeof(algp->alg_name) - 1] = '\0';

^ permalink raw reply

* [PATCH 2/5] xfrm: make xfrm_alg_len() return unsigned int
From: Alexey Dobriyan @ 2017-09-21 20:46 UTC (permalink / raw)
  To: steffen.klassert; +Cc: herbert, davem, netdev
In-Reply-To: <20170921204543.GB13550@avx2>

Key lengths can't be negative.

Comparison with nla_len() is left signed just in case negative value
can sneak in there.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 include/net/xfrm.h   |    2 +-
 net/xfrm/xfrm_user.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1769,7 +1769,7 @@ static inline unsigned int aead_len(struct xfrm_algo_aead *alg)
 	return sizeof(*alg) + ((alg->alg_key_len + 7) / 8);
 }
 
-static inline int xfrm_alg_len(const struct xfrm_algo *alg)
+static inline unsigned int xfrm_alg_len(const struct xfrm_algo *alg)
 {
 	return sizeof(*alg) + ((alg->alg_key_len + 7) / 8);
 }
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -42,7 +42,7 @@ static int verify_one_alg(struct nlattr **attrs, enum xfrm_attr_type_t type)
 		return 0;
 
 	algp = nla_data(rt);
-	if (nla_len(rt) < xfrm_alg_len(algp))
+	if (nla_len(rt) < (int)xfrm_alg_len(algp))
 		return -EINVAL;
 
 	switch (type) {

^ permalink raw reply

* [PATCH 3/5] xfrm: make xfrm_alg_auth_len() return unsigned int
From: Alexey Dobriyan @ 2017-09-21 20:47 UTC (permalink / raw)
  To: steffen.klassert; +Cc: herbert, davem, netdev
In-Reply-To: <20170921204543.GB13550@avx2>

Key lengths can't be negative.

Comparison with nla_len() is left signed just in case negative value
can sneak in there.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 include/net/xfrm.h   |    2 +-
 net/xfrm/xfrm_user.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1774,7 +1774,7 @@ static inline unsigned int xfrm_alg_len(const struct xfrm_algo *alg)
 	return sizeof(*alg) + ((alg->alg_key_len + 7) / 8);
 }
 
-static inline int xfrm_alg_auth_len(const struct xfrm_algo_auth *alg)
+static inline unsigned int xfrm_alg_auth_len(const struct xfrm_algo_auth *alg)
 {
 	return sizeof(*alg) + ((alg->alg_key_len + 7) / 8);
 }
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -68,7 +68,7 @@ static int verify_auth_trunc(struct nlattr **attrs)
 		return 0;
 
 	algp = nla_data(rt);
-	if (nla_len(rt) < xfrm_alg_auth_len(algp))
+	if (nla_len(rt) < (int)xfrm_alg_auth_len(algp))
 		return -EINVAL;
 
 	algp->alg_name[sizeof(algp->alg_name) - 1] = '\0';

^ permalink raw reply

* [PATCH 4/5] xfrm: make xfrm_replay_state_esn_len() return unsigned int
From: Alexey Dobriyan @ 2017-09-21 20:47 UTC (permalink / raw)
  To: steffen.klassert; +Cc: herbert, davem, netdev
In-Reply-To: <20170921204543.GB13550@avx2>

Replay detection bitmaps can't have negative length.

Comparisons with nla_len() are left signed just in case negative value
can sneak in there.

Propagate unsignedness for code size savings:

	add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-38 (-38)
	function                                     old     new   delta
	xfrm_state_construct                        1802    1800      -2
	xfrm_update_ae_params                        295     289      -6
	xfrm_state_migrate                          1345    1339      -6
	xfrm_replay_notify_esn                       349     337     -12
	xfrm_replay_notify_bmp                       345     333     -12

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 include/net/xfrm.h   |    2 +-
 net/xfrm/xfrm_user.c |   10 +++++-----
 2 files changed, 6 insertions(+), 6 deletions(-)

--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1779,7 +1779,7 @@ static inline unsigned int xfrm_alg_auth_len(const struct xfrm_algo_auth *alg)
 	return sizeof(*alg) + ((alg->alg_key_len + 7) / 8);
 }
 
-static inline int xfrm_replay_state_esn_len(struct xfrm_replay_state_esn *replay_esn)
+static inline unsigned int xfrm_replay_state_esn_len(struct xfrm_replay_state_esn *replay_esn)
 {
 	return sizeof(*replay_esn) + replay_esn->bmp_len * sizeof(__u32);
 }
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -130,7 +130,7 @@ static inline int verify_replay(struct xfrm_usersa_info *p,
 		if (rs->bmp_len > XFRMA_REPLAY_ESN_MAX / sizeof(rs->bmp[0]) / 8)
 			return -EINVAL;
 
-		if (nla_len(rt) < xfrm_replay_state_esn_len(rs) &&
+		if (nla_len(rt) < (int)xfrm_replay_state_esn_len(rs) &&
 		    nla_len(rt) != sizeof(*rs))
 			return -EINVAL;
 	}
@@ -404,7 +404,7 @@ static inline int xfrm_replay_verify_len(struct xfrm_replay_state_esn *replay_es
 					 struct nlattr *rp)
 {
 	struct xfrm_replay_state_esn *up;
-	int ulen;
+	unsigned int ulen;
 
 	if (!replay_esn || !rp)
 		return 0;
@@ -414,7 +414,7 @@ static inline int xfrm_replay_verify_len(struct xfrm_replay_state_esn *replay_es
 
 	/* Check the overall length and the internal bitmap length to avoid
 	 * potential overflow. */
-	if (nla_len(rp) < ulen ||
+	if (nla_len(rp) < (int)ulen ||
 	    xfrm_replay_state_esn_len(replay_esn) != ulen ||
 	    replay_esn->bmp_len != up->bmp_len)
 		return -EINVAL;
@@ -430,14 +430,14 @@ static int xfrm_alloc_replay_state_esn(struct xfrm_replay_state_esn **replay_esn
 				       struct nlattr *rta)
 {
 	struct xfrm_replay_state_esn *p, *pp, *up;
-	int klen, ulen;
+	unsigned int klen, ulen;
 
 	if (!rta)
 		return 0;
 
 	up = nla_data(rta);
 	klen = xfrm_replay_state_esn_len(up);
-	ulen = nla_len(rta) >= klen ? klen : sizeof(*up);
+	ulen = nla_len(rta) >= (int)klen ? klen : sizeof(*up);
 
 	p = kzalloc(klen, GFP_KERNEL);
 	if (!p)

^ permalink raw reply

* [PATCH 5/5] xfrm: eradicate size_t
From: Alexey Dobriyan @ 2017-09-21 20:48 UTC (permalink / raw)
  To: steffen.klassert; +Cc: herbert, davem, netdev
In-Reply-To: <20170921204543.GB13550@avx2>

All netlink message sizes are a) unsigned, b) can't be >= 4GB in size
because netlink doesn't support >= 64KB messages in the first place.

All those size_t across the code are a scam especially across networking
which likes to work with small numbers like 1500 or 65536.

Propagate unsignedness and flip some "int" to "unsigned int" as well.

This is preparation to switching nlmsg_new() to "unsigned int".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 net/xfrm/xfrm_user.c |   44 +++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -458,9 +458,9 @@ static int xfrm_alloc_replay_state_esn(struct xfrm_replay_state_esn **replay_esn
 	return 0;
 }
 
-static inline int xfrm_user_sec_ctx_size(struct xfrm_sec_ctx *xfrm_ctx)
+static inline unsigned int xfrm_user_sec_ctx_size(struct xfrm_sec_ctx *xfrm_ctx)
 {
-	int len = 0;
+	unsigned int len = 0;
 
 	if (xfrm_ctx) {
 		len += sizeof(struct xfrm_user_sec_ctx);
@@ -1031,7 +1031,7 @@ static inline int xfrm_nlmsg_multicast(struct net *net, struct sk_buff *skb,
 		return -1;
 }
 
-static inline size_t xfrm_spdinfo_msgsize(void)
+static inline unsigned int xfrm_spdinfo_msgsize(void)
 {
 	return NLMSG_ALIGN(4)
 	       + nla_total_size(sizeof(struct xfrmu_spdinfo))
@@ -1157,7 +1157,7 @@ static int xfrm_get_spdinfo(struct sk_buff *skb, struct nlmsghdr *nlh,
 	return nlmsg_unicast(net->xfrm.nlsk, r_skb, sportid);
 }
 
-static inline size_t xfrm_sadinfo_msgsize(void)
+static inline unsigned int xfrm_sadinfo_msgsize(void)
 {
 	return NLMSG_ALIGN(4)
 	       + nla_total_size(sizeof(struct xfrmu_sadhinfo))
@@ -1633,7 +1633,7 @@ static inline int copy_to_user_sec_ctx(struct xfrm_policy *xp, struct sk_buff *s
 		return copy_sec_ctx(xp->security, skb);
 	return 0;
 }
-static inline size_t userpolicy_type_attrsize(void)
+static inline unsigned int userpolicy_type_attrsize(void)
 {
 #ifdef CONFIG_XFRM_SUB_POLICY
 	return nla_total_size(sizeof(struct xfrm_userpolicy_type));
@@ -1850,9 +1850,9 @@ static int xfrm_flush_sa(struct sk_buff *skb, struct nlmsghdr *nlh,
 	return 0;
 }
 
-static inline size_t xfrm_aevent_msgsize(struct xfrm_state *x)
+static inline unsigned int xfrm_aevent_msgsize(struct xfrm_state *x)
 {
-	size_t replay_size = x->replay_esn ?
+	unsigned int replay_size = x->replay_esn ?
 			      xfrm_replay_state_esn_len(x->replay_esn) :
 			      sizeof(struct xfrm_replay_state);
 
@@ -2321,8 +2321,8 @@ static int copy_to_user_kmaddress(const struct xfrm_kmaddress *k, struct sk_buff
 	return nla_put(skb, XFRMA_KMADDRESS, sizeof(uk), &uk);
 }
 
-static inline size_t xfrm_migrate_msgsize(int num_migrate, int with_kma,
-					  int with_encp)
+static inline unsigned int xfrm_migrate_msgsize(int num_migrate, int with_kma,
+						int with_encp)
 {
 	return NLMSG_ALIGN(sizeof(struct xfrm_userpolicy_id))
 	      + (with_kma ? nla_total_size(sizeof(struct xfrm_kmaddress)) : 0)
@@ -2566,7 +2566,7 @@ static void xfrm_netlink_rcv(struct sk_buff *skb)
 	mutex_unlock(&net->xfrm.xfrm_cfg_mutex);
 }
 
-static inline size_t xfrm_expire_msgsize(void)
+static inline unsigned int xfrm_expire_msgsize(void)
 {
 	return NLMSG_ALIGN(sizeof(struct xfrm_user_expire))
 	       + nla_total_size(sizeof(struct xfrm_mark));
@@ -2654,9 +2654,9 @@ static int xfrm_notify_sa_flush(const struct km_event *c)
 	return xfrm_nlmsg_multicast(net, skb, 0, XFRMNLGRP_SA);
 }
 
-static inline size_t xfrm_sa_len(struct xfrm_state *x)
+static inline unsigned int xfrm_sa_len(struct xfrm_state *x)
 {
-	size_t l = 0;
+	unsigned int l = 0;
 	if (x->aead)
 		l += nla_total_size(aead_len(x->aead));
 	if (x->aalg) {
@@ -2701,8 +2701,9 @@ static int xfrm_notify_sa(struct xfrm_state *x, const struct km_event *c)
 	struct xfrm_usersa_id *id;
 	struct nlmsghdr *nlh;
 	struct sk_buff *skb;
-	int len = xfrm_sa_len(x);
-	int headlen, err;
+	unsigned int len = xfrm_sa_len(x);
+	unsigned int headlen;
+	int err;
 
 	headlen = sizeof(*p);
 	if (c->event == XFRM_MSG_DELSA) {
@@ -2776,8 +2777,8 @@ static int xfrm_send_state_notify(struct xfrm_state *x, const struct km_event *c
 
 }
 
-static inline size_t xfrm_acquire_msgsize(struct xfrm_state *x,
-					  struct xfrm_policy *xp)
+static inline unsigned int xfrm_acquire_msgsize(struct xfrm_state *x,
+						struct xfrm_policy *xp)
 {
 	return NLMSG_ALIGN(sizeof(struct xfrm_user_acquire))
 	       + nla_total_size(sizeof(struct xfrm_user_tmpl) * xp->xfrm_nr)
@@ -2900,7 +2901,7 @@ static struct xfrm_policy *xfrm_compile_policy(struct sock *sk, int opt,
 	return xp;
 }
 
-static inline size_t xfrm_polexpire_msgsize(struct xfrm_policy *xp)
+static inline unsigned int xfrm_polexpire_msgsize(struct xfrm_policy *xp)
 {
 	return NLMSG_ALIGN(sizeof(struct xfrm_user_polexpire))
 	       + nla_total_size(sizeof(struct xfrm_user_tmpl) * xp->xfrm_nr)
@@ -2957,13 +2958,14 @@ static int xfrm_exp_policy_notify(struct xfrm_policy *xp, int dir, const struct
 
 static int xfrm_notify_policy(struct xfrm_policy *xp, int dir, const struct km_event *c)
 {
-	int len = nla_total_size(sizeof(struct xfrm_user_tmpl) * xp->xfrm_nr);
+	unsigned int len = nla_total_size(sizeof(struct xfrm_user_tmpl) * xp->xfrm_nr);
 	struct net *net = xp_net(xp);
 	struct xfrm_userpolicy_info *p;
 	struct xfrm_userpolicy_id *id;
 	struct nlmsghdr *nlh;
 	struct sk_buff *skb;
-	int headlen, err;
+	unsigned int headlen;
+	int err;
 
 	headlen = sizeof(*p);
 	if (c->event == XFRM_MSG_DELPOLICY) {
@@ -3070,7 +3072,7 @@ static int xfrm_send_policy_notify(struct xfrm_policy *xp, int dir, const struct
 
 }
 
-static inline size_t xfrm_report_msgsize(void)
+static inline unsigned int xfrm_report_msgsize(void)
 {
 	return NLMSG_ALIGN(sizeof(struct xfrm_user_report));
 }
@@ -3115,7 +3117,7 @@ static int xfrm_send_report(struct net *net, u8 proto,
 	return xfrm_nlmsg_multicast(net, skb, 0, XFRMNLGRP_REPORT);
 }
 
-static inline size_t xfrm_mapping_msgsize(void)
+static inline unsigned int xfrm_mapping_msgsize(void)
 {
 	return NLMSG_ALIGN(sizeof(struct xfrm_user_mapping));
 }

^ permalink raw reply

* pull-request: ieee802154 2017-09-20
From: Stefan Schmidt @ 2017-09-21 20:56 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-wpan, alex.aring, marcel, netdev

Hello Dave.

[Resend with netdev in cc]

Here comes a pull request for ieee802154 changes I have queued up for
this merge window.

Normally these have been coming through the bluetooth tree but as this
three have been falling through the cracks so far and I have to review
and ack all of them anyway I think it makes sense if I save the
bluetooth people some work and handle them directly.

Its the first pull request I send to you so please let me know if I did
something wrong or if you prefer a different format.

regards
Stefan Schmidt

The following changes since commit 8ca712c373a462cfa1b62272870b6c2c74aa83f9:

  Merge branch 'net-speedup-netns-create-delete-time' (2017-09-19 16:32:24 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next.git ieee802154-for-davem-2017-09-20

for you to fetch changes up to d5dd29e4dafef4baad7bf529ad73cafeb13e1aa8:

  ieee802154: atusb: Driver for Busware HUL dongle (2017-09-20 13:37:16 +0200)

----------------------------------------------------------------
Support for the hulusb hardware inside the atusb driver by Josef
Filzmaier and 802.15.4 MAC security compliance fix by Diogenes Pereira.
----------------------------------------------------------------
Diogenes Pereira (2):
      mac802154: replace hardcoded value with macro
      mac802154: Fix MAC header and payload encrypted

Josef Filzmaier (1):
      ieee802154: atusb: Driver for Busware HUL dongle

 drivers/net/ieee802154/atusb.c | 317 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------
 drivers/net/ieee802154/atusb.h |   8 ++++
 net/mac802154/llsec.c          |  14 ++++--
 3 files changed, 295 insertions(+), 44 deletions(-)

^ permalink raw reply

* Re: [PATCH net] packet: hold bind lock when rebinding to fanout hook
From: Willem de Bruijn @ 2017-09-21 21:10 UTC (permalink / raw)
  To: David Miller; +Cc: Willem de Bruijn, Network Development, nixiaoming
In-Reply-To: <20170920.140327.2188227671204769350.davem@davemloft.net>

On Wed, Sep 20, 2017 at 5:03 PM, David Miller <davem@davemloft.net> wrote:
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Date: Fri, 15 Sep 2017 10:07:46 -0400
>
>> On Thu, Sep 14, 2017 at 5:14 PM, Willem de Bruijn <willemb@google.com> wrote:
>>> Packet socket bind operations must hold the po->bind_lock. This keeps
>>> po->running consistent with whether the socket is actually on a ptype
>>> list to receive packets.
>>>
>>> fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
>>> binds the fanout object to receive through packet_rcv_fanout.
>>>
>>> Make it hold the po->bind_lock when testing po->running and rebinding.
>>> Else, it can race with other rebind operations, such as that in
>>> packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
>>> can result in a socket being added to a fanout group twice, causing
>>> use-after-free KASAN bug reports, among others.
>>>
>>> Reported independently by both trinity and syzkaller.
>>> Verified that the syzkaller reproducer passes after this patch.
>>>
>>
>> I forgot to add the Fixes tag, sorry.
>>
>> Fixes: dc99f600698d ("packet: Add fanout support.")
>
> Applied and queued up for stable as it fixes this race and I can't
> see any new problems it introduces.
>
> But boy is this one messy area.

Yeah, I've been staring at this code for a while now. But I don't see
an obvious way to simplify it. packet_notifier does not hold the socket
lock, so even if all of bind, set_ring and fanout_add do hold that,
bind_lock is still needed. packet_mmap may not take the socket lock,
so pg_vec_lock must remain to synchronize with packet_set_ring
even if for no other reason. I had a look at using rcu pointers for rx_ring
and tx_ring, to avoid taking that lock in the datapath and possibly updating
without the unbind/bind dance. But that update needs to be atomic with
purging the socket queue, so the socket must be properly quiesced.

> The scariest thing to me now is the save/restore sequence done by
> packet_set_ring(), for example.
>
>         spin_lock(&po->bind_lock);
>         was_running = po->running;
>         num = po->num;
>         if (was_running) {
>                 po->num = 0;
>                 __unregister_prot_hook(sk, false);
>         }
>         spin_unlock(&po->bind_lock);
>  ...
>         spin_lock(&po->bind_lock);
>         if (was_running) {
>                 po->num = num;
>                 register_prot_hook(sk);
>         }
>         spin_unlock(&po->bind_lock);
>
> The socket is also locked during this sequence but that doesn't
> prevent parallel changes to the running state.
>
> Since po->bind_lock is dropped, it's possible for another thread
> to grab bind_lock and bind it meanwhile.
>
> The above code seems to assume that can't happen, and that
> register_prot_hook() will always see po->running set to zero
> and rebind the socket.
>
> If the race happens we'll have weird state, because we did not
> rebind yet we modified po->num.

It appears that the only path that may try to bind without holding the
socket lock is packet_notifier. That skips register_prot_hook if !po->num.

> We seem to have a hierachy of sleeping and non-sleeping locks
> that do not work well together.

Given the number recent bugs that were fixed by locking the socket
inside a particular setsockopt case, I think that we should lock that
entire function, similar to other protocol families (after verifying that
all cases can indeed sleep).

Another issue that looks fragile is the test for po->fanout in
packet_do_bind before taking the socket lock.

^ permalink raw reply

* Re: [PATCH 2/2] ip_tunnel: add mpls over gre encapsulation
From: Francois Romieu @ 2017-09-21 21:25 UTC (permalink / raw)
  To: Amine Kherbouche; +Cc: netdev, xeb, roopa, equinox
In-Reply-To: <1505985924-12479-3-git-send-email-amine.kherbouche@6wind.com>

Amine Kherbouche <amine.kherbouche@6wind.com> :
[...]
> diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
> index 36ea2ad..060ed07 100644
> --- a/net/mpls/af_mpls.c
> +++ b/net/mpls/af_mpls.c
[...]
> @@ -39,6 +40,40 @@ static int one = 1;
>  static int label_limit = (1 << 20) - 1;
>  static int ttl_max = 255;
>  
> +size_t ipgre_mpls_encap_hlen(struct ip_tunnel_encap *e)
> +{
> +	return sizeof(struct mpls_shim_hdr);
> +}
> +
> +int ipgre_mpls_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
> +			    u8 *protocol, struct flowi4 *fl4)
> +{
> +	return 0;
> +}
> +
> +static const struct ip_tunnel_encap_ops mpls_iptun_ops = {
> +	.encap_hlen = ipgre_mpls_encap_hlen,
> +	.build_header = ipgre_mpls_build_header,
> +};

Nit: af_mpls.c uses tab before '=' in such places.

> +
> +int ipgre_tunnel_encap_add_mpls_ops(void)
> +{
> +	int ret;
> +
> +	ret = ip_tunnel_encap_add_ops(&mpls_iptun_ops, TUNNEL_ENCAP_MPLS);

ip_tunnel_encap_add_ops is CONFIG_NET_IP_TUNNEL dependant.

Afaics CONFIG_MPLS does not enforce it.

[...]
> @@ -2486,6 +2521,7 @@ static int __init mpls_init(void)
>  		      0);
>  	rtnl_register(PF_MPLS, RTM_GETNETCONF, mpls_netconf_get_devconf,
>  		      mpls_netconf_dump_devconf, 0);
> +	ipgre_tunnel_encap_add_mpls_ops();
>  	err = 0;
>  out:
>  	return err;

ipgre_tunnel_encap_add_mpls_ops status return code is not checked.

-- 
Ueimor

^ permalink raw reply

* Re: Kernel 4.13.0-rc4-next-20170811 - IP Routing / Forwarding performance vs Core/RSS number / HT on
From: Paweł Staszewski @ 2017-09-21 21:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Paolo Abeni, Jesper Dangaard Brouer,
	Linux Kernel Network Developers, Alexander Duyck
In-Reply-To: <4b1efff7-4f91-fd78-beb8-2c7ebcf18895@itcare.pl>



W dniu 2017-08-15 o 11:11, Paweł Staszewski pisze:
> diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c
> index 
> 5e831de3103e2f7092c7fa15534def403bc62fb4..9472de846d5c0960996261cb2843032847fa4bf7 
> 100644
> --- a/net/8021q/vlan_netlink.c
> +++ b/net/8021q/vlan_netlink.c
> @@ -143,6 +143,7 @@ static int vlan_newlink(struct net *src_net, 
> struct net_device *dev,
>       vlan->vlan_proto = proto;
>       vlan->vlan_id     = nla_get_u16(data[IFLA_VLAN_ID]);
>       vlan->real_dev     = real_dev;
> +    dev->priv_flags |= (real_dev->priv_flags & IFF_XMIT_DST_RELEASE);
>       vlan->flags     = VLAN_FLAG_REORDER_HDR;
>         err = vlan_check_real_dev(real_dev, vlan->vlan_proto, 
> vlan->vlan_id); 

Any plans for this patch to go normal into the kernel ?

So far im using it for about 3 weeks on all my linux based routers - and 
no problems.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox