Netdev List
 help / color / mirror / Atom feed
* [PATCH 1/1 net-next] NFC: llcp: use sizeof(*sdres) instead of nfc_llcp_sdp_tlv
From: Fabian Frederick @ 2014-10-15 19:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Fabian Frederick, Lauro Ramos Venancio, Aloisio Almeida Jr,
	Samuel Ortiz, David S. Miller, linux-wireless, netdev

 See Documentation/CodingStyle Chapter 14

Signed-off-by: Fabian Frederick <fabf@skynet.be>
---
 net/nfc/llcp_commands.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/nfc/llcp_commands.c b/net/nfc/llcp_commands.c
index a3ad69a..a450075 100644
--- a/net/nfc/llcp_commands.c
+++ b/net/nfc/llcp_commands.c
@@ -120,7 +120,7 @@ struct nfc_llcp_sdp_tlv *nfc_llcp_build_sdres_tlv(u8 tid, u8 sap)
 	struct nfc_llcp_sdp_tlv *sdres;
 	u8 value[2];
 
-	sdres = kzalloc(sizeof(struct nfc_llcp_sdp_tlv), GFP_KERNEL);
+	sdres = kzalloc(sizeof(*sdres), GFP_KERNEL);
 	if (sdres == NULL)
 		return NULL;
 
@@ -149,7 +149,7 @@ struct nfc_llcp_sdp_tlv *nfc_llcp_build_sdreq_tlv(u8 tid, char *uri,
 
 	pr_debug("uri: %s, len: %zu\n", uri, uri_len);
 
-	sdreq = kzalloc(sizeof(struct nfc_llcp_sdp_tlv), GFP_KERNEL);
+	sdreq = kzalloc(sizeof(*sdreq), GFP_KERNEL);
 	if (sdreq == NULL)
 		return NULL;
 
-- 
1.9.3

^ permalink raw reply related

* [PATCH 1/1 net-next] NFC: netlink: use sizeof(*iter) instead of class_dev_iter
From: Fabian Frederick @ 2014-10-15 19:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: Fabian Frederick, Lauro Ramos Venancio, Aloisio Almeida Jr,
	Samuel Ortiz, David S. Miller, linux-wireless, netdev

Also fixes kzalloc on struct se_io_ctx
See Documentation/CodingStyle Chapter 14

Signed-off-by: Fabian Frederick <fabf@skynet.be>
---
 net/nfc/netlink.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/nfc/netlink.c b/net/nfc/netlink.c
index 43cb1c1..b4ba68e 100644
--- a/net/nfc/netlink.c
+++ b/net/nfc/netlink.c
@@ -534,7 +534,7 @@ static int nfc_genl_dump_devices(struct sk_buff *skb,
 
 	if (!iter) {
 		first_call = true;
-		iter = kmalloc(sizeof(struct class_dev_iter), GFP_KERNEL);
+		iter = kmalloc(sizeof(*iter), GFP_KERNEL);
 		if (!iter)
 			return -ENOMEM;
 		cb->args[0] = (long) iter;
@@ -1242,7 +1242,7 @@ static int nfc_genl_dump_ses(struct sk_buff *skb,
 
 	if (!iter) {
 		first_call = true;
-		iter = kmalloc(sizeof(struct class_dev_iter), GFP_KERNEL);
+		iter = kmalloc(sizeof(*iter), GFP_KERNEL);
 		if (!iter)
 			return -ENOMEM;
 		cb->args[0] = (long) iter;
@@ -1360,7 +1360,7 @@ static int nfc_genl_se_io(struct sk_buff *skb, struct genl_info *info)
 	if (!apdu)
 		return -EINVAL;
 
-	ctx = kzalloc(sizeof(struct se_io_ctx), GFP_KERNEL);
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (!ctx)
 		return -ENOMEM;
 
-- 
1.9.3

^ permalink raw reply related

* [PATCH 1/1 net-next] openvswitch: kerneldoc warning fix
From: Fabian Frederick @ 2014-10-15 19:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: Fabian Frederick, Pravin Shelar, David S. Miller, dev, netdev

s/sock/gs

Signed-off-by: Fabian Frederick <fabf@skynet.be>
---
 net/openvswitch/vport-geneve.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/openvswitch/vport-geneve.c b/net/openvswitch/vport-geneve.c
index 910b3ef..106a9d8 100644
--- a/net/openvswitch/vport-geneve.c
+++ b/net/openvswitch/vport-geneve.c
@@ -30,7 +30,7 @@
 
 /**
  * struct geneve_port - Keeps track of open UDP ports
- * @sock: The socket created for this port number.
+ * @gs: The socket created for this port number.
  * @name: vport name.
  */
 struct geneve_port {
-- 
1.9.3

^ permalink raw reply related

* [PATCH 1/1 net-next] openvswitch: use vport instead of p
From: Fabian Frederick @ 2014-10-15 19:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: Fabian Frederick, Pravin Shelar, David S. Miller, dev, netdev

All functions used struct vport *vport except
ovs_vport_find_upcall_portid.

This fixes 1 kerneldoc warning

Signed-off-by: Fabian Frederick <fabf@skynet.be>
---
 net/openvswitch/vport.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 53001b0..6015802 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -408,13 +408,13 @@ int ovs_vport_get_upcall_portids(const struct vport *vport,
  *
  * Returns the portid of the target socket.  Must be called with rcu_read_lock.
  */
-u32 ovs_vport_find_upcall_portid(const struct vport *p, struct sk_buff *skb)
+u32 ovs_vport_find_upcall_portid(const struct vport *vport, struct sk_buff *skb)
 {
 	struct vport_portids *ids;
 	u32 ids_index;
 	u32 hash;
 
-	ids = rcu_dereference(p->upcall_portids);
+	ids = rcu_dereference(vport->upcall_portids);
 
 	if (ids->n_ids == 1 && ids->ids[0] == 0)
 		return 0;
-- 
1.9.3

^ permalink raw reply related

* Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
From: Anatol Pomozov @ 2014-10-15 19:41 UTC (permalink / raw)
  To: Tom Gundersen
  Cc: One Thousand Gnomes, Takashi Iwai, Kay Sievers, Sreekanth Reddy,
	James Bottomley, Praveen Krishnamoorthy, hare,
	Nagalakshmi Nandigama, Wu Zhangjin, Tetsuo Handa,
	mpt-fusionlinux.pdl, Tim Gardner, Benjamin Poirier,
	Santosh Rastapur, Casey Leedom, Hariprasad S, Pierre Fersing,
	Arjan van de Ven, Abhijit Mahajan, systemd Mailing List
In-Reply-To: <CAG-2HqXdxHrx4OL4E0TnzA1Fb6zK6J+qowfD0RHCwviGUv-_ng@mail.gmail.com>

Hi

On Fri, Oct 10, 2014 at 3:45 PM, Tom Gundersen <teg@jklm.no> wrote:
> On Fri, Oct 10, 2014 at 11:54 PM, Anatol Pomozov
> <anatol.pomozov@gmail.com> wrote:
>> 1) Why not to make the timeout configurable through config file? There
>> is already udev.conf you can put config option there. Thus people with
>> modprobe issues can easily "fix" the problem. And then decrease
>> default timeout back to 30 seconds. I agree that long module loading
>> (more than 30 secs) is abnormal and should be investigated by driver
>> authors.
>
> We can already configure this either on the udev or kernel
> commandline, is that not sufficient (I don't object to also adding it
> to the config file, just asking)?

I did not know that udev timeout can be configured via kernel cmd. And
because other people ask about changing timeout they most like did not
know about it neither. Actually looking at
http://www.freedesktop.org/software/systemd/man/kernel-command-line.html
I do not see where it mentions udev timeout.

I think adding configuration to the right place (udev config file) and
adding documentation to make the option more discoverable will solve
the topic starter issue. Now anyone can easily set timeout they want.
The default timeout can go back to 30 sec in this case.

>> 2) Could you add 'echo w > /proc/sysrq-trigger' to udev code right
>> before killing the "modprobe" thread? sysrq will print information
>> about stuck threads (including modprobe itself) this will make
>> debugging easier. e.g. dmesg here
>> https://bugs.archlinux.org/task/40454 says nothing where the threads
>> were stuck.
>
> Are the current warnings (in udev git) sufficient (should tell you
> which module is taking long, but still won't tell you which kernel
> thread of course)?

True. module name should be enough. In this case to debug the issue user needs:
 - disable failing udev rule (or blacklist module?)
 - reboot, it will let the user get into shell
 - modprobe the failing module
 - use sysrq-trigger to get more information about stuck process

So it is more matter of easier problem debugging. Not critical but it
will be useful imho. This feature can be configured via udev.conf

^ permalink raw reply

* Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM
From: Alexander E. Patrakov @ 2014-10-15 19:46 UTC (permalink / raw)
  To: Anatol Pomozov, Tom Gundersen
  Cc: One Thousand Gnomes, Takashi Iwai, Kay Sievers, Sreekanth Reddy,
	James Bottomley, Praveen Krishnamoorthy, hare,
	Nagalakshmi Nandigama, Wu Zhangjin, Tetsuo Handa,
	mpt-fusionlinux.pdl, Tim Gardner, Benjamin Poirier,
	Santosh Rastapur, Casey Leedom, Hariprasad S, Pierre Fersing,
	Arjan van de Ven, Abhijit Mahajan, systemd Mailing List
In-Reply-To: <CAOMFOmW2r2SyRVLAdZTV_1wK+gRES-hQXHGUGnWA=kvv6P3KXQ@mail.gmail.com>

16.10.2014 01:41, Anatol Pomozov wrote:
> True. module name should be enough. In this case to debug the issue user needs:
>   - disable failing udev rule (or blacklist module?)
>   - reboot, it will let the user get into shell
>   - modprobe the failing module
>   - use sysrq-trigger to get more information about stuck process

Nitpick: this only works only if the "stuck modprobe" bug is 100% 
reproducible. Which is not a given. So it is better to collect as much 
information about the bug when it is noticed by systemd.

-- 
Alexander E. Patrakov

^ permalink raw reply

* Re: something is wrong in commit 971f10eca1 - tcp: better TCP_SKB_CB layout to reduce cache line misses
From: Krzysztof Kolasa @ 2014-10-15 20:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Cong Wang, David Miller, Eric Dumazet, netdev
In-Reply-To: <1413387152.17365.26.camel@edumazet-glaptop2.roam.corp.google.com>

W dniu 15.10.2014 o 17:32, Eric Dumazet pisze:
> On Wed, 2014-10-15 at 13:40 +0200, Krzysztof Kolasa wrote:
>
>> on a 32bit system, the patch did not solve the problem :(
>> I have exactly the same problem as before the patch
>
> You mean Cong patch ?
Yes
>
>> I do not understand this, perhaps the problem is hidden somewhere else,
>> one thing is certain after revert commit 971f10eca1 everything works
>> correctly
>>
> Can you privately send me the vmlinux file ?
>
> It might be a compiler issue...
>
>
>
I sent the file

^ permalink raw reply

* Re: [PATCH net 0/3] SCTP fixes
From: Neil Horman @ 2014-10-15 20:20 UTC (permalink / raw)
  To: David Miller; +Cc: dborkman, linux-sctp, netdev
In-Reply-To: <20141015.002143.780696221161706737.davem@davemloft.net>

On Wed, Oct 15, 2014 at 12:21:43AM -0400, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Tue, 14 Oct 2014 23:06:32 -0400
> 
> > On Tue, Oct 14, 2014 at 12:46:57PM -0400, David Miller wrote:
> >> From: Daniel Borkmann <dborkman@redhat.com>
> >> Date: Thu,  9 Oct 2014 22:55:30 +0200
> >> 
> >> > Here are some SCTP fixes.
> >> > 
> >> > [ Note, immediate workaround would be to disable ASCONF (it
> >> >   is sysctl disabled by default). It is actually only used
> >> >   together with chunk authentication. ]
> >> 
> >> Series applied, thanks guys.
> > Why did you apply this Dave? There was ongoing discussion around it.
> 
> Sorry Neil, that wasn't my impression.
> 
> Worry not I can revert or apply relative fixes on top.
> :-)
> 
No, no worries, just trying to figure out how to inform admins of duplicate
ASCONF serial numbers.  A follow on patch to add pr_debugs will be easy here.
Neil

> Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: something is wrong in commit 971f10eca1 - tcp: better TCP_SKB_CB layout to reduce cache line misses
From: Cong Wang @ 2014-10-15 20:36 UTC (permalink / raw)
  To: Krzysztof Kolasa; +Cc: Eric Dumazet, David Miller, Eric Dumazet, netdev
In-Reply-To: <543E4DD8.80203@winsoft.pl>

On Wed, Oct 15, 2014 at 3:35 AM, Krzysztof Kolasa <kkolasa@winsoft.pl> wrote:
>
> one-line patch not resolve problem, fix created by Cong Wang resolves
> problem !!!
>
> tested on 64bit system and working OK
>

So my patch fixes your problem on 64bit but not on 32bit,
is this correct?

I will send out the patch. And as Eric said, the problem on 32bit
might be a different issue, we can definitely fix it separately.

Thanks for testing!

^ permalink raw reply

* Re: [PATCH v2 1/1] net: fec: ptp: fix convergence issue to support LinuxPTP stack
From: David Miller @ 2014-10-15 20:40 UTC (permalink / raw)
  To: b38611; +Cc: richardcochran, netdev, bhutchings, b20596
In-Reply-To: <1413365412-22072-1-git-send-email-b38611@freescale.com>

From: Fugang Duan <b38611@freescale.com>
Date: Wed, 15 Oct 2014 17:30:12 +0800

> iMX6SX IEEE 1588 module has one hw issue in capturing the ATVR register.
> The current SW flow is:
> 		ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK;
> 		ts_counter_ns = ENET0->ATVR;
> The ATVR value is not expected value that cause LinuxPTP stack cannot be convergent.
> 
> ENET Block Guide/ Chapter for the iMX6SX (PELE) address the issue:
> After set ENET_ATCR[Capture], there need some time cycles before the counter
> value is capture in the register clock domain. The wait-time-cycles is at least
> 6 clock cycles of the slower clock between the register clock and the 1588 clock.
> So need something like:
> 		ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK;
> 		wait();
> 		ts_counter_ns = ENET0->ATVR;
> 
> For iMX6SX, the 1588 ts_clk is fixed to 25Mhz, register clock is 66Mhz, so the
> wait-time-cycles must be greater than 240ns (40ns * 6). The patch add 1us delay
> before cpu read ATVR register.
> 
> Changes V2:
> Modify the commit/comments log to describe the issue clearly.
> 
> Signed-off-by: Fugang Duan <B38611@freescale.com>

Applied, thanks.

^ permalink raw reply

* Re: something is wrong in commit 971f10eca1 - tcp: better TCP_SKB_CB layout to reduce cache line misses
From: Krzysztof Kolasa @ 2014-10-15 20:42 UTC (permalink / raw)
  To: Cong Wang; +Cc: Eric Dumazet, David Miller, Eric Dumazet, netdev
In-Reply-To: <CAHA+R7OEvTcZOm-OepWYvGP+pC_CJRoSen351nw8VJGdLTo0uA@mail.gmail.com>

W dniu 15.10.2014 o 22:36, Cong Wang pisze:
> On Wed, Oct 15, 2014 at 3:35 AM, Krzysztof Kolasa <kkolasa@winsoft.pl> wrote:
>> one-line patch not resolve problem, fix created by Cong Wang resolves
>> problem !!!
>>
>> tested on 64bit system and working OK
>>
> So my patch fixes your problem on 64bit but not on 32bit,
> is this correct?
Yes

^ permalink raw reply

* Re: [PATCH] virtio_net: fix use after free
From: David Miller @ 2014-10-15 20:47 UTC (permalink / raw)
  To: mst; +Cc: linux-kernel, rusty, virtualization, netdev, jasowang
In-Reply-To: <1413378878-28118-1-git-send-email-mst@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Wed, 15 Oct 2014 16:23:28 +0300

> You used __netif_subqueue_stopped but that seems to use
> a slightly more expensive test_bit internally.

More expensive in what sense?  It should be roughly the same
as "x & y" sans the volatile.

Anyways I'm ambivalent and I want to see this bug fixes, so I'll
apply your patch.

Thanks!

^ permalink raw reply

* Re: [net] gianfar: Add FCS to rx buffer size (fix)
From: David Miller @ 2014-10-15 20:54 UTC (permalink / raw)
  To: claudiu.manoil; +Cc: netdev, kotnes
In-Reply-To: <1413389506-19172-1-git-send-email-claudiu.manoil@freescale.com>

From: Claudiu Manoil <claudiu.manoil@freescale.com>
Date: Wed, 15 Oct 2014 19:11:46 +0300

> For each Rx frame the eTSEC writes its FCS (Frame Check Sequence)
> to the Rx buffer.
> 
> The eTSEC h/w manual states in the "Receive Buffer Descriptor Field
> Descriptions" table:
> "Data length is the number of octets written by the eTSEC into this BD's
> data buffer if L is cleared (the value is equal to MRBLR), or, if L is
> set, the length of the frame including *CRC*, FCB (if RCTRL[PRSDEP > 00),
> preamble (if MACCFG2[PreAmRxEn]=1), time stamp (if RCTRL[TS] = 1) and
> any padding (RCTRL[PAL])."
> 
> Though the FCS bytes are removed by the driver before passing the skb
> to the net stack, the Rx buffer size computation does not currently
> take into account the FCS bytes (4 bytes).
> Because the Rx buffer size is multiple of 512 bytes, leaving out the
> FCS is not a problem for the default MTU of 1500, as the Rx buffer size
> is 1536 in this case.  However, for custom MTUs, where the difference
> between the MTU size and the Rx buffer size is less, this can be a
> problem as the computed Rx buffer size won't be enough to accomodate
> the FCS for a received frame that is big enough (close to MTU size).
> In such case the received frame is considered to be incomplete (L flag
> not set in the RxBD status) and silently dropped.
> 
> Note that the driver does not currently support S/G on Rx, so it has to
> compute its Rx buffer size based on the MTU of the device.
> 
> Reported-by: Kristian Otnes <kotnes@cisco.com>
> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>

Applied, thanks!

^ permalink raw reply

* Re: something is wrong in commit 971f10eca1 - tcp: better TCP_SKB_CB layout to reduce cache line misses
From: Cong Wang @ 2014-10-15 21:22 UTC (permalink / raw)
  To: Krzysztof Kolasa; +Cc: Eric Dumazet, David Miller, Eric Dumazet, netdev
In-Reply-To: <543EDC2F.1090604@winsoft.pl>

On Wed, Oct 15, 2014 at 1:42 PM, Krzysztof Kolasa <kkolasa@winsoft.pl> wrote:
> W dniu 15.10.2014 o 22:36, Cong Wang pisze:
>>
>> On Wed, Oct 15, 2014 at 3:35 AM, Krzysztof Kolasa <kkolasa@winsoft.pl>
>> wrote:
>>>
>>> one-line patch not resolve problem, fix created by Cong Wang resolves
>>> problem !!!
>>>
>>> tested on 64bit system and working OK
>>>
>> So my patch fixes your problem on 64bit but not on 32bit,
>> is this correct?
>
> Yes

Cool! Patches are coming.

Meanwhile Eric is debugging it, do you mind to try a followup quick
fix on your 32bit system?

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index e13d778..12bd3f6 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1006,8 +1006,7 @@ static int tcp_transmit_skb(struct sock *sk,
struct sk_buff *skb, int clone_it,
        skb->tstamp.tv64 = 0;

        /* Cleanup our debris for IP stacks */
-       memset(skb->cb, 0, max(sizeof(struct inet_skb_parm),
-                              sizeof(struct inet6_skb_parm)));
+       memset(TCPCB(skb), 0, sizeof(*TCPCB(skb)));

        err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);

^ permalink raw reply related

* Re: something is wrong in commit 971f10eca1 - tcp: better TCP_SKB_CB layout to reduce cache line misses
From: Cong Wang @ 2014-10-15 21:25 UTC (permalink / raw)
  To: Krzysztof Kolasa; +Cc: Eric Dumazet, David Miller, Eric Dumazet, netdev
In-Reply-To: <CAHA+R7PK=oXswV0PAkKtAqaeJh+Rx7ug0aaLKoJzJovR98dJpg@mail.gmail.com>

On Wed, Oct 15, 2014 at 2:22 PM, Cong Wang <cwang@twopensource.com> wrote:
>
> Meanwhile Eric is debugging it, do you mind to try a followup quick
> fix on your 32bit system?
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index e13d778..12bd3f6 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1006,8 +1006,7 @@ static int tcp_transmit_skb(struct sock *sk,
> struct sk_buff *skb, int clone_it,
>         skb->tstamp.tv64 = 0;
>
>         /* Cleanup our debris for IP stacks */
> -       memset(skb->cb, 0, max(sizeof(struct inet_skb_parm),
> -                              sizeof(struct inet6_skb_parm)));
> +       memset(TCPCB(skb), 0, sizeof(*TCPCB(skb)));


Of course, s/TCPCB/TCP_SKB_CB/. :)

>
>         err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);

^ permalink raw reply

* [Patch net 1/3] ipv4: call __ip_options_echo() in cookie_v4_check()
From: Cong Wang @ 2014-10-15 21:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, Cong Wang, Krzysztof Kolasa, Eric Dumazet, Cong Wang

From: Cong Wang <cwang@twopensource.com>

commit 971f10eca186cab238c49da ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
missed that cookie_v4_check() still calls ip_options_echo() which uses
IPCB(). It should use TCPCB() at TCP layer, so call __ip_options_echo()
instead.

Fixes: commit 971f10eca186cab238c49da ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Cc: Krzysztof Kolasa <kkolasa@winsoft.pl>
Cc: Eric Dumazet <edumazet@google.com>
Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Tested-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/ipv4/syncookies.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 0431a8f..7e7401c 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -321,7 +321,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 		int opt_size = sizeof(struct ip_options_rcu) + opt->optlen;
 
 		ireq->opt = kmalloc(opt_size, GFP_ATOMIC);
-		if (ireq->opt != NULL && ip_options_echo(&ireq->opt->opt, skb)) {
+		if (ireq->opt != NULL && __ip_options_echo(&ireq->opt->opt, skb, opt)) {
 			kfree(ireq->opt);
 			ireq->opt = NULL;
 		}
-- 
1.8.3.1

^ permalink raw reply related

* [Patch net 2/3] ipv4: share tcp_v4_save_options() with cookie_v4_check()
From: Cong Wang @ 2014-10-15 21:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, Cong Wang, Krzysztof Kolasa, Eric Dumazet, Cong Wang
In-Reply-To: <1413408802-21052-1-git-send-email-xiyou.wangcong@gmail.com>

From: Cong Wang <cwang@twopensource.com>

cookie_v4_check() allocates ip_options_rcu in the same way
with tcp_v4_save_options(), we can just make it a helper function.

Cc: Krzysztof Kolasa <kkolasa@winsoft.pl>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 include/net/tcp.h     | 20 ++++++++++++++++++++
 net/ipv4/syncookies.c | 10 +---------
 net/ipv4/tcp_ipv4.c   | 20 --------------------
 3 files changed, 21 insertions(+), 29 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 74efeda..869637a 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1666,4 +1666,24 @@ int tcpv4_offload_init(void);
 void tcp_v4_init(void);
 void tcp_init(void);
 
+/*
+ * Save and compile IPv4 options, return a pointer to it
+ */
+static inline struct ip_options_rcu *tcp_v4_save_options(struct sk_buff *skb)
+{
+	const struct ip_options *opt = &TCP_SKB_CB(skb)->header.h4.opt;
+	struct ip_options_rcu *dopt = NULL;
+
+	if (opt && opt->optlen) {
+		int opt_size = sizeof(*dopt) + opt->optlen;
+
+		dopt = kmalloc(opt_size, GFP_ATOMIC);
+		if (dopt && __ip_options_echo(&dopt->opt, skb, opt)) {
+			kfree(dopt);
+			dopt = NULL;
+		}
+	}
+	return dopt;
+}
+
 #endif	/* _TCP_H */
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 7e7401c..c68d0a1 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -317,15 +317,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 	/* We throwed the options of the initial SYN away, so we hope
 	 * the ACK carries the same options again (see RFC1122 4.2.3.8)
 	 */
-	if (opt && opt->optlen) {
-		int opt_size = sizeof(struct ip_options_rcu) + opt->optlen;
-
-		ireq->opt = kmalloc(opt_size, GFP_ATOMIC);
-		if (ireq->opt != NULL && __ip_options_echo(&ireq->opt->opt, skb, opt)) {
-			kfree(ireq->opt);
-			ireq->opt = NULL;
-		}
-	}
+	ireq->opt = tcp_v4_save_options(skb);
 
 	if (security_inet_conn_request(sk, skb, req)) {
 		reqsk_free(req);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 552e87e..6a2a7d6 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -880,26 +880,6 @@ bool tcp_syn_flood_action(struct sock *sk,
 }
 EXPORT_SYMBOL(tcp_syn_flood_action);
 
-/*
- * Save and compile IPv4 options into the request_sock if needed.
- */
-static struct ip_options_rcu *tcp_v4_save_options(struct sk_buff *skb)
-{
-	const struct ip_options *opt = &TCP_SKB_CB(skb)->header.h4.opt;
-	struct ip_options_rcu *dopt = NULL;
-
-	if (opt && opt->optlen) {
-		int opt_size = sizeof(*dopt) + opt->optlen;
-
-		dopt = kmalloc(opt_size, GFP_ATOMIC);
-		if (dopt && __ip_options_echo(&dopt->opt, skb, opt)) {
-			kfree(dopt);
-			dopt = NULL;
-		}
-	}
-	return dopt;
-}
-
 #ifdef CONFIG_TCP_MD5SIG
 /*
  * RFC2385 MD5 checksumming requires a mapping of
-- 
1.8.3.1

^ permalink raw reply related

* [Patch net 3/3] ipv4: clean up cookie_v4_check()
From: Cong Wang @ 2014-10-15 21:33 UTC (permalink / raw)
  To: netdev; +Cc: davem, Cong Wang, Krzysztof Kolasa, Eric Dumazet, Cong Wang
In-Reply-To: <1413408802-21052-1-git-send-email-xiyou.wangcong@gmail.com>

From: Cong Wang <cwang@twopensource.com>

We can retrieve opt from skb, no need to pass it as a parameter.
And opt should always be non-NULL, no need to check.

Cc: Krzysztof Kolasa <kkolasa@winsoft.pl>
Cc: Eric Dumazet <edumazet@google.com>
Tested-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 include/net/tcp.h     | 5 ++---
 net/ipv4/syncookies.c | 6 +++---
 net/ipv4/tcp_ipv4.c   | 2 +-
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 869637a..3a4bbbf 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -468,8 +468,7 @@ void inet_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb);
 /* From syncookies.c */
 int __cookie_v4_check(const struct iphdr *iph, const struct tcphdr *th,
 		      u32 cookie);
-struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
-			     struct ip_options *opt);
+struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb);
 #ifdef CONFIG_SYN_COOKIES
 
 /* Syncookies use a monotonic timer which increments every 60 seconds.
@@ -1674,7 +1673,7 @@ static inline struct ip_options_rcu *tcp_v4_save_options(struct sk_buff *skb)
 	const struct ip_options *opt = &TCP_SKB_CB(skb)->header.h4.opt;
 	struct ip_options_rcu *dopt = NULL;
 
-	if (opt && opt->optlen) {
+	if (opt->optlen) {
 		int opt_size = sizeof(*dopt) + opt->optlen;
 
 		dopt = kmalloc(opt_size, GFP_ATOMIC);
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index c68d0a1..d346303 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -255,9 +255,9 @@ bool cookie_check_timestamp(struct tcp_options_received *tcp_opt,
 }
 EXPORT_SYMBOL(cookie_check_timestamp);
 
-struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
-			     struct ip_options *opt)
+struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb)
 {
+	struct ip_options *opt = &TCP_SKB_CB(skb)->header.h4.opt;
 	struct tcp_options_received tcp_opt;
 	struct inet_request_sock *ireq;
 	struct tcp_request_sock *treq;
@@ -336,7 +336,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 	flowi4_init_output(&fl4, sk->sk_bound_dev_if, ireq->ir_mark,
 			   RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE, IPPROTO_TCP,
 			   inet_sk_flowi_flags(sk),
-			   (opt && opt->srr) ? opt->faddr : ireq->ir_rmt_addr,
+			   opt->srr ? opt->faddr : ireq->ir_rmt_addr,
 			   ireq->ir_loc_addr, th->source, th->dest);
 	security_req_classify_flow(req, flowi4_to_flowi(&fl4));
 	rt = ip_route_output_key(sock_net(sk), &fl4);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 6a2a7d6..94d1a77 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1408,7 +1408,7 @@ static struct sock *tcp_v4_hnd_req(struct sock *sk, struct sk_buff *skb)
 
 #ifdef CONFIG_SYN_COOKIES
 	if (!th->syn)
-		sk = cookie_v4_check(sk, skb, &TCP_SKB_CB(skb)->header.h4.opt);
+		sk = cookie_v4_check(sk, skb);
 #endif
 	return sk;
 }
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH linux v3 1/1] fs/proc: use a rb tree for the directory entries
From: Andrew Morton @ 2014-10-15 21:37 UTC (permalink / raw)
  To: Nicolas Dichtel
  Cc: netdev, linux-kernel, davem, ebiederm, adobriyan, rui.xiang, viro,
	oleg, gorcunov, kirill.shutemov, grant.likely, tytso
In-Reply-To: <1412672559-5256-2-git-send-email-nicolas.dichtel@6wind.com>

On Tue,  7 Oct 2014 11:02:39 +0200 Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:

> The current implementation for the directories in /proc is using a single
> linked list. This is slow when handling directories with large numbers of
> entries (eg netdevice-related entries when lots of tunnels are opened).
> 
> This patch replaces this linked list by a red-black tree.
> 
> ...
>
> --- a/fs/proc/root.c
> +++ b/fs/proc/root.c
> @@ -166,6 +166,7 @@ void __init proc_root_init(void)
>  {
>  	int err;
>  
> +	proc_root.subdir = RB_ROOT;
>  	proc_init_inodecache();
>  	err = register_filesystem(&proc_fs_type);
>  	if (err)

This can be done at compile time can't it?

--- a/fs/proc/root.c~fs-proc-use-a-rb-tree-for-the-directory-entries-fix
+++ a/fs/proc/root.c
@@ -166,7 +166,6 @@ void __init proc_root_init(void)
 {
 	int err;
 
-	proc_root.subdir = RB_ROOT;
 	proc_init_inodecache();
 	err = register_filesystem(&proc_fs_type);
 	if (err)
@@ -252,6 +251,7 @@ struct proc_dir_entry proc_root = {
 	.proc_iops	= &proc_root_inode_operations, 
 	.proc_fops	= &proc_root_operations,
 	.parent		= &proc_root,
+	.subdir		= RB_ROOT,
 	.name		= "/proc",
 };
 
_

^ permalink raw reply

* Re: something is wrong in commit 971f10eca1 - tcp: better TCP_SKB_CB layout to reduce cache line misses
From: Eric Dumazet @ 2014-10-15 21:39 UTC (permalink / raw)
  To: Cong Wang; +Cc: Krzysztof Kolasa, David Miller, Eric Dumazet, netdev
In-Reply-To: <CAHA+R7OEvTcZOm-OepWYvGP+pC_CJRoSen351nw8VJGdLTo0uA@mail.gmail.com>

On Wed, 2014-10-15 at 13:36 -0700, Cong Wang wrote:
> On Wed, Oct 15, 2014 at 3:35 AM, Krzysztof Kolasa <kkolasa@winsoft.pl> wrote:
> >
> > one-line patch not resolve problem, fix created by Cong Wang resolves
> > problem !!!
> >
> > tested on 64bit system and working OK
> >
> 
> So my patch fixes your problem on 64bit but not on 32bit,
> is this correct?
> 
> I will send out the patch. And as Eric said, the problem on 32bit
> might be a different issue, we can definitely fix it separately.
> 
> Thanks for testing!

I think more fixes are needed.

This is most probably an IPv6 issue.

^ permalink raw reply

* [PATCH] net: m_can: add CONFIG_HAS_IOMEM dependence
From: David Cohen @ 2014-10-15 21:41 UTC (permalink / raw)
  To: wg, mkl; +Cc: linux-can, netdev, linux-kernel, trivial, David Cohen

m_can uses io memory which makes it not compilable on architectures
without HAS_IOMEM such as UML:

drivers/built-in.o: In function `m_can_plat_probe':
m_can.c:(.text+0x218cc5): undefined reference to `devm_ioremap_resource'
m_can.c:(.text+0x218df9): undefined reference to `devm_ioremap'

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
---
 drivers/net/can/m_can/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/can/m_can/Kconfig b/drivers/net/can/m_can/Kconfig
index fca5482c09ac..04f20dd39007 100644
--- a/drivers/net/can/m_can/Kconfig
+++ b/drivers/net/can/m_can/Kconfig
@@ -1,4 +1,5 @@
 config CAN_M_CAN
+	depends on HAS_IOMEM
 	tristate "Bosch M_CAN devices"
 	---help---
 	  Say Y here if you want to support for Bosch M_CAN controller.
-- 
2.1.0


^ permalink raw reply related

* Re: something is wrong in commit 971f10eca1 - tcp: better TCP_SKB_CB layout to reduce cache line misses
From: Eric Dumazet @ 2014-10-15 21:42 UTC (permalink / raw)
  To: Cong Wang; +Cc: Krzysztof Kolasa, David Miller, Eric Dumazet, netdev
In-Reply-To: <1413409168.17186.0.camel@edumazet-glaptop2.roam.corp.google.com>

On Wed, 2014-10-15 at 14:39 -0700, Eric Dumazet wrote:
> On Wed, 2014-10-15 at 13:36 -0700, Cong Wang wrote:
> > On Wed, Oct 15, 2014 at 3:35 AM, Krzysztof Kolasa <kkolasa@winsoft.pl> wrote:
> > >
> > > one-line patch not resolve problem, fix created by Cong Wang resolves
> > > problem !!!
> > >
> > > tested on 64bit system and working OK
> > >
> > 
> > So my patch fixes your problem on 64bit but not on 32bit,
> > is this correct?
> > 
> > I will send out the patch. And as Eric said, the problem on 32bit
> > might be a different issue, we can definitely fix it separately.
> > 
> > Thanks for testing!
> 
> I think more fixes are needed.
> 
> This is most probably an IPv6 issue.
> 

Yes, ipv6 uses inet6_iif() a lot. We need to use a different accessor.

^ permalink raw reply

* Re: Regarding tx-nocache-copy in the Sheevaplug
From: Benjamin Poirier @ 2014-10-15 21:57 UTC (permalink / raw)
  To: Lluís Batlle i Rossell
  Cc: linux-kernel, netdev, Carles Pagès, linux-arm-kernel
In-Reply-To: <20141013105246.GD1972@vicerveza.homeunix.net>

On 2014/10/13 12:52, Lluís Batlle i Rossell wrote:
> Hello,
> 
> on the 7th of January 2014 ths patch was applied:
> https://lkml.org/lkml/2014/1/7/307
> 
> [PATCH v2] net: Do not enable tx-nocache-copy by default
>         
> In the Sheevaplug (ARM Feroceon 88FR131 from Marvell) this made packets to be
> sent corrupted. I think this machine has something special about the cache.
> 
> Enabling back this tx-nocache-copy (as it used to be before the patch) the
> transfers work fine again. I think that most people, encountering this problem,
> completely disable the tx offload instead of enabling back this setting.
> 
> Is this an ARM kernel problem regarding this platform?

This is odd, only x86 defines ARCH_HAS_NOCACHE_UACCESS. On arm,
skb_do_copy_data_nocache() should end up using __copy_from_user()
regardless of tx-nocache-copy.

^ permalink raw reply

* Re: feature suggestion: implement SO_PEERCRED on local AF_INET/AF_INET6 sockets (allow uid-based identification on localhost)
From: David Madore @ 2014-10-15 22:30 UTC (permalink / raw)
  To: Linux Kernel mailing-list, Linux network mailing-list; +Cc: Andy Lutomirski
In-Reply-To: <543E87AC.5090402@amacapital.net>

On Wed, Oct 15, 2014 at 07:41:48AM -0700, Andy Lutomirski wrote:
> On 10/15/2014 06:35 AM, David Madore wrote:
> > Given an AF_UNIX socket, the getsockopt(, SOL_SOCKET, SO_PEERCRED,,)
> > call allows one endpoint to authenticate the other endpoint's pid, uid
> > and gid.
> > 
> > The call is valid on AF_INET and AF_INET6 sockets but returns no data
> > (pid=0, uid=-1, gid=-1).  Obviously it is meaningless to try to get
> > such credentials from a INET/INET6 socket in general, but there is one
> > case where it would make sense: namely, when the endpoint is local
> > (i.e., when the socket is a connection to the same machine, e.g., when
> > connecting to 127.0.0.0/8 or ::1/32).
> 
> I will object to adding it as described, for the same reason that I
> object to anything that extends the current model of socket-based
> credential passing.  Ideally, credentials would *never* be implicitly
> captured by socket syscalls.  We live in the real world, and SO_****CRED
> exists, so I think the best we can do is to try to minimize its use.
> 
> I can elaborate further, or you can IIRC search the archives for
> SCM_IDENTITY, and you can also look at CVE-2013-1979 for a nasty example
> of why this model is broken.

>From what I understand, what was broken is mainly that the credentials
were evaluated when the write() system call took place rather than
when socket() or bind(): this violates the Unix security model
(privilege control occurs when the file descriptor is created, not
when it is used).  On the contrary, it is conform to Unix security
principles that credentials are checked implicitly when binding a
socket (this happens when permissions are being checked on the path
when binding or connecting on a Unix domain socket; and to allow
binding to secure ports in the INET domain; and so on).  It seems to
me that a suid program that is willing to create or bind a socket on
behalf of its caller without knowing exactly what it will be
connecting to, it should intrinsically be treated as a security
vulnerability, even when it is not obviously exploitable.

Also, to go along the real world examples, identd exists and is used
for identification on local networks (e.g. localhost), so the capture
of credentials already takes place.  Unix programmers are aware of
this, and know that a privileged program should not bind a socket if
they don't want to leak privileges.  (Another example is the use of -m
owner in iptables.)

And, of course, if Solaris already has this feature, there is some
experience for it.  Has there been any documented vulnerability
relating to the fact that Solaris allows getpeerucred() to
authenticate locally connected AF_INET sockets?

Note that since the possibility of using SO_PEERCRED on AF_INET
sockets does not hitherto exist on Linux, we can be sure that nobody
uses it, so it's not like it might open vulnerabilities in existing
code.  If you think it's insecure, it can be documented as such (by
comparing it with identd): I still think it's better than having no
control at all when binding to localhost, which is the present
situation (causing, e.g., CVE-2014-2914).

Because SO_PEERCRED currently returns {pid=0,uid=-1,gid=-1} on
AF_INET, we might still return this value if there is any risk that
the endpoint would be unwilling to share its credentials: for example,
this value might be returned if the other endpoint is not ptraceable
by the caller - this would still cover the essential use case, which
is for unprivileged users to authenticate the connections from their
own processes.  Would this limitation assuage your worries about the
proposed feature?

The thing is, I don't see any other way the ssh port forwarding mess
can ever be improved.  Do you have another solution in mind that?

Any attempt to have some kind of authentication of local sockets that
required participation on the client (authenticatee)'s part is doomed:
if modifying the protocol and/or client code is an option, we might as
well use some form of crypto / TLS.  Or Unix-domain sockets.  But what
are we supposed to do when modifying the client (to make it send
credentials, use crypto or connect on AF_UNIX) is not an option?

-- 
     David A. Madore
   ( http://www.madore.org/~david/ )

^ permalink raw reply

* Re: Regarding tx-nocache-copy in the Sheevaplug
From: Eric Dumazet @ 2014-10-15 22:45 UTC (permalink / raw)
  To: Benjamin Poirier
  Cc: Lluís Batlle i Rossell, linux-kernel, netdev,
	Carles Pagès, linux-arm-kernel
In-Reply-To: <20141015215701.GA4109@f1.synalogic.ca>

On Wed, 2014-10-15 at 14:57 -0700, Benjamin Poirier wrote:
> On 2014/10/13 12:52, Lluís Batlle i Rossell wrote:
> > Hello,
> > 
> > on the 7th of January 2014 ths patch was applied:
> > https://lkml.org/lkml/2014/1/7/307
> > 
> > [PATCH v2] net: Do not enable tx-nocache-copy by default
> >         
> > In the Sheevaplug (ARM Feroceon 88FR131 from Marvell) this made packets to be
> > sent corrupted. I think this machine has something special about the cache.
> > 
> > Enabling back this tx-nocache-copy (as it used to be before the patch) the
> > transfers work fine again. I think that most people, encountering this problem,
> > completely disable the tx offload instead of enabling back this setting.
> > 
> > Is this an ARM kernel problem regarding this platform?
> 
> This is odd, only x86 defines ARCH_HAS_NOCACHE_UACCESS. On arm,
> skb_do_copy_data_nocache() should end up using __copy_from_user()
> regardless of tx-nocache-copy.

 kmap_atomic()/kunmap_atomic() is missing, so we lack
__cpuc_flush_dcache_area() operations.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox