* Re: [PATCH net] hyper-v: Add myself as additional MAINTAINER
From: gregkh @ 2017-01-05 18:29 UTC (permalink / raw)
To: KY Srinivasan
Cc: Stephen Hemminger, davem@davemloft.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Stephen Hemminger
In-Reply-To: <DM5PR03MB24904C998F63FDF80C4CBB16A0600@DM5PR03MB2490.namprd03.prod.outlook.com>
On Thu, Jan 05, 2017 at 05:43:04PM +0000, KY Srinivasan wrote:
>
>
> > -----Original Message-----
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Thursday, January 5, 2017 9:36 AM
> > To: davem@davemloft.net; KY Srinivasan <kys@microsoft.com>
> > Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> > gregkh@linuxfoundation.org; Stephen Hemminger
> > <sthemmin@microsoft.com>
> > Subject: [PATCH net] hyper-v: Add myself as additional MAINTAINER
> >
> > Update the Hyper-V MAINTAINERS to include myself.
> >
> > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
>
> Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Thanks, will go queue this up now.
greg k-h
^ permalink raw reply
* Re: [PATCH] MIPS: NI 169445 board support
From: Joao Pinto @ 2017-01-05 18:33 UTC (permalink / raw)
To: Niklas Cassel, Nathan Sullivan, Ralf Baechle, linux-mips
Cc: davem, netdev, linux-kernel, Lars Persson, Joao Pinto
In-Reply-To: <5d5a087f-68ec-e633-0232-0248edf11ee0@axis.com>
Hi,
Às 6:28 PM de 1/5/2017, Niklas Cassel escreveu:
> On 01/04/2017 05:38 PM, Nathan Sullivan wrote:
>> On Tue, Dec 20, 2016 at 05:34:34PM +0100, Ralf Baechle wrote:
>>> On Fri, Dec 02, 2016 at 09:42:09AM -0600, Nathan Sullivan wrote:
>>>> Date: Fri, 2 Dec 2016 09:42:09 -0600
>>>> From: Nathan Sullivan <nathan.sullivan@ni.com>
>>>> To: ralf@linux-mips.org, mark.rutland@arm.com, robh+dt@kernel.org
>>>> CC: linux-mips@linux-mips.org, devicetree@vger.kernel.org,
>>>> linux-kernel@vger.kernel.org, Nathan Sullivan <nathan.sullivan@ni.com>
>>>> Subject: [PATCH] MIPS: NI 169445 board support
>>>> Content-Type: text/plain
>>>>
>>>> Support the National Instruments 169445 board.
>>> Nathan,
>>>
>>> I assume you're going to repost the changes Rob asked for in
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.linux-2Dmips.org_patch_14641_-2326924&d=DgICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=5p7f9dIkvVVK4UFHimMpezq5NwIJfUpd08c-Zk4_c6c&s=_JwSwe4VFYtxV1tcYt6Z8r4hJX0xfoGhCixygUxlg5s&e= and resubmit?
>>>
>>> Thanks,
>>>
>>> Ralf
>> Hmm, I found the issue with the generic MIPS config and dwc_eth_qos. The NIC
>> driver attempts to cache align a descriptor ring using the ___cacheline_aligned
>> attribute on the descriptor struct, in combination with a "skip" feature in
>> hardware. However, the skip feature only has a three bit field, and the generic
>> MIPS config selects MIPS_L1_CACHE_SHIFT_7. So, the line size is 128, and with a
>> 64-bit bus, that means the NIC descriptor skip field would need to be set to
>> 14 to align the 16-byte descriptors...
>>
>> I guess it makes sense for a generic MIPS kernel to align everything for 128 byte
>> cache lines, and for me to fix the dwc_eth_qos driver to handle cases where the
>> line size is too big for the hardware skip feature, right?
>
> I don't know if you've been following the discussion regarding
> dwc_eth_qos on netdev, but Joao Pinto from Synopsys is
> planning on removing the driver (since the stmmac driver
> now supports the same version of the IP, together with older
> versions of the IP).
>
> Since device tree bindings are treated as an ABI,
> Joao has implemented a glue layer for stmmac that parses
> the dwc_eth_qos binding, but uses stmmac under the hood.
>
> You can use any of the bindings, but since the dwc_eth_qos
> binding will be marked as deprecated, you might want to
> consider moving to the stmmac binding.
A patch set to port dwc_eth_qos to stmmac is at this moment under review:
http://patchwork.ozlabs.org/patch/711428/
http://patchwork.ozlabs.org/patch/711438/
http://patchwork.ozlabs.org/patch/711439/
Niklas has tested it and it works well, so after the patches are upstreamed the
dwc_eth_qos will be removed as agreed with Lars.
Thanks.
>
>>
>> Thanks,
>>
>> Nathan
>>
>>
>
^ permalink raw reply
* Re: [PATCH net-next] packet: fix panic in __packet_set_timestamp on tpacket_v3 in tx mode
From: Eric Dumazet @ 2017-01-05 18:27 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: davem, sowmini.varadhan, willemb, netdev
In-Reply-To: <1483580068-13854-1-git-send-email-daniel@iogearbox.net>
On Thu, 2017-01-05 at 02:34 +0100, Daniel Borkmann wrote:
> When TX timestamping is in use with TPACKET_V3's TX ring, then we'll
> hit the BUG() in __packet_set_timestamp() when ring buffer slot is
> returned to user space via tpacket_destruct_skb(). This is due to v3
> being assumed as unreachable here, but since 7f953ab2ba46 ("af_packet:
> TX_RING support for TPACKET_V3") it's not anymore. Fix it by filling
> the timestamp back into the ring slot.
>
> Fixes: 7f953ab2ba46 ("af_packet: TX_RING support for TPACKET_V3")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---
> net/packet/af_packet.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 7e39087..ddbda25 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -481,6 +481,9 @@ static __u32 __packet_set_timestamp(struct packet_sock *po, void *frame,
> h.h2->tp_nsec = ts.tv_nsec;
> break;
> case TPACKET_V3:
> + h.h3->tp_sec = ts.tv_sec;
> + h.h3->tp_nsec = ts.tv_nsec;
> + break;
> default:
> WARN(1, "TPACKET version not supported.\n");
> BUG();
Gosh. Can we also replace this BUG() into something less aggressive ?
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index b9e1a13b4ba36a0bc7edf6a8c2c116c7d48c970c..0c0d268544787dcbef6601c5014e7d3836d16f96 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -476,9 +476,11 @@ static __u32 __packet_set_timestamp(struct packet_sock *po, void *frame,
h.h2->tp_nsec = ts.tv_nsec;
break;
case TPACKET_V3:
+ h.h3->tp_sec = ts.tv_sec;
+ h.h3->tp_nsec = ts.tv_nsec;
+ break;
default:
- WARN(1, "TPACKET version not supported.\n");
- BUG();
+ pr_err_once("TPACKET version %u not supported.\n", po->tp_version);
}
/* one flush is safe, as both fields always lie on the same cacheline */
^ permalink raw reply related
* Re: [PATCH] MIPS: NI 169445 board support
From: Nathan Sullivan @ 2017-01-05 18:44 UTC (permalink / raw)
To: Joao Pinto
Cc: Niklas Cassel, Ralf Baechle, linux-mips, davem, netdev,
linux-kernel, Lars Persson
In-Reply-To: <8fd70ecb-36fc-7cc7-7795-cd4dccabf8b9@synopsys.com>
On Thu, Jan 05, 2017 at 06:33:53PM +0000, Joao Pinto wrote:
> Hi,
>
> Às 6:28 PM de 1/5/2017, Niklas Cassel escreveu:
> > On 01/04/2017 05:38 PM, Nathan Sullivan wrote:
> >> On Tue, Dec 20, 2016 at 05:34:34PM +0100, Ralf Baechle wrote:
> >>> On Fri, Dec 02, 2016 at 09:42:09AM -0600, Nathan Sullivan wrote:
> >>>> Date: Fri, 2 Dec 2016 09:42:09 -0600
> >>>> From: Nathan Sullivan <nathan.sullivan@ni.com>
> >>>> To: ralf@linux-mips.org, mark.rutland@arm.com, robh+dt@kernel.org
> >>>> CC: linux-mips@linux-mips.org, devicetree@vger.kernel.org,
> >>>> linux-kernel@vger.kernel.org, Nathan Sullivan <nathan.sullivan@ni.com>
> >>>> Subject: [PATCH] MIPS: NI 169445 board support
> >>>> Content-Type: text/plain
> >>>>
> >>>> Support the National Instruments 169445 board.
> >>> Nathan,
> >>>
> >>> I assume you're going to repost the changes Rob asked for in
> >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.linux-2Dmips.org_patch_14641_-2326924&d=DgICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=5p7f9dIkvVVK4UFHimMpezq5NwIJfUpd08c-Zk4_c6c&s=_JwSwe4VFYtxV1tcYt6Z8r4hJX0xfoGhCixygUxlg5s&e= and resubmit?
> >>>
> >>> Thanks,
> >>>
> >>> Ralf
> >> Hmm, I found the issue with the generic MIPS config and dwc_eth_qos. The NIC
> >> driver attempts to cache align a descriptor ring using the ___cacheline_aligned
> >> attribute on the descriptor struct, in combination with a "skip" feature in
> >> hardware. However, the skip feature only has a three bit field, and the generic
> >> MIPS config selects MIPS_L1_CACHE_SHIFT_7. So, the line size is 128, and with a
> >> 64-bit bus, that means the NIC descriptor skip field would need to be set to
> >> 14 to align the 16-byte descriptors...
> >>
> >> I guess it makes sense for a generic MIPS kernel to align everything for 128 byte
> >> cache lines, and for me to fix the dwc_eth_qos driver to handle cases where the
> >> line size is too big for the hardware skip feature, right?
> >
> > I don't know if you've been following the discussion regarding
> > dwc_eth_qos on netdev, but Joao Pinto from Synopsys is
> > planning on removing the driver (since the stmmac driver
> > now supports the same version of the IP, together with older
> > versions of the IP).
> >
> > Since device tree bindings are treated as an ABI,
> > Joao has implemented a glue layer for stmmac that parses
> > the dwc_eth_qos binding, but uses stmmac under the hood.
> >
> > You can use any of the bindings, but since the dwc_eth_qos
> > binding will be marked as deprecated, you might want to
> > consider moving to the stmmac binding.
>
> A patch set to port dwc_eth_qos to stmmac is at this moment under review:
>
> http://patchwork.ozlabs.org/patch/711428/
> http://patchwork.ozlabs.org/patch/711438/
> http://patchwork.ozlabs.org/patch/711439/
>
> Niklas has tested it and it works well, so after the patches are upstreamed the
> dwc_eth_qos will be removed as agreed with Lars.
>
> Thanks.
>
Thanks for the heads up, I'll wait, adjust my bindings and retest then.
Nathan
> >
> >>
> >> Thanks,
> >>
> >> Nathan
> >>
> >>
> >
>
^ permalink raw reply
* Re: [PATCH] MIPS: NI 169445 board support
From: Joao Pinto @ 2017-01-05 18:45 UTC (permalink / raw)
To: Nathan Sullivan, Joao Pinto
Cc: Niklas Cassel, Ralf Baechle, linux-mips, davem, netdev,
linux-kernel, Lars Persson
In-Reply-To: <20170105184442.GA9424@nathan3500-linux-VM>
Às 6:44 PM de 1/5/2017, Nathan Sullivan escreveu:
> On Thu, Jan 05, 2017 at 06:33:53PM +0000, Joao Pinto wrote:
>> Hi,
>>
>> Às 6:28 PM de 1/5/2017, Niklas Cassel escreveu:
>>> On 01/04/2017 05:38 PM, Nathan Sullivan wrote:
>>>> On Tue, Dec 20, 2016 at 05:34:34PM +0100, Ralf Baechle wrote:
>>>>> On Fri, Dec 02, 2016 at 09:42:09AM -0600, Nathan Sullivan wrote:
>>>>>> Date: Fri, 2 Dec 2016 09:42:09 -0600
>>>>>> From: Nathan Sullivan <nathan.sullivan@ni.com>
>>>>>> To: ralf@linux-mips.org, mark.rutland@arm.com, robh+dt@kernel.org
>>>>>> CC: linux-mips@linux-mips.org, devicetree@vger.kernel.org,
>>>>>> linux-kernel@vger.kernel.org, Nathan Sullivan <nathan.sullivan@ni.com>
>>>>>> Subject: [PATCH] MIPS: NI 169445 board support
>>>>>> Content-Type: text/plain
>>>>>>
>>>>>> Support the National Instruments 169445 board.
>>>>> Nathan,
>>>>>
>>>>> I assume you're going to repost the changes Rob asked for in
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.linux-2Dmips.org_patch_14641_-2326924&d=DgICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=5p7f9dIkvVVK4UFHimMpezq5NwIJfUpd08c-Zk4_c6c&s=_JwSwe4VFYtxV1tcYt6Z8r4hJX0xfoGhCixygUxlg5s&e= and resubmit?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Ralf
>>>> Hmm, I found the issue with the generic MIPS config and dwc_eth_qos. The NIC
>>>> driver attempts to cache align a descriptor ring using the ___cacheline_aligned
>>>> attribute on the descriptor struct, in combination with a "skip" feature in
>>>> hardware. However, the skip feature only has a three bit field, and the generic
>>>> MIPS config selects MIPS_L1_CACHE_SHIFT_7. So, the line size is 128, and with a
>>>> 64-bit bus, that means the NIC descriptor skip field would need to be set to
>>>> 14 to align the 16-byte descriptors...
>>>>
>>>> I guess it makes sense for a generic MIPS kernel to align everything for 128 byte
>>>> cache lines, and for me to fix the dwc_eth_qos driver to handle cases where the
>>>> line size is too big for the hardware skip feature, right?
>>>
>>> I don't know if you've been following the discussion regarding
>>> dwc_eth_qos on netdev, but Joao Pinto from Synopsys is
>>> planning on removing the driver (since the stmmac driver
>>> now supports the same version of the IP, together with older
>>> versions of the IP).
>>>
>>> Since device tree bindings are treated as an ABI,
>>> Joao has implemented a glue layer for stmmac that parses
>>> the dwc_eth_qos binding, but uses stmmac under the hood.
>>>
>>> You can use any of the bindings, but since the dwc_eth_qos
>>> binding will be marked as deprecated, you might want to
>>> consider moving to the stmmac binding.
>>
>> A patch set to port dwc_eth_qos to stmmac is at this moment under review:
>>
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_patch_711428_&d=DgIDAw&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=E0wkLvWGNBx49Zdq7Jw5toxfcwI9r7MBBbcTea28AL0&s=P71GK8K8tyGjenB4tDVyKfCuZF9cZiFBBpdeX8PQtEM&e=
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_patch_711438_&d=DgIDAw&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=E0wkLvWGNBx49Zdq7Jw5toxfcwI9r7MBBbcTea28AL0&s=fj787JEefx7cddQAe7g604tMtvDlVzYj3kQKy80Gym0&e=
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_patch_711439_&d=DgIDAw&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=E0wkLvWGNBx49Zdq7Jw5toxfcwI9r7MBBbcTea28AL0&s=Cyy9ySM6LNgkQ07OsIYE8KnD1h1DruhCKLxH6W3_1VY&e=
>>
>> Niklas has tested it and it works well, so after the patches are upstreamed the
>> dwc_eth_qos will be removed as agreed with Lars.
>>
>> Thanks.
>>
>
> Thanks for the heads up, I'll wait, adjust my bindings and retest then.
Great! Thanks!
>
> Nathan
>
>>>
>>>>
>>>> Thanks,
>>>>
>>>> Nathan
>>>>
>>>>
>>>
>>
^ permalink raw reply
* Re: [PATCH v3 net-next 1/2] tools: psock_lib: tighten conditions checked in sock_setfilter
From: Shuah Khan @ 2017-01-05 18:46 UTC (permalink / raw)
To: Sowmini Varadhan; +Cc: netdev, daniel, willemb, davem, Shuah Khan
In-Reply-To: <20170105155407.GH16822@oracle.com>
On 01/05/2017 08:54 AM, Sowmini Varadhan wrote:
> On (01/04/17 16:26), Shuah Khan wrote:
>>
>> Could you please split this patch into two. Hardening part in one and
>> the cleanup in a separate patch. This way I can get the hardening fix
>> into 4.10 in my next Kselftest update. Cleanup patch can go in later.
>>
>> thanks,
>> -- Shuah
>
> I'm a little confused by the comments above.
>
> Dan's suggestion was that I could have used some other
> tool to generate the code, rather than hand-crafting it as I did.
> In his last message, he suggests that it may be ok to leave
> the hand-crafted version as is (for now), as well.
>
> To make it clear:
> the current v3 version *is* the "hardening" part. Dan's suggestion is
> that the hand-crafted version can be replaced by bpf_asm generated code
> later. That would be the "cleanup" part, which I was going to do in a
> later commit.
>
> Does that help?
>
> --Sowmini
>
Let's try this again. I want to see a separate patch for the
filter cleanup. I don't want that included in the non-udp packet
check. Please address the readability review comments from me and
Daniel when you send your next version.
thanks,
-- Shuah
^ permalink raw reply
* [PATCH v4 net-next] tools: psock_tpacket: block Rx until socket filter has been added and socket has been bound to loopback.
From: Sowmini Varadhan @ 2017-01-05 19:06 UTC (permalink / raw)
To: netdev, sowmini.varadhan; +Cc: daniel, willemb, davem
In-Reply-To: <cover.1483642577.git.sowmini.varadhan@oracle.com>
Packets from any/all interfaces may be queued up on the PF_PACKET socket
before it is bound to the loopback interface by psock_tpacket, and
when these are passed up by the kernel, they could interfere
with the Rx tests.
Avoid interference from spurious packet by blocking Rx until the
socket filter has been set up, and the packet has been bound to the
desired (lo) interface. The effective sequence is
socket(PF_PACKET, SOCK_RAW, 0);
set up ring
Invoke SO_ATTACH_FILTER
bind to sll_protocol set to ETH_P_ALL, sll_ifindex for lo
After this sequence, the only packets that will be passed up are
those received on loopback that pass the attached filter.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
---
v2: patch reworked based on comments from Willem de Bruijn
v4: dropped patch 1/2: leave it soft;
Send patch 2/2 to the owner of tools/testing/selftests/net/ listed in
MAINTAINERS, instead of the list generated by get_maintainer.pl
tools/testing/selftests/net/psock_tpacket.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/net/psock_tpacket.c b/tools/testing/selftests/net/psock_tpacket.c
index 4a1bc64..7f6cd9f 100644
--- a/tools/testing/selftests/net/psock_tpacket.c
+++ b/tools/testing/selftests/net/psock_tpacket.c
@@ -110,7 +110,7 @@ struct block_desc {
static int pfsocket(int ver)
{
- int ret, sock = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
+ int ret, sock = socket(PF_PACKET, SOCK_RAW, 0);
if (sock == -1) {
perror("socket");
exit(1);
@@ -239,7 +239,6 @@ static void walk_v1_v2_rx(int sock, struct ring *ring)
bug_on(ring->type != PACKET_RX_RING);
pair_udp_open(udp_sock, PORT_BASE);
- pair_udp_setfilter(sock);
memset(&pfd, 0, sizeof(pfd));
pfd.fd = sock;
@@ -601,7 +600,6 @@ static void walk_v3_rx(int sock, struct ring *ring)
bug_on(ring->type != PACKET_RX_RING);
pair_udp_open(udp_sock, PORT_BASE);
- pair_udp_setfilter(sock);
memset(&pfd, 0, sizeof(pfd));
pfd.fd = sock;
@@ -741,6 +739,8 @@ static void bind_ring(int sock, struct ring *ring)
{
int ret;
+ pair_udp_setfilter(sock);
+
ring->ll.sll_family = PF_PACKET;
ring->ll.sll_protocol = htons(ETH_P_ALL);
ring->ll.sll_ifindex = if_nametoindex("lo");
--
1.7.1
^ permalink raw reply related
* RE: [PATCH net] hyper-v: Add myself as additional MAINTAINER
From: KY Srinivasan @ 2017-01-05 19:08 UTC (permalink / raw)
To: gregkh@linuxfoundation.org
Cc: Stephen Hemminger, davem@davemloft.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Stephen Hemminger
In-Reply-To: <20170105182911.GA16146@kroah.com>
> -----Original Message-----
> From: gregkh@linuxfoundation.org [mailto:gregkh@linuxfoundation.org]
> Sent: Thursday, January 5, 2017 10:29 AM
> To: KY Srinivasan <kys@microsoft.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>;
> davem@davemloft.net; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; Stephen Hemminger <sthemmin@microsoft.com>
> Subject: Re: [PATCH net] hyper-v: Add myself as additional MAINTAINER
>
> On Thu, Jan 05, 2017 at 05:43:04PM +0000, KY Srinivasan wrote:
> >
> >
> > > -----Original Message-----
> > > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > > Sent: Thursday, January 5, 2017 9:36 AM
> > > To: davem@davemloft.net; KY Srinivasan <kys@microsoft.com>
> > > Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > gregkh@linuxfoundation.org; Stephen Hemminger
> > > <sthemmin@microsoft.com>
> > > Subject: [PATCH net] hyper-v: Add myself as additional MAINTAINER
> > >
> > > Update the Hyper-V MAINTAINERS to include myself.
> > >
> > > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> >
> > Acked-by: K. Y. Srinivasan <kys@microsoft.com>
>
> Thanks, will go queue this up now.
Thanks Greg. On a different note, there are a bunch of Hyper-V specific
patches that have been submitted over the last month or so that have not
been committed. Should I resend them.
Regards,
K. Y
>
> greg k-h
^ permalink raw reply
* [PATCH net-next v2] net: dsa: b53: Utilize common helpers for u64/MAC
From: Florian Fainelli @ 2017-01-05 19:08 UTC (permalink / raw)
To: netdev; +Cc: davem, andrew, vivien.didelot, volodymyr.bendiuga,
Florian Fainelli
Utilize the two functions recently introduced: u64_to_ether() and
ether_to_u64() instead of our own versions.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v2:
- include etherdevice.h in b53_priv.h to fix Kbuild reported
errors
drivers/net/dsa/b53/b53_common.c | 2 +-
drivers/net/dsa/b53/b53_priv.h | 24 +++---------------------
2 files changed, 4 insertions(+), 22 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 947adda3397d..d5370c227043 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1137,7 +1137,7 @@ static int b53_arl_op(struct b53_device *dev, int op, int port,
int ret;
/* Convert the array into a 64-bit MAC */
- mac = b53_mac_to_u64(addr);
+ mac = ether_addr_to_u64(addr);
/* Perform a read for the given MAC and VID */
b53_write48(dev, B53_ARLIO_PAGE, B53_MAC_ADDR_IDX, mac);
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index f192a673caba..1f4b07b77de2 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -22,6 +22,7 @@
#include <linux/kernel.h>
#include <linux/mutex.h>
#include <linux/phy.h>
+#include <linux/etherdevice.h>
#include <net/dsa.h>
#include "b53_regs.h"
@@ -325,25 +326,6 @@ struct b53_arl_entry {
u8 is_static:1;
};
-static inline void b53_mac_from_u64(u64 src, u8 *dst)
-{
- unsigned int i;
-
- for (i = 0; i < ETH_ALEN; i++)
- dst[ETH_ALEN - 1 - i] = (src >> (8 * i)) & 0xff;
-}
-
-static inline u64 b53_mac_to_u64(const u8 *src)
-{
- unsigned int i;
- u64 dst = 0;
-
- for (i = 0; i < ETH_ALEN; i++)
- dst |= (u64)src[ETH_ALEN - 1 - i] << (8 * i);
-
- return dst;
-}
-
static inline void b53_arl_to_entry(struct b53_arl_entry *ent,
u64 mac_vid, u32 fwd_entry)
{
@@ -352,14 +334,14 @@ static inline void b53_arl_to_entry(struct b53_arl_entry *ent,
ent->is_valid = !!(fwd_entry & ARLTBL_VALID);
ent->is_age = !!(fwd_entry & ARLTBL_AGE);
ent->is_static = !!(fwd_entry & ARLTBL_STATIC);
- b53_mac_from_u64(mac_vid, ent->mac);
+ u64_to_ether_addr(mac_vid, ent->mac);
ent->vid = mac_vid >> ARLTBL_VID_S;
}
static inline void b53_arl_from_entry(u64 *mac_vid, u32 *fwd_entry,
const struct b53_arl_entry *ent)
{
- *mac_vid = b53_mac_to_u64(ent->mac);
+ *mac_vid = ether_addr_to_u64(ent->mac);
*mac_vid |= (u64)(ent->vid & ARLTBL_VID_MASK) << ARLTBL_VID_S;
*fwd_entry = ent->port & ARLTBL_DATA_PORT_ID_MASK;
if (ent->is_valid)
--
2.9.3
^ permalink raw reply related
* [net-next PATCH v6 3/3] net: dummy: Introduce dummy virtual functions
From: Phil Sutter @ 2017-01-05 19:09 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20170105190913.29986-1-phil@nwl.cc>
The idea for this was born when testing VF support in iproute2 which was
impeded by hardware requirements. In fact, not every VF-capable hardware
driver implements all netdev ops, so testing the interface is still hard
to do even with a well-sorted hardware shelf.
To overcome this and allow for testing the user-kernel interface, this
patch allows to turn dummy into a PF with a configurable amount of VFs.
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
Changes since v5:
- Got rid of fake PCI parent hack altogether, implement ndo_get_vf_count
instead.
Changes since v4:
- Initialize pci_pdev.sriov at runtime - older gcc versions don't allow
initializing fields of anonymous unions at declaration time.
- Rebased onto current net-next/master.
Changes since v3:
- Changed type of vf_mac field from unsigned char to u8.
- Column-aligned structs' field names.
Changes since v2:
- Fixed oops on reboot (need to initialize parent device mutex).
- Got rid of potential mem leak noticed by Eric Dumazet.
- Dropped stray newline insertion.
Changes since v1:
- Fixed issues reported by kbuild test robot:
- pci_dev->sriov is only present if CONFIG_PCI_ATS is active.
- pci_bus_type does not exist if CONFIG_PCI is not defined.
---
drivers/net/dummy.c | 178 +++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 176 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 6421835f11b7e..8da0a97ff7cee 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -42,6 +42,25 @@
#define DRV_VERSION "1.0"
static int numdummies = 1;
+static int num_vfs;
+
+struct vf_data_storage {
+ u8 vf_mac[ETH_ALEN];
+ u16 pf_vlan; /* When set, guest VLAN config not allowed. */
+ u16 pf_qos;
+ __be16 vlan_proto;
+ u16 min_tx_rate;
+ u16 max_tx_rate;
+ u8 spoofchk_enabled;
+ bool rss_query_enabled;
+ u8 trusted;
+ int link_state;
+};
+
+struct dummy_priv {
+ int num_vfs;
+ struct vf_data_storage *vfinfo;
+};
/* fake multicast ability */
static void set_multicast_list(struct net_device *dev)
@@ -91,10 +110,25 @@ static netdev_tx_t dummy_xmit(struct sk_buff *skb, struct net_device *dev)
static int dummy_dev_init(struct net_device *dev)
{
+ struct dummy_priv *priv = netdev_priv(dev);
+
dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
if (!dev->dstats)
return -ENOMEM;
+ priv->num_vfs = num_vfs;
+ priv->vfinfo = NULL;
+
+ if (!num_vfs)
+ return 0;
+
+ priv->vfinfo = kcalloc(num_vfs, sizeof(struct vf_data_storage),
+ GFP_KERNEL);
+ if (!priv->vfinfo) {
+ free_percpu(dev->dstats);
+ return -ENOMEM;
+ }
+
return 0;
}
@@ -112,6 +146,124 @@ static int dummy_change_carrier(struct net_device *dev, bool new_carrier)
return 0;
}
+static int dummy_set_vf_mac(struct net_device *dev, int vf, u8 *mac)
+{
+ struct dummy_priv *priv = netdev_priv(dev);
+
+ if (!is_valid_ether_addr(mac) || (vf >= priv->num_vfs))
+ return -EINVAL;
+
+ memcpy(priv->vfinfo[vf].vf_mac, mac, ETH_ALEN);
+
+ return 0;
+}
+
+static int dummy_set_vf_vlan(struct net_device *dev, int vf,
+ u16 vlan, u8 qos, __be16 vlan_proto)
+{
+ struct dummy_priv *priv = netdev_priv(dev);
+
+ if ((vf >= priv->num_vfs) || (vlan > 4095) || (qos > 7))
+ return -EINVAL;
+
+ priv->vfinfo[vf].pf_vlan = vlan;
+ priv->vfinfo[vf].pf_qos = qos;
+ priv->vfinfo[vf].vlan_proto = vlan_proto;
+
+ return 0;
+}
+
+static int dummy_set_vf_rate(struct net_device *dev, int vf, int min, int max)
+{
+ struct dummy_priv *priv = netdev_priv(dev);
+
+ if (vf >= priv->num_vfs)
+ return -EINVAL;
+
+ priv->vfinfo[vf].min_tx_rate = min;
+ priv->vfinfo[vf].max_tx_rate = max;
+
+ return 0;
+}
+
+static int dummy_set_vf_spoofchk(struct net_device *dev, int vf, bool val)
+{
+ struct dummy_priv *priv = netdev_priv(dev);
+
+ if (vf >= priv->num_vfs)
+ return -EINVAL;
+
+ priv->vfinfo[vf].spoofchk_enabled = val;
+
+ return 0;
+}
+
+static int dummy_set_vf_rss_query_en(struct net_device *dev, int vf, bool val)
+{
+ struct dummy_priv *priv = netdev_priv(dev);
+
+ if (vf >= priv->num_vfs)
+ return -EINVAL;
+
+ priv->vfinfo[vf].rss_query_enabled = val;
+
+ return 0;
+}
+
+static int dummy_set_vf_trust(struct net_device *dev, int vf, bool val)
+{
+ struct dummy_priv *priv = netdev_priv(dev);
+
+ if (vf >= priv->num_vfs)
+ return -EINVAL;
+
+ priv->vfinfo[vf].trusted = val;
+
+ return 0;
+}
+
+static int dummy_get_vf_config(struct net_device *dev,
+ int vf, struct ifla_vf_info *ivi)
+{
+ struct dummy_priv *priv = netdev_priv(dev);
+
+ if (vf >= priv->num_vfs)
+ return -EINVAL;
+
+ ivi->vf = vf;
+ memcpy(&ivi->mac, priv->vfinfo[vf].vf_mac, ETH_ALEN);
+ ivi->vlan = priv->vfinfo[vf].pf_vlan;
+ ivi->qos = priv->vfinfo[vf].pf_qos;
+ ivi->spoofchk = priv->vfinfo[vf].spoofchk_enabled;
+ ivi->linkstate = priv->vfinfo[vf].link_state;
+ ivi->min_tx_rate = priv->vfinfo[vf].min_tx_rate;
+ ivi->max_tx_rate = priv->vfinfo[vf].max_tx_rate;
+ ivi->rss_query_en = priv->vfinfo[vf].rss_query_enabled;
+ ivi->trusted = priv->vfinfo[vf].trusted;
+ ivi->vlan_proto = priv->vfinfo[vf].vlan_proto;
+
+ return 0;
+}
+
+static int dummy_set_vf_link_state(struct net_device *dev, int vf, int state)
+{
+ struct dummy_priv *priv = netdev_priv(dev);
+
+ if (vf >= priv->num_vfs)
+ return -EINVAL;
+
+ priv->vfinfo[vf].link_state = state;
+
+ return 0;
+}
+
+static int dummy_get_vf_count(const struct net_device *dev)
+{
+ struct dummy_priv *priv = netdev_priv(dev);
+
+ return priv->num_vfs;
+}
+
static const struct net_device_ops dummy_netdev_ops = {
.ndo_init = dummy_dev_init,
.ndo_uninit = dummy_dev_uninit,
@@ -121,6 +273,15 @@ static const struct net_device_ops dummy_netdev_ops = {
.ndo_set_mac_address = eth_mac_addr,
.ndo_get_stats64 = dummy_get_stats64,
.ndo_change_carrier = dummy_change_carrier,
+ .ndo_set_vf_mac = dummy_set_vf_mac,
+ .ndo_set_vf_vlan = dummy_set_vf_vlan,
+ .ndo_set_vf_rate = dummy_set_vf_rate,
+ .ndo_set_vf_spoofchk = dummy_set_vf_spoofchk,
+ .ndo_set_vf_trust = dummy_set_vf_trust,
+ .ndo_get_vf_config = dummy_get_vf_config,
+ .ndo_set_vf_link_state = dummy_set_vf_link_state,
+ .ndo_set_vf_rss_query_en = dummy_set_vf_rss_query_en,
+ .ndo_get_vf_count = dummy_get_vf_count,
};
static void dummy_get_drvinfo(struct net_device *dev,
@@ -134,6 +295,14 @@ static const struct ethtool_ops dummy_ethtool_ops = {
.get_drvinfo = dummy_get_drvinfo,
};
+static void dummy_free_netdev(struct net_device *dev)
+{
+ struct dummy_priv *priv = netdev_priv(dev);
+
+ kfree(priv->vfinfo);
+ free_netdev(dev);
+}
+
static void dummy_setup(struct net_device *dev)
{
ether_setup(dev);
@@ -141,7 +310,7 @@ static void dummy_setup(struct net_device *dev)
/* Initialize the device structure. */
dev->netdev_ops = &dummy_netdev_ops;
dev->ethtool_ops = &dummy_ethtool_ops;
- dev->destructor = free_netdev;
+ dev->destructor = dummy_free_netdev;
/* Fill in device structure with ethernet-generic values. */
dev->flags |= IFF_NOARP;
@@ -172,6 +341,7 @@ static int dummy_validate(struct nlattr *tb[], struct nlattr *data[])
static struct rtnl_link_ops dummy_link_ops __read_mostly = {
.kind = DRV_NAME,
+ .priv_size = sizeof(struct dummy_priv),
.setup = dummy_setup,
.validate = dummy_validate,
};
@@ -180,12 +350,16 @@ static struct rtnl_link_ops dummy_link_ops __read_mostly = {
module_param(numdummies, int, 0);
MODULE_PARM_DESC(numdummies, "Number of dummy pseudo devices");
+module_param(num_vfs, int, 0);
+MODULE_PARM_DESC(num_vfs, "Number of dummy VFs per dummy device");
+
static int __init dummy_init_one(void)
{
struct net_device *dev_dummy;
int err;
- dev_dummy = alloc_netdev(0, "dummy%d", NET_NAME_UNKNOWN, dummy_setup);
+ dev_dummy = alloc_netdev(sizeof(struct dummy_priv),
+ "dummy%d", NET_NAME_UNKNOWN, dummy_setup);
if (!dev_dummy)
return -ENOMEM;
--
2.11.0
^ permalink raw reply related
* [net-next PATCH v6 1/3] net: net_device_ops: Introduce ndo_get_vf_count
From: Phil Sutter @ 2017-01-05 19:09 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20170105190913.29986-1-phil@nwl.cc>
The idea is to allow drivers to implement this callback in order to
provide a custom way to return the number of virtual functions present
on the device.
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
Changes since v5:
- Introduced this patch.
---
include/linux/netdevice.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ecd78b3c9abad..a04a693f55065 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -964,6 +964,10 @@ struct netdev_xdp {
* with PF and querying it may introduce a theoretical security risk.
* int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf, bool setting);
* int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb);
+ * int (*ndo_get_vf_count)(const struct net_device *dev);
+ * Return the number of VFs present on this device instead of having
+ * rtnetlink use pci_num_vf() on the PCI parent device.
+ *
* int (*ndo_setup_tc)(struct net_device *dev, u8 tc)
* Called to setup 'tc' number of traffic classes in the net device. This
* is always called from the stack with the rtnl lock held and netif tx
@@ -1218,6 +1222,7 @@ struct net_device_ops {
int (*ndo_set_vf_rss_query_en)(
struct net_device *dev,
int vf, bool setting);
+ int (*ndo_get_vf_count)(const struct net_device *dev);
int (*ndo_setup_tc)(struct net_device *dev,
u32 handle,
__be16 protocol,
--
2.11.0
^ permalink raw reply related
* [net-next PATCH v6 0/3] net: dummy: Introduce dummy virtual functions
From: Phil Sutter @ 2017-01-05 19:09 UTC (permalink / raw)
To: David Miller; +Cc: netdev
This series adds VF support to dummy device driver after adding the
necessary infrastructure changes:
Patch 1 adds a netdevice callback for device-specific VF count
retrieval. Patch 2 then changes dev_num_vf() implementation to make use
of that new callback (if implemented), falling back to the old
behaviour. Patch 3 then implements VF support in dummy, without the fake
PCI parent device hack from v5.
Phil Sutter (3):
net: net_device_ops: Introduce ndo_get_vf_count
net: rtnetlink: Use a local dev_num_vf() implementation
net: dummy: Introduce dummy virtual functions
drivers/net/dummy.c | 178 +++++++++++++++++++++++++++++++++++++++++++++-
include/linux/netdevice.h | 5 ++
include/linux/pci.h | 2 -
net/core/rtnetlink.c | 37 ++++++----
4 files changed, 205 insertions(+), 17 deletions(-)
--
2.11.0
^ permalink raw reply
* [net-next PATCH v6 2/3] net: rtnetlink: Use a local dev_num_vf() implementation
From: Phil Sutter @ 2017-01-05 19:09 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20170105190913.29986-1-phil@nwl.cc>
Promote dev_num_vf() to be no longer PCI device specific but use
ndo_get_vf_count() if implemented and only fall back to pci_num_vf()
like the old dev_num_vf() did.
Since this implementation no longer requires a parent device to be
present, don't pass the parent but the actual device to it and have it
check for parent existence only in the fallback case. This in turn
allows to eliminate parent existence checks in callers.
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
Changes since v5:
- Introduced this patch.
---
include/linux/pci.h | 2 --
net/core/rtnetlink.c | 37 ++++++++++++++++++++++++-------------
2 files changed, 24 insertions(+), 15 deletions(-)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e2d1a124216a9..adbc859fe7c4c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -885,7 +885,6 @@ void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type);
void pci_sort_breadthfirst(void);
#define dev_is_pci(d) ((d)->bus == &pci_bus_type)
#define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false))
-#define dev_num_vf(d) ((dev_is_pci(d) ? pci_num_vf(to_pci_dev(d)) : 0))
/* Generic PCI functions exported to card drivers */
@@ -1630,7 +1629,6 @@ static inline int pci_get_new_domain_nr(void) { return -ENOSYS; }
#define dev_is_pci(d) (false)
#define dev_is_pf(d) (false)
-#define dev_num_vf(d) (0)
#endif /* CONFIG_PCI */
/* Include architecture-dependent settings and functions */
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 18b5aae99becf..84294593e0306 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -833,13 +833,24 @@ static void copy_rtnl_link_stats(struct rtnl_link_stats *a,
a->rx_nohandler = b->rx_nohandler;
}
+static int dev_num_vf(const struct net_device *dev)
+{
+ if (dev->netdev_ops->ndo_get_vf_count)
+ return dev->netdev_ops->ndo_get_vf_count(dev);
+#ifdef CONFIG_PCI
+ if (dev->dev.parent && dev_is_pci(dev->dev.parent))
+ return pci_num_vf(to_pci_dev(dev->dev.parent));
+#endif
+ return 0;
+}
+
/* All VF info */
static inline int rtnl_vfinfo_size(const struct net_device *dev,
u32 ext_filter_mask)
{
- if (dev->dev.parent && dev_is_pci(dev->dev.parent) &&
- (ext_filter_mask & RTEXT_FILTER_VF)) {
- int num_vfs = dev_num_vf(dev->dev.parent);
+ int num_vfs = dev_num_vf(dev);
+
+ if (num_vfs && (ext_filter_mask & RTEXT_FILTER_VF)) {
size_t size = nla_total_size(0);
size += num_vfs *
(nla_total_size(0) +
@@ -889,12 +900,12 @@ static size_t rtnl_port_size(const struct net_device *dev,
size_t port_self_size = nla_total_size(sizeof(struct nlattr))
+ port_size;
- if (!dev->netdev_ops->ndo_get_vf_port || !dev->dev.parent ||
+ if (!dev->netdev_ops->ndo_get_vf_port ||
!(ext_filter_mask & RTEXT_FILTER_VF))
return 0;
- if (dev_num_vf(dev->dev.parent))
+ if (dev_num_vf(dev))
return port_self_size + vf_ports_size +
- vf_port_size * dev_num_vf(dev->dev.parent);
+ vf_port_size * dev_num_vf(dev);
else
return port_self_size;
}
@@ -962,7 +973,7 @@ static int rtnl_vf_ports_fill(struct sk_buff *skb, struct net_device *dev)
if (!vf_ports)
return -EMSGSIZE;
- for (vf = 0; vf < dev_num_vf(dev->dev.parent); vf++) {
+ for (vf = 0; vf < dev_num_vf(dev); vf++) {
vf_port = nla_nest_start(skb, IFLA_VF_PORT);
if (!vf_port)
goto nla_put_failure;
@@ -1012,7 +1023,7 @@ static int rtnl_port_fill(struct sk_buff *skb, struct net_device *dev,
{
int err;
- if (!dev->netdev_ops->ndo_get_vf_port || !dev->dev.parent ||
+ if (!dev->netdev_ops->ndo_get_vf_port ||
!(ext_filter_mask & RTEXT_FILTER_VF))
return 0;
@@ -1020,7 +1031,7 @@ static int rtnl_port_fill(struct sk_buff *skb, struct net_device *dev,
if (err)
return err;
- if (dev_num_vf(dev->dev.parent)) {
+ if (dev_num_vf(dev)) {
err = rtnl_vf_ports_fill(skb, dev);
if (err)
return err;
@@ -1351,15 +1362,15 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
if (rtnl_fill_stats(skb, dev))
goto nla_put_failure;
- if (dev->dev.parent && (ext_filter_mask & RTEXT_FILTER_VF) &&
- nla_put_u32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent)))
+ if (ext_filter_mask & RTEXT_FILTER_VF &&
+ nla_put_u32(skb, IFLA_NUM_VF, dev_num_vf(dev)))
goto nla_put_failure;
- if (dev->netdev_ops->ndo_get_vf_config && dev->dev.parent &&
+ if (dev->netdev_ops->ndo_get_vf_config &&
ext_filter_mask & RTEXT_FILTER_VF) {
int i;
struct nlattr *vfinfo;
- int num_vfs = dev_num_vf(dev->dev.parent);
+ int num_vfs = dev_num_vf(dev);
vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST);
if (!vfinfo)
--
2.11.0
^ permalink raw reply related
* Re: [PATCH net-next] packet: fix panic in __packet_set_timestamp on tpacket_v3 in tx mode
From: Daniel Borkmann @ 2017-01-05 19:10 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, sowmini.varadhan, willemb, netdev
In-Reply-To: <1483640872.9712.1.camel@edumazet-glaptop3.roam.corp.google.com>
On 01/05/2017 07:27 PM, Eric Dumazet wrote:
> On Thu, 2017-01-05 at 02:34 +0100, Daniel Borkmann wrote:
[...]
>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>> index 7e39087..ddbda25 100644
>> --- a/net/packet/af_packet.c
>> +++ b/net/packet/af_packet.c
>> @@ -481,6 +481,9 @@ static __u32 __packet_set_timestamp(struct packet_sock *po, void *frame,
>> h.h2->tp_nsec = ts.tv_nsec;
>> break;
>> case TPACKET_V3:
>> + h.h3->tp_sec = ts.tv_sec;
>> + h.h3->tp_nsec = ts.tv_nsec;
>> + break;
>> default:
>> WARN(1, "TPACKET version not supported.\n");
>> BUG();
>
> Gosh. Can we also replace this BUG() into something less aggressive ?
There are currently 5 of these WARN() + BUG() constructs and 1 BUG()-only
for the 'default' TPACKET version spread all over af_packet, so probably
makes sense to rather make all of them less aggressive.
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index b9e1a13b4ba36a0bc7edf6a8c2c116c7d48c970c..0c0d268544787dcbef6601c5014e7d3836d16f96 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -476,9 +476,11 @@ static __u32 __packet_set_timestamp(struct packet_sock *po, void *frame,
> h.h2->tp_nsec = ts.tv_nsec;
> break;
> case TPACKET_V3:
> + h.h3->tp_sec = ts.tv_sec;
> + h.h3->tp_nsec = ts.tv_nsec;
> + break;
> default:
> - WARN(1, "TPACKET version not supported.\n");
> - BUG();
> + pr_err_once("TPACKET version %u not supported.\n", po->tp_version);
> }
>
> /* one flush is safe, as both fields always lie on the same cacheline */
>
>
^ permalink raw reply
* Re: [PATCH] phy state machine: failsafe leave invalid RUNNING state
From: Florian Fainelli @ 2017-01-05 19:39 UTC (permalink / raw)
To: Zefir Kurtisi, netdev; +Cc: andrew
In-Reply-To: <327be2cf-8b89-3823-2951-b31a44697a2f@neratec.com>
On 01/05/2017 01:23 AM, Zefir Kurtisi wrote:
> On 01/04/2017 10:44 PM, Florian Fainelli wrote:
>> On 01/04/2017 08:10 AM, Zefir Kurtisi wrote:
>>> On 01/04/2017 04:30 PM, Florian Fainelli wrote:
>>>>
>>>>
>>>> On 01/04/2017 07:27 AM, Zefir Kurtisi wrote:
>>>>> On 01/04/2017 04:13 PM, Florian Fainelli wrote:
>>>>>>
>>>>>>
>>>>>> On 01/04/2017 07:04 AM, Zefir Kurtisi wrote:
>>>>>>> While in RUNNING state, phy_state_machine() checks for link changes by
>>>>>>> comparing phydev->link before and after calling phy_read_status().
>>>>>>> This works as long as it is guaranteed that phydev->link is never
>>>>>>> changed outside the phy_state_machine().
>>>>>>>
>>>>>>> If in some setups this happens, it causes the state machine to miss
>>>>>>> a link loss and remain RUNNING despite phydev->link being 0.
>>>>>>>
>>>>>>> This has been observed running a dsa setup with a process continuously
>>>>>>> polling the link states over ethtool each second (SNMPD RFC-1213
>>>>>>> agent). Disconnecting the link on a phy followed by a ETHTOOL_GSET
>>>>>>> causes dsa_slave_get_settings() / dsa_slave_get_link_ksettings() to
>>>>>>> call phy_read_status() and with that modify the link status - and
>>>>>>> with that bricking the phy state machine.
>>>>>>
>>>>>> That's the interesting part of the analysis, how does this brick the PHY
>>>>>> state machine? Is the PHY driver changing the link status in the
>>>>>> read_status callback that it implements?
>>>>>>
>>>>> phydev->read_status points to genphy_read_status(), where the first call goes to
>>>>> genphy_update_link() which updates the link status.
>>>>>
>>>>> Thereafter phy_state_machine():RUNNING won't be able to detect the link loss
>>>>> anymore unless the link state changes again.
>>>>>
>>>>>
>>>>> I was trying to figure out if there is a rule that forbids changing phydev->link
>>>>> from outside the state machine, but found several places where it happens (either
>>>>> directly, or over genphy_read_status() or over genphy_update_link()).
>>>>>
>>>>> Curious how this did not show up before, since within the dsa setup it is very
>>>>> easy to trigger:
>>>>> a) physically disconnect link
>>>>> b) within one second run ethtool ethX
>>>>
>>>> You need to be more specific here about what "the dsa setup" is, drivers
>>>> involved, which ports of the switch you are seeing this with (user
>>>> facing, CPU port, DSA port?) etc.
>>>>
>>> I am working on top of LEDE and with that at kernel 4.4.21 - alas I checked the
>>> related source files and believe the effect should be reproducible with HEAD.
>>>
>>> The setup is as follows:
>>> mv88e6321:
>>> * ports 0+1 connected to fibre-optics transceivers at fixed 100 Mbps
>>> * port 4 is CPU port
>>> * custom phy driver (replacement for marvell.ko) only populated with
>>> * .config_init to
>>> * set fixed speed for ports 0+1 (when in FO mode)
>>> * run genphy_config_init() for all other modes (here: CPU port)
>>> * .config_aneg=genphy_config_aneg, .read_status=genphy_read_status
>>>
>>>
>>> To my understanding, the exact setup is irrelevant - to reproduce the issue it is
>>> enough to have a means of running genphy_update_link() (as done in e.g.
>>> mediatek/mtk_eth_soc.c, dsa/slave.c), or genphy_read_status() (as done in e.g.
>>> hisilicon/hns/hns_enet.c) or phy_read_status() (as done in e.g.
>>> ethernet/ti/netcp_ethss.c, ethernet/aeroflex/greth.c, etc.). In the observed
>>> drivers it is mostly implemented in the ETHTOOL_GSET execution path.
>>>
>>> Once you get the link state updated outside the phy state machine, it remains in
>>> invalid RUNNING. To prevent that invalid state, to my understanding upper layer
>>> drivers (Ethernet, dsa) must not modify link-states in any case (including calling
>>> the functions noted above), or we need the proposed fail-safe mechanism to prevent
>>> getting stuck.
>>
>> OK, I see the code path involved now, sorry -ENOCOFFEE when I initially
>> responded. Yes, clearly, we should not be mangling the PHY device's link
>> by calling genphy_read_status(). At first glance, none of the users
>> below should be doing what they are doing, but let's kick a separate
>> patch series to collect feedback from the driver writes.
>>
>> Thanks!
>>
> Ok, thanks for taking time.
>
> The kbuild test robot error is due to 'struct device dev' been removed from
> phy_device struct since 4.4.21. Does it make sense to provide a v2 fixing that, or
> do you expect that this fail-safe mechanism is not needed once all Ethernet/dsa
> drivers are fixed?
I think there is value in identifying wrong behaving drivers while we
fix them one after the other.
>
> I think it won't hurt to add the check simply to ensure that it got fixed and the
> issue is not popping up thereafter.
Agreed, can you resubmit against the latest net-next/master tree?
Thanks!
--
Florian
^ permalink raw reply
* Re: [for-next 07/10] IB/mlx5: Use blue flame register allocator in mlx5_ib
From: David Miller @ 2017-01-05 19:51 UTC (permalink / raw)
To: saeedm-VPRAkNaXOzVWk0Htik3J/w
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, leonro-VPRAkNaXOzVWk0Htik3J/w,
eli-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
leon-DgEjT+Ai2ygdnm+yROfE0A
In-Reply-To: <1483480528-22622-8-git-send-email-saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
From: Saeed Mahameed <saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Date: Tue, 3 Jan 2017 23:55:25 +0200
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> index ddb4ca4..39505ac 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> @@ -5,7 +5,7 @@
> config MLX5_CORE
> tristate "Mellanox Technologies ConnectX-4 and Connect-IB core driver"
> depends on MAY_USE_DEVLINK
> - depends on PCI
> + depends on PCI && 64BIT
> default n
> ---help---
> Core driver for low level functionality of the ConnectX-4 and
This is a regression, I'm not applying this.
I don't care how hard it is, you have to keep the driver building properly
in non-64bit builds.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH net-next v2] tcp: provide timestamps for partial writes
From: David Miller @ 2017-01-05 19:56 UTC (permalink / raw)
To: soheil.kdev; +Cc: netdev, soheil, willemb, edumazet, ncardwell, kafai
In-Reply-To: <20170104161934.26849-1-soheil.kdev@gmail.com>
From: Soheil Hassas Yeganeh <soheil.kdev@gmail.com>
Date: Wed, 4 Jan 2017 11:19:34 -0500
> From: Soheil Hassas Yeganeh <soheil@google.com>
>
> For TCP sockets, TX timestamps are only captured when the user data
> is successfully and fully written to the socket. In many cases,
> however, TCP writes can be partial for which no timestamp is
> collected.
>
> Collect timestamps whenever any user data is (fully or partially)
> copied into the socket. Pass tcp_write_queue_tail to tcp_tx_timestamp
> instead of the local skb pointer since it can be set to NULL on
> the error path.
>
> Note that tcp_write_queue_tail can be NULL, even if bytes have been
> copied to the socket. This is because acknowledgements are being
> processed in tcp_sendmsg(), and by the time tcp_tx_timestamp is
> called tcp_write_queue_tail can be NULL. For such cases, this patch
> does not collect any timestamps (i.e., it is best-effort).
>
> This patch is written with suggestions from Willem de Bruijn and
> Eric Dumazet.
>
> Change-log V1 -> V2:
> - Use sockc.tsflags instead of sk->sk_tsflags.
> - Use the same code path for normal writes and errors.
>
> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH v1 1/2] bpf: add a longest prefix match trie map implementation
From: Daniel Borkmann @ 2017-01-05 20:01 UTC (permalink / raw)
To: Daniel Mack, ast; +Cc: dh.herrmann, netdev, davem
In-Reply-To: <586E7366.1010708@iogearbox.net>
On 01/05/2017 05:25 PM, Daniel Borkmann wrote:
> On 12/29/2016 06:28 PM, Daniel Mack wrote:
>> This trie implements a longest prefix match algorithm that can be used
>> to match IP addresses to a stored set of ranges.
>>
>> Internally, data is stored in an unbalanced trie of nodes that has a
>> maximum height of n, where n is the prefixlen the trie was created
>> with.
>>
>> Tries may be created with prefix lengths that are multiples of 8, in
>> the range from 8 to 2048. The key used for lookup and update operations
>> is a struct bpf_lpm_trie_key, and the value is a uint64_t.
>>
>> The code carries more information about the internal implementation.
>>
>> Signed-off-by: Daniel Mack <daniel@zonque.org>
>> Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
>
> Thanks for working on it, and sorry for late reply. In addition to
> Alexei's earlier comments on the cover letter, a few comments inline:
>
[...]
>> +static struct bpf_map *trie_alloc(union bpf_attr *attr)
>> +{
>> + struct lpm_trie *trie;
>> +
>> + /* check sanity of attributes */
>> + if (attr->max_entries == 0 || attr->map_flags ||
>> + attr->key_size < sizeof(struct bpf_lpm_trie_key) + 1 ||
>> + attr->key_size > sizeof(struct bpf_lpm_trie_key) + 256 ||
>> + attr->value_size != sizeof(u64))
>> + return ERR_PTR(-EINVAL);
One more question on this regarding value size as u64 (perhaps I
missed it along the way): reason this was chosen was because for
keeping stats? Why not making user choose a size as in other maps,
so also custom structs could be stored there?
Thanks,
Daniel
^ permalink raw reply
* Re: [PATCH] tg3: Avoid NULL pointer dereference in tg3_get_nstats()
From: Michael Chan @ 2017-01-05 20:04 UTC (permalink / raw)
To: David Miller
Cc: wangyufen, Siva Reddy Kallam, prashant.sreedharan@broadcom.com,
michael.chan@broadcom.com, Netdev
In-Reply-To: <20170105.123337.2237827308340782208.davem@davemloft.net>
On Thu, Jan 5, 2017 at 9:33 AM, David Miller <davem@davemloft.net> wrote:
> From: Wang Yufen <wangyufen@huawei.com>
> Date: Thu, 5 Jan 2017 22:13:21 +0800
>
>> From: Yufen Wang <wangyufen@huawei.com>
>>
>> A possible NULL pointer dereference in tg3_get_stats64 while doing
>> tg3_free_consistent.
> ...
>> This patch avoids the NULL pointer dereference by using !tg3_flag(tp, INIT_COMPLETE)
>> instate of !tp->hw_stats.
>>
>> Signed-off-by: Yufen Wang <wangyufen@huawei.com>
>> ---
>> drivers/net/ethernet/broadcom/tg3.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
>> index 185e9e0..012f18d 100644
>> --- a/drivers/net/ethernet/broadcom/tg3.c
>> +++ b/drivers/net/ethernet/broadcom/tg3.c
>> @@ -14148,7 +14148,7 @@ static struct rtnl_link_stats64 *tg3_get_stats64(struct net_device *dev,
>> struct tg3 *tp = netdev_priv(dev);
>>
>> spin_lock_bh(&tp->lock);
>> - if (!tp->hw_stats) {
>> + if (!tg3_flag(tp, INIT_COMPLETE)) {
>
> The real issue is the manner and order in which the driver performs
> initialization actions relative to netif_device_{attach,detach}().
>
> That is what needs to be fixed here, instead of adding more and more
> ad-hoc tests to the various methods which can be invoked once the
> netif_device_attach() occurs.
Normally, ndo_get_stats64() should be under rtnl lock in the netlink
code path and we should be safe. We only free tp->hw_stats under rtnl
lock in the close path or ethtool path.
But it looks like ndo_get_stats() can be called without rtnl lock from
net-procfs.c. So it is possible that we'll read tp->hw_stats after it
has been freed. For example, if we are reading /proc/net/dev and
closing tg3 at the same time. David, is not taking rtnl_lock in
net-procfs.c by design?
^ permalink raw reply
* Re: [PATCH v1 1/2] bpf: add a longest prefix match trie map implementation
From: Daniel Mack @ 2017-01-05 20:04 UTC (permalink / raw)
To: Daniel Borkmann, ast; +Cc: dh.herrmann, netdev, davem
In-Reply-To: <586E7366.1010708@iogearbox.net>
Hi Daniel,
Thanks for your feedback! I agree on all points. Two questions below.
On 01/05/2017 05:25 PM, Daniel Borkmann wrote:
> On 12/29/2016 06:28 PM, Daniel Mack wrote:
>> diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
>> new file mode 100644
>> index 0000000..8b6a61d
>> --- /dev/null
>> +++ b/kernel/bpf/lpm_trie.c
[..]
>> +static struct bpf_map *trie_alloc(union bpf_attr *attr)
>> +{
>> + struct lpm_trie *trie;
>> +
>> + /* check sanity of attributes */
>> + if (attr->max_entries == 0 || attr->map_flags ||
>> + attr->key_size < sizeof(struct bpf_lpm_trie_key) + 1 ||
>> + attr->key_size > sizeof(struct bpf_lpm_trie_key) + 256 ||
>> + attr->value_size != sizeof(u64))
>> + return ERR_PTR(-EINVAL);
>
> The correct attr->map_flags test here would need to be ...
>
> attr->map_flags != BPF_F_NO_PREALLOC
>
> ... since in this case we don't have any prealloc pool, and
> should that come one day that test could be relaxed again.
>
>> + trie = kzalloc(sizeof(*trie), GFP_USER | __GFP_NOWARN);
>> + if (!trie)
>> + return NULL;
>> +
>> + /* copy mandatory map attributes */
>> + trie->map.map_type = attr->map_type;
>> + trie->map.key_size = attr->key_size;
>> + trie->map.value_size = attr->value_size;
>> + trie->map.max_entries = attr->max_entries;
>
> You also need to fill in trie->map.pages as that is eventually
> used to charge memory against in bpf_map_charge_memlock(), right
> now that would remain as 0 meaning the map is not accounted for.
Hmm, okay. The nodes are, however, allocated dynamically at runtime in
this case. That means that we have trie->map.pages on each allocation,
right?
>> +static void trie_free(struct bpf_map *map)
>> +{
>> + struct lpm_trie_node __rcu **slot;
>> + struct lpm_trie_node *node;
>> + struct lpm_trie *trie =
>> + container_of(map, struct lpm_trie, map);
>> +
>> + spin_lock(&trie->lock);
>> +
>> + /*
>> + * Always start at the root and walk down to a node that has no
>> + * children. Then free that node, nullify its pointer in the parent,
>> + * then start over.
>> + */
>> +
>> + for (;;) {
>> + slot = &trie->root;
>> +
>> + for (;;) {
>> + node = rcu_dereference_protected(*slot,
>> + lockdep_is_held(&trie->lock));
>> + if (!node)
>> + goto out;
>> +
>> + if (node->child[0]) {
>
> rcu_access_pointer(node->child[0]) (at least to keep sparse happy?)
Done, but sparse does not actually complain here.
Thanks,
Daniel
^ permalink raw reply
* Re: [PATCH v4 net-next] tools: psock_tpacket: block Rx until socket filter has been added and socket has been bound to loopback.
From: David Miller @ 2017-01-05 20:04 UTC (permalink / raw)
To: sowmini.varadhan; +Cc: netdev, daniel, willemb
In-Reply-To: <6b77972e99e49096676e04854067a44226e926aa.1483642577.git.sowmini.varadhan@oracle.com>
From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Thu, 5 Jan 2017 11:06:22 -0800
> Packets from any/all interfaces may be queued up on the PF_PACKET socket
> before it is bound to the loopback interface by psock_tpacket, and
> when these are passed up by the kernel, they could interfere
> with the Rx tests.
>
> Avoid interference from spurious packet by blocking Rx until the
> socket filter has been set up, and the packet has been bound to the
> desired (lo) interface. The effective sequence is
> socket(PF_PACKET, SOCK_RAW, 0);
> set up ring
> Invoke SO_ATTACH_FILTER
> bind to sll_protocol set to ETH_P_ALL, sll_ifindex for lo
> After this sequence, the only packets that will be passed up are
> those received on loopback that pass the attached filter.
>
> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Applied, thanks.
^ permalink raw reply
* Re: [for-next 07/10] IB/mlx5: Use blue flame register allocator in mlx5_ib
From: David Miller @ 2017-01-05 20:07 UTC (permalink / raw)
To: eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb
Cc: saeedm-VPRAkNaXOzVWk0Htik3J/w, dledford-H+wXaHxf7aLQT0dZR+AlfA,
netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
leonro-VPRAkNaXOzVWk0Htik3J/w, eli-VPRAkNaXOzVWk0Htik3J/w,
matanb-VPRAkNaXOzVWk0Htik3J/w, leon-DgEjT+Ai2ygdnm+yROfE0A
In-Reply-To: <CAL3tnx4H2HsGMH=caiobivvvs0AU9yrzCmVONOdUQ_j_JMfufA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
From: Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
Date: Thu, 5 Jan 2017 14:03:18 -0600
> If necessary I can make sure it builds on 32 bits as well.
Please do.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH net] hyper-v: Add myself as additional MAINTAINER
From: gregkh @ 2017-01-05 20:09 UTC (permalink / raw)
To: KY Srinivasan
Cc: Stephen Hemminger, davem@davemloft.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Stephen Hemminger
In-Reply-To: <DM5PR03MB24902386946C55BCE617941CA0600@DM5PR03MB2490.namprd03.prod.outlook.com>
On Thu, Jan 05, 2017 at 07:08:23PM +0000, KY Srinivasan wrote:
>
>
> > -----Original Message-----
> > From: gregkh@linuxfoundation.org [mailto:gregkh@linuxfoundation.org]
> > Sent: Thursday, January 5, 2017 10:29 AM
> > To: KY Srinivasan <kys@microsoft.com>
> > Cc: Stephen Hemminger <stephen@networkplumber.org>;
> > davem@davemloft.net; netdev@vger.kernel.org; linux-
> > kernel@vger.kernel.org; Stephen Hemminger <sthemmin@microsoft.com>
> > Subject: Re: [PATCH net] hyper-v: Add myself as additional MAINTAINER
> >
> > On Thu, Jan 05, 2017 at 05:43:04PM +0000, KY Srinivasan wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > > > Sent: Thursday, January 5, 2017 9:36 AM
> > > > To: davem@davemloft.net; KY Srinivasan <kys@microsoft.com>
> > > > Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > > gregkh@linuxfoundation.org; Stephen Hemminger
> > > > <sthemmin@microsoft.com>
> > > > Subject: [PATCH net] hyper-v: Add myself as additional MAINTAINER
> > > >
> > > > Update the Hyper-V MAINTAINERS to include myself.
> > > >
> > > > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> > >
> > > Acked-by: K. Y. Srinivasan <kys@microsoft.com>
> >
> > Thanks, will go queue this up now.
>
> Thanks Greg. On a different note, there are a bunch of Hyper-V specific
> patches that have been submitted over the last month or so that have not
> been committed. Should I resend them.
Nope, they are still in my mbox, I'm just going through stuff that has
to be in 4.10-final at the moment, give me another week or so to catch
up on all the new stuff for 4.11-rc1.
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCH v1 1/2] bpf: add a longest prefix match trie map implementation
From: Daniel Mack @ 2017-01-05 20:14 UTC (permalink / raw)
To: Daniel Borkmann, ast; +Cc: dh.herrmann, netdev, davem
In-Reply-To: <586EA627.6020404@iogearbox.net>
Hi,
On 01/05/2017 09:01 PM, Daniel Borkmann wrote:
> On 01/05/2017 05:25 PM, Daniel Borkmann wrote:
>> On 12/29/2016 06:28 PM, Daniel Mack wrote:
> [...]
>>> +static struct bpf_map *trie_alloc(union bpf_attr *attr)
>>> +{
>>> + struct lpm_trie *trie;
>>> +
>>> + /* check sanity of attributes */
>>> + if (attr->max_entries == 0 || attr->map_flags ||
>>> + attr->key_size < sizeof(struct bpf_lpm_trie_key) + 1 ||
>>> + attr->key_size > sizeof(struct bpf_lpm_trie_key) + 256 ||
>>> + attr->value_size != sizeof(u64))
>>> + return ERR_PTR(-EINVAL);
>
> One more question on this regarding value size as u64 (perhaps I
> missed it along the way): reason this was chosen was because for
> keeping stats? Why not making user choose a size as in other maps,
> so also custom structs could be stored there?
In my use case, the actual value of a node is in fact ignored, all that
matters is whether a node exists in a trie or not. The test code uses
u64 for its tests.
I can change it around so that the value size can be defined by
userspace, but ideally it would also support 0-byte lengths then. The
bpf map syscall handler should handle the latter just fine if I read the
code correctly?
Thanks,
Daniel
^ permalink raw reply
* Re: [PATCH] tg3: Avoid NULL pointer dereference in tg3_get_nstats()
From: David Miller @ 2017-01-05 20:17 UTC (permalink / raw)
To: michael.chan; +Cc: wangyufen, siva.kallam, prashant, mchan, netdev
In-Reply-To: <CACKFLimPkD6hxECA+ZhH+7BVmVSoJ1GfAMZyJ7S50CbhiuC0mA@mail.gmail.com>
From: Michael Chan <michael.chan@broadcom.com>
Date: Thu, 5 Jan 2017 12:04:13 -0800
> But it looks like ndo_get_stats() can be called without rtnl lock from
> net-procfs.c. So it is possible that we'll read tp->hw_stats after it
> has been freed. For example, if we are reading /proc/net/dev and
> closing tg3 at the same time. David, is not taking rtnl_lock in
> net-procfs.c by design?
Probably not, that dev_get_stats() call probably should be surrounded
by RTNL protection.
Doing a quick grep on dev_get_stats() shows other call sites, most of
which are using it to fetch slave device statistics from the get stats
method of the parent. Which should be ok.
It appears that the vlan procfs code in net/8021q/vlanproc.c has a
similar bug as net/core/net-procfs.c
Maybe net/core/net-sysfs.c has the same issue as well, and perhaps also
net/openvswitch/vport.c:ovs_vport_get_stats().
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox