* Re: [PATCH 1/2] net: dsa: b53: Add compatible strings for the Cygnus-family BCM11360.
From: Florian Fainelli @ 2017-04-24 23:17 UTC (permalink / raw)
To: Arun Parameswaran, Eric Anholt, Vivien Didelot, Andrew Lunn,
netdev-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Mark Rutland,
devicetree-u79uwXL29TY76Z2rM5mHXA
Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
bcm-kernel-feedback-list-dY08KVG/lbpWk0Htik3J/w, Ray Jui,
Scott Branden, Jon Mason
In-Reply-To: <1361654c-e4eb-e0a0-3397-b43235b5ff60-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
On 04/24/2017 04:15 PM, Arun Parameswaran wrote:
>
>
> On 17-04-24 04:07 PM, Florian Fainelli wrote:
>> On 04/24/2017 04:03 PM, Arun Parameswaran wrote:
>>> Hi Eric
>>>
>>> A comment on the Device ID.
>>>
>>>
>>> On 17-04-24 02:50 PM, Eric Anholt wrote:
>>>> Cygnus is a small family of SoCs, of which we currently have
>>>> devicetree for BCM11360 and BCM58300. The 11360's B53 is mostly the
>>>> same as 58xx, just requiring a tiny bit of setup that was previously
>>>> missing.
>>>>
>>>> Signed-off-by: Eric Anholt <eric-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
>>>> ---
>>>> Documentation/devicetree/bindings/net/dsa/b53.txt | 3 +++
>>>> drivers/net/dsa/b53/b53_srab.c | 2 ++
>>>> 2 files changed, 5 insertions(+)
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/net/dsa/b53.txt b/Documentation/devicetree/bindings/net/dsa/b53.txt
>>>> index d6c6e41648d4..49c93d3c0839 100644
>>>> --- a/Documentation/devicetree/bindings/net/dsa/b53.txt
>>>> +++ b/Documentation/devicetree/bindings/net/dsa/b53.txt
>>>> @@ -29,6 +29,9 @@ Required properties:
>>>> "brcm,bcm58625-srab"
>>>> "brcm,bcm88312-srab" and the mandatory "brcm,nsp-srab string
>>>>
>>>> + For the BCM11360 SoC, must be:
>>>> + "brcm,bcm11360-srab" and the mandatory "brcm,cygnus-srab string
>>>> +
>>>> For the BCM63xx/33xx SoCs with an integrated switch, must be one of:
>>>> "brcm,bcm3384-switch"
>>>> "brcm,bcm6328-switch"
>>>> diff --git a/drivers/net/dsa/b53/b53_srab.c b/drivers/net/dsa/b53/b53_srab.c
>>>> index 8a62b6a69703..c37ffd1b6833 100644
>>>> --- a/drivers/net/dsa/b53/b53_srab.c
>>>> +++ b/drivers/net/dsa/b53/b53_srab.c
>>>> @@ -364,6 +364,7 @@ static const struct of_device_id b53_srab_of_match[] = {
>>>> { .compatible = "brcm,bcm53018-srab" },
>>>> { .compatible = "brcm,bcm53019-srab" },
>>>> { .compatible = "brcm,bcm5301x-srab" },
>>>> + { .compatible = "brcm,bcm11360-srab", .data = (void *)BCM58XX_DEVICE_ID },
>>>> { .compatible = "brcm,bcm58522-srab", .data = (void *)BCM58XX_DEVICE_ID },
>>>> { .compatible = "brcm,bcm58525-srab", .data = (void *)BCM58XX_DEVICE_ID },
>>>> { .compatible = "brcm,bcm58535-srab", .data = (void *)BCM58XX_DEVICE_ID },
>>>> @@ -371,6 +372,7 @@ static const struct of_device_id b53_srab_of_match[] = {
>>>> { .compatible = "brcm,bcm58623-srab", .data = (void *)BCM58XX_DEVICE_ID },
>>>> { .compatible = "brcm,bcm58625-srab", .data = (void *)BCM58XX_DEVICE_ID },
>>>> { .compatible = "brcm,bcm88312-srab", .data = (void *)BCM58XX_DEVICE_ID },
>>>> + { .compatible = "brcm,cygnus-srab", .data = (void *)BCM58XX_DEVICE_ID },
>>> The device ID should be 0xd300. This is the actual value read from the switch regs.
>>> This also requires an entry in the 'b53_switch_chips' structure in b53_common.c.
>> These are not real ID numbers, these are values that indicate the
>> generation of the switch being embedded into the SoC. Within
>> b53_common.c we don't have to differentiate a Starfighter 2 embedded in
>> BCM11360, NSP, or 7445 or 7278, which is why using 58XX_DEVICE_ID should
>> be good enough.
> Ok. Thanks.
>
> I was under the impression, that these id's could be used in the b53_switch_detect()
> API to auto detect the switch. In that API, the switch ID is read from the
> Management page register.
For external switches that is the case, but for internal/integrated
switches, the ID is not always representative of the switch. This is why
the choice of a chip-type ID was used here while adding support for NSP
to the b53 driver.
--
Florian
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH net-next v3 0/5] virtio-net tx napi
From: Michael S. Tsirkin @ 2017-04-24 23:35 UTC (permalink / raw)
To: Willem de Bruijn
Cc: netdev, jasowang, virtualization, davem, Willem de Bruijn
In-Reply-To: <20170424174930.82623-1-willemdebruijn.kernel@gmail.com>
On Mon, Apr 24, 2017 at 01:49:25PM -0400, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
>
> Add napi for virtio-net transmit completion processing.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
> Changes:
> v2 -> v3:
> - convert __netif_tx_trylock to __netif_tx_lock on tx napi poll
> ensure that the handler always cleans, to avoid deadlock
> - unconditionally clean in start_xmit
> avoid adding an unnecessary "if (use_napi)" branch
> - remove virtqueue_disable_cb in patch 5/5
> a noop in the common event_idx based loop
> - document affinity_hint_set constraint
>
> v1 -> v2:
> - disable by default
> - disable unless affinity_hint_set
> because cache misses add up to a third higher cycle cost,
> e.g., in TCP_RR tests. This is not limited to the patch
> that enables tx completion cleaning in rx napi.
> - use trylock to avoid contention between tx and rx napi
> - keep interrupts masked during xmit_more (new patch 5/5)
> this improves cycles especially for multi UDP_STREAM, which
> does not benefit from cleaning tx completions on rx napi.
> - move free_old_xmit_skbs (new patch 3/5)
> to avoid forward declaration
>
> not changed:
> - deduplicate virnet_poll_tx and virtnet_poll_txclean
> they look similar, but have differ too much to make it
> worthwhile.
> - delay netif_wake_subqueue for more than 2 + MAX_SKB_FRAGS
> evaluated, but made no difference
> - patch 1/5
>
> RFC -> v1:
> - dropped vhost interrupt moderation patch:
> not needed and likely expensive at light load
> - remove tx napi weight
> - always clean all tx completions
> - use boolean to toggle tx-napi, instead
> - only clean tx in rx if tx-napi is enabled
> - then clean tx before rx
> - fix: add missing braces in virtnet_freeze_down
> - testing: add 4KB TCP_RR + UDP test results
>
> Based on previous patchsets by Jason Wang:
>
> [RFC V7 PATCH 0/7] enable tx interrupts for virtio-net
> http://lkml.iu.edu/hypermail/linux/kernel/1505.3/00245.html
>
>
> Before commit b0c39dbdc204 ("virtio_net: don't free buffers in xmit
> ring") the virtio-net driver would free transmitted packets on
> transmission of new packets in ndo_start_xmit and, to catch the edge
> case when no new packet is sent, also in a timer at 10HZ.
>
> A timer can cause long stalls. VIRTIO_F_NOTIFY_ON_EMPTY avoids stalls
> due to low free descriptor count. It does not address a stalls due to
> low socket SO_SNDBUF. Increasing timer frequency decreases that stall
> time, but increases interrupt rate and, thus, cycle count.
>
> Currently, with no timer, packets are freed only at ndo_start_xmit.
> Latency of consume_skb is now unbounded. To avoid a deadlock if a sock
> reaches SO_SNDBUF, packets are orphaned on tx. This breaks TCP small
> queues.
>
> Reenable TCP small queues by removing the orphan. Instead of using a
> timer, convert the driver to regular tx napi. This does not have the
> unresolved stall issue and does not have any frequency to tune.
>
> By keeping interrupts enabled by default, napi increases tx
> interrupt rate. VIRTIO_F_EVENT_IDX avoids sending an interrupt if
> one is already unacknowledged, so makes this more feasible today.
> Combine that with an optimization that brings interrupt rate
> back in line with the existing version for most workloads:
>
> Tx completion cleaning on rx interrupts elides most explicit tx
> interrupts by relying on the fact that many rx interrupts fire.
>
> Tested by running {1, 10, 100} {TCP, UDP} STREAM, RR, 4K_RR benchmarks
> from a guest to a server on the host, on an x86_64 Haswell. The guest
> runs 4 vCPUs pinned to 4 cores. vhost and the test server are
> pinned to a core each.
>
> All results are the median of 5 runs, with variance well < 10%.
> Used neper (github.com/google/neper) as test process.
>
> Napi increases single stream throughput, but increases cycle cost.
> The optimizations bring this down. The previous patchset saw a
> regression with UDP_STREAM, which does not benefit from cleaning tx
> interrupts in rx napi. This regression is now gone for 10x, 100x.
> Remaining difference is higher 1x TCP_STREAM, lower 1x UDP_STREAM.
>
> The latest results are with process, rx napi and tx napi affine to
> the same core. All numbers are lower than the previous patchset.
>
>
> upstream napi
> TCP_STREAM:
> 1x:
> Mbps 27816 39805
> Gcycles 274 285
>
> 10x:
> Mbps 42947 42531
> Gcycles 300 296
>
> 100x:
> Mbps 31830 28042
> Gcycles 279 269
>
> TCP_RR Latency (us):
> 1x:
> p50 21 21
> p99 27 27
> Gcycles 180 167
>
> 10x:
> p50 40 39
> p99 52 52
> Gcycles 214 211
>
> 100x:
> p50 281 241
> p99 411 337
> Gcycles 218 226
>
> TCP_RR 4K:
> 1x:
> p50 28 29
> p99 34 36
> Gcycles 177 167
>
> 10x:
> p50 70 71
> p99 85 134
> Gcycles 213 214
>
> 100x:
> p50 442 611
> p99 802 785
> Gcycles 237 216
>
> UDP_STREAM:
> 1x:
> Mbps 29468 26800
> Gcycles 284 293
>
> 10x:
> Mbps 29891 29978
> Gcycles 285 312
>
> 100x:
> Mbps 30269 30304
> Gcycles 318 316
>
> UDP_RR:
> 1x:
> p50 19 19
> p99 23 23
> Gcycles 180 173
>
> 10x:
> p50 35 40
> p99 54 64
> Gcycles 245 237
>
> 100x:
> p50 234 286
> p99 484 473
> Gcycles 224 214
>
> Note that GSO is enabled, so 4K RR still translates to one packet
> per request.
>
> Lower throughput at 100x vs 10x can be (at least in part)
> explained by looking at bytes per packet sent (nstat). It likely
> also explains the lower throughput of 1x for some variants.
>
> upstream:
>
> N=1 bytes/pkt=16581
> N=10 bytes/pkt=61513
> N=100 bytes/pkt=51558
>
> at_rx:
>
> N=1 bytes/pkt=65204
> N=10 bytes/pkt=65148
> N=100 bytes/pkt=56840
>
> Willem de Bruijn (5):
> virtio-net: napi helper functions
> virtio-net: transmit napi
> virtio-net: move free_old_xmit_skbs
> virtio-net: clean tx descriptors from rx napi
> virtio-net: keep tx interrupts disabled unless kick
>
> drivers/net/virtio_net.c | 193 ++++++++++++++++++++++++++++++++---------------
> 1 file changed, 132 insertions(+), 61 deletions(-)
>
> --
> 2.12.2.816.g2cccc81164-goog
^ permalink raw reply
* Fw: New Defects reported by Coverity Scan for Linux
From: Stephen Hemminger @ 2017-04-24 23:41 UTC (permalink / raw)
To: Jamal Hadi Salim, Cong Wang, Jiri Pirko; +Cc: netdev
This looks like a false positive
Date: Mon, 24 Apr 2017 14:40:46 -0700
From: scan-admin@coverity.com
To: stephen@networkplumber.org
Subject: New Defects reported by Coverity Scan for Linux
1 new defect(s) introduced to Linux found with Coverity Scan.
2 defect(s), reported by Coverity Scan earlier, were marked fixed in the recent build analyzed by Coverity Scan.
New defect(s) Reported-by: Coverity Scan
Showing 1 of 1 defect(s)
** CID 1405487: Resource leaks (RESOURCE_LEAK)
/net/sched/act_api.c: 635 in tcf_action_init_1()
________________________________________________________________________________________________________
*** CID 1405487: Resource leaks (RESOURCE_LEAK)
/net/sched/act_api.c: 635 in tcf_action_init_1()
629 * if it exists and is only bound to in a_o->init() then
630 * ACT_P_CREATED is not returned (a zero is).
631 */
632 if (err != ACT_P_CREATED)
633 module_put(a_o->owner);
634
>>> CID 1405487: Resource leaks (RESOURCE_LEAK)
>>> Variable "cookie" going out of scope leaks the storage it points to.
635 return a;
636
637 err_mod:
638 module_put(a_o->owner);
639 err_out:
640 if (cookie) {
^ permalink raw reply
* Re: [PATCH 1/2] net: dsa: b53: Add compatible strings for the Cygnus-family BCM11360.
From: Eric Anholt @ 2017-04-24 23:54 UTC (permalink / raw)
To: Scott Branden, Florian Fainelli, Vivien Didelot, Andrew Lunn,
netdev-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Mark Rutland,
devicetree-u79uwXL29TY76Z2rM5mHXA
Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
bcm-kernel-feedback-list-dY08KVG/lbpWk0Htik3J/w, Ray Jui,
Scott Branden, Jon Mason
In-Reply-To: <a2f4f0c9-feea-291b-dae5-f4ed10f9c547-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 1263 bytes --]
Scott Branden <scott.branden-dY08KVG/lbpWk0Htik3J/w@public.gmane.org> writes:
> Minor comment in line
>
> On 17-04-24 02:50 PM, Eric Anholt wrote:
>> Cygnus is a small family of SoCs, of which we currently have
>> devicetree for BCM11360 and BCM58300. The 11360's B53 is mostly the
>> same as 58xx, just requiring a tiny bit of setup that was previously
>> missing.
>>
>> Signed-off-by: Eric Anholt <eric-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
>> ---
>> Documentation/devicetree/bindings/net/dsa/b53.txt | 3 +++
>> drivers/net/dsa/b53/b53_srab.c | 2 ++
>> 2 files changed, 5 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/net/dsa/b53.txt b/Documentation/devicetree/bindings/net/dsa/b53.txt
>> index d6c6e41648d4..49c93d3c0839 100644
>> --- a/Documentation/devicetree/bindings/net/dsa/b53.txt
>> +++ b/Documentation/devicetree/bindings/net/dsa/b53.txt
>> @@ -29,6 +29,9 @@ Required properties:
>> "brcm,bcm58625-srab"
>> "brcm,bcm88312-srab" and the mandatory "brcm,nsp-srab string
>>
>> + For the BCM11360 SoC, must be:
>> + "brcm,bcm11360-srab" and the mandatory "brcm,cygnus-srab string
>> +
> place in alphabetical order in the doc?
Moved it above BCM5310x now. I hope that's what you meant.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply
* Re: [PATCH 1/2] net: dsa: b53: Add compatible strings for the Cygnus-family BCM11360.
From: Scott Branden @ 2017-04-24 23:57 UTC (permalink / raw)
To: Eric Anholt, Florian Fainelli, Vivien Didelot, Andrew Lunn,
netdev-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Mark Rutland,
devicetree-u79uwXL29TY76Z2rM5mHXA
Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
bcm-kernel-feedback-list-dY08KVG/lbpWk0Htik3J/w, Ray Jui,
Scott Branden, Jon Mason
In-Reply-To: <8760hte4zw.fsf-omZaPlIz5HhaEpDpdNBo/KxOck334EZe@public.gmane.org>
On 17-04-24 04:54 PM, Eric Anholt wrote:
> Scott Branden <scott.branden-dY08KVG/lbpWk0Htik3J/w@public.gmane.org> writes:
>
>> Minor comment in line
>>
>> On 17-04-24 02:50 PM, Eric Anholt wrote:
>>> Cygnus is a small family of SoCs, of which we currently have
>>> devicetree for BCM11360 and BCM58300. The 11360's B53 is mostly the
>>> same as 58xx, just requiring a tiny bit of setup that was previously
>>> missing.
>>>
>>> Signed-off-by: Eric Anholt <eric-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
>>> ---
>>> Documentation/devicetree/bindings/net/dsa/b53.txt | 3 +++
>>> drivers/net/dsa/b53/b53_srab.c | 2 ++
>>> 2 files changed, 5 insertions(+)
>>>
>>> diff --git a/Documentation/devicetree/bindings/net/dsa/b53.txt b/Documentation/devicetree/bindings/net/dsa/b53.txt
>>> index d6c6e41648d4..49c93d3c0839 100644
>>> --- a/Documentation/devicetree/bindings/net/dsa/b53.txt
>>> +++ b/Documentation/devicetree/bindings/net/dsa/b53.txt
>>> @@ -29,6 +29,9 @@ Required properties:
>>> "brcm,bcm58625-srab"
>>> "brcm,bcm88312-srab" and the mandatory "brcm,nsp-srab string
>>>
>>> + For the BCM11360 SoC, must be:
>>> + "brcm,bcm11360-srab" and the mandatory "brcm,cygnus-srab string
>>> +
>> place in alphabetical order in the doc?
>
> Moved it above BCM5310x now. I hope that's what you meant.
>
Yes, sorry for not being more clear.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 2/2] ARM: dts: Add the ethernet and ethernet PHY to the cygnus core DT.
From: Eric Anholt @ 2017-04-25 0:00 UTC (permalink / raw)
To: Andrew Lunn
Cc: Florian Fainelli, Vivien Didelot, netdev, Rob Herring,
Mark Rutland, devicetree, linux-arm-kernel, linux-kernel,
bcm-kernel-feedback-list, Ray Jui, Scott Branden, Jon Mason
In-Reply-To: <20170424220842.GA26241@lunn.ch>
[-- Attachment #1: Type: text/plain, Size: 568 bytes --]
Andrew Lunn <andrew@lunn.ch> writes:
>> + mdio: mdio@18002000 {
>> + compatible = "brcm,iproc-mdio";
>> + reg = <0x18002000 0x8>;
>> + #size-cells = <1>;
>> + #address-cells = <0>;
>> +
>> + gphy0: eth-gphy@0 {
>> + reg = <0>;
>> + max-speed = <1000>;
>> + };
>> +
>> + gphy1: eth-gphy@1 {
>> + reg = <1>;
>> + max-speed = <1000>;
>> + };
>> + };
>
> Hi Eric
>
> Do these max-speed properties do anything useful? Is the PHY capable
> of > 1Gbps?
Not sure where I had those copy-and-pasted from, but they don't seem
necessary. Dropped.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply
* Re: [PATCH 2/2] ARM: dts: Add the ethernet and ethernet PHY to the cygnus core DT.
From: Eric Anholt @ 2017-04-25 0:02 UTC (permalink / raw)
To: Florian Fainelli, Vivien Didelot, Andrew Lunn,
netdev-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Mark Rutland,
devicetree-u79uwXL29TY76Z2rM5mHXA
Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
bcm-kernel-feedback-list-dY08KVG/lbpWk0Htik3J/w, Ray Jui,
Scott Branden, Jon Mason
In-Reply-To: <1759fb29-0678-0ec5-5398-16c4a3ba9660-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 613 bytes --]
Florian Fainelli <f.fainelli-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> On 04/24/2017 02:50 PM, Eric Anholt wrote:
>> Cygnus has a single amac controller connected to the B53 switch with 2
>> PHYs. On the BCM911360_EP platform, those two PHYs are connected to
>> the external ethernet jacks.
>>
>> Signed-off-by: Eric Anholt <eric-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
>
> This looks fine, just a few nits on the label names:
Thanks. I've applied all of these (and Andrew's and Scott's
suggestions), and I'll send out a new version once the DT maintainers
have had a chance to look.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply
* Re: Fw: New Defects reported by Coverity Scan for Linux
From: Cong Wang @ 2017-04-25 0:09 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Jamal Hadi Salim, Jiri Pirko, Linux Kernel Network Developers
In-Reply-To: <20170424164127.5a2145e8@xeon-e3>
On Mon, Apr 24, 2017 at 4:41 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> This looks like a false positive
> ________________________________________________________________________________________________________
> *** CID 1405487: Resource leaks (RESOURCE_LEAK)
> /net/sched/act_api.c: 635 in tcf_action_init_1()
> 629 * if it exists and is only bound to in a_o->init() then
> 630 * ACT_P_CREATED is not returned (a zero is).
> 631 */
> 632 if (err != ACT_P_CREATED)
> 633 module_put(a_o->owner);
> 634
>>>> CID 1405487: Resource leaks (RESOURCE_LEAK)
>>>> Variable "cookie" going out of scope leaks the storage it points to.
> 635 return a;
I don't see how we could leak it either.
^ permalink raw reply
* Re: Fw: New Defects reported by Coverity Scan for Linux
From: Jamal Hadi Salim @ 2017-04-25 0:23 UTC (permalink / raw)
To: Cong Wang; +Cc: Stephen Hemminger, Jiri Pirko, Linux Kernel Network Developers
In-Reply-To: <CAM_iQpUTxNdp-uC0pGkvW0uceOpouAQ1abrU-u4wvR9ftqXN7w@mail.gmail.com>
Yeah, not sure i see it either.
cheers,
jamal
On Mon, Apr 24, 2017 at 8:09 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Mon, Apr 24, 2017 at 4:41 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> This looks like a false positive
>> ________________________________________________________________________________________________________
>> *** CID 1405487: Resource leaks (RESOURCE_LEAK)
>> /net/sched/act_api.c: 635 in tcf_action_init_1()
>> 629 * if it exists and is only bound to in a_o->init() then
>> 630 * ACT_P_CREATED is not returned (a zero is).
>> 631 */
>> 632 if (err != ACT_P_CREATED)
>> 633 module_put(a_o->owner);
>> 634
>>>>> CID 1405487: Resource leaks (RESOURCE_LEAK)
>>>>> Variable "cookie" going out of scope leaks the storage it points to.
>> 635 return a;
>
> I don't see how we could leak it either.
^ permalink raw reply
* Re: [PATCH net-next 0/2] flower: add MPLS matching support
From: Jamal Hadi Salim @ 2017-04-25 0:58 UTC (permalink / raw)
To: David Miller, benjamin.lahaise; +Cc: netdev, bcrl, Jiri Pirko
In-Reply-To: <20170424.143257.1003081554810761464.davem@davemloft.net>
On 17-04-24 02:32 PM, David Miller wrote:
> From: Benjamin LaHaise <benjamin.lahaise@netronome.com>
>
> Series applied, but in the future:
>
> 1) Put the "v2", "v3", whatever in the inital [PATCH xxx] bracketed
> part of the subject line, otherwise it ends up in the GIT commit
> message header line and that's not desired.
>
> 2) Please cut it with these double signoffs, and add an appropriate
> entry to the email aliases file.
I know i should have spoken earlier, wanted to but got distracted - but
shouldnt the new rules have applied to this patch too? ;->
You have 3 TLVs, one of which is u8 that only allows use of 3 bits.
The other is a u32 which allows only 20 bits to be set.
cheers,
jamal
^ permalink raw reply
* Re: [PATCH v2] net: core: Prevent from dereferencing null pointer when
From: Myungho Jung @ 2017-04-25 1:00 UTC (permalink / raw)
To: David Miller; +Cc: edumazet, netdev
In-Reply-To: <20170424.120235.438297438450788048.davem@davemloft.net>
On Mon, Apr 24, 2017 at 12:02:35PM -0400, David Miller wrote:
> From: Myungho Jung <mhjungk@gmail.com>
> Date: Thu, 20 Apr 2017 16:59:20 -0700
>
> > Added NULL check to make __dev_kfree_skb_irq consistent with kfree
> > family of functions.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=195289
> >
> > Signed-off-by: Myungho Jung <mhjungk@gmail.com>
> > ---
> > Changes in v2:
> > - Correct category in subject
>
> This subject line is an incomplete sentence.
>
> This patch prevents dereferenccing a null pointer when "what"?
As it was reported on bugzilla, it would happen when plugging p54 usb device
to RPi2. But, i'm not 100% sure that this patch will resolve the issue because
the reporter didn't try my patch yet and I don't have the device to test.
And there might be some other places calling the function without checking
null pointer. The thing is that only the function don't check null among
kfree functions. So, I just hope this patch will prevent potential oops
issues.
Thanks,
Myungho
^ permalink raw reply
* Re: [PATCH net-next 0/2] flower: add MPLS matching support
From: Benjamin LaHaise @ 2017-04-25 1:05 UTC (permalink / raw)
To: Jamal Hadi Salim; +Cc: David Miller, netdev, bcrl, Jiri Pirko
In-Reply-To: <d54cca4b-6212-bf7d-93e2-8314177df42c@mojatatu.com>
On Mon, Apr 24, 2017 at 08:58:18PM -0400, Jamal Hadi Salim wrote:
> On 17-04-24 02:32 PM, David Miller wrote:
> > From: Benjamin LaHaise <benjamin.lahaise@netronome.com>
>
> >
> > Series applied, but in the future:
> >
> > 1) Put the "v2", "v3", whatever in the inital [PATCH xxx] bracketed
> > part of the subject line, otherwise it ends up in the GIT commit
> > message header line and that's not desired.
> >
> > 2) Please cut it with these double signoffs, and add an appropriate
> > entry to the email aliases file.
>
> I know i should have spoken earlier, wanted to but got distracted - but
> shouldnt the new rules have applied to this patch too? ;->
>
> You have 3 TLVs, one of which is u8 that only allows use of 3 bits.
> The other is a u32 which allows only 20 bits to be set.
What are the new rules for TLVs -- do you want a new type added for the
subset values that are limited to the number of bits actually used? A
pointed here would be helpful.
-ben
> cheers,
> jamal
^ permalink raw reply
* Re: [PATCH v2] net: core: Prevent from dereferencing null pointer when
From: Eric Dumazet @ 2017-04-25 1:10 UTC (permalink / raw)
To: Myungho Jung; +Cc: David Miller, netdev
In-Reply-To: <20170425010052.GA8717@fqdn.specialj.com>
On Mon, Apr 24, 2017 at 6:00 PM, Myungho Jung <mhjungk@gmail.com> wrote:
> On Mon, Apr 24, 2017 at 12:02:35PM -0400, David Miller wrote:
>> From: Myungho Jung <mhjungk@gmail.com>
>> Date: Thu, 20 Apr 2017 16:59:20 -0700
>>
>> > Added NULL check to make __dev_kfree_skb_irq consistent with kfree
>> > family of functions.
>> >
>> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=195289
>> >
>> > Signed-off-by: Myungho Jung <mhjungk@gmail.com>
>> > ---
>> > Changes in v2:
>> > - Correct category in subject
>>
>> This subject line is an incomplete sentence.
>>
>> This patch prevents dereferenccing a null pointer when "what"?
>
> As it was reported on bugzilla, it would happen when plugging p54 usb device
> to RPi2. But, i'm not 100% sure that this patch will resolve the issue because
> the reporter didn't try my patch yet and I don't have the device to test.
>
> And there might be some other places calling the function without checking
> null pointer. The thing is that only the function don't check null among
> kfree functions. So, I just hope this patch will prevent potential oops
> issues.
Okay, but your patch title was : "net: core: Prevent from
dereferencing null pointer when"
"when" is usually followed by other words.
^ permalink raw reply
* EXTREMELY IMPORTANT
From: Ms. Katherine @ 2017-04-24 22:33 UTC (permalink / raw)
To: Recipients
Dear Beloved, Sorry for the inconvenience, I am getting in touch with you regarding an extremely important and urgent matter, Please I need your urgent assistance and idea to finish up a project (Orphanage Home) Due to my health condition, Everything is available to finish up the project, All I need is your idea and trust...............
Geachte Geliefde, Sorry voor het ongemak, ik kom met je in contact met betrekking tot een zeer belangrijke en dringende zaak. Gelieve mijn dringende hulp en idee nodig om een project af te ronden (Weeshuishuis) Vanwege mijn gezondheidstoestand is alles beschikbaar voor Voltooi het project, alles wat ik nodig heb is jouw idee en vertrouwen.
Please contact me for more details.
Contact Email: kathrynnmison@gmail.com
Thanks
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
^ permalink raw reply
* Re: [PATCH net-next 0/2] flower: add MPLS matching support
From: Jakub Kicinski @ 2017-04-25 1:20 UTC (permalink / raw)
To: Jamal Hadi Salim; +Cc: David Miller, benjamin.lahaise, netdev, bcrl, Jiri Pirko
In-Reply-To: <d54cca4b-6212-bf7d-93e2-8314177df42c@mojatatu.com>
On Mon, 24 Apr 2017 20:58:18 -0400, Jamal Hadi Salim wrote:
> On 17-04-24 02:32 PM, David Miller wrote:
> > From: Benjamin LaHaise <benjamin.lahaise@netronome.com>
>
> >
> > Series applied, but in the future:
> >
> > 1) Put the "v2", "v3", whatever in the inital [PATCH xxx] bracketed
> > part of the subject line, otherwise it ends up in the GIT commit
> > message header line and that's not desired.
> >
> > 2) Please cut it with these double signoffs, and add an appropriate
> > entry to the email aliases file.
>
> I know i should have spoken earlier, wanted to but got distracted - but
> shouldnt the new rules have applied to this patch too? ;->
>
> You have 3 TLVs, one of which is u8 that only allows use of 3 bits.
> The other is a u32 which allows only 20 bits to be set.
I don't think we will ever reuse bits in a field which is called
MPLS_LABEL for anything else than an MPLS label. My understanding of
the conclusions from recent discussions was to either (a) make
attributes single purpose (i.e. separate u8 per flag); or (b) make
attributes u32/word size but validate the unused bits are zero. This
patch would fall under (a).
^ permalink raw reply
* Re: [PATCH net-next 0/2] flower: add MPLS matching support
From: Jamal Hadi Salim @ 2017-04-25 1:25 UTC (permalink / raw)
To: Benjamin LaHaise; +Cc: David Miller, netdev, bcrl, Jiri Pirko
In-Reply-To: <20170425010558.GC24425@nvt-d.home.kvack.org>
On 17-04-24 09:05 PM, Benjamin LaHaise wrote:
> On Mon, Apr 24, 2017 at 08:58:18PM -0400, Jamal Hadi Salim wrote:
>> On 17-04-24 02:32 PM, David Miller wrote:
>> You have 3 TLVs, one of which is u8 that only allows use of 3 bits.
>> The other is a u32 which allows only 20 bits to be set.
>
> What are the new rules for TLVs -- do you want a new type added for the
> subset values that are limited to the number of bits actually used? A
> pointed here would be helpful.
>
The rules are to check for bits that are not used.
For example in the BOS TLV it would be required to
verify that the user did not set any other bit other than the
bos field. It is required to reject the whole thing if the user
set any other of the u8 bits.
cheers,
jamal
^ permalink raw reply
* MONTHLY UPDATE
From: IT Department @ 2017-04-24 23:17 UTC (permalink / raw)
To: netdev
Recently, we have detect some unusual activity on your account and as a result, all email users are urged to update their email account within 24 hours of receiving this e-mail, please click the link http://beam.to/6128 to confirm that your email account is up to date with the institution requirement.
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
^ permalink raw reply
* [PATCH net-next] selftests/net: Fix broken test case in psock_fanout
From: Mike Maloney @ 2017-04-25 1:29 UTC (permalink / raw)
To: netdev, davem; +Cc: Mike Maloney
From: Mike Maloney <maloney@google.com>
The error return falue form sock_fanout_open is -1, not zero. One test
case was checking for 0 instead of -1.
Tested: Built and tested in clean client.
Signed-off-by: Mike Maloney <maloney@google.com>
---
tools/testing/selftests/net/psock_fanout.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/psock_fanout.c b/tools/testing/selftests/net/psock_fanout.c
index b4b1d91fcea5..989f917068d1 100644
--- a/tools/testing/selftests/net/psock_fanout.c
+++ b/tools/testing/selftests/net/psock_fanout.c
@@ -305,7 +305,7 @@ static void test_unique_fanout_group_ids(void)
exit(1);
}
- if (sock_fanout_open(PACKET_FANOUT_CPU, first_group_id)) {
+ if (sock_fanout_open(PACKET_FANOUT_CPU, first_group_id) != -1) {
fprintf(stderr, "ERROR: joined group with wrong type.\n");
exit(1);
}
--
2.13.0.rc0.306.g87b477812d-goog
^ permalink raw reply related
* [PATCH net] netvsc: fix calculation of available send sections
From: Stephen Hemminger @ 2017-04-25 1:33 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
My change (introduced in 4.11) to use find_first_clear_bit
incorrectly assumed that the size argument was words, not bits.
The effect was only a small limited number of the available send
sections were being actually used. This can cause performance loss
with some workloads.
Since map_words is now used only during initialization, it can
be on stack instead of in per-device data.
Fixes: b58a185801da ("netvsc: simplify get next send section")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
drivers/net/hyperv/hyperv_net.h | 1 -
drivers/net/hyperv/netvsc.c | 9 ++++-----
2 files changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index f9f3dba7a588..db23cb36ae5c 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -751,7 +751,6 @@ struct netvsc_device {
u32 send_section_cnt;
u32 send_section_size;
unsigned long *send_section_map;
- int map_words;
/* Used for NetVSP initialization protocol */
struct completion channel_init_wait;
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 8dd0b8770328..15ef713d96c0 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -236,6 +236,7 @@ static int netvsc_init_buf(struct hv_device *device)
struct netvsc_device *net_device;
struct nvsp_message *init_packet;
struct net_device *ndev;
+ size_t map_words;
int node;
net_device = get_outbound_net_device(device);
@@ -401,11 +402,9 @@ static int netvsc_init_buf(struct hv_device *device)
net_device->send_section_size, net_device->send_section_cnt);
/* Setup state for managing the send buffer. */
- net_device->map_words = DIV_ROUND_UP(net_device->send_section_cnt,
- BITS_PER_LONG);
+ map_words = DIV_ROUND_UP(net_device->send_section_cnt, BITS_PER_LONG);
- net_device->send_section_map = kcalloc(net_device->map_words,
- sizeof(ulong), GFP_KERNEL);
+ net_device->send_section_map = kcalloc(map_words, sizeof(ulong), GFP_KERNEL);
if (net_device->send_section_map == NULL) {
ret = -ENOMEM;
goto cleanup;
@@ -683,7 +682,7 @@ static u32 netvsc_get_next_send_section(struct netvsc_device *net_device)
unsigned long *map_addr = net_device->send_section_map;
unsigned int i;
- for_each_clear_bit(i, map_addr, net_device->map_words) {
+ for_each_clear_bit(i, map_addr, net_device->send_section_cnt) {
if (sync_test_and_set_bit(i, map_addr) == 0)
return i;
}
--
2.11.0
^ permalink raw reply related
* Re: [PATCH v2] net: core: Prevent from dereferencing null pointer when
From: Myungho Jung @ 2017-04-25 1:36 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <CANn89iL1n67fkZm0ZL0zN0_AL_uWZDKZXY3mOrn0H96vZTX7zQ@mail.gmail.com>
On Mon, Apr 24, 2017 at 06:10:32PM -0700, Eric Dumazet wrote:
> On Mon, Apr 24, 2017 at 6:00 PM, Myungho Jung <mhjungk@gmail.com> wrote:
> > On Mon, Apr 24, 2017 at 12:02:35PM -0400, David Miller wrote:
> >> From: Myungho Jung <mhjungk@gmail.com>
> >> Date: Thu, 20 Apr 2017 16:59:20 -0700
> >>
> >> > Added NULL check to make __dev_kfree_skb_irq consistent with kfree
> >> > family of functions.
> >> >
> >> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=195289
> >> >
> >> > Signed-off-by: Myungho Jung <mhjungk@gmail.com>
> >> > ---
> >> > Changes in v2:
> >> > - Correct category in subject
> >>
> >> This subject line is an incomplete sentence.
> >>
> >> This patch prevents dereferenccing a null pointer when "what"?
> >
> > As it was reported on bugzilla, it would happen when plugging p54 usb device
> > to RPi2. But, i'm not 100% sure that this patch will resolve the issue because
> > the reporter didn't try my patch yet and I don't have the device to test.
> >
> > And there might be some other places calling the function without checking
> > null pointer. The thing is that only the function don't check null among
> > kfree functions. So, I just hope this patch will prevent potential oops
> > issues.
>
> Okay, but your patch title was : "net: core: Prevent from
> dereferencing null pointer when"
>
> "when" is usually followed by other words.
Oh, sorry that was typo. I'll remove "when" from subject.
Thanks,
Myungho
^ permalink raw reply
* Re: [PATCH v2] net: core: Prevent from dereferencing null pointer when
From: David Miller @ 2017-04-25 1:44 UTC (permalink / raw)
To: mhjungk; +Cc: edumazet, netdev
In-Reply-To: <20170425010052.GA8717@fqdn.specialj.com>
From: Myungho Jung <mhjungk@gmail.com>
Date: Mon, 24 Apr 2017 18:00:52 -0700
> On Mon, Apr 24, 2017 at 12:02:35PM -0400, David Miller wrote:
>> From: Myungho Jung <mhjungk@gmail.com>
>> Date: Thu, 20 Apr 2017 16:59:20 -0700
>>
>> > Added NULL check to make __dev_kfree_skb_irq consistent with kfree
>> > family of functions.
>> >
>> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=195289
>> >
>> > Signed-off-by: Myungho Jung <mhjungk@gmail.com>
>> > ---
>> > Changes in v2:
>> > - Correct category in subject
>>
>> This subject line is an incomplete sentence.
>>
>> This patch prevents dereferenccing a null pointer when "what"?
>
> As it was reported on bugzilla, it would happen when plugging p54 usb device
> to RPi2. But, i'm not 100% sure that this patch will resolve the issue because
> the reporter didn't try my patch yet and I don't have the device to test.
>
> And there might be some other places calling the function without checking
> null pointer. The thing is that only the function don't check null among
> kfree functions. So, I just hope this patch will prevent potential oops
> issues.
It doesn't check for a NULL pointer because it is almost exclusively
used in the interrupt paths where we know we have a non-NULL skb.
^ permalink raw reply
* Re: [PATCH net-next 0/2] flower: add MPLS matching support
From: Jamal Hadi Salim @ 2017-04-25 1:48 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: David Miller, benjamin.lahaise, netdev, bcrl, Jiri Pirko
In-Reply-To: <20170424182054.116d1a99@cakuba.netronome.com>
On 17-04-24 09:20 PM, Jakub Kicinski wrote:
> On Mon, 24 Apr 2017 20:58:18 -0400, Jamal Hadi Salim wrote:
>> On 17-04-24 02:32 PM, David Miller wrote:
>> You have 3 TLVs, one of which is u8 that only allows use of 3 bits.
>> The other is a u32 which allows only 20 bits to be set.
>
> I don't think we will ever reuse bits in a field which is called
> MPLS_LABEL for anything else than an MPLS label.
That is true. So maybe bad example. It also helps the mpls disector
wont for example allow copying of more than 20 bits for a label.
cheers,
jamal
^ permalink raw reply
* Re: [PATCH 7/7] liquidio: use pcie_flr instead of duplicating it
From: Felix Manlunas @ 2017-04-25 1:53 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Bjorn Helgaas, Giovanni Cabiddu, Salvatore Benedetto,
Mike Marciniszyn, Dennis Dalessandro, Derek Chickles,
Satanand Burla, Felix Manlunas, Raghu Vatsavayi, Jeff Kirsher,
linux-pci, qat-linux, linux-crypto, linux-rdma, netdev,
linux-kernel
In-Reply-To: <20170414191131.14286-8-hch@lst.de>
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 14 Apr 2017 21:11:31 +0200
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 15 +--------------
> 1 file changed, 1 insertion(+), 14 deletions(-)
>
> diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
> index 9d5e03502c76..afdbf7fa016e 100644
> --- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
> +++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
> @@ -869,20 +869,7 @@ static void octeon_pci_flr(struct octeon_device *oct)
> pci_write_config_word(oct->pci_dev, PCI_COMMAND,
> PCI_COMMAND_INTX_DISABLE);
>
> - /* Wait for Transaction Pending bit clean */
> - msleep(100);
> - pcie_capability_read_word(oct->pci_dev, PCI_EXP_DEVSTA, &status);
> - if (status & PCI_EXP_DEVSTA_TRPND) {
> - dev_info(&oct->pci_dev->dev, "Function reset incomplete after 100ms, sleeping for 5 seconds\n");
> - ssleep(5);
> - pcie_capability_read_word(oct->pci_dev, PCI_EXP_DEVSTA,
> - &status);
> - if (status & PCI_EXP_DEVSTA_TRPND)
> - dev_info(&oct->pci_dev->dev, "Function reset still incomplete after 5s, reset anyway\n");
> - }
> - pcie_capability_set_word(oct->pci_dev, PCI_EXP_DEVCTL,
> - PCI_EXP_DEVCTL_BCR_FLR);
> - mdelay(100);
> + pcie_flr(oct->pci_dev);
>
> pci_cfg_access_unlock(oct->pci_dev);
>
> --
> 2.11.0
>
This patch works. I tested it on a LiquidIO NIC and the "next" branch of the
PCI git tree.
But the patch causes a gcc warning:
.../lio_vf_main.c: In function 'octeon_pci_flr':
.../lio_vf_main.c:862:6: warning: unused variable 'status' [-Wunused-variable]
Can you rework the patch to get rid of the warning? Thanks.
^ permalink raw reply
* Re: [PATCH net-next 0/2] flower: add MPLS matching support
From: Jamal Hadi Salim @ 2017-04-25 2:00 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: David Miller, benjamin.lahaise, netdev, bcrl, Jiri Pirko
In-Reply-To: <6ae82894-fc8b-2ad9-7dfd-cf74cd6042a1@mojatatu.com>
On 17-04-24 09:48 PM, Jamal Hadi Salim wrote:
> On 17-04-24 09:20 PM, Jakub Kicinski wrote:
>> On Mon, 24 Apr 2017 20:58:18 -0400, Jamal Hadi Salim wrote:
>>> On 17-04-24 02:32 PM, David Miller wrote:
>
>>> You have 3 TLVs, one of which is u8 that only allows use of 3 bits.
>>> The other is a u32 which allows only 20 bits to be set.
>>
>> I don't think we will ever reuse bits in a field which is called
>> MPLS_LABEL for anything else than an MPLS label.
>
> That is true. So maybe bad example. It also helps the mpls disector
> wont for example allow copying of more than 20 bits for a label.
Hrm. maybe I am wrong.
Lets say user sets all of the 8 bits in BOS,
what does setting
key_val->mpls_bos = nla_get_u8 do?
Same with the 20 bits for the label in the u32
or 3 bit bits in the u8 tc.
cheers,
jamal
^ permalink raw reply
* [PATCH net-next] bpf: map_get_next_key to return first key on NULL
From: Alexei Starovoitov @ 2017-04-25 2:00 UTC (permalink / raw)
To: David S . Miller; +Cc: Daniel Borkmann, Teng Qin, netdev, kernel-team
From: Teng Qin <qinteng@fb.com>
When iterating through a map, we need to find a key that does not exist
in the map so map_get_next_key will give us the first key of the map.
This often requires a lot of guessing in production systems.
This patch makes map_get_next_key return the first key when the key
pointer in the parameter is NULL.
Signed-off-by: Teng Qin <qinteng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
---
include/trace/events/bpf.h | 10 +++++++---
kernel/bpf/arraymap.c | 2 +-
kernel/bpf/hashtab.c | 9 +++++----
kernel/bpf/syscall.c | 20 ++++++++++++--------
tools/testing/selftests/bpf/test_maps.c | 29 +++++++++++++++++++++++++----
5 files changed, 50 insertions(+), 20 deletions(-)
diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
index c3a53fd47ff1..52c8425d144b 100644
--- a/include/trace/events/bpf.h
+++ b/include/trace/events/bpf.h
@@ -321,11 +321,14 @@ TRACE_EVENT(bpf_map_next_key,
__dynamic_array(u8, key, map->key_size)
__dynamic_array(u8, nxt, map->key_size)
__field(bool, key_trunc)
+ __field(bool, key_null)
__field(int, ufd)
),
TP_fast_assign(
- memcpy(__get_dynamic_array(key), key, map->key_size);
+ if (key)
+ memcpy(__get_dynamic_array(key), key, map->key_size);
+ __entry->key_null = !key;
memcpy(__get_dynamic_array(nxt), key_next, map->key_size);
__entry->type = map->map_type;
__entry->key_len = min(map->key_size, 16U);
@@ -336,8 +339,9 @@ TRACE_EVENT(bpf_map_next_key,
TP_printk("map type=%s ufd=%d key=[%s%s] next=[%s%s]",
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd,
- __print_hex(__get_dynamic_array(key), __entry->key_len),
- __entry->key_trunc ? " ..." : "",
+ __entry->key_null ? "NULL" : __print_hex(__get_dynamic_array(key),
+ __entry->key_len),
+ __entry->key_trunc && !__entry->key_null ? " ..." : "",
__print_hex(__get_dynamic_array(nxt), __entry->key_len),
__entry->key_trunc ? " ..." : "")
);
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index ec621df5a97a..5e00b2333c26 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -182,7 +182,7 @@ int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value)
static int array_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
{
struct bpf_array *array = container_of(map, struct bpf_array, map);
- u32 index = *(u32 *)key;
+ u32 index = key ? *(u32 *)key : U32_MAX;
u32 *next = (u32 *)next_key;
if (index >= array->map.max_entries) {
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index bc80c038e430..004334ea13ba 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -540,12 +540,15 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
struct hlist_nulls_head *head;
struct htab_elem *l, *next_l;
u32 hash, key_size;
- int i;
+ int i = 0;
WARN_ON_ONCE(!rcu_read_lock_held());
key_size = map->key_size;
+ if (!key)
+ goto find_first_elem;
+
hash = htab_map_hash(key, key_size);
head = select_bucket(htab, hash);
@@ -553,10 +556,8 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
/* lookup the key */
l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets);
- if (!l) {
- i = 0;
+ if (!l)
goto find_first_elem;
- }
/* key was found, get next key in the same bucket */
next_l = hlist_nulls_entry_safe(rcu_dereference_raw(hlist_nulls_next_rcu(&l->hash_node)),
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b89288e2b589..13642c73dca0 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -536,14 +536,18 @@ static int map_get_next_key(union bpf_attr *attr)
if (IS_ERR(map))
return PTR_ERR(map);
- err = -ENOMEM;
- key = kmalloc(map->key_size, GFP_USER);
- if (!key)
- goto err_put;
-
- err = -EFAULT;
- if (copy_from_user(key, ukey, map->key_size) != 0)
- goto free_key;
+ if (ukey) {
+ err = -ENOMEM;
+ key = kmalloc(map->key_size, GFP_USER);
+ if (!key)
+ goto err_put;
+
+ err = -EFAULT;
+ if (copy_from_user(key, ukey, map->key_size) != 0)
+ goto free_key;
+ } else {
+ key = NULL;
+ }
err = -ENOMEM;
next_key = kmalloc(map->key_size, GFP_USER);
diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index 20f1871874df..a977c4f7b0ce 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -28,7 +28,7 @@ static int map_flags;
static void test_hashmap(int task, void *data)
{
- long long key, next_key, value;
+ long long key, next_key, first_key, value;
int fd;
fd = bpf_create_map(BPF_MAP_TYPE_HASH, sizeof(key), sizeof(value),
@@ -89,10 +89,13 @@ static void test_hashmap(int task, void *data)
assert(bpf_map_delete_elem(fd, &key) == -1 && errno == ENOENT);
/* Iterate over two elements. */
+ assert(bpf_map_get_next_key(fd, NULL, &first_key) == 0 &&
+ (first_key == 1 || first_key == 2));
assert(bpf_map_get_next_key(fd, &key, &next_key) == 0 &&
- (next_key == 1 || next_key == 2));
+ (next_key == first_key));
assert(bpf_map_get_next_key(fd, &next_key, &next_key) == 0 &&
- (next_key == 1 || next_key == 2));
+ (next_key == 1 || next_key == 2) &&
+ (next_key != first_key));
assert(bpf_map_get_next_key(fd, &next_key, &next_key) == -1 &&
errno == ENOENT);
@@ -105,6 +108,8 @@ static void test_hashmap(int task, void *data)
key = 0;
/* Check that map is empty. */
+ assert(bpf_map_get_next_key(fd, NULL, &next_key) == -1 &&
+ errno == ENOENT);
assert(bpf_map_get_next_key(fd, &key, &next_key) == -1 &&
errno == ENOENT);
@@ -133,7 +138,7 @@ static void test_hashmap_percpu(int task, void *data)
{
unsigned int nr_cpus = bpf_num_possible_cpus();
long long value[nr_cpus];
- long long key, next_key;
+ long long key, next_key, first_key;
int expected_key_mask = 0;
int fd, i;
@@ -193,7 +198,13 @@ static void test_hashmap_percpu(int task, void *data)
assert(bpf_map_delete_elem(fd, &key) == -1 && errno == ENOENT);
/* Iterate over two elements. */
+ assert(bpf_map_get_next_key(fd, NULL, &first_key) == 0 &&
+ ((expected_key_mask & first_key) == first_key));
while (!bpf_map_get_next_key(fd, &key, &next_key)) {
+ if (first_key) {
+ assert(next_key == first_key);
+ first_key = 0;
+ }
assert((expected_key_mask & next_key) == next_key);
expected_key_mask &= ~next_key;
@@ -219,6 +230,8 @@ static void test_hashmap_percpu(int task, void *data)
key = 0;
/* Check that map is empty. */
+ assert(bpf_map_get_next_key(fd, NULL, &next_key) == -1 &&
+ errno == ENOENT);
assert(bpf_map_get_next_key(fd, &key, &next_key) == -1 &&
errno == ENOENT);
@@ -264,6 +277,8 @@ static void test_arraymap(int task, void *data)
assert(bpf_map_lookup_elem(fd, &key, &value) == -1 && errno == ENOENT);
/* Iterate over two elements. */
+ assert(bpf_map_get_next_key(fd, NULL, &next_key) == 0 &&
+ next_key == 0);
assert(bpf_map_get_next_key(fd, &key, &next_key) == 0 &&
next_key == 0);
assert(bpf_map_get_next_key(fd, &next_key, &next_key) == 0 &&
@@ -319,6 +334,8 @@ static void test_arraymap_percpu(int task, void *data)
assert(bpf_map_lookup_elem(fd, &key, values) == -1 && errno == ENOENT);
/* Iterate over two elements. */
+ assert(bpf_map_get_next_key(fd, NULL, &next_key) == 0 &&
+ next_key == 0);
assert(bpf_map_get_next_key(fd, &key, &next_key) == 0 &&
next_key == 0);
assert(bpf_map_get_next_key(fd, &next_key, &next_key) == 0 &&
@@ -400,6 +417,8 @@ static void test_map_large(void)
errno == E2BIG);
/* Iterate through all elements. */
+ assert(bpf_map_get_next_key(fd, NULL, &key) == 0);
+ key.c = -1;
for (i = 0; i < MAP_SIZE; i++)
assert(bpf_map_get_next_key(fd, &key, &key) == 0);
assert(bpf_map_get_next_key(fd, &key, &key) == -1 && errno == ENOENT);
@@ -499,6 +518,7 @@ static void test_map_parallel(void)
errno == EEXIST);
/* Check that all elements were inserted. */
+ assert(bpf_map_get_next_key(fd, NULL, &key) == 0);
key = -1;
for (i = 0; i < MAP_SIZE; i++)
assert(bpf_map_get_next_key(fd, &key, &key) == 0);
@@ -518,6 +538,7 @@ static void test_map_parallel(void)
/* Nothing should be left. */
key = -1;
+ assert(bpf_map_get_next_key(fd, NULL, &key) == -1 && errno == ENOENT);
assert(bpf_map_get_next_key(fd, &key, &key) == -1 && errno == ENOENT);
}
--
2.9.3
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox