* [PATCH] atm: idt77252: avoid accessing the data mapped to streaming DMA
From: Jia-Ju Bai @ 2020-08-02 9:33 UTC (permalink / raw)
To: 3chas3; +Cc: linux-atm-general, netdev, linux-kernel, Jia-Ju Bai
In queue_skb(), skb->data is mapped to streaming DMA on line 850:
dma_map_single(..., skb->data, ...);
Then skb->data is accessed on lines 862 and 863:
tbd->word_4 = (skb->data[0] << 24) | (skb->data[1] << 16) |
(skb->data[2] << 8) | (skb->data[3] << 0);
and on lines 893 and 894:
tbd->word_4 = (skb->data[0] << 24) | (skb->data[1] << 16) |
(skb->data[2] << 8) | (skb->data[3] << 0);
These accesses may cause data inconsistency between CPU cache and
hardware.
To fix this problem, the calculation result of skb->data is stored in a
local variable before DMA mapping, and then the driver accesses this
local variable instead of skb->data.
Signed-off-by: Jia-Ju Bai <baijiaju@tsinghua.edu.cn>
---
drivers/atm/idt77252.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/atm/idt77252.c b/drivers/atm/idt77252.c
index df51680e8931..65a3886f68c9 100644
--- a/drivers/atm/idt77252.c
+++ b/drivers/atm/idt77252.c
@@ -835,6 +835,7 @@ queue_skb(struct idt77252_dev *card, struct vc_map *vc,
unsigned long flags;
int error;
int aal;
+ u32 word4;
if (skb->len == 0) {
printk("%s: invalid skb->len (%d)\n", card->name, skb->len);
@@ -846,6 +847,8 @@ queue_skb(struct idt77252_dev *card, struct vc_map *vc,
tbd = &IDT77252_PRV_TBD(skb);
vcc = ATM_SKB(skb)->vcc;
+ word4 = (skb->data[0] << 24) | (skb->data[1] << 16) |
+ (skb->data[2] << 8) | (skb->data[3] << 0);
IDT77252_PRV_PADDR(skb) = dma_map_single(&card->pcidev->dev, skb->data,
skb->len, DMA_TO_DEVICE);
@@ -859,8 +862,7 @@ queue_skb(struct idt77252_dev *card, struct vc_map *vc,
tbd->word_1 = SAR_TBD_OAM | ATM_CELL_PAYLOAD | SAR_TBD_EPDU;
tbd->word_2 = IDT77252_PRV_PADDR(skb) + 4;
tbd->word_3 = 0x00000000;
- tbd->word_4 = (skb->data[0] << 24) | (skb->data[1] << 16) |
- (skb->data[2] << 8) | (skb->data[3] << 0);
+ tbd->word_4 = word4;
if (test_bit(VCF_RSV, &vc->flags))
vc = card->vcs[0];
@@ -890,8 +892,7 @@ queue_skb(struct idt77252_dev *card, struct vc_map *vc,
tbd->word_2 = IDT77252_PRV_PADDR(skb) + 4;
tbd->word_3 = 0x00000000;
- tbd->word_4 = (skb->data[0] << 24) | (skb->data[1] << 16) |
- (skb->data[2] << 8) | (skb->data[3] << 0);
+ tbd->word_4 = word4;
break;
case ATM_AAL5:
--
2.17.1
^ permalink raw reply related
* [PATCH] atm: eni: avoid accessing the data mapped to streaming DMA
From: Jia-Ju Bai @ 2020-08-02 9:16 UTC (permalink / raw)
To: 3chas3; +Cc: linux-atm-general, netdev, linux-kernel, Jia-Ju Bai
In do_tx(), skb->data is mapped to streaming DMA on line 1111:
paddr = dma_map_single(...,skb->data,DMA_TO_DEVICE);
Then skb->data is accessed on line 1153:
(skb->data[3] & 0xf)
This access may cause data inconsistency between CPU cache and hardware.
To fix this problem, skb->data[3] is assigned to a local variable before
DMA mapping, and then the driver accesses this local variable instead of
skb->data[3].
Signed-off-by: Jia-Ju Bai <baijiaju@tsinghua.edu.cn>
---
drivers/atm/eni.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/atm/eni.c b/drivers/atm/eni.c
index 17d47ad03ab7..09f4e2f41363 100644
--- a/drivers/atm/eni.c
+++ b/drivers/atm/eni.c
@@ -1034,6 +1034,7 @@ static enum enq_res do_tx(struct sk_buff *skb)
u32 dma_rd,dma_wr;
u32 size; /* in words */
int aal5,dma_size,i,j;
+ unsigned char skb_data3;
DPRINTK(">do_tx\n");
NULLCHECK(skb);
@@ -1108,6 +1109,7 @@ DPRINTK("iovcnt = %d\n",skb_shinfo(skb)->nr_frags);
vcc->dev->number);
return enq_jam;
}
+ skb_data3 = skb->data[3];
paddr = dma_map_single(&eni_dev->pci_dev->dev,skb->data,skb->len,
DMA_TO_DEVICE);
ENI_PRV_PADDR(skb) = paddr;
@@ -1150,7 +1152,7 @@ DPRINTK("doing direct send\n"); /* @@@ well, this doesn't work anyway */
(size/(ATM_CELL_PAYLOAD/4)),tx->send+tx->tx_pos*4);
/*printk("dsc = 0x%08lx\n",(unsigned long) readl(tx->send+tx->tx_pos*4));*/
writel((vcc->vci << MID_SEG_VCI_SHIFT) |
- (aal5 ? 0 : (skb->data[3] & 0xf)) |
+ (aal5 ? 0 : (skb_data3 & 0xf)) |
(ATM_SKB(skb)->atm_options & ATM_ATMOPT_CLP ? MID_SEG_CLP : 0),
tx->send+((tx->tx_pos+1) & (tx->words-1))*4);
DPRINTK("size: %d, len:%d\n",size,skb->len);
--
2.17.1
^ permalink raw reply related
* [net-next PATCH] net: phy: mdio-mvusb: select MDIO_DEVRES in Kconfig
From: Bartosz Golaszewski @ 2020-08-02 7:49 UTC (permalink / raw)
To: Andrew Lunn, Florian Fainelli, Heiner Kallweit, Russell King,
David S . Miller, Jakub Kicinski
Cc: netdev, linux-kernel, Bartosz Golaszewski, kernel test robot
From: Bartosz Golaszewski <bgolaszewski@baylibre.com>
PHYLIB is not selected by the mvusb driver but it uses mdio devres
helpers. Explicitly select MDIO_DEVRES in this driver's Kconfig entry.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1814cff26739 ("net: phy: add a Kconfig option for mdio_devres")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
drivers/net/phy/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index e351d65533aa..7a756e0374fd 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -190,6 +190,7 @@ config MDIO_MSCC_MIIM
config MDIO_MVUSB
tristate "Marvell USB to MDIO Adapter"
depends on USB
+ select MDIO_DEVRES
help
A USB to MDIO converter present on development boards for
Marvell's Link Street family of Ethernet switches.
--
2.26.1
^ permalink raw reply related
* Re: possible deadlock in __dev_queue_xmit (3)
From: syzbot @ 2020-08-02 7:43 UTC (permalink / raw)
To: ap420073, davem, jhs, jiri, kuba, linux-kernel, netdev,
syzkaller-bugs, xiyou.wangcong
In-Reply-To: <000000000000ea90600598c9b089@google.com>
syzbot has bisected this issue to:
commit 1a33e10e4a95cb109ff1145098175df3113313ef
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date: Sun May 3 05:22:19 2020 +0000
net: partially revert dynamic lockdep key changes
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=10434970900000
start commit: d8b9faec Merge tag 'drm-fixes-2020-07-31' of git://anongit..
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=12434970900000
console output: https://syzkaller.appspot.com/x/log.txt?x=14434970900000
kernel config: https://syzkaller.appspot.com/x/.config?x=c0cfcf935bcc94d2
dashboard link: https://syzkaller.appspot.com/bug?extid=3b165dac15094065651e
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=163e2c92900000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=143f2bb8900000
Reported-by: syzbot+3b165dac15094065651e@syzkaller.appspotmail.com
Fixes: 1a33e10e4a95 ("net: partially revert dynamic lockdep key changes")
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
^ permalink raw reply
* Re: WARNING in hci_conn_timeout
From: Greg KH @ 2020-08-02 6:30 UTC (permalink / raw)
To: syzbot
Cc: coreteam, davem, devel, forest, johan.hedberg, kaber, kadlec,
kuba, linux-bluetooth, linux-kernel, marcel, netdev,
netfilter-devel, pablo, rvarsha016, syzkaller-bugs
In-Reply-To: <000000000000b54f9f05abd8cfbb@google.com>
On Sat, Aug 01, 2020 at 03:56:09PM -0700, syzbot wrote:
> syzbot has bisected this issue to:
>
> commit 3d30311c0e4d834c94e6a27d6242a942d6a76b85
> Author: Varsha Rao <rvarsha016@gmail.com>
> Date: Sun Oct 9 11:13:56 2016 +0000
>
> staging: vt6655: Removes unnecessary blank lines.
I doubt this is the real issue :(
^ permalink raw reply
* Re: [PATCH bpf-next] selftests/bpf: fix spurious test failures in core_retro selftest
From: Alexei Starovoitov @ 2020-08-02 6:20 UTC (permalink / raw)
To: John Fastabend
Cc: Andrii Nakryiko, bpf, Network Development, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Kernel Team
In-Reply-To: <5f2493377c396_54fa2b1d9fe285b478@john-XPS-13-9370.notmuch>
On Fri, Jul 31, 2020 at 2:55 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Andrii Nakryiko wrote:
> > core_retro selftest uses BPF program that's triggered on sys_enter
> > system-wide, but has no protection from some unrelated process doing syscall
> > while selftest is running. This leads to occasional test failures with
> > unexpected PIDs being returned. Fix that by filtering out all processes that
> > are not test_progs process.
> >
> > Fixes: fcda189a5133 ("selftests/bpf: Add test relying only on CO-RE and no recent kernel features")
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
>
> Acked-by: John Fastabend <john.fastabend@gmail.com>
Applied. Thanks
^ permalink raw reply
* Re: [PATCH V2 bpf-next] bpf: make __htab_lookup_and_delete_batch faster when map is almost empty
From: Yonghong Song @ 2020-08-02 5:23 UTC (permalink / raw)
To: Brian Vazquez, Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
David S . Miller
Cc: linux-kernel, netdev, bpf, Luigi Rizzo
In-Reply-To: <20200801180927.1003340-1-brianvv@google.com>
On 8/1/20 11:09 AM, Brian Vazquez wrote:
> While running some experiments it was observed that map_lookup_batch was 2x
> slower than get_next_key + lookup when the syscall overhead is minimal.
> This was because the map_lookup_batch implementation was more expensive
> traversing empty buckets, this can be really costly when the pre-allocated
> map is too big.
>
> This patch optimizes the case when the bucket is empty so we can move
> quickly to next bucket.
>
> The Benchmark was generated using the google/benchmark library[1]. When
> the benckmark is executed the number of iterations is governed by the
> amount of time the benckmarks takes, the number of iterations is at
> least 1 and not more than 1e9, until CPU time(of the entire binary, not
> just the part to measure), is greater than 0.5s. Time and CPU reported
> are the average of a single iteration over the iteration runs.
>
> The experiments to exercise the empty buckets are as follows:
>
> -The map was populated with a single entry to make sure that the syscall
> overhead is not helping the map_batch_lookup.
> -The size of the preallocated map was increased to show the effect of
> traversing empty buckets.
>
> To interpret the results, Benchmark is the name of the experiment where
> the first number correspond to the number of elements in the map, and
> the next one correspond to the size of the pre-allocated map. Time and
> CPU are average and correspond to the time elapsed per iteration and the
> system time consumtion per iteration.
thanks for explanation!
>
> Results:
>
> Using get_next_key + lookup:
>
> Benchmark Time(ns) CPU(ns) Iteration
> ---------------------------------------------------------------
> BM_DumpHashMap/1/1k 3593 3586 192680
> BM_DumpHashMap/1/4k 6004 5972 100000
> BM_DumpHashMap/1/16k 15755 15710 44341
> BM_DumpHashMap/1/64k 59525 59376 10000
>
> Using htab_lookup_batch before this patch:
> Benchmark Time(ns) CPU(ns) Iterations
> ---------------------------------------------------------------
> BM_DumpHashMap/1/1k 3933 3927 177978
> BM_DumpHashMap/1/4k 9192 9177 73951
> BM_DumpHashMap/1/16k 42011 41970 16789
> BM_DumpHashMap/1/64k 117895 117661 6135
>
> Using htab_lookup_batch with this patch:
> Benchmark Time(ns) CPU(ns) Iterations
> ---------------------------------------------------------------
> BM_DumpHashMap/1/1k 2809 2803 249212
> BM_DumpHashMap/1/4k 5318 5316 100000
> BM_DumpHashMap/1/16k 14925 14895 47448
> BM_DumpHashMap/1/64k 58870 58674 10000
>
> [1] https://github.com/google/benchmark.git
>
> Changelog:
>
> v1 -> v2:
> - Add more information about how to interpret the results
>
> Suggested-by: Luigi Rizzo <lrizzo@google.com>
> Cc: Yonghong Song <yhs@fb.com>
> Signed-off-by: Brian Vazquez <brianvv@google.com>
> ---
> kernel/bpf/hashtab.c | 23 ++++++++---------------
> 1 file changed, 8 insertions(+), 15 deletions(-)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 024276787055..b6d28bd6345b 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -1349,7 +1349,6 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> struct hlist_nulls_head *head;
> struct hlist_nulls_node *n;
> unsigned long flags = 0;
> - bool locked = false;
> struct htab_elem *l;
> struct bucket *b;
> int ret = 0;
> @@ -1408,19 +1407,19 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> dst_val = values;
> b = &htab->buckets[batch];
> head = &b->head;
> - /* do not grab the lock unless need it (bucket_cnt > 0). */
> - if (locked)
> - flags = htab_lock_bucket(htab, b);
>
> + l = hlist_nulls_entry_safe(rcu_dereference_raw(hlist_nulls_first_rcu(head)),
> + struct htab_elem, hash_node);
> + if (!l && (batch + 1 < htab->n_buckets)) {
> + batch++;
> + goto again_nocopy;
> + }
In this case, if batch + 1 == htab->n_buckets, we still go through
htab_lock_bucket/htab_unlock_bucket which is really not needed.
So since we are trying to optimize for performance, let us handle
the above case as well. We can do
if (!l) {
if (batch + 1 < htab->n_buckets) {
batch++;
goto again_nocopy;
}
bucket_cnt = 0;
goto done_bucket;
}
...
done_bucket:
rcu_read_unlock();
bpf_enable_instrumentation();
...
what do you think?
> +
> + flags = htab_lock_bucket(htab, b);
> bucket_cnt = 0;
> hlist_nulls_for_each_entry_rcu(l, n, head, hash_node)
> bucket_cnt++;
>
> - if (bucket_cnt && !locked) {
> - locked = true;
> - goto again_nocopy;
> - }
> -
> if (bucket_cnt > (max_count - total)) {
> if (total == 0)
> ret = -ENOSPC;
> @@ -1446,10 +1445,6 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> goto alloc;
> }
>
> - /* Next block is only safe to run if you have grabbed the lock */
> - if (!locked)
> - goto next_batch;
> -
> hlist_nulls_for_each_entry_safe(l, n, head, hash_node) {
> memcpy(dst_key, l->key, key_size);
>
> @@ -1492,7 +1487,6 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> }
>
> htab_unlock_bucket(htab, b, flags);
> - locked = false;
>
> while (node_to_free) {
> l = node_to_free;
> @@ -1500,7 +1494,6 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> bpf_lru_push_free(&htab->lru, &l->lru_node);
> }
>
> -next_batch:
> /* If we are not copying data, we can go to next bucket and avoid
> * unlocking the rcu.
> */
>
^ permalink raw reply
* [PATCH] appletalk: Fix atalk_proc_init() return path
From: Lukas Wunner @ 2020-08-02 5:06 UTC (permalink / raw)
To: David Miller, Jakub Kicinski
Cc: Vincent Duvert, Christopher KOBAYASHI, Doug Brown, Yue Haibing,
netdev
From: Vincent Duvert <vincent.ldev@duvert.net>
Add a missing return statement to atalk_proc_init so it doesn't return
-ENOMEM when successful. This allows the appletalk module to load
properly.
Fixes: e2bcd8b0ce6e ("appletalk: use remove_proc_subtree to simplify procfs code")
Link: https://www.downtowndougbrown.com/2020/08/hacking-up-a-fix-for-the-broken-appletalk-kernel-module-in-linux-5-1-and-newer/
Reported-by: Christopher KOBAYASHI <chris@disavowed.jp>
Reported-by: Doug Brown <doug@downtowndougbrown.com>
Signed-off-by: Vincent Duvert <vincent.ldev@duvert.net>
[lukas: add missing tags]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v5.1+
Cc: Yue Haibing <yuehaibing@huawei.com>
---
net/appletalk/atalk_proc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/appletalk/atalk_proc.c b/net/appletalk/atalk_proc.c
index 550c6ca..9c12412 100644
--- a/net/appletalk/atalk_proc.c
+++ b/net/appletalk/atalk_proc.c
@@ -229,6 +229,8 @@ int __init atalk_proc_init(void)
sizeof(struct aarp_iter_state), NULL))
goto out;
+ return 0;
+
out:
remove_proc_subtree("atalk", init_net.proc_net);
return -ENOMEM;
--
2.27.0
^ permalink raw reply related
* Re: [PATCH -next] virtio_net: Avoid loop in virtnet_poll
From: Michael S. Tsirkin @ 2020-08-02 4:40 UTC (permalink / raw)
To: Mao Wenan; +Cc: jasowang, davem, kuba, netdev
In-Reply-To: <1596339683-117617-1-git-send-email-wenan.mao@linux.alibaba.com>
On Sun, Aug 02, 2020 at 11:41:23AM +0800, Mao Wenan wrote:
> The loop may exist if vq->broken is true,
> virtqueue_get_buf_ctx_packed or virtqueue_get_buf_ctx_split
> will return NULL, so virtnet_poll will reschedule napi to
> receive packet, it will lead cpu usage(si) to 100%.
>
> call trace as below:
> virtnet_poll
> virtnet_receive
> virtqueue_get_buf_ctx
> virtqueue_get_buf_ctx_packed
> virtqueue_get_buf_ctx_split
> virtqueue_napi_complete
> virtqueue_napi_schedule //it will reschedule napi
>
> Signed-off-by: Mao Wenan <wenan.mao@linux.alibaba.com>
I think it's more a bug in virtqueue_poll : virtqueue_get_buf reports
NULL on broken, so virtqueue_poll should report false for consistency.
> ---
> drivers/net/virtio_net.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ba38765..a058da1 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -327,7 +327,8 @@ static void virtqueue_napi_complete(struct napi_struct *napi,
>
> opaque = virtqueue_enable_cb_prepare(vq);
> if (napi_complete_done(napi, processed)) {
> - if (unlikely(virtqueue_poll(vq, opaque)))
> + if (unlikely(virtqueue_poll(vq, opaque)) &&
> + unlikely(!virtqueue_is_broken(vq)))
> virtqueue_napi_schedule(napi, vq);
> } else {
> virtqueue_disable_cb(vq);
> --
> 1.8.3.1
^ permalink raw reply
* Re: [PATCH 1/2 v2 net-next] 8390: Miscellaneous cleanups
From: Jakub Kicinski @ 2020-08-02 4:32 UTC (permalink / raw)
To: Armin Wolf; +Cc: davem, netdev
In-Reply-To: <20200801205242.GA9549@mx-linux-amd>
On Sat, 1 Aug 2020 22:52:42 +0200 Armin Wolf wrote:
> Replace version string with MODULE_* macros.
>
> Include necessary libraries.
>
> Fix two minor coding-style issues.
>
> Signed-off-by: Armin Wolf <W_Armin@gmx.de>
This doesn't build but also
../drivers/net/ethernet/8390/lib8390.c:973:17: error: undefined identifier 'version'
In file included from ../include/linux/kernel.h:15,
from ../drivers/net/ethernet/8390/8390.c:9:
../drivers/net/ethernet/8390/lib8390.c: In function ‘ethdev_setup’:
../drivers/net/ethernet/8390/lib8390.c:973:17: error: ‘version’ undeclared (first use in this function)
973 | pr_info("%s", version);
| ^~~~~~~
../include/linux/printk.h:368:34: note: in definition of macro ‘pr_info’
368 | printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
| ^~~~~~~~~~~
../drivers/net/ethernet/8390/lib8390.c:973:17: note: each undeclared identifier is reported only once for each function it appears in
973 | pr_info("%s", version);
| ^~~~~~~
../include/linux/printk.h:368:34: note: in definition of macro ‘pr_info’
368 | printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
| ^~~~~~~~~~~
make[5]: *** [../scripts/Makefile.build:281: drivers/net/ethernet/8390/8390.o] Error 1
make[4]: *** [../scripts/Makefile.build:497: drivers/net/ethernet/8390] Error 2
make[4]: *** Waiting for unfinished jobs....
make[3]: *** [../scripts/Makefile.build:497: drivers/net/ethernet] Error 2
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [../scripts/Makefile.build:497: drivers/net] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [/netdev/net-next/Makefile:1771: drivers] Error 2
make: *** [Makefile:185: __sub-make] Error 2
> diff --git a/drivers/net/ethernet/8390/8390.c b/drivers/net/ethernet/8390/8390.c
> index 0e0aa4016858..aabb637c1fbf 100644
> --- a/drivers/net/ethernet/8390/8390.c
> +++ b/drivers/net/ethernet/8390/8390.c
> @@ -1,11 +1,26 @@
> // SPDX-License-Identifier: GPL-2.0-only
> -/* 8390 core for usual drivers */
>
> -static const char version[] =
> - "8390.c:v1.10cvs 9/23/94 Donald Becker (becker@cesdis.gsfc.nasa.gov)\n";
> +#define DRV_NAME "8390"
> +#define DRV_DESCRIPTION "8390 core for usual drivers"
> +#define DRV_AUTHOR "Donald Becker (becker@cesdis.gsfc.nasa.gov)"
> +#define DRV_VERSION "1.10cvs"
> +#define DRV_RELDATE "9/23/1994"
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/export.h>
> +
> +#include <linux/netdevice.h>
> +#include <linux/etherdevice.h>
>
> #include "lib8390.c"
>
> +MODULE_AUTHOR(DRV_AUTHOR);
> +MODULE_DESCRIPTION(DRV_DESCRIPTION);
> +MODULE_VERSION(DRV_VERSION);
Please drop the driver version, it looks pretty meaningless and we are
moving away from driver versions in general.
> +MODULE_LICENSE("GPL");
Why move this up? It's common to have these at the end.
> +
> int ei_open(struct net_device *dev)
> {
> return __ei_open(dev);
> @@ -64,7 +79,7 @@ const struct net_device_ops ei_netdev_ops = {
> .ndo_get_stats = ei_get_stats,
> .ndo_set_rx_mode = ei_set_multicast_list,
> .ndo_validate_addr = eth_validate_addr,
> - .ndo_set_mac_address = eth_mac_addr,
> + .ndo_set_mac_address = eth_mac_addr,
> #ifdef CONFIG_NET_POLL_CONTROLLER
> .ndo_poll_controller = ei_poll,
> #endif
> @@ -74,6 +89,7 @@ EXPORT_SYMBOL(ei_netdev_ops);
> struct net_device *__alloc_ei_netdev(int size)
> {
> struct net_device *dev = ____alloc_ei_netdev(size);
> +
> if (dev)
> dev->netdev_ops = &ei_netdev_ops;
> return dev;
> @@ -100,4 +116,3 @@ static void __exit ns8390_module_exit(void)
> module_init(ns8390_module_init);
> module_exit(ns8390_module_exit);
> #endif /* MODULE */
> -MODULE_LICENSE("GPL");
> --
> 2.20.1
^ permalink raw reply
* Re: [PATCH net-next] cxgb4: Add support to flash firmware config image
From: Jakub Kicinski @ 2020-08-02 4:22 UTC (permalink / raw)
To: Rahul Lakkireddy; +Cc: Ganji Aravind, netdev, davem, vishal
In-Reply-To: <20200731211733.GA25665@chelsio.com>
On Sat, 1 Aug 2020 02:47:38 +0530 Rahul Lakkireddy wrote:
> I thought /lib/firmware is where firmware related files need to be
> placed and ethtool --flash needs to be used to program them to
> their respective locations on adapter's flash.
Our goal is to provide solid, common interfaces for Linux users to rely
on. Not give way to vendor specific "solutions" like uploading ini files
to perform device configuration.
> Note that we don't have devlink support in our driver yet. And,
> we're a bit confused here on why this already existing ethtool
> feature needs to be duplicated to devlink.
To be clear - I'm suggesting the creation of a more targeted APIs
to control the settings you have encoded _inside_ the ini file.
Not a new interface for an whole sale config upload.
Worst case scenario - if the settings are really device specific
you can try to use devlink device parameters.
^ permalink raw reply
* [PATCH bpf-next 1/2] bpf: change uapi for bpf iterator map elements
From: Yonghong Song @ 2020-08-02 4:21 UTC (permalink / raw)
To: bpf, netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20200802042126.2119783-1-yhs@fb.com>
Commit a5cbe05a6673 ("bpf: Implement bpf iterator for
map elements") added bpf iterator support for
map elements. The map element bpf iterator requires
info to identify a particular map. In the above
commit, the attr->link_create.target_fd is used
to carry map_fd and an enum bpf_iter_link_info
is added to uapi to specify the target_fd actually
representing a map_fd:
enum bpf_iter_link_info {
BPF_ITER_LINK_UNSPEC = 0,
BPF_ITER_LINK_MAP_FD = 1,
MAX_BPF_ITER_LINK_INFO,
};
This is an extensible approach as we can grow
enumerator for pid, cgroup_id, etc. and we can
unionize target_fd for pid, cgroup_id, etc.
But in the future, there are chances that
more complex customization may happen, e.g.,
for tasks, it could be filtered based on
both cgroup_id and user_id.
This patch changed the uapi to have fields
__aligned_u64 iter_info;
__u32 iter_info_len;
for additional iter_info for link_create.
The iter_info is defined as
union bpf_iter_link_info {
struct {
__u32 map_fd;
} map;
};
So future extension for additional customization
will be easier. The bpf_iter_link_info will be
passed to target callback to validate and generic
bpf_iter framework does not need to deal it any
more.
Signed-off-by: Yonghong Song <yhs@fb.com>
---
include/linux/bpf.h | 10 ++++---
include/uapi/linux/bpf.h | 15 +++++-----
kernel/bpf/bpf_iter.c | 52 +++++++++++++++-------------------
kernel/bpf/map_iter.c | 37 ++++++++++++++++++------
kernel/bpf/syscall.c | 2 +-
net/core/bpf_sk_storage.c | 37 ++++++++++++++++++------
tools/include/uapi/linux/bpf.h | 15 +++++-----
7 files changed, 104 insertions(+), 64 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cef4ef0d2b4e..55f694b63164 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1214,15 +1214,17 @@ struct bpf_iter_aux_info {
struct bpf_map *map;
};
-typedef int (*bpf_iter_check_target_t)(struct bpf_prog *prog,
- struct bpf_iter_aux_info *aux);
+typedef int (*bpf_iter_attach_target_t)(struct bpf_prog *prog,
+ union bpf_iter_link_info *linfo,
+ struct bpf_iter_aux_info *aux);
+typedef void (*bpf_iter_detach_target_t)(struct bpf_iter_aux_info *aux);
#define BPF_ITER_CTX_ARG_MAX 2
struct bpf_iter_reg {
const char *target;
- bpf_iter_check_target_t check_target;
+ bpf_iter_attach_target_t attach_target;
+ bpf_iter_detach_target_t detach_target;
u32 ctx_arg_info_size;
- enum bpf_iter_link_info req_linfo;
struct bpf_ctx_arg_aux ctx_arg_info[BPF_ITER_CTX_ARG_MAX];
const struct bpf_iter_seq_info *seq_info;
};
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b134e679e9db..0480f893facd 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -81,6 +81,12 @@ struct bpf_cgroup_storage_key {
__u32 attach_type; /* program attach type */
};
+union bpf_iter_link_info {
+ struct {
+ __u32 map_fd;
+ } map;
+};
+
/* BPF syscall commands, see bpf(2) man-page for details. */
enum bpf_cmd {
BPF_MAP_CREATE,
@@ -249,13 +255,6 @@ enum bpf_link_type {
MAX_BPF_LINK_TYPE,
};
-enum bpf_iter_link_info {
- BPF_ITER_LINK_UNSPEC = 0,
- BPF_ITER_LINK_MAP_FD = 1,
-
- MAX_BPF_ITER_LINK_INFO,
-};
-
/* cgroup-bpf attach flags used in BPF_PROG_ATTACH command
*
* NONE(default): No further bpf programs allowed in the subtree.
@@ -623,6 +622,8 @@ union bpf_attr {
};
__u32 attach_type; /* attach type */
__u32 flags; /* extra flags */
+ __aligned_u64 iter_info; /* extra bpf_iter_link_info */
+ __u32 iter_info_len; /* iter_info length */
} link_create;
struct { /* struct used by BPF_LINK_UPDATE command */
diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
index 363b9cafc2d8..20d62020807f 100644
--- a/kernel/bpf/bpf_iter.c
+++ b/kernel/bpf/bpf_iter.c
@@ -338,8 +338,8 @@ static void bpf_iter_link_release(struct bpf_link *link)
struct bpf_iter_link *iter_link =
container_of(link, struct bpf_iter_link, link);
- if (iter_link->aux.map)
- bpf_map_put_with_uref(iter_link->aux.map);
+ if (iter_link->tinfo->reg_info->detach_target)
+ iter_link->tinfo->reg_info->detach_target(&iter_link->aux);
}
static void bpf_iter_link_dealloc(struct bpf_link *link)
@@ -390,15 +390,29 @@ bool bpf_link_is_iter(struct bpf_link *link)
int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
{
+ union bpf_iter_link_info __user *ulinfo;
struct bpf_link_primer link_primer;
struct bpf_iter_target_info *tinfo;
- struct bpf_iter_aux_info aux = {};
+ union bpf_iter_link_info linfo;
struct bpf_iter_link *link;
- u32 prog_btf_id, target_fd;
+ u32 prog_btf_id, linfo_len;
bool existed = false;
- struct bpf_map *map;
int err;
+ memset(&linfo, 0, sizeof(union bpf_iter_link_info));
+
+ ulinfo = u64_to_user_ptr(attr->link_create.iter_info);
+ linfo_len = attr->link_create.iter_info_len;
+ if (ulinfo && linfo_len) {
+ err = bpf_check_uarg_tail_zero(ulinfo, sizeof(linfo),
+ linfo_len);
+ if (err)
+ return err;
+ linfo_len = min_t(u32, linfo_len, sizeof(linfo));
+ if (copy_from_user(&linfo, ulinfo, linfo_len))
+ return -EFAULT;
+ }
+
prog_btf_id = prog->aux->attach_btf_id;
mutex_lock(&targets_mutex);
list_for_each_entry(tinfo, &targets, list) {
@@ -411,13 +425,6 @@ int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
if (!existed)
return -ENOENT;
- /* Make sure user supplied flags are target expected. */
- target_fd = attr->link_create.target_fd;
- if (attr->link_create.flags != tinfo->reg_info->req_linfo)
- return -EINVAL;
- if (!attr->link_create.flags && target_fd)
- return -EINVAL;
-
link = kzalloc(sizeof(*link), GFP_USER | __GFP_NOWARN);
if (!link)
return -ENOMEM;
@@ -431,28 +438,15 @@ int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
return err;
}
- if (tinfo->reg_info->req_linfo == BPF_ITER_LINK_MAP_FD) {
- map = bpf_map_get_with_uref(target_fd);
- if (IS_ERR(map)) {
- err = PTR_ERR(map);
- goto cleanup_link;
- }
-
- aux.map = map;
- err = tinfo->reg_info->check_target(prog, &aux);
+ if (tinfo->reg_info->attach_target) {
+ err = tinfo->reg_info->attach_target(prog, &linfo, &link->aux);
if (err) {
- bpf_map_put_with_uref(map);
- goto cleanup_link;
+ bpf_link_cleanup(&link_primer);
+ return err;
}
-
- link->aux.map = map;
}
return bpf_link_settle(&link_primer);
-
-cleanup_link:
- bpf_link_cleanup(&link_primer);
- return err;
}
static void init_seq_meta(struct bpf_iter_priv_data *priv_data,
diff --git a/kernel/bpf/map_iter.c b/kernel/bpf/map_iter.c
index fbe1f557cb88..131603fe1cbf 100644
--- a/kernel/bpf/map_iter.c
+++ b/kernel/bpf/map_iter.c
@@ -98,12 +98,21 @@ static struct bpf_iter_reg bpf_map_reg_info = {
.seq_info = &bpf_map_seq_info,
};
-static int bpf_iter_check_map(struct bpf_prog *prog,
- struct bpf_iter_aux_info *aux)
+static int bpf_iter_attach_map(struct bpf_prog *prog,
+ union bpf_iter_link_info *linfo,
+ struct bpf_iter_aux_info *aux)
{
u32 key_acc_size, value_acc_size, key_size, value_size;
- struct bpf_map *map = aux->map;
+ struct bpf_map *map;
bool is_percpu = false;
+ int err = -EINVAL;
+
+ if (!linfo->map.map_fd)
+ return -EINVAL;
+
+ map = bpf_map_get_with_uref(linfo->map.map_fd);
+ if (IS_ERR(map))
+ return PTR_ERR(map);
if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH ||
@@ -112,7 +121,7 @@ static int bpf_iter_check_map(struct bpf_prog *prog,
else if (map->map_type != BPF_MAP_TYPE_HASH &&
map->map_type != BPF_MAP_TYPE_LRU_HASH &&
map->map_type != BPF_MAP_TYPE_ARRAY)
- return -EINVAL;
+ goto put_map;
key_acc_size = prog->aux->max_rdonly_access;
value_acc_size = prog->aux->max_rdwr_access;
@@ -122,10 +131,22 @@ static int bpf_iter_check_map(struct bpf_prog *prog,
else
value_size = round_up(map->value_size, 8) * num_possible_cpus();
- if (key_acc_size > key_size || value_acc_size > value_size)
- return -EACCES;
+ if (key_acc_size > key_size || value_acc_size > value_size) {
+ err = -EACCES;
+ goto put_map;
+ }
+ aux->map = map;
return 0;
+
+put_map:
+ bpf_map_put_with_uref(map);
+ return err;
+}
+
+static void bpf_iter_detach_map(struct bpf_iter_aux_info *aux)
+{
+ bpf_map_put_with_uref(aux->map);
}
DEFINE_BPF_ITER_FUNC(bpf_map_elem, struct bpf_iter_meta *meta,
@@ -133,8 +154,8 @@ DEFINE_BPF_ITER_FUNC(bpf_map_elem, struct bpf_iter_meta *meta,
static const struct bpf_iter_reg bpf_map_elem_reg_info = {
.target = "bpf_map_elem",
- .check_target = bpf_iter_check_map,
- .req_linfo = BPF_ITER_LINK_MAP_FD,
+ .attach_target = bpf_iter_attach_map,
+ .detach_target = bpf_iter_detach_map,
.ctx_arg_info_size = 2,
.ctx_arg_info = {
{ offsetof(struct bpf_iter__bpf_map_elem, key),
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 2f343ce15747..86299a292214 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3883,7 +3883,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *
return -EINVAL;
}
-#define BPF_LINK_CREATE_LAST_FIELD link_create.flags
+#define BPF_LINK_CREATE_LAST_FIELD link_create.iter_info_len
static int link_create(union bpf_attr *attr)
{
enum bpf_prog_type ptype;
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index d3377c90a291..6c0bdb5a0b26 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -1384,18 +1384,39 @@ static int bpf_iter_init_sk_storage_map(void *priv_data,
return 0;
}
-static int bpf_iter_check_map(struct bpf_prog *prog,
- struct bpf_iter_aux_info *aux)
+static int bpf_iter_attach_map(struct bpf_prog *prog,
+ union bpf_iter_link_info *linfo,
+ struct bpf_iter_aux_info *aux)
{
- struct bpf_map *map = aux->map;
+ struct bpf_map *map;
+ int err = -EINVAL;
- if (map->map_type != BPF_MAP_TYPE_SK_STORAGE)
+ if (!linfo->map.map_fd)
return -EINVAL;
- if (prog->aux->max_rdonly_access > map->value_size)
- return -EACCES;
+ map = bpf_map_get_with_uref(linfo->map.map_fd);
+ if (IS_ERR(map))
+ return PTR_ERR(map);
+
+ if (map->map_type != BPF_MAP_TYPE_SK_STORAGE)
+ goto put_map;
+
+ if (prog->aux->max_rdonly_access > map->value_size) {
+ err = -EACCES;
+ goto put_map;
+ }
+ aux->map = map;
return 0;
+
+put_map:
+ bpf_map_put_with_uref(map);
+ return err;
+}
+
+static void bpf_iter_detach_map(struct bpf_iter_aux_info *aux)
+{
+ bpf_map_put_with_uref(aux->map);
}
static const struct seq_operations bpf_sk_storage_map_seq_ops = {
@@ -1414,8 +1435,8 @@ static const struct bpf_iter_seq_info iter_seq_info = {
static struct bpf_iter_reg bpf_sk_storage_map_reg_info = {
.target = "bpf_sk_storage_map",
- .check_target = bpf_iter_check_map,
- .req_linfo = BPF_ITER_LINK_MAP_FD,
+ .attach_target = bpf_iter_attach_map,
+ .detach_target = bpf_iter_detach_map,
.ctx_arg_info_size = 2,
.ctx_arg_info = {
{ offsetof(struct bpf_iter__bpf_sk_storage_map, sk),
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index b134e679e9db..0480f893facd 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -81,6 +81,12 @@ struct bpf_cgroup_storage_key {
__u32 attach_type; /* program attach type */
};
+union bpf_iter_link_info {
+ struct {
+ __u32 map_fd;
+ } map;
+};
+
/* BPF syscall commands, see bpf(2) man-page for details. */
enum bpf_cmd {
BPF_MAP_CREATE,
@@ -249,13 +255,6 @@ enum bpf_link_type {
MAX_BPF_LINK_TYPE,
};
-enum bpf_iter_link_info {
- BPF_ITER_LINK_UNSPEC = 0,
- BPF_ITER_LINK_MAP_FD = 1,
-
- MAX_BPF_ITER_LINK_INFO,
-};
-
/* cgroup-bpf attach flags used in BPF_PROG_ATTACH command
*
* NONE(default): No further bpf programs allowed in the subtree.
@@ -623,6 +622,8 @@ union bpf_attr {
};
__u32 attach_type; /* attach type */
__u32 flags; /* extra flags */
+ __aligned_u64 iter_info; /* extra bpf_iter_link_info */
+ __u32 iter_info_len; /* iter_info length */
} link_create;
struct { /* struct used by BPF_LINK_UPDATE command */
--
2.24.1
^ permalink raw reply related
* [PATCH bpf-next 0/2] bpf: change uapi for bpf iterator map elements
From: Yonghong Song @ 2020-08-02 4:21 UTC (permalink / raw)
To: bpf, netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
Andrii raised a concern that current uapi for bpf iterator map
element is a little restrictive and not suitable for future potential
complex customization. This is a valid suggestion, considering people
may indeed add more complex custimization to the iterator, e.g.,
cgroup_id + user_id, etc. for task or task_file. Another example might
be map_id plus additional control so that the bpf iterator may bail
out a bucket earlier if a bucket has too many elements which may hold
lock too long and impact other parts of systems.
Patch #1 modified uapi with kernel changes. Patch #2
adjusted libbpf api accordingly.
Yonghong Song (2):
bpf: change uapi for bpf iterator map elements
libbpf: support new uapi for map element bpf iterator
include/linux/bpf.h | 10 ++++---
include/uapi/linux/bpf.h | 15 +++++-----
kernel/bpf/bpf_iter.c | 52 +++++++++++++++-------------------
kernel/bpf/map_iter.c | 37 ++++++++++++++++++------
kernel/bpf/syscall.c | 2 +-
net/core/bpf_sk_storage.c | 37 ++++++++++++++++++------
tools/include/uapi/linux/bpf.h | 15 +++++-----
tools/lib/bpf/bpf.c | 4 ++-
tools/lib/bpf/bpf.h | 5 ++--
tools/lib/bpf/libbpf.c | 7 +++--
10 files changed, 115 insertions(+), 69 deletions(-)
--
2.24.1
^ permalink raw reply
* [PATCH bpf-next 2/2] libbpf: support new uapi for map element bpf iterator
From: Yonghong Song @ 2020-08-02 4:21 UTC (permalink / raw)
To: bpf, netdev; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20200802042126.2119783-1-yhs@fb.com>
Previous commit adjusted kernel uapi for map
element bpf iterator. This patch adjusted libbpf API
due to uapi change.
Signed-off-by: Yonghong Song <yhs@fb.com>
---
tools/lib/bpf/bpf.c | 4 +++-
tools/lib/bpf/bpf.h | 5 +++--
tools/lib/bpf/libbpf.c | 7 +++++--
3 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index eab14c97c15d..c75a84398d51 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -598,7 +598,9 @@ int bpf_link_create(int prog_fd, int target_fd,
attr.link_create.prog_fd = prog_fd;
attr.link_create.target_fd = target_fd;
attr.link_create.attach_type = attach_type;
- attr.link_create.flags = OPTS_GET(opts, flags, 0);
+ attr.link_create.iter_info =
+ ptr_to_u64(OPTS_GET(opts, iter_info, (void *)0));
+ attr.link_create.iter_info_len = OPTS_GET(opts, iter_info_len, 0);
return sys_bpf(BPF_LINK_CREATE, &attr, sizeof(attr));
}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 28855fd5b5f4..c9895f191305 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -170,9 +170,10 @@ LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd,
struct bpf_link_create_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
- __u32 flags;
+ union bpf_iter_link_info *iter_info;
+ __u32 iter_info_len;
};
-#define bpf_link_create_opts__last_field flags
+#define bpf_link_create_opts__last_field iter_info_len
LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
enum bpf_attach_type attach_type,
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 7be04e45d29c..dc8fabf9d30d 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -8298,6 +8298,7 @@ bpf_program__attach_iter(struct bpf_program *prog,
const struct bpf_iter_attach_opts *opts)
{
DECLARE_LIBBPF_OPTS(bpf_link_create_opts, link_create_opts);
+ union bpf_iter_link_info linfo;
char errmsg[STRERR_BUFSIZE];
struct bpf_link *link;
int prog_fd, link_fd;
@@ -8307,8 +8308,10 @@ bpf_program__attach_iter(struct bpf_program *prog,
return ERR_PTR(-EINVAL);
if (OPTS_HAS(opts, map_fd)) {
- target_fd = opts->map_fd;
- link_create_opts.flags = BPF_ITER_LINK_MAP_FD;
+ memset(&linfo, 0, sizeof(linfo));
+ linfo.map.map_fd = opts->map_fd;
+ link_create_opts.iter_info = &linfo;
+ link_create_opts.iter_info_len = sizeof(linfo);
}
prog_fd = bpf_program__fd(prog);
--
2.24.1
^ permalink raw reply related
* Re: [PATCH v2 bpf-next 0/5] BPF link force-detach support
From: Alexei Starovoitov @ 2020-08-02 3:48 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: bpf, Network Development, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Kernel Team
In-Reply-To: <20200731182830.286260-1-andriin@fb.com>
On Fri, Jul 31, 2020 at 11:29 AM Andrii Nakryiko <andriin@fb.com> wrote:
>
> This patch set adds new BPF link operation, LINK_DETACH, allowing processes
> with BPF link FD to force-detach it from respective BPF hook, similarly how
> BPF link is auto-detached when such BPF hook (e.g., cgroup, net_device, netns,
> etc) is removed. This facility allows admin to forcefully undo BPF link
> attachment, while process that created BPF link in the first place is left
> intact.
>
> Once force-detached, BPF link stays valid in the kernel as long as there is at
> least one FD open against it. It goes into defunct state, just like
> auto-detached BPF link.
>
> bpftool also got `link detach` command to allow triggering this in
> non-programmatic fashion.
>
> v1->v2:
> - improve error reporting in `bpftool link detach` (Song).
Applied. Thanks
^ permalink raw reply
* [PATCH -next] virtio_net: Avoid loop in virtnet_poll
From: Mao Wenan @ 2020-08-02 3:41 UTC (permalink / raw)
To: mst, jasowang, davem, kuba; +Cc: netdev, Mao Wenan
The loop may exist if vq->broken is true,
virtqueue_get_buf_ctx_packed or virtqueue_get_buf_ctx_split
will return NULL, so virtnet_poll will reschedule napi to
receive packet, it will lead cpu usage(si) to 100%.
call trace as below:
virtnet_poll
virtnet_receive
virtqueue_get_buf_ctx
virtqueue_get_buf_ctx_packed
virtqueue_get_buf_ctx_split
virtqueue_napi_complete
virtqueue_napi_schedule //it will reschedule napi
Signed-off-by: Mao Wenan <wenan.mao@linux.alibaba.com>
---
drivers/net/virtio_net.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ba38765..a058da1 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -327,7 +327,8 @@ static void virtqueue_napi_complete(struct napi_struct *napi,
opaque = virtqueue_enable_cb_prepare(vq);
if (napi_complete_done(napi, processed)) {
- if (unlikely(virtqueue_poll(vq, opaque)))
+ if (unlikely(virtqueue_poll(vq, opaque)) &&
+ unlikely(!virtqueue_is_broken(vq)))
virtqueue_napi_schedule(napi, vq);
} else {
virtqueue_disable_cb(vq);
--
1.8.3.1
^ permalink raw reply related
* Re: [PATCH bpf-next] tools build: propagate build failures from tools/build/Makefile.build
From: Alexei Starovoitov @ 2020-08-02 3:33 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: bpf, Network Development, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Kernel Team, Jiri Olsa, Arnaldo Carvalho de Melo,
Masahiro Yamada
In-Reply-To: <20200731024244.872574-1-andriin@fb.com>
On Thu, Jul 30, 2020 at 7:44 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> The '&&' command seems to have a bad effect when $(cmd_$(1)) exits with
> non-zero effect: the command failure is masked (despite `set -e`) and all but
> the first command of $(dep-cmd) is executed (successfully, as they are mostly
> printfs), thus overall returning 0 in the end.
>
> This means in practice that despite compilation errors, tools's build Makefile
> will return success. We see this very reliably with libbpf's Makefile, which
> doesn't get compilation error propagated properly. This in turns causes issues
> with selftests build, as well as bpftool and other projects that rely on
> building libbpf.
>
> The fix is simple: don't use &&. Given `set -e`, we don't need to chain
> commands with &&. The shell will exit on first failure, giving desired
> behavior and propagating error properly.
>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Fixes: 275e2d95591e ("tools build: Move dependency copy into function")
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Jiri, Arnaldo,
could you please review and ack?
^ permalink raw reply
* Re: [PATCH v9 bpf-next 10/14] bpf: Add d_path helper
From: Alexei Starovoitov @ 2020-08-02 3:13 UTC (permalink / raw)
To: Jiri Olsa
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev, bpf,
Song Liu, Yonghong Song, Martin KaFai Lau, David Miller,
John Fastabend, Wenbo Zhang, KP Singh, Brendan Gregg,
Florent Revest, Al Viro
In-Reply-To: <20200801170322.75218-11-jolsa@kernel.org>
On Sat, Aug 01, 2020 at 07:03:18PM +0200, Jiri Olsa wrote:
> Adding d_path helper function that returns full path for
> given 'struct path' object, which needs to be the kernel
> BTF 'path' object. The path is returned in buffer provided
> 'buf' of size 'sz' and is zero terminated.
>
> bpf_d_path(&file->f_path, buf, size);
>
> The helper calls directly d_path function, so there's only
> limited set of function it can be called from. Adding just
> very modest set for the start.
>
> Updating also bpf.h tools uapi header and adding 'path' to
> bpf_helpers_doc.py script.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> include/uapi/linux/bpf.h | 13 +++++++++
> kernel/trace/bpf_trace.c | 48 ++++++++++++++++++++++++++++++++++
> scripts/bpf_helpers_doc.py | 2 ++
> tools/include/uapi/linux/bpf.h | 13 +++++++++
> 4 files changed, 76 insertions(+)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index eb5e0c38eb2c..a356ea1357bf 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -3389,6 +3389,18 @@ union bpf_attr {
> * A non-negative value equal to or less than *size* on success,
> * or a negative error in case of failure.
> *
> + * int bpf_d_path(struct path *path, char *buf, u32 sz)
Please make it return 'long'. As you well ware the generated code will be better.
^ permalink raw reply
* Re: [PATCH v6 bpf-next 0/6] bpf: tailcalls in BPF subprograms
From: Alexei Starovoitov @ 2020-08-02 3:07 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: Daniel Borkmann, ast, bpf, netdev, bjorn.topel, magnus.karlsson
In-Reply-To: <20200801071357.GA19421@ranger.igk.intel.com>
On Sat, Aug 01, 2020 at 09:13:57AM +0200, Maciej Fijalkowski wrote:
> On Sat, Aug 01, 2020 at 03:03:19AM +0200, Daniel Borkmann wrote:
> > On 7/31/20 2:03 AM, Maciej Fijalkowski wrote:
> > > v5->v6:
> > > - propagate only those poke descriptors that individual subprogram is
> > > actually using (Daniel)
> > > - drop the cumbersome check if poke desc got filled in map_poke_run()
> > > - move poke->ip renaming in bpf_jit_add_poke_descriptor() from patch 4
> > > to patch 3 to provide bisectability (Daniel)
> >
> > I did a basic test with Cilium on K8s with this set, spawning a few Pods
> > and checking connectivity & whether we're not crashing since it has bit more
> > elaborate tail call use. So far so good. I was inclined to push the series
> > out, but there is one more issue I noticed and didn't notice earlier when
> > reviewing, and that is overall stack size:
> >
> > What happens when you create a single program that has nested BPF to BPF
> > calls e.g. either up to the maximum nesting or one call that is using up
> > the max stack size which is then doing another BPF to BPF call that contains
> > the tail call. In the tail call map, you have the same program in there.
> > This means we create a worst case stack from BPF size of max_stack_size *
> > max_tail_call_size, that is, 512*32. So that adds 16k worst case. For x86
> > we have a stack of arch/x86/include/asm/page_64_types.h:
> >
> > #define THREAD_SIZE_ORDER (2 + KASAN_STACK_ORDER)
> > #define THREAD_SIZE (PAGE_SIZE << THREAD_SIZE_ORDER)
> >
> > So we end up with 16k in a typical case. And this will cause kernel stack
> > overflow; I'm at least not seeing where we handle this situation in the
Not quite. The subprog is always 32 byte stack (from safety pov).
The real stack (when JITed) can be lower or zero.
So the max stack is (512 - 32) * 32 = 15360.
So there is no overflow, but may be a bit too close to comfort.
Imo the room is ok to land the set and the better enforcement can
be done as a follow up later, like below idea...
> > set. Hm, need to think more, but maybe this needs tracking of max stack
> > across tail calls to force an upper limit..
>
> My knee jerk reaction would be to decrement the allowed max tail calls,
> but not sure if it's an option and if it would help.
How about make the verifier use a lower bound for a function with a tail call ?
Something like 64 would work.
subprog_info[idx].stack_depth with tail_call will be >= 64.
Then the main function will be automatically limited to 512-64 and the worst
case stack = 14kbyte.
When the sub prog with tail call is not an empty body (malicious stack
abuser) then the lower bound won't affect anything.
A bit annoying that stack_depth will be used by JIT to actually allocate
that much. Some of it will not be used potentially, but I think it's fine.
It's much simpler solution than to keep two variables to track stack size.
Or may be check_max_stack_depth() can be a bit smarter and it can detect
that subprog is using tail_call without actually hacking stack_depth variable.
Essentially I'm proposing to tweak this formula:
depth += round_up(max_t(u32, subprog[idx].stack_depth, 1), 32);
and replace 1 with 64 for subprogs with tail_call.
^ permalink raw reply
* Re: [GIT] Networking
From: David Miller @ 2020-08-02 1:45 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
In-Reply-To: <CAHk-=whiwBCM-a=k8bd4_umR2Od6gf7d8Do3ryGAaFneNRGFng@mail.gmail.com>
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sat, 1 Aug 2020 16:45:49 -0700
> On Sat, Aug 1, 2020 at 2:36 PM David Miller <davem@davemloft.net> wrote:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git
>
> How is this wrt an rc8 or a final?
Nothing scary in there, I think you can safely do a -final with those
networking changes.
> On a very much related note, I really wish you didn't send the
> networking fixes the day before a release is scheduled.
>
> If it's really quiet., send them on (say) Wed/Thu. And then on
> Saturday, send a note saying "no, important stuff", hold on. Or say
> "nothing new".
>
> Because right now the "last-minute network pull request" has become a
> pattern, and I have a very hard time judging whether I should delay a
> release for it.
Sorry about that, just the way things work during the week I can't
catch my breath until late Friday night or Saturday usually to review
what I have and send a pull request to you.
I'll shoot for more mid-week pulls in the future.
^ permalink raw reply
* [PATCH bpf-next 3/3] tools/resolve_btfids: use libbpf's btf__parse() API
From: Andrii Nakryiko @ 2020-08-02 1:32 UTC (permalink / raw)
To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko
In-Reply-To: <20200802013219.864880-1-andriin@fb.com>
Instead of re-implementing generic BTF parsing logic, use libbpf's API.
Also add .gitignore for resolve_btfids's build artifacts.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
tools/bpf/resolve_btfids/.gitignore | 4 ++
tools/bpf/resolve_btfids/main.c | 58 +----------------------------
2 files changed, 5 insertions(+), 57 deletions(-)
create mode 100644 tools/bpf/resolve_btfids/.gitignore
diff --git a/tools/bpf/resolve_btfids/.gitignore b/tools/bpf/resolve_btfids/.gitignore
new file mode 100644
index 000000000000..a026df7dc280
--- /dev/null
+++ b/tools/bpf/resolve_btfids/.gitignore
@@ -0,0 +1,4 @@
+/FEATURE-DUMP.libbpf
+/bpf_helper_defs.h
+/fixdep
+/resolve_btfids
diff --git a/tools/bpf/resolve_btfids/main.c b/tools/bpf/resolve_btfids/main.c
index 6956b6350cad..52d883325a23 100644
--- a/tools/bpf/resolve_btfids/main.c
+++ b/tools/bpf/resolve_btfids/main.c
@@ -403,62 +403,6 @@ static int symbols_collect(struct object *obj)
return 0;
}
-static struct btf *btf__parse_raw(const char *file)
-{
- struct btf *btf;
- struct stat st;
- __u8 *buf;
- FILE *f;
-
- if (stat(file, &st))
- return NULL;
-
- f = fopen(file, "rb");
- if (!f)
- return NULL;
-
- buf = malloc(st.st_size);
- if (!buf) {
- btf = ERR_PTR(-ENOMEM);
- goto exit_close;
- }
-
- if ((size_t) st.st_size != fread(buf, 1, st.st_size, f)) {
- btf = ERR_PTR(-EINVAL);
- goto exit_free;
- }
-
- btf = btf__new(buf, st.st_size);
-
-exit_free:
- free(buf);
-exit_close:
- fclose(f);
- return btf;
-}
-
-static bool is_btf_raw(const char *file)
-{
- __u16 magic = 0;
- int fd, nb_read;
-
- fd = open(file, O_RDONLY);
- if (fd < 0)
- return false;
-
- nb_read = read(fd, &magic, sizeof(magic));
- close(fd);
- return nb_read == sizeof(magic) && magic == BTF_MAGIC;
-}
-
-static struct btf *btf_open(const char *path)
-{
- if (is_btf_raw(path))
- return btf__parse_raw(path);
- else
- return btf__parse_elf(path, NULL);
-}
-
static int symbols_resolve(struct object *obj)
{
int nr_typedefs = obj->nr_typedefs;
@@ -469,7 +413,7 @@ static int symbols_resolve(struct object *obj)
struct btf *btf;
__u32 nr;
- btf = btf_open(obj->btf ?: obj->path);
+ btf = btf__parse(obj->btf ?: obj->path, NULL);
err = libbpf_get_error(btf);
if (err) {
pr_err("FAILED: load BTF from %s: %s",
--
2.24.1
^ permalink raw reply related
* [PATCH bpf-next 2/3] tools/bpftool: use libbpf's btf__parse() API for parsing BTF from file
From: Andrii Nakryiko @ 2020-08-02 1:32 UTC (permalink / raw)
To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko
In-Reply-To: <20200802013219.864880-1-andriin@fb.com>
Use generic libbpf API to parse BTF data from file, instead of re-implementing
it in bpftool.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
tools/bpf/bpftool/btf.c | 54 +----------------------------------------
1 file changed, 1 insertion(+), 53 deletions(-)
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
index fc9bc7a23db6..42d1df2c1939 100644
--- a/tools/bpf/bpftool/btf.c
+++ b/tools/bpf/bpftool/btf.c
@@ -422,54 +422,6 @@ static int dump_btf_c(const struct btf *btf,
return err;
}
-static struct btf *btf__parse_raw(const char *file)
-{
- struct btf *btf;
- struct stat st;
- __u8 *buf;
- FILE *f;
-
- if (stat(file, &st))
- return NULL;
-
- f = fopen(file, "rb");
- if (!f)
- return NULL;
-
- buf = malloc(st.st_size);
- if (!buf) {
- btf = ERR_PTR(-ENOMEM);
- goto exit_close;
- }
-
- if ((size_t) st.st_size != fread(buf, 1, st.st_size, f)) {
- btf = ERR_PTR(-EINVAL);
- goto exit_free;
- }
-
- btf = btf__new(buf, st.st_size);
-
-exit_free:
- free(buf);
-exit_close:
- fclose(f);
- return btf;
-}
-
-static bool is_btf_raw(const char *file)
-{
- __u16 magic = 0;
- int fd, nb_read;
-
- fd = open(file, O_RDONLY);
- if (fd < 0)
- return false;
-
- nb_read = read(fd, &magic, sizeof(magic));
- close(fd);
- return nb_read == sizeof(magic) && magic == BTF_MAGIC;
-}
-
static int do_dump(int argc, char **argv)
{
struct btf *btf = NULL;
@@ -547,11 +499,7 @@ static int do_dump(int argc, char **argv)
}
NEXT_ARG();
} else if (is_prefix(src, "file")) {
- if (is_btf_raw(*argv))
- btf = btf__parse_raw(*argv);
- else
- btf = btf__parse_elf(*argv, NULL);
-
+ btf = btf__parse(*argv, NULL);
if (IS_ERR(btf)) {
err = -PTR_ERR(btf);
btf = NULL;
--
2.24.1
^ permalink raw reply related
* [PATCH bpf-next 0/3] Add generic and raw BTF parsing APIs to libbpf
From: Andrii Nakryiko @ 2020-08-02 1:32 UTC (permalink / raw)
To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko
It's pretty common for applications to want to parse raw (binary) BTF data
from file, as opposed to parsing it from ELF sections. It's also pretty common
for tools to not care whether given file is ELF or raw BTF format. This patch
series exposes internal raw BTF parsing API and adds generic variant of BTF
parsing, which will efficiently determine the format of a given fail and will
parse BTF appropriately.
Patches #2 and #3 removes re-implementations of such APIs from bpftool and
resolve_btfids tools.
Andrii Nakryiko (3):
libbpf: add btf__parse_raw() and generic btf__parse() APIs
tools/bpftool: use libbpf's btf__parse() API for parsing BTF from file
tools/resolve_btfids: use libbpf's btf__parse() API
tools/bpf/bpftool/btf.c | 54 +------------
tools/bpf/resolve_btfids/.gitignore | 4 +
tools/bpf/resolve_btfids/main.c | 58 +-------------
tools/lib/bpf/btf.c | 114 +++++++++++++++++++---------
tools/lib/bpf/btf.h | 5 +-
tools/lib/bpf/libbpf.map | 2 +
6 files changed, 89 insertions(+), 148 deletions(-)
create mode 100644 tools/bpf/resolve_btfids/.gitignore
--
2.24.1
^ permalink raw reply
* [PATCH bpf-next 1/3] libbpf: add btf__parse_raw() and generic btf__parse() APIs
From: Andrii Nakryiko @ 2020-08-02 1:32 UTC (permalink / raw)
To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko
In-Reply-To: <20200802013219.864880-1-andriin@fb.com>
Add public APIs to parse BTF from raw data file (e.g.,
/sys/kernel/btf/vmlinux), as well as generic btf__parse(), which will try to
determine correct format, currently either raw or ELF.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
tools/lib/bpf/btf.c | 114 ++++++++++++++++++++++++++-------------
tools/lib/bpf/btf.h | 5 +-
tools/lib/bpf/libbpf.map | 2 +
3 files changed, 83 insertions(+), 38 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index ded5b29965f9..856b09a04563 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -562,6 +562,83 @@ struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
return btf;
}
+struct btf *btf__parse_raw(const char *path)
+{
+ void *data = NULL;
+ struct btf *btf;
+ FILE *f = NULL;
+ __u16 magic;
+ int err = 0;
+ long sz;
+
+ f = fopen(path, "rb");
+ if (!f) {
+ err = -errno;
+ goto err_out;
+ }
+
+ /* check BTF magic */
+ if (fread(&magic, 1, sizeof(magic), f) < sizeof(magic)) {
+ err = -EIO;
+ goto err_out;
+ }
+ if (magic != BTF_MAGIC) {
+ /* definitely not a raw BTF */
+ err = -EPROTO;
+ goto err_out;
+ }
+
+ /* get file size */
+ if (fseek(f, 0, SEEK_END)) {
+ err = -errno;
+ goto err_out;
+ }
+ sz = ftell(f);
+ if (sz < 0) {
+ err = -errno;
+ goto err_out;
+ }
+ /* rewind to the start */
+ if (fseek(f, 0, SEEK_SET)) {
+ err = -errno;
+ goto err_out;
+ }
+
+ /* pre-alloc memory and read all of BTF data */
+ data = malloc(sz);
+ if (!data) {
+ err = -ENOMEM;
+ goto err_out;
+ }
+ if (fread(data, 1, sz, f) < sz) {
+ err = -EIO;
+ goto err_out;
+ }
+
+ /* finally parse BTF data */
+ btf = btf__new(data, sz);
+
+err_out:
+ free(data);
+ if (f)
+ fclose(f);
+ return err ? ERR_PTR(err) : btf;
+}
+
+struct btf *btf__parse(const char *path, struct btf_ext **btf_ext)
+{
+ struct btf *btf;
+
+ if (btf_ext)
+ *btf_ext = NULL;
+
+ btf = btf__parse_raw(path);
+ if (!IS_ERR(btf) || PTR_ERR(btf) != -EPROTO)
+ return btf;
+
+ return btf__parse_elf(path, btf_ext);
+}
+
static int compare_vsi_off(const void *_a, const void *_b)
{
const struct btf_var_secinfo *a = _a;
@@ -2951,41 +3028,6 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
return 0;
}
-static struct btf *btf_load_raw(const char *path)
-{
- struct btf *btf;
- size_t read_cnt;
- struct stat st;
- void *data;
- FILE *f;
-
- if (stat(path, &st))
- return ERR_PTR(-errno);
-
- data = malloc(st.st_size);
- if (!data)
- return ERR_PTR(-ENOMEM);
-
- f = fopen(path, "rb");
- if (!f) {
- btf = ERR_PTR(-errno);
- goto cleanup;
- }
-
- read_cnt = fread(data, 1, st.st_size, f);
- fclose(f);
- if (read_cnt < st.st_size) {
- btf = ERR_PTR(-EBADF);
- goto cleanup;
- }
-
- btf = btf__new(data, read_cnt);
-
-cleanup:
- free(data);
- return btf;
-}
-
/*
* Probe few well-known locations for vmlinux kernel image and try to load BTF
* data out of it to use for target BTF.
@@ -3021,7 +3063,7 @@ struct btf *libbpf_find_kernel_btf(void)
continue;
if (locations[i].raw_btf)
- btf = btf_load_raw(path);
+ btf = btf__parse_raw(path);
else
btf = btf__parse_elf(path, NULL);
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 491c7b41ffdc..f4a1a1d2b9a3 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -64,8 +64,9 @@ struct btf_ext_header {
LIBBPF_API void btf__free(struct btf *btf);
LIBBPF_API struct btf *btf__new(const void *data, __u32 size);
-LIBBPF_API struct btf *btf__parse_elf(const char *path,
- struct btf_ext **btf_ext);
+LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext);
+LIBBPF_API struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext);
+LIBBPF_API struct btf *btf__parse_raw(const char *path);
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf);
LIBBPF_API int btf__load(struct btf *btf);
LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index ca49a6a7e5b2..edf6a38807ea 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -291,5 +291,7 @@ LIBBPF_0.1.0 {
bpf_program__is_sk_lookup;
bpf_program__set_autoload;
bpf_program__set_sk_lookup;
+ btf__parse;
+ btf__parse_raw;
btf__set_fd;
} LIBBPF_0.0.9;
--
2.24.1
^ permalink raw reply related
* Re: [PATCH v2] drivers/net/wan/lapbether: Use needed_headroom instead of hard_header_len
From: Xie He @ 2020-08-02 0:58 UTC (permalink / raw)
To: Willem de Bruijn
Cc: David S. Miller, Jakub Kicinski, Linux Kernel Network Developers,
LKML, Linux X25, Brian Norris
In-Reply-To: <CA+FuTSdJ1c0R2qmKtm9vWpKnMv=-B0yAaronGkqg=jYZBfqceA@mail.gmail.com>
On Sat, Aug 1, 2020 at 6:31 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> The kernel interface cannot be changed. If packet sockets used to pass
> the first byte up to userspace, they have to continue to do so.
>
> So I think you can limit the header_ops to only dev_hard_header.
Actually if we want to keep the kernel interface unchanged, we
shouldn't implement header_ops for dev_hard_header either, because
this changes the way the user space program sends DGRAM packets, too.
Before the change the userspace program needs to add the 1-byte header
before sending, and after the change the userspace program will let
the kernel add the header via dev_hard_header.
> Fixes should be small and targeted. Any larger refactoring is
> best addressed in a separate net-next patch.
I guess the best way for this fix patch would be just add a 0-byte
packet check before the driver reads skb->data[0].
Thanks! I'll add the check and re-send the patch.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox