* Re: [PATCH 1/2] rtnetlink: gate MAC address with an LSM hook
From: Paul Moore @ 2019-08-30 21:46 UTC (permalink / raw)
To: Michal Kubecek
Cc: netdev, Jeffrey Vander Stoep, David Miller, LSM List, selinux
In-Reply-To: <20190829074516.GM29594@unicorn.suse.cz>
On Thu, Aug 29, 2019 at 3:45 AM Michal Kubecek <mkubecek@suse.cz> wrote:
> On Tue, Aug 27, 2019 at 04:47:04PM -0400, Paul Moore wrote:
> >
> > I'm also not a big fan of inserting the hook in rtnl_fill_ifinfo(); as
> > presented it is way too specific for a LSM hook for me to be happy.
> > However, I do agree that giving the LSMs some control over netlink
> > messages makes sense. As others have pointed out, it's all a matter
> > of where to place the hook.
> >
> > If we only care about netlink messages which leverage nlattrs I
> > suppose one option that I haven't seen mentioned would be to place a
> > hook in nla_put(). While it is a bit of an odd place for a hook, it
> > would allow the LSM easy access to the skb and attribute type to make
> > decisions, and all of the callers should already be checking the
> > return code (although we would need to verify this). One notable
> > drawback (not the only one) is that the hook is going to get hit
> > multiple times for each message.
>
> For most messages, "multiple times" would mean tens, for many even
> hundreds of calls. For each, you would have to check corresponding
> socket (and possibly also genetlink header) to see which netlink based
> protocol it is and often even parse existing part of the message to get
> the context (because the same numeric attribute type can mean something
> completely different if it appears in a nested attribute).
>
> Also, nla_put() (or rather __nla_put()) is not used for all attributes,
> one may also use nla_reserve() and then compose the attribute date in
> place.
I never said it was a great idea, just an idea ;)
Honestly I'm just trying to spur some discussion on this so we can
hopefully arrive at a solution which allows a LSM to control kernel
generated netlink messages that we can all accept.
--
paul moore
www.paul-moore.com
^ permalink raw reply
* RE: [PATCH] ncsi-netlink: support sending NC-SI commands over Netlink interface
From: Ben Wei @ 2019-08-30 21:46 UTC (permalink / raw)
To: Terry Duncan, sam@mendozajonas.com, davem@davemloft.net,
netdev@vger.kernel.org, openbmc@lists.ozlabs.org,
Justin.Lee1@Dell.com
In-Reply-To: <0da11d73-b3ab-53f6-f695-30857a743a7b@linux.intel.com>
> On 8/22/19 5:02 PM, Ben Wei wrote:
> > This patch extends ncsi-netlink command line utility to send NC-SI command to kernel driver
> > via NCSI_CMD_SEND_CMD command.
> >
> > New command line option -o (opcode) is used to specify NC-SI command and optional payload.
> >
>
> Thank you for posting this Ben.
> Something looks off on this next line but it looks fine in your pull
> request in the github.com/sammj/ncsi-netlink repo.
>
> > +static int send_cb(struct nl_msg *msg, void *arg) { #define
> > +ETHERNET_HEADER_SIZE 16
> > +
Yes I think my email client was not configured correctly for plain text so it's removing certain line breaks.
Hopefully I have this figured out now so my future patches won't have this issue.
>
> Do you have plans to upstream your yocto recipe for this repo?
Yes I sure can upstream the recipe file. I had to make local changes to build ncsi-netlink for my BMC platform.
Is there a group I may submit my recipe to?
Thanks,
-Ben
^ permalink raw reply
* Re: [PATCH v6 net-next 15/19] ionic: Add Tx and Rx handling
From: Shannon Nelson @ 2019-08-30 21:44 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, davem
In-Reply-To: <20190829163319.0f4e8707@cakuba.netronome.com>
On 8/29/19 4:33 PM, Jakub Kicinski wrote:
> On Thu, 29 Aug 2019 11:27:16 -0700, Shannon Nelson wrote:
>> +static int ionic_tx_tso(struct ionic_queue *q, struct sk_buff *skb)
>> +{
>> + struct ionic_tx_stats *stats = q_to_tx_stats(q);
>> + struct ionic_desc_info *abort = q->head;
>> + struct device *dev = q->lif->ionic->dev;
>> + struct ionic_desc_info *rewind = abort;
>> + struct ionic_txq_sg_elem *elem;
>> + struct ionic_txq_desc *desc;
>> + unsigned int frag_left = 0;
>> + unsigned int offset = 0;
>> + unsigned int len_left;
>> + dma_addr_t desc_addr;
>> + unsigned int hdrlen;
>> + unsigned int nfrags;
>> + unsigned int seglen;
>> + u64 total_bytes = 0;
>> + u64 total_pkts = 0;
>> + unsigned int left;
>> + unsigned int len;
>> + unsigned int mss;
>> + skb_frag_t *frag;
>> + bool start, done;
>> + bool outer_csum;
>> + bool has_vlan;
>> + u16 desc_len;
>> + u8 desc_nsge;
>> + u16 vlan_tci;
>> + bool encap;
>> + int err;
>> +
>> + mss = skb_shinfo(skb)->gso_size;
>> + nfrags = skb_shinfo(skb)->nr_frags;
>> + len_left = skb->len - skb_headlen(skb);
>> + outer_csum = (skb_shinfo(skb)->gso_type & SKB_GSO_GRE_CSUM) ||
>> + (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM);
>> + has_vlan = !!skb_vlan_tag_present(skb);
>> + vlan_tci = skb_vlan_tag_get(skb);
>> + encap = skb->encapsulation;
>> +
>> + /* Preload inner-most TCP csum field with IP pseudo hdr
>> + * calculated with IP length set to zero. HW will later
>> + * add in length to each TCP segment resulting from the TSO.
>> + */
>> +
>> + if (encap)
>> + err = ionic_tx_tcp_inner_pseudo_csum(skb);
>> + else
>> + err = ionic_tx_tcp_pseudo_csum(skb);
>> + if (err)
>> + return err;
>> +
>> + if (encap)
>> + hdrlen = skb_inner_transport_header(skb) - skb->data +
>> + inner_tcp_hdrlen(skb);
>> + else
>> + hdrlen = skb_transport_offset(skb) + tcp_hdrlen(skb);
>> +
>> + seglen = hdrlen + mss;
>> + left = skb_headlen(skb);
>> +
>> + desc = ionic_tx_tso_next(q, &elem);
>> + start = true;
>> +
>> + /* Chop skb->data up into desc segments */
>> +
>> + while (left > 0) {
>> + len = min(seglen, left);
>> + frag_left = seglen - len;
>> + desc_addr = ionic_tx_map_single(q, skb->data + offset, len);
>> + if (dma_mapping_error(dev, desc_addr))
>> + goto err_out_abort;
>> + desc_len = len;
>> + desc_nsge = 0;
>> + left -= len;
>> + offset += len;
>> + if (nfrags > 0 && frag_left > 0)
>> + continue;
>> + done = (nfrags == 0 && left == 0);
>> + ionic_tx_tso_post(q, desc, skb,
>> + desc_addr, desc_nsge, desc_len,
>> + hdrlen, mss,
>> + outer_csum,
>> + vlan_tci, has_vlan,
>> + start, done);
>> + total_pkts++;
>> + total_bytes += start ? len : len + hdrlen;
>> + desc = ionic_tx_tso_next(q, &elem);
>> + start = false;
>> + seglen = mss;
>> + }
>> +
>> + /* Chop skb frags into desc segments */
>> +
>> + for (frag = skb_shinfo(skb)->frags; len_left; frag++) {
>> + offset = 0;
>> + left = skb_frag_size(frag);
>> + len_left -= left;
>> + nfrags--;
>> + stats->frags++;
>> +
>> + while (left > 0) {
>> + if (frag_left > 0) {
>> + len = min(frag_left, left);
>> + frag_left -= len;
>> + elem->addr =
>> + cpu_to_le64(ionic_tx_map_frag(q, frag,
>> + offset, len));
>> + if (dma_mapping_error(dev, elem->addr))
>> + goto err_out_abort;
>> + elem->len = cpu_to_le16(len);
>> + elem++;
>> + desc_nsge++;
>> + left -= len;
>> + offset += len;
>> + if (nfrags > 0 && frag_left > 0)
>> + continue;
>> + done = (nfrags == 0 && left == 0);
>> + ionic_tx_tso_post(q, desc, skb, desc_addr,
>> + desc_nsge, desc_len,
>> + hdrlen, mss, outer_csum,
>> + vlan_tci, has_vlan,
>> + start, done);
>> + total_pkts++;
>> + total_bytes += start ? len : len + hdrlen;
>> + desc = ionic_tx_tso_next(q, &elem);
>> + start = false;
>> + } else {
>> + len = min(mss, left);
>> + frag_left = mss - len;
>> + desc_addr = ionic_tx_map_frag(q, frag,
>> + offset, len);
>> + if (dma_mapping_error(dev, desc_addr))
>> + goto err_out_abort;
>> + desc_len = len;
>> + desc_nsge = 0;
>> + left -= len;
>> + offset += len;
>> + if (nfrags > 0 && frag_left > 0)
>> + continue;
>> + done = (nfrags == 0 && left == 0);
>> + ionic_tx_tso_post(q, desc, skb, desc_addr,
>> + desc_nsge, desc_len,
>> + hdrlen, mss, outer_csum,
>> + vlan_tci, has_vlan,
>> + start, done);
>> + total_pkts++;
>> + total_bytes += start ? len : len + hdrlen;
>> + desc = ionic_tx_tso_next(q, &elem);
>> + start = false;
>> + }
>> + }
>> + }
>> +
>> + stats->pkts += total_pkts;
>> + stats->bytes += total_bytes;
>> + stats->tso++;
>> +
>> + return 0;
>> +
>> +err_out_abort:
>> + while (rewind->desc != q->head->desc) {
>> + ionic_tx_clean(q, rewind, NULL, NULL);
>> + rewind = rewind->next;
>> + }
>> + q->head = abort;
>> +
>> + return -ENOMEM;
>> +}
> There's definitely a function for helping drivers which can't do full
> TSO slice up the packet, but I can't find it now 😫😫
>
> Eric would definitely know.
>
> Did you have a look? Would it be useful here?
Yes, obviously this could use some work for clarity and supportability,
and I think for performance as well. But since it works, I've been
concentrating on getting other parts of the driver working before coming
back to this. If there are some tools that can help clean this up, I
would be interested to see them.
sln
^ permalink raw reply
* Re: [PATCH -next] net: mlx5: Kconfig: Fix MLX5_CORE_EN dependencies
From: Saeed Mahameed @ 2019-08-30 21:43 UTC (permalink / raw)
To: davem@davemloft.net, maowenan@huawei.com, leon@kernel.org
Cc: kernel-janitors@vger.kernel.org, netdev@vger.kernel.org,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20190827031251.98881-1-maowenan@huawei.com>
On Tue, 2019-08-27 at 11:12 +0800, Mao Wenan wrote:
> When MLX5_CORE_EN=y and PCI_HYPERV_INTERFACE is not set, below errors
The issue happens when PCI_HYPERV_INTERFACE is a module and mlx5_core
is built-in.
> are found:
> drivers/net/ethernet/mellanox/mlx5/core/en_main.o: In function
> `mlx5e_nic_enable':
> en_main.c:(.text+0xb649): undefined reference to
> `mlx5e_hv_vhca_stats_create'
> drivers/net/ethernet/mellanox/mlx5/core/en_main.o: In function
> `mlx5e_nic_disable':
> en_main.c:(.text+0xb8c4): undefined reference to
> `mlx5e_hv_vhca_stats_destroy'
>
> This because CONFIG_PCI_HYPERV_INTERFACE is newly introduced by
> 'commit 348dd93e40c1
> ("PCI: hv: Add a Hyper-V PCI interface driver for software
> backchannel interface"),
> Fix this by making MLX5_CORE_EN imply PCI_HYPERV_INTERFACE.
>
the imply should be in MLX5_CORE not MLX5_CORE_EN since the
implementation also involves MLX5_CORE.
I will prepare a patch with these fixups.
Thanks,
Saeed.
^ permalink raw reply
* Re: [PATCH v6 net-next 16/19] ionic: Add netdev-event handling
From: Shannon Nelson @ 2019-08-30 21:36 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, davem
In-Reply-To: <20190829163738.64e7fe42@cakuba.netronome.com>
On 8/29/19 4:37 PM, Jakub Kicinski wrote:
> On Thu, 29 Aug 2019 11:27:17 -0700, Shannon Nelson wrote:
>> When the netdev gets a new name from userland, pass that name
>> down to the NIC for internal tracking.
>>
>> Signed-off-by: Shannon Nelson <snelson@pensando.io>
> There is a precedent in ACPI for telling the FW what OS is running but
> how is the interface name useful for the firmware I can't really tell.
It is so we can correlate the host's interface name with the internal
port data for internal logging.
sln
^ permalink raw reply
* Re: linux-next: build failure after merge of the net-next tree
From: David Miller @ 2019-08-30 21:35 UTC (permalink / raw)
To: sfr; +Cc: netdev, linux-next, linux-kernel, weifeng.voon, boon.leong.ong
In-Reply-To: <20190829200546.7b9af296@canb.auug.org.au>
From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Thu, 29 Aug 2019 20:05:46 +1000
> From: Stephen Rothwell <sfr@canb.auug.org.au>
> Date: Thu, 29 Aug 2019 19:49:27 +1000
> Subject: [PATCH] net: stmmac: depend on COMMON_CLK
>
> Fixes: 190f73ab4c43 ("net: stmmac: setup higher frequency clk support for EHL & TGL")
> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Applied.
^ permalink raw reply
* Re: [PATCH bpf-next 00/13] bpf: adding map batch processing support
From: Jakub Kicinski @ 2019-08-30 21:35 UTC (permalink / raw)
To: Yonghong Song
Cc: Alexei Starovoitov, bpf@vger.kernel.org, netdev@vger.kernel.org,
Brian Vazquez, Daniel Borkmann, Kernel Team, Quentin Monnet
In-Reply-To: <a3422ffd-e9f2-af77-a92d-81393a9f4fc7@fb.com>
On Fri, 30 Aug 2019 07:25:54 +0000, Yonghong Song wrote:
> On 8/29/19 11:39 AM, Jakub Kicinski wrote:
> > On Wed, 28 Aug 2019 23:45:02 -0700, Yonghong Song wrote:
> >> Brian Vazquez has proposed BPF_MAP_DUMP command to look up more than one
> >> map entries per syscall.
> >> https://lore.kernel.org/bpf/CABCgpaU3xxX6CMMxD+1knApivtc2jLBHysDXw-0E9bQEL0qC3A@mail.gmail.com/T/#t
> >>
> >> During discussion, we found more use cases can be supported in a similar
> >> map operation batching framework. For example, batched map lookup and delete,
> >> which can be really helpful for bcc.
> >> https://github.com/iovisor/bcc/blob/master/tools/tcptop.py#L233-L243
> >> https://github.com/iovisor/bcc/blob/master/tools/slabratetop.py#L129-L138
> >>
> >> Also, in bcc, we have API to delete all entries in a map.
> >> https://github.com/iovisor/bcc/blob/master/src/cc/api/BPFTable.h#L257-L264
> >>
> >> For map update, batched operations also useful as sometimes applications need
> >> to populate initial maps with more than one entry. For example, the below
> >> example is from kernel/samples/bpf/xdp_redirect_cpu_user.c:
> >> https://github.com/torvalds/linux/blob/master/samples/bpf/xdp_redirect_cpu_user.c#L543-L550
> >>
> >> This patch addresses all the above use cases. To make uapi stable, it also
> >> covers other potential use cases. Four bpf syscall subcommands are introduced:
> >> BPF_MAP_LOOKUP_BATCH
> >> BPF_MAP_LOOKUP_AND_DELETE_BATCH
> >> BPF_MAP_UPDATE_BATCH
> >> BPF_MAP_DELETE_BATCH
> >>
> >> In userspace, application can iterate through the whole map one batch
> >> as a time, e.g., bpf_map_lookup_batch() in the below:
> >> p_key = NULL;
> >> p_next_key = &key;
> >> while (true) {
> >> err = bpf_map_lookup_batch(fd, p_key, &p_next_key, keys, values,
> >> &batch_size, elem_flags, flags);
> >> if (err) ...
> >> if (p_next_key) break; // done
> >> if (!p_key) p_key = p_next_key;
> >> }
> >> Please look at individual patches for details of new syscall subcommands
> >> and examples of user codes.
> >>
> >> The testing is also done in a qemu VM environment:
> >> measure_lookup: max_entries 1000000, batch 10, time 342ms
> >> measure_lookup: max_entries 1000000, batch 1000, time 295ms
> >> measure_lookup: max_entries 1000000, batch 1000000, time 270ms
> >> measure_lookup: max_entries 1000000, no batching, time 1346ms
> >> measure_lookup_delete: max_entries 1000000, batch 10, time 433ms
> >> measure_lookup_delete: max_entries 1000000, batch 1000, time 363ms
> >> measure_lookup_delete: max_entries 1000000, batch 1000000, time 357ms
> >> measure_lookup_delete: max_entries 1000000, not batch, time 1894ms
> >> measure_delete: max_entries 1000000, batch, time 220ms
> >> measure_delete: max_entries 1000000, not batch, time 1289ms
> >> For a 1M entry hash table, batch size of 10 can reduce cpu time
> >> by 70%. Please see patch "tools/bpf: measure map batching perf"
> >> for details of test codes.
> >
> > Hi Yonghong!
> >
> > great to see this, we have been looking at implementing some way to
> > speed up map walks as well.
> >
> > The direction we were looking in, after previous discussions [1],
> > however, was to provide a BPF program which can run the logic entirely
> > within the kernel.
> >
> > We have a rough PoC on the FW side (we can offload the program which
> > walks the map, which is pretty neat), but the kernel verifier side
> > hasn't really progressed. It will soon.
> >
> > The rough idea is that the user space provides two programs, "filter"
> > and "dumper":
> >
> > bpftool map exec id XYZ filter pinned /some/prog \
> > dumper pinned /some/other_prog
> >
> > Both programs get this context:
> >
> > struct map_op_ctx {
> > u64 key;
> > u64 value;
> > }
> >
> > We need a per-map implementation of the exec side, but roughly maps
> > would do:
> >
> > LIST_HEAD(deleted);
> >
> > for entry in map {
> > struct map_op_ctx {
> > .key = entry->key,
> > .value = entry->value,
> > };
> >
> > act = BPF_PROG_RUN(filter, &map_op_ctx);
> > if (act & ~ACT_BITS)
> > return -EINVAL;
> >
> > if (act & DELETE) {
> > map_unlink(entry);
> > list_add(entry, &deleted);
> > }
> > if (act & STOP)
> > break;
> > }
> >
> > synchronize_rcu();
> >
> > for entry in deleted {
> > struct map_op_ctx {
> > .key = entry->key,
> > .value = entry->value,
> > };
> >
> > BPF_PROG_RUN(dumper, &map_op_ctx);
> > map_free(entry);
> > }
> >
> > The filter program can't perform any map operations other than lookup,
> > otherwise we won't be able to guarantee that we'll walk the entire map
> > (if the filter program deletes some entries in a unfortunate order).
>
> Looks like you will provide a new program type and per-map
> implementation of above code. My patch set indeed avoided per-map
> implementation for all of lookup/delete/get-next-key...
Indeed, the simple batched ops are undeniably lower LoC.
> > If user space just wants a pure dump it can simply load a program which
> > dumps the entries into a perf ring.
>
> percpu perf ring is not really ideal for user space which simply just
> want to get some key/value pairs back. Some kind of generate non-per-cpu
> ring buffer might be better for such cases.
I don't think it had to be per-cpu, but I may be blissfully ignorant
about the perf ring details :) bpf_perf_event_output() takes flags,
which are effectively selecting the "output CPU", no?
> > I'm bringing this up because that mechanism should cover what is
> > achieved with this patch set and much more.
>
> The only case it did not cover is batched update. But that may not
> be super critical.
Right, my other concern (which admittedly is slightly pedantic) is the
potential loss of modifications. the lookup_and_delete() operation does
not guarantee that the result returned to user space is the "final"
value. We'd need to wait for a RCU grace period between lookup and dump.
The "map is definitely unused now"/RCU guarantee had came up in the
past.
> Your approach give each element an action choice through another bpf
> program. This indeed powerful. My use case is simpler than your use case
> below, hence the implementation.
Agreed, the simplicity is tempting.
> > In particular for networking workloads where old flows have to be
> > pruned from the map periodically it's far more efficient to communicate
> > to user space only the flows which timed out (the delete batching from
> > this set won't help at all).
>
> Maybe LRU map will help in this case? It is designed for such
> use cases.
LRU map would be perfect if it dumped the entry somewhere before it got
reused... Perhaps we need to attach a program to the LRU map that'd get
run when entries get reaped. That'd be cool 🤔 We could trivially reuse
the "dumper" prog type for this.
> > With a 2M entry map and this patch set we still won't be able to prune
> > once a second on one core.
> >
> > [1]
> > https://lore.kernel.org/netdev/20190813130921.10704-4-quentin.monnet@netronome.com/
^ permalink raw reply
* Re: [PATCH] ncsi-netlink: support sending NC-SI commands over Netlink interface
From: Terry Duncan @ 2019-08-30 21:25 UTC (permalink / raw)
To: Ben Wei, sam@mendozajonas.com, davem@davemloft.net,
netdev@vger.kernel.org, openbmc@lists.ozlabs.org,
Justin.Lee1@Dell.com
In-Reply-To: <CH2PR15MB36860EECD2EA6D63BEA70110A3A40@CH2PR15MB3686.namprd15.prod.outlook.com>
On 8/22/19 5:02 PM, Ben Wei wrote:
> This patch extends ncsi-netlink command line utility to send NC-SI command to kernel driver
> via NCSI_CMD_SEND_CMD command.
>
> New command line option -o (opcode) is used to specify NC-SI command and optional payload.
>
Thank you for posting this Ben.
Something looks off on this next line but it looks fine in your pull
request in the github.com/sammj/ncsi-netlink repo.
> +static int send_cb(struct nl_msg *msg, void *arg) { #define
> +ETHERNET_HEADER_SIZE 16
> +
Do you have plans to upstream your yocto recipe for this repo?
^ permalink raw reply
* Re: [PATCH v6 net-next 14/19] ionic: Add initial ethtool support
From: Shannon Nelson @ 2019-08-30 21:25 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, davem
In-Reply-To: <20190829161029.0676d6f7@cakuba.netronome.com>
On 8/29/19 4:10 PM, Jakub Kicinski wrote:
> On Thu, 29 Aug 2019 11:27:15 -0700, Shannon Nelson wrote:
>> +static int ionic_get_module_eeprom(struct net_device *netdev,
>> + struct ethtool_eeprom *ee,
>> + u8 *data)
>> +{
>> + struct ionic_lif *lif = netdev_priv(netdev);
>> + struct ionic_dev *idev = &lif->ionic->idev;
>> + struct ionic_xcvr_status *xcvr;
>> + char tbuf[sizeof(xcvr->sprom)];
>> + int count = 10;
>> + u32 len;
>> +
>> + /* The NIC keeps the module prom up-to-date in the DMA space
>> + * so we can simply copy the module bytes into the data buffer.
>> + */
>> + xcvr = &idev->port_info->status.xcvr;
>> + len = min_t(u32, sizeof(xcvr->sprom), ee->len);
>> +
>> + do {
>> + memcpy(data, xcvr->sprom, len);
>> + memcpy(tbuf, xcvr->sprom, len);
>> +
>> + /* Let's make sure we got a consistent copy */
>> + if (!memcmp(data, tbuf, len))
>> + break;
>> +
>> + } while (--count);
> Should this return an error if the image was never consistent?
Sure, how about -EBUSY?
sln
^ permalink raw reply
* Re: [PATCH net] tc-testing: don't hardcode 'ip' in nsPlugin.py
From: Nicolas Dichtel @ 2019-08-30 21:25 UTC (permalink / raw)
To: Davide Caratti, Hangbin Liu, Roman Mashak, Vlad Buslov,
David S. Miller, Lucas Bates, netdev
Cc: Marcelo Ricardo Leitner
In-Reply-To: <8ade839e21c5231d2d6b8690b39587f802642306.1567180765.git.dcaratti@redhat.com>
Le 30/08/2019 à 18:51, Davide Caratti a écrit :
> the following tdc test fails on Fedora:
>
> # ./tdc.py -e 2638
> -- ns/SubPlugin.__init__
> Test 2638: Add matchall and try to get it
> -----> prepare stage *** Could not execute: "$TC qdisc add dev $DEV1 clsact"
> -----> prepare stage *** Error message: "/bin/sh: ip: command not found"
> returncode 127; expected [0]
> -----> prepare stage *** Aborting test run.
>
> Let nsPlugin.py use the 'IP' variable introduced with commit 92c1a19e2fb9
> ("tc-tests: added path to ip command in tdc"), so that the path to 'ip' is
> correctly resolved to the value we have in tdc_config.py.
>
> # ./tdc.py -e 2638
> -- ns/SubPlugin.__init__
> Test 2638: Add matchall and try to get it
> All test results:
> 1..1
> ok 1 2638 - Add matchall and try to get it
>
> Fixes: 489ce2f42514 ("tc-testing: Restore original behaviour for namespaces in tdc")
> Reported-by: Hangbin Liu <liuhangbin@gmail.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
^ permalink raw reply
* Re: [PATCH net-next] net: Fail explicit bind to local reserved ports
From: David Miller @ 2019-08-30 21:22 UTC (permalink / raw)
To: subashab; +Cc: netdev, stranche
In-Reply-To: <1567049214-19804-1-git-send-email-subashab@codeaurora.org>
From: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Date: Wed, 28 Aug 2019 21:26:54 -0600
> Reserved ports may have some special use cases which are not suitable
> for use by general userspace applications. Currently, ports specified
> in ip_local_reserved_ports will not be returned only in case of
> automatic port assignment.
>
> In some cases, it maybe required to prevent the host from assigning
> the ports even in case of explicit binds. Consider the case of a
> transparent proxy where packets are being redirected. In case a socket
> matches this connection, packets from this application would be
> incorrectly sent to one of the endpoints.
>
> Add a boolean sysctl flag 'reserved_port_bind'. Default value is 1
> which preserves the existing behavior. Setting the value to 0 will
> prevent userspace applications from binding to these ports even when
> they are explicitly requested.
>
> Cc: Sean Tranchetti <stranche@codeaurora.org>
> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
I don't know how happy I am about this. Whatever sets up the transparent
proxy business can block any attempt to communicate over these ports.
Also, protocols like SCTP need the new handling too.
^ permalink raw reply
* Re: [PATCH] sky2: Disable MSI on yet another ASUS boards (P6Xxxx)
From: David Miller @ 2019-08-30 21:18 UTC (permalink / raw)
To: tiwai; +Cc: mlindner, stephen, netdev, linux-kernel, swm
In-Reply-To: <s5hsgpkmtsj.wl-tiwai@suse.de>
From: Takashi Iwai <tiwai@suse.de>
Date: Thu, 29 Aug 2019 07:20:44 +0200
> On Thu, 29 Aug 2019 01:09:37 +0200,
> David Miller wrote:
>>
>> From: Takashi Iwai <tiwai@suse.de>
>> Date: Wed, 28 Aug 2019 08:31:19 +0200
>>
>> > A similar workaround for the suspend/resume problem is needed for yet
>> > another ASUS machines, P6X models. Like the previous fix, the BIOS
>> > doesn't provide the standard DMI_SYS_* entry, so again DMI_BOARD_*
>> > entries are used instead.
>> >
>> > Reported-and-tested-by: SteveM <swm@swm1.com>
>> > Signed-off-by: Takashi Iwai <tiwai@suse.de>
>>
>> Applied, but this is getting suspicious.
>>
>> It looks like MSI generally is not restored properly on resume on these
>> boards, so maybe there simply needs to be a generic PCI quirk for that?
>
> Yes, I wondered that, too.
> But, e.g. HD-audio should use MSI on Intel platforms, and if the
> problem were generic, it must suffer from the same issue, and I
> haven't heard of such, so far. So it's likely specific to some
> limited devices, as it seems.
There must be some state of MSI state on the sky2 chip that is restored by
most BIOS/chipsets but not this one.
Some part of PCI config space or something.
^ permalink raw reply
* Re: [PATCH bpf-next 00/13] bpf: adding map batch processing support
From: Stanislav Fomichev @ 2019-08-30 21:18 UTC (permalink / raw)
To: Yonghong Song
Cc: Jakub Kicinski, Brian Vazquez, Alexei Starovoitov,
bpf@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann,
Kernel Team
In-Reply-To: <eda3c9e0-8ad6-e684-0aeb-d63b9ed60aa7@fb.com>
On 08/30, Yonghong Song wrote:
>
>
> On 8/30/19 1:15 PM, Stanislav Fomichev wrote:
> > On 08/29, Jakub Kicinski wrote:
> >> On Thu, 29 Aug 2019 16:13:59 -0700, Brian Vazquez wrote:
> >>>> We need a per-map implementation of the exec side, but roughly maps
> >>>> would do:
> >>>>
> >>>> LIST_HEAD(deleted);
> >>>>
> >>>> for entry in map {
> >>>> struct map_op_ctx {
> >>>> .key = entry->key,
> >>>> .value = entry->value,
> >>>> };
> >>>>
> >>>> act = BPF_PROG_RUN(filter, &map_op_ctx);
> >>>> if (act & ~ACT_BITS)
> >>>> return -EINVAL;
> >>>>
> >>>> if (act & DELETE) {
> >>>> map_unlink(entry);
> >>>> list_add(entry, &deleted);
> >>>> }
> >>>> if (act & STOP)
> >>>> break;
> >>>> }
> >>>>
> >>>> synchronize_rcu();
> >>>>
> >>>> for entry in deleted {
> >>>> struct map_op_ctx {
> >>>> .key = entry->key,
> >>>> .value = entry->value,
> >>>> };
> >>>>
> >>>> BPF_PROG_RUN(dumper, &map_op_ctx);
> >>>> map_free(entry);
> >>>> }
> >>>>
> >>> Hi Jakub,
> >>>
> >>> how would that approach support percpu maps?
> >>>
> >>> I'm thinking of a scenario where you want to do some calculations on
> >>> percpu maps and you are interested on the info on all the cpus not
> >>> just the one that is running the bpf program. Currently on a pcpu map
> >>> the bpf_map_lookup_elem helper only returns the pointer to the data of
> >>> the executing cpu.
> >>
> >> Right, we need to have the iteration outside of the bpf program itself,
> >> and pass the element in through the context. That way we can feed each
> >> per cpu entry into the program separately.
> > My 2 cents:
> >
> > I personally like Jakub's/Quentin's proposal more. So if I get to choose
> > between this series and Jakub's filter+dump in BPF, I'd pick filter+dump
> > (pending per-cpu issue which we actually care about).
> >
> > But if we can have both, I don't have any objections; this patch
> > series looks to me a lot like what Brian did, just extended to more
> > commands. If we are fine with the shortcomings raised about the
> > original series, then let's go with this version. Maybe we can also
> > look into addressing these independently.
> >
> > But if I pretend that we live in an ideal world, I'd just go with
> > whatever Jakub and Quentin are doing so we don't have to support
> > two APIs that essentially do the same (minus batching update, but
> > it looks like there is no clear use case for that yet; maybe).
> >
> > I guess you can hold off this series a bit and discuss it at LPC,
> > you have a talk dedicated to that :-) (and afaiu, you are all going)
>
> Absolutely. We will have a discussion on map batching and I signed
> on with that :-). One of goals for this patch set is for me to explore
> what uapi (attr and bpf subcommands) we should expose to users.
> Hopefully at that time we will get more clarity
> on Jakub's approach and we can discuss how to proceed.
Sounds good! Your series didn't have an RFC tag, so I wasn't
sure whether we've fully committed to that approach or not.
^ permalink raw reply
* Re: [PATCH netdev] net: stmmac: dwmac-rk: Don't fail if phy regulator is absent
From: David Miller @ 2019-08-30 21:17 UTC (permalink / raw)
To: wens
Cc: peppe.cavallaro, alexandre.torgue, joabreu, heiko, wens,
linux-rockchip, linux-arm-kernel, netdev, linux-kernel
In-Reply-To: <20190829031724.20865-1-wens@kernel.org>
From: Chen-Yu Tsai <wens@kernel.org>
Date: Thu, 29 Aug 2019 11:17:24 +0800
> From: Chen-Yu Tsai <wens@csie.org>
>
> The devicetree binding lists the phy phy as optional. As such, the
> driver should not bail out if it can't find a regulator. Instead it
> should just skip the remaining regulator related code and continue
> on normally.
>
> Skip the remainder of phy_power_on() if a regulator supply isn't
> available. This also gets rid of the bogus return code.
>
> Fixes: 2e12f536635f ("net: stmmac: dwmac-rk: Use standard devicetree property for phy regulator")
> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Applied and queued up for -stable.
> On a separate note, maybe we should add this file to the Rockchip
> entry in MAINTAINERS?
Yes, probably.
Thanks.
^ permalink raw reply
* Re: [PATCH] amd-xgbe: Fix error path in xgbe_mod_init()
From: David Miller @ 2019-08-30 21:15 UTC (permalink / raw)
To: yuehaibing; +Cc: thomas.lendacky, netdev, linux-kernel
In-Reply-To: <20190829024600.16052-1-yuehaibing@huawei.com>
From: YueHaibing <yuehaibing@huawei.com>
Date: Thu, 29 Aug 2019 10:46:00 +0800
> In xgbe_mod_init(), we should do cleanup if some error occurs
>
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Fixes: efbaa828330a ("amd-xgbe: Add support to handle device renaming")
> Fixes: 47f164deab22 ("amd-xgbe: Add PCI device support")
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Applied.
^ permalink raw reply
* Re: [PATCH][V2] arcnet: capmode: remove redundant assignment to pointer pkt
From: David Miller @ 2019-08-30 21:15 UTC (permalink / raw)
To: colin.king; +Cc: m.grzeschik, netdev, kernel-janitors, linux-kernel
In-Reply-To: <20190828231450.22424-1-colin.king@canonical.com>
From: Colin King <colin.king@canonical.com>
Date: Thu, 29 Aug 2019 00:14:50 +0100
> From: Colin Ian King <colin.king@canonical.com>
>
> Pointer pkt is being initialized with a value that is never read
> and pkt is being re-assigned a little later on. The assignment is
> redundant and hence can be removed.
>
> Addresses-Coverity: ("Ununsed value")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>
> V2: fix typo in patch description, pkg -> pkt
Applied to net-next.
^ permalink raw reply
* Re: [PATCH net v1] net/sched: cbs: Fix not adding cbs instance to list
From: David Miller @ 2019-08-30 21:12 UTC (permalink / raw)
To: vinicius.gomes; +Cc: netdev, jhs, xiyou.wangcong, jiri, olteanv
In-Reply-To: <20190828173615.4264-1-vinicius.gomes@intel.com>
From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Wed, 28 Aug 2019 10:36:15 -0700
> When removing a cbs instance when offloading is enabled, the crash
> below can be observed. Also, the current code doesn't handle correctly
> the case when offload is disabled without removing the qdisc: if the
> link speed changes the credit calculations will be wrong.
I think it does handle that case correctly, because in the !offloaded
code path of cbs_change() it makes an explict call to the function
cbs_set_port_rate().
And that is the only location where offload can be disabled on an
already existing instance.
If you agree, please fix your commit message to be more accurate on
this point.
Thank you.
^ permalink raw reply
* Re: [PATCH bpf-next 00/13] bpf: adding map batch processing support
From: Jakub Kicinski @ 2019-08-30 21:10 UTC (permalink / raw)
To: Yonghong Song
Cc: Stanislav Fomichev, Brian Vazquez, Alexei Starovoitov,
bpf@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann,
Kernel Team, Quentin Monnet
In-Reply-To: <eda3c9e0-8ad6-e684-0aeb-d63b9ed60aa7@fb.com>
On Fri, 30 Aug 2019 20:55:33 +0000, Yonghong Song wrote:
> > I guess you can hold off this series a bit and discuss it at LPC,
> > you have a talk dedicated to that :-) (and afaiu, you are all going)
>
> Absolutely. We will have a discussion on map batching and I signed
> on with that :-)
FWIW unfortunately neither Quentin nor I will be able to make the LPC
this year :( But the idea had been floated a few times (I'd certainly
not claim more authorship here than Ed or Alexei), and is simple enough
to move forward over email (I hope).
^ permalink raw reply
* Re: [PATCH net] dev: Delay the free of the percpu refcount
From: Subash Abhinov Kasiviswanathan @ 2019-08-30 21:03 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, netdev, Sean Tranchetti
In-Reply-To: <959f4b3e-387d-a148-3281-aed26a6a7aa5@gmail.com>
> This looks bogus.
>
> Whatever layer tries to access dev refcnt after free_netdev() has been
> called is buggy.
>
> I would rather trap early and fix the root cause.
>
> Untested patch :
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index b5d28dadf964..8080f1305417 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3723,6 +3723,7 @@ void netdev_run_todo(void);
> */
> static inline void dev_put(struct net_device *dev)
> {
> + BUG_ON(!dev->pcpu_refcnt);
> this_cpu_dec(*dev->pcpu_refcnt);
> }
>
> @@ -3734,6 +3735,7 @@ static inline void dev_put(struct net_device
> *dev)
> */
> static inline void dev_hold(struct net_device *dev)
> {
> + BUG_ON(!dev->pcpu_refcnt);
> this_cpu_inc(*dev->pcpu_refcnt);
> }
Hello Eric
I am seeing a similar crash with your patch as well.
The NULL dev->pcpu_refcnt was caught by the BUG you added.
786.510217: <6> kernel BUG at include/linux/netdevice.h:3633!
786.510263: <2> pc : in_dev_finish_destroy+0xcc/0xd0
786.510267: <2> lr : in_dev_finish_destroy+0x2c/0xd0
786.511220: <2> Call trace:
786.511225: <2> in_dev_finish_destroy+0xcc/0xd0
786.511230: <2> in_dev_rcu_put+0x24/0x30
786.511237: <2> rcu_nocb_kthread+0x43c/0x468
786.511243: <2> kthread+0x118/0x128
786.511249: <2> ret_from_fork+0x10/0x1c
This seems to be happening when there is an allocation failure
in the IPv6 notifier callback only.
I had added some additional debug to narrow down the refcount
validity along the callers of the dev_put/dev_hold.
refcnt valid below shows that the pointer dev->pcpu_refcnt is valid
while refcnt null shows the case where dev->pcpu_refcnt is NULL.
The last dev_put happens after free_netdev leading to the
dev->pcpu_refcnt to be accessed when NULL.
309.908501: <6> dev_hold() ffffffe13c9df000 ip6_vti0 refcnt valid
setup_net+0xa0/0x210 -> ops_init+0x88/0x110
309.908674: <6> dev_hold() ffffffe13c9df000 ip6_vti0 refcnt valid
register_netdevice+0x29c/0x5b0 -> netdev_register_kobject+0xd8/0x150
309.908696: <6> dev_hold() ffffffe13c9df000 ip6_vti0 refcnt valid
register_netdevice+0x29c/0x5b0 -> netdev_register_kobject+0x100/0x150
309.908717: <6> dev_hold() ffffffe13c9df000 ip6_vti0 refcnt valid
vti6_init_net+0x188/0x1c0 -> register_netdev+0x28/0x40
309.908763: <6> neighbour: dev_hold() ffffffe13c9df000 ip6_vti0 refcnt
valid inetdev_event+0x43c/0x528 -> inetdev_init+0x80/0x1e0
309.908835: <6> dev_hold() ffffffe13c9df000 ip6_vti0 refcnt valid
raw_notifier_call_chain+0x3c/0x68 -> inetdev_event+0x43c/0x528
309.908882: <6> neighbour: dev_hold() ffffffe13c9df000 ip6_vti0 refcnt
valid addrconf_notify+0x42c/0xe58 -> ipv6_add_dev+0xe4/0x588
309.908890: <6> IPv6: dev_hold() ffffffe13c9df000 ip6_vti0 refcnt
valid raw_notifier_call_chain+0x3c/0x68 -> addrconf_notify+0x42c/0xe58
309.908906: <6> stress-ng-clone: page allocation failure: order:0,
mode:0x6040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null)
309.908910: <6> stress-ng-clone cpuset=foreground mems_allowed=0
309.908925: <2> Call trace:
309.908931: <2> dump_backtrace+0x0/0x158
309.908934: <2> show_stack+0x14/0x20
309.908939: <2> dump_stack+0xc4/0xfc
309.908944: <2> warn_alloc+0xf8/0x168
309.908947: <2> __alloc_pages_nodemask+0xff4/0x1018
309.908955: <2> new_slab+0x128/0x5b8
309.908958: <2> ___slab_alloc+0x4cc/0x5f8
309.908960: <2> kmem_cache_alloc_trace+0x2a4/0x2c0
309.908963: <2> ipv6_add_dev+0x220/0x588
309.908966: <2> addrconf_notify+0x42c/0xe58
309.908969: <2> raw_notifier_call_chain+0x3c/0x68
309.908972: <2> register_netdevice+0x3c4/0x5b0
309.908974: <2> register_netdev+0x28/0x40
309.908978: <2> vti6_init_net+0x188/0x1c0
309.908981: <2> ops_init+0x88/0x110
309.908983: <2> setup_net+0xa0/0x210
309.908986: <2> copy_net_ns+0xa8/0x130
309.908990: <2> create_new_namespaces+0x138/0x170
309.908993: <2> unshare_nsproxy_namespaces+0x68/0x90
309.908999: <2> ksys_unshare+0x17c/0x248
309.909001: <2> __arm64_sys_unshare+0x10/0x20
309.909004: <2> el0_svc_common+0xa0/0x158
309.909007: <2> el0_svc_handler+0x6c/0x88
309.909010: <2> el0_svc+0x8/0xc
309.909021: <6> neighbour: dev_put() ffffffe13c9df000 ip6_vti0 refcnt
valid addrconf_notify+0x42c/0xe58 -> ipv6_add_dev+0x400/0x588
309.909030: <6> IPv6: dev_put() ffffffe13c9df000 ip6_vti0 refcnt valid
raw_notifier_call_chain+0x3c/0x68 -> addrconf_notify+0x42c/0xe58
309.918097: <6> neighbour: dev_put() ffffffe13c9df000 ip6_vti0 refcnt
valid raw_notifier_call_chain+0x3c/0x68 -> inetdev_event+0x290/0x528
309.918249: <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt valid
register_netdevice+0x3f8/0x5b0 -> rollback_registered_many+0x488/0x658
309.918318: <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt valid
net_rx_queue_update_kobjects+0x1ec/0x238 -> kobject_put+0x7c/0xc0
309.918405: <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt valid
netdev_queue_update_kobjects+0x1dc/0x228 -> kobject_put+0x7c/0xc0
309.918759: <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt valid
register_netdev+0x28/0x40 -> register_netdevice+0x3f8/0x5b0
309.918778: <6> free_netdev() ffffffe13c9df000 ip6_vti0 refcnt valid
ops_init+0x88/0x110 -> vti6_init_net+0x1ac/0x1c0
309.980671: <6> dev_put() ffffffe13c9df000 ip6_vti0 refcnt null
rcu_nocb_kthread+0x43c/0x468 -> in_dev_rcu_put+0x24/0x30
309.980838: <6> kernel BUG at include/linux/netdevice.h:3636!
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
^ permalink raw reply
* Re: [PATCH net-next v2 00/22] bnxt_en: health and error recovery.
From: David Miller @ 2019-08-30 21:02 UTC (permalink / raw)
To: michael.chan; +Cc: netdev, vasundhara-v.volam, ray.jui
In-Reply-To: <1567137305-5853-1-git-send-email-michael.chan@broadcom.com>
From: Michael Chan <michael.chan@broadcom.com>
Date: Thu, 29 Aug 2019 23:54:43 -0400
> This patchset implements adapter health and error recovery. The status
> is reported through several devlink reporters and the driver will
> initiate and complete the recovery process using the devlink infrastructure.
>
> v2: Added 4 patches at the beginning of the patchset to clean up error code
> handling related to firmware messages and to convert to use standard
> error codes.
>
> Removed the dropping of rtnl_lock in bnxt_close().
>
> Broke up the patches some more for better patch organization and
> future bisection.
The return value handling looks a lot better now, thanks for cleaning that
up.
Series applied, thanks Michael.
^ permalink raw reply
* Re: [PATCH v4 1/2] netfilter: Terminate rule eval if protocol=IPv6 and ipv6 module is disabled
From: Florian Westphal @ 2019-08-30 20:58 UTC (permalink / raw)
To: Leonardo Bras
Cc: netfilter-devel, coreteam, bridge, netdev, linux-kernel,
Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
Roopa Prabhu, Nikolay Aleksandrov, David S. Miller
In-Reply-To: <20190830181354.26279-2-leonardo@linux.ibm.com>
Leonardo Bras <leonardo@linux.ibm.com> wrote:
> If IPv6 is disabled on boot (ipv6.disable=1), but nft_fib_inet ends up
> dealing with a IPv6 packet, it causes a kernel panic in
> fib6_node_lookup_1(), crashing in bad_page_fault.
>
> The panic is caused by trying to deference a very low address (0x38
> in ppc64le), due to ipv6.fib6_main_tbl = NULL.
> BUG: Kernel NULL pointer dereference at 0x00000038
>
> The kernel panic was reproduced in a host that disabled IPv6 on boot and
> have to process guest packets (coming from a bridge) using it's ip6tables.
>
> Terminate rule evaluation when packet protocol is IPv6 but the ipv6 module
> is not loaded.
>
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
Acked-by: Florian Westphal <fw@strlen.de>
^ permalink raw reply
* Re: [PATCH net-next] net/ncsi: add response handlers for PLDM over NC-SI
From: Ben Wei @ 2019-08-30 20:57 UTC (permalink / raw)
To: David Miller
Cc: sam@mendozajonas.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, openbmc@lists.ozlabs.org
In-Reply-To: <20190828.160032.599086044004802986.davem@davemloft.net>
> > This patch adds handlers for PLDM over NC-SI command response.
> >
> > This enables NC-SI driver recognizes the packet type so the responses don't get dropped as unknown packet type.
> >
> > PLDM over NC-SI are not handled in kernel driver for now, but can be passed back to user space via Netlink for further handling.
> >
> > Signed-off-by: Ben Wei <benwei@fb.com>
>
> I don't know why but patchwork puts part of your patch into the commit message, see:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_patch_1154104_&d=DwICAg&c=5VD0RTtNlTh3ycd41b3MUw&r=U35IaQ-> 7Tnwjs7q_Fwf_bQ&m=vxOIQa5Sv7aY4LKSUvlobJd_TtHOz1KLjxZw8WXmkJM&s=A8rpxgac6iuSEH2DqCSzBDdM82Eu3pD8_nGHx9YtGW8&e=
>
> It's probably an encoding issue or similar.
>
> > +static int ncsi_rsp_handler_pldm(struct ncsi_request *nr) {
>> + return 0;
> > +}
> > +
> > static int ncsi_rsp_handler_netlink(struct ncsi_request *nr) {
>
> I know other functions in this file do it, but please put the openning
> curly braces of a function on a separate line.
>
> Thank you.
Thanks David. I think the issue is related to my email client setting. I fixed it and resubmitted a v2 version of this patch.
Thanks,
-Ben
^ permalink raw reply
* Re: [PATCH bpf-next 00/13] bpf: adding map batch processing support
From: Yonghong Song @ 2019-08-30 20:55 UTC (permalink / raw)
To: Stanislav Fomichev, Jakub Kicinski
Cc: Brian Vazquez, Alexei Starovoitov, bpf@vger.kernel.org,
netdev@vger.kernel.org, Daniel Borkmann, Kernel Team
In-Reply-To: <20190830201513.GA2101@mini-arch>
On 8/30/19 1:15 PM, Stanislav Fomichev wrote:
> On 08/29, Jakub Kicinski wrote:
>> On Thu, 29 Aug 2019 16:13:59 -0700, Brian Vazquez wrote:
>>>> We need a per-map implementation of the exec side, but roughly maps
>>>> would do:
>>>>
>>>> LIST_HEAD(deleted);
>>>>
>>>> for entry in map {
>>>> struct map_op_ctx {
>>>> .key = entry->key,
>>>> .value = entry->value,
>>>> };
>>>>
>>>> act = BPF_PROG_RUN(filter, &map_op_ctx);
>>>> if (act & ~ACT_BITS)
>>>> return -EINVAL;
>>>>
>>>> if (act & DELETE) {
>>>> map_unlink(entry);
>>>> list_add(entry, &deleted);
>>>> }
>>>> if (act & STOP)
>>>> break;
>>>> }
>>>>
>>>> synchronize_rcu();
>>>>
>>>> for entry in deleted {
>>>> struct map_op_ctx {
>>>> .key = entry->key,
>>>> .value = entry->value,
>>>> };
>>>>
>>>> BPF_PROG_RUN(dumper, &map_op_ctx);
>>>> map_free(entry);
>>>> }
>>>>
>>> Hi Jakub,
>>>
>>> how would that approach support percpu maps?
>>>
>>> I'm thinking of a scenario where you want to do some calculations on
>>> percpu maps and you are interested on the info on all the cpus not
>>> just the one that is running the bpf program. Currently on a pcpu map
>>> the bpf_map_lookup_elem helper only returns the pointer to the data of
>>> the executing cpu.
>>
>> Right, we need to have the iteration outside of the bpf program itself,
>> and pass the element in through the context. That way we can feed each
>> per cpu entry into the program separately.
> My 2 cents:
>
> I personally like Jakub's/Quentin's proposal more. So if I get to choose
> between this series and Jakub's filter+dump in BPF, I'd pick filter+dump
> (pending per-cpu issue which we actually care about).
>
> But if we can have both, I don't have any objections; this patch
> series looks to me a lot like what Brian did, just extended to more
> commands. If we are fine with the shortcomings raised about the
> original series, then let's go with this version. Maybe we can also
> look into addressing these independently.
>
> But if I pretend that we live in an ideal world, I'd just go with
> whatever Jakub and Quentin are doing so we don't have to support
> two APIs that essentially do the same (minus batching update, but
> it looks like there is no clear use case for that yet; maybe).
>
> I guess you can hold off this series a bit and discuss it at LPC,
> you have a talk dedicated to that :-) (and afaiu, you are all going)
Absolutely. We will have a discussion on map batching and I signed
on with that :-). One of goals for this patch set is for me to explore
what uapi (attr and bpf subcommands) we should expose to users.
Hopefully at that time we will get more clarity
on Jakub's approach and we can discuss how to proceed.
^ permalink raw reply
* Re: [PATCH v4 2/2] net: br_netfiler_hooks: Drops IPv6 packets if IPv6 module is not loaded
From: Florian Westphal @ 2019-08-30 20:55 UTC (permalink / raw)
To: Leonardo Bras
Cc: netfilter-devel, coreteam, bridge, netdev, linux-kernel,
Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
Roopa Prabhu, Nikolay Aleksandrov, David S. Miller
In-Reply-To: <20190830181354.26279-3-leonardo@linux.ibm.com>
Leonardo Bras <leonardo@linux.ibm.com> wrote:
> A kernel panic can happen if a host has disabled IPv6 on boot and have to
> process guest packets (coming from a bridge) using it's ip6tables.
>
> IPv6 packets need to be dropped if the IPv6 module is not loaded.
>
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
> ---
> net/bridge/br_netfilter_hooks.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
> index d3f9592f4ff8..5e8693730df1 100644
> --- a/net/bridge/br_netfilter_hooks.c
> +++ b/net/bridge/br_netfilter_hooks.c
> @@ -493,6 +493,8 @@ static unsigned int br_nf_pre_routing(void *priv,
> brnet = net_generic(state->net, brnf_net_id);
> if (IS_IPV6(skb) || is_vlan_ipv6(skb, state->net) ||
> is_pppoe_ipv6(skb, state->net)) {
> + if (!ipv6_mod_enabled())
> + return NF_DROP;
> if (!brnet->call_ip6tables &&
> !br_opt_get(br, BROPT_NF_CALL_IP6TABLES))
> return NF_ACCEPT;
No, thats too aggressive and turns the bridge into an ipv6 blackhole.
There are two solutions:
1. The above patch, but use NF_ACCEPT instead
2. keep the DROP, but move it below the call_ip6tables test,
so that users can tweak call-ip6tables to accept packets.
Perhaps it would be good to also add a pr_warn_once() that
tells that ipv6 was disabled on command line and
call-ip6tables isn't supported in this configuration.
I would go with option two.
^ permalink raw reply
* Re: [PATCH v3 net-next 00/15] ioc3-eth improvements
From: David Miller @ 2019-08-30 20:55 UTC (permalink / raw)
To: tbogendoerfer; +Cc: ralf, paul.burton, jhogan, linux-mips, linux-kernel, netdev
In-Reply-To: <20190830092539.24550-1-tbogendoerfer@suse.de>
From: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Date: Fri, 30 Aug 2019 11:25:23 +0200
> In my patch series for splitting out the serial code from ioc3-eth
> by using a MFD device there was one big patch for ioc3-eth.c,
> which wasn't really usefull for reviews. This series contains the
> ioc3-eth changes splitted in smaller steps and few more cleanups.
> Only the conversion to MFD will be done later in a different series.
>
> Changes in v3:
> - no need to check skb == NULL before passing it to dev_kfree_skb_any
> - free memory allocated with get_page(s) with free_page(s)
> - allocate rx ring with just GFP_KERNEL
> - add required alignment for rings in comments
>
> Changes in v2:
> - use net_err_ratelimited for printing various ioc3 errors
> - added missing clearing of rx buf valid flags into ioc3_alloc_rings
> - use __func__ for printing out of memory messages
Series applied, thanks.
I might be nice to use get_order() instead of hardcoding the page size
when "2" is passed into the page alloc/free calls. Just FYI...
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox