* Re: [PATCH net-next v2 0/2] net: phy: improve PHY suspend/resume
From: Andrew Lunn @ 2018-06-01 0:10 UTC (permalink / raw)
To: Heiner Kallweit; +Cc: Florian Fainelli, David Miller, netdev@vger.kernel.org
In-Reply-To: <f48b0978-7891-487b-d2b1-3f23b269578c@gmail.com>
> Configuring the different WoL options isn't handled by writing to
> the PHY registers but by writing to chip / MAC registers.
> Therefore phy_suspend() isn't able to figure out whether WoL is
> enabled or not. Only the parent has the full picture.
Hi Heiner
I think you need to look at your different runtime PM domains. If i
understand the code right, you runtime suspend if there is no
link. But for this to work correctly, your PHY needs to keep working.
You also cannot assume all accesses to the PHY go via the MAC. Some
calls will go direct to the PHY, and they can trigger MDIO bus
accesses. So i think you need two runtime PM domains. MAC and MDIO
bus. Maybe just the pll? An MDIO bus is a device, so it can have its
on PM callbacks. It is not clear what you need to resume in order to
make MDIO work.
It might also help if you do the phy_connect in .ndo_open and
disconnect in .ndo_stop. This is a common pattern in drivers. But some
also do it is probe and remove.
Andrew
^ permalink raw reply
* Re: [PATCH bpf-next] xsk: temporarily disable AF_XDP
From: Björn Töpel @ 2018-06-01 0:24 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Netdev
Cc: Björn Töpel, Karlsson, Magnus, Magnus Karlsson
In-Reply-To: <20180531001754.15923-1-bjorn.topel@gmail.com>
Den ons 30 maj 2018 kl 17:22 skrev Björn Töpel <bjorn.topel@gmail.com>:
>
> From: Björn Töpel <bjorn.topel@intel.com>
>
> Temporarily disable AF_XDP sockets, and hide uapi.
>
[...]
Alexei/Daniel,
Ignore this patch, please.
Thanks,
Björn
^ permalink raw reply
* Re: [PATCH net-next] net/ncsi: Avoid GFP_KERNEL in response handler
From: Samuel Mendoza-Jonas @ 2018-06-01 0:33 UTC (permalink / raw)
To: Eric Dumazet, netdev; +Cc: David S . Miller, linux-kernel, openbmc
In-Reply-To: <69fcb143-00a2-2ddf-e2d4-c692b650f292@gmail.com>
On Thu, 2018-05-31 at 04:50 -0400, Eric Dumazet wrote:
>
> On 05/31/2018 03:02 AM, Samuel Mendoza-Jonas wrote:
> > ncsi_rsp_handler_gc() allocates the filter arrays using GFP_KERNEL in
> > softirq context, causing the below backtrace. This allocation is only a
> > few dozen bytes during probing so allocate with GFP_ATOMIC instead.
> >
>
> Hi Samuel
>
> You forgot to add
>
> Fixes: 062b3e1b6d4f ("net/ncsi: Refactor MAC, VLAN filters")
>
> size = (rsp->uc_cnt + rsp->mc_cnt + rsp->mixed_cnt) * ETH_ALEN;
>
> -> seems to be able to reach more than few dozen bytes...
Hi Eric,
The NCSI spec (at least in the v1.1.0 version I'm looking at) sets the
total number of MAC address filters at 8, so we would be looking at a
maximum of 8 * ETH_ALEN = 48 bytes.
That said it shouldn't be too arduous to move the allocation to later in
the probe/configure cycle so if needed we could do that.
>
> Also, what prevents ncsi_rsp_handler_gc() to be called multiples times ?
>
> nc->mac_filter.addrs & nc->vlan_filter.vids would be re-allocated and memory would leak.
>
Good point, we should put a check there just in case to see if it's
allocated. We should be safe though as ncsi_rsp_handler_gc() should only
be called via ncsi_probe_channel() which only happens through
ncsi_start_dev(), and addrs/vids is cleaned up in ncsi_remove_channel().
Rogue packets shouldn't hit the ncsi_rsp_handler_gc() handler without an
outstanding request.. but it probably is safer to check regardless.
Regards,
Sam
^ permalink raw reply
* Re: [PATCH bpf-next] bpf: prevent non-IPv4 socket to be added into sock hash
From: Eric Dumazet @ 2018-06-01 1:00 UTC (permalink / raw)
To: John Fastabend; +Cc: Wei Wang, netdev, Willem de Bruijn
In-Reply-To: <c6aae1fa-3ad1-9192-3e99-e177b1098f06@gmail.com>
On Thu, May 31, 2018 at 7:32 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
>
> Hi Wei,
>
> Thanks for the report and fix. It would be better to fix the
> root cause so that IPv6 works as intended.
>
> I'm testing the following now,
>
> Author: John Fastabend <john.fastabend@gmail.com>
> Date: Thu May 31 14:38:59 2018 -0700
>
> sockmap: fix crash when ipv6 sock is added by adding support for IPv6
>
> Apparently we had a testing escape and missed IPv6. This fixes a crash
> where we assign tcp_prot to IPv6 sockets instead of tcpv6_prot.
>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>
Hi John
In any case, please forward correct attribution for Wei's work, and
syzbot 'Reported-by'
Are you sure you are handling IPv4 mapped in IPv6 sockets as well ?
Thanks.
^ permalink raw reply
* Re: [PATCH net-next] net: phy: consider PHY_IGNORE_INTERRUPT in state machine PHY_NOLINK handling
From: David Miller @ 2018-06-01 1:26 UTC (permalink / raw)
To: hkallweit1; +Cc: f.fainelli, andrew, netdev
In-Reply-To: <0a4e472d-cb7f-ef1f-420c-1327fa41e8cd@gmail.com>
From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Wed, 30 May 2018 22:13:20 +0200
> We can bail out immediately also in case of PHY_IGNORE_INTERRUPT because
> phy_mac_interupt() informs us once the link is up.
>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
When state is PHY_NOLINK, the phy_mac_interrupt() code paths
will change the state to PHY_CHANGELINK before queueing up
the state machine invocation.
So I can't even see how we can enter phy_state_machine with
->state == PHY_NOLINK is the mac interrupt paths are being
used properly.
Therefore it looks like the code as written is harmless.
Did you actually hit a problem with this test or is this
a change based purely upon code inspection?
^ permalink raw reply
* [PATCH net-next v2 0/2] qed: Fix issues in UFP feature commit 'cac6f691'.
From: Sudarsana Reddy Kalluru @ 2018-06-01 1:47 UTC (permalink / raw)
To: davem; +Cc: netdev, Ariel.Elior, Michal.Kalderon, Sudarsana Reddy Kalluru
From: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
This patch series fixes couple of issues in the UFP feature commit,
cac6f691: Add support for Unified Fabric Port.
Changes from previous version:
------------------------------
v2: Added "Fixes:" tag.
Please consider applying it to "net-next".
Sudarsana Reddy Kalluru (2):
qed: Fix shared memory inconsistency between driver and the MFW.
qed: Fix use of incorrect shmem address.
drivers/net/ethernet/qlogic/qed/qed_hsi.h | 1 +
drivers/net/ethernet/qlogic/qed/qed_mcp.c | 5 +++--
2 files changed, 4 insertions(+), 2 deletions(-)
--
1.8.3.1
^ permalink raw reply
* [PATCH net-next v2 1/2] qed: Fix shared memory inconsistency between driver and the MFW.
From: Sudarsana Reddy Kalluru @ 2018-06-01 1:47 UTC (permalink / raw)
To: davem; +Cc: netdev, Ariel.Elior, Michal.Kalderon
In-Reply-To: <20180601014737.6164-1-sudarsana.kalluru@cavium.com>
The structure shared between driver and management firmware (MFW)
differ in sizes. The additional field defined by the MFW is not
relevant to the current driver. Add a dummy field to the structure.
Fixes: cac6f691 ("qed: Add support for Unified Fabric Port")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
---
drivers/net/ethernet/qlogic/qed/qed_hsi.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_hsi.h b/drivers/net/ethernet/qlogic/qed/qed_hsi.h
index 8e1e6e1..beba930 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_hsi.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_hsi.h
@@ -11996,6 +11996,7 @@ struct public_port {
#define EEE_REMOTE_TW_RX_MASK 0xffff0000
#define EEE_REMOTE_TW_RX_OFFSET 16
+ u32 reserved1;
u32 oem_cfg_port;
#define OEM_CFG_CHANNEL_TYPE_MASK 0x00000003
#define OEM_CFG_CHANNEL_TYPE_OFFSET 0
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next v2 2/2] qed: Fix use of incorrect shmem address.
From: Sudarsana Reddy Kalluru @ 2018-06-01 1:47 UTC (permalink / raw)
To: davem; +Cc: netdev, Ariel.Elior, Michal.Kalderon
In-Reply-To: <20180601014737.6164-1-sudarsana.kalluru@cavium.com>
Incorrect shared memory address is used while deriving the values
for tc and pri_type. Use shmem address corresponding to 'oem_cfg_func'
where the management firmare saves tc/pri_type values.
Fixes: cac6f691 ("qed: Add support for Unified Fabric Port")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
---
drivers/net/ethernet/qlogic/qed/qed_mcp.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.c b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
index 2612e3e..6f9927d 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
@@ -1514,9 +1514,10 @@ void qed_mcp_read_ufp_config(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt)
}
qed_mcp_get_shmem_func(p_hwfn, p_ptt, &shmem_info, MCP_PF_ID(p_hwfn));
- val = (port_cfg & OEM_CFG_FUNC_TC_MASK) >> OEM_CFG_FUNC_TC_OFFSET;
+ val = (shmem_info.oem_cfg_func & OEM_CFG_FUNC_TC_MASK) >>
+ OEM_CFG_FUNC_TC_OFFSET;
p_hwfn->ufp_info.tc = (u8)val;
- val = (port_cfg & OEM_CFG_FUNC_HOST_PRI_CTRL_MASK) >>
+ val = (shmem_info.oem_cfg_func & OEM_CFG_FUNC_HOST_PRI_CTRL_MASK) >>
OEM_CFG_FUNC_HOST_PRI_CTRL_OFFSET;
if (val == OEM_CFG_FUNC_HOST_PRI_CTRL_VNIC) {
p_hwfn->ufp_info.pri_type = QED_UFP_PRI_VNIC;
--
1.8.3.1
^ permalink raw reply related
* Re: [PATCH v2 net] mlx4_core: restore optimal ICM memory allocation
From: Qing Huang @ 2018-06-01 1:51 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller
Cc: netdev, Eric Dumazet, John Sperbeck, Tarick Bedeir,
Daniel Jurgens, Zhu Yanjun
In-Reply-To: <20180531125224.97098-1-edumazet@google.com>
On 5/31/2018 5:52 AM, Eric Dumazet wrote:
> Commit 1383cb8103bb ("mlx4_core: allocate ICM memory in page size chunks")
> brought two regressions caught in our regression suite.
>
> The big one is an additional cost of 256 bytes of overhead per 4096 bytes,
> or 6.25 % which is unacceptable since ICM can be pretty large.
>
> This comes from having to allocate one struct mlx4_icm_chunk (256 bytes)
> per MLX4_TABLE_CHUNK, which the buggy commit shrank to 4KB
> (instead of prior 256KB)
It would be great if you could share the test case that triggered the
KASAN report in your
environment. Our QA has been running intensive tests using 8KB or 4KB
chunk size configuration
for some time, no one has reported memory corruption issues so far,
given that we are not
running the latest upstream kernel.
IMO, it's worthwhile to find out the root cause and fix the problem.
>
> Note that mlx4_alloc_icm() is already able to try high order allocations
> and fallback to low-order allocations under high memory pressure.
>
> Most of these allocations happen right after boot time, when we get
> plenty of non fragmented memory, there is really no point being so
> pessimistic and break huge pages into order-0 ones just for fun.
>
> We only have to tweak gfp_mask a bit, to help falling back faster,
> without risking OOM killings.
Just FYI, out of memory wasn't our original concern. We didn't encounter
OOM killings.
>
> Second regression is an KASAN fault, that will need further investigations.
>
> Fixes: 1383cb8103bb ("mlx4_core: allocate ICM memory in page size chunks")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Tariq Toukan <tariqt@mellanox.com>
> Cc: John Sperbeck <jsperbeck@google.com>
> Cc: Tarick Bedeir <tarick@google.com>
> Cc: Qing Huang <qing.huang@oracle.com>
> Cc: Daniel Jurgens <danielj@mellanox.com>
> Cc: Zhu Yanjun <yanjun.zhu@oracle.com>
> ---
> drivers/net/ethernet/mellanox/mlx4/icm.c | 18 ++++++++++++------
> 1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c
> index 685337d58276fc91baeeb64387c52985e1bc6dda..5342bd8a3d0bfaa9e76bb9b6943790606c97b181 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/icm.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/icm.c
> @@ -43,12 +43,13 @@
> #include "fw.h"
>
> /*
> - * We allocate in page size (default 4KB on many archs) chunks to avoid high
> - * order memory allocations in fragmented/high usage memory situation.
> + * We allocate in as big chunks as we can, up to a maximum of 256 KB
> + * per chunk. Note that the chunks are not necessarily in contiguous
> + * physical memory.
> */
> enum {
> - MLX4_ICM_ALLOC_SIZE = PAGE_SIZE,
> - MLX4_TABLE_CHUNK_SIZE = PAGE_SIZE,
> + MLX4_ICM_ALLOC_SIZE = 1 << 18,
> + MLX4_TABLE_CHUNK_SIZE = 1 << 18,
> };
>
> static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk)
> @@ -135,6 +136,7 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
> struct mlx4_icm *icm;
> struct mlx4_icm_chunk *chunk = NULL;
> int cur_order;
> + gfp_t mask;
> int ret;
>
> /* We use sg_set_buf for coherent allocs, which assumes low memory */
> @@ -178,13 +180,17 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
> while (1 << cur_order > npages)
> --cur_order;
>
> + mask = gfp_mask;
> + if (cur_order)
> + mask &= ~__GFP_DIRECT_RECLAIM;
> +
> if (coherent)
> ret = mlx4_alloc_icm_coherent(&dev->persist->pdev->dev,
> &chunk->mem[chunk->npages],
> - cur_order, gfp_mask);
> + cur_order, mask);
> else
> ret = mlx4_alloc_icm_pages(&chunk->mem[chunk->npages],
> - cur_order, gfp_mask,
> + cur_order, mask,
> dev->numa_node);
>
> if (ret) {
^ permalink raw reply
* [PATCH] net: ethernet: mlx4: Remove unnecessary parentheses
From: Varsha Rao @ 2018-06-01 2:00 UTC (permalink / raw)
To: Tariq Toukan, David S. Miller, Nicholas Mc Guire, Lukas Bulwahn,
netdev, linux-rdma, linux-kernel
Cc: Varsha Rao
This patch fixes the clang warning of extraneous parentheses, with the
following coccinelle script.
@@
identifier i;
expression e;
statement s;
@@
if (
-(i == e)
+i == e
)
s
Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
---
drivers/net/ethernet/mellanox/mlx4/port.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/port.c b/drivers/net/ethernet/mellanox/mlx4/port.c
index 3ef3406ff4cb..10fcc22f4590 100644
--- a/drivers/net/ethernet/mellanox/mlx4/port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/port.c
@@ -614,9 +614,9 @@ int __mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan,
int index_at_dup_port = -1;
for (i = MLX4_VLAN_REGULAR; i < MLX4_MAX_VLAN_NUM; i++) {
- if ((vlan == (MLX4_VLAN_MASK & be32_to_cpu(table->entries[i]))))
+ if (vlan == (MLX4_VLAN_MASK & be32_to_cpu(table->entries[i])))
index_at_port = i;
- if ((vlan == (MLX4_VLAN_MASK & be32_to_cpu(dup_table->entries[i]))))
+ if (vlan == (MLX4_VLAN_MASK & be32_to_cpu(dup_table->entries[i])))
index_at_dup_port = i;
}
/* check that same vlan is not in the tables at different indices */
--
2.17.0
^ permalink raw reply related
* Re: [PATCH V4] mlx4_core: allocate ICM memory in page size chunks
From: Qing Huang @ 2018-06-01 2:04 UTC (permalink / raw)
To: Michal Hocko, Eric Dumazet
Cc: David Miller, tariqt, haakon.bugge, yanjun.zhu, netdev,
linux-rdma, linux-kernel, gi-oh.kim, santosh.shilimkar@oracle.com
In-Reply-To: <20180531091022.GL15278@dhcp22.suse.cz>
On 5/31/2018 2:10 AM, Michal Hocko wrote:
> On Thu 31-05-18 10:55:32, Michal Hocko wrote:
>> On Thu 31-05-18 04:35:31, Eric Dumazet wrote:
> [...]
>>> I merely copied/pasted from alloc_skb_with_frags() :/
>> I will have a look at it. Thanks!
> OK, so this is an example of an incremental development ;).
>
> __GFP_NORETRY was added by ed98df3361f0 ("net: use __GFP_NORETRY for
> high order allocations") to prevent from OOM killer. Yet this was
> not enough because fb05e7a89f50 ("net: don't wait for order-3 page
> allocation") didn't want an excessive reclaim for non-costly orders
> so it made it completely NOWAIT while it preserved __GFP_NORETRY in
> place which is now redundant. Should I send a patch?
>
Just curious, how about GFP_ATOMIC flag? Would it work in a similar
fashion? We experimented
with it a bit in the past but it seemed to cause other issue in our
tests. :-)
By the way, we didn't encounter any OOM killer events. It seemed that
the mlx4_alloc_icm() triggered slowpath.
We still had about 2GB free memory while it was highly fragmented.
#0 [ffff8801f308b380] remove_migration_pte at ffffffff811f0e0b
#1 [ffff8801f308b3e0] rmap_walk_file at ffffffff811cb890
#2 [ffff8801f308b440] rmap_walk at ffffffff811cbaf2
#3 [ffff8801f308b450] remove_migration_ptes at ffffffff811f0db0
#4 [ffff8801f308b490] __unmap_and_move at ffffffff811f2ea6
#5 [ffff8801f308b4e0] unmap_and_move at ffffffff811f2fc5
#6 [ffff8801f308b540] migrate_pages at ffffffff811f3219
#7 [ffff8801f308b5c0] compact_zone at ffffffff811b707e
#8 [ffff8801f308b650] compact_zone_order at ffffffff811b735d
#9 [ffff8801f308b6e0] try_to_compact_pages at ffffffff811b7485
#10 [ffff8801f308b770] __alloc_pages_direct_compact at ffffffff81195f96
#11 [ffff8801f308b7b0] __alloc_pages_slowpath at ffffffff811978a1
#12 [ffff8801f308b890] __alloc_pages_nodemask at ffffffff81197ec1
#13 [ffff8801f308b970] alloc_pages_current at ffffffff811e261f
#14 [ffff8801f308b9e0] mlx4_alloc_icm at ffffffffa01f39b2 [mlx4_core]
Thanks!
^ permalink raw reply
* [PATCH] net: wireless: brcmsmac: Remove unnecessary parentheses
From: Varsha Rao @ 2018-06-01 2:14 UTC (permalink / raw)
To: Nicholas Mc Guire, Lukas Bulwahn, Arend van Spriel, Franky Lin,
Hante Meuleman, Chi-Hsien Lin, Wright Feng, Kalle Valo,
David S. Miller, linux-wireless, brcm80211-dev-list.pdl,
brcm80211-dev-list, netdev, linux-kernel
Cc: Varsha Rao
This patch fixes the clang warning of extraneous parentheses, with the
following coccinelle script.
@@
identifier i;
expression e;
statement s;
@@
if (
-(i == e)
+i == e
)
s
Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
---
drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_cmn.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_cmn.c b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_cmn.c
index 3a13d176b221..35e3b101e5cf 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_cmn.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_cmn.c
@@ -159,7 +159,7 @@ u16 read_radio_reg(struct brcms_phy *pi, u16 addr)
{
u16 data;
- if ((addr == RADIO_IDCODE))
+ if (addr == RADIO_IDCODE)
return 0xffff;
switch (pi->pubpi.phy_type) {
--
2.17.0
^ permalink raw reply related
* Re: [PATCH] rtnetlink: Remove VLA usage
From: David Miller @ 2018-06-01 2:49 UTC (permalink / raw)
To: keescook; +Cc: fw, dsahern, netdev, linux-kernel
In-Reply-To: <20180530222052.GA30622@beast>
From: Kees Cook <keescook@chromium.org>
Date: Wed, 30 May 2018 15:20:52 -0700
> In the quest to remove all stack VLA usage from the kernel[1], this
> allocates the maximum size expected for all possible types and adds
> sanity-checks at both registration and usage to make sure nothing gets
> out of sync. This matches the proposed VLA solution for nfnetlink[2]. The
> values chosen here were based on finding assignments for .maxtype and
> .slave_maxtype and manually counting the enums:
>
> slave_maxtype (max 33):
...
> maxtype (max 45):
...
>
> This additionally changes maxtype and slave_maxtype fields to unsigned,
> since they're only ever using positive values.
>
> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
> [2] https://patchwork.kernel.org/patch/10439647/
>
> Signed-off-by: Kees Cook <keescook@chromium.org>
Looks good, applied, thanks.
^ permalink raw reply
* Re: [PATCH net-next] virtio_net: fix error return code in virtnet_probe()
From: David Miller @ 2018-06-01 2:50 UTC (permalink / raw)
To: weiyongjun1
Cc: mst, jasowang, sridhar.samudrala, virtualization, netdev,
kernel-janitors
In-Reply-To: <1527732307-145609-1-git-send-email-weiyongjun1@huawei.com>
From: Wei Yongjun <weiyongjun1@huawei.com>
Date: Thu, 31 May 2018 02:05:07 +0000
> Fix to return a negative error code from the failover create fail error
> handling case instead of 0, as done elsewhere in this function.
>
> Fixes: ba5e4426e80e ("virtio_net: Extend virtio to use VF datapath when available")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Applied.
^ permalink raw reply
* Re: [PATCH v2 net] ixgbe: fix parsing of TC actions for HW offload
From: David Miller @ 2018-06-01 3:01 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: ohlavaty, netdev, andrewx.bowers
In-Reply-To: <b67b75aeb43afd4812e378673151a629f1681eea.camel@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 31 May 2018 14:46:08 -0700
> On Thu, 2018-05-31 at 23:21 +0200, Ondřej Hlavatý wrote:
>> The previous code was optimistic, accepting the offload of whole
>> action
>> chain when there was a single known action (drop/redirect). This
>> results
>> in offloading a rule which should not be offloaded, because its
>> behavior
>> cannot be reproduced in the hardware.
>>
>> For example:
>>
>> $ tc filter add dev eno1 parent ffff: protocol ip \
>> u32 ht 800: order 1 match tcp src 42 FFFF \
>> action mirred egress mirror dev enp1s16 pipe \
>> drop
>>
>> The controller is unable to mirror the packet to a VF, but still
>> offloads the rule by dropping the packet.
>>
>> Change the approach of the function to a pessimistic one, rejecting
>> the
>> chain when an unknown action is found. This is better suited for
>> future
>> extensions.
>>
>> Note that both recognized actions always return TC_ACT_SHOT,
>> therefore
>> it is safe to ignore actions behind them.
>>
>> Signed-off-by: Ondřej Hlavatý <ohlavaty@redhat.com>
>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>
> Note- I am having our validation move to testing with GCC 8.1.1 or
> later so that we can catch warnings like Dave found in the future.
>
> Dave- Please go ahead and pick this up.
Ok, applied, thanks.
^ permalink raw reply
* Re: [net PATCH] net-sysfs: Fix memory leak in XPS configuration
From: David Miller @ 2018-06-01 3:03 UTC (permalink / raw)
To: alexander.h.duyck; +Cc: netdev
In-Reply-To: <20180531195922.7571.86674.stgit@ahduyck-green-test.jf.intel.com>
From: Alexander Duyck <alexander.h.duyck@intel.com>
Date: Thu, 31 May 2018 15:59:46 -0400
> This patch reorders the error cases in showing the XPS configuration so
> that we hold off on memory allocation until after we have verified that we
> can support XPS on a given ring.
>
> Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Applied and queued up for -stable.
^ permalink raw reply
* Re: [PATCH bpf 1/2] bpf: fix alignment of netns_dev/netns_ino fields in bpf_{map,prog}_info
From: Dmitry V. Levin @ 2018-06-01 3:12 UTC (permalink / raw)
To: Linus Torvalds
Cc: Eugene Syromiatnikov, netdev, linux-kernel, Martin KaFai Lau,
Daniel Borkmann, Alexei Starovoitov, David S. Miller, Jiri Olsa,
Ingo Molnar, Lawrence Brakmo, Andrey Ignatov, Jakub Kicinski,
John Fastabend
In-Reply-To: <20180530181857.GA6744@altlinux.org>
[-- Attachment #1: Type: text/plain, Size: 3127 bytes --]
Hi,
Looks like the ABI bug in bpf_map_info and bpf_prog info introduced
in 4.16 is going to slip into 4.17, causing extra pain to 32-bit
userspace. I'm adding Linus to this thread in hope it might help
to get a fix applied before 4.17 is released.
On Wed, May 30, 2018 at 09:18:58PM +0300, Dmitry V. Levin wrote:
> On Sun, May 27, 2018 at 01:28:42PM +0200, Eugene Syromiatnikov wrote:
> > Recent introduction of netns_dev/netns_ino to bpf_map_info/bpf_prog info
> > has broken compat, as offsets of these fields are different in 32-bit
> > and 64-bit ABIs. One fix (other than implementing compat support in
> > syscall in order to handle this discrepancy) is to use __aligned_u64
> > instead of __u64 for these fields.
> >
> > Reported-by: Dmitry V. Levin <ldv@altlinux.org>
> > Fixes: 52775b33bb507 ("bpf: offload: report device information about
> > offloaded maps")
> > Fixes: 675fc275a3a2d ("bpf: offload: report device information for
> > offloaded programs")
> >
> > Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
>
> Reviewed-by: "Dmitry V. Levin" <ldv@altlinux.org>
> Cc: <stable@vger.kernel.org> # v4.16+
>
> Thanks,
>
> > ---
> > include/uapi/linux/bpf.h | 8 ++++----
> > tools/include/uapi/linux/bpf.h | 8 ++++----
> > 2 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index c5ec897..903010a 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1017,8 +1017,8 @@ struct bpf_prog_info {
> > __aligned_u64 map_ids;
> > char name[BPF_OBJ_NAME_LEN];
> > __u32 ifindex;
> > - __u64 netns_dev;
> > - __u64 netns_ino;
> > + __aligned_u64 netns_dev;
> > + __aligned_u64 netns_ino;
> > } __attribute__((aligned(8)));
> >
> > struct bpf_map_info {
> > @@ -1030,8 +1030,8 @@ struct bpf_map_info {
> > __u32 map_flags;
> > char name[BPF_OBJ_NAME_LEN];
> > __u32 ifindex;
> > - __u64 netns_dev;
> > - __u64 netns_ino;
> > + __aligned_u64 netns_dev;
> > + __aligned_u64 netns_ino;
> > } __attribute__((aligned(8)));
> >
> > /* User bpf_sock_addr struct to access socket fields and sockaddr struct passed
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index c5ec897..903010a 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -1017,8 +1017,8 @@ struct bpf_prog_info {
> > __aligned_u64 map_ids;
> > char name[BPF_OBJ_NAME_LEN];
> > __u32 ifindex;
> > - __u64 netns_dev;
> > - __u64 netns_ino;
> > + __aligned_u64 netns_dev;
> > + __aligned_u64 netns_ino;
> > } __attribute__((aligned(8)));
> >
> > struct bpf_map_info {
> > @@ -1030,8 +1030,8 @@ struct bpf_map_info {
> > __u32 map_flags;
> > char name[BPF_OBJ_NAME_LEN];
> > __u32 ifindex;
> > - __u64 netns_dev;
> > - __u64 netns_ino;
> > + __aligned_u64 netns_dev;
> > + __aligned_u64 netns_ino;
> > } __attribute__((aligned(8)));
> >
> > /* User bpf_sock_addr struct to access socket fields and sockaddr struct passed
--
ldv
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply
* [RFC PATCH] qed: qed_rdma_modify_srq() can be static
From: kbuild test robot @ 2018-06-01 3:41 UTC (permalink / raw)
To: Yuval Bason
Cc: kbuild-all, yuval.bason, davem, netdev, jgg, dledford, linux-rdma,
Michal Kalderon, Ariel Elior
In-Reply-To: <20180530131137.4653-1-yuval.bason@cavium.com>
Fixes: 27c50d39911b ("qed: Add srq core support for RoCE and iWARP")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
---
qed_rdma.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
index bd23659..f118328 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
@@ -1652,8 +1652,8 @@ static void *qed_rdma_get_rdma_ctx(struct qed_dev *cdev)
return QED_LEADING_HWFN(cdev);
}
-int qed_rdma_modify_srq(void *rdma_cxt,
- struct qed_rdma_modify_srq_in_params *in_params)
+static int qed_rdma_modify_srq(void *rdma_cxt,
+ struct qed_rdma_modify_srq_in_params *in_params)
{
struct rdma_srq_modify_ramrod_data *p_ramrod;
struct qed_hwfn *p_hwfn = rdma_cxt;
@@ -1688,8 +1688,8 @@ int qed_rdma_modify_srq(void *rdma_cxt,
return rc;
}
-int qed_rdma_destroy_srq(void *rdma_cxt,
- struct qed_rdma_destroy_srq_in_params *in_params)
+static int qed_rdma_destroy_srq(void *rdma_cxt,
+ struct qed_rdma_destroy_srq_in_params *in_params)
{
struct rdma_srq_destroy_ramrod_data *p_ramrod;
struct qed_hwfn *p_hwfn = rdma_cxt;
@@ -1731,9 +1731,9 @@ int qed_rdma_destroy_srq(void *rdma_cxt,
return rc;
}
-int qed_rdma_create_srq(void *rdma_cxt,
- struct qed_rdma_create_srq_in_params *in_params,
- struct qed_rdma_create_srq_out_params *out_params)
+static int qed_rdma_create_srq(void *rdma_cxt,
+ struct qed_rdma_create_srq_in_params *in_params,
+ struct qed_rdma_create_srq_out_params *out_params)
{
struct rdma_srq_create_ramrod_data *p_ramrod;
struct qed_hwfn *p_hwfn = rdma_cxt;
^ permalink raw reply related
* Re: [PATCH net-next] qed: Add srq core support for RoCE and iWARP
From: kbuild test robot @ 2018-06-01 3:41 UTC (permalink / raw)
To: Yuval Bason
Cc: kbuild-all, yuval.bason, davem, netdev, jgg, dledford, linux-rdma,
Michal Kalderon, Ariel Elior
In-Reply-To: <20180530131137.4653-1-yuval.bason@cavium.com>
Hi Yuval,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on net-next/master]
url: https://github.com/0day-ci/linux/commits/Yuval-Bason/qed-Add-srq-core-support-for-RoCE-and-iWARP/20180601-073407
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__
sparse warnings: (new ones prefixed by >>)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:137:5: sparse: symbol 'qed_rdma_get_sb_id' was not declared. Should it be static?
drivers/net/ethernet/qlogic/qed/qed_rdma.c:448:32: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:448:32: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:452:36: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:452:36: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:459:27: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:459:27: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:471:19: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:471:19: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:544:30: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:709:5: sparse: symbol 'qed_rdma_stop' was not declared. Should it be static?
drivers/net/ethernet/qlogic/qed/qed_rdma.c:796:33: sparse: cast removes address space of expression
drivers/net/ethernet/qlogic/qed/qed_rdma.c:899:16: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:899:16: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:921:16: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:921:16: sparse: expression using sizeof(void)
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1063:31: sparse: incorrect type in assignment (different base types) @@ expected restricted __le16 [usertype] int_timeout @@ got unsignedrestricted __le16 [usertype] int_timeout @@
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1063:31: expected restricted __le16 [usertype] int_timeout
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1063:31: got unsigned short [unsigned] [usertype] int_timeout
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1165:21: sparse: incorrect type in assignment (different base types) @@ expected unsigned short [unsigned] [short] [usertype] <noident> @@ got unsigned] [short] [usertype] <noident> @@
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1165:21: expected unsigned short [unsigned] [short] [usertype] <noident>
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1165:21: got restricted __le16 [usertype] <noident>
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1166:21: sparse: incorrect type in assignment (different base types) @@ expected unsigned short [unsigned] [short] [usertype] <noident> @@ got unsigned] [short] [usertype] <noident> @@
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1166:21: expected unsigned short [unsigned] [short] [usertype] <noident>
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1166:21: got restricted __le16 [usertype] <noident>
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1167:21: sparse: incorrect type in assignment (different base types) @@ expected unsigned short [unsigned] [short] [usertype] <noident> @@ got unsigned] [short] [usertype] <noident> @@
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1167:21: expected unsigned short [unsigned] [short] [usertype] <noident>
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1167:21: got restricted __le16 [usertype] <noident>
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1458:9: sparse: invalid assignment: &=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1458:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1458:9: right side has type int
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1458:9: sparse: invalid assignment: |=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1458:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1458:9: right side has type unsigned long long
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1462:9: sparse: invalid assignment: &=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1462:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1462:9: right side has type int
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1462:9: sparse: invalid assignment: |=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1462:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1462:9: right side has type unsigned long long
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1465:9: sparse: invalid assignment: &=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1465:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1465:9: right side has type int
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1465:9: sparse: invalid assignment: |=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1465:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1465:9: right side has type unsigned long long
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1470:17: sparse: invalid assignment: &=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1470:17: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1470:17: right side has type int
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1470:17: sparse: invalid assignment: |=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1470:17: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1470:17: right side has type unsigned long long
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1474:9: sparse: invalid assignment: &=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1474:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1474:9: right side has type int
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1474:9: sparse: invalid assignment: |=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1474:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1474:9: right side has type unsigned long long
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1478:9: sparse: invalid assignment: &=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1478:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1478:9: right side has type int
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1478:9: sparse: invalid assignment: |=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1478:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1478:9: right side has type unsigned long long
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1482:9: sparse: invalid assignment: &=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1482:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1482:9: right side has type int
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1482:9: sparse: invalid assignment: |=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1482:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1482:9: right side has type unsigned long long
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1486:9: sparse: invalid assignment: &=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1486:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1486:9: right side has type int
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1486:9: sparse: invalid assignment: |=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1486:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1486:9: right side has type unsigned long long
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1490:9: sparse: invalid assignment: &=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1490:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1490:9: right side has type int
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1490:9: sparse: invalid assignment: |=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1490:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1490:9: right side has type unsigned long long
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1493:9: sparse: invalid assignment: &=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1493:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1493:9: right side has type int
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1493:9: sparse: invalid assignment: |=
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1493:9: left side has type restricted __le16
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1493:9: right side has type unsigned long long
>> drivers/net/ethernet/qlogic/qed/qed_rdma.c:1679:29: sparse: incorrect type in assignment (different base types) @@ expected restricted __le32 [usertype] wqe_limit @@ got restricted __le32 [usertype] wqe_limit @@
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1679:29: expected restricted __le32 [usertype] wqe_limit
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1679:29: got restricted __le16 [usertype] <noident>
>> drivers/net/ethernet/qlogic/qed/qed_rdma.c:1655:5: sparse: symbol 'qed_rdma_modify_srq' was not declared. Should it be static?
>> drivers/net/ethernet/qlogic/qed/qed_rdma.c:1691:5: sparse: symbol 'qed_rdma_destroy_srq' was not declared. Should it be static?
>> drivers/net/ethernet/qlogic/qed/qed_rdma.c:1734:5: sparse: symbol 'qed_rdma_create_srq' was not declared. Should it be static?
Please review and possibly fold the followup patch.
vim +1679 drivers/net/ethernet/qlogic/qed/qed_rdma.c
1425
1426 static int
1427 qed_rdma_register_tid(void *rdma_cxt,
1428 struct qed_rdma_register_tid_in_params *params)
1429 {
1430 struct qed_hwfn *p_hwfn = (struct qed_hwfn *)rdma_cxt;
1431 struct rdma_register_tid_ramrod_data *p_ramrod;
1432 struct qed_sp_init_data init_data;
1433 struct qed_spq_entry *p_ent;
1434 enum rdma_tid_type tid_type;
1435 u8 fw_return_code;
1436 int rc;
1437
1438 DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "itid = %08x\n", params->itid);
1439
1440 /* Get SPQ entry */
1441 memset(&init_data, 0, sizeof(init_data));
1442 init_data.opaque_fid = p_hwfn->hw_info.opaque_fid;
1443 init_data.comp_mode = QED_SPQ_MODE_EBLOCK;
1444
1445 rc = qed_sp_init_request(p_hwfn, &p_ent, RDMA_RAMROD_REGISTER_MR,
1446 p_hwfn->p_rdma_info->proto, &init_data);
1447 if (rc) {
1448 DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "rc = %d\n", rc);
1449 return rc;
1450 }
1451
1452 if (p_hwfn->p_rdma_info->last_tid < params->itid)
1453 p_hwfn->p_rdma_info->last_tid = params->itid;
1454
1455 p_ramrod = &p_ent->ramrod.rdma_register_tid;
1456
1457 p_ramrod->flags = 0;
1458 SET_FIELD(p_ramrod->flags,
1459 RDMA_REGISTER_TID_RAMROD_DATA_TWO_LEVEL_PBL,
1460 params->pbl_two_level);
1461
1462 SET_FIELD(p_ramrod->flags,
1463 RDMA_REGISTER_TID_RAMROD_DATA_ZERO_BASED, params->zbva);
1464
1465 SET_FIELD(p_ramrod->flags,
1466 RDMA_REGISTER_TID_RAMROD_DATA_PHY_MR, params->phy_mr);
1467
1468 /* Don't initialize D/C field, as it may override other bits. */
1469 if (!(params->tid_type == QED_RDMA_TID_FMR) && !(params->dma_mr))
1470 SET_FIELD(p_ramrod->flags,
1471 RDMA_REGISTER_TID_RAMROD_DATA_PAGE_SIZE_LOG,
1472 params->page_size_log - 12);
1473
1474 SET_FIELD(p_ramrod->flags,
1475 RDMA_REGISTER_TID_RAMROD_DATA_REMOTE_READ,
1476 params->remote_read);
1477
1478 SET_FIELD(p_ramrod->flags,
1479 RDMA_REGISTER_TID_RAMROD_DATA_REMOTE_WRITE,
1480 params->remote_write);
1481
1482 SET_FIELD(p_ramrod->flags,
1483 RDMA_REGISTER_TID_RAMROD_DATA_REMOTE_ATOMIC,
1484 params->remote_atomic);
1485
1486 SET_FIELD(p_ramrod->flags,
1487 RDMA_REGISTER_TID_RAMROD_DATA_LOCAL_WRITE,
1488 params->local_write);
1489
> 1490 SET_FIELD(p_ramrod->flags,
1491 RDMA_REGISTER_TID_RAMROD_DATA_LOCAL_READ, params->local_read);
1492
> 1493 SET_FIELD(p_ramrod->flags,
1494 RDMA_REGISTER_TID_RAMROD_DATA_ENABLE_MW_BIND,
1495 params->mw_bind);
1496
1497 SET_FIELD(p_ramrod->flags1,
1498 RDMA_REGISTER_TID_RAMROD_DATA_PBL_PAGE_SIZE_LOG,
1499 params->pbl_page_size_log - 12);
1500
1501 SET_FIELD(p_ramrod->flags2,
1502 RDMA_REGISTER_TID_RAMROD_DATA_DMA_MR, params->dma_mr);
1503
1504 switch (params->tid_type) {
1505 case QED_RDMA_TID_REGISTERED_MR:
1506 tid_type = RDMA_TID_REGISTERED_MR;
1507 break;
1508 case QED_RDMA_TID_FMR:
1509 tid_type = RDMA_TID_FMR;
1510 break;
1511 case QED_RDMA_TID_MW_TYPE1:
1512 tid_type = RDMA_TID_MW_TYPE1;
1513 break;
1514 case QED_RDMA_TID_MW_TYPE2A:
1515 tid_type = RDMA_TID_MW_TYPE2A;
1516 break;
1517 default:
1518 rc = -EINVAL;
1519 DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "rc = %d\n", rc);
1520 return rc;
1521 }
1522 SET_FIELD(p_ramrod->flags1,
1523 RDMA_REGISTER_TID_RAMROD_DATA_TID_TYPE, tid_type);
1524
1525 p_ramrod->itid = cpu_to_le32(params->itid);
1526 p_ramrod->key = params->key;
1527 p_ramrod->pd = cpu_to_le16(params->pd);
1528 p_ramrod->length_hi = (u8)(params->length >> 32);
1529 p_ramrod->length_lo = DMA_LO_LE(params->length);
1530 if (params->zbva) {
1531 /* Lower 32 bits of the registered MR address.
1532 * In case of zero based MR, will hold FBO
1533 */
1534 p_ramrod->va.hi = 0;
1535 p_ramrod->va.lo = cpu_to_le32(params->fbo);
1536 } else {
1537 DMA_REGPAIR_LE(p_ramrod->va, params->vaddr);
1538 }
1539 DMA_REGPAIR_LE(p_ramrod->pbl_base, params->pbl_ptr);
1540
1541 /* DIF */
1542 if (params->dif_enabled) {
1543 SET_FIELD(p_ramrod->flags2,
1544 RDMA_REGISTER_TID_RAMROD_DATA_DIF_ON_HOST_FLG, 1);
1545 DMA_REGPAIR_LE(p_ramrod->dif_error_addr,
1546 params->dif_error_addr);
1547 DMA_REGPAIR_LE(p_ramrod->dif_runt_addr, params->dif_runt_addr);
1548 }
1549
1550 rc = qed_spq_post(p_hwfn, p_ent, &fw_return_code);
1551 if (rc)
1552 return rc;
1553
1554 if (fw_return_code != RDMA_RETURN_OK) {
1555 DP_NOTICE(p_hwfn, "fw_return_code = %d\n", fw_return_code);
1556 return -EINVAL;
1557 }
1558
1559 DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "Register TID, rc = %d\n", rc);
1560 return rc;
1561 }
1562
1563 static int qed_rdma_deregister_tid(void *rdma_cxt, u32 itid)
1564 {
1565 struct qed_hwfn *p_hwfn = (struct qed_hwfn *)rdma_cxt;
1566 struct rdma_deregister_tid_ramrod_data *p_ramrod;
1567 struct qed_sp_init_data init_data;
1568 struct qed_spq_entry *p_ent;
1569 struct qed_ptt *p_ptt;
1570 u8 fw_return_code;
1571 int rc;
1572
1573 DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "itid = %08x\n", itid);
1574
1575 /* Get SPQ entry */
1576 memset(&init_data, 0, sizeof(init_data));
1577 init_data.opaque_fid = p_hwfn->hw_info.opaque_fid;
1578 init_data.comp_mode = QED_SPQ_MODE_EBLOCK;
1579
1580 rc = qed_sp_init_request(p_hwfn, &p_ent, RDMA_RAMROD_DEREGISTER_MR,
1581 p_hwfn->p_rdma_info->proto, &init_data);
1582 if (rc) {
1583 DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "rc = %d\n", rc);
1584 return rc;
1585 }
1586
1587 p_ramrod = &p_ent->ramrod.rdma_deregister_tid;
1588 p_ramrod->itid = cpu_to_le32(itid);
1589
1590 rc = qed_spq_post(p_hwfn, p_ent, &fw_return_code);
1591 if (rc) {
1592 DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "rc = %d\n", rc);
1593 return rc;
1594 }
1595
1596 if (fw_return_code == RDMA_RETURN_DEREGISTER_MR_BAD_STATE_ERR) {
1597 DP_NOTICE(p_hwfn, "fw_return_code = %d\n", fw_return_code);
1598 return -EINVAL;
1599 } else if (fw_return_code == RDMA_RETURN_NIG_DRAIN_REQ) {
1600 /* Bit indicating that the TID is in use and a nig drain is
1601 * required before sending the ramrod again
1602 */
1603 p_ptt = qed_ptt_acquire(p_hwfn);
1604 if (!p_ptt) {
1605 rc = -EBUSY;
1606 DP_VERBOSE(p_hwfn, QED_MSG_RDMA,
1607 "Failed to acquire PTT\n");
1608 return rc;
1609 }
1610
1611 rc = qed_mcp_drain(p_hwfn, p_ptt);
1612 if (rc) {
1613 qed_ptt_release(p_hwfn, p_ptt);
1614 DP_VERBOSE(p_hwfn, QED_MSG_RDMA,
1615 "Drain failed\n");
1616 return rc;
1617 }
1618
1619 qed_ptt_release(p_hwfn, p_ptt);
1620
1621 /* Resend the ramrod */
1622 rc = qed_sp_init_request(p_hwfn, &p_ent,
1623 RDMA_RAMROD_DEREGISTER_MR,
1624 p_hwfn->p_rdma_info->proto,
1625 &init_data);
1626 if (rc) {
1627 DP_VERBOSE(p_hwfn, QED_MSG_RDMA,
1628 "Failed to init sp-element\n");
1629 return rc;
1630 }
1631
1632 rc = qed_spq_post(p_hwfn, p_ent, &fw_return_code);
1633 if (rc) {
1634 DP_VERBOSE(p_hwfn, QED_MSG_RDMA,
1635 "Ramrod failed\n");
1636 return rc;
1637 }
1638
1639 if (fw_return_code != RDMA_RETURN_OK) {
1640 DP_NOTICE(p_hwfn, "fw_return_code = %d\n",
1641 fw_return_code);
1642 return rc;
1643 }
1644 }
1645
1646 DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "De-registered TID, rc = %d\n", rc);
1647 return rc;
1648 }
1649
1650 static void *qed_rdma_get_rdma_ctx(struct qed_dev *cdev)
1651 {
1652 return QED_LEADING_HWFN(cdev);
1653 }
1654
> 1655 int qed_rdma_modify_srq(void *rdma_cxt,
1656 struct qed_rdma_modify_srq_in_params *in_params)
1657 {
1658 struct rdma_srq_modify_ramrod_data *p_ramrod;
1659 struct qed_hwfn *p_hwfn = rdma_cxt;
1660 struct qed_sp_init_data init_data;
1661 struct qed_spq_entry *p_ent;
1662 u16 opaque_fid;
1663 int rc;
1664
1665 memset(&init_data, 0, sizeof(init_data));
1666 init_data.opaque_fid = p_hwfn->hw_info.opaque_fid;
1667 init_data.comp_mode = QED_SPQ_MODE_EBLOCK;
1668
1669 rc = qed_sp_init_request(p_hwfn, &p_ent,
1670 RDMA_RAMROD_MODIFY_SRQ,
1671 p_hwfn->p_rdma_info->proto, &init_data);
1672 if (rc)
1673 return rc;
1674
1675 p_ramrod = &p_ent->ramrod.rdma_modify_srq;
1676 p_ramrod->srq_id.srq_idx = cpu_to_le16(in_params->srq_id);
1677 opaque_fid = p_hwfn->hw_info.opaque_fid;
1678 p_ramrod->srq_id.opaque_fid = cpu_to_le16(opaque_fid);
> 1679 p_ramrod->wqe_limit = cpu_to_le16(in_params->wqe_limit);
1680
1681 rc = qed_spq_post(p_hwfn, p_ent, NULL);
1682 if (rc)
1683 return rc;
1684
1685 DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "modified SRQ id = %x",
1686 in_params->srq_id);
1687
1688 return rc;
1689 }
1690
> 1691 int qed_rdma_destroy_srq(void *rdma_cxt,
1692 struct qed_rdma_destroy_srq_in_params *in_params)
1693 {
1694 struct rdma_srq_destroy_ramrod_data *p_ramrod;
1695 struct qed_hwfn *p_hwfn = rdma_cxt;
1696 struct qed_sp_init_data init_data;
1697 struct qed_spq_entry *p_ent;
1698 struct qed_bmap *bmap;
1699 u16 opaque_fid;
1700 int rc;
1701
1702 opaque_fid = p_hwfn->hw_info.opaque_fid;
1703
1704 memset(&init_data, 0, sizeof(init_data));
1705 init_data.opaque_fid = opaque_fid;
1706 init_data.comp_mode = QED_SPQ_MODE_EBLOCK;
1707
1708 rc = qed_sp_init_request(p_hwfn, &p_ent,
1709 RDMA_RAMROD_DESTROY_SRQ,
1710 p_hwfn->p_rdma_info->proto, &init_data);
1711 if (rc)
1712 return rc;
1713
1714 p_ramrod = &p_ent->ramrod.rdma_destroy_srq;
1715 p_ramrod->srq_id.srq_idx = cpu_to_le16(in_params->srq_id);
1716 p_ramrod->srq_id.opaque_fid = cpu_to_le16(opaque_fid);
1717
1718 rc = qed_spq_post(p_hwfn, p_ent, NULL);
1719 if (rc)
1720 return rc;
1721
1722 bmap = &p_hwfn->p_rdma_info->srq_map;
1723
1724 spin_lock_bh(&p_hwfn->p_rdma_info->lock);
1725 qed_bmap_release_id(p_hwfn, bmap, in_params->srq_id);
1726 spin_unlock_bh(&p_hwfn->p_rdma_info->lock);
1727
1728 DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "SRQ destroyed Id = %x",
1729 in_params->srq_id);
1730
1731 return rc;
1732 }
1733
> 1734 int qed_rdma_create_srq(void *rdma_cxt,
1735 struct qed_rdma_create_srq_in_params *in_params,
1736 struct qed_rdma_create_srq_out_params *out_params)
1737 {
1738 struct rdma_srq_create_ramrod_data *p_ramrod;
1739 struct qed_hwfn *p_hwfn = rdma_cxt;
1740 struct qed_sp_init_data init_data;
1741 enum qed_cxt_elem_type elem_type;
1742 struct qed_spq_entry *p_ent;
1743 u16 opaque_fid, srq_id;
1744 struct qed_bmap *bmap;
1745 u32 returned_id;
1746 int rc;
1747
1748 bmap = &p_hwfn->p_rdma_info->srq_map;
1749 spin_lock_bh(&p_hwfn->p_rdma_info->lock);
1750 rc = qed_rdma_bmap_alloc_id(p_hwfn, bmap, &returned_id);
1751 spin_unlock_bh(&p_hwfn->p_rdma_info->lock);
1752
1753 if (rc) {
1754 DP_NOTICE(p_hwfn, "failed to allocate srq id\n");
1755 return rc;
1756 }
1757
1758 elem_type = QED_ELEM_SRQ;
1759 rc = qed_cxt_dynamic_ilt_alloc(p_hwfn, elem_type, returned_id);
1760 if (rc)
1761 goto err;
1762 /* returned id is no greater than u16 */
1763 srq_id = (u16)returned_id;
1764 opaque_fid = p_hwfn->hw_info.opaque_fid;
1765
1766 memset(&init_data, 0, sizeof(init_data));
1767 opaque_fid = p_hwfn->hw_info.opaque_fid;
1768 init_data.opaque_fid = opaque_fid;
1769 init_data.comp_mode = QED_SPQ_MODE_EBLOCK;
1770
1771 rc = qed_sp_init_request(p_hwfn, &p_ent,
1772 RDMA_RAMROD_CREATE_SRQ,
1773 p_hwfn->p_rdma_info->proto, &init_data);
1774 if (rc)
1775 goto err;
1776
1777 p_ramrod = &p_ent->ramrod.rdma_create_srq;
1778 DMA_REGPAIR_LE(p_ramrod->pbl_base_addr, in_params->pbl_base_addr);
1779 p_ramrod->pages_in_srq_pbl = cpu_to_le16(in_params->num_pages);
1780 p_ramrod->pd_id = cpu_to_le16(in_params->pd_id);
1781 p_ramrod->srq_id.srq_idx = cpu_to_le16(srq_id);
1782 p_ramrod->srq_id.opaque_fid = cpu_to_le16(opaque_fid);
1783 p_ramrod->page_size = cpu_to_le16(in_params->page_size);
1784 DMA_REGPAIR_LE(p_ramrod->producers_addr, in_params->prod_pair_addr);
1785
1786 rc = qed_spq_post(p_hwfn, p_ent, NULL);
1787 if (rc)
1788 goto err;
1789
1790 out_params->srq_id = srq_id;
1791
1792 DP_VERBOSE(p_hwfn, QED_MSG_RDMA,
1793 "SRQ created Id = %x\n", out_params->srq_id);
1794
1795 return rc;
1796
1797 err:
1798 spin_lock_bh(&p_hwfn->p_rdma_info->lock);
1799 qed_bmap_release_id(p_hwfn, bmap, returned_id);
1800 spin_unlock_bh(&p_hwfn->p_rdma_info->lock);
1801
1802 return rc;
1803 }
1804
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
^ permalink raw reply
* Re: linux-next: build failure after merge of the net-next tree
From: Stephen Rothwell @ 2018-06-01 3:59 UTC (permalink / raw)
To: David Miller, Networking
Cc: Linux-Next Mailing List, Linux Kernel Mailing List,
Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann
In-Reply-To: <20180531073855.29c23fce@canb.auug.org.au>
[-- Attachment #1: Type: text/plain, Size: 1995 bytes --]
Hi all,
On Thu, 31 May 2018 07:38:55 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> On Tue, 29 May 2018 13:25:48 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >
> > After merging the net-next tree, today's linux-next build (x86_64
> > allmodconfig) failed like this:
> >
> > x86_64-linux-ld: unknown architecture of input file `net/bpfilter/bpfilter_umh.o' is incompatible with i386:x86-64 output
> >
> > Caused by commit
> >
> > d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
> >
> > In my builds, the host is PowerPC 64 LE ...
> >
> > I have reverted that commit along with
> >
> > 61a552eb487f ("bpfilter: fix build dependency")
> > 13405468f49d ("bpfilter: don't pass O_CREAT when opening console for debug")
> >
> > for today.
>
> I am still getting this failure (well, at least yesterday I did).
Still happened today. My guess is that bpfilter_umh needs to be built
with the kernel compiler (not the host compiler - since ir is meant to
run on the some machine as the kernel, right?) but will require the
kernel architecture libc etc
I replaced the reverts above with the following:
From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Fri, 1 Jun 2018 13:33:28 +1000
Subject: [PATCH] net: bpfilter: mark as BROKEN for now
This does not build in a cross compile environment
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
net/bpfilter/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/bpfilter/Kconfig b/net/bpfilter/Kconfig
index a948b072c28f..ea4be72fdf6e 100644
--- a/net/bpfilter/Kconfig
+++ b/net/bpfilter/Kconfig
@@ -2,6 +2,7 @@ menuconfig BPFILTER
bool "BPF based packet filtering framework (BPFILTER)"
default n
depends on NET && BPF && INET
+ depends on BROKEN
help
This builds experimental bpfilter framework that is aiming to
provide netfilter compatible functionality via BPF
--
2.17.0
--
Cheers,
Stephen Rothwell
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply related
* [RFC PATCH 00/18] Assorted rhashtable improvements
From: NeilBrown @ 2018-06-01 4:44 UTC (permalink / raw)
To: Thomas Graf, Herbert Xu; +Cc: netdev, linux-kernel
Hi,
the following is my current set of rhashtable improvements.
Some have been seen before, some have been improved,
others are new.
They include:
working list-nulls support
stability improvements for rhashtable_walk
bit-spin-locks for simplicity and reduced cache footprint
during modification
optional per-cpu locks to improve scalability for modificiation
various cleanups
If I get suitable acks I will send more focused subsets to Davem for
inclusion.
I had said previously that I thought there was a way to provide
stable walking of an rhl table in the face of concurrent
insert/delete. Having tried, I no longer think this can be
done without substantial impact to lookups and/or other operations.
The idea of attaching a marker to the list is incompatible with
the normal rules for working with rcu-protected lists ("attaching"
might be manageable. "moving" or "removing" is the problematic part).
The last patch is the one I'm least certain of. It seems like a good
idea to improve the chance of a walk avoiding any rehash, but it
cannot provide a solid guarantee without risking a denial-of-service.
My compromise is to guarantee no rehashes caused by shrinkage, and
discourage rehashes caused by growth. I'm not yet sure if that is
sufficiently valuable, but I thought I would include the patch in the
RFC anyway.
Thanks,
NeilBrown
---
NeilBrown (18):
rhashtable: silence RCU warning in rhashtable_test.
rhashtable: split rhashtable.h
rhashtable: remove nulls_base and related code.
rhashtable: detect when object movement might have invalidated a lookup
rhashtable: simplify INIT_RHT_NULLS_HEAD()
rhashtable: simplify nested_table_alloc() and rht_bucket_nested_insert()
rhashtable: use cmpxchg() to protect ->future_tbl.
rhashtable: clean up dereference of ->future_tbl.
rhashtable: use cmpxchg() in nested_table_alloc()
rhashtable: remove rhashtable_walk_peek()
rhashtable: further improve stability of rhashtable_walk
rhashtable: add rhashtable_walk_prev()
rhashtable: don't hold lock on first table throughout insertion.
rhashtable: allow rht_bucket_var to return NULL.
rhashtable: use bit_spin_locks to protect hash bucket.
rhashtable: allow percpu element counter
rhashtable: rename rht_for_each*continue as *from.
rhashtable: add rhashtable_walk_delay_rehash()
.clang-format | 8
MAINTAINERS | 2
drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 1
drivers/staging/lustre/lustre/fid/fid_request.c | 2
drivers/staging/lustre/lustre/fld/fld_request.c | 1
drivers/staging/lustre/lustre/include/lu_object.h | 1
include/linux/ipc.h | 2
include/linux/ipc_namespace.h | 2
include/linux/mroute_base.h | 2
include/linux/percpu_counter.h | 4
include/linux/rhashtable-types.h | 147 ++++++
include/linux/rhashtable.h | 537 +++++++++++----------
include/net/inet_frag.h | 2
include/net/netfilter/nf_flow_table.h | 2
include/net/sctp/structs.h | 2
include/net/seg6.h | 2
include/net/seg6_hmac.h | 2
ipc/msg.c | 1
ipc/sem.c | 1
ipc/shm.c | 1
ipc/util.c | 2
lib/rhashtable.c | 481 +++++++++++--------
lib/test_rhashtable.c | 22 +
net/bridge/br_fdb.c | 1
net/bridge/br_vlan.c | 1
net/bridge/br_vlan_tunnel.c | 1
net/ipv4/inet_fragment.c | 1
net/ipv4/ipmr.c | 2
net/ipv4/ipmr_base.c | 1
net/ipv6/ip6mr.c | 2
net/ipv6/seg6.c | 1
net/ipv6/seg6_hmac.c | 1
net/netfilter/nf_tables_api.c | 1
net/sctp/input.c | 1
net/sctp/socket.c | 1
35 files changed, 760 insertions(+), 481 deletions(-)
create mode 100644 include/linux/rhashtable-types.h
--
Signature
^ permalink raw reply
* [PATCH 01/18] rhashtable: silence RCU warning in rhashtable_test.
From: NeilBrown @ 2018-06-01 4:44 UTC (permalink / raw)
To: Thomas Graf, Herbert Xu; +Cc: netdev, linux-kernel
In-Reply-To: <152782754287.30340.4395718227884933670.stgit@noble>
print_ht in rhashtable_test calls rht_dereference() with neither
RCU protection or the mutex. This triggers an RCU warning.
So take the mutex to silence the warning.
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NeilBrown <neilb@suse.com>
---
lib/test_rhashtable.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index f4000c137dbe..bf92b7aa2a49 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -499,6 +499,8 @@ static unsigned int __init print_ht(struct rhltable *rhlt)
unsigned int i, cnt = 0;
ht = &rhlt->ht;
+ /* Take the mutex to avoid RCU warning */
+ mutex_lock(&ht->mutex);
tbl = rht_dereference(ht->tbl, ht);
for (i = 0; i < tbl->size; i++) {
struct rhash_head *pos, *next;
@@ -532,6 +534,7 @@ static unsigned int __init print_ht(struct rhltable *rhlt)
}
}
printk(KERN_ERR "\n---- ht: ----%s\n-------------\n", buff);
+ mutex_unlock(&ht->mutex);
return cnt;
}
^ permalink raw reply related
* [PATCH 02/18] rhashtable: split rhashtable.h
From: NeilBrown @ 2018-06-01 4:44 UTC (permalink / raw)
To: Thomas Graf, Herbert Xu; +Cc: netdev, linux-kernel
In-Reply-To: <152782754287.30340.4395718227884933670.stgit@noble>
Due to the use of rhashtables in net namespaces,
rhashtable.h is included in lots of the kernel,
so a small changes can required a large recompilation.
This makes development painful.
This patch splits out rhashtable-types.h which just includes
the major type declarations, and does not include (non-trivial)
inline code. rhashtable.h is no longer included by anything
in the include/ directory.
Common include files only include rhashtable-types.h so a large
recompilation is only triggered when that changes.
Signed-off-by: NeilBrown <neilb@suse.com>
---
MAINTAINERS | 2
drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 1
drivers/staging/lustre/lustre/fid/fid_request.c | 2
drivers/staging/lustre/lustre/fld/fld_request.c | 1
drivers/staging/lustre/lustre/include/lu_object.h | 1
include/linux/ipc.h | 2
include/linux/ipc_namespace.h | 2
include/linux/mroute_base.h | 2
include/linux/rhashtable-types.h | 139 +++++++++++++++++++++
include/linux/rhashtable.h | 127 -------------------
include/net/inet_frag.h | 2
include/net/netfilter/nf_flow_table.h | 2
include/net/sctp/structs.h | 2
include/net/seg6.h | 2
include/net/seg6_hmac.h | 2
ipc/msg.c | 1
ipc/sem.c | 1
ipc/shm.c | 1
ipc/util.c | 1
lib/rhashtable.c | 1
net/ipv4/inet_fragment.c | 1
net/ipv4/ipmr.c | 1
net/ipv4/ipmr_base.c | 1
net/ipv6/ip6mr.c | 1
net/ipv6/seg6.c | 1
net/ipv6/seg6_hmac.c | 1
net/netfilter/nf_tables_api.c | 1
net/sctp/input.c | 1
net/sctp/socket.c | 1
29 files changed, 169 insertions(+), 134 deletions(-)
create mode 100644 include/linux/rhashtable-types.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 66985d05bced..a3a4f44d3ce1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12012,7 +12012,9 @@ M: Herbert Xu <herbert@gondor.apana.org.au>
L: netdev@vger.kernel.org
S: Maintained
F: lib/rhashtable.c
+F: lib/test_rhashtable.c
F: include/linux/rhashtable.h
+F: include/linux/rhashtable-types.h
RICOH R5C592 MEMORYSTICK DRIVER
M: Maxim Levitsky <maximlevitsky@gmail.com>
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 688f95440af2..66a7a02d616a 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -46,6 +46,7 @@
#include <linux/spinlock.h>
#include <linux/timer.h>
#include <linux/vmalloc.h>
+#include <linux/rhashtable.h>
#include <linux/etherdevice.h>
#include <linux/net_tstamp.h>
#include <linux/ptp_clock_kernel.h>
diff --git a/drivers/staging/lustre/lustre/fid/fid_request.c b/drivers/staging/lustre/lustre/fid/fid_request.c
index c674652af03a..a9af68178eff 100644
--- a/drivers/staging/lustre/lustre/fid/fid_request.c
+++ b/drivers/staging/lustre/lustre/fid/fid_request.c
@@ -40,7 +40,7 @@
#define DEBUG_SUBSYSTEM S_FID
#include <linux/module.h>
-
+#include <linux/rhashtable.h>
#include <obd.h>
#include <obd_class.h>
#include <obd_support.h>
diff --git a/drivers/staging/lustre/lustre/fld/fld_request.c b/drivers/staging/lustre/lustre/fld/fld_request.c
index 7b7ba93a4db6..292ff5756243 100644
--- a/drivers/staging/lustre/lustre/fld/fld_request.c
+++ b/drivers/staging/lustre/lustre/fld/fld_request.c
@@ -40,6 +40,7 @@
#define DEBUG_SUBSYSTEM S_FLD
#include <linux/module.h>
+#include <linux/rhashtable.h>
#include <asm/div64.h>
#include <obd.h>
diff --git a/drivers/staging/lustre/lustre/include/lu_object.h b/drivers/staging/lustre/lustre/include/lu_object.h
index 1b5284d2416c..205463c47bda 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -37,6 +37,7 @@
#include <stdarg.h>
#include <linux/percpu_counter.h>
#include <linux/libcfs/libcfs.h>
+#include <linux/rhashtable-types.h>
#include <uapi/linux/lustre/lustre_idl.h>
#include <lu_ref.h>
diff --git a/include/linux/ipc.h b/include/linux/ipc.h
index 6cc2df7f7ac9..e1c9eea6015b 100644
--- a/include/linux/ipc.h
+++ b/include/linux/ipc.h
@@ -4,7 +4,7 @@
#include <linux/spinlock.h>
#include <linux/uidgid.h>
-#include <linux/rhashtable.h>
+#include <linux/rhashtable-types.h>
#include <uapi/linux/ipc.h>
#include <linux/refcount.h>
diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
index b5630c8eb2f3..6cea726612b7 100644
--- a/include/linux/ipc_namespace.h
+++ b/include/linux/ipc_namespace.h
@@ -9,7 +9,7 @@
#include <linux/nsproxy.h>
#include <linux/ns_common.h>
#include <linux/refcount.h>
-#include <linux/rhashtable.h>
+#include <linux/rhashtable-types.h>
struct user_namespace;
diff --git a/include/linux/mroute_base.h b/include/linux/mroute_base.h
index d617fe45543e..fd673be398ff 100644
--- a/include/linux/mroute_base.h
+++ b/include/linux/mroute_base.h
@@ -2,7 +2,7 @@
#define __LINUX_MROUTE_BASE_H
#include <linux/netdevice.h>
-#include <linux/rhashtable.h>
+#include <linux/rhashtable-types.h>
#include <linux/spinlock.h>
#include <net/net_namespace.h>
#include <net/sock.h>
diff --git a/include/linux/rhashtable-types.h b/include/linux/rhashtable-types.h
new file mode 100644
index 000000000000..9740063ff13b
--- /dev/null
+++ b/include/linux/rhashtable-types.h
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Resizable, Scalable, Concurrent Hash Table
+ *
+ * Simple structures that might be needed in include
+ * files.
+ */
+
+#ifndef _LINUX_RHASHTABLE_TYPES_H
+#define _LINUX_RHASHTABLE_TYPES_H
+
+#include <linux/atomic.h>
+#include <linux/compiler.h>
+#include <linux/mutex.h>
+#include <linux/workqueue.h>
+
+struct rhash_head {
+ struct rhash_head __rcu *next;
+};
+
+struct rhlist_head {
+ struct rhash_head rhead;
+ struct rhlist_head __rcu *next;
+};
+
+struct bucket_table;
+
+/**
+ * struct rhashtable_compare_arg - Key for the function rhashtable_compare
+ * @ht: Hash table
+ * @key: Key to compare against
+ */
+struct rhashtable_compare_arg {
+ struct rhashtable *ht;
+ const void *key;
+};
+
+typedef u32 (*rht_hashfn_t)(const void *data, u32 len, u32 seed);
+typedef u32 (*rht_obj_hashfn_t)(const void *data, u32 len, u32 seed);
+typedef int (*rht_obj_cmpfn_t)(struct rhashtable_compare_arg *arg,
+ const void *obj);
+
+/**
+ * struct rhashtable_params - Hash table construction parameters
+ * @nelem_hint: Hint on number of elements, should be 75% of desired size
+ * @key_len: Length of key
+ * @key_offset: Offset of key in struct to be hashed
+ * @head_offset: Offset of rhash_head in struct to be hashed
+ * @max_size: Maximum size while expanding
+ * @min_size: Minimum size while shrinking
+ * @locks_mul: Number of bucket locks to allocate per cpu (default: 32)
+ * @automatic_shrinking: Enable automatic shrinking of tables
+ * @nulls_base: Base value to generate nulls marker
+ * @hashfn: Hash function (default: jhash2 if !(key_len % 4), or jhash)
+ * @obj_hashfn: Function to hash object
+ * @obj_cmpfn: Function to compare key with object
+ */
+struct rhashtable_params {
+ u16 nelem_hint;
+ u16 key_len;
+ u16 key_offset;
+ u16 head_offset;
+ unsigned int max_size;
+ u16 min_size;
+ bool automatic_shrinking;
+ u8 locks_mul;
+ u32 nulls_base;
+ rht_hashfn_t hashfn;
+ rht_obj_hashfn_t obj_hashfn;
+ rht_obj_cmpfn_t obj_cmpfn;
+};
+
+/**
+ * struct rhashtable - Hash table handle
+ * @tbl: Bucket table
+ * @key_len: Key length for hashfn
+ * @max_elems: Maximum number of elements in table
+ * @p: Configuration parameters
+ * @rhlist: True if this is an rhltable
+ * @run_work: Deferred worker to expand/shrink asynchronously
+ * @mutex: Mutex to protect current/future table swapping
+ * @lock: Spin lock to protect walker list
+ * @nelems: Number of elements in table
+ */
+struct rhashtable {
+ struct bucket_table __rcu *tbl;
+ unsigned int key_len;
+ unsigned int max_elems;
+ struct rhashtable_params p;
+ bool rhlist;
+ struct work_struct run_work;
+ struct mutex mutex;
+ spinlock_t lock;
+ atomic_t nelems;
+};
+
+/**
+ * struct rhltable - Hash table with duplicate objects in a list
+ * @ht: Underlying rhtable
+ */
+struct rhltable {
+ struct rhashtable ht;
+};
+
+/**
+ * struct rhashtable_walker - Hash table walker
+ * @list: List entry on list of walkers
+ * @tbl: The table that we were walking over
+ */
+struct rhashtable_walker {
+ struct list_head list;
+ struct bucket_table *tbl;
+};
+
+/**
+ * struct rhashtable_iter - Hash table iterator
+ * @ht: Table to iterate through
+ * @p: Current pointer
+ * @list: Current hash list pointer
+ * @walker: Associated rhashtable walker
+ * @slot: Current slot
+ * @skip: Number of entries to skip in slot
+ */
+struct rhashtable_iter {
+ struct rhashtable *ht;
+ struct rhash_head *p;
+ struct rhlist_head *list;
+ struct rhashtable_walker walker;
+ unsigned int slot;
+ unsigned int skip;
+ bool end_of_table;
+};
+
+int rhashtable_init(struct rhashtable *ht,
+ const struct rhashtable_params *params);
+int rhltable_init(struct rhltable *hlt,
+ const struct rhashtable_params *params);
+
+#endif /* _LINUX_RHASHTABLE_TYPES_H */
diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index 4e1f535c2034..48754ab07cdf 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -1,3 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
/*
* Resizable, Scalable, Concurrent Hash Table
*
@@ -17,16 +18,14 @@
#ifndef _LINUX_RHASHTABLE_H
#define _LINUX_RHASHTABLE_H
-#include <linux/atomic.h>
-#include <linux/compiler.h>
#include <linux/err.h>
#include <linux/errno.h>
#include <linux/jhash.h>
#include <linux/list_nulls.h>
#include <linux/workqueue.h>
-#include <linux/mutex.h>
#include <linux/rculist.h>
+#include <linux/rhashtable-types.h>
/*
* The end of the chain is marked with a special nulls marks which has
* the following format:
@@ -64,15 +63,6 @@
*/
#define RHT_ELASTICITY 16u
-struct rhash_head {
- struct rhash_head __rcu *next;
-};
-
-struct rhlist_head {
- struct rhash_head rhead;
- struct rhlist_head __rcu *next;
-};
-
/**
* struct bucket_table - Table of hash buckets
* @size: Number of hash buckets
@@ -102,114 +92,6 @@ struct bucket_table {
struct rhash_head __rcu *buckets[] ____cacheline_aligned_in_smp;
};
-/**
- * struct rhashtable_compare_arg - Key for the function rhashtable_compare
- * @ht: Hash table
- * @key: Key to compare against
- */
-struct rhashtable_compare_arg {
- struct rhashtable *ht;
- const void *key;
-};
-
-typedef u32 (*rht_hashfn_t)(const void *data, u32 len, u32 seed);
-typedef u32 (*rht_obj_hashfn_t)(const void *data, u32 len, u32 seed);
-typedef int (*rht_obj_cmpfn_t)(struct rhashtable_compare_arg *arg,
- const void *obj);
-
-struct rhashtable;
-
-/**
- * struct rhashtable_params - Hash table construction parameters
- * @nelem_hint: Hint on number of elements, should be 75% of desired size
- * @key_len: Length of key
- * @key_offset: Offset of key in struct to be hashed
- * @head_offset: Offset of rhash_head in struct to be hashed
- * @max_size: Maximum size while expanding
- * @min_size: Minimum size while shrinking
- * @locks_mul: Number of bucket locks to allocate per cpu (default: 32)
- * @automatic_shrinking: Enable automatic shrinking of tables
- * @nulls_base: Base value to generate nulls marker
- * @hashfn: Hash function (default: jhash2 if !(key_len % 4), or jhash)
- * @obj_hashfn: Function to hash object
- * @obj_cmpfn: Function to compare key with object
- */
-struct rhashtable_params {
- u16 nelem_hint;
- u16 key_len;
- u16 key_offset;
- u16 head_offset;
- unsigned int max_size;
- u16 min_size;
- bool automatic_shrinking;
- u8 locks_mul;
- u32 nulls_base;
- rht_hashfn_t hashfn;
- rht_obj_hashfn_t obj_hashfn;
- rht_obj_cmpfn_t obj_cmpfn;
-};
-
-/**
- * struct rhashtable - Hash table handle
- * @tbl: Bucket table
- * @key_len: Key length for hashfn
- * @max_elems: Maximum number of elements in table
- * @p: Configuration parameters
- * @rhlist: True if this is an rhltable
- * @run_work: Deferred worker to expand/shrink asynchronously
- * @mutex: Mutex to protect current/future table swapping
- * @lock: Spin lock to protect walker list
- * @nelems: Number of elements in table
- */
-struct rhashtable {
- struct bucket_table __rcu *tbl;
- unsigned int key_len;
- unsigned int max_elems;
- struct rhashtable_params p;
- bool rhlist;
- struct work_struct run_work;
- struct mutex mutex;
- spinlock_t lock;
- atomic_t nelems;
-};
-
-/**
- * struct rhltable - Hash table with duplicate objects in a list
- * @ht: Underlying rhtable
- */
-struct rhltable {
- struct rhashtable ht;
-};
-
-/**
- * struct rhashtable_walker - Hash table walker
- * @list: List entry on list of walkers
- * @tbl: The table that we were walking over
- */
-struct rhashtable_walker {
- struct list_head list;
- struct bucket_table *tbl;
-};
-
-/**
- * struct rhashtable_iter - Hash table iterator
- * @ht: Table to iterate through
- * @p: Current pointer
- * @list: Current hash list pointer
- * @walker: Associated rhashtable walker
- * @slot: Current slot
- * @skip: Number of entries to skip in slot
- */
-struct rhashtable_iter {
- struct rhashtable *ht;
- struct rhash_head *p;
- struct rhlist_head *list;
- struct rhashtable_walker walker;
- unsigned int slot;
- unsigned int skip;
- bool end_of_table;
-};
-
static inline unsigned long rht_marker(const struct rhashtable *ht, u32 hash)
{
return NULLS_MARKER(ht->p.nulls_base + hash);
@@ -376,11 +258,6 @@ static inline int lockdep_rht_bucket_is_held(const struct bucket_table *tbl,
}
#endif /* CONFIG_PROVE_LOCKING */
-int rhashtable_init(struct rhashtable *ht,
- const struct rhashtable_params *params);
-int rhltable_init(struct rhltable *hlt,
- const struct rhashtable_params *params);
-
void *rhashtable_insert_slow(struct rhashtable *ht, const void *key,
struct rhash_head *obj);
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
index ed07e3786d98..f4272a29dc44 100644
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -2,7 +2,7 @@
#ifndef __NET_FRAG_H__
#define __NET_FRAG_H__
-#include <linux/rhashtable.h>
+#include <linux/rhashtable-types.h>
struct netns_frags {
/* sysctls */
diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index 833752dd0c58..3bb75491482f 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -4,7 +4,7 @@
#include <linux/in.h>
#include <linux/in6.h>
#include <linux/netdevice.h>
-#include <linux/rhashtable.h>
+#include <linux/rhashtable-types.h>
#include <linux/rcupdate.h>
#include <net/dst.h>
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index a0ec462bc1a9..e5ac430c8717 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -48,7 +48,7 @@
#define __sctp_structs_h__
#include <linux/ktime.h>
-#include <linux/rhashtable.h>
+#include <linux/rhashtable-types.h>
#include <linux/socket.h> /* linux/in.h needs this!! */
#include <linux/in.h> /* We get struct sockaddr_in. */
#include <linux/in6.h> /* We get struct in6_addr */
diff --git a/include/net/seg6.h b/include/net/seg6.h
index 099bad59dc90..85ee569c8306 100644
--- a/include/net/seg6.h
+++ b/include/net/seg6.h
@@ -18,7 +18,7 @@
#include <linux/ipv6.h>
#include <net/lwtunnel.h>
#include <linux/seg6.h>
-#include <linux/rhashtable.h>
+#include <linux/rhashtable-types.h>
static inline void update_csum_diff4(struct sk_buff *skb, __be32 from,
__be32 to)
diff --git a/include/net/seg6_hmac.h b/include/net/seg6_hmac.h
index 69c3a106056b..7fda469e2758 100644
--- a/include/net/seg6_hmac.h
+++ b/include/net/seg6_hmac.h
@@ -22,7 +22,7 @@
#include <linux/route.h>
#include <net/seg6.h>
#include <linux/seg6_hmac.h>
-#include <linux/rhashtable.h>
+#include <linux/rhashtable-types.h>
#define SEG6_HMAC_MAX_DIGESTSIZE 160
#define SEG6_HMAC_RING_SIZE 256
diff --git a/ipc/msg.c b/ipc/msg.c
index 56fd1c73eedc..8fa2cac6830a 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -38,6 +38,7 @@
#include <linux/rwsem.h>
#include <linux/nsproxy.h>
#include <linux/ipc_namespace.h>
+#include <linux/rhashtable.h>
#include <asm/current.h>
#include <linux/uaccess.h>
diff --git a/ipc/sem.c b/ipc/sem.c
index 06be75d9217a..f0c4a0ed91c7 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -84,6 +84,7 @@
#include <linux/nsproxy.h>
#include <linux/ipc_namespace.h>
#include <linux/sched/wake_q.h>
+#include <linux/rhashtable.h>
#include <linux/uaccess.h>
#include "util.h"
diff --git a/ipc/shm.c b/ipc/shm.c
index d73269381ec7..5824a7f3253e 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -43,6 +43,7 @@
#include <linux/nsproxy.h>
#include <linux/mount.h>
#include <linux/ipc_namespace.h>
+#include <linux/rhashtable.h>
#include <linux/uaccess.h>
diff --git a/ipc/util.c b/ipc/util.c
index 4e81182fa0ac..fdffff41f65b 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -63,6 +63,7 @@
#include <linux/rwsem.h>
#include <linux/memory.h>
#include <linux/ipc_namespace.h>
+#include <linux/rhashtable.h>
#include <asm/unistd.h>
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 9427b5766134..c9fafea7dc6e 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -28,6 +28,7 @@
#include <linux/rhashtable.h>
#include <linux/err.h>
#include <linux/export.h>
+#include <linux/rhashtable.h>
#define HASH_DEFAULT_SIZE 64UL
#define HASH_MIN_SIZE 4U
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index c9e35b81d093..316518f87294 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -20,6 +20,7 @@
#include <linux/skbuff.h>
#include <linux/rtnetlink.h>
#include <linux/slab.h>
+#include <linux/rhashtable.h>
#include <net/sock.h>
#include <net/inet_frag.h>
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 2fb4de3f7f66..adbc3d3a560b 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -60,6 +60,7 @@
#include <linux/netfilter_ipv4.h>
#include <linux/compat.h>
#include <linux/export.h>
+#include <linux/rhashtable.h>
#include <net/ip_tunnels.h>
#include <net/checksum.h>
#include <net/netlink.h>
diff --git a/net/ipv4/ipmr_base.c b/net/ipv4/ipmr_base.c
index 30221701614c..4f39b27b5084 100644
--- a/net/ipv4/ipmr_base.c
+++ b/net/ipv4/ipmr_base.c
@@ -2,6 +2,7 @@
* Common logic shared by IPv4 [ipmr] and IPv6 [ip6mr] implementation
*/
+#include <linux/rhashtable.h>
#include <linux/mroute_base.h>
/* Sets everything common except 'dev', since that is done under locking */
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 298fd8b6ed17..dfb339f79bde 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -32,6 +32,7 @@
#include <linux/seq_file.h>
#include <linux/init.h>
#include <linux/compat.h>
+#include <linux/rhashtable.h>
#include <net/protocol.h>
#include <linux/skbuff.h>
#include <net/raw.h>
diff --git a/net/ipv6/seg6.c b/net/ipv6/seg6.c
index 7f5621d09571..e24a91c7892a 100644
--- a/net/ipv6/seg6.c
+++ b/net/ipv6/seg6.c
@@ -17,6 +17,7 @@
#include <linux/net.h>
#include <linux/in6.h>
#include <linux/slab.h>
+#include <linux/rhashtable.h>
#include <net/ipv6.h>
#include <net/protocol.h>
diff --git a/net/ipv6/seg6_hmac.c b/net/ipv6/seg6_hmac.c
index 33fb35cbfac1..b1791129a875 100644
--- a/net/ipv6/seg6_hmac.c
+++ b/net/ipv6/seg6_hmac.c
@@ -22,6 +22,7 @@
#include <linux/icmpv6.h>
#include <linux/mroute6.h>
#include <linux/slab.h>
+#include <linux/rhashtable.h>
#include <linux/netfilter.h>
#include <linux/netfilter_ipv6.h>
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 91e80aa852d6..85077c2f3379 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -14,6 +14,7 @@
#include <linux/skbuff.h>
#include <linux/netlink.h>
#include <linux/vmalloc.h>
+#include <linux/rhashtable.h>
#include <linux/netfilter.h>
#include <linux/netfilter/nfnetlink.h>
#include <linux/netfilter/nf_tables.h>
diff --git a/net/sctp/input.c b/net/sctp/input.c
index ba8a6e6c36fa..9bbc5f92c941 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -56,6 +56,7 @@
#include <net/sctp/sm.h>
#include <net/sctp/checksum.h>
#include <net/net_namespace.h>
+#include <linux/rhashtable.h>
/* Forward declarations for internal helpers. */
static int sctp_rcv_ootb(struct sk_buff *);
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index ae7e7c606f72..0adce7b22675 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -66,6 +66,7 @@
#include <linux/slab.h>
#include <linux/file.h>
#include <linux/compat.h>
+#include <linux/rhashtable.h>
#include <net/ip.h>
#include <net/icmp.h>
^ permalink raw reply related
* [PATCH 03/18] rhashtable: remove nulls_base and related code.
From: NeilBrown @ 2018-06-01 4:44 UTC (permalink / raw)
To: Thomas Graf, Herbert Xu; +Cc: netdev, linux-kernel
In-Reply-To: <152782754287.30340.4395718227884933670.stgit@noble>
This "feature" is unused, undocumented, and untested and so
doesn't really belong. Next patch will introduce support
to detect when a search gets diverted down a different chain,
which the common purpose of nulls markers.
This patch actually fixes a bug too. The table resizing allows a
table to grow to 2^31 buckets, but the hash is truncated to 27 bits -
any growth beyond 2^27 is wasteful an ineffective.
This patch results in NULLS_MARKER(0) being used for all chains,
and leaves the use of rht_is_a_null() to test for it.
Signed-off-by: NeilBrown <neilb@suse.com>
---
include/linux/rhashtable-types.h | 2 --
include/linux/rhashtable.h | 33 +++------------------------------
lib/rhashtable.c | 8 --------
lib/test_rhashtable.c | 5 +----
4 files changed, 4 insertions(+), 44 deletions(-)
diff --git a/include/linux/rhashtable-types.h b/include/linux/rhashtable-types.h
index 9740063ff13b..763d613ce2c2 100644
--- a/include/linux/rhashtable-types.h
+++ b/include/linux/rhashtable-types.h
@@ -50,7 +50,6 @@ typedef int (*rht_obj_cmpfn_t)(struct rhashtable_compare_arg *arg,
* @min_size: Minimum size while shrinking
* @locks_mul: Number of bucket locks to allocate per cpu (default: 32)
* @automatic_shrinking: Enable automatic shrinking of tables
- * @nulls_base: Base value to generate nulls marker
* @hashfn: Hash function (default: jhash2 if !(key_len % 4), or jhash)
* @obj_hashfn: Function to hash object
* @obj_cmpfn: Function to compare key with object
@@ -64,7 +63,6 @@ struct rhashtable_params {
u16 min_size;
bool automatic_shrinking;
u8 locks_mul;
- u32 nulls_base;
rht_hashfn_t hashfn;
rht_obj_hashfn_t obj_hashfn;
rht_obj_cmpfn_t obj_cmpfn;
diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index 48754ab07cdf..d9f719af7936 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -28,25 +28,8 @@
#include <linux/rhashtable-types.h>
/*
* The end of the chain is marked with a special nulls marks which has
- * the following format:
- *
- * +-------+-----------------------------------------------------+-+
- * | Base | Hash |1|
- * +-------+-----------------------------------------------------+-+
- *
- * Base (4 bits) : Reserved to distinguish between multiple tables.
- * Specified via &struct rhashtable_params.nulls_base.
- * Hash (27 bits): Full hash (unmasked) of first element added to bucket
- * 1 (1 bit) : Nulls marker (always set)
- *
- * The remaining bits of the next pointer remain unused for now.
+ * the least significant bit set.
*/
-#define RHT_BASE_BITS 4
-#define RHT_HASH_BITS 27
-#define RHT_BASE_SHIFT RHT_HASH_BITS
-
-/* Base bits plus 1 bit for nulls marker */
-#define RHT_HASH_RESERVED_SPACE (RHT_BASE_BITS + 1)
/* Maximum chain length before rehash
*
@@ -92,24 +75,14 @@ struct bucket_table {
struct rhash_head __rcu *buckets[] ____cacheline_aligned_in_smp;
};
-static inline unsigned long rht_marker(const struct rhashtable *ht, u32 hash)
-{
- return NULLS_MARKER(ht->p.nulls_base + hash);
-}
-
#define INIT_RHT_NULLS_HEAD(ptr, ht, hash) \
- ((ptr) = (typeof(ptr)) rht_marker(ht, hash))
+ ((ptr) = (typeof(ptr)) NULLS_MARKER(0))
static inline bool rht_is_a_nulls(const struct rhash_head *ptr)
{
return ((unsigned long) ptr & 1);
}
-static inline unsigned long rht_get_nulls_value(const struct rhash_head *ptr)
-{
- return ((unsigned long) ptr) >> 1;
-}
-
static inline void *rht_obj(const struct rhashtable *ht,
const struct rhash_head *he)
{
@@ -119,7 +92,7 @@ static inline void *rht_obj(const struct rhashtable *ht,
static inline unsigned int rht_bucket_index(const struct bucket_table *tbl,
unsigned int hash)
{
- return (hash >> RHT_HASH_RESERVED_SPACE) & (tbl->size - 1);
+ return hash & (tbl->size - 1);
}
static inline unsigned int rht_key_get_hash(struct rhashtable *ht,
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index c9fafea7dc6e..688693c919be 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -995,7 +995,6 @@ static u32 rhashtable_jhash2(const void *key, u32 length, u32 seed)
* .key_offset = offsetof(struct test_obj, key),
* .key_len = sizeof(int),
* .hashfn = jhash,
- * .nulls_base = (1U << RHT_BASE_SHIFT),
* };
*
* Configuration Example 2: Variable length keys
@@ -1029,9 +1028,6 @@ int rhashtable_init(struct rhashtable *ht,
(params->obj_hashfn && !params->obj_cmpfn))
return -EINVAL;
- if (params->nulls_base && params->nulls_base < (1U << RHT_BASE_SHIFT))
- return -EINVAL;
-
memset(ht, 0, sizeof(*ht));
mutex_init(&ht->mutex);
spin_lock_init(&ht->lock);
@@ -1096,10 +1092,6 @@ int rhltable_init(struct rhltable *hlt, const struct rhashtable_params *params)
{
int err;
- /* No rhlist NULLs marking for now. */
- if (params->nulls_base)
- return -EINVAL;
-
err = rhashtable_init(&hlt->ht, params);
hlt->ht.rhlist = true;
return err;
diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index bf92b7aa2a49..b428a9c7522a 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -83,7 +83,7 @@ static u32 my_hashfn(const void *data, u32 len, u32 seed)
{
const struct test_obj_rhl *obj = data;
- return (obj->value.id % 10) << RHT_HASH_RESERVED_SPACE;
+ return (obj->value.id % 10);
}
static int my_cmpfn(struct rhashtable_compare_arg *arg, const void *obj)
@@ -99,7 +99,6 @@ static struct rhashtable_params test_rht_params = {
.key_offset = offsetof(struct test_obj, value),
.key_len = sizeof(struct test_obj_val),
.hashfn = jhash,
- .nulls_base = (3U << RHT_BASE_SHIFT),
};
static struct rhashtable_params test_rht_params_dup = {
@@ -294,8 +293,6 @@ static int __init test_rhltable(unsigned int entries)
if (!obj_in_table)
goto out_free;
- /* nulls_base not supported in rhlist interface */
- test_rht_params.nulls_base = 0;
err = rhltable_init(&rhlt, &test_rht_params);
if (WARN_ON(err))
goto out_free;
^ permalink raw reply related
* [PATCH 04/18] rhashtable: detect when object movement might have invalidated a lookup
From: NeilBrown @ 2018-06-01 4:44 UTC (permalink / raw)
To: Thomas Graf, Herbert Xu; +Cc: netdev, linux-kernel
In-Reply-To: <152782754287.30340.4395718227884933670.stgit@noble>
Some users of rhashtable might need to change the key
of an object and move it to a different location in the table.
Other users might want to allocate objects using
SLAB_TYPESAFE_BY_RCU which can result in the same memory allocation
being used for a different (type-compatible) purpose and similarly
end up in a different hash-chain.
To support these, we store a unique NULLS_MARKER at the end of
each chain, and when a search fails to find a match, we check
if the NULLS marker found was the expected one. If not,
the search is repeated.
The unique NULLS_MARKER is derived from the address of the
head of the chain.
If an object is removed and re-added to the same hash chain, we won't
notice by looking that the NULLS marker. In this case we must be sure
that it was not re-added *after* its original location, or a lookup may
incorrectly fail. The easiest solution is to ensure it is inserted at
the start of the chain. insert_slow() already does that,
insert_fast() does not. So this patch changes insert_fast to always
insert at the head of the chain.
Note that such a user must do their own double-checking of
the object found by rhashtable_lookup_fast() after ensuring
mutual exclusion which anything that might change the key, such as
successfully taking a new reference.
Signed-off-by: NeilBrown <neilb@suse.com>
---
include/linux/rhashtable.h | 35 +++++++++++++++++++++++------------
lib/rhashtable.c | 8 +++++---
2 files changed, 28 insertions(+), 15 deletions(-)
diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index d9f719af7936..25d839881ae5 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -75,8 +75,10 @@ struct bucket_table {
struct rhash_head __rcu *buckets[] ____cacheline_aligned_in_smp;
};
+#define RHT_NULLS_MARKER(ptr) \
+ ((void *)NULLS_MARKER(((unsigned long) (ptr)) >> 1))
#define INIT_RHT_NULLS_HEAD(ptr, ht, hash) \
- ((ptr) = (typeof(ptr)) NULLS_MARKER(0))
+ ((ptr) = RHT_NULLS_MARKER(&(ptr)))
static inline bool rht_is_a_nulls(const struct rhash_head *ptr)
{
@@ -471,6 +473,7 @@ static inline struct rhash_head *__rhashtable_lookup(
.ht = ht,
.key = key,
};
+ struct rhash_head __rcu * const *head;
struct bucket_table *tbl;
struct rhash_head *he;
unsigned int hash;
@@ -478,13 +481,19 @@ static inline struct rhash_head *__rhashtable_lookup(
tbl = rht_dereference_rcu(ht->tbl, ht);
restart:
hash = rht_key_hashfn(ht, tbl, key, params);
- rht_for_each_rcu(he, tbl, hash) {
- if (params.obj_cmpfn ?
- params.obj_cmpfn(&arg, rht_obj(ht, he)) :
- rhashtable_compare(&arg, rht_obj(ht, he)))
- continue;
- return he;
- }
+ head = rht_bucket(tbl, hash);
+ do {
+ rht_for_each_rcu_continue(he, *head, tbl, hash) {
+ if (params.obj_cmpfn ?
+ params.obj_cmpfn(&arg, rht_obj(ht, he)) :
+ rhashtable_compare(&arg, rht_obj(ht, he)))
+ continue;
+ return he;
+ }
+ /* An object might have been moved to a different hash chain,
+ * while we walk along it - better check and retry.
+ */
+ } while (he != RHT_NULLS_MARKER(head));
/* Ensure we see any new tables. */
smp_rmb();
@@ -580,6 +589,7 @@ static inline void *__rhashtable_insert_fast(
.ht = ht,
.key = key,
};
+ struct rhash_head __rcu **headp;
struct rhash_head __rcu **pprev;
struct bucket_table *tbl;
struct rhash_head *head;
@@ -603,12 +613,13 @@ static inline void *__rhashtable_insert_fast(
}
elasticity = RHT_ELASTICITY;
- pprev = rht_bucket_insert(ht, tbl, hash);
+ headp = rht_bucket_insert(ht, tbl, hash);
+ pprev = headp;
data = ERR_PTR(-ENOMEM);
if (!pprev)
goto out;
- rht_for_each_continue(head, *pprev, tbl, hash) {
+ rht_for_each_continue(head, *headp, tbl, hash) {
struct rhlist_head *plist;
struct rhlist_head *list;
@@ -648,7 +659,7 @@ static inline void *__rhashtable_insert_fast(
if (unlikely(rht_grow_above_100(ht, tbl)))
goto slow_path;
- head = rht_dereference_bucket(*pprev, tbl, hash);
+ head = rht_dereference_bucket(*headp, tbl, hash);
RCU_INIT_POINTER(obj->next, head);
if (rhlist) {
@@ -658,7 +669,7 @@ static inline void *__rhashtable_insert_fast(
RCU_INIT_POINTER(list->next, NULL);
}
- rcu_assign_pointer(*pprev, obj);
+ rcu_assign_pointer(*headp, obj);
atomic_inc(&ht->nelems);
if (rht_grow_above_75(ht, tbl))
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 688693c919be..69f05cf9e9e8 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -1174,8 +1174,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
unsigned int hash)
{
const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
- static struct rhash_head __rcu *rhnull =
- (struct rhash_head __rcu *)NULLS_MARKER(0);
+ static struct rhash_head __rcu *rhnull;
unsigned int index = hash & ((1 << tbl->nest) - 1);
unsigned int size = tbl->size >> tbl->nest;
unsigned int subhash = hash;
@@ -1193,8 +1192,11 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
subhash >>= shift;
}
- if (!ntbl)
+ if (!ntbl) {
+ if (!rhnull)
+ INIT_RHT_NULLS_HEAD(rhnull, NULL, 0);
return &rhnull;
+ }
return &ntbl[subhash].bucket;
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox