* [PATCH 00/18] xfrm: Add compat layer
@ 2018-07-26 2:31 Dmitry Safonov
2018-07-26 2:31 ` [PATCH 07/18] netlink: Pass groups pointer to .bind() Dmitry Safonov
2018-07-26 8:49 ` [PATCH 00/18] xfrm: Add compat layer Florian Westphal
0 siblings, 2 replies; 13+ messages in thread
From: Dmitry Safonov @ 2018-07-26 2:31 UTC (permalink / raw)
To: linux-kernel
Cc: Dmitry Safonov, David S. Miller, Herbert Xu, Steffen Klassert,
Dmitry Safonov, netdev, Andy Lutomirski, Ard Biesheuvel,
H. Peter Anvin, Ingo Molnar, John Stultz, Kirill A. Shutemov,
Oleg Nesterov, Stephen Boyd, Steven Rostedt, Thomas Gleixner, x86,
linux-efi, Andrew Morton, Greg Kroah-Hartman, Mauro
Due to some historical mistake, xfrm User ABI differ between native and
compatible applications. The difference is in structures paddings and in
the result in the size of netlink messages.
As it's already visible ABI, it cannot be adjusted by packing structures.
Possibility for compatible application to manage xfrm tunnels was
disabled by: the commmit 19d7df69fdb2 ("xfrm: Refuse to insert 32 bit
userspace socket policies on 64 bit systems") and the commit 74005991b78a
("xfrm: Do not parse 32bits compiled xfrm netlink msg on 64bits host").
By some wonderful reasons and brilliant architecture decisions for
creating userspace, on Arista switches we still use 32-bit userspace
with 64-bit kernel. There is slow movement to full 64-bit build, but
it's not yet here. As the switches need support for ipsec tunnels, the
local kernel has reverted mentioned patches that disable xfrm for
compat apps. On the top of that there is a bunch of disgraceful hacks
in userspace to work around the size check for netlink messages
and all that jazz.
It looks like, we're not the only desirable users of compatible xfrm,
there were a couple of attempts to make it work:
https://lkml.org/lkml/2017/1/20/733
https://patchwork.ozlabs.org/patch/44600/
http://netdev.vger.kernel.narkive.com/2Gesykj6/patch-net-next-xfrm-correctly-parse-netlink-msg-from-32bits-ip-command-on-64bits-host
All the discussions end in the conclusion that xfrm should have a full
compatible layer to correctly work with 32-bit applications on 64-bit
kernels:
https://lkml.org/lkml/2017/1/23/413
https://patchwork.ozlabs.org/patch/433279/
In some recent lkml discussion, Linus said that it's worth to fix this
problem and not giving people an excuse to stay on 32-bit kernel:
https://lkml.org/lkml/2018/2/13/752
So, here I add a compatible layer to xfrm.
As xfrm uses netlink notifications, kernel should send them in ABI
format that an application will parse. The proposed solution is
to save the ABI of bind() syscall. The realization detail is
to create kernel-hidden, non visible to userspace netlink groups
for compat applications.
The first two patches simplify ifdeffery, and while I've already submitted
them a while ago, I'm resending them for completeness:
https://lore.kernel.org/lkml/20180717005004.25984-1-dima@arista.com/T/#u
There is also an exhaustive selftest for ipsec tunnels and to check
that kernel parses correctly the structures those differ in size.
It doesn't depend on any library and compat version can be easy
build with: make CFLAGS=-m32 net/ipsec
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: netdev@vger.kernel.org
Dmitry Safonov (18):
x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT
compat: Cleanup in_compat_syscall() callers
selftest/net/xfrm: Add test for ipsec tunnel
net/xfrm: Add _packed types for compat users
net/xfrm: Parse userspi_info{,_packed} depending on syscall
netlink: Do not subscribe to non-existent groups
netlink: Pass groups pointer to .bind()
xfrm: Add in-kernel groups for compat notifications
xfrm: Dump usersa_info in compat/native formats
xfrm: Send state notifications in compat format too
xfrm: Add compat support for xfrm_user_expire messages
xfrm: Add compat support for xfrm_userpolicy_info messages
xfrm: Add compat support for xfrm_user_acquire messages
xfrm: Add compat support for xfrm_user_polexpire messages
xfrm: Check compat acquire listeners in xfrm_is_alive()
xfrm: Notify compat listeners about policy flush
xfrm: Notify compat listeners about state flush
xfrm: Enable compat syscalls
MAINTAINERS | 1 +
arch/x86/include/asm/compat.h | 9 +-
arch/x86/include/asm/ftrace.h | 4 +-
arch/x86/kernel/process_64.c | 4 +-
arch/x86/kernel/sys_x86_64.c | 11 +-
arch/x86/mm/hugetlbpage.c | 4 +-
arch/x86/mm/mmap.c | 2 +-
drivers/firmware/efi/efivars.c | 16 +-
include/linux/compat.h | 4 +-
include/linux/netlink.h | 2 +-
include/net/xfrm.h | 14 -
kernel/audit.c | 2 +-
kernel/time/time.c | 2 +-
net/core/rtnetlink.c | 14 +-
net/core/sock_diag.c | 25 +-
net/netfilter/nfnetlink.c | 24 +-
net/netlink/af_netlink.c | 28 +-
net/netlink/af_netlink.h | 4 +-
net/netlink/genetlink.c | 26 +-
net/xfrm/xfrm_state.c | 5 -
net/xfrm/xfrm_user.c | 690 ++++++++---
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/ipsec.c | 1987 ++++++++++++++++++++++++++++++++
24 files changed, 2612 insertions(+), 268 deletions(-)
create mode 100644 tools/testing/selftests/net/ipsec.c
--
2.13.6
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH 07/18] netlink: Pass groups pointer to .bind() 2018-07-26 2:31 [PATCH 00/18] xfrm: Add compat layer Dmitry Safonov @ 2018-07-26 2:31 ` Dmitry Safonov 2018-07-26 8:49 ` [PATCH 00/18] xfrm: Add compat layer Florian Westphal 1 sibling, 0 replies; 13+ messages in thread From: Dmitry Safonov @ 2018-07-26 2:31 UTC (permalink / raw) To: linux-kernel Cc: Dmitry Safonov, David S. Miller, Herbert Xu, Steffen Klassert, Dmitry Safonov, netdev, Eric Paris, Florian Westphal, Jozsef Kadlecsik, Pablo Neira Ayuso, Paul Moore, coreteam, linux-audit, netfilter-devel Netlink messages sent by xfrm differ in size between 64-bit native and 32-bit compatible applications. To know which UABI to use to send the message from kernel, I'll use the type of bind() syscall. Xfrm will have hidden from userspace kernel-only groups for compatible applications. So, add pointer to groups to netlink_bind(). With later patches xfrm will set a proper compat group for netlink socket during bind(). Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Paris <eparis@redhat.com> Cc: Florian Westphal <fw@strlen.de> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Paul Moore <paul@paul-moore.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: coreteam@netfilter.org Cc: linux-audit@redhat.com Cc: netdev@vger.kernel.org Cc: netfilter-devel@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> --- include/linux/netlink.h | 2 +- kernel/audit.c | 2 +- net/core/rtnetlink.c | 14 ++++++-------- net/core/sock_diag.c | 25 ++++++++++++------------- net/netfilter/nfnetlink.c | 24 ++++++++++++++---------- net/netlink/af_netlink.c | 27 ++++++++++----------------- net/netlink/af_netlink.h | 4 ++-- net/netlink/genetlink.c | 26 ++++++++++++++++++-------- 8 files changed, 64 insertions(+), 60 deletions(-) diff --git a/include/linux/netlink.h b/include/linux/netlink.h index f3075d6c7e82..19202648e04a 100644 --- a/include/linux/netlink.h +++ b/include/linux/netlink.h @@ -46,7 +46,7 @@ struct netlink_kernel_cfg { unsigned int flags; void (*input)(struct sk_buff *skb); struct mutex *cb_mutex; - int (*bind)(struct net *net, int group); + int (*bind)(struct net *net, unsigned long *groups); void (*unbind)(struct net *net, int group); bool (*compare)(struct net *net, struct sock *sk); }; diff --git a/kernel/audit.c b/kernel/audit.c index e7478cb58079..87ca0214bcf2 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1523,7 +1523,7 @@ static void audit_receive(struct sk_buff *skb) } /* Run custom bind function on netlink socket group connect or bind requests. */ -static int audit_bind(struct net *net, int group) +static int audit_bind(struct net *net, unsigned long *groups) { if (!capable(CAP_AUDIT_READ)) return -EPERM; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index e3f743c141b3..0465e692ae32 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4683,15 +4683,13 @@ static void rtnetlink_rcv(struct sk_buff *skb) netlink_rcv_skb(skb, &rtnetlink_rcv_msg); } -static int rtnetlink_bind(struct net *net, int group) +static int rtnetlink_bind(struct net *net, unsigned long *groups) { - switch (group) { - case RTNLGRP_IPV4_MROUTE_R: - case RTNLGRP_IPV6_MROUTE_R: - if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) - return -EPERM; - break; - } + unsigned long mroute_r; + + mroute_r = 1UL << RTNLGRP_IPV4_MROUTE_R | 1UL << RTNLGRP_IPV6_MROUTE_R; + if ((*groups & mroute_r) && !ns_capable(net->user_ns, CAP_NET_ADMIN)) + return -EPERM; return 0; } diff --git a/net/core/sock_diag.c b/net/core/sock_diag.c index c37b5be7c5e4..befa6759f2ad 100644 --- a/net/core/sock_diag.c +++ b/net/core/sock_diag.c @@ -273,20 +273,19 @@ static void sock_diag_rcv(struct sk_buff *skb) mutex_unlock(&sock_diag_mutex); } -static int sock_diag_bind(struct net *net, int group) +static int sock_diag_bind(struct net *net, unsigned long *groups) { - switch (group) { - case SKNLGRP_INET_TCP_DESTROY: - case SKNLGRP_INET_UDP_DESTROY: - if (!sock_diag_handlers[AF_INET]) - sock_load_diag_module(AF_INET, 0); - break; - case SKNLGRP_INET6_TCP_DESTROY: - case SKNLGRP_INET6_UDP_DESTROY: - if (!sock_diag_handlers[AF_INET6]) - sock_load_diag_module(AF_INET6, 0); - break; - } + unsigned long inet_mask, inet6_mask; + + inet_mask = 1UL << SKNLGRP_INET_TCP_DESTROY; + inet_mask |= 1UL << SKNLGRP_INET_UDP_DESTROY; + inet6_mask = 1UL << SKNLGRP_INET6_TCP_DESTROY; + inet6_mask |= 1UL << SKNLGRP_INET6_UDP_DESTROY; + + if ((*groups & inet_mask) && !sock_diag_handlers[AF_INET]) + sock_load_diag_module(AF_INET, 0); + if ((*groups & inet6_mask) && !sock_diag_handlers[AF_INET6]) + sock_load_diag_module(AF_INET6, 0); return 0; } diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c index e1b6be29848d..6a8893df5285 100644 --- a/net/netfilter/nfnetlink.c +++ b/net/netfilter/nfnetlink.c @@ -556,21 +556,25 @@ static void nfnetlink_rcv(struct sk_buff *skb) } #ifdef CONFIG_MODULES -static int nfnetlink_bind(struct net *net, int group) +static int nfnetlink_bind(struct net *net, unsigned long *groups) { const struct nfnetlink_subsystem *ss; - int type; + unsigned long _groups = *groups; + int type, group_bit, group = -1; - if (group <= NFNLGRP_NONE || group > NFNLGRP_MAX) - return 0; + while ((group_bit = __builtin_ffsl(_groups))) { + group += group_bit; - type = nfnl_group2type[group]; + type = nfnl_group2type[group]; + rcu_read_lock(); + ss = nfnetlink_get_subsys(type << 8); + rcu_read_unlock(); + if (!ss) + request_module("nfnetlink-subsys-%d", type); + + _groups >>= group_bit; + } - rcu_read_lock(); - ss = nfnetlink_get_subsys(type << 8); - rcu_read_unlock(); - if (!ss) - request_module("nfnetlink-subsys-%d", type); return 0; } #endif diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index ac805caed2e2..1e11e706c683 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -668,7 +668,7 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol, struct module *module = NULL; struct mutex *cb_mutex; struct netlink_sock *nlk; - int (*bind)(struct net *net, int group); + int (*bind)(struct net *net, unsigned long *groups); void (*unbind)(struct net *net, int group); int err = 0; @@ -969,8 +969,7 @@ static int netlink_realloc_groups(struct sock *sk) return err; } -static void netlink_undo_bind(int group, long unsigned int groups, - struct sock *sk) +static void netlink_undo_bind(unsigned long groups, struct sock *sk) { struct netlink_sock *nlk = nlk_sk(sk); int undo; @@ -978,7 +977,7 @@ static void netlink_undo_bind(int group, long unsigned int groups, if (!nlk->netlink_unbind) return; - for (undo = 0; undo < group; undo++) + for (undo = 0; undo < nlk->ngroups; undo++) if (test_bit(undo, &groups)) nlk->netlink_unbind(sock_net(sk), undo + 1); } @@ -991,7 +990,7 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, struct netlink_sock *nlk = nlk_sk(sk); struct sockaddr_nl *nladdr = (struct sockaddr_nl *)addr; int err = 0; - long unsigned int groups = nladdr->nl_groups; + unsigned long groups = nladdr->nl_groups; bool bound; if (addr_len < sizeof(struct sockaddr_nl)) @@ -1021,17 +1020,9 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, netlink_lock_table(); if (nlk->netlink_bind && groups) { - int group; - - for (group = 0; group < nlk->ngroups; group++) { - if (!test_bit(group, &groups)) - continue; - err = nlk->netlink_bind(net, group + 1); - if (!err) - continue; - netlink_undo_bind(group, groups, sk); + err = nlk->netlink_bind(net, &groups); + if (err) goto unlock; - } } /* No need for barriers here as we return to user-space without @@ -1042,7 +1033,7 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, netlink_insert(sk, nladdr->nl_pid) : netlink_autobind(sock); if (err) { - netlink_undo_bind(nlk->ngroups, groups, sk); + netlink_undo_bind(groups, sk); goto unlock; } } @@ -1652,7 +1643,9 @@ static int netlink_setsockopt(struct socket *sock, int level, int optname, if (!val || val - 1 >= nlk->ngroups) return -EINVAL; if (optname == NETLINK_ADD_MEMBERSHIP && nlk->netlink_bind) { - err = nlk->netlink_bind(sock_net(sk), val); + unsigned long groups = 1UL << val; + + err = nlk->netlink_bind(sock_net(sk), &groups); if (err) return err; } diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h index 962de7b3c023..e765172abbb7 100644 --- a/net/netlink/af_netlink.h +++ b/net/netlink/af_netlink.h @@ -39,7 +39,7 @@ struct netlink_sock { struct mutex *cb_mutex; struct mutex cb_def_mutex; void (*netlink_rcv)(struct sk_buff *skb); - int (*netlink_bind)(struct net *net, int group); + int (*netlink_bind)(struct net *net, unsigned long *groups); void (*netlink_unbind)(struct net *net, int group); struct module *module; @@ -61,7 +61,7 @@ struct netlink_table { unsigned int groups; struct mutex *cb_mutex; struct module *module; - int (*bind)(struct net *net, int group); + int (*bind)(struct net *net, unsigned long *groups); void (*unbind)(struct net *net, int group); bool (*compare)(struct net *net, struct sock *sock); int registered; diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index 25eeb6d2a75a..a86b105730cf 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -960,28 +960,38 @@ static struct genl_family genl_ctrl __ro_after_init = { .netnsok = true, }; -static int genl_bind(struct net *net, int group) +static int genl_bind(struct net *net, unsigned long *groups) { + unsigned long mcgrps; struct genl_family *f; - int err = -ENOENT; + int err = 0; unsigned int id; down_read(&cb_lock); idr_for_each_entry(&genl_fam_idr, f, id) { - if (group >= f->mcgrp_offset && - group < f->mcgrp_offset + f->n_mcgrps) { - int fam_grp = group - f->mcgrp_offset; + int fam_grp_bit, fam_grp = -1; + + mcgrps = (1UL << f->n_mcgrps) - 1; + mcgrps <<= f->mcgrp_offset; + mcgrps &= *groups; + + if (!mcgrps) + continue; + + while ((fam_grp_bit = __builtin_ffsl(mcgrps))) { + fam_grp += fam_grp_bit; if (!f->netnsok && net != &init_net) err = -ENOENT; else if (f->mcast_bind) err = f->mcast_bind(net, fam_grp); - else - err = 0; - break; + + if (err) + goto out; } } +out: up_read(&cb_lock); return err; -- 2.13.6 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-26 2:31 [PATCH 00/18] xfrm: Add compat layer Dmitry Safonov 2018-07-26 2:31 ` [PATCH 07/18] netlink: Pass groups pointer to .bind() Dmitry Safonov @ 2018-07-26 8:49 ` Florian Westphal 2018-07-27 7:37 ` Steffen Klassert 1 sibling, 1 reply; 13+ messages in thread From: Florian Westphal @ 2018-07-26 8:49 UTC (permalink / raw) To: Dmitry Safonov Cc: linux-kernel, David S. Miller, Herbert Xu, Steffen Klassert, Dmitry Safonov, netdev, Andy Lutomirski, Ard Biesheuvel, H. Peter Anvin, Ingo Molnar, John Stultz, Kirill A. Shutemov, Oleg Nesterov, Stephen Boyd, Steven Rostedt, Thomas Gleixner, x86, linux-efi, Andrew Morton, Greg Kroah-Hartman, Mauro Dmitry Safonov <dima@arista.com> wrote: > So, here I add a compatible layer to xfrm. > As xfrm uses netlink notifications, kernel should send them in ABI > format that an application will parse. The proposed solution is > to save the ABI of bind() syscall. The realization detail is > to create kernel-hidden, non visible to userspace netlink groups > for compat applications. Why not use exisiting netlink support? Just add the 32bit skb to skb64->frag_list and let netlink find if tasks needs 64 or 32 one. It only needs this small fix to properly signal the end of a dump: https://marc.info/?l=linux-netdev&m=126625240303351&w=2 I had started a second attempt to make xfrm compat work, but its still in early stage. One link that might still have some value: https://git.breakpoint.cc/cgit/fw/net-next.git/commit/?h=xfrm_config_compat_07&id=f64430e6d9e297f3990f485a4832e273751b9869 (compat structure definitions with BUILD_BUG_ON checking) My plan was to make xfrm compat work strictly as shrinker (64->32) and expander (32->64), i.e. no/little changes to exisiting code and pass all "expanded" skbs through existing xfrm rcv functions. Example to illustrate idea: https://git.breakpoint.cc/cgit/fw/net-next.git/commit/?h=xfrm_config_compat_07&id=c622f067849b02170127b69471cb3481e4bc9e49 ... its supposed to take 64bit skb and create a 32bit one from it. Just for reference; I currently don't plan to work on this again. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-26 8:49 ` [PATCH 00/18] xfrm: Add compat layer Florian Westphal @ 2018-07-27 7:37 ` Steffen Klassert 2018-07-27 14:02 ` Dmitry Safonov 0 siblings, 1 reply; 13+ messages in thread From: Steffen Klassert @ 2018-07-27 7:37 UTC (permalink / raw) To: Florian Westphal Cc: Dmitry Safonov, linux-kernel, David S. Miller, Herbert Xu, Dmitry Safonov, netdev, Andy Lutomirski, Ard Biesheuvel, H. Peter Anvin, Ingo Molnar, John Stultz, Kirill A. Shutemov, Oleg Nesterov, Stephen Boyd, Steven Rostedt, Thomas Gleixner, x86, linux-efi, Andrew Morton, Greg Kroah-Hartman, Mauro Carvalho Chehab <mcheh On Thu, Jul 26, 2018 at 10:49:59AM +0200, Florian Westphal wrote: > Dmitry Safonov <dima@arista.com> wrote: > > So, here I add a compatible layer to xfrm. > > As xfrm uses netlink notifications, kernel should send them in ABI > > format that an application will parse. The proposed solution is > > to save the ABI of bind() syscall. The realization detail is > > to create kernel-hidden, non visible to userspace netlink groups > > for compat applications. > > Why not use exisiting netlink support? > Just add the 32bit skb to skb64->frag_list and let > netlink find if tasks needs 64 or 32 one. > > It only needs this small fix to properly signal the end of a dump: > https://marc.info/?l=linux-netdev&m=126625240303351&w=2 > > I had started a second attempt to make xfrm compat work, > but its still in early stage. > > One link that might still have some value: > https://git.breakpoint.cc/cgit/fw/net-next.git/commit/?h=xfrm_config_compat_07&id=f64430e6d9e297f3990f485a4832e273751b9869 > (compat structure definitions with BUILD_BUG_ON checking) > > My plan was to make xfrm compat work strictly as shrinker (64->32) > and expander (32->64), i.e. no/little changes to exisiting code and > pass all "expanded" skbs through existing xfrm rcv functions. I agree here with Florian. The code behind this ABI is already complicated. Please stay away from generic code a much as possible. Generic and compat code should be clearly separated. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-27 7:37 ` Steffen Klassert @ 2018-07-27 14:02 ` Dmitry Safonov 2018-07-27 14:19 ` Florian Westphal 0 siblings, 1 reply; 13+ messages in thread From: Dmitry Safonov @ 2018-07-27 14:02 UTC (permalink / raw) To: Steffen Klassert, Florian Westphal Cc: linux-kernel, David S. Miller, Herbert Xu, Dmitry Safonov, netdev, Andy Lutomirski, Ard Biesheuvel, H. Peter Anvin, Ingo Molnar, John Stultz, Kirill A. Shutemov, Oleg Nesterov, Stephen Boyd, Steven Rostedt, Thomas Gleixner, x86, linux-efi, Andrew Morton, Greg Kroah-Hartman, Mauro Carvalho Chehab, Shuah Khan <shua On Fri, 2018-07-27 at 09:37 +0200, Steffen Klassert wrote: > On Thu, Jul 26, 2018 at 10:49:59AM +0200, Florian Westphal wrote: > > Dmitry Safonov <dima@arista.com> wrote: > > > So, here I add a compatible layer to xfrm. > > > As xfrm uses netlink notifications, kernel should send them in > > > ABI > > > format that an application will parse. The proposed solution is > > > to save the ABI of bind() syscall. The realization detail is > > > to create kernel-hidden, non visible to userspace netlink groups > > > for compat applications. > > > > Why not use exisiting netlink support? > > Just add the 32bit skb to skb64->frag_list and let > > netlink find if tasks needs 64 or 32 one. > > > > It only needs this small fix to properly signal the end of a dump: > > https://marc.info/?l=linux-netdev&m=126625240303351&w=2 > > > > I had started a second attempt to make xfrm compat work, > > but its still in early stage. > > > > One link that might still have some value: > > https://git.breakpoint.cc/cgit/fw/net-next.git/commit/?h=xfrm_confi > > g_compat_07&id=f64430e6d9e297f3990f485a4832e273751b9869 > > (compat structure definitions with BUILD_BUG_ON checking) > > > > My plan was to make xfrm compat work strictly as shrinker (64->32) > > and expander (32->64), i.e. no/little changes to exisiting code and > > pass all "expanded" skbs through existing xfrm rcv functions. > > I agree here with Florian. The code behind this ABI > is already complicated. Please stay away from generic > code a much as possible. Generic and compat code should > be clearly separated. Yeah, I tend to agree that it would be better to separate it. But: 1. It will double copy netlink messages, making it O(n) instead of O(1), where n - is number of bind()s.. Probably we don't care much. 2. The patches not-yet-done on the link have +500 added lines - as much as my working patches set, so probably it'll add more code. Probably, we don't care that much about amount of code added and additional copies than about separating compat layer from the main code. Will look into that. -- Thanks, Dmitry ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-27 14:02 ` Dmitry Safonov @ 2018-07-27 14:19 ` Florian Westphal 2018-07-27 14:51 ` Dmitry Safonov 0 siblings, 1 reply; 13+ messages in thread From: Florian Westphal @ 2018-07-27 14:19 UTC (permalink / raw) To: Dmitry Safonov Cc: Steffen Klassert, Florian Westphal, linux-kernel, David S. Miller, Herbert Xu, Dmitry Safonov, netdev, Andy Lutomirski, Ard Biesheuvel, H. Peter Anvin, Ingo Molnar, John Stultz, Kirill A. Shutemov, Oleg Nesterov, Stephen Boyd, Steven Rostedt, Thomas Gleixner, x86, linux-efi, Andrew Morton, Greg Kroah-Hartman Dmitry Safonov <dima@arista.com> wrote: > 1. It will double copy netlink messages, making it O(n) instead of > O(1), where n - is number of bind()s.. Probably we don't care much. About those bind() patches, I don't understand why they are needed. Why can't you just add the compat skb to the native skb when doing the multicast call? skb_shinfo(skb)->frag_list = compat_skb; xfrm_nlmsg_multicast(net, skb, 0, ... ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-27 14:19 ` Florian Westphal @ 2018-07-27 14:51 ` Dmitry Safonov 2018-07-27 16:48 ` Nathan Harold 0 siblings, 1 reply; 13+ messages in thread From: Dmitry Safonov @ 2018-07-27 14:51 UTC (permalink / raw) To: Florian Westphal Cc: Steffen Klassert, linux-kernel, David S. Miller, Herbert Xu, Dmitry Safonov, netdev, Andy Lutomirski, Ard Biesheuvel, H. Peter Anvin, Ingo Molnar, John Stultz, Kirill A. Shutemov, Oleg Nesterov, Stephen Boyd, Steven Rostedt, Thomas Gleixner, x86, linux-efi, Andrew Morton, Greg Kroah-Hartman, Mauro On Fri, 2018-07-27 at 16:19 +0200, Florian Westphal wrote: > Dmitry Safonov <dima@arista.com> wrote: > > 1. It will double copy netlink messages, making it O(n) instead of > > O(1), where n - is number of bind()s.. Probably we don't care much. > > About those bind() patches, I don't understand why they are needed. > > Why can't you just add the compat skb to the native skb when doing > the multicast call? > > skb_shinfo(skb)->frag_list = compat_skb; > xfrm_nlmsg_multicast(net, skb, 0, ... Oh yeah, sorry, I think I misread the patch - will try to add compat skb in the multicast call. -- Thanks, Dmitry ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-27 14:51 ` Dmitry Safonov @ 2018-07-27 16:48 ` Nathan Harold 2018-07-27 17:09 ` Andy Lutomirski 2018-07-28 16:26 ` Dmitry Safonov 0 siblings, 2 replies; 13+ messages in thread From: Nathan Harold @ 2018-07-27 16:48 UTC (permalink / raw) To: Dmitry Safonov Cc: Florian Westphal, Steffen Klassert, linux-kernel, David S. Miller, Herbert Xu, Dmitry Safonov, netdev, Andy Lutomirski, Ard Biesheuvel, H. Peter Anvin, Ingo Molnar, John Stultz, Kirill A. Shutemov, Oleg Nesterov, Stephen Boyd, Steven Rostedt, Thomas Gleixner, x86, linux-efi, Andrew Morton, Greg Kroah-Hartman [-- Attachment #1: Type: text/plain, Size: 1805 bytes --] *We (Android) are very interested in removing the restriction for 32-bit userspace processes accessing xfrm netlink on 64-bit kernels. IPsec support is required to pass Android conformance tests, and any manufacturer wishing to ship 32-bit userspace with a recent kernel needs out-of-tree changes (removing the compat_task check) to do so.That said, it’s not difficult to work around alignment issues directly in userspace, so maybe we could just remove the check and make this the caller's responsibility? Here’s an example of the workaround currently in the Android tree:https://android.googlesource.com/platform/system/netd/+/refs/heads/master/server/XfrmController.h#257 <https://android.googlesource.com/platform/system/netd/+/refs/heads/master/server/XfrmController.h#257>We could also employ a (relatively simple) solution such as the one above in the uapi XFRM header itself, though it would require a caller to declare the target kernel ABI at compile time. Maybe that’s not unthinkable for an uncommon case?-Nathan* On Fri, Jul 27, 2018 at 7:51 AM, Dmitry Safonov <dima@arista.com> wrote: > On Fri, 2018-07-27 at 16:19 +0200, Florian Westphal wrote: > > Dmitry Safonov <dima@arista.com> wrote: > > > 1. It will double copy netlink messages, making it O(n) instead of > > > O(1), where n - is number of bind()s.. Probably we don't care much. > > > > About those bind() patches, I don't understand why they are needed. > > > > Why can't you just add the compat skb to the native skb when doing > > the multicast call? > > > > skb_shinfo(skb)->frag_list = compat_skb; > > xfrm_nlmsg_multicast(net, skb, 0, ... > > Oh yeah, sorry, I think I misread the patch - will try to add compat > skb in the multicast call. > > -- > Thanks, > Dmitry > [-- Attachment #2: Type: text/html, Size: 3972 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-27 16:48 ` Nathan Harold @ 2018-07-27 17:09 ` Andy Lutomirski 2018-07-28 16:26 ` Dmitry Safonov 1 sibling, 0 replies; 13+ messages in thread From: Andy Lutomirski @ 2018-07-27 17:09 UTC (permalink / raw) To: Nathan Harold Cc: Dmitry Safonov, Florian Westphal, Steffen Klassert, linux-kernel, David S. Miller, Herbert Xu, Dmitry Safonov, netdev, Andy Lutomirski, Ard Biesheuvel, H. Peter Anvin, Ingo Molnar, John Stultz, Kirill A. Shutemov, Oleg Nesterov, Stephen Boyd, Steven Rostedt, Thomas Gleixner, x86, linux-efi, Andrew Morton, G [-- Attachment #1: Type: text/plain, Size: 2064 bytes --] > On Jul 27, 2018, at 9:48 AM, Nathan Harold <nharold@google.com> wrote: > > We (Android) are very interested in removing the restriction for 32-bit userspace processes accessing xfrm netlink on 64-bit kernels. IPsec support is required to pass Android conformance tests, and any manufacturer wishing to ship 32-bit userspace with a recent kernel needs out-of-tree changes (removing the compat_task check) to do so. > > That said, it’s not difficult to work around alignment issues directly in userspace, so maybe we could just remove the check and make this the caller's responsibility? Here’s an example of the workaround currently in the Android tree: > https://android.googlesource.com/platform/system/netd/+/refs/heads/master/server/XfrmController.h#257 > > We could also employ a (relatively simple) solution such as the one above in the uapi XFRM header itself, though it would require a caller to declare the target kernel ABI at compile time. Maybe that’s not unthinkable for an uncommon case? > Could there just be an XFRM2 that is entirely identical to XFRM for 64-bit userspace but makes the 32-bit structures match? If there are a grand total of two or so userspace implementations, that should cover most use cases. L > -Nathan > > >> On Fri, Jul 27, 2018 at 7:51 AM, Dmitry Safonov <dima@arista.com> wrote: >> On Fri, 2018-07-27 at 16:19 +0200, Florian Westphal wrote: >> > Dmitry Safonov <dima@arista.com> wrote: >> > > 1. It will double copy netlink messages, making it O(n) instead of >> > > O(1), where n - is number of bind()s.. Probably we don't care much. >> > >> > About those bind() patches, I don't understand why they are needed. >> > >> > Why can't you just add the compat skb to the native skb when doing >> > the multicast call? >> > >> > skb_shinfo(skb)->frag_list = compat_skb; >> > xfrm_nlmsg_multicast(net, skb, 0, ... >> >> Oh yeah, sorry, I think I misread the patch - will try to add compat >> skb in the multicast call. >> >> -- >> Thanks, >> Dmitry > [-- Attachment #2: Type: text/html, Size: 4724 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-27 16:48 ` Nathan Harold 2018-07-27 17:09 ` Andy Lutomirski @ 2018-07-28 16:26 ` Dmitry Safonov 2018-07-28 21:18 ` David Miller 1 sibling, 1 reply; 13+ messages in thread From: Dmitry Safonov @ 2018-07-28 16:26 UTC (permalink / raw) To: Nathan Harold Cc: Florian Westphal, Steffen Klassert, linux-kernel, David S. Miller, Herbert Xu, Dmitry Safonov, netdev, Andy Lutomirski, Ard Biesheuvel, H. Peter Anvin, Ingo Molnar, John Stultz, Kirill A. Shutemov, Oleg Nesterov, Stephen Boyd, Steven Rostedt, Thomas Gleixner, x86, linux-efi, Andrew Morton, Greg Kroah-Hartman On Fri, 2018-07-27 at 09:48 -0700, Nathan Harold wrote: > We (Android) are very interested in removing the restriction for 32- > bit userspace processes accessing xfrm netlink on 64-bit kernels. > IPsec support is required to pass Android conformance tests, and any > manufacturer wishing to ship 32-bit userspace with a recent kernel > needs out-of-tree changes (removing the compat_task check) to do so. Glad to hear - that justify my attempts more :) > That said, it’s not difficult to work around alignment issues > directly in userspace, so maybe we could just remove the check and > make this the caller's responsibility? Here’s an example of the > workaround currently in the Android tree: > https://android.googlesource.com/platform/system/netd/+/refs/heads/ma > ster/server/XfrmController.h#257 We've kinda same workarounds in our userspace.. But I don't think reverting the check makes much sense - it'll make broken compat ABI in stone. If you're fine with disgraceful hacks and just want to get rid of additional non-mainstream patch - you can make 64-bit syscalls from 32- bit task (hint: examples in x86 selftests). > We could also employ a (relatively simple) solution such as the one > above in the uapi XFRM header itself, though it would require a > caller to declare the target kernel ABI at compile time. Maybe that’s > not unthinkable for an uncommon case? Well, I think, I'll rework my patches set according to critics and separate compat xfrm layer. I've already a selftest to check that 32/64 bit xfrm works - so the most time-taking part is done. So, if you'll wait a week or two - you may help me to justify acception of mainstreaming those patches. > On Fri, Jul 27, 2018 at 7:51 AM, Dmitry Safonov <dima@arista.com> > wrote: > > On Fri, 2018-07-27 at 16:19 +0200, Florian Westphal wrote: > > > Dmitry Safonov <dima@arista.com> wrote: > > > > 1. It will double copy netlink messages, making it O(n) instead > > of > > > > O(1), where n - is number of bind()s.. Probably we don't care > > much. > > > > > > About those bind() patches, I don't understand why they are > > needed. > > > > > > Why can't you just add the compat skb to the native skb when > > doing > > > the multicast call? > > > > > > skb_shinfo(skb)->frag_list = compat_skb; > > > xfrm_nlmsg_multicast(net, skb, 0, ... > > > > Oh yeah, sorry, I think I misread the patch - will try to add > > compat > > skb in the multicast call. > > -- Thanks, Dmitry ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-28 16:26 ` Dmitry Safonov @ 2018-07-28 21:18 ` David Miller 2018-07-30 17:39 ` Dmitry Safonov 0 siblings, 1 reply; 13+ messages in thread From: David Miller @ 2018-07-28 21:18 UTC (permalink / raw) To: dima Cc: nharold, fw, steffen.klassert, linux-kernel, herbert, 0x7f454c46, netdev, luto, ard.biesheuvel, hpa, mingo, john.stultz, kirill.shutemov, oleg, sboyd, rostedt, tglx, x86, linux-efi, akpm, gregkh, mchehab+samsung, shuah, linux-kselftest, eparis, kadlec, pablo, paul, coreteam, linux-audit, netfilter-devel, fan.du From: Dmitry Safonov <dima@arista.com> Date: Sat, 28 Jul 2018 17:26:55 +0100 > Well, I think, I'll rework my patches set according to critics and > separate compat xfrm layer. I've already a selftest to check that 32/64 > bit xfrm works - so the most time-taking part is done. The way you've done the compat structures using __packed is only going to work on x86, just FYI. The "32-bit alignment for 64-bit objects" thing x86 has is very much not universal amongst ABIs having 32-bit and 64-bit variants. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-28 21:18 ` David Miller @ 2018-07-30 17:39 ` Dmitry Safonov 2018-07-30 19:43 ` Florian Westphal 0 siblings, 1 reply; 13+ messages in thread From: Dmitry Safonov @ 2018-07-30 17:39 UTC (permalink / raw) To: David Miller Cc: nharold, fw, steffen.klassert, linux-kernel, herbert, 0x7f454c46, netdev, luto, ard.biesheuvel, hpa, mingo, john.stultz, kirill.shutemov, oleg, sboyd, rostedt, tglx, x86, linux-efi, akpm, gregkh, mchehab+samsung, shuah, linux-kselftest, eparis, kadlec, pablo, paul, coreteam, linux-audit, netfilter-devel, fan.du On Sat, 2018-07-28 at 14:18 -0700, David Miller wrote: > From: Dmitry Safonov <dima@arista.com> > Date: Sat, 28 Jul 2018 17:26:55 +0100 > > > Well, I think, I'll rework my patches set according to critics and > > separate compat xfrm layer. I've already a selftest to check that > 32/64 > > bit xfrm works - so the most time-taking part is done. > > The way you've done the compat structures using __packed is only > going > to work on x86, just FYI. Thanks for pointing, so I'll probably cover it under something like HAS_COMPAT_XFRM. (if there isn't any better idea). > The "32-bit alignment for 64-bit objects" thing x86 has is very much > not universal amongst ABIs having 32-bit and 64-bit variants. -- Thanks, Dmitry ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 00/18] xfrm: Add compat layer 2018-07-30 17:39 ` Dmitry Safonov @ 2018-07-30 19:43 ` Florian Westphal 0 siblings, 0 replies; 13+ messages in thread From: Florian Westphal @ 2018-07-30 19:43 UTC (permalink / raw) To: Dmitry Safonov Cc: David Miller, nharold, fw, steffen.klassert, linux-kernel, herbert, 0x7f454c46, netdev, luto, ard.biesheuvel, hpa, mingo, john.stultz, kirill.shutemov, oleg, sboyd, rostedt, tglx, x86, linux-efi, akpm, gregkh, mchehab+samsung, shuah, linux-kselftest, eparis, kadlec, pablo, paul, coreteam, linux-audit, netfilter-devel, fan.du Dmitry Safonov <dima@arista.com> wrote: > On Sat, 2018-07-28 at 14:18 -0700, David Miller wrote: > > From: Dmitry Safonov <dima@arista.com> > > Date: Sat, 28 Jul 2018 17:26:55 +0100 > > > > > Well, I think, I'll rework my patches set according to critics and > > > separate compat xfrm layer. I've already a selftest to check that > > 32/64 > > > bit xfrm works - so the most time-taking part is done. > > > > The way you've done the compat structures using __packed is only > > going > > to work on x86, just FYI. > > Thanks for pointing, so I'll probably cover it under something like > HAS_COMPAT_XFRM. > (if there isn't any better idea). You can do that, I suspect you can use CONFIG_COMPAT_FOR_U64_ALIGNMENT as AFAICR the only reason for the compat problem is different alignment requirements of 64bit integer types in the structs, not e.g. due to "long" size differences. Instead of __packed, you can use the "compat" data types, e.g. compat_u64 instead of u64: struct compat_xfrm_lifetime_cur { compat_u64 bytes, packets, add_time, use_time; }; /* same size on i386, but only 4 byte alignment required even on x86_64*/ You might be able to reuse https://git.breakpoint.cc/cgit/fw/net-next.git/commit/?h=xfrm_config_compat_07&id=f64430e6d9e297f3990f485a4832e273751b9869 in your patch set. I can try to submit the first few patches (which are not related to compat, they just add const qualifiers) for inclusion later this week. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-07-30 19:43 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-07-26 2:31 [PATCH 00/18] xfrm: Add compat layer Dmitry Safonov 2018-07-26 2:31 ` [PATCH 07/18] netlink: Pass groups pointer to .bind() Dmitry Safonov 2018-07-26 8:49 ` [PATCH 00/18] xfrm: Add compat layer Florian Westphal 2018-07-27 7:37 ` Steffen Klassert 2018-07-27 14:02 ` Dmitry Safonov 2018-07-27 14:19 ` Florian Westphal 2018-07-27 14:51 ` Dmitry Safonov 2018-07-27 16:48 ` Nathan Harold 2018-07-27 17:09 ` Andy Lutomirski 2018-07-28 16:26 ` Dmitry Safonov 2018-07-28 21:18 ` David Miller 2018-07-30 17:39 ` Dmitry Safonov 2018-07-30 19:43 ` Florian Westphal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).