* [PATCH net-next] macsec: introduce default_async_crypto sysctl
@ 2023-08-17 15:07 Sabrina Dubroca
2023-08-19 1:46 ` Jakub Kicinski
0 siblings, 1 reply; 11+ messages in thread
From: Sabrina Dubroca @ 2023-08-17 15:07 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca, Jonathan Corbet, linux-doc, Scott Dial
Commit ab046a5d4be4 ("net: macsec: preserve ingress frame ordering")
tried to solve an issue caused by MACsec's use of asynchronous crypto
operations, but introduced a large performance regression in cases
where async crypto isn't causing reordering of packets.
This patch introduces a per-netns sysctl that administrators can set
to allow new SAs to use async crypto, such as aesni. Existing SAs
won't be modified.
By setting default_async_crypto=1 and reconfiguring macsec, a single
netperf instance jumps from 1.4Gbps to 4.4Gbps.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
Documentation/admin-guide/sysctl/net.rst | 39 +++++++--
drivers/net/macsec.c | 101 ++++++++++++++++++++---
2 files changed, 119 insertions(+), 21 deletions(-)
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 4877563241f3..ce47b612c517 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -34,14 +34,14 @@ Table : Subdirectories in /proc/sys/net
========= =================== = ========== ===================
Directory Content Directory Content
========= =================== = ========== ===================
- 802 E802 protocol mptcp Multipath TCP
- appletalk Appletalk protocol netfilter Network Filter
- ax25 AX25 netrom NET/ROM
- bridge Bridging rose X.25 PLP layer
- core General parameter tipc TIPC
- ethernet Ethernet protocol unix Unix domain sockets
- ipv4 IP version 4 x25 X.25 protocol
- ipv6 IP version 6
+ 802 E802 protocol macsec MACsec
+ appletalk Appletalk protocol mptcp Multipath TCP
+ ax25 AX25 netfilter Network Filter
+ bridge Bridging netrom NET/ROM
+ core General parameter rose X.25 PLP layer
+ ethernet Ethernet protocol tipc TIPC
+ ipv4 IP version 4 unix Unix domain sockets
+ ipv6 IP version 6 x25 X.25 protocol
========= =================== = ========== ===================
1. /proc/sys/net/core - Network core options
@@ -503,3 +503,26 @@ originally may have been issued in the correct sequential order.
If named_timeout is nonzero, failed topology updates will be placed on a defer
queue until another event arrives that clears the error, or until the timeout
expires. Value is in milliseconds.
+
+
+6. /proc/sys/net/macsec - Parameters for MACsec
+-----------------------------------------------
+
+default_async_crypto
+--------------------
+
+The software implementation of MACsec uses the kernel cryptography
+API, which provides both asynchronous and synchronous implementations
+of algorithms. The asynchronous implementations tend to provide better
+performance, but in some cases, can cause reordering of packets.
+
+This only affects newly created Security Associations. Existing SAs
+will be unchanged. Whether a MACsec device was created before or after
+this sysctl is set has no impact.
+
+Values:
+
+ - 0 - disable asynchronous cryptography
+ - 1 - allow asynchronous cryptography (if available)
+
+Default : 0 (only synchronous)
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index ae60817ec5c2..88743ce5839b 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -138,6 +138,15 @@ struct macsec_cb {
bool has_sci;
};
+static unsigned int macsec_net_id __read_mostly;
+
+struct macsec_net {
+#ifdef CONFIG_SYSCTL
+ struct ctl_table_header *ctl_hdr;
+#endif
+ u8 default_async;
+};
+
static struct macsec_rx_sa *macsec_rxsa_get(struct macsec_rx_sa __rcu *ptr)
{
struct macsec_rx_sa *sa = rcu_dereference_bh(ptr);
@@ -1325,14 +1334,14 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb)
return RX_HANDLER_PASS;
}
-static struct crypto_aead *macsec_alloc_tfm(char *key, int key_len, int icv_len)
+static struct crypto_aead *macsec_alloc_tfm(const struct net *net,
+ char *key, int key_len, int icv_len)
{
+ struct macsec_net *macsec_net = net_generic(net, macsec_net_id);
struct crypto_aead *tfm;
int ret;
- /* Pick a sync gcm(aes) cipher to ensure order is preserved. */
- tfm = crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC);
-
+ tfm = crypto_alloc_aead("gcm(aes)", 0, macsec_net->default_async ? 0 : CRYPTO_ALG_ASYNC);
if (IS_ERR(tfm))
return tfm;
@@ -1350,14 +1359,14 @@ static struct crypto_aead *macsec_alloc_tfm(char *key, int key_len, int icv_len)
return ERR_PTR(ret);
}
-static int init_rx_sa(struct macsec_rx_sa *rx_sa, char *sak, int key_len,
- int icv_len)
+static int init_rx_sa(const struct net *net, struct macsec_rx_sa *rx_sa,
+ char *sak, int key_len, int icv_len)
{
rx_sa->stats = alloc_percpu(struct macsec_rx_sa_stats);
if (!rx_sa->stats)
return -ENOMEM;
- rx_sa->key.tfm = macsec_alloc_tfm(sak, key_len, icv_len);
+ rx_sa->key.tfm = macsec_alloc_tfm(net, sak, key_len, icv_len);
if (IS_ERR(rx_sa->key.tfm)) {
free_percpu(rx_sa->stats);
return PTR_ERR(rx_sa->key.tfm);
@@ -1450,14 +1459,14 @@ static struct macsec_rx_sc *create_rx_sc(struct net_device *dev, sci_t sci,
return rx_sc;
}
-static int init_tx_sa(struct macsec_tx_sa *tx_sa, char *sak, int key_len,
- int icv_len)
+static int init_tx_sa(const struct net *net, struct macsec_tx_sa *tx_sa,
+ char *sak, int key_len, int icv_len)
{
tx_sa->stats = alloc_percpu(struct macsec_tx_sa_stats);
if (!tx_sa->stats)
return -ENOMEM;
- tx_sa->key.tfm = macsec_alloc_tfm(sak, key_len, icv_len);
+ tx_sa->key.tfm = macsec_alloc_tfm(net, sak, key_len, icv_len);
if (IS_ERR(tx_sa->key.tfm)) {
free_percpu(tx_sa->stats);
return PTR_ERR(tx_sa->key.tfm);
@@ -1795,7 +1804,7 @@ static int macsec_add_rxsa(struct sk_buff *skb, struct genl_info *info)
return -ENOMEM;
}
- err = init_rx_sa(rx_sa, nla_data(tb_sa[MACSEC_SA_ATTR_KEY]),
+ err = init_rx_sa(dev_net(dev), rx_sa, nla_data(tb_sa[MACSEC_SA_ATTR_KEY]),
secy->key_len, secy->icv_len);
if (err < 0) {
kfree(rx_sa);
@@ -2038,7 +2047,7 @@ static int macsec_add_txsa(struct sk_buff *skb, struct genl_info *info)
return -ENOMEM;
}
- err = init_tx_sa(tx_sa, nla_data(tb_sa[MACSEC_SA_ATTR_KEY]),
+ err = init_tx_sa(dev_net(dev), tx_sa, nla_data(tb_sa[MACSEC_SA_ATTR_KEY]),
secy->key_len, secy->icv_len);
if (err < 0) {
kfree(tx_sa);
@@ -4168,7 +4177,7 @@ static int macsec_validate_attr(struct nlattr *tb[], struct nlattr *data[],
char dummy_key[DEFAULT_SAK_LEN] = { 0 };
struct crypto_aead *dummy_tfm;
- dummy_tfm = macsec_alloc_tfm(dummy_key,
+ dummy_tfm = macsec_alloc_tfm(&init_net, dummy_key,
DEFAULT_SAK_LEN,
icv_len);
if (IS_ERR(dummy_tfm))
@@ -4380,6 +4389,65 @@ static struct notifier_block macsec_notifier = {
.notifier_call = macsec_notify,
};
+#ifdef CONFIG_SYSCTL
+static struct ctl_table macsec_table[] = {
+ {
+ .procname = "default_async_crypto",
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ { },
+};
+
+static int __net_init macsec_init_net(struct net *net)
+{
+ struct ctl_table *table = macsec_table;
+ struct macsec_net *macsec_net;
+
+ if (!net_eq(net, &init_net)) {
+ table = kmemdup(table, sizeof(macsec_table), GFP_KERNEL);
+ if (!table)
+ return -ENOMEM;
+ }
+
+ macsec_net = net_generic(net, macsec_net_id);
+ table[0].data = &macsec_net->default_async;
+ macsec_net->default_async = 0;
+
+ macsec_net->ctl_hdr = register_net_sysctl(net, "net/macsec", table);
+ if (!macsec_net->ctl_hdr)
+ goto free;
+
+ return 0;
+
+free:
+ if (!net_eq(net, &init_net))
+ kfree(table);
+ return -ENOMEM;
+}
+
+static void __net_exit macsec_exit_net(struct net *net)
+{
+ struct macsec_net *macsec_net = net_generic(net, macsec_net_id);
+
+ unregister_net_sysctl_table(macsec_net->ctl_hdr);
+ if (!net_eq(net, &init_net))
+ kfree(macsec_net->ctl_hdr->ctl_table_arg);
+}
+#endif
+
+static struct pernet_operations macsec_net_ops __read_mostly = {
+#ifdef CONFIG_SYSCTL
+ .init = macsec_init_net,
+ .exit = macsec_exit_net,
+#endif
+ .id = &macsec_net_id,
+ .size = sizeof(struct macsec_net),
+};
+
static int __init macsec_init(void)
{
int err;
@@ -4397,8 +4465,14 @@ static int __init macsec_init(void)
if (err)
goto rtnl;
+ err = register_pernet_subsys(&macsec_net_ops);
+ if (err)
+ goto genl;
+
return 0;
+genl:
+ genl_unregister_family(&macsec_fam);
rtnl:
rtnl_link_unregister(&macsec_link_ops);
notifier:
@@ -4408,6 +4482,7 @@ static int __init macsec_init(void)
static void __exit macsec_exit(void)
{
+ unregister_pernet_subsys(&macsec_net_ops);
genl_unregister_family(&macsec_fam);
rtnl_link_unregister(&macsec_link_ops);
unregister_netdevice_notifier(&macsec_notifier);
--
2.40.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH net-next] macsec: introduce default_async_crypto sysctl
2023-08-17 15:07 [PATCH net-next] macsec: introduce default_async_crypto sysctl Sabrina Dubroca
@ 2023-08-19 1:46 ` Jakub Kicinski
2023-08-22 15:39 ` Sabrina Dubroca
0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2023-08-19 1:46 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: netdev, Jonathan Corbet, linux-doc, Scott Dial
On Thu, 17 Aug 2023 17:07:03 +0200 Sabrina Dubroca wrote:
> Commit ab046a5d4be4 ("net: macsec: preserve ingress frame ordering")
> tried to solve an issue caused by MACsec's use of asynchronous crypto
> operations, but introduced a large performance regression in cases
> where async crypto isn't causing reordering of packets.
>
> This patch introduces a per-netns sysctl that administrators can set
> to allow new SAs to use async crypto, such as aesni. Existing SAs
> won't be modified.
>
> By setting default_async_crypto=1 and reconfiguring macsec, a single
> netperf instance jumps from 1.4Gbps to 4.4Gbps.
Can we not fix the ordering problem?
Queue the packets locally if they get out of order?
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH net-next] macsec: introduce default_async_crypto sysctl
2023-08-19 1:46 ` Jakub Kicinski
@ 2023-08-22 15:39 ` Sabrina Dubroca
2023-08-22 15:59 ` Jakub Kicinski
2023-08-23 20:22 ` Scott Dial
0 siblings, 2 replies; 11+ messages in thread
From: Sabrina Dubroca @ 2023-08-22 15:39 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, Jonathan Corbet, linux-doc, Scott Dial
2023-08-18, 18:46:48 -0700, Jakub Kicinski wrote:
> On Thu, 17 Aug 2023 17:07:03 +0200 Sabrina Dubroca wrote:
> > Commit ab046a5d4be4 ("net: macsec: preserve ingress frame ordering")
> > tried to solve an issue caused by MACsec's use of asynchronous crypto
> > operations, but introduced a large performance regression in cases
> > where async crypto isn't causing reordering of packets.
> >
> > This patch introduces a per-netns sysctl that administrators can set
> > to allow new SAs to use async crypto, such as aesni. Existing SAs
> > won't be modified.
> >
> > By setting default_async_crypto=1 and reconfiguring macsec, a single
> > netperf instance jumps from 1.4Gbps to 4.4Gbps.
>
> Can we not fix the ordering problem?
> Queue the packets locally if they get out of order?
Actually, looking into the crypto API side, I don't see how they can
get out of order since commit 81760ea6a95a ("crypto: cryptd - Add
helpers to check whether a tfm is queued"):
[...] ensure that no reordering is introduced because of requests
queued in cryptd with respect to requests being processed in
softirq context.
And cryptd_aead_queued() is used by AESNI (via simd_aead_decrypt()) to
decide whether to process the request synchronously or not.
So I really don't get what commit ab046a5d4be4 was trying to fix. I've
never been able to reproduce that issue, I guess commit 81760ea6a95a
explains why.
I'd suggest to revert commit ab046a5d4be4, but it feels wrong to
revert it without really understanding what problem Scott hit and why
81760ea6a95a didn't solve it.
What do you think?
--
Sabrina
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH net-next] macsec: introduce default_async_crypto sysctl
2023-08-22 15:39 ` Sabrina Dubroca
@ 2023-08-22 15:59 ` Jakub Kicinski
2023-08-23 20:22 ` Scott Dial
1 sibling, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2023-08-22 15:59 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: netdev, Jonathan Corbet, linux-doc, Scott Dial
On Tue, 22 Aug 2023 17:39:56 +0200 Sabrina Dubroca wrote:
> 2023-08-18, 18:46:48 -0700, Jakub Kicinski wrote:
> > Can we not fix the ordering problem?
> > Queue the packets locally if they get out of order?
>
> Actually, looking into the crypto API side, I don't see how they can
> get out of order since commit 81760ea6a95a ("crypto: cryptd - Add
> helpers to check whether a tfm is queued"):
>
> [...] ensure that no reordering is introduced because of requests
> queued in cryptd with respect to requests being processed in
> softirq context.
>
> And cryptd_aead_queued() is used by AESNI (via simd_aead_decrypt()) to
> decide whether to process the request synchronously or not.
>
> So I really don't get what commit ab046a5d4be4 was trying to fix. I've
> never been able to reproduce that issue, I guess commit 81760ea6a95a
> explains why.
>
> I'd suggest to revert commit ab046a5d4be4, but it feels wrong to
> revert it without really understanding what problem Scott hit and why
> 81760ea6a95a didn't solve it.
>
> What do you think?
Unless Scott can tell us what he was seeing I think we should revert.
The code looks fine to me as well...
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH net-next] macsec: introduce default_async_crypto sysctl
2023-08-22 15:39 ` Sabrina Dubroca
2023-08-22 15:59 ` Jakub Kicinski
@ 2023-08-23 20:22 ` Scott Dial
2023-08-24 13:01 ` Sabrina Dubroca
1 sibling, 1 reply; 11+ messages in thread
From: Scott Dial @ 2023-08-23 20:22 UTC (permalink / raw)
To: Sabrina Dubroca, Jakub Kicinski; +Cc: netdev, Jonathan Corbet, linux-doc
> 2023-08-18, 18:46:48 -0700, Jakub Kicinski wrote:
>> Can we not fix the ordering problem?
>> Queue the packets locally if they get out of order?
AES-NI's implementation of gcm(aes) requires the FPU, so if it's busy
the decrypt gets stuck on the cryptd queue, but that queue is not
order-preserving. If the macsec driver maintained a queue for the netdev
that was order-preserving, then you could resolve the issue, but it adds
more complexity to the macsec driver, so I assume that's why the
maintainers have always desired to revert my patch instead of ensuring
packet order.
With respect to AES-NI's implementation of gcm(aes), it's unfortunate
that there is not a synchronous version that uses the FPU when available
and fallsback to gcm_base(ctr(aes-aesni),ghash-generic) when it's not.
In that case, you would get the benefit of the FPU for the majority of
time when it's available. When I suggested this to linux-crypto, I was
told that relying on synchronous crypto in the macsec driver was wrong:
On 12 Aug 2020 10:45:00 +0000, Pascal Van Leeuwen wrote:
> Forcing the use of sync algorithms only would be detrimental to platforms
> that do not have CPU accelerated crypto, but do have HW acceleration
> for crypto external to the CPU. I understand it's much easier to implement,
> but that is just being lazy IMHO. For bulk crypto of relatively independent
> blocks (networking packets, disk sectors), ASYNC should always be preferred.
So, I abandoned my suggestion to add a fallback. The complexity of the
queueing the macsec driver was beyond the time I had available, and the
regression in performance was not significant for my use case, but I
understand that others may have different requirements. I would
emphasize that benchmarking of network performance should be done by
looking at more than just the interface frame rate. For instance,
out-of-order deliver of packets can trigger TCP backoff. I was never
interested in how many packets the macsec driver could stuff onto the
wire, because the impact was my TCP socket stalling and my UDP streams
being garbled.
On 8/22/2023 11:39 AM, Sabrina Dubroca wrote:
> Actually, looking into the crypto API side, I don't see how they can
> get out of order since commit 81760ea6a95a ("crypto: cryptd - Add
> helpers to check whether a tfm is queued"):
>
> [...] ensure that no reordering is introduced because of requests
> queued in cryptd with respect to requests being processed in
> softirq context.
>
> And cryptd_aead_queued() is used by AESNI (via simd_aead_decrypt()) to
> decide whether to process the request synchronously or not.
I have not been following linux-crypto changes, but I would be surprised
if request is not flagged with CRYPTO_TFM_REQ_MAY_BACKLOG, so it would
be queue. If that's not the case, then the attempt to decrypt would
return -EBUSY, which would translate to a packet error, since
macsec_decrypt MUST handle the skb during the softirq.
> So I really don't get what commit ab046a5d4be4 was trying to fix. I've
> never been able to reproduce that issue, I guess commit 81760ea6a95a
> explains why.
>
> I'd suggest to revert commit ab046a5d4be4, but it feels wrong to
> revert it without really understanding what problem Scott hit and why
> 81760ea6a95a didn't solve it.
I don't think that commit has any relevance to the issue. For instance
with AES-NI, you need to have competing load on the FPU such that
crypto_simd_usable() fails to be true. In the past, I replicated this
failure mode using two SuperMicro 5018D-FN4T servers directly connected
to each other, which is a Xeon-D 1541 w/ Intel 10GbE NIC (ixgbe driver).
From there, I would send /dev/urandom as UDP to the other host. I would
get about 1 out of 10k packets queued on cryptd with that setup. My real
world case was transporting MPEG TS video streams, each about 1k pps, so
that is an decode error in the video stream every 10 seconds.
--
Scott Dial
scott@scottdial.com
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH net-next] macsec: introduce default_async_crypto sysctl
2023-08-23 20:22 ` Scott Dial
@ 2023-08-24 13:01 ` Sabrina Dubroca
2023-08-24 17:08 ` Scott Dial
0 siblings, 1 reply; 11+ messages in thread
From: Sabrina Dubroca @ 2023-08-24 13:01 UTC (permalink / raw)
To: Scott Dial; +Cc: Jakub Kicinski, netdev, Jonathan Corbet, linux-doc
2023-08-23, 16:22:31 -0400, Scott Dial wrote:
> > 2023-08-18, 18:46:48 -0700, Jakub Kicinski wrote:
> > > Can we not fix the ordering problem?
> > > Queue the packets locally if they get out of order?
>
> AES-NI's implementation of gcm(aes) requires the FPU, so if it's busy the
> decrypt gets stuck on the cryptd queue, but that queue is not
> order-preserving.
It should be (per CPU [*]). The queue itself is a linked list, and if we
have requests on the queue we don't let new requests skip the queue.
[*] and if you have packets coming through multiple CPUs at the same
time, ordering won't be predictable anyway
> I would emphasize
> that benchmarking of network performance should be done by looking at more
> than just the interface frame rate. For instance, out-of-order deliver of
> packets can trigger TCP backoff. I was never interested in how many packets
> the macsec driver could stuff onto the wire, because the impact was my TCP
> socket stalling and my UDP streams being garbled.
Sure. And for iperf3/TCP tests, I'm seeing much better performance out
of async crypto (or much lower CPU utilization for the same throughput
on UDP tests), even with the FPU busy. I decided to go the sysctl
route instead of reverting because I couldn't figure out how to
reproduce the problems you've hit, but I didn't want to just bring
them back for your setup.
> On 8/22/2023 11:39 AM, Sabrina Dubroca wrote:
> > Actually, looking into the crypto API side, I don't see how they can
> > get out of order since commit 81760ea6a95a ("crypto: cryptd - Add
> > helpers to check whether a tfm is queued"):
> >
> > [...] ensure that no reordering is introduced because of requests
> > queued in cryptd with respect to requests being processed in
> > softirq context.
> >
> > And cryptd_aead_queued() is used by AESNI (via simd_aead_decrypt()) to
> > decide whether to process the request synchronously or not.
>
> I have not been following linux-crypto changes, but I would be surprised if
> request is not flagged with CRYPTO_TFM_REQ_MAY_BACKLOG, so it would be
macsec doesn't use CRYPTO_TFM_REQ_MAY_BACKLOG.
> queue. If that's not the case, then the attempt to decrypt would return
> -EBUSY, which would translate to a packet error, since macsec_decrypt MUST
> handle the skb during the softirq.
If we get more packets than we can process, we drop them. I think
that's fine.
> > So I really don't get what commit ab046a5d4be4 was trying to fix. I've
> > never been able to reproduce that issue, I guess commit 81760ea6a95a
> > explains why.
> >
> > I'd suggest to revert commit ab046a5d4be4, but it feels wrong to
> > revert it without really understanding what problem Scott hit and why
> > 81760ea6a95a didn't solve it.
>
> I don't think that commit has any relevance to the issue.
It maintains the ordering of requests. If there are async requests
currently waiting to be processed, we don't let requests bypass the
queue until we've drained it.
To make sure, I ran some tests with numbered messages and a patched
kernel that forces queueing decryption every couple of requests, and I
didn't see any reordering.
--
Sabrina
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH net-next] macsec: introduce default_async_crypto sysctl
2023-08-24 13:01 ` Sabrina Dubroca
@ 2023-08-24 17:08 ` Scott Dial
2023-08-28 9:42 ` Sabrina Dubroca
0 siblings, 1 reply; 11+ messages in thread
From: Scott Dial @ 2023-08-24 17:08 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: Jakub Kicinski, netdev, Jonathan Corbet, linux-doc
On 8/24/2023 9:01 AM, Sabrina Dubroca wrote:
> 2023-08-23, 16:22:31 -0400, Scott Dial wrote:
>> AES-NI's implementation of gcm(aes) requires the FPU, so if it's busy the
>> decrypt gets stuck on the cryptd queue, but that queue is not
>> order-preserving.
>
> It should be (per CPU [*]). The queue itself is a linked list, and if we
> have requests on the queue we don't let new requests skip the queue.
My apologies, I'll be the first to admit that I have not tracked all of
the code changes to either the macsec driver or linux-crypto since I
first made the commit. This comment that requests are queued forced me
to review the code again and it appears that the queueing issue was
resolved in v5.2-rc1 with commit 1661131a0479, so I no longer believe we
need the CRYPTO_ALG_ASYNC since v5.2 and going forward.
So, I believe my patch should be reverted from the mainline kernel and
any releases that are still getting maintenance releases -- I believe
v5.4, v5.10, v5.15, and v6.1.
--
Scott Dial
scott@scottdial.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] macsec: introduce default_async_crypto sysctl
2023-08-24 17:08 ` Scott Dial
@ 2023-08-28 9:42 ` Sabrina Dubroca
2023-08-28 19:04 ` Scott Dial
0 siblings, 1 reply; 11+ messages in thread
From: Sabrina Dubroca @ 2023-08-28 9:42 UTC (permalink / raw)
To: Scott Dial; +Cc: Jakub Kicinski, netdev, Jonathan Corbet, linux-doc
2023-08-24, 13:08:41 -0400, Scott Dial wrote:
> On 8/24/2023 9:01 AM, Sabrina Dubroca wrote:
> > 2023-08-23, 16:22:31 -0400, Scott Dial wrote:
> > > AES-NI's implementation of gcm(aes) requires the FPU, so if it's busy the
> > > decrypt gets stuck on the cryptd queue, but that queue is not
> > > order-preserving.
> >
> > It should be (per CPU [*]). The queue itself is a linked list, and if we
> > have requests on the queue we don't let new requests skip the queue.
>
> My apologies, I'll be the first to admit that I have not tracked all of the
> code changes to either the macsec driver or linux-crypto since I first made
> the commit. This comment that requests are queued forced me to review the
> code again and it appears that the queueing issue was resolved in v5.2-rc1
> with commit 1661131a0479, so I no longer believe we need the
> CRYPTO_ALG_ASYNC since v5.2 and going forward.
Are you sure about this? 1661131a0479 pre-dates your patch by over a
year.
And AFAICT, that series only moved the existing FPU usable +
cryptd_aead_queued tests from AESNI's implementation of gcm(aes) to
common SIMD helpers.
--
Sabrina
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] macsec: introduce default_async_crypto sysctl
2023-08-28 9:42 ` Sabrina Dubroca
@ 2023-08-28 19:04 ` Scott Dial
2023-08-31 14:10 ` Sabrina Dubroca
0 siblings, 1 reply; 11+ messages in thread
From: Scott Dial @ 2023-08-28 19:04 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: Jakub Kicinski, netdev, Jonathan Corbet, linux-doc
On 8/28/2023 5:42 AM, Sabrina Dubroca wrote:
> 2023-08-24, 13:08:41 -0400, Scott Dial wrote:
>> On 8/24/2023 9:01 AM, Sabrina Dubroca wrote:
>>> 2023-08-23, 16:22:31 -0400, Scott Dial wrote:
>>>> AES-NI's implementation of gcm(aes) requires the FPU, so if it's busy the
>>>> decrypt gets stuck on the cryptd queue, but that queue is not
>>>> order-preserving.
>>>
>>> It should be (per CPU [*]). The queue itself is a linked list, and if we
>>> have requests on the queue we don't let new requests skip the queue.
>>
>> My apologies, I'll be the first to admit that I have not tracked all of the
>> code changes to either the macsec driver or linux-crypto since I first made
>> the commit. This comment that requests are queued forced me to review the
>> code again and it appears that the queueing issue was resolved in v5.2-rc1
>> with commit 1661131a0479, so I no longer believe we need the
>> CRYPTO_ALG_ASYNC since v5.2 and going forward.
>
> Are you sure about this? 1661131a0479 pre-dates your patch by over a
> year.
>
> And AFAICT, that series only moved the existing FPU usable +
> cryptd_aead_queued tests from AESNI's implementation of gcm(aes) to
> common SIMD helpers.
My original issue started with a RHEL7 system, so a backport of the
macsec driver to the 3.10 kernel. I recall building newer kernels and
reproducing the issue, but I don't have my test setup anymore nor any
meaningful notes that would indicate to me what kernels I tested. In any
case, I didn't bisect when the queuing behavior was changed, and maybe I
misread the code, and maybe my test setup was flawed in some other way.
1661131a0479 wasn't obviously just moving code to me, so I didn't trace
back further, but looking at the longterm maintenance 4.x kernels, I can
see that the AES-NI code has the same cryptd_aead_queued check, so I
think you are correct to say that you could revert my change on all of
the maintenance kernels to restore the performance of MACsec w/ AES-NI.
Whether that causes any ordering regressions for any other crypto
accelerations, I have no idea since it would require auditing a lot of
crypto code.
--
Scott Dial
scott@scottdial.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] macsec: introduce default_async_crypto sysctl
2023-08-28 19:04 ` Scott Dial
@ 2023-08-31 14:10 ` Sabrina Dubroca
2023-09-01 2:35 ` Herbert Xu
0 siblings, 1 reply; 11+ messages in thread
From: Sabrina Dubroca @ 2023-08-31 14:10 UTC (permalink / raw)
To: Scott Dial, Herbert Xu; +Cc: Jakub Kicinski, netdev, Jonathan Corbet, linux-doc
2023-08-28, 15:04:51 -0400, Scott Dial wrote:
> On 8/28/2023 5:42 AM, Sabrina Dubroca wrote:
> > 2023-08-24, 13:08:41 -0400, Scott Dial wrote:
> > > On 8/24/2023 9:01 AM, Sabrina Dubroca wrote:
> > > > 2023-08-23, 16:22:31 -0400, Scott Dial wrote:
> > > > > AES-NI's implementation of gcm(aes) requires the FPU, so if it's busy the
> > > > > decrypt gets stuck on the cryptd queue, but that queue is not
> > > > > order-preserving.
> > > >
> > > > It should be (per CPU [*]). The queue itself is a linked list, and if we
> > > > have requests on the queue we don't let new requests skip the queue.
> > >
> > > My apologies, I'll be the first to admit that I have not tracked all of the
> > > code changes to either the macsec driver or linux-crypto since I first made
> > > the commit. This comment that requests are queued forced me to review the
> > > code again and it appears that the queueing issue was resolved in v5.2-rc1
> > > with commit 1661131a0479, so I no longer believe we need the
> > > CRYPTO_ALG_ASYNC since v5.2 and going forward.
> >
> > Are you sure about this? 1661131a0479 pre-dates your patch by over a
> > year.
> >
> > And AFAICT, that series only moved the existing FPU usable +
> > cryptd_aead_queued tests from AESNI's implementation of gcm(aes) to
> > common SIMD helpers.
>
> My original issue started with a RHEL7 system, so a backport of the macsec
> driver to the 3.10 kernel. I recall building newer kernels and reproducing
> the issue, but I don't have my test setup anymore nor any meaningful notes
> that would indicate to me what kernels I tested. In any case, I didn't
> bisect when the queuing behavior was changed, and maybe I misread the code,
> and maybe my test setup was flawed in some other way.
>
> 1661131a0479 wasn't obviously just moving code to me, so I didn't trace back
> further, but looking at the longterm maintenance 4.x kernels, I can see that
> the AES-NI code has the same cryptd_aead_queued check
Yes, that's more what I meant. The check exists before and after
commits 1661131a0479 and 149e12252fb3.
(and FWIW, RHEL7 doesn't have it, but that's not a concern for netdev)
> so I think you are
> correct to say that you could revert my change on all of the maintenance
> kernels to restore the performance of MACsec w/ AES-NI.
Ok, thanks.
> Whether that causes any ordering regressions for any other crypto
> accelerations, I have no idea since it would require auditing a lot of
> crypto code.
Herbert, can we expect ASYNC implementations of gcm(aes) to maintain
ordering of completions wrt requests? For AESNI, the use of
cryptd_aead_queued() makes sure of that, but I don't know if other
implementations under drivers/crypto would have the same
guarantee.
[context: we're considering reverting commit ab046a5d4be4 ("net:
macsec: preserve ingress frame ordering"), but Scott is concerned that
the issue he saw would happen with other types of acceleration]
--
Sabrina
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH net-next] macsec: introduce default_async_crypto sysctl
2023-08-31 14:10 ` Sabrina Dubroca
@ 2023-09-01 2:35 ` Herbert Xu
0 siblings, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2023-09-01 2:35 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: Scott Dial, Jakub Kicinski, netdev, Jonathan Corbet, linux-doc
On Thu, Aug 31, 2023 at 04:10:40PM +0200, Sabrina Dubroca wrote:
>
> Herbert, can we expect ASYNC implementations of gcm(aes) to maintain
> ordering of completions wrt requests? For AESNI, the use of
> cryptd_aead_queued() makes sure of that, but I don't know if other
> implementations under drivers/crypto would have the same
> guarantee.
Absolutely as otherwise IPsec would be seriously broken (it's even
worse than plain TCP because of the replay windows).
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-09-01 2:35 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-17 15:07 [PATCH net-next] macsec: introduce default_async_crypto sysctl Sabrina Dubroca
2023-08-19 1:46 ` Jakub Kicinski
2023-08-22 15:39 ` Sabrina Dubroca
2023-08-22 15:59 ` Jakub Kicinski
2023-08-23 20:22 ` Scott Dial
2023-08-24 13:01 ` Sabrina Dubroca
2023-08-24 17:08 ` Scott Dial
2023-08-28 9:42 ` Sabrina Dubroca
2023-08-28 19:04 ` Scott Dial
2023-08-31 14:10 ` Sabrina Dubroca
2023-09-01 2:35 ` Herbert Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).