netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] netlink: handle EMSGSIZE errors in the core
@ 2024-03-01  1:28 Jakub Kicinski
  2024-03-01  1:28 ` [PATCH net-next 1/3] " Jakub Kicinski
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jakub Kicinski @ 2024-03-01  1:28 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, johannes, fw, pablo, idosch, jiri,
	Jakub Kicinski

Ido discovered some time back that we usually force NLMSG_DONE
to be delivered in a separate recv() syscall, even if it would
fit into the same skb as data messages. He made nexthop try
to fit DONE with data in commit 8743aeff5bc4 ("nexthop: Fix
infinite nexthop bucket dump when using maximum nexthop ID"),
and nobody has complained so far.

We have since also tried to follow the same pattern in new
genetlink families, but explaining to people, or even remembering
the correct handling ourselves is tedious.

Let the netlink socket layer consume -EMSGSIZE errors.
Practically speaking most families use this error code
as "dump needs more space", anyway.

Jakub Kicinski (3):
  netlink: handle EMSGSIZE errors in the core
  netdev: let netlink core handle -EMSGSIZE errors
  genetlink: fit NLMSG_DONE into same read() as families

 net/core/netdev-genl.c    | 15 +++------------
 net/core/page_pool_user.c |  2 --
 net/netlink/af_netlink.c  |  9 +++++++++
 net/netlink/genetlink.c   | 12 +++++++-----
 4 files changed, 19 insertions(+), 19 deletions(-)

-- 
2.43.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net-next 1/3] netlink: handle EMSGSIZE errors in the core
  2024-03-01  1:28 [PATCH net-next 0/3] netlink: handle EMSGSIZE errors in the core Jakub Kicinski
@ 2024-03-01  1:28 ` Jakub Kicinski
  2024-03-01  8:35   ` Eric Dumazet
  2024-03-01  1:28 ` [PATCH net-next 2/3] netdev: let netlink core handle -EMSGSIZE errors Jakub Kicinski
  2024-03-01  1:28 ` [PATCH net-next 3/3] genetlink: fit NLMSG_DONE into same read() as families Jakub Kicinski
  2 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2024-03-01  1:28 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, johannes, fw, pablo, idosch, jiri,
	Jakub Kicinski, kuniyu

Eric points out that our current suggested way of handling
EMSGSIZE errors ((err == -EMSGSIZE) ? skb->len : err) will
break if we didn't fit even a single object into the buffer
provided by the user. This should not happen for well behaved
applications, but we can fix that, and free netlink families
from dealing with that completely by moving error handling
into the core.

Let's assume from now on that all EMSGSIZE errors in dumps are
because we run out of skb space. Families can now propagate
the error nla_put_*() etc generated and not worry about any
return value magic. If some family really wants to send EMSGSIZE
to user space, assuming it generates the same error on the next
dump iteration the skb->len should be 0, and user space should
still see the EMSGSIZE.

This should simplify families and prevent mistakes in return
values which lead to DONE being forced into a separate recv()
call as discovered by Ido some time ago.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: kuniyu@amazon.com
---
 net/netlink/af_netlink.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index ad7b645e3ae7..da846212fb9b 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2267,6 +2267,15 @@ static int netlink_dump(struct sock *sk, bool lock_taken)
 		if (extra_mutex)
 			mutex_unlock(extra_mutex);
 
+		/* EMSGSIZE plus something already in the skb means
+		 * that there's more to dump but current skb has filled up.
+		 * If the callback really wants to return EMSGSIZE to user space
+		 * it needs to do so again, on the next cb->dump() call,
+		 * without putting data in the skb.
+		 */
+		if (nlk->dump_done_errno == -EMSGSIZE && skb->len)
+			nlk->dump_done_errno = skb->len;
+
 		cb->extack = NULL;
 	}
 
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next 2/3] netdev: let netlink core handle -EMSGSIZE errors
  2024-03-01  1:28 [PATCH net-next 0/3] netlink: handle EMSGSIZE errors in the core Jakub Kicinski
  2024-03-01  1:28 ` [PATCH net-next 1/3] " Jakub Kicinski
@ 2024-03-01  1:28 ` Jakub Kicinski
  2024-03-01  8:37   ` Eric Dumazet
  2024-03-01  1:28 ` [PATCH net-next 3/3] genetlink: fit NLMSG_DONE into same read() as families Jakub Kicinski
  2 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2024-03-01  1:28 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, johannes, fw, pablo, idosch, jiri,
	Jakub Kicinski, amritha.nambiar, sridhar.samudrala, hawk

Previous change added -EMSGSIZE handling to af_netlink, we don't
have to hide these errors any longer.

Theoretically the error handling changes from:
 if (err == -EMSGSIZE)
to
 if (err == -EMSGSIZE && skb->len)

everywhere, but in practice it doesn't matter.
All messages fit into NLMSG_GOODSIZE, so overflow of an empty
skb cannot happen.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: amritha.nambiar@intel.com
CC: sridhar.samudrala@intel.com
CC: hawk@kernel.org
---
 net/core/netdev-genl.c    | 15 +++------------
 net/core/page_pool_user.c |  2 --
 2 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index fd98936da3ae..918b109e0cf4 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -152,10 +152,7 @@ int netdev_nl_dev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 	}
 	rtnl_unlock();
 
-	if (err != -EMSGSIZE)
-		return err;
-
-	return skb->len;
+	return err;
 }
 
 static int
@@ -287,10 +284,7 @@ int netdev_nl_napi_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 	}
 	rtnl_unlock();
 
-	if (err != -EMSGSIZE)
-		return err;
-
-	return skb->len;
+	return err;
 }
 
 static int
@@ -463,10 +457,7 @@ int netdev_nl_queue_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 	}
 	rtnl_unlock();
 
-	if (err != -EMSGSIZE)
-		return err;
-
-	return skb->len;
+	return err;
 }
 
 static int netdev_genl_netdevice_event(struct notifier_block *nb,
diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
index 278294aca66a..3a3277ba167b 100644
--- a/net/core/page_pool_user.c
+++ b/net/core/page_pool_user.c
@@ -103,8 +103,6 @@ netdev_nl_page_pool_get_dump(struct sk_buff *skb, struct netlink_callback *cb,
 	mutex_unlock(&page_pools_lock);
 	rtnl_unlock();
 
-	if (skb->len && err == -EMSGSIZE)
-		return skb->len;
 	return err;
 }
 
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next 3/3] genetlink: fit NLMSG_DONE into same read() as families
  2024-03-01  1:28 [PATCH net-next 0/3] netlink: handle EMSGSIZE errors in the core Jakub Kicinski
  2024-03-01  1:28 ` [PATCH net-next 1/3] " Jakub Kicinski
  2024-03-01  1:28 ` [PATCH net-next 2/3] netdev: let netlink core handle -EMSGSIZE errors Jakub Kicinski
@ 2024-03-01  1:28 ` Jakub Kicinski
  2024-03-01  8:40   ` Eric Dumazet
  2 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2024-03-01  1:28 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, johannes, fw, pablo, idosch, jiri,
	Jakub Kicinski

Make sure ctrl_fill_info() returns sensible error codes and
propagate them out to netlink core. Let netlink core decide
when to return skb->len and when to treat the exit as an
error. Netlink core does better job at it, if we always
return skb->len the core doesn't know when we're done
dumping and NLMSG_DONE ends up in a separate read().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: jiri@resnulli.us
---
 net/netlink/genetlink.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index 50ec599a5cff..70379ecfb6ed 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -1232,7 +1232,7 @@ static int ctrl_fill_info(const struct genl_family *family, u32 portid, u32 seq,
 
 	hdr = genlmsg_put(skb, portid, seq, &genl_ctrl, flags, cmd);
 	if (hdr == NULL)
-		return -1;
+		return -EMSGSIZE;
 
 	if (nla_put_string(skb, CTRL_ATTR_FAMILY_NAME, family->name) ||
 	    nla_put_u16(skb, CTRL_ATTR_FAMILY_ID, family->id) ||
@@ -1355,6 +1355,7 @@ static int ctrl_dumpfamily(struct sk_buff *skb, struct netlink_callback *cb)
 	struct net *net = sock_net(skb->sk);
 	int fams_to_skip = cb->args[0];
 	unsigned int id;
+	int err;
 
 	idr_for_each_entry(&genl_fam_idr, rt, id) {
 		if (!rt->netnsok && !net_eq(net, &init_net))
@@ -1363,16 +1364,17 @@ static int ctrl_dumpfamily(struct sk_buff *skb, struct netlink_callback *cb)
 		if (n++ < fams_to_skip)
 			continue;
 
-		if (ctrl_fill_info(rt, NETLINK_CB(cb->skb).portid,
-				   cb->nlh->nlmsg_seq, NLM_F_MULTI,
-				   skb, CTRL_CMD_NEWFAMILY) < 0) {
+		err = ctrl_fill_info(rt, NETLINK_CB(cb->skb).portid,
+				     cb->nlh->nlmsg_seq, NLM_F_MULTI,
+				     skb, CTRL_CMD_NEWFAMILY);
+		if (err) {
 			n--;
 			break;
 		}
 	}
 
 	cb->args[0] = n;
-	return skb->len;
+	return err;
 }
 
 static struct sk_buff *ctrl_build_family_msg(const struct genl_family *family,
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next 1/3] netlink: handle EMSGSIZE errors in the core
  2024-03-01  1:28 ` [PATCH net-next 1/3] " Jakub Kicinski
@ 2024-03-01  8:35   ` Eric Dumazet
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2024-03-01  8:35 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, pabeni, johannes, fw, pablo, idosch, jiri, kuniyu

On Fri, Mar 1, 2024 at 2:31 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Eric points out that our current suggested way of handling
> EMSGSIZE errors ((err == -EMSGSIZE) ? skb->len : err) will
> break if we didn't fit even a single object into the buffer
> provided by the user. This should not happen for well behaved
> applications, but we can fix that, and free netlink families
> from dealing with that completely by moving error handling
> into the core.
>
> Let's assume from now on that all EMSGSIZE errors in dumps are
> because we run out of skb space. Families can now propagate
> the error nla_put_*() etc generated and not worry about any
> return value magic. If some family really wants to send EMSGSIZE
> to user space, assuming it generates the same error on the next
> dump iteration the skb->len should be 0, and user space should
> still see the EMSGSIZE.
>
> This should simplify families and prevent mistakes in return
> values which lead to DONE being forced into a separate recv()
> call as discovered by Ido some time ago.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next 2/3] netdev: let netlink core handle -EMSGSIZE errors
  2024-03-01  1:28 ` [PATCH net-next 2/3] netdev: let netlink core handle -EMSGSIZE errors Jakub Kicinski
@ 2024-03-01  8:37   ` Eric Dumazet
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2024-03-01  8:37 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, pabeni, johannes, fw, pablo, idosch, jiri,
	amritha.nambiar, sridhar.samudrala, hawk

On Fri, Mar 1, 2024 at 2:31 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Previous change added -EMSGSIZE handling to af_netlink, we don't
> have to hide these errors any longer.
>
> Theoretically the error handling changes from:
>  if (err == -EMSGSIZE)
> to
>  if (err == -EMSGSIZE && skb->len)
>
> everywhere, but in practice it doesn't matter.
> All messages fit into NLMSG_GOODSIZE, so overflow of an empty
> skb cannot happen.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next 3/3] genetlink: fit NLMSG_DONE into same read() as families
  2024-03-01  1:28 ` [PATCH net-next 3/3] genetlink: fit NLMSG_DONE into same read() as families Jakub Kicinski
@ 2024-03-01  8:40   ` Eric Dumazet
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2024-03-01  8:40 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, pabeni, johannes, fw, pablo, idosch, jiri

On Fri, Mar 1, 2024 at 2:31 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Make sure ctrl_fill_info() returns sensible error codes and
> propagate them out to netlink core. Let netlink core decide
> when to return skb->len and when to treat the exit as an
> error. Netlink core does better job at it, if we always
> return skb->len the core doesn't know when we're done
> dumping and NLMSG_DONE ends up in a separate read().
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: jiri@resnulli.us
> ---
>  net/netlink/genetlink.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
> index 50ec599a5cff..70379ecfb6ed 100644
> --- a/net/netlink/genetlink.c
> +++ b/net/netlink/genetlink.c
> @@ -1232,7 +1232,7 @@ static int ctrl_fill_info(const struct genl_family *family, u32 portid, u32 seq,
>
>         hdr = genlmsg_put(skb, portid, seq, &genl_ctrl, flags, cmd);
>         if (hdr == NULL)
> -               return -1;
> +               return -EMSGSIZE;
>
>         if (nla_put_string(skb, CTRL_ATTR_FAMILY_NAME, family->name) ||
>             nla_put_u16(skb, CTRL_ATTR_FAMILY_ID, family->id) ||
> @@ -1355,6 +1355,7 @@ static int ctrl_dumpfamily(struct sk_buff *skb, struct netlink_callback *cb)
>         struct net *net = sock_net(skb->sk);
>         int fams_to_skip = cb->args[0];
>         unsigned int id;
> +       int err;

Don't we have to initialize err to 0 ?

Reviewed-by: Eric Dumazet <edumazet@google.com>

Thanks !

>
>         idr_for_each_entry(&genl_fam_idr, rt, id) {
>                 if (!rt->netnsok && !net_eq(net, &init_net))
> @@ -1363,16 +1364,17 @@ static int ctrl_dumpfamily(struct sk_buff *skb, struct netlink_callback *cb)
>                 if (n++ < fams_to_skip)
>                         continue;
>
> -               if (ctrl_fill_info(rt, NETLINK_CB(cb->skb).portid,
> -                                  cb->nlh->nlmsg_seq, NLM_F_MULTI,
> -                                  skb, CTRL_CMD_NEWFAMILY) < 0) {
> +               err = ctrl_fill_info(rt, NETLINK_CB(cb->skb).portid,
> +                                    cb->nlh->nlmsg_seq, NLM_F_MULTI,
> +                                    skb, CTRL_CMD_NEWFAMILY);
> +               if (err) {
>                         n--;
>                         break;
>                 }
>         }
>
>         cb->args[0] = n;
> -       return skb->len;
> +       return err;
>  }
>
>  static struct sk_buff *ctrl_build_family_msg(const struct genl_family *family,
> --
> 2.43.2
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-01  8:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-01  1:28 [PATCH net-next 0/3] netlink: handle EMSGSIZE errors in the core Jakub Kicinski
2024-03-01  1:28 ` [PATCH net-next 1/3] " Jakub Kicinski
2024-03-01  8:35   ` Eric Dumazet
2024-03-01  1:28 ` [PATCH net-next 2/3] netdev: let netlink core handle -EMSGSIZE errors Jakub Kicinski
2024-03-01  8:37   ` Eric Dumazet
2024-03-01  1:28 ` [PATCH net-next 3/3] genetlink: fit NLMSG_DONE into same read() as families Jakub Kicinski
2024-03-01  8:40   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).