public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org, pabeni@redhat.com, davem@davemloft.net,
	edumazet@google.com, jacob.e.keller@intel.com, jhs@mojatatu.com,
	johannes@sipsolutions.net, andriy.shevchenko@linux.intel.com,
	amritha.nambiar@intel.com, sdf@google.com, horms@kernel.org
Subject: Re: [patch net-next v4 5/9] genetlink: introduce per-sock family private pointer storage
Date: Tue, 28 Nov 2023 17:05:48 +0100	[thread overview]
Message-ID: <ZWYP3H0wtaWxwneR@nanopsycho> (raw)
In-Reply-To: <20231128071116.1b6aed13@kernel.org>

Tue, Nov 28, 2023 at 04:11:16PM CET, kuba@kernel.org wrote:
>On Tue, 28 Nov 2023 09:25:21 +0100 Jiri Pirko wrote:
>> >Put the xarray pointer here directly. Plus a lock to protect the init.  
>> 
>> Okay, just to make this clear. You want me to have:
>> 	struct xarray __rcu		*family_privs;
>> 
>> in struct netlink_sock, correct?
>> 
>> 
>> Why I need a lock? If I read things correctly, skbs are processed in
>> serial over one sock. Since this is per-sock, should be safe.
>
>Okay, then add an assertion that the socket lock is held, at least.

No socket lock, but I assumed revcmsg could be called only in parallel.
But I guess that with multiple threads, this assumption is broken.
Okay, will add a spin lock.


>Also, is the socket lock not held yet when the filtering happens?

Nope.


>Maybe the __rcu annotation isn't necessary then either?
>
>> >The size of the per-family struct should be in family definition,
>> >allocation should happen on first get automatically. Family definition  
>> 
>> Yes, I thought about that. But I decided to do this lockless, allocating
>> new priv every time the user sets the filter and replace the xarray item
>> so it could be accessed in rcu read section during notification
>> processing.
>> 
>> What you suggest requires some lock to protect the memory being changed
>> during filter set and suring accessing in in notify. But okay,
>> if you insist.
>
>Not necessarily, you can have a helper which doesn't allocate, too.
>What I'm saying is that the common case for ops will be to access
>the state and allocate if it doesn't exist.
>
>How about genl_sk_family_priv() and genl_sk_has_family_priv() ?

My point is, with what you suggest, it will look something like this:

1) user does DEVLINK_CMD_NOTIFY_FILTER_SET
2) devlink calls into genetlink code for a sk_priv and inserts in xa_array
3) genetlink allocates sk_priv and returns back
4) devlink fills-up the sk_priv

5) user does DEVLINK_CMD_NOTIFY_FILTER_SET, again
6) devlink calls into genetlink code for a sk_priv
7) genetlink returns already exising sk_priv found in xa_array
8) devlink fills-up the sk_priv

Now the notification thread, sees sk_priv zeroed between 3) and 4)
and inconsistent during 4) and 8)

I originally solved that by rcu, DEVLINK_CMD_NOTIFY_FILTER_SET
code always allocates and flips the rcu pointer. Notification thread
always sees sk_priv consistent.

If you want to allocate sk_priv in genetlink code once, hard to use
the rcu mechanism and have to protect the sk_priv memory by a lock.

What am I missing?


>
>> >should also hold a callback to how the data is going to be freed.  
>> 
>> If it is alloceted automatically, why is it needed?
>
>Because priv may be a complex type which has member that need
>individual fields to be destroyed (in fullness of time we also
>need a constructor which can init things like list_head, but
>we can defer that).
>
>I'm guessing in your case the priv will look like this:
>
>struct devlink_sk_priv {
>	const char *nft_fltr_instance_name;
>};
>
>static void devlink_sk_priv_free(void *ptr)
>{
>	struct devlink_sk_priv *priv = ptr;
>
>	kfree(priv->nft_fltr_instance_name);
>}

If genetlink code does the allocation, it should know how to free.
Does not make sense to pass destructor to genetlink code to free memory
it actually allocated :/

If devlink does the allocation, this callback makes sense. I was
thinking about having it, but decided kfree is okay for now and
destructor could be always introduced if needed.

Quoting the patch description:
    Assume that kfree() is good for free of privs for now, as the only user
    introduced by the follow-up patch (devlink) will use kzalloc() for the
    allocation of the memory of the stored pointer. If later on
    this needs to be made custom, a callback is going to be needed.
    Until then (if ever), do this in a simple way.



  reply	other threads:[~2023-11-28 16:05 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-23 18:15 [patch net-next v4 0/9] devlink: introduce notifications filtering Jiri Pirko
2023-11-23 18:15 ` [patch net-next v4 1/9] devlink: use devl_is_registered() helper instead xa_get_mark() Jiri Pirko
2023-11-23 18:15 ` [patch net-next v4 2/9] devlink: introduce __devl_is_registered() helper and use it instead of xa_get_mark() Jiri Pirko
2023-11-23 18:15 ` [patch net-next v4 3/9] devlink: send notifications only if there are listeners Jiri Pirko
2023-11-27 11:01   ` Paolo Abeni
2023-11-27 12:04     ` Jiri Pirko
2023-11-27 15:00       ` Paolo Abeni
2023-11-28  7:39         ` Jiri Pirko
2023-11-23 18:15 ` [patch net-next v4 4/9] devlink: introduce a helper for netlink multicast send Jiri Pirko
2023-11-23 18:15 ` [patch net-next v4 5/9] genetlink: introduce per-sock family private pointer storage Jiri Pirko
2023-11-27 11:13   ` Paolo Abeni
2023-11-27 12:00     ` Jiri Pirko
2023-11-27 22:46   ` Jakub Kicinski
2023-11-28  8:25     ` Jiri Pirko
2023-11-28 15:11       ` Jakub Kicinski
2023-11-28 16:05         ` Jiri Pirko [this message]
2023-11-28 16:36           ` Jakub Kicinski
2023-11-29 13:59             ` Jiri Pirko
2023-11-29 15:01               ` Jakub Kicinski
2023-11-29 15:25                 ` Jiri Pirko
2023-11-28 12:30   ` Przemek Kitszel
2023-11-28 15:05     ` Jiri Pirko
2023-11-28 16:18     ` Andy Shevchenko
2023-11-28 19:59       ` Jacob Keller
2023-11-28 20:06         ` Andy Shevchenko
2023-11-29 23:29           ` Jacob Keller
2023-11-23 18:15 ` [patch net-next v4 6/9] netlink: introduce typedef for filter function Jiri Pirko
2023-11-23 18:15 ` [patch net-next v4 7/9] genetlink: introduce helpers to do filtered multicast Jiri Pirko
2023-11-23 18:15 ` [patch net-next v4 8/9] devlink: add a command to set notification filter and use it for multicasts Jiri Pirko
2023-11-27 12:30   ` Przemek Kitszel
2023-11-27 12:51     ` Jiri Pirko
2023-11-27 12:56       ` Jiri Pirko
2023-11-27 15:40   ` Przemek Kitszel
2023-11-28  8:26     ` Jiri Pirko
2023-12-04 16:24     ` Jiri Pirko
2023-12-04 16:47       ` Andy Shevchenko
2023-12-04 19:17       ` Keller, Jacob E
2023-12-05  7:47         ` Jiri Pirko
2023-12-05 15:58           ` andriy.shevchenko
2023-11-23 18:15 ` [patch net-next v4 9/9] devlink: extend multicast filtering by port index Jiri Pirko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZWYP3H0wtaWxwneR@nanopsycho \
    --to=jiri@resnulli.us \
    --cc=amritha.nambiar@intel.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jhs@mojatatu.com \
    --cc=johannes@sipsolutions.net \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox