All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gregory Price <gregory.price@memverge.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Gregory Price <gourry.memverge@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-api@vger.kernel.org, corbet@lwn.net,
	akpm@linux-foundation.org, honggyu.kim@sk.com, rakie.kim@sk.com,
	hyeongtak.ji@sk.com, mhocko@kernel.org, vtavarespetr@micron.com,
	jgroves@micron.com, ravis.opensrc@micron.com,
	sthanneeru@micron.com, emirakhur@micron.com, Hasan.Maruf@amd.com,
	seungjun.ha@samsung.com, hannes@cmpxchg.org,
	dan.j.williams@intel.com
Subject: Re: [PATCH 1/3] mm/mempolicy: implement the sysfs-based weighted_interleave interface
Date: Wed, 17 Jan 2024 00:24:41 -0500	[thread overview]
Message-ID: <ZadkmWj3Rd483f68@memverge.com> (raw)
In-Reply-To: <87le8r1dzr.fsf@yhuang6-desk2.ccr.corp.intel.com>

On Mon, Jan 15, 2024 at 11:18:00AM +0800, Huang, Ying wrote:
> Gregory Price <gourry.memverge@gmail.com> writes:
> 
> > +static struct iw_table default_iw_table;
> > +/*
> > + * iw_table is the sysfs-set interleave weight table, a value of 0
> > + * denotes that the default_iw_table value should be used.
> > + *
> > + * iw_table is RCU protected
> > + */
> > +static struct iw_table __rcu *iw_table;
> > +static DEFINE_MUTEX(iw_table_mtx);
> 
> I greped "mtx" in kernel/*.c and mm/*.c and found nothing.  To following
> the existing coding convention, better to name this as iw_table_mutex or
> iw_table_lock?
> 

ack.

> And, I think this is used to protect both iw_table and default_iw_table?
> If so, it deserves some comments.
> 

Right now default_iw_table cannot be updated, and so it is neither
protected nor requires protection.

I planned to add the protection comment in the next patch series, which
would implement the kernel-side interface for updating the default
weights during boot/hotplug.

We haven't had the discussion on how/when this should happen yet,
though, and there's some research to be done.  (i.e. when should DRAM
weights be set? should the entire table be reweighted on hotplug? etc)

> > +static ssize_t node_store(struct kobject *kobj, struct kobj_attribute *attr,
> > +			  const char *buf, size_t count)
> > +{
> > +	struct iw_node_attr *node_attr;
> > +	struct iw_table __rcu *new;
> > +	struct iw_table __rcu *old;
> > +	u8 weight = 0;
> > +
> > +	node_attr = container_of(attr, struct iw_node_attr, kobj_attr);
> > +	if (count == 0 || sysfs_streq(buf, ""))
> > +		weight = 0;
> > +	else if (kstrtou8(buf, 0, &weight))
> > +		return -EINVAL;
> > +
> > +	new = kmalloc(sizeof(*new), GFP_KERNEL);
> > +	if (!new)
> > +		return -ENOMEM;
> > +
> > +	mutex_lock(&iw_table_mtx);
> > +	old = rcu_dereference_protected(iw_table,
> > +					lockdep_is_held(&iw_table_mtx));
> > +	/* If value is 0, revert to default weight */
> > +	weight = weight ? weight : default_iw_table.weights[node_attr->nid];
> 
> If we change the default weight in default_iw_table.weights[], how do we
> identify whether the weight has been customized by users via sysfs?  So,
> I suggest to use 0 in iw_table for not-customized weight.
> 
> And if so, we need to use RCU to access default_iw_table too.
>

Dumb simplification on my part, I'll walk this back and add the 

if (!weight) weight = default_iw_table[node]

logic back into the allocator paths accordinly.

> > +	memcpy(&new->weights, &old->weights, sizeof(new->weights));
> > +	new->weights[node_attr->nid] = weight;
> > +	rcu_assign_pointer(iw_table, new);
> > +	mutex_unlock(&iw_table_mtx);
> > +	kfree_rcu(old, rcu);
> 
> synchronize_rcu() should be OK here.  It's fast enough in this cold
> path.  This make it good to define iw_table as
> 
I'll take a look.

> u8 __rcu *iw_table;
> 
> Then, we only need to allocate nr_node_ids elements now.
> 

We need nr_possible_nodes to handle hotplug correctly.

I decided to simplify this down to MAX_NUMNODES *juuuuuust in case*
"true node hotplug" ever becomes a reality.  If that happens, then
only allocating space for possible nodes creates a much bigger
headache on hotplug.

For the sake of that simplification, it seemed better to just eat the
1KB.  If you really want me to do that, I will, but the MAX_NUMNODES
choice was an explicitly defensive choice.

> > +static int __init mempolicy_sysfs_init(void)
> > +{
> > +	/*
> > +	 * if sysfs is not enabled MPOL_WEIGHTED_INTERLEAVE defaults to
> > +	 * MPOL_INTERLEAVE behavior, but is still defined separately to
> > +	 * allow task-local weighted interleave and system-defaults to
> > +	 * operate as intended.
> > +	 *
> > +	 * In this scenario iw_table cannot (presently) change, so
> > +	 * there's no need to set up RCU / cleanup code.
> > +	 */
> > +	memset(&default_iw_table.weights, 1, sizeof(default_iw_table));
> 
> This depends on sizeof(default_iw_table.weights[0]) == 1, I think it's
> better to use explicit loop here to make the code more robust a little.
> 

oh hm, you're right.  rookie mistake on my part.

Thanks,
Gregory

  reply	other threads:[~2024-01-17  5:24 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-12 21:08 [PATCH 0/3] mm/mempolicy: weighted interleave mempolicy with sysfs extension Gregory Price
2024-01-12 21:08 ` [PATCH 1/3] mm/mempolicy: implement the sysfs-based weighted_interleave interface Gregory Price
2024-01-15  3:18   ` Huang, Ying
2024-01-17  5:24     ` Gregory Price [this message]
2024-01-17  6:58       ` Huang, Ying
2024-01-17 17:46         ` Gregory Price
2024-01-18  4:37           ` Huang, Ying
2024-01-12 21:08 ` [PATCH 2/3] mm/mempolicy: refactor a read-once mechanism into a function for re-use Gregory Price
2024-01-15  4:13   ` Huang, Ying
2024-01-17  5:26     ` Gregory Price
2024-01-12 21:08 ` [PATCH 3/3] mm/mempolicy: introduce MPOL_WEIGHTED_INTERLEAVE for weighted interleaving Gregory Price
2024-01-15  5:47   ` Huang, Ying
2024-01-17  5:34     ` Gregory Price
2024-01-18  1:28       ` Huang, Ying
2024-01-18  3:05   ` Huang, Ying
2024-01-18  4:06     ` Gregory Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZadkmWj3Rd483f68@memverge.com \
    --to=gregory.price@memverge.com \
    --cc=Hasan.Maruf@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=emirakhur@micron.com \
    --cc=gourry.memverge@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=honggyu.kim@sk.com \
    --cc=hyeongtak.ji@sk.com \
    --cc=jgroves@micron.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rakie.kim@sk.com \
    --cc=ravis.opensrc@micron.com \
    --cc=seungjun.ha@samsung.com \
    --cc=sthanneeru@micron.com \
    --cc=vtavarespetr@micron.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.