From: Gregory Price <gregory.price@memverge.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Gregory Price <gourry.memverge@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-api@vger.kernel.org, corbet@lwn.net,
akpm@linux-foundation.org, honggyu.kim@sk.com, rakie.kim@sk.com,
hyeongtak.ji@sk.com, mhocko@kernel.org, vtavarespetr@micron.com,
jgroves@micron.com, ravis.opensrc@micron.com,
sthanneeru@micron.com, emirakhur@micron.com, Hasan.Maruf@amd.com,
seungjun.ha@samsung.com, hannes@cmpxchg.org,
dan.j.williams@intel.com,
Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
Subject: Re: [PATCH v3 3/4] mm/mempolicy: introduce MPOL_WEIGHTED_INTERLEAVE for weighted interleaving
Date: Fri, 26 Jan 2024 10:57:51 -0500 [thread overview]
Message-ID: <ZbPWf9HbUNA1MELh@memverge.com> (raw)
In-Reply-To: <87y1cclgcm.fsf@yhuang6-desk2.ccr.corp.intel.com>
On Fri, Jan 26, 2024 at 03:10:49PM +0800, Huang, Ying wrote:
> Gregory Price <gourry.memverge@gmail.com> writes:
>
> > + } else if (pol == current->mempolicy &&
> > + (pol->mode == MPOL_WEIGHTED_INTERLEAVE)) {
> > + if (pol->cur_il_weight)
> > + *policy = current->il_prev;
> > + else
> > + *policy = next_node_in(current->il_prev,
> > + pol->nodes);
>
> It appears that my previous comments about this is ignored.
>
> https://lore.kernel.org/linux-mm/875xzkv3x2.fsf@yhuang6-desk2.ccr.corp.intel.com/
>
> Please correct me if I am wrong.
>
The fix is in the following patch. I'd originally planned to squash the
atomic patch into this one, but decided against it because it probably
warranted isolated scrutiny.
@@ -973,8 +974,10 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
*policy = next_node_in(current->il_prev, pol->nodes);
} else if (pol == current->mempolicy &&
(pol->mode == MPOL_WEIGHTED_INTERLEAVE)) {
- if (pol->cur_il_weight)
- *policy = current->il_prev;
+ int cweight = atomic_read(&pol->cur_il_weight);
+
+ if (cweight & 0xFF)
+ *policy = cweight >> 8;
in this we return the node the weight applies to, otherwise we return
whatever is after il_prev.
I can pull this fix ahead.
> > + /* if now at 0, move to next node and set up that node's weight */
> > + if (unlikely(!policy->cur_il_weight)) {
> > + me->il_prev = node;
> > + next = next_node_in(node, policy->nodes);
> > + rcu_read_lock();
> > + table = rcu_dereference(iw_table);
> > + /* detect system-default values */
> > + weight = table ? table[next] : 1;
> > + policy->cur_il_weight = weight ? weight : 1;
> > + rcu_read_unlock();
> > + }
>
> It appears that the code could be more concise if we allow
> policy->cur_il_weight == 0. Duplicated code are in
> alloc_pages_bulk_array_weighted_interleave() too. Anyway, can we define
> some function to reduce duplicated code.
>
This is kind of complicated by the next patch, which places the node and
the weight into the same field to resolve the stale weight issue.
In that patch (cur_il_weight = 0) means "cur_il_weight invalid",
because the weight part can only be 0 when:
a) an error occuring during bulk allocation
b) a rebind event
I'll take some time to think about whether we can do away with
task->il_prev (as your next patch notes mentioned).
> > + /* Otherwise we adjust nr_pages down, and continue from there */
> > + rem_pages -= pol->cur_il_weight;
> > + pol->cur_il_weight = 0;
>
> This break the rule to keep pol->cur_il_weight != 0 except after initial
> setup. Is it OK?
>
The only way cur_il_weight can leave this function 0 at this point is if
an error occurs (specifically the failure to kmalloc immediately next).
If we don't clear cur_il_weight here, then we have a stale weight, and
the next allocation pass will over-allocate on the current node.
This semantic also changes a bit in the next patch, but is basically the
same. If il_weight is 0, then either an error occurred or a rebind
event occured.
> > + /* resume from this node w/ remaining weight */
> > + resume_node = prev_node;
> > + resume_weight = weight - (node_pages % weight);
>
> resume_weight = weight - delta; ?
>
ack
~Gregory
next prev parent reply other threads:[~2024-01-26 15:58 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-25 18:43 [PATCH v3 0/4] mm/mempolicy: weighted interleave mempolicy and sysfs extension Gregory Price
2024-01-25 18:43 ` [PATCH v3 1/4] mm/mempolicy: implement the sysfs-based weighted_interleave interface Gregory Price
2024-01-25 18:43 ` [PATCH v3 2/4] mm/mempolicy: refactor a read-once mechanism into a function for re-use Gregory Price
2024-01-25 18:43 ` [PATCH v3 3/4] mm/mempolicy: introduce MPOL_WEIGHTED_INTERLEAVE for weighted interleaving Gregory Price
2024-01-26 7:10 ` Huang, Ying
2024-01-26 15:57 ` Gregory Price [this message]
2024-01-25 18:43 ` [PATCH v3 4/4] mm/mempolicy: change cur_il_weight to atomic and carry the node with it Gregory Price
2024-01-26 7:40 ` Huang, Ying
2024-01-26 16:38 ` Gregory Price
2024-01-29 8:17 ` Huang, Ying
2024-01-29 15:48 ` Gregory Price
2024-01-29 18:11 ` Gregory Price
2024-01-30 3:15 ` Huang, Ying
2024-01-30 3:33 ` Gregory Price
2024-01-30 5:18 ` Huang, Ying
2024-01-30 16:01 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZbPWf9HbUNA1MELh@memverge.com \
--to=gregory.price@memverge.com \
--cc=Hasan.Maruf@amd.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=emirakhur@micron.com \
--cc=gourry.memverge@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=honggyu.kim@sk.com \
--cc=hyeongtak.ji@sk.com \
--cc=jgroves@micron.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rakie.kim@sk.com \
--cc=ravis.opensrc@micron.com \
--cc=seungjun.ha@samsung.com \
--cc=sthanneeru.opensrc@micron.com \
--cc=sthanneeru@micron.com \
--cc=vtavarespetr@micron.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.