From: Gregory Price <gregory.price@memverge.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Gregory Price <gourry.memverge@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-api@vger.kernel.org, corbet@lwn.net,
akpm@linux-foundation.org, honggyu.kim@sk.com, rakie.kim@sk.com,
hyeongtak.ji@sk.com, mhocko@kernel.org, vtavarespetr@micron.com,
jgroves@micron.com, ravis.opensrc@micron.com,
sthanneeru@micron.com, emirakhur@micron.com, Hasan.Maruf@amd.com,
seungjun.ha@samsung.com, hannes@cmpxchg.org,
dan.j.williams@intel.com,
Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
Subject: Re: [PATCH v2 3/3] mm/mempolicy: introduce MPOL_WEIGHTED_INTERLEAVE for weighted interleaving
Date: Mon, 22 Jan 2024 23:54:34 -0500 [thread overview]
Message-ID: <Za9GiqsZtcfKXc5m@memverge.com> (raw)
In-Reply-To: <87jzo0vjkk.fsf@yhuang6-desk2.ccr.corp.intel.com>
On Tue, Jan 23, 2024 at 11:02:03AM +0800, Huang, Ying wrote:
> Gregory Price <gourry.memverge@gmail.com> writes:
>
> > + int prev_node = NUMA_NO_NODE;
>
> It appears that we should initialize prev_node with me->il_prev?
> Details are as below.
>
yeah good catch, was a rebase error from my tested code, where this is
the case. patching now.
> > + if (rem_pages <= pol->wil.cur_weight) {
> > + pol->wil.cur_weight -= rem_pages;
>
> If "pol->wil.cur_weight == 0" here, we need to change me->il_prev?
>
you are right, and also need to fetch the next cur_weight. Seems I
missed this specific case in my tests. (had this tested with a single
node but not 2, so it looked right).
Added to my test suite.
> We can replace "weight_nodes" with "i" and use a "for" loop?
>
> > + while (weight_nodes < nnodes) {
> > + node = next_node_in(prev_node, nodes);
>
> IIUC, "node" will not change in the loop, so all "weight" below will be
> the same value. To keep it simple, I think we can just copy weights
> from the global iw_table and consider the default value?
>
another rebase error here from my tested code, this should have been
node = prev_node;
while (...)
node = next_node_in(node, nodes);
I can change it to a for loop as suggested, but for more info on why I
did it this way, see the chunk below
> > + } else if (!delta_depleted) {
> > + /* if there was no delta, track last allocated node */
> > + resume_node = node;
> > + resume_weight = i < (nnodes - 1) ? weights[i+1] :
> > + weights[0];
^ this line acquires the weight of the *NEXT* node
another chunk prior to this does the same
thing. I suppose i can use next_node_in()
instead and just copy the entire weigh array
though, if that is preferable.
> > + }
>
> Can the above code be simplified as something like below?
>
> resume_node = prev_node;
> resume_weight = 0;
> for (...) {
> ...
> if (delta > weight) {
> node_pages += weight;
> delta -= weight;
> } else if (delta) {
> node_pages += delta;
> /* if delta depleted, resume from this node */
> if (delta < weight) {
> resume_node = prev_node;
> resume_weight = weight - delta;
> } else {
> resume_node = node;
> }
> delta = 0;
> }
> ...
> }
>
I'll take another look at it, but this logic is annoying because of the
corner case: me->il_prev can be NUMA_NO_NODE or an actual numa node.
If it's NUMA_NO_NODE, then the logic you have above will say "the next
node has no remaining weights assigned" and skip it on the next call to
weighted_interleave_nid or weighted_interleave_nodes.
This is incorrect - we want the weight of the first node to be
resume_weight, which is what this chunk does:
if (delta >= weight) {
/* if delta == weight, get next node weight */
resume_weight = i < (nnodes - 1) ? weights[i+1] : weights[0];
else if (delta) { /* delta < weight */
/* there's a remaining weight, use the that for resume weight */
resume_weight = weight - (node_pages % weight);
} else if (!delta_depleted) {
/* there was never a delta, track the last node and get the weight
* of the node AFTER that node, that's the resume weight */
resume_weight = i < (nnodes - 1) ? weights[i+1] : weights[0];
}
If il_prev is an actual node, and delta == 0, we want to return with
(il_prev = prev_node) but with the weight set to the weight of the
first node we're about to allocate from.
This is the reason for the annoying logic here: We have to come out of
this loop with the actual node and the actual weight.
I'll try to clean it up further and get my test suite to pass.
~Gregory
next prev parent reply other threads:[~2024-01-23 4:54 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-19 17:57 [PATCH v2 0/3] mm/mempolicy: weighted interleave mempolicy and sysfs extension Gregory Price
2024-01-19 17:57 ` [PATCH v2 1/3] mm/mempolicy: implement the sysfs-based weighted_interleave interface Gregory Price
2024-01-22 8:03 ` Huang, Ying
2024-01-22 16:58 ` Gregory Price
2024-01-19 17:57 ` [PATCH v2 2/3] mm/mempolicy: refactor a read-once mechanism into a function for re-use Gregory Price
2024-01-19 17:57 ` [PATCH v2 3/3] mm/mempolicy: introduce MPOL_WEIGHTED_INTERLEAVE for weighted interleaving Gregory Price
2024-01-23 3:02 ` Huang, Ying
2024-01-23 4:54 ` Gregory Price [this message]
2024-01-23 5:16 ` Gregory Price
2024-01-23 8:35 ` Huang, Ying
2024-01-23 21:27 ` Gregory Price
2024-01-24 1:51 ` Huang, Ying
2024-01-24 18:01 ` Gregory Price
2024-01-23 8:13 ` Huang, Ying
2024-01-23 8:40 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Za9GiqsZtcfKXc5m@memverge.com \
--to=gregory.price@memverge.com \
--cc=Hasan.Maruf@amd.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=emirakhur@micron.com \
--cc=gourry.memverge@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=honggyu.kim@sk.com \
--cc=hyeongtak.ji@sk.com \
--cc=jgroves@micron.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rakie.kim@sk.com \
--cc=ravis.opensrc@micron.com \
--cc=seungjun.ha@samsung.com \
--cc=sthanneeru.opensrc@micron.com \
--cc=sthanneeru@micron.com \
--cc=vtavarespetr@micron.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.