From: Mateusz Guzik <mjguzik@gmail.com>
To: Jan Kara <jack@suse.cz>
Cc: Dennis Zhou <dennis@kernel.org>,
linux-kernel@vger.kernel.org, tj@kernel.org, cl@linux.com,
akpm@linux-foundation.org, shakeelb@google.com,
linux-mm@kvack.org
Subject: Re: [PATCH 0/2] execve scalability issues, part 1
Date: Tue, 22 Aug 2023 16:24:56 +0200 [thread overview]
Message-ID: <CAGudoHHe5nzRTuj4G1fphD+JJ02TE5BnHEDwFm=-W6DoEj2qVQ@mail.gmail.com> (raw)
In-Reply-To: <20230822095154.7cr5ofogw552z3jk@quack3>
On 8/22/23, Jan Kara <jack@suse.cz> wrote:
> On Tue 22-08-23 00:29:49, Mateusz Guzik wrote:
>> On 8/21/23, Mateusz Guzik <mjguzik@gmail.com> wrote:
>> > True Fix(tm) is a longer story.
>> >
>> > Maybe let's sort out this patchset first, whichever way. :)
>> >
>>
>> So I found the discussion around the original patch with a perf
>> regression report.
>>
>> https://lore.kernel.org/linux-mm/20230608111408.s2minsenlcjow7q3@quack3/
>>
>> The reporter suggests dodging the problem by only allocating per-cpu
>> counters when the process is going multithreaded. Given that there is
>> still plenty of forever single-threaded procs out there I think that's
>> does sound like a great plan regardless of what happens with this
>> patchset.
>>
>> Almost all access is already done using dedicated routines, so this
>> should be an afternoon churn to sort out, unless I missed a
>> showstopper. (maybe there is no good place to stuff a flag/whatever
>> other indicator about the state of counters?)
>>
>> That said I'll look into it some time this or next week.
>
> Good, just let me know how it went, I also wanted to start looking into
> this to come up with some concrete patches :). What I had in mind was that
> we could use 'counters == NULL' as an indication that the counter is still
> in 'single counter mode'.
>
In the current state there are only pointers to counters in mm_struct
and there is no storage for them in task_struct. So I don't think
merely null-checking the per-cpu stuff is going to cut it -- where
should the single-threaded counters land?
Bonus problem, non-current can modify these counters and this needs to
be safe against current playing with them at the same time. (and it
would be a shame to require current to use atomic on them)
That said, my initial proposal adds a union:
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5e74ce4a28cd..ea70f0c08286 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -737,7 +737,11 @@ struct mm_struct {
unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for
/proc/PID/auxv */
- struct percpu_counter rss_stat[NR_MM_COUNTERS];
+ union {
+ struct percpu_counter rss_stat[NR_MM_COUNTERS];
+ u64 *rss_stat_single;
+ };
+ bool magic_flag_stuffed_elsewhere;
struct linux_binfmt *binfmt;
Then for single-threaded case an area is allocated for NR_MM_COUNTERS
countes * 2 -- first set updated without any synchro by current
thread. Second set only to be modified by others and protected with
mm->arg_lock. The lock protects remote access to the union to begin
with.
Transition to per-CPU operation sets the magic flag (there is plenty
of spare space in mm_struct, I'll find a good home for it without
growing the struct). It would be a one-way street -- a process which
gets a bunch of threads and goes back to one stays with per-CPU.
Then you get the true value of something by adding both counters.
arg_lock is sparingly used, so remote ops are not expected to contend
with anything. In fact their cost is going to go down since percpu
summation takes a spinlock which also disables interrupts.
Local ops should be about the same in cost as they are right now.
I might have missed some detail in the above description, but I think
the approach is decent.
--
Mateusz Guzik <mjguzik gmail.com>
next prev parent reply other threads:[~2023-08-22 14:25 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-21 20:28 [PATCH 0/2] execve scalability issues, part 1 Mateusz Guzik
2023-08-21 20:28 ` [PATCH 1/2] pcpcntr: add group allocation/free Mateusz Guzik
2023-08-22 13:37 ` Vegard Nossum
2023-08-22 14:06 ` Mateusz Guzik
2023-08-22 17:02 ` Dennis Zhou
2023-08-21 20:28 ` [PATCH 2/2] fork: group allocation of per-cpu counters for mm struct Mateusz Guzik
2023-08-21 21:20 ` Matthew Wilcox
2023-08-21 20:42 ` [PATCH 0/2] execve scalability issues, part 1 Matthew Wilcox
2023-08-21 20:44 ` [PATCH 1/7] mm: Make folios_put() the basis of release_pages() Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 2/7] mm: Convert free_unref_page_list() to use folios Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 3/7] mm: Add free_unref_folios() Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 4/7] mm: Use folios_put() in __folio_batch_release() Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 5/7] memcg: Add mem_cgroup_uncharge_batch() Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 6/7] mm: Remove use of folio list from folios_put() Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 7/7] mm: Use free_unref_folios() in put_pages_list() Matthew Wilcox (Oracle)
2023-08-21 21:07 ` [PATCH 0/2] execve scalability issues, part 1 Dennis Zhou
2023-08-21 21:39 ` Mateusz Guzik
2023-08-21 22:29 ` Mateusz Guzik
2023-08-22 9:51 ` Jan Kara
2023-08-22 14:24 ` Mateusz Guzik [this message]
2023-08-23 9:49 ` Jan Kara
2023-08-23 10:49 ` David Laight
2023-08-23 12:01 ` Mateusz Guzik
2023-08-23 12:13 ` Mateusz Guzik
2023-08-23 15:47 ` Jan Kara
2023-08-23 16:10 ` Mateusz Guzik
2023-08-23 16:41 ` Jan Kara
2023-08-23 17:12 ` Mateusz Guzik
2023-08-23 20:27 ` Dennis Zhou
2023-08-24 9:19 ` Jan Kara
2023-08-26 18:33 ` Mateusz Guzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGudoHHe5nzRTuj4G1fphD+JJ02TE5BnHEDwFm=-W6DoEj2qVQ@mail.gmail.com' \
--to=mjguzik@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=dennis@kernel.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).