cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Explanation for difference between memcg swap accounting and smaps_rollup
@ 2022-02-25 16:10 Benjamin Berg
       [not found] ` <12f7d0bef9340035b82a007cc37bd09c48d86c3f.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
  0 siblings, 1 reply; 2+ messages in thread
From: Benjamin Berg @ 2022-02-25 16:10 UTC (permalink / raw)
  To: cgroups-u79uwXL29TY76Z2rM5mHXA; +Cc: Tejun Heo, Anita Zhang

[-- Attachment #1: Type: text/plain, Size: 2330 bytes --]

Hi,

I am seeing memory.swap.current usages for the gnome-shell cgroup that
seem high if I compare them to smaps_rollup for the contained
processes. As I don't have an explanation, I thought I would ask here
(shared memory?).

What I am seeing is (see below, after a tail /dev/zero):

memory.swap.current:
  686MiB
"Swap" lines from /proc/$pid/smaps_rollup added up:
  435MiB

We should be moving launched applications out of the shell cgroup
before doing execve(), so I think we can rule out that as a possible
explanation.

I am mostly curious as we currently do swap based kills using systemd-
oomd. So if swap accounting for GNOME Shell is high, then it makes it a
more likely target unfortunately.

Am I missing something obvious?

Benjamin

$ uname -r
5.16.8-200.fc35.x86_64
$ grep -H . org.gnome.Shell-r28gBBs99rhXz5zEmyOJwQ@public.gmane.org/memory.swap.current; for p in $( cat org.gnome.Shell-r28gBBs99rhXz5zEmyOJwQ@public.gmane.org/cgroup.procs ); do ls -l /proc/$p/exe; grep Swap /proc/$p/smaps_rollup; done
org.gnome.Shell-r28gBBs99rhXz5zEmyOJwQ@public.gmane.org/memory.swap.current:712396800
lrwxrwxrwx. 1 benjamin benjamin 0 Feb 25 16:00 /proc/2521/exe -> '/usr/bin/gnome-shell (deleted)'
Swap:             294528 kB
SwapPss:          244060 kB
lrwxrwxrwx. 1 benjamin benjamin 0 Feb 25 16:01 /proc/3853/exe -> /usr/bin/Xwayland
Swap:              55580 kB
SwapPss:           46628 kB
lrwxrwxrwx. 1 benjamin benjamin 0 Feb 25 16:01 /proc/3884/exe -> /usr/bin/ibus-daemon
Swap:               4104 kB
SwapPss:            4104 kB
lrwxrwxrwx. 1 benjamin benjamin 0 Feb 25 16:01 /proc/3891/exe -> /usr/libexec/ibus-dconf
Swap:                800 kB
SwapPss:             796 kB
lrwxrwxrwx. 1 benjamin benjamin 0 Feb 25 16:01 /proc/3892/exe -> /usr/libexec/ibus-extension-gtk3
Swap:              13020 kB
SwapPss:           11864 kB
lrwxrwxrwx. 1 benjamin benjamin 0 Feb 25 16:01 /proc/3894/exe -> /usr/libexec/ibus-x11
Swap:              16284 kB
SwapPss:           16284 kB
lrwxrwxrwx. 1 benjamin benjamin 0 Feb 25 16:01 /proc/3931/exe -> /usr/libexec/ibus-engine-simple
Swap:                312 kB
SwapPss:             312 kB
lrwxrwxrwx. 1 benjamin benjamin 0 Feb 25 16:01 /proc/4086/exe -> /usr/bin/python3.10
Swap:              50640 kB
SwapPss:           49476 kB

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Explanation for difference between memcg swap accounting and smaps_rollup
       [not found] ` <12f7d0bef9340035b82a007cc37bd09c48d86c3f.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
@ 2022-03-01 16:53   ` Johannes Weiner
  0 siblings, 0 replies; 2+ messages in thread
From: Johannes Weiner @ 2022-03-01 16:53 UTC (permalink / raw)
  To: Benjamin Berg; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Tejun Heo, Anita Zhang

Hi Benjamin,

On Fri, Feb 25, 2022 at 05:10:05PM +0100, Benjamin Berg wrote:
> Hi,
> 
> I am seeing memory.swap.current usages for the gnome-shell cgroup that
> seem high if I compare them to smaps_rollup for the contained
> processes. As I don't have an explanation, I thought I would ask here
> (shared memory?).
> 
> What I am seeing is (see below, after a tail /dev/zero):
> 
> memory.swap.current:
>   686MiB
> "Swap" lines from /proc/$pid/smaps_rollup added up:
>   435MiB
> 
> We should be moving launched applications out of the shell cgroup
> before doing execve(), so I think we can rule out that as a possible
> explanation.
>
> I am mostly curious as we currently do swap based kills using systemd-
> oomd. So if swap accounting for GNOME Shell is high, then it makes it a
> more likely target unfortunately.

Shared memory is one option. For example, when you access tmpfs files
with open() read() write() close() instead of mmap().

Another option is swapcache. When swap space is plentiful, the kernel
makes it hold on to copies of pages even after they've been swapped
back in. This way, the next time they need to get "swapped out", it
doesn't require any IO, it can just drop the in-memory copy. From an
smaps POV, swapped in pages are Rss, not Swap. But their swap copies
still contribute to memory.swap.current, hence the discrepancy.

In terms of OOM killing, the kernel will stop keeping swap copies
around when more than half of swap space is used. That should give
plenty of headroom toward the OOM killing thresholds.

If you want to poke around on your machine, here is a drgn script that
tallies up the cache-only swap entries:

---
#!/usr/bin/drgn

MAX_SWAPFILES=25
SWAP_HAS_CACHE=0x40

swapcache=0
for i in range(MAX_SWAPFILES):
    si = prog['swap_info'][i]
    if si:
        for offset in range(si.max.value_()):
            if si.swap_map[offset].value_() == SWAP_HAS_CACHE:
                swapcache += 1
print("Cache-only swap space: %.2fM" % (swapcache * 4 / 1024.0))

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-03-01 16:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-25 16:10 Explanation for difference between memcg swap accounting and smaps_rollup Benjamin Berg
     [not found] ` <12f7d0bef9340035b82a007cc37bd09c48d86c3f.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
2022-03-01 16:53   ` Johannes Weiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).