public inbox for linux-bcachefs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] bcachefs-tools: Use sysinfo(2) directly to implement si_meminfo()
@ 2023-12-09 12:49 Chris Webb
  2023-12-09 12:50 ` [PATCH 2/2] bcachefs-tools: Avoid glibc-specific mallinfo() in shrinker Chris Webb
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Webb @ 2023-12-09 12:49 UTC (permalink / raw)
  To: linux-bcachefs

Use a single sysinfo(2) call to fill out struct sysinfo instead of
multiple libc sysconf(3) requests, which will only make sysinfo(2) calls
internally anyway. This also enables us to access other struct sysinfo
fields, not just the three filled-out previously.

As we provide our own definition of struct sysinfo in include/linux/mm.h
to match the kernel, which is not guaranteed to align with the definition
libc provides in <sys/sysinfo.h>, use syscall(SYS_sysinfo, ...) directly
instead of the libc wrapper.

Signed-off-by: Chris Webb <chris@arachsys.com>
---
 include/linux/mm.h |  8 +++++++-
 linux/shrinker.c   | 15 ---------------
 2 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4bf80ba..744a14c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2,6 +2,7 @@
 #ifndef _TOOLS_LINUX_MM_H
 #define _TOOLS_LINUX_MM_H
 
+#include <sys/syscall.h>
 #include <linux/types.h>
 
 struct sysinfo {
@@ -20,6 +21,11 @@ struct sysinfo {
 	__u32 mem_unit;			/* Memory unit size in bytes */
 };
 
-extern void si_meminfo(struct sysinfo * val);
+
+
+static inline void si_meminfo(struct sysinfo *val)
+{
+	BUG_ON(syscall(SYS_sysinfo, val));
+}
 
 #endif /* _TOOLS_LINUX_MM_H */
diff --git a/linux/shrinker.c b/linux/shrinker.c
index 91f633b..7658fb7 100644
--- a/linux/shrinker.c
+++ b/linux/shrinker.c
@@ -33,21 +33,6 @@ void unregister_shrinker(struct shrinker *shrinker)
 	mutex_unlock(&shrinker_lock);
 }
 
-struct meminfo {
-	u64		total;
-	u64		available;
-};
-
-void si_meminfo(struct sysinfo *val)
-{
-	long page_size = sysconf(_SC_PAGESIZE);
-	memset(val, 0, sizeof(*val));
-	val->mem_unit = 1;
-
-	val->totalram = sysconf(_SC_PHYS_PAGES) * page_size;
-	val->freeram  = sysconf(_SC_AVPHYS_PAGES) * page_size;
-}
-
 static void run_shrinkers_allocation_failed(gfp_t gfp_mask)
 {
 	struct shrinker *shrinker;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] bcachefs-tools: Avoid glibc-specific mallinfo() in shrinker
  2023-12-09 12:49 [PATCH 1/2] bcachefs-tools: Use sysinfo(2) directly to implement si_meminfo() Chris Webb
@ 2023-12-09 12:50 ` Chris Webb
  2023-12-09 13:10   ` Chris Webb
  2023-12-10  4:28   ` Kent Overstreet
  0 siblings, 2 replies; 4+ messages in thread
From: Chris Webb @ 2023-12-09 12:50 UTC (permalink / raw)
  To: linux-bcachefs

Before 326d7c1, the shrinker used freeram and totalram from a struct
sysinfo (constructed from /proc/meminfo) to target 25% free physical
memory. As well as the slowness of repeatedly reading /proc/meminfo,
this was a problem as freeram rises when the system starts to swap.
We don't want swapping to reduce our estimate of memory pressure.

To work around this, in 326d7c1 the shrinker started to use the total
allocated heap from a glibc-specific interface mallinfo2(), aiming to
shrink such that our heap is less than 80% of physical memory, unless
overall free memory is less than 6% so that becomes the determining factor.

Unfortunately, a sign error in the calculation means this heuristic
never worked. It would shrink aggressively when the process was small,
and not at all when the process grew beyond 80% of physical RAM. Only the
fallback test ensuring the free physical RAM doesn't fall below 6% would
actually kick in under memory pressure. It also breaks portability to
anything other than recent glibc.

Later, in 2440469 the mallinfo2() was replaced with the older mallinfo()
to improve compatibility with older glibc. This is even more problematic:
it's still not portable but also struct mallinfo has (signed) int fields
which overflow for large processes on 32-bit machines with a 3G/1G split.

Rather than trying to use libc-specific debug interfaces and our own heap
to inform the shrinker, use the information about free and total swap
we already have from sysinfo(2) to explicitly compensate for swapping
in our estimate of free physical memory. Target free memory of 6% of
physical RAM adjusted for zero swap use when calculating the pressure
on the shrinker, based on the effective behaviour of 326d7c1 in practice
given the sign error.

As well as fixing portability to non-glibc systems, this loosens the
assumption that we are the only process using significant memory when
setting the shrinker target. It wouldn't be unreasonable to run two
fsck jobs against independent devices on a large RAM machine and want to
balance physical RAM between them.

Signed-off-by: Chris Webb <chris@arachsys.com>
---
 linux/shrinker.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/linux/shrinker.c b/linux/shrinker.c
index 7658fb7..8a24565 100644
--- a/linux/shrinker.c
+++ b/linux/shrinker.c
@@ -54,7 +54,6 @@ void run_shrinkers(gfp_t gfp_mask, bool allocation_failed)
 {
 	struct shrinker *shrinker;
 	struct sysinfo info;
-	struct mallinfo malloc_info = mallinfo();
 	s64 want_shrink;
 
 	if (!(gfp_mask & GFP_KERNEL))
@@ -71,16 +70,11 @@ void run_shrinkers(gfp_t gfp_mask, bool allocation_failed)
 
 	si_meminfo(&info);
 
-	if (info.totalram && info.totalram >> 4 < info.freeram) {
-		/* freeram goes up when system swaps, use malloced data instead */
-		want_shrink = -malloc_info.arena + (info.totalram / 10 * 8);
-
-		if (want_shrink <= 0)
-			return;
-	} else {
-		/* We want to play nice with other apps keep 6% avaliable, free 3% */
-		want_shrink = (info.totalram >> 5);
-	}
+	/* Aim for 6% of physical RAM free without anything in swap */
+	want_shrink = (info.totalram << 4) - info.freeram
+			+ info.totalswap - info.freeswap;
+	if (want_shrink <= 0)
+		return;
 
 	mutex_lock(&shrinker_lock);
 	list_for_each_entry(shrinker, &shrinker_list, list) {

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] bcachefs-tools: Avoid glibc-specific mallinfo() in shrinker
  2023-12-09 12:50 ` [PATCH 2/2] bcachefs-tools: Avoid glibc-specific mallinfo() in shrinker Chris Webb
@ 2023-12-09 13:10   ` Chris Webb
  2023-12-10  4:28   ` Kent Overstreet
  1 sibling, 0 replies; 4+ messages in thread
From: Chris Webb @ 2023-12-09 13:10 UTC (permalink / raw)
  To: linux-bcachefs

Chris Webb <chris@arachsys.com> wrote:

> +	/* Aim for 6% of physical RAM free without anything in swap */
> +	want_shrink = (info.totalram << 4) - info.freeram
> +			+ info.totalswap - info.freeswap;
> +	if (want_shrink <= 0)
> +		return;

(info.totalram >> 4) here, sorry, of course!

Best wishes,

Chris.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] bcachefs-tools: Avoid glibc-specific mallinfo() in shrinker
  2023-12-09 12:50 ` [PATCH 2/2] bcachefs-tools: Avoid glibc-specific mallinfo() in shrinker Chris Webb
  2023-12-09 13:10   ` Chris Webb
@ 2023-12-10  4:28   ` Kent Overstreet
  1 sibling, 0 replies; 4+ messages in thread
From: Kent Overstreet @ 2023-12-10  4:28 UTC (permalink / raw)
  To: Chris Webb; +Cc: linux-bcachefs

On Sat, Dec 09, 2023 at 12:50:31PM +0000, Chris Webb wrote:
> Before 326d7c1, the shrinker used freeram and totalram from a struct
> sysinfo (constructed from /proc/meminfo) to target 25% free physical
> memory. As well as the slowness of repeatedly reading /proc/meminfo,
> this was a problem as freeram rises when the system starts to swap.
> We don't want swapping to reduce our estimate of memory pressure.
> 
> To work around this, in 326d7c1 the shrinker started to use the total
> allocated heap from a glibc-specific interface mallinfo2(), aiming to
> shrink such that our heap is less than 80% of physical memory, unless
> overall free memory is less than 6% so that becomes the determining factor.
> 
> Unfortunately, a sign error in the calculation means this heuristic
> never worked. It would shrink aggressively when the process was small,
> and not at all when the process grew beyond 80% of physical RAM. Only the
> fallback test ensuring the free physical RAM doesn't fall below 6% would
> actually kick in under memory pressure. It also breaks portability to
> anything other than recent glibc.
> 
> Later, in 2440469 the mallinfo2() was replaced with the older mallinfo()
> to improve compatibility with older glibc. This is even more problematic:
> it's still not portable but also struct mallinfo has (signed) int fields
> which overflow for large processes on 32-bit machines with a 3G/1G split.
> 
> Rather than trying to use libc-specific debug interfaces and our own heap
> to inform the shrinker, use the information about free and total swap
> we already have from sysinfo(2) to explicitly compensate for swapping
> in our estimate of free physical memory. Target free memory of 6% of
> physical RAM adjusted for zero swap use when calculating the pressure
> on the shrinker, based on the effective behaviour of 326d7c1 in practice
> given the sign error.
> 
> As well as fixing portability to non-glibc systems, this loosens the
> assumption that we are the only process using significant memory when
> setting the shrinker target. It wouldn't be unreasonable to run two
> fsck jobs against independent devices on a large RAM machine and want to
> balance physical RAM between them.
> 
> Signed-off-by: Chris Webb <chris@arachsys.com>

Nice, applied.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-12-10  4:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-09 12:49 [PATCH 1/2] bcachefs-tools: Use sysinfo(2) directly to implement si_meminfo() Chris Webb
2023-12-09 12:50 ` [PATCH 2/2] bcachefs-tools: Avoid glibc-specific mallinfo() in shrinker Chris Webb
2023-12-09 13:10   ` Chris Webb
2023-12-10  4:28   ` Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox