* [PATCH v4] vmscan: add trace events for lru_gen
[not found] <CGME20230926042019epcas1p11c28533f7b7db99db9f9d8a03ddd332c@epcas1p1.samsung.com>
@ 2023-09-26 4:22 ` Jaewon Kim
2023-09-26 4:42 ` Yu Zhao
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Jaewon Kim @ 2023-09-26 4:22 UTC (permalink / raw)
To: rostedt, tjmercier, yuzhao, kaleshsingh, akpm, vbabka, hannes, sj
Cc: linux-kernel, linux-trace-kernel, linux-mm, jaewon31.kim,
Jaewon Kim
As the legacy lru provides, the lru_gen needs some trace events for
debugging.
This commit introduces 2 trace events.
trace_mm_vmscan_lru_gen_scan
trace_mm_vmscan_lru_gen_evict
Each event is similar to the following legacy events.
trace_mm_vmscan_lru_isolate,
trace_mm_vmscan_lru_shrink_[in]active
Here's an example
mm_vmscan_lru_gen_scan: classzone=2 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=anon
mm_vmscan_lru_gen_evict: nid=0 nr_reclaimed=64 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=0 nr_ref_keep=0 nr_unmap_fail=0 priority=2 flags=RECLAIM_WB_ANON|RECLAIM_WB_ASYNC
mm_vmscan_lru_gen_scan: classzone=1 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=file
mm_vmscan_lru_gen_evict: nid=0 nr_reclaimed=64 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=0 nr_ref_keep=0 nr_unmap_fail=0 priority=12 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
---
v4: wrap with #ifdef CONFIG_LRU_GEN
v3: change printk format
v2: use condition and make it aligned
v1: introduce trace events
---
include/trace/events/mmflags.h | 9 ++++
include/trace/events/vmscan.h | 96 ++++++++++++++++++++++++++++++++++
mm/vmscan.c | 20 +++++--
3 files changed, 120 insertions(+), 5 deletions(-)
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index 1478b9dd05fa..6dfe85bd4e81 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -274,6 +274,12 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY, "softdirty" ) \
EM (LRU_ACTIVE_FILE, "active_file") \
EMe(LRU_UNEVICTABLE, "unevictable")
+#ifdef CONFIG_LRU_GEN
+#define LRU_GEN_NAMES \
+ EM (LRU_GEN_ANON, "anon") \
+ EMe(LRU_GEN_FILE, "file")
+#endif
+
/*
* First define the enums in the above macros to be exported to userspace
* via TRACE_DEFINE_ENUM().
@@ -288,6 +294,9 @@ COMPACTION_PRIORITY
/* COMPACTION_FEEDBACK are defines not enums. Not needed here. */
ZONE_TYPE
LRU_NAMES
+#ifdef CONFIG_LRU_GEN
+LRU_GEN_NAMES
+#endif
/*
* Now redefine the EM() and EMe() macros to map the enums to the strings
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index d2123dd960d5..2080ef742f89 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -327,6 +327,102 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
__print_symbolic(__entry->lru, LRU_NAMES))
);
+#ifdef CONFIG_LRU_GEN
+TRACE_EVENT_CONDITION(mm_vmscan_lru_gen_scan,
+ TP_PROTO(int highest_zoneidx,
+ int order,
+ unsigned long nr_requested,
+ unsigned long nr_scanned,
+ unsigned long nr_skipped,
+ unsigned long nr_taken,
+ int lru),
+
+ TP_ARGS(highest_zoneidx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, lru),
+
+ TP_CONDITION(nr_scanned),
+
+ TP_STRUCT__entry(
+ __field(int, highest_zoneidx)
+ __field(int, order)
+ __field(unsigned long, nr_requested)
+ __field(unsigned long, nr_scanned)
+ __field(unsigned long, nr_skipped)
+ __field(unsigned long, nr_taken)
+ __field(int, lru)
+ ),
+
+ TP_fast_assign(
+ __entry->highest_zoneidx = highest_zoneidx;
+ __entry->order = order;
+ __entry->nr_requested = nr_requested;
+ __entry->nr_scanned = nr_scanned;
+ __entry->nr_skipped = nr_skipped;
+ __entry->nr_taken = nr_taken;
+ __entry->lru = lru;
+ ),
+
+ /*
+ * classzone is previous name of the highest_zoneidx.
+ * Reason not to change it is the ABI requirement of the tracepoint.
+ */
+ TP_printk("classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu lru=%s",
+ __entry->highest_zoneidx,
+ __entry->order,
+ __entry->nr_requested,
+ __entry->nr_scanned,
+ __entry->nr_skipped,
+ __entry->nr_taken,
+ __print_symbolic(__entry->lru, LRU_GEN_NAMES))
+);
+
+TRACE_EVENT(mm_vmscan_lru_gen_evict,
+
+ TP_PROTO(int nid, unsigned long nr_reclaimed,
+ struct reclaim_stat *stat, int priority, int file),
+
+ TP_ARGS(nid, nr_reclaimed, stat, priority, file),
+
+ TP_STRUCT__entry(
+ __field(unsigned long, nr_reclaimed)
+ __field(unsigned long, nr_dirty)
+ __field(unsigned long, nr_writeback)
+ __field(unsigned long, nr_congested)
+ __field(unsigned long, nr_immediate)
+ __field(unsigned int, nr_activate0)
+ __field(unsigned int, nr_activate1)
+ __field(unsigned long, nr_ref_keep)
+ __field(unsigned long, nr_unmap_fail)
+ __field(int, nid)
+ __field(int, priority)
+ __field(int, reclaim_flags)
+ ),
+
+ TP_fast_assign(
+ __entry->nid = nid;
+ __entry->nr_reclaimed = nr_reclaimed;
+ __entry->nr_dirty = stat->nr_dirty;
+ __entry->nr_writeback = stat->nr_writeback;
+ __entry->nr_congested = stat->nr_congested;
+ __entry->nr_immediate = stat->nr_immediate;
+ __entry->nr_activate0 = stat->nr_activate[0];
+ __entry->nr_activate1 = stat->nr_activate[1];
+ __entry->nr_ref_keep = stat->nr_ref_keep;
+ __entry->nr_unmap_fail = stat->nr_unmap_fail;
+ __entry->priority = priority;
+ __entry->reclaim_flags = trace_reclaim_flags(file);
+ ),
+
+ TP_printk("nid=%d nr_reclaimed=%lu nr_dirty=%lu nr_writeback=%lu nr_congested=%lu nr_immediate=%lu nr_activate_anon=%u nr_activate_file=%u nr_ref_keep=%lu nr_unmap_fail=%lu priority=%d flags=%s",
+ __entry->nid, __entry->nr_reclaimed,
+ __entry->nr_dirty, __entry->nr_writeback,
+ __entry->nr_congested, __entry->nr_immediate,
+ __entry->nr_activate0, __entry->nr_activate1,
+ __entry->nr_ref_keep, __entry->nr_unmap_fail,
+ __entry->priority,
+ show_reclaim_flags(__entry->reclaim_flags))
+);
+#endif
+
TRACE_EVENT(mm_vmscan_write_folio,
TP_PROTO(struct folio *folio),
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6f13394b112e..0c8b48bcb461 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5005,6 +5005,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
int sorted = 0;
int scanned = 0;
int isolated = 0;
+ int skipped = 0;
int remaining = MAX_LRU_BATCH;
struct lru_gen_folio *lrugen = &lruvec->lrugen;
struct mem_cgroup *memcg = lruvec_memcg(lruvec);
@@ -5018,7 +5019,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
for (i = MAX_NR_ZONES; i > 0; i--) {
LIST_HEAD(moved);
- int skipped = 0;
+ int skipped_zone = 0;
int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
struct list_head *head = &lrugen->folios[gen][type][zone];
@@ -5040,16 +5041,17 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
isolated += delta;
} else {
list_move(&folio->lru, &moved);
- skipped += delta;
+ skipped_zone += delta;
}
- if (!--remaining || max(isolated, skipped) >= MIN_LRU_BATCH)
+ if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH)
break;
}
- if (skipped) {
+ if (skipped_zone) {
list_splice(&moved, head);
- __count_zid_vm_events(PGSCAN_SKIP, zone, skipped);
+ __count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone);
+ skipped += skipped_zone;
}
if (!remaining || isolated >= MIN_LRU_BATCH)
@@ -5065,6 +5067,10 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
__count_memcg_events(memcg, PGREFILL, sorted);
__count_vm_events(PGSCAN_ANON + type, isolated);
+#ifdef CONFIG_LRU_GEN
+ trace_mm_vmscan_lru_gen_scan(sc->reclaim_idx, sc->order, MAX_LRU_BATCH,
+ scanned, skipped, isolated, type);
+#endif
/*
* There might not be eligible folios due to reclaim_idx. Check the
* remaining to prevent livelock if it's not making progress.
@@ -5194,6 +5200,10 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
retry:
reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false);
sc->nr_reclaimed += reclaimed;
+#ifdef CONFIG_LRU_GEN
+ trace_mm_vmscan_lru_gen_evict(pgdat->node_id, reclaimed, &stat,
+ sc->priority, type);
+#endif
list_for_each_entry_safe_reverse(folio, next, &list, lru) {
if (!folio_evictable(folio)) {
--
2.17.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v4] vmscan: add trace events for lru_gen
2023-09-26 4:22 ` [PATCH v4] vmscan: add trace events for lru_gen Jaewon Kim
@ 2023-09-26 4:42 ` Yu Zhao
[not found] ` <CGME20230926042019epcas1p11c28533f7b7db99db9f9d8a03ddd332c@epcms1p3>
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Yu Zhao @ 2023-09-26 4:42 UTC (permalink / raw)
To: Jaewon Kim
Cc: rostedt, tjmercier, kaleshsingh, akpm, vbabka, hannes, sj,
linux-kernel, linux-trace-kernel, linux-mm, jaewon31.kim
On Mon, Sep 25, 2023 at 10:20 PM Jaewon Kim <jaewon31.kim@samsung.com> wrote:
>
> As the legacy lru provides, the lru_gen needs some trace events for
> debugging.
>
> This commit introduces 2 trace events.
> trace_mm_vmscan_lru_gen_scan
> trace_mm_vmscan_lru_gen_evict
>
> Each event is similar to the following legacy events.
> trace_mm_vmscan_lru_isolate,
> trace_mm_vmscan_lru_shrink_[in]active
We should just reuse trace_mm_vmscan_lru_isolate and
trace_mm_vmscan_lru_shrink_inactive instead of adding new tracepoints.
To reuse trace_mm_vmscan_lru_isolate, we'd just need to append two new
names to LRU_NAMES.
The naming of trace_mm_vmscan_lru_shrink_inactive might seem confusing
but it's how MGLRU maintains the compatibility, e.g., the existing
active/inactive counters in /proc/vmstat.
> Here's an example
> mm_vmscan_lru_gen_scan: classzone=2 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=anon
> mm_vmscan_lru_gen_evict: nid=0 nr_reclaimed=64 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=0 nr_ref_keep=0 nr_unmap_fail=0 priority=2 flags=RECLAIM_WB_ANON|RECLAIM_WB_ASYNC
> mm_vmscan_lru_gen_scan: classzone=1 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=file
> mm_vmscan_lru_gen_evict: nid=0 nr_reclaimed=64 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=0 nr_ref_keep=0 nr_unmap_fail=0 priority=12 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC
>
> Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
> ---
> v4: wrap with #ifdef CONFIG_LRU_GEN
> v3: change printk format
> v2: use condition and make it aligned
> v1: introduce trace events
> ---
> include/trace/events/mmflags.h | 9 ++++
> include/trace/events/vmscan.h | 96 ++++++++++++++++++++++++++++++++++
> mm/vmscan.c | 20 +++++--
> 3 files changed, 120 insertions(+), 5 deletions(-)
>
> diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
> index 1478b9dd05fa..6dfe85bd4e81 100644
> --- a/include/trace/events/mmflags.h
> +++ b/include/trace/events/mmflags.h
> @@ -274,6 +274,12 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY, "softdirty" ) \
> EM (LRU_ACTIVE_FILE, "active_file") \
> EMe(LRU_UNEVICTABLE, "unevictable")
>
> +#ifdef CONFIG_LRU_GEN
> +#define LRU_GEN_NAMES \
> + EM (LRU_GEN_ANON, "anon") \
> + EMe(LRU_GEN_FILE, "file")
> +#endif
> +
> /*
> * First define the enums in the above macros to be exported to userspace
> * via TRACE_DEFINE_ENUM().
> @@ -288,6 +294,9 @@ COMPACTION_PRIORITY
> /* COMPACTION_FEEDBACK are defines not enums. Not needed here. */
> ZONE_TYPE
> LRU_NAMES
> +#ifdef CONFIG_LRU_GEN
> +LRU_GEN_NAMES
> +#endif
>
> /*
> * Now redefine the EM() and EMe() macros to map the enums to the strings
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index d2123dd960d5..2080ef742f89 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -327,6 +327,102 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
> __print_symbolic(__entry->lru, LRU_NAMES))
> );
>
> +#ifdef CONFIG_LRU_GEN
> +TRACE_EVENT_CONDITION(mm_vmscan_lru_gen_scan,
> + TP_PROTO(int highest_zoneidx,
> + int order,
> + unsigned long nr_requested,
> + unsigned long nr_scanned,
> + unsigned long nr_skipped,
> + unsigned long nr_taken,
> + int lru),
> +
> + TP_ARGS(highest_zoneidx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, lru),
> +
> + TP_CONDITION(nr_scanned),
> +
> + TP_STRUCT__entry(
> + __field(int, highest_zoneidx)
> + __field(int, order)
> + __field(unsigned long, nr_requested)
> + __field(unsigned long, nr_scanned)
> + __field(unsigned long, nr_skipped)
> + __field(unsigned long, nr_taken)
> + __field(int, lru)
> + ),
> +
> + TP_fast_assign(
> + __entry->highest_zoneidx = highest_zoneidx;
> + __entry->order = order;
> + __entry->nr_requested = nr_requested;
> + __entry->nr_scanned = nr_scanned;
> + __entry->nr_skipped = nr_skipped;
> + __entry->nr_taken = nr_taken;
> + __entry->lru = lru;
> + ),
> +
> + /*
> + * classzone is previous name of the highest_zoneidx.
> + * Reason not to change it is the ABI requirement of the tracepoint.
> + */
> + TP_printk("classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu lru=%s",
> + __entry->highest_zoneidx,
> + __entry->order,
> + __entry->nr_requested,
> + __entry->nr_scanned,
> + __entry->nr_skipped,
> + __entry->nr_taken,
> + __print_symbolic(__entry->lru, LRU_GEN_NAMES))
> +);
> +
> +TRACE_EVENT(mm_vmscan_lru_gen_evict,
> +
> + TP_PROTO(int nid, unsigned long nr_reclaimed,
> + struct reclaim_stat *stat, int priority, int file),
> +
> + TP_ARGS(nid, nr_reclaimed, stat, priority, file),
> +
> + TP_STRUCT__entry(
> + __field(unsigned long, nr_reclaimed)
> + __field(unsigned long, nr_dirty)
> + __field(unsigned long, nr_writeback)
> + __field(unsigned long, nr_congested)
> + __field(unsigned long, nr_immediate)
> + __field(unsigned int, nr_activate0)
> + __field(unsigned int, nr_activate1)
> + __field(unsigned long, nr_ref_keep)
> + __field(unsigned long, nr_unmap_fail)
> + __field(int, nid)
> + __field(int, priority)
> + __field(int, reclaim_flags)
> + ),
> +
> + TP_fast_assign(
> + __entry->nid = nid;
> + __entry->nr_reclaimed = nr_reclaimed;
> + __entry->nr_dirty = stat->nr_dirty;
> + __entry->nr_writeback = stat->nr_writeback;
> + __entry->nr_congested = stat->nr_congested;
> + __entry->nr_immediate = stat->nr_immediate;
> + __entry->nr_activate0 = stat->nr_activate[0];
> + __entry->nr_activate1 = stat->nr_activate[1];
> + __entry->nr_ref_keep = stat->nr_ref_keep;
> + __entry->nr_unmap_fail = stat->nr_unmap_fail;
> + __entry->priority = priority;
> + __entry->reclaim_flags = trace_reclaim_flags(file);
> + ),
> +
> + TP_printk("nid=%d nr_reclaimed=%lu nr_dirty=%lu nr_writeback=%lu nr_congested=%lu nr_immediate=%lu nr_activate_anon=%u nr_activate_file=%u nr_ref_keep=%lu nr_unmap_fail=%lu priority=%d flags=%s",
> + __entry->nid, __entry->nr_reclaimed,
> + __entry->nr_dirty, __entry->nr_writeback,
> + __entry->nr_congested, __entry->nr_immediate,
> + __entry->nr_activate0, __entry->nr_activate1,
> + __entry->nr_ref_keep, __entry->nr_unmap_fail,
> + __entry->priority,
> + show_reclaim_flags(__entry->reclaim_flags))
> +);
> +#endif
> +
> TRACE_EVENT(mm_vmscan_write_folio,
>
> TP_PROTO(struct folio *folio),
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 6f13394b112e..0c8b48bcb461 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -5005,6 +5005,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> int sorted = 0;
> int scanned = 0;
> int isolated = 0;
> + int skipped = 0;
> int remaining = MAX_LRU_BATCH;
> struct lru_gen_folio *lrugen = &lruvec->lrugen;
> struct mem_cgroup *memcg = lruvec_memcg(lruvec);
> @@ -5018,7 +5019,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
>
> for (i = MAX_NR_ZONES; i > 0; i--) {
> LIST_HEAD(moved);
> - int skipped = 0;
> + int skipped_zone = 0;
> int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
> struct list_head *head = &lrugen->folios[gen][type][zone];
>
> @@ -5040,16 +5041,17 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> isolated += delta;
> } else {
> list_move(&folio->lru, &moved);
> - skipped += delta;
> + skipped_zone += delta;
> }
>
> - if (!--remaining || max(isolated, skipped) >= MIN_LRU_BATCH)
> + if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH)
> break;
> }
>
> - if (skipped) {
> + if (skipped_zone) {
> list_splice(&moved, head);
> - __count_zid_vm_events(PGSCAN_SKIP, zone, skipped);
> + __count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone);
> + skipped += skipped_zone;
> }
>
> if (!remaining || isolated >= MIN_LRU_BATCH)
> @@ -5065,6 +5067,10 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> __count_memcg_events(memcg, PGREFILL, sorted);
> __count_vm_events(PGSCAN_ANON + type, isolated);
>
> +#ifdef CONFIG_LRU_GEN
> + trace_mm_vmscan_lru_gen_scan(sc->reclaim_idx, sc->order, MAX_LRU_BATCH,
> + scanned, skipped, isolated, type);
> +#endif
These functions are already within CONFIG_LRU_GEN.
> /*
> * There might not be eligible folios due to reclaim_idx. Check the
> * remaining to prevent livelock if it's not making progress.
> @@ -5194,6 +5200,10 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
> retry:
> reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false);
> sc->nr_reclaimed += reclaimed;
> +#ifdef CONFIG_LRU_GEN
> + trace_mm_vmscan_lru_gen_evict(pgdat->node_id, reclaimed, &stat,
> + sc->priority, type);
> +#endif
Ditto.
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH v4] vmscan: add trace events for lru_gen
[not found] ` <CGME20230926042019epcas1p11c28533f7b7db99db9f9d8a03ddd332c@epcms1p3>
@ 2023-09-26 5:10 ` 김재원
0 siblings, 0 replies; 7+ messages in thread
From: 김재원 @ 2023-09-26 5:10 UTC (permalink / raw)
To: Yu Zhao
Cc: rostedt@goodmis.org, tjmercier@google.com, kaleshsingh@google.com,
akpm@linux-foundation.org, vbabka@suse.cz, hannes@cmpxchg.org,
sj@kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org,
jaewon31.kim@gmail.com
>On Mon, Sep 25, 2023 at 10:20?PM Jaewon Kim <jaewon31.kim@samsung.com> wrote:
>>
>> As the legacy lru provides, the lru_gen needs some trace events for
>> debugging.
>>
>> This commit introduces 2 trace events.
>> trace_mm_vmscan_lru_gen_scan
>> trace_mm_vmscan_lru_gen_evict
>>
>> Each event is similar to the following legacy events.
>> trace_mm_vmscan_lru_isolate,
>> trace_mm_vmscan_lru_shrink_[in]active
>
>We should just reuse trace_mm_vmscan_lru_isolate and
>trace_mm_vmscan_lru_shrink_inactive instead of adding new tracepoints.
>
>To reuse trace_mm_vmscan_lru_isolate, we'd just need to append two new
>names to LRU_NAMES.
>
>The naming of trace_mm_vmscan_lru_shrink_inactive might seem confusing
>but it's how MGLRU maintains the compatibility, e.g., the existing
>active/inactive counters in /proc/vmstat.
Hello
Actually I had tried to reuse them. But some value was not that compatible.
Let me try that way again.
>
>> Here's an example
>> mm_vmscan_lru_gen_scan: classzone=2 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=anon
>> mm_vmscan_lru_gen_evict: nid=0 nr_reclaimed=64 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=0 nr_ref_keep=0 nr_unmap_fail=0 priority=2 flags=RECLAIM_WB_ANON|RECLAIM_WB_ASYNC
>> mm_vmscan_lru_gen_scan: classzone=1 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=file
>> mm_vmscan_lru_gen_evict: nid=0 nr_reclaimed=64 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=0 nr_ref_keep=0 nr_unmap_fail=0 priority=12 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC
>>
>> Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
>> ---
>> v4: wrap with #ifdef CONFIG_LRU_GEN
>> v3: change printk format
>> v2: use condition and make it aligned
>> v1: introduce trace events
>> ---
>> include/trace/events/mmflags.h | 9 ++++
>> include/trace/events/vmscan.h | 96 ++++++++++++++++++++++++++++++++++
>> mm/vmscan.c | 20 +++++--
>> 3 files changed, 120 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
>> index 1478b9dd05fa..6dfe85bd4e81 100644
>> --- a/include/trace/events/mmflags.h
>> +++ b/include/trace/events/mmflags.h
>> @@ -274,6 +274,12 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY, "softdirty" ) \
>> EM (LRU_ACTIVE_FILE, "active_file") \
>> EMe(LRU_UNEVICTABLE, "unevictable")
>>
>> +#ifdef CONFIG_LRU_GEN
>> +#define LRU_GEN_NAMES \
>> + EM (LRU_GEN_ANON, "anon") \
>> + EMe(LRU_GEN_FILE, "file")
>> +#endif
>> +
>> /*
>> * First define the enums in the above macros to be exported to userspace
>> * via TRACE_DEFINE_ENUM().
>> @@ -288,6 +294,9 @@ COMPACTION_PRIORITY
>> /* COMPACTION_FEEDBACK are defines not enums. Not needed here. */
>> ZONE_TYPE
>> LRU_NAMES
>> +#ifdef CONFIG_LRU_GEN
>> +LRU_GEN_NAMES
>> +#endif
>>
>> /*
>> * Now redefine the EM() and EMe() macros to map the enums to the strings
>> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
>> index d2123dd960d5..2080ef742f89 100644
>> --- a/include/trace/events/vmscan.h
>> +++ b/include/trace/events/vmscan.h
>> @@ -327,6 +327,102 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>> __print_symbolic(__entry->lru, LRU_NAMES))
>> );
>>
>> +#ifdef CONFIG_LRU_GEN
>> +TRACE_EVENT_CONDITION(mm_vmscan_lru_gen_scan,
>> + TP_PROTO(int highest_zoneidx,
>> + int order,
>> + unsigned long nr_requested,
>> + unsigned long nr_scanned,
>> + unsigned long nr_skipped,
>> + unsigned long nr_taken,
>> + int lru),
>> +
>> + TP_ARGS(highest_zoneidx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, lru),
>> +
>> + TP_CONDITION(nr_scanned),
>> +
>> + TP_STRUCT__entry(
>> + __field(int, highest_zoneidx)
>> + __field(int, order)
>> + __field(unsigned long, nr_requested)
>> + __field(unsigned long, nr_scanned)
>> + __field(unsigned long, nr_skipped)
>> + __field(unsigned long, nr_taken)
>> + __field(int, lru)
>> + ),
>> +
>> + TP_fast_assign(
>> + __entry->highest_zoneidx = highest_zoneidx;
>> + __entry->order = order;
>> + __entry->nr_requested = nr_requested;
>> + __entry->nr_scanned = nr_scanned;
>> + __entry->nr_skipped = nr_skipped;
>> + __entry->nr_taken = nr_taken;
>> + __entry->lru = lru;
>> + ),
>> +
>> + /*
>> + * classzone is previous name of the highest_zoneidx.
>> + * Reason not to change it is the ABI requirement of the tracepoint.
>> + */
>> + TP_printk("classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu lru=%s",
>> + __entry->highest_zoneidx,
>> + __entry->order,
>> + __entry->nr_requested,
>> + __entry->nr_scanned,
>> + __entry->nr_skipped,
>> + __entry->nr_taken,
>> + __print_symbolic(__entry->lru, LRU_GEN_NAMES))
>> +);
>> +
>> +TRACE_EVENT(mm_vmscan_lru_gen_evict,
>> +
>> + TP_PROTO(int nid, unsigned long nr_reclaimed,
>> + struct reclaim_stat *stat, int priority, int file),
>> +
>> + TP_ARGS(nid, nr_reclaimed, stat, priority, file),
>> +
>> + TP_STRUCT__entry(
>> + __field(unsigned long, nr_reclaimed)
>> + __field(unsigned long, nr_dirty)
>> + __field(unsigned long, nr_writeback)
>> + __field(unsigned long, nr_congested)
>> + __field(unsigned long, nr_immediate)
>> + __field(unsigned int, nr_activate0)
>> + __field(unsigned int, nr_activate1)
>> + __field(unsigned long, nr_ref_keep)
>> + __field(unsigned long, nr_unmap_fail)
>> + __field(int, nid)
>> + __field(int, priority)
>> + __field(int, reclaim_flags)
>> + ),
>> +
>> + TP_fast_assign(
>> + __entry->nid = nid;
>> + __entry->nr_reclaimed = nr_reclaimed;
>> + __entry->nr_dirty = stat->nr_dirty;
>> + __entry->nr_writeback = stat->nr_writeback;
>> + __entry->nr_congested = stat->nr_congested;
>> + __entry->nr_immediate = stat->nr_immediate;
>> + __entry->nr_activate0 = stat->nr_activate[0];
>> + __entry->nr_activate1 = stat->nr_activate[1];
>> + __entry->nr_ref_keep = stat->nr_ref_keep;
>> + __entry->nr_unmap_fail = stat->nr_unmap_fail;
>> + __entry->priority = priority;
>> + __entry->reclaim_flags = trace_reclaim_flags(file);
>> + ),
>> +
>> + TP_printk("nid=%d nr_reclaimed=%lu nr_dirty=%lu nr_writeback=%lu nr_congested=%lu nr_immediate=%lu nr_activate_anon=%u nr_activate_file=%u nr_ref_keep=%lu nr_unmap_fail=%lu priority=%d flags=%s",
>> + __entry->nid, __entry->nr_reclaimed,
>> + __entry->nr_dirty, __entry->nr_writeback,
>> + __entry->nr_congested, __entry->nr_immediate,
>> + __entry->nr_activate0, __entry->nr_activate1,
>> + __entry->nr_ref_keep, __entry->nr_unmap_fail,
>> + __entry->priority,
>> + show_reclaim_flags(__entry->reclaim_flags))
>> +);
>> +#endif
>> +
>> TRACE_EVENT(mm_vmscan_write_folio,
>>
>> TP_PROTO(struct folio *folio),
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 6f13394b112e..0c8b48bcb461 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -5005,6 +5005,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
>> int sorted = 0;
>> int scanned = 0;
>> int isolated = 0;
>> + int skipped = 0;
>> int remaining = MAX_LRU_BATCH;
>> struct lru_gen_folio *lrugen = &lruvec->lrugen;
>> struct mem_cgroup *memcg = lruvec_memcg(lruvec);
>> @@ -5018,7 +5019,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
>>
>> for (i = MAX_NR_ZONES; i > 0; i--) {
>> LIST_HEAD(moved);
>> - int skipped = 0;
>> + int skipped_zone = 0;
>> int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
>> struct list_head *head = &lrugen->folios[gen][type][zone];
>>
>> @@ -5040,16 +5041,17 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
>> isolated += delta;
>> } else {
>> list_move(&folio->lru, &moved);
>> - skipped += delta;
>> + skipped_zone += delta;
>> }
>>
>> - if (!--remaining || max(isolated, skipped) >= MIN_LRU_BATCH)
>> + if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH)
>> break;
>> }
>>
>> - if (skipped) {
>> + if (skipped_zone) {
>> list_splice(&moved, head);
>> - __count_zid_vm_events(PGSCAN_SKIP, zone, skipped);
>> + __count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone);
>> + skipped += skipped_zone;
>> }
>>
>> if (!remaining || isolated >= MIN_LRU_BATCH)
>> @@ -5065,6 +5067,10 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
>> __count_memcg_events(memcg, PGREFILL, sorted);
>> __count_vm_events(PGSCAN_ANON + type, isolated);
>>
>> +#ifdef CONFIG_LRU_GEN
>> + trace_mm_vmscan_lru_gen_scan(sc->reclaim_idx, sc->order, MAX_LRU_BATCH,
>> + scanned, skipped, isolated, type);
>> +#endif
>
>These functions are already within CONFIG_LRU_GEN.
>
>> /*
>> * There might not be eligible folios due to reclaim_idx. Check the
>> * remaining to prevent livelock if it's not making progress.
>> @@ -5194,6 +5200,10 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
>> retry:
>> reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false);
>> sc->nr_reclaimed += reclaimed;
>> +#ifdef CONFIG_LRU_GEN
>> + trace_mm_vmscan_lru_gen_evict(pgdat->node_id, reclaimed, &stat,
>> + sc->priority, type);
>> +#endif
>
>Ditto.
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH v4] vmscan: add trace events for lru_gen
[not found] ` <CGME20230926042019epcas1p11c28533f7b7db99db9f9d8a03ddd332c@epcms1p1>
@ 2023-09-26 7:33 ` 김재원
0 siblings, 0 replies; 7+ messages in thread
From: 김재원 @ 2023-09-26 7:33 UTC (permalink / raw)
To: 김재원, Yu Zhao
Cc: rostedt@goodmis.org, tjmercier@google.com, kaleshsingh@google.com,
akpm@linux-foundation.org, vbabka@suse.cz, hannes@cmpxchg.org,
sj@kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org,
jaewon31.kim@gmail.com
>>On Mon, Sep 25, 2023 at 10:20?PM Jaewon Kim <jaewon31.kim@samsung.com> wrote:
>>>
>>> As the legacy lru provides, the lru_gen needs some trace events for
>>> debugging.
>>>
>>> This commit introduces 2 trace events.
>>> trace_mm_vmscan_lru_gen_scan
>>> trace_mm_vmscan_lru_gen_evict
>>>
>>> Each event is similar to the following legacy events.
>>> trace_mm_vmscan_lru_isolate,
>>> trace_mm_vmscan_lru_shrink_[in]active
>>
>>We should just reuse trace_mm_vmscan_lru_isolate and
>>trace_mm_vmscan_lru_shrink_inactive instead of adding new tracepoints.
>>
>>To reuse trace_mm_vmscan_lru_isolate, we'd just need to append two new
>>names to LRU_NAMES.
>>
>>The naming of trace_mm_vmscan_lru_shrink_inactive might seem confusing
>>but it's how MGLRU maintains the compatibility, e.g., the existing
>>active/inactive counters in /proc/vmstat.
>
>
>Hello
>
>Actually I had tried to reuse them. But some value was not that compatible.
>Let me try that way again.
>
>>
Hello Yu Zhao
Could you look into what I tried below? I reused the legacy trace events as you recommened.
For the nr_scanned for trace_mm_vmscan_lru_shrink_inactive, I just used the scanned returned from isolate_folios.
I thought this is right as scan_folios also uses its isolated.
__count_vm_events(PGSCAN_ANON + type, isolated);
But I guess the scanned in scan_folios is actually the one used in shrink_inactive_list
I tested this on both 0 and 7 of /sys/kernel/mm/lru_gen/enabled
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a4e44f1c97c1..b61a0156559c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4328,6 +4328,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
int sorted = 0;
int scanned = 0;
int isolated = 0;
+ int skipped = 0;
int remaining = MAX_LRU_BATCH;
struct lru_gen_folio *lrugen = &lruvec->lrugen;
struct mem_cgroup *memcg = lruvec_memcg(lruvec);
@@ -4341,7 +4342,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
for (i = MAX_NR_ZONES; i > 0; i--) {
LIST_HEAD(moved);
- int skipped = 0;
+ int skipped_zone = 0;
int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
struct list_head *head = &lrugen->folios[gen][type][zone];
@@ -4363,16 +4364,17 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
isolated += delta;
} else {
list_move(&folio->lru, &moved);
- skipped += delta;
+ skipped_zone += delta;
}
- if (!--remaining || max(isolated, skipped) >= MIN_LRU_BATCH)
+ if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH)
break;
}
- if (skipped) {
+ if (skipped_zone) {
list_splice(&moved, head);
- __count_zid_vm_events(PGSCAN_SKIP, zone, skipped);
+ __count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone);
+ skipped += skipped_zone;
}
if (!remaining || isolated >= MIN_LRU_BATCH)
@@ -4387,6 +4389,9 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
__count_memcg_events(memcg, item, isolated);
__count_memcg_events(memcg, PGREFILL, sorted);
__count_vm_events(PGSCAN_ANON + type, isolated);
+ trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, MAX_LRU_BATCH,
+ scanned, skipped, isolated,
+ type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
/*
* There might not be eligible folios due to reclaim_idx. Check the
@@ -4517,6 +4522,9 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
retry:
reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false);
sc->nr_reclaimed += reclaimed;
+ trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
+ scanned, reclaimed, &stat, sc->priority,
+ type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
list_for_each_entry_safe_reverse(folio, next, &list, lru) {
if (!folio_evictable(folio)) {
>>> Here's an example
>>> mm_vmscan_lru_gen_scan: classzone=2 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=anon
>>> mm_vmscan_lru_gen_evict: nid=0 nr_reclaimed=64 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=0 nr_ref_keep=0 nr_unmap_fail=0 priority=2 flags=RECLAIM_WB_ANON|RECLAIM_WB_ASYNC
>>> mm_vmscan_lru_gen_scan: classzone=1 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=file
>>> mm_vmscan_lru_gen_evict: nid=0 nr_reclaimed=64 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=0 nr_ref_keep=0 nr_unmap_fail=0 priority=12 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC
>>>
>>> Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
>>> ---
>>> v4: wrap with #ifdef CONFIG_LRU_GEN
>>> v3: change printk format
>>> v2: use condition and make it aligned
>>> v1: introduce trace events
>>> ---
>>> include/trace/events/mmflags.h | 9 ++++
>>> include/trace/events/vmscan.h | 96 ++++++++++++++++++++++++++++++++++
>>> mm/vmscan.c | 20 +++++--
>>> 3 files changed, 120 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
>>> index 1478b9dd05fa..6dfe85bd4e81 100644
>>> --- a/include/trace/events/mmflags.h
>>> +++ b/include/trace/events/mmflags.h
>>> @@ -274,6 +274,12 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY, "softdirty" ) \
>>> EM (LRU_ACTIVE_FILE, "active_file") \
>>> EMe(LRU_UNEVICTABLE, "unevictable")
>>>
>>> +#ifdef CONFIG_LRU_GEN
>>> +#define LRU_GEN_NAMES \
>>> + EM (LRU_GEN_ANON, "anon") \
>>> + EMe(LRU_GEN_FILE, "file")
>>> +#endif
>>> +
>>> /*
>>> * First define the enums in the above macros to be exported to userspace
>>> * via TRACE_DEFINE_ENUM().
>>> @@ -288,6 +294,9 @@ COMPACTION_PRIORITY
>>> /* COMPACTION_FEEDBACK are defines not enums. Not needed here. */
>>> ZONE_TYPE
>>> LRU_NAMES
>>> +#ifdef CONFIG_LRU_GEN
>>> +LRU_GEN_NAMES
>>> +#endif
>>>
>>> /*
>>> * Now redefine the EM() and EMe() macros to map the enums to the strings
>>> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
>>> index d2123dd960d5..2080ef742f89 100644
>>> --- a/include/trace/events/vmscan.h
>>> +++ b/include/trace/events/vmscan.h
>>> @@ -327,6 +327,102 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
>>> __print_symbolic(__entry->lru, LRU_NAMES))
>>> );
>>>
>>> +#ifdef CONFIG_LRU_GEN
>>> +TRACE_EVENT_CONDITION(mm_vmscan_lru_gen_scan,
>>> + TP_PROTO(int highest_zoneidx,
>>> + int order,
>>> + unsigned long nr_requested,
>>> + unsigned long nr_scanned,
>>> + unsigned long nr_skipped,
>>> + unsigned long nr_taken,
>>> + int lru),
>>> +
>>> + TP_ARGS(highest_zoneidx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, lru),
>>> +
>>> + TP_CONDITION(nr_scanned),
>>> +
>>> + TP_STRUCT__entry(
>>> + __field(int, highest_zoneidx)
>>> + __field(int, order)
>>> + __field(unsigned long, nr_requested)
>>> + __field(unsigned long, nr_scanned)
>>> + __field(unsigned long, nr_skipped)
>>> + __field(unsigned long, nr_taken)
>>> + __field(int, lru)
>>> + ),
>>> +
>>> + TP_fast_assign(
>>> + __entry->highest_zoneidx = highest_zoneidx;
>>> + __entry->order = order;
>>> + __entry->nr_requested = nr_requested;
>>> + __entry->nr_scanned = nr_scanned;
>>> + __entry->nr_skipped = nr_skipped;
>>> + __entry->nr_taken = nr_taken;
>>> + __entry->lru = lru;
>>> + ),
>>> +
>>> + /*
>>> + * classzone is previous name of the highest_zoneidx.
>>> + * Reason not to change it is the ABI requirement of the tracepoint.
>>> + */
>>> + TP_printk("classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu lru=%s",
>>> + __entry->highest_zoneidx,
>>> + __entry->order,
>>> + __entry->nr_requested,
>>> + __entry->nr_scanned,
>>> + __entry->nr_skipped,
>>> + __entry->nr_taken,
>>> + __print_symbolic(__entry->lru, LRU_GEN_NAMES))
>>> +);
>>> +
>>> +TRACE_EVENT(mm_vmscan_lru_gen_evict,
>>> +
>>> + TP_PROTO(int nid, unsigned long nr_reclaimed,
>>> + struct reclaim_stat *stat, int priority, int file),
>>> +
>>> + TP_ARGS(nid, nr_reclaimed, stat, priority, file),
>>> +
>>> + TP_STRUCT__entry(
>>> + __field(unsigned long, nr_reclaimed)
>>> + __field(unsigned long, nr_dirty)
>>> + __field(unsigned long, nr_writeback)
>>> + __field(unsigned long, nr_congested)
>>> + __field(unsigned long, nr_immediate)
>>> + __field(unsigned int, nr_activate0)
>>> + __field(unsigned int, nr_activate1)
>>> + __field(unsigned long, nr_ref_keep)
>>> + __field(unsigned long, nr_unmap_fail)
>>> + __field(int, nid)
>>> + __field(int, priority)
>>> + __field(int, reclaim_flags)
>>> + ),
>>> +
>>> + TP_fast_assign(
>>> + __entry->nid = nid;
>>> + __entry->nr_reclaimed = nr_reclaimed;
>>> + __entry->nr_dirty = stat->nr_dirty;
>>> + __entry->nr_writeback = stat->nr_writeback;
>>> + __entry->nr_congested = stat->nr_congested;
>>> + __entry->nr_immediate = stat->nr_immediate;
>>> + __entry->nr_activate0 = stat->nr_activate[0];
>>> + __entry->nr_activate1 = stat->nr_activate[1];
>>> + __entry->nr_ref_keep = stat->nr_ref_keep;
>>> + __entry->nr_unmap_fail = stat->nr_unmap_fail;
>>> + __entry->priority = priority;
>>> + __entry->reclaim_flags = trace_reclaim_flags(file);
>>> + ),
>>> +
>>> + TP_printk("nid=%d nr_reclaimed=%lu nr_dirty=%lu nr_writeback=%lu nr_congested=%lu nr_immediate=%lu nr_activate_anon=%u nr_activate_file=%u nr_ref_keep=%lu nr_unmap_fail=%lu priority=%d flags=%s",
>>> + __entry->nid, __entry->nr_reclaimed,
>>> + __entry->nr_dirty, __entry->nr_writeback,
>>> + __entry->nr_congested, __entry->nr_immediate,
>>> + __entry->nr_activate0, __entry->nr_activate1,
>>> + __entry->nr_ref_keep, __entry->nr_unmap_fail,
>>> + __entry->priority,
>>> + show_reclaim_flags(__entry->reclaim_flags))
>>> +);
>>> +#endif
>>> +
>>> TRACE_EVENT(mm_vmscan_write_folio,
>>>
>>> TP_PROTO(struct folio *folio),
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index 6f13394b112e..0c8b48bcb461 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -5005,6 +5005,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
>>> int sorted = 0;
>>> int scanned = 0;
>>> int isolated = 0;
>>> + int skipped = 0;
>>> int remaining = MAX_LRU_BATCH;
>>> struct lru_gen_folio *lrugen = &lruvec->lrugen;
>>> struct mem_cgroup *memcg = lruvec_memcg(lruvec);
>>> @@ -5018,7 +5019,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
>>>
>>> for (i = MAX_NR_ZONES; i > 0; i--) {
>>> LIST_HEAD(moved);
>>> - int skipped = 0;
>>> + int skipped_zone = 0;
>>> int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
>>> struct list_head *head = &lrugen->folios[gen][type][zone];
>>>
>>> @@ -5040,16 +5041,17 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
>>> isolated += delta;
>>> } else {
>>> list_move(&folio->lru, &moved);
>>> - skipped += delta;
>>> + skipped_zone += delta;
>>> }
>>>
>>> - if (!--remaining || max(isolated, skipped) >= MIN_LRU_BATCH)
>>> + if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH)
>>> break;
>>> }
>>>
>>> - if (skipped) {
>>> + if (skipped_zone) {
>>> list_splice(&moved, head);
>>> - __count_zid_vm_events(PGSCAN_SKIP, zone, skipped);
>>> + __count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone);
>>> + skipped += skipped_zone;
>>> }
>>>
>>> if (!remaining || isolated >= MIN_LRU_BATCH)
>>> @@ -5065,6 +5067,10 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
>>> __count_memcg_events(memcg, PGREFILL, sorted);
>>> __count_vm_events(PGSCAN_ANON + type, isolated);
>>>
>>> +#ifdef CONFIG_LRU_GEN
>>> + trace_mm_vmscan_lru_gen_scan(sc->reclaim_idx, sc->order, MAX_LRU_BATCH,
>>> + scanned, skipped, isolated, type);
>>> +#endif
>>
>>These functions are already within CONFIG_LRU_GEN.
>>
>>> /*
>>> * There might not be eligible folios due to reclaim_idx. Check the
>>> * remaining to prevent livelock if it's not making progress.
>>> @@ -5194,6 +5200,10 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
>>> retry:
>>> reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false);
>>> sc->nr_reclaimed += reclaimed;
>>> +#ifdef CONFIG_LRU_GEN
>>> + trace_mm_vmscan_lru_gen_evict(pgdat->node_id, reclaimed, &stat,
>>> + sc->priority, type);
>>> +#endif
>>
>>Ditto.
^ permalink raw reply related [flat|nested] 7+ messages in thread
* RE: [PATCH v4] vmscan: add trace events for lru_gen
[not found] ` <CGME20230926042019epcas1p11c28533f7b7db99db9f9d8a03ddd332c@epcms1p5>
@ 2023-09-26 14:15 ` 김재원
2023-10-01 23:41 ` Jaewon Kim
0 siblings, 1 reply; 7+ messages in thread
From: 김재원 @ 2023-09-26 14:15 UTC (permalink / raw)
To: Yu Zhao
Cc: rostedt@goodmis.org, tjmercier@google.com, kaleshsingh@google.com,
akpm@linux-foundation.org, vbabka@suse.cz, hannes@cmpxchg.org,
sj@kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org,
jaewon31.kim@gmail.com
>>>On Mon, Sep 25, 2023 at 10:20?PM Jaewon Kim <jaewon31.kim@samsung.com> wrote:
>>>>
>>>> As the legacy lru provides, the lru_gen needs some trace events for
>>>> debugging.
>>>>
>>>> This commit introduces 2 trace events.
>>>> trace_mm_vmscan_lru_gen_scan
>>>> trace_mm_vmscan_lru_gen_evict
>>>>
>>>> Each event is similar to the following legacy events.
>>>> trace_mm_vmscan_lru_isolate,
>>>> trace_mm_vmscan_lru_shrink_[in]active
>>>
>>>We should just reuse trace_mm_vmscan_lru_isolate and
>>>trace_mm_vmscan_lru_shrink_inactive instead of adding new tracepoints.
>>>
>>>To reuse trace_mm_vmscan_lru_isolate, we'd just need to append two new
>>>names to LRU_NAMES.
>>>
>>>The naming of trace_mm_vmscan_lru_shrink_inactive might seem confusing
>>>but it's how MGLRU maintains the compatibility, e.g., the existing
>>>active/inactive counters in /proc/vmstat.
>>
>>
>>Hello
>>
>>Actually I had tried to reuse them. But some value was not that compatible.
>>Let me try that way again.
>>
>>>
>
>Hello Yu Zhao
>
>Could you look into what I tried below? I reused the legacy trace events as you recommened.
>
>For the nr_scanned for trace_mm_vmscan_lru_shrink_inactive, I just used the scanned returned from isolate_folios.
>I thought this is right as scan_folios also uses its isolated.
> __count_vm_events(PGSCAN_ANON + type, isolated);
>But I guess the scanned in scan_folios is actually the one used in shrink_inactive_list
please ignore nr_scanned thing above I just misread the code.
This is an example, I think it works well.
mm_vmscan_lru_isolate: isolate_mode=0 classzone=2 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=inactive_file
mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=64 nr_reclaimed=63 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=1 nr_ref_keep=0 nr_unmap_fail=0 priority=2 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC
>
>I tested this on both 0 and 7 of /sys/kernel/mm/lru_gen/enabled
>
>
>diff --git a/mm/vmscan.c b/mm/vmscan.c
>index a4e44f1c97c1..b61a0156559c 100644
>--- a/mm/vmscan.c
>+++ b/mm/vmscan.c
>@@ -4328,6 +4328,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> int sorted = 0;
> int scanned = 0;
> int isolated = 0;
>+ int skipped = 0;
> int remaining = MAX_LRU_BATCH;
> struct lru_gen_folio *lrugen = &lruvec->lrugen;
> struct mem_cgroup *memcg = lruvec_memcg(lruvec);
>@@ -4341,7 +4342,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
>
> for (i = MAX_NR_ZONES; i > 0; i--) {
> LIST_HEAD(moved);
>- int skipped = 0;
>+ int skipped_zone = 0;
> int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
> struct list_head *head = &lrugen->folios[gen][type][zone];
>
>@@ -4363,16 +4364,17 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> isolated += delta;
> } else {
> list_move(&folio->lru, &moved);
>- skipped += delta;
>+ skipped_zone += delta;
> }
>
>- if (!--remaining || max(isolated, skipped) >= MIN_LRU_BATCH)
>+ if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH)
> break;
> }
>
>- if (skipped) {
>+ if (skipped_zone) {
> list_splice(&moved, head);
>- __count_zid_vm_events(PGSCAN_SKIP, zone, skipped);
>+ __count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone);
>+ skipped += skipped_zone;
> }
>
> if (!remaining || isolated >= MIN_LRU_BATCH)
>@@ -4387,6 +4389,9 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> __count_memcg_events(memcg, item, isolated);
> __count_memcg_events(memcg, PGREFILL, sorted);
> __count_vm_events(PGSCAN_ANON + type, isolated);
>+ trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, MAX_LRU_BATCH,
>+ scanned, skipped, isolated,
>+ type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
>
> /*
> * There might not be eligible folios due to reclaim_idx. Check the
>@@ -4517,6 +4522,9 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
> retry:
> reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false);
> sc->nr_reclaimed += reclaimed;
>+ trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
>+ scanned, reclaimed, &stat, sc->priority,
>+ type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
>
> list_for_each_entry_safe_reverse(folio, next, &list, lru) {
> if (!folio_evictable(folio)) {
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] vmscan: add trace events for lru_gen
2023-09-26 14:15 ` 김재원
@ 2023-10-01 23:41 ` Jaewon Kim
2023-10-02 3:26 ` Yu Zhao
0 siblings, 1 reply; 7+ messages in thread
From: Jaewon Kim @ 2023-10-01 23:41 UTC (permalink / raw)
To: jaewon31.kim
Cc: Yu Zhao, rostedt@goodmis.org, tjmercier@google.com,
kaleshsingh@google.com, akpm@linux-foundation.org, vbabka@suse.cz,
hannes@cmpxchg.org, sj@kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org
Hello Yu Zhao
Could you give me your comment? I am waiting for your opinion on the
reuse method. I'm planning to resend it as a complete patch with it.
Thank you.
On Tue, Sep 26, 2023 at 11:15 PM 김재원 <jaewon31.kim@samsung.com> wrote:
>
> >>>On Mon, Sep 25, 2023 at 10:20?PM Jaewon Kim <jaewon31.kim@samsung.com> wrote:
> >>>>
> >>>> As the legacy lru provides, the lru_gen needs some trace events for
> >>>> debugging.
> >>>>
> >>>> This commit introduces 2 trace events.
> >>>> trace_mm_vmscan_lru_gen_scan
> >>>> trace_mm_vmscan_lru_gen_evict
> >>>>
> >>>> Each event is similar to the following legacy events.
> >>>> trace_mm_vmscan_lru_isolate,
> >>>> trace_mm_vmscan_lru_shrink_[in]active
> >>>
> >>>We should just reuse trace_mm_vmscan_lru_isolate and
> >>>trace_mm_vmscan_lru_shrink_inactive instead of adding new tracepoints.
> >>>
> >>>To reuse trace_mm_vmscan_lru_isolate, we'd just need to append two new
> >>>names to LRU_NAMES.
> >>>
> >>>The naming of trace_mm_vmscan_lru_shrink_inactive might seem confusing
> >>>but it's how MGLRU maintains the compatibility, e.g., the existing
> >>>active/inactive counters in /proc/vmstat.
> >>
> >>
> >>Hello
> >>
> >>Actually I had tried to reuse them. But some value was not that compatible.
> >>Let me try that way again.
> >>
> >>>
> >
> >Hello Yu Zhao
> >
> >Could you look into what I tried below? I reused the legacy trace events as you recommened.
> >
> >For the nr_scanned for trace_mm_vmscan_lru_shrink_inactive, I just used the scanned returned from isolate_folios.
> >I thought this is right as scan_folios also uses its isolated.
> > __count_vm_events(PGSCAN_ANON + type, isolated);
> >But I guess the scanned in scan_folios is actually the one used in shrink_inactive_list
>
> please ignore nr_scanned thing above I just misread the code.
>
> This is an example, I think it works well.
>
> mm_vmscan_lru_isolate: isolate_mode=0 classzone=2 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=inactive_file
> mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=64 nr_reclaimed=63 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=1 nr_ref_keep=0 nr_unmap_fail=0 priority=2 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC
>
> >
> >I tested this on both 0 and 7 of /sys/kernel/mm/lru_gen/enabled
> >
> >
> >diff --git a/mm/vmscan.c b/mm/vmscan.c
> >index a4e44f1c97c1..b61a0156559c 100644
> >--- a/mm/vmscan.c
> >+++ b/mm/vmscan.c
> >@@ -4328,6 +4328,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> > int sorted = 0;
> > int scanned = 0;
> > int isolated = 0;
> >+ int skipped = 0;
> > int remaining = MAX_LRU_BATCH;
> > struct lru_gen_folio *lrugen = &lruvec->lrugen;
> > struct mem_cgroup *memcg = lruvec_memcg(lruvec);
> >@@ -4341,7 +4342,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> >
> > for (i = MAX_NR_ZONES; i > 0; i--) {
> > LIST_HEAD(moved);
> >- int skipped = 0;
> >+ int skipped_zone = 0;
> > int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
> > struct list_head *head = &lrugen->folios[gen][type][zone];
> >
> >@@ -4363,16 +4364,17 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> > isolated += delta;
> > } else {
> > list_move(&folio->lru, &moved);
> >- skipped += delta;
> >+ skipped_zone += delta;
> > }
> >
> >- if (!--remaining || max(isolated, skipped) >= MIN_LRU_BATCH)
> >+ if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH)
> > break;
> > }
> >
> >- if (skipped) {
> >+ if (skipped_zone) {
> > list_splice(&moved, head);
> >- __count_zid_vm_events(PGSCAN_SKIP, zone, skipped);
> >+ __count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone);
> >+ skipped += skipped_zone;
> > }
> >
> > if (!remaining || isolated >= MIN_LRU_BATCH)
> >@@ -4387,6 +4389,9 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> > __count_memcg_events(memcg, item, isolated);
> > __count_memcg_events(memcg, PGREFILL, sorted);
> > __count_vm_events(PGSCAN_ANON + type, isolated);
> >+ trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, MAX_LRU_BATCH,
> >+ scanned, skipped, isolated,
> >+ type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
> >
> > /*
> > * There might not be eligible folios due to reclaim_idx. Check the
> >@@ -4517,6 +4522,9 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
> > retry:
> > reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false);
> > sc->nr_reclaimed += reclaimed;
> >+ trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
> >+ scanned, reclaimed, &stat, sc->priority,
> >+ type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
> >
> > list_for_each_entry_safe_reverse(folio, next, &list, lru) {
> > if (!folio_evictable(folio)) {
> >
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] vmscan: add trace events for lru_gen
2023-10-01 23:41 ` Jaewon Kim
@ 2023-10-02 3:26 ` Yu Zhao
0 siblings, 0 replies; 7+ messages in thread
From: Yu Zhao @ 2023-10-02 3:26 UTC (permalink / raw)
To: Jaewon Kim
Cc: jaewon31.kim, rostedt@goodmis.org, tjmercier@google.com,
kaleshsingh@google.com, akpm@linux-foundation.org, vbabka@suse.cz,
hannes@cmpxchg.org, sj@kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org
On Sun, Oct 1, 2023 at 5:41 PM Jaewon Kim <jaewon31.kim@gmail.com> wrote:
>
> Hello Yu Zhao
>
> Could you give me your comment? I am waiting for your opinion on the
> reuse method. I'm planning to resend it as a complete patch with it.
> Thank you.
Acked-by: Yu Zhao <yuzhao@google.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-10-02 3:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20230926042019epcas1p11c28533f7b7db99db9f9d8a03ddd332c@epcas1p1.samsung.com>
2023-09-26 4:22 ` [PATCH v4] vmscan: add trace events for lru_gen Jaewon Kim
2023-09-26 4:42 ` Yu Zhao
[not found] ` <CGME20230926042019epcas1p11c28533f7b7db99db9f9d8a03ddd332c@epcms1p3>
2023-09-26 5:10 ` 김재원
[not found] ` <CGME20230926042019epcas1p11c28533f7b7db99db9f9d8a03ddd332c@epcms1p1>
2023-09-26 7:33 ` 김재원
[not found] ` <CGME20230926042019epcas1p11c28533f7b7db99db9f9d8a03ddd332c@epcms1p5>
2023-09-26 14:15 ` 김재원
2023-10-01 23:41 ` Jaewon Kim
2023-10-02 3:26 ` Yu Zhao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).