* [bug, 4.8] /proc/meminfo: counter values are very wrong
@ 2016-08-04 5:10 Dave Chinner
2016-08-04 12:24 ` Mel Gorman
0 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2016-08-04 5:10 UTC (permalink / raw)
To: linux-kernel
Hi folks,
I just noticed a whacky memory usage profile when running some basic
IO tests on a current 4.8 tree. It looked like there was a massive
memory leak from my monitoring graphs - doing buffered IO was
causing huge amounts of memory to be considered used, but the cache
size was not increasing.
Looking at /proc/meminfo:
$ cat /proc/meminfo
MemTotal: 16395408 kB
MemFree: 79424 kB
MemAvailable: 2497240 kB
Buffers: 4372 kB
Cached: 558744 kB
SwapCached: 48 kB
Active: 2127212 kB
Inactive: 100400 kB
Active(anon): 25348 kB
Inactive(anon): 79424 kB
Active(file): 2101864 kB
Inactive(file): 20976 kB
Unevictable: 13612980 kB <<<<<<<<<
Mlocked: 3516 kB
SwapTotal: 497976 kB
SwapFree: 497188 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 38784 kB
Mapped: 15880 kB
Shmem: 8808 kB
Slab: 460408 kB
SReclaimable: 428496 kB
SUnreclaim: 31912 kB
KernelStack: 6112 kB
PageTables: 6740 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 8695680 kB
Committed_AS: 177456 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 14204 kB
DirectMap2M: 16762880 kB
It seems that whatever was happening was causing unevictable memory
to pile up. But when I look at the per-node stats, the memory is all
accounted as active file pages:
$ cat /sys/bus/node/devices/node0/meminfo
Node 0 MemTotal: 4029052 kB
Node 0 MemFree: 33276 kB
Node 0 MemUsed: 3995776 kB
Node 0 Active: 3283280 kB
Node 0 Inactive: 580668 kB
Node 0 Active(anon): 3564 kB
Node 0 Inactive(anon): 4716 kB
Node 0 Active(file): 3279716 kB <<<<<<<<
Node 0 Inactive(file): 575952 kB
Node 0 Unevictable: 1648 kB
Node 0 Mlocked: 1648 kB
Node 0 Dirty: 8 kB
Node 0 Writeback: 0 kB
Node 0 FilePages: 78796 kB
Node 0 Mapped: 5540 kB
Node 0 AnonPages: 8020 kB
Node 0 Shmem: 256 kB
Node 0 KernelStack: 2352 kB
Node 0 PageTables: 1976 kB
Node 0 NFS_Unstable: 0 kB
Node 0 Bounce: 0 kB
Node 0 WritebackTmp: 0 kB
Node 0 Slab: 109012 kB
Node 0 SReclaimable: 99156 kB
Node 0 SUnreclaim: 9856 kB
Node 0 HugePages_Total: 0
Node 0 HugePages_Free: 0
Node 0 HugePages_Surp: 0
$ cat /sys/bus/node/devices/node1/meminfo
Node 1 MemTotal: 4127912 kB
Node 1 MemFree: 13888 kB
Node 1 MemUsed: 4114024 kB
Node 1 Active: 3455400 kB
Node 1 Inactive: 522156 kB
Node 1 Active(anon): 5556 kB
Node 1 Inactive(anon): 6784 kB
Node 1 Active(file): 3449844 kB
Node 1 Inactive(file): 515372 kB
Node 1 Unevictable: 52 kB
Node 1 Mlocked: 52 kB
Node 1 Dirty: 16 kB
Node 1 Writeback: 0 kB
Node 1 FilePages: 155684 kB
Node 1 Mapped: 2216 kB
Node 1 AnonPages: 12320 kB
Node 1 Shmem: 16 kB
Node 1 KernelStack: 720 kB
Node 1 PageTables: 1120 kB
Node 1 NFS_Unstable: 0 kB
Node 1 Bounce: 0 kB
Node 1 WritebackTmp: 0 kB
Node 1 Slab: 117340 kB
Node 1 SReclaimable: 111472 kB
Node 1 SUnreclaim: 5868 kB
Node 1 HugePages_Total: 0
Node 1 HugePages_Free: 0
Node 1 HugePages_Surp: 0
$ cat /sys/bus/node/devices/node2/meminfo
Node 2 MemTotal: 4127912 kB
Node 2 MemFree: 21308 kB
Node 2 MemUsed: 4106604 kB
Node 2 Active: 3453056 kB
Node 2 Inactive: 517824 kB
Node 2 Active(anon): 3224 kB
Node 2 Inactive(anon): 4356 kB
Node 2 Active(file): 3449832 kB
Node 2 Inactive(file): 513468 kB
Node 2 Unevictable: 556 kB
Node 2 Mlocked: 556 kB
Node 2 Dirty: 0 kB
Node 2 Writeback: 0 kB
Node 2 FilePages: 150120 kB
Node 2 Mapped: 1840 kB
Node 2 AnonPages: 7476 kB
Node 2 Shmem: 232 kB
Node 2 KernelStack: 1184 kB
Node 2 PageTables: 1360 kB
Node 2 NFS_Unstable: 0 kB
Node 2 Bounce: 0 kB
Node 2 WritebackTmp: 0 kB
Node 2 Slab: 114288 kB
Node 2 SReclaimable: 107616 kB
Node 2 SUnreclaim: 6672 kB
Node 2 HugePages_Total: 0
Node 2 HugePages_Free: 0
Node 2 HugePages_Surp: 0
$ cat /sys/bus/node/devices/node3/meminfo
Node 3 MemTotal: 4110532 kB
Node 3 MemFree: 10224 kB
Node 3 MemUsed: 4100308 kB
Node 3 Active: 3442224 kB
Node 3 Inactive: 506564 kB
Node 3 Active(anon): 8636 kB
Node 3 Inactive(anon): 9492 kB
Node 3 Active(file): 3433588 kB
Node 3 Inactive(file): 497072 kB
Node 3 Unevictable: 1260 kB
Node 3 Mlocked: 1260 kB
Node 3 Dirty: 0 kB
Node 3 Writeback: 0 kB
Node 3 FilePages: 178564 kB
Node 3 Mapped: 6284 kB
Node 3 AnonPages: 10968 kB
Node 3 Shmem: 8304 kB
Node 3 KernelStack: 1856 kB
Node 3 PageTables: 2284 kB
Node 3 NFS_Unstable: 0 kB
Node 3 Bounce: 0 kB
Node 3 WritebackTmp: 0 kB
Node 3 Slab: 119736 kB
Node 3 SReclaimable: 110252 kB
Node 3 SUnreclaim: 9484 kB
Node 3 HugePages_Total: 0
Node 3 HugePages_Free: 0
Node 3 HugePages_Surp: 0
So clearly there's an accounting problem here. I think there may be
multiple problems, however. The workload is simple:
$ time for i in `seq 0 1 100`; do
> sudo rm -f /mnt/scratch/testfile
> sudo xfs_io -f -c "pwrite 0 512m -b 32k" -c "pwrite 0 511m -b 32k" /mnt/scratch/testfile &> /dev/null
> done
It's just writing 512MB to a file twice, then unlinking it. Then
doing it again. On unlink, the page cache is invalidated, which means
all the 512MB of cached pages should be freed and removed from teh
page cache. According to the per-node counters, that is not
happening and there gigabytes of invalidated pages still sitting on
the active LRUs.
Something is broken....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [bug, 4.8] /proc/meminfo: counter values are very wrong 2016-08-04 5:10 [bug, 4.8] /proc/meminfo: counter values are very wrong Dave Chinner @ 2016-08-04 12:24 ` Mel Gorman 2016-08-04 12:34 ` Mel Gorman 0 siblings, 1 reply; 7+ messages in thread From: Mel Gorman @ 2016-08-04 12:24 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-kernel On Thu, Aug 04, 2016 at 03:10:51PM +1000, Dave Chinner wrote: > Hi folks, > > I just noticed a whacky memory usage profile when running some basic > IO tests on a current 4.8 tree. It looked like there was a massive > memory leak from my monitoring graphs - doing buffered IO was > causing huge amounts of memory to be considered used, but the cache > size was not increasing. > > Looking at /proc/meminfo: > > $ cat /proc/meminfo > MemTotal: 16395408 kB > MemFree: 79424 kB > MemAvailable: 2497240 kB > Buffers: 4372 kB > Cached: 558744 kB > SwapCached: 48 kB > Active: 2127212 kB > Inactive: 100400 kB > Active(anon): 25348 kB > Inactive(anon): 79424 kB > Active(file): 2101864 kB > Inactive(file): 20976 kB > Unevictable: 13612980 kB <<<<<<<<< This? Very quickly done, no boot testing diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 09e18fdf61e5..b9a8c813e5e6 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -46,7 +46,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) cached = 0; for (lru = LRU_BASE; lru < NR_LRU_LISTS; lru++) - pages[lru] = global_page_state(NR_LRU_BASE + lru); + pages[lru] = global_node_page_state(NR_LRU_BASE + lru); available = si_mem_available(); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [bug, 4.8] /proc/meminfo: counter values are very wrong 2016-08-04 12:24 ` Mel Gorman @ 2016-08-04 12:34 ` Mel Gorman 2016-08-04 23:11 ` Dave Chinner 0 siblings, 1 reply; 7+ messages in thread From: Mel Gorman @ 2016-08-04 12:34 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-kernel On Thu, Aug 04, 2016 at 01:24:09PM +0100, Mel Gorman wrote: > On Thu, Aug 04, 2016 at 03:10:51PM +1000, Dave Chinner wrote: > > Hi folks, > > > > I just noticed a whacky memory usage profile when running some basic > > IO tests on a current 4.8 tree. It looked like there was a massive > > memory leak from my monitoring graphs - doing buffered IO was > > causing huge amounts of memory to be considered used, but the cache > > size was not increasing. > > > > Looking at /proc/meminfo: > > > > $ cat /proc/meminfo > > MemTotal: 16395408 kB > > MemFree: 79424 kB > > MemAvailable: 2497240 kB > > Buffers: 4372 kB > > Cached: 558744 kB > > SwapCached: 48 kB > > Active: 2127212 kB > > Inactive: 100400 kB > > Active(anon): 25348 kB > > Inactive(anon): 79424 kB > > Active(file): 2101864 kB > > Inactive(file): 20976 kB > > Unevictable: 13612980 kB <<<<<<<<< > > This? Very quickly done, no boot testing > Or better yet, this. This is aimed at the proc reporting only. diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 09e18fdf61e5..b9a8c813e5e6 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -46,7 +46,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) cached = 0; for (lru = LRU_BASE; lru < NR_LRU_LISTS; lru++) - pages[lru] = global_page_state(NR_LRU_BASE + lru); + pages[lru] = global_node_page_state(NR_LRU_BASE + lru); available = si_mem_available(); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index fb975cec3518..baa97da3687d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4064,7 +4064,7 @@ long si_mem_available(void) int lru; for (lru = LRU_BASE; lru < NR_LRU_LISTS; lru++) - pages[lru] = global_page_state(NR_LRU_BASE + lru); + pages[lru] = global_node_page_state(NR_LRU_BASE + lru); for_each_zone(zone) wmark_low += zone->watermark[WMARK_LOW]; -- Mel Gorman SUSE Labs ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [bug, 4.8] /proc/meminfo: counter values are very wrong 2016-08-04 12:34 ` Mel Gorman @ 2016-08-04 23:11 ` Dave Chinner 2016-08-05 10:54 ` Mel Gorman 0 siblings, 1 reply; 7+ messages in thread From: Dave Chinner @ 2016-08-04 23:11 UTC (permalink / raw) To: Mel Gorman; +Cc: linux-kernel On Thu, Aug 04, 2016 at 01:34:58PM +0100, Mel Gorman wrote: > On Thu, Aug 04, 2016 at 01:24:09PM +0100, Mel Gorman wrote: > > On Thu, Aug 04, 2016 at 03:10:51PM +1000, Dave Chinner wrote: > > > Hi folks, > > > > > > I just noticed a whacky memory usage profile when running some basic > > > IO tests on a current 4.8 tree. It looked like there was a massive > > > memory leak from my monitoring graphs - doing buffered IO was > > > causing huge amounts of memory to be considered used, but the cache > > > size was not increasing. > > > > > > Looking at /proc/meminfo: > > > > > > $ cat /proc/meminfo > > > MemTotal: 16395408 kB > > > MemFree: 79424 kB > > > MemAvailable: 2497240 kB > > > Buffers: 4372 kB > > > Cached: 558744 kB > > > SwapCached: 48 kB > > > Active: 2127212 kB > > > Inactive: 100400 kB > > > Active(anon): 25348 kB > > > Inactive(anon): 79424 kB > > > Active(file): 2101864 kB > > > Inactive(file): 20976 kB > > > Unevictable: 13612980 kB <<<<<<<<< > > > > This? Very quickly done, no boot testing > > > > Or better yet, this. This is aimed at the proc reporting only. > > diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c > index 09e18fdf61e5..b9a8c813e5e6 100644 > --- a/fs/proc/meminfo.c > +++ b/fs/proc/meminfo.c > @@ -46,7 +46,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) > cached = 0; > > for (lru = LRU_BASE; lru < NR_LRU_LISTS; lru++) > - pages[lru] = global_page_state(NR_LRU_BASE + lru); > + pages[lru] = global_node_page_state(NR_LRU_BASE + lru); > > available = si_mem_available(); > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index fb975cec3518..baa97da3687d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4064,7 +4064,7 @@ long si_mem_available(void) > int lru; > > for (lru = LRU_BASE; lru < NR_LRU_LISTS; lru++) > - pages[lru] = global_page_state(NR_LRU_BASE + lru); > + pages[lru] = global_node_page_state(NR_LRU_BASE + lru); > > for_each_zone(zone) > wmark_low += zone->watermark[WMARK_LOW]; OK, that makes the /proc accounting match the /sys per-node accounting, but the output still looks wrong. I remove files with cached pages from the filesystem (i.e. invalidate and free them), yet they are apparrently still accounted as being on the active/inactive LRU. Reboot, then run dbench for a minute: $ sudo mkfs.xfs -f /dev/pmem1 meta-data=/dev/pmem1 isize=512 agcount=4, agsize=524288 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=0 data = bsize=4096 blocks=2097152, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 $ sudo mount /dev/pmem1 /mnt/scratch $ sudo dbench -t 60 -D /mnt/scratch/ 16 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 60 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 12 secs 13 of 16 processes prepared for launch 0 sec 16 of 16 processes prepared for launch 0 sec releasing clients 16 21754 2147.77 MB/sec warmup 1 sec latency 17.381 ms 16 53021 2426.23 MB/sec warmup 2 sec latency 7.557 ms 16 79871 2389.22 MB/sec warmup 3 sec latency 13.746 ms 16 110047 2441.00 MB/sec warmup 4 sec latency 7.685 ms 16 141388 2495.74 MB/sec warmup 5 sec latency 10.877 ms 16 172085 2519.72 MB/sec warmup 6 sec latency 14.059 ms 16 205794 2574.78 MB/sec warmup 7 sec latency 8.034 ms 16 245632 2682.26 MB/sec warmup 8 sec latency 6.851 ms 16 286074 2771.92 MB/sec warmup 9 sec latency 6.495 ms 16 317664 2767.22 MB/sec warmup 10 sec latency 10.318 ms 16 345883 2736.86 MB/sec warmup 11 sec latency 14.107 ms 16 407209 2491.90 MB/sec execute 1 sec latency 14.000 ms 16 434075 2416.98 MB/sec execute 2 sec latency 14.067 ms 16 460773 2405.59 MB/sec execute 3 sec latency 10.984 ms 16 489597 2438.34 MB/sec execute 4 sec latency 12.759 ms 16 516526 2432.47 MB/sec execute 5 sec latency 11.068 ms 16 550227 2511.25 MB/sec execute 6 sec latency 8.156 ms 16 592814 2677.03 MB/sec execute 7 sec latency 5.571 ms 16 633506 2781.02 MB/sec execute 8 sec latency 6.116 ms 16 668733 2808.95 MB/sec execute 9 sec latency 8.357 ms 16 706375 2853.15 MB/sec execute 10 sec latency 14.133 ms 16 738346 2843.96 MB/sec execute 11 sec latency 9.305 ms 16 766890 2812.90 MB/sec execute 12 sec latency 9.677 ms 16 795720 2788.20 MB/sec execute 13 sec latency 11.019 ms 16 827221 2783.46 MB/sec execute 14 sec latency 9.281 ms 16 857404 2774.13 MB/sec execute 15 sec latency 7.055 ms 16 886224 2756.10 MB/sec execute 16 sec latency 9.445 ms 16 915254 2744.52 MB/sec execute 17 sec latency 10.447 ms 16 949089 2757.18 MB/sec execute 18 sec latency 18.721 ms 16 982041 2763.18 MB/sec execute 19 sec latency 12.401 ms 16 1013375 2762.66 MB/sec execute 20 sec latency 10.355 ms 16 1045294 2763.22 MB/sec execute 21 sec latency 11.896 ms 16 1083431 2786.87 MB/sec execute 22 sec latency 8.539 ms 16 1121150 2807.17 MB/sec execute 23 sec latency 11.574 ms 16 1152589 2803.24 MB/sec execute 24 sec latency 9.669 ms 16 1189106 2816.73 MB/sec execute 25 sec latency 10.104 ms 16 1222171 2819.00 MB/sec execute 26 sec latency 15.502 ms 16 1253029 2813.85 MB/sec execute 27 sec latency 10.030 ms 16 1283695 2808.15 MB/sec execute 28 sec latency 14.520 ms 16 1310353 2791.60 MB/sec execute 29 sec latency 10.559 ms 16 1339147 2781.06 MB/sec execute 30 sec latency 10.048 ms 16 1367932 2772.66 MB/sec execute 31 sec latency 8.283 ms 16 1397614 2766.76 MB/sec execute 32 sec latency 18.411 ms 16 1424115 2753.46 MB/sec execute 33 sec latency 10.346 ms 16 1450474 2739.71 MB/sec execute 34 sec latency 14.018 ms 16 1477712 2729.13 MB/sec execute 35 sec latency 13.977 ms 16 1504889 2719.33 MB/sec execute 36 sec latency 12.919 ms 16 1541168 2731.19 MB/sec execute 37 sec latency 7.316 ms 16 1576119 2738.48 MB/sec execute 38 sec latency 7.054 ms 16 1607758 2738.84 MB/sec execute 39 sec latency 8.155 ms 16 1640821 2742.17 MB/sec execute 40 sec latency 6.686 ms 16 1672161 2741.02 MB/sec execute 41 sec latency 10.458 ms 16 1701658 2736.87 MB/sec execute 42 sec latency 13.877 ms 16 1731166 2732.70 MB/sec execute 43 sec latency 12.307 ms 16 1759926 2726.88 MB/sec execute 44 sec latency 9.745 ms 16 1785855 2716.11 MB/sec execute 45 sec latency 12.085 ms 16 1815083 2712.64 MB/sec execute 46 sec latency 8.048 ms 16 1843967 2708.13 MB/sec execute 47 sec latency 13.302 ms 16 1871199 2701.24 MB/sec execute 48 sec latency 10.160 ms 16 1898231 2694.15 MB/sec execute 49 sec latency 16.625 ms 16 1929620 2694.67 MB/sec execute 50 sec latency 10.693 ms 16 1968670 2708.80 MB/sec execute 51 sec latency 20.115 ms 16 2001738 2711.94 MB/sec execute 52 sec latency 7.594 ms 16 2038135 2720.41 MB/sec execute 53 sec latency 12.177 ms 16 2069687 2721.12 MB/sec execute 54 sec latency 8.222 ms 16 2097545 2715.63 MB/sec execute 55 sec latency 14.536 ms 16 2122502 2705.52 MB/sec execute 56 sec latency 14.738 ms 16 2148142 2697.19 MB/sec execute 57 sec latency 14.473 ms 16 2178085 2695.15 MB/sec execute 58 sec latency 8.586 ms 16 2210608 2697.11 MB/sec execute 59 sec latency 10.013 ms 16 cleanup 60 sec 0 cleanup 60 sec Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 5153313 0.048 15.558 Close 3785702 0.003 8.334 Rename 218238 0.041 9.738 Unlink 1040545 0.336 18.710 Deltree 136 11.999 65.636 Mkdir 68 0.004 0.007 Qpathinfo 4670973 0.007 8.336 Qfileinfo 818833 0.001 8.544 Qfsinfo 856506 0.005 4.494 Sfileinfo 419850 0.013 8.224 Find 1806064 0.018 13.301 WriteX 2570636 0.032 16.620 ReadX 8078906 0.005 12.453 LockX 16782 0.004 1.633 UnlockX 16782 0.002 0.179 Flush 361228 0.171 20.110 Throughput 2697.11 MB/sec 16 clients 16 procs max_latency=20.115 ms $ cat /proc/meminfo MemTotal: 16395372 kB MemFree: 74092 kB MemAvailable: 16090152 kB Buffers: 59348 kB Cached: 142660 kB SwapCached: 0 kB Active: 3098852 kB Inactive: 12669128 kB Active(anon): 21580 kB Inactive(anon): 25288 kB Active(file): 3077272 kB Inactive(file): 12643840 kB Unevictable: 3516 kB Mlocked: 3516 kB SwapTotal: 497976 kB SwapFree: 497976 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 39248 kB Mapped: 36596 kB Shmem: 8968 kB Slab: 462224 kB SReclaimable: 428468 kB SUnreclaim: 33756 kB KernelStack: 6256 kB PageTables: 6852 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 8695660 kB Committed_AS: 187528 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 14204 kB DirectMap2M: 16762880 kB $ dbench completely removes the files it uses during the test. Aparently I have 15GB of cached file lru pages in the system now, which should be impossible because the filesystem is only 8GB in size. It also doesn't match the "cached" accounting, so free considers all these lru pages as used, not free: $ free total used free shared buff/cache available Mem: 16395372 15652956 78264 8968 664152 16094336 Swap: 497976 0 497976 $ So there's still something screwed up.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug, 4.8] /proc/meminfo: counter values are very wrong 2016-08-04 23:11 ` Dave Chinner @ 2016-08-05 10:54 ` Mel Gorman 2016-08-05 11:59 ` Dave Chinner 0 siblings, 1 reply; 7+ messages in thread From: Mel Gorman @ 2016-08-05 10:54 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-kernel On Fri, Aug 05, 2016 at 09:11:10AM +1000, Dave Chinner wrote: > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index fb975cec3518..baa97da3687d 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -4064,7 +4064,7 @@ long si_mem_available(void) > > int lru; > > > > for (lru = LRU_BASE; lru < NR_LRU_LISTS; lru++) > > - pages[lru] = global_page_state(NR_LRU_BASE + lru); > > + pages[lru] = global_node_page_state(NR_LRU_BASE + lru); > > > > for_each_zone(zone) > > wmark_low += zone->watermark[WMARK_LOW]; > > OK, that makes the /proc accounting match the /sys per-node > accounting, but the output still looks wrong. I remove files with > cached pages from the filesystem (i.e. invalidate and free them), > yet they are apparrently still accounted as being on the > active/inactive LRU. > > Reboot, then run dbench for a minute: > > $ sudo mkfs.xfs -f /dev/pmem1 > meta-data=/dev/pmem1 isize=512 agcount=4, agsize=524288 blks > = sectsz=4096 attr=2, projid32bit=1 > = crc=1 finobt=1, sparse=0 > data = bsize=4096 blocks=2097152, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > log =internal log bsize=4096 blocks=2560, version=2 > = sectsz=4096 sunit=1 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > $ sudo mount /dev/pmem1 /mnt/scratch > $ sudo dbench -t 60 -D /mnt/scratch/ 16 > dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 > Is there any chance this is related to pmem1? I tried reproducing this with the patch applied and free -m over time looks like this total used free shared buffers cached # while [ 1 ]; do free -m | grep Mem; sleep 5; done Mem: 15878 259 15618 1 12 123 Mem: 15878 274 15603 1 16 131 Mem: 15878 612 15266 1 17 463 Mem: 15878 617 15261 1 18 470 Mem: 15878 613 15265 1 19 463 Mem: 15878 614 15264 1 19 464 Mem: 15878 647 15231 1 20 498 Mem: 15878 616 15262 1 21 465 Mem: 15878 642 15236 1 22 491 Mem: 15878 618 15260 1 23 465 Mem: 15878 619 15259 1 24 466 Mem: 15878 620 15258 1 25 464 Mem: 15878 620 15257 1 26 466 Mem: 15878 622 15256 1 27 464 Mem: 15878 622 15255 1 27 466 Mem: 15878 285 15592 1 28 132 Mem: 15878 285 15592 1 28 132 Used memory before and after the dbench run were roughly similar -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug, 4.8] /proc/meminfo: counter values are very wrong 2016-08-05 10:54 ` Mel Gorman @ 2016-08-05 11:59 ` Dave Chinner 2016-08-05 12:07 ` Dave Chinner 0 siblings, 1 reply; 7+ messages in thread From: Dave Chinner @ 2016-08-05 11:59 UTC (permalink / raw) To: Mel Gorman; +Cc: linux-kernel On Fri, Aug 05, 2016 at 11:54:17AM +0100, Mel Gorman wrote: > On Fri, Aug 05, 2016 at 09:11:10AM +1000, Dave Chinner wrote: > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > > index fb975cec3518..baa97da3687d 100644 > > > --- a/mm/page_alloc.c > > > +++ b/mm/page_alloc.c > > > @@ -4064,7 +4064,7 @@ long si_mem_available(void) > > > int lru; > > > > > > for (lru = LRU_BASE; lru < NR_LRU_LISTS; lru++) > > > - pages[lru] = global_page_state(NR_LRU_BASE + lru); > > > + pages[lru] = global_node_page_state(NR_LRU_BASE + lru); > > > > > > for_each_zone(zone) > > > wmark_low += zone->watermark[WMARK_LOW]; > > > > OK, that makes the /proc accounting match the /sys per-node > > accounting, but the output still looks wrong. I remove files with > > cached pages from the filesystem (i.e. invalidate and free them), > > yet they are apparrently still accounted as being on the > > active/inactive LRU. > > > > Reboot, then run dbench for a minute: > > > > $ sudo mkfs.xfs -f /dev/pmem1 > > meta-data=/dev/pmem1 isize=512 agcount=4, agsize=524288 blks > > = sectsz=4096 attr=2, projid32bit=1 > > = crc=1 finobt=1, sparse=0 > > data = bsize=4096 blocks=2097152, imaxpct=25 > > = sunit=0 swidth=0 blks > > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > > log =internal log bsize=4096 blocks=2560, version=2 > > = sectsz=4096 sunit=1 blks, lazy-count=1 > > realtime =none extsz=4096 blocks=0, rtextents=0 > > $ sudo mount /dev/pmem1 /mnt/scratch > > $ sudo dbench -t 60 -D /mnt/scratch/ 16 > > dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 > > > > Is there any chance this is related to pmem1? Nope, just reproduced it on /dev/vdc to confirm. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug, 4.8] /proc/meminfo: counter values are very wrong 2016-08-05 11:59 ` Dave Chinner @ 2016-08-05 12:07 ` Dave Chinner 0 siblings, 0 replies; 7+ messages in thread From: Dave Chinner @ 2016-08-05 12:07 UTC (permalink / raw) To: Mel Gorman; +Cc: linux-kernel On Fri, Aug 05, 2016 at 09:59:35PM +1000, Dave Chinner wrote: > On Fri, Aug 05, 2016 at 11:54:17AM +0100, Mel Gorman wrote: > > On Fri, Aug 05, 2016 at 09:11:10AM +1000, Dave Chinner wrote: > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > > > index fb975cec3518..baa97da3687d 100644 > > > > --- a/mm/page_alloc.c > > > > +++ b/mm/page_alloc.c > > > > @@ -4064,7 +4064,7 @@ long si_mem_available(void) > > > > int lru; > > > > > > > > for (lru = LRU_BASE; lru < NR_LRU_LISTS; lru++) > > > > - pages[lru] = global_page_state(NR_LRU_BASE + lru); > > > > + pages[lru] = global_node_page_state(NR_LRU_BASE + lru); > > > > > > > > for_each_zone(zone) > > > > wmark_low += zone->watermark[WMARK_LOW]; > > > > > > OK, that makes the /proc accounting match the /sys per-node > > > accounting, but the output still looks wrong. I remove files with > > > cached pages from the filesystem (i.e. invalidate and free them), > > > yet they are apparrently still accounted as being on the > > > active/inactive LRU. > > > > > > Reboot, then run dbench for a minute: > > > > > > $ sudo mkfs.xfs -f /dev/pmem1 > > > meta-data=/dev/pmem1 isize=512 agcount=4, agsize=524288 blks > > > = sectsz=4096 attr=2, projid32bit=1 > > > = crc=1 finobt=1, sparse=0 > > > data = bsize=4096 blocks=2097152, imaxpct=25 > > > = sunit=0 swidth=0 blks > > > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > > > log =internal log bsize=4096 blocks=2560, version=2 > > > = sectsz=4096 sunit=1 blks, lazy-count=1 > > > realtime =none extsz=4096 blocks=0, rtextents=0 > > > $ sudo mount /dev/pmem1 /mnt/scratch > > > $ sudo dbench -t 60 -D /mnt/scratch/ 16 > > > dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 > > > > > > > Is there any chance this is related to pmem1? > > Nope, just reproduced it on /dev/vdc to confirm. And same thing on a different machine, using iscsi storage. Also, I just noticed that unmounting the filesystem doesn't clean it up - the inactive/active LRU usage is still there. Ah, maybe this is a result of the all the block layer request flag changes and that's screwing up the page accounting. I'll get back to you on that - not going to try to test that now as it's almost bed time. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-08-05 12:07 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-04 5:10 [bug, 4.8] /proc/meminfo: counter values are very wrong Dave Chinner 2016-08-04 12:24 ` Mel Gorman 2016-08-04 12:34 ` Mel Gorman 2016-08-04 23:11 ` Dave Chinner 2016-08-05 10:54 ` Mel Gorman 2016-08-05 11:59 ` Dave Chinner 2016-08-05 12:07 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox