linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 0/4] enhance shmem process and swap accounting
       [not found] <1424958666-18241-1-git-send-email-vbabka@suse.cz>
@ 2015-02-27 10:36 ` Michael Kerrisk
       [not found]   ` <CAHO5Pa0xmquUbzkZvow_PxRGZpA7MVEPFcRL2LPXv7hU41uxDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
       [not found] ` <1424958666-18241-2-git-send-email-vbabka@suse.cz>
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 6+ messages in thread
From: Michael Kerrisk @ 2015-02-27 10:36 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-mm, Jerome Marchand, Linux Kernel, Andrew Morton, linux-doc,
	Hugh Dickins, Michal Hocko, Kirill A. Shutemov, Cyrill Gorcunov,
	Randy Dunlap, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Peter Zijlstra, Paul Mackerras, Arnaldo Carvalho de Melo,
	Oleg Nesterov, Linux API

[CC += linux-api@]

Hello Vlastimil,

Since this is a kernel-user-space API change, please CC linux-api@.
The kernel source file Documentation/SubmitChecklist notes that all
Linux kernel patches that change userspace interfaces should be CCed
to linux-api@vger.kernel.org, so that the various parties who are
interested in API changes are informed. For further information, see
https://www.kernel.org/doc/man-pages/linux-api-ml.html

Cheers,

Michael


On Thu, Feb 26, 2015 at 2:51 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> This series is based on Jerome Marchand's [1] so let me quote the first
> paragraph from there:
>
> There are several shortcomings with the accounting of shared memory
> (sysV shm, shared anonymous mapping, mapping to a tmpfs file). The
> values in /proc/<pid>/status and statm don't allow to distinguish
> between shmem memory and a shared mapping to a regular file, even
> though theirs implication on memory usage are quite different: at
> reclaim, file mapping can be dropped or write back on disk while shmem
> needs a place in swap. As for shmem pages that are swapped-out or in
> swap cache, they aren't accounted at all.
>
> The original motivation for myself is that a customer found (IMHO rightfully)
> confusing that e.g. top output for process swap usage is unreliable with
> respect to swapped out shmem pages, which are not accounted for.
>
> The fundamental difference between private anonymous and shmem pages is that
> the latter has PTE's converted to pte_none, and not swapents. As such, they are
> not accounted to the number of swapents visible e.g. in /proc/pid/status VmSwap
> row. It might be theoretically possible to use swapents when swapping out shmem
> (without extra cost, as one has to change all mappers anyway), and on swap in
> only convert the swapent for the faulting process, leaving swapents in other
> processes until they also fault (so again no extra cost). But I don't know how
> many assumptions this would break, and it would be too disruptive change for a
> relatively small benefit.
>
> Instead, my approach is to document the limitation of VmSwap, and provide means
> to determine the swap usage for shmem areas for those who are interested and
> willing to pay the price, using /proc/pid/smaps. Because outside of ipcs, I
> don't think it's possible to currently to determine the usage at all.  The
> previous patchset [1] did introduce new shmem-specific fields into smaps
> output, and functions to determine the values. I take a simpler approach,
> noting that smaps output already has a "Swap: X kB" line, where currently X ==
> 0 always for shmem areas. I think we can just consider this a bug and provide
> the proper value by consulting the radix tree, as e.g. mincore_page() does. In the
> patch changelog I explain why this is also not perfect (and cannot be without
> swapents), but still arguably much better than showing a 0.
>
> The last two patches are adapted from Jerome's patchset and provide a VmRSS
> breakdown to VmAnon, VmFile and VmShm in /proc/pid/status. Hugh noted that
> this is a welcome addition, and I agree that it might help e.g. debugging
> process memory usage at albeit non-zero, but still rather low cost of extra
> per-mm counter and some page flag checks. I updated these patches to 4.0-rc1,
> made them respect !CONFIG_SHMEM so that tiny systems don't pay the cost, and
> optimized the page flag checking somewhat.
>
> [1] http://lwn.net/Articles/611966/
>
> Jerome Marchand (2):
>   mm, shmem: Add shmem resident memory accounting
>   mm, procfs: Display VmAnon, VmFile and VmShm in /proc/pid/status
>
> Vlastimil Babka (2):
>   mm, documentation: clarify /proc/pid/status VmSwap limitations
>   mm, proc: account for shmem swap in /proc/pid/smaps
>
>  Documentation/filesystems/proc.txt | 15 +++++++++++++--
>  arch/s390/mm/pgtable.c             |  5 +----
>  fs/proc/task_mmu.c                 | 35 +++++++++++++++++++++++++++++++++--
>  include/linux/mm.h                 | 28 ++++++++++++++++++++++++++++
>  include/linux/mm_types.h           |  9 ++++++---
>  kernel/events/uprobes.c            |  2 +-
>  mm/memory.c                        | 30 ++++++++++--------------------
>  mm/oom_kill.c                      |  5 +++--
>  mm/rmap.c                          | 15 ++++-----------
>  9 files changed, 99 insertions(+), 45 deletions(-)
>
> --
> 2.1.4
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>



-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/4] mm, documentation: clarify /proc/pid/status VmSwap limitations
       [not found] ` <1424958666-18241-2-git-send-email-vbabka@suse.cz>
@ 2015-02-27 10:37   ` Michael Kerrisk
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Kerrisk @ 2015-02-27 10:37 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-mm, Jerome Marchand, Linux Kernel, Andrew Morton, linux-doc,
	Hugh Dickins, Michal Hocko, Kirill A. Shutemov, Cyrill Gorcunov,
	Randy Dunlap, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Peter Zijlstra, Paul Mackerras, Arnaldo Carvalho de Melo,
	Oleg Nesterov, Linux API

[CC += linux-api@]

On Thu, Feb 26, 2015 at 2:51 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> The documentation for /proc/pid/status does not mention that the value of
> VmSwap counts only swapped out anonymous private pages and not shmem. This is
> not obvious, so document this limitation.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> I've noticed that proc(5) manpage is currently missing the VmSwap field
> altogether.
>
>  Documentation/filesystems/proc.txt | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
> index a07ba61..d4f56ec 100644
> --- a/Documentation/filesystems/proc.txt
> +++ b/Documentation/filesystems/proc.txt
> @@ -231,6 +231,8 @@ Table 1-2: Contents of the status files (as of 2.6.30-rc7)
>   VmLib                       size of shared library code
>   VmPTE                       size of page table entries
>   VmSwap                      size of swap usage (the number of referred swapents)
> +                             by anonymous private data (shmem swap usage is not
> +                             included)
>   Threads                     number of threads
>   SigQ                        number of signals queued/max. number for queue
>   SigPnd                      bitmap of pending signals for the thread
> --
> 2.1.4
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>



-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/4] mm, procfs: account for shmem swap in /proc/pid/smaps
       [not found] ` <1424958666-18241-3-git-send-email-vbabka@suse.cz>
@ 2015-02-27 10:38   ` Michael Kerrisk
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Kerrisk @ 2015-02-27 10:38 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-mm, Jerome Marchand, Linux Kernel, Andrew Morton, linux-doc,
	Hugh Dickins, Michal Hocko, Kirill A. Shutemov, Cyrill Gorcunov,
	Randy Dunlap, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Peter Zijlstra, Paul Mackerras, Arnaldo Carvalho de Melo,
	Oleg Nesterov, Linux API

[CC += linux-api@]

On Thu, Feb 26, 2015 at 2:51 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> Currently, /proc/pid/smaps will always show "Swap: 0 kB" for shmem-backed
> mappings, even if the mapped portion does contain pages that were swapped out.
> This is because unlike private anonymous mappings, shmem does not change pte
> to swap entry, but pte_none when swapping the page out. In the smaps page
> walk, such page thus looks like it was never faulted in.
>
> This patch changes smaps_pte_entry() to determine the swap status for such
> pte_none entries for shmem mappings, similarly to how mincore_page() does it.
> Swapped out pages are thus accounted for.
>
> The accounting is arguably still not as precise as for private anonymous
> mappings, since now we will count also pages that the process in question never
> accessed, but only another process populated them and then let them become
> swapped out. I believe it is still less confusing and subtle than not showing
> any swap usage by shmem mappings at all. Also, swapped out pages only becomee a
> performance issue for future accesses, and we cannot predict those for neither
> kind of mapping.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  Documentation/filesystems/proc.txt |  3 ++-
>  fs/proc/task_mmu.c                 | 20 ++++++++++++++++++++
>  2 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
> index d4f56ec..8b30543 100644
> --- a/Documentation/filesystems/proc.txt
> +++ b/Documentation/filesystems/proc.txt
> @@ -437,7 +437,8 @@ indicates the amount of memory currently marked as referenced or accessed.
>  a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE
>  and a page is modified, the file page is replaced by a private anonymous copy.
>  "Swap" shows how much would-be-anonymous memory is also used, but out on
> -swap.
> +swap. For shmem mappings, "Swap" shows how much of the mapped portion of the
> +underlying shmem object is on swap.
>
>  "VmFlags" field deserves a separate description. This member represents the kernel
>  flags associated with the particular virtual memory area in two letter encoded
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 956b75d..0410309 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -13,6 +13,7 @@
>  #include <linux/swap.h>
>  #include <linux/swapops.h>
>  #include <linux/mmu_notifier.h>
> +#include <linux/shmem_fs.h>
>
>  #include <asm/elf.h>
>  #include <asm/uaccess.h>
> @@ -496,6 +497,25 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
>                         mss->swap += PAGE_SIZE;
>                 else if (is_migration_entry(swpent))
>                         page = migration_entry_to_page(swpent);
> +       } else if (IS_ENABLED(CONFIG_SHMEM) && IS_ENABLED(CONFIG_SWAP) &&
> +                                       pte_none(*pte) && vma->vm_file) {
> +               struct address_space *mapping =
> +                       file_inode(vma->vm_file)->i_mapping;
> +
> +               /*
> +                * shmem does not use swap pte's so we have to consult
> +                * the radix tree to account for swap
> +                */
> +               if (shmem_mapping(mapping)) {
> +                       page = find_get_entry(mapping, pgoff);
> +                       if (page) {
> +                               if (radix_tree_exceptional_entry(page))
> +                                       mss->swap += PAGE_SIZE;
> +                               else
> +                                       page_cache_release(page);
> +                       }
> +                       page = NULL;
> +               }
>         }
>
>         if (!page)
> --
> 2.1.4
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>



-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/4] mm, shmem: Add shmem resident memory accounting
       [not found] ` <1424958666-18241-4-git-send-email-vbabka@suse.cz>
@ 2015-02-27 10:38   ` Michael Kerrisk
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Kerrisk @ 2015-02-27 10:38 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-mm, Jerome Marchand, Linux Kernel, Andrew Morton, linux-doc,
	Hugh Dickins, Michal Hocko, Kirill A. Shutemov, Cyrill Gorcunov,
	Randy Dunlap, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Peter Zijlstra, Paul Mackerras, Arnaldo Carvalho de Melo,
	Oleg Nesterov, Linux API

[CC += linux-api@]

On Thu, Feb 26, 2015 at 2:51 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> From: Jerome Marchand <jmarchan@redhat.com>
>
> Currently looking at /proc/<pid>/status or statm, there is no way to
> distinguish shmem pages from pages mapped to a regular file (shmem
> pages are mapped to /dev/zero), even though their implication in
> actual memory use is quite different.
> This patch adds MM_SHMEMPAGES counter to mm_rss_stat to account for
> shmem pages instead of MM_FILEPAGES.
>
> [vbabka@suse.cz: port to 4.0, add #ifdefs, mm_counter_file() variant]
> Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  arch/s390/mm/pgtable.c   |  5 +----
>  fs/proc/task_mmu.c       |  4 +++-
>  include/linux/mm.h       | 28 ++++++++++++++++++++++++++++
>  include/linux/mm_types.h |  9 ++++++---
>  kernel/events/uprobes.c  |  2 +-
>  mm/memory.c              | 30 ++++++++++--------------------
>  mm/oom_kill.c            |  5 +++--
>  mm/rmap.c                | 15 ++++-----------
>  8 files changed, 56 insertions(+), 42 deletions(-)
>
> diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
> index b2c1542..5bffd5d 100644
> --- a/arch/s390/mm/pgtable.c
> +++ b/arch/s390/mm/pgtable.c
> @@ -617,10 +617,7 @@ static void gmap_zap_swap_entry(swp_entry_t entry, struct mm_struct *mm)
>         else if (is_migration_entry(entry)) {
>                 struct page *page = migration_entry_to_page(entry);
>
> -               if (PageAnon(page))
> -                       dec_mm_counter(mm, MM_ANONPAGES);
> -               else
> -                       dec_mm_counter(mm, MM_FILEPAGES);
> +               dec_mm_counter(mm, mm_counter(page));
>         }
>         free_swap_and_cache(entry);
>  }
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 0410309..d70334c 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -81,7 +81,8 @@ unsigned long task_statm(struct mm_struct *mm,
>                          unsigned long *shared, unsigned long *text,
>                          unsigned long *data, unsigned long *resident)
>  {
> -       *shared = get_mm_counter(mm, MM_FILEPAGES);
> +       *shared = get_mm_counter(mm, MM_FILEPAGES) +
> +               get_mm_counter(mm, MM_SHMEMPAGES);
>         *text = (PAGE_ALIGN(mm->end_code) - (mm->start_code & PAGE_MASK))
>                                                                 >> PAGE_SHIFT;
>         *data = mm->total_vm - mm->shared_vm;
> @@ -501,6 +502,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
>                                         pte_none(*pte) && vma->vm_file) {
>                 struct address_space *mapping =
>                         file_inode(vma->vm_file)->i_mapping;
> +               pgoff_t pgoff = linear_page_index(vma, addr);
>
>                 /*
>                  * shmem does not use swap pte's so we have to consult
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 47a9392..adfbb5b 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1364,6 +1364,16 @@ static inline unsigned long get_mm_counter(struct mm_struct *mm, int member)
>         return (unsigned long)val;
>  }
>
> +/* A wrapper for the CONFIG_SHMEM dependent counter */
> +static inline unsigned long get_mm_counter_shmem(struct mm_struct *mm)
> +{
> +#ifdef CONFIG_SHMEM
> +       return get_mm_counter(mm, MM_SHMEMPAGES);
> +#else
> +       return 0;
> +#endif
> +}
> +
>  static inline void add_mm_counter(struct mm_struct *mm, int member, long value)
>  {
>         atomic_long_add(value, &mm->rss_stat.count[member]);
> @@ -1379,9 +1389,27 @@ static inline void dec_mm_counter(struct mm_struct *mm, int member)
>         atomic_long_dec(&mm->rss_stat.count[member]);
>  }
>
> +/* Optimized variant when page is already known not to be PageAnon */
> +static inline int mm_counter_file(struct page *page)
> +{
> +#ifdef CONFIG_SHMEM
> +       if (PageSwapBacked(page))
> +               return MM_SHMEMPAGES;
> +#endif
> +       return MM_FILEPAGES;
> +}
> +
> +static inline int mm_counter(struct page *page)
> +{
> +       if (PageAnon(page))
> +               return MM_ANONPAGES;
> +       return mm_counter_file(page);
> +}
> +
>  static inline unsigned long get_mm_rss(struct mm_struct *mm)
>  {
>         return get_mm_counter(mm, MM_FILEPAGES) +
> +               get_mm_counter_shmem(mm) +
>                 get_mm_counter(mm, MM_ANONPAGES);
>  }
>
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 199a03a..d3c2372 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -327,9 +327,12 @@ struct core_state {
>  };
>
>  enum {
> -       MM_FILEPAGES,
> -       MM_ANONPAGES,
> -       MM_SWAPENTS,
> +       MM_FILEPAGES,   /* Resident file mapping pages */
> +       MM_ANONPAGES,   /* Resident anonymous pages */
> +       MM_SWAPENTS,    /* Anonymous swap entries */
> +#ifdef CONFIG_SHMEM
> +       MM_SHMEMPAGES,  /* Resident shared memory pages */
> +#endif
>         NR_MM_COUNTERS
>  };
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index cb346f2..0a08fdd 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -188,7 +188,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
>         lru_cache_add_active_or_unevictable(kpage, vma);
>
>         if (!PageAnon(page)) {
> -               dec_mm_counter(mm, MM_FILEPAGES);
> +               dec_mm_counter(mm, mm_counter_file(page));
>                 inc_mm_counter(mm, MM_ANONPAGES);
>         }
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 8068893..f145d9e 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -832,10 +832,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
>                 } else if (is_migration_entry(entry)) {
>                         page = migration_entry_to_page(entry);
>
> -                       if (PageAnon(page))
> -                               rss[MM_ANONPAGES]++;
> -                       else
> -                               rss[MM_FILEPAGES]++;
> +                       rss[mm_counter(page)]++;
>
>                         if (is_write_migration_entry(entry) &&
>                                         is_cow_mapping(vm_flags)) {
> @@ -874,10 +871,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
>         if (page) {
>                 get_page(page);
>                 page_dup_rmap(page);
> -               if (PageAnon(page))
> -                       rss[MM_ANONPAGES]++;
> -               else
> -                       rss[MM_FILEPAGES]++;
> +               rss[mm_counter(page)]++;
>         }
>
>  out_set_pte:
> @@ -1113,9 +1107,8 @@ again:
>                         tlb_remove_tlb_entry(tlb, pte, addr);
>                         if (unlikely(!page))
>                                 continue;
> -                       if (PageAnon(page))
> -                               rss[MM_ANONPAGES]--;
> -                       else {
> +
> +                       if (!PageAnon(page)) {
>                                 if (pte_dirty(ptent)) {
>                                         force_flush = 1;
>                                         set_page_dirty(page);
> @@ -1123,8 +1116,8 @@ again:
>                                 if (pte_young(ptent) &&
>                                     likely(!(vma->vm_flags & VM_SEQ_READ)))
>                                         mark_page_accessed(page);
> -                               rss[MM_FILEPAGES]--;
>                         }
> +                       rss[mm_counter(page)]--;
>                         page_remove_rmap(page);
>                         if (unlikely(page_mapcount(page) < 0))
>                                 print_bad_pte(vma, addr, ptent, page);
> @@ -1146,11 +1139,7 @@ again:
>                         struct page *page;
>
>                         page = migration_entry_to_page(entry);
> -
> -                       if (PageAnon(page))
> -                               rss[MM_ANONPAGES]--;
> -                       else
> -                               rss[MM_FILEPAGES]--;
> +                       rss[mm_counter(page)]--;
>                 }
>                 if (unlikely(!free_swap_and_cache(entry)))
>                         print_bad_pte(vma, addr, ptent, NULL);
> @@ -1460,7 +1449,7 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr,
>
>         /* Ok, finally just insert the thing.. */
>         get_page(page);
> -       inc_mm_counter_fast(mm, MM_FILEPAGES);
> +       inc_mm_counter_fast(mm, mm_counter_file(page));
>         page_add_file_rmap(page);
>         set_pte_at(mm, addr, pte, mk_pte(page, prot));
>
> @@ -2174,7 +2163,8 @@ gotten:
>         if (likely(pte_same(*page_table, orig_pte))) {
>                 if (old_page) {
>                         if (!PageAnon(old_page)) {
> -                               dec_mm_counter_fast(mm, MM_FILEPAGES);
> +                               dec_mm_counter_fast(mm,
> +                                               mm_counter_file(old_page));
>                                 inc_mm_counter_fast(mm, MM_ANONPAGES);
>                         }
>                 } else
> @@ -2703,7 +2693,7 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
>                 inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
>                 page_add_new_anon_rmap(page, vma, address);
>         } else {
> -               inc_mm_counter_fast(vma->vm_mm, MM_FILEPAGES);
> +               inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page));
>                 page_add_file_rmap(page);
>         }
>         set_pte_at(vma->vm_mm, address, pte, entry);
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 642f38c..a5ee3a2 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -573,10 +573,11 @@ void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
>         /* mm cannot safely be dereferenced after task_unlock(victim) */
>         mm = victim->mm;
>         mark_tsk_oom_victim(victim);
> -       pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB\n",
> +       pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB\n",
>                 task_pid_nr(victim), victim->comm, K(victim->mm->total_vm),
>                 K(get_mm_counter(victim->mm, MM_ANONPAGES)),
> -               K(get_mm_counter(victim->mm, MM_FILEPAGES)));
> +               K(get_mm_counter(victim->mm, MM_FILEPAGES)),
> +               K(get_mm_counter_shmem(victim->mm)));
>         task_unlock(victim);
>
>         /*
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 5e3e090..e3c4392 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1216,12 +1216,8 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>         update_hiwater_rss(mm);
>
>         if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
> -               if (!PageHuge(page)) {
> -                       if (PageAnon(page))
> -                               dec_mm_counter(mm, MM_ANONPAGES);
> -                       else
> -                               dec_mm_counter(mm, MM_FILEPAGES);
> -               }
> +               if (!PageHuge(page))
> +                       dec_mm_counter(mm, mm_counter(page));
>                 set_pte_at(mm, address, pte,
>                            swp_entry_to_pte(make_hwpoison_entry(page)));
>         } else if (pte_unused(pteval)) {
> @@ -1230,10 +1226,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>                  * interest anymore. Simply discard the pte, vmscan
>                  * will take care of the rest.
>                  */
> -               if (PageAnon(page))
> -                       dec_mm_counter(mm, MM_ANONPAGES);
> -               else
> -                       dec_mm_counter(mm, MM_FILEPAGES);
> +               dec_mm_counter(mm, mm_counter(page));
>         } else if (PageAnon(page)) {
>                 swp_entry_t entry = { .val = page_private(page) };
>                 pte_t swp_pte;
> @@ -1276,7 +1269,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>                 entry = make_migration_entry(page, pte_write(pteval));
>                 set_pte_at(mm, address, pte, swp_entry_to_pte(entry));
>         } else
> -               dec_mm_counter(mm, MM_FILEPAGES);
> +               dec_mm_counter(mm, mm_counter_file(page));
>
>         page_remove_rmap(page);
>         page_cache_release(page);
> --
> 2.1.4
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>



-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 4/4] mm, procfs: Display VmAnon, VmFile and VmShm in /proc/pid/status
       [not found] ` <1424958666-18241-5-git-send-email-vbabka@suse.cz>
@ 2015-02-27 10:38   ` Michael Kerrisk
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Kerrisk @ 2015-02-27 10:38 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-mm, Jerome Marchand, Linux Kernel, Andrew Morton, linux-doc,
	Hugh Dickins, Michal Hocko, Kirill A. Shutemov, Cyrill Gorcunov,
	Randy Dunlap, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Peter Zijlstra, Paul Mackerras, Arnaldo Carvalho de Melo,
	Oleg Nesterov, Linux API

[CC += linux-api@]

On Thu, Feb 26, 2015 at 2:51 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> From: Jerome Marchand <jmarchan@redhat.com>
>
> It's currently inconvenient to retrieve MM_ANONPAGES value from status
> and statm files and there is no way to separate MM_FILEPAGES and
> MM_SHMEMPAGES. Add VmAnon, VmFile and VmShm lines in /proc/<pid>/status
> to solve these issues.
>
> Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  Documentation/filesystems/proc.txt | 10 +++++++++-
>  fs/proc/task_mmu.c                 | 13 +++++++++++--
>  2 files changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
> index 8b30543..c777adb 100644
> --- a/Documentation/filesystems/proc.txt
> +++ b/Documentation/filesystems/proc.txt
> @@ -168,6 +168,9 @@ read the file /proc/PID/status:
>    VmLck:         0 kB
>    VmHWM:       476 kB
>    VmRSS:       476 kB
> +  VmAnon:      352 kB
> +  VmFile:      120 kB
> +  VmShm:         4 kB
>    VmData:      156 kB
>    VmStk:        88 kB
>    VmExe:        68 kB
> @@ -224,7 +227,12 @@ Table 1-2: Contents of the status files (as of 2.6.30-rc7)
>   VmSize                      total program size
>   VmLck                       locked memory size
>   VmHWM                       peak resident set size ("high water mark")
> - VmRSS                       size of memory portions
> + VmRSS                       size of memory portions. It contains the three
> +                             following parts (VmRSS = VmAnon + VmFile + VmShm)
> + VmAnon                      size of resident anonymous memory
> + VmFile                      size of resident file mappings
> + VmShm                       size of resident shmem memory (includes SysV shm,
> +                             mapping of tmpfs and shared anonymous mappings)
>   VmData                      size of data, stack, and text segments
>   VmStk                       size of data, stack, and text segments
>   VmExe                       size of text segment
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index d70334c..a77a3ac 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -22,7 +22,7 @@
>
>  void task_mem(struct seq_file *m, struct mm_struct *mm)
>  {
> -       unsigned long data, text, lib, swap, ptes, pmds;
> +       unsigned long data, text, lib, swap, ptes, pmds, anon, file, shmem;
>         unsigned long hiwater_vm, total_vm, hiwater_rss, total_rss;
>
>         /*
> @@ -39,6 +39,9 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
>         if (hiwater_rss < mm->hiwater_rss)
>                 hiwater_rss = mm->hiwater_rss;
>
> +       anon = get_mm_counter(mm, MM_ANONPAGES);
> +       file = get_mm_counter(mm, MM_FILEPAGES);
> +       shmem = get_mm_counter_shmem(mm);
>         data = mm->total_vm - mm->shared_vm - mm->stack_vm;
>         text = (PAGE_ALIGN(mm->end_code) - (mm->start_code & PAGE_MASK)) >> 10;
>         lib = (mm->exec_vm << (PAGE_SHIFT-10)) - text;
> @@ -52,6 +55,9 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
>                 "VmPin:\t%8lu kB\n"
>                 "VmHWM:\t%8lu kB\n"
>                 "VmRSS:\t%8lu kB\n"
> +               "VmAnon:\t%8lu kB\n"
> +               "VmFile:\t%8lu kB\n"
> +               "VmShm:\t%8lu kB\n"
>                 "VmData:\t%8lu kB\n"
>                 "VmStk:\t%8lu kB\n"
>                 "VmExe:\t%8lu kB\n"
> @@ -65,6 +71,9 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
>                 mm->pinned_vm << (PAGE_SHIFT-10),
>                 hiwater_rss << (PAGE_SHIFT-10),
>                 total_rss << (PAGE_SHIFT-10),
> +               anon << (PAGE_SHIFT-10),
> +               file << (PAGE_SHIFT-10),
> +               shmem << (PAGE_SHIFT-10),
>                 data << (PAGE_SHIFT-10),
>                 mm->stack_vm << (PAGE_SHIFT-10), text, lib,
>                 ptes >> 10,
> @@ -82,7 +91,7 @@ unsigned long task_statm(struct mm_struct *mm,
>                          unsigned long *data, unsigned long *resident)
>  {
>         *shared = get_mm_counter(mm, MM_FILEPAGES) +
> -               get_mm_counter(mm, MM_SHMEMPAGES);
> +               get_mm_counter_shmem(mm);
>         *text = (PAGE_ALIGN(mm->end_code) - (mm->start_code & PAGE_MASK))
>                                                                 >> PAGE_SHIFT;
>         *data = mm->total_vm - mm->shared_vm;
> --
> 2.1.4
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>



-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] enhance shmem process and swap accounting
       [not found]   ` <CAHO5Pa0xmquUbzkZvow_PxRGZpA7MVEPFcRL2LPXv7hU41uxDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-02-27 10:52     ` Vlastimil Babka
  0 siblings, 0 replies; 6+ messages in thread
From: Vlastimil Babka @ 2015-02-27 10:52 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-mm, Jerome Marchand, Linux Kernel, Andrew Morton,
	linux-doc-u79uwXL29TY76Z2rM5mHXA, Hugh Dickins, Michal Hocko,
	Kirill A. Shutemov, Cyrill Gorcunov, Randy Dunlap,
	linux-s390-u79uwXL29TY76Z2rM5mHXA, Martin Schwidefsky,
	Heiko Carstens, Peter Zijlstra, Paul Mackerras,
	Arnaldo Carvalho de Melo, Oleg Nesterov, Linux API

On 02/27/2015 11:36 AM, Michael Kerrisk wrote:
> [CC += linux-api@]
> 
> Hello Vlastimil,
> 
> Since this is a kernel-user-space API change, please CC linux-api@.
> The kernel source file Documentation/SubmitChecklist notes that all
> Linux kernel patches that change userspace interfaces should be CCed
> to linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, so that the various parties who are
> interested in API changes are informed. For further information, see
> https://www.kernel.org/doc/man-pages/linux-api-ml.html

Yes I meant to do that but forgot in the end, what a shame. Sorry for the trouble.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-02-27 10:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1424958666-18241-1-git-send-email-vbabka@suse.cz>
2015-02-27 10:36 ` [PATCH 0/4] enhance shmem process and swap accounting Michael Kerrisk
     [not found]   ` <CAHO5Pa0xmquUbzkZvow_PxRGZpA7MVEPFcRL2LPXv7hU41uxDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-02-27 10:52     ` Vlastimil Babka
     [not found] ` <1424958666-18241-2-git-send-email-vbabka@suse.cz>
2015-02-27 10:37   ` [PATCH 1/4] mm, documentation: clarify /proc/pid/status VmSwap limitations Michael Kerrisk
     [not found] ` <1424958666-18241-3-git-send-email-vbabka@suse.cz>
2015-02-27 10:38   ` [PATCH 2/4] mm, procfs: account for shmem swap in /proc/pid/smaps Michael Kerrisk
     [not found] ` <1424958666-18241-4-git-send-email-vbabka@suse.cz>
2015-02-27 10:38   ` [PATCH 3/4] mm, shmem: Add shmem resident memory accounting Michael Kerrisk
     [not found] ` <1424958666-18241-5-git-send-email-vbabka@suse.cz>
2015-02-27 10:38   ` [PATCH 4/4] mm, procfs: Display VmAnon, VmFile and VmShm in /proc/pid/status Michael Kerrisk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).