Linux Container Development
 help / color / mirror / Atom feed
From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Alexey Dobriyan
	<adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Dave Hansen
	<dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Subject: Re: [RFC v14][PATCH 53/54] Detect resource leaks for whole-containercheckpoint
Date: Thu, 7 May 2009 21:56:22 -0700	[thread overview]
Message-ID: <20090508045622.GA31731@linux.vnet.ibm.com> (raw)
In-Reply-To: <4A025F7D.3050403-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>

Oren Laadan [orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org] wrote:
| 
| 
| Sukadev Bhattiprolu wrote:
| > Oren Laadan [orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org] wrote:
| > | Add a 'users' count to objhash items, and, for a !CHECKPOINT_SUBTREE
| > | checkpoint, return an error code if the actual objects' counts are
| > | higher, indicating leaks (references to the objects from a task not
| > | being checkpointed).  Of course, by this time most of the checkpoint
| > | image has been written out to disk, so this is purely advisory.  But
| > | then, it's probably naive to argue that anything more than an advisory
| > | 'this went wrong' error code is useful.
| > | 
| > | The comparison of the objhash user counts to object refcounts as a
| > | basis for checking for leaks comes from Alexey's OpenVZ-based c/r
| > | patchset.
| > | 
| > | Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
| > | ---
| > |  checkpoint/checkpoint.c    |    8 +++
| > |  checkpoint/memory.c        |    2 +
| > |  checkpoint/objhash.c       |  108 +++++++++++++++++++++++++++++++++++++++----
| > |  include/linux/checkpoint.h |    2 +
| > |  4 files changed, 110 insertions(+), 10 deletions(-)
| > | 
| > | diff --git a/checkpoint/checkpoint.c b/checkpoint/checkpoint.c
| > | index 4319976..32a0a8e 100644
| > | --- a/checkpoint/checkpoint.c
| > | +++ b/checkpoint/checkpoint.c
| > | @@ -498,6 +498,14 @@ int do_checkpoint(struct ckpt_ctx *ctx, pid_t pid)
| > |  	if (ret < 0)
| > |  		goto out;
| > | 
| > | +	if (!(ctx->flags & CHECKPOINT_SUBTREE)) {
| > | +		/* verify that all objects are contained (no leaks) */
| > | +		if (!ckpt_obj_contained(ctx)) {
| > | +			ret = -EBUSY;
| > | +			goto out;
| > | +		}
| > | +	}
| > | +
| > |  	/* on success, return (unique) checkpoint identifier */
| > |  	ctx->crid = atomic_inc_return(&ctx_count);
| > |  	ret = ctx->crid;
| > | diff --git a/checkpoint/memory.c b/checkpoint/memory.c
| > | index 7637c1e..5ae2b41 100644
| > | --- a/checkpoint/memory.c
| > | +++ b/checkpoint/memory.c
| > | @@ -687,6 +687,8 @@ static int do_checkpoint_mm(struct ckpt_ctx *ctx, struct mm_struct *mm)
| > |  			ret = exe_objref;
| > |  			goto out;
| > |  		}
| > | +		/* account for all references through vma/exe_file */
| > | +		ckpt_obj_users_inc(ctx, mm->exe_file, mm->num_exe_file_vmas);
| > 
| > Do we really need to add num_exe_file_vmas here ?
| > 
| > A quick look at all callers for added_exe_file_vma() seems to show that
| > those callers also do a get_file().
| 
| Each vma whose file is the same as mm->exe_file causes the refcount
| of that file to increase by 2: once by vma->vm_file, and once via
| added_exe_file_vma(). The c/r code calls ckpt_obj_checkpoint() only
| once, thus once one obj_file_grab() for that file. The code above
| accounts for the missing count.

If the executable is shared between a parent and child (as in fork()/dup_mm)
do we still need to account for the 'added_exe_file_vma()' in the child
process ?

i.e I can trace a call to added_exe_file_vma() when loading/mmaping a biniary.
But I can't trace a call to added_exe_file_vma() during fork()/dup_mm()).

Here is how I can account for the 16 in the obj->users :-)

	Parent:
		do_checkpoint_mm: +2	= 2	(first time/obj_new())
		num_exe_vmas: +2	= 4

		filemap_checkpoint: +1	= 5	(text section)
		filemap_checkpoint: +1	= 6	(data section)

	Child:
		do_checkpoint_mm: +1	= 7
		num_exe_file_vmas: +2	= 9

		filemap_checkpoint: +1	= 10	(text section)
		filemap_checkpoint: +1	= 11	(data section)

	Grand child:

		do_checkpoint_mm: +1	= 12
		num_exe_file_vmas: +2	= 14

		filemap_checkpoint: +1	= 15	(text section)
		filemap_checkpoint: +1	= 16	(data section)

Even if we were to drop the num_exe_file_vmas for the child and
grand-child, we would be off by 2 :-(

As of now, I can account for 9 of the 10 found in file->f_count.


	Parent:
		load_a.out/do_mmap: +2	= 2	(text)
		load_aout/do_mmap(): +2	= 4	(data)

	Child:
		dup_mm()/dup_mmap(): +1	= 5	(text)
		dup_mm()/dup_mmap(): +1	= 6	(data)

	Grand Child:
		dup_mm()/dup_mmap(): +1	= 7	(text)
		dup_mm()/dup_mmap(): +1	= 8	(data)

	Checkpoint/Objhash:

		obj_new/obj_file_grab: +1 = 9

Another question is regarding the obj->users = 2 in obj_new():

	- one of this reference is for the get_file() in obj_file_grab()
	  called near the end of obj_new() right ?

	- where can I find the other get_file() ?

(again with reference to the file the three process are executing, ptree2)

Thanks,

Sukadev

  parent reply	other threads:[~2009-05-08  4:56 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-28 23:23 [RFC v14][PATCH 00/54] Kernel based checkpoint/restart Oren Laadan
     [not found] ` <1240961064-13991-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-28 23:23   ` [RFC v14][PATCH 01/54] Create syscalls: sys_checkpoint, sys_restart Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 02/54] Checkpoint/restart: initial documentation Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 03/54] Make file_pos_read/write() public Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 04/54] General infrastructure for checkpoint restart Oren Laadan
     [not found]     ` <1240961064-13991-5-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29  0:58       ` Serge E. Hallyn
     [not found]         ` <20090429005826.GA23583-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-04-29 17:49           ` Oren Laadan
     [not found]             ` <49F8932D.4040506-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 18:15               ` Serge E. Hallyn
2009-04-29 17:12       ` Serge E. Hallyn
2009-05-06 20:39       ` Sukadev Bhattiprolu
     [not found]         ` <20090506203955.GA6003-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-06 20:57           ` Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 05/54] x86 support for checkpoint/restart Oren Laadan
     [not found]     ` <1240961064-13991-6-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:12       ` Dave Hansen
2009-04-28 23:23   ` [RFC v14][PATCH 06/54] Introduce method 'checkpoint' in struct vm_operations_struct Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 07/54] cr: extend arch_setup_additional_pages() Oren Laadan
     [not found]     ` <1240961064-13991-8-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:13       ` Dave Hansen
2009-05-01 15:42         ` Serge E. Hallyn
     [not found]           ` <20090501154220.GA26771-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-01 15:57             ` Dave Hansen
2009-05-01 16:18               ` Serge E. Hallyn
     [not found]                 ` <20090501161813.GA27516-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-04  7:25                   ` Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 08/54] Dump memory address space Oren Laadan
     [not found]     ` <1240961064-13991-9-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29  4:11       ` Serge E. Hallyn
     [not found]         ` <20090429041128.GA28018-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-04-29  6:42           ` Guenter Roeck
     [not found]             ` <20090429064241.GA17482-gvzKVTG1yJJBDgjK7y7TUQ@public.gmane.org>
2009-04-29 20:00               ` Oren Laadan
2009-04-30  4:54       ` Matt Helsley
2009-05-01 15:25       ` Dave Hansen
2009-05-01 15:27       ` Dave Hansen
2009-05-04  7:58         ` Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 09/54] Restore " Oren Laadan
     [not found]     ` <1240961064-13991-10-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:28       ` Dave Hansen
2009-04-28 23:23   ` [RFC v14][PATCH 10/54] Infrastructure for shared objects Oren Laadan
     [not found]     ` <1240961064-13991-11-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29  1:03       ` Serge E. Hallyn
2009-04-29 16:21       ` Serge E. Hallyn
2009-04-28 23:23   ` [RFC v14][PATCH 11/54] Introduce 'checkpoint' method in 'struct file_operations' Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 12/54] Dump open file descriptors Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 13/54] add generic checkpoint f_op to ext fses Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 14/54] Restore open file descriptors Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 15/54] Record 'struct file' object instead of the file name for VMAs Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 16/54] External checkpoint of a task other than ourself Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 17/54] c/r of restart-blocks: export functionality used in next patch Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 18/54] c/r of restart-blocks Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 19/54] Checkpoint multiple processes Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 20/54] Restart " Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 21/54] Define subtree flag and unpriv_allowed sysctl Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 22/54] Checkpoint open pipes Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 23/54] Restore " Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 24/54] Prepare to support shared memory Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 25/54] Dump anonymous- and file-mapped- " Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 26/54] Restore " Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 27/54] s390: Expose a constant for the number of words representing the CRs Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 28/54] c/r: Add CKPT_COPY() macro (v4) Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 29/54] s390: define s390-specific checkpoint-restart code (v7) Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 30/54] powerpc: provide APIs for validating and updating DABR Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 31/54] powerpc: checkpoint/restart implementation Oren Laadan
     [not found]     ` <1240961064-13991-32-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29  6:54       ` Nathan Lynch
     [not found]         ` <m34ow8ueyk.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
2009-04-29 15:49           ` Serge E. Hallyn
2009-04-29 18:05           ` Oren Laadan
     [not found]             ` <49F896E8.7020802-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 20:55               ` Nathan Lynch
2009-04-29 18:18           ` Oren Laadan
     [not found]             ` <49F899E1.2030207-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 20:33               ` Nathan Lynch
2009-04-28 23:24   ` [RFC v14][PATCH 32/54] powerpc: wire up checkpoint and restart syscalls Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 33/54] powerpc: enable checkpoint support in Kconfig Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 34/54] Export fs/exec.c:exec_mmap() Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 35/54] Support for share memory address spaces Oren Laadan
     [not found]     ` <1240961064-13991-36-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-20 17:55       ` Dave Hansen
2009-05-20 18:23         ` Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 36/54] Make ckpt_may_checkpoint_task() check each namespace individually Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 37/54] c/r: Add UTS support (v6) Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 38/54] Stub implementation of IPC namespace c/r Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 39/54] deferqueue: generic queue to defer work Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 40/54] ipc: allow allocation of an ipc object with desired identifier Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 41/54] ipc: helpers to save and restore kern_ipc_perm structures Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 42/54] ipc namespace: save and restore ipc namespace basics Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 43/54] sysvipc-shm: checkpoint Oren Laadan
     [not found]     ` <1240961064-13991-44-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-15 19:20       ` Serge E. Hallyn
2009-04-28 23:24   ` [RFC v14][PATCH 44/54] sysvipc-shm: restart Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 45/54] sysvipc-shm: export interface from ipc/shm.c to delete ipc shm Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 46/54] sysvipc-shm: correctly handle deleted (active) ipc shared memory Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 47/54] sysvipc-msg: make 'struct msg_msgseg' visible in ipc/util.h Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 48/54] sysvipc-msq: checkpoint Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 49/54] sysvipc-msq: restart Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 50/54] sysvipc-sem: export interface from ipc/sem.c to cleanup ipc sem Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 51/54] sysvipc-sem: checkpoint Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 52/54] sysvipc-sem: restart Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 53/54] Detect resource leaks for whole-container checkpoint Oren Laadan
     [not found]     ` <1240961064-13991-54-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 17:26       ` Dave Hansen
2009-05-07  3:50       ` Sukadev Bhattiprolu
     [not found]         ` <20090507035026.GB6003-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-07  4:11           ` Oren Laadan
     [not found]             ` <4A025F7D.3050403-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-07  6:13               ` [RFC v14][PATCH 53/54] Detect resource leaks for whole-containercheckpoint Sukadev Bhattiprolu
     [not found]                 ` <20090507061321.GA13725-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-07  6:24                   ` Sukadev Bhattiprolu
2009-05-07 21:45                   ` Matt Helsley
     [not found]                     ` <20090507214501.GA29671-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-08 13:44                       ` Oren Laadan
2009-05-08  4:56               ` Sukadev Bhattiprolu [this message]
     [not found]                 ` <20090508045622.GA31731-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-08  8:12                   ` [RFC v14][PATCH 53/54] Detect resource leaks forwhole-containercheckpoint Matt Helsley
2009-04-28 23:24   ` [RFC v14][PATCH 54/54] Report failures during checkpoint as an object in the output stream Oren Laadan
2009-04-29  8:18   ` [RFC v14][PATCH 00/54] Kernel based checkpoint/restart Louis Rilling
     [not found]     ` <20090429081815.GA1813-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-04-29 22:47       ` Oren Laadan
     [not found]         ` <49F8D8FC.8010400-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-30  9:41           ` Louis Rilling
     [not found]             ` <20090430094106.GC13896-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-05-04  8:03               ` Matthieu Fertré
     [not found]                 ` <49FEA136.2040406-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
2009-05-04  9:06                   ` Oren Laadan
     [not found]                     ` <49FEB01B.208-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-04  9:17                       ` Matthieu Fertré
2009-05-04 13:01                       ` Serge E. Hallyn
     [not found]                         ` <20090504130108.GA21521-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-04 20:13                           ` Oren Laadan
2009-05-05  8:20                           ` Louis Rilling
     [not found]                             ` <20090505082057.GA11377-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-05-05 13:49                               ` Serge E. Hallyn
     [not found]                                 ` <20090505134920.GB10136-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-05 14:26                                   ` Louis Rilling
2009-05-04 19:13   ` Oren Laadan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090508045622.GA31731@linux.vnet.ibm.com \
    --to=sukadev-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
    --cc=adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox