From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
Alexey Dobriyan
<adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Dave Hansen
<dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Subject: Re: [RFC v14][PATCH 53/54] Detect resource leaks for whole-containercheckpoint
Date: Thu, 7 May 2009 21:56:22 -0700 [thread overview]
Message-ID: <20090508045622.GA31731@linux.vnet.ibm.com> (raw)
In-Reply-To: <4A025F7D.3050403-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
Oren Laadan [orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org] wrote:
|
|
| Sukadev Bhattiprolu wrote:
| > Oren Laadan [orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org] wrote:
| > | Add a 'users' count to objhash items, and, for a !CHECKPOINT_SUBTREE
| > | checkpoint, return an error code if the actual objects' counts are
| > | higher, indicating leaks (references to the objects from a task not
| > | being checkpointed). Of course, by this time most of the checkpoint
| > | image has been written out to disk, so this is purely advisory. But
| > | then, it's probably naive to argue that anything more than an advisory
| > | 'this went wrong' error code is useful.
| > |
| > | The comparison of the objhash user counts to object refcounts as a
| > | basis for checking for leaks comes from Alexey's OpenVZ-based c/r
| > | patchset.
| > |
| > | Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
| > | ---
| > | checkpoint/checkpoint.c | 8 +++
| > | checkpoint/memory.c | 2 +
| > | checkpoint/objhash.c | 108 +++++++++++++++++++++++++++++++++++++++----
| > | include/linux/checkpoint.h | 2 +
| > | 4 files changed, 110 insertions(+), 10 deletions(-)
| > |
| > | diff --git a/checkpoint/checkpoint.c b/checkpoint/checkpoint.c
| > | index 4319976..32a0a8e 100644
| > | --- a/checkpoint/checkpoint.c
| > | +++ b/checkpoint/checkpoint.c
| > | @@ -498,6 +498,14 @@ int do_checkpoint(struct ckpt_ctx *ctx, pid_t pid)
| > | if (ret < 0)
| > | goto out;
| > |
| > | + if (!(ctx->flags & CHECKPOINT_SUBTREE)) {
| > | + /* verify that all objects are contained (no leaks) */
| > | + if (!ckpt_obj_contained(ctx)) {
| > | + ret = -EBUSY;
| > | + goto out;
| > | + }
| > | + }
| > | +
| > | /* on success, return (unique) checkpoint identifier */
| > | ctx->crid = atomic_inc_return(&ctx_count);
| > | ret = ctx->crid;
| > | diff --git a/checkpoint/memory.c b/checkpoint/memory.c
| > | index 7637c1e..5ae2b41 100644
| > | --- a/checkpoint/memory.c
| > | +++ b/checkpoint/memory.c
| > | @@ -687,6 +687,8 @@ static int do_checkpoint_mm(struct ckpt_ctx *ctx, struct mm_struct *mm)
| > | ret = exe_objref;
| > | goto out;
| > | }
| > | + /* account for all references through vma/exe_file */
| > | + ckpt_obj_users_inc(ctx, mm->exe_file, mm->num_exe_file_vmas);
| >
| > Do we really need to add num_exe_file_vmas here ?
| >
| > A quick look at all callers for added_exe_file_vma() seems to show that
| > those callers also do a get_file().
|
| Each vma whose file is the same as mm->exe_file causes the refcount
| of that file to increase by 2: once by vma->vm_file, and once via
| added_exe_file_vma(). The c/r code calls ckpt_obj_checkpoint() only
| once, thus once one obj_file_grab() for that file. The code above
| accounts for the missing count.
If the executable is shared between a parent and child (as in fork()/dup_mm)
do we still need to account for the 'added_exe_file_vma()' in the child
process ?
i.e I can trace a call to added_exe_file_vma() when loading/mmaping a biniary.
But I can't trace a call to added_exe_file_vma() during fork()/dup_mm()).
Here is how I can account for the 16 in the obj->users :-)
Parent:
do_checkpoint_mm: +2 = 2 (first time/obj_new())
num_exe_vmas: +2 = 4
filemap_checkpoint: +1 = 5 (text section)
filemap_checkpoint: +1 = 6 (data section)
Child:
do_checkpoint_mm: +1 = 7
num_exe_file_vmas: +2 = 9
filemap_checkpoint: +1 = 10 (text section)
filemap_checkpoint: +1 = 11 (data section)
Grand child:
do_checkpoint_mm: +1 = 12
num_exe_file_vmas: +2 = 14
filemap_checkpoint: +1 = 15 (text section)
filemap_checkpoint: +1 = 16 (data section)
Even if we were to drop the num_exe_file_vmas for the child and
grand-child, we would be off by 2 :-(
As of now, I can account for 9 of the 10 found in file->f_count.
Parent:
load_a.out/do_mmap: +2 = 2 (text)
load_aout/do_mmap(): +2 = 4 (data)
Child:
dup_mm()/dup_mmap(): +1 = 5 (text)
dup_mm()/dup_mmap(): +1 = 6 (data)
Grand Child:
dup_mm()/dup_mmap(): +1 = 7 (text)
dup_mm()/dup_mmap(): +1 = 8 (data)
Checkpoint/Objhash:
obj_new/obj_file_grab: +1 = 9
Another question is regarding the obj->users = 2 in obj_new():
- one of this reference is for the get_file() in obj_file_grab()
called near the end of obj_new() right ?
- where can I find the other get_file() ?
(again with reference to the file the three process are executing, ptree2)
Thanks,
Sukadev
next prev parent reply other threads:[~2009-05-08 4:56 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-28 23:23 [RFC v14][PATCH 00/54] Kernel based checkpoint/restart Oren Laadan
[not found] ` <1240961064-13991-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-28 23:23 ` [RFC v14][PATCH 01/54] Create syscalls: sys_checkpoint, sys_restart Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 02/54] Checkpoint/restart: initial documentation Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 03/54] Make file_pos_read/write() public Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 04/54] General infrastructure for checkpoint restart Oren Laadan
[not found] ` <1240961064-13991-5-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 0:58 ` Serge E. Hallyn
[not found] ` <20090429005826.GA23583-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-04-29 17:49 ` Oren Laadan
[not found] ` <49F8932D.4040506-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 18:15 ` Serge E. Hallyn
2009-04-29 17:12 ` Serge E. Hallyn
2009-05-06 20:39 ` Sukadev Bhattiprolu
[not found] ` <20090506203955.GA6003-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-06 20:57 ` Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 05/54] x86 support for checkpoint/restart Oren Laadan
[not found] ` <1240961064-13991-6-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:12 ` Dave Hansen
2009-04-28 23:23 ` [RFC v14][PATCH 06/54] Introduce method 'checkpoint' in struct vm_operations_struct Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 07/54] cr: extend arch_setup_additional_pages() Oren Laadan
[not found] ` <1240961064-13991-8-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:13 ` Dave Hansen
2009-05-01 15:42 ` Serge E. Hallyn
[not found] ` <20090501154220.GA26771-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-01 15:57 ` Dave Hansen
2009-05-01 16:18 ` Serge E. Hallyn
[not found] ` <20090501161813.GA27516-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-04 7:25 ` Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 08/54] Dump memory address space Oren Laadan
[not found] ` <1240961064-13991-9-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 4:11 ` Serge E. Hallyn
[not found] ` <20090429041128.GA28018-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-04-29 6:42 ` Guenter Roeck
[not found] ` <20090429064241.GA17482-gvzKVTG1yJJBDgjK7y7TUQ@public.gmane.org>
2009-04-29 20:00 ` Oren Laadan
2009-04-30 4:54 ` Matt Helsley
2009-05-01 15:25 ` Dave Hansen
2009-05-01 15:27 ` Dave Hansen
2009-05-04 7:58 ` Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 09/54] Restore " Oren Laadan
[not found] ` <1240961064-13991-10-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:28 ` Dave Hansen
2009-04-28 23:23 ` [RFC v14][PATCH 10/54] Infrastructure for shared objects Oren Laadan
[not found] ` <1240961064-13991-11-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 1:03 ` Serge E. Hallyn
2009-04-29 16:21 ` Serge E. Hallyn
2009-04-28 23:23 ` [RFC v14][PATCH 11/54] Introduce 'checkpoint' method in 'struct file_operations' Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 12/54] Dump open file descriptors Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 13/54] add generic checkpoint f_op to ext fses Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 14/54] Restore open file descriptors Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 15/54] Record 'struct file' object instead of the file name for VMAs Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 16/54] External checkpoint of a task other than ourself Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 17/54] c/r of restart-blocks: export functionality used in next patch Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 18/54] c/r of restart-blocks Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 19/54] Checkpoint multiple processes Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 20/54] Restart " Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 21/54] Define subtree flag and unpriv_allowed sysctl Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 22/54] Checkpoint open pipes Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 23/54] Restore " Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 24/54] Prepare to support shared memory Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 25/54] Dump anonymous- and file-mapped- " Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 26/54] Restore " Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 27/54] s390: Expose a constant for the number of words representing the CRs Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 28/54] c/r: Add CKPT_COPY() macro (v4) Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 29/54] s390: define s390-specific checkpoint-restart code (v7) Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 30/54] powerpc: provide APIs for validating and updating DABR Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 31/54] powerpc: checkpoint/restart implementation Oren Laadan
[not found] ` <1240961064-13991-32-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 6:54 ` Nathan Lynch
[not found] ` <m34ow8ueyk.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
2009-04-29 15:49 ` Serge E. Hallyn
2009-04-29 18:05 ` Oren Laadan
[not found] ` <49F896E8.7020802-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 20:55 ` Nathan Lynch
2009-04-29 18:18 ` Oren Laadan
[not found] ` <49F899E1.2030207-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 20:33 ` Nathan Lynch
2009-04-28 23:24 ` [RFC v14][PATCH 32/54] powerpc: wire up checkpoint and restart syscalls Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 33/54] powerpc: enable checkpoint support in Kconfig Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 34/54] Export fs/exec.c:exec_mmap() Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 35/54] Support for share memory address spaces Oren Laadan
[not found] ` <1240961064-13991-36-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-20 17:55 ` Dave Hansen
2009-05-20 18:23 ` Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 36/54] Make ckpt_may_checkpoint_task() check each namespace individually Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 37/54] c/r: Add UTS support (v6) Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 38/54] Stub implementation of IPC namespace c/r Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 39/54] deferqueue: generic queue to defer work Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 40/54] ipc: allow allocation of an ipc object with desired identifier Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 41/54] ipc: helpers to save and restore kern_ipc_perm structures Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 42/54] ipc namespace: save and restore ipc namespace basics Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 43/54] sysvipc-shm: checkpoint Oren Laadan
[not found] ` <1240961064-13991-44-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-15 19:20 ` Serge E. Hallyn
2009-04-28 23:24 ` [RFC v14][PATCH 44/54] sysvipc-shm: restart Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 45/54] sysvipc-shm: export interface from ipc/shm.c to delete ipc shm Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 46/54] sysvipc-shm: correctly handle deleted (active) ipc shared memory Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 47/54] sysvipc-msg: make 'struct msg_msgseg' visible in ipc/util.h Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 48/54] sysvipc-msq: checkpoint Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 49/54] sysvipc-msq: restart Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 50/54] sysvipc-sem: export interface from ipc/sem.c to cleanup ipc sem Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 51/54] sysvipc-sem: checkpoint Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 52/54] sysvipc-sem: restart Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 53/54] Detect resource leaks for whole-container checkpoint Oren Laadan
[not found] ` <1240961064-13991-54-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 17:26 ` Dave Hansen
2009-05-07 3:50 ` Sukadev Bhattiprolu
[not found] ` <20090507035026.GB6003-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-07 4:11 ` Oren Laadan
[not found] ` <4A025F7D.3050403-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-07 6:13 ` [RFC v14][PATCH 53/54] Detect resource leaks for whole-containercheckpoint Sukadev Bhattiprolu
[not found] ` <20090507061321.GA13725-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-07 6:24 ` Sukadev Bhattiprolu
2009-05-07 21:45 ` Matt Helsley
[not found] ` <20090507214501.GA29671-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-08 13:44 ` Oren Laadan
2009-05-08 4:56 ` Sukadev Bhattiprolu [this message]
[not found] ` <20090508045622.GA31731-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-08 8:12 ` [RFC v14][PATCH 53/54] Detect resource leaks forwhole-containercheckpoint Matt Helsley
2009-04-28 23:24 ` [RFC v14][PATCH 54/54] Report failures during checkpoint as an object in the output stream Oren Laadan
2009-04-29 8:18 ` [RFC v14][PATCH 00/54] Kernel based checkpoint/restart Louis Rilling
[not found] ` <20090429081815.GA1813-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-04-29 22:47 ` Oren Laadan
[not found] ` <49F8D8FC.8010400-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-30 9:41 ` Louis Rilling
[not found] ` <20090430094106.GC13896-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-05-04 8:03 ` Matthieu Fertré
[not found] ` <49FEA136.2040406-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
2009-05-04 9:06 ` Oren Laadan
[not found] ` <49FEB01B.208-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-04 9:17 ` Matthieu Fertré
2009-05-04 13:01 ` Serge E. Hallyn
[not found] ` <20090504130108.GA21521-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-04 20:13 ` Oren Laadan
2009-05-05 8:20 ` Louis Rilling
[not found] ` <20090505082057.GA11377-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-05-05 13:49 ` Serge E. Hallyn
[not found] ` <20090505134920.GB10136-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-05 14:26 ` Louis Rilling
2009-05-04 19:13 ` Oren Laadan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090508045622.GA31731@linux.vnet.ibm.com \
--to=sukadev-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
--cc=adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.