All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nikita V. Youshchenko" <yoush-/llMDZXAvAOHXe+LvDLADg@public.gmane.org>
To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Cc: leo-n4oKp6kCDthKyFCjRbgQbg@public.gmane.org
Subject: Scheduling in atomic while restoring shm
Date: Wed, 24 Feb 2010 19:02:18 +0300	[thread overview]
Message-ID: <201002241902.19623@zigzag.lvk.cs.msu.su> (raw)

Hi

While playing with checkpoint-restart code, version
several-commits-before-0.19, we have faced "scheduling in atomic" issue.

It is still in v0.19, below code is from there.

   247          down_write(&shm_ids->rw_mutex);
   248
   249          /* we are the sole owners/users of this ipc_ns, it can't go away */
   250          perms = ipc_lock(shm_ids, h->perms.id);
   251          BUG_ON(IS_ERR(perms));  /* ipc_ns is private to us */
   252
   253          shp = container_of(perms, struct shmid_kernel, shm_perm);
   254          file = shp->shm_file;
   255          get_file(file);
   256
   257          ret = load_ipc_shm_hdr(ctx, h, shp);
   258          if (ret < 0)
   259                  goto mutex;
   260
   261          /* deposit in objhash and read contents in */
   262          ret = ckpt_obj_insert(ctx, file, h->objref, CKPT_OBJ_FILE);
   263          if (ret < 0)
   264                  goto mutex;
   265          ret = restore_memory_contents(ctx, file->f_dentry->d_inode);
   266   mutex:
   267          fput(file);
   268          if (ret < 0) {
   269                  ckpt_debug("shm: need to remove (%d)\n", ret);
   270                  do_shm_rmid(ns, perms);
   271          } else
   272                  ipc_unlock(perms);
   273          up_write(&shm_ids->rw_mutex);

So restore_ipc_shm() calls ipc_lock() and then restore_memory_contents().
Inside ipc_lock(), a spinlock is taken.
Inside restore_memory_contents(), checkpoint data is read, that results
in vfs_read() and a schedule somewhere below.

Looks like a bug.

Here is a backtrace:

[  145.795810] BUG: scheduling while atomic: multitask/433/0x00000003
[  145.796661] Modules linked in:
[  145.796992] Pid: 433, comm: multitask Not tainted 2.6.33-rc5 #2
[  145.797520] Call Trace:
[  145.797833]  [<c11e096b>] ? schedule+0x80/0x627
[  145.798266]  [<c11e1f6b>] ? _raw_spin_unlock_irqrestore+0x1f/0x29
[  145.798823]  [<c1110c54>] ? debug_check_no_obj_freed+0x11d/0x175
[  145.799451]  [<c11e219d>] ? _raw_spin_lock_irqsave+0x11/0x2a
[  145.800244]  [<c1036623>] ? prepare_to_wait+0x14/0x54
[  145.800872]  [<c108171e>] ? pipe_wait+0x4a/0x61
[  145.801442]  [<c10364a4>] ? autoremove_wake_function+0x0/0x2d
[  145.802113]  [<c1081e39>] ? pipe_read+0x2c4/0x327
[  145.802641]  [<c107b8e5>] ? do_sync_read+0x9c/0xe0
[  145.803176]  [<c110a3b2>] ? radix_tree_insert+0x135/0x16d
[  145.803762]  [<c11e1f42>] ? _raw_spin_unlock_irq+0x1e/0x28
[  145.804561]  [<c1058e97>] ? add_to_page_cache_locked+0xc2/0xca
[  145.805191]  [<c10e60f2>] ? security_file_permission+0xc/0xd
[  145.805798]  [<c107b849>] ? do_sync_read+0x0/0xe0
[  145.806292]  [<c107c127>] ? vfs_read+0x73/0xa1
[  145.806783]  [<c10fd87c>] ? ckpt_kread+0x6e/0xc6
[  145.807297]  [<c1104c54>] ? restore_read_page+0x1a/0x49
[  145.807857]  [<c1104ec0>] ? restore_memory_contents+0x23d/0x2f7
[  145.808727]  [<c10e0231>] ? restore_ipc_shm+0x296/0x32d
[  145.809302]  [<c10df9e9>] ? restore_ipc_any+0xa5/0x119
[  145.809865]  [<c10dfb06>] ? restore_ipc_ns+0xa9/0x112
[  145.810406]  [<c10dff9b>] ? restore_ipc_shm+0x0/0x32d
[  145.810962]  [<c10fe1cc>] ? restore_obj+0x98/0x116
[  145.811483]  [<c10ffe71>] ? ckpt_read_obj_dispatch+0x220/0x246
[  145.812238]  [<c10ffead>] ? ckpt_read_obj+0x16/0xe8
[  145.812857]  [<c107b522>] ? fsnotify_access+0x5a/0x61
[  145.813406]  [<c1100001>] ? ckpt_read_obj_type+0x16/0x70
[  145.813975]  [<c1039a6c>] ? restore_ns+0x18/0x12b
[  145.814483]  [<c10fe1cc>] ? restore_obj+0x98/0x116
[  145.815011]  [<c10ffe71>] ? ckpt_read_obj_dispatch+0x220/0x246
[  145.815636]  [<c10ffead>] ? ckpt_read_obj+0x16/0xe8
[  145.816429]  [<c1100001>] ? ckpt_read_obj_type+0x16/0x70
[  145.817030]  [<c1102abc>] ? restore_task+0x512/0x9fc
[  145.817574]  [<c11011dd>] ? do_restart+0xff4/0x12f3
[  145.818114]  [<c10364a4>] ? autoremove_wake_function+0x0/0x2d
[  145.818735]  [<c10fd1a5>] ? do_sys_restart+0x66/0x77
[  145.819271]  [<c1002795>] ? ptregs_restart+0x15/0x1c
[  145.819816]  [<c1002690>] ? sysenter_do_call+0x12/0x26


Another related bug: if load_ipc_shm_hdr() fails in line 257, control
is transfered to mutex: label with negative ret value; ipc_unlock()
is not called on this path.

             reply	other threads:[~2010-02-24 16:02 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-24 16:02 Nikita V. Youshchenko [this message]
     [not found] ` <201002241902.19623-G0jJXfdb3EhtNF42gJWJKsm+4N3/VObd@public.gmane.org>
2010-02-24 23:31   ` [PATCH] c/r: fix "scheduling in atomic" while restoring ipc shm Oren Laadan
     [not found]     ` <1267054267-2819-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-02-25  2:53       ` Oren Laadan
     [not found]         ` <4B85E62B.90804-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-02 14:50           ` Nikita V. Youshchenko
     [not found]             ` <201003021750.47123-G0jJXfdb3EhtNF42gJWJKsm+4N3/VObd@public.gmane.org>
2010-03-02 17:48               ` Serge E. Hallyn
     [not found]                 ` <20100302174855.GA16352-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-02 21:59                   ` Oren Laadan
2010-03-02 22:09               ` Oren Laadan
     [not found]                 ` <4B8D8C7D.2050004-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-02 23:17                   ` Serge E. Hallyn
     [not found]                     ` <20100302231716.GA4594-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-02 23:40                       ` Serge E. Hallyn
2010-03-03 20:31       ` [PATCH] c/r: fix ipc scheduling while atomic - take 3 Oren Laadan
     [not found]         ` <1267648296-5517-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-03 23:06           ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201002241902.19623@zigzag.lvk.cs.msu.su \
    --to=yoush-/llmdzxavaohxe+lvdladg@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=leo-n4oKp6kCDthKyFCjRbgQbg@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.