From: Pierre Peiffer <pierre.peiffer@bull.net>
To: "Serge E. Hallyn" <serue@us.ibm.com>
Cc: linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org
Subject: Re: [PATCH 2.6.24-rc8-mm1 14/15] (RFC) IPC/semaphores: prepare semundo code to work on another task than current
Date: Fri, 01 Feb 2008 13:09:48 +0100 [thread overview]
Message-ID: <47A30C0C.8050905@bull.net> (raw)
In-Reply-To: <20080131180125.GA5617@sergelap.ibm.com>
Serge E. Hallyn wrote:
> Quoting Pierre Peiffer (pierre.peiffer@bull.net):
>>
>> Serge E. Hallyn wrote:
>>> Quoting pierre.peiffer@bull.net (pierre.peiffer@bull.net):
>>>> From: Pierre Peiffer <pierre.peiffer@bull.net>
>>>>
>>>> In order to modify the semundo-list of a task from procfs, we must be able to
>>>> work on any target task.
>>>> But all the existing code playing with the semundo-list, currently works
>>>> only on the 'current' task, and does not allow to specify any target task.
>>>>
>>>> This patch changes all these routines to allow them to work on a specified
>>>> task, passed in parameter, instead of current.
>>>>
>>>> This is mainly a preparation for the semundo_write() operation, on the
>>>> /proc/<pid>/semundo file, as provided in the next patch.
>>>>
>>>> Signed-off-by: Pierre Peiffer <pierre.peiffer@bull.net>
>>>> ---
>>>>
>>>> ipc/sem.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++----------------
>>>> 1 file changed, 68 insertions(+), 22 deletions(-)
>>>>
>>>> Index: b/ipc/sem.c
>>>> ===================================================================
>>>> --- a/ipc/sem.c
>>>> +++ b/ipc/sem.c
>>>> @@ -1017,8 +1017,9 @@ asmlinkage long sys_semctl (int semid, i
>>>> }
>>>>
>>>> /* If the task doesn't already have a undo_list, then allocate one
>>>> - * here. We guarantee there is only one thread using this undo list,
>>>> - * and current is THE ONE
>>>> + * here.
>>>> + * The target task (tsk) is current in the general case, except when
>>>> + * accessed from the procfs (ie when writting to /proc/<pid>/semundo)
>>>> *
>>>> * If this allocation and assignment succeeds, but later
>>>> * portions of this code fail, there is no need to free the sem_undo_list.
>>>> @@ -1026,22 +1027,60 @@ asmlinkage long sys_semctl (int semid, i
>>>> * at exit time.
>>>> *
>>>> * This can block, so callers must hold no locks.
>>>> + *
>>>> + * Note: task_lock is used to synchronize 1. several possible concurrent
>>>> + * creations and 2. the free of the undo_list (done when the task using it
>>>> + * exits). In the second case, we check the PF_EXITING flag to not create
>>>> + * an undo_list for a task which has exited.
>>>> + * If there already is an undo_list for this task, there is no need
>>>> + * to held the task-lock to retrieve it, as the pointer can not change
>>>> + * afterwards.
>>>> */
>>>> -static inline int get_undo_list(struct sem_undo_list **undo_listp)
>>>> +static inline int get_undo_list(struct task_struct *tsk,
>>>> + struct sem_undo_list **ulp)
>>>> {
>>>> - struct sem_undo_list *undo_list;
>>>> + if (tsk->sysvsem.undo_list == NULL) {
>>>> + struct sem_undo_list *undo_list;
>>> Hmm, this is weird. If there was no undo_list and
>>> tsk!=current, you set the refcnt to 2. But if there was an
>>> undo list and tsk!=current, where do you inc the refcnt?
>>>
>> I inc it outside this function, as I don't call get_undo_list() if there is an
>> undo_list.
>> This appears most clearly in the next patch, in semundo_open() for example.
>
> Ok, so however unlikely, there is a flow that could cause you a problem:
> T2 calls semundo_open() for T1. T1 does not yet have a semundolist.
> T2.semundo_open() calls get_undo_list, just then T1 creats its own
> semundo_list. T2 comes to top of get_undo_list() and see
> tsk->sysvsem.undo_list != NULL, simply returns a pointer to the
> undo_list. Now you never increment the count.
>
Right.
And yesterday, with more testing in the corners, I've found another issue: if I
use /proc/self/semundo, I don't have tsk != current and the refcnt is wrong too.
Thanks for finding this !
P.
>>>> - undo_list = current->sysvsem.undo_list;
>>>> - if (!undo_list) {
>>>> - undo_list = kzalloc(sizeof(*undo_list), GFP_KERNEL);
>>>> + /* we must alloc a new one */
>>>> + undo_list = kmalloc(sizeof(*undo_list), GFP_KERNEL);
>>>> if (undo_list == NULL)
>>>> return -ENOMEM;
>>>> +
>>>> + task_lock(tsk);
>>>> +
>>>> + /* check again if there is an undo_list for this task */
>>>> + if (tsk->sysvsem.undo_list) {
>>>> + if (tsk != current)
>>>> + atomic_inc(&tsk->sysvsem.undo_list->refcnt);
>>>> + task_unlock(tsk);
>>>> + kfree(undo_list);
>>>> + goto out;
>>>> + }
>>>> +
>>>> spin_lock_init(&undo_list->lock);
>>>> - atomic_set(&undo_list->refcnt, 1);
>>>> - undo_list->ns = get_ipc_ns(current->nsproxy->ipc_ns);
>>>> - current->sysvsem.undo_list = undo_list;
>>>> + /*
>>>> + * If tsk is not current (meaning that current is creating
>>>> + * a semundo_list for a target task through procfs), and if
>>>> + * it's not being exited then refcnt must be 2: the target
>>>> + * task tsk + current.
>>>> + */
>>>> + if (tsk == current)
>>>> + atomic_set(&undo_list->refcnt, 1);
>>>> + else if (!(tsk->flags & PF_EXITING))
>>>> + atomic_set(&undo_list->refcnt, 2);
>>>> + else {
>>>> + task_unlock(tsk);
>>>> + kfree(undo_list);
>>>> + return -EINVAL;
>>>> + }
>>>> + undo_list->ns = get_ipc_ns(tsk->nsproxy->ipc_ns);
>>>> + undo_list->proc_list = NULL;
>>>> + tsk->sysvsem.undo_list = undo_list;
>>>> + task_unlock(tsk);
>>>> }
>>>> - *undo_listp = undo_list;
>>>> +out:
>>>> + *ulp = tsk->sysvsem.undo_list;
>>>> return 0;
>>>> }
>>>>
>>>> @@ -1065,17 +1104,12 @@ static struct sem_undo *lookup_undo(stru
>>>> return un;
>>>> }
>>>>
>>>> -static struct sem_undo *find_undo(struct ipc_namespace *ns, int semid)
>>>> +static struct sem_undo *find_undo(struct sem_undo_list *ulp, int semid)
>>>> {
>>>> struct sem_array *sma;
>>>> - struct sem_undo_list *ulp;
>>>> struct sem_undo *un, *new;
>>>> + struct ipc_namespace *ns;
>>>> int nsems;
>>>> - int error;
>>>> -
>>>> - error = get_undo_list(&ulp);
>>>> - if (error)
>>>> - return ERR_PTR(error);
>>>>
>>>> spin_lock(&ulp->lock);
>>>> un = lookup_undo(ulp, semid);
>>>> @@ -1083,6 +1117,8 @@ static struct sem_undo *find_undo(struct
>>>> if (likely(un!=NULL))
>>>> goto out;
>>>>
>>>> + ns = ulp->ns;
>>>> +
>>>> /* no undo structure around - allocate one. */
>>>> sma = sem_lock_check(ns, semid);
>>>> if (IS_ERR(sma))
>>>> @@ -1133,6 +1169,7 @@ asmlinkage long sys_semtimedop(int semid
>>>> struct sem_array *sma;
>>>> struct sembuf fast_sops[SEMOPM_FAST];
>>>> struct sembuf* sops = fast_sops, *sop;
>>>> + struct sem_undo_list *ulp;
>>>> struct sem_undo *un;
>>>> int undos = 0, alter = 0, max;
>>>> struct sem_queue queue;
>>>> @@ -1177,9 +1214,13 @@ asmlinkage long sys_semtimedop(int semid
>>>> alter = 1;
>>>> }
>>>>
>>>> + error = get_undo_list(current, &ulp);
>>>> + if (error)
>>>> + goto out_free;
>>>> +
>>>> retry_undos:
>>>> if (undos) {
>>>> - un = find_undo(ns, semid);
>>>> + un = find_undo(ulp, semid);
>>>> if (IS_ERR(un)) {
>>>> error = PTR_ERR(un);
>>>> goto out_free;
>>>> @@ -1305,7 +1346,7 @@ int copy_semundo(unsigned long clone_fla
>>>> int error;
>>>>
>>>> if (clone_flags & CLONE_SYSVSEM) {
>>>> - error = get_undo_list(&undo_list);
>>>> + error = get_undo_list(current, &undo_list);
>>>> if (error)
>>>> return error;
>>>> atomic_inc(&undo_list->refcnt);
>>>> @@ -1405,10 +1446,15 @@ next_entry:
>>>> kfree(undo_list);
>>>> }
>>>>
>>>> -/* called from do_exit() */
>>>> +/* exit_sem: called from do_exit()
>>>> + * task_lock is used to synchronize with get_undo_list()
>>> Ok I had to think about this again. I'd like the comment
>>> here to point out that the task_lock here acts as a barrier
>>> between the prior setting of PF_EXITING and the undo_list
>>> being freed here, so that get_undo_list() will either see
>>> PF_EXITING is NOT in the tsk->flags, in which case it will
>>> insert the undo_list before the task_lock() is grabbed here,
>>> and with count=2, so that it gets correctly put here in
>>> exit_sem, or it will see PF_EXITING set and cancel the
>>> undo_list it was creating.
>>>
>> Yep, I will add this to clarify this point.
>>
>> Thanks Serge.
>>
>> P.
>>
>>>> + */
>>>> void exit_sem(struct task_struct *tsk)
>>>> {
>>>> - struct sem_undo_list *ul = tsk->sysvsem.undo_list;
>>>> + struct sem_undo_list *ul;
>>>> + task_lock(tsk);
>>>> + ul = tsk->sysvsem.undo_list;
>>>> + task_unlock(tsk);
>>>> if (ul) {
>>>> rcu_assign_pointer(tsk->sysvsem.undo_list, NULL);
>>>> synchronize_rcu();
>>>>
>>>> --
>>>> Pierre Peiffer
>>>> _______________________________________________
>>>> Containers mailing list
>>>> Containers@lists.linux-foundation.org
>>>> https://lists.linux-foundation.org/mailman/listinfo/containers
>>>
>> --
>> Pierre Peiffer
>
>
--
Pierre Peiffer
next prev parent reply other threads:[~2008-02-01 12:09 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-29 16:02 [PATCH 2.6.24-rc8-mm1 00/15] IPC: code rewrite + new functionalities pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 01/15] IPC/semaphores: code factorisation pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 02/15] IPC/shared memory: introduce shmctl_down pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 03/15] IPC/message queues: introduce msgctl_down pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 04/15] IPC/semaphores: move the rwmutex handling inside semctl_down pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 05/15] IPC/semaphores: remove one unused parameter from semctl_down() pierre.peiffer
2008-01-31 8:32 ` Nadia Derbey
2008-01-31 10:18 ` Pierre Peiffer
2008-01-31 11:30 ` Nadia Derbey
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 06/15] IPC: get rid of the use *_setbuf structure pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 07/15] IPC: introduce ipc_update_perm() pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 08/15] IPC: consolidate all xxxctl_down() functions pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 09/15] (RFC) IPC: new kernel API to change an ID pierre.peiffer
2008-01-29 21:06 ` Alexey Dobriyan
2008-01-30 9:52 ` Pierre Peiffer
2008-01-31 9:00 ` Pierre Peiffer
2008-01-31 9:54 ` Kirill Korotaev
2008-01-31 11:57 ` Pierre Peiffer
2008-01-31 13:11 ` Kirill Korotaev
2008-01-31 16:10 ` Cedric Le Goater
2008-02-04 13:41 ` Kirill Korotaev
2008-02-04 14:06 ` [Devel] " Pavel Emelyanov
2008-02-04 15:00 ` Daniel Lezcano
2008-02-04 15:16 ` Pavel Emelyanov
2008-02-05 9:51 ` Oren Laadan
2008-02-05 18:00 ` Dave Hansen
2008-02-05 18:42 ` Serge E. Hallyn
2008-02-06 2:07 ` Oren Laadan
2008-02-06 5:00 ` Serge E. Hallyn
2008-02-08 10:12 ` Pierre Peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 10/15] (RFC) IPC: new IPC_SETID command to modify " pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 11/15] (RFC) IPC: new IPC_SETALL command to modify all settings pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 12/15] (RFC) IPC/semaphores: make use of RCU to free the sem_undo_list pierre.peiffer
2008-01-30 21:26 ` Serge E. Hallyn
2008-01-31 9:52 ` Pierre Peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 13/15] (RFC) IPC/semaphores: per <pid> semundo file in procfs pierre.peiffer
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 14/15] (RFC) IPC/semaphores: prepare semundo code to work on another task than current pierre.peiffer
2008-01-30 21:44 ` Serge E. Hallyn
2008-01-31 9:48 ` Pierre Peiffer
2008-01-31 18:01 ` Serge E. Hallyn
2008-02-01 12:09 ` Pierre Peiffer [this message]
2008-01-29 16:02 ` [PATCH 2.6.24-rc8-mm1 15/15] (RFC) IPC/semaphores: add write() operation to semundo file in procfs pierre.peiffer
2008-02-02 18:23 ` [PATCH 2.6.24-rc8-mm1 00/15] IPC: code rewrite + new functionalities Pavel Machek
2008-02-04 13:52 ` Pierre Peiffer
2008-02-04 15:44 ` Benjamin Thery
2008-02-04 19:51 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47A30C0C.8050905@bull.net \
--to=pierre.peiffer@bull.net \
--cc=containers@lists.linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=serue@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox