linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: nagarathnam muthusamy <nagarathnam.muthusamy@oracle.com>
To: prakash sangappa <prakash.sangappa@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Oleg Nesterov <oleg@redhat.com>,
	Linux API <linux-api@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Serge Hallyn <serge.hallyn@ubuntu.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Eugene Syromiatnikov <esyr@redhat.com>
Subject: Re: [PATCH v4] pidns: introduce syscall translate_pid
Date: Wed, 01 Nov 2017 09:59:55 -0700	[thread overview]
Message-ID: <59F9FD8B.8090607@oracle.com> (raw)
In-Reply-To: <59E689F5.2080706@oracle.com>

I believe all the questions raised in this thread were answered. Just 
wondering if there are any outstanding questions?

Thanks,
Nagarathnam.
On 10/17/2017 3:53 PM, prakash sangappa wrote:
>
> On 10/17/2017 3:40 PM, Andy Lutomirski wrote:
>> On Tue, Oct 17, 2017 at 3:35 PM, prakash sangappa
>> <prakash.sangappa@oracle.com> wrote:
>>> On 10/17/2017 3:02 PM, Andy Lutomirski wrote:
>>>> On Tue, Oct 17, 2017 at 8:38 AM, Prakash Sangappa
>>>> <prakash.sangappa@oracle.com> wrote:
>>>>>
>>>>> On 10/16/17 5:52 PM, Andy Lutomirski wrote:
>>>>>> On Mon, Oct 16, 2017 at 3:54 PM, prakash.sangappa
>>>>>> <prakash.sangappa@oracle.com> wrote:
>>>>>>>
>>>>>>> On 10/16/2017 03:07 PM, Nagarathnam Muthusamy wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/16/2017 02:36 PM, Andrew Morton wrote:
>>>>>>>>> On Sat, 14 Oct 2017 11:17:47 +0300 Konstantin Khlebnikov
>>>>>>>>> <khlebnikov@yandex-team.ru> wrote:
>>>>>>>>>
>>>>>>>>>>>>> pid_t translate_pid(pid_t pid, int source, int target);
>>>>>>>>>>>>>
>>>>>>>>>>>>> This syscall converts pid from source pid-ns into pid in 
>>>>>>>>>>>>> target
>>>>>>>>>>>>> pid-ns.
>>>>>>>>>>>>> If pid is unreachable from target pid-ns it returns zero.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Pid-namespaces are referred file descriptors opened to 
>>>>>>>>>>>>> proc files
>>>>>>>>>>>>> /proc/[pid]/ns/pid or /proc/[pid]/ns/pid_for_children. 
>>>>>>>>>>>>> Negative
>>>>>>>>>>>>> argument
>>>>>>>>>>>>> refers to current pid namespace, same as file 
>>>>>>>>>>>>> /proc/self/ns/pid.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kernel expose virtual pids in /proc/[pid]/status:NSpid, but
>>>>>>>>>>>>> backward
>>>>>>>>>>>>> translation requires scanning all tasks. Also pids could be
>>>>>>>>>>>>> translated
>>>>>>>>>>>>> by sending them through unix socket between namespaces, this
>>>>>>>>>>>>> method
>>>>>>>>>>>>> is
>>>>>>>>>>>>> slow and insecure because other side is exposed inside pid
>>>>>>>>>>>>> namespace.
>>>>>>>>>> Andrew asked why we might need this.
>>>>>>>>>>
>>>>>>>>>> Such conversion is required for interaction between processes 
>>>>>>>>>> across
>>>>>>>>>> pid-namespaces.
>>>>>>>>>> For example to identify process in container by pid file looking
>>>>>>>>>> from
>>>>>>>>>> outside.
>>>>>>>>>>
>>>>>>>>>> Two years ago I've solved this in project of mine with monstrous
>>>>>>>>>> code
>>>>>>>>>> which
>>>>>>>>>> forks couple times just to convert pid, lucky for me performance
>>>>>>>>>> wasn't
>>>>>>>>>> important.
>>>>>>>>> That's a single user who needed this a single time, and found a
>>>>>>>>> userspace-based solution anyway.  This is not exactly compelling!
>>>>>>>>>
>>>>>>>>> Is there a stronger case to be made?  How does this change 
>>>>>>>>> benefit
>>>>>>>>> our
>>>>>>>>> users?  Sell it to us!
>>>>>>>> Oracle database is planning to use pid namespace for sandboxing
>>>>>>>> database
>>>>>>>> instances and they need an API similar to translate_pid to 
>>>>>>>> effectively
>>>>>>>> translate process IDs from other pid namespaces. Prakash (cced in
>>>>>>>> mail)
>>>>>>>> can
>>>>>>>> provide more details on this usecase.
>>>>>>>
>>>>>>> As Nagarathnam indicated, Oracle Database will be using pid 
>>>>>>> namespaces
>>>>>>> and
>>>>>>> needs a direct method of converting pids of processes in the pid
>>>>>>> namespace
>>>>>>> hierarchy. In this use case multiple
>>>>>>> nested PID namespaces will be used.  The currently available 
>>>>>>> mechanism
>>>>>>> are
>>>>>>> not very efficient for this use case. For ex. as Konstantin 
>>>>>>> described,
>>>>>>> using
>>>>>>> /proc/<pid>/status would require the application to scan all the 
>>>>>>> pid's
>>>>>>> status files to determine the pid of given process in a child
>>>>>>> namespace.
>>>>>>>
>>>>>>> Use of SCM_CREDENTIALS's socket message is another way, which would
>>>>>>> require
>>>>>>> every process starting inside a pid namespace to send this 
>>>>>>> message and
>>>>>>> the
>>>>>>> receiving process in the target namespace would have to save the
>>>>>>> converted
>>>>>>> pid and reference it. This mechanism becomes cumbersome 
>>>>>>> especially if
>>>>>>> the
>>>>>>> application has to deal with multiple nested pid namespaces. 
>>>>>>> Also, the
>>>>>>> Database needs to be able to convert a thread's global 
>>>>>>> pid(gettid()).
>>>>>>> Passing the thread's pid(gettid()) in SCM_CREDENTIALS message 
>>>>>>> requires
>>>>>>> CAP_SYS_ADMIN, which is an issue.
>>>>>>>
>>>>>>> So having a direct method, like the API that Konstantin is 
>>>>>>> proposing,
>>>>>>> will
>>>>>>> work best for the Database
>>>>>>> since pid of a process in any of the nested pid namespaces can be
>>>>>>> converted
>>>>>>> as and when required. I think with the proposed API, the 
>>>>>>> application
>>>>>>> should
>>>>>>> be able to convert pid of a process or tid(gettid()) of a thread as
>>>>>>> well.
>>>>>>>
>>>>>> Can you explain what Oracle's database is planning to do with this
>>>>>> information?
>>>>>
>>>>> Database uses the PID to programmatically find out if the 
>>>>> process/thread
>>>>> is
>>>>> alive(kill 0) also send signals to the processes requesting it to 
>>>>> dump
>>>>> status/debug information and kill the processes in case of a shutdown
>>>>> abort
>>>>> of the instance.
>>>> What I'm wondering is: how does the caller of kill() end up
>>>> controlling a task whose pid it doesn't know in its own namespace?
>>>
>>> I was generally describing how DB would use the PID of process. The 
>>> above
>>> description
>>> was in the case when no namespaces are used.
>>>
>>> With use of namespaces, the DB would convert the PID of processes 
>>> inside
>>> its children namespaces to PID in its namespace and use that pid to 
>>> issue
>>> kill().
>> Seems vaguely sensible.
>>
>> If I were designing this type of system, I'd have a manager process in
>> each namespace running as PID 1, though -- PID 1 is special and needs
>> to understand what's going on anyway.  Then PID 1 would do the kill()
>> calls and wouldn't need translate_pid().
>
> Yes, this has been tried out with the prototype use of PID namespaces 
> in the DB.
> It works, but would be slow as the manager would have to exchange 
> messages with the
> controlling processes which would be in the parent namespace.
> DB could use the api to convert the pid.
>

  reply	other threads:[~2017-11-01 16:59 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-13  9:26 [PATCH v4] pidns: introduce syscall translate_pid Konstantin Khlebnikov
2017-10-13  9:28 ` Konstantin Khlebnikov
2017-10-13 16:05 ` Oleg Nesterov
2017-10-13 16:13   ` Konstantin Khlebnikov
     [not found]     ` <3bdb5341-9ae6-265a-ce5b-45c2cfc76fad-XoJtRXgx1JseBXzfvpsJ4g@public.gmane.org>
2017-10-14  8:17       ` Konstantin Khlebnikov
     [not found]         ` <d7b2a0b6-6d0c-5ca8-9d2b-3a1211713d34-XoJtRXgx1JseBXzfvpsJ4g@public.gmane.org>
2017-10-16 21:36           ` Andrew Morton
     [not found]             ` <20171016143628.b2ef80a9ef16d4345889b4d9-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2017-10-16 22:07               ` Nagarathnam Muthusamy
2017-10-16 22:54                 ` prakash.sangappa
     [not found]                   ` <fb03aaef-84e5-c869-11cc-6e1d8b4699c8-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-10-17  0:52                     ` Andy Lutomirski
2017-10-17 15:38                       ` Prakash Sangappa
     [not found]                         ` <a41bbfdf-6af5-6b29-36bf-1ed677b6ca75-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-10-17 22:02                           ` Andy Lutomirski
     [not found]                             ` <CALCETrXXDQEddqx5yUnGtgZnv_7eDc=GAFsmUSNPV45BGxQbPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-17 22:35                               ` prakash sangappa
     [not found]                                 ` <59E685B3.1000200-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-10-17 22:40                                   ` Andy Lutomirski
     [not found]                                     ` <CALCETrWv5sYXvyL2mYwDK99O-awB6e2KV++oQK7Nrmgkvt9vPA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-17 22:53                                       ` prakash sangappa
2017-11-01 16:59                                         ` nagarathnam muthusamy [this message]
2017-11-01 17:43                         ` Jann Horn
2017-11-02  0:38                           ` prakash.sangappa
2017-10-16 16:24       ` Oleg Nesterov
     [not found]         ` <20171016162436.GB4142-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-16 21:05           ` Nagarathnam Muthusamy
2017-10-17  7:41             ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59F9FD8B.8090607@oracle.com \
    --to=nagarathnam.muthusamy@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=esyr@redhat.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=oleg@redhat.com \
    --cc=prakash.sangappa@oracle.com \
    --cc=serge.hallyn@ubuntu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).