From: Prakash Sangappa <prakash.sangappa@oracle.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
Oleg Nesterov <oleg@redhat.com>,
Linux API <linux-api@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Serge Hallyn <serge.hallyn@ubuntu.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Eugene Syromiatnikov <esyr@redhat.com>
Subject: Re: [PATCH v4] pidns: introduce syscall translate_pid
Date: Tue, 17 Oct 2017 08:38:42 -0700 [thread overview]
Message-ID: <a41bbfdf-6af5-6b29-36bf-1ed677b6ca75@oracle.com> (raw)
In-Reply-To: <CALCETrUg0xrkWnsQhq5L9RpDunrD8w7C3EjxeOPPrQv2h1KMEA@mail.gmail.com>
On 10/16/17 5:52 PM, Andy Lutomirski wrote:
> On Mon, Oct 16, 2017 at 3:54 PM, prakash.sangappa
> <prakash.sangappa@oracle.com> wrote:
>>
>> On 10/16/2017 03:07 PM, Nagarathnam Muthusamy wrote:
>>>
>>>
>>> On 10/16/2017 02:36 PM, Andrew Morton wrote:
>>>> On Sat, 14 Oct 2017 11:17:47 +0300 Konstantin Khlebnikov
>>>> <khlebnikov@yandex-team.ru> wrote:
>>>>
>>>>>>>> pid_t translate_pid(pid_t pid, int source, int target);
>>>>>>>>
>>>>>>>> This syscall converts pid from source pid-ns into pid in target
>>>>>>>> pid-ns.
>>>>>>>> If pid is unreachable from target pid-ns it returns zero.
>>>>>>>>
>>>>>>>> Pid-namespaces are referred file descriptors opened to proc files
>>>>>>>> /proc/[pid]/ns/pid or /proc/[pid]/ns/pid_for_children. Negative
>>>>>>>> argument
>>>>>>>> refers to current pid namespace, same as file /proc/self/ns/pid.
>>>>>>>>
>>>>>>>> Kernel expose virtual pids in /proc/[pid]/status:NSpid, but backward
>>>>>>>> translation requires scanning all tasks. Also pids could be
>>>>>>>> translated
>>>>>>>> by sending them through unix socket between namespaces, this method
>>>>>>>> is
>>>>>>>> slow and insecure because other side is exposed inside pid namespace.
>>>>> Andrew asked why we might need this.
>>>>>
>>>>> Such conversion is required for interaction between processes across
>>>>> pid-namespaces.
>>>>> For example to identify process in container by pid file looking from
>>>>> outside.
>>>>>
>>>>> Two years ago I've solved this in project of mine with monstrous code
>>>>> which
>>>>> forks couple times just to convert pid, lucky for me performance wasn't
>>>>> important.
>>>> That's a single user who needed this a single time, and found a
>>>> userspace-based solution anyway. This is not exactly compelling!
>>>>
>>>> Is there a stronger case to be made? How does this change benefit our
>>>> users? Sell it to us!
>>> Oracle database is planning to use pid namespace for sandboxing database
>>> instances and they need an API similar to translate_pid to effectively
>>> translate process IDs from other pid namespaces. Prakash (cced in mail) can
>>> provide more details on this usecase.
>>
>> As Nagarathnam indicated, Oracle Database will be using pid namespaces and
>> needs a direct method of converting pids of processes in the pid namespace
>> hierarchy. In this use case multiple
>> nested PID namespaces will be used. The currently available mechanism are
>> not very efficient for this use case. For ex. as Konstantin described, using
>> /proc/<pid>/status would require the application to scan all the pid's
>> status files to determine the pid of given process in a child namespace.
>>
>> Use of SCM_CREDENTIALS's socket message is another way, which would require
>> every process starting inside a pid namespace to send this message and the
>> receiving process in the target namespace would have to save the converted
>> pid and reference it. This mechanism becomes cumbersome especially if the
>> application has to deal with multiple nested pid namespaces. Also, the
>> Database needs to be able to convert a thread's global pid(gettid()).
>> Passing the thread's pid(gettid()) in SCM_CREDENTIALS message requires
>> CAP_SYS_ADMIN, which is an issue.
>>
>> So having a direct method, like the API that Konstantin is proposing, will
>> work best for the Database
>> since pid of a process in any of the nested pid namespaces can be converted
>> as and when required. I think with the proposed API, the application should
>> be able to convert pid of a process or tid(gettid()) of a thread as well.
>>
>
> Can you explain what Oracle's database is planning to do with this information?
Database uses the PID to programmatically find out if the process/thread
is alive(kill 0) also send signals to the processes requesting it to
dump status/debug information and kill the processes in case of a
shutdown abort of the instance.
-Prakash.
next prev parent reply other threads:[~2017-10-17 15:38 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-13 9:26 [PATCH v4] pidns: introduce syscall translate_pid Konstantin Khlebnikov
2017-10-13 9:28 ` Konstantin Khlebnikov
2017-10-13 16:05 ` Oleg Nesterov
2017-10-13 16:13 ` Konstantin Khlebnikov
[not found] ` <3bdb5341-9ae6-265a-ce5b-45c2cfc76fad-XoJtRXgx1JseBXzfvpsJ4g@public.gmane.org>
2017-10-14 8:17 ` Konstantin Khlebnikov
[not found] ` <d7b2a0b6-6d0c-5ca8-9d2b-3a1211713d34-XoJtRXgx1JseBXzfvpsJ4g@public.gmane.org>
2017-10-16 21:36 ` Andrew Morton
[not found] ` <20171016143628.b2ef80a9ef16d4345889b4d9-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2017-10-16 22:07 ` Nagarathnam Muthusamy
2017-10-16 22:54 ` prakash.sangappa
[not found] ` <fb03aaef-84e5-c869-11cc-6e1d8b4699c8-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-10-17 0:52 ` Andy Lutomirski
2017-10-17 15:38 ` Prakash Sangappa [this message]
[not found] ` <a41bbfdf-6af5-6b29-36bf-1ed677b6ca75-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-10-17 22:02 ` Andy Lutomirski
[not found] ` <CALCETrXXDQEddqx5yUnGtgZnv_7eDc=GAFsmUSNPV45BGxQbPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-17 22:35 ` prakash sangappa
[not found] ` <59E685B3.1000200-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-10-17 22:40 ` Andy Lutomirski
[not found] ` <CALCETrWv5sYXvyL2mYwDK99O-awB6e2KV++oQK7Nrmgkvt9vPA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-17 22:53 ` prakash sangappa
2017-11-01 16:59 ` nagarathnam muthusamy
2017-11-01 17:43 ` Jann Horn
2017-11-02 0:38 ` prakash.sangappa
2017-10-16 16:24 ` Oleg Nesterov
[not found] ` <20171016162436.GB4142-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-16 21:05 ` Nagarathnam Muthusamy
2017-10-17 7:41 ` Konstantin Khlebnikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a41bbfdf-6af5-6b29-36bf-1ed677b6ca75@oracle.com \
--to=prakash.sangappa@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=esyr@redhat.com \
--cc=khlebnikov@yandex-team.ru \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=nagarathnam.muthusamy@oracle.com \
--cc=oleg@redhat.com \
--cc=serge.hallyn@ubuntu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).