From: Oleg Nesterov <oleg@redhat.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Alexey Dobriyan <adobriyan@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Cyrill Gorcunov <gorcunov@openvz.org>,
David Howells <dhowells@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Peter Zijlstra <peterz@infradead.org>,
Sasha Levin <levinsasha928@gmail.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/5] introduce proc_inode->pid_entry
Date: Sun, 10 Aug 2014 21:23:24 +0200 [thread overview]
Message-ID: <20140810192324.GA31721@redhat.com> (raw)
In-Reply-To: <87d2caod0g.fsf@x220.int.ebiederm.org>
[-- Attachment #1: Type: text/plain, Size: 2335 bytes --]
On 08/08, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > To me it looks a bit annoying that task_mmu.c needs 6 seq_operations's and
> > 6 file_operations's to handle /proc/pid/*maps*. And _only_ because ->show()
> > differs.
> >
> > Eric, et al, what do you think? At least something like 1-3 looks like a
> > good cleanup imho. And afaics we can do more cleanups on top.
>
>
> I see where you are getting annoyed.
>
> Taking a quick look at task_mmu.c It looks like the tgid vs pid logic
> to decided which stack or stacks to display is simply incorrect.
Yes, probably, but please forget this for now. Because,
> Given where you are starting I think tack_mmu.c code that decides
> when/which stack deserves a serious audit.
Yes. And more. At least this code needs more cleanups. ->task should
die or at least we should avoid get/put_task_struct, ->pid can die too,
hold_task_mempolicy() doesn't look correct (at least the "prevent changing
our mempolicy while show_numa_maps" comment and down_write() in
do_set_mempolicy(). I am going to try to cleanup this a bit after I change
task_nommu.c to avoid mm_access() in m_start().
But this is off-topic,
> At a practical level moving pid_entry into the proc inode is ugly
> especially for the hack that is is_tgid_pid_entry.
Well, I am not going to argue with maintainer, mostly simply because
I do not understand this code enough ;)
But let me say that I disagree. We already have ->fd, and ->sysctl*.
I see nothing wrong if *id_base_stuff files have more info for free.
And imo proc_inode->sysctl* is much worse. Simply because they are
not private to fs/proc/proc_sysctl.c.
Could you please look into the attached mbox? I am just curious if we
can do something like this in a clean way. In particular, please look at
"Note:" comment in 3/3. Perhaps proc_sys_init() can do proc_get_inode(),
initialize/instantiate it...
To clarify, of course it is not that I want to shrink sizeof(proc_inode).
Just to me it doesn't look clean that proc_evict_inode() has to do
sysctl_head_put(), grab_header() assumes that ->sysctl == NULL means
sysctl_table_root.*, etc.
> That test could be implemented more easily by looking at the parent
> directories inode operations and seeing if they are
> proc_root_inode_operations.
Hmm. Looking at inode? How?
Oleg.
[-- Attachment #2: U.m --]
[-- Type: text/plain, Size: 4885 bytes --]
>From a4c94461ae18b2d6cc2e8a9cfb042d0b6cc46e86 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg@redhat.com>
Date: Sat, 9 Aug 2014 17:25:53 +0200
Subject: [PATCH 1/3] proc: introduce proc_sys_evict_inode()
Preparation. Shift the sysctl_head_put() code from proc_evict_inode()
into the new trivial helper, make sysctl_head_put() static.
---
fs/proc/inode.c | 8 ++------
fs/proc/internal.h | 4 ++--
fs/proc/proc_sysctl.c | 13 ++++++++++++-
3 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 0adbc02..9f6715a 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -31,7 +31,6 @@
static void proc_evict_inode(struct inode *inode)
{
struct proc_dir_entry *de;
- struct ctl_table_header *head;
const struct proc_ns_operations *ns_ops;
void *ns;
@@ -45,11 +44,8 @@ static void proc_evict_inode(struct inode *inode)
de = PROC_I(inode)->pde;
if (de)
pde_put(de);
- head = PROC_I(inode)->sysctl;
- if (head) {
- RCU_INIT_POINTER(PROC_I(inode)->sysctl, NULL);
- sysctl_head_put(head);
- }
+
+ proc_sys_evict_inode(inode);
/* Release any associated namespace */
ns_ops = PROC_I(inode)->ns.ns_ops;
ns = PROC_I(inode)->ns.ns;
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 78784cd..7ce3262 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -238,10 +238,10 @@ extern int proc_setup_self(struct super_block *);
*/
#ifdef CONFIG_PROC_SYSCTL
extern int proc_sys_init(void);
-extern void sysctl_head_put(struct ctl_table_header *);
+extern void proc_sys_evict_inode(struct inode *inode);
#else
static inline void proc_sys_init(void) { }
-static inline void sysctl_head_put(struct ctl_table_header *head) { }
+static inline void proc_sys_evict_inode(struct inode *inode) { }
#endif
/*
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 7129046..2fa67e7 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -256,7 +256,7 @@ static void sysctl_head_get(struct ctl_table_header *head)
spin_unlock(&sysctl_lock);
}
-void sysctl_head_put(struct ctl_table_header *head)
+static void sysctl_head_put(struct ctl_table_header *head)
{
spin_lock(&sysctl_lock);
if (!--head->count)
@@ -264,6 +264,17 @@ void sysctl_head_put(struct ctl_table_header *head)
spin_unlock(&sysctl_lock);
}
+void proc_sys_evict_inode(struct inode *inode)
+{
+ struct ctl_table_header *head;
+
+ head = PROC_I(inode)->sysctl;
+ if (head) {
+ RCU_INIT_POINTER(PROC_I(inode)->sysctl, NULL);
+ sysctl_head_put(head);
+ }
+}
+
static struct ctl_table_header *sysctl_head_grab(struct ctl_table_header *head)
{
BUG_ON(!head);
--
1.5.5.1
>From e0b42f4770cd76069e4e70e48e4e7bfa84fab4bf Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg@redhat.com>
Date: Sat, 9 Aug 2014 17:37:31 +0200
Subject: [PATCH 2/3] proc: change proc_sys_evict_inode() to check inode->i_op
Change proc_sys_evict_inode() to verify that this inode is really
a /proc/sys inode before using ->sysctl.
---
fs/proc/proc_sysctl.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 2fa67e7..863aaee 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -268,6 +268,10 @@ void proc_sys_evict_inode(struct inode *inode)
{
struct ctl_table_header *head;
+ if (inode->i_op != &proc_sys_inode_operations &&
+ inode->i_op != &proc_sys_dir_operations)
+ return;
+
head = PROC_I(inode)->sysctl;
if (head) {
RCU_INIT_POINTER(PROC_I(inode)->sysctl, NULL);
--
1.5.5.1
>From 9c8e98fc45f091d5d2a4bfc6dcd86b67275160a1 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg@redhat.com>
Date: Sun, 10 Aug 2014 17:37:25 +0200
Subject: [PATCH 3/3] proc: place proc_inode->sysctl* and proc_inode->fd into a union
->sysctl* and ->fd members can share the memory, add a union.
Note: unfortunately proc_alloc_inode() still has to initialize ->sysctl*
members. And the only (afaics) reason is that proc_mkdir("sys") relies
on ->sysctl == NULL which means sysctl_table_root in grab_header(). It
would be nice to initialize these members after proc_get_inode("sys")
somehow and remove the special "NULL means global root" case in proc_sys
paths somehow, but I do not see how.
---
fs/proc/internal.h | 12 ++++++++----
1 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 7ce3262..8d35ac0 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -60,11 +60,15 @@ union proc_op {
struct proc_inode {
struct pid *pid;
- int fd;
- union proc_op op;
struct proc_dir_entry *pde;
- struct ctl_table_header *sysctl;
- struct ctl_table *sysctl_entry;
+ union proc_op op;
+ union {
+ struct {
+ struct ctl_table_header *sysctl;
+ struct ctl_table *sysctl_entry;
+ };
+ int fd;
+ };
struct proc_ns ns;
struct inode vfs_inode;
};
--
1.5.5.1
prev parent reply other threads:[~2014-08-10 19:26 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-08 18:57 [RFC PATCH 0/5] introduce proc_inode->pid_entry Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 1/5] proc: intoduce proc_inode->pid_entry and is_tgid_pid_entry() Oleg Nesterov
2014-08-08 20:05 ` Cyrill Gorcunov
2014-08-09 14:28 ` Oleg Nesterov
2014-08-10 7:16 ` Cyrill Gorcunov
2014-08-08 18:57 ` [RFC PATCH 2/5] proc: unify proc_tgid_stat() and proc_tid_stat() Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 3/5] proc: kill *_tid_*maps* stuff Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 4/5] proc: introduce pid_entry_name() Oleg Nesterov
2014-08-08 18:58 ` [RFC PATCH 5/5] proc: unify proc_pid_*maps* stuff Oleg Nesterov
2014-08-08 22:03 ` [RFC PATCH 0/5] introduce proc_inode->pid_entry Eric W. Biederman
2014-08-08 22:11 ` Eric W. Biederman
2014-08-10 19:23 ` Oleg Nesterov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140810192324.GA31721@redhat.com \
--to=oleg@redhat.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=dhowells@redhat.com \
--cc=ebiederm@xmission.com \
--cc=gorcunov@openvz.org \
--cc=kirill@shutemov.name \
--cc=levinsasha928@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).