* [RFC PATCH 0/5] introduce proc_inode->pid_entry
@ 2014-08-08 18:57 Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 1/5] proc: intoduce proc_inode->pid_entry and is_tgid_pid_entry() Oleg Nesterov
` (5 more replies)
0 siblings, 6 replies; 12+ messages in thread
From: Oleg Nesterov @ 2014-08-08 18:57 UTC (permalink / raw)
To: Alexander Viro, Alexey Dobriyan, Andrew Morton, Cyrill Gorcunov,
David Howells, David S. Miller, Eric W. Biederman,
Kirill A. Shutemov, Peter Zijlstra, Sasha Levin
Cc: linux-fsdevel, linux-kernel
Hello,
Obviously not for inclusion. The patches are horrible, break task_nommu.c,
untested, etc. Only to explain what I mean and discuss the intent, at least.
On top of recent /proc/pid/*maps* cleanups I sent.
To me it looks a bit annoying that task_mmu.c needs 6 seq_operations's and
6 file_operations's to handle /proc/pid/*maps*. And _only_ because ->show()
differs.
Eric, et al, what do you think? At least something like 1-3 looks like a
good cleanup imho. And afaics we can do more cleanups on top.
Oleg.
fs/proc/array.c | 17 +---
fs/proc/base.c | 37 +++++++--
fs/proc/inode.c | 1 +
fs/proc/internal.h | 16 ++--
fs/proc/task_mmu.c | 237 +++++++++++++---------------------------------------
5 files changed, 98 insertions(+), 210 deletions(-)
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC PATCH 1/5] proc: intoduce proc_inode->pid_entry and is_tgid_pid_entry()
2014-08-08 18:57 [RFC PATCH 0/5] introduce proc_inode->pid_entry Oleg Nesterov
@ 2014-08-08 18:57 ` Oleg Nesterov
2014-08-08 20:05 ` Cyrill Gorcunov
2014-08-08 18:57 ` [RFC PATCH 2/5] proc: unify proc_tgid_stat() and proc_tid_stat() Oleg Nesterov
` (4 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2014-08-08 18:57 UTC (permalink / raw)
To: Alexander Viro, Alexey Dobriyan, Andrew Morton, Cyrill Gorcunov,
David Howells, David S. Miller, Eric W. Biederman,
Kirill A. Shutemov, Peter Zijlstra, Sasha Levin
Cc: linux-fsdevel, linux-kernel
Add the new "struct pid_entry *" member into struct proc_inode, and
the new helper, is_tgid_pid_entry(), it can be used by *base_stuff
to figure out whether we need tgid or tid info.
Note: this obviously blows struct proc_inode, but I think it is
simple to put ->fd, ->sysctl/sysctl_entry, and the new member in
the union, the only complication is that proc_evict_inode() should
be carefull with ->sysctl.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/proc/base.c | 16 +++++++++++++++-
fs/proc/inode.c | 1 +
fs/proc/internal.h | 4 ++++
3 files changed, 20 insertions(+), 1 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 9aef9ac..ea11bf1 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2124,6 +2124,9 @@ static int proc_pident_instantiate(struct inode *dir,
goto out;
ei = PROC_I(inode);
+ ei->pid_entry = p;
+ ei->op = p->op;
+
inode->i_mode = p->mode;
if (S_ISDIR(inode->i_mode))
set_nlink(inode, 2); /* Use getattr to fix if necessary */
@@ -2131,7 +2134,7 @@ static int proc_pident_instantiate(struct inode *dir,
inode->i_op = p->iop;
if (p->fop)
inode->i_fop = p->fop;
- ei->op = p->op;
+
d_set_d_op(dentry, &pid_dentry_operations);
d_add(dentry, inode);
/* Close the race of the process dying before we return the dentry */
@@ -2978,6 +2981,17 @@ static const struct pid_entry tid_base_stuff[] = {
#endif
};
+bool is_tgid_pid_entry(const struct pid_entry *p)
+{
+ if (p >= tgid_base_stuff &&
+ p < tgid_base_stuff + ARRAY_SIZE(tgid_base_stuff))
+ return true;
+ if (p >= tid_base_stuff &&
+ p < tid_base_stuff + ARRAY_SIZE(tid_base_stuff))
+ return false;
+ BUG();
+}
+
static int proc_tid_base_readdir(struct file *file, struct dir_context *ctx)
{
return proc_pident_readdir(file, ctx,
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 0adbc02..45ad621 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -73,6 +73,7 @@ static struct inode *proc_alloc_inode(struct super_block *sb)
ei->pde = NULL;
ei->sysctl = NULL;
ei->sysctl_entry = NULL;
+ ei->pid_entry = NULL;
ei->ns.ns = NULL;
ei->ns.ns_ops = NULL;
inode = &ei->vfs_inode;
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 78784cd..2e310b6 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -15,6 +15,7 @@
#include <linux/atomic.h>
#include <linux/binfmts.h>
+struct pid_entry;
struct ctl_table_header;
struct mempolicy;
@@ -65,6 +66,7 @@ struct proc_inode {
struct proc_dir_entry *pde;
struct ctl_table_header *sysctl;
struct ctl_table *sysctl_entry;
+ const struct pid_entry *pid_entry;
struct proc_ns ns;
struct inode vfs_inode;
};
@@ -147,6 +149,8 @@ out:
*/
extern const struct file_operations proc_tid_children_operations;
+extern bool is_tgid_pid_entry(const struct pid_entry *p);
+
extern int proc_tid_stat(struct seq_file *, struct pid_namespace *,
struct pid *, struct task_struct *);
extern int proc_tgid_stat(struct seq_file *, struct pid_namespace *,
--
1.5.5.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 2/5] proc: unify proc_tgid_stat() and proc_tid_stat()
2014-08-08 18:57 [RFC PATCH 0/5] introduce proc_inode->pid_entry Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 1/5] proc: intoduce proc_inode->pid_entry and is_tgid_pid_entry() Oleg Nesterov
@ 2014-08-08 18:57 ` Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 3/5] proc: kill *_tid_*maps* stuff Oleg Nesterov
` (3 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2014-08-08 18:57 UTC (permalink / raw)
To: Alexander Viro, Alexey Dobriyan, Andrew Morton, Cyrill Gorcunov,
David Howells, David S. Miller, Eric W. Biederman,
Kirill A. Shutemov, Peter Zijlstra, Sasha Levin
Cc: linux-fsdevel, linux-kernel
Now that we have is_tgid_pid_entry() we do not need 2 different helpers
for ONE("stat"), we can kill proc_tgid_stat() and proc_tid_stat() and
turn do_task_stat() into proc_pid_stat().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/proc/array.c | 17 +++--------------
fs/proc/base.c | 4 ++--
fs/proc/internal.h | 4 +---
3 files changed, 6 insertions(+), 19 deletions(-)
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 64db2bc..d096bd6 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -377,9 +377,10 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
return 0;
}
-static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
- struct pid *pid, struct task_struct *task, int whole)
+int proc_pid_stat(struct seq_file *m, struct pid_namespace *ns,
+ struct pid *pid, struct task_struct *task)
{
+ bool whole = is_tgid_pid_entry(PROC_I(m->private)->pid_entry);
unsigned long vsize, eip, esp, wchan = ~0UL;
int priority, nice;
int tty_pgrp = -1, tty_nr = 0;
@@ -550,18 +551,6 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
return 0;
}
-int proc_tid_stat(struct seq_file *m, struct pid_namespace *ns,
- struct pid *pid, struct task_struct *task)
-{
- return do_task_stat(m, ns, pid, task, 0);
-}
-
-int proc_tgid_stat(struct seq_file *m, struct pid_namespace *ns,
- struct pid *pid, struct task_struct *task)
-{
- return do_task_stat(m, ns, pid, task, 1);
-}
-
int proc_pid_statm(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ea11bf1..9d47fe0 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2581,7 +2581,7 @@ static const struct pid_entry tgid_base_stuff[] = {
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
INF("cmdline", S_IRUGO, proc_pid_cmdline),
- ONE("stat", S_IRUGO, proc_tgid_stat),
+ ONE("stat", S_IRUGO, proc_pid_stat),
ONE("statm", S_IRUGO, proc_pid_statm),
REG("maps", S_IRUGO, proc_pid_maps_operations),
#ifdef CONFIG_NUMA
@@ -2917,7 +2917,7 @@ static const struct pid_entry tid_base_stuff[] = {
INF("syscall", S_IRUSR, proc_pid_syscall),
#endif
INF("cmdline", S_IRUGO, proc_pid_cmdline),
- ONE("stat", S_IRUGO, proc_tid_stat),
+ ONE("stat", S_IRUGO, proc_pid_stat),
ONE("statm", S_IRUGO, proc_pid_statm),
REG("maps", S_IRUGO, proc_tid_maps_operations),
#ifdef CONFIG_CHECKPOINT_RESTORE
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 2e310b6..c7bb0e3 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -151,10 +151,8 @@ extern const struct file_operations proc_tid_children_operations;
extern bool is_tgid_pid_entry(const struct pid_entry *p);
-extern int proc_tid_stat(struct seq_file *, struct pid_namespace *,
+extern int proc_pid_stat(struct seq_file *, struct pid_namespace *,
struct pid *, struct task_struct *);
-extern int proc_tgid_stat(struct seq_file *, struct pid_namespace *,
- struct pid *, struct task_struct *);
extern int proc_pid_status(struct seq_file *, struct pid_namespace *,
struct pid *, struct task_struct *);
extern int proc_pid_statm(struct seq_file *, struct pid_namespace *,
--
1.5.5.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 3/5] proc: kill *_tid_*maps* stuff
2014-08-08 18:57 [RFC PATCH 0/5] introduce proc_inode->pid_entry Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 1/5] proc: intoduce proc_inode->pid_entry and is_tgid_pid_entry() Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 2/5] proc: unify proc_tgid_stat() and proc_tid_stat() Oleg Nesterov
@ 2014-08-08 18:57 ` Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 4/5] proc: introduce pid_entry_name() Oleg Nesterov
` (2 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2014-08-08 18:57 UTC (permalink / raw)
To: Alexander Viro, Alexey Dobriyan, Andrew Morton, Cyrill Gorcunov,
David Howells, David S. Miller, Eric W. Biederman,
Kirill A. Shutemov, Peter Zijlstra, Sasha Levin
Cc: linux-fsdevel, linux-kernel
Kill *_tid_*maps* data/functions. proc_maps_open() can use
is_tgid_pid_entry() and initialize priv->is_tgid for ->show()
methods.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/proc/base.c | 6 +-
fs/proc/internal.h | 4 +-
fs/proc/task_mmu.c | 140 ++++++++--------------------------------------------
3 files changed, 25 insertions(+), 125 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 9d47fe0..368f6fe 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2919,12 +2919,12 @@ static const struct pid_entry tid_base_stuff[] = {
INF("cmdline", S_IRUGO, proc_pid_cmdline),
ONE("stat", S_IRUGO, proc_pid_stat),
ONE("statm", S_IRUGO, proc_pid_statm),
- REG("maps", S_IRUGO, proc_tid_maps_operations),
+ REG("maps", S_IRUGO, proc_pid_maps_operations),
#ifdef CONFIG_CHECKPOINT_RESTORE
REG("children", S_IRUGO, proc_tid_children_operations),
#endif
#ifdef CONFIG_NUMA
- REG("numa_maps", S_IRUGO, proc_tid_numa_maps_operations),
+ REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations),
#endif
REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations),
LNK("cwd", proc_cwd_link),
@@ -2934,7 +2934,7 @@ static const struct pid_entry tid_base_stuff[] = {
REG("mountinfo", S_IRUGO, proc_mountinfo_operations),
#ifdef CONFIG_PROC_PAGE_MONITOR
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
- REG("smaps", S_IRUGO, proc_tid_smaps_operations),
+ REG("smaps", S_IRUGO, proc_pid_smaps_operations),
REG("pagemap", S_IRUSR, proc_pagemap_operations),
#endif
#ifdef CONFIG_SECURITY
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index c7bb0e3..46500af 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -268,6 +268,7 @@ extern int proc_remount(struct super_block *, int *, char *);
*/
struct proc_maps_private {
struct pid *pid;
+ bool is_tgid;
struct task_struct *task;
struct mm_struct *mm;
#ifdef CONFIG_MMU
@@ -281,11 +282,8 @@ struct proc_maps_private {
struct mm_struct *proc_mem_open(struct inode *inode, unsigned int mode);
extern const struct file_operations proc_pid_maps_operations;
-extern const struct file_operations proc_tid_maps_operations;
extern const struct file_operations proc_pid_numa_maps_operations;
-extern const struct file_operations proc_tid_numa_maps_operations;
extern const struct file_operations proc_pid_smaps_operations;
-extern const struct file_operations proc_tid_smaps_operations;
extern const struct file_operations proc_clear_refs_operations;
extern const struct file_operations proc_pagemap_operations;
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4ec6b72..7618ef8 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -231,6 +231,7 @@ static int proc_maps_open(struct inode *inode, struct file *file,
return -ENOMEM;
priv->pid = proc_pid(inode);
+ priv->is_tgid = is_tgid_pid_entry(PROC_I(inode)->pid_entry);
priv->mm = proc_mem_open(inode, PTRACE_MODE_READ);
if (IS_ERR(priv->mm)) {
int err = PTR_ERR(priv->mm);
@@ -252,16 +253,7 @@ static int proc_map_release(struct inode *inode, struct file *file)
return seq_release_private(inode, file);
}
-
-static int do_maps_open(struct inode *inode, struct file *file,
- const struct seq_operations *ops)
-{
- return proc_maps_open(inode, file, ops,
- sizeof(struct proc_maps_private));
-}
-
-static void
-show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
+static void show_map_vma(struct seq_file *m, struct vm_area_struct *vma)
{
struct mm_struct *mm = vma->vm_mm;
struct file *file = vma->vm_file;
@@ -331,15 +323,15 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
goto done;
}
- tid = vm_is_stack(task, vma, is_pid);
-
+ tid = vm_is_stack(task, vma, priv->is_tgid);
if (tid != 0) {
/*
* Thread stack in /proc/PID/task/TID/maps or
* the main process stack.
*/
- if (!is_pid || (vma->vm_start <= mm->start_stack &&
- vma->vm_end >= mm->start_stack)) {
+ if (!priv->is_tgid ||
+ (vma->vm_start <= mm->start_stack &&
+ vma->vm_end >= mm->start_stack)) {
name = "[stack]";
} else {
/* Thread stack in /proc/PID/maps */
@@ -357,23 +349,13 @@ done:
seq_putc(m, '\n');
}
-static int show_map(struct seq_file *m, void *v, int is_pid)
+static int show_pid_map(struct seq_file *m, void *v)
{
- show_map_vma(m, v, is_pid);
+ show_map_vma(m, v);
m_cache_vma(m, v);
return 0;
}
-static int show_pid_map(struct seq_file *m, void *v)
-{
- return show_map(m, v, 1);
-}
-
-static int show_tid_map(struct seq_file *m, void *v)
-{
- return show_map(m, v, 0);
-}
-
static const struct seq_operations proc_pid_maps_op = {
.start = m_start,
.next = m_next,
@@ -381,21 +363,10 @@ static const struct seq_operations proc_pid_maps_op = {
.show = show_pid_map
};
-static const struct seq_operations proc_tid_maps_op = {
- .start = m_start,
- .next = m_next,
- .stop = m_stop,
- .show = show_tid_map
-};
-
static int pid_maps_open(struct inode *inode, struct file *file)
{
- return do_maps_open(inode, file, &proc_pid_maps_op);
-}
-
-static int tid_maps_open(struct inode *inode, struct file *file)
-{
- return do_maps_open(inode, file, &proc_tid_maps_op);
+ return proc_maps_open(inode, file, &proc_pid_maps_op,
+ sizeof(struct proc_maps_private));
}
const struct file_operations proc_pid_maps_operations = {
@@ -405,13 +376,6 @@ const struct file_operations proc_pid_maps_operations = {
.release = proc_map_release,
};
-const struct file_operations proc_tid_maps_operations = {
- .open = tid_maps_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = proc_map_release,
-};
-
/*
* Proportional Set Size(PSS): my share of RSS.
*
@@ -584,7 +548,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
seq_putc(m, '\n');
}
-static int show_smap(struct seq_file *m, void *v, int is_pid)
+static int show_pid_smap(struct seq_file *m, void *v)
{
struct vm_area_struct *vma = v;
struct mem_size_stats mss;
@@ -600,7 +564,7 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
if (vma->vm_mm && !is_vm_hugetlb_page(vma))
walk_page_range(vma->vm_start, vma->vm_end, &smaps_walk);
- show_map_vma(m, vma, is_pid);
+ show_map_vma(m, vma);
seq_printf(m,
"Size: %8lu kB\n"
@@ -642,16 +606,6 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
return 0;
}
-static int show_pid_smap(struct seq_file *m, void *v)
-{
- return show_smap(m, v, 1);
-}
-
-static int show_tid_smap(struct seq_file *m, void *v)
-{
- return show_smap(m, v, 0);
-}
-
static const struct seq_operations proc_pid_smaps_op = {
.start = m_start,
.next = m_next,
@@ -659,21 +613,10 @@ static const struct seq_operations proc_pid_smaps_op = {
.show = show_pid_smap
};
-static const struct seq_operations proc_tid_smaps_op = {
- .start = m_start,
- .next = m_next,
- .stop = m_stop,
- .show = show_tid_smap
-};
-
static int pid_smaps_open(struct inode *inode, struct file *file)
{
- return do_maps_open(inode, file, &proc_pid_smaps_op);
-}
-
-static int tid_smaps_open(struct inode *inode, struct file *file)
-{
- return do_maps_open(inode, file, &proc_tid_smaps_op);
+ return proc_maps_open(inode, file, &proc_pid_smaps_op,
+ sizeof(struct proc_maps_private));
}
const struct file_operations proc_pid_smaps_operations = {
@@ -683,13 +626,6 @@ const struct file_operations proc_pid_smaps_operations = {
.release = proc_map_release,
};
-const struct file_operations proc_tid_smaps_operations = {
- .open = tid_smaps_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = proc_map_release,
-};
-
/*
* We do not want to have constant page-shift bits sitting in
* pagemap entries and are about to reuse them some time soon.
@@ -1382,7 +1318,7 @@ static int gather_hugetbl_stats(pte_t *pte, unsigned long hmask,
/*
* Display pages allocated per node and memory policy via /proc.
*/
-static int show_numa_map(struct seq_file *m, void *v, int is_pid)
+static int show_pid_numa_map(struct seq_file *m, void *v)
{
struct numa_maps_private *numa_priv = m->private;
struct proc_maps_private *proc_priv = &numa_priv->proc_maps;
@@ -1421,14 +1357,15 @@ static int show_numa_map(struct seq_file *m, void *v, int is_pid)
} else if (vma->vm_start <= mm->brk && vma->vm_end >= mm->start_brk) {
seq_puts(m, " heap");
} else {
- pid_t tid = vm_is_stack(task, vma, is_pid);
+ pid_t tid = vm_is_stack(task, vma, proc_priv->is_tgid);
if (tid != 0) {
/*
* Thread stack in /proc/PID/task/TID/maps or
* the main process stack.
*/
- if (!is_pid || (vma->vm_start <= mm->start_stack &&
- vma->vm_end >= mm->start_stack))
+ if (!proc_priv->is_tgid ||
+ (vma->vm_start <= mm->start_stack &&
+ vma->vm_end >= mm->start_stack))
seq_puts(m, " stack");
else
seq_printf(m, " stack:%d", tid);
@@ -1473,16 +1410,6 @@ out:
return 0;
}
-static int show_pid_numa_map(struct seq_file *m, void *v)
-{
- return show_numa_map(m, v, 1);
-}
-
-static int show_tid_numa_map(struct seq_file *m, void *v)
-{
- return show_numa_map(m, v, 0);
-}
-
static const struct seq_operations proc_pid_numa_maps_op = {
.start = m_start,
.next = m_next,
@@ -1490,28 +1417,10 @@ static const struct seq_operations proc_pid_numa_maps_op = {
.show = show_pid_numa_map,
};
-static const struct seq_operations proc_tid_numa_maps_op = {
- .start = m_start,
- .next = m_next,
- .stop = m_stop,
- .show = show_tid_numa_map,
-};
-
-static int numa_maps_open(struct inode *inode, struct file *file,
- const struct seq_operations *ops)
-{
- return proc_maps_open(inode, file, ops,
- sizeof(struct numa_maps_private));
-}
-
static int pid_numa_maps_open(struct inode *inode, struct file *file)
{
- return numa_maps_open(inode, file, &proc_pid_numa_maps_op);
-}
-
-static int tid_numa_maps_open(struct inode *inode, struct file *file)
-{
- return numa_maps_open(inode, file, &proc_tid_numa_maps_op);
+ return proc_maps_open(inode, file, &proc_pid_numa_maps_op,
+ sizeof(struct numa_maps_private));
}
const struct file_operations proc_pid_numa_maps_operations = {
@@ -1520,11 +1429,4 @@ const struct file_operations proc_pid_numa_maps_operations = {
.llseek = seq_lseek,
.release = proc_map_release,
};
-
-const struct file_operations proc_tid_numa_maps_operations = {
- .open = tid_numa_maps_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = proc_map_release,
-};
#endif /* CONFIG_NUMA */
--
1.5.5.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 4/5] proc: introduce pid_entry_name()
2014-08-08 18:57 [RFC PATCH 0/5] introduce proc_inode->pid_entry Oleg Nesterov
` (2 preceding siblings ...)
2014-08-08 18:57 ` [RFC PATCH 3/5] proc: kill *_tid_*maps* stuff Oleg Nesterov
@ 2014-08-08 18:57 ` Oleg Nesterov
2014-08-08 18:58 ` [RFC PATCH 5/5] proc: unify proc_pid_*maps* stuff Oleg Nesterov
2014-08-08 22:03 ` [RFC PATCH 0/5] introduce proc_inode->pid_entry Eric W. Biederman
5 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2014-08-08 18:57 UTC (permalink / raw)
To: Alexander Viro, Alexey Dobriyan, Andrew Morton, Cyrill Gorcunov,
David Howells, David S. Miller, Eric W. Biederman,
Kirill A. Shutemov, Peter Zijlstra, Sasha Levin
Cc: linux-fsdevel, linux-kernel
Yes, this is ugly. I think we should simply export struct pid_entry,
or make is_tgid_pid_entry/pid_entry_name "struct inode *". Or something
else. Lets discuss this later.
See the next patch which explains why.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/proc/base.c | 7 +++++++
fs/proc/internal.h | 1 +
2 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 368f6fe..81d372c 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2992,6 +2992,13 @@ bool is_tgid_pid_entry(const struct pid_entry *p)
BUG();
}
+// THIS IS UGLY, JUST TO DISCUSS RFC
+const char *pid_entry_name(const struct pid_entry *p)
+{
+ is_tgid_pid_entry(p); // trigger BUG_ON() if not valid
+ return p->name;
+}
+
static int proc_tid_base_readdir(struct file *file, struct dir_context *ctx)
{
return proc_pident_readdir(file, ctx,
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 46500af..c162db2 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -150,6 +150,7 @@ out:
extern const struct file_operations proc_tid_children_operations;
extern bool is_tgid_pid_entry(const struct pid_entry *p);
+extern const char *pid_entry_name(const struct pid_entry *p);
extern int proc_pid_stat(struct seq_file *, struct pid_namespace *,
struct pid *, struct task_struct *);
--
1.5.5.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH 5/5] proc: unify proc_pid_*maps* stuff
2014-08-08 18:57 [RFC PATCH 0/5] introduce proc_inode->pid_entry Oleg Nesterov
` (3 preceding siblings ...)
2014-08-08 18:57 ` [RFC PATCH 4/5] proc: introduce pid_entry_name() Oleg Nesterov
@ 2014-08-08 18:58 ` Oleg Nesterov
2014-08-08 22:03 ` [RFC PATCH 0/5] introduce proc_inode->pid_entry Eric W. Biederman
5 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2014-08-08 18:58 UTC (permalink / raw)
To: Alexander Viro, Alexey Dobriyan, Andrew Morton, Cyrill Gorcunov,
David Howells, David S. Miller, Eric W. Biederman,
Kirill A. Shutemov, Peter Zijlstra, Sasha Levin
Cc: linux-fsdevel, linux-kernel
A single seq_operations/file_operations pair can handle maps, smaps,
and numa_maps files. Just the generic ->open() needs to look at
pid_entry_name() and initialize priv->show() accordingly.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/proc/base.c | 8 ++--
fs/proc/internal.h | 3 +-
fs/proc/task_mmu.c | 123 +++++++++++++++++++++-------------------------------
3 files changed, 54 insertions(+), 80 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 81d372c..4018a86 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2585,7 +2585,7 @@ static const struct pid_entry tgid_base_stuff[] = {
ONE("statm", S_IRUGO, proc_pid_statm),
REG("maps", S_IRUGO, proc_pid_maps_operations),
#ifdef CONFIG_NUMA
- REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations),
+ REG("numa_maps", S_IRUGO, proc_pid_maps_operations),
#endif
REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations),
LNK("cwd", proc_cwd_link),
@@ -2596,7 +2596,7 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("mountstats", S_IRUSR, proc_mountstats_operations),
#ifdef CONFIG_PROC_PAGE_MONITOR
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
- REG("smaps", S_IRUGO, proc_pid_smaps_operations),
+ REG("smaps", S_IRUGO, proc_pid_maps_operations),
REG("pagemap", S_IRUSR, proc_pagemap_operations),
#endif
#ifdef CONFIG_SECURITY
@@ -2924,7 +2924,7 @@ static const struct pid_entry tid_base_stuff[] = {
REG("children", S_IRUGO, proc_tid_children_operations),
#endif
#ifdef CONFIG_NUMA
- REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations),
+ REG("numa_maps", S_IRUGO, proc_pid_maps_operations),
#endif
REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations),
LNK("cwd", proc_cwd_link),
@@ -2934,7 +2934,7 @@ static const struct pid_entry tid_base_stuff[] = {
REG("mountinfo", S_IRUGO, proc_mountinfo_operations),
#ifdef CONFIG_PROC_PAGE_MONITOR
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
- REG("smaps", S_IRUGO, proc_pid_smaps_operations),
+ REG("smaps", S_IRUGO, proc_pid_maps_operations),
REG("pagemap", S_IRUSR, proc_pagemap_operations),
#endif
#ifdef CONFIG_SECURITY
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index c162db2..17091c9 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -270,6 +270,7 @@ extern int proc_remount(struct super_block *, int *, char *);
struct proc_maps_private {
struct pid *pid;
bool is_tgid;
+ int (*show)(struct seq_file *m, void *v);
struct task_struct *task;
struct mm_struct *mm;
#ifdef CONFIG_MMU
@@ -283,8 +284,6 @@ struct proc_maps_private {
struct mm_struct *proc_mem_open(struct inode *inode, unsigned int mode);
extern const struct file_operations proc_pid_maps_operations;
-extern const struct file_operations proc_pid_numa_maps_operations;
-extern const struct file_operations proc_pid_smaps_operations;
extern const struct file_operations proc_clear_refs_operations;
extern const struct file_operations proc_pagemap_operations;
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 7618ef8..4ef991a 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -222,26 +222,6 @@ static void m_stop(struct seq_file *m, void *v)
}
}
-static int proc_maps_open(struct inode *inode, struct file *file,
- const struct seq_operations *ops, int psize)
-{
- struct proc_maps_private *priv = __seq_open_private(file, ops, psize);
-
- if (!priv)
- return -ENOMEM;
-
- priv->pid = proc_pid(inode);
- priv->is_tgid = is_tgid_pid_entry(PROC_I(inode)->pid_entry);
- priv->mm = proc_mem_open(inode, PTRACE_MODE_READ);
- if (IS_ERR(priv->mm)) {
- int err = PTR_ERR(priv->mm);
- seq_release_private(inode, file);
- return err;
- }
-
- return 0;
-}
-
static int proc_map_release(struct inode *inode, struct file *file)
{
struct seq_file *seq = file->private_data;
@@ -352,30 +332,9 @@ done:
static int show_pid_map(struct seq_file *m, void *v)
{
show_map_vma(m, v);
- m_cache_vma(m, v);
return 0;
}
-static const struct seq_operations proc_pid_maps_op = {
- .start = m_start,
- .next = m_next,
- .stop = m_stop,
- .show = show_pid_map
-};
-
-static int pid_maps_open(struct inode *inode, struct file *file)
-{
- return proc_maps_open(inode, file, &proc_pid_maps_op,
- sizeof(struct proc_maps_private));
-}
-
-const struct file_operations proc_pid_maps_operations = {
- .open = pid_maps_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = proc_map_release,
-};
-
/*
* Proportional Set Size(PSS): my share of RSS.
*
@@ -602,30 +561,9 @@ static int show_pid_smap(struct seq_file *m, void *v)
mss.nonlinear >> 10);
show_smap_vma_flags(m, vma);
- m_cache_vma(m, vma);
return 0;
}
-static const struct seq_operations proc_pid_smaps_op = {
- .start = m_start,
- .next = m_next,
- .stop = m_stop,
- .show = show_pid_smap
-};
-
-static int pid_smaps_open(struct inode *inode, struct file *file)
-{
- return proc_maps_open(inode, file, &proc_pid_smaps_op,
- sizeof(struct proc_maps_private));
-}
-
-const struct file_operations proc_pid_smaps_operations = {
- .open = pid_smaps_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = proc_map_release,
-};
-
/*
* We do not want to have constant page-shift bits sitting in
* pagemap entries and are about to reuse them some time soon.
@@ -1406,27 +1344,64 @@ static int show_pid_numa_map(struct seq_file *m, void *v)
seq_printf(m, " N%d=%lu", nid, md->node[nid]);
out:
seq_putc(m, '\n');
- m_cache_vma(m, vma);
return 0;
}
+#endif /* CONFIG_NUMA */
-static const struct seq_operations proc_pid_numa_maps_op = {
- .start = m_start,
- .next = m_next,
- .stop = m_stop,
- .show = show_pid_numa_map,
+static int m_show(struct seq_file *m, void *v)
+{
+ struct proc_maps_private *priv = m->private;
+ priv->show(m, v);
+ m_cache_vma(m, v);
+ return 0;
+}
+
+static const struct seq_operations proc_pid_maps_op = {
+ .start = m_start,
+ .next = m_next,
+ .stop = m_stop,
+ .show = m_show,
};
-static int pid_numa_maps_open(struct inode *inode, struct file *file)
+static int proc_maps_open(struct inode *inode, struct file *file)
{
- return proc_maps_open(inode, file, &proc_pid_numa_maps_op,
- sizeof(struct numa_maps_private));
+ const char *name = pid_entry_name(PROC_I(inode)->pid_entry);
+ int psize = sizeof(struct proc_maps_private);
+ int (*show)(struct seq_file *m, void *v);
+ struct proc_maps_private *priv;
+
+ if (strcmp(name, "maps") == 0)
+ show = show_pid_map;
+ else if (strcmp(name, "smaps") == 0)
+ show = show_pid_smap;
+ else if (strcmp(name, "numa_maps") == 0) {
+ show = show_pid_numa_map;
+ psize = sizeof(struct numa_maps_private);
+ }
+ else
+ BUG();
+
+ priv = __seq_open_private(file, &proc_pid_maps_op, psize);
+ if (!priv)
+ return -ENOMEM;
+
+ priv->show = show;
+ priv->pid = proc_pid(inode);
+ priv->is_tgid = is_tgid_pid_entry(PROC_I(inode)->pid_entry);
+ priv->mm = proc_mem_open(inode, PTRACE_MODE_READ);
+ if (IS_ERR(priv->mm)) {
+ int err = PTR_ERR(priv->mm);
+ seq_release_private(inode, file);
+ return err;
+ }
+
+ return 0;
}
-const struct file_operations proc_pid_numa_maps_operations = {
- .open = pid_numa_maps_open,
+const struct file_operations proc_pid_maps_operations = {
+ .open = proc_maps_open,
.read = seq_read,
.llseek = seq_lseek,
.release = proc_map_release,
};
-#endif /* CONFIG_NUMA */
+
--
1.5.5.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC PATCH 1/5] proc: intoduce proc_inode->pid_entry and is_tgid_pid_entry()
2014-08-08 18:57 ` [RFC PATCH 1/5] proc: intoduce proc_inode->pid_entry and is_tgid_pid_entry() Oleg Nesterov
@ 2014-08-08 20:05 ` Cyrill Gorcunov
2014-08-09 14:28 ` Oleg Nesterov
0 siblings, 1 reply; 12+ messages in thread
From: Cyrill Gorcunov @ 2014-08-08 20:05 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Alexander Viro, Alexey Dobriyan, Andrew Morton, David Howells,
David S. Miller, Eric W. Biederman, Kirill A. Shutemov,
Peter Zijlstra, Sasha Levin, linux-fsdevel, linux-kernel
On Fri, Aug 08, 2014 at 08:57:47PM +0200, Oleg Nesterov wrote:
...
> +++ b/fs/proc/inode.c
> @@ -73,6 +73,7 @@ static struct inode *proc_alloc_inode(struct super_block *sb)
> ei->pde = NULL;
> ei->sysctl = NULL;
> ei->sysctl_entry = NULL;
> + ei->pid_entry = NULL;
> ei->ns.ns = NULL;
> ei->ns.ns_ops = NULL;
> inode = &ei->vfs_inode;
Hi Oleg, sorry for not responding to previous emails, will try to review this
things tomorrow (from a glance looks quite good!). Btw, this moment strike my
eyes -- why don't we use kmem_cache_zalloc here but do assign nils again and
again, maybe worth to address as well?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH 0/5] introduce proc_inode->pid_entry
2014-08-08 18:57 [RFC PATCH 0/5] introduce proc_inode->pid_entry Oleg Nesterov
` (4 preceding siblings ...)
2014-08-08 18:58 ` [RFC PATCH 5/5] proc: unify proc_pid_*maps* stuff Oleg Nesterov
@ 2014-08-08 22:03 ` Eric W. Biederman
2014-08-08 22:11 ` Eric W. Biederman
2014-08-10 19:23 ` Oleg Nesterov
5 siblings, 2 replies; 12+ messages in thread
From: Eric W. Biederman @ 2014-08-08 22:03 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Alexander Viro, Alexey Dobriyan, Andrew Morton, Cyrill Gorcunov,
David Howells, David S. Miller, Kirill A. Shutemov,
Peter Zijlstra, Sasha Levin, linux-fsdevel, linux-kernel
Oleg Nesterov <oleg@redhat.com> writes:
> Hello,
>
> Obviously not for inclusion. The patches are horrible, break task_nommu.c,
> untested, etc. Only to explain what I mean and discuss the intent, at least.
> On top of recent /proc/pid/*maps* cleanups I sent.
>
> To me it looks a bit annoying that task_mmu.c needs 6 seq_operations's and
> 6 file_operations's to handle /proc/pid/*maps*. And _only_ because ->show()
> differs.
>
> Eric, et al, what do you think? At least something like 1-3 looks like a
> good cleanup imho. And afaics we can do more cleanups on top.
I see where you are getting annoyed.
Taking a quick look at task_mmu.c It looks like the tgid vs pid logic
to decided which stack or stacks to display is simply incorrect.
tgid vs pid is all about do we perform the per thread group rollups or
not. Because we have /proc/<tid>/ directories that need the rollups
but are per thread.
At a practical level moving pid_entry into the proc inode is ugly
especially for the hack that is is_tgid_pid_entry.
That test could be implemented more easily by looking at the parent
directories inode operations and seeing if they are
proc_root_inode_operations.
Similarly you can get the names out of the dentry, although comparing
on the dentry name feels like a real hack.
Given where you are starting I think tack_mmu.c code that decides
when/which stack deserves a serious audit.
Eric
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH 0/5] introduce proc_inode->pid_entry
2014-08-08 22:03 ` [RFC PATCH 0/5] introduce proc_inode->pid_entry Eric W. Biederman
@ 2014-08-08 22:11 ` Eric W. Biederman
2014-08-10 19:23 ` Oleg Nesterov
1 sibling, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2014-08-08 22:11 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Alexander Viro, Alexey Dobriyan, Andrew Morton, Cyrill Gorcunov,
David Howells, David S. Miller, Kirill A. Shutemov,
Peter Zijlstra, Sasha Levin, linux-fsdevel, linux-kernel
ebiederm@xmission.com (Eric W. Biederman) writes:
> Oleg Nesterov <oleg@redhat.com> writes:
>
>> Hello,
>>
>> Obviously not for inclusion. The patches are horrible, break task_nommu.c,
>> untested, etc. Only to explain what I mean and discuss the intent, at least.
>> On top of recent /proc/pid/*maps* cleanups I sent.
>>
>> To me it looks a bit annoying that task_mmu.c needs 6 seq_operations's and
>> 6 file_operations's to handle /proc/pid/*maps*. And _only_ because ->show()
>> differs.
>>
>> Eric, et al, what do you think? At least something like 1-3 looks like a
>> good cleanup imho. And afaics we can do more cleanups on top.
>
>
> I see where you are getting annoyed.
>
> Taking a quick look at task_mmu.c It looks like the tgid vs pid logic
> to decided which stack or stacks to display is simply incorrect.
>
> tgid vs pid is all about do we perform the per thread group rollups or
> not. Because we have /proc/<tid>/ directories that need the rollups
> but are per thread.
>
> At a practical level moving pid_entry into the proc inode is ugly
> especially for the hack that is is_tgid_pid_entry.
>
> That test could be implemented more easily by looking at the parent
> directories inode operations and seeing if they are
> proc_root_inode_operations.
>
> Similarly you can get the names out of the dentry, although comparing
> on the dentry name feels like a real hack.
>
> Given where you are starting I think tack_mmu.c code that decides
> when/which stack deserves a serious audit.
On a slightly larger scale it is probably about time to step back
and look at /proc and see what structural cleanups can be done.
Since last time I was looking deeply a bunch of work has been done in
the automount area, and it would be ever so nice if /proc/<pid>/net
because an automount.
Similarly /proc/sys/ really needs to become at least a symlink into
/proc/<pid>/sys and quite possibly an automount itself.
And arguably it would be nice to split /proc/<pid> from /proc generic,
so we would actually have a pure proc.
Shrug. At least if those are ideas worth thinking about and some of them
have the potential to cleanup some nasty interactions with the vfs.
Eric
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH 1/5] proc: intoduce proc_inode->pid_entry and is_tgid_pid_entry()
2014-08-08 20:05 ` Cyrill Gorcunov
@ 2014-08-09 14:28 ` Oleg Nesterov
2014-08-10 7:16 ` Cyrill Gorcunov
0 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2014-08-09 14:28 UTC (permalink / raw)
To: Cyrill Gorcunov
Cc: Alexander Viro, Alexey Dobriyan, Andrew Morton, David Howells,
David S. Miller, Eric W. Biederman, Kirill A. Shutemov,
Peter Zijlstra, Sasha Levin, linux-fsdevel, linux-kernel
On 08/09, Cyrill Gorcunov wrote:
>
> On Fri, Aug 08, 2014 at 08:57:47PM +0200, Oleg Nesterov wrote:
> ...
> > +++ b/fs/proc/inode.c
> > @@ -73,6 +73,7 @@ static struct inode *proc_alloc_inode(struct super_block *sb)
> > ei->pde = NULL;
> > ei->sysctl = NULL;
> > ei->sysctl_entry = NULL;
> > + ei->pid_entry = NULL;
> > ei->ns.ns = NULL;
> > ei->ns.ns_ops = NULL;
> > inode = &ei->vfs_inode;
>
> Hi Oleg, sorry for not responding to previous emails, will try to review this
> things tomorrow (from a glance looks quite good!). Btw, this moment strike my
> eyes -- why don't we use kmem_cache_zalloc here but do assign nils again and
> again, maybe worth to address as well?
Probably because this doesn't make sense to nullify proc_inode->vfs_inode,
it will be reinitialized at least by inode_init_always() anyway.
Oleg.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH 1/5] proc: intoduce proc_inode->pid_entry and is_tgid_pid_entry()
2014-08-09 14:28 ` Oleg Nesterov
@ 2014-08-10 7:16 ` Cyrill Gorcunov
0 siblings, 0 replies; 12+ messages in thread
From: Cyrill Gorcunov @ 2014-08-10 7:16 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Alexander Viro, Alexey Dobriyan, Andrew Morton, David Howells,
David S. Miller, Eric W. Biederman, Kirill A. Shutemov,
Peter Zijlstra, Sasha Levin, linux-fsdevel, linux-kernel
On Sat, Aug 09, 2014 at 04:28:50PM +0200, Oleg Nesterov wrote:
> >
> > Hi Oleg, sorry for not responding to previous emails, will try to review this
> > things tomorrow (from a glance looks quite good!). Btw, this moment strike my
> > eyes -- why don't we use kmem_cache_zalloc here but do assign nils again and
> > again, maybe worth to address as well?
>
> Probably because this doesn't make sense to nullify proc_inode->vfs_inode,
> it will be reinitialized at least by inode_init_always() anyway.
Yeah, thanks. I managed to fogeth that we call inode_init_always after inode
get allocated. As to series -- looks like a good start, and I think Eric
is a way more confident in this area than me so i've nothing to add up
to his comments.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH 0/5] introduce proc_inode->pid_entry
2014-08-08 22:03 ` [RFC PATCH 0/5] introduce proc_inode->pid_entry Eric W. Biederman
2014-08-08 22:11 ` Eric W. Biederman
@ 2014-08-10 19:23 ` Oleg Nesterov
1 sibling, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2014-08-10 19:23 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Alexander Viro, Alexey Dobriyan, Andrew Morton, Cyrill Gorcunov,
David Howells, David S. Miller, Kirill A. Shutemov,
Peter Zijlstra, Sasha Levin, linux-fsdevel, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2335 bytes --]
On 08/08, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > To me it looks a bit annoying that task_mmu.c needs 6 seq_operations's and
> > 6 file_operations's to handle /proc/pid/*maps*. And _only_ because ->show()
> > differs.
> >
> > Eric, et al, what do you think? At least something like 1-3 looks like a
> > good cleanup imho. And afaics we can do more cleanups on top.
>
>
> I see where you are getting annoyed.
>
> Taking a quick look at task_mmu.c It looks like the tgid vs pid logic
> to decided which stack or stacks to display is simply incorrect.
Yes, probably, but please forget this for now. Because,
> Given where you are starting I think tack_mmu.c code that decides
> when/which stack deserves a serious audit.
Yes. And more. At least this code needs more cleanups. ->task should
die or at least we should avoid get/put_task_struct, ->pid can die too,
hold_task_mempolicy() doesn't look correct (at least the "prevent changing
our mempolicy while show_numa_maps" comment and down_write() in
do_set_mempolicy(). I am going to try to cleanup this a bit after I change
task_nommu.c to avoid mm_access() in m_start().
But this is off-topic,
> At a practical level moving pid_entry into the proc inode is ugly
> especially for the hack that is is_tgid_pid_entry.
Well, I am not going to argue with maintainer, mostly simply because
I do not understand this code enough ;)
But let me say that I disagree. We already have ->fd, and ->sysctl*.
I see nothing wrong if *id_base_stuff files have more info for free.
And imo proc_inode->sysctl* is much worse. Simply because they are
not private to fs/proc/proc_sysctl.c.
Could you please look into the attached mbox? I am just curious if we
can do something like this in a clean way. In particular, please look at
"Note:" comment in 3/3. Perhaps proc_sys_init() can do proc_get_inode(),
initialize/instantiate it...
To clarify, of course it is not that I want to shrink sizeof(proc_inode).
Just to me it doesn't look clean that proc_evict_inode() has to do
sysctl_head_put(), grab_header() assumes that ->sysctl == NULL means
sysctl_table_root.*, etc.
> That test could be implemented more easily by looking at the parent
> directories inode operations and seeing if they are
> proc_root_inode_operations.
Hmm. Looking at inode? How?
Oleg.
[-- Attachment #2: U.m --]
[-- Type: text/plain, Size: 4885 bytes --]
>From a4c94461ae18b2d6cc2e8a9cfb042d0b6cc46e86 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg@redhat.com>
Date: Sat, 9 Aug 2014 17:25:53 +0200
Subject: [PATCH 1/3] proc: introduce proc_sys_evict_inode()
Preparation. Shift the sysctl_head_put() code from proc_evict_inode()
into the new trivial helper, make sysctl_head_put() static.
---
fs/proc/inode.c | 8 ++------
fs/proc/internal.h | 4 ++--
fs/proc/proc_sysctl.c | 13 ++++++++++++-
3 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 0adbc02..9f6715a 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -31,7 +31,6 @@
static void proc_evict_inode(struct inode *inode)
{
struct proc_dir_entry *de;
- struct ctl_table_header *head;
const struct proc_ns_operations *ns_ops;
void *ns;
@@ -45,11 +44,8 @@ static void proc_evict_inode(struct inode *inode)
de = PROC_I(inode)->pde;
if (de)
pde_put(de);
- head = PROC_I(inode)->sysctl;
- if (head) {
- RCU_INIT_POINTER(PROC_I(inode)->sysctl, NULL);
- sysctl_head_put(head);
- }
+
+ proc_sys_evict_inode(inode);
/* Release any associated namespace */
ns_ops = PROC_I(inode)->ns.ns_ops;
ns = PROC_I(inode)->ns.ns;
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 78784cd..7ce3262 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -238,10 +238,10 @@ extern int proc_setup_self(struct super_block *);
*/
#ifdef CONFIG_PROC_SYSCTL
extern int proc_sys_init(void);
-extern void sysctl_head_put(struct ctl_table_header *);
+extern void proc_sys_evict_inode(struct inode *inode);
#else
static inline void proc_sys_init(void) { }
-static inline void sysctl_head_put(struct ctl_table_header *head) { }
+static inline void proc_sys_evict_inode(struct inode *inode) { }
#endif
/*
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 7129046..2fa67e7 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -256,7 +256,7 @@ static void sysctl_head_get(struct ctl_table_header *head)
spin_unlock(&sysctl_lock);
}
-void sysctl_head_put(struct ctl_table_header *head)
+static void sysctl_head_put(struct ctl_table_header *head)
{
spin_lock(&sysctl_lock);
if (!--head->count)
@@ -264,6 +264,17 @@ void sysctl_head_put(struct ctl_table_header *head)
spin_unlock(&sysctl_lock);
}
+void proc_sys_evict_inode(struct inode *inode)
+{
+ struct ctl_table_header *head;
+
+ head = PROC_I(inode)->sysctl;
+ if (head) {
+ RCU_INIT_POINTER(PROC_I(inode)->sysctl, NULL);
+ sysctl_head_put(head);
+ }
+}
+
static struct ctl_table_header *sysctl_head_grab(struct ctl_table_header *head)
{
BUG_ON(!head);
--
1.5.5.1
>From e0b42f4770cd76069e4e70e48e4e7bfa84fab4bf Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg@redhat.com>
Date: Sat, 9 Aug 2014 17:37:31 +0200
Subject: [PATCH 2/3] proc: change proc_sys_evict_inode() to check inode->i_op
Change proc_sys_evict_inode() to verify that this inode is really
a /proc/sys inode before using ->sysctl.
---
fs/proc/proc_sysctl.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 2fa67e7..863aaee 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -268,6 +268,10 @@ void proc_sys_evict_inode(struct inode *inode)
{
struct ctl_table_header *head;
+ if (inode->i_op != &proc_sys_inode_operations &&
+ inode->i_op != &proc_sys_dir_operations)
+ return;
+
head = PROC_I(inode)->sysctl;
if (head) {
RCU_INIT_POINTER(PROC_I(inode)->sysctl, NULL);
--
1.5.5.1
>From 9c8e98fc45f091d5d2a4bfc6dcd86b67275160a1 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg@redhat.com>
Date: Sun, 10 Aug 2014 17:37:25 +0200
Subject: [PATCH 3/3] proc: place proc_inode->sysctl* and proc_inode->fd into a union
->sysctl* and ->fd members can share the memory, add a union.
Note: unfortunately proc_alloc_inode() still has to initialize ->sysctl*
members. And the only (afaics) reason is that proc_mkdir("sys") relies
on ->sysctl == NULL which means sysctl_table_root in grab_header(). It
would be nice to initialize these members after proc_get_inode("sys")
somehow and remove the special "NULL means global root" case in proc_sys
paths somehow, but I do not see how.
---
fs/proc/internal.h | 12 ++++++++----
1 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 7ce3262..8d35ac0 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -60,11 +60,15 @@ union proc_op {
struct proc_inode {
struct pid *pid;
- int fd;
- union proc_op op;
struct proc_dir_entry *pde;
- struct ctl_table_header *sysctl;
- struct ctl_table *sysctl_entry;
+ union proc_op op;
+ union {
+ struct {
+ struct ctl_table_header *sysctl;
+ struct ctl_table *sysctl_entry;
+ };
+ int fd;
+ };
struct proc_ns ns;
struct inode vfs_inode;
};
--
1.5.5.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-08-10 19:25 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-08 18:57 [RFC PATCH 0/5] introduce proc_inode->pid_entry Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 1/5] proc: intoduce proc_inode->pid_entry and is_tgid_pid_entry() Oleg Nesterov
2014-08-08 20:05 ` Cyrill Gorcunov
2014-08-09 14:28 ` Oleg Nesterov
2014-08-10 7:16 ` Cyrill Gorcunov
2014-08-08 18:57 ` [RFC PATCH 2/5] proc: unify proc_tgid_stat() and proc_tid_stat() Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 3/5] proc: kill *_tid_*maps* stuff Oleg Nesterov
2014-08-08 18:57 ` [RFC PATCH 4/5] proc: introduce pid_entry_name() Oleg Nesterov
2014-08-08 18:58 ` [RFC PATCH 5/5] proc: unify proc_pid_*maps* stuff Oleg Nesterov
2014-08-08 22:03 ` [RFC PATCH 0/5] introduce proc_inode->pid_entry Eric W. Biederman
2014-08-08 22:11 ` Eric W. Biederman
2014-08-10 19:23 ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).