From: Satoru Takeuchi <satoru.takeuchi@gmail.com>
To: linux-btrfs <linux-btrfs@vger.kernel.org>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: David Sterba <dsterba@suse.com>
Subject: [PATCH] fix inconsistent device between /proc/pid/maps and stat
Date: Tue, 21 Mar 2017 10:53:09 +0900 [thread overview]
Message-ID: <871strcs56.wl-satoru.takeuchi@gmail.com> (raw)
There have been some discussions about inconsistent device between /proc/pid/maps and stat(2).
http://thr3ads.net/btrfs-devel/2011/05/2346176-RFC-PATCH-0-2-btrfs-vfs-Return-same-device-in-stat-2-and-proc-pid-maps
https://www.spinics.net/lists/linux-btrfs/msg09044.html
https://patchwork.kernel.org/patch/2825842/
https://patchwork.kernel.org/patch/2831525/
It's important since it breaks user programs like lsof(8). There was a patche by Mark to fix this problem.
However, unfortunately, that patch is not merged so far.
NOTE:
1) This patch is inspired by the Mark's one and I tweak it as follows.
- move a flag for this fix from sb->s_flags to kernel internal sb->s_iflags
- change that flag's name from MS_STAT_FOR_DEV to SB_I_LOGICALVOL, to make its meaning clearer
2) This patch is for Chris's for-linus-4.11 (commit e1699d2d7bf6e6cce3e1baff19f9dd4595a58664 ("")). It should modify to apply to 4.11-rcX because of statx patches changes
struct inode_operations->getattr() interface.
For more information about this problem, please refer to the patchat the end of this mail.
---
Subject: [PATCH] fix inconsistent devie between /proc/pid/maps and stat
/proc/pid/maps returns each device as follows.
===
dev = inode->i_sb->s_dev;
===
However, stat(2) returns it as follows.
===
stat->dev = BTRFS_I(inode)->root->anon_dev;
===
It results in device mismatch and this inconsistency breaks users programs like lsof(8) as follows.
Without this patch:
===
$ mount | grep /home/vagrant/mnt
/dev/vda5 on /home/vagrant/mnt type btrfs (rw,relatime,noacl,space_cache,subvolid=5,subvol=/)
$ /home/vagrant/mnt/lsof | grep /home/vagrant/mnt
lsof 19292 root txt REG 0,44 163136 257 /home/sat/mnt/lsof
lsof 19292 root mem REG 0,42 257 /home/sat/mnt/lsof (path dev=0,44)
lsof 19294 root txt REG 0,44 163136 257 /home/sat/mnt/lsof
lsof 19294 root mem REG 0,42 257 /home/sat/mnt/lsof (path dev=0,44)
===
With this patch:
===
$ mount | grep /home/vagrant/mnt
/dev/vda5 on /home/vagrant/mnt type btrfs (rw,relatime,noacl,space_cache,subvolid=5,subvol=/)
$ /home/vagrant/mnt/lsof | grep /home/vagrant/mnt
lsof 752 root txt REG 0,44 163224 263 /home/vagrant/mnt/lsof
lsof 754 root txt REG 0,44 163224 263 /home/vagrant/mnt/lsof
===
Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
CC: Signed-off-by: Mark Fasheh <mfasheh@suse.de>
---
fs/btrfs/super.c | 1 +
fs/proc/generic.c | 13 +++++++++++++
fs/proc/internal.h | 1 +
fs/proc/nommu.c | 2 +-
fs/proc/task_mmu.c | 2 +-
fs/proc/task_nommu.c | 2 +-
include/linux/fs.h | 1 +
7 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index da687dc79cce..2ccfdb107ba0 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1133,6 +1133,7 @@ static int btrfs_fill_super(struct super_block *sb,
#endif
sb->s_flags |= MS_I_VERSION;
sb->s_iflags |= SB_I_CGROUPWB;
+ sb->s_iflags |= SB_I_LOGICALVOL;
err = open_ctree(sb, fs_devices, (char *)data);
if (err) {
btrfs_err(fs_info, "open_ctree failed");
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index f6a01f09f79d..d38cd77b297c 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -646,3 +646,16 @@ void *PDE_DATA(const struct inode *inode)
return __PDE_DATA(inode);
}
EXPORT_SYMBOL(PDE_DATA);
+
+dev_t proc_get_map_dev(struct dentry *dentry)
+{
+ struct inode *inode = dentry->d_inode;
+ struct kstat kstat;
+
+ if (inode->i_sb->s_iflags & SB_I_LOGICALVOL &&
+ inode->i_op->getattr &&
+ inode->i_op->getattr(NULL, dentry, &kstat) == 0)
+ return kstat.dev;
+
+ return inode->i_sb->s_dev;
+}
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 2de5194ba378..bf0f11fc209b 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -190,6 +190,7 @@ static inline struct proc_dir_entry *pde_get(struct proc_dir_entry *pde)
return pde;
}
extern void pde_put(struct proc_dir_entry *);
+dev_t proc_get_map_dev(struct dentry *);
static inline bool is_empty_pde(const struct proc_dir_entry *pde)
{
diff --git a/fs/proc/nommu.c b/fs/proc/nommu.c
index 75634379f82e..e9c29864a50e 100644
--- a/fs/proc/nommu.c
+++ b/fs/proc/nommu.c
@@ -46,7 +46,7 @@ static int nommu_region_show(struct seq_file *m, struct vm_region *region)
if (file) {
struct inode *inode = file_inode(region->vm_file);
- dev = inode->i_sb->s_dev;
+ dev = proc_get_map_dev(vma->vm_file->f_path.dentry);
ino = inode->i_ino;
}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 8f96a49178d0..3529adf0faa8 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -292,7 +292,7 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
if (file) {
struct inode *inode = file_inode(vma->vm_file);
- dev = inode->i_sb->s_dev;
+ dev = proc_get_map_dev(vma->vm_file->f_path.dentry);
ino = inode->i_ino;
pgoff = ((loff_t)vma->vm_pgoff) << PAGE_SHIFT;
}
diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c
index 37175621e890..4f17f91225c3 100644
--- a/fs/proc/task_nommu.c
+++ b/fs/proc/task_nommu.c
@@ -156,7 +156,7 @@ static int nommu_vma_show(struct seq_file *m, struct vm_area_struct *vma,
if (file) {
struct inode *inode = file_inode(vma->vm_file);
- dev = inode->i_sb->s_dev;
+ dev = proc_get_map_dev(vma->vm_file->f_path.dentry);
ino = inode->i_ino;
pgoff = (loff_t)vma->vm_pgoff << PAGE_SHIFT;
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2ba074328894..eec62a5bfd4d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1262,6 +1262,7 @@ struct mm_struct;
#define SB_I_CGROUPWB 0x00000001 /* cgroup-aware writeback enabled */
#define SB_I_NOEXEC 0x00000002 /* Ignore executables on this fs */
#define SB_I_NODEV 0x00000004 /* Ignore devices on this fs */
+#define SB_I_LOGICALVOL 0x00000008 /* this fs supports logical volumes */
/* sb->s_iflags to limit user namespace mounts */
#define SB_I_USERNS_VISIBLE 0x00000010 /* fstype already mounted */
--
2.11.0
next reply other threads:[~2017-03-21 2:02 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-21 1:53 Satoru Takeuchi [this message]
2017-03-24 11:58 ` [PATCH] fix inconsistent device between /proc/pid/maps and stat David Sterba
2017-03-24 13:32 ` Satoru Takeuchi
-- strict thread matches above, loose matches on Subject: below --
2017-03-24 17:29 Hirotaka Yamamoto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871strcs56.wl-satoru.takeuchi@gmail.com \
--to=satoru.takeuchi@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=mfasheh@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.