All of lore.kernel.org
 help / color / mirror / Atom feed
From: Satoru Takeuchi <satoru.takeuchi@gmail.com>
To: linux-btrfs <linux-btrfs@vger.kernel.org>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: David Sterba <dsterba@suse.com>
Subject: [PATCH] fix inconsistent device between /proc/pid/maps and stat
Date: Tue, 21 Mar 2017 10:53:09 +0900	[thread overview]
Message-ID: <871strcs56.wl-satoru.takeuchi@gmail.com> (raw)

There have been some discussions about inconsistent device between /proc/pid/maps and stat(2).

http://thr3ads.net/btrfs-devel/2011/05/2346176-RFC-PATCH-0-2-btrfs-vfs-Return-same-device-in-stat-2-and-proc-pid-maps
https://www.spinics.net/lists/linux-btrfs/msg09044.html
https://patchwork.kernel.org/patch/2825842/
https://patchwork.kernel.org/patch/2831525/

It's important since it breaks user programs like lsof(8). There was a patche by Mark to fix this problem.
However, unfortunately, that patch is not merged so far.

  NOTE:
  1) This patch is inspired by the Mark's one and I tweak it as follows.
     - move a flag for this fix from sb->s_flags to kernel internal sb->s_iflags
     - change that flag's name from MS_STAT_FOR_DEV to SB_I_LOGICALVOL, to make its meaning clearer
  2) This patch is for Chris's for-linus-4.11 (commit e1699d2d7bf6e6cce3e1baff19f9dd4595a58664 ("")). It should modify to apply to 4.11-rcX because of statx patches changes
     struct inode_operations->getattr() interface.

For more information about this problem, please refer to the patchat the end of this mail.

---
Subject: [PATCH] fix inconsistent devie between /proc/pid/maps and stat

/proc/pid/maps returns each device as follows.

===
dev = inode->i_sb->s_dev;
===

However, stat(2) returns it as follows.

===
stat->dev = BTRFS_I(inode)->root->anon_dev;
===

It results in device mismatch and this inconsistency breaks users programs like lsof(8) as follows.

Without this patch:
===
$ mount | grep /home/vagrant/mnt
/dev/vda5 on /home/vagrant/mnt type btrfs (rw,relatime,noacl,space_cache,subvolid=5,subvol=/)
$ /home/vagrant/mnt/lsof | grep /home/vagrant/mnt
lsof      19292                  root  txt       REG               0,44   163136        257 /home/sat/mnt/lsof
lsof      19292                  root  mem       REG               0,42                 257 /home/sat/mnt/lsof (path dev=0,44)
lsof      19294                  root  txt       REG               0,44   163136        257 /home/sat/mnt/lsof
lsof      19294                  root  mem       REG               0,42                 257 /home/sat/mnt/lsof (path dev=0,44)
===

With this patch:
===
$ mount | grep /home/vagrant/mnt
/dev/vda5 on /home/vagrant/mnt type btrfs (rw,relatime,noacl,space_cache,subvolid=5,subvol=/)
$ /home/vagrant/mnt/lsof | grep /home/vagrant/mnt
lsof      752                 root  txt       REG               0,44   163224        263 /home/vagrant/mnt/lsof
lsof      754                 root  txt       REG               0,44   163224        263 /home/vagrant/mnt/lsof
===

Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
CC: Signed-off-by: Mark Fasheh <mfasheh@suse.de>
---
 fs/btrfs/super.c     |  1 +
 fs/proc/generic.c    | 13 +++++++++++++
 fs/proc/internal.h   |  1 +
 fs/proc/nommu.c      |  2 +-
 fs/proc/task_mmu.c   |  2 +-
 fs/proc/task_nommu.c |  2 +-
 include/linux/fs.h   |  1 +
 7 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index da687dc79cce..2ccfdb107ba0 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1133,6 +1133,7 @@ static int btrfs_fill_super(struct super_block *sb,
 #endif
 	sb->s_flags |= MS_I_VERSION;
 	sb->s_iflags |= SB_I_CGROUPWB;
+	sb->s_iflags |= SB_I_LOGICALVOL;
 	err = open_ctree(sb, fs_devices, (char *)data);
 	if (err) {
 		btrfs_err(fs_info, "open_ctree failed");
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index f6a01f09f79d..d38cd77b297c 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -646,3 +646,16 @@ void *PDE_DATA(const struct inode *inode)
 	return __PDE_DATA(inode);
 }
 EXPORT_SYMBOL(PDE_DATA);
+
+dev_t proc_get_map_dev(struct dentry *dentry)
+{
+	struct inode *inode = dentry->d_inode;
+	struct kstat kstat;
+
+	if (inode->i_sb->s_iflags & SB_I_LOGICALVOL &&
+	    inode->i_op->getattr &&
+	    inode->i_op->getattr(NULL, dentry, &kstat) == 0)
+		return kstat.dev;
+
+	return inode->i_sb->s_dev;
+}
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 2de5194ba378..bf0f11fc209b 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -190,6 +190,7 @@ static inline struct proc_dir_entry *pde_get(struct proc_dir_entry *pde)
 	return pde;
 }
 extern void pde_put(struct proc_dir_entry *);
+dev_t proc_get_map_dev(struct dentry *);
 
 static inline bool is_empty_pde(const struct proc_dir_entry *pde)
 {
diff --git a/fs/proc/nommu.c b/fs/proc/nommu.c
index 75634379f82e..e9c29864a50e 100644
--- a/fs/proc/nommu.c
+++ b/fs/proc/nommu.c
@@ -46,7 +46,7 @@ static int nommu_region_show(struct seq_file *m, struct vm_region *region)
 
 	if (file) {
 		struct inode *inode = file_inode(region->vm_file);
-		dev = inode->i_sb->s_dev;
+		dev = proc_get_map_dev(vma->vm_file->f_path.dentry);
 		ino = inode->i_ino;
 	}
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 8f96a49178d0..3529adf0faa8 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -292,7 +292,7 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
 
 	if (file) {
 		struct inode *inode = file_inode(vma->vm_file);
-		dev = inode->i_sb->s_dev;
+		dev = proc_get_map_dev(vma->vm_file->f_path.dentry);
 		ino = inode->i_ino;
 		pgoff = ((loff_t)vma->vm_pgoff) << PAGE_SHIFT;
 	}
diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c
index 37175621e890..4f17f91225c3 100644
--- a/fs/proc/task_nommu.c
+++ b/fs/proc/task_nommu.c
@@ -156,7 +156,7 @@ static int nommu_vma_show(struct seq_file *m, struct vm_area_struct *vma,
 
 	if (file) {
 		struct inode *inode = file_inode(vma->vm_file);
-		dev = inode->i_sb->s_dev;
+		dev = proc_get_map_dev(vma->vm_file->f_path.dentry);
 		ino = inode->i_ino;
 		pgoff = (loff_t)vma->vm_pgoff << PAGE_SHIFT;
 	}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2ba074328894..eec62a5bfd4d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1262,6 +1262,7 @@ struct mm_struct;
 #define SB_I_CGROUPWB	0x00000001	/* cgroup-aware writeback enabled */
 #define SB_I_NOEXEC	0x00000002	/* Ignore executables on this fs */
 #define SB_I_NODEV	0x00000004	/* Ignore devices on this fs */
+#define SB_I_LOGICALVOL	0x00000008	/* this fs supports logical volumes */
 
 /* sb->s_iflags to limit user namespace mounts */
 #define SB_I_USERNS_VISIBLE		0x00000010 /* fstype already mounted */
-- 
2.11.0


             reply	other threads:[~2017-03-21  2:02 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-21  1:53 Satoru Takeuchi [this message]
2017-03-24 11:58 ` [PATCH] fix inconsistent device between /proc/pid/maps and stat David Sterba
2017-03-24 13:32   ` Satoru Takeuchi
  -- strict thread matches above, loose matches on Subject: below --
2017-03-24 17:29 Hirotaka Yamamoto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871strcs56.wl-satoru.takeuchi@gmail.com \
    --to=satoru.takeuchi@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=mfasheh@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.