From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Matej Kupljen <matej.kupljen@gmail.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, hughd@google.com
Subject: Re: tmpfs inode leakage when opening file with O_TMP_FILE
Date: Thu, 14 Feb 2019 16:26:31 -0800 [thread overview]
Message-ID: <20190215002631.GB6474@magnolia> (raw)
In-Reply-To: <20190214154402.5d204ef2aa109502761ab7a0@linux-foundation.org>
[cc the shmem maintainer and the mm list]
On Thu, Feb 14, 2019 at 03:44:02PM -0800, Andrew Morton wrote:
> (cc linux-fsdevel)
>
> On Mon, 11 Feb 2019 15:18:11 +0100 Matej Kupljen <matej.kupljen@gmail.com> wrote:
>
> > Hi,
> >
> > it seems that when opening file on file system that is mounted on
> > tmpfs with the O_TMPFILE flag and using linkat call after that, it
> > uses 2 inodes instead of 1.
> >
> > This is simple test case:
> >
> > #include <sys/types.h>
> > #include <sys/stat.h>
> > #include <fcntl.h>
> > #include <unistd.h>
> > #include <string.h>
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <linux/limits.h>
> > #include <errno.h>
> >
> > #define TEST_STRING "Testing\n"
> >
> > #define TMP_PATH "/tmp/ping/"
> > #define TMP_FILE "file.txt"
> >
> >
> > int main(int argc, char* argv[])
> > {
> > char path[PATH_MAX];
> > int fd;
> > int rc;
> >
> > fd = open(TMP_PATH, __O_TMPFILE | O_RDWR,
> > S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP |
> > S_IROTH | S_IWOTH);
> >
> > rc = write(fd, TEST_STRING, strlen(TEST_STRING));
> >
> > snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd);
> > linkat(AT_FDCWD, path, AT_FDCWD, TMP_PATH TMP_FILE, AT_SYMLINK_FOLLOW);
> > close(fd);
> >
> > return 0;
> > }
> >
> > I have checked indoes with "df -i" tool. The first inode is used when
> > the call to open is executed and the second one when the call to
> > linkat is executed.
> > It is not decreased when close is executed.
> >
> > I have also tested this on an ext4 mounted fs and there only one inode is used.
> >
> > I tested this on:
> > $ cat /etc/lsb-release
> > DISTRIB_ID=Ubuntu
> > DISTRIB_RELEASE=18.04
> > DISTRIB_CODENAME=bionic
> > DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"
> >
> > $ uname -a
> > Linux Orion 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC
> > 2018 x86_64 x86_64 x86_64 GNU/Linux
Heh, tmpfs and its weird behavior where each new link counts as a new
inode because "each new link needs a new dentry, pinning lowmem, and
tmpfs dentries cannot be pruned until they are unlinked."
It seems to have this behavior on 5.0-rc6 too:
$ /bin/df -i /tmp ; ./c ; /bin/df -i /tmp
Filesystem Inodes IUsed IFree IUse% Mounted on
tmp 1019110 17 1019093 1% /tmp
Filesystem Inodes IUsed IFree IUse% Mounted on
tmp 1019110 19 1019091 1% /tmp
Probably because shmem_tmpfile -> shmem_get_inode -> shmem_reserve_inode
which decrements ifree when we create the tmpfile, and then the
d_tmpfile decrements i_nlink to zero. Now we have iused=1, nlink=0,
assuming iused=itotal-ifree like usual.
Then the linkat call does:
shmem_link -> shmem_reserve_inode
which decrements ifree again and increments i_nlink to 1. Now we have
iused=2, nlink=1.
The program exits, which closes the file. /tmp/ping/file.txt still
exists and we haven't evicted inodes yet, so nothing much happens.
But then I added in rm -rf /tmp/ping/file.txt to see what happens.
shmem_unlink contains this:
if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode))
shmem_free_inode(inode->i_sb);
So shmem_iunlink *doesnt* decrement ifree but does drop the nlink, so
our state is now iused=2, nlink=0.
Now we evict the inode, which decrements ifree, so iused=1 and the inode
goes away. Oops, we just leaked an ifree.
I /think/ the proper fix is to change shmem_link to decrement ifree only
if the inode has nonzero nlink, e.g.
/*
* No ordinary (disk based) filesystem counts links as inodes;
* but each new link needs a new dentry, pinning lowmem, and
* tmpfs dentries cannot be pruned until they are unlinked. If
* we're linking an O_TMPFILE file into the tmpfs we can skip
* this because there's still only one link to the inode.
*/
if (inode->i_nlink > 0) {
ret = shmem_reserve_inode(inode->i_sb);
if (ret)
goto out;
}
Says me who was crawling around poking at O_TMPFILE behavior all morning.
Not sure if that's right; what happens to the old dentry?
--D
> > If you need any more information, please let me know.
> >
> > And please CC me when replying, I am not subscribed to the list.
> >
> > Thanks and BR,
> > Matej
next prev parent reply other threads:[~2019-02-15 0:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAHMF36F4JN44Y-yMnxw36A8cO0yVUQhAkvJDcj_gbWbsuUAA5A@mail.gmail.com>
2019-02-14 23:44 ` tmpfs inode leakage when opening file with O_TMP_FILE Andrew Morton
2019-02-15 0:26 ` Darrick J. Wong [this message]
2019-02-15 10:38 ` Hugh Dickins
2019-02-19 4:23 ` Hugh Dickins
2019-02-19 4:34 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190215002631.GB6474@magnolia \
--to=darrick.wong@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matej.kupljen@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).