From: Khalid Aziz <khalid.aziz@oracle.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: akpm@linux-foundation.org, willy@infradead.org,
aneesh.kumar@linux.ibm.com, arnd@arndb.de, 21cnbao@gmail.com,
corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com,
ebiederm@xmission.com, hagen@jauu.net, jack@suse.cz,
keescook@chromium.org, kirill@shutemov.name, kucharsk@gmail.com,
linkinjeon@kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
longpeng2@huawei.com, luto@kernel.org, markhemm@googlemail.com,
pcc@google.com, rppt@kernel.org, sieberf@amazon.com,
sjpark@amazon.de, surenb@google.com, tst@schoebel-theuer.de,
yzaikin@google.com
Subject: Re: [PATCH v2 4/9] mm/mshare: Add a read operation for msharefs files
Date: Thu, 30 Jun 2022 16:27:01 -0600 [thread overview]
Message-ID: <4bbaa753-f145-4971-2b51-c909b946ae63@oracle.com> (raw)
In-Reply-To: <Yr4VVuCzCp50cu0O@magnolia>
On 6/30/22 15:27, Darrick J. Wong wrote:
> On Wed, Jun 29, 2022 at 04:53:55PM -0600, Khalid Aziz wrote:
>> When a new file is created under msharefs, allocate a new mm_struct
>> that will hold the VMAs for mshare region. Also allocate structure
>> to defines the mshare region and add a read operation to the file
>> that returns this information about the mshare region. Currently
>> this information is returned as a struct:
>>
>> struct mshare_info {
>> unsigned long start;
>> unsigned long size;
>> };
>>
>> This gives the start address for mshare region and its size.
>>
>> Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
>> ---
>> include/uapi/linux/mman.h | 5 +++
>> mm/mshare.c | 64 ++++++++++++++++++++++++++++++++++++++-
>> 2 files changed, 68 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/uapi/linux/mman.h b/include/uapi/linux/mman.h
>> index f55bc680b5b0..56fe446e24b1 100644
>> --- a/include/uapi/linux/mman.h
>> +++ b/include/uapi/linux/mman.h
>> @@ -41,4 +41,9 @@
>> #define MAP_HUGE_2GB HUGETLB_FLAG_ENCODE_2GB
>> #define MAP_HUGE_16GB HUGETLB_FLAG_ENCODE_16GB
>>
>> +struct mshare_info {
>> + unsigned long start;
>> + unsigned long size;
>
> You might want to make these explicitly u64, since this is userspace
> ABI and you never know when someone will want to do something crazy like
> run 32-bit programs with mshare files.
>
> Also you might want to add some padding fields for flags, future
> expansion, etc.
That sounds like a good idea. I will queue it up for next version of patch series.
>
>> +};
>> +
>> #endif /* _UAPI_LINUX_MMAN_H */
>> diff --git a/mm/mshare.c b/mm/mshare.c
>> index 2d5924d39221..d238b68b0576 100644
>> --- a/mm/mshare.c
>> +++ b/mm/mshare.c
>> @@ -22,8 +22,14 @@
>> #include <uapi/linux/magic.h>
>> #include <uapi/linux/limits.h>
>> #include <uapi/linux/mman.h>
>> +#include <linux/sched/mm.h>
>>
>> static struct super_block *msharefs_sb;
>> +struct mshare_data {
>> + struct mm_struct *mm;
>> + refcount_t refcnt;
>> + struct mshare_info *minfo;
>> +};
>>
>> static const struct inode_operations msharefs_dir_inode_ops;
>> static const struct inode_operations msharefs_file_inode_ops;
>> @@ -34,8 +40,29 @@ msharefs_open(struct inode *inode, struct file *file)
>> return simple_open(inode, file);
>> }
>>
>> +static ssize_t
>> +msharefs_read(struct kiocb *iocb, struct iov_iter *iov)
>> +{
>> + struct mshare_data *info = iocb->ki_filp->private_data;
>> + size_t ret;
>> + struct mshare_info m_info;
>> +
>> + if (info->minfo != NULL) {
>> + m_info.start = info->minfo->start;
>> + m_info.size = info->minfo->size;
>> + } else {
>> + m_info.start = 0;
>> + m_info.size = 0;
>
> Hmmm, read()ing out the shared mapping information. Heh.
>
> When does this case happen? Is it before anybody mmaps this file into
> an address space?
>
It can happen before or after the first mmap which will establish the start address and size. Hence I have to account
for both cases.
>> + }
>> + ret = copy_to_iter(&m_info, sizeof(m_info), iov);
>> + if (!ret)
>> + return -EFAULT;
>> + return ret;
>> +}
>> +
>> static const struct file_operations msharefs_file_operations = {
>> .open = msharefs_open,
>> + .read_iter = msharefs_read,
>> .llseek = no_llseek,
>> };
>>
>> @@ -73,12 +100,43 @@ static struct dentry
>> return ERR_PTR(-ENOMEM);
>> }
>>
>> +static int
>> +msharefs_fill_mm(struct inode *inode)
>> +{
>> + struct mm_struct *mm;
>> + struct mshare_data *info = NULL;
>> + int retval = 0;
>> +
>> + mm = mm_alloc();
>> + if (!mm) {
>> + retval = -ENOMEM;
>> + goto err_free;
>> + }
>> +
>> + info = kzalloc(sizeof(*info), GFP_KERNEL);
>> + if (!info) {
>> + retval = -ENOMEM;
>> + goto err_free;
>> + }
>> + info->mm = mm;
>> + info->minfo = NULL;
>> + refcount_set(&info->refcnt, 1);
>> + inode->i_private = info;
>> +
>> + return 0;
>> +
>> +err_free:
>> + if (mm)
>> + mmput(mm);
>> + kfree(info);
>> + return retval;
>> +}
>> +
>> static struct inode
>> *msharefs_get_inode(struct super_block *sb, const struct inode *dir,
>> umode_t mode)
>> {
>> struct inode *inode = new_inode(sb);
>> -
>> if (inode) {
>> inode->i_ino = get_next_ino();
>> inode_init_owner(&init_user_ns, inode, dir, mode);
>> @@ -89,6 +147,10 @@ static struct inode
>> case S_IFREG:
>> inode->i_op = &msharefs_file_inode_ops;
>> inode->i_fop = &msharefs_file_operations;
>> + if (msharefs_fill_mm(inode) != 0) {
>> + discard_new_inode(inode);
>> + inode = ERR_PTR(-ENOMEM);
>
> Is it intentional to clobber the msharefs_fill_mm return value and
> replace it with ENOMEM?
ENOMEM sounded like the right value to return from msharefs_get_inode() in case of failure. On the other hand, there
isn't much of a reason to not just return the return value from msharefs_fill_mm(). I can change that.
Thanks for the review.
--
Khalid
next prev parent reply other threads:[~2022-06-30 22:27 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-29 22:53 [PATCH v2 0/9] Add support for shared PTEs across processes Khalid Aziz
2022-06-29 22:53 ` [PATCH v2 1/9] mm: Add msharefs filesystem Khalid Aziz
2022-06-30 21:53 ` Darrick J. Wong
2022-07-01 16:05 ` Khalid Aziz
2022-06-30 22:57 ` Al Viro
2022-07-01 16:08 ` Khalid Aziz
2022-06-29 22:53 ` [PATCH v2 2/9] mm/mshare: pre-populate msharefs with information file Khalid Aziz
2022-06-30 21:37 ` Darrick J. Wong
2022-06-30 22:54 ` Khalid Aziz
2022-06-30 23:01 ` Al Viro
2022-07-01 16:11 ` Khalid Aziz
2022-06-29 22:53 ` [PATCH v2 3/9] mm/mshare: make msharefs writable and support directories Khalid Aziz
2022-06-30 21:34 ` Darrick J. Wong
2022-06-30 22:49 ` Khalid Aziz
2022-06-30 23:09 ` Al Viro
2022-07-02 0:22 ` Khalid Aziz
2022-06-29 22:53 ` [PATCH v2 4/9] mm/mshare: Add a read operation for msharefs files Khalid Aziz
2022-06-30 21:27 ` Darrick J. Wong
2022-06-30 22:27 ` Khalid Aziz [this message]
2022-06-29 22:53 ` [PATCH v2 5/9] mm/mshare: Add vm flag for shared PTE Khalid Aziz
2022-06-30 14:59 ` Mark Hemment
2022-06-30 15:46 ` Khalid Aziz
2022-06-29 22:53 ` [PATCH v2 6/9] mm/mshare: Add mmap operation Khalid Aziz
2022-06-30 21:44 ` Darrick J. Wong
2022-06-30 23:30 ` Khalid Aziz
2022-06-29 22:53 ` [PATCH v2 7/9] mm/mshare: Add unlink and munmap support Khalid Aziz
2022-06-30 21:50 ` Darrick J. Wong
2022-07-01 15:58 ` Khalid Aziz
2022-06-29 22:53 ` [PATCH v2 8/9] mm/mshare: Add basic page table sharing support Khalid Aziz
2022-07-07 9:13 ` Xin Hao
2022-07-07 15:33 ` Khalid Aziz
2022-06-29 22:54 ` [PATCH v2 9/9] mm/mshare: Enable mshare region mapping across processes Khalid Aziz
2022-06-30 11:57 ` [PATCH v2 0/9] Add support for shared PTEs " Mark Hemment
2022-06-30 15:39 ` Khalid Aziz
2022-07-02 4:24 ` Andrew Morton
2022-07-06 19:26 ` Khalid Aziz
2022-07-08 11:47 ` David Hildenbrand
2022-07-08 19:36 ` Khalid Aziz
2022-07-13 14:00 ` David Hildenbrand
2022-07-13 17:58 ` Mike Kravetz
2022-07-13 18:03 ` David Hildenbrand
2022-07-14 22:02 ` Khalid Aziz
2022-07-18 12:59 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4bbaa753-f145-4971-2b51-c909b946ae63@oracle.com \
--to=khalid.aziz@oracle.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=arnd@arndb.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=djwong@kernel.org \
--cc=ebiederm@xmission.com \
--cc=hagen@jauu.net \
--cc=jack@suse.cz \
--cc=keescook@chromium.org \
--cc=kirill@shutemov.name \
--cc=kucharsk@gmail.com \
--cc=linkinjeon@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longpeng2@huawei.com \
--cc=luto@kernel.org \
--cc=markhemm@googlemail.com \
--cc=pcc@google.com \
--cc=rppt@kernel.org \
--cc=sieberf@amazon.com \
--cc=sjpark@amazon.de \
--cc=surenb@google.com \
--cc=tst@schoebel-theuer.de \
--cc=willy@infradead.org \
--cc=yzaikin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).