From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74C5FC433DF for ; Mon, 18 May 2020 19:23:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 047E020756 for ; Mon, 18 May 2020 19:23:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ArinD1zH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 047E020756 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5FCAC900003; Mon, 18 May 2020 15:23:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5AD82900002; Mon, 18 May 2020 15:23:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4C2E5900003; Mon, 18 May 2020 15:23:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0030.hostedemail.com [216.40.44.30]) by kanga.kvack.org (Postfix) with ESMTP id 34596900002 for ; Mon, 18 May 2020 15:23:09 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E5714824556B for ; Mon, 18 May 2020 19:23:08 +0000 (UTC) X-FDA: 76830812856.19.army53_72c3485854d47 X-HE-Tag: army53_72c3485854d47 X-Filterd-Recvd-Size: 9921 Received: from mail-vs1-f67.google.com (mail-vs1-f67.google.com [209.85.217.67]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Mon, 18 May 2020 19:23:08 +0000 (UTC) Received: by mail-vs1-f67.google.com with SMTP id v26so6330983vsa.1 for ; Mon, 18 May 2020 12:23:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=f3ILiyk30wRbfK23dh/O9xFsFphYtNDbVYfVZgp73Oc=; b=ArinD1zH1rcDoD3eiWAyIe2Q8HfWaY+YmTtTgigWn7pciUtEgLRFTDSawEinY3f1nN xZhZxUiZXPT2usRKyRNF3hC0LmKdyukGGtkPwogqkCY2DZnD974VBZuJNfzTSbn+Cm+P 8ztyv1L7mnw/E2+5qty9azyOElIkgEY9hr0g7cmXvZtYE2gWUUBbRJ0optWiBxptKwFf mc/zbsIdcaW4V/4P59+fMlSPgWvLUhA3WdIOzan5elDVqy6tUSDUQOkpemo/AWTCcbJr Tka2KsAzTztWVAJFwqnVxia8oin4uiGg/nGkuvNPuWjIXwPG0PdVpxKRCbiSpNuzv9ej n66g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=f3ILiyk30wRbfK23dh/O9xFsFphYtNDbVYfVZgp73Oc=; b=FC8iEYc5ayooDkmg3up3lJhbemm3fOz1j0yqHfsCOtUr5B8yDjt80pUOtraAJloc1R 7m6zn+Oe04ZYH72gnLDF458+61ixqwjvtOC5dFuTJtScElPpMaJ3dtcb3XvL52Jz9tHv q4kYlGuiApeobSxGlSSHvXzZhFrFvpCRUJo0N8oCsrBYOhbJorq2697Vdxf/6CtqbWk5 bFcJBPrnrxDoGtP0YAEmLcpGyQ8MF1blentvBpZi6ULlp+MMtIMSd7qO7xBJp6H0FKCo YEeK1X/+kerOCAEL5wlSx/0xs09PTlq3xRP0VLT4Oeh0XCEPXLDTdIaEljisXuub1cho jFWg== X-Gm-Message-State: AOAM533YB1HP3hsV1P8Dm+/sQ0ffo6ltZ6hKu8IXJOBLapUQxV4Dxn2j mMLfQCvvERJgisPftBZFOo8aIY+h739Xr2Y//BGEyQ== X-Google-Smtp-Source: ABdhPJxDL/9YgOnqV3iWJOuJCL8QuNnztjnS6s28kHehJj8idXehTbtn/Sfp0ou+w9qe7XSEVPhnR2x3JXc5/5UqawY= X-Received: by 2002:a67:f883:: with SMTP id h3mr7772716vso.239.1589829787468; Mon, 18 May 2020 12:23:07 -0700 (PDT) MIME-Version: 1.0 References: <20200516012055.126205-1-minchan@kernel.org> In-Reply-To: <20200516012055.126205-1-minchan@kernel.org> From: Suren Baghdasaryan Date: Mon, 18 May 2020 12:22:56 -0700 Message-ID: Subject: Re: [PATCH] mm: use only pidfd for process_madvise syscall To: Minchan Kim Cc: Andrew Morton , LKML , Christian Brauner , linux-mm , linux-api@vger.kernel.org, Oleksandr Natalenko , Tim Murray , Daniel Colascione , Sandeep Patil , Sonny Rao , Brian Geffon , Michal Hocko , Johannes Weiner , Shakeel Butt , John Dias , Joel Fernandes , Jann Horn , alexander.h.duyck@linux.intel.com, SeongJae Park , David Rientjes , Arjun Roy , Kirill Tkhai Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, May 15, 2020 at 6:21 PM Minchan Kim wrote: > > Based on discussion[1], people didn't feel we need to support both > pid and pidfd for every new coming API[2] so this patch keeps only > pidfd. This patch also changes flags's type with "unsigned int". > So finally, the API is as follows, > > ssize_t process_madvise(int pidfd, const struct iovec *iovec, > unsigned long vlen, int advice, unsigned int flags); > > DESCRIPTION > The process_madvise() system call is used to give advice or directions > to the kernel about the address ranges from external process as well as > local process. It provides the advice to address ranges of process > described by iovec and vlen. The goal of such advice is to improve system > or application performance. > > The pidfd selects the process referred to by the PID file descriptor > specified in pidfd. (See pidofd_open(2) for further information) > > The pointer iovec points to an array of iovec structures, defined in > as: > > struct iovec { > void *iov_base; /* starting address */ > size_t iov_len; /* number of bytes to be advised */ > }; > > The iovec describes address ranges beginning at address(iov_base) > and with size length of bytes(iov_len). > > The vlen represents the number of elements in iovec. > > The advice is indicated in the advice argument, which is one of the > following at this moment if the target process specified by idtype and There is no idtype parameter anymore, so maybe just "if the target process is external"? > id is external. > > MADV_COLD > MADV_PAGEOUT > MADV_MERGEABLE > MADV_UNMERGEABLE > > Permission to provide a hint to external process is governed by a > ptrace access mode PTRACE_MODE_ATTACH_FSCREDS check; see ptrace(2). > > The process_madvise supports every advice madvise(2) has if target > process is in same thread group with calling process so user could > use process_madvise(2) to extend existing madvise(2) to support > vector address ranges. > > RETURN VALUE > On success, process_madvise() returns the number of bytes advised. > This return value may be less than the total number of requested > bytes, if an error occurred. The caller should check return value > to determine whether a partial advice occurred. > > [1] https://lore.kernel.org/linux-mm/20200509124817.xmrvsrq3mla6b76k@wittgenstein/ > [2] https://lore.kernel.org/linux-mm/9d849087-3359-c4ab-fbec-859e8186c509@virtuozzo.com/ > Signed-off-by: Minchan Kim > --- > mm/madvise.c | 42 +++++++++++++----------------------------- > 1 file changed, 13 insertions(+), 29 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index d3fbbe52d230..35c9b220146a 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -1229,8 +1229,8 @@ static int process_madvise_vec(struct task_struct *target_task, > return ret; > } > > -static ssize_t do_process_madvise(int which, pid_t upid, struct iov_iter *iter, > - int behavior, unsigned long flags) > +static ssize_t do_process_madvise(int pidfd, struct iov_iter *iter, > + int behavior, unsigned int flags) > { > ssize_t ret; > struct pid *pid; > @@ -1241,26 +1241,12 @@ static ssize_t do_process_madvise(int which, pid_t upid, struct iov_iter *iter, > if (flags != 0) > return -EINVAL; > > - switch (which) { > - case P_PID: > - if (upid <= 0) > - return -EINVAL; > - > - pid = find_get_pid(upid); > - if (!pid) > - return -ESRCH; > - break; > - case P_PIDFD: > - if (upid < 0) > - return -EINVAL; > - > - pid = pidfd_get_pid(upid); > - if (IS_ERR(pid)) > - return PTR_ERR(pid); > - break; > - default: > + if (pidfd < 0) > return -EINVAL; > - } > + > + pid = pidfd_get_pid(pidfd); > + if (IS_ERR(pid)) > + return PTR_ERR(pid); > > task = get_pid_task(pid, PIDTYPE_PID); > if (!task) { > @@ -1292,9 +1278,8 @@ static ssize_t do_process_madvise(int which, pid_t upid, struct iov_iter *iter, > return ret; > } > > -SYSCALL_DEFINE6(process_madvise, int, which, pid_t, upid, > - const struct iovec __user *, vec, unsigned long, vlen, > - int, behavior, unsigned long, flags) > +SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, > + unsigned long, vlen, int, behavior, unsigned int, flags) > { > ssize_t ret; > struct iovec iovstack[UIO_FASTIOV]; > @@ -1303,19 +1288,18 @@ SYSCALL_DEFINE6(process_madvise, int, which, pid_t, upid, > > ret = import_iovec(READ, vec, vlen, ARRAY_SIZE(iovstack), &iov, &iter); > if (ret >= 0) { > - ret = do_process_madvise(which, upid, &iter, behavior, flags); > + ret = do_process_madvise(pidfd, &iter, behavior, flags); > kfree(iov); > } > return ret; > } > > #ifdef CONFIG_COMPAT > -COMPAT_SYSCALL_DEFINE6(process_madvise, compat_int_t, which, > - compat_pid_t, upid, > +COMPAT_SYSCALL_DEFINE5(process_madvise, compat_int_t, pidfd, > const struct compat_iovec __user *, vec, > compat_ulong_t, vlen, > compat_int_t, behavior, > - compat_ulong_t, flags) > + compat_int_t, flags) > > { > ssize_t ret; > @@ -1326,7 +1310,7 @@ COMPAT_SYSCALL_DEFINE6(process_madvise, compat_int_t, which, > ret = compat_import_iovec(READ, vec, vlen, ARRAY_SIZE(iovstack), > &iov, &iter); > if (ret >= 0) { > - ret = do_process_madvise(which, upid, &iter, behavior, flags); > + ret = do_process_madvise(pidfd, &iter, behavior, flags); > kfree(iov); > } > return ret; > -- > 2.26.2.761.g0e0b3e54be-goog > Reviewed-by: Suren Baghdasaryan