Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 20/34] xen: convert put_page() to put_user_page*()
From: John Hubbard @ 2019-08-02  5:48 UTC (permalink / raw)
  To: Juergen Gross, john.hubbard, Andrew Morton
  Cc: devel, Dave Chinner, Christoph Hellwig, Dan Williams, Ira Weiny,
	x86, linux-mm, Dave Hansen, amd-gfx, dri-devel, intel-gfx,
	linux-arm-kernel, linux-rpi-kernel, devel, xen-devel,
	Boris Ostrovsky, rds-devel, Jérôme Glisse, Jan Kara,
	ceph-devel, kvm, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, LKML, linux-media, linux-nfs, linux-rdma,
	linux-xfs, netdev, sparclinux, Jason Gunthorpe
In-Reply-To: <4471e9dc-a315-42c1-0c3c-55ba4eeeb106@suse.com>

On 8/1/19 9:36 PM, Juergen Gross wrote:
> On 02.08.19 04:19, john.hubbard@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
...
>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
>> index 2f5ce7230a43..29e461dbee2d 100644
>> --- a/drivers/xen/privcmd.c
>> +++ b/drivers/xen/privcmd.c
>> @@ -611,15 +611,10 @@ static int lock_pages(
>>   static void unlock_pages(struct page *pages[], unsigned int nr_pages)
>>   {
>> -    unsigned int i;
>> -
>>       if (!pages)
>>           return;
>> -    for (i = 0; i < nr_pages; i++) {
>> -        if (pages[i])
>> -            put_page(pages[i]);
>> -    }
>> +    put_user_pages(pages, nr_pages);
> 
> You are not handling the case where pages[i] is NULL here. Or am I
> missing a pending patch to put_user_pages() here?
> 

Hi Juergen,

You are correct--this no longer handles the cases where pages[i]
is NULL. It's intentional, though possibly wrong. :)

I see that I should have added my standard blurb to this
commit description. I missed this one, but some of the other patches
have it. It makes the following, possibly incorrect claim:

"This changes the release code slightly, because each page slot in the
page_list[] array is no longer checked for NULL. However, that check
was wrong anyway, because the get_user_pages() pattern of usage here
never allowed for NULL entries within a range of pinned pages."

The way I've seen these page arrays used with get_user_pages(),
things are either done single page, or with a contiguous range. So
unless I'm missing a case where someone is either

a) releasing individual pages within a range (and thus likely messing
up their count of pages they have), or

b) allocating two gup ranges within the same pages[] array, with a
gap between the allocations,

...then it should be correct. If so, then I'll add the above blurb
to this patch's commit description.

If that's not the case (both here, and in 3 or 4 other patches in this
series, then as you said, I should add NULL checks to put_user_pages()
and put_user_pages_dirty_lock().


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply

* Re: Slowness forming TIPC cluster with explicit node addresses
From: Chris Packham @ 2019-08-02  5:11 UTC (permalink / raw)
  To: jon.maloy@ericsson.com, tipc-discussion@lists.sourceforge.net
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <1564347861.9737.25.camel@alliedtelesis.co.nz>

On Mon, 2019-07-29 at 09:04 +1200, Chris Packham wrote:
> On Fri, 2019-07-26 at 13:31 +0000, Jon Maloy wrote:
> > 
> > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org>
> > > On
> > > Behalf Of Chris Packham
> > > Sent: 25-Jul-19 19:37
> > > To: tipc-discussion@lists.sourceforge.net
> > > Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> > > Subject: Slowness forming TIPC cluster with explicit node
> > > addresses
> > > 
> > > Hi,
> > > 
> > > I'm having problems forming a TIPC cluster between 2 nodes.
> > > 
> > > This is the basic steps I'm going through on each node.
> > > 
> > > modprobe tipc
> > > ip link set eth2 up
> > > tipc node set addr 1.1.5 # or 1.1.6
> > > tipc bearer enable media eth dev eth0
> > eth2, I assume...
> > 
> Yes sorry I keep switching between between Ethernet ports for testing
> so I hand edited the email.
> 
> > 
> > > 
> > > 
> > > 
> > > Then to confirm if the cluster is formed I use tipc link list
> > > 
> > > [root@node-5 ~]# tipc link list
> > > broadcast-link: up
> > > ...
> > > 
> > > Looking at tcpdump the two nodes are sending packets
> > > 
> > > 22:30:05.782320 TIPC v2.0 1.1.5 > 0.0.0, headerlength 60 bytes,
> > > MessageSize
> > > 76 bytes, Neighbor Detection Protocol internal, messageType Link
> > > request
> > > 22:30:05.863555 TIPC v2.0 1.1.6 > 0.0.0, headerlength 60 bytes,
> > > MessageSize
> > > 76 bytes, Neighbor Detection Protocol internal, messageType Link
> > > request
> > > 
> > > Eventually (after a few minutes) the link does come up
> > > 
> > > [root@node-6 ~]# tipc link list
> > > broadcast-link: up
> > > 1001006:eth2-1001005:eth2: up
> > > 
> > > [root@node-5 ~]# tipc link list
> > > broadcast-link: up
> > > 1001005:eth2-1001006:eth2: up
> > > 
> > > When I remove the "tipc node set addr" things seem to kick into
> > > life straight
> > > away
> > > 
> > > [root@node-5 ~]# tipc link list
> > > broadcast-link: up
> > > 0050b61bd2aa:eth2-0050b61e6dfa:eth2: up
> > > 
> > > So there appears to be some difference in behaviour between
> > > having
> > > an
> > > explicit node address and using the default. Unfortunately our
> > > application
> > > relies on setting the node addresses.
> > I do this many times a day, without any problems. If there would be
> > any time difference, I would expect the 'auto configurable' version
> > to be slower, because it involves a DAD step.
> > Are you sure you don't have any other nodes running in your system?
> > 
> > ///jon
> > 
> Nope the two nodes are connected back to back. Does the number of
> Ethernet interfaces make a difference? As you can see I've got 3 on
> each node. One is completely disconnected, one is for booting over
> TFTP
>  (only used by U-boot) and the other is the USB Ethernet I'm using
> for
> testing.
> 

So I can still reproduce this on nodes that only have one network
interface and are the only things connected.

I did find one thing that helps

diff --git a/net/tipc/discover.c b/net/tipc/discover.c
index c138d68e8a69..49921dad404a 100644
--- a/net/tipc/discover.c
+++ b/net/tipc/discover.c
@@ -358,10 +358,10 @@ int tipc_disc_create(struct net *net, struct
tipc_bearer *b,
        tipc_disc_init_msg(net, d->skb, DSC_REQ_MSG, b);
 
        /* Do we need an address trial period first ? */
-       if (!tipc_own_addr(net)) {
+//     if (!tipc_own_addr(net)) {
                tn->addr_trial_end = jiffies + msecs_to_jiffies(1000);
                msg_set_type(buf_msg(d->skb), DSC_TRIAL_MSG);
-       }
+//     }
        memcpy(&d->dest, dest, sizeof(*dest));
        d->net = net;
        d->bearer_id = b->identity;

I think because with pre-configured addresses the duplicate address
detection is skipped the shorter init phase is skipped. Would is make
sense to unconditionally do the trial step? Or is there some better way
to get things to transition with pre-assigned addresses.

^ permalink raw reply related

* Re: [v2,0/2] tools: bpftool: add net attach/detach command to attach XDP prog
From: Daniel T. Lee @ 2019-08-02  5:07 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Daniel Borkmann, Alexei Starovoitov, netdev
In-Reply-To: <20190801162129.0a775fde@cakuba.netronome.com>

Thank you for letting me know.
I will add to next version of patch.

And, thank you for the detailed review. :)

On Fri, Aug 2, 2019 at 8:21 AM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Thu,  1 Aug 2019 17:11:31 +0900, Daniel T. Lee wrote:
> > Currently, bpftool net only supports dumping progs attached on the
> > interface. To attach XDP prog on interface, user must use other tool
> > (eg. iproute2). By this patch, with `bpftool net attach/detach`, user
> > can attach/detach XDP prog on interface.
> >
> >     $ ./bpftool prog
> >     ...
> >     208: xdp  name xdp_prog1  tag ad822e38b629553f  gpl
> >       loaded_at 2019-07-28T18:03:11+0900  uid 0
> >     ...
> >     $ ./bpftool net attach id 208 xdpdrv enp6s0np1
> >     $ ./bpftool net
> >     xdp:
> >     enp6s0np1(5) driver id 208
> >     ...
> >     $ ./bpftool net detach xdpdrv enp6s0np1
> >     $ ./bpftool net
> >     xdp:
> >     ...
> >
> > While this patch only contains support for XDP, through `net
> > attach/detach`, bpftool can further support other prog attach types.
> >
> > XDP attach/detach tested on Mellanox ConnectX-4 and Netronome Agilio.
>
> Please provide documentation for man pages, and bash completions.

^ permalink raw reply

* Re: [v2,1/2] tools: bpftool: add net attach command to attach XDP on interface
From: Daniel T. Lee @ 2019-08-02  5:02 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Daniel Borkmann, Alexei Starovoitov, netdev
In-Reply-To: <20190801163638.71700f6d@cakuba.netronome.com>

On Fri, Aug 2, 2019 at 8:36 AM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Thu,  1 Aug 2019 17:11:32 +0900, Daniel T. Lee wrote:
> > By this commit, using `bpftool net attach`, user can attach XDP prog on
> > interface. New type of enum 'net_attach_type' has been made, as stated at
> > cover-letter, the meaning of 'attach' is, prog will be attached on interface.
> >
> > BPF prog will be attached through libbpf 'bpf_set_link_xdp_fd'.
> >
> > Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
> > ---
> > Changes in v2:
> >   - command 'load' changed to 'attach' for the consistency
> >   - 'NET_ATTACH_TYPE_XDP_DRIVE' changed to 'NET_ATTACH_TYPE_XDP_DRIVER'
> >
> >  tools/bpf/bpftool/net.c | 107 +++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 106 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
> > index 67e99c56bc88..f3b57660b303 100644
> > --- a/tools/bpf/bpftool/net.c
> > +++ b/tools/bpf/bpftool/net.c
> > @@ -55,6 +55,35 @@ struct bpf_attach_info {
> >       __u32 flow_dissector_id;
> >  };
> >
> > +enum net_attach_type {
> > +     NET_ATTACH_TYPE_XDP,
> > +     NET_ATTACH_TYPE_XDP_GENERIC,
> > +     NET_ATTACH_TYPE_XDP_DRIVER,
> > +     NET_ATTACH_TYPE_XDP_OFFLOAD,
> > +     __MAX_NET_ATTACH_TYPE
> > +};
> > +
> > +static const char * const attach_type_strings[] = {
> > +     [NET_ATTACH_TYPE_XDP] = "xdp",
> > +     [NET_ATTACH_TYPE_XDP_GENERIC] = "xdpgeneric",
> > +     [NET_ATTACH_TYPE_XDP_DRIVER] = "xdpdrv",
> > +     [NET_ATTACH_TYPE_XDP_OFFLOAD] = "xdpoffload",
> > +     [__MAX_NET_ATTACH_TYPE] = NULL,
>
> Not sure if the terminator is necessary,
> ARRAY_SIZE(attach_type_strings) should suffice?

Yes, ARRAY_SIZE is fine though. But I was just trying to make below
'parse_attach_type' consistent with 'parse_attach_type' from the 'prog.c'.
At 'prog.c', It has same terminator at 'attach_type_strings'.

Should I change it or keep it?

> > +};
> > +
> > +static enum net_attach_type parse_attach_type(const char *str)
> > +{
> > +     enum net_attach_type type;
> > +
> > +     for (type = 0; type < __MAX_NET_ATTACH_TYPE; type++) {
> > +             if (attach_type_strings[type] &&
> > +                is_prefix(str, attach_type_strings[type]))
> > +                     return type;
> > +     }
> > +
> > +     return __MAX_NET_ATTACH_TYPE;
> > +}
> > +
> >  static int dump_link_nlmsg(void *cookie, void *msg, struct nlattr **tb)
> >  {
> >       struct bpf_netdev_t *netinfo = cookie;
> > @@ -223,6 +252,77 @@ static int query_flow_dissector(struct bpf_attach_info *attach_info)
> >       return 0;
> >  }
> >
> > +static int parse_attach_args(int argc, char **argv, int *progfd,
> > +                          enum net_attach_type *attach_type, int *ifindex)
> > +{
> > +     if (!REQ_ARGS(3))
> > +             return -EINVAL;
> > +
> > +     *progfd = prog_parse_fd(&argc, &argv);
> > +     if (*progfd < 0)
> > +             return *progfd;
> > +
> > +     *attach_type = parse_attach_type(*argv);
> > +     if (*attach_type == __MAX_NET_ATTACH_TYPE) {
> > +             p_err("invalid net attach/detach type");
> > +             return -EINVAL;
>
> You should close the progfd on error paths.

I will add 'close(*progfd);' at next version of patch.

>
> > +     }
>
> Hm. I'm not too sure about the ordering of arguments, type should
> probably be right after attach.
>
> If we ever add tc attach support or some other hook, that's more
> fundamental part of the command than the program. So I think:
>
> bpftool net attach xdp id xyz dev ethN

I think it is more reasonable than current format.
I'll change the argument order as attach type comes in first.

> > +     NEXT_ARG();
> > +     if (!REQ_ARGS(1))
> > +             return -EINVAL;
>
> Error message needed here.
>

Actually it provides error message like:
Error: 'xdp' needs at least 1 arguments, 0 found

are you suggesting that any additional error message is necessary?

> > +     *ifindex = if_nametoindex(*argv);
> > +     if (!*ifindex) {
> > +             p_err("Invalid ifname");
>
> "ifname" is not mentioned in help, it'd be best to keep this error
> message consistent with bpftool prog load.

I will change it to a 'devname', since the word is mentioned in help.

> > +             return -EINVAL;
> > +     }
>
> Please require the dev keyword before the interface name.
> That'll make it feel closer to prog load syntax.

If adding the dev keyword before interface name, will it be too long to type in?
and also `bpftool prog` use extra keyword (such as dev) when it is
optional keyword.

       bpftool prog dump jited  PROG [{ file FILE | opcodes | linum }]
       bpftool prog pin   PROG FILE
       bpftool prog { load | loadall } OBJ  PATH \

as you can see here, FILE uses optional keyword 'file' when the
argument is optional.

       bpftool prog { load | loadall } OBJ  PATH \
                         [type TYPE] [dev NAME] \
                         [map { idx IDX | name NAME } MAP]\
                         [pinmaps MAP_DIR]

Yes, bpftool prog load has dev keyword with it,

but first, like previous, the argument is optional so i think it is
unnecessary to use optional keyword 'dev'.
and secondly, 'bpftool net attach' isn't really related to 'bpftool prog load'.

At previous version patch, I was using word 'load' instead of
'attach', since XDP program is not
considered as 'BPF_PROG_ATTACH', so it might give a confusion. However
by the last patch discussion,
word 'load' has been replaced to 'attach'.

Keeping the consistency is very important, but I was just wandering
about making command
similar to 'bpftool prog load' syntax.

>
> > +     return 0;
> > +}
> > +
> > +static int do_attach_detach_xdp(int *progfd, enum net_attach_type *attach_type,
> > +                             int *ifindex)
> > +{
> > +     __u32 flags;
> > +     int err;
> > +
> > +     flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
>
> Please add this as an option so that user can decide whether overwrite
> is allowed or not.

Adding force flag to bpftool seems necessary.
I will add an optional argument for this.

> > +     if (*attach_type == NET_ATTACH_TYPE_XDP_GENERIC)
> > +             flags |= XDP_FLAGS_SKB_MODE;
> > +     if (*attach_type == NET_ATTACH_TYPE_XDP_DRIVER)
> > +             flags |= XDP_FLAGS_DRV_MODE;
> > +     if (*attach_type == NET_ATTACH_TYPE_XDP_OFFLOAD)
> > +             flags |= XDP_FLAGS_HW_MODE;
> > +
> > +     err = bpf_set_link_xdp_fd(*ifindex, *progfd, flags);
> > +
> > +     return err;
>
> no need for the err variable here.

My apologies, but I'm not sure why err variable isn't needed at here.
AFAIK, 'bpf_set_link_xdp_fd' from libbpf returns the netlink_recv result,
and in order to propagate error, err variable is necessary, I guess?

> > +}
> > +
> > +static int do_attach(int argc, char **argv)
> > +{
> > +     enum net_attach_type attach_type;
> > +     int err, progfd, ifindex;
> > +
> > +     err = parse_attach_args(argc, argv, &progfd, &attach_type, &ifindex);
> > +     if (err)
> > +             return err;
>
> Probably not the best idea to move this out into a helper.

Again, just trying to make consistent with 'prog.c'.

But clearly it has differences with do_attach/detach from 'prog.c'.
From it, it uses the same parse logic 'parse_attach_detach_args' since
the two command 'bpftool prog attach/detach' uses the same argument format.

However, in here, 'bpftool net' attach and detach requires different number of
argument, so function for parse argument has been defined separately.
The situation is little bit different, but keeping argument parse logic as an
helper, I think it's better in terms of consistency.


About the moving parse logic to a helper, I was trying to keep command
entry (do_attach)
as simple as possible. Parse all the argument in command entry will
make function longer
and might make harder to understand what it does.

And I'm not pretty sure that argument parse logic will stays the same
after other attachment
type comes in. What I mean is, the argument count or type might be
added and to fulfill
all that specific cases, the code might grow larger.

So for the consistency, simplicity and extensibility, I prefer to keep
it as a helper.

> > +     if (is_prefix("xdp", attach_type_strings[attach_type]))
> > +             err = do_attach_detach_xdp(&progfd, &attach_type, &ifindex);
>
> Hm. We either need an error to be reported if it's not xdp or since we
> only accept XDP now perhaps the if() is superfluous?

Well, if the attach_type isn't xdp, the error will be occurred from
the argument parse,
Will it be necessary to reinforce with error logic to make it more secure?

> > +     if (err < 0) {
> > +             p_err("link set %s failed", attach_type_strings[attach_type]);
>
> "link set"?  So you are familiar with iproute2 syntax! :)

Well at first, to avoid confusion about using a command for end-user,
I referenced iproute2 and tried to keep similar message form with it.

But through the discussion about iproute2 and bpftool, it would be better not to
use word such as 'link set'. Maybe "interface %s attach failed" would be better
for clear understanding.

I will fix it at next version of patch.

> > +             return -1;
> > +     }
> > +
> > +     if (json_output)
> > +             jsonw_null(json_wtr);
> > +
> > +     return 0;
> > +}
> > +
> >  static int do_show(int argc, char **argv)
> >  {
> >       struct bpf_attach_info attach_info = {};
> > @@ -305,13 +405,17 @@ static int do_help(int argc, char **argv)
> >
> >       fprintf(stderr,
> >               "Usage: %s %s { show | list } [dev <devname>]\n"
> > +             "       %s %s attach PROG LOAD_TYPE <devname>\n"
> >               "       %s %s help\n"
> > +             "\n"
> > +             "       " HELP_SPEC_PROGRAM "\n"
> > +             "       LOAD_TYPE := { xdp | xdpgeneric | xdpdrv | xdpoffload }\n"
>
> ATTACH_TYPE now?
>
> Perhaps a new line before the "Note"?

My bad.
Will change it right away.

> >               "Note: Only xdp and tc attachments are supported now.\n"
> >               "      For progs attached to cgroups, use \"bpftool cgroup\"\n"
> >               "      to dump program attachments. For program types\n"
> >               "      sk_{filter,skb,msg,reuseport} and lwt/seg6, please\n"
> >               "      consult iproute2.\n",
> > -             bin_name, argv[-2], bin_name, argv[-2]);
> > +             bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2]);
> >
> >       return 0;
> >  }
> > @@ -319,6 +423,7 @@ static int do_help(int argc, char **argv)
> >  static const struct cmd cmds[] = {
> >       { "show",       do_show },
> >       { "list",       do_show },
> > +     { "attach",     do_attach },
> >       { "help",       do_help },
> >       { 0 }
> >  };
>

Thanks for the detailed review! :)

^ permalink raw reply

* Re: [PATCH 20/34] xen: convert put_page() to put_user_page*()
From: Juergen Gross @ 2019-08-02  4:36 UTC (permalink / raw)
  To: john.hubbard, Andrew Morton
  Cc: devel, Dave Chinner, Christoph Hellwig, Dan Williams, Ira Weiny,
	x86, linux-mm, Dave Hansen, amd-gfx, dri-devel, intel-gfx,
	linux-arm-kernel, linux-rpi-kernel, devel, xen-devel,
	John Hubbard, Boris Ostrovsky, rds-devel, Jérôme Glisse,
	Jan Kara, ceph-devel, kvm, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, LKML, linux-media, linux-nfs, linux-rdma,
	linux-xfs, netdev, sparclinux, Jason Gunthorpe
In-Reply-To: <20190802022005.5117-21-jhubbard@nvidia.com>

On 02.08.19 04:19, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> For pages that were retained via get_user_pages*(), release those pages
> via the new put_user_page*() routines, instead of via put_page() or
> release_pages().
> 
> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
> ("mm: introduce put_user_page*(), placeholder versions").
> 
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: xen-devel@lists.xenproject.org
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>   drivers/xen/gntdev.c  | 5 +----
>   drivers/xen/privcmd.c | 7 +------
>   2 files changed, 2 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
> index 4c339c7e66e5..2586b3df2bb6 100644
> --- a/drivers/xen/gntdev.c
> +++ b/drivers/xen/gntdev.c
> @@ -864,10 +864,7 @@ static int gntdev_get_page(struct gntdev_copy_batch *batch, void __user *virt,
>   
>   static void gntdev_put_pages(struct gntdev_copy_batch *batch)
>   {
> -	unsigned int i;
> -
> -	for (i = 0; i < batch->nr_pages; i++)
> -		put_page(batch->pages[i]);
> +	put_user_pages(batch->pages, batch->nr_pages);
>   	batch->nr_pages = 0;
>   }
>   
> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
> index 2f5ce7230a43..29e461dbee2d 100644
> --- a/drivers/xen/privcmd.c
> +++ b/drivers/xen/privcmd.c
> @@ -611,15 +611,10 @@ static int lock_pages(
>   
>   static void unlock_pages(struct page *pages[], unsigned int nr_pages)
>   {
> -	unsigned int i;
> -
>   	if (!pages)
>   		return;
>   
> -	for (i = 0; i < nr_pages; i++) {
> -		if (pages[i])
> -			put_page(pages[i]);
> -	}
> +	put_user_pages(pages, nr_pages);

You are not handling the case where pages[i] is NULL here. Or am I
missing a pending patch to put_user_pages() here?


Juergen

^ permalink raw reply

* Re: [PATCH net] ipv4/route: do not check saddr dev if iif is LOOPBACK_IFINDEX
From: David Ahern @ 2019-08-02  4:16 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, Stefano Brivio, Marcelo Ricardo Leitner, David S . Miller
In-Reply-To: <20190802041358.GT18865@dhcp-12-139.nay.redhat.com>

On 8/1/19 10:13 PM, Hangbin Liu wrote:
> On Thu, Aug 01, 2019 at 01:51:25PM -0600, David Ahern wrote:
>> On 8/1/19 2:29 AM, Hangbin Liu wrote:
>>> Jianlin reported a bug that for IPv4, ip route get from src_addr would fail
>>> if src_addr is not an address on local system.
>>>
>>> \# ip route get 1.1.1.1 from 2.2.2.2
>>> RTNETLINK answers: Invalid argument
>>
>> so this is a forwarding lookup in which case iif should be set. Based on
> 
> with out setting iif in userspace, the kernel set iif to lo by default.

right, it presumes locally generated traffic.
> 
>> the above 'route get' inet_rtm_getroute is doing a lookup as if it is
>> locally generated traffic.
> 
> yeah... but what about the IPv6 part. That cause a different behavior in
> userspace.

just one of many, many annoying differences between v4 and v6. We could
try to catalog it.

^ permalink raw reply

* Re: [PATCH net] ipv4/route: do not check saddr dev if iif is LOOPBACK_IFINDEX
From: Hangbin Liu @ 2019-08-02  4:13 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, Stefano Brivio, Marcelo Ricardo Leitner, David S . Miller
In-Reply-To: <f44d9f26-046d-38a2-13aa-d25b92419d11@gmail.com>

On Thu, Aug 01, 2019 at 01:51:25PM -0600, David Ahern wrote:
> On 8/1/19 2:29 AM, Hangbin Liu wrote:
> > Jianlin reported a bug that for IPv4, ip route get from src_addr would fail
> > if src_addr is not an address on local system.
> > 
> > \# ip route get 1.1.1.1 from 2.2.2.2
> > RTNETLINK answers: Invalid argument
> 
> so this is a forwarding lookup in which case iif should be set. Based on

with out setting iif in userspace, the kernel set iif to lo by default.

> the above 'route get' inet_rtm_getroute is doing a lookup as if it is
> locally generated traffic.

yeah... but what about the IPv6 part. That cause a different behavior in
userspace.

Thanks
Hangbin

^ permalink raw reply

* Re: [PATCH net-next 00/15] net: Add functional tests for L3 and L4
From: David Ahern @ 2019-08-02  4:11 UTC (permalink / raw)
  To: Alexei Starovoitov, David Ahern; +Cc: davem, netdev
In-Reply-To: <20190802001900.uyuryet2lrr3hgsq@ast-mbp.dhcp.thefacebook.com>

On 8/1/19 6:19 PM, Alexei Starovoitov wrote:
> Do you really need 'sleep 1' everywhere?
> It makes them so slow to run...
> What happens if you just remove it ? Tests will fail? Why?

yes, the sleep 1 is needed. A server process is getting launched into
the background. It's status is not relevant; the client's is the key.
The server needs time to initialize. And, yes I tried forking and it
gets hung up with bash and capturing the output for verbose mode.

^ permalink raw reply

* Re: [PATCH net-next 00/15] net: Add functional tests for L3 and L4
From: David Ahern @ 2019-08-02  4:04 UTC (permalink / raw)
  To: Alexei Starovoitov, David Ahern; +Cc: davem, netdev
In-Reply-To: <20190802001900.uyuryet2lrr3hgsq@ast-mbp.dhcp.thefacebook.com>

On 8/1/19 6:19 PM, Alexei Starovoitov wrote:
> On Thu, Aug 01, 2019 at 11:56:33AM -0700, David Ahern wrote:
>> From: David Ahern <dsahern@gmail.com>
>>
>> This is a port the functional test cases created during the development
>> of the VRF feature. It covers various permutations of icmp, tcp and udp
>> for IPv4 and IPv6 including negative tests.
> 
> Thanks a lot for doing this!
> 
> Is there expected output ?

My tests are quiet by default with a verbose option and pause on fail.

$ fcnal-test.sh -h
usage: fcnal-test.sh OPTS

	-4          IPv4 tests only
	-6          IPv6 tests only
	-t <test>   Test name/set to run
	-p          Pause on fail
	-P          Pause after each test
	-v          Be verbose


> All tests suppose to pass on the latest net-next?

they do for me. Buster image. 5.3-rc1 kernel.

> 
> I'm seeing:
> ./fcnal-test.sh
> ...
> SYSCTL: net.ipv4.raw_l3mdev_accept=0
> TEST: ping out - ns-B IP                                                      [ OK ]
> TEST: ping out, device bind - ns-B IP                                         [ OK ]
> TEST: ping out, address bind - ns-B IP                                        [FAIL]
> TEST: ping out - ns-B loopback IP                                             [ OK ]
> TEST: ping out, device bind - ns-B loopback IP                                [ OK ]
> TEST: ping out, address bind - ns-B loopback IP                               [FAIL]
> TEST: ping in - ns-A IP                                                       [ OK ]
> TEST: ping in - ns-A loopback IP                                              [ OK ]
> TEST: ping local - ns-A IP                                                    [ OK ]
> TEST: ping local - ns-A loopback IP                                           [ OK ]
> TEST: ping local - loopback                                                   [ OK ]
> TEST: ping local, device bind - ns-A IP                                       [ OK ]
> TEST: ping local, device bind - ns-A loopback IP                              [FAIL]
> TEST: ping local, device bind - loopback                                      [FAIL]

...
TEST: ping out - ns-B IP                                          [ OK ]
TEST: ping out, device bind - ns-B IP                             [ OK ]
TEST: ping out, address bind - ns-B IP                            [ OK ]
TEST: ping out - ns-B loopback IP                                 [ OK ]
TEST: ping out, device bind - ns-B loopback IP                    [ OK ]
TEST: ping out, address bind - ns-B loopback IP                   [ OK ]
TEST: ping in - ns-A IP                                           [ OK ]
TEST: ping in - ns-A loopback IP                                  [ OK ]
TEST: ping local - ns-A IP                                        [ OK ]
TEST: ping local - ns-A loopback IP                               [ OK ]
TEST: ping local - loopback                                       [ OK ]
TEST: ping local, device bind - ns-A IP                           [ OK ]
TEST: ping local, device bind - ns-A loopback IP                  [ OK ]
TEST: ping local, device bind - loopback                          [ OK ]
TEST: ping out, blocked by rule - ns-B loopback IP                [ OK ]
TEST: ping in, blocked by rule - ns-A loopback IP                 [ OK ]
TEST: ping out, blocked by route - ns-B loopback IP               [ OK ]
TEST: ping in, blocked by route - ns-A loopback IP                [ OK ]
TEST: ping out, unreachable default route - ns-B loopback IP      [ OK ]
...

> 
> with -v I see:
> COMMAND: ip netns exec ns-A ping -c1 -w1 -I 172.16.2.1 172.16.1.2
> ping: unknown iface 172.16.2.1
> TEST: ping out, address bind - ns-B IP                                        [FAIL]

With ping from iputils-ping -I can be an address or a device.

$ man ping

       -I interface
           interface is either an address, or an interface name. If
interface is an address, it
           sets source address to specified interface address. If
interface in an interface name,
           it sets source interface to specified interface. NOTE: For
IPv6, when doing ping to a
           link-local scope address, link specification (by the
'%'-notation in destination, or
           by this option) can be used but it is no longer required.

^ permalink raw reply

* [PATCH net-next] net: can: Fix compiling warning
From: Mao Wenan @ 2019-08-02  3:36 UTC (permalink / raw)
  To: socketcan, davem, netdev; +Cc: linux-kernel, kernel-janitors, Mao Wenan

There are two warings in net/can, fix them by setting bcm_sock_no_ioctlcmd
and raw_sock_no_ioctlcmd as static.

net/can/bcm.c:1683:5: warning: symbol 'bcm_sock_no_ioctlcmd' was not declared. Should it be static?
net/can/raw.c:840:5: warning: symbol 'raw_sock_no_ioctlcmd' was not declared. Should it be static?

Fixes: 473d924d7d46 ("can: fix ioctl function removal")

Signed-off-by: Mao Wenan <maowenan@huawei.com>
---
 net/can/bcm.c | 2 +-
 net/can/raw.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/can/bcm.c b/net/can/bcm.c
index bf1d0bbecec8..b8a32b4ac368 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1680,7 +1680,7 @@ static int bcm_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	return size;
 }
 
-int bcm_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
+static int bcm_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
 			 unsigned long arg)
 {
 	/* no ioctls for socket layer -> hand it down to NIC layer */
diff --git a/net/can/raw.c b/net/can/raw.c
index da386f1fa815..a01848ff9b12 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -837,7 +837,7 @@ static int raw_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	return size;
 }
 
-int raw_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
+static int raw_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
 			 unsigned long arg)
 {
 	/* no ioctls for socket layer -> hand it down to NIC layer */
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
From: Jakub Kicinski @ 2019-08-02  2:56 UTC (permalink / raw)
  To: wenxu; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
In-Reply-To: <43995d3c-dff5-3000-317c-09b119c61565@ucloud.cn>

On Fri, 2 Aug 2019 10:47:26 +0800, wenxu wrote:
> > After all the same device may have both a TC block and a NFT block.  
> 
> Only one subsystem can be used for the same device for both indr-dev and hw-dev
> the flow_block_cb_is_busy avoid the situation you mentioned.

AFAIU that's a temporary limitation until drivers learn to manage
multiple tables.

^ permalink raw reply

* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
From: wenxu @ 2019-08-02  2:47 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
In-Reply-To: <20190801161129.25fee619@cakuba.netronome.com>


On 8/2/2019 7:11 AM, Jakub Kicinski wrote:
> On Thu,  1 Aug 2019 11:03:46 +0800, wenxu@ucloud.cn wrote:
>> From: wenxu <wenxu@ucloud.cn>
>>
>> The new flow-indr-block can't get the tcf_block
>> directly. It provide a callback list to find the flow_block immediately
>> when the device register and contain a ingress block.
>>
>> Signed-off-by: wenxu <wenxu@ucloud.cn>
> First of all thanks for splitting the series up into more patches, 
> it is easier to follow the logic now!
>
>> @@ -328,6 +348,7 @@ struct flow_indr_block_dev {
>>  
>>  	INIT_LIST_HEAD(&indr_dev->cb_list);
>>  	indr_dev->dev = dev;
>> +	flow_get_default_block(indr_dev);
>>  	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
>>  				   flow_indr_setup_block_ht_params)) {
>>  		kfree(indr_dev);
> I wonder if it's still practical to keep the block information in the
> indr_dev structure at all. The way this used to work was:
>
>
> [hash table of devices]     --------------
>                  |         |    netdev    |
>                  |         |    refcnt    |
>   indir_dev[tun0]|  ------ | cached block | ---- [ TC block ]
>                  |         |   callbacks  | .
>                  |          --------------   \__ [cb, cb_priv, cb_ident]
>                  |                               [cb, cb_priv, cb_ident]
>                  |          --------------
>                  |         |    netdev    |
>                  |         |    refcnt    |
>   indir_dev[tun1]|  ------ | cached block | ---- [ TC block ]
>                  |         |   callbacks  |.
> -----------------           --------------   \__ [cb, cb_priv, cb_ident]
>                                                  [cb, cb_priv, cb_ident]
>
>
> In the example above we have two tunnels tun0 and tun1, each one has a
> indr_dev structure allocated, and for each one of them two drivers
> registered for callbacks (hence the callbacks list has two entries).
>
> We used to cache the TC block in the indr_dev structure, but now that
> there are multiple subsytems using the indr_dev we either have to have
> a list of cached blocks (with entries for each subsystem) or just always
> iterate over the subsystems :(
>
> After all the same device may have both a TC block and a NFT block.

Only one subsystem can be used for the same device for both indr-dev and hw-dev

the flow_block_cb_is_busy avoid the situation you mentioned.

>
> I think always iterating would be easier:
>
> The indr_dev struct would no longer have the block pointer, instead
> when new driver registers for the callback instead of:
>
> 	if (indr_dev->ing_cmd_cb)
> 		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
> 				     indr_block_cb->cb, indr_block_cb->cb_priv,
> 				     FLOW_BLOCK_BIND);
>
> We'd have something like the loop in flow_get_default_block():
>
> 	for each (subsystem)
> 		subsystem->handle_new_indir_cb(indr_dev, cb);
>
> And then per-subsystem logic would actually call the cb. Or:
>
> 	for each (subsystem)
> 		block = get_default_block(indir_dev)
> 		indr_dev->ing_cmd_cb(...)
>
> I hope this makes sense.
>
>
>

^ permalink raw reply

* Re: [PATCH 00/34] put_user_pages(): miscellaneous call sites
From: John Hubbard @ 2019-08-02  2:39 UTC (permalink / raw)
  To: john.hubbard, Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel
In-Reply-To: <20190802021653.4882-1-jhubbard@nvidia.com>

On 8/1/19 7:16 PM, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> Hi,
> 
> These are best characterized as miscellaneous conversions: many (not all)
> call sites that don't involve biovec or iov_iter, nor mm/. It also leaves
> out a few call sites that require some more work. These are mostly pretty
> simple ones.
> 
> It's probably best to send all of these via Andrew's -mm tree, assuming
> that there are no significant merge conflicts with ongoing work in other
> trees (which I doubt, given that these are small changes).
> 

In case anyone is wondering, this truncated series is due to a script failure:
git-send-email chokes when it hits email addresses whose names have a
comma in them, as happened here with patch 0003.  

Please disregard this set and reply to the other thread.

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply

* memory leak in tipc_group_create_member
From: syzbot @ 2019-08-02  2:38 UTC (permalink / raw)
  To: davem, jon.maloy, linux-kernel, netdev, syzkaller-bugs,
	tipc-discussion, ying.xue

Hello,

syzbot found the following crash on:

HEAD commit:    a9815a4f Merge branch 'x86-urgent-for-linus' of git://git...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12a6dbf0600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=37c48fb52e3789e6
dashboard link: https://syzkaller.appspot.com/bug?extid=f95d90c454864b3b5bc9
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13be3ecc600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11c992b4600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+f95d90c454864b3b5bc9@syzkaller.appspotmail.com

executing program
BUG: memory leak
unreferenced object 0xffff888122bca200 (size 128):
   comm "syz-executor232", pid 7065, jiffies 4294943817 (age 8.880s)
   hex dump (first 32 bytes):
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
     00 00 00 00 00 00 00 00 18 a2 bc 22 81 88 ff ff  ..........."....
   backtrace:
     [<000000005bada299>] kmemleak_alloc_recursive  
include/linux/kmemleak.h:43 [inline]
     [<000000005bada299>] slab_post_alloc_hook mm/slab.h:522 [inline]
     [<000000005bada299>] slab_alloc mm/slab.c:3319 [inline]
     [<000000005bada299>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
     [<00000000e7bcdc9f>] kmalloc include/linux/slab.h:552 [inline]
     [<00000000e7bcdc9f>] kzalloc include/linux/slab.h:748 [inline]
     [<00000000e7bcdc9f>] tipc_group_create_member+0x3c/0x190  
net/tipc/group.c:306
     [<0000000005f56f40>] tipc_group_add_member+0x34/0x40  
net/tipc/group.c:327
     [<0000000044406683>] tipc_nametbl_build_group+0x9b/0xf0  
net/tipc/name_table.c:600
     [<000000009f71e803>] tipc_sk_join net/tipc/socket.c:2901 [inline]
     [<000000009f71e803>] tipc_setsockopt+0x170/0x490 net/tipc/socket.c:3006
     [<000000007f61cbc2>] __sys_setsockopt+0x10f/0x220 net/socket.c:2084
     [<00000000cc630372>] __do_sys_setsockopt net/socket.c:2100 [inline]
     [<00000000cc630372>] __se_sys_setsockopt net/socket.c:2097 [inline]
     [<00000000cc630372>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2097
     [<00000000ec30be33>] do_syscall_64+0x76/0x1a0  
arch/x86/entry/common.c:296
     [<00000000271be3e6>] entry_SYSCALL_64_after_hwframe+0x44/0xa9



---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* [PATCH 00/34] put_user_pages(): miscellaneous call sites
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard

From: John Hubbard <jhubbard@nvidia.com>

Hi,

These are best characterized as miscellaneous conversions: many (not all)
call sites that don't involve biovec or iov_iter, nor mm/. It also leaves
out a few call sites that require some more work. These are mostly pretty
simple ones.

It's probably best to send all of these via Andrew's -mm tree, assuming
that there are no significant merge conflicts with ongoing work in other
trees (which I doubt, given that these are small changes).

These patches apply to the latest linux.git. Patch #1 is also already in
Andrew's tree, but given the broad non-linux-mm Cc list, I thought it
would be more convenient to just include that patch here, so that people
can use linux.git as the base--even though these are probably destined
for linux-mm.

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions"). That commit
has an extensive description of the problem and the planned steps to
solve it, but the highlites are:

1) Provide put_user_page*() routines, intended to be used
for releasing pages that were pinned via get_user_pages*().

2) Convert all of the call sites for get_user_pages*(), to
invoke put_user_page*(), instead of put_page(). This involves dozens of
call sites, and will take some time.

3) After (2) is complete, use get_user_pages*() and put_user_page*() to
implement tracking of these pages. This tracking will be separate from
the existing struct page refcounting.

4) Use the tracking and identification of these pages, to implement
special handling (especially in writeback paths) when the pages are
backed by a filesystem.

And a few references, also from that commit:

[1] https://lwn.net/Articles/774411/ : "DMA and get_user_pages()"
[2] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"


Ira Weiny (1):
  fs/binfmt_elf: convert put_page() to put_user_page*()

John Hubbard (33):
  mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
  net/rds: convert put_page() to put_user_page*()
  net/ceph: convert put_page() to put_user_page*()
  x86/kvm: convert put_page() to put_user_page*()
  drm/etnaviv: convert release_pages() to put_user_pages()
  drm/i915: convert put_page() to put_user_page*()
  drm/radeon: convert put_page() to put_user_page*()
  media/ivtv: convert put_page() to put_user_page*()
  media/v4l2-core/mm: convert put_page() to put_user_page*()
  genwqe: convert put_page() to put_user_page*()
  scif: convert put_page() to put_user_page*()
  vmci: convert put_page() to put_user_page*()
  rapidio: convert put_page() to put_user_page*()
  oradax: convert put_page() to put_user_page*()
  staging/vc04_services: convert put_page() to put_user_page*()
  drivers/tee: convert put_page() to put_user_page*()
  vfio: convert put_page() to put_user_page*()
  fbdev/pvr2fb: convert put_page() to put_user_page*()
  fsl_hypervisor: convert put_page() to put_user_page*()
  xen: convert put_page() to put_user_page*()
  fs/exec.c: convert put_page() to put_user_page*()
  orangefs: convert put_page() to put_user_page*()
  uprobes: convert put_page() to put_user_page*()
  futex: convert put_page() to put_user_page*()
  mm/frame_vector.c: convert put_page() to put_user_page*()
  mm/gup_benchmark.c: convert put_page() to put_user_page*()
  mm/memory.c: convert put_page() to put_user_page*()
  mm/madvise.c: convert put_page() to put_user_page*()
  mm/process_vm_access.c: convert put_page() to put_user_page*()
  crypt: convert put_page() to put_user_page*()
  nfs: convert put_page() to put_user_page*()
  goldfish_pipe: convert put_page() to put_user_page*()
  kernel/events/core.c: convert put_page() to put_user_page*()

 arch/x86/kvm/svm.c                            |   4 +-
 crypto/af_alg.c                               |   7 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem.c         |   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |   9 +-
 drivers/gpu/drm/radeon/radeon_ttm.c           |   2 +-
 drivers/infiniband/core/umem.c                |   5 +-
 drivers/infiniband/hw/hfi1/user_pages.c       |   5 +-
 drivers/infiniband/hw/qib/qib_user_pages.c    |   5 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c      |   5 +-
 drivers/infiniband/sw/siw/siw_mem.c           |  10 +-
 drivers/media/pci/ivtv/ivtv-udma.c            |  14 +--
 drivers/media/pci/ivtv/ivtv-yuv.c             |  10 +-
 drivers/media/v4l2-core/videobuf-dma-sg.c     |   3 +-
 drivers/misc/genwqe/card_utils.c              |  17 +--
 drivers/misc/mic/scif/scif_rma.c              |  17 ++-
 drivers/misc/vmw_vmci/vmci_context.c          |   2 +-
 drivers/misc/vmw_vmci/vmci_queue_pair.c       |  11 +-
 drivers/platform/goldfish/goldfish_pipe.c     |   9 +-
 drivers/rapidio/devices/rio_mport_cdev.c      |   9 +-
 drivers/sbus/char/oradax.c                    |   2 +-
 .../interface/vchiq_arm/vchiq_2835_arm.c      |  10 +-
 drivers/tee/tee_shm.c                         |  10 +-
 drivers/vfio/vfio_iommu_type1.c               |   8 +-
 drivers/video/fbdev/pvr2fb.c                  |   3 +-
 drivers/virt/fsl_hypervisor.c                 |   7 +-
 drivers/xen/gntdev.c                          |   5 +-
 drivers/xen/privcmd.c                         |   7 +-
 fs/binfmt_elf.c                               |   2 +-
 fs/binfmt_elf_fdpic.c                         |   2 +-
 fs/exec.c                                     |   2 +-
 fs/nfs/direct.c                               |   4 +-
 fs/orangefs/orangefs-bufmap.c                 |   7 +-
 include/linux/mm.h                            |   5 +-
 kernel/events/core.c                          |   2 +-
 kernel/events/uprobes.c                       |   6 +-
 kernel/futex.c                                |  10 +-
 mm/frame_vector.c                             |   4 +-
 mm/gup.c                                      | 115 ++++++++----------
 mm/gup_benchmark.c                            |   2 +-
 mm/madvise.c                                  |   2 +-
 mm/memory.c                                   |   2 +-
 mm/process_vm_access.c                        |  18 +--
 net/ceph/pagevec.c                            |   8 +-
 net/rds/info.c                                |   5 +-
 net/rds/message.c                             |   2 +-
 net/rds/rdma.c                                |  15 ++-
 virt/kvm/kvm_main.c                           |   4 +-
 47 files changed, 151 insertions(+), 266 deletions(-)

-- 
2.22.0


^ permalink raw reply

* [PATCH 02/34] net/rds: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Santosh Shilimkar, David S . Miller
In-Reply-To: <20190802022005.5117-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: rds-devel@oss.oracle.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 net/rds/info.c    |  5 ++---
 net/rds/message.c |  2 +-
 net/rds/rdma.c    | 15 +++++++--------
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/net/rds/info.c b/net/rds/info.c
index 03f6fd56d237..ca6af2889adf 100644
--- a/net/rds/info.c
+++ b/net/rds/info.c
@@ -162,7 +162,6 @@ int rds_info_getsockopt(struct socket *sock, int optname, char __user *optval,
 	struct rds_info_lengths lens;
 	unsigned long nr_pages = 0;
 	unsigned long start;
-	unsigned long i;
 	rds_info_func func;
 	struct page **pages = NULL;
 	int ret;
@@ -235,8 +234,8 @@ int rds_info_getsockopt(struct socket *sock, int optname, char __user *optval,
 		ret = -EFAULT;
 
 out:
-	for (i = 0; pages && i < nr_pages; i++)
-		put_page(pages[i]);
+	if (pages)
+		put_user_pages(pages, nr_pages);
 	kfree(pages);
 
 	return ret;
diff --git a/net/rds/message.c b/net/rds/message.c
index 50f13f1d4ae0..d7b0d266c437 100644
--- a/net/rds/message.c
+++ b/net/rds/message.c
@@ -404,7 +404,7 @@ static int rds_message_zcopy_from_user(struct rds_message *rm, struct iov_iter *
 			int i;
 
 			for (i = 0; i < rm->data.op_nents; i++)
-				put_page(sg_page(&rm->data.op_sg[i]));
+				put_user_page(sg_page(&rm->data.op_sg[i]));
 			mmp = &rm->data.op_mmp_znotifier->z_mmp;
 			mm_unaccount_pinned_pages(mmp);
 			ret = -EFAULT;
diff --git a/net/rds/rdma.c b/net/rds/rdma.c
index 916f5ec373d8..6762e8696b99 100644
--- a/net/rds/rdma.c
+++ b/net/rds/rdma.c
@@ -162,8 +162,7 @@ static int rds_pin_pages(unsigned long user_addr, unsigned int nr_pages,
 				  pages);
 
 	if (ret >= 0 && ret < nr_pages) {
-		while (ret--)
-			put_page(pages[ret]);
+		put_user_pages(pages, ret);
 		ret = -EFAULT;
 	}
 
@@ -276,7 +275,7 @@ static int __rds_rdma_map(struct rds_sock *rs, struct rds_get_mr_args *args,
 
 	if (IS_ERR(trans_private)) {
 		for (i = 0 ; i < nents; i++)
-			put_page(sg_page(&sg[i]));
+			put_user_page(sg_page(&sg[i]));
 		kfree(sg);
 		ret = PTR_ERR(trans_private);
 		goto out;
@@ -464,9 +463,10 @@ void rds_rdma_free_op(struct rm_rdma_op *ro)
 		 * to local memory */
 		if (!ro->op_write) {
 			WARN_ON(!page->mapping && irqs_disabled());
-			set_page_dirty(page);
+			put_user_pages_dirty_lock(&page, 1, true);
+		} else {
+			put_user_page(page);
 		}
-		put_page(page);
 	}
 
 	kfree(ro->op_notifier);
@@ -481,8 +481,7 @@ void rds_atomic_free_op(struct rm_atomic_op *ao)
 	/* Mark page dirty if it was possibly modified, which
 	 * is the case for a RDMA_READ which copies from remote
 	 * to local memory */
-	set_page_dirty(page);
-	put_page(page);
+	put_user_pages_dirty_lock(&page, 1, true);
 
 	kfree(ao->op_notifier);
 	ao->op_notifier = NULL;
@@ -867,7 +866,7 @@ int rds_cmsg_atomic(struct rds_sock *rs, struct rds_message *rm,
 	return ret;
 err:
 	if (page)
-		put_page(page);
+		put_user_page(page);
 	rm->atomic.op_active = 0;
 	kfree(rm->atomic.op_notifier);
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH 01/34] mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Matthew Wilcox, Christoph Hellwig
In-Reply-To: <20190802022005.5117-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

Provide more capable variation of put_user_pages_dirty_lock(),
and delete put_user_pages_dirty(). This is based on the
following:

1. Lots of call sites become simpler if a bool is passed
into put_user_page*(), instead of making the call site
choose which put_user_page*() variant to call.

2. Christoph Hellwig's observation that set_page_dirty_lock()
is usually correct, and set_page_dirty() is usually a
bug, or at least questionable, within a put_user_page*()
calling chain.

This leads to the following API choices:

    * put_user_pages_dirty_lock(page, npages, make_dirty)

    * There is no put_user_pages_dirty(). You have to
      hand code that, in the rare case that it's
      required.

Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/infiniband/core/umem.c             |   5 +-
 drivers/infiniband/hw/hfi1/user_pages.c    |   5 +-
 drivers/infiniband/hw/qib/qib_user_pages.c |   5 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
 drivers/infiniband/sw/siw/siw_mem.c        |  10 +-
 include/linux/mm.h                         |   5 +-
 mm/gup.c                                   | 115 +++++++++------------
 7 files changed, 58 insertions(+), 92 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 08da840ed7ee..965cf9dea71a 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -54,10 +54,7 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
 
 	for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_nents, 0) {
 		page = sg_page_iter_page(&sg_iter);
-		if (umem->writable && dirty)
-			put_user_pages_dirty_lock(&page, 1);
-		else
-			put_user_page(page);
+		put_user_pages_dirty_lock(&page, 1, umem->writable && dirty);
 	}
 
 	sg_free_table(&umem->sg_head);
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
index b89a9b9aef7a..469acb961fbd 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -118,10 +118,7 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
 void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
 			     size_t npages, bool dirty)
 {
-	if (dirty)
-		put_user_pages_dirty_lock(p, npages);
-	else
-		put_user_pages(p, npages);
+	put_user_pages_dirty_lock(p, npages, dirty);
 
 	if (mm) { /* during close after signal, mm can be NULL */
 		atomic64_sub(npages, &mm->pinned_vm);
diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c
index bfbfbb7e0ff4..6bf764e41891 100644
--- a/drivers/infiniband/hw/qib/qib_user_pages.c
+++ b/drivers/infiniband/hw/qib/qib_user_pages.c
@@ -40,10 +40,7 @@
 static void __qib_release_user_pages(struct page **p, size_t num_pages,
 				     int dirty)
 {
-	if (dirty)
-		put_user_pages_dirty_lock(p, num_pages);
-	else
-		put_user_pages(p, num_pages);
+	put_user_pages_dirty_lock(p, num_pages, dirty);
 }
 
 /**
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 0b0237d41613..62e6ffa9ad78 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -75,10 +75,7 @@ static void usnic_uiom_put_pages(struct list_head *chunk_list, int dirty)
 		for_each_sg(chunk->page_list, sg, chunk->nents, i) {
 			page = sg_page(sg);
 			pa = sg_phys(sg);
-			if (dirty)
-				put_user_pages_dirty_lock(&page, 1);
-			else
-				put_user_page(page);
+			put_user_pages_dirty_lock(&page, 1, dirty);
 			usnic_dbg("pa: %pa\n", &pa);
 		}
 		kfree(chunk);
diff --git a/drivers/infiniband/sw/siw/siw_mem.c b/drivers/infiniband/sw/siw/siw_mem.c
index 67171c82b0c4..ab83a9cec562 100644
--- a/drivers/infiniband/sw/siw/siw_mem.c
+++ b/drivers/infiniband/sw/siw/siw_mem.c
@@ -63,15 +63,7 @@ struct siw_mem *siw_mem_id2obj(struct siw_device *sdev, int stag_index)
 static void siw_free_plist(struct siw_page_chunk *chunk, int num_pages,
 			   bool dirty)
 {
-	struct page **p = chunk->plist;
-
-	while (num_pages--) {
-		if (!PageDirty(*p) && dirty)
-			put_user_pages_dirty_lock(p, 1);
-		else
-			put_user_page(*p);
-		p++;
-	}
+	put_user_pages_dirty_lock(chunk->plist, num_pages, dirty);
 }
 
 void siw_umem_release(struct siw_umem *umem, bool dirty)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0334ca97c584..9759b6a24420 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1057,8 +1057,9 @@ static inline void put_user_page(struct page *page)
 	put_page(page);
 }
 
-void put_user_pages_dirty(struct page **pages, unsigned long npages);
-void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
+void put_user_pages_dirty_lock(struct page **pages, unsigned long npages,
+			       bool make_dirty);
+
 void put_user_pages(struct page **pages, unsigned long npages);
 
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
diff --git a/mm/gup.c b/mm/gup.c
index 98f13ab37bac..7fefd7ab02c4 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -29,85 +29,70 @@ struct follow_page_context {
 	unsigned int page_mask;
 };
 
-typedef int (*set_dirty_func_t)(struct page *page);
-
-static void __put_user_pages_dirty(struct page **pages,
-				   unsigned long npages,
-				   set_dirty_func_t sdf)
-{
-	unsigned long index;
-
-	for (index = 0; index < npages; index++) {
-		struct page *page = compound_head(pages[index]);
-
-		/*
-		 * Checking PageDirty at this point may race with
-		 * clear_page_dirty_for_io(), but that's OK. Two key cases:
-		 *
-		 * 1) This code sees the page as already dirty, so it skips
-		 * the call to sdf(). That could happen because
-		 * clear_page_dirty_for_io() called page_mkclean(),
-		 * followed by set_page_dirty(). However, now the page is
-		 * going to get written back, which meets the original
-		 * intention of setting it dirty, so all is well:
-		 * clear_page_dirty_for_io() goes on to call
-		 * TestClearPageDirty(), and write the page back.
-		 *
-		 * 2) This code sees the page as clean, so it calls sdf().
-		 * The page stays dirty, despite being written back, so it
-		 * gets written back again in the next writeback cycle.
-		 * This is harmless.
-		 */
-		if (!PageDirty(page))
-			sdf(page);
-
-		put_user_page(page);
-	}
-}
-
 /**
- * put_user_pages_dirty() - release and dirty an array of gup-pinned pages
- * @pages:  array of pages to be marked dirty and released.
+ * put_user_pages_dirty_lock() - release and optionally dirty gup-pinned pages
+ * @pages:  array of pages to be maybe marked dirty, and definitely released.
  * @npages: number of pages in the @pages array.
+ * @make_dirty: whether to mark the pages dirty
  *
  * "gup-pinned page" refers to a page that has had one of the get_user_pages()
  * variants called on that page.
  *
  * For each page in the @pages array, make that page (or its head page, if a
- * compound page) dirty, if it was previously listed as clean. Then, release
- * the page using put_user_page().
+ * compound page) dirty, if @make_dirty is true, and if the page was previously
+ * listed as clean. In any case, releases all pages using put_user_page(),
+ * possibly via put_user_pages(), for the non-dirty case.
  *
  * Please see the put_user_page() documentation for details.
  *
- * set_page_dirty(), which does not lock the page, is used here.
- * Therefore, it is the caller's responsibility to ensure that this is
- * safe. If not, then put_user_pages_dirty_lock() should be called instead.
+ * set_page_dirty_lock() is used internally. If instead, set_page_dirty() is
+ * required, then the caller should a) verify that this is really correct,
+ * because _lock() is usually required, and b) hand code it:
+ * set_page_dirty_lock(), put_user_page().
  *
  */
-void put_user_pages_dirty(struct page **pages, unsigned long npages)
+void put_user_pages_dirty_lock(struct page **pages, unsigned long npages,
+			       bool make_dirty)
 {
-	__put_user_pages_dirty(pages, npages, set_page_dirty);
-}
-EXPORT_SYMBOL(put_user_pages_dirty);
+	unsigned long index;
 
-/**
- * put_user_pages_dirty_lock() - release and dirty an array of gup-pinned pages
- * @pages:  array of pages to be marked dirty and released.
- * @npages: number of pages in the @pages array.
- *
- * For each page in the @pages array, make that page (or its head page, if a
- * compound page) dirty, if it was previously listed as clean. Then, release
- * the page using put_user_page().
- *
- * Please see the put_user_page() documentation for details.
- *
- * This is just like put_user_pages_dirty(), except that it invokes
- * set_page_dirty_lock(), instead of set_page_dirty().
- *
- */
-void put_user_pages_dirty_lock(struct page **pages, unsigned long npages)
-{
-	__put_user_pages_dirty(pages, npages, set_page_dirty_lock);
+	/*
+	 * TODO: this can be optimized for huge pages: if a series of pages is
+	 * physically contiguous and part of the same compound page, then a
+	 * single operation to the head page should suffice.
+	 */
+
+	if (!make_dirty) {
+		put_user_pages(pages, npages);
+		return;
+	}
+
+	for (index = 0; index < npages; index++) {
+		struct page *page = compound_head(pages[index]);
+		/*
+		 * Checking PageDirty at this point may race with
+		 * clear_page_dirty_for_io(), but that's OK. Two key
+		 * cases:
+		 *
+		 * 1) This code sees the page as already dirty, so it
+		 * skips the call to set_page_dirty(). That could happen
+		 * because clear_page_dirty_for_io() called
+		 * page_mkclean(), followed by set_page_dirty().
+		 * However, now the page is going to get written back,
+		 * which meets the original intention of setting it
+		 * dirty, so all is well: clear_page_dirty_for_io() goes
+		 * on to call TestClearPageDirty(), and write the page
+		 * back.
+		 *
+		 * 2) This code sees the page as clean, so it calls
+		 * set_page_dirty(). The page stays dirty, despite being
+		 * written back, so it gets written back again in the
+		 * next writeback cycle. This is harmless.
+		 */
+		if (!PageDirty(page))
+			set_page_dirty_lock(page);
+		put_user_page(page);
+	}
 }
 EXPORT_SYMBOL(put_user_pages_dirty_lock);
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH 04/34] x86/kvm: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Joerg Roedel, Paolo Bonzini,
	Radim Krčmář, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin
In-Reply-To: <20190802022005.5117-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 arch/x86/kvm/svm.c  | 4 ++--
 virt/kvm/kvm_main.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7eafc6907861..ff93c923ed36 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1827,7 +1827,7 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
 
 err:
 	if (npinned > 0)
-		release_pages(pages, npinned);
+		put_user_pages(pages, npinned);
 
 	kvfree(pages);
 	return NULL;
@@ -1838,7 +1838,7 @@ static void sev_unpin_memory(struct kvm *kvm, struct page **pages,
 {
 	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
 
-	release_pages(pages, npages);
+	put_user_pages(pages, npages);
 	kvfree(pages);
 	sev->pages_locked -= npages;
 }
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 887f3b0c2b60..4b6a596ea8e9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1499,7 +1499,7 @@ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
 
 		if (__get_user_pages_fast(addr, 1, 1, &wpage) == 1) {
 			*writable = true;
-			put_page(page);
+			put_user_page(page);
 			page = wpage;
 		}
 	}
@@ -1831,7 +1831,7 @@ EXPORT_SYMBOL_GPL(kvm_release_page_clean);
 void kvm_release_pfn_clean(kvm_pfn_t pfn)
 {
 	if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn))
-		put_page(pfn_to_page(pfn));
+		put_user_page(pfn_to_page(pfn));
 }
 EXPORT_SYMBOL_GPL(kvm_release_pfn_clean);
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH 06/34] drm/i915: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie
In-Reply-To: <20190802022005.5117-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Note that this effectively changes the code's behavior in
i915_gem_userptr_put_pages(): it now calls set_page_dirty_lock(),
instead of set_page_dirty(). This is probably more accurate.

As Christophe Hellwig put it, "set_page_dirty() is only safe if we are
dealing with a file backed page where we have reference on the inode it
hangs off." [1]

[1] https://lore.kernel.org/r/20190723153640.GB720@lst.de

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 528b61678334..c18008d3cc2a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -527,7 +527,7 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
 	}
 	mutex_unlock(&obj->mm.lock);
 
-	release_pages(pvec, pinned);
+	put_user_pages(pvec, pinned);
 	kvfree(pvec);
 
 	i915_gem_object_put(obj);
@@ -640,7 +640,7 @@ static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 		__i915_gem_userptr_set_active(obj, true);
 
 	if (IS_ERR(pages))
-		release_pages(pvec, pinned);
+		put_user_pages(pvec, pinned);
 	kvfree(pvec);
 
 	return PTR_ERR_OR_ZERO(pages);
@@ -663,11 +663,8 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 	i915_gem_gtt_finish_pages(obj, pages);
 
 	for_each_sgt_page(page, sgt_iter, pages) {
-		if (obj->mm.dirty)
-			set_page_dirty(page);
-
 		mark_page_accessed(page);
-		put_page(page);
+		put_user_pages_dirty_lock(&page, 1, obj->mm.dirty);
 	}
 	obj->mm.dirty = false;
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH 10/34] genwqe: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Frank Haverkamp, Guilherme G. Piccoli,
	Arnd Bergmann, Greg Kroah-Hartman
In-Reply-To: <20190802022005.5117-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

This changes the release code slightly, because each page slot in the
page_list[] array is no longer checked for NULL. However, that check
was wrong anyway, because the get_user_pages() pattern of usage here
never allowed for NULL entries within a range of pinned pages.

Cc: Frank Haverkamp <haver@linux.vnet.ibm.com>
Cc: "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/misc/genwqe/card_utils.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/drivers/misc/genwqe/card_utils.c b/drivers/misc/genwqe/card_utils.c
index 2e1c4d2905e8..2a888f31d2c5 100644
--- a/drivers/misc/genwqe/card_utils.c
+++ b/drivers/misc/genwqe/card_utils.c
@@ -517,24 +517,13 @@ int genwqe_free_sync_sgl(struct genwqe_dev *cd, struct genwqe_sgl *sgl)
 /**
  * genwqe_free_user_pages() - Give pinned pages back
  *
- * Documentation of get_user_pages is in mm/gup.c:
- *
- * If the page is written to, set_page_dirty (or set_page_dirty_lock,
- * as appropriate) must be called after the page is finished with, and
- * before put_page is called.
+ * The pages may have been written to, so we call put_user_pages_dirty_lock(),
+ * rather than put_user_pages().
  */
 static int genwqe_free_user_pages(struct page **page_list,
 			unsigned int nr_pages, int dirty)
 {
-	unsigned int i;
-
-	for (i = 0; i < nr_pages; i++) {
-		if (page_list[i] != NULL) {
-			if (dirty)
-				set_page_dirty_lock(page_list[i]);
-			put_page(page_list[i]);
-		}
-	}
+	put_user_pages_dirty_lock(page_list, nr_pages, dirty);
 	return 0;
 }
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH 11/34] scif: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Sudeep Dutt, Ashutosh Dixit,
	Arnd Bergmann, Joerg Roedel, Robin Murphy, Zhen Lei
In-Reply-To: <20190802022005.5117-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/misc/mic/scif/scif_rma.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c
index 01e27682ea30..d84ed9466920 100644
--- a/drivers/misc/mic/scif/scif_rma.c
+++ b/drivers/misc/mic/scif/scif_rma.c
@@ -113,13 +113,14 @@ static int scif_destroy_pinned_pages(struct scif_pinned_pages *pin)
 	int writeable = pin->prot & SCIF_PROT_WRITE;
 	int kernel = SCIF_MAP_KERNEL & pin->map_flags;
 
-	for (j = 0; j < pin->nr_pages; j++) {
-		if (pin->pages[j] && !kernel) {
+	if (kernel) {
+		for (j = 0; j < pin->nr_pages; j++) {
 			if (writeable)
-				SetPageDirty(pin->pages[j]);
+				set_page_dirty_lock(pin->pages[j]);
 			put_page(pin->pages[j]);
 		}
-	}
+	} else
+		put_user_pages_dirty_lock(pin->pages, pin->nr_pages, writeable);
 
 	scif_free(pin->pages,
 		  pin->nr_pages * sizeof(*pin->pages));
@@ -1385,11 +1386,9 @@ int __scif_pin_pages(void *addr, size_t len, int *out_prot,
 				if (ulimit)
 					__scif_dec_pinned_vm_lock(mm, nr_pages);
 				/* Roll back any pinned pages */
-				for (i = 0; i < pinned_pages->nr_pages; i++) {
-					if (pinned_pages->pages[i])
-						put_page(
-						pinned_pages->pages[i]);
-				}
+				put_user_pages(pinned_pages->pages,
+					       pinned_pages->nr_pages);
+
 				prot &= ~SCIF_PROT_WRITE;
 				try_upgrade = false;
 				goto retry;
-- 
2.22.0


^ permalink raw reply related

* [PATCH 12/34] vmci: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Arnd Bergmann, Al Viro,
	Gustavo A. R. Silva, Kees Cook
In-Reply-To: <20190802022005.5117-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Note that this effectively changes the code's behavior in
qp_release_pages(): it now ultimately calls set_page_dirty_lock(),
instead of set_page_dirty(). This is probably more accurate.

As Christophe Hellwig put it, "set_page_dirty() is only safe if we are
dealing with a file backed page where we have reference on the inode it
hangs off." [1]

[1] https://lore.kernel.org/r/20190723153640.GB720@lst.de

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/misc/vmw_vmci/vmci_context.c    |  2 +-
 drivers/misc/vmw_vmci/vmci_queue_pair.c | 11 ++---------
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_context.c b/drivers/misc/vmw_vmci/vmci_context.c
index 16695366ec92..9daa52ee63b7 100644
--- a/drivers/misc/vmw_vmci/vmci_context.c
+++ b/drivers/misc/vmw_vmci/vmci_context.c
@@ -587,7 +587,7 @@ void vmci_ctx_unset_notify(struct vmci_ctx *context)
 
 	if (notify_page) {
 		kunmap(notify_page);
-		put_page(notify_page);
+		put_user_page(notify_page);
 	}
 }
 
diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
index 8531ae781195..e5434551d0ef 100644
--- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
+++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
@@ -626,15 +626,8 @@ static void qp_release_queue_mutex(struct vmci_queue *queue)
 static void qp_release_pages(struct page **pages,
 			     u64 num_pages, bool dirty)
 {
-	int i;
-
-	for (i = 0; i < num_pages; i++) {
-		if (dirty)
-			set_page_dirty(pages[i]);
-
-		put_page(pages[i]);
-		pages[i] = NULL;
-	}
+	put_user_pages_dirty_lock(pages, num_pages, dirty);
+	memset(pages, 0, num_pages * sizeof(struct page *));
 }
 
 /*
-- 
2.22.0


^ permalink raw reply related

* [PATCH 07/34] drm/radeon: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Alex Deucher, Christian König,
	David Zhou, David Airlie
In-Reply-To: <20190802022005.5117-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index fb3696bc616d..4c9943fa10df 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -540,7 +540,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 	kfree(ttm->sg);
 
 release_pages:
-	release_pages(ttm->pages, pinned);
+	put_user_pages(ttm->pages, pinned);
 	return r;
 }
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH 14/34] oradax: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, David S . Miller, Jonathan Helman,
	Rob Gardner, Andy Shevchenko, Jonathan Corbet, Wei Yongjun,
	Mauro Carvalho Chehab
In-Reply-To: <20190802022005.5117-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: David S. Miller <davem@davemloft.net>
Cc: Jonathan Helman <jonathan.helman@oracle.com>
Cc: Rob Gardner <rob.gardner@oracle.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: sparclinux@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/sbus/char/oradax.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/sbus/char/oradax.c b/drivers/sbus/char/oradax.c
index 8af216287a84..029e619992fc 100644
--- a/drivers/sbus/char/oradax.c
+++ b/drivers/sbus/char/oradax.c
@@ -412,7 +412,7 @@ static void dax_unlock_pages(struct dax_ctx *ctx, int ccb_index, int nelem)
 				dax_dbg("freeing page %p", p);
 				if (j == OUT)
 					set_page_dirty(p);
-				put_page(p);
+				put_user_page(p);
 				ctx->pages[i][j] = NULL;
 			}
 		}
-- 
2.22.0


^ permalink raw reply related

* [PATCH 08/34] media/ivtv: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-02  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Andy Walls, Mauro Carvalho Chehab
In-Reply-To: <20190802022005.5117-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Andy Walls <awalls@md.metrocast.net>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-media@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/media/pci/ivtv/ivtv-udma.c | 14 ++++----------
 drivers/media/pci/ivtv/ivtv-yuv.c  | 10 +++-------
 2 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/drivers/media/pci/ivtv/ivtv-udma.c b/drivers/media/pci/ivtv/ivtv-udma.c
index 5f8883031c9c..7c7f33c2412b 100644
--- a/drivers/media/pci/ivtv/ivtv-udma.c
+++ b/drivers/media/pci/ivtv/ivtv-udma.c
@@ -92,7 +92,7 @@ int ivtv_udma_setup(struct ivtv *itv, unsigned long ivtv_dest_addr,
 {
 	struct ivtv_dma_page_info user_dma;
 	struct ivtv_user_dma *dma = &itv->udma;
-	int i, err;
+	int err;
 
 	IVTV_DEBUG_DMA("ivtv_udma_setup, dst: 0x%08x\n", (unsigned int)ivtv_dest_addr);
 
@@ -119,8 +119,7 @@ int ivtv_udma_setup(struct ivtv *itv, unsigned long ivtv_dest_addr,
 		IVTV_DEBUG_WARN("failed to map user pages, returned %d instead of %d\n",
 			   err, user_dma.page_count);
 		if (err >= 0) {
-			for (i = 0; i < err; i++)
-				put_page(dma->map[i]);
+			put_user_pages(dma->map, err);
 			return -EINVAL;
 		}
 		return err;
@@ -130,9 +129,7 @@ int ivtv_udma_setup(struct ivtv *itv, unsigned long ivtv_dest_addr,
 
 	/* Fill SG List with new values */
 	if (ivtv_udma_fill_sg_list(dma, &user_dma, 0) < 0) {
-		for (i = 0; i < dma->page_count; i++) {
-			put_page(dma->map[i]);
-		}
+		put_user_pages(dma->map, dma->page_count);
 		dma->page_count = 0;
 		return -ENOMEM;
 	}
@@ -153,7 +150,6 @@ int ivtv_udma_setup(struct ivtv *itv, unsigned long ivtv_dest_addr,
 void ivtv_udma_unmap(struct ivtv *itv)
 {
 	struct ivtv_user_dma *dma = &itv->udma;
-	int i;
 
 	IVTV_DEBUG_INFO("ivtv_unmap_user_dma\n");
 
@@ -170,9 +166,7 @@ void ivtv_udma_unmap(struct ivtv *itv)
 	ivtv_udma_sync_for_cpu(itv);
 
 	/* Release User Pages */
-	for (i = 0; i < dma->page_count; i++) {
-		put_page(dma->map[i]);
-	}
+	put_user_pages(dma->map, dma->page_count);
 	dma->page_count = 0;
 }
 
diff --git a/drivers/media/pci/ivtv/ivtv-yuv.c b/drivers/media/pci/ivtv/ivtv-yuv.c
index cd2fe2d444c0..9465a7d450b6 100644
--- a/drivers/media/pci/ivtv/ivtv-yuv.c
+++ b/drivers/media/pci/ivtv/ivtv-yuv.c
@@ -81,8 +81,7 @@ static int ivtv_yuv_prep_user_dma(struct ivtv *itv, struct ivtv_user_dma *dma,
 				 uv_pages, uv_dma.page_count);
 
 			if (uv_pages >= 0) {
-				for (i = 0; i < uv_pages; i++)
-					put_page(dma->map[y_pages + i]);
+				put_user_pages(&dma->map[y_pages], uv_pages);
 				rc = -EFAULT;
 			} else {
 				rc = uv_pages;
@@ -93,8 +92,7 @@ static int ivtv_yuv_prep_user_dma(struct ivtv *itv, struct ivtv_user_dma *dma,
 				 y_pages, y_dma.page_count);
 		}
 		if (y_pages >= 0) {
-			for (i = 0; i < y_pages; i++)
-				put_page(dma->map[i]);
+			put_user_pages(dma->map, y_pages);
 			/*
 			 * Inherit the -EFAULT from rc's
 			 * initialization, but allow it to be
@@ -112,9 +110,7 @@ static int ivtv_yuv_prep_user_dma(struct ivtv *itv, struct ivtv_user_dma *dma,
 	/* Fill & map SG List */
 	if (ivtv_udma_fill_sg_list (dma, &uv_dma, ivtv_udma_fill_sg_list (dma, &y_dma, 0)) < 0) {
 		IVTV_DEBUG_WARN("could not allocate bounce buffers for highmem userspace buffers\n");
-		for (i = 0; i < dma->page_count; i++) {
-			put_page(dma->map[i]);
-		}
+		put_user_pages(dma->map, dma->page_count);
 		dma->page_count = 0;
 		return -ENOMEM;
 	}
-- 
2.22.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox