Linux virtualization list
 help / color / mirror / Atom feed
* Re: [PATCH v3 7/7] [wip] virtio-gpu: add page flip support
From: Daniel Vetter @ 2016-05-30 14:43 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: open list:VIRTIO GPU DRIVER, dri-devel, Daniel Vetter, open list
In-Reply-To: <1464610010.5179.34.camel@redhat.com>

On Mon, May 30, 2016 at 02:06:50PM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > > But I'll take you up on the implied offer to help out and test ;-)
> > 
> > git://people.freedesktop.org/~danvet/drm stuff
> 
> Tried that branch.
> 
> > Would be really awesome if you could test this on virtio. Note that the
> > new nonblocking helpers require that your atomic backend gets the drm
> > event handling right. So if there's a bug in that logic then you'll see
> > lots of dmesg noise about waits timing out (after 10s of waiting). From a
> > quick inspection it should work though.
> 
> No timeouts.  Yay!
> 
> But it seems crtcs can be (temporarely) disabled now, so we might have
> to pick up the crtc from old_state in virtio_gpu_plane_atomic_update to
> figure which virtual output needs to be turned off.  Ran into this last
> week already.  Happened with multihead setups only, but the same patch
> fixes this one too ;)
> 
> https://lists.freedesktop.org/archives/dri-devel/2016-May/108772.html

Hm, smells more like virtio isn't too happy with the default ordering of
the commit operation. The default is:

- Disable any crtc/encoders that need to be disabled/change.
- Bash new plane setup into hw.
- Enable all crtcs/encoders that need to be enabled/have changed.

There's two problems:
- some hw gets real grumpy if you bash in plane state without the crtc
  state yet matching.
- if you do runtime pm nothing is enabled and the writes get lost at best,
  or hang your interconnect at worst.

That's why you can overwrite atomic_commit_tail, and use something more
sensible. See for example what it looks like for rockchip. I have a gut
feeling that should also take care of your troubles. Using the old crtc is
definitely not what you want.

Another option is that virtio isn't happy about bashing in plane state for
disabled crtc. Again helpers have you covered, look at the active_only
parameter for drm_atomic_helper_commit_planes().

Aside, if you wonder why these defaults: They match what the crtc helpers
are doing, but they are a bit surprising indeed.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply

* Re: [PATCH] Add virtio gpu driver.
From: Daniel Vetter @ 2016-05-30 14:47 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: virtio-dev, Michael S. Tsirkin, open list:ABI/API, open list,
	open list:DRM DRIVERS, open list:VIRTIO CORE, NET...,
	Daniel Vetter, Dave Airlie
In-Reply-To: <1464616236.5179.41.camel@redhat.com>

On Mon, May 30, 2016 at 03:50:36PM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > - add a small core function to registerr HOT_X/HOT_Y for a (cursor) plane,
> >   e.g. drm_plane_register_hotspot(). That should allocate the properties
> >   (if they don't exist yet) and then attach those props to the cursor. We
> >   don't want those props everywhere, but only on drivers that support/need
> >   them, aka virtual hw.
> 
> Hmm, why is this special to virtual hw?
> 
> >  	if (crtc->cursor) {
> > -		ret = drm_mode_cursor_universal(crtc, req, file_priv);
> > +		if (drm_core_check_feature(DRIVER_ATOMIC))
> > +			ret = drm_mode_cursor_atomic(crtc, req, file_priv);
> > +		else
> > +			ret = drm_mode_cursor_universal(crtc, req, file_priv);
> >  		goto out;
> 
> >   drm_mode_cursor_atomic would simply be a fusing of
> >   drm_mode_cursor_universal + drm_atomic_helper_update_plane (dump all the
> >   intermediate variables and store directly in the plane state), with the
> >   addition of also storing hot_x/y into the plane state.
> 
> Hmm, that'll either make drm_mode_cursor_atomic a big cut+pasted
> function, or need quite some refactoring to move common code into
> functions callable from both drm_mode_cursor_atomic
> +drm_mode_cursor_universal ...
> 
> Why attach the hotspot to the plane?  Wouldn't it make more sense to
> make it a framebuffer property?

We don't have properties on the framebuffer. I guess you /could/ just add
it internally to struct drm_framebuffer, and not bother exposing to
userspace. I guess that would be a lot simpler, but it also means that
atomic userspace can't use hotspots before we add properties to fbs. And
doing that is a bit tricky since drm_framebuffer objects are meant to be
invariant - this assumption is deeply in-grained into the code all over
the place, everything just compares pointers when semantically it means to
compare the entire fb (including backing storage pointer/offsets and
everything).

So would be a bit more work to wire up for atomic userspace, but indeed a
lot less work to implement. I'm totally happy if you go with that tradeoff
;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply

* Re: [PATCH] virtio-gpu: fix output lookup
From: Daniel Vetter @ 2016-05-30 15:19 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: open list:VIRTIO GPU DRIVER, open list,
	open list:VIRTIO GPU DRIVER, dri-devel
In-Reply-To: <1464609806-22013-1-git-send-email-kraxel@redhat.com>

On Mon, May 30, 2016 at 02:03:26PM +0200, Gerd Hoffmann wrote:
> Needed for multihead setups where we can have disabled
> outputs and therefore plane->crtc can be NULL.
> 
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

See my reply in the other thread, but I think you have a more fundamental
issue here. I dropped two suggestions worth trying, both need your own
atomic_commit_tail (on top of my branch at least):
- set active_only to true in commit_planes()
- move commit_planes after commit_modeset_enables().

Cheers, Daniel
> ---
>  drivers/gpu/drm/virtio/virtgpu_plane.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
> index 70b44a2..e50674b 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_plane.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
> @@ -63,11 +63,17 @@ static void virtio_gpu_plane_atomic_update(struct drm_plane *plane,
>  {
>  	struct drm_device *dev = plane->dev;
>  	struct virtio_gpu_device *vgdev = dev->dev_private;
> -	struct virtio_gpu_output *output = drm_crtc_to_virtio_gpu_output(plane->crtc);
> +	struct virtio_gpu_output *output = NULL;
>  	struct virtio_gpu_framebuffer *vgfb;
>  	struct virtio_gpu_object *bo;
>  	uint32_t handle;
>  
> +	if (plane->state->crtc)
> +		output = drm_crtc_to_virtio_gpu_output(plane->state->crtc);
> +	if (old_state->crtc)
> +		output = drm_crtc_to_virtio_gpu_output(old_state->crtc);
> +	WARN_ON(!output);
> +
>  	if (plane->state->fb) {
>  		vgfb = to_virtio_gpu_framebuffer(plane->state->fb);
>  		bo = gem_to_virtio_gpu_obj(vgfb->obj);
> -- 
> 1.8.3.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply

* Re: [PATCH V2 1/2] vhost_net: stop polling socket during rx processing
From: Michael S. Tsirkin @ 2016-05-30 15:47 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <1464590874-39539-2-git-send-email-jasowang@redhat.com>

On Mon, May 30, 2016 at 02:47:53AM -0400, Jason Wang wrote:
> We don't stop rx polling socket during rx processing, this will lead
> unnecessary wakeups from under layer net devices (E.g
> sock_def_readable() form tun). Rx will be slowed down in this
> way. This patch avoids this by stop polling socket during rx
> processing. A small drawback is that this introduces some overheads in
> light load case because of the extra start/stop polling, but single
> netperf TCP_RR does not notice any change. In a super heavy load case,
> e.g using pktgen to inject packet to guest, we get about ~8.8%
> improvement on pps:
> 
> before: ~1240000 pkt/s
> after:  ~1350000 pkt/s
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vhost/net.c | 56 +++++++++++++++++++++++++++--------------------------
>  1 file changed, 29 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 10ff494..e91603b 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -301,6 +301,32 @@ static bool vhost_can_busy_poll(struct vhost_dev *dev,
>  	       !vhost_has_work(dev);
>  }
>  
> +static void vhost_net_disable_vq(struct vhost_net *n,
> +				 struct vhost_virtqueue *vq)
> +{
> +	struct vhost_net_virtqueue *nvq =
> +		container_of(vq, struct vhost_net_virtqueue, vq);
> +	struct vhost_poll *poll = n->poll + (nvq - n->vqs);
> +	if (!vq->private_data)
> +		return;
> +	vhost_poll_stop(poll);
> +}
> +
> +static int vhost_net_enable_vq(struct vhost_net *n,
> +				struct vhost_virtqueue *vq)
> +{
> +	struct vhost_net_virtqueue *nvq =
> +		container_of(vq, struct vhost_net_virtqueue, vq);
> +	struct vhost_poll *poll = n->poll + (nvq - n->vqs);
> +	struct socket *sock;
> +
> +	sock = vq->private_data;
> +	if (!sock)
> +		return 0;
> +
> +	return vhost_poll_start(poll, sock->file);
> +}
> +
>  static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
>  				    struct vhost_virtqueue *vq,
>  				    struct iovec iov[], unsigned int iov_size,

BTW we might want to rename these functions, name no longer
reflects function ...


> @@ -627,6 +653,7 @@ static void handle_rx(struct vhost_net *net)
>  	if (!sock)
>  		goto out;
>  	vhost_disable_notify(&net->dev, vq);
> +	vhost_net_disable_vq(net, vq);
>  
>  	vhost_hlen = nvq->vhost_hlen;
>  	sock_hlen = nvq->sock_hlen;
> @@ -715,9 +742,10 @@ static void handle_rx(struct vhost_net *net)
>  		total_len += vhost_len;
>  		if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
>  			vhost_poll_queue(&vq->poll);
> -			break;
> +			goto out;
>  		}
>  	}
> +	vhost_net_enable_vq(net, vq);

OK so if sock is readable but RX VQ is empty, this will
immediately schedule another round of handle_rx and so ad
infinitum,

Looks like a bug.


>  out:
>  	mutex_unlock(&vq->mutex);
>  }
> @@ -796,32 +824,6 @@ static int vhost_net_open(struct inode *inode, struct file *f)
>  	return 0;
>  }
>  
> -static void vhost_net_disable_vq(struct vhost_net *n,
> -				 struct vhost_virtqueue *vq)
> -{
> -	struct vhost_net_virtqueue *nvq =
> -		container_of(vq, struct vhost_net_virtqueue, vq);
> -	struct vhost_poll *poll = n->poll + (nvq - n->vqs);
> -	if (!vq->private_data)
> -		return;
> -	vhost_poll_stop(poll);
> -}
> -
> -static int vhost_net_enable_vq(struct vhost_net *n,
> -				struct vhost_virtqueue *vq)
> -{
> -	struct vhost_net_virtqueue *nvq =
> -		container_of(vq, struct vhost_net_virtqueue, vq);
> -	struct vhost_poll *poll = n->poll + (nvq - n->vqs);
> -	struct socket *sock;
> -
> -	sock = vq->private_data;
> -	if (!sock)
> -		return 0;
> -
> -	return vhost_poll_start(poll, sock->file);
> -}
> -
>  static struct socket *vhost_net_stop_vq(struct vhost_net *n,
>  					struct vhost_virtqueue *vq)
>  {
> -- 
> 1.8.3.1

^ permalink raw reply

* Re: [PATCH V2 2/2] vhost_net: conditionally enable tx polling
From: Michael S. Tsirkin @ 2016-05-30 15:55 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <1464590874-39539-3-git-send-email-jasowang@redhat.com>

On Mon, May 30, 2016 at 02:47:54AM -0400, Jason Wang wrote:
> We always poll tx for socket, this is sub optimal since:
> 
> - it will be only used when we exceed the sndbuf of the socket.
> - since we use two independent polls for tx and vq, this will slightly
>   increase the waitqueue traversing time and more important, vhost
>   could not benefit from commit
>   9e641bdcfa4ef4d6e2fbaa59c1be0ad5d1551fd5 ("net-tun: restructure
>   tun_do_read for better sleep/wakeup efficiency") even if we've
>   stopped rx polling during handle_rx since tx poll were still left in
>   the waitqueue.

Why is this an issue?
sock_def_write_space only wakes up when queue is half empty,
not on each packet.
        if ((atomic_read(&sk->sk_wmem_alloc) << 1) <= sk->sk_sndbuf)

I suspect the issue is with your previous patch,
it now pokes at the spinlock on data path
where it used not to.

Is that right?


> 
> Fix this by conditionally enable tx polling only when -EAGAIN were
> met.
> 
> Test shows about 8% improvement on guest rx pps.
> 
> Before: ~1350000
> After:  ~1460000
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vhost/net.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index e91603b..5a05fa0 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -378,6 +378,7 @@ static void handle_tx(struct vhost_net *net)
>  		goto out;
>  
>  	vhost_disable_notify(&net->dev, vq);
> +	vhost_net_disable_vq(net, vq);
>  
>  	hdr_size = nvq->vhost_hlen;
>  	zcopy = nvq->ubufs;
> @@ -459,6 +460,8 @@ static void handle_tx(struct vhost_net *net)
>  					% UIO_MAXIOV;
>  			}
>  			vhost_discard_vq_desc(vq, 1);
> +			if (err == -EAGAIN)
> +				vhost_net_enable_vq(net, vq);
>  			break;
>  		}
>  		if (err != len)
> -- 
> 1.8.3.1

^ permalink raw reply

* Re: PATCH v6v2 02/12] mm: migrate: support non-lru movable page migration
From: Minchan Kim @ 2016-05-30 16:25 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Rik van Riel, Sergey Senozhatsky, Rafael Aquini, Jonathan Corbet,
	Hugh Dickins, linux-kernel, dri-devel, virtualization,
	John Einar Reitan, linux-mm, Mel Gorman, Andrew Morton,
	Joonsoo Kim, Gioh Kim
In-Reply-To: <2bc277c4-4257-c6cb-2e37-ee5de985410b@suse.cz>

On Mon, May 30, 2016 at 11:36:07AM +0200, Vlastimil Babka wrote:
> On 05/30/2016 03:39 AM, Minchan Kim wrote:
> >After isolation, VM calls migratepage of driver with isolated page.
> >The function of migratepage is to move content of the old page to new page
> >and set up fields of struct page newpage. Keep in mind that you should
> >clear PG_movable of oldpage via __ClearPageMovable under page_lock if you
> >migrated the oldpage successfully and returns 0.
> 
> This "clear PG_movable" is one of the reasons I was confused about
> what __ClearPageMovable() really does. There's no actual
> "PG_movable" page flag and the function doesn't clear even the
> actual mapping flag :) Also same thing in the Documentation/ part.
> 
> Something like "... you should indicate to the VM that the oldpage
> is no longer movable via __ClearPageMovable() ..."?

It's better. I will change it.

> 
> >--- a/mm/compaction.c
> >+++ b/mm/compaction.c
> >@@ -81,6 +81,39 @@ static inline bool migrate_async_suitable(int migratetype)
> >
> > #ifdef CONFIG_COMPACTION
> >
> >+int PageMovable(struct page *page)
> >+{
> >+	struct address_space *mapping;
> >+
> >+	VM_BUG_ON_PAGE(!PageLocked(page), page);
> >+	if (!__PageMovable(page))
> >+		return 0;
> >+
> >+	mapping = page_mapping(page);
> >+	if (mapping && mapping->a_ops && mapping->a_ops->isolate_page)
> >+		return 1;
> >+
> >+	return 0;
> >+}
> >+EXPORT_SYMBOL(PageMovable);
> >+
> >+void __SetPageMovable(struct page *page, struct address_space *mapping)
> >+{
> >+	VM_BUG_ON_PAGE(!PageLocked(page), page);
> >+	VM_BUG_ON_PAGE((unsigned long)mapping & PAGE_MAPPING_MOVABLE, page);
> >+	page->mapping = (void *)((unsigned long)mapping | PAGE_MAPPING_MOVABLE);
> >+}
> >+EXPORT_SYMBOL(__SetPageMovable);
> >+
> >+void __ClearPageMovable(struct page *page)
> >+{
> >+	VM_BUG_ON_PAGE(!PageLocked(page), page);
> >+	VM_BUG_ON_PAGE(!PageMovable(page), page);
> >+	page->mapping = (void *)((unsigned long)page->mapping &
> >+				PAGE_MAPPING_MOVABLE);
> >+}
> >+EXPORT_SYMBOL(__ClearPageMovable);
> 
> The second confusing thing is that the function is named
> __ClearPageMovable(), but what it really clears is the mapping
> pointer,
> which is not at all the opposite of what __SetPageMovable() does.
> 
> I know it's explained in the documentation, but it also deserves a
> comment here so it doesn't confuse everyone who looks at it.
> Even better would be a less confusing name for the function, but I
> can't offer one right now.

To me, __ClearPageMovable naming is suitable for user POV.
It effectively makes the page unmovable. The confusion is just caused
by the implementation and I don't prefer exported API depends on the
implementation. So I want to add just comment.

I didn't add comment above the function because I don't want to export
internal implementation to the user. I think they don't need to know it.

index a7df2ae71f2a..d1d2063b4fd9 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -108,6 +108,11 @@ void __ClearPageMovable(struct page *page)
 {
        VM_BUG_ON_PAGE(!PageLocked(page), page);
        VM_BUG_ON_PAGE(!PageMovable(page), page);
+       /*
+        * Clear registered address_space val with keeping PAGE_MAPPING_MOVABLE
+        * flag so that VM can catch up released page by driver after isolation.
+        * With it, VM migration doesn't try to put it back.
+        */
        page->mapping = (void *)((unsigned long)page->mapping &
                                PAGE_MAPPING_MOVABLE);

> 
> >diff --git a/mm/util.c b/mm/util.c
> >index 917e0e3d0f8e..b756ee36f7f0 100644
> >--- a/mm/util.c
> >+++ b/mm/util.c
> >@@ -399,10 +399,12 @@ struct address_space *page_mapping(struct page *page)
> > 	}
> >
> > 	mapping = page->mapping;
> 
> I'd probably use READ_ONCE() here to be safe. Not all callers are
> under page lock?

I don't understand. Yeah, all caller are not under page lock but at least,
new user of movable pages should call it under page_lock.
Yeah, I will write the rule down in document.
In this case, what kinds of problem do you see?

> 
> >-	if ((unsigned long)mapping & PAGE_MAPPING_FLAGS)
> >+	if ((unsigned long)mapping & PAGE_MAPPING_ANON)
> > 		return NULL;
> >-	return mapping;
> >+
> >+	return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
> > }
> >+EXPORT_SYMBOL(page_mapping);
> >
> > /* Slow path of page_mapcount() for compound pages */
> > int __page_mapcount(struct page *page)
> >
> 

^ permalink raw reply related

* [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration
From: Minchan Kim @ 2016-05-31  0:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rik van Riel, Sergey Senozhatsky, Rafael Aquini, Jonathan Corbet,
	Hugh Dickins, linux-kernel, dri-devel, virtualization,
	John Einar Reitan, linux-mm, Gioh Kim, Mel Gorman, Joonsoo Kim,
	Vlastimil Babka
In-Reply-To: <20160530013926.GB8683@bbox>

Per Vlastimi's review comment.

Thanks for the detail review, Vlastimi!
If you have another concern, feel free to say.
After I resolve all thing, I will send v7 rebased on recent mmotm.

From b14aaeeeeb2d3ac0702c7b2eec36409d74406d43 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Fri, 8 Apr 2016 10:34:49 +0900
Subject: [PATCH] mm: migrate: support non-lru movable page migration

We have allowed migration for only LRU pages until now and it was
enough to make high-order pages. But recently, embedded system(e.g.,
webOS, android) uses lots of non-movable pages(e.g., zram, GPU memory)
so we have seen several reports about troubles of small high-order
allocation. For fixing the problem, there were several efforts
(e,g,. enhance compaction algorithm, SLUB fallback to 0-order page,
reserved memory, vmalloc and so on) but if there are lots of
non-movable pages in system, their solutions are void in the long run.

So, this patch is to support facility to change non-movable pages
with movable. For the feature, this patch introduces functions related
to migration to address_space_operations as well as some page flags.

If a driver want to make own pages movable, it should define three functions
which are function pointers of struct address_space_operations.

1. bool (*isolate_page) (struct page *page, isolate_mode_t mode);

What VM expects on isolate_page function of driver is to return *true*
if driver isolates page successfully. On returing true, VM marks the page
as PG_isolated so concurrent isolation in several CPUs skip the page
for isolation. If a driver cannot isolate the page, it should return *false*.

Once page is successfully isolated, VM uses page.lru fields so driver
shouldn't expect to preserve values in that fields.

2. int (*migratepage) (struct address_space *mapping,
		struct page *newpage, struct page *oldpage, enum migrate_mode);

After isolation, VM calls migratepage of driver with isolated page.
The function of migratepage is to move content of the old page to new page
and set up fields of struct page newpage. Keep in mind that you should
indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
under page_lock if you migrated the oldpage successfully and returns 0.
If driver cannot migrate the page at the moment, driver can return -EAGAIN.
On -EAGAIN, VM will retry page migration in a short time because VM interprets
-EAGAIN as "temporal migration failure". On returning any error except -EAGAIN,
VM will give up the page migration without retrying in this time.

Driver shouldn't touch page.lru field VM using in the functions.

3. void (*putback_page)(struct page *);

If migration fails on isolated page, VM should return the isolated page
to the driver so VM calls driver's putback_page with migration failed page.
In this function, driver should put the isolated page back to the own data
structure.

4. non-lru movable page flags

There are two page flags for supporting non-lru movable page.

* PG_movable

Driver should use the below function to make page movable under page_lock.

	void __SetPageMovable(struct page *page, struct address_space *mapping)

It needs argument of address_space for registering migration family functions
which will be called by VM. Exactly speaking, PG_movable is not a real flag of
struct page. Rather than, VM reuses page->mapping's lower bits to represent it.

	#define PAGE_MAPPING_MOVABLE 0x2
	page->mapping = page->mapping | PAGE_MAPPING_MOVABLE;

so driver shouldn't access page->mapping directly. Instead, driver should
use page_mapping which mask off the low two bits of page->mapping so it can get
right struct address_space.

For testing of non-lru movable page, VM supports __PageMovable function.
However, it doesn't guarantee to identify non-lru movable page because
page->mapping field is unified with other variables in struct page.
As well, if driver releases the page after isolation by VM, page->mapping
doesn't have stable value although it has PAGE_MAPPING_MOVABLE
(Look at __ClearPageMovable). But __PageMovable is cheap to catch whether
page is LRU or non-lru movable once the page has been isolated. Because
LRU pages never can have PAGE_MAPPING_MOVABLE in page->mapping. It is also
good for just peeking to test non-lru movable pages before more expensive
checking with lock_page in pfn scanning to select victim.

For guaranteeing non-lru movable page, VM provides PageMovable function.
Unlike __PageMovable, PageMovable functions validates page->mapping and
mapping->a_ops->isolate_page under lock_page. The lock_page prevents sudden
destroying of page->mapping.

Driver using __SetPageMovable should clear the flag via __ClearMovablePage
under page_lock before the releasing the page.

* PG_isolated

To prevent concurrent isolation among several CPUs, VM marks isolated page
as PG_isolated under lock_page. So if a CPU encounters PG_isolated non-lru
movable page, it can skip it. Driver doesn't need to manipulate the flag
because VM will set/clear it automatically. Keep in mind that if driver
sees PG_isolated page, it means the page have been isolated by VM so it
shouldn't touch page.lru field.
PG_isolated is alias with PG_reclaim flag so driver shouldn't use the flag
for own purpose.

Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: John Einar Reitan <john.reitan@foss.arm.com>
Cc: dri-devel@lists.freedesktop.org
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 Documentation/filesystems/Locking |   4 +
 Documentation/filesystems/vfs.txt |  11 +++
 Documentation/vm/page_migration   | 107 ++++++++++++++++++++-
 include/linux/compaction.h        |  17 ++++
 include/linux/fs.h                |   2 +
 include/linux/ksm.h               |   3 +-
 include/linux/migrate.h           |   2 +
 include/linux/mm.h                |   1 +
 include/linux/page-flags.h        |  33 +++++--
 mm/compaction.c                   |  85 +++++++++++++----
 mm/ksm.c                          |   4 +-
 mm/migrate.c                      | 192 ++++++++++++++++++++++++++++++++++----
 mm/page_alloc.c                   |   2 +-
 mm/util.c                         |   6 +-
 14 files changed, 417 insertions(+), 52 deletions(-)

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index af7c030a0368..3991a976cf43 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -195,7 +195,9 @@ unlocks and drops the reference.
 	int (*releasepage) (struct page *, int);
 	void (*freepage)(struct page *);
 	int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
+	bool (*isolate_page) (struct page *, isolate_mode_t);
 	int (*migratepage)(struct address_space *, struct page *, struct page *);
+	void (*putback_page) (struct page *);
 	int (*launder_page)(struct page *);
 	int (*is_partially_uptodate)(struct page *, unsigned long, unsigned long);
 	int (*error_remove_page)(struct address_space *, struct page *);
@@ -219,7 +221,9 @@ invalidatepage:		yes
 releasepage:		yes
 freepage:		yes
 direct_IO:
+isolate_page:		yes
 migratepage:		yes (both)
+putback_page:		yes
 launder_page:		yes
 is_partially_uptodate:	yes
 error_remove_page:	yes
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 19366fef2652..9d4ae317fdcb 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -591,9 +591,14 @@ struct address_space_operations {
 	int (*releasepage) (struct page *, int);
 	void (*freepage)(struct page *);
 	ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter);
+	/* isolate a page for migration */
+	bool (*isolate_page) (struct page *, isolate_mode_t);
 	/* migrate the contents of a page to the specified target */
 	int (*migratepage) (struct page *, struct page *);
+	/* put migration-failed page back to right list */
+	void (*putback_page) (struct page *);
 	int (*launder_page) (struct page *);
+
 	int (*is_partially_uptodate) (struct page *, unsigned long,
 					unsigned long);
 	void (*is_dirty_writeback) (struct page *, bool *, bool *);
@@ -739,6 +744,10 @@ struct address_space_operations {
         and transfer data directly between the storage and the
         application's address space.
 
+  isolate_page: Called by the VM when isolating a movable non-lru page.
+	If page is successfully isolated, VM marks the page as PG_isolated
+	via __SetPageIsolated.
+
   migrate_page:  This is used to compact the physical memory usage.
         If the VM wants to relocate a page (maybe off a memory card
         that is signalling imminent failure) it will pass a new page
@@ -746,6 +755,8 @@ struct address_space_operations {
 	transfer any private data across and update any references
         that it has to the page.
 
+  putback_page: Called by the VM when isolated page's migration fails.
+
   launder_page: Called before freeing a page - it writes back the dirty page. To
   	prevent redirtying the page, it is kept locked during the whole
 	operation.
diff --git a/Documentation/vm/page_migration b/Documentation/vm/page_migration
index fea5c0864170..18d37c7ac50b 100644
--- a/Documentation/vm/page_migration
+++ b/Documentation/vm/page_migration
@@ -142,5 +142,110 @@ is increased so that the page cannot be freed while page migration occurs.
 20. The new page is moved to the LRU and can be scanned by the swapper
     etc again.
 
-Christoph Lameter, May 8, 2006.
+C. Non-LRU page migration
+-------------------------
+
+Although original migration aimed for reducing the latency of memory access
+for NUMA, compaction who want to create high-order page is also main customer.
+
+Current problem of the implementation is that it is designed to migrate only
+*LRU* pages. However, there are potential non-lru pages which can be migrated
+in drivers, for example, zsmalloc, virtio-balloon pages.
+
+For virtio-balloon pages, some parts of migration code path have been hooked
+up and added virtio-balloon specific functions to intercept migration logics.
+It's too specific to a driver so other drivers who want to make their pages
+movable would have to add own specific hooks in migration path.
+
+To overclome the problem, VM supports non-LRU page migration which provides
+generic functions for non-LRU movable pages without driver specific hooks
+migration path.
+
+If a driver want to make own pages movable, it should define three functions
+which are function pointers of struct address_space_operations.
+
+1. bool (*isolate_page) (struct page *page, isolate_mode_t mode);
+
+What VM expects on isolate_page function of driver is to return *true*
+if driver isolates page successfully. On returing true, VM marks the page
+as PG_isolated so concurrent isolation in several CPUs skip the page
+for isolation. If a driver cannot isolate the page, it should return *false*.
+
+Once page is successfully isolated, VM uses page.lru fields so driver
+shouldn't expect to preserve values in that fields.
+
+2. int (*migratepage) (struct address_space *mapping,
+		struct page *newpage, struct page *oldpage, enum migrate_mode);
+
+After isolation, VM calls migratepage of driver with isolated page.
+The function of migratepage is to move content of the old page to new page
+and set up fields of struct page newpage. Keep in mind that you should
+indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
+under page_lock if you migrated the oldpage successfully and returns 0.
+If driver cannot migrate the page at the moment, driver can return -EAGAIN.
+On -EAGAIN, VM will retry page migration in a short time because VM interprets
+-EAGAIN as "temporal migration failure". On returning any error except -EAGAIN,
+VM will give up the page migration without retrying in this time.
+
+Driver shouldn't touch page.lru field VM using in the functions.
+
+3. void (*putback_page)(struct page *);
+
+If migration fails on isolated page, VM should return the isolated page
+to the driver so VM calls driver's putback_page with migration failed page.
+In this function, driver should put the isolated page back to the own data
+structure.
 
+4. non-lru movable page flags
+
+There are two page flags for supporting non-lru movable page.
+
+* PG_movable
+
+Driver should use the below function to make page movable under page_lock.
+
+	void __SetPageMovable(struct page *page, struct address_space *mapping)
+
+It needs argument of address_space for registering migration family functions
+which will be called by VM. Exactly speaking, PG_movable is not a real flag of
+struct page. Rather than, VM reuses page->mapping's lower bits to represent it.
+
+	#define PAGE_MAPPING_MOVABLE 0x2
+	page->mapping = page->mapping | PAGE_MAPPING_MOVABLE;
+
+so driver shouldn't access page->mapping directly. Instead, driver should
+use page_mapping which mask off the low two bits of page->mapping under
+page lock so it can get right struct address_space.
+
+For testing of non-lru movable page, VM supports __PageMovable function.
+However, it doesn't guarantee to identify non-lru movable page because
+page->mapping field is unified with other variables in struct page.
+As well, if driver releases the page after isolation by VM, page->mapping
+doesn't have stable value although it has PAGE_MAPPING_MOVABLE
+(Look at __ClearPageMovable). But __PageMovable is cheap to catch whether
+page is LRU or non-lru movable once the page has been isolated. Because
+LRU pages never can have PAGE_MAPPING_MOVABLE in page->mapping. It is also
+good for just peeking to test non-lru movable pages before more expensive
+checking with lock_page in pfn scanning to select victim.
+
+For guaranteeing non-lru movable page, VM provides PageMovable function.
+Unlike __PageMovable, PageMovable functions validates page->mapping and
+mapping->a_ops->isolate_page under lock_page. The lock_page prevents sudden
+destroying of page->mapping.
+
+Driver using __SetPageMovable should clear the flag via __ClearMovablePage
+under page_lock before the releasing the page.
+
+* PG_isolated
+
+To prevent concurrent isolation among several CPUs, VM marks isolated page
+as PG_isolated under lock_page. So if a CPU encounters PG_isolated non-lru
+movable page, it can skip it. Driver doesn't need to manipulate the flag
+because VM will set/clear it automatically. Keep in mind that if driver
+sees PG_isolated page, it means the page have been isolated by VM so it
+shouldn't touch page.lru field.
+PG_isolated is alias with PG_reclaim flag so driver shouldn't use the flag
+for own purpose.
+
+Christoph Lameter, May 8, 2006.
+Minchan Kim, Mar 28, 2016.
diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index a58c852a268f..c6b47c861cea 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -54,6 +54,9 @@ enum compact_result {
 struct alloc_context; /* in mm/internal.h */
 
 #ifdef CONFIG_COMPACTION
+extern int PageMovable(struct page *page);
+extern void __SetPageMovable(struct page *page, struct address_space *mapping);
+extern void __ClearPageMovable(struct page *page);
 extern int sysctl_compact_memory;
 extern int sysctl_compaction_handler(struct ctl_table *table, int write,
 			void __user *buffer, size_t *length, loff_t *ppos);
@@ -151,6 +154,19 @@ extern void kcompactd_stop(int nid);
 extern void wakeup_kcompactd(pg_data_t *pgdat, int order, int classzone_idx);
 
 #else
+static inline int PageMovable(struct page *page)
+{
+	return 0;
+}
+static inline void __SetPageMovable(struct page *page,
+			struct address_space *mapping)
+{
+}
+
+static inline void __ClearPageMovable(struct page *page)
+{
+}
+
 static inline enum compact_result try_to_compact_pages(gfp_t gfp_mask,
 			unsigned int order, int alloc_flags,
 			const struct alloc_context *ac,
@@ -212,6 +228,7 @@ static inline void wakeup_kcompactd(pg_data_t *pgdat, int order, int classzone_i
 #endif /* CONFIG_COMPACTION */
 
 #if defined(CONFIG_COMPACTION) && defined(CONFIG_SYSFS) && defined(CONFIG_NUMA)
+struct node;
 extern int compaction_register_node(struct node *node);
 extern void compaction_unregister_node(struct node *node);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0cfdf2aec8f7..39ef97414033 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -402,6 +402,8 @@ struct address_space_operations {
 	 */
 	int (*migratepage) (struct address_space *,
 			struct page *, struct page *, enum migrate_mode);
+	bool (*isolate_page)(struct page *, isolate_mode_t);
+	void (*putback_page)(struct page *);
 	int (*launder_page) (struct page *);
 	int (*is_partially_uptodate) (struct page *, unsigned long,
 					unsigned long);
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 7ae216a39c9e..481c8c4627ca 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -43,8 +43,7 @@ static inline struct stable_node *page_stable_node(struct page *page)
 static inline void set_page_stable_node(struct page *page,
 					struct stable_node *stable_node)
 {
-	page->mapping = (void *)stable_node +
-				(PAGE_MAPPING_ANON | PAGE_MAPPING_KSM);
+	page->mapping = (void *)((unsigned long)stable_node | PAGE_MAPPING_KSM);
 }
 
 /*
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 9b50325e4ddf..404fbfefeb33 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -37,6 +37,8 @@ extern int migrate_page(struct address_space *,
 			struct page *, struct page *, enum migrate_mode);
 extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
 		unsigned long private, enum migrate_mode mode, int reason);
+extern bool isolate_movable_page(struct page *page, isolate_mode_t mode);
+extern void putback_movable_page(struct page *page);
 
 extern int migrate_prep(void);
 extern int migrate_prep_local(void);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a00ec816233a..33eaec57e997 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1035,6 +1035,7 @@ static inline pgoff_t page_file_index(struct page *page)
 }
 
 bool page_mapped(struct page *page);
+struct address_space *page_mapping(struct page *page);
 
 /*
  * Return true only if the page has been allocated with
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index e5a32445f930..f36dbb3a3060 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -129,6 +129,9 @@ enum pageflags {
 
 	/* Compound pages. Stored in first tail page's flags */
 	PG_double_map = PG_private_2,
+
+	/* non-lru isolated movable page */
+	PG_isolated = PG_reclaim,
 };
 
 #ifndef __GENERATING_BOUNDS_H
@@ -357,29 +360,37 @@ PAGEFLAG(Idle, idle, PF_ANY)
  * with the PAGE_MAPPING_ANON bit set to distinguish it.  See rmap.h.
  *
  * On an anonymous page in a VM_MERGEABLE area, if CONFIG_KSM is enabled,
- * the PAGE_MAPPING_KSM bit may be set along with the PAGE_MAPPING_ANON bit;
- * and then page->mapping points, not to an anon_vma, but to a private
+ * the PAGE_MAPPING_MOVABLE bit may be set along with the PAGE_MAPPING_ANON
+ * bit; and then page->mapping points, not to an anon_vma, but to a private
  * structure which KSM associates with that merged page.  See ksm.h.
  *
- * PAGE_MAPPING_KSM without PAGE_MAPPING_ANON is currently never used.
+ * PAGE_MAPPING_KSM without PAGE_MAPPING_ANON is used for non-lru movable
+ * page and then page->mapping points a struct address_space.
  *
  * Please note that, confusingly, "page_mapping" refers to the inode
  * address_space which maps the page from disk; whereas "page_mapped"
  * refers to user virtual address space into which the page is mapped.
  */
-#define PAGE_MAPPING_ANON	1
-#define PAGE_MAPPING_KSM	2
-#define PAGE_MAPPING_FLAGS	(PAGE_MAPPING_ANON | PAGE_MAPPING_KSM)
+#define PAGE_MAPPING_ANON	0x1
+#define PAGE_MAPPING_MOVABLE	0x2
+#define PAGE_MAPPING_KSM	(PAGE_MAPPING_ANON | PAGE_MAPPING_MOVABLE)
+#define PAGE_MAPPING_FLAGS	(PAGE_MAPPING_ANON | PAGE_MAPPING_MOVABLE)
 
-static __always_inline int PageAnonHead(struct page *page)
+static __always_inline int PageMappingFlags(struct page *page)
 {
-	return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
+	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) != 0;
 }
 
 static __always_inline int PageAnon(struct page *page)
 {
 	page = compound_head(page);
-	return PageAnonHead(page);
+	return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
+}
+
+static __always_inline int __PageMovable(struct page *page)
+{
+	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) ==
+				PAGE_MAPPING_MOVABLE;
 }
 
 #ifdef CONFIG_KSM
@@ -393,7 +404,7 @@ static __always_inline int PageKsm(struct page *page)
 {
 	page = compound_head(page);
 	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) ==
-				(PAGE_MAPPING_ANON | PAGE_MAPPING_KSM);
+				PAGE_MAPPING_KSM;
 }
 #else
 TESTPAGEFLAG_FALSE(Ksm)
@@ -641,6 +652,8 @@ static inline void __ClearPageBalloon(struct page *page)
 	atomic_set(&page->_mapcount, -1);
 }
 
+__PAGEFLAG(Isolated, isolated, PF_ANY);
+
 /*
  * If network-based swap is enabled, sl*b must keep track of whether pages
  * were allocated from pfmemalloc reserves.
diff --git a/mm/compaction.c b/mm/compaction.c
index 1427366ad673..a680b52e190b 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -81,6 +81,44 @@ static inline bool migrate_async_suitable(int migratetype)
 
 #ifdef CONFIG_COMPACTION
 
+int PageMovable(struct page *page)
+{
+	struct address_space *mapping;
+
+	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	if (!__PageMovable(page))
+		return 0;
+
+	mapping = page_mapping(page);
+	if (mapping && mapping->a_ops && mapping->a_ops->isolate_page)
+		return 1;
+
+	return 0;
+}
+EXPORT_SYMBOL(PageMovable);
+
+void __SetPageMovable(struct page *page, struct address_space *mapping)
+{
+	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	VM_BUG_ON_PAGE((unsigned long)mapping & PAGE_MAPPING_MOVABLE, page);
+	page->mapping = (void *)((unsigned long)mapping | PAGE_MAPPING_MOVABLE);
+}
+EXPORT_SYMBOL(__SetPageMovable);
+
+void __ClearPageMovable(struct page *page)
+{
+	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	VM_BUG_ON_PAGE(!PageMovable(page), page);
+	/*
+	 * Clear registered address_space val with keeping PAGE_MAPPING_MOVABLE
+	 * flag so that VM can catch up released page by driver after isolation.
+	 * With it, VM migration doesn't try to put it back.
+	 */
+	page->mapping = (void *)((unsigned long)page->mapping &
+				PAGE_MAPPING_MOVABLE);
+}
+EXPORT_SYMBOL(__ClearPageMovable);
+
 /* Do not skip compaction more than 64 times */
 #define COMPACT_MAX_DEFER_SHIFT 6
 
@@ -735,21 +773,6 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 		}
 
 		/*
-		 * Check may be lockless but that's ok as we recheck later.
-		 * It's possible to migrate LRU pages and balloon pages
-		 * Skip any other type of page
-		 */
-		is_lru = PageLRU(page);
-		if (!is_lru) {
-			if (unlikely(balloon_page_movable(page))) {
-				if (balloon_page_isolate(page)) {
-					/* Successfully isolated */
-					goto isolate_success;
-				}
-			}
-		}
-
-		/*
 		 * Regardless of being on LRU, compound pages such as THP and
 		 * hugetlbfs are not to be compacted. We can potentially save
 		 * a lot of iterations if we skip them at once. The check is
@@ -765,8 +788,38 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 			goto isolate_fail;
 		}
 
-		if (!is_lru)
+		/*
+		 * Check may be lockless but that's ok as we recheck later.
+		 * It's possible to migrate LRU and non-lru movable pages.
+		 * Skip any other type of page
+		 */
+		is_lru = PageLRU(page);
+		if (!is_lru) {
+			if (unlikely(balloon_page_movable(page))) {
+				if (balloon_page_isolate(page)) {
+					/* Successfully isolated */
+					goto isolate_success;
+				}
+			}
+
+			/*
+			 * __PageMovable can return false positive so we need
+			 * to verify it under page_lock.
+			 */
+			if (unlikely(__PageMovable(page)) &&
+					!PageIsolated(page)) {
+				if (locked) {
+					spin_unlock_irqrestore(&zone->lru_lock,
+									flags);
+					locked = false;
+				}
+
+				if (isolate_movable_page(page, isolate_mode))
+					goto isolate_success;
+			}
+
 			goto isolate_fail;
+		}
 
 		/*
 		 * Migration will fail if an anonymous page is pinned in memory,
diff --git a/mm/ksm.c b/mm/ksm.c
index 4786b4150f62..35b8aef867a9 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -532,8 +532,8 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
 	void *expected_mapping;
 	unsigned long kpfn;
 
-	expected_mapping = (void *)stable_node +
-				(PAGE_MAPPING_ANON | PAGE_MAPPING_KSM);
+	expected_mapping = (void *)((unsigned long)stable_node |
+					PAGE_MAPPING_KSM);
 again:
 	kpfn = READ_ONCE(stable_node->kpfn);
 	page = pfn_to_page(kpfn);
diff --git a/mm/migrate.c b/mm/migrate.c
index 2666f28b5236..60abcf379b51 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -31,6 +31,7 @@
 #include <linux/vmalloc.h>
 #include <linux/security.h>
 #include <linux/backing-dev.h>
+#include <linux/compaction.h>
 #include <linux/syscalls.h>
 #include <linux/hugetlb.h>
 #include <linux/hugetlb_cgroup.h>
@@ -73,6 +74,81 @@ int migrate_prep_local(void)
 	return 0;
 }
 
+bool isolate_movable_page(struct page *page, isolate_mode_t mode)
+{
+	struct address_space *mapping;
+
+	/*
+	 * Avoid burning cycles with pages that are yet under __free_pages(),
+	 * or just got freed under us.
+	 *
+	 * In case we 'win' a race for a movable page being freed under us and
+	 * raise its refcount preventing __free_pages() from doing its job
+	 * the put_page() at the end of this block will take care of
+	 * release this page, thus avoiding a nasty leakage.
+	 */
+	if (unlikely(!get_page_unless_zero(page)))
+		goto out;
+
+	/*
+	 * Check PageMovable before holding a PG_lock because page's owner
+	 * assumes anybody doesn't touch PG_lock of newly allocated page
+	 * so unconditionally grapping the lock ruins page's owner side.
+	 */
+	if (unlikely(!__PageMovable(page)))
+		goto out_putpage;
+	/*
+	 * As movable pages are not isolated from LRU lists, concurrent
+	 * compaction threads can race against page migration functions
+	 * as well as race against the releasing a page.
+	 *
+	 * In order to avoid having an already isolated movable page
+	 * being (wrongly) re-isolated while it is under migration,
+	 * or to avoid attempting to isolate pages being released,
+	 * lets be sure we have the page lock
+	 * before proceeding with the movable page isolation steps.
+	 */
+	if (unlikely(!trylock_page(page)))
+		goto out_putpage;
+
+	if (!PageMovable(page) || PageIsolated(page))
+		goto out_no_isolated;
+
+	mapping = page_mapping(page);
+	VM_BUG_ON_PAGE(!mapping, page);
+
+	if (!mapping->a_ops->isolate_page(page, mode))
+		goto out_no_isolated;
+
+	/* Driver shouldn't use PG_isolated bit of page->flags */
+	WARN_ON_ONCE(PageIsolated(page));
+	__SetPageIsolated(page);
+	unlock_page(page);
+
+	return true;
+
+out_no_isolated:
+	unlock_page(page);
+out_putpage:
+	put_page(page);
+out:
+	return false;
+}
+
+/* It should be called on page which is PG_movable */
+void putback_movable_page(struct page *page)
+{
+	struct address_space *mapping;
+
+	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	VM_BUG_ON_PAGE(!PageMovable(page), page);
+	VM_BUG_ON_PAGE(!PageIsolated(page), page);
+
+	mapping = page_mapping(page);
+	mapping->a_ops->putback_page(page);
+	__ClearPageIsolated(page);
+}
+
 /*
  * Put previously isolated pages back onto the appropriate lists
  * from where they were once taken off for compaction/migration.
@@ -94,10 +170,25 @@ void putback_movable_pages(struct list_head *l)
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		if (unlikely(isolated_balloon_page(page)))
+		if (unlikely(isolated_balloon_page(page))) {
 			balloon_page_putback(page);
-		else
+		/*
+		 * We isolated non-lru movable page so here we can use
+		 * __PageMovable because LRU page's mapping cannot have
+		 * PAGE_MAPPING_MOVABLE.
+		 */
+		} else if (unlikely(__PageMovable(page))) {
+			VM_BUG_ON_PAGE(!PageIsolated(page), page);
+			lock_page(page);
+			if (PageMovable(page))
+				putback_movable_page(page);
+			else
+				__ClearPageIsolated(page);
+			unlock_page(page);
+			put_page(page);
+		} else {
 			putback_lru_page(page);
+		}
 	}
 }
 
@@ -592,7 +683,7 @@ void migrate_page_copy(struct page *newpage, struct page *page)
  ***********************************************************/
 
 /*
- * Common logic to directly migrate a single page suitable for
+ * Common logic to directly migrate a single LRU page suitable for
  * pages that do not use PagePrivate/PagePrivate2.
  *
  * Pages are locked upon entry and exit.
@@ -755,33 +846,72 @@ static int move_to_new_page(struct page *newpage, struct page *page,
 				enum migrate_mode mode)
 {
 	struct address_space *mapping;
-	int rc;
+	int rc = -EAGAIN;
+	bool is_lru = !__PageMovable(page);
 
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 	VM_BUG_ON_PAGE(!PageLocked(newpage), newpage);
 
 	mapping = page_mapping(page);
-	if (!mapping)
-		rc = migrate_page(mapping, newpage, page, mode);
-	else if (mapping->a_ops->migratepage)
+
+	if (likely(is_lru)) {
+		if (!mapping)
+			rc = migrate_page(mapping, newpage, page, mode);
+		else if (mapping->a_ops->migratepage)
+			/*
+			 * Most pages have a mapping and most filesystems
+			 * provide a migratepage callback. Anonymous pages
+			 * are part of swap space which also has its own
+			 * migratepage callback. This is the most common path
+			 * for page migration.
+			 */
+			rc = mapping->a_ops->migratepage(mapping, newpage,
+							page, mode);
+		else
+			rc = fallback_migrate_page(mapping, newpage,
+							page, mode);
+	} else {
 		/*
-		 * Most pages have a mapping and most filesystems provide a
-		 * migratepage callback. Anonymous pages are part of swap
-		 * space which also has its own migratepage callback. This
-		 * is the most common path for page migration.
+		 * In case of non-lru page, it could be released after
+		 * isolation step. In that case, we shouldn't try migration.
 		 */
-		rc = mapping->a_ops->migratepage(mapping, newpage, page, mode);
-	else
-		rc = fallback_migrate_page(mapping, newpage, page, mode);
+		VM_BUG_ON_PAGE(!PageIsolated(page), page);
+		if (!PageMovable(page)) {
+			rc = MIGRATEPAGE_SUCCESS;
+			__ClearPageIsolated(page);
+			goto out;
+		}
+
+		rc = mapping->a_ops->migratepage(mapping, newpage,
+						page, mode);
+		WARN_ON_ONCE(rc == MIGRATEPAGE_SUCCESS &&
+			!PageIsolated(page));
+	}
 
 	/*
 	 * When successful, old pagecache page->mapping must be cleared before
 	 * page is freed; but stats require that PageAnon be left as PageAnon.
 	 */
 	if (rc == MIGRATEPAGE_SUCCESS) {
-		if (!PageAnon(page))
+		if (__PageMovable(page)) {
+			VM_BUG_ON_PAGE(!PageIsolated(page), page);
+
+			/*
+			 * We clear PG_movable under page_lock so any compactor
+			 * cannot try to migrate this page.
+			 */
+			__ClearPageIsolated(page);
+		}
+
+		/*
+		 * Anonymous and movable page->mapping will be cleard by
+		 * free_pages_prepare so don't reset it here for keeping
+		 * the type to work PageAnon, for example.
+		 */
+		if (!PageMappingFlags(page))
 			page->mapping = NULL;
 	}
+out:
 	return rc;
 }
 
@@ -791,6 +921,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	int rc = -EAGAIN;
 	int page_was_mapped = 0;
 	struct anon_vma *anon_vma = NULL;
+	bool is_lru = !__PageMovable(page);
 
 	if (!trylock_page(page)) {
 		if (!force || mode == MIGRATE_ASYNC)
@@ -871,6 +1002,11 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		goto out_unlock_both;
 	}
 
+	if (unlikely(!is_lru)) {
+		rc = move_to_new_page(newpage, page, mode);
+		goto out_unlock_both;
+	}
+
 	/*
 	 * Corner case handling:
 	 * 1. When a new swap-cache page is read into, it is added to the LRU
@@ -920,7 +1056,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	 * list in here.
 	 */
 	if (rc == MIGRATEPAGE_SUCCESS) {
-		if (unlikely(__is_movable_balloon_page(newpage)))
+		if (unlikely(__is_movable_balloon_page(newpage) ||
+				__PageMovable(newpage)))
 			put_page(newpage);
 		else
 			putback_lru_page(newpage);
@@ -961,6 +1098,12 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
 		/* page was freed from under us. So we are done. */
 		ClearPageActive(page);
 		ClearPageUnevictable(page);
+		if (unlikely(__PageMovable(page))) {
+			lock_page(page);
+			if (!PageMovable(page))
+				__ClearPageIsolated(page);
+			unlock_page(page);
+		}
 		if (put_new_page)
 			put_new_page(newpage, private);
 		else
@@ -1010,8 +1153,21 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
 				num_poisoned_pages_inc();
 		}
 	} else {
-		if (rc != -EAGAIN)
-			putback_lru_page(page);
+		if (rc != -EAGAIN) {
+			if (likely(!__PageMovable(page))) {
+				putback_lru_page(page);
+				goto put_new;
+			}
+
+			lock_page(page);
+			if (PageMovable(page))
+				putback_movable_page(page);
+			else
+				__ClearPageIsolated(page);
+			unlock_page(page);
+			put_page(page);
+		}
+put_new:
 		if (put_new_page)
 			put_new_page(newpage, private);
 		else
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d27e8b968ac3..4d0ba8f6cfee 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1014,7 +1014,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 			(page + i)->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
 		}
 	}
-	if (PageAnonHead(page))
+	if (PageMappingFlags(page))
 		page->mapping = NULL;
 	if (check_free)
 		bad += free_pages_check(page);
diff --git a/mm/util.c b/mm/util.c
index 917e0e3d0f8e..b756ee36f7f0 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -399,10 +399,12 @@ struct address_space *page_mapping(struct page *page)
 	}
 
 	mapping = page->mapping;
-	if ((unsigned long)mapping & PAGE_MAPPING_FLAGS)
+	if ((unsigned long)mapping & PAGE_MAPPING_ANON)
 		return NULL;
-	return mapping;
+
+	return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
 }
+EXPORT_SYMBOL(page_mapping);
 
 /* Slow path of page_mapcount() for compound pages */
 int __page_mapcount(struct page *page)
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH V2 1/2] vhost_net: stop polling socket during rx processing
From: Jason Wang @ 2016-05-31  3:14 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20160530184211-mutt-send-email-mst@redhat.com>



On 2016年05月30日 23:47, Michael S. Tsirkin wrote:
> On Mon, May 30, 2016 at 02:47:53AM -0400, Jason Wang wrote:
>> We don't stop rx polling socket during rx processing, this will lead
>> unnecessary wakeups from under layer net devices (E.g
>> sock_def_readable() form tun). Rx will be slowed down in this
>> way. This patch avoids this by stop polling socket during rx
>> processing. A small drawback is that this introduces some overheads in
>> light load case because of the extra start/stop polling, but single
>> netperf TCP_RR does not notice any change. In a super heavy load case,
>> e.g using pktgen to inject packet to guest, we get about ~8.8%
>> improvement on pps:
>>
>> before: ~1240000 pkt/s
>> after:  ~1350000 pkt/s
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   drivers/vhost/net.c | 56 +++++++++++++++++++++++++++--------------------------
>>   1 file changed, 29 insertions(+), 27 deletions(-)
>>
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index 10ff494..e91603b 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -301,6 +301,32 @@ static bool vhost_can_busy_poll(struct vhost_dev *dev,
>>   	       !vhost_has_work(dev);
>>   }
>>   
>> +static void vhost_net_disable_vq(struct vhost_net *n,
>> +				 struct vhost_virtqueue *vq)
>> +{
>> +	struct vhost_net_virtqueue *nvq =
>> +		container_of(vq, struct vhost_net_virtqueue, vq);
>> +	struct vhost_poll *poll = n->poll + (nvq - n->vqs);
>> +	if (!vq->private_data)
>> +		return;
>> +	vhost_poll_stop(poll);
>> +}
>> +
>> +static int vhost_net_enable_vq(struct vhost_net *n,
>> +				struct vhost_virtqueue *vq)
>> +{
>> +	struct vhost_net_virtqueue *nvq =
>> +		container_of(vq, struct vhost_net_virtqueue, vq);
>> +	struct vhost_poll *poll = n->poll + (nvq - n->vqs);
>> +	struct socket *sock;
>> +
>> +	sock = vq->private_data;
>> +	if (!sock)
>> +		return 0;
>> +
>> +	return vhost_poll_start(poll, sock->file);
>> +}
>> +
>>   static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
>>   				    struct vhost_virtqueue *vq,
>>   				    struct iovec iov[], unsigned int iov_size,
> BTW we might want to rename these functions, name no longer
> reflects function ...

Do you mean adding something reflect busy polling in the name? Then the 
name may be too long or have suggestion on the name?

>
>
>> @@ -627,6 +653,7 @@ static void handle_rx(struct vhost_net *net)
>>   	if (!sock)
>>   		goto out;
>>   	vhost_disable_notify(&net->dev, vq);
>> +	vhost_net_disable_vq(net, vq);
>>   
>>   	vhost_hlen = nvq->vhost_hlen;
>>   	sock_hlen = nvq->sock_hlen;
>> @@ -715,9 +742,10 @@ static void handle_rx(struct vhost_net *net)
>>   		total_len += vhost_len;
>>   		if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
>>   			vhost_poll_queue(&vq->poll);
>> -			break;
>> +			goto out;
>>   		}
>>   	}
>> +	vhost_net_enable_vq(net, vq);
> OK so if sock is readable but RX VQ is empty, this will
> immediately schedule another round of handle_rx and so ad
> infinitum,
>
> Looks like a bug.

Yes it is, will change the above headcount check to:

                 /* OK, now we need to know about added descriptors. */
                 if (!headcount) {
                         if (unlikely(vhost_enable_notify(&net->dev, vq))) {
                                 /* They have slipped one in as we were
                                  * doing that: check again. */
vhost_disable_notify(&net->dev, vq);
                 continue;
                         }
                         /* Nothing new?  Wait for eventfd to tell us
                          * they refilled. */
                         goto out;
                 }


>
>
>>   out:
>>   	mutex_unlock(&vq->mutex);
>>   }
>> @@ -796,32 +824,6 @@ static int vhost_net_open(struct inode *inode, struct file *f)
>>   	return 0;
>>   }
>>   
>> -static void vhost_net_disable_vq(struct vhost_net *n,
>> -				 struct vhost_virtqueue *vq)
>> -{
>> -	struct vhost_net_virtqueue *nvq =
>> -		container_of(vq, struct vhost_net_virtqueue, vq);
>> -	struct vhost_poll *poll = n->poll + (nvq - n->vqs);
>> -	if (!vq->private_data)
>> -		return;
>> -	vhost_poll_stop(poll);
>> -}
>> -
>> -static int vhost_net_enable_vq(struct vhost_net *n,
>> -				struct vhost_virtqueue *vq)
>> -{
>> -	struct vhost_net_virtqueue *nvq =
>> -		container_of(vq, struct vhost_net_virtqueue, vq);
>> -	struct vhost_poll *poll = n->poll + (nvq - n->vqs);
>> -	struct socket *sock;
>> -
>> -	sock = vq->private_data;
>> -	if (!sock)
>> -		return 0;
>> -
>> -	return vhost_poll_start(poll, sock->file);
>> -}
>> -
>>   static struct socket *vhost_net_stop_vq(struct vhost_net *n,
>>   					struct vhost_virtqueue *vq)
>>   {
>> -- 
>> 1.8.3.1
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH V2 2/2] vhost_net: conditionally enable tx polling
From: Jason Wang @ 2016-05-31  3:23 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20160530155521.GA5427@redhat.com>



On 2016年05月30日 23:55, Michael S. Tsirkin wrote:
> On Mon, May 30, 2016 at 02:47:54AM -0400, Jason Wang wrote:
>> We always poll tx for socket, this is sub optimal since:
>>
>> - it will be only used when we exceed the sndbuf of the socket.
>> - since we use two independent polls for tx and vq, this will slightly
>>    increase the waitqueue traversing time and more important, vhost
>>    could not benefit from commit
>>    9e641bdcfa4ef4d6e2fbaa59c1be0ad5d1551fd5 ("net-tun: restructure
>>    tun_do_read for better sleep/wakeup efficiency") even if we've
>>    stopped rx polling during handle_rx since tx poll were still left in
>>    the waitqueue.
> Why is this an issue?
> sock_def_write_space only wakes up when queue is half empty,
> not on each packet.
>          if ((atomic_read(&sk->sk_wmem_alloc) << 1) <= sk->sk_sndbuf)
>
> I suspect the issue is with your previous patch,
> it now pokes at the spinlock on data path
> where it used not to.
>
> Is that right?

The problem is not tx wake up but still rx wake up. Patch 1 removes rx 
poll, but still left tx poll. So in sock_def_readable(), 
skwq_has_sleeper() returns true, we still need to traverse waitqueue and 
touch spinlocks. With this patch, unless a heavy tx load, tx poll were 
disabled, sock_def_readable() can return finish very soon.

>
>
>> Fix this by conditionally enable tx polling only when -EAGAIN were
>> met.
>>
>> Test shows about 8% improvement on guest rx pps.
>>
>> Before: ~1350000
>> After:  ~1460000
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   drivers/vhost/net.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index e91603b..5a05fa0 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -378,6 +378,7 @@ static void handle_tx(struct vhost_net *net)
>>   		goto out;
>>   
>>   	vhost_disable_notify(&net->dev, vq);
>> +	vhost_net_disable_vq(net, vq);
>>   
>>   	hdr_size = nvq->vhost_hlen;
>>   	zcopy = nvq->ubufs;
>> @@ -459,6 +460,8 @@ static void handle_tx(struct vhost_net *net)
>>   					% UIO_MAXIOV;
>>   			}
>>   			vhost_discard_vq_desc(vq, 1);
>> +			if (err == -EAGAIN)
>> +				vhost_net_enable_vq(net, vq);
>>   			break;
>>   		}
>>   		if (err != len)
>> -- 
>> 1.8.3.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH v3 7/7] [wip] virtio-gpu: add page flip support
From: Gerd Hoffmann @ 2016-05-31  6:18 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: open list, dri-devel, open list:VIRTIO GPU DRIVER
In-Reply-To: <20160530144325.GJ27098@phenom.ffwll.local>

  Hi,

> > https://lists.freedesktop.org/archives/dri-devel/2016-May/108772.html
> 
> Hm, smells more like virtio isn't too happy with the default ordering of
> the commit operation. The default is:
> 
> - Disable any crtc/encoders that need to be disabled/change.
> - Bash new plane setup into hw.
> - Enable all crtcs/encoders that need to be enabled/have changed.
> 
> There's two problems:
> - some hw gets real grumpy if you bash in plane state without the crtc
>   state yet matching.
> - if you do runtime pm nothing is enabled and the writes get lost at best,
>   or hang your interconnect at worst.
> 
> That's why you can overwrite atomic_commit_tail, and use something more
> sensible. See for example what it looks like for rockchip. I have a gut
> feeling that should also take care of your troubles. Using the old crtc is
> definitely not what you want.

> Another option is that virtio isn't happy about bashing in plane state for
> disabled crtc. Again helpers have you covered, look at the active_only
> parameter for drm_atomic_helper_commit_planes().

virtio-gpu is a bit simplified compared to real hardware, so there isn't
really separate plane/crtc state.

Right now the virtual outputs are linked to drm_crtc.  To apply any
changes I need to lookup the crtc to figure which virtual output should
be updated.

So, setting active_only should make sure I have a valid crtc pointer on
plane updates, right?  It probably also skips the disable + enable crtc
steps on commit?  What happens when outputs are disabled?

Maybe it makes sense to link our virtual outputs to (primary) planes not
crtcs?

cheers,
  Gerd

^ permalink raw reply

* Re: [PATCH] Add virtio gpu driver.
From: Gerd Hoffmann @ 2016-05-31  6:29 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: virtio-dev, Michael S. Tsirkin, open list:ABI/API, open list,
	open list:DRM DRIVERS, open list:VIRTIO CORE, NET..., Dave Airlie
In-Reply-To: <20160530144755.GK27098@phenom.ffwll.local>

  Hi,

> > Why attach the hotspot to the plane?  Wouldn't it make more sense to
> > make it a framebuffer property?
> 
> We don't have properties on the framebuffer. I guess you /could/ just add
> it internally to struct drm_framebuffer, and not bother exposing to
> userspace. I guess that would be a lot simpler,

Yes.  I can simply stick the hotspot into drm_framebuffer in
drm_mode_cursor_universal() and pick up the values in the driver's plane
update function.

>  but it also means that
> atomic userspace can't use hotspots before we add properties to fbs. And
> doing that is a bit tricky since drm_framebuffer objects are meant to be
> invariant - this assumption is deeply in-grained into the code all over
> the place, everything just compares pointers when semantically it means to
> compare the entire fb (including backing storage pointer/offsets and
> everything).

Hmm, the hotspot location for a given cursor image is invariant too, so
I don't think that would be a big issue.

I'd expect userspace define a bunch of cursors, then switch between them
(instead of defining a single cursor, then constantly updating it).

cheers,
  Gerd

^ permalink raw reply

* Re: [PATCH v3 7/7] [wip] virtio-gpu: add page flip support
From: Daniel Vetter @ 2016-05-31  6:37 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: open list, dri-devel, open list:VIRTIO GPU DRIVER
In-Reply-To: <1464675500.5978.15.camel@redhat.com>

On Tue, May 31, 2016 at 8:18 AM, Gerd Hoffmann <kraxel@redhat.com> wrote:
>> > https://lists.freedesktop.org/archives/dri-devel/2016-May/108772.html
>>
>> Hm, smells more like virtio isn't too happy with the default ordering of
>> the commit operation. The default is:
>>
>> - Disable any crtc/encoders that need to be disabled/change.
>> - Bash new plane setup into hw.
>> - Enable all crtcs/encoders that need to be enabled/have changed.
>>
>> There's two problems:
>> - some hw gets real grumpy if you bash in plane state without the crtc
>>   state yet matching.
>> - if you do runtime pm nothing is enabled and the writes get lost at best,
>>   or hang your interconnect at worst.
>>
>> That's why you can overwrite atomic_commit_tail, and use something more
>> sensible. See for example what it looks like for rockchip. I have a gut
>> feeling that should also take care of your troubles. Using the old crtc is
>> definitely not what you want.
>
>> Another option is that virtio isn't happy about bashing in plane state for
>> disabled crtc. Again helpers have you covered, look at the active_only
>> parameter for drm_atomic_helper_commit_planes().
>
> virtio-gpu is a bit simplified compared to real hardware, so there isn't
> really separate plane/crtc state.
>
> Right now the virtual outputs are linked to drm_crtc.  To apply any
> changes I need to lookup the crtc to figure which virtual output should
> be updated.
>
> So, setting active_only should make sure I have a valid crtc pointer on
> plane updates, right?  It probably also skips the disable + enable crtc
> steps on commit?  What happens when outputs are disabled?
>
> Maybe it makes sense to link our virtual outputs to (primary) planes not
> crtcs?

Nah, I just misunderstood your patch. If it's all about finding the
corresponding crtc, then you're all good. I thought there was some
other reason (like the virtual hw getting upset about certain things).
I'll pick up your patch into my nonblocking atomic branch to make sure
virtio isn't accidentally broken. btw can you pls drop an ack or r-b
onto my virtio conversion? I already added your tested-by.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply

* Re: [PATCH] Add virtio gpu driver.
From: Daniel Vetter @ 2016-05-31  6:55 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: virtio-dev, Michael S. Tsirkin, open list:ABI/API, open list,
	open list:DRM DRIVERS, open list:VIRTIO CORE, NET..., Dave Airlie
In-Reply-To: <1464676160.5978.24.camel@redhat.com>

On Tue, May 31, 2016 at 8:29 AM, Gerd Hoffmann <kraxel@redhat.com> wrote:
>>  but it also means that
>> atomic userspace can't use hotspots before we add properties to fbs. And
>> doing that is a bit tricky since drm_framebuffer objects are meant to be
>> invariant - this assumption is deeply in-grained into the code all over
>> the place, everything just compares pointers when semantically it means to
>> compare the entire fb (including backing storage pointer/offsets and
>> everything).
>
> Hmm, the hotspot location for a given cursor image is invariant too, so
> I don't think that would be a big issue.
>
> I'd expect userspace define a bunch of cursors, then switch between them
> (instead of defining a single cursor, then constantly updating it).

Agreed, conceptually it would be a really nice fit to put the hotspot
into the invariant drm_framebuffer. I just meant to say that you need
to add a bit of new uapi to make it happen, since we can't reuse our
existing get/set_prop functions (since those only change props after
the object is created). And we also can't use our existing ioctls to
enumerate properties of an object (since chicken-egg: you want the
list of properties before creating a new framebuffer, so can't use
that framebuffer to enumerate them).

But really not a big concern.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply

* Re: [PATCH] Add virtio gpu driver.
From: Pekka Paalanen @ 2016-05-31  7:00 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: virtio-dev, Michael S. Tsirkin, open list:ABI/API, open list,
	open list:DRM DRIVERS, open list:VIRTIO CORE, NET...,
	Daniel Vetter, Dave Airlie
In-Reply-To: <1464676160.5978.24.camel@redhat.com>


[-- Attachment #1.1: Type: text/plain, Size: 1967 bytes --]

On Tue, 31 May 2016 08:29:20 +0200
Gerd Hoffmann <kraxel@redhat.com> wrote:

>   Hi,
> 
> > > Why attach the hotspot to the plane?  Wouldn't it make more sense to
> > > make it a framebuffer property?  
> > 
> > We don't have properties on the framebuffer. I guess you /could/ just add
> > it internally to struct drm_framebuffer, and not bother exposing to
> > userspace. I guess that would be a lot simpler,  
> 
> Yes.  I can simply stick the hotspot into drm_framebuffer in
> drm_mode_cursor_universal() and pick up the values in the driver's plane
> update function.
> 
> >  but it also means that
> > atomic userspace can't use hotspots before we add properties to fbs. And
> > doing that is a bit tricky since drm_framebuffer objects are meant to be
> > invariant - this assumption is deeply in-grained into the code all over
> > the place, everything just compares pointers when semantically it means to
> > compare the entire fb (including backing storage pointer/offsets and
> > everything).  
> 
> Hmm, the hotspot location for a given cursor image is invariant too, so
> I don't think that would be a big issue.
> 
> I'd expect userspace define a bunch of cursors, then switch between them
> (instead of defining a single cursor, then constantly updating it).

Except updating a single cursor (well, two alternating buffers) is
exactly what Weston does, since there is no "set of cursors". On
Wayland, a cursor is just a regular surface like any other with
arbitrary content from a client, except it happens to be associated
with a pointer device.

Furthermore, in Weston a cursor plane is not special in any way. *Any*
client surface can go on the cursor plane if it fits. Universal planes,
and all that.

That's one existing userspace. I suppose that is sub-optimal for
virtual drivers, isn't it? But what else could Weston do without having
separate paths for "normal DRM" vs. "virtual DRM"?


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH] virtio-gpu: fix output lookup
From: Daniel Vetter @ 2016-05-31  7:30 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: open list:VIRTIO GPU DRIVER, open list,
	open list:VIRTIO GPU DRIVER, dri-devel
In-Reply-To: <1464609806-22013-1-git-send-email-kraxel@redhat.com>

On Mon, May 30, 2016 at 02:03:26PM +0200, Gerd Hoffmann wrote:
> Needed for multihead setups where we can have disabled
> outputs and therefore plane->crtc can be NULL.
> 
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

Applied to drm-misc, thanks.
-Daniel

> ---
>  drivers/gpu/drm/virtio/virtgpu_plane.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
> index 70b44a2..e50674b 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_plane.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
> @@ -63,11 +63,17 @@ static void virtio_gpu_plane_atomic_update(struct drm_plane *plane,
>  {
>  	struct drm_device *dev = plane->dev;
>  	struct virtio_gpu_device *vgdev = dev->dev_private;
> -	struct virtio_gpu_output *output = drm_crtc_to_virtio_gpu_output(plane->crtc);
> +	struct virtio_gpu_output *output = NULL;
>  	struct virtio_gpu_framebuffer *vgfb;
>  	struct virtio_gpu_object *bo;
>  	uint32_t handle;
>  
> +	if (plane->state->crtc)
> +		output = drm_crtc_to_virtio_gpu_output(plane->state->crtc);
> +	if (old_state->crtc)
> +		output = drm_crtc_to_virtio_gpu_output(old_state->crtc);
> +	WARN_ON(!output);
> +
>  	if (plane->state->fb) {
>  		vgfb = to_virtio_gpu_framebuffer(plane->state->fb);
>  		bo = gem_to_virtio_gpu_obj(vgfb->obj);
> -- 
> 1.8.3.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply

* Re: [PATCH v3 7/7] [wip] virtio-gpu: add page flip support
From: Gerd Hoffmann @ 2016-05-31  7:34 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: open list, dri-devel, open list:VIRTIO GPU DRIVER
In-Reply-To: <CAKMK7uFkB9SX8pNdANSK87M7wr1WzPYkP2t2CHoJZeeM62eCpA@mail.gmail.com>

  Hi,

> > Right now the virtual outputs are linked to drm_crtc.  To apply any
> > changes I need to lookup the crtc to figure which virtual output should
> > be updated.

> > So, setting active_only should make sure I have a valid crtc pointer on
> > plane updates, right?  It probably also skips the disable + enable crtc
> > steps on commit?  What happens when outputs are disabled?

> Nah, I just misunderstood your patch. If it's all about finding the
> corresponding crtc, then you're all good.

Yes, it's all about finding the crtc.

> I thought there was some
> other reason (like the virtual hw getting upset about certain things).

virtio wouldn't be upset.

It's a pointless exercise though to first disable the output, just to
re-enable it the next moment with the new page-flipped framebuffer.  So
I guess I should look at the active_only thing nevertheless.

> btw can you pls drop an ack or r-b
> onto my virtio conversion? I already added your tested-by.

Grr, mail is not in my dri-devel folder.  Guess that is the <censored>
"avoid-duplicates" mailman option at work.

Feel free to just add the r-b too.  Or I'll send it for the next version
of the series.

cheers,
  Gerd

^ permalink raw reply

* Re: [PATCH v3 7/7] [wip] virtio-gpu: add page flip support
From: Daniel Vetter @ 2016-05-31  7:39 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: open list, dri-devel, open list:VIRTIO GPU DRIVER
In-Reply-To: <1464680085.5978.40.camel@redhat.com>

On Tue, May 31, 2016 at 9:34 AM, Gerd Hoffmann <kraxel@redhat.com> wrote:
>   Hi,
>
>> > Right now the virtual outputs are linked to drm_crtc.  To apply any
>> > changes I need to lookup the crtc to figure which virtual output should
>> > be updated.
>
>> > So, setting active_only should make sure I have a valid crtc pointer on
>> > plane updates, right?  It probably also skips the disable + enable crtc
>> > steps on commit?  What happens when outputs are disabled?
>
>> Nah, I just misunderstood your patch. If it's all about finding the
>> corresponding crtc, then you're all good.
>
> Yes, it's all about finding the crtc.
>
>> I thought there was some
>> other reason (like the virtual hw getting upset about certain things).
>
> virtio wouldn't be upset.
>
> It's a pointless exercise though to first disable the output, just to
> re-enable it the next moment with the new page-flipped framebuffer.  So
> I guess I should look at the active_only thing nevertheless.

It's still possible that the plane can get disabled without it getting
enabled. Userspace is allowed to do this. But since this suprises a
lot of driver writers there's a special atomic_plane_disable hook you
can use for the disable path. Instead of hand-rolling that check in
your own code.

>> btw can you pls drop an ack or r-b
>> onto my virtio conversion? I already added your tested-by.
>
> Grr, mail is not in my dri-devel folder.  Guess that is the <censored>
> "avoid-duplicates" mailman option at work.
>
> Feel free to just add the r-b too.  Or I'll send it for the next version
> of the series.

Added your r-b locally so it won't get lost on the next round.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply

* Re: PATCH v6v2 02/12] mm: migrate: support non-lru movable page migration
From: Vlastimil Babka @ 2016-05-31  7:51 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Rik van Riel, Sergey Senozhatsky, Rafael Aquini, Jonathan Corbet,
	Hugh Dickins, linux-kernel, dri-devel, virtualization,
	John Einar Reitan, linux-mm, Mel Gorman, Andrew Morton,
	Joonsoo Kim, Gioh Kim
In-Reply-To: <20160530162523.GA18314@bbox>

On 05/30/2016 06:25 PM, Minchan Kim wrote:
>>> --- a/mm/compaction.c
>>> +++ b/mm/compaction.c
>>> @@ -81,6 +81,39 @@ static inline bool migrate_async_suitable(int migratetype)
>>>
>>> #ifdef CONFIG_COMPACTION
>>>
>>> +int PageMovable(struct page *page)
>>> +{
>>> +	struct address_space *mapping;
>>> +
>>> +	VM_BUG_ON_PAGE(!PageLocked(page), page);
>>> +	if (!__PageMovable(page))
>>> +		return 0;
>>> +
>>> +	mapping = page_mapping(page);
>>> +	if (mapping && mapping->a_ops && mapping->a_ops->isolate_page)
>>> +		return 1;
>>> +
>>> +	return 0;
>>> +}
>>> +EXPORT_SYMBOL(PageMovable);
>>> +
>>> +void __SetPageMovable(struct page *page, struct address_space *mapping)
>>> +{
>>> +	VM_BUG_ON_PAGE(!PageLocked(page), page);
>>> +	VM_BUG_ON_PAGE((unsigned long)mapping & PAGE_MAPPING_MOVABLE, page);
>>> +	page->mapping = (void *)((unsigned long)mapping | PAGE_MAPPING_MOVABLE);
>>> +}
>>> +EXPORT_SYMBOL(__SetPageMovable);
>>> +
>>> +void __ClearPageMovable(struct page *page)
>>> +{
>>> +	VM_BUG_ON_PAGE(!PageLocked(page), page);
>>> +	VM_BUG_ON_PAGE(!PageMovable(page), page);
>>> +	page->mapping = (void *)((unsigned long)page->mapping &
>>> +				PAGE_MAPPING_MOVABLE);
>>> +}
>>> +EXPORT_SYMBOL(__ClearPageMovable);
>>
>> The second confusing thing is that the function is named
>> __ClearPageMovable(), but what it really clears is the mapping
>> pointer,
>> which is not at all the opposite of what __SetPageMovable() does.
>>
>> I know it's explained in the documentation, but it also deserves a
>> comment here so it doesn't confuse everyone who looks at it.
>> Even better would be a less confusing name for the function, but I
>> can't offer one right now.
>
> To me, __ClearPageMovable naming is suitable for user POV.
> It effectively makes the page unmovable. The confusion is just caused
> by the implementation and I don't prefer exported API depends on the
> implementation. So I want to add just comment.
>
> I didn't add comment above the function because I don't want to export
> internal implementation to the user. I think they don't need to know it.
>
> index a7df2ae71f2a..d1d2063b4fd9 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -108,6 +108,11 @@ void __ClearPageMovable(struct page *page)
>  {
>         VM_BUG_ON_PAGE(!PageLocked(page), page);
>         VM_BUG_ON_PAGE(!PageMovable(page), page);
> +       /*
> +        * Clear registered address_space val with keeping PAGE_MAPPING_MOVABLE
> +        * flag so that VM can catch up released page by driver after isolation.
> +        * With it, VM migration doesn't try to put it back.
> +        */
>         page->mapping = (void *)((unsigned long)page->mapping &
>                                 PAGE_MAPPING_MOVABLE);

OK, that's fine!

>>
>>> diff --git a/mm/util.c b/mm/util.c
>>> index 917e0e3d0f8e..b756ee36f7f0 100644
>>> --- a/mm/util.c
>>> +++ b/mm/util.c
>>> @@ -399,10 +399,12 @@ struct address_space *page_mapping(struct page *page)
>>> 	}
>>>
>>> 	mapping = page->mapping;
>>
>> I'd probably use READ_ONCE() here to be safe. Not all callers are
>> under page lock?
>
> I don't understand. Yeah, all caller are not under page lock but at least,
> new user of movable pages should call it under page_lock.
> Yeah, I will write the rule down in document.
> In this case, what kinds of problem do you see?

After more thinking, probably none. It wouldn't prevent any extra races. 
Sorry for the noise.

^ permalink raw reply

* Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration
From: Vlastimil Babka @ 2016-05-31  7:52 UTC (permalink / raw)
  To: Minchan Kim, Andrew Morton
  Cc: Rik van Riel, Sergey Senozhatsky, Rafael Aquini, Jonathan Corbet,
	Hugh Dickins, linux-kernel, dri-devel, virtualization,
	John Einar Reitan, linux-mm, Mel Gorman, Joonsoo Kim, Gioh Kim
In-Reply-To: <20160531000117.GB18314@bbox>

On 05/31/2016 02:01 AM, Minchan Kim wrote:
> Per Vlastimi's review comment.
>
> Thanks for the detail review, Vlastimi!
> If you have another concern, feel free to say.

I don't for now :)

[...]

> Cc: Rik van Riel <riel@redhat.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Rafael Aquini <aquini@redhat.com>
> Cc: virtualization@lists.linux-foundation.org
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: John Einar Reitan <john.reitan@foss.arm.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

^ permalink raw reply

* [PATCH 1/5] virtio-gpu: fix output lookup
From: Gerd Hoffmann @ 2016-05-31 10:53 UTC (permalink / raw)
  To: dri-devel
  Cc: David Airlie, open list, Daniel Vetter,
	open list:VIRTIO GPU DRIVER
In-Reply-To: <1464691994-5704-1-git-send-email-kraxel@redhat.com>

Needed for multihead setups where we can have disabled
outputs and therefore plane->crtc can be NULL.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 drivers/gpu/drm/virtio/virtgpu_plane.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
index 70b44a2..e50674b 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -63,11 +63,17 @@ static void virtio_gpu_plane_atomic_update(struct drm_plane *plane,
 {
 	struct drm_device *dev = plane->dev;
 	struct virtio_gpu_device *vgdev = dev->dev_private;
-	struct virtio_gpu_output *output = drm_crtc_to_virtio_gpu_output(plane->crtc);
+	struct virtio_gpu_output *output = NULL;
 	struct virtio_gpu_framebuffer *vgfb;
 	struct virtio_gpu_object *bo;
 	uint32_t handle;
 
+	if (plane->state->crtc)
+		output = drm_crtc_to_virtio_gpu_output(plane->state->crtc);
+	if (old_state->crtc)
+		output = drm_crtc_to_virtio_gpu_output(old_state->crtc);
+	WARN_ON(!output);
+
 	if (plane->state->fb) {
 		vgfb = to_virtio_gpu_framebuffer(plane->state->fb);
 		bo = gem_to_virtio_gpu_obj(vgfb->obj);
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH 2/5] virtio-gpu: add atomic_commit function
From: Gerd Hoffmann @ 2016-05-31 10:53 UTC (permalink / raw)
  To: dri-devel
  Cc: David Airlie, open list, Daniel Vetter,
	open list:VIRTIO GPU DRIVER
In-Reply-To: <1464691994-5704-1-git-send-email-kraxel@redhat.com>

Do modesets first, then call drm_atomic_helper_commit_planes with
active_only = true.  That way the outputs doesn't get disabled
temporarly on atomic commits.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 22 +++++++++++++++++++++-
 drivers/gpu/drm/virtio/virtgpu_drv.h     |  1 +
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c b/drivers/gpu/drm/virtio/virtgpu_display.c
index d4305da..c5d33e0 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -466,10 +466,30 @@ virtio_gpu_user_framebuffer_create(struct drm_device *dev,
 	return &virtio_gpu_fb->base;
 }
 
+static int vgdev_atomic_commit(struct drm_device *dev,
+			       struct drm_atomic_state *state,
+			       bool nonblock)
+{
+	if (nonblock)
+		return -EBUSY;
+
+	drm_atomic_helper_swap_state(dev, state);
+	drm_atomic_helper_wait_for_fences(dev, state);
+
+	drm_atomic_helper_commit_modeset_disables(dev, state);
+	drm_atomic_helper_commit_modeset_enables(dev, state);
+	drm_atomic_helper_commit_planes(dev, state, true);
+
+	drm_atomic_helper_wait_for_vblanks(dev, state);
+	drm_atomic_helper_cleanup_planes(dev, state);
+	drm_atomic_state_free(state);
+	return 0;
+}
+
 static const struct drm_mode_config_funcs virtio_gpu_mode_funcs = {
 	.fb_create = virtio_gpu_user_framebuffer_create,
 	.atomic_check = drm_atomic_helper_check,
-	.atomic_commit = drm_atomic_helper_commit,
+	.atomic_commit = vgdev_atomic_commit,
 };
 
 int virtio_gpu_modeset_init(struct virtio_gpu_device *vgdev)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 0a54f43..ecd1c43 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -33,6 +33,7 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_gem.h>
+#include <drm/drm_atomic.h>
 #include <drm/drm_crtc_helper.h>
 #include <ttm/ttm_bo_api.h>
 #include <ttm/ttm_bo_driver.h>
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH 3/5] virtio-gpu: switch to atomic cursor interfaces
From: Gerd Hoffmann @ 2016-05-31 10:53 UTC (permalink / raw)
  To: dri-devel
  Cc: David Airlie, open list, Daniel Vetter,
	open list:VIRTIO GPU DRIVER
In-Reply-To: <1464691994-5704-1-git-send-email-kraxel@redhat.com>

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 drivers/gpu/drm/virtio/virtgpu_display.c | 102 ++++--------------------------
 drivers/gpu/drm/virtio/virtgpu_drv.h     |   1 +
 drivers/gpu/drm/virtio/virtgpu_plane.c   | 105 ++++++++++++++++++++++++++++---
 3 files changed, 109 insertions(+), 99 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c b/drivers/gpu/drm/virtio/virtgpu_display.c
index c5d33e0..d6f5d1e 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -29,8 +29,8 @@
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
 
-#define XRES_MIN   320
-#define YRES_MIN   200
+#define XRES_MIN    32
+#define YRES_MIN    32
 
 #define XRES_DEF  1024
 #define YRES_DEF   768
@@ -38,86 +38,6 @@
 #define XRES_MAX  8192
 #define YRES_MAX  8192
 
-static void
-virtio_gpu_hide_cursor(struct virtio_gpu_device *vgdev,
-		       struct virtio_gpu_output *output)
-{
-	output->cursor.hdr.type = cpu_to_le32(VIRTIO_GPU_CMD_UPDATE_CURSOR);
-	output->cursor.resource_id = 0;
-	virtio_gpu_cursor_ping(vgdev, output);
-}
-
-static int virtio_gpu_crtc_cursor_set(struct drm_crtc *crtc,
-				      struct drm_file *file_priv,
-				      uint32_t handle,
-				      uint32_t width,
-				      uint32_t height,
-				      int32_t hot_x, int32_t hot_y)
-{
-	struct virtio_gpu_device *vgdev = crtc->dev->dev_private;
-	struct virtio_gpu_output *output =
-		container_of(crtc, struct virtio_gpu_output, crtc);
-	struct drm_gem_object *gobj = NULL;
-	struct virtio_gpu_object *qobj = NULL;
-	struct virtio_gpu_fence *fence = NULL;
-	int ret = 0;
-
-	if (handle == 0) {
-		virtio_gpu_hide_cursor(vgdev, output);
-		return 0;
-	}
-
-	/* lookup the cursor */
-	gobj = drm_gem_object_lookup(file_priv, handle);
-	if (gobj == NULL)
-		return -ENOENT;
-
-	qobj = gem_to_virtio_gpu_obj(gobj);
-
-	if (!qobj->hw_res_handle) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	virtio_gpu_cmd_transfer_to_host_2d(vgdev, qobj->hw_res_handle, 0,
-					   cpu_to_le32(64),
-					   cpu_to_le32(64),
-					   0, 0, &fence);
-	ret = virtio_gpu_object_reserve(qobj, false);
-	if (!ret) {
-		reservation_object_add_excl_fence(qobj->tbo.resv,
-						  &fence->f);
-		fence_put(&fence->f);
-		virtio_gpu_object_unreserve(qobj);
-		virtio_gpu_object_wait(qobj, false);
-	}
-
-	output->cursor.hdr.type = cpu_to_le32(VIRTIO_GPU_CMD_UPDATE_CURSOR);
-	output->cursor.resource_id = cpu_to_le32(qobj->hw_res_handle);
-	output->cursor.hot_x = cpu_to_le32(hot_x);
-	output->cursor.hot_y = cpu_to_le32(hot_y);
-	virtio_gpu_cursor_ping(vgdev, output);
-	ret = 0;
-
-out:
-	drm_gem_object_unreference_unlocked(gobj);
-	return ret;
-}
-
-static int virtio_gpu_crtc_cursor_move(struct drm_crtc *crtc,
-				    int x, int y)
-{
-	struct virtio_gpu_device *vgdev = crtc->dev->dev_private;
-	struct virtio_gpu_output *output =
-		container_of(crtc, struct virtio_gpu_output, crtc);
-
-	output->cursor.hdr.type = cpu_to_le32(VIRTIO_GPU_CMD_MOVE_CURSOR);
-	output->cursor.pos.x = cpu_to_le32(x);
-	output->cursor.pos.y = cpu_to_le32(y);
-	virtio_gpu_cursor_ping(vgdev, output);
-	return 0;
-}
-
 static int virtio_gpu_page_flip(struct drm_crtc *crtc,
 				struct drm_framebuffer *fb,
 				struct drm_pending_vblank_event *event,
@@ -164,8 +84,6 @@ static int virtio_gpu_page_flip(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_funcs virtio_gpu_crtc_funcs = {
-	.cursor_set2            = virtio_gpu_crtc_cursor_set,
-	.cursor_move            = virtio_gpu_crtc_cursor_move,
 	.set_config             = drm_atomic_helper_set_config,
 	.destroy                = drm_crtc_cleanup,
 
@@ -406,7 +324,7 @@ static int vgdev_output_init(struct virtio_gpu_device *vgdev, int index)
 	struct drm_connector *connector = &output->conn;
 	struct drm_encoder *encoder = &output->enc;
 	struct drm_crtc *crtc = &output->crtc;
-	struct drm_plane *plane;
+	struct drm_plane *primary, *cursor;
 
 	output->index = index;
 	if (index == 0) {
@@ -415,13 +333,17 @@ static int vgdev_output_init(struct virtio_gpu_device *vgdev, int index)
 		output->info.r.height = cpu_to_le32(YRES_DEF);
 	}
 
-	plane = virtio_gpu_plane_init(vgdev, index);
-	if (IS_ERR(plane))
-		return PTR_ERR(plane);
-	drm_crtc_init_with_planes(dev, crtc, plane, NULL,
+	primary = virtio_gpu_plane_init(vgdev, DRM_PLANE_TYPE_PRIMARY, index);
+	if (IS_ERR(primary))
+		return PTR_ERR(primary);
+	cursor = virtio_gpu_plane_init(vgdev, DRM_PLANE_TYPE_CURSOR, index);
+	if (IS_ERR(cursor))
+		return PTR_ERR(cursor);
+	drm_crtc_init_with_planes(dev, crtc, primary, cursor,
 				  &virtio_gpu_crtc_funcs, NULL);
 	drm_crtc_helper_add(crtc, &virtio_gpu_crtc_helper_funcs);
-	plane->crtc = crtc;
+	primary->crtc = crtc;
+	cursor->crtc = crtc;
 
 	drm_connector_init(dev, connector, &virtio_gpu_connector_funcs,
 			   DRM_MODE_CONNECTOR_VIRTUAL);
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index ecd1c43..acf556a 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -336,6 +336,7 @@ void virtio_gpu_modeset_fini(struct virtio_gpu_device *vgdev);
 
 /* virtio_gpu_plane.c */
 struct drm_plane *virtio_gpu_plane_init(struct virtio_gpu_device *vgdev,
+					enum drm_plane_type type,
 					int index);
 
 /* virtio_gpu_ttm.c */
diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
index e50674b..4780354 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -38,6 +38,10 @@ static const uint32_t virtio_gpu_formats[] = {
 	DRM_FORMAT_ABGR8888,
 };
 
+static const uint32_t virtio_gpu_cursor_formats[] = {
+	DRM_FORMAT_ARGB8888,
+};
+
 static void virtio_gpu_plane_destroy(struct drm_plane *plane)
 {
 	kfree(plane);
@@ -58,8 +62,8 @@ static int virtio_gpu_plane_atomic_check(struct drm_plane *plane,
 	return 0;
 }
 
-static void virtio_gpu_plane_atomic_update(struct drm_plane *plane,
-					   struct drm_plane_state *old_state)
+static void virtio_gpu_primary_plane_update(struct drm_plane *plane,
+					    struct drm_plane_state *old_state)
 {
 	struct drm_device *dev = plane->dev;
 	struct virtio_gpu_device *vgdev = dev->dev_private;
@@ -104,32 +108,115 @@ static void virtio_gpu_plane_atomic_update(struct drm_plane *plane,
 				      plane->state->crtc_h);
 }
 
+static void virtio_gpu_cursor_plane_update(struct drm_plane *plane,
+					   struct drm_plane_state *old_state)
+{
+	struct drm_device *dev = plane->dev;
+	struct virtio_gpu_device *vgdev = dev->dev_private;
+	struct virtio_gpu_output *output = NULL;
+	struct virtio_gpu_framebuffer *vgfb;
+	struct virtio_gpu_fence *fence = NULL;
+	struct virtio_gpu_object *bo = NULL;
+	uint32_t handle;
+	int ret = 0;
 
-static const struct drm_plane_helper_funcs virtio_gpu_plane_helper_funcs = {
+	if (plane->state->crtc)
+		output = drm_crtc_to_virtio_gpu_output(plane->state->crtc);
+	if (old_state->crtc)
+		output = drm_crtc_to_virtio_gpu_output(old_state->crtc);
+	WARN_ON(!output);
+
+	if (plane->state->fb) {
+		vgfb = to_virtio_gpu_framebuffer(plane->state->fb);
+		bo = gem_to_virtio_gpu_obj(vgfb->obj);
+		handle = bo->hw_res_handle;
+	} else {
+		handle = 0;
+	}
+
+	if (bo && bo->dumb && (plane->state->fb != old_state->fb)) {
+		/* new cursor -- update & wait */
+		virtio_gpu_cmd_transfer_to_host_2d
+			(vgdev, handle, 0,
+			 cpu_to_le32(plane->state->crtc_w),
+			 cpu_to_le32(plane->state->crtc_h),
+			 0, 0, &fence);
+		ret = virtio_gpu_object_reserve(bo, false);
+		if (!ret) {
+			reservation_object_add_excl_fence(bo->tbo.resv,
+							  &fence->f);
+			fence_put(&fence->f);
+			fence = NULL;
+			virtio_gpu_object_unreserve(bo);
+			virtio_gpu_object_wait(bo, false);
+		}
+	}
+
+	if (plane->state->fb != old_state->fb) {
+		DRM_DEBUG("update, handle %d, +%d+%d\n", handle,
+			  plane->state->crtc_x,
+			  plane->state->crtc_y);
+		output->cursor.hdr.type =
+			cpu_to_le32(VIRTIO_GPU_CMD_UPDATE_CURSOR);
+		output->cursor.resource_id = cpu_to_le32(handle);
+#if 0
+		/* TODO */
+		output->cursor.hot_x = cpu_to_le32(hot_x);
+		output->cursor.hot_y = cpu_to_le32(hot_y);
+#endif
+	} else {
+		DRM_DEBUG("move +%d+%d\n",
+			  plane->state->crtc_x,
+			  plane->state->crtc_y);
+		output->cursor.hdr.type =
+			cpu_to_le32(VIRTIO_GPU_CMD_MOVE_CURSOR);
+	}
+	output->cursor.pos.x = cpu_to_le32(plane->state->crtc_x);
+	output->cursor.pos.y = cpu_to_le32(plane->state->crtc_y);
+	virtio_gpu_cursor_ping(vgdev, output);
+}
+
+static const struct drm_plane_helper_funcs virtio_gpu_primary_helper_funcs = {
+	.atomic_check		= virtio_gpu_plane_atomic_check,
+	.atomic_update		= virtio_gpu_primary_plane_update,
+};
+
+static const struct drm_plane_helper_funcs virtio_gpu_cursor_helper_funcs = {
 	.atomic_check		= virtio_gpu_plane_atomic_check,
-	.atomic_update		= virtio_gpu_plane_atomic_update,
+	.atomic_update		= virtio_gpu_cursor_plane_update,
 };
 
 struct drm_plane *virtio_gpu_plane_init(struct virtio_gpu_device *vgdev,
+					enum drm_plane_type type,
 					int index)
 {
 	struct drm_device *dev = vgdev->ddev;
+	const struct drm_plane_helper_funcs *funcs;
 	struct drm_plane *plane;
-	int ret;
+	const uint32_t *formats;
+	int ret, nformats;
 
 	plane = kzalloc(sizeof(*plane), GFP_KERNEL);
 	if (!plane)
 		return ERR_PTR(-ENOMEM);
 
+	if (type == DRM_PLANE_TYPE_CURSOR) {
+		formats = virtio_gpu_cursor_formats;
+		nformats = ARRAY_SIZE(virtio_gpu_cursor_formats);
+		funcs = &virtio_gpu_cursor_helper_funcs;
+	} else {
+		formats = virtio_gpu_formats;
+		nformats = ARRAY_SIZE(virtio_gpu_formats);
+		funcs = &virtio_gpu_primary_helper_funcs;
+	}
 	ret = drm_universal_plane_init(dev, plane, 1 << index,
 				       &virtio_gpu_plane_funcs,
-				       virtio_gpu_formats,
-				       ARRAY_SIZE(virtio_gpu_formats),
-				       DRM_PLANE_TYPE_PRIMARY, NULL);
+				       formats, nformats,
+				       type, NULL);
 	if (ret)
 		goto err_plane_init;
 
-	drm_plane_helper_add(plane, &virtio_gpu_plane_helper_funcs);
+	drm_plane_helper_add(plane, funcs);
 	return plane;
 
 err_plane_init:
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH 5/5] virtio-gpu: pick up hotspot from framebuffer
From: Gerd Hoffmann @ 2016-05-31 10:53 UTC (permalink / raw)
  To: dri-devel
  Cc: David Airlie, open list, Daniel Vetter,
	open list:VIRTIO GPU DRIVER
In-Reply-To: <1464691994-5704-1-git-send-email-kraxel@redhat.com>

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 drivers/gpu/drm/virtio/virtgpu_plane.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
index 4780354..b7778a7 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -153,17 +153,23 @@ static void virtio_gpu_cursor_plane_update(struct drm_plane *plane,
 	}
 
 	if (plane->state->fb != old_state->fb) {
-		DRM_DEBUG("update, handle %d, +%d+%d\n", handle,
+		DRM_DEBUG("update, handle %d, pos +%d+%d, hot %d,%d\n", handle,
 			  plane->state->crtc_x,
-			  plane->state->crtc_y);
+			  plane->state->crtc_y,
+			  plane->state->fb ? plane->state->fb->hot_x : 0,
+			  plane->state->fb ? plane->state->fb->hot_y : 0);
 		output->cursor.hdr.type =
 			cpu_to_le32(VIRTIO_GPU_CMD_UPDATE_CURSOR);
 		output->cursor.resource_id = cpu_to_le32(handle);
-#if 0
-		/* TODO */
-		output->cursor.hot_x = cpu_to_le32(hot_x);
-		output->cursor.hot_y = cpu_to_le32(hot_y);
-#endif
+		if (plane->state->fb) {
+			output->cursor.hot_x =
+				cpu_to_le32(plane->state->fb->hot_x);
+			output->cursor.hot_y =
+				cpu_to_le32(plane->state->fb->hot_y);
+		} else {
+			output->cursor.hot_x = cpu_to_le32(0);
+			output->cursor.hot_y = cpu_to_le32(0);
+		}
 	} else {
 		DRM_DEBUG("move +%d+%d\n",
 			  plane->state->crtc_x,
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH] virtio-gpu: use src not crtc
From: Gerd Hoffmann @ 2016-05-31 12:52 UTC (permalink / raw)
  To: dri-devel
  Cc: David Airlie, open list, Marc-André Lureau,
	open list:VIRTIO GPU DRIVER

Pick up the correct source rectangle from framebuffer.
Without this multihead setups are not working correctly.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 drivers/gpu/drm/virtio/virtgpu_plane.c | 31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
index b7778a7..925ca25 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -85,27 +85,32 @@ static void virtio_gpu_primary_plane_update(struct drm_plane *plane,
 		if (bo->dumb) {
 			virtio_gpu_cmd_transfer_to_host_2d
 				(vgdev, handle, 0,
-				 cpu_to_le32(plane->state->crtc_w),
-				 cpu_to_le32(plane->state->crtc_h),
-				 plane->state->crtc_x, plane->state->crtc_y, NULL);
+				 cpu_to_le32(plane->state->src_w >> 16),
+				 cpu_to_le32(plane->state->src_h >> 16),
+				 plane->state->src_x >> 16,
+				 plane->state->src_y >> 16, NULL);
 		}
 	} else {
 		handle = 0;
 	}
 
-	DRM_DEBUG("handle 0x%x, crtc %dx%d+%d+%d\n", handle,
+	DRM_DEBUG("handle 0x%x, crtc %dx%d+%d+%d, src %dx%d+%d+%d\n", handle,
 		  plane->state->crtc_w, plane->state->crtc_h,
-		  plane->state->crtc_x, plane->state->crtc_y);
+		  plane->state->crtc_x, plane->state->crtc_y,
+		  plane->state->src_w >> 16,
+		  plane->state->src_h >> 16,
+		  plane->state->src_x >> 16,
+		  plane->state->src_y >> 16);
 	virtio_gpu_cmd_set_scanout(vgdev, output->index, handle,
-				   plane->state->crtc_w,
-				   plane->state->crtc_h,
-				   plane->state->crtc_x,
-				   plane->state->crtc_y);
+				   plane->state->src_w >> 16,
+				   plane->state->src_h >> 16,
+				   plane->state->src_x >> 16,
+				   plane->state->src_y >> 16);
 	virtio_gpu_cmd_resource_flush(vgdev, handle,
-				      plane->state->crtc_x,
-				      plane->state->crtc_y,
-				      plane->state->crtc_w,
-				      plane->state->crtc_h);
+				      plane->state->src_x >> 16,
+				      plane->state->src_y >> 16,
+				      plane->state->src_w >> 16,
+				      plane->state->src_h >> 16);
 }
 
 static void virtio_gpu_cursor_plane_update(struct drm_plane *plane,
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH v6v3 02/12] mm: migrate: support non-lru movable page migration
From: Minchan Kim @ 2016-05-31 23:05 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Rik van Riel, Sergey Senozhatsky, Rafael Aquini, Jonathan Corbet,
	Hugh Dickins, linux-kernel, dri-devel, virtualization,
	John Einar Reitan, linux-mm, Mel Gorman, Andrew Morton,
	Joonsoo Kim, Gioh Kim
In-Reply-To: <29ae65ef-a4c9-1d45-ce56-70b29d3eacfd@suse.cz>

On Tue, May 31, 2016 at 09:52:48AM +0200, Vlastimil Babka wrote:
> On 05/31/2016 02:01 AM, Minchan Kim wrote:
> >Per Vlastimi's review comment.
> >
> >Thanks for the detail review, Vlastimi!
> >If you have another concern, feel free to say.
> 
> I don't for now :)
> 
> [...]
> 
> >Cc: Rik van Riel <riel@redhat.com>
> >Cc: Vlastimil Babka <vbabka@suse.cz>
> >Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> >Cc: Mel Gorman <mgorman@suse.de>
> >Cc: Hugh Dickins <hughd@google.com>
> >Cc: Rafael Aquini <aquini@redhat.com>
> >Cc: virtualization@lists.linux-foundation.org
> >Cc: Jonathan Corbet <corbet@lwn.net>
> >Cc: John Einar Reitan <john.reitan@foss.arm.com>
> >Cc: dri-devel@lists.freedesktop.org
> >Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> >Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
> >Signed-off-by: Minchan Kim <minchan@kernel.org>
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>

Thanks for the review, Vlastimil!

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox