* Re: [PATCH] virtio: 9p: correctly pass physical address to userspace for high pages
@ 2012-10-18 2:19 ` Rusty Russell
0 siblings, 0 replies; 9+ messages in thread
From: Rusty Russell @ 2012-10-18 2:19 UTC (permalink / raw)
To: Will Deacon, linux-kernel
Cc: Will Deacon, Sasha Levin, Marc Zyngier, lf-virt, Andrew Morton,
Eric Van Hensbergen
Will Deacon <will.deacon@arm.com> writes:
> When using a virtio transport, the 9p net device allocates pages to back
> the descriptors inserted into the virtqueue. These allocations may be
> performed from atomic context (under the channel lock) and can therefore
> return high mappings which aren't suitable for virt_to_phys.
I had not appreciated that subtlety about GFP_ATOMIC :(
This isn't just 9p, the console, block, scsi and net devices also use
GFP_ATOMIC.
> @@ -165,7 +166,8 @@ static int vring_add_indirect(struct vring_virtqueue *vq,
> /* Use a single buffer which doesn't continue */
> head = vq->free_head;
> vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
> - vq->vring.desc[head].addr = virt_to_phys(desc);
> + vq->vring.desc[head].addr = page_to_phys(kmap_to_page(desc)) +
> + ((unsigned long)desc & ~PAGE_MASK);
> vq->vring.desc[head].len = i * sizeof(struct vring_desc);
Gah, virt_to_phys_harder()?
What's the performance effect? If it's negligible, why doesn't
virt_to_phys() just do this for us?
We do have an alternate solution: masking out __GFP_HIGHMEM from the
kmalloc of desc. If it fails, we will fall back to laying out the
virtio request directly inside the ring; if it doesn't fit, we'll wait
for the device to consume more buffers.
> @@ -325,7 +326,7 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
> int count = nr_pages;
> while (nr_pages) {
> s = rest_of_page(data);
> - pages[index++] = virt_to_page(data);
> + pages[index++] = kmap_to_page(data);
> data += s;
> nr_pages--;
> }
This seems like a separate bug fix.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] virtio: 9p: correctly pass physical address to userspace for high pages
2012-10-18 2:19 ` Rusty Russell
(?)
@ 2012-10-18 9:42 ` Will Deacon
2012-10-18 23:39 ` Rusty Russell
-1 siblings, 1 reply; 9+ messages in thread
From: Will Deacon @ 2012-10-18 9:42 UTC (permalink / raw)
To: Rusty Russell
Cc: linux-kernel@vger.kernel.org, Sasha Levin, Marc Zyngier, lf-virt,
Andrew Morton, Eric Van Hensbergen
Hi Rusty,
On Thu, Oct 18, 2012 at 03:19:06AM +0100, Rusty Russell wrote:
> Will Deacon <will.deacon@arm.com> writes:
> > When using a virtio transport, the 9p net device allocates pages to back
> > the descriptors inserted into the virtqueue. These allocations may be
> > performed from atomic context (under the channel lock) and can therefore
> > return high mappings which aren't suitable for virt_to_phys.
>
> I had not appreciated that subtlety about GFP_ATOMIC :(
Yeah, it's unfortunate for poor old userspace.
> This isn't just 9p, the console, block, scsi and net devices also use
> GFP_ATOMIC.
Ok, I'll split this patch in two since I think that only 9p has the
zero-copy stuff, which is why an extra fix is needed there for creating the
scatterlist correctly.
> > @@ -165,7 +166,8 @@ static int vring_add_indirect(struct vring_virtqueue *vq,
> > /* Use a single buffer which doesn't continue */
> > head = vq->free_head;
> > vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
> > - vq->vring.desc[head].addr = virt_to_phys(desc);
> > + vq->vring.desc[head].addr = page_to_phys(kmap_to_page(desc)) +
> > + ((unsigned long)desc & ~PAGE_MASK);
> > vq->vring.desc[head].len = i * sizeof(struct vring_desc);
>
> Gah, virt_to_phys_harder()?
Tell me about it...
> What's the performance effect? If it's negligible, why doesn't
> virt_to_phys() just do this for us?
I've not measured it, but even when you don't have CONFIG_HIGHMEM, there's
going to be an overhead here because we go around the houses to get the page
and then add the offset on afterwards. I doubt it's something we want to
plumb directly into virt_to_phys (also, kmap_to_page may call virt_to_phys via
the __pa macro so we'd get stuck).
> We do have an alternate solution: masking out __GFP_HIGHMEM from the
> kmalloc of desc. If it fails, we will fall back to laying out the
> virtio request directly inside the ring; if it doesn't fit, we'll wait
> for the device to consume more buffers.
Hmm, that will probably work for the vring but the zero-copy code for 9p may
just give us an address from userspace if I'm understanding it correctly. In
that case, we really have to do the translation as below (which is actually
much cleaner because everything is page-aligned).
> > @@ -325,7 +326,7 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
> > int count = nr_pages;
> > while (nr_pages) {
> > s = rest_of_page(data);
> > - pages[index++] = virt_to_page(data);
> > + pages[index++] = kmap_to_page(data);
> > data += s;
> > nr_pages--;
> > }
So what do you reckon? How about I leave this hunk as a separate patch and
have a play masking out __GFP_HIGHMEM for the vring descriptor?
Cheers,
Will
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] virtio: 9p: correctly pass physical address to userspace for high pages
2012-10-18 9:42 ` Will Deacon
@ 2012-10-18 23:39 ` Rusty Russell
0 siblings, 0 replies; 9+ messages in thread
From: Rusty Russell @ 2012-10-18 23:39 UTC (permalink / raw)
To: Will Deacon
Cc: Marc Zyngier, linux-kernel@vger.kernel.org, lf-virt,
Eric Van Hensbergen, Sasha Levin, Andrew Morton
Will Deacon <will.deacon@arm.com> writes:
> On Thu, Oct 18, 2012 at 03:19:06AM +0100, Rusty Russell wrote:
>> We do have an alternate solution: masking out __GFP_HIGHMEM from the
>> kmalloc of desc. If it fails, we will fall back to laying out the
>> virtio request directly inside the ring; if it doesn't fit, we'll wait
>> for the device to consume more buffers.
>
> Hmm, that will probably work for the vring but the zero-copy code for 9p may
> just give us an address from userspace if I'm understanding it correctly. In
> that case, we really have to do the translation as below (which is actually
> much cleaner because everything is page-aligned).
>
>> > @@ -325,7 +326,7 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
>> > int count = nr_pages;
>> > while (nr_pages) {
>> > s = rest_of_page(data);
>> > - pages[index++] = virt_to_page(data);
>> > + pages[index++] = kmap_to_page(data);
>> > data += s;
>> > nr_pages--;
>> > }
>
> So what do you reckon? How about I leave this hunk as a separate patch and
> have a play masking out __GFP_HIGHMEM for the vring descriptor?
Yes, I think so. A scathing comment would be nice, too...
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] virtio: 9p: correctly pass physical address to userspace for high pages
@ 2012-10-18 23:39 ` Rusty Russell
0 siblings, 0 replies; 9+ messages in thread
From: Rusty Russell @ 2012-10-18 23:39 UTC (permalink / raw)
To: Will Deacon
Cc: linux-kernel@vger.kernel.org, Sasha Levin, Marc Zyngier, lf-virt,
Andrew Morton, Eric Van Hensbergen
Will Deacon <will.deacon@arm.com> writes:
> On Thu, Oct 18, 2012 at 03:19:06AM +0100, Rusty Russell wrote:
>> We do have an alternate solution: masking out __GFP_HIGHMEM from the
>> kmalloc of desc. If it fails, we will fall back to laying out the
>> virtio request directly inside the ring; if it doesn't fit, we'll wait
>> for the device to consume more buffers.
>
> Hmm, that will probably work for the vring but the zero-copy code for 9p may
> just give us an address from userspace if I'm understanding it correctly. In
> that case, we really have to do the translation as below (which is actually
> much cleaner because everything is page-aligned).
>
>> > @@ -325,7 +326,7 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
>> > int count = nr_pages;
>> > while (nr_pages) {
>> > s = rest_of_page(data);
>> > - pages[index++] = virt_to_page(data);
>> > + pages[index++] = kmap_to_page(data);
>> > data += s;
>> > nr_pages--;
>> > }
>
> So what do you reckon? How about I leave this hunk as a separate patch and
> have a play masking out __GFP_HIGHMEM for the vring descriptor?
Yes, I think so. A scathing comment would be nice, too...
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] virtio: 9p: correctly pass physical address to userspace for high pages
2012-10-18 2:19 ` Rusty Russell
(?)
(?)
@ 2012-10-18 9:42 ` Will Deacon
-1 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2012-10-18 9:42 UTC (permalink / raw)
To: Rusty Russell
Cc: Marc Zyngier, linux-kernel@vger.kernel.org, lf-virt,
Eric Van Hensbergen, Sasha Levin, Andrew Morton
Hi Rusty,
On Thu, Oct 18, 2012 at 03:19:06AM +0100, Rusty Russell wrote:
> Will Deacon <will.deacon@arm.com> writes:
> > When using a virtio transport, the 9p net device allocates pages to back
> > the descriptors inserted into the virtqueue. These allocations may be
> > performed from atomic context (under the channel lock) and can therefore
> > return high mappings which aren't suitable for virt_to_phys.
>
> I had not appreciated that subtlety about GFP_ATOMIC :(
Yeah, it's unfortunate for poor old userspace.
> This isn't just 9p, the console, block, scsi and net devices also use
> GFP_ATOMIC.
Ok, I'll split this patch in two since I think that only 9p has the
zero-copy stuff, which is why an extra fix is needed there for creating the
scatterlist correctly.
> > @@ -165,7 +166,8 @@ static int vring_add_indirect(struct vring_virtqueue *vq,
> > /* Use a single buffer which doesn't continue */
> > head = vq->free_head;
> > vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
> > - vq->vring.desc[head].addr = virt_to_phys(desc);
> > + vq->vring.desc[head].addr = page_to_phys(kmap_to_page(desc)) +
> > + ((unsigned long)desc & ~PAGE_MASK);
> > vq->vring.desc[head].len = i * sizeof(struct vring_desc);
>
> Gah, virt_to_phys_harder()?
Tell me about it...
> What's the performance effect? If it's negligible, why doesn't
> virt_to_phys() just do this for us?
I've not measured it, but even when you don't have CONFIG_HIGHMEM, there's
going to be an overhead here because we go around the houses to get the page
and then add the offset on afterwards. I doubt it's something we want to
plumb directly into virt_to_phys (also, kmap_to_page may call virt_to_phys via
the __pa macro so we'd get stuck).
> We do have an alternate solution: masking out __GFP_HIGHMEM from the
> kmalloc of desc. If it fails, we will fall back to laying out the
> virtio request directly inside the ring; if it doesn't fit, we'll wait
> for the device to consume more buffers.
Hmm, that will probably work for the vring but the zero-copy code for 9p may
just give us an address from userspace if I'm understanding it correctly. In
that case, we really have to do the translation as below (which is actually
much cleaner because everything is page-aligned).
> > @@ -325,7 +326,7 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
> > int count = nr_pages;
> > while (nr_pages) {
> > s = rest_of_page(data);
> > - pages[index++] = virt_to_page(data);
> > + pages[index++] = kmap_to_page(data);
> > data += s;
> > nr_pages--;
> > }
So what do you reckon? How about I leave this hunk as a separate patch and
have a play masking out __GFP_HIGHMEM for the vring descriptor?
Cheers,
Will
^ permalink raw reply [flat|nested] 9+ messages in thread