virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
@ 2013-12-04 20:43 Richard Yao
  2013-12-04 20:49 ` Alexander Graf
  2013-12-06 11:14 ` Will Deacon
  0 siblings, 2 replies; 5+ messages in thread
From: Richard Yao @ 2013-12-04 20:43 UTC (permalink / raw)
  To: virtualization
  Cc: Eric Van Hensbergen, Will Deacon, kernel, Aneesh Kumar K.V,
	Ron Minnich, v9fs-developer, Brian Behlendorf

The 9p-virtio transport does zero copy on things larger than 1024 bytes
in size. It accomplishes this by returning the physical addresses of
pages to the virtio-pci device. At present, the translation is usually a
bit shift.

However, that approach produces an invalid page address when we
read/write to vmalloc buffers, such as those used for Linux kernle
modules. This causes QEMU to die printing:

qemu-system-x86_64: virtio: trying to map MMIO memory

This patch enables 9p-virtio to correctly handle this case. This not
only enables us to load Linux kernel modules off virtfs, but also
enables ZFS file-based vdevs on virtfs to be used without killing QEMU.

Also, special thanks to both Avi Kivity and Alexander Graf for their
interpretation of QEMU backtraces. Without their guidence, tracking down
this bug would have taken much longer.

Signed-off-by: Richard Yao <ryao@gentoo.org>
---
 net/9p/trans_virtio.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index 9c5a1aa..5d1d04b 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -340,7 +340,10 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
 		int count = nr_pages;
 		while (nr_pages) {
 			s = rest_of_page(data);
-			pages[index++] = kmap_to_page(data);
+			if (is_vmalloc_or_module_addr(data))
+				pages[index++] = vmalloc_to_page(data);
+			else
+				pages[index++] = kmap_to_page(data);
 			data += s;
 			nr_pages--;
 		}
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
  2013-12-04 20:43 [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers Richard Yao
@ 2013-12-04 20:49 ` Alexander Graf
  2013-12-06 11:14 ` Will Deacon
  1 sibling, 0 replies; 5+ messages in thread
From: Alexander Graf @ 2013-12-04 20:49 UTC (permalink / raw)
  To: Richard Yao
  Cc: Eric Van Hensbergen, Will Deacon, virtualization, kernel,
	Aneesh Kumar K.V, Ron Minnich, v9fs-developer, Brian Behlendorf


On 04.12.2013, at 21:43, Richard Yao <ryao@gentoo.org> wrote:

> The 9p-virtio transport does zero copy on things larger than 1024 bytes
> in size. It accomplishes this by returning the physical addresses of
> pages to the virtio-pci device. At present, the translation is usually a
> bit shift.
> 
> However, that approach produces an invalid page address when we
> read/write to vmalloc buffers, such as those used for Linux kernle
> modules. This causes QEMU to die printing:
> 
> qemu-system-x86_64: virtio: trying to map MMIO memory
> 
> This patch enables 9p-virtio to correctly handle this case. This not
> only enables us to load Linux kernel modules off virtfs, but also
> enables ZFS file-based vdevs on virtfs to be used without killing QEMU.
> 
> Also, special thanks to both Avi Kivity and Alexander Graf for their
> interpretation of QEMU backtraces. Without their guidence, tracking down
> this bug would have taken much longer.
> 
> Signed-off-by: Richard Yao <ryao@gentoo.org>

Looks good and clean to me, but I'm not expert on the mm subsystem.

Acked-by: Alexander Graf <agraf@suse.de>


Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
  2013-12-04 20:43 [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers Richard Yao
  2013-12-04 20:49 ` Alexander Graf
@ 2013-12-06 11:14 ` Will Deacon
  2013-12-06 14:38   ` Richard Yao
  1 sibling, 1 reply; 5+ messages in thread
From: Will Deacon @ 2013-12-06 11:14 UTC (permalink / raw)
  To: Richard Yao
  Cc: Eric Van Hensbergen, virtualization@lists.linux-foundation.org,
	kernel@gentoo.org, Aneesh Kumar K.V, Ron Minnich,
	v9fs-developer@lists.sourceforge.net, Brian Behlendorf

On Wed, Dec 04, 2013 at 08:43:18PM +0000, Richard Yao wrote:
> The 9p-virtio transport does zero copy on things larger than 1024 bytes
> in size. It accomplishes this by returning the physical addresses of
> pages to the virtio-pci device. At present, the translation is usually a
> bit shift.
> 
> However, that approach produces an invalid page address when we
> read/write to vmalloc buffers, such as those used for Linux kernle
> modules. This causes QEMU to die printing:
> 
> qemu-system-x86_64: virtio: trying to map MMIO memory
> 
> This patch enables 9p-virtio to correctly handle this case. This not
> only enables us to load Linux kernel modules off virtfs, but also
> enables ZFS file-based vdevs on virtfs to be used without killing QEMU.
> 
> Also, special thanks to both Avi Kivity and Alexander Graf for their
> interpretation of QEMU backtraces. Without their guidence, tracking down
> this bug would have taken much longer.
> 
> Signed-off-by: Richard Yao <ryao@gentoo.org>
> ---
>  net/9p/trans_virtio.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
> index 9c5a1aa..5d1d04b 100644
> --- a/net/9p/trans_virtio.c
> +++ b/net/9p/trans_virtio.c
> @@ -340,7 +340,10 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
>  		int count = nr_pages;
>  		while (nr_pages) {
>  			s = rest_of_page(data);
> -			pages[index++] = kmap_to_page(data);
> +			if (is_vmalloc_or_module_addr(data))

Can this really end up being a module address?

Will

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
  2013-12-06 11:14 ` Will Deacon
@ 2013-12-06 14:38   ` Richard Yao
  2013-12-06 16:29     ` Will Deacon
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Yao @ 2013-12-06 14:38 UTC (permalink / raw)
  To: Will Deacon
  Cc: Eric Van Hensbergen, virtualization@lists.linux-foundation.org,
	kernel@gentoo.org, Aneesh Kumar K.V, Ron Minnich,
	v9fs-developer@lists.sourceforge.net, Brian Behlendorf


[-- Attachment #1.1: Type: text/plain, Size: 2556 bytes --]

On 12/06/2013 06:14 AM, Will Deacon wrote:
> On Wed, Dec 04, 2013 at 08:43:18PM +0000, Richard Yao wrote:
>> The 9p-virtio transport does zero copy on things larger than 1024 bytes
>> in size. It accomplishes this by returning the physical addresses of
>> pages to the virtio-pci device. At present, the translation is usually a
>> bit shift.
>>
>> However, that approach produces an invalid page address when we
>> read/write to vmalloc buffers, such as those used for Linux kernle
>> modules. This causes QEMU to die printing:
>>
>> qemu-system-x86_64: virtio: trying to map MMIO memory
>>
>> This patch enables 9p-virtio to correctly handle this case. This not
>> only enables us to load Linux kernel modules off virtfs, but also
>> enables ZFS file-based vdevs on virtfs to be used without killing QEMU.
>>
>> Also, special thanks to both Avi Kivity and Alexander Graf for their
>> interpretation of QEMU backtraces. Without their guidence, tracking down
>> this bug would have taken much longer.
>>
>> Signed-off-by: Richard Yao <ryao@gentoo.org>
>> ---
>>  net/9p/trans_virtio.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
>> index 9c5a1aa..5d1d04b 100644
>> --- a/net/9p/trans_virtio.c
>> +++ b/net/9p/trans_virtio.c
>> @@ -340,7 +340,10 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
>>  		int count = nr_pages;
>>  		while (nr_pages) {
>>  			s = rest_of_page(data);
>> -			pages[index++] = kmap_to_page(data);
>> +			if (is_vmalloc_or_module_addr(data))
> 
> Can this really end up being a module address?

Yes. Here is the stacktrace to prove it:

[<ffffffff814878ce>] p9_virtio_zc_request+0x45e/0x510
[<ffffffff814814ed>] p9_client_zc_rpc.constprop.16+0xfd/0x4f0
[<ffffffff814839dd>] p9_client_read+0x15d/0x240
[<ffffffff811c8440>] v9fs_fid_readn+0x50/0xa0
[<ffffffff811c84a0>] v9fs_file_readn+0x10/0x20
[<ffffffff811c84e7>] v9fs_file_read+0x37/0x70
[<ffffffff8114e3fb>] vfs_read+0x9b/0x160
[<ffffffff81153571>] kernel_read+0x41/0x60
[<ffffffff810c83ab>] copy_module_from_fd.isra.34+0xfb/0x180
[<ffffffff810cc420>] SyS_finit_module+0x70/0xd0
[<ffffffff814a08fd>] system_call_fastpath+0x1a/0x1f
[<ffffffffffffffff>] 0xffffffffffffffff

This is easily reproducible by trying to load a module off virtfs. QEMU
will print the message that I cited in the commit message and then kill
itself.

P.S. I omitted the CC list the first time I sent this, so I am resending
with the correct CC list.


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
  2013-12-06 14:38   ` Richard Yao
@ 2013-12-06 16:29     ` Will Deacon
  0 siblings, 0 replies; 5+ messages in thread
From: Will Deacon @ 2013-12-06 16:29 UTC (permalink / raw)
  To: Richard Yao
  Cc: Eric Van Hensbergen, virtualization@lists.linux-foundation.org,
	kernel@gentoo.org, Aneesh Kumar K.V, Ron Minnich,
	v9fs-developer@lists.sourceforge.net, Brian Behlendorf

On Fri, Dec 06, 2013 at 02:38:28PM +0000, Richard Yao wrote:
> On 12/06/2013 06:14 AM, Will Deacon wrote:
> > On Wed, Dec 04, 2013 at 08:43:18PM +0000, Richard Yao wrote:
> >> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
> >> index 9c5a1aa..5d1d04b 100644
> >> --- a/net/9p/trans_virtio.c
> >> +++ b/net/9p/trans_virtio.c
> >> @@ -340,7 +340,10 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
> >>  		int count = nr_pages;
> >>  		while (nr_pages) {
> >>  			s = rest_of_page(data);
> >> -			pages[index++] = kmap_to_page(data);
> >> +			if (is_vmalloc_or_module_addr(data))
> > 
> > Can this really end up being a module address?
> 
> Yes. Here is the stacktrace to prove it:
> 
> [<ffffffff814878ce>] p9_virtio_zc_request+0x45e/0x510
> [<ffffffff814814ed>] p9_client_zc_rpc.constprop.16+0xfd/0x4f0
> [<ffffffff814839dd>] p9_client_read+0x15d/0x240
> [<ffffffff811c8440>] v9fs_fid_readn+0x50/0xa0
> [<ffffffff811c84a0>] v9fs_file_readn+0x10/0x20
> [<ffffffff811c84e7>] v9fs_file_read+0x37/0x70
> [<ffffffff8114e3fb>] vfs_read+0x9b/0x160
> [<ffffffff81153571>] kernel_read+0x41/0x60
> [<ffffffff810c83ab>] copy_module_from_fd.isra.34+0xfb/0x180
> [<ffffffff810cc420>] SyS_finit_module+0x70/0xd0
> [<ffffffff814a08fd>] system_call_fastpath+0x1a/0x1f
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> This is easily reproducible by trying to load a module off virtfs. QEMU
> will print the message that I cited in the commit message and then kill
> itself.

Ok, thanks for confirming!

In which case, this patch looks fine to me:

  Reviewed-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-12-06 16:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-04 20:43 [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers Richard Yao
2013-12-04 20:49 ` Alexander Graf
2013-12-06 11:14 ` Will Deacon
2013-12-06 14:38   ` Richard Yao
2013-12-06 16:29     ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).