Unhandled fault: page domain fault (0x81b) at 0x00e41008

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
@ 2016-01-22 17:37 Mason
  2016-01-22 17:48 ` Russell King - ARM Linux
  0 siblings, 1 reply; 15+ messages in thread
From: Mason @ 2016-01-22 17:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

I'm hitting
Unhandled fault: page domain fault (0x81b) at 0x00e41008

which is related to CPU_SW_DOMAIN_PAN
commit a5e090acbf545c0a3b04080f8a488b17ec41fe02

I see that __copy_from_user() is now wrapped in
uaccess_save_and_enable ... uaccess_restore

I'm not using __copy_from_user() because I'm implementing block
copies with specific access size.

Can I just wrap my block copy functions in
uaccess_save_and_enable ... uaccess_restore
like __copy_from_user?

Regards.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-22 17:37 Unhandled fault: page domain fault (0x81b) at 0x00e41008 Mason
@ 2016-01-22 17:48 ` Russell King - ARM Linux
  2016-01-22 18:59   ` Mason
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux @ 2016-01-22 17:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 22, 2016 at 06:37:43PM +0100, Mason wrote:
> Hello,
> 
> I'm hitting
> Unhandled fault: page domain fault (0x81b) at 0x00e41008
> 
> which is related to CPU_SW_DOMAIN_PAN
> commit a5e090acbf545c0a3b04080f8a488b17ec41fe02
> 
> 
> I see that __copy_from_user() is now wrapped in
> uaccess_save_and_enable ... uaccess_restore
> 
> I'm not using __copy_from_user() because I'm implementing block
> copies with specific access size.
> 
> Can I just wrap my block copy functions in
> uaccess_save_and_enable ... uaccess_restore
> like __copy_from_user?

No, you _must_ use the correct functions to access userspace.
Userspace accesses are marked in a special way that allows the kernel
to fix up non-present pages.  Normal accesses may appear to work but
will eventually oops the kernel when the page is unmapped or is marked
read-only and you try to write to it.

Please don't think of using __copy_from_user() et.al. either - those
are there for code which knows what it's doing and has pre-validated
the accesses.

Drivers and platform code should use copy_from_user()/copy_to_user()
to block-copy data to/from userspace, and get_user()/put_user() to
copy individual bytes, shorts and int/longs.  (It doesn't matter
who you are, that's the official guidance.)

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-22 17:48 ` Russell King - ARM Linux
@ 2016-01-22 18:59   ` Mason
  2016-01-22 19:34     ` Russell King - ARM Linux
  0 siblings, 1 reply; 15+ messages in thread
From: Mason @ 2016-01-22 18:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 22/01/2016 18:48, Russell King - ARM Linux wrote:

> On Fri, Jan 22, 2016 at 06:37:43PM +0100, Mason wrote:
>
>> I'm hitting
>> Unhandled fault: page domain fault (0x81b) at 0x00e41008
>>
>> which is related to CPU_SW_DOMAIN_PAN
>> commit a5e090acbf545c0a3b04080f8a488b17ec41fe02
>>
>>
>> I see that __copy_from_user() is now wrapped in
>> uaccess_save_and_enable ... uaccess_restore
>>
>> I'm not using __copy_from_user() because I'm implementing block
>> copies with specific access size.
>>
>> Can I just wrap my block copy functions in
>> uaccess_save_and_enable ... uaccess_restore
>> like __copy_from_user?
> 
> No, you _must_ use the correct functions to access userspace.
> Userspace accesses are marked in a special way that allows the kernel
> to fix up non-present pages.

Do you mean calling might_fault() ?

> Normal accesses may appear to work but
> will eventually oops the kernel when the page is unmapped or is marked
> read-only and you try to write to it.

I'll have to check again how the code was before I made
my silly changes, but it's been in production for years,
and we've never had any problem in that module...
(But I suppose something broken can appear to work for
months or years.)

> Please don't think of using __copy_from_user() et.al. either - those
> are there for code which knows what it's doing and has pre-validated
> the accesses.

I do call access_ok() before doing the copy.

> Drivers and platform code should use copy_from_user()/copy_to_user()
> to block-copy data to/from userspace, and get_user()/put_user() to
> copy individual bytes, shorts and int/longs.  (It doesn't matter
> who you are, that's the official guidance.)

The problem is that the kernel module's API is already set
in stone, and it requires block copies with specific access
sizes, e.g. block_copy8, block_copy16, block_copy32.

So copy_to/from_user is out, AFAICT.

Marc Zyngier suggested wrapping put/get_user in a loop,
but it looks like performance is going to suck for large
copies (500-2000 KB)

Regards.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-22 18:59   ` Mason
@ 2016-01-22 19:34     ` Russell King - ARM Linux
  2016-01-22 23:15       ` Mason
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux @ 2016-01-22 19:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 22, 2016 at 07:59:12PM +0100, Mason wrote:
> On 22/01/2016 18:48, Russell King - ARM Linux wrote:
> > No, you _must_ use the correct functions to access userspace.
> > Userspace accesses are marked in a special way that allows the kernel
> > to fix up non-present pages.
> 
> Do you mean calling might_fault() ?

No, I explicitly listed the functions you should be using in my
original reply.

> > Normal accesses may appear to work but
> > will eventually oops the kernel when the page is unmapped or is marked
> > read-only and you try to write to it.
> 
> I'll have to check again how the code was before I made
> my silly changes, but it's been in production for years,
> and we've never had any problem in that module...
> (But I suppose something broken can appear to work for
> months or years.)
> 
> > Please don't think of using __copy_from_user() et.al. either - those
> > are there for code which knows what it's doing and has pre-validated
> > the accesses.
> 
> I do call access_ok() before doing the copy.

It's possible to use access_ok() + __copy_from_user(), but that's
really frowned upon because it's _very_ easy to get it wrong - and
then you have a security bug.

> > Drivers and platform code should use copy_from_user()/copy_to_user()
> > to block-copy data to/from userspace, and get_user()/put_user() to
> > copy individual bytes, shorts and int/longs.  (It doesn't matter
> > who you are, that's the official guidance.)
> 
> The problem is that the kernel module's API is already set
> in stone, and it requires block copies with specific access
> sizes, e.g. block_copy8, block_copy16, block_copy32.

Rather than making these statements, you need to explain what, how
and why.

What do these "block_copy8, block_copy16, block_copy32" functions
do?  Does the "8", "16" and "32" refer to the access size?  Why do
they need to make accesses to userspace using these specific sizes?
What causes this restriction?

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-22 19:34     ` Russell King - ARM Linux
@ 2016-01-22 23:15       ` Mason
  2016-01-22 23:57         ` Russell King - ARM Linux
  0 siblings, 1 reply; 15+ messages in thread
From: Mason @ 2016-01-22 23:15 UTC (permalink / raw)
  To: linux-arm-kernel

On 22/01/2016 20:34, Russell King - ARM Linux wrote:

> It's possible to use access_ok() + __copy_from_user(), but that's
> really frowned upon because it's _very_ easy to get it wrong - and
> then you have a security bug.

I was following advice from LDD3.

>>> Drivers and platform code should use copy_from_user()/copy_to_user()
>>> to block-copy data to/from userspace, and get_user()/put_user() to
>>> copy individual bytes, shorts and int/longs.  (It doesn't matter
>>> who you are, that's the official guidance.)
>>
>> The problem is that the kernel module's API is already set
>> in stone, and it requires block copies with specific access
>> sizes, e.g. block_copy8, block_copy16, block_copy32.
> 
> Rather than making these statements, you need to explain what, how
> and why.
> 
> What do these "block_copy8, block_copy16, block_copy32" functions
> do?  Does the "8", "16" and "32" refer to the access size?  Why do
> they need to make accesses to userspace using these specific sizes?
> What causes this restriction?

Interfaces are somewhat arbitrary. The problem statement
was as follows.

Implement functions for copying a range of addresses
FROM user-space, TO physical addresses,
(and also the other way around)
in access size of 8, 16, 32 bits.

So, a little over a decade ago, someone decided that these
functions would have the following prototype:

int read_data8  (u8  *user_addr, u8  *phys_addr, int count)
int read_data16 (u16 *user_addr, u16 *phys_addr, int count)
int read_data32 (u32 *user_addr, u32 *phys_addr, int count)

int write_data8 (u8  *user_addr, u8  *phys_addr, int count)
int write_data16(u16 *user_addr, u16 *phys_addr, int count)
int write_data32(u32 *user_addr, u32 *phys_addr, int count)


IIUC what you're saying, the only 100% correct solution
would be something like this:

(Note: the following code is simplified, as count may be
larger than vmalloc space, so the operation needs to be
"chunked" or "tiled".)

int read_data8 (u8 *user_addr, u8 *phys_addr, int count)
{
  int i, err = 0;

  /* map phys_addr into kernel VA */
  void *va = ioremap(phys_addr, count);
  if (va == NULL) return some_error;

  for (i = 0; i < count; ++i) {
    u8 val = readb(va + i);
    err = put_user(val, user_addr + i);
    if (err) break;
  }

  iounmap(va);
  return err;
}

Is this what you are suggesting?

I expect this to be quite slow.

The problem is that one important user of the API is a
program used to copy the contents of files to "remote"
RAM, i.e. RAM not managed by Linux, to pass that
information to a secure processor, which then copies
it to the DSPs. And this operation is on the critical
path (at boot-time) and must be as fast as possible.

Regards.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-22 23:15       ` Mason
@ 2016-01-22 23:57         ` Russell King - ARM Linux
  2016-01-23 11:14           ` Mason
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux @ 2016-01-22 23:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 23, 2016 at 12:15:29AM +0100, Mason wrote:
> On 22/01/2016 20:34, Russell King - ARM Linux wrote:
> > What do these "block_copy8, block_copy16, block_copy32" functions
> > do?  Does the "8", "16" and "32" refer to the access size?  Why do
> > they need to make accesses to userspace using these specific sizes?
> > What causes this restriction?
> 
> Interfaces are somewhat arbitrary. The problem statement
> was as follows.
> 
> Implement functions for copying a range of addresses
> FROM user-space, TO physical addresses,
> (and also the other way around)
> in access size of 8, 16, 32 bits.
> 
> So, a little over a decade ago, someone decided that these
> functions would have the following prototype:
> 
> int read_data8  (u8  *user_addr, u8  *phys_addr, int count)
> int read_data16 (u16 *user_addr, u16 *phys_addr, int count)
> int read_data32 (u32 *user_addr, u32 *phys_addr, int count)
> 
> int write_data8 (u8  *user_addr, u8  *phys_addr, int count)
> int write_data16(u16 *user_addr, u16 *phys_addr, int count)
> int write_data32(u32 *user_addr, u32 *phys_addr, int count)

Of course, physical addresses are _integers_ and not pointers... (I can't
help but say that because every time I see that mistake, I'm duty bound
to educate to prevent anyone thinking this kind of thing is a good idea.)

> (Note: the following code is simplified, as count may be
> larger than vmalloc space, so the operation needs to be
> "chunked" or "tiled".)
> 
> int read_data8 (u8 *user_addr, u8 *phys_addr, int count)
> {
>   int i, err = 0;
> 
>   /* map phys_addr into kernel VA */
>   void *va = ioremap(phys_addr, count);
>   if (va == NULL) return some_error;
> 
>   for (i = 0; i < count; ++i) {
>     u8 val = readb(va + i);
>     err = put_user(val, user_addr + i);
>     if (err) break;
>   }
> 
>   iounmap(va);
>   return err;
> }
> 
> Is this what you are suggesting?
> 
> I expect this to be quite slow.

That probably will be quite slow.  How about this instead:

int read_data8(u8 __user *user_addr, phys_addr_t phys_addr, size_t bytes)
{
	const size_t batch_size = PAGE_SIZE;
	void __user *user_pos = user_addr;
	void __iomem *va;
	size_t i, j;
	u8 *buf;
	int err;

	va = ioremap(phys_addr, bytes);
	buf = kmalloc(batch_size);
	if (!va || !buf) {
		iounmap(va);
		kfree(buf);
		return -ENOMEM;
	}

	for (i = 0; i < bytes; i += batch_size) {
		size_t len = bytes - i;
		if (len > batch_size)
			len = batch_size;

		for (j = 0; j < len; j += sizeof(*buf))
			buf[j / sizeof(*buf)] = readb_relaxed(va + i + j);

		if (copy_to_user(user_pos, buf, len)) {
			err = -EFAULT;
			break;
		}

		user_pos += len;
	}

	iounmap(va);
	kfree(buf);
	return err;
}

You can change the batch size by altering the "batch_size" variable,
though I suspect you'll find that the above may be reasonably fast.

You should only need to change the "u8" data types and the iomem
accessor for your other read functions.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently@9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-22 23:57         ` Russell King - ARM Linux
@ 2016-01-23 11:14           ` Mason
  2016-01-23 11:34             ` Russell King - ARM Linux
  0 siblings, 1 reply; 15+ messages in thread
From: Mason @ 2016-01-23 11:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 23/01/2016 00:57, Russell King - ARM Linux wrote:
> On Sat, Jan 23, 2016 at 12:15:29AM +0100, Mason wrote:
>> On 22/01/2016 20:34, Russell King - ARM Linux wrote:
>>> What do these "block_copy8, block_copy16, block_copy32" functions
>>> do?  Does the "8", "16" and "32" refer to the access size?  Why do
>>> they need to make accesses to userspace using these specific sizes?
>>> What causes this restriction?
>>
>> Interfaces are somewhat arbitrary. The problem statement
>> was as follows.
>>
>> Implement functions for copying a range of addresses
>> FROM user-space, TO physical addresses,
>> (and also the other way around)
>> in access size of 8, 16, 32 bits.
>>
>> So, a little over a decade ago, someone decided that these
>> functions would have the following prototype:
>>
>> int read_data8  (u8  *user_addr, u8  *phys_addr, int count)
>> int read_data16 (u16 *user_addr, u16 *phys_addr, int count)
>> int read_data32 (u32 *user_addr, u32 *phys_addr, int count)
>>
>> int write_data8 (u8  *user_addr, u8  *phys_addr, int count)
>> int write_data16(u16 *user_addr, u16 *phys_addr, int count)
>> int write_data32(u32 *user_addr, u32 *phys_addr, int count)
> 
> Of course, physical addresses are _integers_ and not pointers... (I can't
> help but say that because every time I see that mistake, I'm duty bound
> to educate to prevent anyone thinking this kind of thing is a good idea.)
> 
>> (Note: the following code is simplified, as count may be
>> larger than vmalloc space, so the operation needs to be
>> "chunked" or "tiled".)
>>
>> int read_data8 (u8 *user_addr, u8 *phys_addr, int count)
>> {
>>   int i, err = 0;
>>
>>   /* map phys_addr into kernel VA */
>>   void *va = ioremap(phys_addr, count);
>>   if (va == NULL) return some_error;
>>
>>   for (i = 0; i < count; ++i) {
>>     u8 val = readb(va + i);
>>     err = put_user(val, user_addr + i);
>>     if (err) break;
>>   }
>>
>>   iounmap(va);
>>   return err;
>> }
>>
>> Is this what you are suggesting?
>>
>> I expect this to be quite slow.
> 
> That probably will be quite slow.  How about this instead:
> 
> int read_data8(u8 __user *user_addr, phys_addr_t phys_addr, size_t bytes)
> {
> 	const size_t batch_size = PAGE_SIZE;
> 	void __user *user_pos = user_addr;
> 	void __iomem *va;
> 	size_t i, j;
> 	u8 *buf;
> 	int err;
> 
> 	va = ioremap(phys_addr, bytes);
> 	buf = kmalloc(batch_size);
> 	if (!va || !buf) {
> 		iounmap(va);
> 		kfree(buf);
> 		return -ENOMEM;
> 	}
> 
> 	for (i = 0; i < bytes; i += batch_size) {
> 		size_t len = bytes - i;
> 		if (len > batch_size)
> 			len = batch_size;
> 
> 		for (j = 0; j < len; j += sizeof(*buf))
> 			buf[j / sizeof(*buf)] = readb_relaxed(va + i + j);
> 
> 		if (copy_to_user(user_pos, buf, len)) {
> 			err = -EFAULT;
> 			break;
> 		}
> 
> 		user_pos += len;
> 	}
> 
> 	iounmap(va);
> 	kfree(buf);
> 	return err;
> }
> 
> You can change the batch size by altering the "batch_size" variable,
> though I suspect you'll find that the above may be reasonably fast.
> 
> You should only need to change the "u8" data types and the iomem
> accessor for your other read functions.

Russell,

Thanks for taking the time to walk me through it. I really
appreciate it. I'll benchmark on Monday.

One more thing: for the important use-case I described
(copying a file to remote RAM) the existing code was using
write_data8! Obvious optimization was to use write_data32
but this still means two unnecessary copies:

1) copying file contents to user-space temp buffer
2) copying user-space to temp kernel buffer
3) finally move data to remote RAM

I had a different approach which avoids the 2 copies:
In user-space, mmap the physical address using /dev/mem
and just use the read syscall to copy from filesystem
into remote RAM.

I don't need to worry about access size in that specific
instance (since I'm copying to RAM).

What do you think about that?

Regards.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-23 11:14           ` Mason
@ 2016-01-23 11:34             ` Russell King - ARM Linux
  2016-01-23 20:53               ` Mason
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux @ 2016-01-23 11:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 23, 2016 at 12:14:38PM +0100, Mason wrote:
> One more thing: for the important use-case I described
> (copying a file to remote RAM) the existing code was using
> write_data8! Obvious optimization was to use write_data32
> but this still means two unnecessary copies:
> 
> 1) copying file contents to user-space temp buffer
> 2) copying user-space to temp kernel buffer
> 3) finally move data to remote RAM
> 
> I had a different approach which avoids the 2 copies:
> In user-space, mmap the physical address using /dev/mem
> and just use the read syscall to copy from filesystem
> into remote RAM.
> 
> I don't need to worry about access size in that specific
> instance (since I'm copying to RAM).
> 
> What do you think about that?

Well, there's an issue here I didn't cover previously, and that is that
the kernel has a firmware interface - a method by which drivers can
request firmware for their devices from userspace.  This is the accepted
way to upload firmware from userspace for any device.  It's more
complex, but you ought to look at it as a proper solution to your
problem.  Look at linux/firmware.h for information on this, and an
example usage can be seen in drivers/dma/imx-sdma.c - others can be
found by grepping for that header.

However, if you're going to stick with your approach, there's a way to
avoid the need to copy into userspace and then copy back out.  It's
called sendfile().  This can be used to send part (or all) of one file
to another file descriptor.  It's intended use was to accelerate
server applications such as apache, but it seems to be a good fit
for what you want.

If you arrange for your module to provide an anonymous file descriptor,
you can use that as the destination file descriptor for uploading the
contents of your file to using sendfile().  The anonymous file
descriptor's f_ops->write method should be called as if userspace
were writing the firmware into this file descriptor - but without the
data being copied into and back out of userspace.

As the file descriptor will be writable, you can also use lseek() and
write() to invoke the f_ops->write method for any file offset, so you
can also use it to upload at any offset and length via standard APIs.

It seems to me that such a solution has a very nice elegance to it.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-23 11:34             ` Russell King - ARM Linux
@ 2016-01-23 20:53               ` Mason
  2016-01-23 22:46                 ` Mason
  2016-01-23 23:59                 ` Russell King - ARM Linux
  0 siblings, 2 replies; 15+ messages in thread
From: Mason @ 2016-01-23 20:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 23/01/2016 12:34, Russell King - ARM Linux wrote:

> [snip firmware interface blurb -- I'll take a look]
> 
> However, if you're going to stick with your approach, there's a way to
> avoid the need to copy into userspace and then copy back out.  It's
> called sendfile().  This can be used to send part (or all) of one file
> to another file descriptor.  It's intended use was to accelerate
> server applications such as apache, but it seems to be a good fit
> for what you want.
> 
> If you arrange for your module to provide an anonymous file descriptor,
> you can use that as the destination file descriptor for uploading the
> contents of your file to using sendfile().  The anonymous file
> descriptor's f_ops->write method should be called as if userspace
> were writing the firmware into this file descriptor - but without the
> data being copied into and back out of userspace.
> 
> As the file descriptor will be writable, you can also use lseek() and
> write() to invoke the f_ops->write method for any file offset, so you
> can also use it to upload at any offset and length via standard APIs.
> 
> It seems to me that such a solution has a very nice elegance to it.

True.

But I would consider mmaping /dev/mem even simpler: it doesn't
require writing a single line of kernel code, and it works as
expected. The gist of the code is:

  fd = open("/path/to/file", O_RDONLY);
  mem_fd = open("/dev/mem", O_WRONLY | O_SYNC);
  va = mmap(NULL, len, PROT_WRITE, MAP_SHARED, mem_fd, phys_addr);
  read(fd, va, len);

And that's it. If I understand correctly, the mem driver
will copy the file contents directly to "remote" RAM.

Is there something wrong with this solution, in your opinion?

Regards.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-23 20:53               ` Mason
@ 2016-01-23 22:46                 ` Mason
  2016-01-23 23:59                 ` Russell King - ARM Linux
  1 sibling, 0 replies; 15+ messages in thread
From: Mason @ 2016-01-23 22:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 23/01/2016 21:53, Mason wrote:

> But I would consider mmaping /dev/mem even simpler: it doesn't
> require writing a single line of kernel code, and it works as
> expected. The gist of the code is:
> 
>   fd = open("/path/to/file", O_RDONLY);
>   mem_fd = open("/dev/mem", O_WRONLY | O_SYNC);
>   va = mmap(NULL, len, PROT_WRITE, MAP_SHARED, mem_fd, phys_addr);
>   read(fd, va, len);
> 
> And that's it. If I understand correctly, the mem driver
> will copy the file contents directly to "remote" RAM.
> 
> Is there something wrong with this solution, in your opinion?

For my own reference, Arnd had several comments:

> /dev/mem is deprecated and often gets disabled because it is a
> security hole. Your driver can implement its own mmap function
> if needed.
> 
> On the user space side, you have to be careful to use aligned
> accesses on device memory, you can't just memcpy into a buffer
> from mmap()
>
> /dev/mem maps memory as MT_DEVICE, which has stricter rules
> [than ordinary RAM]. If you want an mmap of this memory, you
> should really implement your own with the correct attributes.
> 
> read may or may not do unaligned accesses, depending on a lot of
> factors like compiler optimization flags when building the kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-23 20:53               ` Mason
  2016-01-23 22:46                 ` Mason
@ 2016-01-23 23:59                 ` Russell King - ARM Linux
  2016-01-24 13:27                   ` Mason
  1 sibling, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux @ 2016-01-23 23:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 23, 2016 at 09:53:30PM +0100, Mason wrote:
> But I would consider mmaping /dev/mem even simpler: it doesn't
> require writing a single line of kernel code, and it works as
> expected. The gist of the code is:
> 
>   fd = open("/path/to/file", O_RDONLY);
>   mem_fd = open("/dev/mem", O_WRONLY | O_SYNC);
>   va = mmap(NULL, len, PROT_WRITE, MAP_SHARED, mem_fd, phys_addr);
>   read(fd, va, len);
> 
> And that's it. If I understand correctly, the mem driver
> will copy the file contents directly to "remote" RAM.
> 
> Is there something wrong with this solution, in your opinion?

I'll quote what you said in previous mails in this thread, so
there's no misunderstanding.  Let's call this exhibit A.  You said:

> I said:
> > Drivers and platform code should use copy_from_user()/copy_to_user()
> > to block-copy data to/from userspace, and get_user()/put_user() to
> > copy individual bytes, shorts and int/longs.  (It doesn't matter
> > who you are, that's the official guidance.)
> 
> The problem is that the kernel module's API is already set
> in stone, and it requires block copies with specific access
> sizes, e.g. block_copy8, block_copy16, block_copy32.

Okay, so in this email, we established that there was a requirement
to use specific access sizes, which turned out to be 8-bit, 16-bit
and 32-bit.  You discounted using copy_from_user()/copy_to_user().
You confirmed this in a following email, which I'll call exhibit B:

> Implement functions for copying a range of addresses
> FROM user-space, TO physical addresses,
> (and also the other way around)
> in access size of 8, 16, 32 bits.

Do you agree that you said these things?  (I'll assume you will agree,
because you're going to look _real_ stupid if you don't, because it's
all in the public archives.)

Okay, now let's consider what you've just said.  For the avoidance of
any doubt, I'll re-quote the exact bit which I'm referring to:

> But I would consider mmaping /dev/mem even simpler: it doesn't
> require writing a single line of kernel code, and it works as
> expected. The gist of the code is:
> 
>   fd = open("/path/to/file", O_RDONLY);
>   mem_fd = open("/dev/mem", O_WRONLY | O_SYNC);
>   va = mmap(NULL, len, PROT_WRITE, MAP_SHARED, mem_fd, phys_addr);
>   read(fd, va, len);

Okay, now, read() uses copy_to_user() to copy the data from kernel
space into userspace - and in this instance, that happens to be the
mapping you've set up against /dev/mem, which corresponds to the
physical addresses.  In other words, it's writing to the same
"phys_addr" as your block_copy*() functions, except it's writing
using copy_to_user().  You have stated (as we can see above) that
your solution using read() "works as expected".  That's great,
however...

In exhibit A and B, you've repeatedly stated why you are unable to
use the copy_*_user() functions for this purpose, making the claims
that specific access sizes are required.  You have just proven that
you are wrong, and that my _very_ _first_ reply in this thread was
correct.  Rather than trying it out (and finding that it _would_
have worked) you've instead argued against it, wasting my time.

I am far from impressed.

I suggest you read Aesop's Fables, specifically The Boy Who Cried
Wolf.  It seems rather appropriate here, and illustrates what can
happen if this behaviour continues.

Please, smarten up your act.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-23 23:59                 ` Russell King - ARM Linux
@ 2016-01-24 13:27                   ` Mason
  2016-01-27 10:36                     ` Mason
  0 siblings, 1 reply; 15+ messages in thread
From: Mason @ 2016-01-24 13:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 24/01/2016 00:59, Russell King - ARM Linux wrote:

> I'll quote what you said in previous mails in this thread,
> so there's no misunderstanding.

Russell,

Our "software stack" provides the kernel API under discussion:

  {read,write}_data{8,16,32}

These 6 functions must be implemented, because they are part
of the API we provide to customers. As the "page domain fault"
underscores, my own implementation is incorrect. I am grateful
for the implementation you suggested up-thread, and will test
its performance on Monday.

Our "software stack" also provides several user-space tools,
some of which use the above kernel API. One of those tools is
the one I described, to copy files to remote RAM.

The current code uses write_data8, but the implementation is
irrelevant. Any implementation is acceptable, as long as the
tool works as expected. So I can take advantage of the tool's
specificities to optimize.

Anyway, thanks again for the guidance and advice, and please
give a little more credit.

Regards.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-24 13:27                   ` Mason
@ 2016-01-27 10:36                     ` Mason
  2016-01-27 10:48                       ` Russell King - ARM Linux
  0 siblings, 1 reply; 15+ messages in thread
From: Mason @ 2016-01-27 10:36 UTC (permalink / raw)
  To: linux-arm-kernel

On 24/01/2016 14:27, Mason wrote:

> Our "software stack" provides the kernel API under discussion:
> 
>   {read,write}_data{8,16,32}
> 
> These 6 functions must be implemented, because they are part
> of the API we provide to customers. As the "page domain fault"
> underscores, my own implementation is incorrect. I am grateful
> for the implementation you suggested up-thread, and will test
> its performance on Monday.

For the record, I've now changed the implementation as follows.
I'll benchmark performance as soon as I fix the other bug in
the module.

#define DEFINE_BLOCK_READ(N)							\
static int read##N(void __user *dest, void *buf, void __iomem *io, size_t len)	\
{										\
	size_t i; u##N *temp = buf;						\
	for (i = 0; i < len; i += N/8) temp[i] = RD##N(io + i);			\
	return copy_to_user(dest, temp, len) ? -EFAULT : 0;			\
}

#define RD8	readb_relaxed
#define RD16	readw_relaxed
#define RD32	readl_relaxed

DEFINE_BLOCK_READ(8)
DEFINE_BLOCK_READ(16)
DEFINE_BLOCK_READ(32)

#define DEFINE_BLOCK_WRITE(N)							\
static int write##N(void __user *src, void *buf, void __iomem *io, size_t len)	\
{										\
	size_t i; u##N *temp = buf;						\
	if (copy_from_user(temp, src, len)) return -EFAULT;			\
	for (i = 0; i < len; i += N/8) WR##N(temp[i], io + i);			\
	return 0;								\
}

#define WR8	writeb_relaxed
#define WR16	writew_relaxed
#define WR32	writel_relaxed

DEFINE_BLOCK_WRITE(8)
DEFINE_BLOCK_WRITE(16)
DEFINE_BLOCK_WRITE(32)

#define TILE_SIZE (16u << 10)
typedef int fun_t(void __user *ua, void *buf, void __iomem *io, size_t len);

static int block_copy(void __user *ua, phys_addr_t pa, size_t bytes, fun_t *fun)
{
	int err = 0;
	size_t pos = 0;
	void *buf = kmalloc(TILE_SIZE, GFP_KERNEL);
	if (buf == NULL) err = -ENOMEM;

	while (pos < bytes && !err)
	{
		size_t tile = min(bytes-pos, TILE_SIZE);
		void __iomem *va = ioremap(pa + pos, tile);

		err = va ? fun(ua + pos, buf, va, tile) : -EFAULT;
		iounmap(va);
		pos += tile;
	}

	kfree(buf);
	return err;
}


and then the ioctl dispatcher calls e.g.
	block_copy(user_addr, phys_addr, count*4, read32);


IIUC, Arnd mentioned that there might be an issue using readl_relaxed
on a memory region with a big-endian kernel.


The access_ok() macro could be hoisted out of the inner functions,
into block_copy() => that would save about 350 bytes. (I'm not sure
it is worth it.)

   text	   data	    bss	    dec	    hex	filename
  18111	    188	  15036	  33335	   8237	kllad.o

   text	   data	    bss	    dec	    hex	filename
  17759	    188	  15036	  32983	   80d7	kllad.o


Regards.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-27 10:36                     ` Mason
@ 2016-01-27 10:48                       ` Russell King - ARM Linux
  2016-01-27 12:04                         ` Mason
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux @ 2016-01-27 10:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 27, 2016 at 11:36:51AM +0100, Mason wrote:
> On 24/01/2016 14:27, Mason wrote:
> 
> > Our "software stack" provides the kernel API under discussion:
> > 
> >   {read,write}_data{8,16,32}
> > 
> > These 6 functions must be implemented, because they are part
> > of the API we provide to customers. As the "page domain fault"
> > underscores, my own implementation is incorrect. I am grateful
> > for the implementation you suggested up-thread, and will test
> > its performance on Monday.
> 
> For the record, I've now changed the implementation as follows.
> I'll benchmark performance as soon as I fix the other bug in
> the module.
> 
> #define DEFINE_BLOCK_READ(N)							\
> static int read##N(void __user *dest, void *buf, void __iomem *io, size_t len)	\
> {										\
> 	size_t i; u##N *temp = buf;						\
> 	for (i = 0; i < len; i += N/8) temp[i] = RD##N(io + i);			\
> 	return copy_to_user(dest, temp, len) ? -EFAULT : 0;			\
> }
> 
> #define RD8	readb_relaxed
> #define RD16	readw_relaxed
> #define RD32	readl_relaxed
> 
> DEFINE_BLOCK_READ(8)
> DEFINE_BLOCK_READ(16)
> DEFINE_BLOCK_READ(32)
> 
> #define DEFINE_BLOCK_WRITE(N)							\
> static int write##N(void __user *src, void *buf, void __iomem *io, size_t len)	\
> {										\
> 	size_t i; u##N *temp = buf;						\
> 	if (copy_from_user(temp, src, len)) return -EFAULT;			\
> 	for (i = 0; i < len; i += N/8) WR##N(temp[i], io + i);			\
> 	return 0;								\
> }
> 
> #define WR8	writeb_relaxed
> #define WR16	writew_relaxed
> #define WR32	writel_relaxed
> 
> DEFINE_BLOCK_WRITE(8)
> DEFINE_BLOCK_WRITE(16)
> DEFINE_BLOCK_WRITE(32)
> 
> #define TILE_SIZE (16u << 10)
> typedef int fun_t(void __user *ua, void *buf, void __iomem *io, size_t len);
> 
> static int block_copy(void __user *ua, phys_addr_t pa, size_t bytes, fun_t *fun)
> {
> 	int err = 0;
> 	size_t pos = 0;
> 	void *buf = kmalloc(TILE_SIZE, GFP_KERNEL);
> 	if (buf == NULL) err = -ENOMEM;
> 
> 	while (pos < bytes && !err)
> 	{
> 		size_t tile = min(bytes-pos, TILE_SIZE);
> 		void __iomem *va = ioremap(pa + pos, tile);
> 
> 		err = va ? fun(ua + pos, buf, va, tile) : -EFAULT;
> 		iounmap(va);
> 		pos += tile;
> 	}
> 
> 	kfree(buf);
> 	return err;
> }
> 
> 
> and then the ioctl dispatcher calls e.g.
> 	block_copy(user_addr, phys_addr, count*4, read32);
> 
> 
> IIUC, Arnd mentioned that there might be an issue using readl_relaxed
> on a memory region with a big-endian kernel.

I think you're confused there, or Arnd's comment was incorrect.

In any case, I'm even more pissed off with you.  Let me quote from
your earlier emails:

> The problem is that the kernel module's API is already set
> in stone, and it requires block copies with specific access
> sizes, e.g. block_copy8, block_copy16, block_copy32.

and:

> So, a little over a decade ago, someone decided that these
> functions would have the following prototype:
> 
> int read_data8  (u8  *user_addr, u8  *phys_addr, int count)
> int read_data16 (u16 *user_addr, u16 *phys_addr, int count)
> int read_data32 (u32 *user_addr, u32 *phys_addr, int count)
> 
> int write_data8 (u8  *user_addr, u8  *phys_addr, int count)
> int write_data16(u16 *user_addr, u16 *phys_addr, int count)
> int write_data32(u32 *user_addr, u32 *phys_addr, int count)

However, you've done away with those prototypes, and instead come up
with something that looks like:

int block_copy(void __user *ua, phys_addr_t pa, size_t bytes, fun_t *fun)

?

So, let me get this straight.  You demand that the API can't be changed,
and I provide you with a solution which results in very little change to
the API.

However, rather than testing the version I carefully created for you,
you've decided that you're not going to, instead coming up with your
own solution which breaks all your previous demands.

This is totally rediculous.  Why should I waste any more time with _any_
of your questions?

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Unhandled fault: page domain fault (0x81b) at 0x00e41008
  2016-01-27 10:48                       ` Russell King - ARM Linux
@ 2016-01-27 12:04                         ` Mason
  0 siblings, 0 replies; 15+ messages in thread
From: Mason @ 2016-01-27 12:04 UTC (permalink / raw)
  To: linux-arm-kernel

On 27/01/2016 11:48, Russell King - ARM Linux wrote:

> Mason wrote:
> 
>> IIUC, Arnd mentioned that there might be an issue using readl_relaxed
>> on a memory region with a big-endian kernel.
> 
> I think you're confused there, or Arnd's comment was incorrect.

I have certainly misunderstood Arnd's comment.

> In any case, I'm even more pissed off with you.  Let me quote from
> your earlier emails:
> 
>> The problem is that the kernel module's API is already set
>> in stone, and it requires block copies with specific access
>> sizes, e.g. block_copy8, block_copy16, block_copy32.
> 
> and:
> 
>> So, a little over a decade ago, someone decided that these
>> functions would have the following prototype:
>>
>> int read_data8  (u8  *user_addr, u8  *phys_addr, int count)
>> int read_data16 (u16 *user_addr, u16 *phys_addr, int count)
>> int read_data32 (u32 *user_addr, u32 *phys_addr, int count)
>>
>> int write_data8 (u8  *user_addr, u8  *phys_addr, int count)
>> int write_data16(u16 *user_addr, u16 *phys_addr, int count)
>> int write_data32(u32 *user_addr, u32 *phys_addr, int count)
> 
> However, you've done away with those prototypes, and instead come up
> with something that looks like:
> 
> int block_copy(void __user *ua, phys_addr_t pa, size_t bytes, fun_t *fun)
> 
> ?
> 
> So, let me get this straight.  You demand that the API can't be changed,
> and I provide you with a solution which results in very little change to
> the API.

I don't understand what is unclear in my description.

The user-space API to call into the kernel module is, and has always been:

	int read_data8  (u8 *user_addr,  u32 phys_addr, int count)
	int read_data16 (u16 *user_addr, u32 phys_addr, int count)
	int read_data32 (u32 *user_addr, u32 phys_addr, int count)

	int write_data8 (u8 *user_addr,  u32 phys_addr, int count)
	int write_data16(u16 *user_addr, u32 phys_addr, int count)
	int write_data32(u32 *user_addr, u32 phys_addr, int count)

A user-space shim shoves the parameters inside a struct and calls into
the kernel module via ioctl. The ioctl dispatcher then unpacks the
parameters and calls block_copy:

	case DIRECT_IOCTL_READ_DATA32:
	{
		struct data32 s;
		if (copy_from_user(&s, (void *)arg, sizeof s)) break;
		rc = block_copy(s.data, s.byte_address, s.count*4, read32);
		break;
	}

> However, rather than testing the version I carefully created for you,
> you've decided that you're not going to, instead coming up with your
> own solution which breaks all your previous demands.

Why do you say that my solution breaks my previous demands?

Note that block_copy is directly inspired from your read_data8 example.
The prototypes are nearly identical:

int block_copy(void __user *ua, phys_addr_t pa, size_t bytes, fun_t *fun)
int read_data8(u8 __user *user_addr, phys_addr_t phys_addr, size_t bytes)

I just added the function pointer to be able to generalize the function
for 16-bit and 32-bit accesses.

By the way, thanks again for providing read_data8, it really helped me
get a better grasp of the process.

I don't understand what it is you found so blatantly outrageous in my
message? Perhaps I need to improve my writing skills.

> This is totally rediculous.  Why should I waste any more time with _any_
> of your questions?

All I can say is that I am grateful for the help you've provided so far,
and I will try to be clearer in my problem descriptions in the future.

Regards.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-01-27 12:04 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-22 17:37 Unhandled fault: page domain fault (0x81b) at 0x00e41008 Mason
2016-01-22 17:48 ` Russell King - ARM Linux
2016-01-22 18:59   ` Mason
2016-01-22 19:34     ` Russell King - ARM Linux
2016-01-22 23:15       ` Mason
2016-01-22 23:57         ` Russell King - ARM Linux
2016-01-23 11:14           ` Mason
2016-01-23 11:34             ` Russell King - ARM Linux
2016-01-23 20:53               ` Mason
2016-01-23 22:46                 ` Mason
2016-01-23 23:59                 ` Russell King - ARM Linux
2016-01-24 13:27                   ` Mason
2016-01-27 10:36                     ` Mason
2016-01-27 10:48                       ` Russell King - ARM Linux
2016-01-27 12:04                         ` Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).