All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] prevent "dd if=/dev/mem" crash
@ 2003-10-17 22:10 Bjorn Helgaas
  2003-10-17 22:23 ` Matt Mackall
  2003-10-17 22:50 ` Andrew Morton
  0 siblings, 2 replies; 24+ messages in thread
From: Bjorn Helgaas @ 2003-10-17 22:10 UTC (permalink / raw)
  To: linux-ia64, linux-kernel

This is the generic part of a change to prevent "dd if=/dev/mem"
from causing a machine check on ia64.

read_mem() and write_mem() already check the requested address
against "high_memory", but that is only a complete check if
everything from 0 to high_memory is valid, readable/writable
memory.  Obviously that's not the case for architectures with
discontiguous memory, like ia64.

Old behavior:

    # dd if=/dev/mem of=/dev/null
    <unrecoverable machine check>

New behavior (this system has a hole from 0-16MB, then memory
from 16MB-1GB):

    # dd if=/dev/mem of=/dev/null
    0+0 records in
    0+0 records out
    0 bytes transferred in 0.000282 seconds (0 bytes/sec)

    # dd if=/dev/mem of=/dev/null bs=1M skip=16 
    1004+10 records in
    1004+10 records out
    1056964608 bytes transferred in 1.629262 seconds (648738280 bytes/sec)

I expect there are probably different opinions about the idea
that "dd if=/dev/mem" exits without doing anything.  Sparc and
68K have nearby code that bit-buckets writes and returns zeroes
for reads of page zero.  We could do that, too, but it seems like
kind of a hack, and holes on ia64 can be BIG (on the order of
256GB for one box).

So flame away :-)

The patch below is mangled so it won't apply easily.  If this
seems a reasonable approach, I'll submit the ia64 piece first,
then repost this.

Bjorn

===== drivers/char/mem.c 1.44 vs edited =====
--- 1.44/ drivers/char/mem.c	Sun Sep 21 15:50:34 2003
+++ edited/ drivers/char/mem.c	Fri Oct 17 15:37:47 2003
@@ -79,6 +79,24 @@
 #endif
 }
 
+static inline int valid_mem_range(unsigned long addr, size_t *count)
+{
+#if defined(CONFIG_IA64)
+	return efi_valid_mem_range(addr, count);
+#else
+	unsigned long end_mem;
+
+	end_mem = __pa(high_memory);
+	if (addr >= end_mem)
+		return 0;
+
+	if (*count > end_mem - addr)
+		*count = end_mem - addr;
+
+	return 1;
+#endif
+}
+
 static ssize_t do_write_mem(struct file * file, void *p, unsigned long realp,
 			    const char * buf, size_t count, loff_t *ppos)
 {
@@ -113,14 +131,10 @@
 			size_t count, loff_t *ppos)
 {
 	unsigned long p = *ppos;
-	unsigned long end_mem;
 	ssize_t read;
 
-	end_mem = __pa(high_memory);
-	if (p >= end_mem)
+	if (!valid_mem_range(p, &count))
 		return 0;
-	if (count > end_mem - p)
-		count = end_mem - p;
 	read = 0;
 #if defined(__sparc__) || (defined(__mc68000__) && defined(CONFIG_MMU))
 	/* we don't have page 0 mapped on sparc and m68k.. */
@@ -149,13 +163,9 @@
 			 size_t count, loff_t *ppos)
 {
 	unsigned long p = *ppos;
-	unsigned long end_mem;
 
-	end_mem = __pa(high_memory);
-	if (p >= end_mem)
+	if (!valid_mem_range(p, &count))
 		return 0;
-	if (count > end_mem - p)
-		count = end_mem - p;
 	return do_write_mem(file, __va(p), p, buf, count, ppos);
 }
 
===== include/linux/efi.h 1.3 vs edited =====
--- 1.3/ include/linux/efi.h	Thu Aug  7 14:01:48 2003
+++ edited/ include/linux/efi.h	Thu Oct 16 16:54:52 2003
@@ -266,6 +266,7 @@
 extern u64 efi_get_iobase (void);
 extern u32 efi_mem_type (unsigned long phys_addr);
 extern u64 efi_mem_attributes (unsigned long phys_addr);
+extern int efi_valid_mem_range (unsigned long phys_addr, unsigned long *count);
 
 /*
  * Variable Attributes


^ permalink raw reply	[flat|nested] 24+ messages in thread
* fielding PCI bus timeouts - was: prevent "dd if=/dev/mem" crash
@ 2003-10-20 15:58 Rich Altmaier
  0 siblings, 0 replies; 24+ messages in thread
From: Rich Altmaier @ 2003-10-20 15:58 UTC (permalink / raw)
  To: linux-ia64

Just a note to mention experience with handling hardware failures.

For the case of user mappings to IO buses there are important
classes of aps, usually realtime or data acquisition, that
benefit from nice error handling of bus timeouts at user level.

These aps tend to be using old or prototype hardware, which can
fail (cause a bus timeout) during "normal" operation.
Or at least the user view is that the machine should not crash
due to "one flaky board".   Hence there is merit in being
able to translate a PIO-read bus timeout to say a SIGBUS.

More interesting is the case of PIO-write failure, as the writes
can be asynchronous.  Meaning by the time the hardware recognizes
a failure, the CPU's store instruction has graduated and the
CPU has moved on.  Perhaps gone through a context switch or
even exitted the user process.  In this case something more
than a SIGBUS is needed (IRIX has several options to deal with
this).  

On the thread about dd if=/dev/mem, I don't know of any legitmate
reason that user code needs to successfully recover from reading
non-existant phys memory.  I would suggest the princple that
bad user code should not cause a crash, and bad user code that
does a lot of reads would be thought harmless by many dangerous
users.  So some kind of error to the user process is probably
reasonable, perhaps SIGBUS.
Silently returning 0 doesn't sound right, as if there were any 
legitimate reason for this code in the first place, it probably
relates to some search of the physical address space.  Perhaps
a diagnostic.

FYI, Rich

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2003-10-23 21:12 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-17 22:10 [RFC] prevent "dd if=/dev/mem" crash Bjorn Helgaas
2003-10-17 22:23 ` Matt Mackall
2003-10-17 22:50 ` Andrew Morton
2003-10-17 23:25   ` Bjorn Helgaas
2003-10-17 23:55     ` Andrew Morton
2003-10-18  0:15       ` William Lee Irwin III
2003-10-18  0:21       ` David Mosberger
2003-10-18  0:49         ` Andrew Morton
2003-10-18  1:31           ` Matt Chapman
2003-10-18  1:41             ` Andrew Morton
2003-10-18  1:48           ` David Mosberger
2003-10-18  2:01             ` Andrew Morton
2003-10-18  2:01             ` Matt Chapman
2003-10-19 11:25           ` Eric W. Biederman
2003-10-19 19:01             ` William Lee Irwin III
2003-10-20 15:17         ` Bjorn Helgaas
2003-10-20 18:48           ` David Mosberger
2003-10-20 15:58         ` fielding PCI bus timeouts - was: " Rich Altmaier
2003-10-20 17:42       ` [RFC] " Bjorn Helgaas
2003-10-23 21:05         ` Bjorn Helgaas
2003-10-19 18:17   ` Pavel Machek
2003-10-23  8:33     ` Martin Pool
2003-10-23  9:31       ` Zoltan Menyhart
  -- strict thread matches above, loose matches on Subject: below --
2003-10-20 15:58 fielding PCI bus timeouts - was: " Rich Altmaier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.