public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] mm/memory.c, 2.4.1 : memory leak with swap cache (updated)
@ 2001-03-27 21:29 Richard Jerrell
  2001-03-27 21:18 ` Rik van Riel
  2001-03-27 21:51 ` [PATCH] mm/memory.c, 2.4.1 : memory leak with swap cache (updated) Linus Torvalds
  0 siblings, 2 replies; 57+ messages in thread
From: Richard Jerrell @ 2001-03-27 21:29 UTC (permalink / raw)
  To: Stephen Tweedie; +Cc: linux-kernel, Rik van Riel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1901 bytes --]

> fork and exit are very hot paths in the kernel, and this patch can force
> a page cache lookup on a large number of pte which wouldn't be looked
> up before.

True, but I don't know how large of a performance hit the system takes.

> Given that the leak is, as you say, temporary, and that the leak will
> be recovered as soon as we start swapping again, do we really want to
> pollute the fast path for the sake of a bit more speed during
> swapping?

It isn't speed of swapping that is the biggest problem.  The problem is
that if you run a memory intensive task, exit after being placed on an
lru, and run it again, there won't be enough memory to execute because all
the memory you used previously is now sitting in the swap cache.  That
isn't to say that without being patched the speed isn't poor.  After all,
we'd be paging out a dead processes pages.

But you are right, this fix is slow and that can be improved.  So,
hopefully this patch is satisfactory in respect to speed and fixing the
leak.  And will also remove the panic which is possible with the other
patches (can't do a lookup_swap_cache with a spinlock held).

Instead of removing the swap cache page at process exit and possibly
expending time doing disk IO as you have pointed out, we check during
refill_inactive_scan and page_launder for a page that is 

1)  in the swap cache 
2)  is not locked 
3)  is only being referenced by the swap cache, us, and possibly by
    buffers
4)  has no one else referencing the swap cell

If that is true, we can safely remove that page without writing it to
disk.  In addition, the number of swap cache pages are included in the
amount returned from vm_enough_memory to get rid of the temporary leak.

So, the exit path remains unchanged, reclaiming a page is faster for when
the page is no longer being mapped, and the lazy reclaiming for multiply
referenced pages remains intact.

Rich

[-- Attachment #2: Type: TEXT/PLAIN, Size: 2555 bytes --]

diff -rNu linux-2.4.1/include/linux/swap.h linux-2.4.1-paging-fix/include/linux/swap.h
--- linux-2.4.1/include/linux/swap.h	Tue Jan 30 02:24:56 2001
+++ linux-2.4.1-paging-fix/include/linux/swap.h	Tue Mar 27 10:43:22 2001
@@ -69,6 +69,7 @@
 FASTCALL(unsigned int nr_free_buffer_pages(void));
 extern int nr_active_pages;
 extern int nr_inactive_dirty_pages;
+extern int nr_swap_cache_pages;
 extern atomic_t nr_async_pages;
 extern struct address_space swapper_space;
 extern atomic_t page_cache_size;
diff -rNu linux-2.4.1/mm/mmap.c linux-2.4.1-paging-fix/mm/mmap.c
--- linux-2.4.1/mm/mmap.c	Mon Jan 29 11:10:41 2001
+++ linux-2.4.1-paging-fix/mm/mmap.c	Tue Mar 27 10:38:17 2001
@@ -63,6 +63,7 @@
 	free += atomic_read(&page_cache_size);
 	free += nr_free_pages();
 	free += nr_swap_pages;
+	free += nr_swap_cache_pages;
 	return free > pages;
 }
 
diff -rNu linux-2.4.1/mm/swap_state.c linux-2.4.1-paging-fix/mm/swap_state.c
--- linux-2.4.1/mm/swap_state.c	Fri Dec 29 18:04:27 2000
+++ linux-2.4.1-paging-fix/mm/swap_state.c	Tue Mar 27 10:41:04 2001
@@ -17,6 +17,8 @@
 
 #include <asm/pgtable.h>
 
+int nr_swap_cache_pages = 0;
+
 static int swap_writepage(struct page *page)
 {
 	rw_swap_page(WRITE, page, 0);
@@ -58,6 +60,7 @@
 #ifdef SWAP_CACHE_INFO
 	swap_cache_add_total++;
 #endif
+	nr_swap_cache_pages++;
 	if (!PageLocked(page))
 		BUG();
 	if (PageTestandSetSwapCache(page))
@@ -96,6 +99,7 @@
 #ifdef SWAP_CACHE_INFO
 	swap_cache_del_total++;
 #endif
+	nr_swap_cache_pages--;
 	remove_from_swap_cache(page);
 	swap_free(entry);
 }
diff -rNu linux-2.4.1/mm/vmscan.c linux-2.4.1-paging-fix/mm/vmscan.c
--- linux-2.4.1/mm/vmscan.c	Mon Jan 15 15:36:49 2001
+++ linux-2.4.1-paging-fix/mm/vmscan.c	Tue Mar 27 12:31:04 2001
@@ -466,6 +466,17 @@
 			continue;
 		}
 
+		if (PageSwapCache(page)) {
+			page_cache_get(page);
+			if(!is_page_shared(page)) {
+				delete_from_swap_cache_nolock(page);
+				UnlockPage(page);
+				page_cache_release(page);
+				continue;
+			}
+			page_cache_release(page);
+		}
+
 		/*
 		 * Dirty swap-cache page? Write it out if
 		 * last copy..
@@ -650,6 +661,17 @@
 			list_del(page_lru);
 			nr_active_pages--;
 			continue;
+		}
+
+		if (PageSwapCache(page)) {
+			page_cache_get(page);
+			if(!is_page_shared(page) && !TryLockPage(page)) {
+				delete_from_swap_cache_nolock(page);
+				UnlockPage(page);
+				page_cache_release(page);
+				continue;
+			}
+			page_cache_release(page);
 		}
 
 		/* Do aging on the pages. */

^ permalink raw reply	[flat|nested] 57+ messages in thread
* Re: Disturbing news..
@ 2001-03-28 14:15 Jesse Pollard
  2001-03-28 14:53 ` Russell King
  0 siblings, 1 reply; 57+ messages in thread
From: Jesse Pollard @ 2001-03-28 14:15 UTC (permalink / raw)
  To: kaos, jesse; +Cc: linux-kernel

Keith Owens <kaos@ocs.com.au>
> 
> On Wed, 28 Mar 2001 06:08:15 -0600, 
> Jesse Pollard <jesse@cats-chateau.net> wrote:
> >Sure - very simple. If the execute bit is set on a file, don't allow
> >ANY write to the file. This does modify the permission bits slightly
> >but I don't think it is an unreasonable thing to have.
> 
> man strip
> man objcopy
> man ld

Thought of theses already (well, at least ld...)

strip - not used that much (most executables still have their symbol table
	but could be handled by removing the execute bit, stripping, then
	putting it back. Or just use the ld option -s.
objcopy - copies object files. Object files are not marked executable...
ld	- on other UNIX systems (Cray/IRIX), I think the output file
	(-o) specified is first deleted. Whenever I can cause a link
	error, the output is not marked executable. If the GNU ld doesn't
	delete it first, then it most likely should.

I was expecting shell scripts to be the complaint first... :-)

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 57+ messages in thread
* Re: Disturbing news..
@ 2001-03-28 14:40 Jesse Pollard
  2001-03-28 15:08 ` Russell King
  2001-03-29 12:05 ` Walter Hofmann
  0 siblings, 2 replies; 57+ messages in thread
From: Jesse Pollard @ 2001-03-28 14:40 UTC (permalink / raw)
  To: snwahofm, Jesse Pollard; +Cc: Shawn Starr, linux-kernel

> 
> 
> 
> On Wed, 28 Mar 2001, Jesse Pollard wrote:
> 
> > >Any idea?
> > 
> > Sure - very simple. If the execute bit is set on a file, don't allow
> > ANY write to the file. This does modify the permission bits slightly
> > but I don't think it is an unreasonable thing to have.
> 
> And how exactly does this help?
> 
> fchmod (fd, 0666);
> fwrite (fd, ...);
> fchmod (fd, 0777);

By itself it doesn't - but if you also don't have user/group/world rw and
don't own the file, you can't do anything to it.

It's only there to reduce accidents. If you want to go full out,
remove the symbols from the file.

Now, if ELF were to be modified, I'd just add a segment checksum
for each segment, then put the checksum in the ELF header as well as
in the/a segment header just to make things harder. At exec time a checksum
verify could (expensive) be done on each segment. A reduced level could be
done only on the data segment or text segment. This would at least force
the virus to completly read the file to regenerate the checksum.

That change would even allow for signature checks of the checksum if the
signature was stored somewhere else (system binaries/setuid binaries...).
But only in a high risk environment. This could even be used for a scanner
to detect ANY change to binaries (and fast too - signature check of checksums
wouldn't require reading the entire file).

In any case, the problem is limited to one user, even if nothing is done.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 57+ messages in thread
* Re: Disturbing news..
@ 2001-03-28 14:43 Jesse Pollard
  0 siblings, 0 replies; 57+ messages in thread
From: Jesse Pollard @ 2001-03-28 14:43 UTC (permalink / raw)
  To: rmk, Jesse Pollard; +Cc: linux-kernel

---------  Received message begins Here  ---------

> 
> On Wed, Mar 28, 2001 at 06:08:15AM -0600, Jesse Pollard wrote:
> > Sure - very simple. If the execute bit is set on a file, don't allow
> > ANY write to the file. This does modify the permission bits slightly
> > but I don't think it is an unreasonable thing to have.
> 
> Even easier method - remove the write permission bits from all executable
> files, and don't do the unsafe thing of running email/web browsers/other
> user-type stuff as user root.
> 
> If it still worries you that root can write to files without the 'w' bit
> set, modify the capabilities of the system to prevent it (there is a bit
> that can be set which will remove this ability for all new processes).

How about just adding MLS ... :-)

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 57+ messages in thread
* Re: Disturbing news..
@ 2001-03-28 15:31 Jesse Pollard
  0 siblings, 0 replies; 57+ messages in thread
From: Jesse Pollard @ 2001-03-28 15:31 UTC (permalink / raw)
  To: sean, Jesse Pollard; +Cc: Shawn Starr, Matti Aarnio, linux-kernel

Sean Hunter <sean@dev.sportingbet.com>:
> On Wed, Mar 28, 2001 at 06:08:15AM -0600, Jesse Pollard wrote:
> > Sure - very simple. If the execute bit is set on a file, don't allow
> > ANY write to the file. This does modify the permission bits slightly
> > but I don't think it is an unreasonable thing to have.
> > 
> 
> Are we not then in the somewhat zen-like state of having an "rm" which can't
> "rm" itself without needing to be made non-executable so that it can't execute?

We've been in that state for a long time... (carefull updating that libc.so
file... can't overwrite/delete without having some REAL problems show up.)

It just calls for some carefull activity. If rm is being replaced, first
rename it; then put new one in place; chmod old; delete old. It is directly
comparable to the libc.so update procedure.

I should have left off the "very simple" remark.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 57+ messages in thread
* Re: Disturbing news..
@ 2001-03-28 15:43 Jesse Pollard
  0 siblings, 0 replies; 57+ messages in thread
From: Jesse Pollard @ 2001-03-28 15:43 UTC (permalink / raw)
  To: rmk, Jesse Pollard; +Cc: kaos, jesse, linux-kernel

Russell King <rmk@arm.linux.org.uk>:
> On Wed, Mar 28, 2001 at 08:15:57AM -0600, Jesse Pollard wrote:
> > objcopy - copies object files. Object files are not marked executable...
> 
> objcopy copies executable files as well - check the kernel makefiles
> for examples.

At the time it's copying, the input doesn't need to be executable. That
appears to be a byproduct of a library link.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 57+ messages in thread
* Re: Disturbing news..
@ 2001-03-28 15:51 Jesse Pollard
  2001-03-28 15:54 ` rmk
  0 siblings, 1 reply; 57+ messages in thread
From: Jesse Pollard @ 2001-03-28 15:51 UTC (permalink / raw)
  To: rmk, Jesse Pollard; +Cc: snwahofm, Jesse Pollard, Shawn Starr, linux-kernel

Russell King <rmk@arm.linux.org.uk>
> 
> On Wed, Mar 28, 2001 at 08:40:42AM -0600, Jesse Pollard wrote:
> > Now, if ELF were to be modified, I'd just add a segment checksum
> > for each segment, then put the checksum in the ELF header as well as
> > in the/a segment header just to make things harder. At exec time a checksum
> > verify could (expensive) be done on each segment. A reduced level could be
> > done only on the data segment or text segment. This would at least force
> > the virus to completly read the file to regenerate the checksum.
> 
> Checksums don't help that much - virus writers would treat it as "part
> of the set of alterations that need to be made" and then the checksum
> becomes zero protection.
> 
 [ snip of good stuff ]
> Therefore, if you follow good easy system administration techniques, then
> you end up minimising the risk of getting:
> 
> 1. viruses
> 2. trojans
> 3. malicious users
> 
> cracking your system.  If you don't follow these techniques, then you're
> asking for lots of trouble, and no amount of checksumming/signing/etc
> will ever save you.

Absolutely true. The only help the checksumming etc stuff is good for is
detecting the fact afterward by external comparison.

I like MLS for the ability to catch ATTEMPTS to make unauthorized
modification.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 57+ messages in thread
* Re: Disturbing news..
@ 2001-03-29 17:10 Jesse Pollard
  0 siblings, 0 replies; 57+ messages in thread
From: Jesse Pollard @ 2001-03-29 17:10 UTC (permalink / raw)
  To: snwahofm, Jesse Pollard; +Cc: Shawn Starr, linux-kernel

Walter Hofmann <snwahofm@mi.uni-erlangen.de>:
> On Wed, 28 Mar 2001, Jesse Pollard wrote:
[snip]
> > Now, if ELF were to be modified, I'd just add a segment checksum
> > for each segment, then put the checksum in the ELF header as well as
> > in the/a segment header just to make things harder. At exec time a checksum
> > verify could (expensive) be done on each segment. A reduced level could be
> > done only on the data segment or text segment. This would at least force
> > the virus to completly read the file to regenerate the checksum.
> 
> So? The virus will just redo the checksum. Sooner or later their will be a
> routine to do this in libbfd and this all reduces to a single additional
> line of code. 

true.

> > That change would even allow for signature checks of the checksum if the
> > signature was stored somewhere else (system binaries/setuid binaries...).
> > But only in a high risk environment. This could even be used for a scanner
> > to detect ANY change to binaries (and fast too - signature check of checksums
> > wouldn't require reading the entire file).
> 
> One sane way to do this is to store the sig on a ro medium and make the
> kernel check the sig of every binary before it is run.

Only for trusted binaries. (extreme paranoia now).
 
> HOWEVER, this means no compilers will work, and you have to delete all
> script languages like perl or python (or make all of them check the
> signature).

Compilers should work normally, the link phase is what would generate
the checksums, though if each object file contained a checksum for the
segment then the interpreters/dynamic loaders would have the choice.

The only applications I see as really needing to check such signatures
are those using PAM. These should do it anyway. The dynamic linking programs
should do so only if they are configured to do so.

> Useless again, IMO.
> 
> > In any case, the problem is limited to one user, even if nothing is done.
> 
> Your best bet is to educate your users.

User eduation is a reasonable substitute as long as they can be directed
to follow the rules.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2001-04-21  0:43 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-03-27 21:29 [PATCH] mm/memory.c, 2.4.1 : memory leak with swap cache (updated) Richard Jerrell
2001-03-27 21:18 ` Rik van Riel
2001-03-27 23:10   ` Richard Jerrell
2001-03-27 22:57     ` Rik van Riel
2001-03-28  0:53   ` Ideas for the oom problem james
2001-03-28  0:52     ` Rik van Riel
2001-03-28  1:14       ` Doug Ledford
2001-03-28  3:21         ` Rik van Riel
2001-03-28  3:41           ` Doug Ledford
2001-03-28  3:53             ` Rik van Riel
2001-03-28  1:39       ` james
2001-03-28  5:52     ` Jonathan Morton
2001-03-28  6:16       ` Disturbing news Shawn Starr
2001-03-28  6:33         ` Disturbing news.. Idea Shawn Starr
2001-04-21  0:43           ` Serious Latency problems : 2.4.4-pre5 Shawn Starr
2001-03-28  7:19         ` Disturbing news Matti Aarnio
2001-03-28  7:27           ` Shawn Starr
2001-03-28 12:08             ` Jesse Pollard
2001-03-28  5:50               ` Ben Ford
2001-03-28 12:50               ` Walter Hofmann
2001-03-28 14:04                 ` Simon Williams
2001-03-28 15:04                   ` Olivier Galibert
2001-03-28 15:49                     ` Simon Williams
2001-03-28 11:57                       ` Ben Ford
2001-03-29  8:02                         ` Helge Hafting
2001-03-28 17:51                       ` Olivier Galibert
2001-03-28 12:53               ` Keith Owens
2001-03-28 13:00               ` Russell King
2001-03-28 14:10               ` Sean Hunter
2001-03-28 15:36                 ` john slee
2001-03-28 16:18                   ` Jonathan Lundell
2001-04-02 23:10               ` Dr. Kelsey Hudson
2001-03-28 17:29             ` Horst von Brand
2001-03-28 10:00         ` Helge Hafting
2001-03-28 13:25         ` Alexander Viro
2001-03-28 14:32           ` Romano Giannetti
2001-03-28 14:57             ` Bill Rugolsky Jr.
2001-03-28 14:57             ` Alexander Viro
2001-03-28 16:14               ` Romano Giannetti
2001-03-28 14:38     ` Ideas for the oom problem Hacksaw
2001-03-28 15:56       ` Andreas Rogge
2001-03-28 23:33         ` Hacksaw
2001-03-28 23:47           ` Tim Haynes
2001-03-29  0:12             ` Hacksaw
2001-03-27 21:51 ` [PATCH] mm/memory.c, 2.4.1 : memory leak with swap cache (updated) Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2001-03-28 14:15 Disturbing news Jesse Pollard
2001-03-28 14:53 ` Russell King
2001-03-28 14:40 Jesse Pollard
2001-03-28 15:08 ` Russell King
2001-03-29 12:05 ` Walter Hofmann
2001-03-28 14:43 Jesse Pollard
2001-03-28 15:31 Jesse Pollard
2001-03-28 15:43 Jesse Pollard
2001-03-28 15:51 Jesse Pollard
2001-03-28 15:54 ` rmk
2001-03-28 21:19   ` Gerhard Mack
2001-03-29 17:10 Jesse Pollard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox