Re: VM subsystem bug in 2.4.0 ?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Christoph Rohland <cr@sap.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: "Stephen C. Tweedie" <sct@redhat.com>,
	Rik van Riel <riel@conectiva.com.br>,
	"Sergey E. Volkov" <sve@raiden.bancorp.ru>,
	linux-kernel@vger.kernel.org
Subject: Re: VM subsystem bug in 2.4.0 ?
Date: 10 Jan 2001 08:33:47 +0100	[thread overview]
Message-ID: <m3itnneu90.fsf@linux.local> (raw)
In-Reply-To: <Pine.LNX.4.10.10101091447540.2633-100000@penguin.transmeta.com>

Hi Linus,

Linus Torvalds <torvalds@transmeta.com> writes:

> I'd really like an opinion on whether this is truly legal or not? After
> all, it does change the behaviour to mean "pages are locked only if they
> have been mapped into virtual memory". Which is not what it used to mean.
> 
> Arguably the new semantics are perfectly valid semantics on their
> own, but I'm not sure they are acceptable.

I just checked SuS and they do not list SHM_LOCK as command at all.

> In contrast, the PG_realdirty approach would give the old behaviour of
> truly locked-down shm segments, with not significantly different
> complexity behaviour.
>
> What do other UNIXes do for shm_lock()?
> 
> The Linux man-page explicitly states for SHM_LOCK that
> 
> 	The user must fault in any pages that are required to be present
> 	after locking is enabled.
> 
> which kind of implies to me that the VM_LOCKED implementation is ok.

Yes.

> HOWEVER, looking at the HP-UX man-page, for example, certainly implies
> that the PG_realdirty approach is the correct one. 

Yes.

> The IRIX man-pages in contrast say
> 
> 				Locking occurs per address space;
>         multiple processes or sprocs mapping the area at different
>         addresses each need to issue the lock (this is primarily an
>         issue with the per-process page tables).
> 
> which again implies that they've done something akin to a VM_LOCKED
> implementation.

So Irix does something quite different. For Irix SHM_LOCK is a special
version of mlock...

> Does anybody have any better pointers, ideas, or opinions?

I think the VM_LOCKED approach is the best: 

- SuS does not specify anything, the different vendors do different
  things. So people using SHM_LOCK have to be aware that the details
  differ.
- Technically this is the fastest approach for attached segments: We
  do not scan the relevent vmas at all and by doing so we keep the
  overhead lowest. And I do not see a reason to use SHM_LOCK besides
  performance.

BTW I also have a patch appended which bumps the page count. Works
also, is also small, but we will have a higher soft fault rate with
that.

Greetings 
                Christoph

diff -uNr 2.4.0/ipc/shm.c c/ipc/shm.c
--- 2.4.0/ipc/shm.c	Mon Jan  8 11:24:39 2001
+++ c/ipc/shm.c	Tue Jan  9 17:48:55 2001
@@ -121,6 +121,7 @@
 {
 	shm_tot -= (shp->shm_segsz + PAGE_SIZE - 1) >> PAGE_SHIFT;
 	shm_rmid (shp->id);
+	shmem_lock(shp->shm_file, 0);
 	fput (shp->shm_file);
 	kfree (shp);
 }
@@ -467,10 +468,10 @@
 		if(err)
 			goto out_unlock;
 		if(cmd==SHM_LOCK) {
-			shp->shm_file->f_dentry->d_inode->u.shmem_i.locked = 1;
+			shmem_lock(shp->shm_file, 1);
 			shp->shm_flags |= SHM_LOCKED;
 		} else {
-			shp->shm_file->f_dentry->d_inode->u.shmem_i.locked = 0;
+			shmem_lock(shp->shm_file, 0);
 			shp->shm_flags &= ~SHM_LOCKED;
 		}
 		shm_unlock(shmid);
diff -uNr 2.4.0/mm/shmem.c c/mm/shmem.c
--- 2.4.0/mm/shmem.c	Mon Jan  8 11:24:39 2001
+++ c/mm/shmem.c	Tue Jan  9 18:04:16 2001
@@ -310,6 +310,8 @@
 	}
 	/* We have the page */
 	SetPageUptodate (page);
+	if (info->locked)
+		page_cache_get(page);
 
 cached_page:
 	UnlockPage (page);
@@ -399,6 +401,32 @@
 	spin_unlock (&sb->u.shmem_sb.stat_lock);
 	buf->f_namelen = 255;
 	return 0;
+}
+
+void shmem_lock(struct file * file, int lock)
+{
+	struct inode * inode = file->f_dentry->d_inode;
+	struct shmem_inode_info * info = &inode->u.shmem_i;
+	struct page * page;
+	unsigned long idx, size;
+
+	if (info->locked == lock)
+		return;
+	down(&inode->i_sem);
+	info->locked = lock;
+	size = (inode->i_size + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
+	for (idx = 0; idx < size; idx++) {
+		page = find_lock_page(inode->i_mapping, idx);
+		if (!page)
+			continue;
+		if (!lock) {
+			/* release the extra count and our reference */
+			page_cache_release(page);
+			page_cache_release(page);
+		}
+		UnlockPage(page);
+	}
+	up(&inode->i_sem);
 }
 
 /*

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

next prev parent reply	other threads:[~2001-01-10  7:31 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-01-08  8:46 VM subsystem bug in 2.4.0 ? Sergey E. Volkov
2001-01-08 18:00 ` Rik van Riel
2001-01-08 18:10   ` Linus Torvalds
2001-01-08 18:30     ` Rik van Riel
2001-01-09  7:52       ` Christoph Rohland
2001-01-09 14:09       ` Stephen C. Tweedie
2001-01-09 14:53         ` Christoph Rohland
2001-01-09 15:31           ` Stephen C. Tweedie
2001-01-09 15:45             ` Christoph Rohland
2001-01-09 16:05               ` Stephen C. Tweedie
2001-01-09 16:17                 ` Christoph Rohland
2001-01-09 18:37                   ` Linus Torvalds
2001-01-09 16:45                 ` Daniel Phillips
2001-01-17  8:33                   ` Rik van Riel
2001-01-18  8:23                     ` Christoph Rohland
2001-01-25 22:47                       ` Daniel Phillips
2001-01-09 18:36             ` Linus Torvalds
2001-01-09 18:23         ` Linus Torvalds
2001-01-09 22:20           ` Christoph Rohland
2001-01-09 22:59             ` Linus Torvalds
2001-01-10  7:33               ` Christoph Rohland [this message]
2001-01-10 15:50               ` Tim Wright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3itnneu90.fsf@linux.local \
    --to=cr@sap.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@conectiva.com.br \
    --cc=sct@redhat.com \
    --cc=sve@raiden.bancorp.ru \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.