* [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6
@ 2000-10-27 20:21 Christoph Hellwig
[not found] ` <200010272123.OAA21478@penguin.transmeta.com>
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2000-10-27 20:21 UTC (permalink / raw)
To: linux-kernel
Ok, forgot to Cc linux-kernel ...
Please Cc linus on reply.
----- Forwarded message from Christoph Hellwig <hch@caldera.de> -----
Date: Fri, 27 Oct 2000 22:03:54 +0200
From: Christoph Hellwig <hch@caldera.de>
To: Linus Torvalds <torvalds@transmeta.com>
Subject: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6
X-Mailer: Mutt 1.0i
Hi Linus,
Stephen Tweedies last kiobuf patchset contained a lot bu fixes
besides new features. These bug-fixes are not yet merged in 2.4.0.
This patch contains forward-ports of the follwoing fixes
(quote from his 00README):
01-mapfix.diff
map_user_kiobuf() retries failed maps to cover a race in which
the swapper steals a page before the kiobuf has grabbed and
locked it.
02-iocount.diff
Kanoj Sarcar's fixes to allow kiobufs to work properly over
fork(), even on threaded applications.
04-eiofix.diff
Fix to return -EIO instead of 0 if a raw I/O read or write
encounters an error in the first block.
06-enxio.diff
Return ENXIO on read/write at or beyond the end of the device
for raw I/O
Please apply.
Christoph
--
Always remember that you are unique. Just like everyone else.
diff -uNr --exclude-from=dontdiff linux.orig/drivers/char/raw.c linux/drivers/char/raw.c
--- linux.orig/drivers/char/raw.c Thu Oct 19 13:21:24 2000
+++ linux/drivers/char/raw.c Tue Oct 24 13:25:47 2000
@@ -277,8 +277,11 @@
if ((*offp & sector_mask) || (size & sector_mask))
return -EINVAL;
- if ((*offp >> sector_bits) > limit)
+ if ((*offp >> sector_bits) > limit) {
+ if (size)
+ return -ENXIO;
return 0;
+ }
/*
* We'll just use one kiobuf
diff -uNr --exclude-from=dontdiff linux.orig/fs/buffer.c linux/fs/buffer.c
--- linux.orig/fs/buffer.c Tue Oct 24 13:15:49 2000
+++ linux/fs/buffer.c Tue Oct 24 13:26:31 2000
@@ -1924,6 +1924,8 @@
spin_unlock(&unused_list_lock);
+ if (!iosize)
+ return -EIO;
return iosize;
}
diff -uNr --exclude-from=dontdiff linux.orig/include/linux/mm.h linux/include/linux/mm.h
--- linux.orig/include/linux/mm.h Tue Oct 24 13:15:56 2000
+++ linux/include/linux/mm.h Tue Oct 24 14:41:46 2000
@@ -157,8 +157,9 @@
wait_queue_head_t wait;
struct page **pprev_hash;
struct buffer_head * buffers;
- void *virtual; /* non-NULL if kmapped */
+ void *virtual; /* non-NULL if kmapped */
struct zone_struct *zone;
+ atomic_t rawcount; /* count of raw io in progress */
} mem_map_t;
#define get_page(p) atomic_inc(&(p)->count)
diff -uNr --exclude-from=dontdiff linux.orig/mm/memory.c linux/mm/memory.c
--- linux.orig/mm/memory.c Tue Oct 24 13:15:58 2000
+++ linux/mm/memory.c Tue Oct 24 16:09:22 2000
@@ -138,6 +138,30 @@
check_pgt_cache();
}
+/*
+ * Establish a new mapping:
+ * - flush the old one
+ * - update the page tables
+ * - inform the TLB about the new one
+ */
+static inline void establish_pte(struct vm_area_struct * vma, unsigned long address,
+ pte_t *page_table, pte_t entry)
+{
+ flush_tlb_page(vma, address);
+ set_pte(page_table, entry);
+ update_mmu_cache(vma, address, entry);
+}
+
+static inline void break_cow(struct vm_area_struct * vma, struct page * old_page,
+ struct page * new_page, unsigned long address,
+ pte_t *page_table)
+{
+ copy_cow_page(old_page,new_page,address);
+ flush_page_to_ram(new_page);
+ flush_cache_page(vma, address);
+ establish_pte(vma, address, page_table, pte_mkwrite(pte_mkdirty(mk_pte(new_page, vma->vm_page_prot))));
+}
+
#define PTE_TABLE_MASK ((PTRS_PER_PTE-1) * sizeof(pte_t))
#define PMD_TABLE_MASK ((PTRS_PER_PMD-1) * sizeof(pmd_t))
@@ -227,6 +251,22 @@
/* If it's a COW mapping, write protect it both in the parent and the child */
if (cow) {
+ /* Rawio in progress? */
+ if (atomic_read(&ptepage->rawcount)) {
+ /*
+ * If pte is dirty, its a private page,
+ * rawio was initiated by a clone.
+ * For dmain operation, need to break
+ * cow.
+ */
+ if (pte_dirty(pte)) {
+ struct page * new_page = alloc_page(GFP_HIGHUSER);
+ if (!new_page)
+ goto nomem;
+ break_cow(vma, ptepage, new_page, address, dst_pte);
+ goto cont_copy_pte_range;
+ }
+ }
ptep_clear_wrprotect(src_pte);
pte = *src_pte;
}
@@ -382,9 +422,12 @@
/*
- * Do a quick page-table lookup for a single page.
+ * Do a quick page-table lookup for a single page. We have already verified
+ * access type, and done a fault in. But, kswapd might have stolen the page
+ * in the meantime. Return an indication of whether we should retry the fault
+ * in. Writability test is superfluous but conservative.
*/
-static struct page * follow_page(unsigned long address)
+static struct page * follow_page(unsigned long address, int writeacc, int * ret)
{
pgd_t *pgd;
pmd_t *pmd;
@@ -393,10 +436,15 @@
pmd = pmd_offset(pgd, address);
if (pmd) {
pte_t * pte = pte_offset(pmd, address);
- if (pte && pte_present(*pte))
+ if (pte && pte_present(*pte)) {
+ if (writeacc && !pte_write(*pte))
+ goto retry;
return pte_page(*pte);
+ }
}
-
+
+retry:
+ *ret = 1;
return NULL;
}
@@ -428,7 +476,8 @@
struct page * map;
int i;
int datain = (rw == READ);
-
+ int failed;
+
/* Make sure the iobuf is not already mapped somewhere. */
if (iobuf->nr_pages)
return -EINVAL;
@@ -467,23 +516,28 @@
}
if (((datain) && (!(vma->vm_flags & VM_WRITE))) ||
(!(vma->vm_flags & VM_READ))) {
- err = -EACCES;
goto out_unlock;
}
}
+
+faultin:
if (handle_mm_fault(current->mm, vma, ptr, datain) <= 0)
goto out_unlock;
spin_lock(&mm->page_table_lock);
- map = follow_page(ptr);
- if (!map) {
+ map = follow_page(ptr, datain, &failed);
+ if (failed) {
+ /*
+ * Page got stolen before we could lock it down.
+ * Retry.
+ */
spin_unlock(&mm->page_table_lock);
- dprintk (KERN_ERR "Missing page in map_user_kiobuf\n");
- goto out_unlock;
+ goto faultin;
}
map = get_page_map(map);
- if (map)
+ if (map) {
atomic_inc(&map->count);
- else
+ atomic_inc(&map->rawcount);
+ } else
printk (KERN_INFO "Mapped page missing [%d]\n", i);
spin_unlock(&mm->page_table_lock);
iobuf->maplist[i] = map;
@@ -519,6 +573,7 @@
if (map) {
if (iobuf->locked)
UnlockPage(map);
+ atomic_dec(&map->rawcount);
__free_page(map);
}
}
@@ -771,28 +826,6 @@
} while (from && (from < end));
flush_tlb_range(current->mm, beg, end);
return error;
-}
-
-/*
- * Establish a new mapping:
- * - flush the old one
- * - update the page tables
- * - inform the TLB about the new one
- */
-static inline void establish_pte(struct vm_area_struct * vma, unsigned long address, pte_t *page_table, pte_t entry)
-{
- flush_tlb_page(vma, address);
- set_pte(page_table, entry);
- update_mmu_cache(vma, address, entry);
-}
-
-static inline void break_cow(struct vm_area_struct * vma, struct page * old_page, struct page * new_page, unsigned long address,
- pte_t *page_table)
-{
- copy_cow_page(old_page,new_page,address);
- flush_page_to_ram(new_page);
- flush_cache_page(vma, address);
- establish_pte(vma, address, page_table, pte_mkwrite(pte_mkdirty(mk_pte(new_page, vma->vm_page_prot))));
}
/*
diff -uNr --exclude-from=dontdiff linux.orig/mm/page_alloc.c linux/mm/page_alloc.c
--- linux.orig/mm/page_alloc.c Tue Oct 24 13:15:58 2000
+++ linux/mm/page_alloc.c Tue Oct 24 13:36:44 2000
@@ -98,6 +98,8 @@
BUG();
if (PageInactiveClean(page))
BUG();
+ if (atomic_read(&page->rawcount))
+ BUG();
page->flags &= ~(1<<PG_referenced);
page->age = PAGE_AGE_START;
@@ -819,6 +821,7 @@
*/
for (p = lmem_map; p < lmem_map + totalpages; p++) {
set_page_count(p, 0);
+ atomic_set(&(p)->rawcount, 0);
SetPageReserved(p);
init_waitqueue_head(&p->wait);
memlist_init(&p->list);
----- End forwarded message -----
--
Always remember that you are unique. Just like everyone else.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 12+ messages in thread[parent not found: <200010272123.OAA21478@penguin.transmeta.com>]
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 [not found] ` <200010272123.OAA21478@penguin.transmeta.com> @ 2000-10-30 11:45 ` Christoph Hellwig 2000-10-30 17:19 ` Jeff Garzik 2000-10-31 2:08 ` Andrea Arcangeli 0 siblings, 2 replies; 12+ messages in thread From: Christoph Hellwig @ 2000-10-30 11:45 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel On Fri, Oct 27, 2000 at 02:23:04PM -0700, Linus Torvalds wrote: > > [...] > > That solution, btw, might be as simple as just saying: > > - raw IO is based on physical pages, and the COW mapping crated by > fork() may cause the changes to be visibile to either child or parent > or both, depending on usage patterns to the page in question. For > repeatable behaviour, do not have outstanding direct IO in progress > over a fork(). > > Ie, just _document_ it. It's not _wrong_, it can just be surprising (but > it is actually entirely straightforward and sane if you just look at it > the right way). Ok, here is an updated patch witout that change, but instead with a little piece of kiobuf documentation that does document this and other things related to kiobufs. Christoph -- Always remember that you are unique. Just like everyone else. --- linux.orig/drivers/char/raw.c Thu Oct 19 13:21:24 2000 +++ linux/drivers/char/raw.c Sun Oct 29 20:55:43 2000 @@ -277,8 +277,11 @@ if ((*offp & sector_mask) || (size & sector_mask)) return -EINVAL; - if ((*offp >> sector_bits) > limit) + if ((*offp >> sector_bits) > limit) { + if (size) + return -ENXIO; return 0; + } /* * We'll just use one kiobuf --- linux.orig/fs/buffer.c Fri Oct 27 12:28:40 2000 +++ linux/fs/buffer.c Sun Oct 29 20:55:43 2000 @@ -1924,6 +1924,8 @@ spin_unlock(&unused_list_lock); + if (!iosize) + return -EIO; return iosize; } --- linux.orig/mm/memory.c Fri Oct 27 12:28:42 2000 +++ linux/mm/memory.c Sun Oct 29 20:56:09 2000 @@ -382,9 +382,12 @@ /* - * Do a quick page-table lookup for a single page. + * Do a quick page-table lookup for a single page. We have already verified + * access type, and done a fault in. But, kswapd might have stolen the page + * in the meantime. Return an indication of whether we should retry the fault + * in. Writability test is superfluous but conservative. */ -static struct page * follow_page(unsigned long address) +static struct page * follow_page(unsigned long address, int writeacc, int * ret) { pgd_t *pgd; pmd_t *pmd; @@ -393,10 +396,15 @@ pmd = pmd_offset(pgd, address); if (pmd) { pte_t * pte = pte_offset(pmd, address); - if (pte && pte_present(*pte)) + if (pte && pte_present(*pte)) { + if (writeacc && !pte_write(*pte)) + goto retry; return pte_page(*pte); + } } - + +retry: + *ret = 1; return NULL; } @@ -428,7 +436,8 @@ struct page * map; int i; int datain = (rw == READ); - + int failed; + /* Make sure the iobuf is not already mapped somewhere. */ if (iobuf->nr_pages) return -EINVAL; @@ -467,18 +476,22 @@ } if (((datain) && (!(vma->vm_flags & VM_WRITE))) || (!(vma->vm_flags & VM_READ))) { - err = -EACCES; goto out_unlock; } } + +faultin: if (handle_mm_fault(current->mm, vma, ptr, datain) <= 0) goto out_unlock; spin_lock(&mm->page_table_lock); - map = follow_page(ptr); - if (!map) { + map = follow_page(ptr, datain, &failed); + if (failed) { + /* + * Page got stolen before we could lock it down. + * Retry. + */ spin_unlock(&mm->page_table_lock); - dprintk (KERN_ERR "Missing page in map_user_kiobuf\n"); - goto out_unlock; + goto faultin; } map = get_page_map(map); if (map) diff -uNr linux.orig/Documentation/kiobuf.txt linux/Documentation/kiobuf.txt --- linux.orig/Documentation/kiobuf.txt Thu Jan 1 01:00:00 1970 +++ linux/Documentation/kiobuf.txt Sun Oct 29 21:38:20 2000 @@ -0,0 +1,100 @@ + Abstract Kernel IO Buffers + Under Linux + + Christoph Hellwig <hch@caldera.de> + + +This document describes the kiobuf concept used in the Linux Kernel +IO/memory subsystem. It describes it's usages, functions working +with kernel IO buffers and show some examples for kiobuf usage. + + +The main reason for implementing kernel IO buffers (by Stephen Tweedie) +was the lack of raw devices support in Linux kernels <= 2.2. Raw devices +are the character devices that AT&T derived UNIX version implement to +allow character based uncached access to mass storage devices. In +Linux kernels <= 2.2 all blockdevice IO goes either through the buffer- +or pagecache, so that applications like databases cannot get full +control over their data. + +The solution in Linux 2.3 an higher is that the new raw devices driver +locks down the virtual memory it gets passed by the ->read and ->write +methods and does physical page io on them, bypassing the caches. +NOTE: the physical memory referenced by kiobufs does - unlike nearly +everything else in the Linux memory managment - not have reasonable COW +semenantics. So don't even try to fork when doing rawio or using +user-space memory in kiobufs in an other way. + + +To use iobufs in this way you need to allocate one or more kiobufs (an +array of kiobufs is called kiovec - do not confuse those with BSD iovecs). + + err = alloc_kiovec (count, iovec); + +This allocates the memory for the wanted number of kiobufs (and adds them +to a cache) and initalizes some variables - in an OO-language this would be +the constructor. Then you force the virtual memory to faulted in and locked +in physical memory and reference it by the kiobuf. (NOTE: this must be done +for each iobuf, not for the whole iovec). + + err = map_user_kiobuf (rw, iobuf, address, len); + +After that you request IO against the wanted device. For the case of +raw devices where IO should be requested against a blockdevice, there +is a function in fs/buffer.c that does exactly this. (the parameter +'blocks' is an array of the block numbers the IO should be requested +against) + + err = brw_kiovec (rw, count, iovec, dev, blocks, sector_size); + +After the IO for this iobuf is done, unmap the virtual memory. + + unmap_kiobuf (iobuf); + +And when we are finished with the iovev, free it. + + free_kiovec (count, iovec); + + +Locking down user memory and doing mass storage device IO with it is not +the only purpose of kiobufs. Another use for kiobufs is allowing +user-space mmaping dma memory, e.g in sound drivers. To do so you +need to lock-down kernel virtual memory and refernece it using kiobufs. +The code that does exactly this is not yet in the kernel - get Stephen +Tweedie's kiobuf patchset if you want to use this. + + +In the long term it looks like all blockdev IO will be done using +kiobufs. In the SGI XFS tree there is code that allows passing kiovecs +to the individual low-level block drivers. There are lots of advantages +of doing it this way: the page cache doesn't need to fit the outstanding +io into lots of bufferheads, passing each bufferhead to ll_rw_block() +where the elevator merges some of them together for better device usage +and submits them to the drivers. Instead the cache locks down the pages +and submits the kiovec to the low-level driver. The lowlevel driver knows +better how the request should be splitted for dmaing or whatever. On the +other hand software RAID or LVM get more complicated: instead of just +doing block-remapping they must split the kiobufs and - in case of LVM - +find ways to do efficient IO on continguos areas. + + + +References: + + Linux Kernel Sourcecode + (fs/buffer.c, fs/iobufs.c, mm/memory.c, drivers/char/raw.c) + + SGI XFS for Linux + (http://oss.sgi.com/projects/linux-xfs/) + + Stephen Tweedies kiobuf patchset + (ftp://ftp.linux.org.uk/pub/linux/sct/fs/raw-io/) + + Linux MM mailinglist + (http://humbolt.geo.uu.nl/Linux-MM/linux-mm.html) + + +Thanks to Arjan van de Ven, Daniel Phillips and Marcelo Tosatti for +proofreading this document and giving usefull hints. + +$Id: kiobuf.txt,v 1.2 2000/10/29 20:37:54 hch Exp hch $ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 2000-10-30 11:45 ` Christoph Hellwig @ 2000-10-30 17:19 ` Jeff Garzik 2000-10-30 18:17 ` Christoph Hellwig 2000-10-31 2:08 ` Andrea Arcangeli 1 sibling, 1 reply; 12+ messages in thread From: Jeff Garzik @ 2000-10-30 17:19 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Linus Torvalds, linux-kernel, sct Christoph Hellwig wrote: > +Locking down user memory and doing mass storage device IO with it is not > +the only purpose of kiobufs. Another use for kiobufs is allowing > +user-space mmaping dma memory, e.g in sound drivers. To do so you > +need to lock-down kernel virtual memory and refernece it using kiobufs. > +The code that does exactly this is not yet in the kernel - get Stephen > +Tweedie's kiobuf patchset if you want to use this. Take a look at drivers/sound/via82cxxx_audio.c. How can that mmap be improved by using kiobufs? It seems like there is less overhead to mmap(2) DMA memory the way I do it currently -- without kiobufs... Honestly interested, Jeff -- Jeff Garzik | "Mind if I drive?" -Sam Building 1024 | "Not if you don't mind me clawing at the MandrakeSoft | dash and shrieking like a cheerleader." | -Max - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 2000-10-30 17:19 ` Jeff Garzik @ 2000-10-30 18:17 ` Christoph Hellwig 2000-10-30 18:56 ` Jeff Garzik 0 siblings, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2000-10-30 18:17 UTC (permalink / raw) To: Jeff Garzik; +Cc: Linus Torvalds, linux-kernel, sct On Mon, Oct 30, 2000 at 12:19:21PM -0500, Jeff Garzik wrote: > Christoph Hellwig wrote: > > +Locking down user memory and doing mass storage device IO with it is not > > +the only purpose of kiobufs. Another use for kiobufs is allowing > > +user-space mmaping dma memory, e.g in sound drivers. To do so you > > +need to lock-down kernel virtual memory and refernece it using kiobufs. > > +The code that does exactly this is not yet in the kernel - get Stephen > > +Tweedie's kiobuf patchset if you want to use this. > > Take a look at drivers/sound/via82cxxx_audio.c. How can that mmap be > improved by using kiobufs? I think so - but you need Stephen's kvmap patch, that is in the same patchset the forward-ported fixes are (at ftp://ftp.linux.org.uk/pub/linux/sct/fs/raw-io/) An very nice example is included. Christoph -- Always remember that you are unique. Just like everyone else. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 2000-10-30 18:17 ` Christoph Hellwig @ 2000-10-30 18:56 ` Jeff Garzik 2000-10-30 19:44 ` Christoph Hellwig 2000-11-01 13:32 ` Stephen C. Tweedie 0 siblings, 2 replies; 12+ messages in thread From: Jeff Garzik @ 2000-10-30 18:56 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Linus Torvalds, linux-kernel, sct Christoph Hellwig wrote: > On Mon, Oct 30, 2000 at 12:19:21PM -0500, Jeff Garzik wrote: > > Take a look at drivers/sound/via82cxxx_audio.c. How can that mmap be > > improved by using kiobufs? > > I think so - but you need Stephen's kvmap patch, that is in the same > patchset the forward-ported fixes are > (at ftp://ftp.linux.org.uk/pub/linux/sct/fs/raw-io/) > > An very nice example is included. Seen it, re-read my question... I keep seeing "audio drivers' mmap" used a specific example of a place that would benefit from kiobufs. The current via audio mmap looks quite a bit like mmap_kiobuf and its support code... except without all the kiobuf overhead. My question from above is: how can the via audio mmap in test10-preXX be improved by using kiobufs? I am not a kiobuf expert, but AFAICS a non-kiobuf implementation is better for audio drivers. (and the via audio mmap implementation is what some other audio drivers are about to start using...) I can clearly see that many applications will find kiobufs quite useful (learned another from alan just now...), but I do not see that audio drivers can benefit from kiobufs at all. Corrections on this fact are requested, as I am hacking audio drivers right now and want to make sure I pick the best course of action for the long term. Regards, Jeff -- Jeff Garzik | "Mind if I drive?" -Sam Building 1024 | "Not if you don't mind me clawing at the MandrakeSoft | dash and shrieking like a cheerleader." | -Max - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 2000-10-30 18:56 ` Jeff Garzik @ 2000-10-30 19:44 ` Christoph Hellwig 2000-10-30 20:08 ` Jeff Garzik 2000-11-01 13:32 ` Stephen C. Tweedie 1 sibling, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2000-10-30 19:44 UTC (permalink / raw) To: Jeff Garzik; +Cc: Linus Torvalds, linux-kernel, sct On Mon, Oct 30, 2000 at 01:56:07PM -0500, Jeff Garzik wrote: > My question from above is: how can the via audio mmap in test10-preXX > be improved by using kiobufs? I am not a kiobuf expert, but AFAICS a > non-kiobuf implementation is better for audio drivers. (and the via > audio mmap implementation is what some other audio drivers are about to > start using...) I think the biggest advantage is that you actually get the list of pages when you perform the mmap instead of doing virt_to_page on every ->nopage. That should speed up the operations on the mmap'ed are a bit. The other strong argument for the kiobuf solution is code-sharing. Instead of having every (sound) driver playing with the vm, there is one central place when you use kvmaps. Christoph -- Always remember that you are unique. Just like everyone else. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 2000-10-30 19:44 ` Christoph Hellwig @ 2000-10-30 20:08 ` Jeff Garzik 2000-10-30 20:32 ` Christoph Hellwig 0 siblings, 1 reply; 12+ messages in thread From: Jeff Garzik @ 2000-10-30 20:08 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Linus Torvalds, linux-kernel, sct [-- Attachment #1: Type: text/plain, Size: 1872 bytes --] Christoph Hellwig wrote: > > On Mon, Oct 30, 2000 at 01:56:07PM -0500, Jeff Garzik wrote: > > My question from above is: how can the via audio mmap in test10-preXX > > be improved by using kiobufs? I am not a kiobuf expert, but AFAICS a > > non-kiobuf implementation is better for audio drivers. (and the via > > audio mmap implementation is what some other audio drivers are about to > > start using...) > > I think the biggest advantage is that you actually get the list of pages > when you perform the mmap instead of doing virt_to_page on every ->nopage. > That should speed up the operations on the mmap'ed are a bit. nopage() is only called when the page is not mapped for the current process. So it doesn't get called very often. Easy enough to call virt_to_page at alloc instead of nopage time, though. Patch attached :) > The other strong argument for the kiobuf solution is code-sharing. Instead > of having every (sound) driver playing with the vm, there is one central > place when you use kvmaps. Actually, I wonder if its even possible for mmap_kiobuf to support audio -- full duplex requires that both record and playback buffer(s), theoretically two separate sets of kiobufs, to be presented as one space (with playback always presented before record). Even if you can do that with mmap_kiobuf, some audio hardware doesn't support scatter-gather, so each set of kiobufs must be physically contiguous for each channel. <thinks> audio drivers' write(2) should be kiobuf'd, but only for hardware which supports scatter-gather. I can't think of any other cases where kiobuf would apply to audio. Jeff -- Jeff Garzik | "Mind if I drive?" -Sam Building 1024 | "Not if you don't mind me clawing at the MandrakeSoft | dash and shrieking like a cheerleader." | -Max [-- Attachment #2: via.patch --] [-- Type: text/plain, Size: 1381 bytes --] Index: drivers/sound/via82cxxx_audio.c =================================================================== RCS file: /cvsroot/gkernel/linux_2_4/drivers/sound/via82cxxx_audio.c,v retrieving revision 1.1.1.6.4.1 diff -u -r1.1.1.6.4.1 via82cxxx_audio.c --- drivers/sound/via82cxxx_audio.c 2000/10/27 08:21:41 1.1.1.6.4.1 +++ drivers/sound/via82cxxx_audio.c 2000/10/30 19:57:21 @@ -226,6 +226,7 @@ struct via_sgd_data { dma_addr_t handle; void *cpuaddr; + struct page *page; }; @@ -626,6 +627,7 @@ if (!chan->sgbuf[i].cpuaddr) goto err_out_nomem; + chan->sgbuf[i].page = virt_to_page (chan->sgbuf[i].cpuaddr); if (i < (VIA_DMA_BUFFERS - 1)) chan->sgtable[i].count = cpu_to_le32 (VIA_DMA_BUF_SIZE | VIA_FLAG); @@ -722,6 +724,7 @@ chan->sgbuf[i].handle); chan->sgbuf[i].cpuaddr = NULL; chan->sgbuf[i].handle = 0; + chan->sgbuf[i].page = NULL; } if (chan->sgtable) { @@ -1717,9 +1720,11 @@ } else if (!wr) chan = &card->ch_in; + assert (chan->sgbuf[pgoff].cpuaddr != NULL); + assert (chan->sgbuf[pgoff].page != NULL); assert ((((unsigned long)chan->sgbuf[pgoff].cpuaddr) % PAGE_SIZE) == 0); - dmapage = virt_to_page (chan->sgbuf[pgoff].cpuaddr); + dmapage = chan->sgbuf[pgoff].page; DPRINTK ("EXIT, returning page %p for cpuaddr %lXh\n", dmapage, (unsigned long) chan->sgbuf[pgoff].cpuaddr); get_page (dmapage); ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 2000-10-30 20:08 ` Jeff Garzik @ 2000-10-30 20:32 ` Christoph Hellwig 2000-10-30 21:51 ` Jeff Garzik 0 siblings, 1 reply; 12+ messages in thread From: Christoph Hellwig @ 2000-10-30 20:32 UTC (permalink / raw) To: Jeff Garzik; +Cc: Linus Torvalds, linux-kernel, sct On Mon, Oct 30, 2000 at 03:08:31PM -0500, Jeff Garzik wrote: > Actually, I wonder if its even possible for mmap_kiobuf to support audio > -- full duplex requires that both record and playback buffer(s), > theoretically two separate sets of kiobufs, to be presented as one space > (with playback always presented before record). kvmaps take kiovecs, which are multiple kiobufs ... Christoph -- Always remember that you are unique. Just like everyone else. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 2000-10-30 20:32 ` Christoph Hellwig @ 2000-10-30 21:51 ` Jeff Garzik 0 siblings, 0 replies; 12+ messages in thread From: Jeff Garzik @ 2000-10-30 21:51 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Linus Torvalds, linux-kernel, sct Christoph Hellwig wrote: > > On Mon, Oct 30, 2000 at 03:08:31PM -0500, Jeff Garzik wrote: > > Actually, I wonder if its even possible for mmap_kiobuf to support audio > > -- full duplex requires that both record and playback buffer(s), > > theoretically two separate sets of kiobufs, to be presented as one space > > (with playback always presented before record). > > kvmaps take kiovecs, which are multiple kiobufs ... s/sets of kiobufs/kiovecs/ in my message and re-read :) <thinks> Ok kiobuf mmap in OSS audio is possible, but at that point using kiobufs is still 100% overhead, because you still have to allocate and manage DMA buffers separately due to read(2) and write(2). -- Jeff Garzik | "Mind if I drive?" -Sam Building 1024 | "Not if you don't mind me clawing at the MandrakeSoft | dash and shrieking like a cheerleader." | -Max - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 2000-10-30 18:56 ` Jeff Garzik 2000-10-30 19:44 ` Christoph Hellwig @ 2000-11-01 13:32 ` Stephen C. Tweedie 1 sibling, 0 replies; 12+ messages in thread From: Stephen C. Tweedie @ 2000-11-01 13:32 UTC (permalink / raw) To: Jeff Garzik; +Cc: Christoph Hellwig, Linus Torvalds, linux-kernel, sct Hi, On Mon, Oct 30, 2000 at 01:56:07PM -0500, Jeff Garzik wrote: > > Seen it, re-read my question... > > I keep seeing "audio drivers' mmap" used a specific example of a place > that would benefit from kiobufs. The current via audio mmap looks quite > a bit like mmap_kiobuf and its support code... except without all the > kiobuf overhead. > > My question from above is: how can the via audio mmap in test10-preXX > be improved by using kiobufs? I am not a kiobuf expert, but AFAICS a > non-kiobuf implementation is better for audio drivers. Code reuse. You may not need every single thing that the kvmap api gives you --- for example, you may not need the per-mmap refcounting, because you might be associating your dma buffer with the file descriptor, not the mmap region --- but if you implement the same nopage code in every single sound driver, then you end up with a lot of duplication and you increase (enormously) the number of places you have to touch if anything ever changes in the vma management code. Cheers, Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 2000-10-30 11:45 ` Christoph Hellwig 2000-10-30 17:19 ` Jeff Garzik @ 2000-10-31 2:08 ` Andrea Arcangeli 2000-11-01 11:16 ` Christoph Hellwig 1 sibling, 1 reply; 12+ messages in thread From: Andrea Arcangeli @ 2000-10-31 2:08 UTC (permalink / raw) To: Linus Torvalds, linux-kernel On Mon, Oct 30, 2000 at 12:45:13PM +0100, Christoph Hellwig wrote: > @@ -393,10 +396,15 @@ > pmd = pmd_offset(pgd, address); > if (pmd) { > pte_t * pte = pte_offset(pmd, address); > - if (pte && pte_present(*pte)) > + if (pte && pte_present(*pte)) { > + if (writeacc && !pte_write(*pte)) > + goto retry; > return pte_page(*pte); > + } > } It should also make sure the pte is dirty before starting the read-from-disk I/O and then things will currently break in the swapout because the page is not locked (see discussion of last week). The fix for that problem proposed by SCT and Linus is that the page (not pte) will be marked dirty during swapout and written back to disk _only_ once reference count is 1 (btw I now noticed invalidate_inode_pages+MAP_SHARED will mess with that fix and it will trigger a BUG() in free_pages). > + > +faultin: > if (handle_mm_fault(current->mm, vma, ptr, datain) <= 0) > goto out_unlock; > spin_lock(&mm->page_table_lock); > - map = follow_page(ptr); > - if (!map) { > + map = follow_page(ptr, datain, &failed); > + if (failed) { > + /* > + * Page got stolen before we could lock it down. > + * Retry. > + */ > spin_unlock(&mm->page_table_lock); > - dprintk (KERN_ERR "Missing page in map_user_kiobuf\n"); > - goto out_unlock; > + goto faultin; This is suboptimal (walks the pagetables twice if the page is just mapped). It should be a follow page first and handle_mm_fault only if follow page failed. Andrea - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 2000-10-31 2:08 ` Andrea Arcangeli @ 2000-11-01 11:16 ` Christoph Hellwig 0 siblings, 0 replies; 12+ messages in thread From: Christoph Hellwig @ 2000-11-01 11:16 UTC (permalink / raw) To: linux-kernel Andrea Arcangeli <andrea@suse.de> wrote: >> + map = follow_page(ptr, datain, &failed); >> + if (failed) { >> + /* >> + * Page got stolen before we could lock it down. >> + * Retry. >> + */ >> spin_unlock(&mm->page_table_lock); >> - dprintk (KERN_ERR "Missing page in map_user_kiobuf\n"); >> - goto out_unlock; >> + goto faultin; > This is suboptimal (walks the pagetables twice if the page is just mapped). It > should be a follow page first and handle_mm_fault only if follow page failed. I did only forward-port the fixes from Stpehen's 2.3.99pre2 patchset because no one else seemed to be interested. If someone with more vm-experience (e.g. you) gets interested because of this patch: fine. Christoph -- Always remember that you are unique. Just like everyone else. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2000-11-01 13:35 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-10-27 20:21 [PATCH] kiobuf/rawio fixes for 2.4.0-test10-pre6 Christoph Hellwig
[not found] ` <200010272123.OAA21478@penguin.transmeta.com>
2000-10-30 11:45 ` Christoph Hellwig
2000-10-30 17:19 ` Jeff Garzik
2000-10-30 18:17 ` Christoph Hellwig
2000-10-30 18:56 ` Jeff Garzik
2000-10-30 19:44 ` Christoph Hellwig
2000-10-30 20:08 ` Jeff Garzik
2000-10-30 20:32 ` Christoph Hellwig
2000-10-30 21:51 ` Jeff Garzik
2000-11-01 13:32 ` Stephen C. Tweedie
2000-10-31 2:08 ` Andrea Arcangeli
2000-11-01 11:16 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox