* Re: Driver design question @ 2006-09-14 11:29 Takashi Iwai 2006-09-15 14:47 ` Lee Revell 0 siblings, 1 reply; 35+ messages in thread From: Takashi Iwai @ 2006-09-14 11:29 UTC (permalink / raw) To: rlrevell, alsa-devel, tiwai >>> Lee Revell <rlrevell@joe-job.com> 09/12/06 10:27 PM >>> > I have a device here that can play PCM, but it's a weird implementation. > Only 2 periods of 0x2000 words per buffer are supported, and there is no > DMA - the driver must poll a status register, and when the chip is > ready, deliver all 0x2000 words using outw() then send an end xfer > command. > > I think I need to use an intermediate buffer, and implement copy/silence > callbacks that write to this. Then I plan to use a tasklet or workqueue > to do the actual xfer of PCM data to the hardware. A timer callback > will periodically check whether at least 0x2000 words are in the buffer > and if so, schedule the tasklet to drain it. You can implement it in two ways, at least. One is to use copy and silent callbacks. This is pretty straightforward, doesn't need extra buffer, but it cannot use the native mmap. (alsa-lib provides the mmap emulation mode, but it seems not so stable.) Also, the buffer and period sizes are very restricitve (in your case, 0x2000 x 2 words), so many apps might not work well. An exampleof this type is isa/sb/emu8000_pcm.c (and some of gus pcm, IIRC). Another is to use an intermediate buffer as you suggsted. In this case, you do _not_ need copy and silent callbacks. These callbacks are the operations to copy and silent the hardware buffers. WIth an intermediate buffer, the copy and silent are done on this buffer, i.e. on normal RAM. What you need instead is the background-running copy operation from this intermediate buffer to the hardware via outw. This copy task could be done in the interrupt handler (or timer in your case) if it's minimum. If not, it'd be better to take a workq since it can sleep and much preemptive. > I presume this implementation cannot support mmap() as there's no random > access to the hardware buffer. Is this correct? Mmap works in practice. The driver simply shares the intermediate buffer with user-space. > Do I call snd_pcm_period_elapsed() from the timer, immediately before > scheduling the tasklet, or from the tasklet, after the data has been > transferred to the hw? Yes, by calling snd_pcm_period_elapsed(), the actual hwptr is updated. Then schedule workq to run copy-task from the buffer to h/w for the appropriate size. > In general, does this seem like a reasonable implementation? I don't see other possibilities. BTW, I'll be less communicative (and don't read MLs) for a while until the next Tuesday since I'm in a conference. Takashi ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-09-14 11:29 Driver design question Takashi Iwai @ 2006-09-15 14:47 ` Lee Revell 2006-09-19 15:15 ` Takashi Iwai 0 siblings, 1 reply; 35+ messages in thread From: Lee Revell @ 2006-09-15 14:47 UTC (permalink / raw) To: Takashi Iwai; +Cc: tiwai, alsa-devel On Thu, 2006-09-14 at 12:29 +0100, Takashi Iwai wrote: > >>> Lee Revell <rlrevell@joe-job.com> 09/12/06 10:27 PM >>> > > I have a device here that can play PCM, but it's a weird implementation. > > Only 2 periods of 0x2000 words per buffer are supported, and there is no > > DMA - the driver must poll a status register, and when the chip is > > ready, deliver all 0x2000 words using outw() then send an end xfer > > command. > > > > I think I need to use an intermediate buffer, and implement copy/silence > > callbacks that write to this. Then I plan to use a tasklet or workqueue > > to do the actual xfer of PCM data to the hardware. A timer callback > > will periodically check whether at least 0x2000 words are in the buffer > > and if so, schedule the tasklet to drain it. > > You can implement it in two ways, at least. > > One is to use copy and silent callbacks. This is pretty straightforward, > doesn't need extra buffer, but it cannot use the native mmap. (alsa-lib > provides the mmap emulation mode, but it seems not so stable.) > Also, the buffer and period sizes are very restricitve (in your case, 0x2000 > x 2 words), so many apps might not work well. > An exampleof this type is isa/sb/emu8000_pcm.c (and some of gus pcm, > IIRC). > > Another is to use an intermediate buffer as you suggsted. > In this case, you do _not_ need copy and silent callbacks. These callbacks > are the operations to copy and silent the hardware buffers. WIth an > intermediate buffer, the copy and silent are done on this buffer, i.e. on > normal RAM. What you need instead is the background-running copy > operation from this intermediate buffer to the hardware via outw. > > This copy task could be done in the interrupt handler (or timer in your case) > if it's minimum. If not, it'd be better to take a workq since it can sleep and > much preemptive. Thanks very much for the information. So if I use an intermediate buffer but no copy and silence callbacks, I must copy the data to the hardware in a workqueue (I have no interrupt ability and the copy to hardware must be able to sleep). Where do I copy the data from, and how do I know when 0x2000 words are available? By directly accessing runtime->dma_area and the software pointer? IOW how do I get the information that would be passed to the copy callback? Lee ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-09-15 14:47 ` Lee Revell @ 2006-09-19 15:15 ` Takashi Iwai 2006-09-25 19:54 ` Lee Revell 2006-09-27 13:58 ` Lee Revell 0 siblings, 2 replies; 35+ messages in thread From: Takashi Iwai @ 2006-09-19 15:15 UTC (permalink / raw) To: Lee Revell; +Cc: alsa-devel At Fri, 15 Sep 2006 10:47:59 -0400, Lee Revell wrote: > > On Thu, 2006-09-14 at 12:29 +0100, Takashi Iwai wrote: > > >>> Lee Revell <rlrevell@joe-job.com> 09/12/06 10:27 PM >>> > > > I have a device here that can play PCM, but it's a weird implementation. > > > Only 2 periods of 0x2000 words per buffer are supported, and there is no > > > DMA - the driver must poll a status register, and when the chip is > > > ready, deliver all 0x2000 words using outw() then send an end xfer > > > command. > > > > > > I think I need to use an intermediate buffer, and implement copy/silence > > > callbacks that write to this. Then I plan to use a tasklet or workqueue > > > to do the actual xfer of PCM data to the hardware. A timer callback > > > will periodically check whether at least 0x2000 words are in the buffer > > > and if so, schedule the tasklet to drain it. > > > > You can implement it in two ways, at least. > > > > One is to use copy and silent callbacks. This is pretty straightforward, > > doesn't need extra buffer, but it cannot use the native mmap. (alsa-lib > > provides the mmap emulation mode, but it seems not so stable.) > > Also, the buffer and period sizes are very restricitve (in your case, 0x2000 > > x 2 words), so many apps might not work well. > > An exampleof this type is isa/sb/emu8000_pcm.c (and some of gus pcm, > > IIRC). > > > > Another is to use an intermediate buffer as you suggsted. > > In this case, you do _not_ need copy and silent callbacks. These callbacks > > are the operations to copy and silent the hardware buffers. WIth an > > intermediate buffer, the copy and silent are done on this buffer, i.e. on > > normal RAM. What you need instead is the background-running copy > > operation from this intermediate buffer to the hardware via outw. > > > > This copy task could be done in the interrupt handler (or timer in your case) > > if it's minimum. If not, it'd be better to take a workq since it can sleep and > > much preemptive. > > Thanks very much for the information. > > So if I use an intermediate buffer but no copy and silence callbacks, I > must copy the data to the hardware in a workqueue (I have no interrupt > ability and the copy to hardware must be able to sleep). Where do I > copy the data from, and how do I know when 0x2000 words are available? The timing must be triggered via a ceratin irq. In your case, you'd need to take a timer as the irq source, I guess. > By directly accessing runtime->dma_area and the software pointer? IOW > how do I get the information that would be passed to the copy callback? You can allocate runtime->dma_area via normal memory by calling preallocator with SNDRV_DMA_TYPE_CONTINUOUS, and the rest is just like other drivers. In this case, runtime->dma_area points the intermediate buffer. Alternatively, you can use vmalloc() for allocating buffers, as used in usbaudio.c. The workq task is a pseudo-DMA engine that runs in background. When it's woken up, it feeds data from intermediate buffer to hardware as much as possible. The available data can be found in If all data are fed, it sleeps again (or exit the task). The timer wakes up or schedule the new workq task. Maybe you can utilize pcm-indirect.h for a framework. It contains the basic stuff for handling the intermediate and hardware buffers. You'll need to define the following: 1. ack pcm op, which invokes the transfer function 2. set up struct snd_pcm_indirect fields in prepare callback 3. call transfer in the trigger-start The implementation for 1 depends on how long it takes to copy the data chunk. If writel can be done relatively fast, it's OK to do it in the irq context. Then, call snd_pcm_indirect_playback_transfer() in ack callback with own copy function. The copy function does the data transfer for a given amount of data. If the transfer would take time, you'll need a workq, as mentioned. In such a case, ack callback would simply do wakeup() or schedule_work() for the workq task that calls snd_pcm_indirect_playback_transfer(). For 2, setup of snd_pcm_direct, you'll need at least to define hw_buffer_size and sw_buffer_size fields (in bytes unit). Other fields can be zero. Then, call the same transfer function as ack calls in the trigger-start. This will fill the h/w buffer before actually starting the stream. The pointer callback requires the calculation of the current h/w position. This is translated to the intermediate buffer position by calling snd_pcm_indirect_playback_pointer(). This helper also calls data-transfer (ack op) appropriately to fill the empty h/w buffer space. The example is found in emu10k1/emupcm.c. Takashi ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-09-19 15:15 ` Takashi Iwai @ 2006-09-25 19:54 ` Lee Revell 2006-09-27 17:18 ` Takashi Iwai 2006-09-27 13:58 ` Lee Revell 1 sibling, 1 reply; 35+ messages in thread From: Lee Revell @ 2006-09-25 19:54 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Tue, 2006-09-19 at 17:15 +0200, Takashi Iwai wrote: > The workq task is a pseudo-DMA engine that runs in background. > When it's woken up, it feeds data from intermediate buffer to hardware > as much as possible. The available data can be found in > > If all data are fed, it sleeps again (or exit > the task). The timer wakes up or schedule the new workq task. > I've attached our code, which actually uses a combination of the two (copy callback and an intermediate buffer). It's certainly not the most efficient implementation but it works, at least with aplay. However if we play an MP3 using madplay, the sound is OK for the first minute or so after which ticks or glitches are heard. It sounds as if old parts of the buffer are being replayed. The difference in the applications seems to be that madplay writes in chunks of 0x480 bytes (IOW, arbitrary) while aplay always uses 0x4000 (which matches the hardware buffer size). Here is the code - can you see anything wrong with it? It's derived from the ALSA dummy driver and still uses the dummy driver's hardware pointer and timer scheme for calling pcm_period_elapsed() - could this be the problem? My attempts to track the hardware pointer more accurately and call pcm_period_elapsed from the copy callback have not been successful. Most of the work is done in the workqueue dream_handle_pcm_queue() and the copy callback. /* * Dream soundcard * Copyright (c) by Jaroslav Kysela <perex@suse.cz> * Copyright (c) by Lee Revell <rlrevell@joe-job.com> * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * */ #include <linux/init.h> #include <linux/pci.h> #include <linux/jiffies.h> #include <linux/slab.h> #include <linux/time.h> #include <linux/wait.h> #include <linux/moduleparam.h> #include <sound/dream.h> #include <sound/dream_codec.h> MODULE_AUTHOR("Lee Revell <rlrevell@joe-job.com>"); MODULE_DESCRIPTION("Dream soundcard (/dev/null)"); MODULE_LICENSE("GPL"); MODULE_SUPPORTED_DEVICE("{{ALSA,Dream soundcard}}"); static int index[SNDRV_CARDS] = SNDRV_DEFAULT_IDX; /* Index 0-MAX */ static char *id[SNDRV_CARDS] = SNDRV_DEFAULT_STR; /* ID for this card */ static int enable[SNDRV_CARDS] = {1, 1, [2 ... (SNDRV_CARDS - 1)] = 0}; static int pcm_devs[SNDRV_CARDS] = {[0 ... (SNDRV_CARDS - 1)] = 1}; static int pcm_substreams[SNDRV_CARDS] = {[0 ... (SNDRV_CARDS - 1)] = 8}; /* Are we running on real Alchemy board? If so create local bus driver as well as probing PCI */ static int alchemy = 0; module_param_array(index, int, NULL, 0444); MODULE_PARM_DESC(index, "Index value for dream soundcard."); module_param_array(id, charp, NULL, 0444); MODULE_PARM_DESC(id, "ID string for dream soundcard."); module_param_array(enable, bool, NULL, 0444); MODULE_PARM_DESC(enable, "Enable this dream soundcard."); module_param_array(pcm_devs, int, NULL, 0444); MODULE_PARM_DESC(pcm_devs, "PCM devices # (0-4) for dream driver."); module_param_array(pcm_substreams, int, NULL, 0444); MODULE_PARM_DESC(pcm_substreams, "PCM substreams # (1-16) for dream driver."); module_param(alchemy, int, 0); MODULE_PARM_DESC(alchemy, "Device is on PCI rather than local bus"); #ifdef LEO_PCM_REPLAY #define DPCM_DBG_SIZE 0x1000L #define DPCM_MAX_DBG_SIZE DPCM_DBG_SIZE-128 static unsigned char dpcm_dbg[DPCM_DBG_SIZE]; #endif int dream_setup_firmware(struct snd_card_dream *dream) { const char *fwfile; const struct firmware *fw; int err; fwfile = "PCI9708S.BIN"; #if 0 err = DrmcInit(1, 0, 1); printk("dream codec init: %i\n", err); #endif if ((err = dream_switch_uart_mode(dream, &dream->midi1, 0)) != 0) { printk("dream_switch_uart_mode failed for first port: %i\n", err); return -1; } if ((err = dream_load_boot(dream, &dream->midi1)) < 0) return -1; if ((err = dream_switch_uart_mode(dream, &dream->midi2, 0)) != 0) { printk("dream_switch_uart_mode failed for first port: %i\n", err); return -1; } if ((err = dream_load_boot(dream, &dream->midi2)) < 0) return -1; if (request_firmware(&fw, fwfile, &dream->pci->dev)) { printk(KERN_ERR "Dreamchip: cannot load firmware %s\n", fwfile); return -ENOENT; } else { printk(KERN_ERR "Dreamchip: loaded firmware %s size %i\n", fwfile, (int)fw->size); } err = dream_dbg_upload_xmem(dream, &dream->midi1, DREAM_FIRMWARE_ADDR, (unsigned short *) fw->data, fw->size); printk("dream_dbg_upload_xmem returned %i\n", err); return 0; } static int snd_dream_playback_copy(snd_pcm_substream_t * substream, int channel, snd_pcm_uframes_t pos, void __user *src, snd_pcm_uframes_t count) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; int retval; if(((unsigned int)(dpcm->ptr - dpcm_dbg)) < DPCM_MAX_DBG_SIZE) dpcm->ptr += sprintf(dpcm->ptr, "TS: %10.0lu snd_dream_playback_copy: pos 0x%x src 0x%p count 0x%x (0x%x bytes)\n", jiffies, (unsigned int) pos, src, (unsigned int) count, frames_to_bytes(runtime, count)); retval=0; do { unsigned int len, new_size; unsigned char * new_data; /* allocate necessary number of buffers and add them to buffers' list */ len = frames_to_bytes(runtime, count); new_size = len + dpcm->b.left + 2; new_data = (unsigned char *) kmalloc(new_size, GFP_KERNEL); if(new_data == NULL) { retval = -ENOMEM; break; } if(dpcm->b.data) { memcpy(new_data, & dpcm->b.data[dpcm->b.offset], dpcm->b.left); kfree(dpcm->b.data); } if ((retval = copy_from_user(& new_data[dpcm->b.left], src, len))) { printk("copy_from_user failed: %i\n", retval); retval = -EFAULT; break; } else { /* printk("copy_from_user 0x%x bytes to 0x%p from 0x%p succeeded\n", len, & new_data[dpcm->b.left], src);*/ retval=0; } dpcm->b.left += len; dpcm->b.size = dpcm->b.left; dpcm->b.offset= 0; dpcm->b.data = new_data; } while(0); if(retval<0) return(retval); /* Convert to words for queue handler */ dpcm->sw_words_played += frames_to_bytes(runtime, count) / 2; dpcm->sw_chunks_played++; schedule_work(&dpcm->pcm_work); return 0; } static int snd_dream_capture_copy(snd_pcm_substream_t * substream, int channel, snd_pcm_uframes_t pos, void __user *src, snd_pcm_uframes_t count) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; dream_pcm_dma_xfer(dream, dpcm, src, frames_to_bytes(runtime, count), 1); return 0; } static snd_card_t *snd_dream_cards[SNDRV_CARDS] = SNDRV_DEFAULT_PTR; static void snd_card_dream_pcm_timer_start(snd_pcm_substream_t * substream) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; dpcm->timer.expires = 1 + jiffies; add_timer(&dpcm->timer); } static void snd_card_dream_pcm_timer_stop(snd_pcm_substream_t * substream) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; del_timer(&dpcm->timer); } static int snd_card_dream_playback_trigger(snd_pcm_substream_t * substream, int cmd) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; struct snd_card_dream *dream = snd_pcm_substream_chip(substream); if (cmd == SNDRV_PCM_TRIGGER_START) { // dbg(dream, "trigger: PCM playback start\n"); snd_card_dream_pcm_timer_start(substream); } else if (cmd == SNDRV_PCM_TRIGGER_STOP) { // dbg(dream, "trigger: PCM playback stop\n"); snd_card_dream_pcm_timer_stop(substream); /* Send PCM_CLOSE here - ? */ dpcm->stopping = 1; schedule_work(&dpcm->pcm_work); } else { return -EINVAL; } return 0; } static int snd_card_dream_capture_trigger(snd_pcm_substream_t * substream, int cmd) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; struct snd_card_dream *dream = snd_pcm_substream_chip(substream); if (cmd == SNDRV_PCM_TRIGGER_START) { dbg(dream, "trigger: PCM capture start\n"); snd_card_dream_pcm_timer_start(substream); /* Send PCM_START here */ dream_pcm_start(dream, dpcm->channel); } else if (cmd == SNDRV_PCM_TRIGGER_STOP) { dbg(dream, "trigger: PCM capture stop\n"); snd_card_dream_pcm_timer_stop(substream); /* Send PCM_CLOSE here - ? */ dream_pcm_close(dream, dpcm->channel); } else { return -EINVAL; } return 0; } static int snd_card_dream_pcm_prepare(snd_pcm_substream_t * substream) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; unsigned int bps; bps = runtime->rate * runtime->channels; bps *= snd_pcm_format_width(runtime->format); bps /= 8; if (bps <= 0) return -EINVAL; dpcm->pcm_bps = bps; dpcm->pcm_jiffie = bps / HZ; dpcm->pcm_size = snd_pcm_lib_buffer_bytes(substream); dpcm->pcm_count = snd_pcm_lib_period_bytes(substream); dpcm->pcm_irq_pos = 0; dpcm->pcm_buf_pos = 0; // dbg(dream, "PCM prepare: DMA buffer 0x%x bytes at 0x%p\n", runtime->dma_bytes, runtime->dma_area); return 0; } static int snd_card_dream_playback_prepare(snd_pcm_substream_t * substream) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; struct snd_card_dream *dream = snd_pcm_substream_chip(substream); printk("dream_pcm_open: channel 0x%x eight_bit 0x%x stereo 0x%x rate 0x%x\n", dpcm->channel, dpcm->eight_bit, dpcm->stereo, dpcm->rate); dream_pcm_open(dream, dpcm->channel, dpcm->eight_bit, dpcm->stereo, dpcm->rate); // dbg(dream, "prepare: PCM playback\n"); return snd_card_dream_pcm_prepare(substream); } static int snd_card_dream_capture_prepare(snd_pcm_substream_t * substream) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); dbg(dream, "prepare: PCM capture\n"); return snd_card_dream_pcm_prepare(substream); } static void snd_card_dream_pcm_timer_function(unsigned long data) { struct snd_card_dream_pcm *dpcm = (struct snd_card_dream_pcm *)data; struct snd_card_dream *dream = snd_pcm_substream_chip(dpcm->substream); dpcm->timer.expires = 1 + jiffies; add_timer(&dpcm->timer); spin_lock_irq(&dpcm->lock); dpcm->pcm_irq_pos += dpcm->pcm_jiffie; dpcm->pcm_buf_pos += dpcm->pcm_jiffie; dpcm->pcm_buf_pos %= dpcm->pcm_size; if (dpcm->pcm_irq_pos >= dpcm->pcm_count) { dpcm->pcm_irq_pos %= dpcm->pcm_count; // dbg(dream, "timer: PCM period elapsed\n"); snd_pcm_period_elapsed(dpcm->substream); } spin_unlock_irq(&dpcm->lock); } static snd_pcm_uframes_t snd_card_dream_playback_pointer(snd_pcm_substream_t * substream) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; #ifdef LEO_PCM_REPLAY #else printk("pointer returned 0x%x\n", bytes_to_frames(runtime, dpcm->pcm_buf_pos)); #endif return bytes_to_frames(runtime, dpcm->pcm_buf_pos); } static snd_pcm_uframes_t snd_card_dream_capture_pointer(snd_pcm_substream_t * substream) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; return bytes_to_frames(runtime, dpcm->pcm_buf_pos); } static snd_pcm_hardware_t snd_card_dream_playback = { .info = (SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_INTERLEAVED), .formats = USE_PLAYBACK_FORMATS, .rates = USE_RATE, .rate_min = USE_RATE_MIN, .rate_max = USE_RATE_MAX, .channels_min = USE_CHANNELS_MIN, .channels_max = USE_CHANNELS_MAX, .buffer_bytes_max = MAX_BUFFER_SIZE, .period_bytes_min = MAX_PERIOD_SIZE, .period_bytes_max = MAX_PERIOD_SIZE, .periods_min = USE_PERIODS_MIN, .periods_max = USE_PERIODS_MAX, .fifo_size = 0, }; static snd_pcm_hardware_t snd_card_dream_capture = { .info = (SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_INTERLEAVED), .formats = USE_CAPTURE_FORMATS, .rates = USE_RATE, .rate_min = USE_RATE_MIN, .rate_max = USE_RATE_MAX, .channels_min = USE_CHANNELS_MIN, .channels_max = USE_CHANNELS_MAX, .buffer_bytes_max = MAX_BUFFER_SIZE, .period_bytes_min = 64, .period_bytes_max = MAX_BUFFER_SIZE, .periods_min = USE_PERIODS_MIN, .periods_max = USE_PERIODS_MAX, .fifo_size = 0, }; static void snd_card_dream_runtime_free(snd_pcm_runtime_t *runtime) { struct snd_card_dream_pcm *dpcm = runtime->private_data; kfree(dpcm); } static int snd_card_dream_hw_params(snd_pcm_substream_t * substream, snd_pcm_hw_params_t * hw_params) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; int err; /* W_OPEN: channels 0 if mono, 1 if stereo */ dpcm->stereo = params_channels(hw_params) == 2; /* W_OPEN: format 0 if 16bit, 1 if 8bit */ dpcm->eight_bit = params_format(hw_params) == SNDRV_PCM_FORMAT_U8; dpcm->rate = params_rate(hw_params); printk("hw_params: stereo %s, format %s\n, rate %i", dpcm->stereo ? "yes" : "no", dpcm->eight_bit ? "8bit" : "16bit", dpcm->rate); if (( err = snd_pcm_lib_malloc_pages(substream, params_buffer_bytes(hw_params))) < 0) return err; return 0; } static int snd_card_dream_hw_free(snd_pcm_substream_t * substream) { return snd_pcm_lib_free_pages(substream); } static void dream_handle_pcm_queue(void * data) { struct snd_card_dream_pcm *dpcm = (struct snd_card_dream_pcm *) data; int res, sample; int samples = 0x2000; if (! dpcm->running) { printk("dream_handle_pcm_queue: invoked before trigger!\n"); /* Send PCM_START */ dream_pcm_start(dpcm->dream, dpcm->channel); dpcm->running = 1; } if (dpcm->stopping) { /* Play buffer of any size, but at least 1 and never 0x2000 */ if(dpcm->b.left) samples = (dpcm->b.left < DREAM_PCM_HALF_BUFFER_LEN_BYTES) ? dpcm->b.left/2 : DREAM_PCM_HALF_BUFFER_LEN-1; else samples = 1; } else if(dpcm->b.left >= DREAM_PCM_HALF_BUFFER_LEN_BYTES) samples = DREAM_PCM_HALF_BUFFER_LEN; else samples = 0; /* Don't play partial half buffer unless is stopping */ if(((unsigned int)(dpcm->ptr - dpcm_dbg)) < DPCM_MAX_DBG_SIZE) dpcm->ptr += sprintf(dpcm->ptr, "TS: %10.0lu %s pcm_queue: app 0x%x words hw 0x%x samples 0x%x off 0x%x left 0x%x\n", jiffies, dpcm->stopping ? "STOP" : "RUN ", dpcm->sw_words_played, dpcm->hw_words_played, samples, dpcm->b.offset, dpcm->b.left); if(samples > 0) { res = dream_get_mpu_data_poll_status_byte(dpcm->dream, &dpcm->dream->midi2, MPU401_IT_MASK, MPU401_IT_VALUE, 100); if(res >= 0) { unsigned int play_size = samples*2; unsigned short * pcm_word = (unsigned short *) & dpcm->b.data[dpcm->b.offset]; /* Write buffer to Dream chip */ for (sample = 0; sample < samples; sample++, pcm_word++) mpu_write16(dpcm->dream, &dpcm->dream->midi2, cpu_to_le16(*pcm_word), 0); /* TBD: don't swap for 8 bits WAV */ dream_pcm_end_xfer(dpcm->dream, dpcm->channel); /* move offset */ dpcm->b.offset += play_size; dpcm->b.left -= play_size; dpcm->hw_words_played += samples; } else printk("dream_handle_pcm_queue: polling status byte returned %i\n", res); } /* if(samples > 0) */ if (dpcm->stopping) { int l = strlen(dpcm_dbg), ii=0; unsigned char buf[120]; while(ii < l) { strncpy(buf, & dpcm_dbg[ii], 100); buf[100] = 0; printk(buf); ii += 100; } mdelay(100); dream_pcm_close(dpcm->dream, dpcm->channel); dpcm->running = 0; if(dpcm->b.data) kfree(dpcm->b.data); printk("handle_queue: stopping. %i chunks\n", dpcm->sw_chunks_played); } return; } static int snd_card_dream_playback_open(snd_pcm_substream_t * substream) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm; int err; dpcm = kcalloc(1, sizeof(*dpcm), GFP_KERNEL); if (dpcm == NULL) return -ENOMEM; #ifdef LEO_PCM_REPLAY memset(dpcm, 0, sizeof(*dpcm)); dpcm->channel = 0; /* not necessary because of previous memset, but just in case */ dpcm->ptr = dpcm_dbg; #endif dpcm->dream = dream; init_timer(&dpcm->timer); dpcm->timer.data = (unsigned long) dpcm; dpcm->timer.function = snd_card_dream_pcm_timer_function; spin_lock_init(&dpcm->lock); dpcm->substream = substream; runtime->private_data = dpcm; runtime->private_free = snd_card_dream_runtime_free; runtime->hw = snd_card_dream_playback; if ((err = add_playback_constraints(runtime)) < 0) { kfree(dpcm); return err; } /* initialize the workqueue that will be triggered from copy callback * to write PCM samples to the hardware */ INIT_WORK(&dpcm->pcm_work, dream_handle_pcm_queue, (void *)dpcm); return 0; } static int snd_card_dream_capture_open(snd_pcm_substream_t * substream) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm; int err; dpcm = kcalloc(1, sizeof(*dpcm), GFP_KERNEL); if (dpcm == NULL) return -ENOMEM; init_timer(&dpcm->timer); dpcm->timer.data = (unsigned long) dpcm; dpcm->timer.function = snd_card_dream_pcm_timer_function; spin_lock_init(&dpcm->lock); dpcm->substream = substream; runtime->private_data = dpcm; runtime->private_free = snd_card_dream_runtime_free; runtime->hw = snd_card_dream_capture; if ((err = add_capture_constraints(runtime)) < 0) { kfree(dpcm); return err; } return 0; } static int snd_card_dream_playback_close(snd_pcm_substream_t * substream) { return 0; } static int snd_card_dream_capture_close(snd_pcm_substream_t * substream) { return 0; } static snd_pcm_ops_t snd_card_dream_playback_ops = { .open = snd_card_dream_playback_open, .close = snd_card_dream_playback_close, .ioctl = snd_pcm_lib_ioctl, .hw_params = snd_card_dream_hw_params, .hw_free = snd_card_dream_hw_free, .prepare = snd_card_dream_playback_prepare, .trigger = snd_card_dream_playback_trigger, .pointer = snd_card_dream_playback_pointer, .copy = snd_dream_playback_copy, }; static snd_pcm_ops_t snd_card_dream_capture_ops = { .open = snd_card_dream_capture_open, .close = snd_card_dream_capture_close, .ioctl = snd_pcm_lib_ioctl, .hw_params = snd_card_dream_hw_params, .hw_free = snd_card_dream_hw_free, .prepare = snd_card_dream_capture_prepare, .trigger = snd_card_dream_capture_trigger, .pointer = snd_card_dream_capture_pointer, .copy = snd_dream_capture_copy, }; static int __init snd_card_dream_pcm(struct snd_card_dream *dream, int device, int substreams) { snd_pcm_t *pcm; int err; if ((err = snd_pcm_new(dream->card, "Dream PCM", device, substreams, substreams, &pcm)) < 0) return err; pcm->private_data = dream; snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_PLAYBACK, &snd_card_dream_playback_ops); snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_CAPTURE, &snd_card_dream_capture_ops); pcm->info_flags = 0; strcpy(pcm->name, "Dream PCM"); snd_pcm_lib_preallocate_pages_for_all(pcm, SNDRV_DMA_TYPE_CONTINUOUS, snd_dma_continuous_data(GFP_KERNEL), 0, 64*1024); dream->pcm = pcm; return 0; } static int snd_dream_hwdep_open(snd_hwdep_t * hw, struct file *file) { return 0; } static int snd_dream_hwdep_release(snd_hwdep_t * hw, struct file *file) { return 0; } irqreturn_t snd_dream_interrupt(int irq, void *dev_id, struct pt_regs *regs) { struct snd_card_dream *dream = dev_id; printk("interrupt - dream is 0x%p\n", dream); return IRQ_HANDLED; } static int snd_dream_hwdep_ioctl(snd_hwdep_t * hw, struct file *file, unsigned int cmd, unsigned long arg) { struct snd_card_dream *dream = hw->private_data; struct snd_dream_midi_msg *midi_msg; struct snd_dream_codec_msg * codec_msg; //unsigned int addr; void __user *argp = (void __user *)arg; int res, sample, i; int samples = 0x2000; signed short * buffer; switch (cmd) { case SNDRV_DREAM_IOCTL_MIDI_MSG: midi_msg = (struct snd_dream_midi_msg *)kmalloc(sizeof(*midi_msg), GFP_KERNEL); if (midi_msg == NULL) return -ENOMEM; if (copy_from_user(midi_msg, argp, sizeof(*midi_msg))) { kfree(midi_msg); return -EFAULT; } if(midi_msg->dbg_prn) printk("MIDI message chip %i mpu_port %i offset %i direction %i 16bit %i data 0x%x\n", midi_msg->chip, midi_msg->mpu_port, midi_msg->offset, midi_msg->direction, midi_msg->size, midi_msg->data); if (midi_msg->size == SND_DREAM_MSG_8BIT) { if (midi_msg->direction == SND_DREAM_WRITE) { mpu_write(dream, (midi_msg->mpu_port ? &dream->midi2 : &dream->midi1), midi_msg->data, midi_msg->offset, midi_msg->dbg_prn); } else { midi_msg->data = mpu_read(dream, (midi_msg->mpu_port ? &dream->midi2 : &dream->midi1), midi_msg->offset, midi_msg->dbg_prn); } } else { /* 16 bit */ if (midi_msg->direction == SND_DREAM_WRITE) { mpu_write16(dream, (midi_msg->mpu_port ? &dream->midi2 : &dream->midi1), midi_msg->data, midi_msg->dbg_prn); } else { midi_msg->data = mpu_read16(dream, (midi_msg->mpu_port ? &dream->midi2 : &dream->midi1), midi_msg->offset, midi_msg->dbg_prn); } } if (copy_to_user(argp, midi_msg, sizeof(*midi_msg))) { kfree(midi_msg); return -EFAULT; } kfree(midi_msg); return 0; case SNDRV_DREAM_IOCTL_REQUEST_IRQ: res = request_irq(dream->pci->irq, snd_dream_interrupt, SA_INTERRUPT|SA_SHIRQ, "Dreamchip", (void *)dream); printk("IOCTL_REQUEST_IRQ returned %i\n", res); return res; case SNDRV_DREAM_IOCTL_FW_LOAD: res = dream_setup_firmware(dream); printk("IOCTL_FW_LOAD returned %i\n", res); return res; /* send random data to PCM for testing */ case SNDRV_DREAM_IOCTL_PCM_TEST: printk(KERN_ERR "IOCTL_PCM_TEST: start\n"); /* allocate a buffer and fill it with random data */ buffer = (signed short *) kmalloc((samples * sizeof(signed short)), GFP_KERNEL); if (buffer == NULL) return -ENOMEM; get_random_bytes(buffer, (samples * sizeof(signed short))); printk(KERN_ERR "IOCTL_PCM_TEST: filled buffer with random data\n"); /* 9708 channel 0, 16 bit, stereo, 48000Hz */ dream_pcm_open(dream, 0, 0, 1, 48000); dream_pcm_set_vol_fl(dream, 0, 0xFFFF); dream_pcm_set_vol_fr(dream, 0, 0xFFFF); dream_pcm_start(dream, 0); /* Send first half buffer */ res = dream_get_mpu_data_poll_status_byte(dream, &dream->midi2, MPU401_IT_MASK, MPU401_IT_VALUE, 100); printk(KERN_ERR "IOCTL_PCM_TEST: iter 0 polling status byte returned %i\n", res); for (sample = 0; sample < samples; sample++) mpu_write16(dream, &dream->midi2, buffer[sample], 0); dream_pcm_end_xfer(dream, 0); /* Send 10 more half buffers */ for (i = 0; i < 10; i++) { res = dream_get_mpu_data_poll_status_byte(dream, &dream->midi2, MPU401_IT_MASK, MPU401_IT_VALUE, 100); printk(KERN_ERR "IOCTL_PCM_TEST: iter %i polling status byte returned %i\n", i+1, res); for (sample = 0; sample < samples; sample++) mpu_write16(dream, &dream->midi2, buffer[sample], 0); dream_pcm_end_xfer(dream, 0); } /* Finally send incomplete half buffer */ res = dream_get_mpu_data_poll_status_byte(dream, &dream->midi2, MPU401_IT_MASK, MPU401_IT_VALUE, 100); printk(KERN_ERR "IOCTL_PCM_TEST: iter 0 polling status byte returned %i\n", res); for (sample = 0; sample < samples-1; sample++) mpu_write16(dream, &dream->midi2, buffer[sample], 0); dream_pcm_end_xfer(dream, 0); res = dream_pcm_close(dream, 0); printk(KERN_ERR "IOCTL_PCM_TEST returned %i\n", res); return res; case SNDRV_DREAM_IOCTL_CODEC_MSG: codec_msg = (struct snd_dream_codec_msg *)kmalloc(sizeof(*codec_msg), GFP_KERNEL); if (codec_msg == NULL) return -ENOMEM; if (copy_from_user(codec_msg, argp, sizeof(*codec_msg))) { kfree(codec_msg); return -EFAULT; } res = DrmcIoctlProc(codec_msg); if (copy_to_user(argp, codec_msg, sizeof(*codec_msg))) { kfree(codec_msg); return -EFAULT; } kfree(codec_msg); return res; } return -ENOTTY; } int __devinit snd_dream_hwdep_new(struct snd_card_dream *dream, int device, snd_hwdep_t ** rhwdep) { snd_hwdep_t *hw; int err; if (rhwdep) *rhwdep = NULL; if ((err = snd_hwdep_new(dream->card, "Dreamchip", device, &hw)) < 0) return err; strcpy(hw->name, "Dreamchip"); hw->iface = SNDRV_HWDEP_IFACE_DREAMCHIP; hw->ops.open = snd_dream_hwdep_open; hw->ops.ioctl = snd_dream_hwdep_ioctl; hw->ops.release = snd_dream_hwdep_release; hw->private_data = dream; if (rhwdep) *rhwdep = hw; return 0; } static int snd_dream_free(struct snd_card_dream *dream) { /* release the i/o ports & memory */ pci_release_regions(dream->pci); /* disable the PCI entry */ pci_disable_device(dream->pci); /* release the data */ kfree(dream); return 0; } static int snd_dream_dev_free(snd_device_t *device) { struct snd_card_dream *dream = device->device_data; return snd_dream_free(dream); } static int __devinit snd_dream_pci_create(snd_card_t * card, struct pci_dev * pci, struct snd_card_dream ** rdream) { struct snd_card_dream *dream; int err, len; static snd_device_ops_t ops = { .dev_free = snd_dream_dev_free, }; /* enable PCI device */ if ((err = pci_enable_device(pci)) < 0) return err; dream = kmalloc(sizeof(*dream), GFP_KERNEL); if (dream == NULL) { pci_disable_device(pci); return -ENOMEM; } memset(dream, 0, sizeof(*dream)); dream->card = card; dream->pci = pci; if ((err = pci_request_regions(pci, "Dreamchip")) < 0) { kfree(dream); pci_disable_device(pci); return err; } printk("Dreamchip: alchemy is %i\n", alchemy); /* Get the PCI BAR */ dream->port = pci_resource_start(pci, 0); printk("Dreamchip: base port 0x%x\n", dream->port); len = pci_resource_len(pci, 0); printk("Dreamchip: base port len %i\n", len); #if 0 if (request_irq(pci->irq, snd_dream_interrupt, SA_INTERRUPT|SA_SHIRQ, "Dreamchip", (void *)dream)) { kfree(dream); pci_disable_device(pci); return -EBUSY; } dream->irq = pci->irq; #endif if ((err = snd_device_new(card, SNDRV_DEV_LOWLEVEL, dream, &ops)) < 0) { snd_dream_free(dream); return err; } snd_card_set_dev(card, &pci->dev); *rdream = dream; return 0; } static int __devinit snd_card_dream_pci_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) { static int dev; snd_card_t *card; struct snd_card_dream *dream; int idx, err; if (dev >= SNDRV_CARDS) return -ENODEV; if (!enable[dev]) { dev++; return -ENOENT; } card = snd_card_new(index[dev], id[dev], THIS_MODULE, 0); if (card == NULL) return -ENOMEM; strcpy(card->driver, "Dream"); strcpy(card->shortname, "Dream"); sprintf(card->longname, "Dream %i", dev + 1); if ((err = snd_dream_pci_create(card, pci, &dream)) < 0) goto __nodev; spin_lock_init(&dream->dream_lock); spin_lock_init(&dream->log_lock); for (idx = 0; idx < MAX_PCM_DEVICES && idx < pcm_devs[dev]; idx++) { if (pcm_substreams[dev] < 1) pcm_substreams[dev] = 1; if (pcm_substreams[dev] > MAX_PCM_SUBSTREAMS) pcm_substreams[dev] = MAX_PCM_SUBSTREAMS; if ((err = snd_card_dream_pcm(dream, idx, pcm_substreams[dev])) < 0) goto __nodev; } if ((err = snd_card_dream_new_mixer(dream)) < 0) goto __nodev; snd_dream_proc_init(dream); snd_dream_midi(dream); if ((err = snd_dream_hwdep_new(dream, 0, NULL)) < 0) goto __nodev; /* Simulation mode should not be set by default for PCI board bringup */ dream->simulation_mode = 0; /* Set the register IO log buffer index to 0 */ dream->log_idx = 0; /* Load firmware */ /* if ((err = dream_setup_firmware(dream)) < 0) goto __nodev; */ #if 0 if((err = DrmcInit(1, 0, 0)) != 0) goto __nodev; #endif if ((err = snd_card_register(card)) == 0) { snd_dream_cards[dev] = card; return 0; } pci_set_drvdata(pci, card); dev++; __nodev: snd_card_free(card); return err; } static int __devinit snd_card_dream_lb_probe(int dev) { snd_card_t *card; struct snd_card_dream *dream; int idx, err; if (!enable[dev]) return -ENODEV; card = snd_card_new(index[dev], id[dev], THIS_MODULE, sizeof(struct snd_card_dream)); if (card == NULL) return -ENOMEM; dream = (struct snd_card_dream *)card->private_data; dream->card = card; spin_lock_init(&dream->dream_lock); spin_lock_init(&dream->log_lock); for (idx = 0; idx < MAX_PCM_DEVICES && idx < pcm_devs[dev]; idx++) { if (pcm_substreams[dev] < 1) pcm_substreams[dev] = 1; if (pcm_substreams[dev] > MAX_PCM_SUBSTREAMS) pcm_substreams[dev] = MAX_PCM_SUBSTREAMS; if ((err = snd_card_dream_pcm(dream, idx, pcm_substreams[dev])) < 0) goto __nodev; } if ((err = snd_card_dream_new_mixer(dream)) < 0) goto __nodev; snd_dream_proc_init(dream); snd_dream_midi(dream); if ((err = snd_dream_hwdep_new(dream, 0, NULL)) < 0) goto __nodev; /* Simulation mode should not be set by default for PCI board bringup */ dream->simulation_mode = 0; /* Set the register IO log buffer index to 0 */ dream->log_idx = 0; /* We are a local bus device, not PCI */ dream->alchemy = 1; dream->mmio_base = ioremap(SAM9708_MMIO_BASE, 0x20); printk("MMIO base 0x%p", dream->mmio_base); /* Load firmware */ /* if ((err = dream_setup_firmware(dream)) < 0) goto __nodev; */ #if 0 if((err = DrmcInit(1, 0, 0)) != 0) goto __nodev; #endif strcpy(card->driver, "Dream"); strcpy(card->shortname, "Dream"); sprintf(card->longname, "Dream %i", dev + 1); if ((err = snd_card_register(card)) == 0) { snd_dream_cards[dev] = card; return 0; } dev++; __nodev: snd_card_free(card); return err; } struct pci_device_id snd_dream_ids[] = { { 0x1438, 0x9707, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, /* Dream PCI board */ { 0, } }; static void __devexit snd_card_dream_pci_remove(struct pci_dev *pci) { snd_card_free(pci_get_drvdata(pci)); pci_set_drvdata(pci, NULL); } static struct pci_driver driver = { .name = "Dreamchip", .id_table = snd_dream_ids, .probe = snd_card_dream_pci_probe, .remove = __devexit_p(snd_card_dream_pci_remove), }; static int __init alsa_card_dream_init(void) { int dev, cards; pci_register_driver(&driver); if (!alchemy) return 0; for (dev = cards = 1; dev < SNDRV_CARDS && enable[dev]; dev++) { if (snd_card_dream_lb_probe(dev) < 0) { #ifdef MODULE dbg(dream, "Dream soundcard #%i not found or device busy\n", dev + 1); #endif break; } cards++; } if (!cards) { #ifdef MODULE dbg(dream, "Dream soundcard not found or device busy\n"); #endif return -ENODEV; } return 0; } static void __exit alsa_card_dream_exit(void) { int idx; pci_unregister_driver(&driver); for (idx = 0; idx < SNDRV_CARDS; idx++) snd_card_free(snd_dream_cards[idx]); } module_init(alsa_card_dream_init) module_exit(alsa_card_dream_exit) ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-09-25 19:54 ` Lee Revell @ 2006-09-27 17:18 ` Takashi Iwai 2006-09-27 17:38 ` Lee Revell 2006-09-30 2:03 ` Lee Revell 0 siblings, 2 replies; 35+ messages in thread From: Takashi Iwai @ 2006-09-27 17:18 UTC (permalink / raw) To: Lee Revell; +Cc: alsa-devel At Mon, 25 Sep 2006 15:54:22 -0400, Lee Revell wrote: > > On Tue, 2006-09-19 at 17:15 +0200, Takashi Iwai wrote: > > The workq task is a pseudo-DMA engine that runs in background. > > When it's woken up, it feeds data from intermediate buffer to hardware > > as much as possible. The available data can be found in > > > > If all data are fed, it sleeps again (or exit > > the task). The timer wakes up or schedule the new workq task. > > > > I've attached our code, which actually uses a combination of the two > (copy callback and an intermediate buffer). It's certainly not the most > efficient implementation but it works, at least with aplay. However if > we play an MP3 using madplay, the sound is OK for the first minute or so > after which ticks or glitches are heard. It sounds as if old parts of > the buffer are being replayed. > > The difference in the applications seems to be that madplay writes in > chunks of 0x480 bytes (IOW, arbitrary) while aplay always uses 0x4000 > (which matches the hardware buffer size). > > Here is the code - can you see anything wrong with it? It's derived > from the ALSA dummy driver and still uses the dummy driver's hardware > pointer and timer scheme for calling pcm_period_elapsed() - could this > be the problem? My attempts to track the hardware pointer more > accurately and call pcm_period_elapsed from the copy callback have not > been successful. > > Most of the work is done in the workqueue dream_handle_pcm_queue() and > the copy callback. Well, when you have an intermediate buffer (allocated via snd_pcm_lib_malloc(), you don't need copy and silent callbacks. The data is written on the intermediate buffer. Then, the workq copies the data again from this buffer to the hardware in the background. As I mentioned, the helpers in pcm-indirect.h might make things eaiser for such an implementation. In your case, a pseudo code would look like below. ... #include <sound/pcm-indirect.h> ... struct dream { ... unsigned int playback_hw_ptr; /* current position */ struct snd_pcm_indirect playback_rec; struct work_struct playback_work; ... }; /* transfer the given size of data to the hardware */ static void dream_playback_copy(struct snd_pcm_substream *substream, struct snd_pcm_indirect *rec, size_t bytes) { struct dream *chip = snd_pcm_substream_chip(substream); unsigned char *src = substream->runtime->dma_area + rec->sw_data; while (bytes--) do_write_data(chip, *src++); } /* work to transfer the data for playback */ static void dream_work_transfer(void *data) { struct dream *chip = data; struct snd_pcm_substream *substream = chip->playback_substream; if (!substream) return; snd_pcm_indirect_playback_transfer(substream, &chip->playback_rec, dream_playback_copy); } /* ack callback - just schedule work */ static int dream_playback_ack(struct snd_pcm_substream *substream) { struct dream *chip = snd_pcm_substream_chip(substream); schedule_work(&chip->playback_work); return 0; } /* cancel work */ static void dream_playback_cancel(struct snd_pcm_substream *substream) { cancel_delayed_work(&chip->pb_work); } /* prepare callback - * initialize pcm_indirect record, too */ static int dream_playback_prepare(struct snd_pcm_substream *substream) { struct dream *chip = snd_pcm_substream_chip(substream); memset(&chip->playback_rec, 0, sizeof(chip->playback_rec)); chip->playback_rec.hw_buffer_size = HARDWARE_BUFFER_SIZE; /* byte size */ chip->playback_rec.sw_buffer_size = snd_pcm_lib_buffer_bytes(substream); chip->playback_hw_ptr = 0; .... return 0; } /* trigger callback */ static int dream_playback_trigger(struct snd_pcm_substream *substream, int cmd) { struct dream *chip = snd_pcm_substream_chip(substream); switch (cmd) { case SNDRV_PCM_TRIGGER_START: dream_playback_ack(substream); /* fill the data */ start_timer(); break; case SNDRV_PCM_TRIGGER_STOP: stop_timer(); dream_playback_cancel(substream); break; default: return -EINVAL; } return 0; } /* pointer callback - * playback_hw_ptr is the current byte position in the ring buffer, * which is updated in the timer handler */ static snd_pcm_uframes_t dream_playback_pointer(struct snd_pcm_substream *substream) { struct dream *chip = snd_pcm_substream_chip(substream); return snd_pcm_indirect_playback_pointer(substream, &chip->playback_rec, chip->playback_hw_ptr); } /* open, close, hw_params and hw_free are as usual */ static struct snd_pcm_ops dream_ops = { .open = dream_playback_open, .close = dream_playback_close, .ioctl = snd_pcm_lib_ioctl, .hw_params = dream_playback_hw_params, .hw_free = dream_playback_hw_free, .prepare = dream_playback_prepare, .pointer = dream_playback_pointer, .trigger = dream_playback_trigger, .pointer = dream_playback_pointer, .ack = dream_playback_ack, }; If the data transfer doesn't take much time, you can implement without workqueue but calling dream_playback_copy() from the ack callback directly. Takashi ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-09-27 17:18 ` Takashi Iwai @ 2006-09-27 17:38 ` Lee Revell 2006-09-30 2:03 ` Lee Revell 1 sibling, 0 replies; 35+ messages in thread From: Lee Revell @ 2006-09-27 17:38 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Wed, 2006-09-27 at 19:18 +0200, Takashi Iwai wrote: > > Well, when you have an intermediate buffer (allocated via > snd_pcm_lib_malloc(), you don't need copy and silent callbacks. The > data is written on the intermediate buffer. Then, the workq copies > the data again from this buffer to the hardware in the background. > > As I mentioned, the helpers in pcm-indirect.h might make things eaiser > for such an implementation. In your case, a pseudo code would look > like below. > Takashi-san, Thanks very much! I'll give these recommendations a try. I am very glad that the ALSA driver API can handle such a weird hardware PCM implementation. Keep up the good work! Lee ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-09-27 17:18 ` Takashi Iwai 2006-09-27 17:38 ` Lee Revell @ 2006-09-30 2:03 ` Lee Revell 2006-10-03 15:35 ` Lee Revell 2006-10-04 9:07 ` Takashi Iwai 1 sibling, 2 replies; 35+ messages in thread From: Lee Revell @ 2006-09-30 2:03 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Wed, 2006-09-27 at 19:18 +0200, Takashi Iwai wrote: > At Mon, 25 Sep 2006 15:54:22 -0400, > Lee Revell wrote: > > Here is the code - can you see anything wrong with it? It's derived > > from the ALSA dummy driver and still uses the dummy driver's hardware > > pointer and timer scheme for calling pcm_period_elapsed() - could this > > be the problem? My attempts to track the hardware pointer more > > accurately and call pcm_period_elapsed from the copy callback have not > > been successful. > > > > Most of the work is done in the workqueue dream_handle_pcm_queue() and > > the copy callback. > > Well, when you have an intermediate buffer (allocated via > snd_pcm_lib_malloc(), you don't need copy and silent callbacks. The > data is written on the intermediate buffer. Then, the workq copies > the data again from this buffer to the hardware in the background. > > As I mentioned, the helpers in pcm-indirect.h might make things eaiser > for such an implementation. In your case, a pseudo code would look > like below. OK, I've implemented this, but it's still not quite working. I've noticed that with the PCM indirect implementation, the pointer callback invokes the ack callback every time: pcm-indirect.h: 90 */ 91 static inline snd_pcm_uframes_t 92 snd_pcm_indirect_playback_pointer(snd_pcm_substream_t *substream, 93 snd_pcm_indirect_t *rec, unsigned int ptr) 94 { [ ... ] 103 if (substream->ops->ack) 104 substream->ops->ack(substream); But, this cannot work with our hardware, if the ack callback always transfers 0x2000 words from the hardware, because the pointer callback is invoked many times per period. What are the exact rules/semantics for when the ack callback is invoked? Also, how should I know when to invoke snd_pcm_period_elapsed? It only seems to work if I call it from a timer - if I move it to the ack or copy callback, playback hangs after the first chunk of audio is played. Lee ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-09-30 2:03 ` Lee Revell @ 2006-10-03 15:35 ` Lee Revell 2006-10-04 9:17 ` Takashi Iwai 2006-10-04 9:07 ` Takashi Iwai 1 sibling, 1 reply; 35+ messages in thread From: Lee Revell @ 2006-10-03 15:35 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Fri, 2006-09-29 at 22:03 -0400, Lee Revell wrote: > pcm-indirect.h: > > 90 */ > 91 static inline snd_pcm_uframes_t > 92 snd_pcm_indirect_playback_pointer(snd_pcm_substream_t *substream, > 93 snd_pcm_indirect_t *rec, > unsigned int ptr) > 94 { > > [ ... ] > > 103 if (substream->ops->ack) > 104 substream->ops->ack(substream); > > But, this cannot work with our hardware, if the ack callback always > transfers 0x2000 words from the hardware, because the pointer callback > is invoked many times per period. If the ack callback is called many times per period, but our hardware only supports writing in chunks of 0x2000 words, how can we copy to the hardware from the ack callback as you recommend without using an intermediate buffer? Is it OK to continue to use the scheme from the dummy driver for calling snd_pcm_period_elapsed()? I tried calling snd_pcm_period_elapsed() from the ack callback or from the workqueue whenever the pointer crosses the period boundary, but playback just hangs after the first period. Here is the (still not working) PCM implementation based on your pseudo-code: /* transfer the given size of data to the hardware */ static void dream_playback_copy(snd_pcm_substream_t *substream, snd_pcm_indirect_t *rec, size_t bytes) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; unsigned char *src = runtime->dma_area + rec->sw_data; unsigned short * pcm_word = (unsigned short *) src; int words = bytes / 2; int res; if (! dpcm->running) { printk("dream_playback_copy: not running!\n"); /* Send PCM_START */ dream_pcm_start(dream, dpcm->channel); dpcm->running = 1; } printk("in dream_playback_copy: 0x%x bytes\n", (int) bytes); dream->playback_hw_ptr += bytes; dream->playback_hw_ptr %= snd_pcm_lib_buffer_bytes(substream); if (dpcm->stopping) words--; res = dream_get_mpu_data_poll_status_byte(dream, &dream->midi2, MPU401_IT_MASK, MPU401_IT_VALUE, 100); if (res < 0) { printk("dream_playback_copy: polling status byte returned %i\n", res); return; } while (words--) mpu_write16(dream, &dream->midi2, cpu_to_le16(*pcm_word++), 0); /* TBD: don't swap for 8 bits WAV */ dream_pcm_end_xfer(dpcm->dream, dpcm->channel); if (dpcm->stopping) { mdelay(100); dream_pcm_close(dream, dpcm->channel); dpcm->running = 0; } } static void dream_work_transfer(void *data) { struct snd_card_dream *dream = data; snd_pcm_substream_t *substream = dream->playback_substream; if (!substream) return; snd_pcm_indirect_playback_transfer(substream, &dream->playback_rec, dream_playback_copy); } /* ack callback - just schedule work */ static int dream_playback_ack(snd_pcm_substream_t *substream) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); schedule_work(&dream->playback_work); return 0; } /* cancel work */ static void dream_playback_cancel(snd_pcm_substream_t *substream) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); cancel_delayed_work(&dream->playback_work); } /* trigger callback */ static int dream_playback_trigger(snd_pcm_substream_t *substream, int cmd) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; switch (cmd) { case SNDRV_PCM_TRIGGER_START: dream_playback_ack(substream); /* fill the data */ snd_card_dream_pcm_timer_start(substream); break; case SNDRV_PCM_TRIGGER_STOP: snd_card_dream_pcm_timer_stop(substream); dpcm->stopping = 1; dream_playback_ack(substream); /* send final incomplete 1/2 bufffer and close */ //dream_playback_cancel(substream); break; default: return -EINVAL; } return 0; } /* pointer callback - * playback_hw_ptr is the current byte position in the ring buffer, * which is updated in the timer handler */ static snd_pcm_uframes_t dream_playback_pointer(snd_pcm_substream_t *substream) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); snd_pcm_uframes_t pointer; pointer = snd_pcm_indirect_playback_pointer(substream, &dream->playback_rec, dream->playback_hw_ptr); printk("dream_playback_pointer: 0x%x\n", (unsigned int) pointer); return pointer; } #endif Lee ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-10-03 15:35 ` Lee Revell @ 2006-10-04 9:17 ` Takashi Iwai 2006-10-19 22:12 ` Lee Revell 0 siblings, 1 reply; 35+ messages in thread From: Takashi Iwai @ 2006-10-04 9:17 UTC (permalink / raw) To: Lee Revell; +Cc: alsa-devel At Tue, 03 Oct 2006 11:35:22 -0400, Lee Revell wrote: > > On Fri, 2006-09-29 at 22:03 -0400, Lee Revell wrote: > > pcm-indirect.h: > > > > 90 */ > > 91 static inline snd_pcm_uframes_t > > 92 snd_pcm_indirect_playback_pointer(snd_pcm_substream_t *substream, > > 93 snd_pcm_indirect_t *rec, > > unsigned int ptr) > > 94 { > > > > [ ... ] > > > > 103 if (substream->ops->ack) > > 104 substream->ops->ack(substream); > > > > But, this cannot work with our hardware, if the ack callback always > > transfers 0x2000 words from the hardware, because the pointer callback > > is invoked many times per period. > > If the ack callback is called many times per period, but our hardware > only supports writing in chunks of 0x2000 words, how can we copy to the > hardware from the ack callback as you recommend without using an > intermediate buffer? I thought you don't have to write 0x2000 words at once but in a loop? Look at dream_playback_copy(). It has bytes argument. That specifies the size to copy. Setting snd_pcm_indirect.hw_buffer_size = 0x4000 should restrict the max size to be written to 0x2000 words. Even if you have to call xfer_end() at each 0x2000 words, you can implement a counter and call xfer_end(), such as static void dream_playback_copy(snd_pcm_substream_t *substream, snd_pcm_indirect_t *rec, size_t bytes) { ... int words = bytes / 2; for (; words > 0; words--) { mpu_write16(dream, *data++); playback_chunk_left--; if (!playback_chunk_left) { xfer_end(); playback_chunk_left = 0x2000; } } } Takashi ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-10-04 9:17 ` Takashi Iwai @ 2006-10-19 22:12 ` Lee Revell 2006-10-20 12:55 ` Takashi Iwai 0 siblings, 1 reply; 35+ messages in thread From: Lee Revell @ 2006-10-19 22:12 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Wed, 2006-10-04 at 11:17 +0200, Takashi Iwai wrote: > At Tue, 03 Oct 2006 11:35:22 -0400, > Lee Revell wrote: > > > > On Fri, 2006-09-29 at 22:03 -0400, Lee Revell wrote: > > > pcm-indirect.h: > > > > > > 90 */ > > > 91 static inline snd_pcm_uframes_t > > > 92 snd_pcm_indirect_playback_pointer(snd_pcm_substream_t *substream, > > > 93 snd_pcm_indirect_t *rec, > > > unsigned int ptr) > > > 94 { > > > > > > [ ... ] > > > > > > 103 if (substream->ops->ack) > > > 104 substream->ops->ack(substream); > > > > > > But, this cannot work with our hardware, if the ack callback always > > > transfers 0x2000 words from the hardware, because the pointer callback > > > is invoked many times per period. > > > > If the ack callback is called many times per period, but our hardware > > only supports writing in chunks of 0x2000 words, how can we copy to the > > hardware from the ack callback as you recommend without using an > > intermediate buffer? > > I thought you don't have to write 0x2000 words at once but in a loop? > Look at dream_playback_copy(). It has bytes argument. That specifies > the size to copy. Setting snd_pcm_indirect.hw_buffer_size = 0x4000 > should restrict the max size to be written to 0x2000 words. > Is there any way to force the upper layers to only ever transfer 0x2000 words - no more, no less? I think the hardware gets unhappy if they are not all transferred quickly. Anyway, I still can't make it work with the PCM indirect API. The best I've managed to produce is choppy sound that seems to underrun every period. Do you see what could be wrong with my code? The first version I posted, using an intermediate buffer with copy callback, is the only one that ever worked, though it only works with aplay. Here's my current implementation, which incorporates your suggestions: /* transfer the given size of data to the hardware */ static void dream_playback_copy(snd_pcm_substream_t *substream, snd_pcm_indirect_t *rec, size_t bytes) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; unsigned char *src = runtime->dma_area + rec->sw_data; unsigned short * pcm_word = (unsigned short *) src; int words = bytes / 2; int res; if (! dpcm->running) { printk("dream_playback_copy: not running!\n"); /* Send PCM_START */ dream_pcm_start(dream, dpcm->channel); dpcm->running = 1; } printk("in dream_playback_copy: 0x%x bytes\n", (int) bytes); if (dpcm->stopping) words--; res = dream_get_mpu_data_poll_status_byte(dream, &dream->midi2, MPU401_IT_MASK, MPU401_IT_VALUE, 100); if (res < 0) { printk("dream_playback_copy: polling status byte returned %i\n", res); return; } for (; words > 0; words--) { mpu_write16(dream, &dream->midi2, cpu_to_le16(*pcm_word++), 0); /* TBD: don't swap for 8 bits WAV */ dpcm->playback_chunk_left--; if (! dpcm->playback_chunk_left) { dpcm->playback_chunk_left = 0x2000; dream_pcm_end_xfer(dpcm->dream, dpcm->channel); } } if (dpcm->stopping) { mdelay(100); dream_pcm_close(dream, dpcm->channel); dpcm->running = 0; } } static void dream_work_transfer(void *data) { struct snd_card_dream *dream = data; snd_pcm_substream_t *substream = dream->playback_substream; if (!substream) return; snd_pcm_indirect_playback_transfer(substream, &dream->playback_rec, dream_playback_copy); } /* ack callback - just schedule work */ static int dream_playback_ack(snd_pcm_substream_t *substream) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); schedule_work(&dream->playback_work); return 0; } /* trigger callback */ static int dream_playback_trigger(snd_pcm_substream_t *substream, int cmd) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; switch (cmd) { case SNDRV_PCM_TRIGGER_START: dream_playback_ack(substream); /* fill the data */ snd_card_dream_pcm_timer_start(substream); break; case SNDRV_PCM_TRIGGER_STOP: dpcm->stopping = 1; dream_playback_ack(substream); /* send final incomplete 1/2 bufffer and close */ snd_card_dream_pcm_timer_stop(substream); //dream_playback_cancel(substream); break; default: return -EINVAL; } return 0; } static snd_pcm_uframes_t dream_playback_pointer(snd_pcm_substream_t *substream) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); snd_pcm_uframes_t pointer; pointer = snd_pcm_indirect_playback_pointer(substream, &dream->playback_rec, dream->playback_hw_ptr); //printk("dream_playback_pointer: 0x%x\n", (unsigned int) pointer); return pointer; } static int snd_card_dream_pcm_prepare(snd_pcm_substream_t * substream) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; struct snd_card_dream *dream = snd_pcm_substream_chip(substream); dpcm->frames_per_period = runtime->rate / HZ; dpcm->frac_per_period = runtime->rate % HZ; dpcm->playback_chunk_left = 0x2000; memset(&dream->playback_rec, 0, sizeof(snd_pcm_indirect_t)); dream->playback_rec.hw_buffer_size = 0x4000; /* byte size */ dream->playback_rec.sw_buffer_size = snd_pcm_lib_buffer_bytes(substream); dream->playback_hw_ptr = 0; dream->playback_hw_ptr_frac = 0; return 0; } static int snd_card_dream_playback_prepare(snd_pcm_substream_t * substream) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; struct snd_card_dream *dream = snd_pcm_substream_chip(substream); printk("dream_pcm_open: channel 0x%x eight_bit 0x%x stereo 0x%x rate 0x%x\n", dpcm->channel, dpcm->eight_bit, dpcm->stereo, dpcm->rate); dream_pcm_open(dream, dpcm->channel, dpcm->eight_bit, dpcm->stereo, dpcm->rate); // dbg(dream, "prepare: PCM playback\n"); return snd_card_dream_pcm_prepare(substream); } static void snd_card_dream_pcm_timer_function(unsigned long data) { struct snd_card_dream_pcm *dpcm = (struct snd_card_dream_pcm *)data; struct snd_card_dream *dream = snd_pcm_substream_chip(dpcm->substream); snd_pcm_runtime_t *runtime = dpcm->substream->runtime; unsigned int size; dream->playback_hw_ptr += dpcm->frames_per_period; dream->playback_hw_ptr_frac += dpcm->frac_per_period; if (dream->playback_hw_ptr_frac >= HZ) { dream->playback_hw_ptr++; dream->playback_hw_ptr_frac -= HZ; } dream->playback_hw_ptr %= runtime->buffer_size; if (dream->playback_hw_ptr < dream->playback_last_hw_ptr) size = runtime->buffer_size + dream->playback_hw_ptr - dream->playback_last_hw_ptr; else size = dream->playback_hw_ptr - dream->playback_last_hw_ptr; dream->playback_last_hw_ptr = dream->playback_hw_ptr; dream->playback_size += size; if (dream->playback_size >= runtime->period_size) { dream->playback_size %= runtime->period_size; snd_pcm_period_elapsed(dpcm->substream); } dpcm->timer.expires = 1 + jiffies; add_timer(&dpcm->timer); } static snd_pcm_ops_t dream_playback_ops = { .open = snd_card_dream_playback_open, .close = snd_card_dream_playback_close, .ioctl = snd_pcm_lib_ioctl, .hw_params = snd_card_dream_hw_params, .hw_free = snd_card_dream_hw_free, .prepare = snd_card_dream_playback_prepare, .pointer = dream_playback_pointer, .trigger = dream_playback_trigger, .ack = dream_playback_ack, }; ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-10-19 22:12 ` Lee Revell @ 2006-10-20 12:55 ` Takashi Iwai 2006-10-20 20:12 ` Lee Revell 0 siblings, 1 reply; 35+ messages in thread From: Takashi Iwai @ 2006-10-20 12:55 UTC (permalink / raw) To: Lee Revell; +Cc: alsa-devel At Thu, 19 Oct 2006 18:12:30 -0400, Lee Revell wrote: > > On Wed, 2006-10-04 at 11:17 +0200, Takashi Iwai wrote: > > At Tue, 03 Oct 2006 11:35:22 -0400, > > Lee Revell wrote: > > > > > > On Fri, 2006-09-29 at 22:03 -0400, Lee Revell wrote: > > > > pcm-indirect.h: > > > > > > > > 90 */ > > > > 91 static inline snd_pcm_uframes_t > > > > 92 snd_pcm_indirect_playback_pointer(snd_pcm_substream_t *substream, > > > > 93 snd_pcm_indirect_t *rec, > > > > unsigned int ptr) > > > > 94 { > > > > > > > > [ ... ] > > > > > > > > 103 if (substream->ops->ack) > > > > 104 substream->ops->ack(substream); > > > > > > > > But, this cannot work with our hardware, if the ack callback always > > > > transfers 0x2000 words from the hardware, because the pointer callback > > > > is invoked many times per period. > > > > > > If the ack callback is called many times per period, but our hardware > > > only supports writing in chunks of 0x2000 words, how can we copy to the > > > hardware from the ack callback as you recommend without using an > > > intermediate buffer? > > > > I thought you don't have to write 0x2000 words at once but in a loop? > > Look at dream_playback_copy(). It has bytes argument. That specifies > > the size to copy. Setting snd_pcm_indirect.hw_buffer_size = 0x4000 > > should restrict the max size to be written to 0x2000 words. > > > > Is there any way to force the upper layers to only ever transfer 0x2000 > words - no more, no less? I think the hardware gets unhappy if they are > not all transferred quickly. It's easy to fix this. An untested patch is below. > Anyway, I still can't make it work with the PCM indirect API. The best > I've managed to produce is choppy sound that seems to underrun every > period. Do you see what could be wrong with my code? You should give a fixed period size (0x2000), i.e. set both period_bytes_min and period_bytes_max = 0x4000 (and periods_min = 2) in snd_pcm_hardware. In this way, the ack is called at least at each time 0x2000 words are processed. BTW, what does dream_get_mpu_data_poll_status_byte() do? Does it check whether you can another push 0x2000 words? If so, you should check this before claling snd_pcm_indirect_playback_transfer() in dream_work_transfer(). static void dream_work_transfer(void *data) { struct snd_card_dream *dream = data; snd_pcm_substream_t *substream = dream->playback_substream; struct snd_card_dream_pcm *dpcm = substream->runtime->private_data; if (!substream) return; if (dpcm->running && dream_get_mpu_data_poll_status_byte(dream, &dream->midi2, MPU401_IT_MASK, MPU401_IT_VALUE, 100) < 0) return; snd_pcm_indirect_playback_transfer(substream, &dream->playback_rec, dream_playback_copy); } Possibly, the block transfer patch might be not needed if you add this check. Also, to avoid the restriction of period_size, you'll need the implementation of asynchronous background transfer, e.g. by polling at each timer interrupt (suppose the polling is atomic and fast) then scheduling the transfer work immediately if the space is available. static void snd_card_dream_pcm_timer_function(unsigned long data) { ... dream->playback_size += size; if (dream->playback_size >= runtime->period_size) { dream->playback_size %= runtime->period_size; snd_pcm_period_elapsed(dpcm->substream); } else if (dpcm->running) { if (!dream_get_mpu_data_poll_status_byte(dream, &dream->midi2, MPU401_IT_MASK, MPU401_IT_VALUE, 100)) /* async transfer */ schedule_work(&dream->playback_work); } dpcm->timer.expires = 1 + jiffies; add_timer(&dpcm->timer); } The polling isn't necessary at each timer interrupt, but can be each 0x2000 words, instead, though. Takashi diff -r d7fe584f7395 include/pcm-indirect.h --- a/include/pcm-indirect.h Thu Oct 19 20:35:56 2006 +0200 +++ b/include/pcm-indirect.h Fri Oct 20 12:27:41 2006 +0200 @@ -27,6 +27,7 @@ struct snd_pcm_indirect { struct snd_pcm_indirect { unsigned int hw_buffer_size; /* Byte size of hardware buffer */ unsigned int hw_queue_size; /* Max queue size of hw buffer (0 = buffer size) */ + unsigned int hw_xfer_block; /* transfer alignment (0 = unaligned) */ unsigned int hw_data; /* Offset to next dst (or src) in hw ring buffer */ unsigned int hw_io; /* Ring buffer hw pointer */ int hw_ready; /* Bytes ready for play (or captured) in hw ring buffer */ @@ -72,6 +73,11 @@ snd_pcm_indirect_playback_transfer(struc bytes = sw_to_end; if (! bytes) break; + if (rec->hw_xfer_block) { + if (bytes < rec->hw_xfer_block) + break; + bytes = rec->hw_xfer_block; + } copy(substream, rec, bytes); rec->hw_data += bytes; if (rec->hw_data == rec->hw_buffer_size) @@ -137,6 +143,11 @@ snd_pcm_indirect_capture_transfer(struct bytes = sw_to_end; if (! bytes) break; + if (rec->hw_xfer_block) { + if (bytes < rec->hw_xfer_block) + break; + bytes = rec->hw_xfer_block; + } copy(substream, rec, bytes); rec->hw_data += bytes; if ((int)rec->hw_data == rec->hw_buffer_size) ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-10-20 12:55 ` Takashi Iwai @ 2006-10-20 20:12 ` Lee Revell 2006-10-23 13:09 ` Takashi Iwai 0 siblings, 1 reply; 35+ messages in thread From: Lee Revell @ 2006-10-20 20:12 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Fri, 2006-10-20 at 14:55 +0200, Takashi Iwai wrote: > > Anyway, I still can't make it work with the PCM indirect API. The best > > I've managed to produce is choppy sound that seems to underrun every > > period. Do you see what could be wrong with my code? > > You should give a fixed period size (0x2000), i.e. set both > period_bytes_min and period_bytes_max = 0x4000 (and periods_min = 2) in > snd_pcm_hardware. In this way, the ack is called at least at each > time 0x2000 words are processed. > I am doing this already. > BTW, what does dream_get_mpu_data_poll_status_byte() do? > Does it check whether you can another push 0x2000 words? > Yes, exactly. > If so, you should check this before claling > snd_pcm_indirect_playback_transfer() in dream_work_transfer(). > OK, I've made this change. > > Possibly, the block transfer patch might be not needed if you add this > check. > OK, I feel I am getting closer. But it seems that only the first period is played. Upon further inspection it looks like snd_pcm_indirect_playback_transfer only causes dream_playback_copy to be invoked the first time: ~ # aplay /usr/share/sounds/test/phone.wav Playing WAVE '/usr/share/sounds/test/phone.wav': Signed 16 bit Little Endian, Rate 44100 Hz, Stereo hw_params: stereo yes, format 16bit, rate 44100 dream_pcm_open: channel 0x0 eight_bit 0x0 stereo 0x1 rate 0xac44 dream_playback_ack dream_playback_copy: not running! dream_pcm_start: channel is 0 dream_work_transfer: polling status byte returned 0 calling snd_pcm_indirect_playback_transfer in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x0 dream_playback_ack dream_playback_ack dream_work_transfer: polling status byte returned 0 calling snd_pcm_indirect_playback_transfer dream_playback_ack playback size is 0x1005 dream_playback_ack playback size is 0x100a dream_playback_ack playback size is 0x100f dream_playback_ack playback size is 0x1015 dream_playback_ack dream_playback_ack (eventually underruns and errors are reported) Here is the code: /* transfer the given size of data to the hardware */ static void dream_playback_copy(snd_pcm_substream_t *substream, snd_pcm_indirect_t *rec, size_t bytes) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; unsigned char *src = runtime->dma_area + rec->sw_data; unsigned short * pcm_word = (unsigned short *) src; int words = bytes / 2; printk("in dream_playback_copy: 0x%x bytes DMA area 0x%p sw_data 0x%i\n", (int) bytes, runtime->dma_area, rec->sw_data); if (dpcm->stopping) words--; while (words--) { mpu_write16(dream, &dream->midi2, cpu_to_le16(*pcm_word++), 0); /* TBD: don't swap for 8 bits WAV */ dpcm->playback_chunk_left--; if (! dpcm->playback_chunk_left) { dpcm->playback_chunk_left = 0x2000; dream_pcm_end_xfer(dpcm->dream, dpcm->channel); } } if (dpcm->stopping) { mdelay(100); dream_pcm_close(dream, dpcm->channel); dpcm->running = 0; } } static void dream_work_transfer(void *data) { struct snd_card_dream *dream = data; int res; snd_pcm_substream_t *substream = dream->playback_substream; snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; if (!substream || !runtime) return; if (! dpcm->running) { printk("dream_playback_copy: not running!\n"); /* Send PCM_START */ dream_pcm_start(dream, dpcm->channel); dpcm->running = 1; } res = dream_get_mpu_data_poll_status_byte(dream, &dream->midi2, MPU401_IT_MASK, MPU401_IT_VALUE, 100); printk("dream_work_transfer: polling status byte returned %i\n", res); if (res < 0) return; printk("calling snd_pcm_indirect_playback_transfer\n", res); snd_pcm_indirect_playback_transfer(substream, &dream->playback_rec, dream_playback_copy); } /* ack callback - just schedule work */ static int dream_playback_ack(snd_pcm_substream_t *substream) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); printk("dream_playback_ack\n"); schedule_work(&dream->playback_work); return 0; } /* cancel work */ static void dream_playback_cancel(snd_pcm_substream_t *substream) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); cancel_delayed_work(&dream->playback_work); } /* trigger callback */ static int dream_playback_trigger(snd_pcm_substream_t *substream, int cmd) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; switch (cmd) { case SNDRV_PCM_TRIGGER_START: dream_playback_ack(substream); /* fill the data */ snd_card_dream_pcm_timer_start(substream); break; case SNDRV_PCM_TRIGGER_STOP: dpcm->stopping = 1; dream_playback_ack(substream); /* send final incomplete 1/2 bufffer and close */ snd_card_dream_pcm_timer_stop(substream); //dream_playback_cancel(substream); break; default: return -EINVAL; } return 0; } static snd_pcm_uframes_t dream_playback_pointer(snd_pcm_substream_t *substream) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); snd_pcm_uframes_t pointer; pointer = snd_pcm_indirect_playback_pointer(substream, &dream->playback_rec, dream->playback_hw_ptr); //printk("dream_playback_pointer: 0x%x\n", (unsigned int) pointer); return pointer; } #endif static int snd_card_dream_pcm_prepare(snd_pcm_substream_t * substream) { snd_pcm_runtime_t *runtime = substream->runtime; struct snd_card_dream_pcm *dpcm = runtime->private_data; struct snd_card_dream *dream = snd_pcm_substream_chip(substream); dpcm->frames_per_period = runtime->rate / HZ; dpcm->frac_per_period = runtime->rate % HZ; dpcm->playback_chunk_left = 0x2000; memset(&dream->playback_rec, 0, sizeof(snd_pcm_indirect_t)); dream->playback_rec.hw_buffer_size = 0x4000; /* byte size */ dream->playback_rec.sw_buffer_size = snd_pcm_lib_buffer_bytes(substream); dream->playback_hw_ptr = 0; dream->playback_hw_ptr_frac = 0; return 0; } static void snd_card_dream_pcm_timer_function(unsigned long data) { struct snd_card_dream_pcm *dpcm = (struct snd_card_dream_pcm *)data; dpcm->timer.expires = 1 + jiffies; add_timer(&dpcm->timer); #ifdef RLR_PCM_REPLAY if (!dpcm->substream) { printk("substream null in timer func\n"); return; } struct snd_card_dream *dream = snd_pcm_substream_chip(dpcm->substream); snd_pcm_runtime_t *runtime = dpcm->substream->runtime; if (!runtime) { printk("runtime null in timer func\n"); return; } unsigned int size; dream->playback_hw_ptr += dpcm->frames_per_period; dream->playback_hw_ptr_frac += dpcm->frac_per_period; if (dream->playback_hw_ptr_frac >= HZ) { dream->playback_hw_ptr++; dream->playback_hw_ptr_frac -= HZ; } dream->playback_hw_ptr %= runtime->buffer_size; if (dream->playback_hw_ptr < dream->playback_last_hw_ptr) size = runtime->buffer_size + dream->playback_hw_ptr - dream->playback_last_hw_ptr; else size = dream->playback_hw_ptr - dream->playback_last_hw_ptr; dream->playback_last_hw_ptr = dream->playback_hw_ptr; dream->playback_size += size; if (dream->playback_size >= runtime->period_size) { printk("playback size is 0x%x\n", dream->playback_size); dream->playback_size %= runtime->period_size; snd_pcm_period_elapsed(dpcm->substream); } } static snd_pcm_ops_t dream_playback_ops = { .open = snd_card_dream_playback_open, .close = snd_card_dream_playback_close, .ioctl = snd_pcm_lib_ioctl, .hw_params = snd_card_dream_hw_params, .hw_free = snd_card_dream_hw_free, .prepare = snd_card_dream_playback_prepare, .pointer = dream_playback_pointer, .trigger = dream_playback_trigger, .ack = dream_playback_ack, }; ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-10-20 20:12 ` Lee Revell @ 2006-10-23 13:09 ` Takashi Iwai 2006-10-23 17:46 ` Lee Revell 0 siblings, 1 reply; 35+ messages in thread From: Takashi Iwai @ 2006-10-23 13:09 UTC (permalink / raw) To: Lee Revell; +Cc: alsa-devel [-- Attachment #1: Type: text/plain, Size: 3500 bytes --] At Fri, 20 Oct 2006 16:12:48 -0400, Lee Revell wrote: > > On Fri, 2006-10-20 at 14:55 +0200, Takashi Iwai wrote: > > > Anyway, I still can't make it work with the PCM indirect API. The best > > > I've managed to produce is choppy sound that seems to underrun every > > > period. Do you see what could be wrong with my code? > > > > You should give a fixed period size (0x2000), i.e. set both > > period_bytes_min and period_bytes_max = 0x4000 (and periods_min = 2) in > > snd_pcm_hardware. In this way, the ack is called at least at each > > time 0x2000 words are processed. > > > > I am doing this already. > > > BTW, what does dream_get_mpu_data_poll_status_byte() do? > > Does it check whether you can another push 0x2000 words? > > > > Yes, exactly. > > > If so, you should check this before claling > > snd_pcm_indirect_playback_transfer() in dream_work_transfer(). > > > > OK, I've made this change. > > > > > Possibly, the block transfer patch might be not needed if you add this > > check. > > > > OK, I feel I am getting closer. But it seems that only the first period > is played. Upon further inspection it looks like > snd_pcm_indirect_playback_transfer only causes dream_playback_copy to be > invoked the first time: > > ~ # aplay /usr/share/sounds/test/phone.wav > Playing WAVE '/usr/share/sounds/test/phone.wav': Signed 16 bit Little > Endian, Rate 44100 Hz, Stereo > hw_params: stereo yes, format 16bit, rate 44100 > dream_pcm_open: channel 0x0 eight_bit 0x0 stereo 0x1 rate 0xac44 > dream_playback_ack > dream_playback_copy: not running! > dream_pcm_start: channel is 0 > dream_work_transfer: polling status byte returned 0 > calling snd_pcm_indirect_playback_transfer > in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x0 > dream_playback_ack > dream_playback_ack > dream_work_transfer: polling status byte returned 0 > calling snd_pcm_indirect_playback_transfer > dream_playback_ack > playback size is 0x1005 > dream_playback_ack > playback size is 0x100a > dream_playback_ack > playback size is 0x100f > dream_playback_ack > playback size is 0x1015 > dream_playback_ack > dream_playback_ack The problem looks like the calculation of dream->playback_hw_ptr passed to snd_pcm_indirect_playback_pointer(). This must be a _byte_ offset in the "hardware" ring buffer. That is, it's between 0 and (rec->hw_buffer_size - 1). You set hw_buffer_size = 0x4000, but calculate playback_hw_ptr as [0, runtime->buffer_size] and in frames. In such a case, you can set up in the following way: hw_buffer_size = sw_buffer_size = snd_pcm_lib_buffer_bytes(); hw_queue_size = 0x4000; and in pointer callback, snd_pcm_indirect_playback_pointer(substream, &dream->playback_rec, frames_to_bytes(runtime, dream->playback_hw_ptr); Well, but... This may still result in clicking noises because this handles as if periods=1. If we know that more than 0x4000 bytes can be pushed to the hardware, we can increase hw_queue_size accordingly, too. Alternatively, a better way is to let copy callback return the actually written size. Then the block transfer behavior can be implemented in the copy callback side (so you don't need block-transfer patch any more, too). The first patch is the change to API and its user in the current tree to check the return value of copy callback. The second patch is the change/fix to dream.c you attached. I suppose that dream_get_mpu_data_poll_status_bytes() can be called in the interrupt context, too. Takashi [-- Attachment #2: pcm-indirect-copy-check.diff --] [-- Type: application/octet-stream, Size: 6191 bytes --] diff -r d7fe584f7395 include/pcm-indirect.h --- a/include/pcm-indirect.h Thu Oct 19 20:35:56 2006 +0200 +++ b/include/pcm-indirect.h Fri Oct 20 12:51:29 2006 +0200 @@ -37,8 +37,9 @@ struct snd_pcm_indirect { snd_pcm_uframes_t appl_ptr; /* Last seen appl_ptr */ }; -typedef void (*snd_pcm_indirect_copy_t)(struct snd_pcm_substream *substream, - struct snd_pcm_indirect *rec, size_t bytes); +typedef ssize_t (*snd_pcm_indirect_copy_t)(struct snd_pcm_substream *substream, + struct snd_pcm_indirect *rec, + size_t bytes); /* * helper function for playback ack callback @@ -64,6 +65,7 @@ snd_pcm_indirect_playback_transfer(struc unsigned int hw_to_end = rec->hw_buffer_size - rec->hw_data; unsigned int sw_to_end = rec->sw_buffer_size - rec->sw_data; unsigned int bytes = qsize - rec->hw_ready; + ssize_t res; if (rec->sw_ready < (int)bytes) bytes = rec->sw_ready; if (hw_to_end < bytes) @@ -72,15 +74,17 @@ snd_pcm_indirect_playback_transfer(struc bytes = sw_to_end; if (! bytes) break; - copy(substream, rec, bytes); - rec->hw_data += bytes; + res = copy(substream, rec, bytes); + if (res <= 0) + break; + rec->hw_data += res; if (rec->hw_data == rec->hw_buffer_size) rec->hw_data = 0; - rec->sw_data += bytes; + rec->sw_data += res; if (rec->sw_data == rec->sw_buffer_size) rec->sw_data = 0; - rec->hw_ready += bytes; - rec->sw_ready -= bytes; + rec->hw_ready += res; + rec->sw_ready -= res; } } @@ -129,6 +133,7 @@ snd_pcm_indirect_capture_transfer(struct size_t hw_to_end = rec->hw_buffer_size - rec->hw_data; size_t sw_to_end = rec->sw_buffer_size - rec->sw_data; size_t bytes = rec->sw_buffer_size - rec->sw_ready; + ssize_t res; if (rec->hw_ready < (int)bytes) bytes = rec->hw_ready; if (hw_to_end < bytes) @@ -137,15 +142,17 @@ snd_pcm_indirect_capture_transfer(struct bytes = sw_to_end; if (! bytes) break; - copy(substream, rec, bytes); - rec->hw_data += bytes; + res = copy(substream, rec, bytes); + if (res <= 0) + break; + rec->hw_data += res; if ((int)rec->hw_data == rec->hw_buffer_size) rec->hw_data = 0; - rec->sw_data += bytes; + rec->sw_data += res; if (rec->sw_data == rec->sw_buffer_size) rec->sw_data = 0; - rec->hw_ready -= bytes; - rec->sw_ready += bytes; + rec->hw_ready -= res; + rec->sw_ready += res; } } diff -r d7fe584f7395 pci/cs46xx/cs46xx_lib.c --- a/pci/cs46xx/cs46xx_lib.c Thu Oct 19 20:35:56 2006 +0200 +++ b/pci/cs46xx/cs46xx_lib.c Fri Oct 20 12:52:41 2006 +0200 @@ -686,12 +686,14 @@ static void snd_cs46xx_set_capture_sampl * PCM part */ -static void snd_cs46xx_pb_trans_copy(struct snd_pcm_substream *substream, - struct snd_pcm_indirect *rec, size_t bytes) +static ssize_t snd_cs46xx_pb_trans_copy(struct snd_pcm_substream *substream, + struct snd_pcm_indirect *rec, + size_t bytes) { struct snd_pcm_runtime *runtime = substream->runtime; struct snd_cs46xx_pcm * cpcm = runtime->private_data; memcpy(cpcm->hw_buf.area + rec->hw_data, runtime->dma_area + rec->sw_data, bytes); + return bytes; } static int snd_cs46xx_playback_transfer(struct snd_pcm_substream *substream) @@ -702,13 +704,15 @@ static int snd_cs46xx_playback_transfer( return 0; } -static void snd_cs46xx_cp_trans_copy(struct snd_pcm_substream *substream, - struct snd_pcm_indirect *rec, size_t bytes) +static ssize_t snd_cs46xx_cp_trans_copy(struct snd_pcm_substream *substream, + struct snd_pcm_indirect *rec, + size_t bytes) { struct snd_cs46xx *chip = snd_pcm_substream_chip(substream); struct snd_pcm_runtime *runtime = substream->runtime; memcpy(runtime->dma_area + rec->sw_data, chip->capt.hw_buf.area + rec->hw_data, bytes); + return bytes; } static int snd_cs46xx_capture_transfer(struct snd_pcm_substream *substream) diff -r d7fe584f7395 pci/emu10k1/emupcm.c --- a/pci/emu10k1/emupcm.c Thu Oct 19 20:35:56 2006 +0200 +++ b/pci/emu10k1/emupcm.c Fri Oct 20 12:52:01 2006 +0200 @@ -1557,8 +1557,8 @@ static void snd_emu10k1_fx8010_playback_ } } -static void fx8010_pb_trans_copy(struct snd_pcm_substream *substream, - struct snd_pcm_indirect *rec, size_t bytes) +static ssize_t fx8010_pb_trans_copy(struct snd_pcm_substream *substream, + struct snd_pcm_indirect *rec, size_t bytes) { struct snd_emu10k1 *emu = snd_pcm_substream_chip(substream); struct snd_emu10k1_fx8010_pcm *pcm = &emu->fx8010.pcm[substream->number]; @@ -1584,6 +1584,7 @@ static void fx8010_pb_trans_copy(struct tram_pos -= frames; pcm->tram_pos = tram_pos; pcm->tram_shift = tram_shift; + return bytes; } static int snd_emu10k1_fx8010_playback_transfer(struct snd_pcm_substream *substream) diff -r d7fe584f7395 pci/rme32.c --- a/pci/rme32.c Thu Oct 19 20:35:56 2006 +0200 +++ b/pci/rme32.c Fri Oct 20 12:53:19 2006 +0200 @@ -1158,12 +1158,14 @@ snd_rme32_capture_pointer(struct snd_pcm /* ack and pointer callbacks for fullduplex mode */ -static void snd_rme32_pb_trans_copy(struct snd_pcm_substream *substream, - struct snd_pcm_indirect *rec, size_t bytes) +static ssize_t snd_rme32_pb_trans_copy(struct snd_pcm_substream *substream, + struct snd_pcm_indirect *rec, + size_t bytes) { struct rme32 *rme32 = snd_pcm_substream_chip(substream); memcpy_toio(rme32->iobase + RME32_IO_DATA_BUFFER + rec->hw_data, substream->runtime->dma_area + rec->sw_data, bytes); + return bytes; } static int snd_rme32_playback_fd_ack(struct snd_pcm_substream *substream) @@ -1183,13 +1185,15 @@ static int snd_rme32_playback_fd_ack(str return 0; } -static void snd_rme32_cp_trans_copy(struct snd_pcm_substream *substream, - struct snd_pcm_indirect *rec, size_t bytes) +static ssize_t snd_rme32_cp_trans_copy(struct snd_pcm_substream *substream, + struct snd_pcm_indirect *rec, + size_t bytes) { struct rme32 *rme32 = snd_pcm_substream_chip(substream); memcpy_fromio(substream->runtime->dma_area + rec->sw_data, rme32->iobase + RME32_IO_DATA_BUFFER + rec->hw_data, bytes); + return bytes; } static int snd_rme32_capture_fd_ack(struct snd_pcm_substream *substream) [-- Attachment #3: dream.patch --] [-- Type: application/octet-stream, Size: 3850 bytes --] --- dream-old.c 2006-10-23 14:48:57.000000000 +0200 +++ dream.c 2006-10-23 15:04:01.000000000 +0200 @@ -1,5 +1,8 @@ +#define TRANSFER_WORDS 0x2000 +#define TRANSFER_BYTES (TRANSFER_WORDS * 2) + /* transfer the given size of data to the hardware */ -static void dream_playback_copy(snd_pcm_substream_t *substream, +static ssize_t dream_playback_copy(snd_pcm_substream_t *substream, snd_pcm_indirect_t *rec, size_t bytes) { struct snd_card_dream *dream = snd_pcm_substream_chip(substream); @@ -8,10 +11,23 @@ unsigned char *src = runtime->dma_area + rec->sw_data; unsigned short * pcm_word = (unsigned short *) src; - int words = bytes / 2; + int words; + int res; + + if (bytes < TRANSFER_BYTES) + return 0; /* no enough data for a block transfer */ + /* check h/w buffer availability */ + res = dream_get_mpu_data_poll_status_byte(dream, + &dream->midi2, + MPU401_IT_MASK, + MPU401_IT_VALUE, 100); + + if (res < 0) + return 0; printk("in dream_playback_copy: 0x%x bytes DMA area 0x%p sw_data 0x%i\n", (int) bytes, runtime->dma_area, rec->sw_data); + words = TRANSFER_WORDS; /* block transfer */ if (dpcm->stopping) words--; @@ -29,6 +45,7 @@ dream_pcm_close(dream, dpcm->channel); dpcm->running = 0; } + return TRANSFER_BYTES; } static void dream_work_transfer(void *data) @@ -49,16 +66,6 @@ dpcm->running = 1; } - res = dream_get_mpu_data_poll_status_byte(dream, - &dream->midi2, - MPU401_IT_MASK, - MPU401_IT_VALUE, 100); - - printk("dream_work_transfer: polling status byte returned %i\n", res); - - if (res < 0) - return; - printk("calling snd_pcm_indirect_playback_transfer\n", res); snd_pcm_indirect_playback_transfer(substream, &dream->playback_rec, dream_playback_copy); } @@ -110,7 +117,9 @@ struct snd_card_dream *dream = snd_pcm_substream_chip(substream); snd_pcm_uframes_t pointer; - pointer = snd_pcm_indirect_playback_pointer(substream, &dream->playback_rec, dream->playback_hw_ptr); + pointer = snd_pcm_indirect_playback_pointer(substream, + &dream->playback_rec, + frames_to_bytes(dream->playback_hw_ptr)); //printk("dream_playback_pointer: 0x%x\n", (unsigned int) pointer); return pointer; } @@ -127,8 +136,12 @@ dpcm->playback_chunk_left = 0x2000; memset(&dream->playback_rec, 0, sizeof(snd_pcm_indirect_t)); - dream->playback_rec.hw_buffer_size = 0x4000; /* byte size */ + dream->playback_rec.hw_buffer_size = dream->playback_rec.sw_buffer_size = snd_pcm_lib_buffer_bytes(substream); + if (dream->playback_rec.hw_buffer_size % TRANSFER_BYTES) { + printk("INVALID BUFFER SIZE %d\n", dream->playback_rec.hw_buffer_size); + return -EINVAL; + } dream->playback_hw_ptr = 0; dream->playback_hw_ptr_frac = 0; return 0; @@ -172,7 +185,12 @@ printk("playback size is 0x%x\n", dream->playback_size); dream->playback_size %= runtime->period_size; snd_pcm_period_elapsed(dpcm->substream); - } + } else if (dream_get_mpu_data_poll_status_byte(dream, + &dream->midi2, + MPU401_IT_MASK, + MPU401_IT_VALUE, 100) >= 0) + /* transfer more data if available */ + dream_playback_ack(substream); } static snd_pcm_ops_t dream_playback_ops = { [-- Attachment #4: Type: text/plain, Size: 373 bytes --] ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 [-- Attachment #5: Type: text/plain, Size: 161 bytes --] _______________________________________________ Alsa-devel mailing list Alsa-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/alsa-devel ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-10-23 13:09 ` Takashi Iwai @ 2006-10-23 17:46 ` Lee Revell 2006-10-24 15:01 ` Takashi Iwai 0 siblings, 1 reply; 35+ messages in thread From: Lee Revell @ 2006-10-23 17:46 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Mon, 2006-10-23 at 15:09 +0200, Takashi Iwai wrote: > At Fri, 20 Oct 2006 16:12:48 -0400, > Lee Revell wrote: > > ~ # aplay /usr/share/sounds/test/phone.wav > > Playing WAVE '/usr/share/sounds/test/phone.wav': Signed 16 bit Little > > Endian, Rate 44100 Hz, Stereo > > hw_params: stereo yes, format 16bit, rate 44100 > > dream_pcm_open: channel 0x0 eight_bit 0x0 stereo 0x1 rate 0xac44 > > dream_playback_ack > > dream_playback_copy: not running! > > dream_pcm_start: channel is 0 > > dream_work_transfer: polling status byte returned 0 > > calling snd_pcm_indirect_playback_transfer > > in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x0 > > dream_playback_ack > > dream_playback_ack > > dream_work_transfer: polling status byte returned 0 > > calling snd_pcm_indirect_playback_transfer > > dream_playback_ack > > playback size is 0x1005 > > dream_playback_ack > > playback size is 0x100a > > dream_playback_ack > > playback size is 0x100f > > dream_playback_ack > > playback size is 0x1015 > > dream_playback_ack > > dream_playback_ack > > The problem looks like the calculation of dream->playback_hw_ptr > passed to snd_pcm_indirect_playback_pointer(). This must be a _byte_ > offset in the "hardware" ring buffer. That is, it's between 0 and > (rec->hw_buffer_size - 1). You set hw_buffer_size = 0x4000, but > calculate playback_hw_ptr as [0, runtime->buffer_size] and in frames. > With your patches I still get a similar result: snd_pcm_indirect_playback_transfer is only called once or twice and the first chunk of sound repeats endlessly. ~ # aplay /usr/share/sounds/test/phone.wav Playing WAVE '/uhw_params: stereo yes, format 16bit, rate 44100 sr/share/sounds/dream_pcm_open: channel 0x0 eight_bit 0x0 stereo 0x1 rate 0xac44 test/phone.wav' dream_playback_ack dream_playback_copy: not running! dream_pcm_start: channel is 0 : Signed 16 bit calling snd_pcm_indirect_playback_transfer in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x0 Little Endian, Rdream_playback_copy: polling status byte returned 0 ate 44100 Hz, Stereo dream_playback_copy done dream_playback_ack dream_playback_ack dream_playback_ack calling snd_pcm_indirect_playback_transfer in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x16384 dream_playback_ack dream_playback_pointer: 0x134 At this point playback hangs. I just can't understand why the version with the copy callback worked perfectly, but this PCM indirect stuff has been nothing but trouble. Did I mention I am using ALSA 1.0.9b? Does that version have a buggy PCM indirect implementation? Is it really necessary to make up a fake pointer like this? Can't I just advance the pointer by 0x2000 words after writing each chunk? Here is my userspace PCM playback implementation that works *perfectly*. I just need a way to make ALSA do the same thing when a .wav file is played. dream_pcm_open(chip, port, channel, audio8bits, stereo, sample_rate); dream_pcm_set_vol_fl(chip, port, channel, 0xFFFF); dream_pcm_set_vol_fr(chip, port, channel, 0xFFFF); dream_pcm_start(chip, port, channel); bptr = & buffer[DREAM_MS_WAV_HEADER_SIZE/2]; rest = bufsize-DREAM_MS_WAV_HEADER_SIZE/2; if(audio8bits) /* Endianness doesn't make sence for 8bits WAV files */ endianness = DREAM_BIG_ENDIAN; for (i = 0; rest > 0; i++, bptr+=size) { retval = dream_get_mpu_data_poll_status_byte(chip, port, MPU401_IT_MASK, MPU401_IT_VALUE, 100, & tout); if(retval < 0) pRes += sprintf(pRes, "%d:%d[%u];", i, retval, tout/1000); size = AKA_DIAG_MIN(rest, samples); /* Send more half buffer of dummy samples pausing after each */ for (sample = 0; sample < size; sample++) dream_mpu_oper16(MPU_WRITE_DATA16, port, endianness, &bptr[sample], 0); dream_pcm_end_xfer(chip, port, channel); rest -= size; if(rest == 0) { if(size == samples) /* last half buffer was full */ { rest = 2; /* send 2 additional words */ bptr -= 2; } } } rsp = dream_pcm_close(chip, port, channel); Lee ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-10-23 17:46 ` Lee Revell @ 2006-10-24 15:01 ` Takashi Iwai 2006-10-24 15:30 ` Lee Revell 2006-10-24 23:54 ` Lee Revell 0 siblings, 2 replies; 35+ messages in thread From: Takashi Iwai @ 2006-10-24 15:01 UTC (permalink / raw) To: Lee Revell; +Cc: alsa-devel At Mon, 23 Oct 2006 13:46:13 -0400, Lee Revell wrote: > > On Mon, 2006-10-23 at 15:09 +0200, Takashi Iwai wrote: > > At Fri, 20 Oct 2006 16:12:48 -0400, > > Lee Revell wrote: > > > ~ # aplay /usr/share/sounds/test/phone.wav > > > Playing WAVE '/usr/share/sounds/test/phone.wav': Signed 16 bit Little > > > Endian, Rate 44100 Hz, Stereo > > > hw_params: stereo yes, format 16bit, rate 44100 > > > dream_pcm_open: channel 0x0 eight_bit 0x0 stereo 0x1 rate 0xac44 > > > dream_playback_ack > > > dream_playback_copy: not running! > > > dream_pcm_start: channel is 0 > > > dream_work_transfer: polling status byte returned 0 > > > calling snd_pcm_indirect_playback_transfer > > > in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x0 > > > dream_playback_ack > > > dream_playback_ack > > > dream_work_transfer: polling status byte returned 0 > > > calling snd_pcm_indirect_playback_transfer > > > dream_playback_ack > > > playback size is 0x1005 > > > dream_playback_ack > > > playback size is 0x100a > > > dream_playback_ack > > > playback size is 0x100f > > > dream_playback_ack > > > playback size is 0x1015 > > > dream_playback_ack > > > dream_playback_ack > > > > The problem looks like the calculation of dream->playback_hw_ptr > > passed to snd_pcm_indirect_playback_pointer(). This must be a _byte_ > > offset in the "hardware" ring buffer. That is, it's between 0 and > > (rec->hw_buffer_size - 1). You set hw_buffer_size = 0x4000, but > > calculate playback_hw_ptr as [0, runtime->buffer_size] and in frames. > > > > With your patches I still get a similar result: > snd_pcm_indirect_playback_transfer is only called once or twice and the > first chunk of sound repeats endlessly. > > ~ # aplay /usr/share/sounds/test/phone.wav > Playing WAVE '/uhw_params: stereo yes, format 16bit, rate 44100 > sr/share/sounds/dream_pcm_open: channel 0x0 eight_bit 0x0 stereo 0x1 > rate 0xac44 > test/phone.wav' dream_playback_ack > dream_playback_copy: not running! > dream_pcm_start: channel is 0 > : Signed 16 bit calling snd_pcm_indirect_playback_transfer > in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x0 > Little Endian, Rdream_playback_copy: polling status byte returned 0 > ate 44100 Hz, Stereo > dream_playback_copy done > dream_playback_ack > dream_playback_ack > dream_playback_ack > calling snd_pcm_indirect_playback_transfer > in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x16384 > dream_playback_ack > dream_playback_pointer: 0x134 > > At this point playback hangs. Hm, that's bad. Is dream_get_mpu_data_poll_status_byte() function atomic? > I just can't understand why the version with the copy callback worked > perfectly, but this PCM indirect stuff has been nothing but trouble. Yep. > Did I mention I am using ALSA 1.0.9b? Does that version have a buggy > PCM indirect implementation? I don't think so. The pcm-indirect stuff is all inline functions. > Is it really necessary to make up a fake pointer like this? The only reason for using a background transfer is the damn mmap mode. Since mmap mode requires a DMA transfer without manual write operations, you must implement a pseudo DMA engine to transfer the data chunk in background. > Can't I > just advance the pointer by 0x2000 words after writing each chunk? Which pointer? Note that "writing" to the hardware doesn't mean that the samples are "played". The hw_ptr is supposed to be the position currently played. Or, at least, it's the position where the last period was processed. The timer function is necessary to achieve this pseudo hw_ptr. Otherwise the driver has no way to tell apps the current position. Many apps would be confused if you simply pass the written position because it's not the right position (although the difference is not so obvious in your case because the available buffer/period size is small). > Here is my userspace PCM playback implementation that works *perfectly*. ... unless you use mmap mode, dmix, whatever. The mmap emulation doesn't work for codes like dmix at all. Takashi ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-10-24 15:01 ` Takashi Iwai @ 2006-10-24 15:30 ` Lee Revell 2006-10-24 23:54 ` Lee Revell 1 sibling, 0 replies; 35+ messages in thread From: Lee Revell @ 2006-10-24 15:30 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Tue, 2006-10-24 at 17:01 +0200, Takashi Iwai wrote: > At Mon, 23 Oct 2006 13:46:13 -0400, > Lee Revell wrote: > > > > On Mon, 2006-10-23 at 15:09 +0200, Takashi Iwai wrote: > > > At Fri, 20 Oct 2006 16:12:48 -0400, > > > Lee Revell wrote: > > > > ~ # aplay /usr/share/sounds/test/phone.wav > > > > Playing WAVE '/usr/share/sounds/test/phone.wav': Signed 16 bit Little > > > > Endian, Rate 44100 Hz, Stereo > > > > hw_params: stereo yes, format 16bit, rate 44100 > > > > dream_pcm_open: channel 0x0 eight_bit 0x0 stereo 0x1 rate 0xac44 > > > > dream_playback_ack > > > > dream_playback_copy: not running! > > > > dream_pcm_start: channel is 0 > > > > dream_work_transfer: polling status byte returned 0 > > > > calling snd_pcm_indirect_playback_transfer > > > > in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x0 > > > > dream_playback_ack > > > > dream_playback_ack > > > > dream_work_transfer: polling status byte returned 0 > > > > calling snd_pcm_indirect_playback_transfer > > > > dream_playback_ack > > > > playback size is 0x1005 > > > > dream_playback_ack > > > > playback size is 0x100a > > > > dream_playback_ack > > > > playback size is 0x100f > > > > dream_playback_ack > > > > playback size is 0x1015 > > > > dream_playback_ack > > > > dream_playback_ack > > > > > > The problem looks like the calculation of dream->playback_hw_ptr > > > passed to snd_pcm_indirect_playback_pointer(). This must be a _byte_ > > > offset in the "hardware" ring buffer. That is, it's between 0 and > > > (rec->hw_buffer_size - 1). You set hw_buffer_size = 0x4000, but > > > calculate playback_hw_ptr as [0, runtime->buffer_size] and in frames. > > > > > > > With your patches I still get a similar result: > > snd_pcm_indirect_playback_transfer is only called once or twice and the > > first chunk of sound repeats endlessly. > > > > ~ # aplay /usr/share/sounds/test/phone.wav > > Playing WAVE '/uhw_params: stereo yes, format 16bit, rate 44100 > > sr/share/sounds/dream_pcm_open: channel 0x0 eight_bit 0x0 stereo 0x1 > > rate 0xac44 > > test/phone.wav' dream_playback_ack > > dream_playback_copy: not running! > > dream_pcm_start: channel is 0 > > : Signed 16 bit calling snd_pcm_indirect_playback_transfer > > in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x0 > > Little Endian, Rdream_playback_copy: polling status byte returned 0 > > ate 44100 Hz, Stereo > > dream_playback_copy done > > dream_playback_ack > > dream_playback_ack > > dream_playback_ack > > calling snd_pcm_indirect_playback_transfer > > in dream_playback_copy: 0x4000 bytes DMA area 0xcdfb0000 sw_data 0x16384 > > dream_playback_ack > > dream_playback_pointer: 0x134 > > > > At this point playback hangs. > > Hm, that's bad. Is dream_get_mpu_data_poll_status_byte() function > atomic? > It won't sleep, but it could spin for milliseconds. I think it's probably not optimal to call from interrupt context... > > Is it really necessary to make up a fake pointer like this? > > The only reason for using a background transfer is the damn mmap > mode. Since mmap mode requires a DMA transfer without manual write > operations, you must implement a pseudo DMA engine to transfer the > data chunk in background. > I think it's not required to support mmap - read/write only is acceptable. Lee ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-10-24 15:01 ` Takashi Iwai 2006-10-24 15:30 ` Lee Revell @ 2006-10-24 23:54 ` Lee Revell 1 sibling, 0 replies; 35+ messages in thread From: Lee Revell @ 2006-10-24 23:54 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Tue, 2006-10-24 at 17:01 +0200, Takashi Iwai wrote: > At Mon, 23 Oct 2006 13:46:13 -0400, > Lee Revell wrote: > > I just can't understand why the version with the copy callback worked > > perfectly, but this PCM indirect stuff has been nothing but trouble. > > Yep. Takashi-san, Thanks very much for your help. I've actually revisited our original driver, and after fixing a few bugs (a few kernel stack overflows and some debug output that was disturbing the timing) I've got the original design using copy callback working perfectly. Thanks again, Lee ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-09-30 2:03 ` Lee Revell 2006-10-03 15:35 ` Lee Revell @ 2006-10-04 9:07 ` Takashi Iwai 1 sibling, 0 replies; 35+ messages in thread From: Takashi Iwai @ 2006-10-04 9:07 UTC (permalink / raw) To: Lee Revell; +Cc: alsa-devel At Fri, 29 Sep 2006 22:03:09 -0400, Lee Revell wrote: > > On Wed, 2006-09-27 at 19:18 +0200, Takashi Iwai wrote: > > At Mon, 25 Sep 2006 15:54:22 -0400, > > Lee Revell wrote: > > > Here is the code - can you see anything wrong with it? It's derived > > > from the ALSA dummy driver and still uses the dummy driver's hardware > > > pointer and timer scheme for calling pcm_period_elapsed() - could this > > > be the problem? My attempts to track the hardware pointer more > > > accurately and call pcm_period_elapsed from the copy callback have not > > > been successful. > > > > > > Most of the work is done in the workqueue dream_handle_pcm_queue() and > > > the copy callback. > > > > Well, when you have an intermediate buffer (allocated via > > snd_pcm_lib_malloc(), you don't need copy and silent callbacks. The > > data is written on the intermediate buffer. Then, the workq copies > > the data again from this buffer to the hardware in the background. > > > > As I mentioned, the helpers in pcm-indirect.h might make things eaiser > > for such an implementation. In your case, a pseudo code would look > > like below. > > OK, I've implemented this, but it's still not quite working. > > I've noticed that with the PCM indirect implementation, the pointer > callback invokes the ack callback every time: > > pcm-indirect.h: > > 90 */ > 91 static inline snd_pcm_uframes_t > 92 snd_pcm_indirect_playback_pointer(snd_pcm_substream_t *substream, > 93 snd_pcm_indirect_t *rec, unsigned int ptr) > 94 { > > [ ... ] > > 103 if (substream->ops->ack) > 104 substream->ops->ack(substream); > > But, this cannot work with our hardware, if the ack callback always > transfers 0x2000 words from the hardware, because the pointer callback > is invoked many times per period. Actually it's not always 0x2000 words in this case. The ack calls snd_pcm_indirect_playback_transfer() and this passes the size of data to be copied as the argument. > What are the exact rules/semantics for when the ack callback is invoked? > > Also, how should I know when to invoke snd_pcm_period_elapsed? It only > seems to work if I call it from a timer - if I move it to the ack or > copy callback, playback hangs after the first chunk of audio is played. The invokation of snd_pcm_period_elapsed() must be in an interrupt handler (in your case a timer handler). The pcm_indirect framework doesn't provide the irq mechanism but only the transfer mechanism. When snd_pcm_period_elapsed() is called, the ack is called from the PCM core layer so you don't need any other tasks. Takashi ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-09-19 15:15 ` Takashi Iwai 2006-09-25 19:54 ` Lee Revell @ 2006-09-27 13:58 ` Lee Revell 2006-09-27 16:52 ` Takashi Iwai 1 sibling, 1 reply; 35+ messages in thread From: Lee Revell @ 2006-09-27 13:58 UTC (permalink / raw) To: Takashi Iwai; +Cc: alsa-devel On Tue, 2006-09-19 at 17:15 +0200, Takashi Iwai wrote: > The workq task is a pseudo-DMA engine that runs in background. > When it's woken up, it feeds data from intermediate buffer to hardware > as much as possible. The available data can be found in > > If all data are fed, it sleeps again (or exit > the task). The timer wakes up or schedule the new workq task. I think something is missing - the available data can be found where? Lee ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver design question 2006-09-27 13:58 ` Lee Revell @ 2006-09-27 16:52 ` Takashi Iwai 0 siblings, 0 replies; 35+ messages in thread From: Takashi Iwai @ 2006-09-27 16:52 UTC (permalink / raw) To: Lee Revell; +Cc: alsa-devel At Wed, 27 Sep 2006 09:58:53 -0400, Lee Revell wrote: > > On Tue, 2006-09-19 at 17:15 +0200, Takashi Iwai wrote: > > The workq task is a pseudo-DMA engine that runs in background. > > When it's woken up, it feeds data from intermediate buffer to hardware > > as much as possible. The available data can be found in > > > > If all data are fed, it sleeps again (or exit > > the task). The timer wakes up or schedule the new workq task. > > I think something is missing - the available data can be found where? We track the following two pointers (assume the playback direction): - hw_ptr points the current transferring position in the ring buffer - appl_ptr points the current position the data is filled The available data size is calculated as (appl_ptr - hw_ptr). The appl_ptr is the data size given by the application, i.e. it's incremented at each time the app writes data chunk. So, the interest for the driver is basically only hw_ptr. The only access point that the driver provides hw_ptr is the pointer callback. It returns a value between 0 and buffer_size-1, and the PCM core layer expands to a linear position in 0 - pcm->boundary_size. In case the hardware doesn't provide the exact DMA position... it's a bit hard. If the hardware generates an irq at each period boundary, the driver simply updates the hw_ptr at each interrupt (before calling snd_pcm_period_elapsed()) such as hw_ptr += frames_per_period; hw_ptr %= buffer_size; then pass this value in pointer callback. If the hardware doesn't generate irq, we have to rely on the system timer just like your code. In that case, the driver must keep tracking the pointer by itself, such as frames_per_period = rate / HZ; frac_per_period = rate % HZ; hw_ptr += frames_per_period; frac += frac_per_period; if (frac >= HZ) { hw_ptr++; frac -= HZ; } hw_ptr %= buffer_size; Takashi ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ^ permalink raw reply [flat|nested] 35+ messages in thread
* Driver Design Question
@ 2013-10-22 7:02 Johannes Thumshirn
2013-10-23 3:10 ` Guenter Roeck
0 siblings, 1 reply; 35+ messages in thread
From: Johannes Thumshirn @ 2013-10-22 7:02 UTC (permalink / raw)
To: linux-kernel; +Cc: Samuel Ortiz, Lee Jones, Michael Buesch, Johannes Thumshirn
Hi List,
I have a design question concerning a device driver. The device in question is
somewhere in between drivers/mfd/timberdale and drivers/ssb. It is mapped
connected via PCI and on PCI Bar 0 there is a table describing which
"sub-devices" are contained in the FPGA as well as where their Memory and IRQ
resources are.
Unlike the timberdale driver, there is no static configuration of the FPGA's
sub-devices, but their number and kind is variable. But luckily we have unique
device-ids for every sub-device, so it is possible to do a PCI/USB like
enumeration.
In my understanding the MFD API, which timberdale uses, isn't tailored to this
Plug'n'Play like behavior. Whereas the I think (virtual) bus concept used by
SSB is much more suited for this kind of devices. But would it be wise to add a
bus only suited to devices manufactured by one vendor, when there is already a
API for such kinds of multi function devices?
Long story short, which would be the preferred way to implement such a driver? At
the point I currently reached I could go in both directions.
I'd appreciate any advice I can get on this topic.
Thanks in advance,
Johannes
P.S.: MFD and SSB maintainers are on CC as I'd really like to hear their opinion
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: Driver Design Question 2013-10-22 7:02 Driver Design Question Johannes Thumshirn @ 2013-10-23 3:10 ` Guenter Roeck 2013-10-23 7:29 ` Johannes Thumshirn 0 siblings, 1 reply; 35+ messages in thread From: Guenter Roeck @ 2013-10-23 3:10 UTC (permalink / raw) To: Johannes Thumshirn, linux-kernel Cc: Samuel Ortiz, Lee Jones, Michael Buesch, Greg Kroah-Hartman On 10/22/2013 12:02 AM, Johannes Thumshirn wrote: > Hi List, > > I have a design question concerning a device driver. The device in question is > somewhere in between drivers/mfd/timberdale and drivers/ssb. It is mapped > connected via PCI and on PCI Bar 0 there is a table describing which > "sub-devices" are contained in the FPGA as well as where their Memory and IRQ > resources are. > > Unlike the timberdale driver, there is no static configuration of the FPGA's > sub-devices, but their number and kind is variable. But luckily we have unique > device-ids for every sub-device, so it is possible to do a PCI/USB like > enumeration. > > In my understanding the MFD API, which timberdale uses, isn't tailored to this > Plug'n'Play like behavior. Whereas the I think (virtual) bus concept used by Not sure if that is true. There is no requirement to declare mfd cells statically. As long as the devices don't change after mfd probe, an mfd based solution would at least be implementable. However, adding support for new sub-devices might be an issue, as you would have to update the mfd driver to add support for each new device (to create its mfd cell and platform data), in addition to writing the actual driver. > SSB is much more suited for this kind of devices. But would it be wise to add a > bus only suited to devices manufactured by one vendor, when there is already a > API for such kinds of multi function devices? > Assuming you refer to mfd, isn't that a contradiction ? You just stated that mfd doesn't exactly meet your requirements. There is also an API for adding a new bus, and it is used quite widely. Question for me would be if the additional overhead for adding a bus outweighs its benefits. > Long story short, which would be the preferred way to implement such a driver? At > the point I currently reached I could go in both directions. > > I'd appreciate any advice I can get on this topic. > I'm adding Greg KH to the thread. Maybe he has some useful advice as the driver core maintainer. I have struggled with the question if to add a bus myself, so maybe I can get something useful out of it ;). Thanks, Guenter > Thanks in advance, > > Johannes > > P.S.: MFD and SSB maintainers are on CC as I'd really like to hear their opinion > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > > > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Driver Design Question 2013-10-23 3:10 ` Guenter Roeck @ 2013-10-23 7:29 ` Johannes Thumshirn 0 siblings, 0 replies; 35+ messages in thread From: Johannes Thumshirn @ 2013-10-23 7:29 UTC (permalink / raw) To: Guenter Roeck Cc: Johannes Thumshirn, linux-kernel, Samuel Ortiz, Lee Jones, Michael Buesch, Greg Kroah-Hartman On Tue, Oct 22, 2013 at 08:10:00PM -0700, Guenter Roeck wrote: > On 10/22/2013 12:02 AM, Johannes Thumshirn wrote: > >Hi List, > > > >I have a design question concerning a device driver. The device in question is > >somewhere in between drivers/mfd/timberdale and drivers/ssb. It is mapped > >connected via PCI and on PCI Bar 0 there is a table describing which > >"sub-devices" are contained in the FPGA as well as where their Memory and IRQ > >resources are. > > > >Unlike the timberdale driver, there is no static configuration of the FPGA's > >sub-devices, but their number and kind is variable. But luckily we have unique > >device-ids for every sub-device, so it is possible to do a PCI/USB like > >enumeration. > > > >In my understanding the MFD API, which timberdale uses, isn't tailored to this > >Plug'n'Play like behavior. Whereas the I think (virtual) bus concept used by > > Not sure if that is true. There is no requirement to declare mfd cells > statically. As long as the devices don't change after mfd probe, an mfd based > solution would at least be implementable. Yes it would be implementable, but would it be a good idea to do so? > > However, adding support for new sub-devices might be an issue, as you would > have to update the mfd driver to add support for each new device (to create its > mfd cell and platform data), in addition to writing the actual driver. > Adding sub-devices could probably be done without changing th mfd driver, as I have a device-id, which is part o the article number, so I could generate the string for the modalias in a generic way. > >SSB is much more suited for this kind of devices. But would it be wise to add a > >bus only suited to devices manufactured by one vendor, when there is already a > >API for such kinds of multi function devices? > > > Assuming you refer to mfd, isn't that a contradiction ? You just stated that mfd > doesn't exactly meet your requirements. There is also an API for adding a new bus, > and it is used quite widely. Well my requirements probably can be met using the mfd framework, but I'm not shure if it would be a good design decission then. Or maybe adding a bus would be a better decission. The thing is, I don't want to make me a maintainence hell, but have something I ideally don't need to touch a 2nd time once it is done right. > > Question for me would be if the additional overhead for adding a bus outweighs its > benefits. > > >Long story short, which would be the preferred way to implement such a driver? At > >the point I currently reached I could go in both directions. > > > >I'd appreciate any advice I can get on this topic. > > > > I'm adding Greg KH to the thread. Maybe he has some useful advice as the driver core > maintainer. I have struggled with the question if to add a bus myself, so maybe I can > get something useful out of it ;). This really would be great. If you can get an answer as well, we'd have a win win situation ;). Thanks a lot for your comments. Johannes ^ permalink raw reply [flat|nested] 35+ messages in thread
* Driver design question @ 2006-09-12 20:27 Lee Revell 0 siblings, 0 replies; 35+ messages in thread From: Lee Revell @ 2006-09-12 20:27 UTC (permalink / raw) To: alsa-devel; +Cc: Takashi Iwai I have a device here that can play PCM, but it's a weird implementation. Only 2 periods of 0x2000 words per buffer are supported, and there is no DMA - the driver must poll a status register, and when the chip is ready, deliver all 0x2000 words using outw() then send an end xfer command. I think I need to use an intermediate buffer, and implement copy/silence callbacks that write to this. Then I plan to use a tasklet or workqueue to do the actual xfer of PCM data to the hardware. A timer callback will periodically check whether at least 0x2000 words are in the buffer and if so, schedule the tasklet to drain it. I presume this implementation cannot support mmap() as there's no random access to the hardware buffer. Is this correct? Do I call snd_pcm_period_elapsed() from the timer, immediately before scheduling the tasklet, or from the tasklet, after the data has been transferred to the hw? In general, does this seem like a reasonable implementation? Lee ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 35+ messages in thread
* driver design question
@ 2005-05-19 6:24 Jean Delvare
2005-05-19 6:24 ` Philip Edelbrock
` (9 more replies)
0 siblings, 10 replies; 35+ messages in thread
From: Jean Delvare @ 2005-05-19 6:24 UTC (permalink / raw)
To: lm-sensors
Hi all,
Reading some code today, I noticed that most of our drivers, if not all,
read the current values and the limit values when updated. Why that? I
would have read the current values only. We set the limits ourself
(either at init time or through procfs/sysfs) so they are unlikely to
have changed. Looks like an overhead we could easily get rid of. Is
there something obvious I am missing?
Thanks.
--
Jean Delvare
http://www.ensicaen.ismra.fr/~delvare/
^ permalink raw reply [flat|nested] 35+ messages in thread* driver design question 2005-05-19 6:24 driver " Jean Delvare @ 2005-05-19 6:24 ` Philip Edelbrock 2005-05-19 6:24 ` Jean Delvare ` (8 subsequent siblings) 9 siblings, 0 replies; 35+ messages in thread From: Philip Edelbrock @ 2005-05-19 6:24 UTC (permalink / raw) To: lm-sensors I think we should try to read and report what we know. If we don't read the limits, we probably shouldn't report them either (i.e. make them write-only). Reducing overhead by introducing assumptions in the code (like that limits won't change), is a bad idea, imho. At least, I think this is more important at the driver level than the user-space apps. Are we having problems with performance and overhead issues? Phil Jean Delvare wrote: >Hi all, > >Reading some code today, I noticed that most of our drivers, if not all, >read the current values and the limit values when updated. Why that? I >would have read the current values only. We set the limits ourself >(either at init time or through procfs/sysfs) so they are unlikely to >have changed. Looks like an overhead we could easily get rid of. Is >there something obvious I am missing? > >Thanks. > > > ^ permalink raw reply [flat|nested] 35+ messages in thread
* driver design question 2005-05-19 6:24 driver " Jean Delvare 2005-05-19 6:24 ` Philip Edelbrock @ 2005-05-19 6:24 ` Jean Delvare 2005-05-19 6:24 ` Philip Edelbrock ` (7 subsequent siblings) 9 siblings, 0 replies; 35+ messages in thread From: Jean Delvare @ 2005-05-19 6:24 UTC (permalink / raw) To: lm-sensors > >Reading some code today, I noticed that most of our drivers, if not > >all, read the current values and the limit values when updated. Why > >that? I would have read the current values only. We set the limits > >ourself(either at init time or through procfs/sysfs) so they are > >unlikely to have changed. Looks like an overhead we could easily get > >rid of. Is there something obvious I am missing? > > I think we should try to read and report what we know. If we don't > read the limits, we probably shouldn't report them either (i.e. make > them write-only). Reducing overhead by introducing assumptions in the > code (like that limits won't change), is a bad idea, imho. I think I understand your point. You want us to play it safe and you must be right. However, what could make the limits change? The chip (refering to ADM1025 here, since it's the one I'm working on right now) doesn't change them on his own unless it is power-cycled (a case we don't handle anyway, and I believe we are right to do so). The only way the limits can be changed is through our driver, and in this case we can store it on-the-fly instead of reading it back on every update. One may object that this doesn't allow us to check wether a round-trip from userspace to the registers and back again is OK (we assume that the right value was written to the register). However, we are never sure the register value is right, even if we read the values back. The two conversions formula can be bad but compensate each other, and it's transparent from user's point of view. Anyway, a compromise is to read the limits back once (at the time we write them) so we don't loose this is-the-conversion-OK information. Another compromise is to read the limits but not on every update. This would however require to have two "last update" values recorded into the driver. Not very clean IMHO. Is there any way values can be written to the driver's register without telling us? > At least, I think this is more important at the driver level than > the user-space apps. Not sure I get you there, is there a similar problematic with user-space apps? > Are we having problems with performance and overhead issues? No (at least none I am aware of). But I don't like the idea of our driver doing something that could be avoided with no drawback. Please realize that for each update, which can occur typically as often as once every 1.5 second, 1/3 of the work is for the current values, which is legitimate, and 2/3 of the work is for limits, which 100% of the time AFAIK did not change. Multiplying the amount of work by three is more than just overhead, isn't it? Thus my question, and I ask it again, maybe more clearly than the first time. How could the chip's register be changed without using our driver? -- Jean Delvare http://www.ensicaen.ismra.fr/~delvare/ ^ permalink raw reply [flat|nested] 35+ messages in thread
* driver design question 2005-05-19 6:24 driver " Jean Delvare 2005-05-19 6:24 ` Philip Edelbrock 2005-05-19 6:24 ` Jean Delvare @ 2005-05-19 6:24 ` Philip Edelbrock 2005-05-19 6:24 ` Jean Delvare ` (6 subsequent siblings) 9 siblings, 0 replies; 35+ messages in thread From: Philip Edelbrock @ 2005-05-19 6:24 UTC (permalink / raw) To: lm-sensors BTW- I'm glad you're looking for optimizations and clean-ups. I appreciate your efforts. I think this is a topic which is really has valid points either way. As you suggest, I think I am more leaning towards playing it safe. :') I'll make a few comments and clarifications inline, and perhaps Mark and some others can weigh in. Jean Delvare wrote: >Is there any way values can be written to the driver's register without >telling us? > I think any other communication with the chip which isn't through our driver (Bios? or some other code?) might change the values, although it is unlikely. It's also possible that a driver may be misdetected or forced. It would be odd if the driver gave misleading 'correct' limit readings in these cases. I don't think it is worth the risl In a nutshell, I think lower level drivers which talk to hardware shouldn't do any internal caching or optimizing which isn't nessesary. I.e., when values are asked for, it should talk directly with the hardware to ask for them and not make any assumptions. Now, for user-space apps, it could be more justifiable to cache or assume values. At least, I'd rather see those sorts of hacks done in user-space than kernel driver space. For this paricular issue (if we are trying to optimize speed or limit transaction numbers over the I2C bus), it doesn't buy anything so it is a moot point. > >No (at least none I am aware of). But I don't like the idea of our >driver doing something that could be avoided with no drawback. Please >realize that for each update, which can occur typically as often as once >every 1.5 second, 1/3 of the work is for the current values, which is >legitimate, and 2/3 of the work is for limits, which 100% of the time >AFAIK did not change. Multiplying the amount of work by three is more >than just overhead, isn't it? > >Thus my question, and I ask it again, maybe more clearly than the first >time. How could the chip's register be changed without using our driver? > > Well, I guess it is a matter of ideals. I've thought of kernel hardware drivers as being an inbetween for the user-space apps and the actual hardware with absolutely as little tainting of the data as possible. Even if the values in the hardware's registers are unlikely to change or change infrequently, we still should pass values straight through and not make any assumptions on the data in code. As long as overhead and processor loading isn't an issue, I'd suggest we not cache or code in any value/limit assumptions into the code. I2C transactions are slow, but they cause little (if any) CPU loading or resource hogging. Even as far as percentage usage of the I2C bus, it is small. I can't see any possible significant performance gains by caching things like limits. Also, there are at least some potential drawbacks (at least marginally more than simply 'no drawbacks'), so I see some additional risk with no gain. BTW- The worst offender of reading values over and over with little chance of change is the eeprom driver. For each device, if I remember correctly, each byte read takes two I2C transactions for 256 bytes! So, if you have 4 dimms (with an eeprom on each), you are doing over 2 thousand transactions. Still, though, this take little time and almost no CPU overhead, so it doesn't seem to hurt anything. Anyways, that's my $0.02. :') Phil ^ permalink raw reply [flat|nested] 35+ messages in thread
* driver design question 2005-05-19 6:24 driver " Jean Delvare ` (2 preceding siblings ...) 2005-05-19 6:24 ` Philip Edelbrock @ 2005-05-19 6:24 ` Jean Delvare 2005-05-19 6:24 ` Philip Edelbrock ` (5 subsequent siblings) 9 siblings, 0 replies; 35+ messages in thread From: Jean Delvare @ 2005-05-19 6:24 UTC (permalink / raw) To: lm-sensors > >Is there any way values can be written to the driver's register > >without telling us? > > I think any other communication with the chip which isn't through our > driver (Bios? or some other code?) might change the values, although > it is unlikely. No other code should. If they do, they are bad, not us. As for the BIOS, is there any BIOS code running after boot time? I would have said no, but you seem to think the other way. Of course, one can use i2cset to write some arbitrary value to any register, and break the driver if it is designed my way, while it woudln't if designed your way. But someone doing so has to know what he/she does, isn't it? I2cset is meant for developing/debugging purposes, not regular use. > It's also possible that a driver may be misdetected or forced. > It would be odd if the driver gave misleading 'correct' > limit readings in these cases. I don't think it is worth the risk. As I was proposing, we still can read the limit back once, at the time we write it at first (and any later time we change it) so if it really is wrong (such as the register is read only) we'll see it exactly the same way we would with the current policy. As far as I know, the chips only seldom, if ever, change their R/W registers on their own, except at init time (none of our business) and reset (this could happen, I admit, but shouldn't after module init). So the only two errors I can think of if the chip isn't what it is supposed to be are writing a limit to a read only register (which can be handled as described right above) and hitting some reset register or bit, which is likely to show the same way (reading the value back probably won't be what was expected) and will have other side effects the user will probably notice. All these are problems caused by the driver being used with wrong hardware, which will not work anyway. It looks like we "optimize" our drivers for the very very few cases were the things go bad. I don't pretend we must ignore problems and remove error checking from our code ;) but still I believe we should optimize the working, not the faulty. One thing I'm strongly thinking of is disabling the limit readings *unless* the module is loaded with any force parameter (still mainly thinking of my ADM1025/NE1619 driver). This makes the secure path back in case a user is forcing the driver, but makes the driver faster for regular users. > In a nutshell, I think lower level drivers which talk to hardware > shouldn't do any internal caching or optimizing which isn't nessesary. And I don't quite agree. We are already caching since we limit the frequency at which new data can be obtained. Also, note that I don't advocate in favor of caching too much. For example, one could think of caching VID readings since it's unlikely to change after boot time, still I think it could in some cases (I guess Speedstep technology and the like could want to change that value) so we have to "sample" it continuously. I just want to limit what I consider useless re-readings, (mainly limits). > I.e., when values are asked for, it should talk directly with the > hardware to ask for them and not make any assumptions. I understand your point of view. I think it's motivated by the fact that you are a developer (I am too BTW ;)) and you must love the idea of watching whatever happens to the internal registers of every chipset in your computer. I love playing with them too, don't doubt it :) But what about the end-user? He/she wants an efficient driver. He/she event just wants temperature, etc. readings. The fact that a driver is required for that, and how the driver does it, is far beyond his/her interest. Again, I don't mean we can do anything that would make the driver faster, at any cost. I don't want speed. I want efficiency. > Now, for user-space apps, it could be more justifiable to cache or > assume values. At least, I'd rather see those sorts of hacks done in > user-space than kernel driver space. OK, maybe I see what you mean now, but I think you're missing something. At the time being, the user-space applications cannot cache a given set of values and refresh only the rest, because most of our drivers, if not all, have a single update function. If *one* value is queried and the data isn't up-to-date, *all* values are refreshed. This is the reason why I want to limit what is refreshed. (Next paragraph is me thinking while writing, not necessary very clear, feel free to plain skip or ask for better explanations) We could think of splitting all values into groups for separate refreshes, but it ain't that obvious. The groups would be "vertical" (understand: current values form one group, limits form one group...) from the driver's point of view, while they would be "horizontal" (understand: in0, in0_min and in0_max form one group...) from user-space (because for example reading /proc/dev/sys/.../in0 needs these three values). Or we can have individual refreshed (that is, each group only has one value in it), but considering you need one timestamp per group, it will be really painful. What's more, I think most applications will read all values at a time, which justifies there is a single update function in our drivers. This is how I came to the idea of not reading the limits as being the better optimization one could think of. > For this paricular issue (if we are trying to optimize speed or > limit transaction numbers over the I2C bus), it doesn't buy > anything so it is a moot point. I guess the I2C bus isn't overloaded by our drivers, for sure ;) On the other hand, speed (I prefer the term of quickness or reactivity) could be discussed. For example, running "sensors" on my desktop system takes about 500ms (three busses, one chip on each) when an update is needed. OK, 500ms isn't that long, but think of it. It doesn't do that many things in 500ms. It could be faster, isn't it? If it ain't much of an issue for the sensors program, I think it is for apps that monitor the values continuously (such as gkrellm). My proposal could lower the time to say 200ms, which is far better IMHO. > >Thus my question, and I ask it again, maybe more clearly than the > >first time. How could the chip's register be changed without using > >our driver? > > Well, I guess it is a matter of ideals. I've thought of kernel > hardware drivers as being an inbetween for the user-space apps and the > actual hardware with absolutely as little tainting of the data as > possible. We already do many transformations, even between the registers and the internal vars. > Even if the values in the hardware's registers are unlikely > to change or change infrequently, we still should pass values straight > through and not make any assumptions on the data in code. My point isn't about values that are unlikely to change (see my opinion about VID above) but about values that *should* not change. It's slightly different (in the spirit if not in the facts). > As long as overhead and processor loading isn't an issue, I'd suggest > we not cache or code in any value/limit assumptions into the code. > I2C transactions are slow, but they cause little (if any) CPU loading > or resource hogging. Even as far as percentage usage of the I2C bus, > it is small. I can't see any possible significant performance gains > by caching things like limits. Also, there are at least some > potential drawbacks (at least marginally more than simply 'no > drawbacks'), so I see some additional risk with no gain. Hardly anything I can say against this, you're of course right. It's more about having a "beautiful" driver than anything else. > BTW- The worst offender of reading values over and over with little > chance of change is the eeprom driver. For each device, if I > remember correctly, each byte read takes two I2C transactions for > 256 bytes! So, if you have 4 dimms (with an eeprom on each), you > are doing over 2 thousand transactions. Still, though, this > take little time and almost no CPU overhead, so it doesn't seem > to hurt anything. You're definitely right. I guess this is the reason why the update frequency is limited to 1 each 5 minutes as opposed to 1 each 1 or 2 seconds for the rest of our drivers. Refreshing the two eeproms on my laptops takes 5 full seconds (and I was complaining about the 500ms on the other system...) The worst part of the story is that over 50% of the data we retrieve isn't even used. Anyway it's a bit different, the EEPROM driver isn't a sensor-like driver. It isn't meant to be refreshed frequently. All data in the EEPROM are the same (update-frequency-wise), so that a user-space app can cache the whole thing and never use the driver again, while it's different with usual chipsets. While it's useful for our tests, few people want to use that driver (I even wonder if we should tell the users to modprobe that module at the end of sensors-detect). > Anyways, that's my $0.02. :') I add my 0.02 Eur (not much more ;)). Long way still before we can afford a drink ;) Thanks a lot for the clarifications, I really appreciate when we can exchange our views like that. (My state of mind after this step: I think I'll use "my" policy in the ADM1025 as a test. It's not widely used AFAIK so we don't risk much. If we have feedback of any kind, we'll could either revert the driver to the usual policy, of extend the experiment to other drivers. Or I can just be quiet and leave the ADM1025 driver as the other ones, if you or anyone doesn't want me to experiment. What is certain after our discussion is that I won't change all drivers to my policy right now ;)) -- Jean Delvare http://www.ensicaen.ismra.fr/~delvare/ ^ permalink raw reply [flat|nested] 35+ messages in thread
* driver design question 2005-05-19 6:24 driver " Jean Delvare ` (3 preceding siblings ...) 2005-05-19 6:24 ` Jean Delvare @ 2005-05-19 6:24 ` Philip Edelbrock 2005-05-19 6:24 ` Mark M. Hoffman ` (4 subsequent siblings) 9 siblings, 0 replies; 35+ messages in thread From: Philip Edelbrock @ 2005-05-19 6:24 UTC (permalink / raw) To: lm-sensors Good discussion. Let me toss a few very quick comments: Jean Delvare wrote: > And I don't quite agree. We are already caching since we limit the > >frequency at which new data can be obtained. [...] > Quick note: The refresh frequency limit was introduced because the hardware would give bad results if you polled it too fast. So, depending on what the datasheet for the chip says, we adjust the max polling frequency to match. So this was to ensure accurate results from the hardware, not as a performace optimization. > Hardly anything I can say against this, you're of course right. It's > >more about having a "beautiful" driver than anything else. > > My idea of the ideal driver is that it is almost transparent to the hardware. The less caching, the better. Now, your point about reducing a significant amount of latency is a good one. That's a reasonable reason for such an optimization. I'm not sure if it is a strong enough reason, though. I think we'd need to do a little math to figure out exactly how much it 'costs' to read limits and what it would save us to poll them less frequently. Phil ^ permalink raw reply [flat|nested] 35+ messages in thread
* driver design question 2005-05-19 6:24 driver " Jean Delvare ` (4 preceding siblings ...) 2005-05-19 6:24 ` Philip Edelbrock @ 2005-05-19 6:24 ` Mark M. Hoffman 2005-05-19 6:24 ` Philip Pokorny ` (3 subsequent siblings) 9 siblings, 0 replies; 35+ messages in thread From: Mark M. Hoffman @ 2005-05-19 6:24 UTC (permalink / raw) To: lm-sensors * Philip Edelbrock <phil@edgedesign.us> [2003-07-29 09:50:45 -0700]: > > Jean Delvare wrote: > > >And I don't quite agree. We are already caching since we limit the > >frequency at which new data can be obtained. [...] > > > Quick note: The refresh frequency limit was introduced because the > hardware would give bad results if you polled it too fast. So, > depending on what the datasheet for the chip says, we adjust the max > polling frequency to match. So this was to ensure accurate results from > the hardware, not as a performace optimization. Take a look at the ADM1026 driver (by Phil P.) as a counter-example. The update routine is apropos to this thread. He uses two different refresh frequencies - the slower one for config data. I thought this was a good idea when I first saw it. > My idea of the ideal driver is that it is almost transparent to the > hardware. The less caching, the better. Much as I try, I can't jump off the fence here. OOH, it seems wasteful to keep re-reading what are almost certainly non-volatile h/w registers. OTOH, what are we optimizing here? - a special purpose slow bus that has nothing better to do (I2C); or an insignificant amount of ISA port IO. > Now, your point about reducing a significant amount of latency is a good > one. That's a reasonable reason for such an optimization. I'm not sure > if it is a strong enough reason, though. I think we'd need to do a > little math to figure out exactly how much it 'costs' to read limits and > what it would save us to poll them less frequently. It depends on the chip of course - but I think the ADM1026 driver might be a good model to follow for others which are deemed worth optimizing. Regards, -- Mark M. Hoffman mhoffman@lightlink.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* driver design question 2005-05-19 6:24 driver " Jean Delvare ` (5 preceding siblings ...) 2005-05-19 6:24 ` Mark M. Hoffman @ 2005-05-19 6:24 ` Philip Pokorny 2005-05-19 6:24 ` Jean Delvare ` (2 subsequent siblings) 9 siblings, 0 replies; 35+ messages in thread From: Philip Pokorny @ 2005-05-19 6:24 UTC (permalink / raw) To: lm-sensors Mark M. Hoffman wrote: > * Philip Edelbrock <phil@edgedesign.us> [2003-07-29 09:50:45 -0700]: > >>Jean Delvare wrote: >>My idea of the ideal driver is that it is almost transparent to the >>hardware. The less caching, the better. > > Much as I try, I can't jump off the fence here. OOH, it seems wasteful to > keep re-reading what are almost certainly non-volatile h/w registers. OTOH, > what are we optimizing here? - a special purpose slow bus that has nothing > better to do (I2C); or an insignificant amount of ISA port IO. Don't under-estimate the delays associated with doing ISA port I/O. It can take *thousands* of clock cycles to perform just one port I/O instruction. The kernel even uses I/O to port 0x80 as a short timing delay. :v) -- Philip Pokorny, Director of Engineering Tel: 415-358-2635 Fax: 415-358-2646 Toll Free: 888-PENGUIN PENGUIN COMPUTING, INC. www.penguincomputing.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* driver design question 2005-05-19 6:24 driver " Jean Delvare ` (6 preceding siblings ...) 2005-05-19 6:24 ` Philip Pokorny @ 2005-05-19 6:24 ` Jean Delvare 2005-05-19 6:24 ` Mark D. Studebaker 2005-05-19 6:24 ` Philip Pokorny 9 siblings, 0 replies; 35+ messages in thread From: Jean Delvare @ 2005-05-19 6:24 UTC (permalink / raw) To: lm-sensors > > Quick note: The refresh frequency limit was introduced because the > > hardware would give bad results if you polled it too fast. So, > > depending on what the datasheet for the chip says, we adjust the > > max polling frequency to match. So this was to ensure accurate > > results from the hardware, not as a performace optimization. > Take a look at the ADM1026 driver (by Phil P.) as a counter-example. > The update routine is apropos to this thread. He uses two different > refresh frequencies - the slower one for config data. I thought this > was a good idea when I first saw it. Exactly what I had in mind. I didn't know someone else already did that :) After some more thinking however, I'm not sure it's what we really want. We are talking (correct me if I'm wrong there) about data that should not change *at all*, not data that should not change *often*. So sampling their value every 5 minutes doesn't make much sense. If they change and we are not aware of the change, it's likely to cause trouble that will solve only up to 5 minutes later. The user will probably get mad about this (by the time he/she tries to figure out what's wrong, the problem will solve by itself). So I think we should opt for one of the following policy: 1* We refresh these values as often as the other ones (what most of our drivers do) and don't care about the performance cost. Play it safe, Phil E. said. 2* We say these values should *NOT* change, we don't sample them, and if something goes wrong, obviously it's not our fault (though the user will certainly complain to us). We can enable a check (that is, sampling the should-not-change data as in 1* and emit a warning if we see a change in the values) in debug mode (or additional parameter?) to catch the problem. One more thought though: remember that these things are very unlikely to happen. We have the adm1026 driver which as a different policy, now the adm1025 has its own one too. These are good tests. If we have no bad feedback about these, it must mean it's safe to change our global policy to either of those. > > Now, your point about reducing a significant amount of latency is a > > good one. That's a reasonable reason for such an optimization. I'm > > not sure if it is a strong enough reason, though. I think we'd need > > to do a little math to figure out exactly how much it 'costs' to > > read limits and what it would save us to poll them less frequently. > > It depends on the chip of course - but I think the ADM1026 driver > might be a good model to follow for others which are deemed worth > optimizing. What about trying some real-life tests (that's what it's all about after all)? I could try the different policies on the w83781d driver and measure how much time module loads and data refreshs take. -- Jean Delvare http://www.ensicaen.ismra.fr/~delvare/ ^ permalink raw reply [flat|nested] 35+ messages in thread
* driver design question 2005-05-19 6:24 driver " Jean Delvare ` (7 preceding siblings ...) 2005-05-19 6:24 ` Jean Delvare @ 2005-05-19 6:24 ` Mark D. Studebaker 2005-05-19 6:24 ` Philip Pokorny 9 siblings, 0 replies; 35+ messages in thread From: Mark D. Studebaker @ 2005-05-19 6:24 UTC (permalink / raw) To: lm-sensors Doing some catching up. Good discussion, I agree with most of it. Some thoughts: - ACPI can indeed go poking around behind our backs. - Experimenting using one (small) driver is indeed a good thing. Note that adm1021 has my enhancements for returning an alarm if a read fails. - Reading limits less often than other things is probably good. - If people have a problem with eeprom performance they can rmmod it. Philip Pokorny wrote: > Jean Delvare wrote: > >>>> Quick note: The refresh frequency limit was introduced because the >>>> hardware would give bad results if you polled it too fast. So, >>>> depending on what the datasheet for the chip says, we adjust the >>>> max polling frequency to match. So this was to ensure accurate >>>> results from the hardware, not as a performace optimization. >>>> >> >>> Take a look at the ADM1026 driver (by Phil P.) as a counter-example. >>> The update routine is apropos to this thread. He uses two different >>> refresh frequencies - the slower one for config data. I thought this >>> was a good idea when I first saw it. >>> >> >> Exactly what I had in mind. I didn't know someone else already did that >> :) >> >> After some more thinking however, I'm not sure it's what we really want. >> We are talking (correct me if I'm wrong there) about data that should >> not change *at all*, not data that should not change *often*. So >> sampling their value every 5 minutes doesn't make much sense. If they >> change and we are not aware of the change, it's likely to cause trouble >> that will solve only up to 5 minutes later. The user will probably get >> mad about this (by the time he/she tries to figure out what's wrong, the >> problem will solve by itself). So I think we should opt for one of the >> following policy: > > > > That's one of the reasons that I still poll the data that I don't expect > to be different. I made sure to update the cache immediately when I > change a configuration parameter so that future reads will return the > correct data. But then at some time later, The driver will read the > actual hardware so you can always be sure that after some period of time > what you see from the driver is what's actually on the chip. > > This turned out to be important because it helped me find a programming > bug in my driver. I wasn't bit shifting and masking one of the > parameters correctly in and out of the chip. The result was that some > settings didn't get changed correctly. The cache was what I "wanted" > but the chip was set to something different. If I hadn't included the > re-read of the chip, I don't think I would have found this bug. > > A cache is just a cache. It's not the real thing. You must include a > way to update the cache to make it consistent with the "real thing". > That might be some other parameter/attribute that you write to to > trigger a cache update on demand. Or it can be a timeout as implemented > in the adm1026 driver. > > With ACPI and IPMI poking at these monitoring chips, I think it's > dangerous for us to assume that we know everything that's going on with > the sensor. Witness Margit's experience where we programmed the limits > incorrectly and the BIOS shut down his machine. Or where we set off > alarm beepers because the limits are wrong. > >> 1* We refresh these values as often as the other ones (what most of our >> drivers do) and don't care about the performance cost. Play it safe, >> Phil E. said. > > > > That's reasonable for simple chips with few registers. But consider > that just about every chip has twice as many limit values (min and max) > as current readings. That means that only 35% of the data is changing > rapidly. That's a significant performance optimization to reduce 65% of > reads. > > >> 2* We say these values should *NOT* change, we don't sample them, and if >> something goes wrong, obviously it's not our fault (though the user will >> certainly complain to us). We can enable a check (that is, sampling the >> should-not-change data as in 1* and emit a warning if we see a change in >> the values) in debug mode (or additional parameter?) to catch the >> problem. > > > > As I said above, I think we should just read the bits once in a while. > Adding checks for unexpected changes could add significantly to the code. > >> One more thought though: remember that these things are very unlikely to >> happen. We have the adm1026 driver which as a different policy, now the >> adm1025 has its own one too. These are good tests. If we have no bad >> feedback about these, it must mean it's safe to change our global policy >> to either of those. >> >>>> Now, your point about reducing a significant amount of latency is a >>>> good one. That's a reasonable reason for such an optimization. I'm >>>> not sure if it is a strong enough reason, though. I think we'd need >>>> to do a little math to figure out exactly how much it 'costs' to >>>> read limits and what it would save us to poll them less frequently. >>>> >>> It depends on the chip of course - but I think the ADM1026 driver >>> might be a good model to follow for others which are deemed worth >>> optimizing. > > >> >> What about trying some real-life tests (that's what it's all about after >> all)? I could try the different policies on the w83781d driver and >> measure how much time module loads and data refreshs take. > > > > Real world test data is always best. > > :v) > > > ^ permalink raw reply [flat|nested] 35+ messages in thread
* driver design question 2005-05-19 6:24 driver " Jean Delvare ` (8 preceding siblings ...) 2005-05-19 6:24 ` Mark D. Studebaker @ 2005-05-19 6:24 ` Philip Pokorny 9 siblings, 0 replies; 35+ messages in thread From: Philip Pokorny @ 2005-05-19 6:24 UTC (permalink / raw) To: lm-sensors Jean Delvare wrote: >>>Quick note: The refresh frequency limit was introduced because the >>>hardware would give bad results if you polled it too fast. So, >>>depending on what the datasheet for the chip says, we adjust the >>>max polling frequency to match. So this was to ensure accurate >>>results from the hardware, not as a performace optimization. >>> > >>Take a look at the ADM1026 driver (by Phil P.) as a counter-example. >>The update routine is apropos to this thread. He uses two different >>refresh frequencies - the slower one for config data. I thought this >>was a good idea when I first saw it. >> > > Exactly what I had in mind. I didn't know someone else already did that > :) > > After some more thinking however, I'm not sure it's what we really want. > We are talking (correct me if I'm wrong there) about data that should > not change *at all*, not data that should not change *often*. So > sampling their value every 5 minutes doesn't make much sense. If they > change and we are not aware of the change, it's likely to cause trouble > that will solve only up to 5 minutes later. The user will probably get > mad about this (by the time he/she tries to figure out what's wrong, the > problem will solve by itself). So I think we should opt for one of the > following policy: That's one of the reasons that I still poll the data that I don't expect to be different. I made sure to update the cache immediately when I change a configuration parameter so that future reads will return the correct data. But then at some time later, The driver will read the actual hardware so you can always be sure that after some period of time what you see from the driver is what's actually on the chip. This turned out to be important because it helped me find a programming bug in my driver. I wasn't bit shifting and masking one of the parameters correctly in and out of the chip. The result was that some settings didn't get changed correctly. The cache was what I "wanted" but the chip was set to something different. If I hadn't included the re-read of the chip, I don't think I would have found this bug. A cache is just a cache. It's not the real thing. You must include a way to update the cache to make it consistent with the "real thing". That might be some other parameter/attribute that you write to to trigger a cache update on demand. Or it can be a timeout as implemented in the adm1026 driver. With ACPI and IPMI poking at these monitoring chips, I think it's dangerous for us to assume that we know everything that's going on with the sensor. Witness Margit's experience where we programmed the limits incorrectly and the BIOS shut down his machine. Or where we set off alarm beepers because the limits are wrong. > 1* We refresh these values as often as the other ones (what most of our > drivers do) and don't care about the performance cost. Play it safe, > Phil E. said. That's reasonable for simple chips with few registers. But consider that just about every chip has twice as many limit values (min and max) as current readings. That means that only 35% of the data is changing rapidly. That's a significant performance optimization to reduce 65% of reads. > 2* We say these values should *NOT* change, we don't sample them, and if > something goes wrong, obviously it's not our fault (though the user will > certainly complain to us). We can enable a check (that is, sampling the > should-not-change data as in 1* and emit a warning if we see a change in > the values) in debug mode (or additional parameter?) to catch the > problem. As I said above, I think we should just read the bits once in a while. Adding checks for unexpected changes could add significantly to the code. > One more thought though: remember that these things are very unlikely to > happen. We have the adm1026 driver which as a different policy, now the > adm1025 has its own one too. These are good tests. If we have no bad > feedback about these, it must mean it's safe to change our global policy > to either of those. > >>>Now, your point about reducing a significant amount of latency is a >>>good one. That's a reasonable reason for such an optimization. I'm >>>not sure if it is a strong enough reason, though. I think we'd need >>>to do a little math to figure out exactly how much it 'costs' to >>>read limits and what it would save us to poll them less frequently. >>> >>It depends on the chip of course - but I think the ADM1026 driver >>might be a good model to follow for others which are deemed worth >>optimizing. > > What about trying some real-life tests (that's what it's all about after > all)? I could try the different policies on the w83781d driver and > measure how much time module loads and data refreshs take. Real world test data is always best. :v) ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2013-10-23 7:29 UTC | newest] Thread overview: 35+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-09-14 11:29 Driver design question Takashi Iwai 2006-09-15 14:47 ` Lee Revell 2006-09-19 15:15 ` Takashi Iwai 2006-09-25 19:54 ` Lee Revell 2006-09-27 17:18 ` Takashi Iwai 2006-09-27 17:38 ` Lee Revell 2006-09-30 2:03 ` Lee Revell 2006-10-03 15:35 ` Lee Revell 2006-10-04 9:17 ` Takashi Iwai 2006-10-19 22:12 ` Lee Revell 2006-10-20 12:55 ` Takashi Iwai 2006-10-20 20:12 ` Lee Revell 2006-10-23 13:09 ` Takashi Iwai 2006-10-23 17:46 ` Lee Revell 2006-10-24 15:01 ` Takashi Iwai 2006-10-24 15:30 ` Lee Revell 2006-10-24 23:54 ` Lee Revell 2006-10-04 9:07 ` Takashi Iwai 2006-09-27 13:58 ` Lee Revell 2006-09-27 16:52 ` Takashi Iwai -- strict thread matches above, loose matches on Subject: below -- 2013-10-22 7:02 Driver Design Question Johannes Thumshirn 2013-10-23 3:10 ` Guenter Roeck 2013-10-23 7:29 ` Johannes Thumshirn 2006-09-12 20:27 Driver design question Lee Revell 2005-05-19 6:24 driver " Jean Delvare 2005-05-19 6:24 ` Philip Edelbrock 2005-05-19 6:24 ` Jean Delvare 2005-05-19 6:24 ` Philip Edelbrock 2005-05-19 6:24 ` Jean Delvare 2005-05-19 6:24 ` Philip Edelbrock 2005-05-19 6:24 ` Mark M. Hoffman 2005-05-19 6:24 ` Philip Pokorny 2005-05-19 6:24 ` Jean Delvare 2005-05-19 6:24 ` Mark D. Studebaker 2005-05-19 6:24 ` Philip Pokorny
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.