* Race in gspca main or missing lock in stv06xx subdriver?
@ 2009-09-14 15:17 James Blanford
2009-09-15 10:41 ` Jean-Francois Moine
0 siblings, 1 reply; 7+ messages in thread
From: James Blanford @ 2009-09-14 15:17 UTC (permalink / raw)
To: linux-media, moinejf
Howdy folks,
I have my old quickcam express webcam working, with HDCS1000
sensor, 046d:840. It's clearly throwing away every other frame. What
seems to be happening is, while the last packet of the previous frame
is being analyzed by the subdriver, the first packet of the next frame
is assigned to the current frame buffer. By the time that packet is
analysed and sent back to the main driver, it's frame buffer has been
completely filled and marked as "DONE." The entire frame is then
marked for "DISCARD." This does _not_ happen with all cams using this
subdriver.
Here's a little patch, supplied only to help illustrate the problem,
that allows for the full, expected frame rate of the webcam. What it
does is wait until the very last moment to assign a frame buffer to any
packet, but the last. I also threw in a few printks so I can see where
failure takes place without wading through a swamp of debug output.
--- gspca/gspca.c.orig 2009-09-11 22:43:54.000000000 -0400
+++ gspca/gspca.c 2009-09-11 23:04:34.000000000 -0400
@@ -114,8 +114,10 @@ struct gspca_frame *gspca_get_i_frame(st
i = gspca_dev->fr_queue[i];
frame = &gspca_dev->frame[i];
if ((frame->v4l2_buf.flags & BUF_ALL_FLAGS)
- != V4L2_BUF_FLAG_QUEUED)
- return NULL;
+ != V4L2_BUF_FLAG_QUEUED){
+ printk(KERN_DEBUG "Shout NULL NULL NULL\n");
+ return frame;
+ }
return frame;
}
EXPORT_SYMBOL(gspca_get_i_frame);
@@ -146,6 +148,7 @@ static void fill_frame(struct gspca_dev
/* check the availability of the frame buffer */
frame = gspca_get_i_frame(gspca_dev);
if (!frame) {
+ printk(KERN_DEBUG "get_i_frame fails\n");
gspca_dev->last_packet_type = DISCARD_PACKET;
break;
}
@@ -268,9 +271,12 @@ struct gspca_frame *gspca_frame_add(stru
/* when start of a new frame, if the current frame buffer
* is not queued, discard the whole frame */
if (packet_type == FIRST_PACKET) {
+ i = gspca_dev->fr_i;
+ frame = &gspca_dev->frame[i];
if ((frame->v4l2_buf.flags & BUF_ALL_FLAGS)
!= V4L2_BUF_FLAG_QUEUED) {
gspca_dev->last_packet_type = DISCARD_PACKET;
+ printk(KERN_DEBUG "Frame marked for discard\n");
return frame;
}
frame->data_end = frame->data;
@@ -285,8 +291,11 @@ struct gspca_frame *gspca_frame_add(stru
/* append the packet to the frame buffer */
if (len > 0) {
+ i = gspca_dev->fr_i;
+ frame = &gspca_dev->frame[i];
if (frame->data_end - frame->data + len
> frame->v4l2_buf.length) {
+ printk(KERN_DEBUG "Frame overflow\n");
PDEBUG(D_ERR|D_PACK, "frame overflow %zd > %d",
frame->data_end - frame->data + len,
frame->v4l2_buf.length);
It works, at least until there is any disruption to the stream, such as
an exposure change, and then something gets out of sync and it starts
throwing out every other frame again. It shows that the driver
framework and USB bus is capable of handling the full frame rate.
I'll keep looking for an actual solution, but there is a lot I don't
understand. Any suggestions or ideas would be appreciated. Several
questions come to mind. Why bother assigning a frame buffer with
get_i_frame, before it's needed? What purpose has frame_wait, when
it's not called until the frame is completed and the buffer is marked
as "DONE." Why are there five, fr_i fr_q fr_o fr_queue index , buffer
indexing counters? I'm sure I don't understand all the different tasks
this driver has to handle and all the different hardware it has to deal
with. But I would be surprised if my cam is the only one this is
happening with.
Thanks,
- Jim
--
There are two kinds of people. The innocent and the living.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Race in gspca main or missing lock in stv06xx subdriver?
2009-09-14 15:17 Race in gspca main or missing lock in stv06xx subdriver? James Blanford
@ 2009-09-15 10:41 ` Jean-Francois Moine
2009-10-01 13:23 ` Erik Andrén
2009-10-04 15:36 ` Hans de Goede
0 siblings, 2 replies; 7+ messages in thread
From: Jean-Francois Moine @ 2009-09-15 10:41 UTC (permalink / raw)
To: James Blanford; +Cc: linux-media
On Mon, 14 Sep 2009 11:17:57 -0400
James Blanford <jhblanford@gmail.com> wrote:
> Howdy folks,
>
> I have my old quickcam express webcam working, with HDCS1000
> sensor, 046d:840. It's clearly throwing away every other frame. What
> seems to be happening is, while the last packet of the previous frame
> is being analyzed by the subdriver, the first packet of the next frame
> is assigned to the current frame buffer. By the time that packet is
> analysed and sent back to the main driver, it's frame buffer has been
> completely filled and marked as "DONE." The entire frame is then
> marked for "DISCARD." This does _not_ happen with all cams using this
> subdriver.
>
> Here's a little patch, supplied only to help illustrate the problem,
> that allows for the full, expected frame rate of the webcam. What it
> does is wait until the very last moment to assign a frame buffer to
> any packet, but the last. I also threw in a few printks so I can see
> where failure takes place without wading through a swamp of debug
> output.
[snip]
> It works, at least until there is any disruption to the stream, such
> as an exposure change, and then something gets out of sync and it
> starts throwing out every other frame again. It shows that the driver
> framework and USB bus is capable of handling the full frame rate.
>
> I'll keep looking for an actual solution, but there is a lot I don't
> understand. Any suggestions or ideas would be appreciated. Several
> questions come to mind. Why bother assigning a frame buffer with
> get_i_frame, before it's needed? What purpose has frame_wait, when
> it's not called until the frame is completed and the buffer is marked
> as "DONE." Why are there five, fr_i fr_q fr_o fr_queue index , buffer
> indexing counters? I'm sure I don't understand all the different
> tasks this driver has to handle and all the different hardware it has
> to deal with. But I would be surprised if my cam is the only one
> this is happening with.
Hi James,
I think you may have found a big problem, and this one should exist in
all drivers!
As I understood, you say that the URB completion routine (isoc_irq) may
be run many times at the same time.
With gspca, the problem is critical: the frame queue is managed without
any lock thanks to a circular buffer list (the pointers fr_i, fr_q and
fr_o). This mechanism works well when there are only one producer
(interrupt) and one consumer (application) (and to answer the question,
get_i_frame can be called anywere in the interrupt function because the
application is not running). Then, if there may be many producers...
For other drivers, the problem remains: the image fragments could be
received out of order.
How is this possible? Looking at some kernel documents, I found that
the URB completion routine is called from the bottom-half entity (thus,
not under hardware interrupt). A bottom-half may be a tasklet or a soft
irq. There may be only one active tasklet at a time, while there may be
many softirqs running (on the interrupt CPU). It seems that we are in
the last case, and I could not find any mean to change that.
Then, to fix this problem, I see only one solution: have a private
tasklet to do the video streaming, as this is done for some bulk
transfer...
Cheers.
--
Ken ar c'hentañ | ** Breizh ha Linux atav! **
Jef | http://moinejf.free.fr/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Race in gspca main or missing lock in stv06xx subdriver?
2009-09-15 10:41 ` Jean-Francois Moine
@ 2009-10-01 13:23 ` Erik Andrén
2009-10-03 7:28 ` Jean-Francois Moine
2009-10-04 15:36 ` Hans de Goede
1 sibling, 1 reply; 7+ messages in thread
From: Erik Andrén @ 2009-10-01 13:23 UTC (permalink / raw)
To: Jean-Francois Moine; +Cc: James Blanford, linux-media
2009/9/15 Jean-Francois Moine <moinejf@free.fr>:
> On Mon, 14 Sep 2009 11:17:57 -0400
> James Blanford <jhblanford@gmail.com> wrote:
>
>> Howdy folks,
>>
>> I have my old quickcam express webcam working, with HDCS1000
>> sensor, 046d:840. It's clearly throwing away every other frame. What
>> seems to be happening is, while the last packet of the previous frame
>> is being analyzed by the subdriver, the first packet of the next frame
>> is assigned to the current frame buffer. By the time that packet is
>> analysed and sent back to the main driver, it's frame buffer has been
>> completely filled and marked as "DONE." The entire frame is then
>> marked for "DISCARD." This does _not_ happen with all cams using this
>> subdriver.
>>
>> Here's a little patch, supplied only to help illustrate the problem,
>> that allows for the full, expected frame rate of the webcam. What it
>> does is wait until the very last moment to assign a frame buffer to
>> any packet, but the last. I also threw in a few printks so I can see
>> where failure takes place without wading through a swamp of debug
>> output.
> [snip]
>> It works, at least until there is any disruption to the stream, such
>> as an exposure change, and then something gets out of sync and it
>> starts throwing out every other frame again. It shows that the driver
>> framework and USB bus is capable of handling the full frame rate.
>>
>> I'll keep looking for an actual solution, but there is a lot I don't
>> understand. Any suggestions or ideas would be appreciated. Several
>> questions come to mind. Why bother assigning a frame buffer with
>> get_i_frame, before it's needed? What purpose has frame_wait, when
>> it's not called until the frame is completed and the buffer is marked
>> as "DONE." Why are there five, fr_i fr_q fr_o fr_queue index , buffer
>> indexing counters? I'm sure I don't understand all the different
>> tasks this driver has to handle and all the different hardware it has
>> to deal with. But I would be surprised if my cam is the only one
>> this is happening with.
>
> Hi James,
>
> I think you may have found a big problem, and this one should exist in
> all drivers!
>
> As I understood, you say that the URB completion routine (isoc_irq) may
> be run many times at the same time.
>
> With gspca, the problem is critical: the frame queue is managed without
> any lock thanks to a circular buffer list (the pointers fr_i, fr_q and
> fr_o). This mechanism works well when there are only one producer
> (interrupt) and one consumer (application) (and to answer the question,
> get_i_frame can be called anywere in the interrupt function because the
> application is not running). Then, if there may be many producers...
>
> For other drivers, the problem remains: the image fragments could be
> received out of order.
>
> How is this possible? Looking at some kernel documents, I found that
> the URB completion routine is called from the bottom-half entity (thus,
> not under hardware interrupt). A bottom-half may be a tasklet or a soft
> irq. There may be only one active tasklet at a time, while there may be
> many softirqs running (on the interrupt CPU). It seems that we are in
> the last case, and I could not find any mean to change that.
>
> Then, to fix this problem, I see only one solution: have a private
> tasklet to do the video streaming, as this is done for some bulk
> transfer...
>
> Cheers.
Hi Jean-Francois,
Are you currently working on anything addressing this issue or do we
need some further discussion?
Best regards,
Erik
>
> --
> Ken ar c'hentań | ** Breizh ha Linux atav! **
> Jef | http://moinejf.free.fr/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Race in gspca main or missing lock in stv06xx subdriver?
2009-10-01 13:23 ` Erik Andrén
@ 2009-10-03 7:28 ` Jean-Francois Moine
2009-10-04 10:45 ` Erik Andrén
0 siblings, 1 reply; 7+ messages in thread
From: Jean-Francois Moine @ 2009-10-03 7:28 UTC (permalink / raw)
To: Erik Andrén; +Cc: James Blanford, linux-media
On Thu, 1 Oct 2009 15:23:29 +0200
Erik Andrén <erik.andren@gmail.com> wrote:
> 2009/9/15 Jean-Francois Moine <moinejf@free.fr>:
[snip]
> > I think you may have found a big problem, and this one should exist
> > in all drivers!
> >
> > As I understood, you say that the URB completion routine (isoc_irq)
> > may be run many times at the same time.
[snip]
> > Then, to fix this problem, I see only one solution: have a private
> > tasklet to do the video streaming, as this is done for some bulk
> > transfer...
[snip]
> Are you currently working on anything addressing this issue or do we
> need some further discussion?
Hi Erik,
No, I am not working on this problem: I cannot reproduce it (easy test:
have a static variable which is incremented in the irq function -
isoc_irq() in gspca.c - and warn when it is non null at entry).
May you (or anyone) check it?
Then, the simplest solution is not a tasklet, but to use only one URB
(change the '#define DEF_NURBS' to 1 instead of 3 in gspca.c).
Best regards.
--
Ken ar c'hentañ | ** Breizh ha Linux atav! **
Jef | http://moinejf.free.fr/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Race in gspca main or missing lock in stv06xx subdriver?
2009-10-03 7:28 ` Jean-Francois Moine
@ 2009-10-04 10:45 ` Erik Andrén
0 siblings, 0 replies; 7+ messages in thread
From: Erik Andrén @ 2009-10-04 10:45 UTC (permalink / raw)
To: Jean-Francois Moine; +Cc: James Blanford, linux-media
2009/10/3 Jean-Francois Moine <moinejf@free.fr>:
> On Thu, 1 Oct 2009 15:23:29 +0200
> Erik Andrén <erik.andren@gmail.com> wrote:
>
>> 2009/9/15 Jean-Francois Moine <moinejf@free.fr>:
> [snip]
>> > I think you may have found a big problem, and this one should exist
>> > in all drivers!
>> >
>> > As I understood, you say that the URB completion routine (isoc_irq)
>> > may be run many times at the same time.
> [snip]
>> > Then, to fix this problem, I see only one solution: have a private
>> > tasklet to do the video streaming, as this is done for some bulk
>> > transfer...
> [snip]
>> Are you currently working on anything addressing this issue or do we
>> need some further discussion?
>
> Hi Erik,
>
> No, I am not working on this problem: I cannot reproduce it (easy test:
> have a static variable which is incremented in the irq function -
> isoc_irq() in gspca.c - and warn when it is non null at entry).
>
> May you (or anyone) check it?
>
> Then, the simplest solution is not a tasklet, but to use only one URB
> (change the '#define DEF_NURBS' to 1 instead of 3 in gspca.c).
>
But this would result in reduced performance for all gspca subdrivers, no?
Best regards,
Erik
> Best regards.
>
> --
> Ken ar c'hentañ | ** Breizh ha Linux atav! **
> Jef | http://moinejf.free.fr/
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Race in gspca main or missing lock in stv06xx subdriver?
2009-09-15 10:41 ` Jean-Francois Moine
2009-10-01 13:23 ` Erik Andrén
@ 2009-10-04 15:36 ` Hans de Goede
2009-10-04 16:41 ` Jean-Francois Moine
1 sibling, 1 reply; 7+ messages in thread
From: Hans de Goede @ 2009-10-04 15:36 UTC (permalink / raw)
To: Jean-Francois Moine; +Cc: James Blanford, linux-media
Hi all,
Sorry for jumping into this discussion a bit late, I was swamped
with other stuff and did not have time to look into this before.
On 09/15/2009 12:41 PM, Jean-Francois Moine wrote:
> On Mon, 14 Sep 2009 11:17:57 -0400
> James Blanford<jhblanford@gmail.com> wrote:
>
>> Howdy folks,
>>
>> I have my old quickcam express webcam working, with HDCS1000
>> sensor, 046d:840. It's clearly throwing away every other frame. What
>> seems to be happening is, while the last packet of the previous frame
>> is being analyzed by the subdriver, the first packet of the next frame
>> is assigned to the current frame buffer. By the time that packet is
>> analysed and sent back to the main driver, it's frame buffer has been
>> completely filled and marked as "DONE." The entire frame is then
>> marked for "DISCARD." This does _not_ happen with all cams using this
>> subdriver.
>>
<snip>
>
> Hi James,
>
> I think you may have found a big problem, and this one should exist in
> all drivers!
>
> As I understood, you say that the URB completion routine (isoc_irq) may
> be run many times at the same time.
>
That was what I understood too, and I could not believe it, so I thought
James had made some sort of mistake with his analyses, because it
we can have multiple URB completion routines run at the same time, no
matter how good out locking it, we've already lost, as then:
> For other drivers, the problem remains: the image fragments could be
> received out of order.
>
As Jean-Francois very correctly pointed out, but no worries. The handling of
isoc irq's is serialised elsewhere in the kernel, the issue of what James is
seeing is much simpler.
When we call gspca_frame_add, it returns a pointer to the frame passed in,
unless we call it with LAST_PACKET, when it will return a pointer to a
new frame in to whoch store the frame data for the next frame. So whenever
calling:
gspca_frame_add(gspca_dev, LAST_PACKET, frame, data, len);
we should do this as:
frame = gspca_frame_add(gspca_dev, LAST_PACKET, frame, data, len);
So that any further data got from of the pkt we are handling in pkt_scan, goes
to the next frame.
We are not doing this in stv06xx.c pkt_scan method, which the cause of what
James is seeing. So I started checking all drivers, and we are not doing this
either in ov519.c when handling an ov518 bridge. So now the framerate of my
3 ov518 test cams has just doubled. Thanks James!
I'll send a patch with the fix in a separate mail.
Regards,
Hans
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Race in gspca main or missing lock in stv06xx subdriver?
2009-10-04 15:36 ` Hans de Goede
@ 2009-10-04 16:41 ` Jean-Francois Moine
0 siblings, 0 replies; 7+ messages in thread
From: Jean-Francois Moine @ 2009-10-04 16:41 UTC (permalink / raw)
To: Hans de Goede; +Cc: James Blanford, linux-media, Erik Andrén
On Sun, 04 Oct 2009 17:36:55 +0200
Hans de Goede <hdegoede@redhat.com> wrote:
> As Jean-Francois very correctly pointed out, but no worries. The
> handling of isoc irq's is serialised elsewhere in the kernel, the
> issue of what James is seeing is much simpler.
>
> When we call gspca_frame_add, it returns a pointer to the frame
> passed in, unless we call it with LAST_PACKET, when it will return a
> pointer to a new frame in to whoch store the frame data for the next
> frame. So whenever calling:
> gspca_frame_add(gspca_dev, LAST_PACKET, frame, data, len);
> we should do this as:
> frame = gspca_frame_add(gspca_dev, LAST_PACKET, frame, data, len);
>
> So that any further data got from of the pkt we are handling in
> pkt_scan, goes to the next frame.
>
> We are not doing this in stv06xx.c pkt_scan method, which the cause
> of what James is seeing. So I started checking all drivers, and we
> are not doing this either in ov519.c when handling an ov518 bridge.
> So now the framerate of my 3 ov518 test cams has just doubled. Thanks
> James!
>
> I'll send a patch with the fix in a separate mail.
Hello Hans,
Thank you for the patch. Yes, it was simple and I should have seen it
before! I think this problem could be avoided if the frame pointer is
not given to the pkt_scan function (if the subdriver needs it, it is
returned by gspca_get_i_frame).
Best regards.
--
Ken ar c'hentañ | ** Breizh ha Linux atav! **
Jef | http://moinejf.free.fr/
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-10-04 16:41 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-14 15:17 Race in gspca main or missing lock in stv06xx subdriver? James Blanford
2009-09-15 10:41 ` Jean-Francois Moine
2009-10-01 13:23 ` Erik Andrén
2009-10-03 7:28 ` Jean-Francois Moine
2009-10-04 10:45 ` Erik Andrén
2009-10-04 15:36 ` Hans de Goede
2009-10-04 16:41 ` Jean-Francois Moine
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).