* Need VIDIOC_CROPCAP clarification @ 2008-05-26 21:26 Hans Verkuil 2008-05-27 1:16 ` Andy Walls 0 siblings, 1 reply; 15+ messages in thread From: Hans Verkuil @ 2008-05-26 21:26 UTC (permalink / raw) To: v4l; +Cc: Michael Schimek Hi all, How should the pixelaspect field of the v4l2_cropcap struct be filled? Looking at existing drivers it can be anything from 0/0, 1/1, 54/59 for PAL/SECAM and 11/10 for NTSC or the horizontal number of samples/the horizontal number of pixels. However, it is my understanding that the last one as used in bttv is the correct interpretation. Meaning that if the horizontal unit used for cropping is equal to a pixel (this is the case for most drivers), then pixelaspect should be 1/1. If the horizontal unit is different from a pixel, then it should be: (total number of horizontal units) / (horizontal pixels) So given a crop coordinate X, the corresponding coordinate in pixels would be: X * pixelaspect.denominator / pixelaspect.numerator This is what bttv does and I'm pretty sure that's when this ioctl was introduced. Assuming this is correct, then the Spec needs to be fixed in several places (and drivers too, for that matter): - all references to the term 'pixel aspect' are incorrect: it has nothing to do with the pixel aspect, it is about the ratio between the horizontal sampling frequency and the 'pixel frequency'. - the description of 'bounds' is wrong: "Width and height are defined in pixels, the driver writer is free to choose origin and units of the coordinate system in the analog domain." This is contradictory: the width units are up to the driver so the unit for the width is not necessarily a pixel. The way the cropping is setup implies that the height and Y coordinates are ALWAYS in line (aka pixel) units. It cannot be anything else since that's the way analog video works. You can't sample the height of half a line. - pixelaspect: has nothing to do with the pixel aspect. So the references to PAL/SECAM and NTSC are irrelevant. Comments? Hans -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-05-26 21:26 Need VIDIOC_CROPCAP clarification Hans Verkuil @ 2008-05-27 1:16 ` Andy Walls 2008-05-27 6:53 ` Hans Verkuil 0 siblings, 1 reply; 15+ messages in thread From: Andy Walls @ 2008-05-27 1:16 UTC (permalink / raw) To: Hans Verkuil; +Cc: v4l, Michael Schimek On Mon, 2008-05-26 at 23:26 +0200, Hans Verkuil wrote: > Hi all, > > How should the pixelaspect field of the v4l2_cropcap struct be filled? > Looking at existing drivers it can be anything from 0/0, 1/1, 54/59 for > PAL/SECAM and 11/10 for NTSC or the horizontal number of samples/the > horizontal number of pixels. > > However, it is my understanding that the last one as used in bttv is the > correct interpretation. Meaning that if the horizontal unit used for > cropping is equal to a pixel (this is the case for most drivers), then > pixelaspect should be 1/1. If the horizontal unit is different from a > pixel, then it should be: > > (total number of horizontal units) / (horizontal pixels) > > So given a crop coordinate X, the corresponding coordinate in pixels > would be: > > X * pixelaspect.denominator / pixelaspect.numerator > > This is what bttv does and I'm pretty sure that's when this ioctl was > introduced. The definition of "defrect" for VIDIOC_CROPCAP in the spec seems to support this interpretation of "pixelaspect". > Assuming this is correct, then the Spec needs to be fixed in several > places (and drivers too, for that matter): > - all references to the term 'pixel aspect' are incorrect: it has > nothing to do with the pixel aspect, it is about the ratio between the > horizontal sampling frequency and the 'pixel frequency'. Well, wouldn't changing the luma signal sampling rate (in time), but not the number of luma samples per pixel, effectively stretch or shrink the "real world" as it is displayed on a horizontal line, thus affecting the apparent aspect of a pixel when compared to the vertical dimension? Thus, when representing real world features, the pixels can have an apparent aspect? >From the ITU-R BT.601-4 (which has been superseded by BT.601-6) that I found on the 'net: http://inst.eecs.berkeley.edu/~cs150/Documents/ITU601.PDF BT.601 defines a luma sampling freq of 13.5 MHz, yielding 858 samples per NTSC line and 720 samples per active regions of a line. The 11/10 ratio mentioned for NTSC maps to 704/640, which is clearly a ratio of active digital pixels to digital pixels scaled for a display. So I'm not sure of the relationship to BT.601-4's 720 pixels. That's where this informal document may help: http://www.arachnotron.nl/videocap/doc/Karl_cap_v1_en.pdf He somewhat explains Display Aspect Ratio (DAR) and Pixel Aspect Ratio (PAR), where 704 comes from instead of 720, and works some examples with numbers on pages 7-9. He even talks about the BT8xx chips on page 10. Here's some more clarifying/confusing information on pixel aspect: http://lurkertech.com/lg/pixelaspect/ BTW, 'pixel frequency', as you are calling it, is twice the maximum "spatial frequency" that is displayable on a line. The 'pixel frequency', or 'pixels/line' is the Nyquist rate for the highest displayable spatial frequency on the line, the highest supported spatial cycles/line before aliasing makes features unresolvable. You need to see a light pixel and a dark pixel to have one spatial cycle. (IIRC, If you know the focal length and field of view, the highest spatial frequency tells you what is the smallest object length you can hope to resolve.) > - the description of 'bounds' is wrong: "Width and height are defined in > pixels, the driver writer is free to choose origin and units of the > coordinate system in the analog domain." This is contradictory: the > width units are up to the driver so the unit for the width is not > necessarily a pixel. The way the cropping is setup implies that the > height and Y coordinates are ALWAYS in line (aka pixel) units. It > cannot be anything else since that's the way analog video works. You > can't sample the height of half a line. > - pixelaspect: has nothing to do with the pixel aspect. So the > references to PAL/SECAM and NTSC are irrelevant. As the "Karl_cap_v1_en.pdf" points out on page 7, you need to know the pixel aspect assumed by the digitization to do a proper conversion from a source digital format to a target digital format and a crop is part of that conversion. I think PAR has only an indirect relationship with analog video standards. PAR has more to do with display devices, encoding and recording standards, and digitization standards. All of these have been influenced by the analog standards, so certain PARs can get tied to certain analog standards. I think I've added more confusion than clarification. Oh well... -Andy > Comments? > > Hans > -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-05-27 1:16 ` Andy Walls @ 2008-05-27 6:53 ` Hans Verkuil 2008-05-27 7:00 ` Hans Verkuil 2008-05-27 23:24 ` Andy Walls 0 siblings, 2 replies; 15+ messages in thread From: Hans Verkuil @ 2008-05-27 6:53 UTC (permalink / raw) To: Andy Walls; +Cc: v4l, Michael Schimek On Tuesday 27 May 2008 03:16:16 Andy Walls wrote: > On Mon, 2008-05-26 at 23:26 +0200, Hans Verkuil wrote: > > Hi all, > > > > How should the pixelaspect field of the v4l2_cropcap struct be > > filled? Looking at existing drivers it can be anything from 0/0, > > 1/1, 54/59 for PAL/SECAM and 11/10 for NTSC or the horizontal > > number of samples/the horizontal number of pixels. > > > > However, it is my understanding that the last one as used in bttv > > is the correct interpretation. Meaning that if the horizontal unit > > used for cropping is equal to a pixel (this is the case for most > > drivers), then pixelaspect should be 1/1. If the horizontal unit is > > different from a pixel, then it should be: > > > > (total number of horizontal units) / (horizontal pixels) > > > > So given a crop coordinate X, the corresponding coordinate in > > pixels would be: > > > > X * pixelaspect.denominator / pixelaspect.numerator > > > > This is what bttv does and I'm pretty sure that's when this ioctl > > was introduced. > > The definition of "defrect" for VIDIOC_CROPCAP in the spec seems to > support this interpretation of "pixelaspect". > > > Assuming this is correct, then the Spec needs to be fixed in > > several places (and drivers too, for that matter): > > > > - all references to the term 'pixel aspect' are incorrect: it has > > nothing to do with the pixel aspect, it is about the ratio between > > the horizontal sampling frequency and the 'pixel frequency'. > > Well, wouldn't changing the luma signal sampling rate (in time), but > not the number of luma samples per pixel, effectively stretch or > shrink the "real world" as it is displayed on a horizontal line, thus > affecting the apparent aspect of a pixel when compared to the > vertical dimension? Thus, when representing real world features, the > pixels can have an apparent aspect? Yes, but then you would no longer be compliant to BT.601. And in the case of CROPCAP this field still has nothing to do with the pixel aspect. > >From the ITU-R BT.601-4 (which has been superseded by BT.601-6) that > > I > > found on the 'net: > > http://inst.eecs.berkeley.edu/~cs150/Documents/ITU601.PDF > > > BT.601 defines a luma sampling freq of 13.5 MHz, yielding 858 samples > per NTSC line and 720 samples per active regions of a line. The > 11/10 ratio mentioned for NTSC maps to 704/640, which is clearly a > ratio of active digital pixels to digital pixels scaled for a > display. So I'm not sure of the relationship to BT.601-4's 720 > pixels. > > That's where this informal document may help: > > http://www.arachnotron.nl/videocap/doc/Karl_cap_v1_en.pdf Nice document. Explains it well. It also clearly shows that the 720 width should include overscan (which it does for ivtv/cx18), so that the actual video part (704x576 for PAL/SECAM) does have a pixel aspect of 54/59 (y/x). I.e. the width of a PAL/SECAM pixel is slighly larger than its height. In other words, if the capture device follows BT.601, then the pixel aspect is fixed depending on the TV standard. And it has nothing to do with what CROPCAP calls 'pixelaspect'. > He somewhat explains Display Aspect Ratio (DAR) and Pixel Aspect > Ratio (PAR), where 704 comes from instead of 720, and works some > examples with numbers on pages 7-9. He even talks about the BT8xx > chips on page 10. > > Here's some more clarifying/confusing information on pixel aspect: > > http://lurkertech.com/lg/pixelaspect/ > > > > > BTW, 'pixel frequency', as you are calling it, is twice the maximum > "spatial frequency" that is displayable on a line. The 'pixel > frequency', or 'pixels/line' is the Nyquist rate for the highest > displayable spatial frequency on the line, the highest supported > spatial cycles/line before aliasing makes features unresolvable. You > need to see a light pixel and a dark pixel to have one spatial cycle. Unfortunately term from my side. > (IIRC, If you know the focal length and field of view, the highest > spatial frequency tells you what is the smallest object length you > can hope to resolve.) > > > - the description of 'bounds' is wrong: "Width and height are > > defined in pixels, the driver writer is free to choose origin and > > units of the coordinate system in the analog domain." This is > > contradictory: the width units are up to the driver so the unit for > > the width is not necessarily a pixel. The way the cropping is setup > > implies that the height and Y coordinates are ALWAYS in line (aka > > pixel) units. It cannot be anything else since that's the way > > analog video works. You can't sample the height of half a line. > > > > - pixelaspect: has nothing to do with the pixel aspect. So the > > references to PAL/SECAM and NTSC are irrelevant. > > As the "Karl_cap_v1_en.pdf" points out on page 7, you need to know > the pixel aspect assumed by the digitization to do a proper > conversion from a source digital format to a target digital format > and a crop is part of that conversion. > > I think PAR has only an indirect relationship with analog video > standards. PAR has more to do with display devices, encoding and > recording standards, and digitization standards. All of these have > been influenced by the analog standards, so certain PARs can get tied > to certain analog standards. > > > > I think I've added more confusion than clarification. Oh well... The problem as I see it is that there doesn't seem to be a V4L API at the moment that returns the true pixel aspect. So any application basically has to hope the device follows BT.601 and assume the corresponding pixel aspect ratios. But this can become problematic with webcams etc. that do not follow BT.601. Regards, Hans -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-05-27 6:53 ` Hans Verkuil @ 2008-05-27 7:00 ` Hans Verkuil 2008-05-27 23:14 ` Andy Walls 2008-06-06 22:29 ` Michael Schimek 2008-05-27 23:24 ` Andy Walls 1 sibling, 2 replies; 15+ messages in thread From: Hans Verkuil @ 2008-05-27 7:00 UTC (permalink / raw) To: video4linux-list; +Cc: Michael Schimek Here's an old article I found detailing the design of pixelaspect, it makes me wonder if what bttv does isn't wrong and pixelaspect is really a pixel aspect. http://www.spinics.net/lists/vfl/msg02653.html Regards, Hans -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-05-27 7:00 ` Hans Verkuil @ 2008-05-27 23:14 ` Andy Walls 2008-06-06 22:29 ` Michael Schimek 1 sibling, 0 replies; 15+ messages in thread From: Andy Walls @ 2008-05-27 23:14 UTC (permalink / raw) To: Hans Verkuil; +Cc: video4linux-list, Michael Schimek On Tue, 2008-05-27 at 09:00 +0200, Hans Verkuil wrote: > Here's an old article I found detailing the design of pixelaspect, it > makes me wonder if what bttv does isn't wrong and pixelaspect is really > a pixel aspect. > > http://www.spinics.net/lists/vfl/msg02653.html > > Regards, > > Hans > Yet another good reference on pixel aspect conversions from one digital video scheme to another: http://lipas.uwasa.fi/~f76998/video/conversion/ Also, I think I've determined the rationale for defining 12 3/11 MHz as the sampling rate of NTSC for NTSC square pixel displays, since this page challenges one to mail in a guess at the rationale: http://lurkertech.com/lg/video-systems/#sqnonsq I'll email the author, but once one realizes that 12 3/11 MHz is 1080/88 MHz and that it's related to the fc = 63/88 * 5 MHz FCC rule for the NTSC chroma subcarrier, things begin to unwind from there. -Andy -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-05-27 7:00 ` Hans Verkuil 2008-05-27 23:14 ` Andy Walls @ 2008-06-06 22:29 ` Michael Schimek 2008-06-07 1:33 ` Daniel Glöckner ` (2 more replies) 1 sibling, 3 replies; 15+ messages in thread From: Michael Schimek @ 2008-06-06 22:29 UTC (permalink / raw) To: Hans Verkuil; +Cc: video4linux-list On Tue, 2008-05-27 at 09:00 +0200, Hans Verkuil wrote: > Here's an old article I found detailing the design of pixelaspect, it > makes me wonder if what bttv does isn't wrong and pixelaspect is really > a pixel aspect. After all the time I spent on the stuff it better be right. ;) Well as the spec says, pixelaspect is the aspect ratio (y/x) of pixels sampled by the driver. For example in PAL mode bttv samples luminance pixels at four times the PAL-BG color carrier frequency: 35468950 / 2 Hz. The line frequency is 15625 Hz so one gets 1135.0064 pixels/line. That's what v4l2_crop counts, pixels as sampled by the hardware. The industry standard sampling rate to get PAL square pixels is 14.75 MHz, giving 944 samples/line. Therefore cropcap.pixelaspect is 1135 / 944. In human terms the pixels are too narrow. On a square pixel display like a computer monitor the unscaled image looks stretched. Square pixel PAL images have 576 * 4 / 3 = 768 pixels per line. To produce 768 square pixels the bttv driver must sample 768 * 1135 / 944 = 923.4 pixels and scale down to 768. (The driver actually captures 924x576 pixels since times immemorial, so cropcap.defrect is 924x576.) As Daniel wrote, another way to view this is that the active portion of the video is about 52 µs wide. At 12.27 MHz (for NTSC square pixels) one would sample or "crop" 638 pixels over this period. At 13.5 MHz (BT.601) 702 pixels, at 14.75 MHz (for PAL square pixels) 767 pixels, and at 17.73 MHz (bttv) 922.2 pixels. The pixel aspect is (actual sampling rate) / (square pixel sampling rate). The examples in the spec are: 54 / 59 = 13.5 MHz / 14.75 MHz (PAL BT.601) and 11 / 10 = 13.5 MHz / 12 3/11 MHz (NTSC BT.601). Of course there's no guarantee defrect will cover exactly 52 µs. An older chip may always capture 720 pixels at 13.5 MHz with no support for scaling or cropping whatsoever. But since all video capture drivers are supposed to support VIDIOC_CROPCAP, apps can still determine the pixel aspect and display captured images correctly. > [cropcap] does not take into account anamorphic 16:9 transmissions. It's true cropcap assumes the pixel aspect (or sampling rate) will never change. PAL/NTSC/SECAM has a 4:3 picture aspect. Apps must find out by other means, perhaps WSS, if a 16:9 signal is transmitted instead, and ask the driver to scale the images accordingly or do that themselves. But for a webcam with an anamorphic lens it would be perfectly correct to return e.g. defrect=640x480 and pixelaspect=3/4. > The height of defrect should correspond to the active picture area. > In case of 625-line PAL/SECAM it should represent 576 lines. > It follows that > width = defrect.height * 4/3 > * v4l2_cropcap.pixelaspect.numerator > / v4l2_cropcap.pixelaspect.denominator; > covers 52µs of a 64µs PAL/SECAM line. > 52µs equals 702 BT.601 pixels. Not quite. Let's say defrect is 720x576 and pixelaspect is 54/59 (PAL/SECAM BT.601). If you want to capture exactly what the driver samples (no scaling) just call VIDIOC_S_FMT width cropcap.defrect as the image size. From our hypothetical BT.601 driver you'd get 720x576 images (no square pixels) with a black bar at the left and right edge because BT.601 overscans the picture: 720 / 13.5 MHz = 53.3 µs. To get the same area of the picture with square pixels you request: image width = defrect.width / pixelaspect; image height = defrect.height; Now the images will be 720 / 54 * 59 = 786 square pixels wide. That's more than 768 because you're still overscanning. What you really need is: image width = 768; image height = 576; crop width = round (image width * pixelaspect); crop height = defrect.height; crop left = defrect.left + (defrect.width - crop width) / 2; crop top = defrect.top; Now the images will be 768 pixels wide, scaled up from 768 * 54 / 59 = 703 sampled pixels, which cover 703 / 13.5 MHz = 52.0 µs centered over the active picture. With the same code the bttv driver with defrect 924x576 and pixelaspect 1135/944 would give you 768x576 square pixel images covering 923 / 17.73 MHz = 52.0 µs centered. But a driver can return any defrect.height and the picture aspect is not necessarily 4:3. Imagine a webcam with a 1280x720 sensor. image width = 768; image height = 576; crop width = round (image width * pixelaspect * defrect.height / image height); crop height = defrect.height; crop left = defrect.left + (defrect.width - crop width) / 2; crop top = defrect.top; Now let's say defrect is 720x480 and pixelaspect is 11/10 (NTSC BT.601). Result: The driver scales up from 704x480 to 768x576 square pixels covering 704 / 13.5 MHz = 52.1 µs centered. Let's say defrect is 1280x720 and pixelaspect is 1/1 (16:9 camera). Result: It scales images down from 960x720 to 768x576, cutting off 160 pixels left and right. Let's say defrect is 640x480 and pixelaspect is 3/4 (camera with anamorphic lens). Result: It scales images up from 480x480 to 768x576, cutting off 80 pixels left and right. The relation between picture aspect and pixel aspect is: picture aspect = defrect.width / defrect.height / pixelaspect; E.g. 16/9 = 640 / 480 / (3/4). > The defrect.left+defrect.width/2 should be the center of the active > picture area. That's required by the spec, also in the vertical direction. (Well, duh. What else would drivers capture by default.) > Many people use 480 lines instead of 486 lines for the active region in NTSC > and if there are inconsistencies in drivers, application may degrade the > picture by scaling. Therefore it would be nice if at least analog vertical > resolution was mapped 1:1 to cropping regions per standard. > Not doing so would make sense only if there was a tv standard where the > image is drawn column-wise. Horizontally the bttv driver's v4l2_crop counts luminance samples starting at 0H, which is an obvious choice to me. Don't know about saa7134. Vertically the bttv and saa7134 driver count frame lines. Field lines would be admissible too, but considering these devices can capture interlaced images it makes sense to return defrect.height 480 and 576. An odd cropping height is not possible though. The vertical origin is given by counting ITU-R line numbers as in the VBI API, which simplifies things quite a bit. Specifically these drivers count ITU-R line numbers of the first field times two, so bttv's defrect.top is 23 * 2. It may be nice if other drivers followed this convention, but apps cannot blindly rely on that. (They can check the driver name if exact cropping is important.) The cropping units are undefined by the spec because samples, microseconds or scan lines depend on the video standard and make no sense for a webcam. Michael -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-06-06 22:29 ` Michael Schimek @ 2008-06-07 1:33 ` Daniel Glöckner 2008-06-08 12:27 ` Michael Schimek 2008-06-07 2:28 ` Andy Walls 2008-06-11 18:49 ` Hans Verkuil 2 siblings, 1 reply; 15+ messages in thread From: Daniel Glöckner @ 2008-06-07 1:33 UTC (permalink / raw) To: Michael Schimek; +Cc: video4linux-list On Sat, Jun 07, 2008 at 12:29:43AM +0200, Michael Schimek wrote: > Well as the spec says, pixelaspect is the aspect ratio (y/x) of pixels > sampled by the driver. [...] > That's what v4l2_crop counts, pixels as sampled by the hardware. There is no need for the crop values to be related to sampled pixels as long as one can calculate the DAR of a crop region. The pixelaspect value may define virtual pixels that appear nowhere in hard- and software. The region is associated with real pixels when the user requests a width and height in VIDIOC_S_FMT. > As Daniel wrote, another way to view this is that the active portion of > the video is about 52 µs wide. Actually it's defined in BT.1700 to be exactly 52µs. For NTSC SMPTE 170M says 52.85555...µs. > Of course there's no guarantee defrect will cover exactly 52 µs. An > older chip may always capture 720 pixels at 13.5 MHz with no support for > scaling or cropping whatsoever. But since all video capture drivers are > supposed to support VIDIOC_CROPCAP, apps can still determine the pixel > aspect and display captured images correctly. Drivers may modify the region requested by the user even if it is the defrect. So my request to make defrect cover the standardized active area makes still sense for chips with a fixed area. Then applications know how to clip in software to the active region. Spec says: "The driver first adjusts the requested dimensions against hardware limits, i. e. the bounds given by the capture/output window, and it rounds to the closest possible values of horizontal and vertical offset, width and height." > > [cropcap] does not take into account anamorphic 16:9 transmissions. > > It's true cropcap assumes the pixel aspect (or sampling rate) will never > change. PAL/NTSC/SECAM has a 4:3 picture aspect. Apps must find out by > other means, perhaps WSS, if a 16:9 signal is transmitted instead, and > ask the driver to scale the images accordingly or do that themselves. Ack! SMPTE 170M says 4:3 for NTSC and BT.1700 for NTSC and SECAM. No word in BT.1700 about PAL aspect ratio. > > The height of defrect should correspond to the active picture area. > > In case of 625-line PAL/SECAM it should represent 576 lines. > > It follows that > > width = defrect.height * 4/3 > > * v4l2_cropcap.pixelaspect.numerator > > / v4l2_cropcap.pixelaspect.denominator; > > covers 52µs of a 64µs PAL/SECAM line. > > 52µs equals 702 BT.601 pixels. > > Not quite. Let's say defrect is 720x576 and pixelaspect is 54/59 > (PAL/SECAM BT.601). I wrote "should" because that's what I think drivers would do in a perfect world. See my above point about the driver modifying the requested area. > If you want to capture exactly what the driver samples (no scaling) just > call VIDIOC_S_FMT width cropcap.defrect as the image size. Nooo! Don't use cropcap values in VIDIOC_S_FMT. Spec says: "the driver writer is free to choose origin and units of the coordinate system in the analog domain." There is a problem with wanting to capture what is sampled without scaling. A capture card may perform scaling by modifying the sampling frequency. Then there is no sampling without scaling. A corresponding driver may use nanoseconds for horizontal crop coordinates even if it can't capture 52000 pixels per line. The cx88 chips have a Sample Rate Conversion register that allows exactly this, although it is supposed to be used only for a handful of sample rates because there are only four luma notches/chroma bandpasses to choose from. VIDIOCGCAP did return the maximum resolution. In v4l2 applications can call _S_FMT/_TRY_FMT with huge width and height values and let the driver reduce these to the supported maximum. > Now the images will be 720 / 54 * 59 = 786 square pixels wide. That's > more than 768 because you're still overscanning. What you really need > is: > > image width = 768; > image height = 576; And that's where I wanted defrect to tell us the region of interest and not the hardware limitations. > Let's say defrect is 1280x720 and pixelaspect is 1/1 (16:9 camera). > Result: It scales images down from 960x720 to 768x576, cutting off 160 > pixels left and right. In my perfect world applications apply that scaling logic only when VIDIOC_G_STD returns a known standard. Webcams that don't conform to a tv standard may return whatever is useful for defrect on that hardware. Applications can then still compute the display aspect ratio of a crop region. > > The defrect.left+defrect.width/2 should be the center of the active > > picture area. > > That's required by the spec, also in the vertical direction. (Well, duh. > What else would drivers capture by default.) Is it? Spec says: "this could be for example a 640 × 480 rectangle for NTSC, a 768 × 576 rectangle for PAL and SECAM centered over the active picture area." Doesn't sound like a requirement. If you want to make it one, I'll vote for you. > Vertically the bttv and saa7134 driver count frame lines. Field lines > would be admissible too, but considering these devices can capture > interlaced images it makes sense to return defrect.height 480 and 576. It makes sense as well to return defrect.height=486 for NTSC. > An odd cropping height is not possible though. Why? Applications are still allowed to vertically scale the picture, so there may be (or is enforced by the driver) an even number of lines in the end. The spec says the field order could be confused if the vertical offset was odd but then who knows which line belongs to which field after scaling? If you want odd vertical offsets, ask the driver to interleave the fields. > The vertical origin is given by counting ITU-R line numbers as in the > VBI API, which simplifies things quite a bit. Specifically these drivers > count ITU-R line numbers of the first field times two, so bttv's > defrect.top is 23 * 2. > > It may be nice if other drivers followed this convention, but apps > cannot blindly rely on that. (They can check the driver name if exact > cropping is important.) The cropping units are undefined by the spec > because samples, microseconds or scan lines depend on the video standard > and make no sense for a webcam. Strange, I had the feeling you wanted to pass cropping units to VIDIOC_S_FMT... Webcams usually have rows of pixels that can be counted. Spec could be modified to have vertical units = scan lines for analog tv standards and pixel rows for discrete image sensors. Hopefully nobody invents image sensors with an irregular pixel distribution. Daniel -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-06-07 1:33 ` Daniel Glöckner @ 2008-06-08 12:27 ` Michael Schimek 2008-06-08 16:55 ` Daniel Glöckner 0 siblings, 1 reply; 15+ messages in thread From: Michael Schimek @ 2008-06-08 12:27 UTC (permalink / raw) To: Daniel Glöckner; +Cc: video4linux-list On Sat, 2008-06-07 at 03:33 +0200, Daniel Glöckner wrote: > There is no need for the crop values to be related to sampled pixels > as long as one can calculate the DAR of a crop region. > The pixelaspect value may define virtual pixels that appear nowhere in > hard- and software. I'd hope v4l2_cropcap tells me what the hardware is actually capable of, i.e. the maximum physical resolution, not interpolated pixels. > Drivers may modify the region requested by the user even if it is the > defrect. So my request to make defrect cover the standardized active > area makes still sense for chips with a fixed area. Then applications > know how to clip in software to the active region. You think defrect should return the size and position of the active picture area in hardware units, not what the closest area is the hardware can actually capture? Well that's a possibility, but for once many drivers do not support VIDIOC_G/S_CROP, implying defrect is what they capture. > > Not quite. Let's say defrect is 720x576 and pixelaspect is 54/59 > > (PAL/SECAM BT.601). > > I wrote "should" because that's what I think drivers would do in a > perfect world. See my above point about the driver modifying the > requested area. Sorry it was late. You're right of course. > > If you want to capture exactly what the driver samples (no scaling) just > > call VIDIOC_S_FMT width cropcap.defrect as the image size. > > Nooo! Don't use cropcap values in VIDIOC_S_FMT. > Spec says: > "the driver writer is free to choose origin and units of the coordinate > system in the analog domain." Well the idea was that all drivers support at least 1:1 scaling, defrect is what apps usually want to capture and what the hardware can actually capture, so defrect.width and .height is a valid image size and presumably the highest resolution image apps can get of the active picture area. > There is a problem with wanting to capture what is sampled without scaling. > A capture card may perform scaling by modifying the sampling frequency. Which one for example? I've only ever seen cards with discrete sampling frequencies. > Then there is no sampling without scaling. But there is still a maximum sampling frequency which limits the number of "pixels" that can be cropped. > A corresponding driver may > use nanoseconds for horizontal crop coordinates even if it can't capture > 52000 pixels per line. > > The cx88 chips have a Sample Rate Conversion register that allows exactly > this, although it is supposed to be used only for a handful of sample > rates because there are only four luma notches/chroma bandpasses to > choose from. I'm not convinced that selecting the horizontal cropping start and end with nanosecond precision is more useful than to know how many vertical lines the hardware can actually distinguish. > VIDIOCGCAP did return the maximum resolution. In v4l2 applications can > call _S_FMT/_TRY_FMT with huge width and height values and let the > driver reduce these to the supported maximum. As I understand it VIDIOCGCAP returns the minimum and maximum scaled image size. > > > The defrect.left+defrect.width/2 should be the center of the active > > > picture area. > > > > That's required by the spec, also in the vertical direction. (Well, duh. > > What else would drivers capture by default.) > > Is it? Spec says: > "this could be for example a 640 × 480 rectangle for NTSC, a 768 × 576 > rectangle for PAL and SECAM centered over the active picture area." > > Doesn't sound like a requirement. > If you want to make it one, I'll vote for you. I expressed that badly. What I meant to say was "defrect shall provide co-ordinates which cover all of the picture, for example x by y pixels which are centered over the picture by virtue of covering all of it." I really want to replace "the picture" by "exactly what the respective video standard defines as the active picture area" but in reality the term may not apply, and many drivers do not or cannot aim that accurately. > > Vertically the bttv and saa7134 driver count frame lines. Field lines > > would be admissible too, but considering these devices can capture > > interlaced images it makes sense to return defrect.height 480 and 576. > > It makes sense as well to return defrect.height=486 for NTSC. That's right but I didn't pick the numbers. By default bttv always captured 480 lines and quite a few apps may depend on that. I guess most drivers capture 480 lines in NTSC-M mode, and most apps request 480 scaled lines. > > An odd cropping height is not possible though. > > Why? > Applications are still allowed to vertically scale the picture, so there > may be (or is enforced by the driver) an even number of lines in the end. > > The spec says the field order could be confused if the vertical offset was > odd but then who knows which line belongs to which field after scaling? > If you want odd vertical offsets, ask the driver to interleave the fields. I'm not sure what you argue for. Frame lines reflect the vertical resolution of the hardware. Properly drivers should scale the fields separately, which does not affect their order. The bt8x8 in particular cannot begin scaling on an even frame line and end on an odd frame line, of the opposite field, as an odd cropping height would suggest. An odd vertical offset may swap the fields, if the source is interlaced, which is an unnecessary complication on top of the already intricate enum v4l2_field. > > It may be nice if other drivers followed this convention, but apps > > cannot blindly rely on that. (They can check the driver name if exact > > cropping is important.) The cropping units are undefined by the spec > > because samples, microseconds or scan lines depend on the video standard > > and make no sense for a webcam. > > Strange, I had the feeling you wanted to pass cropping units to > VIDIOC_S_FMT... With known cropping units one can select an absolute position. Without that knowledge one can still move and resize the crop window relative to the default and pick scale factors. What's the problem? > Webcams usually have rows of pixels that can be counted. > Spec could be modified to have vertical units = scan lines for analog > tv standards and pixel rows for discrete image sensors. If that covers all hardware and can be required without breaking drivers I won't object. Michael -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-06-08 12:27 ` Michael Schimek @ 2008-06-08 16:55 ` Daniel Glöckner 0 siblings, 0 replies; 15+ messages in thread From: Daniel Glöckner @ 2008-06-08 16:55 UTC (permalink / raw) To: Michael Schimek; +Cc: video4linux-list On Sun, Jun 08, 2008 at 02:27:16PM +0200, Michael Schimek wrote: > I'd hope v4l2_cropcap tells me what the hardware is actually capable of, > i.e. the maximum physical resolution, not interpolated pixels. How many cards do support upscaling? Defining crop units that way would make it impossible to make use of the higher resolution in cropping after upscaling. And it conflicts with the idea of using tv frame lines for vertical units if the device can only capture one field. > You think defrect should return the size and position of the active > picture area in hardware units, not what the closest area is the > hardware can actually capture? Not necessarily in hardware units, but yes. Using hardware units would suggest that the hardware is capable of cropping to that size. > Well that's a possibility, but for once many drivers do not support > VIDIOC_G/S_CROP, implying defrect is what they capture. You mean drivers that support VIDIOC_CROPCAP without VIDIOC_G/S_CROP? I see two possibilities: 1. add VIDIOC_G/S_CROP dummies 2. add to spec that drivers without VIDIOC_G/S_CROP capture bounds > Well the idea was that all drivers support at least 1:1 scaling, defrect > is what apps usually want to capture and what the hardware can actually > capture, so defrect.width and .height is a valid image size and > presumably the highest resolution image apps can get of the active > picture area. That makes me think about my old FAST MovieMachine Pro. It does not have enough on board ram to capture both fields at full resolution. It can downscale horizontally and vertically by dropping pixels/lines in a Bresenham way. So it's either 448x576 or 704x372 maximum. Using 13.5 MHz sample rate pixels horizontally and tv lines vertically as crop units will make defrect too big to be a valid image size. What do you suggest? > That's right but I didn't pick the numbers. By default bttv always > captured 480 lines and quite a few apps may depend on that. I guess most > drivers capture 480 lines in NTSC-M mode, and most apps request 480 > scaled lines. Ah those beautiful numbers divisibly by 16 needed for MPEG... A workaround could be to decouple the startup crop region from defrect. Then applications that don't know about cropping get 480 lines on open while those that do can work with 486 lines. This is a problem for horizontal cropping as well, when the active region is 702 pixels wide and scaling is impossible. > The bt8x8 in particular > cannot begin scaling on an even frame line and end on an odd frame line, > of the opposite field, as an odd cropping height would suggest. On bt8x8 you set the scaling factor and tell the hardware where to save each line. You can't prevent the chip from using one more line for scaling but you can set a scaling factor that maps 517 lines to 512 and then capture 512 lines. > With known cropping units one can select an absolute position. Without > that knowledge one can still move and resize the crop window relative to > the default and pick scale factors. So with unknown units one can still select the absolute position if defrect is standardized. The ability to capture without scaling could be implemented by other means. One could for example pass v4l2_pix_format.width=0 to get the unscaled width for the current crop region. Width and height should then be tried independently to make it work with my old capture card. Daniel P.S.: GMX hat -all im SPF TXT RR. Du solltest deren Mailserver benutzen. -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-06-06 22:29 ` Michael Schimek 2008-06-07 1:33 ` Daniel Glöckner @ 2008-06-07 2:28 ` Andy Walls 2008-06-08 12:27 ` Michael Schimek 2008-06-11 18:49 ` Hans Verkuil 2 siblings, 1 reply; 15+ messages in thread From: Andy Walls @ 2008-06-07 2:28 UTC (permalink / raw) To: mschimek; +Cc: video4linux-list On Sat, 2008-06-07 at 00:29 +0200, Michael Schimek wrote: > As Daniel wrote, another way to view this is that the active portion of > the video is about 52 µs wide. At 12.27 MHz (for NTSC square pixels) one > would sample or "crop" 638 pixels over this period. Actually since the NTSC line frequency is Fh = 4.5 MHz/286 ~= 15.73426 kHz so 1/Fh ~= 63.5556 usec thus the active part of an NTSC line is actually (1/Fh - 10.9 us) ~= 52.65556 usec At the NTSC 12 3/11 MHz square pixel sampling rate that's actually (1/Fh - 10.9 us) * 12 3/11 MHz = (286/4.5 MHz - 10.9 us) * 12 3/11 MHz = 646.22727 samples It's slightly above and close, to the VGA screen width of 640 pixels, with ~3 pixels of active video lost on each of the left and right edge. -Andy -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-06-07 2:28 ` Andy Walls @ 2008-06-08 12:27 ` Michael Schimek 0 siblings, 0 replies; 15+ messages in thread From: Michael Schimek @ 2008-06-08 12:27 UTC (permalink / raw) To: Andy Walls; +Cc: video4linux-list On Fri, 2008-06-06 at 22:28 -0400, Andy Walls wrote: > On Sat, 2008-06-07 at 00:29 +0200, Michael Schimek wrote: > > > As Daniel wrote, another way to view this is that the active portion of > > the video is about 52 µs wide. At 12.27 MHz (for NTSC square pixels) one > > would sample or "crop" 638 pixels over this period. > > > Actually since the NTSC line frequency is > > Fh = 4.5 MHz/286 ~= 15.73426 kHz I know. Do I have to give three decimal places for every single video standard to explain the basic idea? ;-) Michael -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-06-06 22:29 ` Michael Schimek 2008-06-07 1:33 ` Daniel Glöckner 2008-06-07 2:28 ` Andy Walls @ 2008-06-11 18:49 ` Hans Verkuil 2008-06-11 20:15 ` Daniel Glöckner 2 siblings, 1 reply; 15+ messages in thread From: Hans Verkuil @ 2008-06-11 18:49 UTC (permalink / raw) To: mschimek; +Cc: video4linux-list On Saturday 07 June 2008 00:29:43 Michael Schimek wrote: > On Tue, 2008-05-27 at 09:00 +0200, Hans Verkuil wrote: > > Here's an old article I found detailing the design of pixelaspect, > > it makes me wonder if what bttv does isn't wrong and pixelaspect is > > really a pixel aspect. > > After all the time I spent on the stuff it better be right. ;) > > Well as the spec says, pixelaspect is the aspect ratio (y/x) of > pixels sampled by the driver. For example in PAL mode bttv samples > luminance pixels at four times the PAL-BG color carrier frequency: > 35468950 / 2 Hz. The line frequency is 15625 Hz so one gets 1135.0064 > pixels/line. That's what v4l2_crop counts, pixels as sampled by the > hardware. I'm sorry, but it remains a headache inducing description in the spec. Put yourself in the place of an application writer. You want to do this: 1) how do I scale the image? 2) how do I crop the image? 3) what is the pixel aspect of the pixels that I capture? I'll try to answer this based on my understanding of the spec. 1) This one is fairly easy: using G/S_FMT you can select the new width and height. The max width and height are specified by CROPCAP, bounds.width and bounds.height. The default initial width and height are defrect.width and defrect.height. Note: bounds and defrect are very strange in that width and height have pixels as units and top and left have their own units (although in practice it also uses pixels as the unit). This is not at all obvious from the spec! Also, is there any reason why we shouldn't uses pixels as well as the top/left unit? See more about this below. 2) You change c.width and c.height (pixel units) to crop to part of the image, and you change c.left and c.top ('analog units') to move the origin. There doesn't seem to be a way to know what the maximum left and top values are (bounds only specifies the minimum). This would be possible if left and top also uses pixel units, then bounds really specifies the max frame you can capture. Note that there also does not seem to be any information about the relationship between the top/left units and the width/height pixels, so how can you ever hope to reliably crop a 300 by 200 pixel window that's 10 pixels to the left and 10 pixels down from the default capture window? Suppose you have a black border on the left hand side that's 6 pixels wide. If you want to crop that, how would you do that? If c.left is in pixels, then you just set that to 6 (and subtract 6 from c.width). Since c.left has it's own units, how do I know what to put in c.left? Note that examples 1-12 and 1-13 in the spec clearly assume that the crop units are pixels! And I think all drivers we have do the same. 3) CROPCAP returns the pixelaspect of the pixels you capture when you use defrect.width/height as the width and height with S_FMT and defrect.width/height with S_CROP. Non-standard cropping and scaling means that you will have to calculate the new pixelaspect by taking that into account. This also does not take things like anamorphic widescreen into account, you have to detect that yourself and adjust accordingly. Does all this make any sense? It's really this sentence that makes things so hard: 'the driver writer is free to choose origin and units of the coordinate system in the analog domain.' If that was replaced by: 'the driver writer is free to choose the origin of the coordinate system.' then it would make a lot more sense. Regards, Hans > The industry standard sampling rate to get PAL square pixels is 14.75 > MHz, giving 944 samples/line. Therefore cropcap.pixelaspect is 1135 / > 944. In human terms the pixels are too narrow. On a square pixel > display like a computer monitor the unscaled image looks stretched. > > Square pixel PAL images have 576 * 4 / 3 = 768 pixels per line. To > produce 768 square pixels the bttv driver must sample 768 * 1135 / > 944 = 923.4 pixels and scale down to 768. (The driver actually > captures 924x576 pixels since times immemorial, so cropcap.defrect is > 924x576.) > > As Daniel wrote, another way to view this is that the active portion > of the video is about 52 µs wide. At 12.27 MHz (for NTSC square > pixels) one would sample or "crop" 638 pixels over this period. At > 13.5 MHz (BT.601) 702 pixels, at 14.75 MHz (for PAL square pixels) > 767 pixels, and at 17.73 MHz (bttv) 922.2 pixels. > > The pixel aspect is (actual sampling rate) / (square pixel sampling > rate). The examples in the spec are: 54 / 59 = 13.5 MHz / 14.75 MHz > (PAL BT.601) and 11 / 10 = 13.5 MHz / 12 3/11 MHz (NTSC BT.601). > > Of course there's no guarantee defrect will cover exactly 52 µs. An > older chip may always capture 720 pixels at 13.5 MHz with no support > for scaling or cropping whatsoever. But since all video capture > drivers are supposed to support VIDIOC_CROPCAP, apps can still > determine the pixel aspect and display captured images correctly. > > > [cropcap] does not take into account anamorphic 16:9 transmissions. > > It's true cropcap assumes the pixel aspect (or sampling rate) will > never change. PAL/NTSC/SECAM has a 4:3 picture aspect. Apps must find > out by other means, perhaps WSS, if a 16:9 signal is transmitted > instead, and ask the driver to scale the images accordingly or do > that themselves. > > But for a webcam with an anamorphic lens it would be perfectly > correct to return e.g. defrect=640x480 and pixelaspect=3/4. > > > The height of defrect should correspond to the active picture area. > > In case of 625-line PAL/SECAM it should represent 576 lines. > > It follows that > > width = defrect.height * 4/3 > > * v4l2_cropcap.pixelaspect.numerator > > / v4l2_cropcap.pixelaspect.denominator; > > covers 52µs of a 64µs PAL/SECAM line. > > 52µs equals 702 BT.601 pixels. > > Not quite. Let's say defrect is 720x576 and pixelaspect is 54/59 > (PAL/SECAM BT.601). > > If you want to capture exactly what the driver samples (no scaling) > just call VIDIOC_S_FMT width cropcap.defrect as the image size. From > our hypothetical BT.601 driver you'd get 720x576 images (no square > pixels) with a black bar at the left and right edge because BT.601 > overscans the picture: 720 / 13.5 MHz = 53.3 µs. > > To get the same area of the picture with square pixels you request: > > image width = defrect.width / pixelaspect; > image height = defrect.height; > > Now the images will be 720 / 54 * 59 = 786 square pixels wide. That's > more than 768 because you're still overscanning. What you really need > is: > > image width = 768; > image height = 576; > > crop width = round (image width * pixelaspect); > crop height = defrect.height; > crop left = defrect.left + (defrect.width - crop width) / 2; > crop top = defrect.top; > > Now the images will be 768 pixels wide, scaled up from 768 * 54 / 59 > = 703 sampled pixels, which cover 703 / 13.5 MHz = 52.0 µs centered > over the active picture. With the same code the bttv driver with > defrect 924x576 and pixelaspect 1135/944 would give you 768x576 > square pixel images covering 923 / 17.73 MHz = 52.0 µs centered. > > But a driver can return any defrect.height and the picture aspect is > not necessarily 4:3. Imagine a webcam with a 1280x720 sensor. > > image width = 768; > image height = 576; > > crop width = round (image width * pixelaspect > * defrect.height / image height); > crop height = defrect.height; > crop left = defrect.left + (defrect.width - crop width) / 2; > crop top = defrect.top; > > Now let's say defrect is 720x480 and pixelaspect is 11/10 (NTSC > BT.601). Result: The driver scales up from 704x480 to 768x576 square > pixels covering 704 / 13.5 MHz = 52.1 µs centered. > > Let's say defrect is 1280x720 and pixelaspect is 1/1 (16:9 camera). > Result: It scales images down from 960x720 to 768x576, cutting off > 160 pixels left and right. > > Let's say defrect is 640x480 and pixelaspect is 3/4 (camera with > anamorphic lens). Result: It scales images up from 480x480 to > 768x576, cutting off 80 pixels left and right. > > The relation between picture aspect and pixel aspect is: > picture aspect = defrect.width / defrect.height / pixelaspect; > E.g. 16/9 = 640 / 480 / (3/4). > > > The defrect.left+defrect.width/2 should be the center of the active > > picture area. > > That's required by the spec, also in the vertical direction. (Well, > duh. What else would drivers capture by default.) > > > Many people use 480 lines instead of 486 lines for the active > > region in NTSC and if there are inconsistencies in drivers, > > application may degrade the picture by scaling. Therefore it would > > be nice if at least analog vertical resolution was mapped 1:1 to > > cropping regions per standard. Not doing so would make sense only > > if there was a tv standard where the image is drawn column-wise. > > Horizontally the bttv driver's v4l2_crop counts luminance samples > starting at 0H, which is an obvious choice to me. Don't know about > saa7134. > > Vertically the bttv and saa7134 driver count frame lines. Field lines > would be admissible too, but considering these devices can capture > interlaced images it makes sense to return defrect.height 480 and > 576. An odd cropping height is not possible though. > > The vertical origin is given by counting ITU-R line numbers as in the > VBI API, which simplifies things quite a bit. Specifically these > drivers count ITU-R line numbers of the first field times two, so > bttv's defrect.top is 23 * 2. > > It may be nice if other drivers followed this convention, but apps > cannot blindly rely on that. (They can check the driver name if exact > cropping is important.) The cropping units are undefined by the spec > because samples, microseconds or scan lines depend on the video > standard and make no sense for a webcam. > > Michael -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-06-11 18:49 ` Hans Verkuil @ 2008-06-11 20:15 ` Daniel Glöckner 0 siblings, 0 replies; 15+ messages in thread From: Daniel Glöckner @ 2008-06-11 20:15 UTC (permalink / raw) To: Hans Verkuil; +Cc: video4linux-list, mschimek On Wed, Jun 11, 2008 at 08:49:36PM +0200, Hans Verkuil wrote: > Note: bounds and defrect are very strange in that width and height have > pixels as units and top and left have their own units (although in > practice it also uses pixels as the unit). This is not at all obvious > from the spec! Also, is there any reason why we shouldn't uses pixels > as well as the top/left unit? See more about this below. I don't think it was supposed to be like that. All four members use the same units. > Note that examples 1-12 and 1-13 in the spec clearly assume that the > crop units are pixels! And I think all drivers we have do the same. Michael's mail from 2002 does that as well and as he said a few days ago, the cropping units should be pixels at maximum unscaled resolution. > 3) CROPCAP returns the pixelaspect of the pixels you capture when you > use defrect.width/height as the width and height with S_FMT and > defrect.width/height with S_CROP. Non-standard cropping and scaling > means that you will have to calculate the new pixelaspect by taking > that into account. This also does not take things like anamorphic > widescreen into account, you have to detect that yourself and adjust > accordingly. I read that as well in the 2002 mail. > It's really this sentence that makes things so hard: 'the driver writer > is free to choose origin and units of the coordinate system in the > analog domain.' If that was replaced by: 'the driver writer is free to > choose the origin of the coordinate system.' then it would make a lot > more sense. And the units are pixels at the highest resolution without upscaling. (regardless of the possible cropping granularity) The current standard does not allow to derive the position and size of an image in the tv signal from cropping values. The only thing known is that defrect is centered over and bigger or equal than the active area iff hardware permits to crop such an area. Daniel -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-05-27 6:53 ` Hans Verkuil 2008-05-27 7:00 ` Hans Verkuil @ 2008-05-27 23:24 ` Andy Walls 2008-05-28 2:19 ` Daniel Glöckner 1 sibling, 1 reply; 15+ messages in thread From: Andy Walls @ 2008-05-27 23:24 UTC (permalink / raw) To: Hans Verkuil; +Cc: v4l, Michael Schimek On Tue, 2008-05-27 at 08:53 +0200, Hans Verkuil wrote: > On Tuesday 27 May 2008 03:16:16 Andy Walls wrote: > > On Mon, 2008-05-26 at 23:26 +0200, Hans Verkuil wrote: > > > Hi all, > > > > > > How should the pixelaspect field of the v4l2_cropcap struct be > > > filled? Looking at existing drivers it can be anything from 0/0, > > > 1/1, 54/59 for PAL/SECAM and 11/10 for NTSC or the horizontal > > > number of samples/the horizontal number of pixels. > > > > > > However, it is my understanding that the last one as used in bttv > > > is the correct interpretation. Meaning that if the horizontal unit > > > used for cropping is equal to a pixel (this is the case for most > > > drivers), then pixelaspect should be 1/1. If the horizontal unit is > > > different from a pixel, then it should be: > > > > > > (total number of horizontal units) / (horizontal pixels) > > > > > > So given a crop coordinate X, the corresponding coordinate in > > > pixels would be: > > > > > > X * pixelaspect.denominator / pixelaspect.numerator > > > > > > This is what bttv does and I'm pretty sure that's when this ioctl > > > was introduced. > > > > The definition of "defrect" for VIDIOC_CROPCAP in the spec seems to > > support this interpretation of "pixelaspect". > > > > > Assuming this is correct, then the Spec needs to be fixed in > > > several places (and drivers too, for that matter): > > > > > > - all references to the term 'pixel aspect' are incorrect: it has > > > nothing to do with the pixel aspect, it is about the ratio between > > > the horizontal sampling frequency and the 'pixel frequency'. > > > > Well, wouldn't changing the luma signal sampling rate (in time), but > > not the number of luma samples per pixel, effectively stretch or > > shrink the "real world" as it is displayed on a horizontal line, thus > > affecting the apparent aspect of a pixel when compared to the > > vertical dimension? Thus, when representing real world features, the > > pixels can have an apparent aspect? > > Yes, but then you would no longer be compliant to BT.601. And in the > case of CROPCAP this field still has nothing to do with the pixel > aspect. > > > >From the ITU-R BT.601-4 (which has been superseded by BT.601-6) that > > > I > > > > found on the 'net: > > > > http://inst.eecs.berkeley.edu/~cs150/Documents/ITU601.PDF > > > > > > BT.601 defines a luma sampling freq of 13.5 MHz, yielding 858 samples > > per NTSC line and 720 samples per active regions of a line. The > > 11/10 ratio mentioned for NTSC maps to 704/640, which is clearly a > > ratio of active digital pixels to digital pixels scaled for a > > display. So I'm not sure of the relationship to BT.601-4's 720 > > pixels. > > > > That's where this informal document may help: > > > > http://www.arachnotron.nl/videocap/doc/Karl_cap_v1_en.pdf > > Nice document. Explains it well. It also clearly shows that the 720 > width should include overscan (which it does for ivtv/cx18), so that > the actual video part (704x576 for PAL/SECAM) does have a pixel aspect > of 54/59 (y/x). I.e. the width of a PAL/SECAM pixel is slighly larger > than its height. > > In other words, if the capture device follows BT.601, then the pixel > aspect is fixed depending on the TV standard. And it has nothing to do > with what CROPCAP calls 'pixelaspect'. Well, not all recording formats follow BT.601 is what I gathered. But suffice to say, for the NTSC, PAL, & SECAM analog standards, there are a small set of possible pixel aspects in use for digitization. There is (I think) one sampling rate for each for a "square pixel", and then one or more other sampling rates for non-square pixels. > > He somewhat explains Display Aspect Ratio (DAR) and Pixel Aspect > > Ratio (PAR), where 704 comes from instead of 720, and works some > > examples with numbers on pages 7-9. He even talks about the BT8xx > > chips on page 10. > > > > Here's some more clarifying/confusing information on pixel aspect: > > > > http://lurkertech.com/lg/pixelaspect/ > > > > > > > > > > BTW, 'pixel frequency', as you are calling it, is twice the maximum > > "spatial frequency" that is displayable on a line. The 'pixel > > frequency', or 'pixels/line' is the Nyquist rate for the highest > > displayable spatial frequency on the line, the highest supported > > spatial cycles/line before aliasing makes features unresolvable. You > > need to see a light pixel and a dark pixel to have one spatial cycle. > > Unfortunately term from my side. > > > (IIRC, If you know the focal length and field of view, the highest > > spatial frequency tells you what is the smallest object length you > > can hope to resolve.) > > > > > - the description of 'bounds' is wrong: "Width and height are > > > defined in pixels, the driver writer is free to choose origin and > > > units of the coordinate system in the analog domain." This is > > > contradictory: the width units are up to the driver so the unit for > > > the width is not necessarily a pixel. The way the cropping is setup > > > implies that the height and Y coordinates are ALWAYS in line (aka > > > pixel) units. It cannot be anything else since that's the way > > > analog video works. You can't sample the height of half a line. > > > > > > - pixelaspect: has nothing to do with the pixel aspect. So the > > > references to PAL/SECAM and NTSC are irrelevant. > > > > As the "Karl_cap_v1_en.pdf" points out on page 7, you need to know > > the pixel aspect assumed by the digitization to do a proper > > conversion from a source digital format to a target digital format > > and a crop is part of that conversion. > > > > I think PAR has only an indirect relationship with analog video > > standards. PAR has more to do with display devices, encoding and > > recording standards, and digitization standards. All of these have > > been influenced by the analog standards, so certain PARs can get tied > > to certain analog standards. > > > > > > > > I think I've added more confusion than clarification. Oh well... > > The problem as I see it is that there doesn't seem to be a V4L API at > the moment that returns the true pixel aspect. So any application > basically has to hope the device follows BT.601 and assume the > corresponding pixel aspect ratios. But this can become problematic with > webcams etc. that do not follow BT.601. So the only way to do the right thing, is to "know" the only pixel aspect supported by the device, or have the driver report the correct one that is in use. > Regards, > > Hans > -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Need VIDIOC_CROPCAP clarification 2008-05-27 23:24 ` Andy Walls @ 2008-05-28 2:19 ` Daniel Glöckner 0 siblings, 0 replies; 15+ messages in thread From: Daniel Glöckner @ 2008-05-28 2:19 UTC (permalink / raw) To: Andy Walls; +Cc: v4l, Michael Schimek Just looking at the intended usage of struct v4l2_cropcap and ignoring most of the descriptions in the standard, I would explain it this way: struct v4l2_crop x = { type, { 0, 0, v4l2_cropcap.pixelaspect.numerator, v4l2_cropcap.pixelaspect.denominator } }; defines a square region on (or outside) the screen. This does not take into account anamorphic 16:9 transmissions. There is no information in v4l2_cropcap on how to map these values to pixels. The mapping has to be done with VIDIOC_S_FMT. v4l2_cropcap tells us how to calculate the DAR for a crop region. v4l2_format defines how to calculate the PAR from the DAR. The height of defrect should correspond to the active picture area. In case of 625-line PAL/SECAM it should represent 576 lines. It follows that width = defrect.height * 4/3 * v4l2_cropcap.pixelaspect.numerator / v4l2_cropcap.pixelaspect.denominator; covers 52µs of a 64µs PAL/SECAM line. 52µs equals 702 BT.601 pixels. The defrect.left+defrect.width/2 should be the center of the active picture area. This is 36.5µs after OH (start of horizontal sync) for PAL/SECAM according to BT.1700. These microsecond calculations can of course only be done if v4l2_std_id is a known standard. If it is unknown, the application only knows that defrect looks good and how to scale the image to get the aspect ratio right. All of this is how I think it should work, not necessarily how it is standardized. Many people use 480 lines instead of 486 lines for the active region in NTSC and if there are inconsistencies in drivers, application may degrade the picture by scaling. Therefore it would be nice if at least analog vertical resolution was mapped 1:1 to cropping regions per standard. Not doing so would make sense only if there was a tv standard where the image is drawn column-wise. Daniel -- video4linux-list mailing list Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe https://www.redhat.com/mailman/listinfo/video4linux-list ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2008-06-11 20:43 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-05-26 21:26 Need VIDIOC_CROPCAP clarification Hans Verkuil 2008-05-27 1:16 ` Andy Walls 2008-05-27 6:53 ` Hans Verkuil 2008-05-27 7:00 ` Hans Verkuil 2008-05-27 23:14 ` Andy Walls 2008-06-06 22:29 ` Michael Schimek 2008-06-07 1:33 ` Daniel Glöckner 2008-06-08 12:27 ` Michael Schimek 2008-06-08 16:55 ` Daniel Glöckner 2008-06-07 2:28 ` Andy Walls 2008-06-08 12:27 ` Michael Schimek 2008-06-11 18:49 ` Hans Verkuil 2008-06-11 20:15 ` Daniel Glöckner 2008-05-27 23:24 ` Andy Walls 2008-05-28 2:19 ` Daniel Glöckner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox