New USB-Statistics API

public inbox for linux-media@vger.kernel.org
 help / color / mirror / Atom feed

* New USB-Statistics API
@ 2009-12-08  9:16 Julian Scheel
  2009-12-08 12:24 ` New DVB(!!)-Statistics API Julian Scheel
  2009-12-08 13:22 ` New DVB-Statistics API Mauro Carvalho Chehab
  0 siblings, 2 replies; 12+ messages in thread
From: Julian Scheel @ 2009-12-08  9:16 UTC (permalink / raw)
  To: linux-media

Hello together,

after the last thread which asked about signal statistics details 
degenerated into a discussion about the technical possibilites for 
implementing an entirely new API, which lead to nothing so far, I wanted 
to open a new thread to bring this forward. Maybe some more people can 
give their votes for the different options

Actually Manu did propose a new API for fetching enhanced statistics. It 
uses new IOCTL to directly fetch the statistical data in one go from the 
frontend. This propose was so far rejected by Mauro who wants to use the 
S2API get/set calls instead.

Now to give everyone a quick overview about the advantages and 
disadvantages of both approaches I will try to summarize it up:

1st approach: Introduce new IOCTL

Pros:
- Allows a quick fetch of the entire dataset, which ensures that:
  1. all values are fetched in one go (as long as possible) from the 
frontend, so that they can be treated as one united and be valued in 
conjunction
  2. the requested values arrive the caller in an almost constant 
timeframe, as the ioctl is directly executed by the driver
- It does not interfere with the existing statistics API, which has to 
be kept alive as it is in use for a long time already. (unifying it's 
data is not the approach of this new API)

Cons:
- Forces the application developers to interact with two APIs. The S2API 
for tuning on the one hand and the Statistics API for reading signal 
values on the other hand.

2nd approach: Set up S2API calls for reading statistics

Pros:
- Continous unification of the linuxtv API, allowing all calls to be 
made through one API. -> easy for application developers

Cons:
- Due to the key/value pairs used for S2API getters the statistical 
values can't be read as a unique block, so we loose the guarantee, that 
all of the values can be treatend as one unit expressing the signals 
state at a concrete time.
- Due to the general architecture of the S2API the delay between 
requesting and receiving the actual data could become too big to allow 
realtime interpretation of the data (as it is needed for applications 
like satellite finders, etc.)

I hope that this summary is technically correct in all points, if not 
I'd be thankful for objective corrections. I am not going to articulate 
my personal opinion in this mail, but I will do it in another mail in 
reply to this one, so that this mail can be seen as a neutral summary of 
the current situation.

So now it's everyones turn to think about the options and give an 
opinion. In the end I am quite sure that all of us would have great 
benefits of an improved statistics API.

Cheers,
Julian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB(!!)-Statistics API
  2009-12-08  9:16 New USB-Statistics API Julian Scheel
@ 2009-12-08 12:24 ` Julian Scheel
  2009-12-08 13:22 ` New DVB-Statistics API Mauro Carvalho Chehab
  1 sibling, 0 replies; 12+ messages in thread
From: Julian Scheel @ 2009-12-08 12:24 UTC (permalink / raw)
  To: linux-media

Am 08.12.09 10:16, schrieb Julian Scheel:
> Hello together,
>
> after the last thread which asked about signal statistics details 
> degenerated into a discussion about the technical possibilites for 
> implementing an entirely new API, which lead to nothing so far, I 
> wanted to open a new thread to bring this forward. Maybe some more 
> people can give their votes for the different options
>
> Actually Manu did propose a new API for fetching enhanced statistics. 
> It uses new IOCTL to directly fetch the statistical data in one go 
> from the frontend. This propose was so far rejected by Mauro who wants 
> to use the S2API get/set calls instead.
>
> Now to give everyone a quick overview about the advantages and 
> disadvantages of both approaches I will try to summarize it up:
>
> 1st approach: Introduce new IOCTL
>
> Pros:
> - Allows a quick fetch of the entire dataset, which ensures that:
>  1. all values are fetched in one go (as long as possible) from the 
> frontend, so that they can be treated as one united and be valued in 
> conjunction
>  2. the requested values arrive the caller in an almost constant 
> timeframe, as the ioctl is directly executed by the driver
> - It does not interfere with the existing statistics API, which has to 
> be kept alive as it is in use for a long time already. (unifying it's 
> data is not the approach of this new API)
>
> Cons:
> - Forces the application developers to interact with two APIs. The 
> S2API for tuning on the one hand and the Statistics API for reading 
> signal values on the other hand.
>
> 2nd approach: Set up S2API calls for reading statistics
>
> Pros:
> - Continous unification of the linuxtv API, allowing all calls to be 
> made through one API. -> easy for application developers
>
> Cons:
> - Due to the key/value pairs used for S2API getters the statistical 
> values can't be read as a unique block, so we loose the guarantee, 
> that all of the values can be treatend as one unit expressing the 
> signals state at a concrete time.
> - Due to the general architecture of the S2API the delay between 
> requesting and receiving the actual data could become too big to allow 
> realtime interpretation of the data (as it is needed for applications 
> like satellite finders, etc.)
>
> I hope that this summary is technically correct in all points, if not 
> I'd be thankful for objective corrections. I am not going to 
> articulate my personal opinion in this mail, but I will do it in 
> another mail in reply to this one, so that this mail can be seen as a 
> neutral summary of the current situation.
>
> So now it's everyones turn to think about the options and give an 
> opinion. In the end I am quite sure that all of us would have great 
> benefits of an improved statistics API.
>
> Cheers,
> Julian
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

The above mail of course has nothing to do with USB-Statistics. I must 
have been slightly confused when typing the message title! Sorry for that!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB-Statistics API
  2009-12-08  9:16 New USB-Statistics API Julian Scheel
  2009-12-08 12:24 ` New DVB(!!)-Statistics API Julian Scheel
@ 2009-12-08 13:22 ` Mauro Carvalho Chehab
  2009-12-08 21:46   ` Manu Abraham
  2010-02-03  9:49   ` New DVB-Statistics API - please vote Mauro Carvalho Chehab
  1 sibling, 2 replies; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2009-12-08 13:22 UTC (permalink / raw)
  To: Julian Scheel; +Cc: linux-media

Hi Julian,

Let me add some corrections to your technical analysis.

Julian Scheel wrote:
> Hello together,
> 
> after the last thread which asked about signal statistics details
> degenerated into a discussion about the technical possibilites for
> implementing an entirely new API, which lead to nothing so far, I wanted
> to open a new thread to bring this forward. Maybe some more people can
> give their votes for the different options
> 
> Actually Manu did propose a new API for fetching enhanced statistics. It
> uses new IOCTL to directly fetch the statistical data in one go from the
> frontend. This propose was so far rejected by Mauro who wants to use the
> S2API get/set calls instead.
> 
> Now to give everyone a quick overview about the advantages and
> disadvantages of both approaches I will try to summarize it up:
> 
> 1st approach: Introduce new IOCTL
> 
> Pros:
> - Allows a quick fetch of the entire dataset, which ensures that:
>  1. all values are fetched in one go (as long as possible) from the
> frontend, so that they can be treated as one united and be valued in
> conjunction
>  2. the requested values arrive the caller in an almost constant
> timeframe, as the ioctl is directly executed by the driver
> - It does not interfere with the existing statistics API, which has to
> be kept alive as it is in use for a long time already. (unifying it's
> data is not the approach of this new API)
> 
> Cons:
> - Forces the application developers to interact with two APIs. The S2API
> for tuning on the one hand and the Statistics API for reading signal
> values on the other hand.
> 
> 2nd approach: Set up S2API calls for reading statistics
> 
> Pros:
> - Continous unification of the linuxtv API, allowing all calls to be
> made through one API. -> easy for application developers

- Scaling values can be retrieved/negotiated (if we implement the set
mode) before requesting the stats, using the same API;

- API can be easily extended to support other statistics that may be needed
by newer DTV standards.

> 
> Cons:
> - Due to the key/value pairs used for S2API getters the statistical
> values can't be read as a unique block, so we loose the guarantee, that
> all of the values can be treatend as one unit expressing the signals
> state at a concrete time.

You missed the point here. The proposal patch groups all S2API
pairs into a _single_ call into the driver:

> +		for (i = 0; i < tvps->num; i++)
> +			need_get_ops += dtv_property_prepare_get_stats(fe,
> +							 tvp + i, inode, file);
> +
> +		if (!fe->dtv_property_cache.need_stats) {
> +			need_get_ops++;
> +		} else {
> +			if (fe->ops.get_stats) {
> +				err = fe->ops.get_stats(fe);
> +				if (err < 0)
> +					return err;
> +			}
> +		}

The dtv_property_prepare_get_stats will generate a bitmap field (need_stats) that
will describe all value pairs that userspace is expecting. After doing it,
a single call is done to get_stats() callback.

All the driver need to do is to fill all values at dtv_property_cache. If the driver
fills more values than requested by the user, the extra values will simply be discarded.

In order to reduce latency, the driver may opt to not read the register values for the
data that aren't requested by the user, like I did on cx24123 driver.

Those values will all be returned at the same time to userspace by a single copy_to_user()
operation.

> - Due to the general architecture of the S2API the delay between
> requesting and receiving the actual data could become too big to allow
> realtime interpretation of the data (as it is needed for applications
> like satellite finders, etc.)

Not true. As pointed at the previous answer, the difference between a new ioctl
and S2API is basically the code at dtv_property_prepare_get_stats() and
dtv_property_process_get(). This is a pure code that uses a continuous struct
that will likely be at L3 cache, inside the CPU chip. So, this code will run
really quickly.

As current CPU's runs at the order of Teraflops (as the CPU clocks are at gigahertz
order, and CPU's can handle multiple instructions per clock cycle), the added delay
is in de order of nanosseconds. 

On the other hand, a simple read of a value from an i2c device is in the order
of milisseconds, since I2C serial bus, speed is slow (typically operating at
100 Kbps).

So, the delay is determined by the number of I2C calls you have at the code.

With the new ioctl proposal, as you need to read all data from I2C (even the ones
that userspace don't need), you'll have two situations:
	- if you use S2API call to request all data provided by ioctl approach,
the delay will be the same;
	- if you use S2API call to request less stats, the S2API delay will
be shorter. 

For example, with cx24123, the S2API delay it will be 6 times shorter than the
ioctl, if you request just the signal strength - as just one read is needed
to get signal strength, while you need to do 6 reads to get all 3 stats.

So, if you want to do some realtime usage and delay is determinant, a call
via S2API containing just the values you need will be better than the new
ioctl call.

The only cons I can think is that the S2API payload for a complete retrival of all
stats will be a little bigger.

Cheers,
Mauro.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB-Statistics API
  2009-12-08 13:22 ` New DVB-Statistics API Mauro Carvalho Chehab
@ 2009-12-08 21:46   ` Manu Abraham
  2009-12-08 23:43     ` Mauro Carvalho Chehab
  2010-02-03  9:49   ` New DVB-Statistics API - please vote Mauro Carvalho Chehab
  1 sibling, 1 reply; 12+ messages in thread
From: Manu Abraham @ 2009-12-08 21:46 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Julian Scheel, linux-media

On Tue, Dec 8, 2009 at 5:22 PM, Mauro Carvalho Chehab
<mchehab@redhat.com> wrote:
> Hi Julian,
>
> Let me add some corrections to your technical analysis.
>
> Julian Scheel wrote:
>> Hello together,
>>
>> after the last thread which asked about signal statistics details
>> degenerated into a discussion about the technical possibilites for
>> implementing an entirely new API, which lead to nothing so far, I wanted
>> to open a new thread to bring this forward. Maybe some more people can
>> give their votes for the different options
>>
>> Actually Manu did propose a new API for fetching enhanced statistics. It
>> uses new IOCTL to directly fetch the statistical data in one go from the
>> frontend. This propose was so far rejected by Mauro who wants to use the
>> S2API get/set calls instead.
>>
>> Now to give everyone a quick overview about the advantages and
>> disadvantages of both approaches I will try to summarize it up:
>>
>> 1st approach: Introduce new IOCTL
>>
>> Pros:
>> - Allows a quick fetch of the entire dataset, which ensures that:
>>  1. all values are fetched in one go (as long as possible) from the
>> frontend, so that they can be treated as one united and be valued in
>> conjunction
>>  2. the requested values arrive the caller in an almost constant
>> timeframe, as the ioctl is directly executed by the driver
>> - It does not interfere with the existing statistics API, which has to
>> be kept alive as it is in use for a long time already. (unifying it's
>> data is not the approach of this new API)
>>
>> Cons:
>> - Forces the application developers to interact with two APIs. The S2API
>> for tuning on the one hand and the Statistics API for reading signal
>> values on the other hand.
>>
>> 2nd approach: Set up S2API calls for reading statistics
>>
>> Pros:
>> - Continous unification of the linuxtv API, allowing all calls to be
>> made through one API. -> easy for application developers
>
> - Scaling values can be retrieved/negotiated (if we implement the set
> mode) before requesting the stats, using the same API;
>
> - API can be easily extended to support other statistics that may be needed
> by newer DTV standards.
>
>>
>> Cons:
>> - Due to the key/value pairs used for S2API getters the statistical
>> values can't be read as a unique block, so we loose the guarantee, that
>> all of the values can be treatend as one unit expressing the signals
>> state at a concrete time.
>
> You missed the point here. The proposal patch groups all S2API
> pairs into a _single_ call into the driver:
>
>> +             for (i = 0; i < tvps->num; i++)
>> +                     need_get_ops += dtv_property_prepare_get_stats(fe,
>> +                                                      tvp + i, inode, file);
>> +
>> +             if (!fe->dtv_property_cache.need_stats) {
>> +                     need_get_ops++;
>> +             } else {
>> +                     if (fe->ops.get_stats) {
>> +                             err = fe->ops.get_stats(fe);
>> +                             if (err < 0)
>> +                                     return err;
>> +                     }
>> +             }
>
> The dtv_property_prepare_get_stats will generate a bitmap field (need_stats) that
> will describe all value pairs that userspace is expecting. After doing it,
> a single call is done to get_stats() callback.
>
> All the driver need to do is to fill all values at dtv_property_cache. If the driver
> fills more values than requested by the user, the extra values will simply be discarded.
>
> In order to reduce latency, the driver may opt to not read the register values for the
> data that aren't requested by the user, like I did on cx24123 driver.
>
> Those values will all be returned at the same time to userspace by a single copy_to_user()
> operation.
>
>> - Due to the general architecture of the S2API the delay between
>> requesting and receiving the actual data could become too big to allow
>> realtime interpretation of the data (as it is needed for applications
>> like satellite finders, etc.)
>
> Not true. As pointed at the previous answer, the difference between a new ioctl
> and S2API is basically the code at dtv_property_prepare_get_stats() and
> dtv_property_process_get(). This is a pure code that uses a continuous struct
> that will likely be at L3 cache, inside the CPU chip. So, this code will run
> really quickly.



AFAIK, cache eviction occurs with syscalls: where content in the
caches near the CPU cores is pushed to the outer cores, resulting in
cache misses. Not all CPU's are equipped with L3 caches. Continuous
syscalls can result in TLB cache misses from what i understand, which
is expensive.


These are the numbers Intel lists for a Pentium M:

    To    	Cycles
    Register 	<= 1
    L1d 	~3
    L2 	~14
    Main Memory 	~240



> As current CPU's runs at the order of Teraflops (as the CPU clocks are at gigahertz
> order, and CPU's can handle multiple instructions per clock cycle), the added delay
> is in de order of nanosseconds.


Consider STB's where DVB is generally deployed rather than the small
segment of home users running a stack on a generic PC.


> On the other hand, a simple read of a value from an i2c device is in the order
> of milisseconds, since I2C serial bus, speed is slow (typically operating at
> 100 Kbps).
>
> So, the delay is determined by the number of I2C calls you have at the code.


Hardware I/O is the most expensive operation involved. The number of
I2C calls is not a ground for comparison, see the latter part of the
mail.


> With the new ioctl proposal, as you need to read all data from I2C (even the ones
> that userspace don't need), you'll have two situations:
>        - if you use S2API call to request all data provided by ioctl approach,
> the delay will be the same;
>        - if you use S2API call to request less stats, the S2API delay will
> be shorter.
>
> For example, with cx24123, the S2API delay it will be 6 times shorter than the
> ioctl, if you request just the signal strength - as just one read is needed
> to get signal strength, while you need to do 6 reads to get all 3 stats.
>
> So, if you want to do some realtime usage and delay is determinant, a call
> via S2API containing just the values you need will be better than the new
> ioctl call.

Case #1: The ioctl approach


+struct fecap_statistics {
+       struct fecap_quality            quality;
+       struct fecap_strength           strength;
+       struct fecap_error              error;
+       enum fecap_unc_params           unc;
+};

+/* FE_SIGNAL_STATS
+ * This system call provides a snapshot of all the receiver system
+ * at any given point of time. System signal statistics are always
+ * computed with respect to time and is best obtained the nearest
+ * to each of the individual parameters in a time domain.
+ * Signal statistics are assumed, "at any given instance of time".
+ * It is not possible to get a snapshot at the exact single instance
+ * and hence we look at the nearest instance, in the time domain.
+ * The statistics are described by the FE_STATISTICS_CAPS ioctl,
+ * ie. based on the device capabilities.
+ */
+#define FE_SIGNAL_STATS                        _IOR('o', 86, struct
fesignal_stat)

+	case FE_SIGNAL_STATS:
+		if (fe->ops.read_stats)
+			err = fe->ops.read_stats(fe, (struct fesignal_stat *) parg);
+		break;



+static int stb0899_read_stats(struct dvb_frontend *fe, struct
fesignal_stat *stats)
+{
+}

This is as simple as it can get






Now Case #2: based on s2api

struct dtv_property {
	__u32 cmd;
	__u32 reserved[3];
	union {
		__u32 data;
		struct {
			__u8 data[32];
			__u32 len;
			__u32 reserved1[3];
			void *reserved2;
		} buffer;
	} u;
	int result;
} __attribute__ ((packed));

struct dtv_properties {
	__u32 num;
	struct dtv_property *props;
};

#define FE_GET_PROPERTY		   _IOR('o', 83, struct dtv_properties)


#define _DTV_CMD(n, s, b) \
[n] = { \
	.name = #n, \
	.cmd  = n, \
	.set  = s,\
	.buffer = b \
}

static int dtv_property_process_get(struct dvb_frontend *fe,
				    struct dtv_property *tvp,
				    struct inode *inode, struct file *file)
{
	int r = 0;

	dtv_property_dump(tvp);

	/* Allow the frontend to validate incoming properties */
	if (fe->ops.get_property)
		r = fe->ops.get_property(fe, tvp);

	if (r < 0)
		return r;

	switch(tvp->cmd) {
	case DTV_FREQUENCY:
		tvp->u.data = fe->dtv_property_cache.frequency;
		break;
	case DTV_MODULATION:
		tvp->u.data = fe->dtv_property_cache.modulation;
		break;
	case DTV_BANDWIDTH_HZ:
		tvp->u.data = fe->dtv_property_cache.bandwidth_hz;
		break;
	case DTV_INVERSION:
		tvp->u.data = fe->dtv_property_cache.inversion;
		break;
	case DTV_SYMBOL_RATE:
		tvp->u.data = fe->dtv_property_cache.symbol_rate;
		break;
	case DTV_INNER_FEC:
		tvp->u.data = fe->dtv_property_cache.fec_inner;
		break;
	case DTV_PILOT:
		tvp->u.data = fe->dtv_property_cache.pilot;
		break;
	case DTV_ROLLOFF:
		tvp->u.data = fe->dtv_property_cache.rolloff;
		break;
	case DTV_DELIVERY_SYSTEM:
		tvp->u.data = fe->dtv_property_cache.delivery_system;
		break;
	case DTV_VOLTAGE:
		tvp->u.data = fe->dtv_property_cache.voltage;
		break;
	case DTV_TONE:
		tvp->u.data = fe->dtv_property_cache.sectone;
		break;
	case DTV_API_VERSION:
		tvp->u.data = (DVB_API_VERSION << 8) | DVB_API_VERSION_MINOR;
		break;
	case DTV_CODE_RATE_HP:
		tvp->u.data = fe->dtv_property_cache.code_rate_HP;
		break;
	case DTV_CODE_RATE_LP:
		tvp->u.data = fe->dtv_property_cache.code_rate_LP;
		break;
	case DTV_GUARD_INTERVAL:
		tvp->u.data = fe->dtv_property_cache.guard_interval;
		break;
	case DTV_TRANSMISSION_MODE:
		tvp->u.data = fe->dtv_property_cache.transmission_mode;
		break;
	case DTV_HIERARCHY:
		tvp->u.data = fe->dtv_property_cache.hierarchy;
		break;

	/* ISDB-T Support here */
	case DTV_ISDBT_PARTIAL_RECEPTION:
		tvp->u.data = fe->dtv_property_cache.isdbt_partial_reception;
		break;
	case DTV_ISDBT_SOUND_BROADCASTING:
		tvp->u.data = fe->dtv_property_cache.isdbt_sb_mode;
		break;
	case DTV_ISDBT_SB_SUBCHANNEL_ID:
		tvp->u.data = fe->dtv_property_cache.isdbt_sb_subchannel;
		break;
	case DTV_ISDBT_SB_SEGMENT_IDX:
		tvp->u.data = fe->dtv_property_cache.isdbt_sb_segment_idx;
		break;
	case DTV_ISDBT_SB_SEGMENT_COUNT:
		tvp->u.data = fe->dtv_property_cache.isdbt_sb_segment_count;
		break;
	case DTV_ISDBT_LAYER_ENABLED:
		tvp->u.data = fe->dtv_property_cache.isdbt_layer_enabled;
		break;
	case DTV_ISDBT_LAYERA_FEC:
		tvp->u.data = fe->dtv_property_cache.layer[0].fec;
		break;
	case DTV_ISDBT_LAYERA_MODULATION:
		tvp->u.data = fe->dtv_property_cache.layer[0].modulation;
		break;
	case DTV_ISDBT_LAYERA_SEGMENT_COUNT:
		tvp->u.data = fe->dtv_property_cache.layer[0].segment_count;
		break;
	case DTV_ISDBT_LAYERA_TIME_INTERLEAVING:
		tvp->u.data = fe->dtv_property_cache.layer[0].interleaving;
		break;
	case DTV_ISDBT_LAYERB_FEC:
		tvp->u.data = fe->dtv_property_cache.layer[1].fec;
		break;
	case DTV_ISDBT_LAYERB_MODULATION:
		tvp->u.data = fe->dtv_property_cache.layer[1].modulation;
		break;
	case DTV_ISDBT_LAYERB_SEGMENT_COUNT:
		tvp->u.data = fe->dtv_property_cache.layer[1].segment_count;
		break;
	case DTV_ISDBT_LAYERB_TIME_INTERLEAVING:
		tvp->u.data = fe->dtv_property_cache.layer[1].interleaving;
		break;
	case DTV_ISDBT_LAYERC_FEC:
		tvp->u.data = fe->dtv_property_cache.layer[2].fec;
		break;
	case DTV_ISDBT_LAYERC_MODULATION:
		tvp->u.data = fe->dtv_property_cache.layer[2].modulation;
		break;
	case DTV_ISDBT_LAYERC_SEGMENT_COUNT:
		tvp->u.data = fe->dtv_property_cache.layer[2].segment_count;
		break;
	case DTV_ISDBT_LAYERC_TIME_INTERLEAVING:
		tvp->u.data = fe->dtv_property_cache.layer[2].interleaving;
		break;
	case DTV_ISDBS_TS_ID:
		tvp->u.data = fe->dtv_property_cache.isdbs_ts_id;
		break;
	default:
		r = -1;
	}

	return r;
}


static int dvb_frontend_ioctl_properties(struct inode *inode, struct file *file,
			unsigned int cmd, void *parg)
{
	struct dvb_device *dvbdev = file->private_data;
	struct dvb_frontend *fe = dvbdev->priv;
	int err = 0;

	struct dtv_properties *tvps = NULL;
	struct dtv_property *tvp = NULL;
	int i;

	dprintk("%s\n", __func__);

	if(cmd == FE_SET_PROPERTY) {
		tvps = (struct dtv_properties __user *)parg;

		dprintk("%s() properties.num = %d\n", __func__, tvps->num);
		dprintk("%s() properties.props = %p\n", __func__, tvps->props);

		/* Put an arbitrary limit on the number of messages that can
		 * be sent at once */
		if ((tvps->num == 0) || (tvps->num > DTV_IOCTL_MAX_MSGS))
			return -EINVAL;

		tvp = (struct dtv_property *) kmalloc(tvps->num *
			sizeof(struct dtv_property), GFP_KERNEL);
		if (!tvp) {
			err = -ENOMEM;
			goto out;
		}

		if (copy_from_user(tvp, tvps->props, tvps->num * sizeof(struct
dtv_property))) {
			err = -EFAULT;
			goto out;
		}

		for (i = 0; i < tvps->num; i++) {
			(tvp + i)->result = dtv_property_process_set(fe, tvp + i, inode, file);
			err |= (tvp + i)->result;
		}

		if(fe->dtv_property_cache.state == DTV_TUNE)
			dprintk("%s() Property cache is full, tuning\n", __func__);

	} else
	if(cmd == FE_GET_PROPERTY) {

		tvps = (struct dtv_properties __user *)parg;

		dprintk("%s() properties.num = %d\n", __func__, tvps->num);
		dprintk("%s() properties.props = %p\n", __func__, tvps->props);

		/* Put an arbitrary limit on the number of messages that can
		 * be sent at once */
		if ((tvps->num == 0) || (tvps->num > DTV_IOCTL_MAX_MSGS))
			return -EINVAL;

		tvp = (struct dtv_property *) kmalloc(tvps->num *
			sizeof(struct dtv_property), GFP_KERNEL);
		if (!tvp) {
			err = -ENOMEM;
			goto out;
		}

		if (copy_from_user(tvp, tvps->props, tvps->num * sizeof(struct
dtv_property))) {
			err = -EFAULT;
			goto out;
		}

		for (i = 0; i < tvps->num; i++) {
			(tvp + i)->result = dtv_property_process_get(fe, tvp + i, inode, file);
			err |= (tvp + i)->result;
		}

		if (copy_to_user(tvps->props, tvp, tvps->num * sizeof(struct dtv_property))) {
			err = -EFAULT;
			goto out;
		}

	} else
		err = -EOPNOTSUPP;

out:
	kfree(tvp);
	return err;
}

+static int dtv_property_prepare_get_stats(struct dvb_frontend *fe,
+                                   struct dtv_property *tvp,
+                                   struct inode *inode, struct file *file)
+{
+       switch (tvp->cmd) {
+       case DTV_FE_QUALITY:
+               fe->dtv_property_cache.need_stats |= FE_NEED_QUALITY;
+               break;
+       case DTV_FE_QUALITY_UNIT:
+               fe->dtv_property_cache.need_stats |= FE_NEED_QUALITY_UNIT;
+               break;
+       case DTV_FE_STRENGTH:
+               fe->dtv_property_cache.need_stats |= FE_NEED_STRENGTH;
+               break;
+       case DTV_FE_STRENGTH_UNIT:
+               fe->dtv_property_cache.need_stats |= FE_NEED_STRENGTH_UNIT;
+               break;
+       case DTV_FE_ERROR:
+               fe->dtv_property_cache.need_stats |= FE_NEED_ERROR;
+               break;
+       case DTV_FE_ERROR_UNIT:
+               fe->dtv_property_cache.need_stats |= FE_NEED_ERROR_UNIT;
+               break;
+       case DTV_FE_SIGNAL:
+               fe->dtv_property_cache.need_stats |= FE_NEED_SIGNAL;
+               break;
+       case DTV_FE_SIGNAL_UNIT:
+               fe->dtv_property_cache.need_stats |= FE_NEED_SIGNAL_UNIT;
+               break;
+       case DTV_FE_UNC:
+               fe->dtv_property_cache.need_stats |= FE_NEED_SIGNAL;
+               break;
+       case DTV_FE_UNC_UNIT:
+               fe->dtv_property_cache.need_stats |= FE_NEED_SIGNAL_UNIT;
+               break;
+       default:
+               return 1;
+       };
+
+       return 0;
+}
+
 static int dtv_property_process_get(struct dvb_frontend *fe,
                                   struct dtv_property *tvp,
-                                   struct inode *inode, struct file *file)
+                                   struct inode *inode, struct file *file,
+                                   int need_get_ops)
 {
       int r = 0;

       dtv_property_dump(tvp);

       /* Allow the frontend to validate incoming properties */
-       if (fe->ops.get_property)
+       if (fe->ops.get_property && need_get_ops)
               r = fe->ops.get_property(fe, tvp);

       if (r < 0)
@@ -1329,6 +1382,38 @@ static int dtv_property_process_get(stru
       case DTV_ISDBS_TS_ID:
               tvp->u.data = fe->dtv_property_cache.isdbs_ts_id;
               break;
+
+       /* Quality measures */
+       case DTV_FE_QUALITY:
+               tvp->u.data = fe->dtv_property_cache.quality;
+               break;
+       case DTV_FE_QUALITY_UNIT:
+               tvp->u.data = fe->dtv_property_cache.quality_unit;
+               break;
+       case DTV_FE_STRENGTH:
+               tvp->u.data = fe->dtv_property_cache.strength;
+               break;
+       case DTV_FE_STRENGTH_UNIT:
+               tvp->u.data = fe->dtv_property_cache.strength_unit;
+               break;
+       case DTV_FE_ERROR:
+               tvp->u.data = fe->dtv_property_cache.error;
+               break;
+       case DTV_FE_ERROR_UNIT:
+               tvp->u.data = fe->dtv_property_cache.error_unit;
+               break;
+       case DTV_FE_SIGNAL:
+               tvp->u.data = fe->dtv_property_cache.signal;
+               break;
+       case DTV_FE_SIGNAL_UNIT:
+               tvp->u.data = fe->dtv_property_cache.signal_unit;
+               break;
+       case DTV_FE_UNC:
+               tvp->u.data = fe->dtv_property_cache.signal;
+               break;
+       case DTV_FE_UNC_UNIT:
+               tvp->u.data = fe->dtv_property_cache.signal_unit;
+               break;
       default:
               r = -1;
       }
@@ -1527,7 +1612,7 @@ static int dvb_frontend_ioctl_properties
 {
       struct dvb_device *dvbdev = file->private_data;
       struct dvb_frontend *fe = dvbdev->priv;
-       int err = 0;
+       int err = 0, need_get_ops;

       struct dtv_properties *tvps = NULL;
       struct dtv_property *tvp = NULL;
@@ -1591,8 +1676,29 @@ static int dvb_frontend_ioctl_properties
                       goto out;
               }

+               /*
+               * Do all get operations at once, instead of handling them
+               * individually
+               */
+               need_get_ops = 0;
+               fe->dtv_property_cache.need_stats = 0;
+               for (i = 0; i < tvps->num; i++)
+                       need_get_ops += dtv_property_prepare_get_stats(fe,
+                                                        tvp + i, inode, file);
+
+               if (!fe->dtv_property_cache.need_stats) {
+                       need_get_ops++;
+               } else {
+                       if (fe->ops.get_stats) {
+                               err = fe->ops.get_stats(fe);
+                               if (err < 0)
+                                       return err;
+                       }
+               }
+
               for (i = 0; i < tvps->num; i++) {
-                       (tvp + i)->result =
dtv_property_process_get(fe, tvp + i, inode, file);
+                       (tvp + i)->result = dtv_property_process_get(fe,
+                                       tvp + i, inode, file, need_get_ops);
                       err |= (tvp + i)->result;
               }




Now that we can see the actual code flow, we can see the s2api
approach requires an unnecessary large tokenizer/serializer, involving
multiple function calls. Additionally you would require to clear the
cache similarly as in

static int dvb_frontend_clear_cache(struct dvb_frontend *fe)
{
	int i;

	memset(&(fe->dtv_property_cache), 0,
			sizeof(struct dtv_frontend_properties));

	fe->dtv_property_cache.state = DTV_CLEAR;
	fe->dtv_property_cache.delivery_system = SYS_UNDEFINED;
	fe->dtv_property_cache.inversion = INVERSION_AUTO;
	fe->dtv_property_cache.fec_inner = FEC_AUTO;
	fe->dtv_property_cache.transmission_mode = TRANSMISSION_MODE_AUTO;
	fe->dtv_property_cache.bandwidth_hz = BANDWIDTH_AUTO;
	fe->dtv_property_cache.guard_interval = GUARD_INTERVAL_AUTO;
	fe->dtv_property_cache.hierarchy = HIERARCHY_AUTO;
	fe->dtv_property_cache.symbol_rate = QAM_AUTO;
	fe->dtv_property_cache.code_rate_HP = FEC_AUTO;
	fe->dtv_property_cache.code_rate_LP = FEC_AUTO;

	fe->dtv_property_cache.isdbt_partial_reception = -1;
	fe->dtv_property_cache.isdbt_sb_mode = -1;
	fe->dtv_property_cache.isdbt_sb_subchannel = -1;
	fe->dtv_property_cache.isdbt_sb_segment_idx = -1;
	fe->dtv_property_cache.isdbt_sb_segment_count = -1;
	fe->dtv_property_cache.isdbt_layer_enabled = 0x7;
	for (i = 0; i < 3; i++) {
		fe->dtv_property_cache.layer[i].fec = FEC_AUTO;
		fe->dtv_property_cache.layer[i].modulation = QAM_AUTO;
		fe->dtv_property_cache.layer[i].interleaving = -1;
		fe->dtv_property_cache.layer[i].segment_count = -1;
	}

	return 0;
}

Let's assume for the time being that you don't need the clear, to make
things clearer.


Aspect #1. Comparing Case #1 and Case #2, i guess anyone will agree
that case 2 has very much code overhead in terms of the serialization
and serialization set/unset.

Aspect #2, Comparing the data payload between Case #1 and Case #2, i
guess anyone can see the size of the data structure and come to a
conclusion without a second thought that case #2 has many many times
the payload size as in Case #1

Also there is no guarantee that all these function calls are going to
give you "in-sync" statistics data for real time applications.

Now: We are talking about a small system such as a STB where this is
going to run.

Signal statistics are polled continuously in many applications to
check for LNB drift etc (some of the good applications do that).
Statistics are not just used alone to display a signal graph alone.
These STB's have very small CPU's in the order of 400MHz etc, but not
as one sees in a regular PC in the order of a few GHz and teraflop
operations. There are still smaller applications, what I am pointing
out here is a large majority of the small CPU user segment.

Another example is a headless IPTV headend. Small CPU's on it too..

Another example of a continuous polling is a gyro based setup where a
rotor controls the antenna position, the position of which is based
from the RF signal in question. Syscalls rates are considerably higher
in this case.

As you can see, the serializer/tokenizer approach introduces an
unwanted/redundant additional load.

Other than that, we don't have numerous parameters that we are in need
of a serializer to handle 4 parameters. 4 x 4bytes.


Now, let me get back to your cx24123 example.

+static int cx24123_get_stats(struct dvb_frontend* fe)
+{
+       struct cx24123_state *state = fe->demodulator_priv;
+       struct dtv_frontend_properties *prop = &fe->dtv_property_cache;
+
+       if (fe->dtv_property_cache.need_stats & FE_NEED_STRENGTH) {
+               /* larger = better */
+               prop->strength = cx24123_readreg(state, 0x3b) << 8;
+                       dprintk("Signal strength = %d\n", prop->strength);
+               fe->dtv_property_cache.need_stats &= ~FE_NEED_STRENGTH;
+       }
+
+       if (fe->dtv_property_cache.need_stats & FE_NEED_STRENGTH_UNIT) {
+               /* larger = better */
+               prop->strength_unit = FE_SCALE_UNKNOWN;
+               fe->dtv_property_cache.need_stats &= ~FE_NEED_STRENGTH_UNIT;
+       }
+
+       if (fe->dtv_property_cache.need_stats & FE_NEED_ERROR) {
+               /* The true bit error rate is this value divided by
+               the window size (set as 256 * 255) */
+               prop->error = ((cx24123_readreg(state, 0x1c) & 0x3f) << 16) |
+                              (cx24123_readreg(state, 0x1d) << 8 |
+                              cx24123_readreg(state, 0x1e));
+
+               dprintk("BER = %d\n", prop->error);
+
+               fe->dtv_property_cache.need_stats &= ~FE_NEED_ERROR;
+       }
+
+       if (fe->dtv_property_cache.need_stats & FE_NEED_ERROR_UNIT) {
+               /* larger = better */
+               prop->strength_unit = FE_ERROR_BER;
+               fe->dtv_property_cache.need_stats &= ~FE_NEED_ERROR_UNIT;
+       }
+
+       if (fe->dtv_property_cache.need_stats & FE_NEED_QUALITY) {
+               /* Inverted raw Es/N0 count, totally bogus but better than the
+                  BER threshold. */
+               prop->quality = 65535 - (((u16)cx24123_readreg(state,
0x18) << 8) |
+                                         (u16)cx24123_readreg(state, 0x19));
+
+               dprintk("read S/N index = %d\n", prop->quality);
+
+               fe->dtv_property_cache.need_stats &= ~FE_NEED_QUALITY;
+       }
+
+       if (fe->dtv_property_cache.need_stats & FE_NEED_QUALITY_UNIT) {
+               /* larger = better */
+               prop->strength_unit = FE_QUALITY_EsNo;
+               fe->dtv_property_cache.need_stats &= ~FE_NEED_QUALITY_UNIT;
+       }
+
+       /* Check if userspace requested a parameter that we can't handle*/
+       if (fe->dtv_property_cache.need_stats)
+               return -EINVAL;
+
+       return 0;
+}
+

 /*
 * Configured to return the measurement of errors in blocks,
 * because no UCBLOCKS value is available, so this value doubles up
@@ -897,43 +957,30 @@ static int cx24123_read_status(struct dv
 */
 static int cx24123_read_ber(struct dvb_frontend *fe, u32 *ber)
 {
-       struct cx24123_state *state = fe->demodulator_priv;
+       fe->dtv_property_cache.need_stats = FE_NEED_ERROR;
+       cx24123_get_stats(fe);

-       /* The true bit error rate is this value divided by
-          the window size (set as 256 * 255) */
-       *ber = ((cx24123_readreg(state, 0x1c) & 0x3f) << 16) |
-               (cx24123_readreg(state, 0x1d) << 8 |
-                cx24123_readreg(state, 0x1e));
-
-       dprintk("BER = %d\n", *ber);
-
+       *ber = fe->dtv_property_cache.error;
       return 0;
 }

 static int cx24123_read_signal_strength(struct dvb_frontend *fe,
       u16 *signal_strength)
 {
-       struct cx24123_state *state = fe->demodulator_priv;
-
-       /* larger = better */
-       *signal_strength = cx24123_readreg(state, 0x3b) << 8;
-
-       dprintk("Signal strength = %d\n", *signal_strength);
-
+       fe->dtv_property_cache.need_stats = FE_NEED_STRENGTH;
+       cx24123_get_stats(fe);
+       *signal_strength = fe->dtv_property_cache.strength;
       return 0;
 }

+
 static int cx24123_read_snr(struct dvb_frontend *fe, u16 *snr)
 {
-       struct cx24123_state *state = fe->demodulator_priv;
-
       /* Inverted raw Es/N0 count, totally bogus but better than the
-          BER threshold. */
-       *snr = 65535 - (((u16)cx24123_readreg(state, 0x18) << 8) |
-                        (u16)cx24123_readreg(state, 0x19));
-
-       dprintk("read S/N index = %d\n", *snr);
-
+               BER threshold. */
+       fe->dtv_property_cache.need_stats = FE_NEED_QUALITY;
+       cx24123_get_stats(fe);
+       *snr = fe->dtv_property_cache.quality;
       return 0;
 }


Now, in any of your use cases, you are in fact using the same number
of I2C calls to get a snapshot of statistics in any event of time, as
in the case of the ioctl approach. So you don't get any benefit in
using the s2api approach for I2C operation I/O time periods.

The only additional aspect that you draw in using a serializer (when
using s2api) spread out over multiple function calls, in such a fast
call use case is that, you add in the ambiguity of the time frame in
which the completed operation is presented back to the user.

So eventually, we need to consider using timing related information
sent back to the user to compensate for an ambiguous latency involved,
which makes things even more complex, ugly and unwieldy for fast and
small applications.

> The only cons I can think is that the S2API payload for a complete retrival of all
> stats will be a little bigger.

Yeah, this is another one of the side effects of using s2api, 16 bytes
in comparison to > 300 bytes

Regards,
Manu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB-Statistics API
  2009-12-08 21:46   ` Manu Abraham
@ 2009-12-08 23:43     ` Mauro Carvalho Chehab
  2009-12-09 11:42       ` Manu Abraham
  0 siblings, 1 reply; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2009-12-08 23:43 UTC (permalink / raw)
  To: Manu Abraham; +Cc: Julian Scheel, linux-media

Manu Abraham wrote:

>> Not true. As pointed at the previous answer, the difference between a new ioctl
>> and S2API is basically the code at dtv_property_prepare_get_stats() and
>> dtv_property_process_get(). This is a pure code that uses a continuous struct
>> that will likely be at L3 cache, inside the CPU chip. So, this code will run
>> really quickly.
> 
> 
> 
> AFAIK, cache eviction occurs with syscalls: where content in the
> caches near the CPU cores is pushed to the outer cores, resulting in
> cache misses. Not all CPU's are equipped with L3 caches. Continuous
> syscalls can result in TLB cache misses from what i understand, which
> is expensive.

It is very likely that the contents of struct fe to go into the cache during the
syscall. I was conservative when I talked about L3. Depending on the cache sizes,
I won't doubt that the needed fields of the fe struct will go to L1 cache.

>> As current CPU's runs at the order of Teraflops (as the CPU clocks are at gigahertz
>> order, and CPU's can handle multiple instructions per clock cycle), the added delay
>> is in de order of nanosseconds.
> 
> 
> Consider STB's where DVB is generally deployed rather than the small
> segment of home users running a stack on a generic PC.

Even with STB, let's assume a very slow cpu that runs at 100 Megabytes/second. So, the clock
speed is 10 nanoseconds. Assuming that this CPU doesn't have a good pipeline, being
capable of handling only one instruction per second, you'll have one instruction at executed
at each 10 nanoseconds (as a reference, a Pentium 1, running at 133 Mbps is faster than this).

An I/O operation at i2c is of the order of 10^-3. Assuming that an additional delay of 10%
(10 ^ -4) will deadly affect realtime capability (with it is very doubtful), this means that
the CPU can run up to 10,000 (!!!) instructions to generate such delay. If you compile that code
and check the number or extra instructions I bet it will be shorter enough to not cause any
practical effect.

So, even on such bad hardware that is at least 20x slower than a netbook running at 1Gbps,
what determines the delay is the amount of I/O you're doing, and not the number of extra
code.

 > Hardware I/O is the most expensive operation involved. 

True. That's what I said.

> Case #1: the ioctl approach
	<code stripped>
>
> Now Case #2: based on s2api
	<code stripped>

> Now that we can see the actual code flow, we can see the s2api
> approach requires an unnecessary large tokenizer/serializer, involving
> multiple function calls.

Are you seeing there 10.000 assembler instructions or so? If not, the size of the code is
not relevant. 

Also: gcc optimizer transforms switches into a binary tree. So, if you have 64
cases on switch, it will require 7 comparations (log2(64)) to get a match.

For example, a quick look at the code you've presented, let's just calculate
the number of operations on the dtv_property_proccess_get() routine, without
debug stuff:

static int dtv_property_process_get() {
	CMP (if fe->ops.get_property)
	CMP (if r < 0)                   <==== This if only happens if the first one is executed. On my patch, it is not executed
						(the code you posted is the one before my patch)
	SWITCH (7 CMP's)		 <==== due to binary tree optimization done by gcc
	MOV
}

So, that entire code (that has about 200 lines) has, in fact
9 comparations and one move instruction.

At dtv_property_prepare_get_stats(), the code is even cheaper: just a switch with 8
elements (log2(8) = 3), so 3 comparations, and one move instruction.

The additional cost on dvb_frontend_ioctl_properties is:
	2 MOVs
	One loop calling dtv_property_prepare_get_stats() - up to 4 times to retrieve
all quality values
	one INC
	one CMP and function CALL (the same cost exists also with the struct)
	One loop calling dtv_property_get_stats() - up to 4 times to retrieve
all quality values

So, if I've calculated it right, we're talking about 2+1+16+1+2+1+40 = 63 instructions.

So, the real executed code is very cheap. On that very slow CPU, the delay will be
about 63 x 10 nanosseconds = 630 nanosseconds of delay (being 220ns before the
call and 410ns after it).

This means 0.063% of a delay at the order of milisseconds required just one
I2C read (and, in general, you need more than one single read to get the stats).

Even if the code would double the size, the latency won't be affected.

> Aspect #1. Comparing Case #1 and Case #2, i guess anyone will agree
> that case 2 has very much code overhead in terms of the serialization
> and serialization set/unset.

It has some overhead, but if you are concerned about the realtime case, 
just run kernel perf tools. You'll see my point. If you have less I/O
to do (to retrieve just the data you need), Case #2 is much faster than Case #1.

> Aspect #2, Comparing the data payload between Case #1 and Case #2, i
> guess anyone can see the size of the data structure and come to a
> conclusion without a second thought that case #2 has many many times
> the payload size as in Case #1

True. Payload is bigger, but not that big. The payload is the data that
comes from the kernel API to the driver and vice versa (payload is defined
as the amount of data is used to pass information from one layer to
another).

> Also there is no guarantee that all these function calls are going to
> give you "in-sync" statistics data for real time applications.

Why not? The guarantee is equal on both cases, since it will depend on how
the callback is written. The extra delay of 410 nanoseconds won't be
enough to cause any change on it.
 
> Now: We are talking about a small system such as a STB where this is
> going to run.
> 
> Signal statistics are polled continuously in many applications to
> check for LNB drift etc (some of the good applications do that).
> Statistics are not just used alone to display a signal graph alone.
> These STB's have very small CPU's in the order of 400MHz etc,

Good! 4 times faster than my example, if the CPU can run one
instruction per cycle. 

So, the delay after the call is 102.5 ns on this CPU.

> but not
> as one sees in a regular PC in the order of a few GHz and teraflop
> operations. There are still smaller applications, what I am pointing
> out here is a large majority of the small CPU user segment.
> 
> Another example is a headless IPTV headend. Small CPU's on it too..
> 
> Another example of a continuous polling is a gyro based setup where a
> rotor controls the antenna position, the position of which is based
> from the RF signal in question. Syscalls rates are considerably higher
> in this case.

Very likely, you won't need to get all 4 stats on this example. So, S2API
will be faster, since a single read at ms is 9756 x slower than 102.5 ns.

> As you can see, the serializer/tokenizer approach introduces an
> unwanted/redundant additional load.

> Other than that, we don't have numerous parameters that we are in need
> of a serializer to handle 4 parameters. 4 x 4bytes.
> 
> 
> Now, let me get back to your cx24123 example.
> 
> +static int cx24123_get_stats(struct dvb_frontend* fe)
> +{
> +       struct cx24123_state *state = fe->demodulator_priv;
> +       struct dtv_frontend_properties *prop = &fe->dtv_property_cache;
> +
> +       if (fe->dtv_property_cache.need_stats & FE_NEED_STRENGTH) {
> +               /* larger = better */
> +               prop->strength = cx24123_readreg(state, 0x3b) << 8;
> +                       dprintk("Signal strength = %d\n", prop->strength);
> +               fe->dtv_property_cache.need_stats &= ~FE_NEED_STRENGTH;
> +       }
> +
> +       if (fe->dtv_property_cache.need_stats & FE_NEED_STRENGTH_UNIT) {
> +               /* larger = better */
> +               prop->strength_unit = FE_SCALE_UNKNOWN;
> +               fe->dtv_property_cache.need_stats &= ~FE_NEED_STRENGTH_UNIT;
> +       }
> +
> +       if (fe->dtv_property_cache.need_stats & FE_NEED_ERROR) {
> +               /* The true bit error rate is this value divided by
> +               the window size (set as 256 * 255) */
> +               prop->error = ((cx24123_readreg(state, 0x1c) & 0x3f) << 16) |
> +                              (cx24123_readreg(state, 0x1d) << 8 |
> +                              cx24123_readreg(state, 0x1e));
> +
> +               dprintk("BER = %d\n", prop->error);
> +
> +               fe->dtv_property_cache.need_stats &= ~FE_NEED_ERROR;
> +       }
> +
> +       if (fe->dtv_property_cache.need_stats & FE_NEED_ERROR_UNIT) {
> +               /* larger = better */
> +               prop->strength_unit = FE_ERROR_BER;
> +               fe->dtv_property_cache.need_stats &= ~FE_NEED_ERROR_UNIT;
> +       }
> +
> +       if (fe->dtv_property_cache.need_stats & FE_NEED_QUALITY) {
> +               /* Inverted raw Es/N0 count, totally bogus but better than the
> +                  BER threshold. */
> +               prop->quality = 65535 - (((u16)cx24123_readreg(state,
> 0x18) << 8) |
> +                                         (u16)cx24123_readreg(state, 0x19));
> +
> +               dprintk("read S/N index = %d\n", prop->quality);
> +
> +               fe->dtv_property_cache.need_stats &= ~FE_NEED_QUALITY;
> +       }
> +
> +       if (fe->dtv_property_cache.need_stats & FE_NEED_QUALITY_UNIT) {
> +               /* larger = better */
> +               prop->strength_unit = FE_QUALITY_EsNo;
> +               fe->dtv_property_cache.need_stats &= ~FE_NEED_QUALITY_UNIT;
> +       }
> +
> +       /* Check if userspace requested a parameter that we can't handle*/
> +       if (fe->dtv_property_cache.need_stats)
> +               return -EINVAL;
> +
> +       return 0;
> +}
> +
> 
>  /*
>  * Configured to return the measurement of errors in blocks,
>  * because no UCBLOCKS value is available, so this value doubles up
> @@ -897,43 +957,30 @@ static int cx24123_read_status(struct dv
>  */
>  static int cx24123_read_ber(struct dvb_frontend *fe, u32 *ber)
>  {
> -       struct cx24123_state *state = fe->demodulator_priv;
> +       fe->dtv_property_cache.need_stats = FE_NEED_ERROR;
> +       cx24123_get_stats(fe);
> 
> -       /* The true bit error rate is this value divided by
> -          the window size (set as 256 * 255) */
> -       *ber = ((cx24123_readreg(state, 0x1c) & 0x3f) << 16) |
> -               (cx24123_readreg(state, 0x1d) << 8 |
> -                cx24123_readreg(state, 0x1e));
> -
> -       dprintk("BER = %d\n", *ber);
> -
> +       *ber = fe->dtv_property_cache.error;
>        return 0;
>  }
> 
>  static int cx24123_read_signal_strength(struct dvb_frontend *fe,
>        u16 *signal_strength)
>  {
> -       struct cx24123_state *state = fe->demodulator_priv;
> -
> -       /* larger = better */
> -       *signal_strength = cx24123_readreg(state, 0x3b) << 8;
> -
> -       dprintk("Signal strength = %d\n", *signal_strength);
> -
> +       fe->dtv_property_cache.need_stats = FE_NEED_STRENGTH;
> +       cx24123_get_stats(fe);
> +       *signal_strength = fe->dtv_property_cache.strength;
>        return 0;
>  }
> 
> +
>  static int cx24123_read_snr(struct dvb_frontend *fe, u16 *snr)
>  {
> -       struct cx24123_state *state = fe->demodulator_priv;
> -
>        /* Inverted raw Es/N0 count, totally bogus but better than the
> -          BER threshold. */
> -       *snr = 65535 - (((u16)cx24123_readreg(state, 0x18) << 8) |
> -                        (u16)cx24123_readreg(state, 0x19));
> -
> -       dprintk("read S/N index = %d\n", *snr);
> -
> +               BER threshold. */
> +       fe->dtv_property_cache.need_stats = FE_NEED_QUALITY;
> +       cx24123_get_stats(fe);
> +       *snr = fe->dtv_property_cache.quality;
>        return 0;
>  }
> 
> 
> Now, in any of your use cases, you are in fact using the same number
> of I2C calls to get a snapshot of statistics in any event of time, as
> in the case of the ioctl approach. So you don't get any benefit in
> using the s2api approach for I2C operation I/O time periods.

Not true.

If userspace requires only strength, 

fe->dtv_property_cache.need_stats will be equal to FE_NEED_STRENGTH
and only one hardware I/O operation will occur:
 
	prop->strength = cx24123_readreg(state, 0x3b) << 8;

If userspace requires both strength and ber,

fe->dtv_property_cache.need_stats = FE_NEED_QUALITY | FE_NEED_STRENGTH, and
the code will run the following I/O:

	prop->strength = cx24123_readreg(state, 0x3b) << 8;
        prop->ber = ((cx24123_readreg(state, 0x1c) & 0x3f) << 16) |
                     (cx24123_readreg(state, 0x1d) << 8 |
                      cx24123_readreg(state, 0x1e));

If the user requires all 3 stats, then we'll have:

	prop->strength = cx24123_readreg(state, 0x3b) << 8;
        prop->ber = ((cx24123_readreg(state, 0x1c) & 0x3f) << 16) |
                     (cx24123_readreg(state, 0x1d) << 8 |
                      cx24123_readreg(state, 0x1e));
        prop->quality = 65535 - (((u16)cx24123_readreg(state, 0x18) << 8) |
                                 (u16)cx24123_readreg(state, 0x19));

Only in the last case, the delay will be equal to the one you would get with
Case #1 (since the delay will be determined by the 100 KHz I2C clock, and
not by the 400 MHz CPU clock).

> The only additional aspect that you draw in using a serializer (when
> using s2api) spread out over multiple function calls, in such a fast
> call use case is that, you add in the ambiguity of the time frame in
> which the completed operation is presented back to the user.

There's only one function call: the call to cx24123_get_stats().
the other "calls" at dvb core are converted into a single function by
gcc optimizer (as they are static and are called only once inside the code).

> So eventually, we need to consider using timing related information
> sent back to the user to compensate for an ambiguous latency involved,
> which makes things even more complex, ugly and unwieldy for fast and
> small applications.
> 
>> The only cons I can think is that the S2API payload for a complete retrival of all
>> stats will be a little bigger.
> 
> Yeah, this is another one of the side effects of using s2api, 16 bytes
> in comparison to > 300 bytes

It seems that you didn't made the right calculus here. There are two payloads envolved:

1) the callback to the driver:

Case #1:
static int cx24123_get_stats(struct dvb_frontend* fe)


Case #2:
static int stb0899_read_stats(struct dvb_frontend *fe, struct fesignal_stat *stats)

payload of case #2 is 4 bytes bigger.

2) the userspace->kernelspace payload. 

Case #1: The size of S2API structs. It will range from 24 to 84 (depending on what
you want to get, from one to 4 different value pairs).

Case #2: The size of the ioctl struct: about 30 bytes (If I summed the size of all structs correctly).

payload of S2API is generally bigger, except if just one parameter is required. 

The size of the S2API cache struct doesn't matter here, as it is part of "struct fe", so
it is present anyway.

Cheers,
Mauro.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB-Statistics API
  2009-12-08 23:43     ` Mauro Carvalho Chehab
@ 2009-12-09 11:42       ` Manu Abraham
  2009-12-09 13:02         ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 12+ messages in thread
From: Manu Abraham @ 2009-12-09 11:42 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Julian Scheel, linux-media

On Wed, Dec 9, 2009 at 3:43 AM, Mauro Carvalho Chehab
<mchehab@redhat.com> wrote:
> Manu Abraham wrote:
>
>>> Not true. As pointed at the previous answer, the difference between a new ioctl
>>> and S2API is basically the code at dtv_property_prepare_get_stats() and
>>> dtv_property_process_get(). This is a pure code that uses a continuous struct
>>> that will likely be at L3 cache, inside the CPU chip. So, this code will run
>>> really quickly.
>>
>>
>>
>> AFAIK, cache eviction occurs with syscalls: where content in the
>> caches near the CPU cores is pushed to the outer cores, resulting in
>> cache misses. Not all CPU's are equipped with L3 caches. Continuous
>> syscalls can result in TLB cache misses from what i understand, which
>> is expensive.
>
> It is very likely that the contents of struct fe to go into the cache during the
> syscall. I was conservative when I talked about L3. Depending on the cache sizes,
> I won't doubt that the needed fields of the fe struct will go to L1 cache.



Ah, so the data structure which is there in the ioctl approach as well
and "less likely" to get cache hits since the calls are lesser.


>>> As current CPU's runs at the order of Teraflops (as the CPU clocks are at gigahertz
>>> order, and CPU's can handle multiple instructions per clock cycle), the added delay
>>> is in de order of nanosseconds.
>>
>>
>> Consider STB's where DVB is generally deployed rather than the small
>> segment of home users running a stack on a generic PC.
>
> Even with STB, let's assume a very slow cpu that runs at 100 Megabytes/second. So, the clock
> speed is 10 nanoseconds. Assuming that this CPU doesn't have a good pipeline, being
> capable of handling only one instruction per second, you'll have one instruction at executed
> at each 10 nanoseconds (as a reference, a Pentium 1, running at 133 Mbps is faster than this).

Incorrect.
A CPU doesn't execute instruction per clock cycle. Clock cycles
required to execute an instruction do vary from 2 cycles 12 cycles
varying from CPU to CPU.


> An I/O operation at i2c is of the order of 10^-3. Assuming that an additional delay of 10%
> (10 ^ -4) will deadly affect realtime capability (with it is very doubtful), this means that
> the CPU can run up to 10,000 (!!!) instructions to generate such delay. If you compile that code
> and check the number or extra instructions I bet it will be shorter enough to not cause any
> practical effect.
>
> So, even on such bad hardware that is at least 20x slower than a netbook running at 1Gbps,
> what determines the delay is the amount of I/O you're doing, and not the number of extra
> code.


The I/O overhead required to read 4 registers from hardware is the
same whether you use the ioctl approach or s2api.


>  > Hardware I/O is the most expensive operation involved.
>
> True. That's what I said.
>
>> Case #1: the ioctl approach
>        <code stripped>
>>
>> Now Case #2: based on s2api
>        <code stripped>
>
>> Now that we can see the actual code flow, we can see the s2api
>> approach requires an unnecessary large tokenizer/serializer, involving
>> multiple function calls.
>
> Are you seeing there 10.000 assembler instructions or so? If not, the size of the code is
> not relevant.
>
> Also: gcc optimizer transforms switches into a binary tree. So, if you have 64
> cases on switch, it will require 7 comparations (log2(64)) to get a match.
>
> For example, a quick look at the code you've presented, let's just calculate
> the number of operations on the dtv_property_proccess_get() routine, without
> debug stuff:
>
> static int dtv_property_process_get() {
>        CMP (if fe->ops.get_property)
>        CMP (if r < 0)                   <==== This if only happens if the first one is executed. On my patch, it is not executed
>                                                (the code you posted is the one before my patch)
>        SWITCH (7 CMP's)                 <==== due to binary tree optimization done by gcc
>        MOV
> }
>
> So, that entire code (that has about 200 lines) has, in fact
> 9 comparations and one move instruction.
>
> At dtv_property_prepare_get_stats(), the code is even cheaper: just a switch with 8
> elements (log2(8) = 3), so 3 comparations, and one move instruction.
>
> The additional cost on dvb_frontend_ioctl_properties is:
>        2 MOVs
>        One loop calling dtv_property_prepare_get_stats() - up to 4 times to retrieve
> all quality values
>        one INC
>        one CMP and function CALL (the same cost exists also with the struct)
>        One loop calling dtv_property_get_stats() - up to 4 times to retrieve
> all quality values
>
> So, if I've calculated it right, we're talking about 2+1+16+1+2+1+40 = 63 instructions.
>
> 2) the userspace->kernelspace payload.
>
> Case #1: The size of S2API structs. It will range from 24 to 84 (depending on what
> you want to get, from one to 4 different value pairs).
>
> Case #2: The size of the ioctl struct: about 30 bytes (If I summed the size of all structs correctly).
>
> payload of S2API is generally bigger, except if just one parameter is required.
>
> The size of the S2API cache struct doesn't matter here, as it is part of "struct fe", so
> it is present anyway.




Eventually, as you have pointed out yourself, The data struct will be
in the cache all the time for the ioctl approach. The only new
addition to the existing API in the ioctl case is a CALL instruction
as compared to the numerous instructions in comparison to that you
have pointed out as with the s2api approach.

Regards,
Manu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB-Statistics API
  2009-12-09 11:42       ` Manu Abraham
@ 2009-12-09 13:02         ` Mauro Carvalho Chehab
  2009-12-09 22:12           ` Julian Scheel
  0 siblings, 1 reply; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2009-12-09 13:02 UTC (permalink / raw)
  To: Manu Abraham; +Cc: Julian Scheel, linux-media

Manu Abraham wrote:
> On Wed, Dec 9, 2009 at 3:43 AM, Mauro Carvalho Chehab
> <mchehab@redhat.com> wrote:

>> Even with STB, let's assume a very slow cpu that runs at 100 Megabytes/second. So, the clock
>> speed is 10 nanoseconds. Assuming that this CPU doesn't have a good pipeline, being
>> capable of handling only one instruction per second, you'll have one instruction at executed
>> at each 10 nanoseconds (as a reference, a Pentium 1, running at 133 Mbps is faster than this).
> 
> Incorrect.
> A CPU doesn't execute instruction per clock cycle. Clock cycles
> required to execute an instruction do vary from 2 cycles 12 cycles
> varying from CPU to CPU.

See the description of an old Pentium MMX processor (the sucessor of i586, running up to 200 MHz):
	http://www.intel.com/design/archives/processors/mmx/docs/243185.htm

Thanks to superscalar architecture, it runs 2 instructions per clock cycle (IPC).

Newer processors can run more instructions per clock cycle. For example, any Pentium-4 processor,
can do 3 IPC:
	http://www.intel.com/support/processors/pentium4/sb/CS-017371.htm

>> So, even on such bad hardware that is at least 20x slower than a netbook running at 1Gbps,
>> what determines the delay is the amount of I/O you're doing, and not the number of extra
>> code.
> 
> 
> The I/O overhead required to read 4 registers from hardware is the
> same whether you use the ioctl approach or s2api.

It seems you got my point. What will determinate the delay is the number of I/O's, and not the
amount of instructions.
 
> Eventually, as you have pointed out yourself, The data struct will be
> in the cache all the time for the ioctl approach. The only new
> addition to the existing API in the ioctl case is a CALL instruction
> as compared to the numerous instructions in comparison to that you
> have pointed out as with the s2api approach.

True, but, as shown, the additional delay introduced by the code is less than 0.01%, even on
a processor that has half of the speed of a 12-year old very slow CPU (a Pentium MMX @ 100 MHz
is capable of 2 IPC. My calculus assumed 1 IPC).

So, what will affect the delay is the number of I/O you need to do.

To get all data that the ioctl approach struct has, the delay for S2API will be equal.
To get less data, S2API will have a small delay.

Cheers,
Mauro.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB-Statistics API
  2009-12-09 13:02         ` Mauro Carvalho Chehab
@ 2009-12-09 22:12           ` Julian Scheel
  2009-12-09 23:38             ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 12+ messages in thread
From: Julian Scheel @ 2009-12-09 22:12 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Manu Abraham, linux-media

Am 09.12.09 14:02, schrieb Mauro Carvalho Chehab:
> Manu Abraham wrote:
>    
>> On Wed, Dec 9, 2009 at 3:43 AM, Mauro Carvalho Chehab
>> <mchehab@redhat.com>  wrote:
>>      
>    
>>> Even with STB, let's assume a very slow cpu that runs at 100 Megabytes/second. So, the clock
>>> speed is 10 nanoseconds. Assuming that this CPU doesn't have a good pipeline, being
>>> capable of handling only one instruction per second, you'll have one instruction at executed
>>> at each 10 nanoseconds (as a reference, a Pentium 1, running at 133 Mbps is faster than this).
>>>        
>> Incorrect.
>> A CPU doesn't execute instruction per clock cycle. Clock cycles
>> required to execute an instruction do vary from 2 cycles 12 cycles
>> varying from CPU to CPU.
>>      
> See the description of an old Pentium MMX processor (the sucessor of i586, running up to 200 MHz):
> 	http://www.intel.com/design/archives/processors/mmx/docs/243185.htm
>
> Thanks to superscalar architecture, it runs 2 instructions per clock cycle (IPC).
>
> Newer processors can run more instructions per clock cycle. For example, any Pentium-4 processor,
> can do 3 IPC:
> 	http://www.intel.com/support/processors/pentium4/sb/CS-017371.htm
>    

I don't think you can just take the average IPC rates into account for 
this. When doing a syscall the processors TLB cache will be cleared, 
which will force the CPU to go to the whole instruction pipeline before 
the first syscall instruction is actually executed. This will introduce 
a delay for each syscall you make. I'm not exactly sure about the length 
of the delay, but I think it should be something like 2 clock cycles.
>>> So, even on such bad hardware that is at least 20x slower than a netbook running at 1Gbps,
>>> what determines the delay is the amount of I/O you're doing, and not the number of extra
>>> code.
>>>        
>>
>> The I/O overhead required to read 4 registers from hardware is the
>> same whether you use the ioctl approach or s2api.
>>      
> It seems you got my point. What will determinate the delay is the number of I/O's, and not the
> amount of instructions.
>    
The number of hardware I/Os is constant for both cases, so we do not 
need to discuss them as pro/con for any of the proposals.
>> Eventually, as you have pointed out yourself, The data struct will be
>> in the cache all the time for the ioctl approach. The only new
>> addition to the existing API in the ioctl case is a CALL instruction
>> as compared to the numerous instructions in comparison to that you
>> have pointed out as with the s2api approach.
>>      
> True, but, as shown, the additional delay introduced by the code is less than 0.01%, even on
> a processor that has half of the speed of a 12-year old very slow CPU (a Pentium MMX @ 100 MHz
> is capable of 2 IPC. My calculus assumed 1 IPC).
>
> So, what will affect the delay is the number of I/O you need to do.
>
> To get all data that the ioctl approach struct has, the delay for S2API will be equal.
> To get less data, S2API will have a small delay.
>    
Imho the S2API would be slower when reading all data the ioctl fetches, 
due to the way the instructions would be handled.

Correct me, if I'm wrong with any of this.

Cheers,
Julian


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB-Statistics API
  2009-12-09 22:12           ` Julian Scheel
@ 2009-12-09 23:38             ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2009-12-09 23:38 UTC (permalink / raw)
  To: Julian Scheel; +Cc: Manu Abraham, linux-media

Julian Scheel wrote:
> Am 09.12.09 14:02, schrieb Mauro Carvalho Chehab:
>> Manu Abraham wrote:

> I don't think you can just take the average IPC rates into account for
> this. When doing a syscall the processors TLB cache will be cleared,
> which will force the CPU to go to the whole instruction pipeline before
> the first syscall instruction is actually executed. This will introduce
> a delay for each syscall you make. I'm not exactly sure about the length
> of the delay, but I think it should be something like 2 clock cycles.

True, but this delay is common to both S2API and struct.

>> To get all data that the ioctl approach struct has, the delay for
>> S2API will be equal.
>> To get less data, S2API will have a small delay.
>>    
> Imho the S2API would be slower when reading all data the ioctl fetches,
> due to the way the instructions would be handled.
> 
> Correct me, if I'm wrong with any of this.

Not sure if I understood your question.

On both cases, just one function call will go to the driver, with one struct
(struct fe, the case of S2API) or two structs (struct fe and the
stats-specific struct(s)) in the case of a new ioctl(s).

As drivers are free to implement any logic, the driver can implement exactly
the same logic with both API calls. So, the delay to get the info will be
equal on both cases.

In a practical case, this will take at least a few milisseconds to retrieve all
data. It may take even more, since, on some drivers, you may need to wait for
some condition to happen before start measuring, in order to be sure that you'll
be getting atomic and accurate values.

After the function return, with an struct, you can just return, while, with S2API,
you'll need to put the data into the proper payload fields, but this will add
a delay in the order of nanoseconds.

Let's say that, to get all data, the routine needs 10 milisseconds.

The difference between new ioctl or S2API will be of about 0,00063 milliseconds
on a Pentium MMX. On a machine with 1GHz of clock, 2 IPC, the difference will
be 0,0000315 milliseconds.

Considering that the Linux kernel is preemptive, and an interrupt or the scheduler
could be called during the 10 milliseconds time, I doubt you'll be able to
notice that difference on any practical use case.

On the other hand, if you need for example just the strength of the signal at the AGC,
if you call via struct, you'll still be consuming the same 10 ms, while, with S2API,
you can do it, let's say, on 1 ms (the real numbers will depend, of course, on how
much I/O is needed on hardware, and on how many time do you need to wait there to 
get an stable value).

So, if you want to do things like moving a rotor, S2API will give better results.
If you want all stats, it will give the same result as a new ioctl.

While I don't think that 0,0000315 milliseconds is worth enough to cause any troubles,
With a simple change like the one bellow, this time can be reduced to 0,0000105 milisseconds
with a patch like that (the patch were simplified to change just quality, but, of course,
such approach needs to be done on the other fields to get this result):

Index: master/linux/drivers/media/dvb/dvb-core/dvb_frontend.c
===================================================================
--- master.orig/linux/drivers/media/dvb/dvb-core/dvb_frontend.c
+++ master/linux/drivers/media/dvb/dvb-core/dvb_frontend.c
@@ -1220,6 +1220,7 @@ static int dtv_property_prepare_get_stat
 	switch (tvp->cmd) {
 	case DTV_FE_QUALITY:
 		fe->dtv_property_cache.need_stats |= FE_NEED_QUALITY;
+		fe->dtv_property_cache.quality = &tvp->u.data;
 		break;
 	case DTV_FE_QUALITY_UNIT:
 		fe->dtv_property_cache.need_stats |= FE_NEED_QUALITY_UNIT;
@@ -1384,9 +1385,6 @@ static int dtv_property_process_get(stru
 		break;

	/* Quality measures */
-	case DTV_FE_QUALITY:
-		tvp->u.data = fe->dtv_property_cache.quality;
-		break;
 	case DTV_FE_QUALITY_UNIT:
 		tvp->u.data = fe->dtv_property_cache.quality_unit;
 		break;
@@ -1696,10 +1697,12 @@ static int dvb_frontend_ioctl_properties
 			}
 		}

-		for (i = 0; i < tvps->num; i++) {
-			(tvp + i)->result = dtv_property_process_get(fe,
-					tvp + i, inode, file, need_get_ops);
-			err |= (tvp + i)->result;
+		if (need_get_ops) {
+			for (i = 0; i < tvps->num; i++) {
+				(tvp + i)->result = dtv_property_process_get(fe,
+						tvp + i, inode, file, need_get_ops);
+				err |= (tvp + i)->result;
+			}
 		}

 		if (copy_to_user(tvps->props, tvp, tvps->num * sizeof(struct dtv_property))) {
Index: master/linux/drivers/media/dvb/dvb-core/dvb_frontend.h
===================================================================
--- master.orig/linux/drivers/media/dvb/dvb-core/dvb_frontend.h
+++ master/linux/drivers/media/dvb/dvb-core/dvb_frontend.h
@@ -374,7 +374,7 @@ struct dtv_frontend_properties {
 #define FE_NEED_SIGNAL_UNIT	(1 << 7)
 	int			need_stats;

-	u32			quality;
+	u32			*quality;
 	u32			strength;
 	u32			error;
 	u32			unc;

and, on the drivers, instead of doing:

	prop->quality = <some calculus>

doing
	*prop->quality = <some calculus>

Cheers,
Mauro.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB-Statistics API - please vote
  2009-12-08 13:22 ` New DVB-Statistics API Mauro Carvalho Chehab
  2009-12-08 21:46   ` Manu Abraham
@ 2010-02-03  9:49   ` Mauro Carvalho Chehab
  2010-02-03 10:40     ` Hans Verkuil
  1 sibling, 1 reply; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2010-02-03  9:49 UTC (permalink / raw)
  To: Julian Scheel; +Cc: linux-media

Mauro Carvalho Chehab wrote:

>> after the last thread which asked about signal statistics details
>> degenerated into a discussion about the technical possibilites for
>> implementing an entirely new API, which lead to nothing so far, I wanted
>> to open a new thread to bring this forward. Maybe some more people can
>> give their votes for the different options

Only me and Manu manifested with opinions on this thread. Not sure why
nobody else gave their comments. Maybe all interested people just decided 
to take a long vacation and are not listening to their emails ;)

Seriously, from what I understand, this is an API improvement and we need
to take a decision on it. So, your opinions are important.

---

Let me draw a summary of this subject, trying to be impartial.

The original proposal were made by Manu. My proposal is derived from Manu's
original one, both will equally provide the same features. 

The main difference is that Manu's proposal use a struct to get the 
statistics while my proposal uses DVBS2API.

With both API proposals, all values are get at the same time by the driver.
the DVBS2API adds a small delay to fill the fields, but the extra delay is
insignificant, when compared with the I/O delays needed to retrieve the 
values from the hardware.

Due to the usage of DVBS2API, it is possible to retrieve a subset of statistics.
When obtaining a subset, the DVBS2API latency is considerable faster, as less
data needed to be transfered from the hardware.

The DVBS2API also offers the possibility of expanding the statistics group, since
it is not rigid as an struct.

One criteria that should be reminded is that, according with Linux Kernel rules,
any userspace API is stable. This means that applications compiled against an
older API version should keep running with the latest kernel. So, whatever decided,
the statistics API should always maintain backward compatibility.

---

During the end of the year, I did some work with an ISDB-T driver for Siano, and
I realized that the usage of the proposed struct won't fit well for ISDB-T. The
reason is that, on ISDB-T, the transmission uses up to 3 hierarchical layers.
Each layer may have different OFDM parameters, so the devices (at least, this is the 
case for Siano) has a group of statistics per layer.

I'm afraid that newer standards may also bring different ways to present statistics, and
the current proposal won't fit well.

So, in my opinion, if it is chosen any struct-based approach, we'll have a bad time to
maintain it, as it won't fit into all cases, and we'll need to add some tricks to extend
the struct.

So, my vote is for the DVBS2API approach, where a new group of statistics can easily be added,
on an elegant way, without needing of re-discussing the better API or to find a way to extend
the size of an struct without breaking backward compatibility.

Cheers,
Mauro

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB-Statistics API - please vote
  2010-02-03  9:49   ` New DVB-Statistics API - please vote Mauro Carvalho Chehab
@ 2010-02-03 10:40     ` Hans Verkuil
  2010-02-03 12:37       ` Manu Abraham
  0 siblings, 1 reply; 12+ messages in thread
From: Hans Verkuil @ 2010-02-03 10:40 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Julian Scheel, linux-media


> Mauro Carvalho Chehab wrote:
>
>>> after the last thread which asked about signal statistics details
>>> degenerated into a discussion about the technical possibilites for
>>> implementing an entirely new API, which lead to nothing so far, I
>>> wanted
>>> to open a new thread to bring this forward. Maybe some more people can
>>> give their votes for the different options
>
> Only me and Manu manifested with opinions on this thread. Not sure why
> nobody else gave their comments. Maybe all interested people just decided
> to take a long vacation and are not listening to their emails ;)
>
> Seriously, from what I understand, this is an API improvement and we need
> to take a decision on it. So, your opinions are important.
>
> ---
>
> Let me draw a summary of this subject, trying to be impartial.
>
> The original proposal were made by Manu. My proposal is derived from
> Manu's
> original one, both will equally provide the same features.
>
> The main difference is that Manu's proposal use a struct to get the
> statistics while my proposal uses DVBS2API.
>
> With both API proposals, all values are get at the same time by the
> driver.
> the DVBS2API adds a small delay to fill the fields, but the extra delay is
> insignificant, when compared with the I/O delays needed to retrieve the
> values from the hardware.
>
> Due to the usage of DVBS2API, it is possible to retrieve a subset of
> statistics.
> When obtaining a subset, the DVBS2API latency is considerable faster, as
> less
> data needed to be transfered from the hardware.
>
> The DVBS2API also offers the possibility of expanding the statistics
> group, since
> it is not rigid as an struct.
>
> One criteria that should be reminded is that, according with Linux Kernel
> rules,
> any userspace API is stable. This means that applications compiled against
> an
> older API version should keep running with the latest kernel. So, whatever
> decided,
> the statistics API should always maintain backward compatibility.
>
> ---
>
> During the end of the year, I did some work with an ISDB-T driver for
> Siano, and
> I realized that the usage of the proposed struct won't fit well for
> ISDB-T. The
> reason is that, on ISDB-T, the transmission uses up to 3 hierarchical
> layers.
> Each layer may have different OFDM parameters, so the devices (at least,
> this is the
> case for Siano) has a group of statistics per layer.
>
> I'm afraid that newer standards may also bring different ways to present
> statistics, and
> the current proposal won't fit well.
>
> So, in my opinion, if it is chosen any struct-based approach, we'll have a
> bad time to
> maintain it, as it won't fit into all cases, and we'll need to add some
> tricks to extend
> the struct.
>
> So, my vote is for the DVBS2API approach, where a new group of statistics
> can easily be added,
> on an elegant way, without needing of re-discussing the better API or to
> find a way to extend
> the size of an struct without breaking backward compatibility.

>From a purely technical standpoint the DVBS2API is by definition more
flexible and future-proof. The V4L API has taken the same approach with
controls (basically exactly the same mechanism). Should it be necessary in
the future to set multiple properties atomically, then the same technique
as V4L can be used (see VIDIOC_S_EXT_CTRLS).

The alternative is to make structs with lots of reserved fields. It
depends on how predictable the API is expected to be. Sometimes you can be
reasonably certain that there won't be too many additions in the future
and then using reserved fields is perfectly OK.

Just my 5 cents based on my V4L experience.

Regards,

          Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG Telecom


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: New DVB-Statistics API - please vote
  2010-02-03 10:40     ` Hans Verkuil
@ 2010-02-03 12:37       ` Manu Abraham
  0 siblings, 0 replies; 12+ messages in thread
From: Manu Abraham @ 2010-02-03 12:37 UTC (permalink / raw)
  To: Hans Verkuil; +Cc: Mauro Carvalho Chehab, Julian Scheel, linux-media

On Wed, Feb 3, 2010 at 2:40 PM, Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
>> Mauro Carvalho Chehab wrote:
>>
>>>> after the last thread which asked about signal statistics details
>>>> degenerated into a discussion about the technical possibilites for
>>>> implementing an entirely new API, which lead to nothing so far, I
>>>> wanted
>>>> to open a new thread to bring this forward. Maybe some more people can
>>>> give their votes for the different options
>>
>> Only me and Manu manifested with opinions on this thread. Not sure why
>> nobody else gave their comments. Maybe all interested people just decided
>> to take a long vacation and are not listening to their emails ;)
>>
>> Seriously, from what I understand, this is an API improvement and we need
>> to take a decision on it. So, your opinions are important.
>>
>> ---
>>
>> Let me draw a summary of this subject, trying to be impartial.
>>
>> The original proposal were made by Manu. My proposal is derived from
>> Manu's
>> original one, both will equally provide the same features.
>>
>> The main difference is that Manu's proposal use a struct to get the
>> statistics while my proposal uses DVBS2API.
>>
>> With both API proposals, all values are get at the same time by the
>> driver.
>> the DVBS2API adds a small delay to fill the fields, but the extra delay is
>> insignificant, when compared with the I/O delays needed to retrieve the
>> values from the hardware.
>>
>> Due to the usage of DVBS2API, it is possible to retrieve a subset of
>> statistics.
>> When obtaining a subset, the DVBS2API latency is considerable faster, as
>> less
>> data needed to be transfered from the hardware.
>>
>> The DVBS2API also offers the possibility of expanding the statistics
>> group, since
>> it is not rigid as an struct.
>>
>> One criteria that should be reminded is that, according with Linux Kernel
>> rules,
>> any userspace API is stable. This means that applications compiled against
>> an
>> older API version should keep running with the latest kernel. So, whatever
>> decided,
>> the statistics API should always maintain backward compatibility.
>>
>> ---
>>
>> During the end of the year, I did some work with an ISDB-T driver for
>> Siano, and
>> I realized that the usage of the proposed struct won't fit well for
>> ISDB-T. The
>> reason is that, on ISDB-T, the transmission uses up to 3 hierarchical
>> layers.
>> Each layer may have different OFDM parameters, so the devices (at least,
>> this is the
>> case for Siano) has a group of statistics per layer.
>>
>> I'm afraid that newer standards may also bring different ways to present
>> statistics, and
>> the current proposal won't fit well.
>>
>> So, in my opinion, if it is chosen any struct-based approach, we'll have a
>> bad time to
>> maintain it, as it won't fit into all cases, and we'll need to add some
>> tricks to extend
>> the struct.
>>
>> So, my vote is for the DVBS2API approach, where a new group of statistics
>> can easily be added,
>> on an elegant way, without needing of re-discussing the better API or to
>> find a way to extend
>> the size of an struct without breaking backward compatibility.
>
> From a purely technical standpoint the DVBS2API is by definition more
> flexible and future-proof. The V4L API has taken the same approach with
> controls (basically exactly the same mechanism). Should it be necessary in
> the future to set multiple properties atomically, then the same technique
> as V4L can be used (see VIDIOC_S_EXT_CTRLS).
>
> The alternative is to make structs with lots of reserved fields. It
> depends on how predictable the API is expected to be. Sometimes you can be
> reasonably certain that there won't be too many additions in the future
> and then using reserved fields is perfectly OK.
>
> Just my 5 cents based on my V4L experience.


In fact, I don't see the advantage using a specific get/set, since
what i proposed just reads back basic data types such as a u32 for any
instance and those being "standard " for any digital communication
system. It makes no sense to go for a get/set approach. For example
there cannot be multiple properties for any digital system such as
BER, just that there are different measure s such as kilograms,
pounds, grams etc. Which is what the whole patch is meant to do in a
performance critical manner.
The whole patch is a kind of get/set and and hence it doesn't need a
get/set at the micro level.

To contradict the reverse, as an example, I can state that weight is
not measured in centimeters or inches or feet, to put it in laymans
terms. if you say that it needs to be expressed as such,  then i very
well see that there is something conceptually wrong in your thought.

My 2 cents, for those who don't understand the issue.

Regards,
Manu

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-02-03 12:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-08  9:16 New USB-Statistics API Julian Scheel
2009-12-08 12:24 ` New DVB(!!)-Statistics API Julian Scheel
2009-12-08 13:22 ` New DVB-Statistics API Mauro Carvalho Chehab
2009-12-08 21:46   ` Manu Abraham
2009-12-08 23:43     ` Mauro Carvalho Chehab
2009-12-09 11:42       ` Manu Abraham
2009-12-09 13:02         ` Mauro Carvalho Chehab
2009-12-09 22:12           ` Julian Scheel
2009-12-09 23:38             ` Mauro Carvalho Chehab
2010-02-03  9:49   ` New DVB-Statistics API - please vote Mauro Carvalho Chehab
2010-02-03 10:40     ` Hans Verkuil
2010-02-03 12:37       ` Manu Abraham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox