linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [rt sched] SCHED_FIFO task of lower rt_priority blocks higher one
@ 2010-03-04  3:21 Xianwei Zeng
  2010-03-04  3:54 ` Karthik Singaram Lakshmanan
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Xianwei Zeng @ 2010-03-04  3:21 UTC (permalink / raw)
  To: linux-rt-users; +Cc: tglx, mingo, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2695 bytes --]

Hi,

# sorry, rejected by mail list server, change the format and resend it

I am using the linux-2.6.29.6-rt24 kernel on an ARM11MPCore (SMP, 4
cores) system.

In this kernel, a SCHED_FIFO task which does the following things seems
can block other real-time processes of higher rt priority on the same CPU
core (the test program is also attached):

static void child_yielder(void)
{
        struct sched_param sp;

        memset (&sp, 0, sizeof sp);
        sp.sched_priority = 10;     /* Arbitrary rt priority */

        if (sched_setscheduler (0, SCHED_FIFO, &sp) != 0)
        {
        perror("sched_setscheduler()");
        exit (1);
        }

    while (1) {
        sched_yield();
    }
}

In other words, no other tasks can be scheduled, including the per-cpu's
keventd kernel thread which has the highest rt priority(keventd is
SCHED_FIFO, and rt_priority = 1). But real-time tasks with lower
rt_priority can get scheduled. This sounds strange to me.

I checked sched_rt.c of my kernel version(The latest kernel is almost
the same in this part), and try to understand how a real-time task is
enqueued, dequeued and picked up:

  * enqueue a real-time task
    - task->prio is used to find the list in rt_prio_array and add task to it;
    - Set the bit in rt_prio_array->bitmap by task->prio;

  * dequeue a real-time task
    - Remove task from the list in rt_prio_array
    - Clear the bit in rt_prio_array->bitmap by task->prio;

  * pick up next real-time task
    - Call sched_find_first_bit(array->bitmap) to find the list
    - Pick the task in the list head

  * yield a real-time task
    - Instead of doing dequeue followed by enqueue, calls
      requeue_task_rt() which moves the task from its current place to
      the list tail.

In all above operations, task->prio is used to find the bit in runqueue
bitmap. Except for Priority Inherient, task->prio is equal to
task->normal_prio which is calculated by function normal_prio(). For
real-time task, its normal_prio is:

   normal_prio = MAX_RT_PRIO - 1 - task->rt_priority;

So the place  of a higher rt_priority real-time task is always
__behind__ the lower rt_priority one in the runqueue bitmap. So that
sched_find_first_bit() picks up the lower rt_priority task to run.

That is why a SCHED_FIFO task can block higher rt_priority SCHED_FIFO
tasks but lower rt_priority real-time task can be scheduled in my test.

But I am confuse about:

  * Does the real-time scheduler work as designed?
  * Or arm I doing the wrong thing in my test?
  * Why not use rt_priority to enqueue and dequeue real-time task
    to/from runqueue list?

Can somebody have a look at my questions? Thanks.

--
Best Regards,
Zeng Xianwei

[-- Attachment #2: rt-sched.c --]
[-- Type: text/x-csrc, Size: 1386 bytes --]

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <signal.h>
#include <sched.h>
#include <sys/types.h>

#define NUM_TASKS   5
#define PARENT_PRIO 10   /* Parent's rt priority */

static void child_yielder(int rt_prio);

int main (int argc, char **argv)
{
        struct sched_param sp;
	pid_t child;
	int i = 0;

        memset (&sp, 0, sizeof sp);
        sp.sched_priority = PARENT_PRIO;
        if (sched_setscheduler (0, SCHED_FIFO, &sp) != 0)
        {
		perror("sched_setscheduler()");
		exit (1);
        }

	for (i = 1; i <= NUM_TASKS; i++) {
		child = fork();
		switch (child) {
		case 0: /* Child */
			/* Child has lower rt priority than parent */
			child_yielder(PARENT_PRIO + i);
			break;
		case -1: /* Error */
			perror("fork()");
			kill(0, SIGTERM);
			break;
		default: /* Parent */
			printf ("Parent: craete child pid %d\n", child);
			break;
		}
	}

	printf("-- Parent END --\n");
	
	/* Exit and leave child processes running */
	/* kill(0, SIGTERM); */
	return 0;
}

static void child_yielder(int rt_prio)
{
        struct sched_param sp;

	printf ("Child running: pid = %d\n", getpid());

        memset (&sp, 0, sizeof sp);
        sp.sched_priority = rt_prio;

        if (sched_setscheduler (0, SCHED_FIFO, &sp) != 0)
        {
		perror("sched_setscheduler()");
		exit (1);
        }

	while (1) {
		sched_yield();
	}
}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Delivery Status Notification (Failure)
       [not found]   ` <0016e646086875cfde0480f186a0@google.com>
@ 2010-03-04  3:53     ` Karthik Singaram Lakshmanan
  0 siblings, 0 replies; 5+ messages in thread
From: Karthik Singaram Lakshmanan @ 2010-03-04  3:53 UTC (permalink / raw)
  To: Linux RT

Hi,
    If I am not mistaken, higher value of sched_priority denotes
higher priority.
    Therefore, looking at
        normal_prio = MAX_RT_PRIO - 1 - task->rt_priority;
        you can see that a higher task->rt_priority will translate to
a lower value of normal_prio, thereby occupying an earlier bit.

Hope that helps
- Karthik
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rt sched] SCHED_FIFO task of lower rt_priority blocks higher one
  2010-03-04  3:21 [rt sched] SCHED_FIFO task of lower rt_priority blocks higher one Xianwei Zeng
@ 2010-03-04  3:54 ` Karthik Singaram Lakshmanan
  2010-03-04  4:24 ` Robert Hancock
       [not found] ` <1ca41c0f1003031952u7a9ea0w5adf4f0438b8fe0c@mail.gmail.com>
  2 siblings, 0 replies; 5+ messages in thread
From: Karthik Singaram Lakshmanan @ 2010-03-04  3:54 UTC (permalink / raw)
  To: Xianwei Zeng; +Cc: Linux RT

Hi,
   If I am not mistaken, higher value of sched_priority denotes
higher priority.
   Therefore, looking at
       normal_prio = MAX_RT_PRIO - 1 - task->rt_priority;
       you can see that a higher task->rt_priority will translate to
a lower value of normal_prio, thereby occupying an earlier bit.

Hope that helps
- Karthik
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rt sched] SCHED_FIFO task of lower rt_priority blocks higher one
  2010-03-04  3:21 [rt sched] SCHED_FIFO task of lower rt_priority blocks higher one Xianwei Zeng
  2010-03-04  3:54 ` Karthik Singaram Lakshmanan
@ 2010-03-04  4:24 ` Robert Hancock
       [not found] ` <1ca41c0f1003031952u7a9ea0w5adf4f0438b8fe0c@mail.gmail.com>
  2 siblings, 0 replies; 5+ messages in thread
From: Robert Hancock @ 2010-03-04  4:24 UTC (permalink / raw)
  To: Xianwei Zeng; +Cc: linux-rt-users, tglx, mingo, linux-kernel

On 03/03/2010 09:21 PM, Xianwei Zeng wrote:
> Hi,
>
> # sorry, rejected by mail list server, change the format and resend it
>
> I am using the linux-2.6.29.6-rt24 kernel on an ARM11MPCore (SMP, 4
> cores) system.
>
> In this kernel, a SCHED_FIFO task which does the following things seems
> can block other real-time processes of higher rt priority on the same CPU
> core (the test program is also attached):
>
> static void child_yielder(void)
> {
>          struct sched_param sp;
>
>          memset (&sp, 0, sizeof sp);
>          sp.sched_priority = 10;     /* Arbitrary rt priority */
>
>          if (sched_setscheduler (0, SCHED_FIFO,&sp) != 0)
>          {
>          perror("sched_setscheduler()");
>          exit (1);
>          }
>
>      while (1) {
>          sched_yield();
>      }
> }
>
> In other words, no other tasks can be scheduled, including the per-cpu's
> keventd kernel thread which has the highest rt priority(keventd is
> SCHED_FIFO, and rt_priority = 1). But real-time tasks with lower
> rt_priority can get scheduled. This sounds strange to me.
>
> I checked sched_rt.c of my kernel version(The latest kernel is almost
> the same in this part), and try to understand how a real-time task is
> enqueued, dequeued and picked up:
>
>    * enqueue a real-time task
>      - task->prio is used to find the list in rt_prio_array and add task to it;
>      - Set the bit in rt_prio_array->bitmap by task->prio;
>
>    * dequeue a real-time task
>      - Remove task from the list in rt_prio_array
>      - Clear the bit in rt_prio_array->bitmap by task->prio;
>
>    * pick up next real-time task
>      - Call sched_find_first_bit(array->bitmap) to find the list
>      - Pick the task in the list head
>
>    * yield a real-time task
>      - Instead of doing dequeue followed by enqueue, calls
>        requeue_task_rt() which moves the task from its current place to
>        the list tail.
>
> In all above operations, task->prio is used to find the bit in runqueue
> bitmap. Except for Priority Inherient, task->prio is equal to
> task->normal_prio which is calculated by function normal_prio(). For
> real-time task, its normal_prio is:
>
>     normal_prio = MAX_RT_PRIO - 1 - task->rt_priority;
>
> So the place  of a higher rt_priority real-time task is always
> __behind__ the lower rt_priority one in the runqueue bitmap. So that
> sched_find_first_bit() picks up the lower rt_priority task to run.
>
> That is why a SCHED_FIFO task can block higher rt_priority SCHED_FIFO
> tasks but lower rt_priority real-time task can be scheduled in my test.
>
> But I am confuse about:
>
>    * Does the real-time scheduler work as designed?
>    * Or arm I doing the wrong thing in my test?
>    * Why not use rt_priority to enqueue and dequeue real-time task
>      to/from runqueue list?
>
> Can somebody have a look at my questions? Thanks.

Your code has this:

			/* Child has lower rt priority than parent */
			child_yielder(PARENT_PRIO + i);

You may be confusing how the sched_setscheduler realtime priority values 
work - higher numbers are HIGHER priority, not lower (which is opposite 
to how the internal priority values work in the kernel where lower 
numbers mean higher priority, the values get converted in the kernel as 
I recall). You've spawned off a task that's higher priority than the 
current one. sched_yield in a realtime process does nothing if no 
process of the same or higher priority is available to run, so 
effectively it just spins calling sched_yield and hogging the CPU.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rt sched] SCHED_FIFO task of lower rt_priority blocks higher one
       [not found] ` <1ca41c0f1003031952u7a9ea0w5adf4f0438b8fe0c@mail.gmail.com>
       [not found]   ` <0016e646086875cfde0480f186a0@google.com>
@ 2010-03-04  5:18   ` Xianwei Zeng
  1 sibling, 0 replies; 5+ messages in thread
From: Xianwei Zeng @ 2010-03-04  5:18 UTC (permalink / raw)
  To: Karthik Singaram Lakshmanan, hancockrwd; +Cc: linux-rt-users

Hi,

Thanks for your replies.

I made a mistake about the priority inside the kernel and
sched_setscheduler system call.
I thought both of them are the same: numerically lower priority value
is higher schedule priority.
But in fact, they are opposite:

  * Kernel: numerically lower value is high priority
  * sched_setscheduler() and related system calls:  numerically lower
value is lower priority

It is said in  "man sched_get_priority_max" output:

       Processes with numerically higher priority values  are
scheduled  before  processes
       with numerically lower priority values. Thus, the value
returned by sched_get_prior-
       ity_max() will be greater than the value returned by
sched_get_priority_min().

Thanks.

On Thu, Mar 4, 2010 at 12:52 PM, Karthik Singaram Lakshmanan
<karthiksingaram@gmail.com> wrote:
> Hi,
>     If I am not mistaken, higher value of sched_priority denotes higher
> priority.
>     Therefore, looking at
>         normal_prio = MAX_RT_PRIO - 1 - task->rt_priority;
>         you can see that a higher task->rt_priority will translate to a
> lower value of normal_prio, thereby occupying an earlier bit.


-- 
Best Regards,
Zeng Xianwei
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-03-04  5:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-04  3:21 [rt sched] SCHED_FIFO task of lower rt_priority blocks higher one Xianwei Zeng
2010-03-04  3:54 ` Karthik Singaram Lakshmanan
2010-03-04  4:24 ` Robert Hancock
     [not found] ` <1ca41c0f1003031952u7a9ea0w5adf4f0438b8fe0c@mail.gmail.com>
     [not found]   ` <0016e646086875cfde0480f186a0@google.com>
2010-03-04  3:53     ` Delivery Status Notification (Failure) Karthik Singaram Lakshmanan
2010-03-04  5:18   ` [rt sched] SCHED_FIFO task of lower rt_priority blocks higher one Xianwei Zeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).