* Run queue corruption issue
@ 2016-05-17 22:55 Jerrin Shaji George
2016-05-17 23:20 ` Greg KH
0 siblings, 1 reply; 4+ messages in thread
From: Jerrin Shaji George @ 2016-05-17 22:55 UTC (permalink / raw)
To: kernelnewbies
Hi All,
I wanted help with a piece of code that I have been working on.
Please see -
https://gist.github.com/jerrinsg/333e584d1f65dc95b9f13b61dcebdaa7
I have written two function, migrate_to and migrate_back. migrate_to is used
to remove a process from the run queue, and migrate_back is used to insert this
process back into the run queue.
The gist is from a taken from a larger project, where we are working on building
a mechanism to support thread migration across heterogeneous processors.
migrate_to_call() will be called by a thread which wants to remove itself from
the run queue (hence, it will pass the current task struct as the migration
argument). Once the other processor completes execution of the assigned task, it
will interrupt the main processor, which runs an interrupt handler, which in
turn calls the migrate_back_call() function. It passes the task struct of the
process that was removed from the run queue earlier to this function.
This mechanism works fine the first few times, but when this process is repeated
many times in a loop, I am seeing a run queue corruption:
https://gist.github.com/jerrinsg/0ab09cd435d8d2cb6ae692c7e6f4f26b
Is there anything wrong in the process dequeue or enqueue function that I have
written? Please help!
Kernel used: Linux 3.13
Thanks,
Jerrin
^ permalink raw reply [flat|nested] 4+ messages in thread
* Run queue corruption issue
2016-05-17 22:55 Run queue corruption issue Jerrin Shaji George
@ 2016-05-17 23:20 ` Greg KH
2016-05-18 5:29 ` Jerrin Shaji George
0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2016-05-17 23:20 UTC (permalink / raw)
To: kernelnewbies
On Tue, May 17, 2016 at 06:55:07PM -0400, Jerrin Shaji George wrote:
> Hi All,
>
> I wanted help with a piece of code that I have been working on.
>
> Please see -
>
> https://gist.github.com/jerrinsg/333e584d1f65dc95b9f13b61dcebdaa7
>
> I have written two function, migrate_to and migrate_back. migrate_to is used
> to remove a process from the run queue, and migrate_back is used to insert this
> process back into the run queue.
>
> The gist is from a taken from a larger project, where we are working on building
> a mechanism to support thread migration across heterogeneous processors.
> migrate_to_call() will be called by a thread which wants to remove itself from
> the run queue (hence, it will pass the current task struct as the migration
> argument). Once the other processor completes execution of the assigned task, it
> will interrupt the main processor, which runs an interrupt handler, which in
> turn calls the migrate_back_call() function. It passes the task struct of the
> process that was removed from the run queue earlier to this function.
>
> This mechanism works fine the first few times, but when this process is repeated
> many times in a loop, I am seeing a run queue corruption:
> https://gist.github.com/jerrinsg/0ab09cd435d8d2cb6ae692c7e6f4f26b
>
> Is there anything wrong in the process dequeue or enqueue function that I have
> written? Please help!
volatile doesn't mean what you think it does, please don't use it in the
kernel.
And why are you using "raw_spin_lock()"?
> Kernel used: Linux 3.13
Wow that's obsolete and buggy, why use such an old thing?
greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread
* Run queue corruption issue
2016-05-17 23:20 ` Greg KH
@ 2016-05-18 5:29 ` Jerrin Shaji George
2016-05-18 6:16 ` Greg KH
0 siblings, 1 reply; 4+ messages in thread
From: Jerrin Shaji George @ 2016-05-18 5:29 UTC (permalink / raw)
To: kernelnewbies
Hi Greg,
Thanks for your response.
On Tue, May 17, 2016 at 7:20 PM, Greg KH <greg@kroah.com> wrote:
> On Tue, May 17, 2016 at 06:55:07PM -0400, Jerrin Shaji George wrote:
>> Hi All,
>>
>> I wanted help with a piece of code that I have been working on.
>>
>> Please see -
>>
>> https://gist.github.com/jerrinsg/333e584d1f65dc95b9f13b61dcebdaa7
>>
>> I have written two function, migrate_to and migrate_back. migrate_to is used
>> to remove a process from the run queue, and migrate_back is used to insert this
>> process back into the run queue.
>>
>> The gist is from a taken from a larger project, where we are working on building
>> a mechanism to support thread migration across heterogeneous processors.
>> migrate_to_call() will be called by a thread which wants to remove itself from
>> the run queue (hence, it will pass the current task struct as the migration
>> argument). Once the other processor completes execution of the assigned task, it
>> will interrupt the main processor, which runs an interrupt handler, which in
>> turn calls the migrate_back_call() function. It passes the task struct of the
>> process that was removed from the run queue earlier to this function.
>>
>> This mechanism works fine the first few times, but when this process is repeated
>> many times in a loop, I am seeing a run queue corruption:
>> https://gist.github.com/jerrinsg/0ab09cd435d8d2cb6ae692c7e6f4f26b
>>
>> Is there anything wrong in the process dequeue or enqueue function that I have
>> written? Please help!
>
> volatile doesn't mean what you think it does, please don't use it in the
> kernel.
>
This flag was to be used for synchronization. I will change this.
> And why are you using "raw_spin_lock()"?
I used this seeing other usage in sched/core.c. Can please you let me know if I
should instead use a different function to lock the run queue?
>> Kernel used: Linux 3.13
>
> Wow that's obsolete and buggy, why use such an old thing?
This is the codebase that I inherited. Once I get the basic prototype working, I
will be working to port it to a newer version of the kernel.
>
> greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread
* Run queue corruption issue
2016-05-18 5:29 ` Jerrin Shaji George
@ 2016-05-18 6:16 ` Greg KH
0 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2016-05-18 6:16 UTC (permalink / raw)
To: kernelnewbies
On Wed, May 18, 2016 at 01:29:55AM -0400, Jerrin Shaji George wrote:
> Hi Greg,
>
> Thanks for your response.
>
> On Tue, May 17, 2016 at 7:20 PM, Greg KH <greg@kroah.com> wrote:
> > On Tue, May 17, 2016 at 06:55:07PM -0400, Jerrin Shaji George wrote:
> >> Hi All,
> >>
> >> I wanted help with a piece of code that I have been working on.
> >>
> >> Please see -
> >>
> >> https://gist.github.com/jerrinsg/333e584d1f65dc95b9f13b61dcebdaa7
> >>
> >> I have written two function, migrate_to and migrate_back. migrate_to is used
> >> to remove a process from the run queue, and migrate_back is used to insert this
> >> process back into the run queue.
> >>
> >> The gist is from a taken from a larger project, where we are working on building
> >> a mechanism to support thread migration across heterogeneous processors.
> >> migrate_to_call() will be called by a thread which wants to remove itself from
> >> the run queue (hence, it will pass the current task struct as the migration
> >> argument). Once the other processor completes execution of the assigned task, it
> >> will interrupt the main processor, which runs an interrupt handler, which in
> >> turn calls the migrate_back_call() function. It passes the task struct of the
> >> process that was removed from the run queue earlier to this function.
> >>
> >> This mechanism works fine the first few times, but when this process is repeated
> >> many times in a loop, I am seeing a run queue corruption:
> >> https://gist.github.com/jerrinsg/0ab09cd435d8d2cb6ae692c7e6f4f26b
> >>
> >> Is there anything wrong in the process dequeue or enqueue function that I have
> >> written? Please help!
> >
> > volatile doesn't mean what you think it does, please don't use it in the
> > kernel.
> >
>
> This flag was to be used for synchronization. I will change this.
>
> > And why are you using "raw_spin_lock()"?
>
> I used this seeing other usage in sched/core.c. Can please you let me know if I
> should instead use a different function to lock the run queue?
Ah, don't know, don't mess with the scheduler, thankfully :)
> >> Kernel used: Linux 3.13
> >
> > Wow that's obsolete and buggy, why use such an old thing?
>
> This is the codebase that I inherited. Once I get the basic prototype working, I
> will be working to port it to a newer version of the kernel.
Try porting it to a modern kernel and then posting your real patch for
review, that would make things a bit more obvious and probably show your
bug better.
good luck,
greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-05-18 6:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-17 22:55 Run queue corruption issue Jerrin Shaji George
2016-05-17 23:20 ` Greg KH
2016-05-18 5:29 ` Jerrin Shaji George
2016-05-18 6:16 ` Greg KH
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.