- Jan 11, 2009
-
-
Li Zefan authored
Impact: avoid accessing NULL tg.css->cgroup In commit 0a0db8f5, I removed checking NULL tg.css->cgroup, but I realized I was wrong when I found reading /proc/sched_debug can race with cgroup_create(). Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Dec 01, 2008
-
-
Arun R Bharadwaj authored
Impact: extend information in /proc/sched_debug This patch adds uid information in sched_debug for CONFIG_USER_SCHED Signed-off-by:
Arun R Bharadwaj <arun@linux.vnet.ibm.com> Acked-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Nov 16, 2008
-
-
Ingo Molnar authored
Luis Henriques reported that with CONFIG_PREEMPT=y + CONFIG_PREEMPT_DEBUG=y + CONFIG_SCHED_DEBUG=y + CONFIG_LATENCYTOP=y enabled, the following warning triggers when using latencytop: > [ 775.663239] BUG: using smp_processor_id() in preemptible [00000000] code: latencytop/6585 > [ 775.663303] caller is native_sched_clock+0x3a/0x80 > [ 775.663314] Pid: 6585, comm: latencytop Tainted: G W 2.6.28-rc4-00355-g9c7c354 #1 > [ 775.663322] Call Trace: > [ 775.663343] [<ffffffff803a94e4>] debug_smp_processor_id+0xe4/0xf0 > [ 775.663356] [<ffffffff80213f7a>] native_sched_clock+0x3a/0x80 > [ 775.663368] [<ffffffff80213e19>] sched_clock+0x9/0x10 > [ 775.663381] [<ffffffff8024550d>] proc_sched_show_task+0x8bd/0x10e0 > [ 775.663395] [<ffffffff8034466e>] sched_show+0x3e/0x80 > [ 775.663408] [<ffffffff8031039b>] seq_read+0xdb/0x350 > [ 775.663421] [<ffffffff80368776>] ? security_file_permission+0x16/0x20 > [ 775.663435] [<ffffffff802f4198>] vfs_read+0xc8/0x170 > [ 775.663447] [<ffffffff802f4335>] sys_read+0x55/0x90 > [ 775.663460] [<ffffffff8020c67a>] system_call_fastpath+0x16/0x1b > ... This breakage was caused by me via: 7cbaef9c: sched: optimize sched_clock() a bit Change the calls to cpu_clock(). Reported-by:
Luis Henriques <henrix@sapo.pt>
-
- Nov 11, 2008
-
-
Bharata B Rao authored
Impact: extend /proc/sched_debug info Since the statistics of a group entity isn't exported directly from the kernel, it becomes difficult to obtain some of the group statistics. For example, the current method to obtain exec time of a group entity is not always accurate. One has to read the exec times of all the tasks(/proc/<pid>/sched) in the group and add them. This method fails (or becomes difficult) if we want to collect stats of a group over a duration where tasks get created and terminated. This patch makes it easier to obtain group stats by directly including them in /proc/sched_debug. Stats like group exec time would help user programs (like LTP) to accurately measure the group fairness. An example output of group stats from /proc/sched_debug: cfs_rq[3]:/3/a/1 .exec_clock : 89.598007 .MIN_vruntime : 0.000001 .min_vruntime : 256300.970506 .max_vruntime : 0.000001 .spread : 0.000000 .spread0 : -25373.372248 .nr_running : 0 .load : 0 .yld_exp_empty : 0 .yld_act_empty : 0 .yld_both_empty : 0 .yld_count : 4474 .sched_switch : 0 .sched_count : 40507 .sched_goidle : 12686 .ttwu_count : 15114 .ttwu_local : 11950 .bkl_count : 67 .nr_spread_over : 0 .shares : 0 .se->exec_start : 113676.727170 .se->vruntime : 1592.612714 .se->sum_exec_runtime : 89.598007 .se->wait_start : 0.000000 .se->sleep_start : 0.000000 .se->block_start : 0.000000 .se->sleep_max : 0.000000 .se->block_max : 0.000000 .se->exec_max : 1.000282 .se->slice_max : 1.999750 .se->wait_max : 54.981093 .se->wait_sum : 217.610521 .se->wait_count : 50 .se->load.weight : 2 Signed-off-by:
Bharata B Rao <bharata@linux.vnet.ibm.com> Acked-by:
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Acked-by:
Dhaval Giani <dhaval@linux.vnet.ibm.com> Acked-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Nov 10, 2008
-
-
Peter Zijlstra authored
Impact: clean up and fix debug info printout While looking over the sched_debug code I noticed that we printed the rq schedstats for every cfs_rq, ammend this. Also change nr_spead_over into an int, and fix a little buglet in min_vruntime printing. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Nov 04, 2008
-
-
Li Zefan authored
Impact: cleanup cfs->tg is initialized in init_tg_cfs_entry() with tg != NULL, and will never be invalidated to NULL. And the underlying cgroup of a valid task_group is always valid. Same for rt->tg. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Acked-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Oct 30, 2008
-
-
Li Zefan authored
Impact: change /proc/sched/debug from rw-r--r-- to r--r--r-- /proc/sched_debug is read-only. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Acked-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Oct 10, 2008
-
-
Lai Jiangshan authored
lock_task_sighand() make sure task->sighand is being protected, so we do not need rcu_read_lock(). [ exec() will get task->sighand->siglock before change task->sighand! ] But code using rcu_read_lock() _just_ to protect lock_task_sighand() only appear in procfs. (and some code in procfs use lock_task_sighand() without such redundant protection.) Other subsystem may put lock_task_sighand() into rcu_read_lock() critical region, but these rcu_read_lock() are used for protecting "for_each_process()", "find_task_by_vpid()" etc. , not for protecting lock_task_sighand(). Signed-off-by:
Lai Jiangshan <laijs@cn.fujitsu.com> [ok from Oleg] Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com>
-
- Jun 27, 2008
-
-
Peter Zijlstra authored
show all the schedstats in /debug/sched_debug as well. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Peter Zijlstra authored
Try again.. Initial commit: 18d95a28 Revert: 6363ca57 Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Jun 20, 2008
-
-
Peter Zijlstra authored
Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Daniel K." <dk@uw.no> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- May 29, 2008
-
-
Ingo Molnar authored
Yanmin Zhang reported: Comparing with 2.6.25, volanoMark has big regression with kernel 2.6.26-rc1. It's about 50% on my 8-core stoakley, 16-core tigerton, and Itanium Montecito. With bisect, I located the following patch: | 18d95a28 is first bad commit | commit 18d95a28 | Author: Peter Zijlstra <a.p.zijlstra@chello.nl> | Date: Sat Apr 19 19:45:00 2008 +0200 | | sched: fair-group: SMP-nice for group scheduling Revert it so that we get v2.6.25 behavior. Bisected-by:
Yanmin Zhang <yanmin_zhang@linux.intel.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- May 05, 2008
-
-
Peter Zijlstra authored
this replaces the rq->clock stuff (and possibly cpu_clock()). - architectures that have an 'imperfect' hardware clock can set CONFIG_HAVE_UNSTABLE_SCHED_CLOCK - the 'jiffie' window might be superfulous when we update tick_gtod before the __update_sched_clock() call in sched_clock_tick() - cpu_clock() might be implemented as: sched_clock_cpu(smp_processor_id()) if the accuracy proves good enough - how far can TSC drift in a single jiffie when considering the filtering and idle hooks? [ mingo@elte.hu: various fixes and cleanups ] Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- May 01, 2008
-
-
Roman Zippel authored
Rename div64_64 to div64_u64 to make it consistent with the other divide functions, so it clearly includes the type of the divide. Move its definition to math64.h as currently no architecture overrides the generic implementation. They can still override it of course, but the duplicated declarations are avoided. Signed-off-by:
Roman Zippel <zippel@linux-m68k.org> Cc: Avi Kivity <avi@qumranet.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Apr 29, 2008
-
-
Denis V. Lunev authored
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by:
Denis V. Lunev <den@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Apr 19, 2008
-
-
Ingo Molnar authored
Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Peter Zijlstra authored
Add some extra debug output so we can get a better overview of the full hierarchy. We print the cgroup path after each cfs_rq, so we can see what group we're looking at. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
it's unused. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Mar 19, 2008
-
-
Ingo Molnar authored
improve affine wakeups. Maintain the 'overlap' metric based on CFS's sum_exec_runtime - which means the amount of time a task executes after it wakes up some other task. Use the 'overlap' for the wakeup decisions: if the 'overlap' is short, it means there's strong workload coupling between this task and the woken up task. If the 'overlap' is large then the workload is decoupled and the scheduler will move them to separate CPUs more easily. ( Also slightly move the preempt_check within try_to_wake_up() - this has no effect on functionality but allows 'early wakeups' (for still-on-rq tasks) to be correctly accounted as well.) Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Jan 25, 2008
-
-
Arjan van de Ven authored
Right now, the linux kernel (with scheduler statistics enabled) keeps track of the maximum time a process is waiting to be scheduled. While the maximum is a very useful metric, tracking average and total is equally useful (at least for latencytop) to figure out the accumulated effect of scheduler delays. The accumulated effect is important to judge the performance impact of scheduler tuning/behavior. Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Guillaume Chazarain authored
We monitor clock overflows, let's also monitor clock underflows. Signed-off-by:
Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Dec 30, 2007
-
-
Ingo Molnar authored
Meelis Roos reported these warnings on sparc64: CC kernel/sched.o In file included from kernel/sched.c:879: kernel/sched_debug.c: In function 'nsec_high': kernel/sched_debug.c:38: warning: comparison of distinct pointer types lacks a cast the debug check in do_div() is over-eager here, because the long long is always positive in these places. Mark this by casting them to unsigned long long. no change in code output: text data bss dec hex filename 51471 6582 376 58429 e43d sched.o.before 51471 6582 376 58429 e43d sched.o.after md5: 7f7729c111f185bf3ccea4d542abc049 sched.o.before.asm 7f7729c111f185bf3ccea4d542abc049 sched.o.after.asm Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Nov 28, 2007
-
-
Ingo Molnar authored
clean up overlong line in kernel/sched_debug.c. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Nov 26, 2007
-
-
Ingo Molnar authored
bump version of kernel/sched_debug.c and remove CFS version information from it. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Nov 09, 2007
-
-
Peter Zijlstra authored
we lost the sched_min_granularity tunable to a clever optimization that uses the sched_latency/min_granularity ratio - but the ratio is quite unintuitive to users and can also crash the kernel if the ratio is set to 0. So reintroduce the min_granularity tunable, while keeping the ratio maintained internally. no functionality changed. [ mingo@elte.hu: some fixlets. ] Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Oct 25, 2007
-
-
Peter Zijlstra authored
Lockdep noticed that this lock can also be taken from hardirq context, and can thus not unconditionally disable/enable irqs. WARNING: at kernel/lockdep.c:2033 trace_hardirqs_on() [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30 [show_trace+18/32] show_trace+0x12/0x20 [dump_stack+22/32] dump_stack+0x16/0x20 [trace_hardirqs_on+405/416] trace_hardirqs_on+0x195/0x1a0 [_read_unlock_irq+34/48] _read_unlock_irq+0x22/0x30 [sched_debug_show+2615/4224] sched_debug_show+0xa37/0x1080 [show_state_filter+326/368] show_state_filter+0x146/0x170 [sysrq_handle_showstate+10/16] sysrq_handle_showstate+0xa/0x10 [__handle_sysrq+123/288] __handle_sysrq+0x7b/0x120 [handle_sysrq+40/64] handle_sysrq+0x28/0x40 [kbd_event+1045/1680] kbd_event+0x415/0x690 [input_pass_event+206/208] input_pass_event+0xce/0xd0 [input_handle_event+170/928] input_handle_event+0xaa/0x3a0 [input_event+95/112] input_event+0x5f/0x70 [atkbd_interrupt+434/1456] atkbd_interrupt+0x1b2/0x5b0 [serio_interrupt+59/128] serio_interrupt+0x3b/0x80 [i8042_interrupt+263/576] i8042_interrupt+0x107/0x240 [handle_IRQ_event+40/96] handle_IRQ_event+0x28/0x60 [handle_edge_irq+175/320] handle_edge_irq+0xaf/0x140 [do_IRQ+64/128] do_IRQ+0x40/0x80 [common_interrupt+46/52] common_interrupt+0x2e/0x34 Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Oct 18, 2007
-
-
Ken Chen authored
schedstat is useful in investigating CPU scheduler behavior. Ideally, I think it is beneficial to have it on all the time. However, the cost of turning it on in production system is quite high, largely due to number of events it collects and also due to its large memory footprint. Most of the fields probably don't need to be full 64-bit on 64-bit arch. Rolling over 4 billion events will most like take a long time and user space tool can be made to accommodate that. I'm proposing kernel to cut back most of variable width on 64-bit system. (note, the following patch doesn't affect 32-bit system). Signed-off-by:
Ken Chen <kenchen@google.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Oct 15, 2007
-
-
Arjan van de Ven authored
In general, struct file_operations are const in the kernel, to not have false cacheline sharing and to catch bugs at compiletime with accidental writes to them. The new scheduler code introduces a new non-const one; fix this up. Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
add new migration statistics when SCHED_DEBUG and SCHEDSTATS is enabled. Available in /proc/<PID>/sched. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
increase width of debug line - in preparation of more debugging info. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Dhaval Giani authored
Add tunables in sysfs to modify a user's cpu share. A directory is created in sysfs for each new user in the system. /sys/kernel/uids/<uid>/cpu_share Reading this file returns the cpu shares granted for the user. Writing into this file modifies the cpu share for the user. Only an administrator is allowed to modify a user's cpu share. Ex: # cd /sys/kernel/uids/ # cat 512/cpu_share 1024 # echo 2048 > 512/cpu_share # cat 512/cpu_share 2048 # Signed-off-by:
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by:
Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
cleanup: rename task_grp to task_group. No need to save two characters and 'grp' is annoying to read. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Srivatsa Vaddagiri authored
Fix coding style issues reported by Randy Dunlap and others Signed-off-by:
Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by:
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de>
-
Peter Zijlstra authored
speed up and simplify vslice calculations. [ From: Mike Galbraith <efault@gmx.de>: build fix ] Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
rename all 'cnt' fields and variables to the less yucky 'count' name. yuckage noticed by Andrew Morton. no change in code, other than the /proc/sched_debug bkl_count string got a bit larger: text data bss dec hex filename 38236 3506 24 41766 a326 sched.o.before 38240 3506 24 41770 a32a sched.o.after Signed-off-by:
Ingo Molnar <mingo@elte.hu> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de>
-
Peter Zijlstra authored
debug feature: check how well we schedule within a reasonable vruntime 'spread' range. (note that CPU overload can increase the spread, so this is not a hard condition, but normal loads should be within the spread.) Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl>
-
Ingo Molnar authored
more width for parameter printouts in /proc/sched_debug. Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de>
-
Ingo Molnar authored
print the current value of all tunables in /proc/sched_debug output. Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de>
-
S.Çağlar Onur authored
build fix for the SCHED_DEBUG && !SCHEDSTATS case. Signed-off-by:
S.Ceglar Onur <caglar@pardus.org.tr> Signed-off-by:
Ingo Molnar <mingo@elte.hu> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de>
-
Ingo Molnar authored
add per task and per rq BKL usage statistics. Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de>
-