android_kernel_asus_sm8350

Author	SHA1	Message	Date
Ingo Molnar	aef1b9cef7	Merge commit 'v2.6.37' into perf/core Merge reason: Add the final .37 tree. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-05 14:22:10 +01:00
NeilBrown	6bf4123760	sched: Change wait_for_completion__timeout() to return a signed long wait_for_completion__timeout() can return: 0: if the wait timed out -ve: if the wait was interrupted +ve: if the completion was completed. As they currently return an 'unsigned long', the last two cases are not easily distinguished which can easily result in buggy code, as is the case for the recently added wait_for_completion_interruptible_timeout() call in net/sunrpc/cache.c So change them both to return 'long'. As MAX_SCHEDULE_TIMEOUT is LONG_MAX, a large +ve return value should never overflow. Signed-off-by: NeilBrown <neilb@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110105125016.64ccab0e@notabene.brown> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-05 14:15:50 +01:00
Ingo Molnar	27066fd484	Merge commit 'v2.6.37' into sched/core Merge reason: Merge the final .37 tree. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-05 14:14:46 +01:00
Mike Galbraith	101e5f77bf	sched, autogroup: Fix reference leak The cgroup exit mess also uncovered a struct autogroup reference leak. copy_process() was simply freeing vs putting the signal_struct, stranding a reference. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Oleg Nesterov <oleg@redhat.com> LKML-Reference: <1293784350.6839.2.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-04 15:10:36 +01:00
Mike Galbraith	4f8219875a	sched, autogroup: Fix potential access to freed memory Oleg pointed out that the /proc interface kref_get() useage may race with the final put during autogroup_move_group(). A signal->autogroup assignment may be in flight when the /proc interface dereference, leaving them taking a reference to an already dead group. Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1292508592.5940.28.camel@maggy.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-04 15:10:34 +01:00
Yong Zhang	6706125e29	sched: Remove redundant CONFIG_CGROUP_SCHED ifdef CONFIG_[FAIR\|RT]_GROUP_SCHED always means CONFIG_CGROUP_SCHED Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1293803938-8157-1-git-send-email-yong.zhang0@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-04 09:42:03 +01:00
Thomas Renninger	25e41933b5	perf: Clean up power events by introducing new, more generic ones Add these new power trace events: power:cpu_idle power:cpu_frequency power:machine_suspend The old C-state/idle accounting events: power:power_start power:power_end Have now a replacement (but we are still keeping the old tracepoints for compatibility): power:cpu_idle and power:power_frequency is replaced with: power:cpu_frequency power:machine_suspend is newly introduced. Jean Pihet has a patch integrated into the generic layer (kernel/power/suspend.c) which will make use of it. the type= field got removed from both, it was never used and the type is differed by the event type itself. perf timechart userspace tool gets adjusted in a separate patch. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Jean Pihet <jean.pihet@newoldbits.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: rjw@sisk.pl LKML-Reference: <1294073445-14812-3-git-send-email-trenn@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1290072314-31155-2-git-send-email-trenn@suse.de>	2011-01-04 08:16:54 +01:00
Thomas Renninger	61a0d49c33	perf: Do not export power_frequency, but power_start event power_frequency moved to drivers/cpufreq/cpufreq.c which has to be compiled in, no need to export it. intel_idle can a be module though... Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jean Pihet <jean.pihet@newoldbits.com> Cc: Jean Pihet <j-pihet@ti.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: rjw@sisk.pl LKML-Reference: <1294073445-14812-2-git-send-email-trenn@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1290072314-31155-2-git-send-email-trenn@suse.de>	2011-01-04 08:16:54 +01:00
Ingo Molnar	cc22219699	Merge commit 'v2.6.37-rc8' into perf/core Merge reason: pick up latest -rc. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-04 08:08:54 +01:00
Ben Hutchings	551423748a	watchdog: Improve initialisation error message and documentation The error message 'NMI watchdog failed to create perf event...' does not make it clear that this is a fatal error for the watchdog. It also currently prints the error value as a pointer, rather than extracting the error code with PTR_ERR(). Fix that. Add a note to the description of the 'nowatchdog' kernel parameter to associate it with this message. Reported-by: Cesare Leonardi <celeonar@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: 599368@bugs.debian.org Cc: 608138@bugs.debian.org Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: <stable@kernel.org> # .37.x and later LKML-Reference: <1294009362.3167.126.camel@localhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-03 05:25:52 +01:00
Hillf Danton	4ef9e11d68	fix freeing user_struct in user cache When racing on adding into user cache, the new allocated from mm slab is freed without putting user namespace. Since the user namespace is already operated by getting, putting has to be issued. Signed-off-by: Hillf Danton <dhillf@gmail.com> Acked-by: Serge Hallyn <serge@hallyn.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-12-29 11:31:38 -08:00
Linus Torvalds	6f7f41851c	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ring_buffer: Off-by-one and duplicate events in ring_buffer_read_page	2010-12-28 15:54:24 -08:00
Linus Torvalds	a4790c9457	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: print out alloc information with KERN_DEBUG instead of KERN_INFO kthread_work: make lockdep happy	2010-12-24 12:59:09 -08:00
Rafael J. Wysocki	a2867e08c8	PM / Wakeup: Replace pm_check_wakeup_events() with pm_wakeup_pending() To avoid confusion with the meaning and return value of pm_check_wakeup_events() replace it with pm_wakeup_pending() that will work the other way around (ie. return true when system-wide power transition should be aborted). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2010-12-24 15:02:42 +01:00
MyungJoo Ham	5262a47502	PM / Hibernate: When failed, in_suspend should be reset When hibernation failed due to an error in swsusp_write() called by hibernate(), it skips calling "power_down()" and returns. When hibernate() is called again (probably after fixing up so that swsusp_write() wouldn't fail again), before "in_suspend = 1" of create_image is called, in_suspend should be 0. However, because hibernate() did not reset "in_suspend" after a failure, it's already 1. This patch fixes such inconsistency of "in_suspend" value. Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2010-12-24 15:02:40 +01:00
MyungJoo Ham	5729c63a51	PM / Hibernate: hibernation_ops->leave should be checked too Because hibernate calls hibernation_ops->leave() without checking whether hibernation_ops->leave is NULL or not, hiberantion_set_ops should WARN_ON if hibernation_ops->leave is NULL. This patch added one more condition to check hibernation_ops->leave. Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2010-12-24 15:02:40 +01:00
Tejun Heo	8cfe400ca5	Freezer: Fix a race during freezing of TASK_STOPPED tasks After calling freeze_task(), try_to_freeze_tasks() see whether the task is stopped or traced and if so, considers it to be frozen; however, nothing guarantees that either the task being frozen sees TIF_FREEZE or the freezer sees TASK_STOPPED -> TASK_RUNNING transition. The task being frozen may wake up and not see TIF_FREEZE while the freezer fails to notice the transition and believes the task is still stopped. This patch fixes the race by making freeze_task() always go through fake_signal_wake_up() for applicable tasks. The function goes through the target task's scheduler lock and thus guarantees that either the target sees TIF_FREEZE or try_to_freeze_task() sees TASK_RUNNING. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2010-12-24 15:02:40 +01:00
Tracey Dent	133f1128b2	PM: Use proper ccflag flag in kernel/power/Makefile Use the ccflags-$ flag instead of EXTRA_CFLAGS because EXTRA_CFLAGS is deprecated and should now be switched. According to (documentation/kbuild/makefiles.txt). Signed-off-by: Tracey Dent <tdent48227@gmail.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2010-12-24 15:02:40 +01:00
David Sharp	e1e3592735	ring_buffer: Off-by-one and duplicate events in ring_buffer_read_page Fix two related problems in the event-copying loop of ring_buffer_read_page. The loop condition for copying events is off-by-one. "len" is the remaining space in the caller-supplied page. "size" is the size of the next event (or two events). If len == size, then there is just enough space for the next event. size was set to rb_event_ts_length, which may include the size of two events if the first event is a time-extend, in order to assure time- extends are kept together with the event after it. However, rb_advance_reader always advances by one event. This would result in the event after any time-extend being duplicated. Instead, get the size of a single event for the memcpy, but use rb_event_ts_length for the loop condition. Signed-off-by: David Sharp <dhsharp@google.com> LKML-Reference: <1293064704-8101-1-git-send-email-dhsharp@google.com> LKML-Reference: <AANLkTin7nLrRPc9qGjdjHbeVDDWiJjAiYyb-L=gH85bx@mail.gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-12-23 12:09:30 -05:00
Steven Rostedt	94462ad3b1	module: Move RO/NX module protection to after ftrace module update The commit: 84e1c6bb38eb318e456558b610396d9f1afaabf0 x86: Add RO/NX protection for loadable kernel modules Broke the function tracer with this output: ------------[ cut here ]------------ WARNING: at kernel/trace/ftrace.c:1014 ftrace_bug+0x114/0x171() Hardware name: Precision WorkStation 470 Modules linked in: i2c_core(+) Pid: 86, comm: modprobe Not tainted 2.6.37-rc2+ #68 Call Trace: [<ffffffff8104e957>] warn_slowpath_common+0x85/0x9d [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffff8104e989>] warn_slowpath_null+0x1a/0x1c [<ffffffff810a9dfe>] ftrace_bug+0x114/0x171 [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffff810aa0db>] ftrace_process_locs+0x1ae/0x274 [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffff810aa29e>] ftrace_module_notify+0x39/0x44 [<ffffffff814405cf>] notifier_call_chain+0x37/0x63 [<ffffffff8106e054>] __blocking_notifier_call_chain+0x46/0x5b [<ffffffff8106e07d>] blocking_notifier_call_chain+0x14/0x16 [<ffffffff8107ffde>] sys_init_module+0x73/0x1f3 [<ffffffff8100acf2>] system_call_fastpath+0x16/0x1b ---[ end trace 2aff4f4ca53ec746 ]--- ftrace faulted on writing [<ffffffffa00026db>] __process_new_adapter+0x7/0x34 [i2c_core] The cause was that the module text was set to read only before ftrace could convert the calls to mcount to nops. Thus, the conversions failed due to not being able to write to the text locations. The simple fix is to move setting the module to read only after the module notifiers are called (where ftrace sets the module mcounts to nops). Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-12-23 09:56:00 -05:00
Ingo Molnar	394f4528c5	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu	2010-12-23 12:57:04 +01:00
Ingo Molnar	26e20a108c	Merge commit 'v2.6.37-rc7' into x86/security	2010-12-23 09:48:41 +01:00
Jeff Mahoney	4be2c95d1f	taskstats: pad taskstats netlink response for aligment issues on ia64 The taskstats structure is internally aligned on 8 byte boundaries but the layout of the aggregrate reply, with two NLA headers and the pid (each 4 bytes), actually force the entire structure to be unaligned. This causes the kernel to issue unaligned access warnings on some architectures like ia64. Unfortunately, some software out there doesn't properly unroll the NLA packet and assumes that the start of the taskstats structure will always be 20 bytes from the start of the netlink payload. Aligning the start of the taskstats structure breaks this software, which we don't want. So, for now the alignment only happens on architectures that require it and those users will have to update to fixed versions of those packages. Space is reserved in the packet only when needed. This ifdef should be removed in several years e.g. 2012 once we can be confident that fixed versions are installed on most systems. We add the padding before the aggregate since the aggregate is already a defined type. Commit 85893120 ("delayacct: align to 8 byte boundary on 64-bit systems") previously addressed the alignment issues by padding out the pid field. This was supposed to be a compatible change but the circumstances described above mean that it wasn't. This patch backs out that change, since it was a hack, and introduces a new NULL attribute type to provide the padding. Padding the response with 4 bytes avoids allocating an aligned taskstats structure and copying it back. Since the structure weighs in at 328 bytes, it's too big to do it on the stack. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reported-by: Brian Rogers <brian@xyzw.org> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Guillaume Chazarain <guichaz@gmail.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-12-22 19:43:34 -08:00
john stultz	b5776c4a6d	Fix rounding in clocks_calc_mult_shift() Russell King reports: \| On the ARM dev boards, we have a 32-bit counter running at 24MHz. Calling \| clocks_calc_mult_shift(&mult, &shift, 24MHz, NSEC_PER_SEC, 60) gives \| us a multiplier of 2796202666 and a shift of 26. \| \| Over a large counter delta, this produces an error - lets take a count \| from 362976315 to 4280663372: \| \| (4280663372-362976315) * 2796202666 / 2^26 - (4280663372-362976315) * (1000/24) \| => -38.91872422891230269990 \| \| Can we do better? \| \| (4280663372-362976315) * 2796202667 / 2^26 - (4280663372-362976315) * (1000/24) \| 19.45936211449532822051 \| \| which is about twice as good as the 2796202666 multiplier. \| \| Looking at the equivalent divisions obtained, 2796202666 / 2^26 gives \| 41.66666665673255920410ns per tick, whereas 2796202667 / 2^26 gives \| 41.66666667163372039794ns. The actual value wanted is 1000/24 = \| 41.66666666666666666666ns. Fix this by ensuring we round to nearest when calculating the multiplier. Signed-off-by: John Stultz <john.stultz@linaro.org> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Will Deacon <will.deacon@arm.com> Tested-by: Mikael Pettersson <mikpe@it.uu.se> Tested-by: Eric Miao <eric.y.miao@gmail.com> Tested-by: Olof Johansson <olof@lixom.net> Tested-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2010-12-22 22:44:43 +00:00
Namhyung Kim	43b210139a	hrtimer: fix a typo in comment Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-12-22 19:01:47 +01:00
Jiri Kosina	4b7bd36470	Merge branch 'master' into for-next Conflicts: MAINTAINERS arch/arm/mach-omap2/pm24xx.c drivers/scsi/bfa/bfa_fcpim.c Needed to update to apply fixes for which the old branch was too outdated.	2010-12-22 18:57:02 +01:00
Ingo Molnar	9fb67204d7	Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core	2010-12-22 17:12:08 +01:00
Ingo Molnar	6c529a266b	Merge commit 'v2.6.37-rc7' into perf/core Merge reason: Pick up the latest -rc. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-22 11:53:23 +01:00
Yong Zhang	4f32e9b1f8	kthread_work: make lockdep happy spinlock in kthread_worker and wait_queue_head in kthread_work both should be lockdep sensible, so change the interface to make it suiltable for CONFIG_LOCKDEP. tj: comment update Reported-by: Nicolas <nicolas.mailhot@laposte.net> Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Andy Walls <awalls@md.metrocast.net> Tested-by: Andy Walls <awalls@md.metrocast.net> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2010-12-22 10:27:53 +01:00
Tejun Heo	c8efcc2589	workqueue: allow chained queueing during destruction Currently, destroy_workqueue() makes the workqueue deny all new queueing by setting WQ_DYING and flushes the workqueue once before proceeding with destruction; however, there are cases where work items queue more related work items. Currently, such users need to explicitly flush the workqueue multiple times depending on the possible depth of such chained queueing. This patch updates the queueing path such that a work item can queue further work items on the same workqueue even when WQ_DYING is set. The flush on destruction is automatically retried until the workqueue is empty. This guarantees that the workqueue is empty on destruction while allowing chained queueing. The flush retry logic whines if it takes too many retries to drain the workqueue. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>	2010-12-20 19:32:04 +01:00
Linus Torvalds	de5e9d5820	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Remove debugging check	2010-12-20 09:05:26 -08:00
Ingo Molnar	050c6c9b89	sched: Remove debugging check Linus reported that the new warning introduced by commit f26f9aff6aaf "Sched: fix skip_clock_update optimization" triggers. The need_resched flag can be set by other CPUs asynchronously so this debug check is bogus - remove it. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <AANLkTinJ8hAG1TpyC+CSYPR47p48+1=E7fiC45hMXT_1@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-19 23:24:27 +01:00
Linus Torvalds	55ec86f848	Merge branches 'x86-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-32: Make sure we can map all of lowmem if we need to x86, vt-d: Handle previous faults after enabling fault handling x86: Enable the intr-remap fault handling after local APIC setup x86, vt-d: Fix the vt-d fault handling irq migration in the x2apic mode x86, vt-d: Quirk for masking vtd spec errors to platform error handling logic x86, xsave: Use alloc_bootmem_align() instead of alloc_bootmem() bootmem: Add alloc_bootmem_align() x86, gcc-4.6: Use gcc -m options when building vdso x86: HPET: Chose a paranoid safe value for the ETIME check x86: io_apic: Avoid unused variable warning when CONFIG_GENERIC_PENDING_IRQ=n * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Fix off by one in perf_swevent_init() perf: Fix duplicate events with multiple-pmu vs software events ftrace: Have recordmcount honor endianness in fn_ELF_R_INFO scripts/tags.sh: Add magic for trace-events tracing: Fix panic when lseek() called on "trace" opened for writing	2010-12-19 10:44:54 -08:00
Linus Torvalds	21228e4557	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix the irqtime code for 32bit sched: Fix the irqtime code to deal with u64 wraps nohz: Fix get_next_timer_interrupt() vs cpu hotplug Sched: fix skip_clock_update optimization sched: Cure more NO_HZ load average woes	2010-12-19 10:37:37 -08:00
Paul Turner	19e5eebb8e	sched: Fix interactivity bug by charging unaccounted run-time on entity re-weight Mike Galbraith reported poor interactivity[] when the new shares distribution code was combined with autogroups. The root cause turns out to be a mis-ordering of accounting accrued execution time and shares updates. Since update_curr() is issued hierarchically, updating the parent entity weights to reflect child enqueue/dequeue results in the parent's unaccounted execution time then being accrued (vs vruntime) at the new weight as opposed to the weight present at accumulation. While this doesn't have much effect on processes with timeslices that cross a tick, it is particularly problematic for an interactive process (e.g. Xorg) which incurs many (tiny) timeslices. In this scenario almost all updates are at dequeue which can result in significant fairness perturbation (especially if it is the only thread, resulting in potential {tg->shares, MIN_SHARES} transitions). Correct this by ensuring unaccounted time is accumulated prior to manipulating an entity's weight. [] http://xkcd.com/619/ is perversely Nostradamian here. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20101216031038.159704378@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-19 16:36:30 +01:00
Paul Turner	43365bd7ff	sched: Move periodic share updates to entity_tick() Long running entities that do not block (dequeue) require periodic updates to maintain accurate share values. (Note: group entities with several threads are quite likely to be non-blocking in many circumstances). By virtue of being long-running however, we will see entity ticks (otherwise the required update occurs in dequeue/put and we are done). Thus we can move the detection (and associated work) for these updates into the periodic path. This restores the 'atomicity' of update_curr() with respect to accounting. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20101216031038.067028969@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-19 16:36:22 +01:00
Ingo Molnar	ca680888d5	Merge commit 'v2.6.37-rc6' into sched/core Merge reason: Update to the latest -rc. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-19 16:35:14 +01:00
Linus Torvalds	46bdfe6a50	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: x86: avoid high BIOS area when allocating address space x86: avoid E820 regions when allocating address space x86: avoid low BIOS area when allocating address space resources: add arch hook for preventing allocation in reserved areas Revert "resources: support allocating space within a region from the top down" Revert "PCI: allocate bus resources from the top down" Revert "x86/PCI: allocate space from the end of a region, not the beginning" Revert "x86: allocate space within a region top-down" Revert "PCI: fix pci_bus_alloc_resource() hang, prefer positive decode" PCI: Update MCP55 quirk to not affect non HyperTransport variants	2010-12-18 10:13:24 -08:00
Christoph Lameter	20b876918c	irq_work: Use per cpu atomics instead of regular atomics The irq work queue is a per cpu object and it is sufficient for synchronization if per cpu atomics are used. Doing so simplifies the code and reduces the overhead of the code. Before: christoph@linux-2.6$ size kernel/irq_work.o text data bss dec hex filename 451 8 1 460 1cc kernel/irq_work.o After: christoph@linux-2.6$ size kernel/irq_work.o text data bss dec hex filename 438 8 1 447 1bf kernel/irq_work.o Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Christoph Lameter <cl@linux.com>	2010-12-18 15:54:48 +01:00
Paul E. McKenney	b52573d279	rcu: reduce __call_rcu()-induced contention on rcu_node structures When the current __call_rcu() function was written, the expedited APIs did not exist. The __call_rcu() implementation therefore went to great lengths to detect the end of old grace periods and to start new ones, all in the name of reducing grace-period latency. Now the expedited APIs do exist, and the usage of __call_rcu() has increased considerably. This commit therefore causes __call_rcu() to avoid worrying about grace periods unless there are a large number of RCU callbacks stacked up on the current CPU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-12-17 12:34:49 -08:00
Paul E. McKenney	0209f6490b	rcu: limit rcu_node leaf-level fanout Some recent benchmarks have indicated possible lock contention on the leaf-level rcu_node locks. This commit therefore limits the number of CPUs per leaf-level rcu_node structure to 16, in other words, there can be at most 16 rcu_data structures fanning into a given rcu_node structure. Prior to this, the limit was 32 on 32-bit systems and 64 on 64-bit systems. Note that the fanout of non-leaf rcu_node structures is unchanged. The organization of accesses to the rcu_node tree is such that references to non-leaf rcu_node structures are much less frequent than to the leaf structures. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-12-17 12:34:20 -08:00
Paul E. McKenney	121dfc4b3e	rcu: fine-tune grace-period begin/end checks Use the CPU's bit in rnp->qsmask to determine whether or not the CPU should try to report a quiescent state. Handle overflow in the check for rdp->gpnum having fallen behind. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-12-17 12:34:20 -08:00
Frederic Weisbecker	5ff8e6f053	rcu: Keep gpnum and completed fields synchronized When a CPU that was in an extended quiescent state wakes up and catches up with grace periods that remote CPUs completed on its behalf, we update the completed field but not the gpnum that keeps a stale value of a backward grace period ID. Later, note_new_gpnum() will interpret the shift between the local CPU and the node grace period ID as some new grace period to handle and will then start to hunt quiescent state. But if every grace periods have already been completed, this interpretation becomes broken. And we'll be stuck in clusters of spurious softirqs because rcu_report_qs_rdp() will make this broken state run into infinite loop. The solution, as suggested by Lai Jiangshan, is to ensure that the gpnum and completed fields are well synchronized when we catch up with completed grace periods on their behalf by other cpus. This way we won't start noting spurious new grace periods. Suggested-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-12-17 12:34:19 -08:00
Frederic Weisbecker	20377f32dc	rcu: Stop chasing QS if another CPU did it for us When a CPU is idle and others CPUs handled its extended quiescent state to complete grace periods on its behalf, it will catch up with completed grace periods numbers when it wakes up. But at this point there might be no more grace period to complete, but still the woken CPU always keeps its stale qs_pending value and will then continue to chase quiescent states even if its not needed anymore. This results in clusters of spurious softirqs until a new real grace period is started. Because if we continue to chase quiescent states but we have completed every grace periods, rcu_report_qs_rdp() is puzzled and makes that state run into infinite loops. As suggested by Lai Jiangshan, just reset qs_pending if someone completed every grace periods on our behalf. Suggested-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-12-17 12:34:18 -08:00
Tejun Heo	e27fc9641e	rcu: increase synchronize_sched_expedited() batching The fix in commit #6a0cc49 requires more than three concurrent instances of synchronize_sched_expedited() before batching is possible. This patch uses a ticket-counter-like approach that is also not unrelated to Lai Jiangshan's Ring RCU to allow sharing of expedited grace periods even when there are only two concurrent instances of synchronize_sched_expedited(). This commit builds on Tejun's original posting, which may be found at http://lkml.org/lkml/2010/11/9/204, adding memory barriers, avoiding overflow of signed integers (other than via atomic_t), and fixing the detection of batching. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-12-17 12:34:08 -08:00
Bjorn Helgaas	fcb119183c	resources: add arch hook for preventing allocation in reserved areas This adds arch_remove_reservations(), which an arch can implement if it needs to protect part of the address space from allocation. Sometimes that can be done by just putting a region in the resource tree, but there are cases where that doesn't work well. For example, x86 BIOS E820 reservations are not related to devices, so they may overlap part of, all of, or more than a device resource, so they may not end up at the correct spot in the resource tree. Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-12-17 10:01:09 -08:00
Bjorn Helgaas	c0f5ac5426	Revert "resources: support allocating space within a region from the top down" This reverts commit e7f8567db9a7f6b3151b0b275e245c1cef0d9c70. Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-12-17 10:00:59 -08:00
Christoph Lameter	cd85fc58cd	taskstats: Use this_cpu_ops Use this_cpu_inc_return in one place and avoid ugly __raw_get_cpu in another. V3->V4: - Fix off by one. V4-V4f: - Use &listener_array Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2010-12-17 15:18:05 +01:00
Tejun Heo	275c8b9328	Merge branch 'this_cpu_ops' into for-2.6.38	2010-12-17 15:16:46 +01:00
Christoph Lameter	909ea96468	core: Replace __get_cpu_var with __this_cpu_read if not used for an address. __get_cpu_var() can be replaced with this_cpu_read and will then use a single read instruction with implied address calculation to access the correct per cpu instance. However, the address of a per cpu variable passed to __this_cpu_read() cannot be determined (since it's an implied address conversion through segment prefixes). Therefore apply this only to uses of __get_cpu_var where the address of the variable is not used. Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Hugh Dickins <hughd@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2010-12-17 15:07:19 +01:00

1 2 3 4 5 ...

10940 Commits