It seems to hurt performance in real life. Yes, the inode will be used
later, but the conditional doesn't seem to predict all that well
(negative dentries are not uncommon) and it looks like the cost of
prefetching is simply higher than depending on the cache doing the right
thing.
As usual.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The compiler, at least for ix86 and m68k, validly warns that the
comparison:
next <= (loff_t)-1
is always true (and it's always true also for x86-64 and probably all
other arches - as long as pgoff_t isn't wider than loff_t). The
intention appears to be to avoid wrapping of "next", so rather than
eliminating the pointless comparison, fix the loop to indeed get exited
when "next" would otherwise wrap.
On m68k the following warning is observed:
fs/fscache/page.c: In function '__fscache_uncache_all_inode_pages':
fs/fscache/page.c:979: warning: comparison is always false due to limited range of data type
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
"entity_key()" is only used in "__enqueue_entity()" and
its only function is to subtract a tasks vruntime by
its groups minvruntime.
Before this patch a rbtree enqueue-decision is done by
comparing two tasks in the style:
"if (entity_key(cfs_rq, se) < entity_key(cfs_rq, entry))"
which would be
"if (se->vruntime-cfs_rq->min_vruntime < entry->vruntime-cfs_rq->min_vruntime)"
or (if reducing cfs_rq->min_vruntime out)
"if (se->vruntime < entry->vruntime)"
which is
"if (entity_before(se, entry))"
So we do not need "entity_key()".
If "entity_before()" is inline we will also save one subtraction (only one,
because "entity_key(cfs_rq, se)" was cached in "key")
Signed-off-by: Stephan Baerwolf <stephan.baerwolf@tu-ilmenau.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-ns12mnd2h5w8rb9agd8hnsfk@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Clean up cfs/rt runqueue initialization by moving group scheduling
related code into the corresponding functions.
Also, keep group scheduling as an add-on, so that things are only done
additionally, i. e. remove the init_*_rq() calls from init_tg_*_entry().
(This removes a redundant initalization during sched_init()).
In case of group scheduling rt_rq->highest_prio.curr is now initialized
twice, but adding another #ifdef seems not worth it.
Signed-off-by: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1310661163-16606-1-git-send-email-schnhrr@cs.tu-berlin.de
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reorder root_domain to remove 8 bytes of alignment padding on 64 bit
builds, this shrinks the size from 1736 to 1728 bytes, therefore using
one fewer cachelines.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1310726492.1977.5.camel@castor.rsk
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If a task group is to be created and alloc_fair_sched_group() fails,
then the rt_bandwidth of the corresponding task group is not yet
initialized. The caller, sched_create_group(), starts a clean up
procedure which calls free_rt_sched_group() which unconditionally
destroys the not yet initialized rt_bandwidth.
This crashes or hangs the system in lock_hrtimer_base(): UP systems
dereference a NULL pointer, while SMP systems loop endlessly on a
condition that cannot become true.
This patch simply avoids the destruction of rt_bandwidth when the
initialization code path was not reached.
(This was discovered by accident with a custom kernel modification.)
Signed-off-by: Bianca Lutz <sowilo@cs.tu-berlin.de>
Signed-off-by: Jan Schoenherr <schnhrr@cs.tu-berlin.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1310580816-10861-7-git-send-email-schnhrr@cs.tu-berlin.de
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The last reference to cpu_cfs_rq() was removed with commit 88ec22d3
("sched: Remove the cfs_rq dependency from set_task_cpu()"). Thus,
remove this function, too.
Signed-off-by: Jan Schoenherr <schnhrr@cs.tu-berlin.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1310580816-10861-3-git-send-email-schnhrr@cs.tu-berlin.de
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use for_each_leaf_cfs_rq() instead of list_for_each_entry_rcu(), this
achieves that load_balance_fair() only iterates those task_groups that
actually have tasks on busiest, and that we iterate bottom-up, trying to
move light groups before the heavier ones.
No idea if it will actually work out to be beneficial in practice, does
anybody have a cgroup workload that might show a difference one way or
the other?
[ Also move update_h_load to sched_fair.c, loosing #ifdef-ery ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/1310557009.2586.28.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In dequeue_task_fair() we bail on dequeue when we encounter a parenting entity
with additional weight. However, we perform a double shares update on this
entity as we continue the shares update traversal from this point, despite
dequeue_entity() having already updated its queuing cfs_rq.
Avoid this by starting from the parent when we resume.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110707053059.797714697@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While looking at check_preempt_wakeup() I realized that we are
potentially updating the wrong entity in the fair-group scheduling
case. In this case the current task's cfs_rq may not be the same as
the one used for the comparison between the waking task and the
existing task's vruntime.
This potentially results in us using a stale vruntime in the
pre-emption decision, providing a small false preference for the
previous task. The effects of this are bounded since we always
perform a hierarchal update on the tick.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/CAPM31R+2Ke2urUZKao5W92_LupdR4AYEv-EZWiJ3tG=tEes2cw@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Simple test-case,
int main(void)
{
int pid, status;
pid = fork();
if (!pid) {
pause();
assert(0);
return 0x23;
}
assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
assert(wait(&status) == pid);
assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP);
kill(pid, SIGCONT); // <--- also clears STOP_DEQUEUD
assert(ptrace(PTRACE_CONT, pid, 0,0) == 0);
assert(wait(&status) == pid);
assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGCONT);
assert(ptrace(PTRACE_CONT, pid, 0, SIGSTOP) == 0);
assert(wait(&status) == pid);
assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP);
kill(pid, SIGKILL);
return 0;
}
Without the patch it hangs. After the patch SIGSTOP "injected" by the
tracer is not ignored and stops the tracee.
Note also that if this test-case uses, say, SIGWINCH instead of SIGCONT,
everything works without the patch. This can't be right, and this is
confusing.
The problem is that SIGSTOP (or any other sig_kernel_stop() signal) has
no effect without JOBCTL_STOP_DEQUEUED. This means it is simply ignored
after PTRACE_CONT unless JOBCTL_STOP_DEQUEUED was set "by accident", say
it wasn't cleared after initial SIGSTOP sent by PTRACE_ATTACH.
At first glance we could change ptrace_signal() to add STOP_DEQUEUED
after return from ptrace_stop(), but this is not right in case when the
tracer does not change the reported SIGSTOP and SIGCONT comes in between.
This is even more wrong with PT_SEIZED, SIGCONT adds JOBCTL_TRAP_NOTIFY
which will be "lost" during the TRAP_STOP | TRAP_NOTIFY report.
So lets add STOP_DEQUEUED _before_ we report the signal. It has no effect
unless sig_kernel_stop() == T after the tracer resumes us, and in the
latter case the pending STOP_DEQUEUED means no SIGCONT in between, we
should stop.
Note also that if SIGCONT was sent, PT_SEIZED tracee will correctly
report PTRACE_EVENT_STOP/SIGTRAP and thus the tracer can notice the fact
SIGSTOP was cancelled.
Also, move the current->ptrace check from ptrace_signal() to its caller,
get_signal_to_deliver(), this looks more natural.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
At http://www.mail-archive.com/linux-mmc@vger.kernel.org/msg08371.html
(thread: "mmc: sdio: reset card during power_restore") we found and
fixed a bug where mmc's runtime power management functions were not being
called. We have now also made improvements to the SDIO powerup routine
which could possibly mask this kind of issue in future.
Add debug messages to the runtime PM hooks so that it is easy to verify
if and when runtime PM is happening.
Signed-off-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
In the case where a driver returns -ENOSYS from its suspend handler
to indicate that the device should be powered down over suspend, the
remove routine of the driver was not being called, leading to lots of
confusion during resume.
The problem is that runtime PM is disabled during this process,
and when we reach mmc_sdio_remove, calling the runtime PM functions here
(validly) return errors, and this was causing us to skip the remove
function.
Fix this by ignoring the error value of pm_runtime_get_sync(), which
can return valid errors. This also matches the behaviour of
pci_device_remove().
Signed-off-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Fix clock rate setting in the mxs-mmc driver. Previously, if div2 was 0
then the value for TIMING_CLOCK_RATE would have been 255 instead of 0.
The limits for div1 (TIMING_CLOCK_DIVIDE) and div2 (TIMING_CLOCK_RATE+1)
were also not correctly defined.
Can easily be reproduced on mx23evk: default clock for high speed sdio
cards is 50 MHz. With a SSP_CLK of 28.8 MHz default), this resulted in
an actual clock rate of about 56 kHz. Tested on mx23evk.
Signed-off-by: Koen Beel <koen.beel@barco.com>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
Currently the tmio-mmc driver contains a recursive runtime PM method
invocation, which leads to a deadlock on a mutex. Avoid it by taking
care not to request DMA too early.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
A recent commit "mmc: tmio: Share register access functions" has swapped
arguments of a macro and broken DMA with TMIO MMC. This patch fixes the
arguments back.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch uses runtime PM to allow the system to power down the MMC
controller, when the MMC closk is switched off.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch uses runtime PM to allow the system to power down the MMC
controller, when the MMC closk is switched off.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
Calling mmc_request_done() under a spinlock with interrupts disabled
leads to a recursive spin-lock on request retry path and to
scheduling in atomic context. This patch fixes both these problems
by moving mmc_request_done() to the scheduler workqueue.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
Ricoh 1180:e823 does not recognize certain types of SD/MMC cards,
as reported at http://launchpad.net/bugs/773524. Lowering the SD
base clock frequency from 200Mhz to 50Mhz fixes this issue. This
solution was suggest by Koji Matsumuro, Ricoh Company, Ltd.
This change has no negative performance effect on standard SD
cards, though it's quite possible that there will be one on
UHS-1 cards.
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Tested-by: Daniel Manrique <daniel.manrique@canonical.com>
Cc: Koji Matsumuro <matsumur@nts.ricoh.co.jp>
Cc: <stable@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
In the case of an I/O error, the DMA will have been cleaned up in
the MMC interrupt and the request structure pointer will be null.
In that case, it is essential to check if the DMA is over before
dereferencing host->mrq->data.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
There are a few places with the same functionality. This patch creates
two functions omap_hsmmc_set_bus_width() and omap_hsmmc_set_bus_mode()
to do the job.
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
There are two pieces of code which are similar, but not the same.
Each of them contains a bug.
The SYSCTL register should be read before writing to it in
omap_hsmmc_context_restore() to retain the state of the reserved bits.
Before setting the clock divisor and DTO bits the value from the SYSCTL
register should be masked properly. We were lucky to have no problems
with DTO bits. So, make sure we have clear DTO bits properly in
omap_hsmmc_set_ios().
Additionally get rid of msleep(1). The actual time is rarely higher
than 30us on OMAP 3630.
The resulting pieces of code are refactored into the
omap_hsmmc_set_clock() function.
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
There is similar code in two functions which enable the clock. Refactor
this code to omap_hsmmc_start_clock(). Re-use omap_hsmmc_stop_clock() in
omap_hsmmc_context_restore() as well.
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
There are two places where the same calculations are done.
Let's split them into a separate function.
In addition, simplify by using the DIV_ROUND_UP kernel macro.
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Move the min and max frequency constants to the definition block in
the source file.
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
CERR and BADA were in the wrong place and there are only
32 not 35.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Reviewed-by: Venkatraman S <svenkatr@ti.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
We already check for ongoing async transfers when handling discard
requests, but not in mmc_blk_issue_flush(). This patch fixes that
omission.
Tested with an SDHCI controller and eMMC4.41.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Per Forlin <per.forlin@linaro.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Documentation about the background and the design of mmc non-blocking.
Host driver guidelines to minimize request preparation overhead.
Signed-off-by: Per Forlin <per.forlin@linaro.org>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch adds support for the CSR panel built by XAT.
Signed-off-by: Ice Chien <ice.chien@accupoint.com.tw>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The following symbols are not referenced outside this file so
there's no need for it to be in the global name space.
pcmidi_sustained_note_release
init_sustain_timers
stop_sustain_timers
pcmidi_handle_report
pcmidi_setup_extra_keys
pcmidi_snd_initialise
pcmidi_snd_terminate
Make them static.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Some gcc versions warn about prototypes without "inline" when the declaration
includes the "inline" keyword. The fix generates a false error message
"marked inline, but without a definition" with sparse below 0.4.2.
Signed-off-by: Chris Friesen <chris.friesen@genband.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
If overlapping networks with different interfaces was added to
the set, the type did not handle it properly. Example
ipset create test hash:net,iface
ipset add test 192.168.0.0/16,eth0
ipset add test 192.168.0.0/24,eth1
Now, if a packet was sent from 192.168.0.0/24,eth0, the type returned
a match.
In the patch the algorithm is fixed in order to correctly handle
overlapping networks.
Limitation: the same network cannot be stored with more than 64 different
interfaces in a single set.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Adding builtin test for parse_events function, which is
responsible for parsing/processing "-e" option for
stat/top/record commands.
This new test will run within the builtin test command suite
(perf test).
One or several tests were added for each type of event.
More tests could be added easily if needed.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310635534-4013-3-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Moving out the option parameter from parse_events function,
and adding new parse_events_option function instead.
The option parameter is used only to carry "struct perf_evlist"
pointer for chaining new events. Putting it away, enable us
to call parse_events from other places without using the
option parameter.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310635534-4013-2-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The perf_event_attr struct has two __u32's at the top and
they need to be swapped individually.
With this change I was able to analyze a perf.data collected in a
32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit
binaries for the Intel analysis side; both read the PPC perf.data
file correctly.
-v2:
- changed the existing perf_event__attr_swap() to swap only elements
of perf_event_attr and exported it for use in swapping the
attributes in the file header
- updated swap_ops used for processing events
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310754849-12474-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add "node" as a simple alias for NODE cache events.
The addition of NODE cache events broke the parse_alias
function, so any mismatched event caused the segfault, like:
# ./perf stat -e krava ls
The hw_cache/hw_cache_op/hw_cache_result arrays needs to follow
PERF_COUNT_HW_CACHE_*MAX enums. Adding those MAXs to be size
of those arrays, so possible ommision in future wil not lead to
segfault.
Adding read/write/prefetch as allowed operations for node cache
event.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/20110713205818.GB7827@jolsa.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The meth for calculating the # of outstanding buffers gives
incorrect results when vq->upend_idx wraps around zero.
Fix that.
Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The non-debug variant of mutex_destroy is a no-op, currently
implemented as a macro which does nothing. This approach fails
to check the type of the parameter, so an error would only show
when debugging gets enabled. Using an inline function instead,
offers type checking for earlier bug catching.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110716174200.41002352@endymion.delvare
Signed-off-by: Ingo Molnar <mingo@elte.hu>