The cifs.idmap keytype always allocates memory to hold the payload from
userspace. In the common case where we're translating a SID to a UID or
GID, we're allocating memory to hold something that's less than or equal
to the size of a pointer.
When the payload is the same size as a pointer or smaller, just store
it in the payload.value union member instead. That saves us an extra
allocation on the sid_to_id upcall.
Note that we have to take extra care to check the datalen when we
go to dereference the .data pointer in the union, but the callers
now check that as a matter of course anyway.
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
The cifs.idmap handling code currently causes the kernel to cache the
data from userspace twice. It first looks in a rbtree to see if there is
a matching entry for the given id. If there isn't then it calls
request_key which then checks its cache and then calls out to userland
if it doesn't have one. If the userland program establishes a mapping
and downcalls with that info, it then gets cached in the keyring and in
this rbtree.
Aside from the double memory usage and the performance penalty in doing
all of these extra copies, there are some nasty bugs in here too. The
code declares four rbtrees and spinlocks to protect them, but only seems
to use two of them. The upshot is that the same tree is used to hold
(eg) uid:sid and sid:uid mappings. The comparitors aren't equipped to
deal with that.
I think we'd be best off to remove a layer of caching in this code. If
this was originally done for performance reasons, then that really seems
like a premature optimization.
This patch does that -- it removes the rbtrees and the locks that
protect them and simply has the code do a request_key call on each call
into sid_to_id and id_to_sid. This greatly simplifies this code and
should roughly halve the memory utilization from using the idmapping
code.
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
commit c702418f8a2f ("mm: vmscan: do not keep kswapd looping forever due
to individual uncompactable zones") removed zone watermark checks from
the compaction code in kswapd but left in the zone congestion clearing,
which now happens unconditionally on higher order reclaim.
This messes up the reclaim throttling logic for zones with
dirty/writeback pages, where zones should only lose their congestion
status when their watermarks have been restored.
Remove the clearing from the zone compaction section entirely. The
preliminary zone check and the reclaim loop in kswapd will clear it if
the zone is considered balanced.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The direct-IO write path already had the i_size checks in mm/filemap.c,
but it turns out the read path did not, and removing the block size
checks in fs/block_dev.c (commit bbec0270bdd8: "blkdev_max_block: make
private to fs/buffer.c") removed the magic "shrink IO to past the end of
the device" code there.
Fix it by truncating the IO to the size of the block device, like the
write path already does.
NOTE! I suspect the write path would be *much* better off doing it this
way in fs/block_dev.c, rather than hidden deep in mm/filemap.c. The
mm/filemap.c code is extremely hard to follow, and has various
conditionals on the target being a block device (ie the flag passed in
to 'generic_write_checks()', along with a conditional update of the
inode timestamp etc).
It is also quite possible that we should treat this whole block device
size as a "s_maxbytes" issue, and try to make the logic even more
generic. However, in the meantime this is the fairly minimal targeted
fix.
Noted by Milan Broz thanks to a regression test for the cryptsetup
reencrypt tool.
Reported-and-tested-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
. UAPI fixes, from David Howels
. Separate perf tests into multiple objects, one per test, from Jiri Olsa.
. Fixes to /proc/pid/maps parsing, preparatory to supporting data maps,
from Namhyung Kim
. Fix compile error for non-NEWT builds, from Namhyung Kim
. Implement ui_progress for GTK, from Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJQqO+jAAoJENZQFvNTUqpATGIP/juZF47Al1+4So5KfdOTNPKD
X9H23T31HrcK4z0Yk4upbi0qWW1do7RkoJOdwVbvMndYK8Mh0QEmZcmcJg7hzhJq
v86/qGAbnWUi8H5g8VVhGCFRBbkdN2U6tKzBe3KzmgQnjq48/2EYsi4AlnfoyWoX
u1VuBd1K3dQAvl4TqypxXeB84HJQ/nMtDtT6YMuMGfAPP4aCvCjEX365rgux8bGW
nCA/ERppfvfj7QG/Z3Da/nI/qlHYPYjyZBuaj/X5qvF3GzPAcVQ5Qs5JlkQ9daiO
RS3rR7eDfx7DTidTobIagfxgj0gYoY3/1gZVrLLAB1ByUJcpehbBgKxN6PvEvJFh
FXfrXrZ8AtnxMBBN0Wnkqwm0z40RGe65bu0JqU8YroTVUUH+ZvO6U0WvDH4KVcpL
iMJTWIpcpsgPWFEieW6rArSdYuuvAivrNxm2Sl1RNguOlxftHT4ZCTvxVPEwrOl9
PWnoJhc5YxTfdaoJ8xHQaD7TyYj6RtycZw9hhBYD5mdDwpzBZdccJ2vgMlkBgeuL
VOHCG0NxJVH6myexhfjrFqbJljAo024/nkqwTm3JHJr3WJv0qqOIFbmb8SA2kCpP
QXnLCrfUk/MXdU69paWGMpFyG6+iuSDhyecy0iX/+xqV3EGOdVxVM1x/mxfBSC2p
xU/aCDZvX0+YYv5797UK
=9k43
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
- UAPI fixes, from David Howels
- Separate perf tests into multiple objects, one per test, from Jiri Olsa.
- Fixes to /proc/pid/maps parsing, preparatory to supporting data maps,
from Namhyung Kim
- Fix compile error for non-NEWT builds, from Namhyung Kim
- Implement ui_progress for GTK, from Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking fixes from David Miller:
"Two stragglers:
1) The new code that adds new flushing semantics to GRO can cause SKB
pointer list corruption, manage the lists differently to avoid the
OOPS. Fix from Eric Dumazet.
2) When TCP fast open does a retransmit of data in a SYN-ACK or
similar, we update retransmit state that we shouldn't triggering a
WARN_ON later. Fix from Yuchung Cheng."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: gro: fix possible panic in skb_gro_receive()
tcp: bug fix Fast Open client retransmission
As the current LEDs code breaks other platform, remove it.
It shall be replaced by a generic "MMIO LEDs" driver.
Reported-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iEYEABECAAYFAlDAZ8MACgkQGxsu9jQV9nZP1ACfaQv/G9U6T9iNSkY+IuOUY/h2
QMMAnjo0VIlSPx2z/Jr+ZRFfGWtbTtxT
=Jthr
-----END PGP SIGNATURE-----
Merge tag 'sunxi-fixes-for-3.8' of git://github.com/mripard/linux into next/soc
From Maxime Ripard:
Fixes in sunXi related drivers for 3.8
* tag 'sunxi-fixes-for-3.8' of git://github.com/mripard/linux:
irqchip: irq-sunxi: Add terminating entry for sunxi_irq_dt_ids
clocksource: sunxi_timer: Add terminating entry for sunxi_timer_dt_ids
* acpi-general:
pnpacpi: fix incorrect TEST_ALPHA() test
ACPI / video: ignore BIOS initial backlight value for HP Folio 13-2000
ACPI : do not use Lid and Sleep button for S5 wakeup
All devices behind Haswell LPSS (Low Power Subsystem) should be represented
as platform devices so add them to the acpi_platform_device_ids list.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a document that describes how to take advantage of ACPI enumeration for
buses like platform, I2C and SPI. In addition to that we document how to
translate ACPI GpioIo and GpioInt resources to be useful in Linux device
drivers.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
TEST_ALPHA() is broken and always returns 0.
[akpm@linux-foundation.org: return false for '@' as well, per Bjorn]
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
To handle error trigger table correctly, memory region must be
removed from request region. We had a series of patches to do this
culminating in:
commit b4e008dc5
ACPI, APEI, EINJ, Refine the fix of resource conflict
but when ACPI5 support was added, we missed updating this area. So
when using EINJ table on an ACPI5 enabled machine, we get following error:
APEI: Can not request [mem 0x526b80000-0x526b80007] for APEI EINJ
Trigger registers
Fix this by checking for the acpi5 case and using the same code
that was added earlier.
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
commit 2e71a6f8084e (net: gro: selective flush of packets) added
a bug for skbs using frag_list. This part of the GRO stack is rarely
used, as it needs skb not using a page fragment for their skb->head.
Most drivers do use a page fragment, but some of them use GFP_KERNEL
allocations for the initial fill of their RX ring buffer.
napi_gro_flush() overwrite skb->prev that was used for these skb to
point to the last skb in frag_list.
Fix this using a separate field in struct napi_gro_cb to point to the
last fragment.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If SYN-ACK partially acks SYN-data, the client retransmits the
remaining data by tcp_retransmit_skb(). This increments lost recovery
state variables like tp->retrans_out in Open state. If loss recovery
happens before the retransmission is acked, it triggers the WARN_ON
check in tcp_fastretrans_alert(). For example: the client sends
SYN-data, gets SYN-ACK acking only ISN, retransmits data, sends
another 4 data packets and get 3 dupacks.
Since the retransmission is not caused by network drop it should not
update the recovery state variables. Further the server may return a
smaller MSS than the cached MSS used for SYN-data, so the retranmission
needs a loop. Otherwise some data will not be retransmitted until timeout
or other loss recovery events.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
by using cifs_invalidate_mapping rather than invalidate_remote_inode
in cifs_oplock_break - this invalidates all inode pages and resets
fscache cookies.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
Extracting a part of the SDHCI card tasklet into a .card_event()
implementation allows SDHCI hosts to use generic card-detection
services, e.g. the GPIO slot function.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Reviewed-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
The slot-gpio API provides a generic card-detection handler. To support a
wider range of hosts it has to call the host's card-event callback, if
implemented. Also increase the debounce interval to 200ms to match the
SDHCI driver.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Reviewed-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Some hosts need to perform additional actions upon card insertion or
ejection. Add a host operation to be called from card detection handlers.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Reviewed-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
'sc' is used only when CONFIG_PM_RUNTIME is defined. Hence define it
conditionally.
Silences the following warning:
drivers/mmc/host/sdhci-s3c.c: In function ‘sdhci_s3c_notify_change’:
drivers/mmc/host/sdhci-s3c.c:378:20: warning: unused variable ‘sc’ [-Wunused-variable]
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
We don't need to permit a write to the area locked with a read lock
by any process including the process that issues the write.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
Commit b9a50f74905a ("ARM: 7450/1: dcache: select DCACHE_WORD_ACCESS for
little-endian ARMv6+ CPUs") added support for word-at-time path
comparisons, relying on the ability to perform unaligned loads with
negligible performance impact in hardware.
For nommu configurations without MPU support, this is unpredictable and
so we should fall back to the byte-by-byte routines.
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Recent ARMv7 toolchains assume that unaligned memory accesses will not
fault and will instead be handled by the processor.
For the nommu case (without an MPU), memory will be treated as
strongly-ordered and therefore unaligned accesses may fault regardless
of the SCTLR.A setting.
This patch passes -mno-unaligned-access to GCC when compiling for nommu
targets, preventing the generation of unaligned memory access in the
kernel.
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch keeps disabled the strict alignment CP15 bit for
all armv6 and armv7 processor without the mmu. This behaviour
is now same as in the mmu case.
Signed-off-by: Armando Visconti <armando.visconti@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This is what is done for the regular interrupts in kernel/irqs/proc.c
already, before calling arch_show_interrupts(). Not doing so for the
IPIs causes the column headers not to match with the content whenever
some CPUs are offline.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
6e20a0a429bd4dc07d6de16d9c247270e04e4aa0
(gpio: pcf857x: enable gpio_to_irq() support)
added gpio_to_irq() support on pcf857x driver,
but it used pdata->irq.
This patch modifies driver to use client->irq instead of it.
It modifies kzm9g board platform settings,
and device probe information too.
This patch is tested on kzm9g board
Reported-by: Christian Engelmayer <christian.engelmayer@frequentis.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
This fixes a regression in 3.7-rc, which has since gone into stable.
Commit 00442ad04a5e ("mempolicy: fix a memory corruption by refcount
imbalance in alloc_pages_vma()") changed get_vma_policy() to raise the
refcount on a shmem shared mempolicy; whereas shmem_alloc_page() went
on expecting alloc_page_vma() to drop the refcount it had acquired.
This deserves a rework: but for now fix the leak in shmem_alloc_page().
Hugh: shmem_swapin() did not need a fix, but surely it's clearer to use
the same refcounting there as in shmem_alloc_page(), delete its onstack
mempolicy, and the strange mpol_cond_copy() and __mpol_cond_copy() -
those were invented to let swapin_readahead() make an unknown number of
calls to alloc_pages_vma() with one mempolicy; but since 00442ad04a5e,
alloc_pages_vma() has kept refcount in balance, so now no problem.
Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a zone meets its high watermark and is compactable in case of
higher order allocations, it contributes to the percentage of the node's
memory that is considered balanced.
This requirement, that a node be only partially balanced, came about
when kswapd was desparately trying to balance tiny zones when all bigger
zones in the node had plenty of free memory. Arguably, the same should
apply to compaction: if a significant part of the node is balanced
enough to run compaction, do not get hung up on that tiny zone that
might never get in shape.
When the compaction logic in kswapd is reached, we know that at least
25% of the node's memory is balanced properly for compaction (see
zone_balanced and pgdat_balanced). Remove the individual zone checks
that restart the kswapd cycle.
Otherwise, we may observe more endless looping in kswapd where the
compaction code loops back to reclaim because of a single zone and
reclaim does nothing because the node is considered balanced overall.
See for example
https://bugzilla.redhat.com/show_bug.cgi?id=866988
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-and-tested-by: Thorsten Leemhuis <fedora@leemhuis.info>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Tested-by: John Ellson <john.ellson@comcast.net>
Tested-by: Zdenek Kabelac <zkabelac@redhat.com>
Tested-by: Bruno Wolff III <bruno@wolff.to>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 0bf380bc70ec ("mm: compaction: check pfn_valid when entering a
new MAX_ORDER_NR_PAGES block during isolation for migration") added a
check for pfn_valid() when isolating pages for migration as the scanner
does not necessarily start pageblock-aligned.
Since commit c89511ab2f8f ("mm: compaction: Restart compaction from near
where it left off"), the free scanner has the same problem. This patch
makes sure that the pfn range passed to isolate_freepages_block() is
within the same block so that pfn_valid() checks are unnecessary.
In answer to Henrik's wondering why others have not reported this:
reproducing this requires a large enough hole with the right aligment to
have compaction walk into a PFN range with no memmap. Size and
alignment depends in the memory model - 4M for FLATMEM and 128M for
SPARSEMEM on x86. It needs a "lucky" machine.
Reported-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Ricoh SDHCI controllers support Highspeed clocks as evident from
the ricoh_mmc_probe_slot() settings. Hence, SDHCI_CAN_DO_HISPD needs
to be set to enable SDIO client drivers to set/enable high speed clock
settings
Signed-off-by: Madhvapathi Sriram <Madhvapathi.Sriram@csr.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
This commit taken from Rabeeh's Cubox kernel and re-worked for DT;
Sebastian Hasselbrath is believed to be the original author.
Some Cuboxes require a GPIO for card detection; this implements the
optional GPIO support for card detection. This GPIO is logic 0 for
card inserted.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Chris Ball <cjb@laptop.org>
We need to use the two-stage initialization for sdhci-pltfm if we're
going to do anything extra at initialization time.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Chris Ball <cjb@laptop.org>
Use devm_clk_get() rather than clk_get() to make cleanup paths more simple.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Chris Ball <cjb@laptop.org>
A-003500: False ADMA Error might be reported when ADMA is used for
multiple block read command with Stop at Block Gap. If PROCTL[SABGREQ]
is set when the particular block's data is received by the System side
logic before entire block (with CRC) data is received by the SD side
logic, and also if ADMA descriptor line is fetched at the same time,
then DMA engine might report false ADMA error. eSDHC might not be able
to Continue (PROCTL[CREQ]=1) after Stop at Block Gap.
This issue will impact the eSDHC IP VVN2.3.
Signed-off-by: Haijun Zhang <Haijun.Zhang@freescale.com>
Signed-off-by: Jerry Huang <Chang-Ming.Huang@freescale.com>
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
ctype is using 1-bit buswidth mode by default.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
To ensure the stable clock need to enable before set the
DW_MMC_CARD_NEED_INIT flag. If set DW_MMC_CARD_NEED_INIT flag,
wait for 80-clock before first command after power-up.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
If "caps2" host capabilities does not indicate support for MMC
HS200, don't allow clock speeds >52MHz. Currently, for MMC, the
clock speed is set to the lesser of the max speed the eMMC module
supports (card->ext_csd.hs_max_dtr) or the max base clock of the
host controller (host->f_max based on BASE_CLK_FREQ in the host
CAPS register). This means that a host controller that doesn't
support HS200 mode but has a base clock of 100MHz and an eMMC module
that supports HS200 speeds will end up using a 100MHz clock.
Signed-off-by: Al Cooper <alcooperx@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Use managed device resource functions for easy handling.
This makes driver simpler in the routine of error and exit.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
MMCIF only uses one clock, all ARM and SuperH platforms register MMCIF
clock lookup entries with no connection ID, hence it can be dropped in
the driver too.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Chris Ball <cjb@laptop.org>