memory allocations
We add these hooks to avoid key threads blocked in memory allocation
path.
-android_vh_free_unref_page_bypass ----We create a memory pool for the
key threads. This hook determines whether a page should be free to the
pool or to buddy freelist. It works with a existing hook
`android_vh_alloc_pages_reclaim_bypass`, which takes pages out of the
pool.
-android_vh_kvmalloc_node_use_vmalloc ----For key threads, we perfer
not to run into direct reclaim. So we clear __GFP_DIRECT_RECLAIM flag.
For threads which are not that important, we perfer use vmalloc.
-android_vh_should_alloc_pages_retry ----Before key threads run into
direct reclaim, we want to retry with a lower watermark.
-android_vh_unreserve_highatomic_bypass ----We want to keep more
highatomic pages when unreserve them to avoid highatomic allocation
failures.
-android_vh_rmqueue_bulk_bypass ----We found sometimes when key threads
run into rmqueue_bulk, it took several milliseconds spinning at
zone->lock or filling per-cpu pages. We use this hook to take pages from
the mempool mentioned above, rather than grab zone->lock and fill a
batch of pages to per-cpu.
Bug: 288216516
Change-Id: I1656032d6819ca627723341987b6094775bc345f
Signed-off-by: Oven <liyangouwen1@oppo.com>
since we can't control all kvmalloc_node callsite's gfp_flags, we add
a vendor hook in kvmalloc_node to tune the reclaim behavior for some
really high-order allocation
Bug: 300857012
Change-Id: I5f0c4c2921d204289911704e3a205f6a1dc50d04
Signed-off-by: liwei <liwei1234@oppo.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmUBaBUACgkQONu9yGCS
aT6OkBAArqBSUyCYQJrhoUlFYBnBqF7BLSkj0GwINGSUOlt5ilJ3kZwH9ftjvpWp
ZtO0Rp/1yH2H5PpcsaLljPl055Sf30e0oCkz6vX16vy17NGnzI4rJi55+nRZbFRH
tBMhMjblgIJoTiTPEQPSGghENok+QzJ9Imffo4/Wru3w5ytkBnGcPPXreHJw+8V5
Pjhzg5tcjhz23rk2wzVtR4VfEqWaHQaapv49rKB1Yls578WYn4QXl4jgUyB7rCo7
9vBB7xy77H1hr9m8ifB/9v1ToV/vw6L1xGPWWWbhsSikFAMBoq34SCsq+6RdeURo
43CCcFsx1s5acM7NQWvxkoV5Hgl8Hc3WgFsx5eVBlNd+vS6ezkgdYuGmN76t+dF/
hZ7XGEoEFuoz9NKQC/5rKjdBd2p/IQYx6vf8EpK0IxFPD4h+DY9pn0FvwuAmxAcA
M41xLYGbXX5l/QJR016B1AYiB3DqVxRRRyQT0yNip+PDAh2N06MOJ84KgMSR9lg7
jyeFKZM2vQ619RopMIspuHTWxNiMw7x94aUhBnY1oD+fDzaRn+VNL8po6QYHLK8U
QTDhrWplTbTuGIF72h+1IyX1aUj6ozoCewl9Y9ry1u9jBb7LZoupVd0s1dwqORIk
2OSo74pDu5F2BT+4hEcCpDRcYvWlfKbZWBunRrMqvHN8BON0Mks=
=aFyS
-----END PGP SIGNATURE-----
Merge 6.1.53 into android14-6.1-lts
Changes in 6.1.53
Revert "bridge: Add extack warning when enabling STP in netns."
Partially revert "drm/amd/display: Fix possible underflow for displays with large vblank"
scsi: ufs: Try harder to change the power mode
Revert "Revert drm/amd/display: Enable Freesync Video Mode by default"
ARM: dts: imx: Set default tuning step for imx7d usdhc
ALSA: hda/realtek: Enable 4 amplifiers instead of 2 on a HP platform
powerpc/boot: Disable power10 features after BOOTAFLAGS assignment
media: uapi: HEVC: Add num_delta_pocs_of_ref_rps_idx field
Revert "MIPS: unhide PATA_PLATFORM"
phy: qcom-snps-femto-v2: use qcom_snps_hsphy_suspend/resume error code
media: amphion: use dev_err_probe
media: pulse8-cec: handle possible ping error
media: pci: cx23885: fix error handling for cx23885 ATSC boards
9p: virtio: fix unlikely null pointer deref in handle_rerror
9p: virtio: make sure 'offs' is initialized in zc_request
ksmbd: fix out of bounds in smb3_decrypt_req()
ksmbd: validate session id and tree id in compound request
ksmbd: no response from compound read
ksmbd: fix out of bounds in init_smb2_rsp_hdr()
ASoC: da7219: Flush pending AAD IRQ when suspending
ASoC: da7219: Check for failure reading AAD IRQ events
ASoC: nau8821: Add DMI quirk mechanism for active-high jack-detect
ethernet: atheros: fix return value check in atl1c_tso_csum()
m68k: Fix invalid .section syntax
s390/dasd: use correct number of retries for ERP requests
s390/dasd: fix hanging device after request requeue
fs/nls: make load_nls() take a const parameter
ASoC: rt5682-sdw: fix for JD event handling in ClockStop Mode0
ASoc: codecs: ES8316: Fix DMIC config
ASoC: rt711: fix for JD event handling in ClockStop Mode0
ASoC: rt711-sdca: fix for JD event handling in ClockStop Mode0
ASoC: atmel: Fix the 8K sample parameter in I2SC master
ALSA: usb-audio: Add quirk for Microsoft Modern Wireless Headset
platform/x86: intel: hid: Always call BTNL ACPI method
platform/x86/intel/hid: Add HP Dragonfly G2 to VGBS DMI quirks
platform/x86: think-lmi: Use kfree_sensitive instead of kfree
platform/x86: asus-wmi: Fix setting RGB mode on some TUF laptops
platform/x86: huawei-wmi: Silence ambient light sensor
drm/amd/smu: use AverageGfxclkFrequency* to replace previous GFX Curr Clock
drm/amd/display: Guard DCN31 PHYD32CLK logic against chip family
drm/amd/display: Exit idle optimizations before attempt to access PHY
ovl: Always reevaluate the file signature for IMA
ata: pata_arasan_cf: Use dev_err_probe() instead dev_err() in data_xfer()
ALSA: usb-audio: Update for native DSD support quirks
staging: fbtft: ili9341: use macro FBTFT_REGISTER_SPI_DRIVER
security: keys: perform capable check only on privileged operations
kprobes: Prohibit probing on CFI preamble symbol
clk: fixed-mmio: make COMMON_CLK_FIXED_MMIO depend on HAS_IOMEM
vmbus_testing: fix wrong python syntax for integer value comparison
Revert "wifi: ath6k: silence false positive -Wno-dangling-pointer warning on GCC 12"
net: dsa: microchip: KSZ9477 register regmap alignment to 32 bit boundaries
net: annotate data-races around sk->sk_{rcv|snd}timeo
net: usb: qmi_wwan: add Quectel EM05GV2
wifi: brcmfmac: Fix field-spanning write in brcmf_scan_params_v2_to_v1()
powerpc/powermac: Use early_* IO variants in via_calibrate_decr()
idmaengine: make FSL_EDMA and INTEL_IDMA64 depends on HAS_IOMEM
platform/x86/amd/pmf: Fix unsigned comparison with less than zero
scsi: lpfc: Remove reftag check in DIF paths
scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock
net: hns3: restore user pause configure when disable autoneg
drm/amdgpu: Match against exact bootloader status
wifi: cfg80211: remove links only on AP
wifi: mac80211: Use active_links instead of valid_links in Tx
netlabel: fix shift wrapping bug in netlbl_catmap_setlong()
bnx2x: fix page fault following EEH recovery
cifs: fix sockaddr comparison in iface_cmp
cifs: fix max_credits implementation
sctp: handle invalid error codes without calling BUG()
scsi: aacraid: Reply queue mapping to CPUs based on IRQ affinity
scsi: storvsc: Always set no_report_opcodes
scsi: lpfc: Fix incorrect big endian type assignment in bsg loopback path
LoongArch: Let pmd_present() return true when splitting pmd
LoongArch: Fix the write_fcsr() macro
ALSA: seq: oss: Fix racy open/close of MIDI devices
net: sfp: handle 100G/25G active optical cables in sfp_parse_support
tracing: Introduce pipe_cpumask to avoid race on trace_pipes
platform/mellanox: Fix mlxbf-tmfifo not handling all virtio CONSOLE notifications
of: property: Simplify of_link_to_phandle()
cpufreq: intel_pstate: set stale CPU frequency to minimum
crypto: rsa-pkcs1pad - Use helper to set reqsize
tpm: Enable hwrng only for Pluton on AMD CPUs
KVM: x86/mmu: Use kstrtobool() instead of strtobool()
KVM: x86/mmu: Add "never" option to allow sticky disabling of nx_huge_pages
net: Avoid address overwrite in kernel_connect
drm/amd/display: ensure async flips are only accepted for fast updates
udf: Check consistency of Space Bitmap Descriptor
udf: Handle error when adding extent to a file
Input: i8042 - add quirk for TUXEDO Gemini 17 Gen1/Clevo PD70PN
Revert "PCI: tegra194: Enable support for 256 Byte payload"
Revert "net: macsec: preserve ingress frame ordering"
tools lib subcmd: Add install target
tools lib subcmd: Make install_headers clearer
tools lib subcmd: Add dependency test to install_headers
tools/resolve_btfids: Use pkg-config to locate libelf
tools/resolve_btfids: Install subcmd headers
tools/resolve_btfids: Alter how HOSTCC is forced
tools/resolve_btfids: Compile resolve_btfids as host program
tools/resolve_btfids: Tidy HOST_OVERRIDES
tools/resolve_btfids: Pass HOSTCFLAGS as EXTRA_CFLAGS to prepare targets
tools/resolve_btfids: Fix setting HOSTCFLAGS
reiserfs: Check the return value from __getblk()
eventfd: prevent underflow for eventfd semaphores
fs: Fix error checking for d_hash_and_lookup()
iomap: Remove large folio handling in iomap_invalidate_folio()
tmpfs: verify {g,u}id mount options correctly
selftests/harness: Actually report SKIP for signal tests
vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing
ARM: ptrace: Restore syscall restart tracing
ARM: ptrace: Restore syscall skipping for tracers
refscale: Fix uninitalized use of wait_queue_head_t
OPP: Fix passing 0 to PTR_ERR in _opp_attach_genpd()
selftests/resctrl: Add resctrl.h into build deps
selftests/resctrl: Don't leak buffer in fill_cache()
selftests/resctrl: Unmount resctrl FS if child fails to run benchmark
selftests/resctrl: Close perf value read fd on errors
arm64/ptrace: Clean up error handling path in sve_set_common()
sched/psi: Select KERNFS as needed
x86/decompressor: Don't rely on upper 32 bits of GPRs being preserved
arm64/sme: Don't use streaming mode to probe the maximum SME VL
arm64/fpsimd: Only provide the length to cpufeature for xCR registers
sched/rt: Fix sysctl_sched_rr_timeslice intial value
perf/imx_ddr: don't enable counter0 if none of 4 counters are used
selftests/futex: Order calls to futex_lock_pi
s390/pkey: fix/harmonize internal keyblob headers
s390/pkey: fix PKEY_TYPE_EP11_AES handling in PKEY_GENSECK2 IOCTL
s390/pkey: fix PKEY_TYPE_EP11_AES handling for sysfs attributes
s390/paes: fix PKEY_TYPE_EP11_AES handling for secure keyblobs
irqchip/loongson-eiointc: Fix return value checking of eiointc_index
ACPI: x86: s2idle: Post-increment variables when getting constraints
ACPI: x86: s2idle: Fix a logic error parsing AMD constraints table
thermal/of: Fix potential uninitialized value access
cpufreq: amd-pstate-ut: Remove module parameter access
cpufreq: amd-pstate-ut: Fix kernel panic when loading the driver
x86/efistub: Fix PCI ROM preservation in mixed mode
cpufreq: powernow-k8: Use related_cpus instead of cpus in driver.exit()
selftests/bpf: Fix bpf_nf failure upon test rerun
bpftool: use a local copy of perf_event to fix accessing :: Bpf_cookie
bpftool: Define a local bpf_perf_link to fix accessing its fields
bpftool: Use a local copy of BPF_LINK_TYPE_PERF_EVENT in pid_iter.bpf.c
bpftool: Use a local bpf_perf_event_value to fix accessing its fields
libbpf: Fix realloc API handling in zero-sized edge cases
bpf: Clear the probe_addr for uprobe
bpf: Fix an error in verifying a field in a union
crypto: qat - change value of default idle filter
tcp: tcp_enter_quickack_mode() should be static
hwrng: nomadik - keep clock enabled while hwrng is registered
hwrng: pic32 - use devm_clk_get_enabled
regmap: rbtree: Use alloc_flags for memory allocations
wifi: rtw89: debug: Fix error handling in rtw89_debug_priv_btc_manual_set()
wifi: mt76: mt7921: fix non-PSC channel scan fail
udp: re-score reuseport groups when connected sockets are present
bpf: reject unhashed sockets in bpf_sk_assign
wifi: mt76: testmode: add nla_policy for MT76_TM_ATTR_TX_LENGTH
spi: tegra20-sflash: fix to check return value of platform_get_irq() in tegra_sflash_probe()
can: gs_usb: gs_usb_receive_bulk_callback(): count RX overflow errors also in case of OOM
wifi: mt76: mt7915: fix power-limits while chan_switch
wifi: mwifiex: Fix OOB and integer underflow when rx packets
wifi: mwifiex: fix error recovery in PCIE buffer descriptor management
selftests/bpf: fix static assert compilation issue for test_cls_*.c
kbuild: rust_is_available: remove -v option
kbuild: rust_is_available: fix version check when CC has multiple arguments
kbuild: rust_is_available: add check for `bindgen` invocation
kbuild: rust_is_available: fix confusion when a version appears in the path
crypto: stm32 - Properly handle pm_runtime_get failing
crypto: api - Use work queue in crypto_destroy_instance
Bluetooth: nokia: fix value check in nokia_bluetooth_serdev_probe()
Bluetooth: Fix potential use-after-free when clear keys
Bluetooth: hci_sync: Don't double print name in add/remove adv_monitor
Bluetooth: hci_sync: Avoid use-after-free in dbg for hci_add_adv_monitor()
net: tcp: fix unexcepted socket die when snd_wnd is 0
selftests/bpf: Fix repeat option when kfunc_call verification fails
selftests/bpf: Clean up fmod_ret in bench_rename test script
net-memcg: Fix scope of sockmem pressure indicators
ice: ice_aq_check_events: fix off-by-one check when filling buffer
crypto: caam - fix unchecked return value error
hwrng: iproc-rng200 - Implement suspend and resume calls
lwt: Fix return values of BPF xmit ops
lwt: Check LWTUNNEL_XMIT_CONTINUE strictly
fs: ocfs2: namei: check return value of ocfs2_add_entry()
net: annotate data-races around sk->sk_lingertime
wifi: mwifiex: fix memory leak in mwifiex_histogram_read()
wifi: mwifiex: Fix missed return in oob checks failed path
ARM: dts: Add .dts files missing from the build
samples/bpf: fix bio latency check with tracepoint
samples/bpf: fix broken map lookup probe
wifi: ath9k: fix races between ath9k_wmi_cmd and ath9k_wmi_ctrl_rx
wifi: ath9k: protect WMI command response buffer replacement with a lock
wifi: nl80211/cfg80211: add forgotten nla_policy for BSS color attribute
mac80211: make ieee80211_tx_info padding explicit
wifi: mwifiex: avoid possible NULL skb pointer dereference
Bluetooth: btusb: Do not call kfree_skb() under spin_lock_irqsave()
arm64: mm: use ptep_clear() instead of pte_clear() in clear_flush()
wifi: ath9k: use IS_ERR() with debugfs_create_dir()
ice: avoid executing commands on other ports when driving sync
net: arcnet: Do not call kfree_skb() under local_irq_disable()
mlxsw: i2c: Fix chunk size setting in output mailbox buffer
mlxsw: i2c: Limit single transaction buffer size
mlxsw: core_hwmon: Adjust module label names based on MTCAP sensor counter
hwmon: (tmp513) Fix the channel number in tmp51x_is_visible()
octeontx2-pf: Refactor schedular queue alloc/free calls
octeontx2-pf: Fix PFC TX scheduler free
cteonxt2-pf: Fix backpressure config for multiple PFC priorities to work simultaneously
sfc: Check firmware supports Ethernet PTP filter
net/sched: sch_hfsc: Ensure inner classes have fsc curve
netrom: Deny concurrent connect().
drm/bridge: tc358764: Fix debug print parameter order
ASoC: cs43130: Fix numerator/denominator mixup
quota: factor out dquot_write_dquot()
quota: rename dquot_active() to inode_quota_active()
quota: add new helper dquot_active()
quota: fix dqput() to follow the guarantees dquot_srcu should provide
drm/amd/display: Do not set drr on pipe commit
drm/hyperv: Fix a compilation issue because of not including screen_info.h
ASoC: stac9766: fix build errors with REGMAP_AC97
soc: qcom: ocmem: Add OCMEM hardware version print
soc: qcom: ocmem: Fix NUM_PORTS & NUM_MACROS macros
arm64: dts: qcom: sm6350: Fix ZAP region
arm64: dts: qcom: sm8250: correct dynamic power coefficients
arm64: dts: qcom: msm8916-l8150: correct light sensor VDDIO supply
arm64: dts: qcom: sm8250-edo: Add gpio line names for TLMM
arm64: dts: qcom: sm8250-edo: Add GPIO line names for PMIC GPIOs
arm64: dts: qcom: sm8250-edo: Rectify gpio-keys
arm64: dts: qcom: sc8280xp-crd: Correct vreg_misc_3p3 GPIO
arm64: dts: qcom: sc8280xp: Add missing SCM interconnect
arm64: dts: qcom: msm8996: Add missing interrupt to the USB2 controller
arm64: dts: qcom: sdm845-tama: Set serial indices and stdout-path
arm64: dts: qcom: sm8350: Fix CPU idle state residency times
arm64: dts: qcom: sm8350: Add missing LMH interrupts to cpufreq
arm64: dts: qcom: sm8350: Use proper CPU compatibles
arm64: dts: qcom: pm8350: fix thermal zone name
arm64: dts: qcom: pm8350b: fix thermal zone name
arm64: dts: qcom: pmr735b: fix thermal zone name
arm64: dts: qcom: pmk8350: fix ADC-TM compatible string
arm64: dts: qcom: sm8250: Mark PCIe hosts as DMA coherent
ARM: dts: stm32: Rename mdio0 to mdio
ARM: dts: stm32: YAML validation fails for Argon Boards
ARM: dts: stm32: adopt generic iio bindings for adc channels on emstamp-argon
ARM: dts: stm32: Add missing detach mailbox for emtrion emSBC-Argon
ARM: dts: stm32: YAML validation fails for Odyssey Boards
ARM: dts: stm32: Add missing detach mailbox for Odyssey SoM
ARM: dts: stm32: Update to generic ADC channel binding on DHSOM systems
ARM: dts: stm32: Add missing detach mailbox for DHCOM SoM
firmware: ti_sci: Use system_state to determine polling
drm/amdgpu: avoid integer overflow warning in amdgpu_device_resize_fb_bar()
ARM: dts: BCM53573: Drop nonexistent #usb-cells
ARM: dts: BCM53573: Add cells sizes to PCIe node
ARM: dts: BCM53573: Use updated "spi-gpio" binding properties
arm64: tegra: Fix HSUART for Jetson AGX Orin
arm64: dts: qcom: sm8250-sony-xperia: correct GPIO keys wakeup again
arm64: dts: qcom: pm6150l: Add missing short interrupt
arm64: dts: qcom: pm660l: Add missing short interrupt
arm64: dts: qcom: pmi8994: Add missing OVP interrupt
arm64: tegra: Fix HSUART for Smaug
drm/etnaviv: fix dumping of active MMU context
block: cleanup queue_wc_store
block: don't allow enabling a cache on devices that don't support it
x86/mm: Fix PAT bit missing from page protection modify mask
drm/bridge: anx7625: Use common macros for DP power sequencing commands
drm/bridge: anx7625: Use common macros for HDCP capabilities
ARM: dts: samsung: s3c6410-mini6410: correct ethernet reg addresses (split)
ARM: dts: s5pv210: add dummy 5V regulator for backlight on SMDKv210
ARM: dts: samsung: s5pv210-smdkv210: correct ethernet reg addresses (split)
drm: adv7511: Fix low refresh rate register for ADV7533/5
ARM: dts: BCM53573: Fix Ethernet info for Luxul devices
arm64: dts: qcom: sdm845: Add missing RPMh power domain to GCC
arm64: dts: qcom: sdm845: Fix the min frequency of "ice_core_clk"
arm64: dts: qcom: msm8996-gemini: fix touchscreen VIO supply
drm/amdgpu: Update min() to min_t() in 'amdgpu_info_ioctl'
md: Factor out is_md_suspended helper
md: Change active_io to percpu
md: restore 'noio_flag' for the last mddev_resume()
md/raid10: factor out dereference_rdev_and_rrdev()
md/raid10: use dereference_rdev_and_rrdev() to get devices
md/md-bitmap: remove unnecessary local variable in backlog_store()
md/md-bitmap: hold 'reconfig_mutex' in backlog_store()
drm/msm: Update dev core dump to not print backwards
drm/tegra: dpaux: Fix incorrect return value of platform_get_irq
of: unittest: fix null pointer dereferencing in of_unittest_find_node_by_name()
arm64: dts: qcom: sm8150: Fix the I2C7 interrupt
ARM: dts: BCM53573: Fix Tenda AC9 switch CPU port
drm/armada: Fix off-by-one error in armada_overlay_get_property()
drm/repaper: Reduce temporary buffer size in repaper_fb_dirty()
drm/panel: simple: Add missing connector type and pixel format for AUO T215HVN01
ima: Remove deprecated IMA_TRUSTED_KEYRING Kconfig
drm: xlnx: zynqmp_dpsub: Add missing check for dma_set_mask
soc: qcom: smem: Fix incompatible types in comparison
drm/msm/mdp5: Don't leak some plane state
firmware: meson_sm: fix to avoid potential NULL pointer dereference
drm/msm/dpu: fix the irq index in dpu_encoder_phys_wb_wait_for_commit_done
smackfs: Prevent underflow in smk_set_cipso()
drm/amd/pm: fix variable dereferenced issue in amdgpu_device_attr_create()
drm/msm/a2xx: Call adreno_gpu_init() earlier
audit: fix possible soft lockup in __audit_inode_child()
block/mq-deadline: use correct way to throttling write requests
io_uring: fix drain stalls by invalid SQE
drm/mediatek: dp: Add missing error checks in mtk_dp_parse_capabilities
bus: ti-sysc: Fix build warning for 64-bit build
drm/mediatek: Remove freeing not dynamic allocated memory
ARM: dts: qcom: ipq4019: correct SDHCI XO clock
drm/mediatek: Fix potential memory leak if vmap() fail
arm64: dts: qcom: apq8016-sbc: Fix ov5640 regulator supply names
arm64: dts: qcom: msm8998: Drop bus clock reference from MMSS SMMU
arm64: dts: qcom: msm8998: Add missing power domain to MMSS SMMU
arm64: dts: qcom: msm8996: Fix dsi1 interrupts
arm64: dts: qcom: sc8280xp-x13s: Unreserve NC pins
bus: ti-sysc: Fix cast to enum warning
md/raid5-cache: fix a deadlock in r5l_exit_log()
md/raid5-cache: fix null-ptr-deref for r5l_flush_stripe_to_raid()
firmware: cs_dsp: Fix new control name check
md: add error_handlers for raid0 and linear
md/raid0: Factor out helper for mapping and submitting a bio
md/raid0: Fix performance regression for large sequential writes
md: raid0: account for split bio in iostat accounting
ASoC: SOF: amd: clear dsp to host interrupt status
of: overlay: Call of_changeset_init() early
of: unittest: Fix overlay type in apply/revert check
ALSA: ac97: Fix possible error value of *rac97
ipmi:ssif: Add check for kstrdup
ipmi:ssif: Fix a memory leak when scanning for an adapter
clk: qcom: gpucc-sm6350: Introduce index-based clk lookup
clk: qcom: gpucc-sm6350: Fix clock source names
clk: qcom: gcc-sc8280xp: Add EMAC GDSCs
clk: qcom: gcc-sc8280xp: Add missing GDSC flags
dt-bindings: clock: qcom,gcc-sc8280xp: Add missing GDSCs
clk: qcom: gcc-sc8280xp: Add missing GDSCs
clk: rockchip: rk3568: Fix PLL rate setting for 78.75MHz
PCI: apple: Initialize pcie->nvecs before use
PCI: qcom-ep: Switch MHI bus master clock off during L1SS
drivers: clk: keystone: Fix parameter judgment in _of_pll_clk_init()
PCI/DOE: Fix destroy_work_on_stack() race
clk: sunxi-ng: Modify mismatched function name
clk: qcom: gcc-sc7180: Fix up gcc_sdcc2_apps_clk_src
EDAC/igen6: Fix the issue of no error events
ext4: correct grp validation in ext4_mb_good_group
ext4: avoid potential data overflow in next_linear_group
clk: qcom: gcc-sm8250: Fix gcc_sdcc2_apps_clk_src
kvm/vfio: Prepare for accepting vfio device fd
kvm/vfio: ensure kvg instance stays around in kvm_vfio_group_add()
clk: qcom: reset: Use the correct type of sleep/delay based on length
clk: qcom: gcc-sm6350: Fix gcc_sdcc2_apps_clk_src
PCI: microchip: Correct the DED and SEC interrupt bit offsets
PCI: Mark NVIDIA T4 GPUs to avoid bus reset
pinctrl: mcp23s08: check return value of devm_kasprintf()
PCI: Allow drivers to request exclusive config regions
PCI: Add locking to RMW PCI Express Capability Register accessors
PCI: pciehp: Use RMW accessors for changing LNKCTL
PCI/ASPM: Use RMW accessors for changing LNKCTL
clk: qcom: gcc-sm8450: Use floor ops for SDCC RCGs
clk: imx: pllv4: Fix SPLL2 MULT range
clk: imx: imx8ulp: update SPLL2 type
clk: imx8mp: fix sai4 clock
clk: imx: composite-8m: fix clock pauses when set_rate would be a no-op
powerpc/radix: Move some functions into #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
vfio/type1: fix cap_migration information leak
nvdimm: Fix memleak of pmu attr_groups in unregister_nvdimm_pmu()
nvdimm: Fix dereference after free in register_nvdimm_pmu()
powerpc/fadump: reset dump area size if fadump memory reserve fails
powerpc/perf: Convert fsl_emb notifier to state machine callbacks
drm/amdgpu: Use RMW accessors for changing LNKCTL
drm/radeon: Use RMW accessors for changing LNKCTL
net/mlx5: Use RMW accessors for changing LNKCTL
wifi: ath11k: Use RMW accessors for changing LNKCTL
wifi: ath10k: Use RMW accessors for changing LNKCTL
NFSv4.2: Rework scratch handling for READ_PLUS
NFSv4.2: Fix READ_PLUS smatch warnings
NFSv4.2: Fix up READ_PLUS alignment
NFSv4.2: Fix READ_PLUS size calculations
powerpc: Don't include lppaca.h in paca.h
powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT
nfs/blocklayout: Use the passed in gfp flags
powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n
powerpc/mpc5xxx: Add missing fwnode_handle_put()
powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
ext4: fix unttached inode after power cut with orphan file feature enabled
jfs: validate max amount of blocks before allocation.
fs: lockd: avoid possible wrong NULL parameter
NFSD: da_addr_body field missing in some GETDEVICEINFO replies
NFS: Guard against READDIR loop when entry names exceed MAXNAMELEN
NFSv4.2: fix handling of COPY ERR_OFFLOAD_NO_REQ
pNFS: Fix assignment of xprtdata.cred
cgroup/cpuset: Inherit parent's load balance state in v2
RDMA/qedr: Remove a duplicate assignment in irdma_query_ah()
media: ov5640: fix low resolution image abnormal issue
media: ad5820: Drop unsupported ad5823 from i2c_ and of_device_id tables
media: i2c: tvp5150: check return value of devm_kasprintf()
media: v4l2-core: Fix a potential resource leak in v4l2_fwnode_parse_link()
iommu/amd/iommu_v2: Fix pasid_state refcount dec hit 0 warning on pasid unbind
iommu: rockchip: Fix directory table address encoding
drivers: usb: smsusb: fix error handling code in smsusb_init_device
media: dib7000p: Fix potential division by zero
media: dvb-usb: m920x: Fix a potential memory leak in m920x_i2c_xfer()
media: cx24120: Add retval check for cx24120_message_send()
RDMA/siw: Fabricate a GID on tun and loopback devices
scsi: hisi_sas: Fix warnings detected by sparse
scsi: hisi_sas: Fix normally completed I/O analysed as failed
dt-bindings: extcon: maxim,max77843: restrict connector properties
media: amphion: reinit vpu if reqbufs output 0
media: amphion: add helper function to get id name
media: mtk-jpeg: Fix use after free bug due to uncanceled work
media: rkvdec: increase max supported height for H.264
media: amphion: fix CHECKED_RETURN issues reported by coverity
media: amphion: fix REVERSE_INULL issues reported by coverity
media: amphion: fix UNINIT issues reported by coverity
media: amphion: fix UNUSED_VALUE issue reported by coverity
media: amphion: ensure the bitops don't cross boundaries
media: mediatek: vcodec: Return NULL if no vdec_fb is found
media: mediatek: vcodec: fix potential double free
media: mediatek: vcodec: fix resource leaks in vdec_msg_queue_init()
usb: phy: mxs: fix getting wrong state with mxs_phy_is_otg_host()
scsi: RDMA/srp: Fix residual handling
scsi: iscsi: Rename iscsi_set_param() to iscsi_if_set_param()
scsi: iscsi: Add length check for nlattr payload
scsi: iscsi: Add strlen() check in iscsi_if_set{_host}_param()
scsi: be2iscsi: Add length check when parsing nlattrs
scsi: qla4xxx: Add length check when parsing nlattrs
iio: accel: adxl313: Fix adxl313_i2c_id[] table
serial: sprd: Assign sprd_port after initialized to avoid wrong access
serial: sprd: Fix DMA buffer leak issue
x86/APM: drop the duplicate APM_MINOR_DEV macro
RDMA/rxe: Split rxe_run_task() into two subroutines
RDMA/rxe: Fix incomplete state save in rxe_requester
scsi: qedf: Do not touch __user pointer in qedf_dbg_stop_io_on_error_cmd_read() directly
scsi: qedf: Do not touch __user pointer in qedf_dbg_debug_cmd_read() directly
scsi: qedf: Do not touch __user pointer in qedf_dbg_fp_int_cmd_read() directly
RDMA/irdma: Replace one-element array with flexible-array member
coresight: tmc: Explicit type conversions to prevent integer overflow
interconnect: qcom: qcm2290: Enable sync state
dma-buf/sync_file: Fix docs syntax
driver core: test_async: fix an error code
driver core: Call dma_cleanup() on the test_remove path
kernfs: add stub helper for kernfs_generic_poll()
extcon: cht_wc: add POWER_SUPPLY dependency
iommu/mediatek: Remove unused "mapping" member from mtk_iommu_data
iommu/mediatek: Fix two IOMMU share pagetable issue
iommu/sprd: Add missing force_aperture
RDMA/hns: Fix port active speed
RDMA/hns: Fix incorrect post-send with direct wqe of wr-list
RDMA/hns: Fix inaccurate error label name in init instance
RDMA/hns: Fix CQ and QP cache affinity
IB/uverbs: Fix an potential error pointer dereference
fsi: aspeed: Reset master errors after CFAM reset
iommu/qcom: Disable and reset context bank before programming
iommu/vt-d: Fix to flush cache of PASID directory table
platform/x86: dell-sysman: Fix reference leak
media: cec: core: add adap_nb_transmit_canceled() callback
media: cec: core: add adap_unconfigured() callback
media: go7007: Remove redundant if statement
media: venus: hfi_venus: Only consider sys_idle_indicator on V1
docs: ABI: fix spelling/grammar in SBEFIFO timeout interface
USB: gadget: core: Add missing kerneldoc for vbus_work
USB: gadget: f_mass_storage: Fix unused variable warning
drivers: base: Free devm resources when unregistering a device
HID: input: Support devices sending Eraser without Invert
media: ov5640: Enable MIPI interface in ov5640_set_power_mipi()
media: ov5640: Fix initial RESETB state and annotate timings
media: i2c: ov2680: Set V4L2_CTRL_FLAG_MODIFY_LAYOUT on flips
media: ov2680: Remove auto-gain and auto-exposure controls
media: ov2680: Fix ov2680_bayer_order()
media: ov2680: Fix vflip / hflip set functions
media: ov2680: Remove VIDEO_V4L2_SUBDEV_API ifdef-s
media: ov2680: Don't take the lock for try_fmt calls
media: ov2680: Add ov2680_fill_format() helper function
media: ov2680: Fix ov2680_set_fmt() which == V4L2_SUBDEV_FORMAT_TRY not working
media: ov2680: Fix regulators being left enabled on ov2680_power_on() errors
media: i2c: rdacm21: Fix uninitialized value
f2fs: fix to avoid mmap vs set_compress_option case
f2fs: judge whether discard_unit is section only when have CONFIG_BLK_DEV_ZONED
f2fs: Only lfs mode is allowed with zoned block device feature
Revert "f2fs: fix to do sanity check on extent cache correctly"
cgroup:namespace: Remove unused cgroup_namespaces_init()
coresight: trbe: Fix TRBE potential sleep in atomic context
RDMA/irdma: Prevent zero-length STAG registration
scsi: core: Use 32-bit hostnum in scsi_host_lookup()
scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock
interconnect: qcom: sm8450: Enable sync_state
interconnect: qcom: bcm-voter: Improve enable_mask handling
interconnect: qcom: bcm-voter: Use enable_maks for keepalive voting
serial: tegra: handle clk prepare error in tegra_uart_hw_init()
amba: bus: fix refcount leak
Revert "IB/isert: Fix incorrect release of isert connection"
RDMA/siw: Balance the reference of cep->kref in the error path
RDMA/siw: Correct wrong debug message
RDMA/efa: Fix wrong resources deallocation order
HID: logitech-dj: Fix error handling in logi_dj_recv_switch_to_dj_mode()
HID: uclogic: Correct devm device reference for hidinput input_dev name
HID: multitouch: Correct devm device reference for hidinput input_dev name
platform/x86/amd/pmf: Fix a missing cleanup path
tick/rcu: Fix false positive "softirq work is pending" messages
x86/speculation: Mark all Skylake CPUs as vulnerable to GDS
tracing: Remove extra space at the end of hwlat_detector/mode
tracing: Fix race issue between cpu buffer write and swap
mtd: rawnand: brcmnand: Fix mtd oobsize
dmaengine: idxd: Modify the dependence of attribute pasid_enabled
phy/rockchip: inno-hdmi: use correct vco_div_5 macro on rk3328
phy/rockchip: inno-hdmi: round fractal pixclock in rk3328 recalc_rate
phy/rockchip: inno-hdmi: do not power on rk3328 post pll on reg write
rpmsg: glink: Add check for kstrdup
leds: pwm: Fix error code in led_pwm_create_fwnode()
leds: multicolor: Use rounded division when calculating color components
leds: Fix BUG_ON check for LED_COLOR_ID_MULTI that is always false
leds: trigger: tty: Do not use LED_ON/OFF constants, use led_blink_set_oneshot instead
mtd: spi-nor: Check bus width while setting QE bit
mtd: rawnand: fsmc: handle clk prepare error in fsmc_nand_resume()
um: Fix hostaudio build errors
dmaengine: ste_dma40: Add missing IRQ check in d40_probe
Drivers: hv: vmbus: Don't dereference ACPI root object handle
cpufreq: Fix the race condition while updating the transition_task of policy
virtio_ring: fix avail_wrap_counter in virtqueue_add_packed
igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU
netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c
netfilter: nft_exthdr: Fix non-linear header modification
netfilter: xt_u32: validate user space input
netfilter: xt_sctp: validate the flag_info count
skbuff: skb_segment, Call zero copy functions before using skbuff frags
igb: set max size RX buffer when store bad packet is enabled
PM / devfreq: Fix leak in devfreq_dev_release()
ALSA: pcm: Fix missing fixup call in compat hw_refine ioctl
rcu: dump vmalloc memory info safely
printk: ringbuffer: Fix truncating buffer size min_t cast
scsi: core: Fix the scsi_set_resid() documentation
mm/vmalloc: add a safer version of find_vm_area() for debug
cpu/hotplug: Prevent self deadlock on CPU hot-unplug
media: i2c: ccs: Check rules is non-NULL
media: i2c: Add a camera sensor top level menu
PCI: rockchip: Use 64-bit mask on MSI 64-bit PCI address
ipmi_si: fix a memleak in try_smi_init()
ARM: OMAP2+: Fix -Warray-bounds warning in _pwrdm_state_switch()
XArray: Do not return sibling entries from xa_load()
io_uring: break iopolling on signal
backlight/gpio_backlight: Compare against struct fb_info.device
backlight/bd6107: Compare against struct fb_info.device
backlight/lv5207lp: Compare against struct fb_info.device
drm/amd/display: register edp_backlight_control() for DCN301
xtensa: PMU: fix base address for the newer hardware
LoongArch: mm: Add p?d_leaf() definitions
i3c: master: svc: fix probe failure when no i3c device exist
arm64: csum: Fix OoB access in IP checksum code for negative lengths
ALSA: hda/cirrus: Fix broken audio on hardware with two CS42L42 codecs.
media: dvb: symbol fixup for dvb_attach()
media: venus: hfi_venus: Write to VIDC_CTRL_INIT after unmasking interrupts
Revert "scsi: qla2xxx: Fix buffer overrun"
scsi: mpt3sas: Perform additional retries if doorbell read returns 0
PCI: Free released resource after coalescing
PCI: hv: Fix a crash in hv_pci_restore_msi_msg() during hibernation
PCI/PM: Only read PCI_PM_CTRL register when available
ntb: Drop packets when qp link is down
ntb: Clean up tx tail index on link down
ntb: Fix calculation ntb_transport_tx_free_entry()
Revert "PCI: Mark NVIDIA T4 GPUs to avoid bus reset"
block: don't add or resize partition on the disk with GENHD_FL_NO_PART
procfs: block chmod on /proc/thread-self/comm
parisc: Fix /proc/cpuinfo output for lscpu
drm/amd/display: Add smu write msg id fail retry process
bpf: Fix issue in verifying allow_ptr_leaks
dlm: fix plock lookup when using multiple lockspaces
dccp: Fix out of bounds access in DCCP error handler
x86/sev: Make enc_dec_hypercall() accept a size instead of npages
r8169: fix ASPM-related issues on a number of systems with NIC version from RTL8168h
X.509: if signature is unsupported skip validation
net: handle ARPHRD_PPP in dev_is_mac_header_xmit()
fsverity: skip PKCS#7 parser when keyring is empty
x86/MCE: Always save CS register on AMD Zen IF Poison errors
platform/chrome: chromeos_acpi: print hex string for ACPI_TYPE_BUFFER
mmc: renesas_sdhi: register irqs before registering controller
pstore/ram: Check start of empty przs during init
arm64: sdei: abort running SDEI handlers during crash
s390/dcssblk: fix kernel crash with list_add corruption
s390/ipl: add missing secure/has_secure file to ipl type 'unknown'
s390/dasd: fix string length handling
crypto: stm32 - fix loop iterating through scatterlist for DMA
cpufreq: brcmstb-avs-cpufreq: Fix -Warray-bounds bug
of: property: fw_devlink: Add a devlink for panel followers
usb: typec: tcpm: set initial svdm version based on pd revision
usb: typec: bus: verify partner exists in typec_altmode_attention
USB: core: Unite old scheme and new scheme descriptor reads
USB: core: Change usb_get_device_descriptor() API
USB: core: Fix race by not overwriting udev->descriptor in hub_port_init()
USB: core: Fix oversight in SuperSpeed initialization
x86/sgx: Break up long non-preemptible delays in sgx_vepc_release()
perf/x86/uncore: Correct the number of CHAs on EMR
serial: sc16is7xx: remove obsolete out_thread label
serial: sc16is7xx: fix regression with GPIO configuration
tracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY
Revert "drm/amd/display: Do not set drr on pipe commit"
md: Free resources in __md_stop
NFSv4.2: Fix a potential double free with READ_PLUS
NFSv4.2: Rework scratch handling for READ_PLUS (again)
md: fix regression for null-ptr-deference in __md_stop()
clk: Mark a fwnode as initialized when using CLK_OF_DECLARE() macro
treewide: Fix probing of devices in DT overlays
clk: Avoid invalid function names in CLK_OF_DECLARE()
udf: initialize newblock to 0
Linux 6.1.53
Change-Id: I6f5858bce0f20963ae42515eac36ac14cb686f24
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit c83ad36a18c02c0f51280b50272327807916987f upstream.
Currently, for double invoke call_rcu(), will dump rcu_head objects memory
info, if the objects is not allocated from the slab allocator, the
vmalloc_dump_obj() will be invoke and the vmap_area_lock spinlock need to
be held, since the call_rcu() can be invoked in interrupt context,
therefore, there is a possibility of spinlock deadlock scenarios.
And in Preempt-RT kernel, the rcutorture test also trigger the following
lockdep warning:
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
3 locks held by swapper/0/1:
#0: ffffffffb534ee80 (fullstop_mutex){+.+.}-{4:4}, at: torture_init_begin+0x24/0xa0
#1: ffffffffb5307940 (rcu_read_lock){....}-{1:3}, at: rcu_torture_init+0x1ec7/0x2370
#2: ffffffffb536af40 (vmap_area_lock){+.+.}-{3:3}, at: find_vmap_area+0x1f/0x70
irq event stamp: 565512
hardirqs last enabled at (565511): [<ffffffffb379b138>] __call_rcu_common+0x218/0x940
hardirqs last disabled at (565512): [<ffffffffb5804262>] rcu_torture_init+0x20b2/0x2370
softirqs last enabled at (399112): [<ffffffffb36b2586>] __local_bh_enable_ip+0x126/0x170
softirqs last disabled at (399106): [<ffffffffb43fef59>] inet_register_protosw+0x9/0x1d0
Preemption disabled at:
[<ffffffffb58040c3>] rcu_torture_init+0x1f13/0x2370
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.5.0-rc4-rt2-yocto-preempt-rt+ #15
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x68/0xb0
dump_stack+0x14/0x20
__might_resched+0x1aa/0x280
? __pfx_rcu_torture_err_cb+0x10/0x10
rt_spin_lock+0x53/0x130
? find_vmap_area+0x1f/0x70
find_vmap_area+0x1f/0x70
vmalloc_dump_obj+0x20/0x60
mem_dump_obj+0x22/0x90
__call_rcu_common+0x5bf/0x940
? debug_smp_processor_id+0x1b/0x30
call_rcu_hurry+0x14/0x20
rcu_torture_init+0x1f82/0x2370
? __pfx_rcu_torture_leak_cb+0x10/0x10
? __pfx_rcu_torture_leak_cb+0x10/0x10
? __pfx_rcu_torture_init+0x10/0x10
do_one_initcall+0x6c/0x300
? debug_smp_processor_id+0x1b/0x30
kernel_init_freeable+0x2b9/0x540
? __pfx_kernel_init+0x10/0x10
kernel_init+0x1f/0x150
ret_from_fork+0x40/0x50
? __pfx_kernel_init+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
The previous patch fixes this by using the deadlock-safe best-effort
version of find_vm_area. However, in case of failure print the fact that
the pointer was a vmalloc pointer so that we print at least something.
Link: https://lkml.kernel.org/r/20230904180806.1002832-2-joel@joelfernandes.org
Fixes: 98f180837a ("mm: Make mem_dump_obj() handle vmalloc() memory")
Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reported-by: Zhen Lei <thunder.leizhen@huaweicloud.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
linux-next for a couple of months without, to my knowledge, any negative
reports (or any positive ones, come to that).
- Also the Maple Tree from Liam R. Howlett. An overlapping range-based
tree for vmas. It it apparently slight more efficient in its own right,
but is mainly targeted at enabling work to reduce mmap_lock contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
(https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
This has yet to be addressed due to Liam's unfortunately timed
vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down to
the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
=xfWx
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any
negative reports (or any positive ones, come to that).
- Also the Maple Tree from Liam Howlett. An overlapping range-based
tree for vmas. It it apparently slightly more efficient in its own
right, but is mainly targeted at enabling work to reduce mmap_lock
contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
at [1]. This has yet to be addressed due to Liam's unfortunately
timed vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down
to the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
support file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging
activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]
* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
hugetlb: allocate vma lock for all sharable vmas
hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
hugetlb: fix vma lock handling during split vma and range unmapping
mglru: mm/vmscan.c: fix imprecise comments
mm/mglru: don't sync disk for each aging cycle
mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
mm: memcontrol: use do_memsw_account() in a few more places
mm: memcontrol: deprecate swapaccounting=0 mode
mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
mm/secretmem: remove reduntant return value
mm/hugetlb: add available_huge_pages() func
mm: remove unused inline functions from include/linux/mm_inline.h
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
selftests/vm: add thp collapse shmem testing
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: modularize thp collapse memory operations
selftests/vm: dedup THP helpers
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
mm/madvise: add file and shmem support to MADV_COLLAPSE
...
Replace any vm_next use with vma_find().
Update free_pgtables(), unmap_vmas(), and zap_page_range() to use the
maple tree.
Use the new free_pgtables() and unmap_vmas() in do_mas_align_munmap(). At
the same time, alter the loop to be more compact.
Now that free_pgtables() and unmap_vmas() take a maple tree as an
argument, rearrange do_mas_align_munmap() to use the new tree to hold the
vmas to remove.
Remove __vma_link_list() and __vma_unlink_list() as they are exclusively
used to update the linked list.
Drop linked list update from __insert_vm_struct().
Rework validation of tree as it was depending on the linked list.
[yang.lee@linux.alibaba.com: fix one kernel-doc comment]
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=1949
Link: https://lkml.kernel.org/r/20220824021918.94116-1-yang.lee@linux.alibaba.comLink: https://lkml.kernel.org/r/20220906194824.2110408-69-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Remove the RB tree and start using the maple tree for vm_area_struct
tracking.
Drop validate_mm() calls in expand_upwards() and expand_downwards() as the
lock is not held.
Link: https://lkml.kernel.org/r/20220906194824.2110408-18-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If a process has not enough memory to allocate a new virtual mapping, we
may meet verious kinds of error, eg, fork cannot allocate memory, SIGBUS
error in shmem, but it is difficult to confirm them, let's add some debug
information to easily to check this scenario if __vm_enough_memory fails.
Link: https://lkml.kernel.org/r/20220726145428.8030-1-wangkefeng.wang@huawei.com
Reported-by: Yongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
folio_test_hugetlb() will call PageHeadHuge which is a function call,
and blocks the compiler from recognizing this redundant load.
After rearranging the code, stack usage is dropped from 32 to 24, and
the function size is smaller (tested on GCC 12):
Before:
Stack usage:
mm/util.c:845:5:folio_mapcount 32 static
Size:
0000000000000ea0 00000000000000c7 T folio_mapcount
After:
Stack usage:
mm/util.c:845:5:folio_mapcount 24 static
Size:
0000000000000ea0 00000000000000b0 T folio_mapcount
Link: https://lkml.kernel.org/r/20220801173155.92008-1-ryncsn@gmail.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
- Some kmemleak fixes from Patrick Wang and Waiman Long
- DAMON updates from SeongJae Park
- memcg debug/visibility work from Roman Gushchin
- vmalloc speedup from Uladzislau Rezki
- more folio conversion work from Matthew Wilcox
- enhancements for coherent device memory mapping from Alex Sierra
- addition of shared pages tracking and CoW support for fsdax, from
Shiyang Ruan
- hugetlb optimizations from Mike Kravetz
- Mel Gorman has contributed some pagealloc changes to improve latency
and realtime behaviour.
- mprotect soft-dirty checking has been improved by Peter Xu
- Many other singleton patches all over the place
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuravgAKCRDdBJ7gKXxA
jpqSAQDrXSdII+ht9kSHlaCVYjqRFQz/rRvURQrWQV74f6aeiAD+NHHeDPwZn11/
SPktqEUrF1pxnGQxqLh1kUFUhsVZQgE=
=w/UH
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Most of the MM queue. A few things are still pending.
Liam's maple tree rework didn't make it. This has resulted in a few
other minor patch series being held over for next time.
Multi-gen LRU still isn't merged as we were waiting for mapletree to
stabilize. The current plan is to merge MGLRU into -mm soon and to
later reintroduce mapletree, with a view to hopefully getting both
into 6.1-rc1.
Summary:
- The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
- Some kmemleak fixes from Patrick Wang and Waiman Long
- DAMON updates from SeongJae Park
- memcg debug/visibility work from Roman Gushchin
- vmalloc speedup from Uladzislau Rezki
- more folio conversion work from Matthew Wilcox
- enhancements for coherent device memory mapping from Alex Sierra
- addition of shared pages tracking and CoW support for fsdax, from
Shiyang Ruan
- hugetlb optimizations from Mike Kravetz
- Mel Gorman has contributed some pagealloc changes to improve
latency and realtime behaviour.
- mprotect soft-dirty checking has been improved by Peter Xu
- Many other singleton patches all over the place"
[ XFS merge from hell as per Darrick Wong in
https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ]
* tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits)
tools/testing/selftests/vm/hmm-tests.c: fix build
mm: Kconfig: fix typo
mm: memory-failure: convert to pr_fmt()
mm: use is_zone_movable_page() helper
hugetlbfs: fix inaccurate comment in hugetlbfs_statfs()
hugetlbfs: cleanup some comments in inode.c
hugetlbfs: remove unneeded header file
hugetlbfs: remove unneeded hugetlbfs_ops forward declaration
hugetlbfs: use helper macro SZ_1{K,M}
mm: cleanup is_highmem()
mm/hmm: add a test for cross device private faults
selftests: add soft-dirty into run_vmtests.sh
selftests: soft-dirty: add test for mprotect
mm/mprotect: fix soft-dirty check in can_change_pte_writable()
mm: memcontrol: fix potential oom_lock recursion deadlock
mm/gup.c: fix formatting in check_and_migrate_movable_page()
xfs: fail dax mount if reflink is enabled on a partition
mm/memcontrol.c: remove the redundant updating of stats_flush_threshold
userfaultfd: don't fail on unrecognized features
hugetlb_cgroup: fix wrong hugetlb cgroup numa stat
...
These drivers are rather uncomfortably hammered into the
address_space_operations hole. They aren't filesystems and don't behave
like filesystems. They just need their own movable_operations structure,
which we can point to directly from page->mapping.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
so it will be consistent with code mm directory and with
Documentation/admin-guide/mm and won't be confused with virtual machines.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Acked-by: Wu XiangCheng <bobwxc@email.cn>
Steps on the way to 5.19-rc1
Resolves merge conflict in:
fs/proc/base.c
mm/util.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I55d6d0cadb4dbd0f978a1e7d14444068bd050a7c
- Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT).
- Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later).
- Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ.
- Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later.
- Drop support for system call instruction emulation.
- Many other small features and fixes.
Thanks to: Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn
Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan
Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu
Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing
Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent
Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan
Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár,
Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib
Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong,
Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, Zucheng
Zheng.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmKSEgETHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgJpLEACee7mu2I00Z7VWtW5ckT4RFbAXYZcM
Hv5DbTnVB2ItoQMRHvG52DNbR73j9HnYrz8kpwfTBVk90udxVP14L/swXDs3xbT4
riXEYtJ1DRVc/bLiOK637RLPWNrmmZStWZme7k0Y9Ki5Aif8i1Erjjq7EIy47m9j
j1MTcwp3ND7IsBON2nZ3PkttEHhevKvOwCPb/BWtPMDV0OhyQUFKB2SNegrlCrkT
wshDgdQcYqbIix98PoGa2ZfUVgFQD3JVLzXa4sLpqouzGD+HvEFStOFa2Gq/ZEvV
zunaeXDdZUCjlib6KvA8+aumBbIQ1s/urrDbxd+3BuYxZ094vNP1B428NT1AWVtl
3bEZQIN8GSx0v9aHxZ8HePsAMXgG9d2o0xC9EMQ430+cqroN+6UHP7lkekwkprb7
U9EpZCG9U8jV6SDcaMigW3tooEjn657we0R8nZG2NgUNssdSHVh/JYxGDALPXIAk
awL3NQrR0tYF3Y3LJm5AxdQrK1hJH8E+hZFCZvIpUXGsr/uf9Gemy/62pD1rhrr/
niULpxIneRGkJiXB5qdGy8pRu27ED53k7Ky6+8MWSEFQl1mUsHSryYACWz939D8c
DydhBwQqDTl6Ozs41a5TkVjIRLOCrZADUd/VZM6A4kEOqPJ5t2Gz22Bn8ya1z6Ks
5Sx6vrGH7GnDjA==
=15oQ
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT)
- Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later)
- Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ
- Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later
- Drop support for system call instruction emulation
- Many other small features and fixes
Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas
Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian
King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank
Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai,
Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing
Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof
Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes,
Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas
Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras,
Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib
Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang
wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing,
Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng.
* tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits)
powerpc/64: Include cache.h directly in paca.h
powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set
powerpc/xics: Include missing header
powerpc/powernv/pci: Drop VF MPS fixup
powerpc/fsl_book3e: Don't set rodata RO too early
powerpc/microwatt: Add mmu bits to device tree
powerpc/powernv/flash: Check OPAL flash calls exist before using
powerpc/powermac: constify device_node in of_irq_parse_oldworld()
powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch"
powerpc: Enable the DAWR on POWER9 DD2.3 and above
powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask
powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask
powerpc: Fix all occurences of "the the"
selftests/powerpc/pmu/ebb: remove fixed_instruction.S
powerpc/platforms/83xx: Use of_device_get_match_data()
powerpc/eeh: Drop redundant spinlock initialization
powerpc/iommu: Add missing of_node_put in iommu_init_early_dart
powerpc/pseries/vas: Call misc_deregister if sysfs init fails
powerpc/papr_scm: Fix leaking nvdimm_events_map elements
...
file-backed transparent hugepages.
Johannes Weiner has arranged for zswap memory use to be tracked and
managed on a per-cgroup basis.
Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime
enablement of the recent huge page vmemmap optimization feature.
Baolin Wang contributes a series to fix some issues around hugetlb
pagetable invalidation.
Zhenwei Pi has fixed some interactions between hwpoisoned pages and
virtualization.
Tong Tiangen has enabled the use of the presently x86-only
page_table_check debugging feature on arm64 and riscv.
David Vernet has done some fixup work on the memcg selftests.
Peter Xu has taught userfaultfd to handle write protection faults against
shmem- and hugetlbfs-backed files.
More DAMON development from SeongJae Park - adding online tuning of the
feature and support for monitoring of fixed virtual address ranges. Also
easier discovery of which monitoring operations are available.
Nadav Amit has done some optimization of TLB flushing during mprotect().
Neil Brown continues to labor away at improving our swap-over-NFS support.
David Hildenbrand has some fixes to anon page COWing versus
get_user_pages().
Peng Liu fixed some errors in the core hugetlb code.
Joao Martins has reduced the amount of memory consumed by device-dax's
compound devmaps.
Some cleanups of the arch-specific pagemap code from Anshuman Khandual.
Muchun Song has found and fixed some errors in the TLB flushing of
transparent hugepages.
Roman Gushchin has done more work on the memcg selftests.
And, of course, many smaller fixes and cleanups. Notably, the customary
million cleanup serieses from Miaohe Lin.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo52xQAKCRDdBJ7gKXxA
jtJFAQD238KoeI9z5SkPMaeBRYSRQmNll85mxs25KapcEgWgGQD9FAb7DJkqsIVk
PzE+d9hEfirUGdL6cujatwJ6ejYR8Q8=
=nFe6
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Almost all of MM here. A few things are still getting finished off,
reviewed, etc.
- Yang Shi has improved the behaviour of khugepaged collapsing of
readonly file-backed transparent hugepages.
- Johannes Weiner has arranged for zswap memory use to be tracked and
managed on a per-cgroup basis.
- Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
runtime enablement of the recent huge page vmemmap optimization
feature.
- Baolin Wang contributes a series to fix some issues around hugetlb
pagetable invalidation.
- Zhenwei Pi has fixed some interactions between hwpoisoned pages and
virtualization.
- Tong Tiangen has enabled the use of the presently x86-only
page_table_check debugging feature on arm64 and riscv.
- David Vernet has done some fixup work on the memcg selftests.
- Peter Xu has taught userfaultfd to handle write protection faults
against shmem- and hugetlbfs-backed files.
- More DAMON development from SeongJae Park - adding online tuning of
the feature and support for monitoring of fixed virtual address
ranges. Also easier discovery of which monitoring operations are
available.
- Nadav Amit has done some optimization of TLB flushing during
mprotect().
- Neil Brown continues to labor away at improving our swap-over-NFS
support.
- David Hildenbrand has some fixes to anon page COWing versus
get_user_pages().
- Peng Liu fixed some errors in the core hugetlb code.
- Joao Martins has reduced the amount of memory consumed by
device-dax's compound devmaps.
- Some cleanups of the arch-specific pagemap code from Anshuman
Khandual.
- Muchun Song has found and fixed some errors in the TLB flushing of
transparent hugepages.
- Roman Gushchin has done more work on the memcg selftests.
... and, of course, many smaller fixes and cleanups. Notably, the
customary million cleanup serieses from Miaohe Lin"
* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
mm: kfence: use PAGE_ALIGNED helper
selftests: vm: add the "settings" file with timeout variable
selftests: vm: add "test_hmm.sh" to TEST_FILES
selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
selftests: vm: add migration to the .gitignore
selftests/vm/pkeys: fix typo in comment
ksm: fix typo in comment
selftests: vm: add process_mrelease tests
Revert "mm/vmscan: never demote for memcg reclaim"
mm/kfence: print disabling or re-enabling message
include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
mm: fix a potential infinite loop in start_isolate_page_range()
MAINTAINERS: add Muchun as co-maintainer for HugeTLB
zram: fix Kconfig dependency warning
mm/shmem: fix shmem folio swapoff hang
cgroup: fix an error handling path in alloc_pagecache_max_30M()
mm: damon: use HPAGE_PMD_SIZE
tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
nodemask.h: fix compilation error with GCC12
...
randomize_page is an mm function. It is documented like one. It contains
the history of one. It has the naming convention of one. It looks
just like another very similar function in mm, randomize_stack_top().
And it has always been maintained and updated by mm people. There is no
need for it to be in random.c. In the "which shape does not look like
the other ones" test, pointing to randomize_page() is correct.
So move randomize_page() into mm/util.c, right next to the similar
randomize_stack_top() function.
This commit contains no actual code changes.
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Patch series "MM changes to improve swap-over-NFS support".
Assorted improvements for swap-via-filesystem.
This is a resend of these patches, rebased on current HEAD. The only
substantial changes is that swap_dirty_folio has replaced
swap_set_page_dirty.
Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It
has previously worked for NFS but that broke a few releases back. This
series changes to use a new ->swap_rw rather than ->readpage and
->direct_IO. It also makes other improvements.
There is a companion series already in linux-next which fixes various
issues with NFS. Once both series land, a final patch is needed which
changes NFS over to use ->swap_rw.
This patch (of 10):
Many functions declared in include/linux/swap.h are only used within mm/
Create a new "mm/swap.h" and move some of these declarations there.
Remove the redundant 'extern' from the function declarations.
[akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h]
Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown
Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: David Howells <dhowells@redhat.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit e7142bf5d2 ("arm64, mm: make randomization selected by
generic topdown mmap layout") introduced a default version of
arch_randomize_brk() provided when
CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT is selected.
powerpc could select CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
but needs to provide its own arch_randomize_brk().
In order to allow that, define generic version of arch_randomize_brk()
as a __weak symbol.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b222f1ca06c850daf1b2f26afdb46c6dd97d21ba.1649523076.git.christophe.leroy@csgroup.eu
Since commit 559089e0a9 ("vmalloc: replace VM_NO_HUGE_VMAP with
VM_ALLOW_HUGE_VMAP"), the use of hugepage mappings for vmalloc is an
opt-in strategy, because it caused a number of problems that weren't
noticed until x86 enabled it too.
One of the issues was fixed by Nick Piggin in commit 3b8000ae18
("mm/vmalloc: huge vmalloc backing pages should be split rather than
compound"), but I'm still worried about page protection issues, and
VM_FLUSH_RESET_PERMS in particular.
However, like the hash table allocation case (commit f2edd118d0:
"page_alloc: use vmalloc_huge for large system hash"), the use of
kvmalloc() should be safe from any such games, since the returned
pointer might be a SLUB allocation, and as such no user should
reasonably be using it in any odd ways.
We also know that the allocations are fairly large, since it falls back
to the vmalloc case only when a kmalloc() fails. So using a hugepage
mapping seems both safe and relevant.
This patch does show a weakness in the opt-in strategy: since the opt-in
flag is in the 'vm_flags', not the usual gfp_t allocation flags, very
few of the usual interfaces actually expose it.
That's not much of an issue in this case that already used one of the
fairly specialized low-level vmalloc interfaces for the allocation, but
for a lot of other vmalloc() users that might want to opt in, it's going
to be very inconvenient.
We'll either have to fix any compatibility problems, or expose it in the
gfp flags (__GFP_COMP would have made a lot of sense) to allow normal
vmalloc() users to use hugepage mappings. That said, the cases that
really matter were probably already taken care of by the hash tabel
allocation.
Link: https://lore.kernel.org/all/20220415164413.2727220-1-song@kernel.org/
Link: https://lore.kernel.org/all/CAHk-=whao=iosX1s5Z4SF-ZGa-ebAukJoAdUJFk5SPwnofV+Vg@mail.gmail.com/
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Song Liu <songliubraving@fb.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Proper emulation of the OSLock feature of the debug architecture
- Scalibility improvements for the MMU lock when dirty logging is on
- New VMID allocator, which will eventually help with SVA in VMs
- Better support for PMUs in heterogenous systems
- PSCI 1.1 support, enabling support for SYSTEM_RESET2
- Implement CONFIG_DEBUG_LIST at EL2
- Make CONFIG_ARM64_ERRATUM_2077057 default y
- Reduce the overhead of VM exit when no interrupt is pending
- Remove traces of 32bit ARM host support from the documentation
- Updated vgic selftests
- Various cleanups, doc updates and spelling fixes
RISC-V:
- Prevent KVM_COMPAT from being selected
- Optimize __kvm_riscv_switch_to() implementation
- RISC-V SBI v0.3 support
s390:
- memop selftest
- fix SCK locking
- adapter interruptions virtualization for secure guests
- add Claudio Imbrenda as maintainer
- first step to do proper storage key checking
x86:
- Continue switching kvm_x86_ops to static_call(); introduce
static_call_cond() and __static_call_ret0 when applicable.
- Cleanup unused arguments in several functions
- Synthesize AMD 0x80000021 leaf
- Fixes and optimization for Hyper-V sparse-bank hypercalls
- Implement Hyper-V's enlightened MSR bitmap for nested SVM
- Remove MMU auditing
- Eager splitting of page tables (new aka "TDP" MMU only) when dirty
page tracking is enabled
- Cleanup the implementation of the guest PGD cache
- Preparation for the implementation of Intel IPI virtualization
- Fix some segment descriptor checks in the emulator
- Allow AMD AVIC support on systems with physical APIC ID above 255
- Better API to disable virtualization quirks
- Fixes and optimizations for the zapping of page tables:
- Zap roots in two passes, avoiding RCU read-side critical sections
that last too long for very large guests backed by 4 KiB SPTEs.
- Zap invalid and defunct roots asynchronously via concurrency-managed
work queue.
- Allowing yielding when zapping TDP MMU roots in response to the root's
last reference being put.
- Batch more TLB flushes with an RCU trick. Whoever frees the paging
structure now holds RCU as a proxy for all vCPUs running in the guest,
i.e. to prolongs the grace period on their behalf. It then kicks the
the vCPUs out of guest mode before doing rcu_read_unlock().
Generic:
- Introduce __vcalloc and use it for very large allocations that
need memcg accounting
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmI4fdwUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMq8gf/WoeVHtw2QlL5Mmz6McvRRmPAYPLV
wLUIFNrRqRvd8Tw4kivzZoh/xTpwmnojv0YdK5SjKAiMjgv094YI1LrNp1JSPvmL
pitocMkA10RSJNWHeEMg9cMSKH0rKiqeYl6S1e2XsdB+UZZ2BINOCVtvglmjTAvJ
dFBdKdBkqjAUZbdXAGIvz4JEEER3N/LkFDKGaUGX+0QIQOzGBPIyLTxynxIDG6mt
RViCCFyXdy5NkVp5hZFm96vQ2qAlWL9B9+iKruQN++82+oqWbeTdSqPhdwF7GyFz
BfOv3gobQ2c4ef/aMLO5LswZ9joI1t/4kQbbAn6dNybpOAz/NXfDnbNefg==
=keox
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Proper emulation of the OSLock feature of the debug architecture
- Scalibility improvements for the MMU lock when dirty logging is on
- New VMID allocator, which will eventually help with SVA in VMs
- Better support for PMUs in heterogenous systems
- PSCI 1.1 support, enabling support for SYSTEM_RESET2
- Implement CONFIG_DEBUG_LIST at EL2
- Make CONFIG_ARM64_ERRATUM_2077057 default y
- Reduce the overhead of VM exit when no interrupt is pending
- Remove traces of 32bit ARM host support from the documentation
- Updated vgic selftests
- Various cleanups, doc updates and spelling fixes
RISC-V:
- Prevent KVM_COMPAT from being selected
- Optimize __kvm_riscv_switch_to() implementation
- RISC-V SBI v0.3 support
s390:
- memop selftest
- fix SCK locking
- adapter interruptions virtualization for secure guests
- add Claudio Imbrenda as maintainer
- first step to do proper storage key checking
x86:
- Continue switching kvm_x86_ops to static_call(); introduce
static_call_cond() and __static_call_ret0 when applicable.
- Cleanup unused arguments in several functions
- Synthesize AMD 0x80000021 leaf
- Fixes and optimization for Hyper-V sparse-bank hypercalls
- Implement Hyper-V's enlightened MSR bitmap for nested SVM
- Remove MMU auditing
- Eager splitting of page tables (new aka "TDP" MMU only) when dirty
page tracking is enabled
- Cleanup the implementation of the guest PGD cache
- Preparation for the implementation of Intel IPI virtualization
- Fix some segment descriptor checks in the emulator
- Allow AMD AVIC support on systems with physical APIC ID above 255
- Better API to disable virtualization quirks
- Fixes and optimizations for the zapping of page tables:
- Zap roots in two passes, avoiding RCU read-side critical
sections that last too long for very large guests backed by 4
KiB SPTEs.
- Zap invalid and defunct roots asynchronously via
concurrency-managed work queue.
- Allowing yielding when zapping TDP MMU roots in response to the
root's last reference being put.
- Batch more TLB flushes with an RCU trick. Whoever frees the
paging structure now holds RCU as a proxy for all vCPUs running
in the guest, i.e. to prolongs the grace period on their behalf.
It then kicks the the vCPUs out of guest mode before doing
rcu_read_unlock().
Generic:
- Introduce __vcalloc and use it for very large allocations that need
memcg accounting"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits)
KVM: use kvcalloc for array allocations
KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2
kvm: x86: Require const tsc for RT
KVM: x86: synthesize CPUID leaf 0x80000021h if useful
KVM: x86: add support for CPUID leaf 0x80000021
KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for get_mt_mask
Revert "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()"
kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU
KVM: arm64: fix typos in comments
KVM: arm64: Generalise VM features into a set of flags
KVM: s390: selftests: Add error memop tests
KVM: s390: selftests: Add more copy memop tests
KVM: s390: selftests: Add named stages for memop test
KVM: s390: selftests: Add macro as abstraction for MEM_OP
KVM: s390: selftests: Split memop tests
KVM: s390x: fix SCK locking
RISC-V: KVM: Implement SBI HSM suspend call
RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function
RISC-V: Add SBI HSM suspend related defines
RISC-V: KVM: Implement SBI v0.3 SRST extension
...
- Rewrite how munlock works to massively reduce the contention
on i_mmap_rwsem (Hugh Dickins):
https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/
- Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig):
https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/
- Convert GUP to use folios and make pincount available for order-1
pages. (Matthew Wilcox)
- Convert a few more truncation functions to use folios (Matthew Wilcox)
- Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox)
- Convert rmap_walk to use folios (Matthew Wilcox)
- Convert most of shrink_page_list() to use a folio (Matthew Wilcox)
- Add support for creating large folios in readahead (Matthew Wilcox)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmI4ucgACgkQDpNsjXcp
gj69Wgf6AwqwmO5Tmy+fLScDPqWxmXJofbocae1kyoGHf7Ui91OK4U2j6IpvAr+g
P/vLIK+JAAcTQcrSCjymuEkf4HkGZOR03QQn7maPIEe4eLrZRQDEsmHC1L9gpeJp
s/GMvDWiGE0Tnxu0EOzfVi/yT+qjIl/S8VvqtCoJv1HdzxitZ7+1RDuqImaMC5MM
Qi3uHag78vLmCltLXpIOdpgZhdZexCdL2Y/1npf+b6FVkAJRRNUnA0gRbS7YpoVp
CbxEJcmAl9cpJLuj5i5kIfS9trr+/QcvbUlzRxh4ggC58iqnmF2V09l2MJ7YU3XL
v1O/Elq4lRhXninZFQEm9zjrri7LDQ==
=n9Ad
-----END PGP SIGNATURE-----
Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache
Pull folio updates from Matthew Wilcox:
- Rewrite how munlock works to massively reduce the contention on
i_mmap_rwsem (Hugh Dickins):
https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/
- Sort out the page refcount mess for ZONE_DEVICE pages (Christoph
Hellwig):
https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/
- Convert GUP to use folios and make pincount available for order-1
pages. (Matthew Wilcox)
- Convert a few more truncation functions to use folios (Matthew
Wilcox)
- Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew
Wilcox)
- Convert rmap_walk to use folios (Matthew Wilcox)
- Convert most of shrink_page_list() to use a folio (Matthew Wilcox)
- Add support for creating large folios in readahead (Matthew Wilcox)
* tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits)
mm/damon: minor cleanup for damon_pa_young
selftests/vm/transhuge-stress: Support file-backed PMD folios
mm/filemap: Support VM_HUGEPAGE for file mappings
mm/readahead: Switch to page_cache_ra_order
mm/readahead: Align file mappings for non-DAX
mm/readahead: Add large folio readahead
mm: Support arbitrary THP sizes
mm: Make large folios depend on THP
mm: Fix READ_ONLY_THP warning
mm/filemap: Allow large folios to be added to the page cache
mm: Turn can_split_huge_page() into can_split_folio()
mm/vmscan: Convert pageout() to take a folio
mm/vmscan: Turn page_check_references() into folio_check_references()
mm/vmscan: Account large folios correctly
mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios
mm/vmscan: Free non-shmem folios without splitting them
mm/rmap: Constify the rmap_walk_control argument
mm/rmap: Convert rmap_walk() to take a folio
mm: Turn page_anon_vma() into folio_anon_vma()
mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read()
...
This implements the same algorithm as total_mapcount(), which is
transformed into a wrapper function.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Linux has dozens of occurrences of vmalloc(array_size()) and
vzalloc(array_size()). Allow to simplify the code by providing
vmalloc_array and vcalloc, as well as the underscored variants that let
the caller specify the GFP flags.
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
syzkaller was recently triggering an oversized kvmalloc() warning via
xdp_umem_create().
The triggered warning was added back in 7661809d49 ("mm: don't allow
oversized kvmalloc() calls"). The rationale for the warning for huge
kvmalloc sizes was as a reaction to a security bug where the size was
more than UINT_MAX but not everything was prepared to handle unsigned
long sizes.
Anyway, the AF_XDP related call trace from this syzkaller report was:
kvmalloc include/linux/mm.h:806 [inline]
kvmalloc_array include/linux/mm.h:824 [inline]
kvcalloc include/linux/mm.h:829 [inline]
xdp_umem_pin_pages net/xdp/xdp_umem.c:102 [inline]
xdp_umem_reg net/xdp/xdp_umem.c:219 [inline]
xdp_umem_create+0x6a5/0xf00 net/xdp/xdp_umem.c:252
xsk_setsockopt+0x604/0x790 net/xdp/xsk.c:1068
__sys_setsockopt+0x1fd/0x4e0 net/socket.c:2176
__do_sys_setsockopt net/socket.c:2187 [inline]
__se_sys_setsockopt net/socket.c:2184 [inline]
__x64_sys_setsockopt+0xb5/0x150 net/socket.c:2184
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Björn mentioned that requests for >2GB allocation can still be valid:
The structure that is being allocated is the page-pinning accounting.
AF_XDP has an internal limit of U32_MAX pages, which is *a lot*, but
still fewer than what memcg allows (PAGE_COUNTER_MAX is a LONG_MAX/
PAGE_SIZE on 64 bit systems). [...]
I could just change from U32_MAX to INT_MAX, but as I stated earlier
that has a hacky feeling to it. [...] From my perspective, the code
isn't broken, with the memcg limits in consideration. [...]
Linus says:
[...] Pretty much every time this has come up, the kernel warning has
shown that yes, the code was broken and there really wasn't a reason
for doing allocations that big.
Of course, some people would be perfectly fine with the allocation
failing, they just don't want the warning. I didn't want __GFP_NOWARN
to shut it up originally because I wanted people to see all those
cases, but these days I think we can just say "yeah, people can shut
it up explicitly by saying 'go ahead and fail this allocation, don't
warn about it'".
So enough time has passed that by now I'd certainly be ok with [it].
Thus allow call-sites to silence such userspace triggered splats if the
allocation requests have __GFP_NOWARN. For xdp_umem_pin_pages()'s call
to kvcalloc() this is already the case, so nothing else needed there.
Fixes: 7661809d49 ("mm: don't allow oversized kvmalloc() calls")
Reported-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com
Cc: Björn Töpel <bjorn@kernel.org>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Link: https://lore.kernel.org/bpf/CAJ+HfNhyfsT5cS_U9EC213ducHs9k9zNxX9+abqC0kTrPbQ0gg@mail.gmail.com
Link: https://lore.kernel.org/bpf/20211201202905.b9892171e3f5b9a60f9da251@linux-foundation.org
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Ackd-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Support for GFP_NO{FS,IO} and __GFP_NOFAIL has been implemented by
previous patches so we can allow the support for kvmalloc. This will
allow some external users to simplify or completely remove their
helpers.
GFP_NOWAIT semantic hasn't been supported so far but it hasn't been
explicitly documented so let's add a note about that.
ceph_kvmalloc is the first helper to be dropped and changed to kvmalloc.
Link: https://lkml.kernel.org/r/20211122153233.9924-5-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Through this vendor hook, we can get the timing to check
current running task for the validation of its credential
and related operations.
Bug: 191291287
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Change-Id: If20bd8bb8311ad10a374033734fbdc7ef61a7704
(cherry picked from commit a5543c9cd718cf3ac51b4065110213e5535d4ee5)
This is the folio equivalent of migrate_page_copy(), which is retained
as a wrapper for filesystems which are not yet converted to folios.
Also convert copy_huge_page() to folio_copy().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
This is a default implementation which calls flush_dcache_page() on
each page in the folio. If architectures can do better, they should
implement their own version of it.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Convert __page_rmapping to folio_raw_mapping and move it to mm/internal.h.
It's only a couple of instructions (load and mask), so it's definitely
going to be cheaper to inline it than call it. Leave page_rmapping
out of line. Change page_anon_vma() to not call folio_raw_mapping() --
it's more efficient to do the subtraction than the mask.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
This function is the equivalent of page_mapped(). It is slightly
shorter as we do not need to handle the PageTail() case. Reimplement
page_mapped() as a wrapper around folio_mapped(). folio_mapped()
is 13 bytes smaller than page_mapped(), but the page_mapped() wrapper
is 30 bytes, for a net increase of 17 bytes of text.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Howells <dhowells@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
These are the folio equivalent of page_mapping() and page_file_mapping().
Add an out-of-line page_mapping() wrapper around folio_mapping()
in order to prevent the page_folio() call from bloating every caller
of page_mapping(). Adjust page_file_mapping() and page_mapping_file()
to use folios internally. Rename __page_file_mapping() to
swapcache_mapping() and change it to take a folio.
This ends up saving 122 bytes of text overall. folio_mapping() is
45 bytes shorter than page_mapping() was, but the new page_mapping()
wrapper is 30 bytes. The major reduction is a few bytes less in dozens
of nfs functions (which call page_file_mapping()). Most of these appear
to be a slight change in gcc's register allocation decisions, which allow:
48 8b 56 08 mov 0x8(%rsi),%rdx
48 8d 42 ff lea -0x1(%rdx),%rax
83 e2 01 and $0x1,%edx
48 0f 44 c6 cmove %rsi,%rax
to become:
48 8b 46 08 mov 0x8(%rsi),%rax
48 8d 78 ff lea -0x1(%rax),%rdi
a8 01 test $0x1,%al
48 0f 44 fe cmove %rsi,%rdi
for a reduction of a single byte. Once the NFS client is converted to
use folios, this entire sequence will disappear.
Also add folio_mapping() documentation.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: David Howells <dhowells@redhat.com>