05e7d49733
414 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Greg Kroah-Hartman
|
012423e6bd |
Merge 5.10.228 into android12-5.10-lts
Changes in 5.10.228 ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2 net: enetc: add missing static descriptor and inline keyword posix-clock: Fix missing timespec64 check in pc_clock_settime() arm64: probes: Remove broken LDR (literal) uprobe support arm64: probes: Fix simulate_ldr*_literal() net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1 fat: fix uninitialized variable mm/swapfile: skip HugeTLB pages for unuse_vma wifi: mac80211: fix potential key use-after-free KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin() io_uring/sqpoll: do not allow pinning outside of cpuset io_uring/sqpoll: retain test for whether the CPU is valid io_uring/sqpoll: do not put cpumask on stack s390/sclp_vt220: Convert newlines to CRLF instead of LFCR KVM: s390: Change virtual to physical address access in diag 0x258 handler x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET x86/cpufeatures: Add a IBPB_NO_RET BUG flag x86/entry: Have entry_ibpb() invalidate return predictions x86/bugs: Skip RSB fill at VMEXIT x86/bugs: Do not use UNTRAIN_RET with IBPB on entry blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race io_uring/sqpoll: close race on waiting for sqring entries drm/radeon: Fix encoder->possible_clones drm/vmwgfx: Handle surface check failure correctly iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency() iio: light: veml6030: fix ALS sensor resolution iio: light: veml6030: fix IIO device retrieval from embedded device iio: light: opt3001: add missing full-scale range value iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig iio: adc: ti-ads124s08: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig Bluetooth: Remove debugfs directory on module init failure Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001 xhci: Fix incorrect stream context type macro USB: serial: option: add support for Quectel EG916Q-GL USB: serial: option: add Telit FN920C04 MBIM compositions parport: Proper fix for array out-of-bounds access x86/resctrl: Annotate get_mem_config() functions as __init x86/apic: Always explicitly disarm TSC-deadline timer x86/entry_32: Do not clobber user EFLAGS.ZF x86/entry_32: Clear CPU buffers after register restore in NMI return irqchip/gic-v4: Don't allow a VMOVP on a dying VPE mptcp: track and update contiguous data status mptcp: handle consistently DSS corruption tcp: fix mptcp DSS corruption due to large pmtu xmit nilfs2: propagate directory read errors from nilfs_find_entry() powerpc/mm: Always update max/min_low_pfn in mem_topology_setup() ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2 Linux 5.10.228 Change-Id: I46a08618e1091915449af89690af27a230a28855 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Liu Shixin
|
417d5838ca |
mm/swapfile: skip HugeTLB pages for unuse_vma
commit 7528c4fb1237512ee18049f852f014eba80bbe8d upstream.
I got a bad pud error and lost a 1GB HugeTLB when calling swapoff. The
problem can be reproduced by the following steps:
1. Allocate an anonymous 1GB HugeTLB and some other anonymous memory.
2. Swapout the above anonymous memory.
3. run swapoff and we will get a bad pud error in kernel message:
mm/pgtable-generic.c:42: bad pud 00000000743d215d(84000001400000e7)
We can tell that pud_clear_bad is called by pud_none_or_clear_bad in
unuse_pud_range() by ftrace. And therefore the HugeTLB pages will never
be freed because we lost it from page table. We can skip HugeTLB pages
for unuse_vma to fix it.
Link: https://lkml.kernel.org/r/20241015014521.570237-1-liushixin2@huawei.com
Fixes:
|
||
Greg Kroah-Hartman
|
9100d24dfd |
This is the 5.10.215 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmYaZdgACgkQONu9yGCS aT4oMxAA0pATFAq8RN5f9CmYlMg5HqHgzZ8lJv8P0/reOINhUa+F5sJb1n+x+Ch4 WQbmiFeZRzfsKZ2qKhIdNR0Lg+9JOr/DtYXdSBZ6InfSWrTAIrQ9fjl5Warkmcgg O4WbgF5BVgU3vGFATgxLvnUZwhR1D7WK93oMDunzrT7+OqyncU3f1Uj53ZAu9030 z18UNqnTxDLYH/CMGwAeRkaZqBev9gZ1HdgQWA27SVLqWQwZq0al81Cmlo+ECVmk 5dF6V2pid4qfKGJjDDfx1NS0PVnoP68iK4By1SXyoFV9VBiSwp77nUUyDr7YsHsT u8GpZHr9jZvSO5/xtKv20NPLejTPCRKc06CbkwpikDRtGOocBL8em0GuVqlf8hMs KwDb6ZEzYhXZGPJHbJM+aRD1tq/KHw9X7TrldOszMQPr6lubBtscPbg1FCg3OlcC HUrtub0i275x7TH0dJeRTD8TRE9jRmF+tl7KQytEJM3JRrquFjLyhDj+/VJnZkiB lzj3FRf4zshzgz4+CAeqXO/8Lu8b3fGYmcW1acCmk7emjDcXUKojPj/Aig6T4l7P oCWDY3+w1E6eiyE8BazxY1KUa/41ld0VJnlW5JWGRaDFTJwrk0h6/rvf9qImSckw IGx24UezRyp6NS1op3Qm2iwHLr41pFRfKxNm9ppgH9iBPzOhe38= =pkLL -----END PGP SIGNATURE----- Merge 5.10.215 into android12-5.10-lts Changes in 5.10.215 amdkfd: use calloc instead of kzalloc to avoid integer overflow Documentation/hw-vuln: Update spectre doc x86/cpu: Support AMD Automatic IBRS x86/bugs: Use sysfs_emit() timers: Update kernel-doc for various functions timers: Use del_timer_sync() even on UP timers: Rename del_timer_sync() to timer_delete_sync() wifi: brcmfmac: Fix use-after-free bug in brcmf_cfg80211_detach media: staging: ipu3-imgu: Set fields before media_entity_pads_init() clk: qcom: gcc-sdm845: Add soft dependency on rpmhpd smack: Set SMACK64TRANSMUTE only for dirs in smack_inode_setxattr() smack: Handle SMACK64TRANSMUTE in smack_inode_setsecurity() arm: dts: marvell: Fix maxium->maxim typo in brownstone dts drm/vmwgfx: stop using ttm_bo_create v2 drm/vmwgfx: switch over to the new pin interface v2 drm/vmwgfx/vmwgfx_cmdbuf_res: Remove unused variable 'ret' drm/vmwgfx: Fix some static checker warnings drm/vmwgfx: Fix possible null pointer derefence with invalid contexts serial: max310x: fix NULL pointer dereference in I2C instantiation media: xc4000: Fix atomicity violation in xc4000_get_frequency KVM: Always flush async #PF workqueue when vCPU is being destroyed sparc64: NMI watchdog: fix return value of __setup handler sparc: vDSO: fix return value of __setup handler crypto: qat - fix double free during reset crypto: qat - resolve race condition during AER recovery selftests/mqueue: Set timeout to 180 seconds ext4: correct best extent lstart adjustment logic block: introduce zone_write_granularity limit block: Clear zone limits for a non-zoned stacked queue bounds: support non-power-of-two CONFIG_NR_CPUS fat: fix uninitialized field in nostale filehandles ubifs: Set page uptodate in the correct place ubi: Check for too small LEB size in VTBL code ubi: correct the calculation of fastmap size mtd: rawnand: meson: fix scrambling mode value in command macro parisc: Avoid clobbering the C/B bits in the PSW with tophys and tovirt macros parisc: Fix ip_fast_csum parisc: Fix csum_ipv6_magic on 32-bit systems parisc: Fix csum_ipv6_magic on 64-bit systems parisc: Strip upper 32 bit of sum in csum_ipv6_magic for 64-bit builds PM: suspend: Set mem_sleep_current during kernel command line setup clk: qcom: gcc-ipq6018: fix terminating of frequency table arrays clk: qcom: gcc-ipq8074: fix terminating of frequency table arrays clk: qcom: mmcc-apq8084: fix terminating of frequency table arrays clk: qcom: mmcc-msm8974: fix terminating of frequency table arrays powerpc/fsl: Fix mfpmr build errors with newer binutils USB: serial: ftdi_sio: add support for GMC Z216C Adapter IR-USB USB: serial: add device ID for VeriFone adapter USB: serial: cp210x: add ID for MGP Instruments PDS100 USB: serial: option: add MeiG Smart SLM320 product USB: serial: cp210x: add pid/vid for TDK NC0110013M and MM0110113M PM: sleep: wakeirq: fix wake irq warning in system suspend mmc: tmio: avoid concurrent runs of mmc_request_done() fuse: fix root lookup with nonzero generation fuse: don't unhash root usb: typec: ucsi: Clean up UCSI_CABLE_PROP macros printk/console: Split out code that enables default console serial: Lock console when calling into driver before registration btrfs: fix off-by-one chunk length calculation at contains_pending_extent() PCI: Drop pci_device_remove() test of pci_dev->driver PCI/PM: Drain runtime-idle callbacks before driver removal PCI/ERR: Cache RCEC EA Capability offset in pci_init_capabilities() PCI: Cache PCIe Device Capabilities register PCI: Work around Intel I210 ROM BAR overlap defect PCI/ASPM: Make Intel DG2 L1 acceptable latency unlimited PCI/DPC: Quirk PIO log size for certain Intel Root Ports PCI/DPC: Quirk PIO log size for Intel Raptor Lake Root Ports Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"" dm-raid: fix lockdep waring in "pers->hot_add_disk" mac802154: fix llsec key resources release in mac802154_llsec_key_del mm: swap: fix race between free_swap_and_cache() and swapoff() mmc: core: Fix switch on gp3 partition drm/etnaviv: Restore some id values hwmon: (amc6821) add of_match table ext4: fix corruption during on-line resize nvmem: meson-efuse: fix function pointer type mismatch slimbus: core: Remove usage of the deprecated ida_simple_xx() API phy: tegra: xusb: Add API to retrieve the port number of phy usb: gadget: tegra-xudc: Use dev_err_probe() usb: gadget: tegra-xudc: Fix USB3 PHY retrieval logic speakup: Fix 8bit characters from direct synth PCI/ERR: Clear AER status only when we control AER PCI/AER: Block runtime suspend when handling errors nfs: fix UAF in direct writes kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1 PCI: dwc: endpoint: Fix advertised resizable BAR size vfio/platform: Disable virqfds on cleanup ring-buffer: Fix waking up ring buffer readers ring-buffer: Do not set shortest_full when full target is hit ring-buffer: Fix resetting of shortest_full ring-buffer: Fix full_waiters_pending in poll soc: fsl: qbman: Always disable interrupts when taking cgr_lock soc: fsl: qbman: Add helper for sanity checking cgr ops soc: fsl: qbman: Add CGR update function soc: fsl: qbman: Use raw spinlock for cgr_lock s390/zcrypt: fix reference counting on zcrypt card objects drm/panel: do not return negative error codes from drm_panel_get_modes() drm/exynos: do not return negative values from .get_modes() drm/imx/ipuv3: do not return negative values from .get_modes() drm/vc4: hdmi: do not return negative values from .get_modes() memtest: use {READ,WRITE}_ONCE in memory scanning nilfs2: fix failure to detect DAT corruption in btree and direct mappings nilfs2: prevent kernel bug at submit_bh_wbc() cpufreq: dt: always allocate zeroed cpumask x86/CPU/AMD: Update the Zenbleed microcode revisions net: hns3: tracing: fix hclgevf trace event strings wireguard: netlink: check for dangling peer via is_dead instead of empty list wireguard: netlink: access device through ctx instead of peer ahci: asm1064: correct count of reported ports ahci: asm1064: asm1166: don't limit reported ports drm/amd/display: Return the correct HDCP error code drm/amd/display: Fix noise issue on HDMI AV mute dm snapshot: fix lockup in dm_exception_table_exit vxge: remove unnecessary cast in kfree() x86/stackprotector/32: Make the canary into a regular percpu variable x86/pm: Work around false positive kmemleak report in msr_build_context() scripts: kernel-doc: Fix syntax error due to undeclared args variable comedi: comedi_test: Prevent timers rescheduling during deletion cpufreq: brcmstb-avs-cpufreq: fix up "add check for cpufreq_cpu_get's return value" netfilter: nf_tables: mark set as dead when unbinding anonymous set with timeout netfilter: nf_tables: disallow anonymous set with timeout flag netfilter: nf_tables: reject constant set with timeout Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory xfrm: Avoid clang fortify warning in copy_to_user_tmpl() KVM: SVM: Flush pages under kvm->lock to fix UAF in svm_register_enc_region() ALSA: hda/realtek - Fix headset Mic no show at resume back for Lenovo ALC897 platform USB: usb-storage: Prevent divide-by-0 error in isd200_ata_command usb: gadget: ncm: Fix handling of zero block length packets usb: port: Don't try to peer unused USB ports based on location tty: serial: fsl_lpuart: avoid idle preamble pending if CTS is enabled mei: me: add arrow lake point S DID mei: me: add arrow lake point H DID vt: fix unicode buffer corruption when deleting characters fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion tee: optee: Fix kernel panic caused by incorrect error handling xen/events: close evtchn after mapping cleanup printk: Update @console_may_schedule in console_trylock_spinning() btrfs: allocate btrfs_ioctl_defrag_range_args on stack x86/asm: Add _ASM_RIP() macro for x86-64 (%rip) suffix x86/bugs: Add asm helpers for executing VERW x86/entry_64: Add VERW just before userspace transition x86/entry_32: Add VERW just before userspace transition x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH KVM/VMX: Move VERW closer to VMentry for MDS mitigation x86/mmio: Disable KVM mitigation when X86_FEATURE_CLEAR_CPU_BUF is set Documentation/hw-vuln: Add documentation for RFDS x86/rfds: Mitigate Register File Data Sampling (RFDS) KVM/x86: Export RFDS_NO and RFDS_CLEAR to guests perf/core: Fix reentry problem in perf_output_read_group() efivarfs: Request at most 512 bytes for variable names powerpc: xor_vmx: Add '-mhard-float' to CFLAGS serial: sc16is7xx: convert from _raw_ to _noinc_ regmap functions for FIFO mm/memory-failure: fix an incorrect use of tail pages mm/migrate: set swap entry values of THP tail pages properly. init: open /initrd.image with O_LARGEFILE wifi: mac80211: check/clear fast rx for non-4addr sta VLAN changes exec: Fix NOMMU linux_binprm::exec in transfer_args_to_stack() hexagon: vmlinux.lds.S: handle attributes section mmc: core: Initialize mmc_blk_ioc_data mmc: core: Avoid negative index with array access net: ll_temac: platform_get_resource replaced by wrong function usb: cdc-wdm: close race between read and workqueue ALSA: sh: aica: reorder cleanup operations to avoid UAF bugs scsi: core: Fix unremoved procfs host directory regression staging: vc04_services: changen strncpy() to strscpy_pad() staging: vc04_services: fix information leak in create_component() USB: core: Add hub_get() and hub_put() routines usb: dwc2: host: Fix remote wakeup from hibernation usb: dwc2: host: Fix hibernation flow usb: dwc2: host: Fix ISOC flow in DDMA mode usb: dwc2: gadget: LPM flow fix usb: udc: remove warning when queue disabled ep usb: typec: ucsi: Ack unsupported commands usb: typec: ucsi: Clear UCSI_CCI_RESET_COMPLETE before reset scsi: qla2xxx: Split FCE|EFT trace control scsi: qla2xxx: Fix command flush on cable pull scsi: qla2xxx: Delay I/O Abort on PCI error x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled PCI/DPC: Quirk PIO log size for Intel Ice Lake Root Ports scsi: lpfc: Correct size for wqe for memset() USB: core: Fix deadlock in usb_deauthorize_interface() nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet ixgbe: avoid sleeping allocation in ixgbe_ipsec_vf_add_sa() tcp: properly terminate timers for kernel sockets ACPICA: debugger: check status of acpi_evaluate_object() in acpi_db_walk_for_fields() bpf: Protect against int overflow for stack access size Octeontx2-af: fix pause frame configuration in GMP mode dm integrity: fix out-of-range warning r8169: fix issue caused by buggy BIOS on certain boards with RTL8168d x86/cpufeatures: Add new word for scattered features Bluetooth: hci_event: set the conn encrypted before conn establishes Bluetooth: Fix TOCTOU in HCI debugfs implementation netfilter: nf_tables: disallow timeout for anonymous sets net/rds: fix possible cp null dereference vfio/pci: Disable auto-enable of exclusive INTx IRQ vfio/pci: Lock external INTx masking ops vfio: Introduce interface to flush virqfd inject workqueue vfio/pci: Create persistent INTx handler vfio/platform: Create persistent IRQ handlers vfio/fsl-mc: Block calling interrupt handler without trigger io_uring: ensure '0' is returned on file registration success Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped." mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations x86/srso: Add SRSO mitigation for Hygon processors block: add check that partition length needs to be aligned with block size netfilter: nf_tables: reject new basechain after table flag update netfilter: nf_tables: flush pending destroy work before exit_net release netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get() netfilter: validate user input for expected length vboxsf: Avoid an spurious warning if load_nls_xxx() fails bpf, sockmap: Prevent lock inversion deadlock in map delete elem net/sched: act_skbmod: prevent kernel-infoleak net: stmmac: fix rx queue priority assignment erspan: make sure erspan_base_hdr is present in skb->head selftests: reuseaddr_conflict: add missing new line at the end of the output ipv6: Fix infinite recursion in fib6_dump_done(). udp: do not transition UDP GRO fraglist partial checksums to unnecessary octeontx2-pf: check negative error code in otx2_open() i40e: fix i40e_count_filters() to count only active/new filters i40e: fix vf may be used uninitialized in this function warning scsi: qla2xxx: Update manufacturer details scsi: qla2xxx: Update manufacturer detail Revert "usb: phy: generic: Get the vbus supply" udp: do not accept non-tunnel GSO skbs landing in a tunnel net: ravb: Always process TX descriptor ring arm64: dts: qcom: sc7180: Remove clock for bluetooth on Trogdor arm64: dts: qcom: sc7180-trogdor: mark bluetooth address as broken ASoC: ops: Fix wraparound for mask in snd_soc_get_volsw ata: sata_sx4: fix pdc20621_get_from_dimm() on 64-bit scsi: mylex: Fix sysfs buffer lengths ata: sata_mv: Fix PCI device ID table declaration compilation warning ALSA: hda/realtek: Update Panasonic CF-SZ6 quirk to support headset with microphone driver core: Introduce device_link_wait_removal() of: dynamic: Synchronize of_changeset_destroy() with the devlink removals x86/mce: Make sure to grab mce_sysfs_mutex in set_bank() s390/entry: align system call table on 8 bytes riscv: Fix spurious errors from __get/put_kernel_nofault x86/bugs: Fix the SRSO mitigation on Zen3/4 x86/retpoline: Do the necessary fixup to the Zen3/4 srso return thunk for !SRSO mptcp: don't account accept() of non-MPC client as fallback to TCP x86/cpufeatures: Add CPUID_LNX_5 to track recently added Linux-defined word objtool: Add asm version of STACK_FRAME_NON_STANDARD wifi: ath9k: fix LNA selection in ath_ant_try_scan() VMCI: Fix memcpy() run-time warning in dg_dispatch_as_host() panic: Flush kernel log buffer at the end arm64: dts: rockchip: fix rk3328 hdmi ports node arm64: dts: rockchip: fix rk3399 hdmi ports node ionic: set adminq irq affinity pstore/zone: Add a null pointer check to the psz_kmsg_read tools/power x86_energy_perf_policy: Fix file leak in get_pkg_num() btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks() btrfs: export: handle invalid inode or root reference in btrfs_get_parent() btrfs: send: handle path ref underflow in header iterate_inode_ref() net/smc: reduce rtnl pressure in smc_pnet_create_pnetids_list() Bluetooth: btintel: Fix null ptr deref in btintel_read_version Input: synaptics-rmi4 - fail probing if memory allocation for "phys" fails pinctrl: renesas: checker: Limit cfg reg enum checks to provided IDs sysv: don't call sb_bread() with pointers_lock held scsi: lpfc: Fix possible memory leak in lpfc_rcv_padisc() isofs: handle CDs with bad root inode but good Joliet root directory media: sta2x11: fix irq handler cast ext4: add a hint for block bitmap corrupt state in mb_groups ext4: forbid commit inconsistent quota data when errors=remount-ro drm/amd/display: Fix nanosec stat overflow SUNRPC: increase size of rpc_wait_queue.qlen from unsigned short to unsigned int Revert "ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default" libperf evlist: Avoid out-of-bounds access block: prevent division by zero in blk_rq_stat_sum() RDMA/cm: add timeout to cm_destroy_id wait Input: allocate keycode for Display refresh rate toggle platform/x86: touchscreen_dmi: Add an extra entry for a variant of the Chuwi Vi8 tablet ktest: force $buildonly = 1 for 'make_warnings_file' test type ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment tools: iio: replace seekdir() in iio_generic_buffer usb: typec: tcpci: add generic tcpci fallback compatible usb: sl811-hcd: only defined function checkdone if QUIRK2 is defined fbdev: viafb: fix typo in hw_bitblt_1 and hw_bitblt_2 drivers/nvme: Add quirks for device 126f:2262 fbmon: prevent division by zero in fb_videomode_from_videomode() netfilter: nf_tables: release batch on table validation from abort path netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path netfilter: nf_tables: discard table flag update with pending basechain deletion tty: n_gsm: require CAP_NET_ADMIN to attach N_GSM0710 ldisc virtio: reenable config if freezing device failed x86/mm/pat: fix VM_PAT handling in COW mappings drm/i915/gt: Reset queue_priority_hint on parking Bluetooth: btintel: Fixe build regression VMCI: Fix possible memcpy() run-time warning in vmci_datagram_invoke_guest_handler() kbuild: dummy-tools: adjust to stricter stackprotector check scsi: sd: Fix wrong zone_write_granularity value during revalidate x86/retpoline: Add NOENDBR annotation to the SRSO dummy return thunk x86/head/64: Re-enable stack protection Linux 5.10.215 Change-Id: I45a0a9c4a0683ff5ef97315690f1f884f666e1b5 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Ryan Roberts
|
d85c11c97e |
mm: swap: fix race between free_swap_and_cache() and swapoff()
[ Upstream commit 82b1c07a0af603e3c47b906c8e991dc96f01688e ]
There was previously a theoretical window where swapoff() could run and
teardown a swap_info_struct while a call to free_swap_and_cache() was
running in another thread. This could cause, amongst other bad
possibilities, swap_page_trans_huge_swapped() (called by
free_swap_and_cache()) to access the freed memory for swap_map.
This is a theoretical problem and I haven't been able to provoke it from a
test case. But there has been agreement based on code review that this is
possible (see link below).
Fix it by using get_swap_device()/put_swap_device(), which will stall
swapoff(). There was an extra check in _swap_info_get() to confirm that
the swap entry was not free. This isn't present in get_swap_device()
because it doesn't make sense in general due to the race between getting
the reference and swapoff. So I've added an equivalent check directly in
free_swap_and_cache().
Details of how to provoke one possible issue (thanks to David Hildenbrand
for deriving this):
--8<-----
__swap_entry_free() might be the last user and result in
"count == SWAP_HAS_CACHE".
swapoff->try_to_unuse() will stop as soon as soon as si->inuse_pages==0.
So the question is: could someone reclaim the folio and turn
si->inuse_pages==0, before we completed swap_page_trans_huge_swapped().
Imagine the following: 2 MiB folio in the swapcache. Only 2 subpages are
still references by swap entries.
Process 1 still references subpage 0 via swap entry.
Process 2 still references subpage 1 via swap entry.
Process 1 quits. Calls free_swap_and_cache().
-> count == SWAP_HAS_CACHE
[then, preempted in the hypervisor etc.]
Process 2 quits. Calls free_swap_and_cache().
-> count == SWAP_HAS_CACHE
Process 2 goes ahead, passes swap_page_trans_huge_swapped(), and calls
__try_to_reclaim_swap().
__try_to_reclaim_swap()->folio_free_swap()->delete_from_swap_cache()->
put_swap_folio()->free_swap_slot()->swapcache_free_entries()->
swap_entry_free()->swap_range_free()->
...
WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
What stops swapoff to succeed after process 2 reclaimed the swap cache
but before process1 finished its call to swap_page_trans_huge_swapped()?
--8<-----
Link: https://lkml.kernel.org/r/20240306140356.3974886-1-ryan.roberts@arm.com
Fixes:
|
||
Lincheng Yang
|
6356ed35b9 |
ANDROID: add vendor hook of add/delete/iterate node for swap_avail_heads
Our Android phones occur Panic as follows: [77522.303024][ T9734] Call trace: [77522.303039][ T9734] dump_backtrace.cfi_jt+0x0/0x8 [77522.303052][ T9734] dump_stack_lvl+0xc4/0x140 [77522.303061][ T9734] dump_stack+0x1c/0x2c [77522.303123][ T9734] mrdump_common_die+0x3a8/0x544 [mrdump] [77522.303177][ T9734] ipanic_die+0x24/0x38 [mrdump] [77522.303189][ T9734] die+0x340/0x698 [77522.303199][ T9734] bug_handler+0x48/0x108 [77522.303210][ T9734] brk_handler+0xac/0x1a8 [77522.303221][ T9734] do_debug_exception+0xe0/0x1e0 [77522.303233][ T9734] el1_dbg+0x38/0x54 [77522.303242][ T9734] el1_sync_handler+0x40/0x88 [77522.303255][ T9734] el1_sync+0x8c/0x140 [77522.303264][ T9734] plist_requeue+0xd4/0x110 [77522.303297][ T9734] tran_get_swap_pages+0xc8/0x364 [memfusion] [77522.303329][ T9734] probe_android_vh_get_swap_page+0x1b4/0x220 [memfusion] [77522.303342][ T9734] get_swap_page+0x258/0x304 [77522.303352][ T9734] shrink_page_list+0xe00/0x1e0c [77522.303361][ T9734] shrink_inactive_list+0x2f4/0xac8 [77522.303373][ T9734] shrink_lruvec+0x1a4/0x34c [77522.303383][ T9734] shrink_node_memcgs+0x84/0x3b0 [77522.303391][ T9734] shrink_node+0x2c4/0x6e4 [77522.303400][ T9734] shrink_zones+0x16c/0x29c [77522.303410][ T9734] do_try_to_free_pages+0xe4/0x2bc [77522.303418][ T9734] try_to_free_pages+0x388/0x7b4 [77522.303429][ T9734] __alloc_pages_direct_reclaim+0x88/0x278 [77522.303438][ T9734] __alloc_pages_slowpath+0x464/0xb24 [77522.303447][ T9734] __alloc_pages_nodemask+0x1f4/0x3dc [77522.303458][ T9734] do_anonymous_page+0x164/0x914 [77522.303466][ T9734] handle_pte_fault+0x15c/0x9f8 [77522.303476][ T9734] ___handle_speculative_fault+0x234/0xe18 [77522.303485][ T9734] __handle_speculative_fault+0x78/0x21c [77522.303497][ T9734] do_page_fault+0x36c/0x754 [77522.303506][ T9734] do_translation_fault+0x48/0x64 [77522.303514][ T9734] do_mem_abort+0x6c/0x164 [77522.303522][ T9734] el0_da+0x24/0x34 [77522.303531][ T9734] el0_sync_handler+0xc8/0xf0 [77522.303539][ T9734] el0_sync+0x1b4/0x1c0 The analysis shows that when we iterate the swap_avail_heads list, we get node A, but before we access node A, node A is maybe deleted, and by the time we actually access node A, it no longer exists, as follows: CPU1 thread1 CPU2 thread2 plist_for_each_entry_safe() get si->avail_lists[node] from swap_avail_heads remove si->avail_lists[node] from swap_avail_heads plist_requeue(&si->avail_lists[node]) BUG_ON(plist_node_empty(node)); // trigger Due to when we use vendor hook of get_swap_page, the get_swap_pages() function is overridden, use our own spin_lock to protect when iterate swap_avail_heads list, but now use native swap_avail_lock spin_lock protect when the swap_avail_heads list to add and delete nodes, so there will be concurrent access. So add vendor hook of add/delete/iterate node for avail_list, in this way, we can use our own spin_lock to protect the swap_avail_heads list to add, delete and iterate node. Due to enable_swap_info function to call vendor hook of add_to_avail_list, need first init swap_avail_heads, so also add vendor hook of swap_avail_heads_init. Due to the vendor hook of __cgroup_throttle_swaprate need to call blkcg_schedule_throttle function, so export it also. Bug: 225795494 Change-Id: I03107cbda6310fa7ae85e41b8cf1fa8225cafe78 Signed-off-by: Lincheng Yang <lincheng.yang@transsion.com> Suggested-by: Bing Han <bing.han@transsion.com> |
||
Greg Kroah-Hartman
|
2d6a4ad08c |
This is the 5.10.178 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmRBD58ACgkQONu9yGCS aT5BTxAApbYtClwFI1KGwlvnh9elm2m6NYDZcBleAT8bps1ofI50Bpca0CKgkX8f HLzRid8WE5BW6+3tpDJxBwqEGEGG1Z8bgleaM62PxiNU3CRFKtUuDmS2DiVAK30d PfdvjhOxlwf4f6e+WHSvGXvqxV9w1DtjqG+Lz1jA37sAAj6IithDuSkNrYcsojFF u+zA+17M2KVG8vrTCHZVH/ij9A1w4gWOkhYVCKaC7hKafTsU613YjFTGpqelhvTS 6AfMSTI15E01Qy6FM5OjqmVM4k8UWIydA1WBV7aHLn3y2MzXeaYza8xsg90Qu2V2 49F4Yu53WLkuNV0aDOURnaQ7M1m+Sj8IL/MD7G5iLIDwjN3PDwY5IqwKyYueZ/2P TdlNTTffCC66MiYjAy/A5gPg4bjxvs7aaQkjgahluzWnXUWSdyUvJDg1XYGdhf1l W4E2OcWH0Al6se56255O2eKbvmeOe+IHW22oRoDaAC9+14Lp6KWP9sAh4/zrEcgf /x0YxZekOoWVdVtoP4oS1CE3Rj9v4HmtPT2QVltE7Dag7sn3FtGWTQ+SxZ34gmwY RYCvoCpBF5SNg3tkW/eIwl+6fRryiT/LS9OsUmz+5g0L6mkK5m6ScleIbAGYq6BZ 4mu6CwHuSBX0O/EvRgmVpZpPsKsHypVu86krtTlW/+HcKBrXSuY= =hM8w -----END PGP SIGNATURE----- Merge 5.10.178 into android12-5.10-lts Changes in 5.10.178 gpio: GPIO_REGMAP: select REGMAP instead of depending on it Drivers: vmbus: Check for channel allocation before looking up relids pwm: cros-ec: Explicitly set .polarity in .get_state() pwm: sprd: Explicitly set .polarity in .get_state() KVM: s390: pv: fix external interruption loop not always detected wifi: mac80211: fix invalid drv_sta_pre_rcu_remove calls for non-uploaded sta net: qrtr: combine nameservice into main module net: qrtr: Fix a refcount bug in qrtr_recvmsg() icmp: guard against too small mtu net: don't let netpoll invoke NAPI if in xmit context sctp: check send stream number after wait_for_sndbuf net: qrtr: Do not do DEL_SERVER broadcast after DEL_CLIENT ipv6: Fix an uninit variable access bug in __ip6_make_skb() gpio: davinci: Add irq chip flag to skip set wake net: ethernet: ti: am65-cpsw: Fix mdio cleanup in probe net: stmmac: fix up RX flow hash indirection table when setting channels sunrpc: only free unix grouplist after RCU settles NFSD: callback request does not use correct credential for AUTH_SYS usb: xhci: tegra: fix sleep in atomic call xhci: also avoid the XHCI_ZERO_64B_REGS quirk with a passthrough iommu USB: serial: cp210x: add Silicon Labs IFS-USB-DATACABLE IDs usb: typec: altmodes/displayport: Fix configure initial pin assignment USB: serial: option: add Telit FE990 compositions USB: serial: option: add Quectel RM500U-CN modem iio: adc: ti-ads7950: Set `can_sleep` flag for GPIO chip iio: dac: cio-dac: Fix max DAC write value check for 12-bit iio: light: cm32181: Unregister second I2C client if present tty: serial: sh-sci: Fix transmit end interrupt handler tty: serial: sh-sci: Fix Rx on RZ/G2L SCI tty: serial: fsl_lpuart: avoid checking for transfer complete when UARTCTRL_SBK is asserted in lpuart32_tx_empty nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread() nilfs2: fix sysfs interface lifetime dt-bindings: serial: renesas,scif: Fix 4th IRQ for 4-IRQ SCIFs ALSA: hda/realtek: Add quirk for Clevo X370SNW iio: adc: ad7791: fix IRQ flags scsi: iscsi_tcp: Check that sock is valid before iscsi_set_param() perf/core: Fix the same task check in perf_event_set_output ftrace: Mark get_lock_parent_ip() __always_inline ftrace: Fix issue that 'direct->addr' not restored in modify_ftrace_direct() can: j1939: j1939_tp_tx_dat_new(): fix out-of-bounds memory access can: isotp: isotp_ops: fix poll() to not report false EPOLLOUT events tracing: Free error logs of tracing instances ASoC: hdac_hdmi: use set_stream() instead of set_tdm_slots() drm/panfrost: Fix the panfrost_mmu_map_fault_addr() error path drm/nouveau/disp: Support more modes by checking with lower bpc ring-buffer: Fix race while reader and writer are on the same page mm/swap: fix swap_info_struct race between swapoff and get_swap_pages() selftests: intel_pstate: ftime() is deprecated drm/bridge: lt9611: Fix PLL being unable to lock Revert "media: ti: cal: fix possible memory leak in cal_ctx_create()" ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown bpftool: Print newline before '}' for struct with padding only fields Revert "pinctrl: amd: Disable and mask interrupts on resume" ALSA: emu10k1: fix capture interrupt handler unlinking ALSA: hda/sigmatel: add pin overrides for Intel DP45SG motherboard ALSA: i2c/cs8427: fix iec958 mixer control deactivation ALSA: firewire-tascam: add missing unwind goto in snd_tscm_stream_start_duplex() ALSA: hda/sigmatel: fix S/PDIF out on Intel D*45* motherboards Bluetooth: L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp} Bluetooth: Fix race condition in hidp_session_thread btrfs: print checksum type and implementation at mount time btrfs: fix fast csum implementation detection fbmem: Reject FB_ACTIVATE_KD_TEXT from userspace mtdblock: tolerate corrected bit-flips mtd: rawnand: meson: fix bitmask for length in command word mtd: rawnand: stm32_fmc2: remove unsupported EDO mode mtd: rawnand: stm32_fmc2: use timings.mode instead of checking tRC_min clk: sprd: set max_register according to mapping range IB/mlx5: Add support for NDR link speed IB/mlx5: Add support for 400G_8X lane speed RDMA/cma: Allow UD qp_type to join multicast only 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition niu: Fix missing unwind goto in niu_alloc_channels() sysctl: add proc_dou8vec_minmax() ipv4: shrink netns_ipv4 with sysctl conversions tcp: convert elligible sysctls to u8 tcp: restrict net.ipv4.tcp_app_win drm/armada: Fix a potential double free in an error handling path qlcnic: check pci_reset_function result net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume() sctp: fix a potential overflow in sctp_ifwdtsn_skip RDMA/core: Fix GID entry ref leak when create_ah fails udp6: fix potential access to stale information net: macb: fix a memory corruption in extended buffer descriptor mode libbpf: Fix single-line struct definition output in btf_dump power: supply: cros_usbpd: reclassify "default case!" as debug wifi: mwifiex: mark OF related data as maybe unused i2c: imx-lpi2c: clean rx/tx buffers upon new message efi: sysfb_efi: Add quirk for Lenovo Yoga Book X91F/L drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F verify_pefile: relax wrapper length check asymmetric_keys: log on fatal failures in PE/pkcs7 riscv: add icache flush for nommu sigreturn trampoline net: sfp: initialize sfp->i2c_block_size at sfp allocation scsi: ses: Handle enclosure with just a primary component gracefully x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot cgroup/cpuset: Wake up cpuset_attach_wq tasks in cpuset_cancel_attach() ubi: Fix failure attaching when vid_hdr offset equals to (sub)page size mtd: ubi: wl: Fix a couple of kernel-doc issues ubi: Fix deadlock caused by recursively holding work_sem powerpc/pseries: rename min_common_depth to primary_domain_index powerpc/pseries: Rename TYPE1_AFFINITY to FORM1_AFFINITY powerpc/pseries: Consolidate different NUMA distance update code paths powerpc/pseries: Add a helper for form1 cpu distance powerpc/pseries: Add support for FORM2 associativity powerpc/papr_scm: Update the NUMA distance table for the target node sched/fair: Move calculate of avg_load to a better location sched/fair: Fix imbalance overflow x86/rtc: Remove __init for runtime functions i2c: ocores: generate stop condition after timeout in polling mode watchdog: sbsa_wdog: Make sure the timeout programming is within the limits coresight-etm4: Fix for() loop drvdata->nr_addr_cmp range bug kbuild: check the minimum assembler version in Kconfig kbuild: Switch to 'f' variants of integrated assembler flag kbuild: check CONFIG_AS_IS_LLVM instead of LLVM_IAS riscv: Handle zicsr/zifencei issues between clang and binutils kexec: move locking into do_kexec_load kexec: turn all kexec_mutex acquisitions into trylocks panic, kexec: make __crash_kexec() NMI safe sysctl: Fix data-races in proc_dou8vec_minmax(). Linux 5.10.178 Change-Id: I34107ee680c7b081bb0c2782483cbb7ec62252ca Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Rongwei Wang
|
ea8c42b3b6 |
mm/swap: fix swap_info_struct race between swapoff and get_swap_pages()
commit 6fe7d6b992113719e96744d974212df3fcddc76c upstream.
The si->lock must be held when deleting the si from the available list.
Otherwise, another thread can re-add the si to the available list, which
can lead to memory corruption. The only place we have found where this
happens is in the swapoff path. This case can be described as below:
core 0 core 1
swapoff
del_from_avail_list(si) waiting
try lock si->lock acquire swap_avail_lock
and re-add si into
swap_avail_head
acquire si->lock but missing si already being added again, and continuing
to clear SWP_WRITEOK, etc.
It can be easily found that a massive warning messages can be triggered
inside get_swap_pages() by some special cases, for example, we call
madvise(MADV_PAGEOUT) on blocks of touched memory concurrently, meanwhile,
run much swapon-swapoff operations (e.g. stress-ng-swap).
However, in the worst case, panic can be caused by the above scene. In
swapoff(), the memory used by si could be kept in swap_info[] after
turning off a swap. This means memory corruption will not be caused
immediately until allocated and reset for a new swap in the swapon path.
A panic message caused: (with CONFIG_PLIST_DEBUG enabled)
------------[ cut here ]------------
top: 00000000e58a3003, n: 0000000013e75cda, p: 000000008cd4451a
prev: 0000000035b1e58a, n: 000000008cd4451a, p: 000000002150ee8d
next: 000000008cd4451a, n: 000000008cd4451a, p: 000000008cd4451a
WARNING: CPU: 21 PID: 1843 at lib/plist.c:60 plist_check_prev_next_node+0x50/0x70
Modules linked in: rfkill(E) crct10dif_ce(E)...
CPU: 21 PID: 1843 Comm: stress-ng Kdump: ... 5.10.134+
Hardware name: Alibaba Cloud ECS, BIOS 0.0.0 02/06/2015
pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
pc : plist_check_prev_next_node+0x50/0x70
lr : plist_check_prev_next_node+0x50/0x70
sp : ffff0018009d3c30
x29: ffff0018009d3c40 x28: ffff800011b32a98
x27: 0000000000000000 x26: ffff001803908000
x25: ffff8000128ea088 x24: ffff800011b32a48
x23: 0000000000000028 x22: ffff001800875c00
x21: ffff800010f9e520 x20: ffff001800875c00
x19: ffff001800fdc6e0 x18: 0000000000000030
x17: 0000000000000000 x16: 0000000000000000
x15: 0736076307640766 x14: 0730073007380731
x13: 0736076307640766 x12: 0730073007380731
x11: 000000000004058d x10: 0000000085a85b76
x9 : ffff8000101436e4 x8 : ffff800011c8ce08
x7 : 0000000000000000 x6 : 0000000000000001
x5 : ffff0017df9ed338 x4 : 0000000000000001
x3 : ffff8017ce62a000 x2 : ffff0017df9ed340
x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
plist_check_prev_next_node+0x50/0x70
plist_check_head+0x80/0xf0
plist_add+0x28/0x140
add_to_avail_list+0x9c/0xf0
_enable_swap_info+0x78/0xb4
__do_sys_swapon+0x918/0xa10
__arm64_sys_swapon+0x20/0x30
el0_svc_common+0x8c/0x220
do_el0_svc+0x2c/0x90
el0_svc+0x1c/0x30
el0_sync_handler+0xa8/0xb0
el0_sync+0x148/0x180
irq event stamp: 2082270
Now, si->lock locked before calling 'del_from_avail_list()' to make sure
other thread see the si had been deleted and SWP_WRITEOK cleared together,
will not reinsert again.
This problem exists in versions after stable 5.10.y.
Link: https://lkml.kernel.org/r/20230404154716.23058-1-rongwei.wang@linux.alibaba.com
Fixes:
|
||
Greg Kroah-Hartman
|
570621d64f |
Merge 5.10.168 into android12-5.10-lts
Changes in 5.10.168 firewire: fix memory leak for payload of request subaction to IEC 61883-1 FCP region bus: sunxi-rsb: Fix error handling in sunxi_rsb_init() bpf: Fix incorrect state pruning for <8B spill/fill powerpc/imc-pmu: Revert nest_init_lock to being a mutex bpf: Fix a possible task gone issue with bpf_send_signal[_thread]() helpers ALSA: hda/via: Avoid potential array out-of-bound in add_secret_dac_path() bpf: Support <8-byte scalar spill and refill bpf: Fix to preserve reg parent/live fields when copying range info bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listener arm64: dts: imx8mm: Fix pad control for UART1_DTE_RX drm/vc4: hdmi: make CEC adapter name unique scsi: Revert "scsi: core: map PQ=1, PDT=other values to SCSI_SCAN_TARGET_PRESENT" vhost/net: Clear the pending messages when the backend is removed WRITE is "data source", not destination... READ is "data destination", not source... fix iov_iter_bvec() "direction" argument fix "direction" argument of iov_iter_kvec() virtio-net: execute xdp_do_flush() before napi_complete_done() sfc: correctly advertise tunneled IPv6 segmentation net: phy: dp83822: Fix null pointer access on DP83825/DP83826 devices netrom: Fix use-after-free caused by accept on already connected socket netfilter: br_netfilter: disable sabotage_in hook after first suppression squashfs: harden sanity check in squashfs_read_xattr_id_table net: phy: meson-gxl: Add generic dummy stubs for MMD register access igc: return an error if the mac type is unknown in igc_ptp_systim_to_hwtstamp() can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate ata: libata: Fix sata_down_spd_limit() when no link speed is reported selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warning selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are provided selftests: net: udpgso_bench: Fix racing bug between the rx/tx programs selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy benchmarking virtio-net: Keep stop() to follow mirror sequence of open() net: openvswitch: fix flow memory leak in ovs_flow_cmd_new efi: fix potential NULL deref in efi_mem_reserve_persistent qede: add netpoll support for qede driver qede: execute xdp_do_flush() before napi_complete_done() i2c: mxs: suppress probe-deferral error message scsi: target: core: Fix warning on RT kernels scsi: iscsi_tcp: Fix UAF during login when accessing the shost ipaddress i2c: rk3x: fix a bunch of kernel-doc warnings platform/x86: dell-wmi: Add a keymap for KEY_MUTE in type 0x0010 table net/x25: Fix to not accept on connected socket iio: adc: stm32-dfsdm: fill module aliases usb: dwc3: dwc3-qcom: Fix typo in the dwc3 vbus override API usb: dwc3: qcom: enable vbus override when in OTG dr-mode usb: gadget: f_fs: Fix unbalanced spinlock in __ffs_ep0_queue_wait vc_screen: move load of struct vc_data pointer in vcs_read() to avoid UAF Input: i8042 - move __initconst to fix code styling warning Input: i8042 - merge quirk tables Input: i8042 - add TUXEDO devices to i8042 quirk tables Input: i8042 - add Clevo PCX0DX to i8042 quirk table fbcon: Check font dimension limits net: qrtr: free memory on error path in radix_tree_insert() watchdog: diag288_wdt: do not use stack buffers for hardware data watchdog: diag288_wdt: fix __diag288() inline assembly ALSA: hda/realtek: Add Acer Predator PH315-54 efi: Accept version 2 of memory attributes table iio: hid: fix the retval in accel_3d_capture_sample iio: adc: berlin2-adc: Add missing of_node_put() in error path iio:adc:twl6030: Enable measurements of VUSB, VBAT and others iio: imu: fxos8700: fix ACCEL measurement range selection iio: imu: fxos8700: fix incomplete ACCEL and MAGN channels readback iio: imu: fxos8700: fix IMU data bits returned to user space iio: imu: fxos8700: fix map label of channel type to MAGN sensor iio: imu: fxos8700: fix swapped ACCEL and MAGN channels readback iio: imu: fxos8700: fix incorrect ODR mode readback iio: imu: fxos8700: fix failed initialization ODR mode assignment iio: imu: fxos8700: remove definition FXOS8700_CTRL_ODR_MIN iio: imu: fxos8700: fix MAGN sensor scale and unit nvmem: qcom-spmi-sdam: fix module autoloading parisc: Fix return code of pdc_iodc_print() parisc: Wire up PTRACE_GETREGS/PTRACE_SETREGS for compat case riscv: disable generation of unwind tables mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps x86/debug: Fix stack recursion caused by wrongly ordered DR7 accesses fpga: stratix10-soc: Fix return value check in s10_ops_write_init() mm/swapfile: add cond_resched() in get_swap_pages() Squashfs: fix handling and sanity checking of xattr_ids count drm/i915: Fix potential bit_17 double-free nvmem: core: initialise nvmem->id early nvmem: core: fix cell removal on error serial: 8250_dma: Fix DMA Rx completion race serial: 8250_dma: Fix DMA Rx rearm race fbdev: smscufx: fix error handling code in ufx_usb_probe f2fs: fix to do sanity check on i_extra_isize in is_alive() wifi: brcmfmac: Check the count value of channel spec to prevent out-of-bounds reads nvmem: core: Fix a conflict between MTD and NVMEM on wp-gpios property bpf: Do not reject when the stack read size is different from the tracked scalar size iio:adc:twl6030: Enable measurement of VAC mm/migration: return errno when isolate_huge_page failed migrate: hugetlb: check for hugetlb shared PMD in node migration btrfs: limit device extents to the device size btrfs: zlib: zero-initialize zlib workspace ALSA: hda/realtek: Add Positivo N14KP6-TG ALSA: emux: Avoid potential array out-of-bound in snd_emux_xg_control() ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro 360 tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw of/address: Return an error when no valid dma-ranges are found can: j1939: do not wait 250 ms if the same addr was already claimed xfrm: compat: change expression for switch in xfrm_xlate64 IB/hfi1: Restore allocated resources on failed copyout xfrm/compat: prevent potential spectre v1 gadget in xfrm_xlate32_attr() IB/IPoIB: Fix legacy IPoIB due to wrong number of queues RDMA/usnic: use iommu_map_atomic() under spin_lock() xfrm: fix bug with DSCP copy to v6 from v4 tunnel bonding: fix error checking in bond_debug_reregister() net: phy: meson-gxl: use MMD access dummy stubs for GXL, internal PHY ionic: clean interrupt before enabling queue to avoid credit race uapi: add missing ip/ipv6 header dependencies for linux/stddef.h ice: Do not use WQ_MEM_RECLAIM flag for workqueue net: mscc: ocelot: fix VCAP filters not matching on MAC with "protocol 802.1Q" net/mlx5e: IPoIB, Show unknown speed instead of error net/mlx5: fw_tracer, Clear load bit when freeing string DBs buffers net/mlx5: fw_tracer, Zero consumer index when reloading the tracer rds: rds_rm_zerocopy_callback() use list_first_entry() selftests: forwarding: lib: quote the sysctl values ALSA: pci: lx6464es: fix a debug loop pinctrl: aspeed: Fix confusing types in return value pinctrl: single: fix potential NULL dereference spi: dw: Fix wrong FIFO level setting for long xfers pinctrl: intel: Restore the pins that used to be in Direct IRQ mode cifs: Fix use-after-free in rdata->read_into_pages() net: USB: Fix wrong-direction WARNING in plusb.c btrfs: free device in btrfs_close_devices for a single device filesystem usb: core: add quirk for Alcor Link AK9563 smartcard reader usb: typec: altmodes/displayport: Fix probe pin assign check ceph: flush cap releases when the session is flushed riscv: Fixup race condition on PG_dcache_clean in flush_icache_pte arm64: dts: meson-gx: Make mmc host controller interrupts level-sensitive arm64: dts: meson-g12-common: Make mmc host controller interrupts level-sensitive arm64: dts: meson-axg: Make mmc host controller interrupts level-sensitive Fix page corruption caused by racy check in __free_pages Linux 5.10.168 Change-Id: I98d1e73edfaab3ce45c15283ae0964527d5e547e Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
Longlong Xia
|
30187be290 |
mm/swapfile: add cond_resched() in get_swap_pages()
commit 7717fc1a12f88701573f9ed897cc4f6699c661e3 upstream. The softlockup still occurs in get_swap_pages() under memory pressure. 64 CPU cores, 64GB memory, and 28 zram devices, the disksize of each zram device is 50MB with same priority as si. Use the stress-ng tool to increase memory pressure, causing the system to oom frequently. The plist_for_each_entry_safe() loops in get_swap_pages() could reach tens of thousands of times to find available space (extreme case: cond_resched() is not called in scan_swap_map_slots()). Let's add cond_resched() into get_swap_pages() when failed to find available space to avoid softlockup. Link: https://lkml.kernel.org/r/20230128094757.1060525-1-xialonglong1@huawei.com Signed-off-by: Longlong Xia <xialonglong1@huawei.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Cc: Chen Wandun <chenwandun@huawei.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Nanyong Sun <sunnanyong@huawei.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Bing Han
|
09f4246296 |
ANDROID: sched: add restricted hooks to replace the former hooks
Fix Bug: scheduling while atomic In these vendor hooks, we will perform schedule due to competion. This will lead to kernel exception. To solve this problem, we need to add these restrcted hooks to replace the former regular vendor hooks. Bug: 234214858 Signed-off-by: Bing Han <bing.han@transsion.com> Change-Id: I151125a7119a91d1339d4790a68a6a4796d673e3 |
||
Bing Han
|
6c56a05b87 |
ANDROID: vendor_hooks: Add hooks to extend the struct swap_info_struct
This reverts commit:
|
||
Bing Han
|
7449d8120a |
ANDROID: vendor_hook: Add hook in si_swapinfo()
This reverts commit
|
||
Greg Kroah-Hartman
|
86be1a3d9f |
Revert "ANDROID: vendor_hook: Add hook in si_swapinfo()"
This reverts commit
|
||
Greg Kroah-Hartman
|
d0590b99c9 |
Revert "ANDROID: vendor_hooks: Add hooks to extend the struct swap_info_struct"
This reverts commit
|
||
Bing Han
|
034877c195 |
ANDROID: mm: export swapcache_free_entries
Export swapcache_free_entries to be used in the alternative function android_vh_drain_slots_cache_cpu to swap entries in swap slot cache, it's usage is similar to the usage in drain_slots_cache_cpu. Bug: 234214858 Signed-off-by: Bing Han <bing.han@transsion.com> Change-Id: Ia89b1728d540c5cc8995a939a918e12c23057266 |
||
Bing Han
|
06c2766cbc |
ANDROID: mm: export symbols used in vendor hook android_vh_get_swap_page()
3 symbols are exported to be used in vendor hook android_vh_get_swap_page: 1)check_cache_active, used to get swap page from the specified swap location, it's usage is similar to the usage in get_swap_page 2)scan_swap_map_slots, used to get swap page from the specified swap, it's usage is similar to get_swap_pages 3)swap_alloc_cluster, used to get swap page from the specified swap, it's usage is similar to get_swap_pages Bug: 234214858 Signed-off-by: Bing Han <bing.han@transsion.com> Change-Id: Ie24c5d32a16c7cb87905d034095ec8fb070dbe0f |
||
Bing Han
|
4506bcbba5 |
ANDROID: mm: export swap_type_to_swap_info
The function swap_type_to_swap_info is exported to access the swap_info_struct of the specified swap, which is regarded as reserved extended memory. Bug: 234214858 Signed-off-by: Bing Han <bing.han@transsion.com> Change-Id: I0107e7d561150f1945a4c161e886e9e03383fff6 |
||
Bing Han
|
ed2b11d639 |
ANDROID: vendor_hook: Add hook in si_swapinfo()
Provide a vendor hook android_vh_si_swapinf to replace the process of updating nr_to_be_unused. When the page is swapped to a specified swap location, nr_to_be_unused should not be updated. Because the specified swap is regarded as a reserved extended memory. Bug: 234214858 Signed-off-by: Bing Han <bing.han@transsion.com> Change-Id: Ie41caec345658589bf908fb0f96d038d1fba21f3 |
||
Bing Han
|
667f0d71dc |
ANDROID: vendor_hooks: Add hooks to extend the struct swap_info_struct
Two vendor hooks are added to extend the struct swap_info_struct: android_vh_alloc_si, extend the allocation of struct swap_info_struct, adding data to record the information of specified reclaimed location; android_vh_init_swap_info_struct, adding initializing the extension of struct swap_info_struct; Bug: 234214858 Signed-off-by: Bing Han <bing.han@transsion.com> Change-Id: I0e1d8e38ba7dfd52b609b1c14eb78f8b0ef0f9e6 |
||
Bing Han
|
bc4c73c182 |
ANDROID: vendor_hook: Add hooks in unuse_pte_range() and try_to_unuse()
When the page is unused, a vendor hook android_vh_unuse_swap_page should be called to specify that the page should not be swapped to the specified swap location any more. Bug: 234214858 Signed-off-by: Bing Han <bing.han@transsion.com> Change-Id: I3fc3675020517f7cc69c76a06150dfb2380dae21 |
||
Bing Han
|
d2fea0ba9a |
ANDROID: vendor_hook: Add hook to update nr_swap_pages and total_swap_pages
The specified swap is regarded as reserved extended memory. So nr_swap_pages and total_swap_pages should not be affected by the specified swap. Provide a vendor hook android_vh_account_swap_pages to replace the updating process of nr_swap_pages and total_swap_pages. When the page is swapped to the specified swap location, nr_swap_pages and total_swap_pages should not be updated. Bug: 234214858 Signed-off-by: Bing Han <bing.han@transsion.com> Change-Id: Ib8dfb355d190399a037b9d9eda478a81c436e224 |
||
Liujie Xie
|
d9845e9e5c |
ANDROID: export walk_page_range and swp_swap_info
Export walk_page_range and swp_swap_info for reading swap from backing device to zram. Bug: 225273514 Signed-off-by: Liujie Xie <xieliujie@oppo.com> Change-Id: If888cfc2823d8003b62bdb177740643696cf6f7e |
||
Greg Kroah-Hartman
|
b1a6760ddf |
Merge branch 'android12-5.10' into android12-5.10-lts
Sync up with android12-5.10 for the following commits: |
||
Suren Baghdasaryan
|
309aa7e7a2 |
FROMLIST: mm, memcg: inline swap-related functions to improve disabled memcg config
Inline mem_cgroup_try_charge_swap, mem_cgroup_uncharge_swap and cgroup_throttle_swaprate functions to perform mem_cgroup_disabled static key check inline before calling the main body of the function. This minimizes the memcg overhead in the pagefault and exit_mmap paths when memcgs are disabled using cgroup_disable=memory command-line option. This change results in ~1% overhead reduction when running PFT test [1] comparing {CONFIG_MEMCG=n} against {CONFIG_MEMCG=y, cgroup_disable=memory} configuration on an 8-core ARM64 Android device. [1] https://lkml.org/lkml/2006/8/29/294 also used in mmtests suite Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Link: https://lore.kernel.org/patchwork/patch/1458908/ Bug: 191223209 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I18d59090ec908037b39324d1f1bb511d06e9c690 |
||
Suren Baghdasaryan
|
f73d029485 |
FROMLIST: mm, memcg: add mem_cgroup_disabled checks in vmpressure and swap-related functions
Add mem_cgroup_disabled check in vmpressure, mem_cgroup_uncharge_swap and cgroup_throttle_swaprate functions. This minimizes the memcg overhead in the pagefault and exit_mmap paths when memcgs are disabled using cgroup_disable=memory command-line option. This change results in ~2.1% overhead reduction when running PFT test [1] comparing {CONFIG_MEMCG=n, CONFIG_MEMCG_SWAP=n} against {CONFIG_MEMCG=y, CONFIG_MEMCG_SWAP=y, cgroup_disable=memory} configuration on an 8-core ARM64 Android device. [1] https://lkml.org/lkml/2006/8/29/294 also used in mmtests suite Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Link: https://lore.kernel.org/patchwork/patch/1458906/ Bug: 191223209 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: Ic1fc75eb1e4d7a9848cf641b9f232ad3262c490b |
||
Greg Kroah-Hartman
|
948d38f94d |
This is the 5.10.46 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmDTLFUACgkQONu9yGCS aT5eThAApQAh1A++P729NJOTeoewU5YH0/1c+ZVN4nfxxEOApeBpfA4tTDvfHJeI MYx10AI1UiLPHfLtHI5exvG00/Ll4lb0fs2bpVL2b/SQKCm2G3kZf7xOdJOBtoy4 DEaTORhmZ001weapZN+G4oz+FEnNZEyR/rThqKTA0G/PS1MxNl4ZBhY9BrySpH1V Cq7OFX18IbTh3/XXmcPotZa2sXE6Z+jjWQb5GLZ+ZjicbzgLiWWcnrm8bzLahVC4 N7TToeGv9zOLKgrE+HVR52UoFB1+2vRUEaRVOiFbDViLjoF5KWw5rAzioTCvfXW+ g/ldoAuDQBNGUrYfVUrSNwj5JuWCI2Cltt//9f/xGfPPn0HNjAxSM7ExpnMNVhVK 1gjTco+0kWzv2BGjgpNAe7+aLka5sQkLEOYlSExI6VVuF5CCcIywWjWZ6zHG0CF1 7kW8CfINV4BFP+IYw5Gnt5K3hUTulDt+alX9WgsdPxpsZ9gbIscO1/awnRrAyDyO 2EeCbZ3WWSuvFL6qAjJERiDbhDPRaZV0cwGPxzLZ7NN8ZPXLxTVv7Nc6QoiNXYkk E+LYcMua9dxFXjoHA0imKxlxqJD64mh3oUkdpTGOwIxrE5bavnKGrO2B3Nl7zWVn u8mazeKHWpJ+t+dDZ47CjrNTul0SOvryKmog//DCkvAIYSjRzVc= =WRWw -----END PGP SIGNATURE----- Merge 5.10.46 into android12-5.10-lts Changes in 5.10.46 dmaengine: idxd: add missing dsa driver unregister dmaengine: fsl-dpaa2-qdma: Fix error return code in two functions dmaengine: xilinx: dpdma: initialize registers before request_irq dmaengine: ALTERA_MSGDMA depends on HAS_IOMEM dmaengine: QCOM_HIDMA_MGMT depends on HAS_IOMEM dmaengine: SF_PDMA depends on HAS_IOMEM dmaengine: stedma40: add missing iounmap() on error in d40_probe() afs: Fix an IS_ERR() vs NULL check mm/memory-failure: make sure wait for page writeback in memory_failure kvm: LAPIC: Restore guard to prevent illegal APIC register access fanotify: fix copy_event_to_user() fid error clean up batman-adv: Avoid WARN_ON timing related checks mac80211: fix skb length check in ieee80211_scan_rx() mlxsw: reg: Spectrum-3: Enforce lowest max-shaper burst size of 11 mlxsw: core: Set thermal zone polling delay argument to real value at init libbpf: Fixes incorrect rx_ring_setup_done net: ipv4: fix memory leak in netlbl_cipsov4_add_std vrf: fix maximum MTU net: rds: fix memory leak in rds_recvmsg net: dsa: felix: re-enable TX flow control in ocelot_port_flush() net: lantiq: disable interrupt before sheduling NAPI netfilter: nft_fib_ipv6: skip ipv6 packets from any to link-local ice: add ndo_bpf callback for safe mode netdev ops ice: parameterize functions responsible for Tx ring management udp: fix race between close() and udp_abort() rtnetlink: Fix regression in bridge VLAN configuration net/sched: act_ct: handle DNAT tuple collision net/mlx5e: Remove dependency in IPsec initialization flows net/mlx5e: Fix page reclaim for dead peer hairpin net/mlx5: Consider RoCE cap before init RDMA resources net/mlx5: DR, Allow SW steering for sw_owner_v2 devices net/mlx5: DR, Don't use SW steering when RoCE is not supported net/mlx5e: Block offload of outer header csum for UDP tunnels netfilter: synproxy: Fix out of bounds when parsing TCP options mptcp: Fix out of bounds when parsing TCP options sch_cake: Fix out of bounds when parsing TCP options and header mptcp: try harder to borrow memory from subflow under pressure mptcp: do not warn on bad input from the network selftests: mptcp: enable syncookie only in absence of reorders alx: Fix an error handling path in 'alx_probe()' cxgb4: fix endianness when flashing boot image cxgb4: fix sleep in atomic when flashing PHY firmware cxgb4: halt chip before flashing PHY firmware image net: stmmac: dwmac1000: Fix extended MAC address registers definition net: make get_net_ns return error if NET_NS is disabled net: qualcomm: rmnet: Update rmnet device MTU based on real device net: qualcomm: rmnet: don't over-count statistics ethtool: strset: fix message length calculation qlcnic: Fix an error handling path in 'qlcnic_probe()' netxen_nic: Fix an error handling path in 'netxen_nic_probe()' cxgb4: fix wrong ethtool n-tuple rule lookup ipv4: Fix device used for dst_alloc with local routes net: qrtr: fix OOB Read in qrtr_endpoint_post bpf: Fix leakage under speculation on mispredicted branches ptp: improve max_adj check against unreasonable values net: cdc_ncm: switch to eth%d interface naming lantiq: net: fix duplicated skb in rx descriptor ring net: usb: fix possible use-after-free in smsc75xx_bind net: fec_ptp: fix issue caused by refactor the fec_devtype net: ipv4: fix memory leak in ip_mc_add1_src net/af_unix: fix a data-race in unix_dgram_sendmsg / unix_release_sock net/mlx5: E-Switch, Read PF mac address net/mlx5: E-Switch, Allow setting GUID for host PF vport net/mlx5: Reset mkey index on creation be2net: Fix an error handling path in 'be_probe()' net: hamradio: fix memory leak in mkiss_close net: cdc_eem: fix tx fixup skb leak cxgb4: fix wrong shift. bnxt_en: Rediscover PHY capabilities after firmware reset bnxt_en: Fix TQM fastpath ring backing store computation bnxt_en: Call bnxt_ethtool_free() in bnxt_init_one() error path icmp: don't send out ICMP messages with a source address of 0.0.0.0 net: ethernet: fix potential use-after-free in ec_bhf_remove regulator: cros-ec: Fix error code in dev_err message regulator: bd70528: Fix off-by-one for buck123 .n_voltages setting platform/x86: thinkpad_acpi: Add X1 Carbon Gen 9 second fan support ASoC: rt5659: Fix the lost powers for the HDA header phy: phy-mtk-tphy: Fix some resource leaks in mtk_phy_init() ASoC: fsl-asoc-card: Set .owner attribute when registering card. regulator: rtmv20: Fix to make regcache value first reading back from HW spi: spi-zynq-qspi: Fix some wrong goto jumps & missing error code sched/pelt: Ensure that *_sum is always synced with *_avg ASoC: tas2562: Fix TDM_CFG0_SAMPRATE values spi: stm32-qspi: Always wait BUSY bit to be cleared in stm32_qspi_wait_cmd() regulator: rt4801: Fix NULL pointer dereference if priv->enable_gpios is NULL ASoC: rt5682: Fix the fast discharge for headset unplugging in soundwire mode pinctrl: ralink: rt2880: avoid to error in calls is pin is already enabled drm/sun4i: dw-hdmi: Make HDMI PHY into a platform device ASoC: qcom: lpass-cpu: Fix pop noise during audio capture begin radeon: use memcpy_to/fromio for UVD fw upload hwmon: (scpi-hwmon) shows the negative temperature properly mm: relocate 'write_protect_seq' in struct mm_struct irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry bpf: Inherit expanded/patched seen count from old aux data bpf: Do not mark insn as seen under speculative path verification can: bcm: fix infoleak in struct bcm_msg_head can: bcm/raw/isotp: use per module netdevice notifier can: j1939: fix Use-after-Free, hold skb ref while in use can: mcba_usb: fix memory leak in mcba_usb usb: core: hub: Disable autosuspend for Cypress CY7C65632 usb: chipidea: imx: Fix Battery Charger 1.2 CDP detection tracing: Do not stop recording cmdlines when tracing is off tracing: Do not stop recording comms if the trace file is being read tracing: Do no increment trace_clock_global() by one PCI: Mark TI C667X to avoid bus reset PCI: Mark some NVIDIA GPUs to avoid bus reset PCI: aardvark: Fix kernel panic during PIO transfer PCI: Add ACS quirk for Broadcom BCM57414 NIC PCI: Work around Huawei Intelligent NIC VF FLR erratum KVM: x86: Immediately reset the MMU context when the SMM flag is cleared KVM: x86/mmu: Calculate and check "full" mmu_role for nested MMU KVM: X86: Fix x86_emulator slab cache leak s390/mcck: fix calculation of SIE critical section size s390/ap: Fix hanging ioctl caused by wrong msg counter ARCv2: save ABI registers across signal handling x86/mm: Avoid truncating memblocks for SGX memory x86/process: Check PF_KTHREAD and not current->mm for kernel threads x86/ioremap: Map EFI-reserved memory as encrypted for SEV x86/pkru: Write hardware init value to PKRU when xstate is init x86/fpu: Prevent state corruption in __fpu__restore_sig() x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer x86/fpu: Reset state for all signal restore failures crash_core, vmcoreinfo: append 'SECTION_SIZE_BITS' to vmcoreinfo dmaengine: pl330: fix wrong usage of spinlock flags in dma_cyclc mac80211: Fix NULL ptr deref for injected rate info cfg80211: make certificate generation more robust cfg80211: avoid double free of PMSR request drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell. drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue. net: ll_temac: Make sure to free skb when it is completely used net: ll_temac: Fix TX BD buffer overwrite net: bridge: fix vlan tunnel dst null pointer dereference net: bridge: fix vlan tunnel dst refcnt when egressing mm/swap: fix pte_same_as_swp() not removing uffd-wp bit when compare mm/slub: clarify verification reporting mm/slub: fix redzoning for small allocations mm/slub: actually fix freelist pointer vs redzoning mm/slub.c: include swab.h net: stmmac: disable clocks in stmmac_remove_config_dt() net: fec_ptp: add clock rate zero check tools headers UAPI: Sync linux/in.h copy with the kernel sources perf beauty: Update copy of linux/socket.h with the kernel sources usb: dwc3: debugfs: Add and remove endpoint dirs dynamically usb: dwc3: core: fix kernel panic when do reboot Linux 5.10.46 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I99f37c9f257f90ccdb091306f3d4cfb7c32e3880 |
||
Peter Xu
|
12eb3c2c1a |
mm/swap: fix pte_same_as_swp() not removing uffd-wp bit when compare
commit 099dd6878b9b12d6bbfa6bf29ce0c8ddd38f6901 upstream.
I found it by pure code review, that pte_same_as_swp() of unuse_vma()
didn't take uffd-wp bit into account when comparing ptes.
pte_same_as_swp() returning false negative could cause failure to
swapoff swap ptes that was wr-protected by userfaultfd.
Link: https://lkml.kernel.org/r/20210603180546.9083-1-peterx@redhat.com
Fixes:
|
||
Greg Kroah-Hartman
|
28454baf9c |
This is the 5.10.21 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmBEulcACgkQONu9yGCS aT5+8BAAtgxUCbSKsQOd8wObQSeasrc0DD3r3pAmf08gxVRAwtzupxmeJvNcxU6G XSIeKJZ0c/3JVQhh7390mZKS5IYtEFWaBjbbkwogrx9Vsn12ewDKEjfTqeOjli7B iaOI8qaRN/zYhEgi2J6wWn81WBKlmjWh+bB2LnX7LzOATguVehMAmLsF6KHrHBne ERml4aRXxrj2EwiG55BqDY1jMD6sPPECEqMhQYdSeEJR9jCkvUdpa0SCQ7UwBOav nS/Rmxnx35LjIPUIO4Jk+CLQJnFlsCJjjdW222yJIgFW5NeBbuN4xyONVWwXUddj X6prMXjKMf1fCfxQ124JwiIp98mSz+1nIhu7zMQuALbDh/S1GMufKEq/cHg57Jlh vliQcO4kv9Vq0GPcn667P1OAL+DXAgSyTk8Hajv+do9Cb5c3897LpuiDPgadh+F6 c6xO6MT8zX5L54xRu2ITtaFh8LRaCgMHEFrDcBwIKGvlGwrzIvYxxuoAFIyiN/yL ZIfZWbs1y8XHsHC69f0B7MTC99+TPr/M+a1gnQ9c9B7HBlzGEjoTTgFFhkvKs1lr dLe63Z2NF5ZQhHOSW47lqBkEHhYVW+cSQdPcEUFJu5qs/xuzXyzVZWf0443nHOMN RLgMvIR1DgrLSoWIl8A0cfBVWd2+FUa7IBBtrz6P+RyRLe+ybQ8= =ZowT -----END PGP SIGNATURE----- Merge 5.10.21 into android12-5.10 Changes in 5.10.21 net: usb: qmi_wwan: support ZTE P685M modem Input: elantech - fix protocol errors for some trackpoints in SMBus mode Input: elan_i2c - add new trackpoint report type 0x5F drm/virtio: use kvmalloc for large allocations x86/build: Treat R_386_PLT32 relocation as R_386_PC32 JFS: more checks for invalid superblock sched/core: Allow try_invoke_on_locked_down_task() with irqs disabled udlfb: Fix memory leak in dlfb_usb_probe media: mceusb: sanity check for prescaler value erofs: fix shift-out-of-bounds of blkszbits media: v4l2-ctrls.c: fix shift-out-of-bounds in std_validate xfs: Fix assert failure in xfs_setattr_size() net/af_iucv: remove WARN_ONCE on malformed RX packets smackfs: restrict bytes count in smackfs write functions tomoyo: ignore data race while checking quota net: fix up truesize of cloned skb in skb_prepare_for_shift() riscv: Get rid of MAX_EARLY_MAPPING_SIZE nbd: handle device refs for DESTROY_ON_DISCONNECT properly mm/hugetlb.c: fix unnecessary address expansion of pmd sharing RDMA/rtrs: Do not signal for heatbeat RDMA/rtrs-clt: Use bitmask to check sess->flags RDMA/rtrs-srv: Do not signal REG_MR tcp: fix tcp_rmem documentation mptcp: do not wakeup listener for MPJ subflows net: bridge: use switchdev for port flags set through sysfs too net/sched: cls_flower: Reject invalid ct_state flags rules net: dsa: tag_rtl4_a: Support also egress tags net: ag71xx: remove unnecessary MTU reservation net: hsr: add support for EntryForgetTime net: psample: Fix netlink skb length with tunnel info net: fix dev_ifsioc_locked() race condition dt-bindings: ethernet-controller: fix fixed-link specification dt-bindings: net: btusb: DT fix s/interrupt-name/interrupt-names/ ASoC: qcom: Remove useless debug print rsi: Fix TX EAPOL packet handling against iwlwifi AP rsi: Move card interrupt handling to RX thread EDAC/amd64: Do not load on family 0x15, model 0x13 staging: fwserial: Fix error handling in fwserial_create x86/reboot: Add Zotac ZBOX CI327 nano PCI reboot quirk vt/consolemap: do font sum unsigned wlcore: Fix command execute failure 19 for wl12xx Bluetooth: hci_h5: Set HCI_QUIRK_SIMULTANEOUS_DISCOVERY for btrtl Bluetooth: btusb: fix memory leak on suspend and resume mt76: mt7615: reset token when mac_reset happens pktgen: fix misuse of BUG_ON() in pktgen_thread_worker() ath10k: fix wmi mgmt tx queue full due to race condition net: sfp: add mode quirk for GPON module Ubiquiti U-Fiber Instant Bluetooth: Add new HCI_QUIRK_NO_SUSPEND_NOTIFIER quirk Bluetooth: Fix null pointer dereference in amp_read_loc_assoc_final_data staging: most: sound: add sanity check for function argument staging: bcm2835-audio: Replace unsafe strcpy() with strscpy() brcmfmac: Add DMI nvram filename quirk for Predia Basic tablet brcmfmac: Add DMI nvram filename quirk for Voyo winpad A15 tablet drm/hisilicon: Fix use-after-free crypto: tcrypt - avoid signed overflow in byte count fs: make unlazy_walk() error handling consistent drm/amdgpu: Add check to prevent IH overflow PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse ASoC: Intel: bytcr_rt5640: Add new BYT_RT5640_NO_SPEAKERS quirk-flag drm/amd/display: Guard against NULL pointer deref when get_i2c_info fails drm/amd/amdgpu: add error handling to amdgpu_virt_read_pf2vf_data media: uvcvideo: Allow entities with no pads f2fs: handle unallocated section and zone on pinned/atgc f2fs: fix to set/clear I_LINKABLE under i_lock nvme-core: add cancel tagset helpers nvme-rdma: add clean action for failed reconnection nvme-tcp: add clean action for failed reconnection ASoC: Intel: Add DMI quirk table to soc_intel_is_byt_cr() btrfs: fix error handling in commit_fs_roots perf/x86/kvm: Add Cascade Lake Xeon steppings to isolation_ucodes[] ASoC: Intel: sof-sdw: indent and add quirks consistently ASoC: Intel: sof_sdw: detect DMIC number based on mach params parisc: Bump 64-bit IRQ stack size to 64 KB sched/features: Fix hrtick reprogramming ASoC: Intel: bytcr_rt5640: Add quirk for the Estar Beauty HD MID 7316R tablet ASoC: Intel: bytcr_rt5640: Add quirk for the Voyo Winpad A15 tablet ASoC: Intel: bytcr_rt5651: Add quirk for the Jumper EZpad 7 tablet ASoC: Intel: bytcr_rt5640: Add quirk for the Acer One S1002 tablet scsi: iscsi: Restrict sessions and handles to admin capabilities scsi: iscsi: Ensure sysfs attributes are limited to PAGE_SIZE scsi: iscsi: Verify lengths on passthrough PDUs Xen/gnttab: handle p2m update errors on a per-slot basis xen-netback: respect gnttab_map_refs()'s return value xen: fix p2m size in dom0 for disabled memory hotplug case zsmalloc: account the number of compacted pages correctly remoteproc/mediatek: Fix kernel test robot warning swap: fix swapfile read/write offset powerpc/sstep: Check instruction validity against ISA version before emulation powerpc/sstep: Fix incorrect return from analyze_instr() tty: fix up iterate_tty_read() EOVERFLOW handling tty: fix up hung_up_tty_read() conversion tty: clean up legacy leftovers from n_tty line discipline tty: teach n_tty line discipline about the new "cookie continuations" tty: teach the n_tty ICANON case about the new "cookie continuations" too media: v4l: ioctl: Fix memory leak in video_usercopy ALSA: hda/realtek: Add quirk for Clevo NH55RZQ ALSA: hda/realtek: Add quirk for Intel NUC 10 ALSA: hda/realtek: Apply dual codec quirks for MSI Godlike X570 board net: sfp: VSOL V2801F / CarlitoxxPro CPGOS03-0490 v2.0 workaround net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips Linux 5.10.21 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I52b1105b73d893779b3886b577accfabe9f83a16 |
||
Jens Axboe
|
04b049ac9c |
swap: fix swapfile read/write offset
commit caf6912f3f4af7232340d500a4a2008f81b93f14 upstream.
We're not factoring in the start of the file for where to write and
read the swapfile, which leads to very unfortunate side effects of
writing where we should not be...
Fixes:
|
||
Will Deacon
|
cab48b24a8 |
BACKPORT: FROMGIT: mm: Use static initialisers for immutable fields of 'struct vm_fault'
In preparation for const-ifying the anonymous struct field of 'struct vm_fault', ensure that it is initialised using designated initialisers. Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Will Deacon <will@kernel.org> Change-Id: Ib2c84bbc4d59fe1811465e59c89f8eb7f73e6229 Bug: 171278850 (cherry picked from commit 8c63ca5bc3e19f11128e8e285dcf20aac6768f97 https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/faultaround) Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> |
||
Greg Kroah-Hartman
|
39564d70ad |
This is the 5.10.12 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmAVV8MACgkQONu9yGCS aT5EyxAAkIKzfWDIgtxBBui9zNb4af0nik/4Fv+0ynvvMFIJ+9OEh8vrrzASze3E 6w8E5c1TxP2iXiW0/NQqU2UWmdVzO85zAeGMjZGSgzn4AtbZrBd8FIk3g5aNzGEJ xuqlVm+VOmdQ30Lr+yIOE/xwGDhGy+4cCMBQqGdMWk3Bnsk2QHBzSzyLZOJiK8M1 9qTyMvtUdIVDFw5rqWQgtfNkcCfk7dMfjmD1bFVSFiJCnJbHE2Yr8y2MscSeLZ1V csBmg6K/JgEZFJFVamFKfGkAKQp2nI6YIUm3K0oJhp9BYYECJaH0irnkrT5F8rU8 RBvxW+9E+SOmrHoEo9RTfGDnvU0hOrZolmPmj71puT6vHzw/S2npoAanWX+nWD6j dVTT77TKaSovmqp7+Lt9djsb3E9WzKHlIBJIcgcy/uyMpsllmHt6GROYBIa5gFJk LZY6zFrG9l04RYICBuuD6XNcqP56H/WnhBB8us3X5ui5x/3fI+RFBhf/UOXzxUnB KcBzRLCUFugvPdKeXGmjn0FCrj1vpj1/cbqLbDvETq9nF8qp/sXjPHbDpvNHyBOR MpzFgWnNrg2pYlJHidxpj2gog8jvEEdtOHeVW16HpVsvwMClJVcgaBF3US5mT8Zy nNohKtYPx6XjdddDb41NZsWxPHizN7FGnFeJOTZpH0YjNpTNS6c= =etoA -----END PGP SIGNATURE----- Merge 5.10.12 into android12-5.10 Changes in 5.10.12 gpio: mvebu: fix pwm .get_state period calculation Revert "mm/slub: fix a memory leak in sysfs_slab_add()" futex: Ensure the correct return value from futex_lock_pi() futex: Replace pointless printk in fixup_owner() futex: Provide and use pi_state_update_owner() rtmutex: Remove unused argument from rt_mutex_proxy_unlock() futex: Use pi_state_update_owner() in put_pi_state() futex: Simplify fixup_pi_state_owner() futex: Handle faults correctly for PI futexes HID: wacom: Correct NULL dereference on AES pen proximity HID: multitouch: Apply MT_QUIRK_CONFIDENCE quirk for multi-input devices media: Revert "media: videobuf2: Fix length check for single plane dmabuf queueing" media: v4l2-subdev.h: BIT() is not available in userspace RDMA/vmw_pvrdma: Fix network_hdr_type reported in WC iwlwifi: dbg: Don't touch the tlv data kernel/io_uring: cancel io_uring before task works io_uring: inline io_uring_attempt_task_drop() io_uring: add warn_once for io_uring_flush() io_uring: stop SQPOLL submit on creator's death io_uring: fix null-deref in io_disable_sqo_submit io_uring: do sqo disable on install_fd error io_uring: fix false positive sqo warning on flush io_uring: fix uring_flush in exit_files() warning io_uring: fix skipping disabling sqo on exec io_uring: dont kill fasync under completion_lock io_uring: fix sleeping under spin in __io_clean_op objtool: Don't fail on missing symbol table mm/page_alloc: add a missing mm_page_alloc_zone_locked() tracepoint mm: fix a race on nr_swap_pages tools: Factor HOSTCC, HOSTLD, HOSTAR definitions printk: fix buffer overflow potential for print_text() printk: fix string termination for record_print_text() Linux 5.10.12 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I6d96ec78494ebbc0daf4fdecfc13e522c6bd6b42 |
||
Zhaoyang Huang
|
f472a59aa1 |
mm: fix a race on nr_swap_pages
commit b50da6e9f42ade19141f6cf8870bb2312b055aa3 upstream. The scenario on which "Free swap = -4kB" happens in my system, which is caused by several get_swap_pages racing with each other and show_swap_cache_info happens simutaniously. No need to add a lock on get_swap_page_of_type as we remove "Presub/PosAdd" here. ProcessA ProcessB ProcessC ngoals = 1 ngoals = 1 avail = nr_swap_pages(1) avail = nr_swap_pages(1) nr_swap_pages(1) -= ngoals nr_swap_pages(0) -= ngoals nr_swap_pages = -1 Link: https://lkml.kernel.org/r/1607050340-4535-1-git-send-email-zhaoyang.huang@unisoc.com Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Vijayanand Jitta
|
5e07d2eb08 |
ANDROID: mm: Export si_swapinfo
Export si_swapinfo symbol which is used as part of meminfo collection from minidump module. Bug: 176277894 Change-Id: I5dc1672ce649c22dc33d4a544ee5a38f8376becf Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org> |
||
Qian Cai
|
b11a76b37a |
mm/swapfile: do not sleep with a spin lock held
We can't call kvfree() with a spin lock held, so defer it. Fixes a
might_sleep() runtime warning.
Fixes:
|
||
Miaohe Lin
|
822bca52ee |
mm/swapfile.c: fix potential memory leak in sys_swapon
If we failed to drain inode, we would forget to free the swap address
space allocated by init_swap_address_space() above.
Fixes:
|
||
Miaohe Lin
|
7a3d52e45e |
mm/swapfile.c: remove unnecessary goto out in _swap_info_get()
It's unnecessary to goto the out label while out label is just below. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200930102549.1885-1-linmiaohe@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Yu Zhao
|
cc2828b21c |
mm: remove activate_page() from unuse_pte()
We don't initially add anon pages to active lruvec after commit
|
||
Gao Xiang
|
3264631548 |
swap: rename SWP_FS to SWAP_FS_OPS to avoid ambiguity
SWP_FS is used to make swap_{read,write}page() go through the filesystem, and it's only used for swap files over NFS for now. Otherwise it will directly submit IO to blockdev according to swapfile extents reported by filesystems in advance. As Matthew pointed out [1], SWP_FS naming is somewhat confusing, so let's rename to SWP_FS_OPS. [1] https://lore.kernel.org/r/20200820113448.GM17456@casper.infradead.org Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200822113019.11319-1-hsiangkao@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
3ad11d7ac8 |
block-5.10-2020-10-12
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+EWUgQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpnoxEADCVSNBRkpV0OVkOEC3wf8EGhXhk01Jnjtl u5Mg2V55hcgJ0thQxBV/V28XyqmsEBrmAVi0Yf8Vr9Qbq4Ze08Wae4ChS4rEOyh1 jTcGYWx5aJB3ChLvV/HI0nWQ3bkj03mMrL3SW8rhhf5DTyKHsVeTenpx42Qu/FKf fRzi09FSr3Pjd0B+EX6gunwJnlyXQC5Fa4AA0GhnXJzAznANXxHkkcXu8a6Yw75x e28CfhIBliORsK8sRHLoUnPpeTe1vtxCBhBMsE+gJAj9ZUOWMzvNFIPP4FvfawDy 6cCQo2m1azJ/IdZZCDjFUWyjh+wxdKMp+NNryEcoV+VlqIoc3n98rFwrSL+GIq5Z WVwEwq+AcwoMCsD29Lu1ytL2PQ/RVqcJP5UheMrbL4vzefNfJFumQVZLIcX0k943 8dFL2QHL+H/hM9Dx5y5rjeiWkAlq75v4xPKVjh/DHb4nehddCqn/+DD5HDhNANHf c1kmmEuYhvLpIaC4DHjE6DwLh8TPKahJjwsGuBOTr7D93NUQD+OOWsIhX6mNISIl FFhP8cd0/ZZVV//9j+q+5B4BaJsT+ZtwmrelKFnPdwPSnh+3iu8zPRRWO+8P8fRC YvddxuJAmE6BLmsAYrdz6Xb/wqfyV44cEiyivF0oBQfnhbtnXwDnkDWSfJD1bvCm ZwfpDh2+Tg== =LzyE -----END PGP SIGNATURE----- Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block Pull block updates from Jens Axboe: - Series of merge handling cleanups (Baolin, Christoph) - Series of blk-throttle fixes and cleanups (Baolin) - Series cleaning up BDI, seperating the block device from the backing_dev_info (Christoph) - Removal of bdget() as a generic API (Christoph) - Removal of blkdev_get() as a generic API (Christoph) - Cleanup of is-partition checks (Christoph) - Series reworking disk revalidation (Christoph) - Series cleaning up bio flags (Christoph) - bio crypt fixes (Eric) - IO stats inflight tweak (Gabriel) - blk-mq tags fixes (Hannes) - Buffer invalidation fixes (Jan) - Allow soft limits for zone append (Johannes) - Shared tag set improvements (John, Kashyap) - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel) - DM no-wait support (Mike, Konstantin) - Request allocation improvements (Ming) - Allow md/dm/bcache to use IO stat helpers (Song) - Series improving blk-iocost (Tejun) - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang, Xianting, Yang, Yufen, yangerkun) * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits) block: fix uapi blkzoned.h comments blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue blk-mq: get rid of the dead flush handle code path block: get rid of unnecessary local variable block: fix comment and add lockdep assert blk-mq: use helper function to test hw stopped block: use helper function to test queue register block: remove redundant mq check block: invoke blk_mq_exit_sched no matter whether have .exit_sched percpu_ref: don't refer to ref->data if it isn't allocated block: ratelimit handle_bad_sector() message blk-throttle: Re-use the throtl_set_slice_end() blk-throttle: Open code __throtl_de/enqueue_tg() blk-throttle: Move service tree validation out of the throtl_rb_first() blk-throttle: Move the list operation after list validation blk-throttle: Fix IO hang for a corner case blk-throttle: Avoid tracking latency if low limit is invalid blk-throttle: Avoid getting the current time if tg->last_finish_time is 0 blk-throttle: Remove a meaningless parameter for throtl_downgrade_state() block: Remove redundant 'return' statement ... |
||
Linus Torvalds
|
6734e20e39 |
arm64 updates for 5.10
- Userspace support for the Memory Tagging Extension introduced by Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11. - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context switching. - Fix and subsequent rewrite of our Spectre mitigations, including the addition of support for PR_SPEC_DISABLE_NOEXEC. - Support for the Armv8.3 Pointer Authentication enhancements. - Support for ASID pinning, which is required when sharing page-tables with the SMMU. - MM updates, including treating flush_tlb_fix_spurious_fault() as a no-op. - Perf/PMU driver updates, including addition of the ARM CMN PMU driver and also support to handle CPU PMU IRQs as NMIs. - Allow prefetchable PCI BARs to be exposed to userspace using normal non-cacheable mappings. - Implementation of ARCH_STACKWALK for unwinding. - Improve reporting of unexpected kernel traps due to BPF JIT failure. - Improve robustness of user-visible HWCAP strings and their corresponding numerical constants. - Removal of TEXT_OFFSET. - Removal of some unused functions, parameters and prototypes. - Removal of MPIDR-based topology detection in favour of firmware description. - Cleanups to handling of SVE and FPSIMD register state in preparation for potential future optimisation of handling across syscalls. - Cleanups to the SDEI driver in preparation for support in KVM. - Miscellaneous cleanups and refactoring work. -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+AUXMQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNFc1B/4q2Kabe+pPu7s1f58Q+OTaEfqcr3F1qh27 F1YpFZUYxg0GPfPsFrnbJpo5WKo7wdR9ceI9yF/GHjs7A/MSoQJis3pG6SlAd9c0 nMU5tCwhg9wfq6asJtl0/IPWem6cqqhdzC6m808DjeHuyi2CCJTt0vFWH3OeHEhG cfmLfaSNXOXa/MjEkT8y1AXJ/8IpIpzkJeCRA1G5s18PXV9Kl5bafIo9iqyfKPLP 0rJljBmoWbzuCSMc81HmGUQI4+8KRp6HHhyZC/k0WEVgj3LiumT7am02bdjZlTnK BeNDKQsv2Jk8pXP2SlrI3hIUTz0bM6I567FzJEokepvTUzZ+CVBi =9J8H -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "There's quite a lot of code here, but much of it is due to the addition of a new PMU driver as well as some arm64-specific selftests which is an area where we've traditionally been lagging a bit. In terms of exciting features, this includes support for the Memory Tagging Extension which narrowly missed 5.9, hopefully allowing userspace to run with use-after-free detection in production on CPUs that support it. Work is ongoing to integrate the feature with KASAN for 5.11. Another change that I'm excited about (assuming they get the hardware right) is preparing the ASID allocator for sharing the CPU page-table with the SMMU. Those changes will also come in via Joerg with the IOMMU pull. We do stray outside of our usual directories in a few places, mostly due to core changes required by MTE. Although much of this has been Acked, there were a couple of places where we unfortunately didn't get any review feedback. Other than that, we ran into a handful of minor conflicts in -next, but nothing that should post any issues. Summary: - Userspace support for the Memory Tagging Extension introduced by Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11. - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context switching. - Fix and subsequent rewrite of our Spectre mitigations, including the addition of support for PR_SPEC_DISABLE_NOEXEC. - Support for the Armv8.3 Pointer Authentication enhancements. - Support for ASID pinning, which is required when sharing page-tables with the SMMU. - MM updates, including treating flush_tlb_fix_spurious_fault() as a no-op. - Perf/PMU driver updates, including addition of the ARM CMN PMU driver and also support to handle CPU PMU IRQs as NMIs. - Allow prefetchable PCI BARs to be exposed to userspace using normal non-cacheable mappings. - Implementation of ARCH_STACKWALK for unwinding. - Improve reporting of unexpected kernel traps due to BPF JIT failure. - Improve robustness of user-visible HWCAP strings and their corresponding numerical constants. - Removal of TEXT_OFFSET. - Removal of some unused functions, parameters and prototypes. - Removal of MPIDR-based topology detection in favour of firmware description. - Cleanups to handling of SVE and FPSIMD register state in preparation for potential future optimisation of handling across syscalls. - Cleanups to the SDEI driver in preparation for support in KVM. - Miscellaneous cleanups and refactoring work" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits) Revert "arm64: initialize per-cpu offsets earlier" arm64: random: Remove no longer needed prototypes arm64: initialize per-cpu offsets earlier kselftest/arm64: Check mte tagged user address in kernel kselftest/arm64: Verify KSM page merge for MTE pages kselftest/arm64: Verify all different mmap MTE options kselftest/arm64: Check forked child mte memory accessibility kselftest/arm64: Verify mte tag inclusion via prctl kselftest/arm64: Add utilities and a test to validate mte memory perf: arm-cmn: Fix conversion specifiers for node type perf: arm-cmn: Fix unsigned comparison to less than zero arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option arm64: Pull in task_stack_page() to Spectre-v4 mitigation code KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled arm64: Get rid of arm64_ssbd_state KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state() KVM: arm64: Get rid of kvm_arm_have_ssbd() KVM: arm64: Simplify handling of ARCH_WORKAROUND_2 ... |
||
Gao Xiang
|
4166343058 |
mm, THP, swap: fix allocating cluster for swapfile by mistake
SWP_FS is used to make swap_{read,write}page() go through the filesystem, and it's only used for swap files over NFS. So, !SWP_FS means non NFS for now, it could be either file backed or device backed. Something similar goes with legacy SWP_FILE. So in order to achieve the goal of the original patch, SWP_BLKDEV should be used instead. FS corruption can be observed with SSD device + XFS + fragmented swapfile due to CONFIG_THP_SWAP=y. I reproduced the issue with the following details: Environment: QEMU + upstream kernel + buildroot + NVMe (2 GB) Kernel config: CONFIG_BLK_DEV_NVME=y CONFIG_THP_SWAP=y Some reproducible steps: mkfs.xfs -f /dev/nvme0n1 mkdir /tmp/mnt mount /dev/nvme0n1 /tmp/mnt bs="32k" sz="1024m" # doesn't matter too much, I also tried 16m xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw mkswap /tmp/mnt/sw swapon /tmp/mnt/sw stress --vm 2 --vm-bytes 600M # doesn't matter too much as well Symptoms: - FS corruption (e.g. checksum failure) - memory corruption at: 0xd2808010 - segfault Fixes: |
||
Christoph Hellwig
|
1cb039f3dc |
bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag
The BDI_CAP_STABLE_WRITES is one of the few bits of information in the backing_dev_info shared between the block drivers and the writeback code. To help untangling the dependency replace it with a queue flag and a superblock flag derived from it. This also helps with the case of e.g. a file system requiring stable writes due to its own checksumming, but not forcing it on other users of the block device like the swap code. One downside is that we an't support the stable_pages_required bdi attribute in sysfs anymore. It is replaced with a queue attribute which also is writable for easier testing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Christoph Hellwig
|
a8b456d01c |
bdi: remove BDI_CAP_SYNCHRONOUS_IO
BDI_CAP_SYNCHRONOUS_IO is only checked in the swap code, and used to decided if ->rw_page can be used on a block device. Just check up for the method instead. The only complication is that zram needs a second set of block_device_operations as it can switch between modes that actually support ->rw_page and those who don't. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Christoph Hellwig
|
21bd900572 |
mm: split swap_type_of
swap_type_of is used for two entirely different purposes: (1) check what swap type a given device/offset corresponds to (2) find the first available swap device that can be written to Mixing both in a single function creates an unreadable mess. Create two separate functions instead, and switch both to pass a dev_t instead of a struct block_device to further simplify the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Christoph Hellwig
|
ef16e1d98c |
mm: cleanup claim_swapfile
Use blkdev_get_by_dev instead of bdgrab + blkdev_get. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Steven Price
|
8a84802e2a |
mm: Add arch hooks for saving/restoring tags
Arm's Memory Tagging Extension (MTE) adds some metadata (tags) to every physical page, when swapping pages out to disk it is necessary to save these tags, and later restore them when reading the pages back. Add some hooks along with dummy implementations to enable the arch code to handle this. Three new hooks are added to the swap code: * arch_prepare_to_swap() and * arch_swap_invalidate_page() / arch_swap_invalidate_area(). One new hook is added to shmem: * arch_swap_restore() Signed-off-by: Steven Price <steven.price@arm.com> [catalin.marinas@arm.com: add unlock_page() on the error path] [catalin.marinas@arm.com: dropped the _tags suffix] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> |
||
Qian Cai
|
a449bf58e4 |
mm/swapfile: fix and annotate various data races
swap_info_struct si.highest_bit, si.swap_map[offset] and si.flags could be accessed concurrently separately as noticed by KCSAN, === si.highest_bit === write to 0xffff8d5abccdc4d4 of 4 bytes by task 5353 on cpu 24: swap_range_alloc+0x81/0x130 swap_range_alloc at mm/swapfile.c:681 scan_swap_map_slots+0x371/0xb90 get_swap_pages+0x39d/0x5c0 get_swap_page+0xf2/0x524 add_to_swap+0xe4/0x1c0 shrink_page_list+0x1795/0x2870 shrink_inactive_list+0x316/0x880 shrink_lruvec+0x8dc/0x1380 shrink_node+0x317/0xd80 do_try_to_free_pages+0x1f7/0xa10 try_to_free_pages+0x26c/0x5e0 __alloc_pages_slowpath+0x458/0x1290 read to 0xffff8d5abccdc4d4 of 4 bytes by task 6672 on cpu 70: scan_swap_map_slots+0x4a6/0xb90 scan_swap_map_slots at mm/swapfile.c:892 get_swap_pages+0x39d/0x5c0 get_swap_page+0xf2/0x524 add_to_swap+0xe4/0x1c0 shrink_page_list+0x1795/0x2870 shrink_inactive_list+0x316/0x880 shrink_lruvec+0x8dc/0x1380 shrink_node+0x317/0xd80 do_try_to_free_pages+0x1f7/0xa10 try_to_free_pages+0x26c/0x5e0 __alloc_pages_slowpath+0x458/0x1290 Reported by Kernel Concurrency Sanitizer on: CPU: 70 PID: 6672 Comm: oom01 Tainted: G W L 5.5.0-next-20200205+ #3 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 === si.swap_map[offset] === write to 0xffffbc370c29a64c of 1 bytes by task 6856 on cpu 86: __swap_entry_free_locked+0x8c/0x100 __swap_entry_free_locked at mm/swapfile.c:1209 (discriminator 4) __swap_entry_free.constprop.20+0x69/0xb0 free_swap_and_cache+0x53/0xa0 unmap_page_range+0x7f8/0x1d70 unmap_single_vma+0xcd/0x170 unmap_vmas+0x18b/0x220 exit_mmap+0xee/0x220 mmput+0x10e/0x270 do_exit+0x59b/0xf40 do_group_exit+0x8b/0x180 read to 0xffffbc370c29a64c of 1 bytes by task 6855 on cpu 20: _swap_info_get+0x81/0xa0 _swap_info_get at mm/swapfile.c:1140 free_swap_and_cache+0x40/0xa0 unmap_page_range+0x7f8/0x1d70 unmap_single_vma+0xcd/0x170 unmap_vmas+0x18b/0x220 exit_mmap+0xee/0x220 mmput+0x10e/0x270 do_exit+0x59b/0xf40 do_group_exit+0x8b/0x180 === si.flags === write to 0xffff956c8fc6c400 of 8 bytes by task 6087 on cpu 23: scan_swap_map_slots+0x6fe/0xb50 scan_swap_map_slots at mm/swapfile.c:887 get_swap_pages+0x39d/0x5c0 get_swap_page+0x377/0x524 add_to_swap+0xe4/0x1c0 shrink_page_list+0x1795/0x2870 shrink_inactive_list+0x316/0x880 shrink_lruvec+0x8dc/0x1380 shrink_node+0x317/0xd80 do_try_to_free_pages+0x1f7/0xa10 try_to_free_pages+0x26c/0x5e0 __alloc_pages_slowpath+0x458/0x1290 read to 0xffff956c8fc6c400 of 8 bytes by task 6207 on cpu 63: _swap_info_get+0x41/0xa0 __swap_info_get at mm/swapfile.c:1114 put_swap_page+0x84/0x490 __remove_mapping+0x384/0x5f0 shrink_page_list+0xff1/0x2870 shrink_inactive_list+0x316/0x880 shrink_lruvec+0x8dc/0x1380 shrink_node+0x317/0xd80 do_try_to_free_pages+0x1f7/0xa10 try_to_free_pages+0x26c/0x5e0 __alloc_pages_slowpath+0x458/0x1290 The writes are under si->lock but the reads are not. For si.highest_bit and si.swap_map[offset], data race could trigger logic bugs, so fix them by having WRITE_ONCE() for the writes and READ_ONCE() for the reads except those isolated reads where they compare against zero which a data race would cause no harm. Thus, annotate them as intentional data races using the data_race() macro. For si.flags, the readers are only interested in a single bit where a data race there would cause no issue there. [cai@lca.pw: add a missing annotation for si->flags in memory.c] Link: http://lkml.kernel.org/r/1581612647-5958-1-git-send-email-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Marco Elver <elver@google.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/1581095163-12198-1-git-send-email-cai@lca.pw Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Matthew Wilcox (Oracle)
|
6c357848b4 |
mm: replace hpage_nr_pages with thp_nr_pages
The thp prefix is more frequently used than hpage and we should be consistent between the various functions. [akpm@linux-foundation.org: fix mm/migrate.c] Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20200629151959.15779-6-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Joonsoo Kim
|
3852f6768e |
mm/swapcache: support to handle the shadow entries
Workingset detection for anonymous page will be implemented in the following patch and it requires to store the shadow entries into the swapcache. This patch implements an infrastructure to store the shadow entry in the swapcache. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/1595490560-15117-5-git-send-email-iamjoonsoo.kim@lge.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Joonsoo Kim
|
b518154e59 |
mm/vmscan: protect the workingset on anonymous LRU
In current implementation, newly created or swap-in anonymous page is started on active list. Growing active list results in rebalancing active/inactive list so old pages on active list are demoted to inactive list. Hence, the page on active list isn't protected at all. Following is an example of this situation. Assume that 50 hot pages on active list. Numbers denote the number of pages on active/inactive list (active | inactive). 1. 50 hot pages on active list 50(h) | 0 2. workload: 50 newly created (used-once) pages 50(uo) | 50(h) 3. workload: another 50 newly created (used-once) pages 50(uo) | 50(uo), swap-out 50(h) This patch tries to fix this issue. Like as file LRU, newly created or swap-in anonymous pages will be inserted to the inactive list. They are promoted to active list if enough reference happens. This simple modification changes the above example as following. 1. 50 hot pages on active list 50(h) | 0 2. workload: 50 newly created (used-once) pages 50(h) | 50(uo) 3. workload: another 50 newly created (used-once) pages 50(h) | 50(uo), swap-out 50(uo) As you can see, hot pages on active list would be protected. Note that, this implementation has a drawback that the page cannot be promoted and will be swapped-out if re-access interval is greater than the size of inactive list but less than the size of total(active+inactive). To solve this potential issue, following patch will apply workingset detection similar to the one that's already applied to file LRU. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Link: http://lkml.kernel.org/r/1595490560-15117-3-git-send-email-iamjoonsoo.kim@lge.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |