android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Peter Collingbourne	ca53b8f1b4	BACKPORT: mm: make minimum slab alignment a runtime property When CONFIG_KASAN_HW_TAGS is enabled we currently increase the minimum slab alignment to 16. This happens even if MTE is not supported in hardware or disabled via kasan=off, which creates an unnecessary memory overhead in those cases. Eliminate this overhead by making the minimum slab alignment a runtime property and only aligning to 16 if KASAN is enabled at runtime. On a DragonBoard 845c (non-MTE hardware) with a kernel built with CONFIG_KASAN_HW_TAGS, waiting for quiescence after a full Android boot I see the following Slab measurements in /proc/meminfo (median of 3 reboots): Before: 169020 kB After: 167304 kB [akpm@linux-foundation.org: make slab alignment type `unsigned int' to avoid casting] Link: https://linux-review.googlesource.com/id/I752e725179b43b144153f4b6f584ceb646473ead Link: https://lkml.kernel.org/r/20220427195820.1716975-2-pcc@google.com Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Pekka Enberg <penberg@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Bug: 265364138 (cherry picked from commit d949a8155d139aa890795b802004a196b7f00598) [Zhenhua: fold 587cfd8e66df3515 ("ANDROID: fix alignment of struct shash_desc member") into this change, to keep ABI compatibility] Change-Id: I3749f8de65ef3619724e68a9affb4eefd1ebe737 Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com> Signed-off-by: Zhenhua Huang <quic_zhenhuah@quicinc.com>	2023-01-20 00:46:19 +00:00
Muchun Song	2357d700f8	UPSTREAM: mm: kfence: fix missing objcg housekeeping for SLAB The objcg is not cleared and put for kfence object when it is freed, which could lead to memory leak for struct obj_cgroup and wrong statistics of NR_SLAB_RECLAIMABLE_B or NR_SLAB_UNRECLAIMABLE_B. Since the last freed object's objcg is not cleared, mem_cgroup_from_obj() could return the wrong memcg when this kfence object, which is not charged to any objcgs, is reallocated to other users. A real word issue [1] is caused by this bug. Bug: 254441685 Link: https://lore.kernel.org/all/000000000000cabcb505dae9e577@google.com/ [1] Reported-by: syzbot+f8c45ccc7d5d45fc5965@syzkaller.appspotmail.com Fixes: d3fb45f370d9 ("mm, kfence: insert KFENCE hooks for SLAB") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit ae085d7f9365de7da27ab5c0d16b12d51ea7fca9) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: If17f6048e312e0cf78d01f7c122b84b3fb4a58d8	2022-11-09 13:57:12 +00:00
Greg Kroah-Hartman	e054456ced	Merge 5.10.37 into android12-5.10 Changes in 5.10.37 Bluetooth: verify AMP hci_chan before amp_destroy bluetooth: eliminate the potential race condition when removing the HCI controller net/nfc: fix use-after-free llcp_sock_bind/connect io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffers Revert "USB: cdc-acm: fix rounding error in TIOCSSERIAL" usb: roles: Call try_module_get() from usb_role_switch_find_by_fwnode() tty: moxa: fix TIOCSSERIAL jiffies conversions tty: amiserial: fix TIOCSSERIAL permission check USB: serial: usb_wwan: fix TIOCSSERIAL jiffies conversions staging: greybus: uart: fix TIOCSSERIAL jiffies conversions USB: serial: ti_usb_3410_5052: fix TIOCSSERIAL permission check staging: fwserial: fix TIOCSSERIAL jiffies conversions tty: moxa: fix TIOCSSERIAL permission check staging: fwserial: fix TIOCSSERIAL permission check drm: bridge: fix LONTIUM use of mipi_dsi_() functions usb: typec: tcpm: Address incorrect values of tcpm psy for fixed supply usb: typec: tcpm: Address incorrect values of tcpm psy for pps supply usb: typec: tcpm: update power supply once partner accepts usb: xhci-mtk: remove or operator for setting schedule parameters usb: xhci-mtk: improve bandwidth scheduling with TT ASoC: samsung: tm2_wm5110: check of of_parse return value ASoC: Intel: kbl_da7219_max98927: Fix kabylake_ssp_fixup function ASoC: tlv320aic32x4: Register clocks before registering component ASoC: tlv320aic32x4: Increase maximum register in regmap MIPS: pci-mt7620: fix PLL lock check MIPS: pci-rt2880: fix slot 0 configuration FDDI: defxx: Bail out gracefully with unassigned PCI resource for CSR PCI: Allow VPD access for QLogic ISP2722 KVM: x86: Defer the MMU unload to the normal path on an global INVPCID PCI: xgene: Fix cfg resource mapping PCI: keystone: Let AM65 use the pci_ops defined in pcie-designware-host.c PM / devfreq: Unlock mutex and free devfreq struct in error path soc/tegra: regulators: Fix locking up when voltage-spread is out of range iio: inv_mpu6050: Fully validate gyro and accel scale writes iio:accel:adis16201: Fix wrong axis assignment that prevents loading iio:adc:ad7476: Fix remove handling sc16is7xx: Defer probe if device read fails phy: cadence: Sierra: Fix PHY power_on sequence misc: lis3lv02d: Fix false-positive WARN on various HP models phy: ti: j721e-wiz: Invoke wiz_init() before of_platform_device_create() misc: vmw_vmci: explicitly initialize vmci_notify_bm_set_msg struct misc: vmw_vmci: explicitly initialize vmci_datagram payload selinux: add proper NULL termination to the secclass_map permissions x86, sched: Treat Intel SNC topology as default, COD as exception async_xor: increase src_offs when dropping destination page md/bitmap: wait for external bitmap writes to complete during tear down md-cluster: fix use-after-free issue when removing rdev md: split mddev_find md: factor out a mddev_find_locked helper from mddev_find md: md_open returns -EBUSY when entering racing area md: Fix missing unused status line of /proc/mdstat mt76: mt7615: use ieee80211_free_txskb() in mt7615_tx_token_put() ipw2x00: potential buffer overflow in libipw_wx_set_encodeext() cfg80211: scan: drop entry from hidden_list on overflow rtw88: Fix array overrun in rtw_get_tx_power_params() mt76: fix potential DMA mapping leak FDDI: defxx: Make MMIO the configuration default except for EISA drm/i915/gvt: Fix virtual display setup for BXT/APL drm/i915/gvt: Fix vfio_edid issue for BXT/APL drm/qxl: use ttm bo priorities drm/panfrost: Clear MMU irqs before handling the fault drm/panfrost: Don't try to map pages that are already mapped drm/radeon: fix copy of uninitialized variable back to userspace drm/dp_mst: Revise broadcast msg lct & lcr drm/dp_mst: Set CLEAR_PAYLOAD_ID_TABLE as broadcast drm: bridge/panel: Cleanup connector on bridge detach drm/amd/display: Reject non-zero src_y and src_x for video planes drm/amdgpu: fix concurrent VM flushes on Vega/Navi v2 ALSA: hda/realtek: Re-order ALC882 Acer quirk table entries ALSA: hda/realtek: Re-order ALC882 Sony quirk table entries ALSA: hda/realtek: Re-order ALC882 Clevo quirk table entries ALSA: hda/realtek: Re-order ALC269 HP quirk table entries ALSA: hda/realtek: Re-order ALC269 Acer quirk table entries ALSA: hda/realtek: Re-order ALC269 Dell quirk table entries ALSA: hda/realtek: Re-order ALC269 ASUS quirk table entries ALSA: hda/realtek: Re-order ALC269 Sony quirk table entries ALSA: hda/realtek: Re-order ALC269 Lenovo quirk table entries ALSA: hda/realtek: Re-order remaining ALC269 quirk table entries ALSA: hda/realtek: Re-order ALC662 quirk table entries ALSA: hda/realtek: Remove redundant entry for ALC861 Haier/Uniwill devices ALSA: hda/realtek: ALC285 Thinkpad jack pin quirk is unreachable ALSA: hda/realtek: Fix speaker amp on HP Envy AiO 32 KVM: s390: VSIE: correctly handle MVPG when in VSIE KVM: s390: split kvm_s390_logical_to_effective KVM: s390: fix guarded storage control register handling s390: fix detection of vector enhancements facility 1 vs. vector packed decimal facility KVM: s390: VSIE: fix MVPG handling for prefixing and MSO KVM: s390: split kvm_s390_real_to_abs KVM: s390: extend kvm_s390_shadow_fault to return entry pointer KVM: x86/mmu: Alloc page for PDPTEs when shadowing 32-bit NPT with 64-bit KVM: x86: Remove emulator's broken checks on CR0/CR3/CR4 loads KVM: nSVM: Set the shadow root level to the TDP level for nested NPT KVM: SVM: Don't strip the C-bit from CR2 on #PF interception KVM: SVM: Do not allow SEV/SEV-ES initialization after vCPUs are created KVM: SVM: Inject #GP on guest MSR_TSC_AUX accesses if RDTSCP unsupported KVM: nVMX: Defer the MMU reload to the normal path on an EPTP switch KVM: nVMX: Truncate bits 63:32 of VMCS field on nested check in !64-bit KVM: nVMX: Truncate base/index GPR value on address calc in !64-bit KVM: arm/arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST read KVM: Destroy I/O bus devices on unregister failure _after_ sync'ing SRCU KVM: Stop looking for coalesced MMIO zones if the bus is destroyed KVM: arm64: Fully zero the vcpu state on reset KVM: arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION read Revert "drivers/net/wan/hdlc_fr: Fix a double free in pvc_xmit" Revert "i3c master: fix missing destroy_workqueue() on error in i3c_master_register" ovl: fix missing revert_creds() on error path Revert "drm/qxl: do not run release if qxl failed to init" usb: gadget: pch_udc: Revert `d3cb25a121` completely Revert "tools/power turbostat: adjust for temperature offset" firmware: xilinx: Fix dereferencing freed memory firmware: xilinx: Add a blank line after function declaration firmware: xilinx: Remove zynqmp_pm_get_eemi_ops() in IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE) fpga: fpga-mgr: xilinx-spi: fix error messages on -EPROBE_DEFER crypto: sun8i-ss - fix result memory leak on error path memory: gpmc: fix out of bounds read and dereference on gpmc_cs[] ARM: dts: exynos: correct fuel gauge interrupt trigger level on GT-I9100 ARM: dts: exynos: correct fuel gauge interrupt trigger level on Midas family ARM: dts: exynos: correct MUIC interrupt trigger level on Midas family ARM: dts: exynos: correct PMIC interrupt trigger level on Midas family ARM: dts: exynos: correct PMIC interrupt trigger level on Odroid X/U3 family ARM: dts: exynos: correct PMIC interrupt trigger level on SMDK5250 ARM: dts: exynos: correct PMIC interrupt trigger level on Snow ARM: dts: s5pv210: correct fuel gauge interrupt trigger level on Fascinate family ARM: dts: renesas: Add mmc aliases into R-Car Gen2 board dts files arm64: dts: renesas: Add mmc aliases into board dts files x86/platform/uv: Set section block size for hubless architectures serial: stm32: fix code cleaning warnings and checks serial: stm32: add "_usart" prefix in functions name serial: stm32: fix probe and remove order for dma serial: stm32: Use of_device_get_match_data() serial: stm32: fix startup by enabling usart for reception serial: stm32: fix incorrect characters on console serial: stm32: fix TX and RX FIFO thresholds serial: stm32: fix a deadlock condition with wakeup event serial: stm32: fix wake-up flag handling serial: stm32: fix a deadlock in set_termios serial: stm32: fix tx dma completion, release channel serial: stm32: call stm32_transmit_chars locked serial: stm32: fix FIFO flush in startup and set_termios serial: stm32: add FIFO flush when port is closed serial: stm32: fix tx_empty condition usb: typec: tcpci: Check ROLE_CONTROL while interpreting CC_STATUS usb: typec: tps6598x: Fix return value check in tps6598x_probe() usb: typec: stusb160x: fix return value check in stusb160x_probe() regmap: set debugfs_name to NULL after it is freed spi: rockchip: avoid objtool warning mtd: rawnand: fsmc: Fix error code in fsmc_nand_probe() mtd: rawnand: brcmnand: fix OOB R/W with Hamming ECC mtd: Handle possible -EPROBE_DEFER from parse_mtd_partitions() mtd: rawnand: qcom: Return actual error code instead of -ENODEV mtd: don't lock when recursively deleting partitions mtd: maps: fix error return code of physmap_flash_remove() ARM: dts: stm32: fix usart 2 & 3 pinconf to wake up with flow control arm64: dts: qcom: sm8250: Fix level triggered PMU interrupt polarity arm64: dts: qcom: sm8250: Fix timer interrupt to specify EL2 physical timer arm64: dts: qcom: sdm845: fix number of pins in 'gpio-ranges' arm64: dts: qcom: sm8150: fix number of pins in 'gpio-ranges' arm64: dts: qcom: sm8250: fix number of pins in 'gpio-ranges' arm64: dts: qcom: db845c: fix correct powerdown pin for WSA881x crypto: sun8i-ss - Fix memory leak of object d when dma_iv fails to map spi: stm32: drop devres version of spi_register_master regulator: bd9576: Fix return from bd957x_probe() arm64: dts: renesas: r8a77980: Fix vin4-7 endpoint binding spi: stm32: Fix use-after-free on unbind x86/microcode: Check for offline CPUs before requesting new microcode devtmpfs: fix placement of complete() call usb: gadget: pch_udc: Replace cpu_to_le32() by lower_32_bits() usb: gadget: pch_udc: Check if driver is present before calling ->setup() usb: gadget: pch_udc: Check for DMA mapping error usb: gadget: pch_udc: Initialize device pointer before use usb: gadget: pch_udc: Provide a GPIO line used on Intel Minnowboard (v1) crypto: ccp - fix command queuing to TEE ring buffer crypto: qat - don't release uninitialized resources crypto: qat - ADF_STATUS_PF_RUNNING should be set after adf_dev_init fotg210-udc: Fix DMA on EP0 for length > max packet size fotg210-udc: Fix EP0 IN requests bigger than two packets fotg210-udc: Remove a dubious condition leading to fotg210_done fotg210-udc: Mask GRP2 interrupts we don't handle fotg210-udc: Don't DMA more than the buffer can take fotg210-udc: Complete OUT requests on short packets usb: gadget: s3c: Fix incorrect resources releasing usb: gadget: s3c: Fix the error handling path in 's3c2410_udc_probe()' dt-bindings: serial: stm32: Use 'type: object' instead of false for 'additionalProperties' mtd: require write permissions for locking and badblock ioctls arm64: dts: renesas: r8a779a0: Fix PMU interrupt bus: qcom: Put child node before return soundwire: bus: Fix device found flag correctly phy: ti: j721e-wiz: Delete "clk_div_sel" clk provider during cleanup phy: marvell: ARMADA375_USBCLUSTER_PHY should not default to y, unconditionally arm64: dts: mediatek: fix reset GPIO level on pumpkin NFSD: Fix sparse warning in nfs4proc.c NFSv4.2: fix copy stateid copying for the async copy crypto: poly1305 - fix poly1305_core_setkey() declaration crypto: qat - fix error path in adf_isr_resource_alloc() usb: gadget: aspeed: fix dma map failure USB: gadget: udc: fix wrong pointer passed to IS_ERR() and PTR_ERR() drivers: nvmem: Fix voltage settings for QTI qfprom-efuse driver core: platform: Declare early_platform_cleanup() prototype memory: pl353: fix mask of ECC page_size config register soundwire: stream: fix memory leak in stream config error path m68k: mvme147,mvme16x: Don't wipe PCC timer config bits firmware: qcom_scm: Make __qcom_scm_is_call_available() return bool firmware: qcom_scm: Reduce locking section for __get_convention() firmware: qcom_scm: Workaround lack of "is available" call on SC7180 iio: adc: Kconfig: make AD9467 depend on ADI_AXI_ADC symbol mtd: rawnand: gpmi: Fix a double free in gpmi_nand_init irqchip/gic-v3: Fix OF_BAD_ADDR error handling staging: comedi: tests: ni_routes_test: Fix compilation error staging: rtl8192u: Fix potential infinite loop staging: fwserial: fix TIOCSSERIAL implementation staging: fwserial: fix TIOCGSERIAL implementation staging: greybus: uart: fix unprivileged TIOCCSERIAL soc: qcom: pdr: Fix error return code in pdr_register_listener PM / devfreq: Use more accurate returned new_freq as resume_freq clocksource/drivers/timer-ti-dm: Fix posted mode status check order clocksource/drivers/timer-ti-dm: Add missing set_state_oneshot_stopped clocksource/drivers/ingenic_ost: Fix return value check in ingenic_ost_probe() spi: Fix use-after-free with devm_spi_alloc_* spi: fsl: add missing iounmap() on error in of_fsl_spi_probe() soc: qcom: mdt_loader: Validate that p_filesz < p_memsz soc: qcom: mdt_loader: Detect truncated read of segments PM: runtime: Replace inline function pm_runtime_callbacks_present() cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration ACPI: CPPC: Replace cppc_attr with kobj_attribute crypto: allwinner - add missing CRYPTO_ prefix crypto: sun8i-ss - Fix memory leak of pad crypto: sa2ul - Fix memory leak of rxd crypto: qat - Fix a double free in adf_create_ring cpufreq: armada-37xx: Fix setting TBG parent for load levels clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock cpufreq: armada-37xx: Fix the AVS value for load L1 clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0 cpufreq: armada-37xx: Fix driver cleanup when registration failed cpufreq: armada-37xx: Fix determining base CPU frequency spi: spi-zynqmp-gqspi: use wait_for_completion_timeout to make zynqmp_qspi_exec_op not interruptible spi: spi-zynqmp-gqspi: add mutex locking for exec_op spi: spi-zynqmp-gqspi: transmit dummy circles by using the controller's internal functionality spi: spi-zynqmp-gqspi: fix incorrect operating mode in zynqmp_qspi_read_op spi: fsl-lpspi: Fix PM reference leak in lpspi_prepare_xfer_hardware() usb: gadget: r8a66597: Add missing null check on return from platform_get_resource USB: cdc-acm: fix unprivileged TIOCCSERIAL USB: cdc-acm: fix TIOCGSERIAL implementation tty: actually undefine superseded ASYNC flags tty: fix return value for unsupported ioctls tty: Remove dead termiox code tty: fix return value for unsupported termiox ioctls serial: core: return early on unsupported ioctls firmware: qcom-scm: Fix QCOM_SCM configuration node: fix device cleanups in error handling code crypto: chelsio - Read rxchannel-id from firmware usbip: vudc: fix missing unlock on error in usbip_sockfd_store() m68k: Add missing mmap_read_lock() to sys_cacheflush() spi: spi-zynqmp-gqspi: Fix missing unlock on error in zynqmp_qspi_exec_op() memory: renesas-rpc-if: fix possible NULL pointer dereference of resource memory: samsung: exynos5422-dmc: handle clk_set_parent() failure security: keys: trusted: fix TPM2 authorizations platform/x86: pmc_atom: Match all Beckhoff Automation baytrail boards with critclk_systems DMI table ARM: dts: aspeed: Rainier: Fix humidity sensor bus address Drivers: hv: vmbus: Use after free in __vmbus_open() spi: spi-zynqmp-gqspi: fix clk_enable/disable imbalance issue spi: spi-zynqmp-gqspi: fix hang issue when suspend/resume spi: spi-zynqmp-gqspi: fix use-after-free in zynqmp_qspi_exec_op spi: spi-zynqmp-gqspi: return -ENOMEM if dma_map_single fails x86/platform/uv: Fix !KEXEC build failure hwmon: (pmbus/pxe1610) don't bail out when not all pages are active Drivers: hv: vmbus: Increase wait time for VMbus unload PM: hibernate: x86: Use crc32 instead of md5 for hibernation e820 integrity check usb: dwc2: Fix host mode hibernation exit with remote wakeup flow. usb: dwc2: Fix hibernation between host and device modes. ttyprintk: Add TTY hangup callback. serial: omap: don't disable rs485 if rts gpio is missing serial: omap: fix rs485 half-duplex filtering xen-blkback: fix compatibility bug with single page rings soc: aspeed: fix a ternary sign expansion bug drm/tilcdc: send vblank event when disabling crtc drm/stm: Fix bus_flags handling drm/amd/display: Fix off by one in hdmi_14_process_transaction() drm/mcde/panel: Inverse misunderstood flag sched/fair: Fix shift-out-of-bounds in load_balance() afs: Fix updating of i_mode due to 3rd party change rcu: Remove spurious instrumentation_end() in rcu_nmi_enter() media: vivid: fix assignment of dev->fbuf_out_flags media: saa7134: use sg_dma_len when building pgtable media: saa7146: use sg_dma_len when building pgtable media: omap4iss: return error code when omap4iss_get() failed media: rkisp1: rsz: crash fix when setting src format media: aspeed: fix clock handling logic drm/probe-helper: Check epoch counter in output_poll_execute() media: venus: core: Fix some resource leaks in the error path of 'venus_probe()' media: platform: sunxi: sun6i-csi: fix error return code of sun6i_video_start_streaming() media: m88ds3103: fix return value check in m88ds3103_probe() media: docs: Fix data organization of MEDIA_BUS_FMT_RGB101010_1X30 media: [next] staging: media: atomisp: fix memory leak of object flash media: atomisp: Fixed error handling path media: m88rs6000t: avoid potential out-of-bounds reads on arrays media: atomisp: Fix use after free in atomisp_alloc_css_stat_bufs() drm/amdkfd: fix build error with AMD_IOMMU_V2=m of: overlay: fix for_each_child.cocci warnings x86/kprobes: Fix to check non boostable prefixes correctly selftests: fix prepending $(OUTPUT) to $(TEST_PROGS) pata_arasan_cf: fix IRQ check pata_ipx4xx_cf: fix IRQ check sata_mv: add IRQ checks ata: libahci_platform: fix IRQ check seccomp: Fix CONFIG tests for Seccomp_filters nvme-tcp: block BH in sk state_change sk callback nvmet-tcp: fix incorrect locking in state_change sk callback clk: imx: Fix reparenting of UARTs not associated with stdout power: supply: bq25980: Move props from battery node nvme: retrigger ANA log update if group descriptor isn't found media: i2c: imx219: Move out locking/unlocking of vflip and hflip controls from imx219_set_stream media: i2c: imx219: Balance runtime PM use-count media: v4l2-ctrls.c: fix race condition in hdl->requests list vfio/fsl-mc: Re-order vfio_fsl_mc_probe() vfio/pci: Move VGA and VF initialization to functions vfio/pci: Re-order vfio_pci_probe() vfio/mdev: Do not allow a mdev_type to have a NULL parent pointer clk: zynqmp: move zynqmp_pll_set_mode out of round_rate callback clk: zynqmp: pll: add set_pll_mode to check condition in zynqmp_pll_enable drm: xlnx: zynqmp: fix a memset in zynqmp_dp_train() clk: qcom: a53-pll: Add missing MODULE_DEVICE_TABLE clk: qcom: apss-ipq-pll: Add missing MODULE_DEVICE_TABLE drm/amd/display: use GFP_ATOMIC in dcn20_resource_construct drm/radeon: Fix a missing check bug in radeon_dp_mst_detect() clk: uniphier: Fix potential infinite loop scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check() scsi: pm80xx: Fix potential infinite loop scsi: ufs: ufshcd-pltfrm: Fix deferred probing scsi: hisi_sas: Fix IRQ checks scsi: jazz_esp: Add IRQ check scsi: sun3x_esp: Add IRQ check scsi: sni_53c710: Add IRQ check scsi: ibmvfc: Fix invalid state machine BUG_ON() mailbox: sprd: Introduce refcnt when clients requests/free channels mfd: stm32-timers: Avoid clearing auto reload register nvmet-tcp: fix a segmentation fault during io parsing error nvme-pci: don't simple map sgl when sgls are disabled media: cedrus: Fix H265 status definitions HSI: core: fix resource leaks in hsi_add_client_from_dt() x86/events/amd/iommu: Fix sysfs type mismatch perf/amd/uncore: Fix sysfs type mismatch io_uring: fix overflows checks in provide buffers sched/debug: Fix cgroup_path[] serialization drivers/block/null_blk/main: Fix a double free in null_init. xsk: Respect device's headroom and tailroom on generic xmit path HID: plantronics: Workaround for double volume key presses perf symbols: Fix dso__fprintf_symbols_by_name() to return the number of printed chars ASoC: Intel: boards: sof-wm8804: add check for PLL setting ASoC: Intel: Skylake: Compile when any configuration is selected RDMA/mlx5: Fix mlx5 rates to IB rates map wilc1000: write value to WILC_INTR2_ENABLE register KVM: x86/mmu: Retry page faults that hit an invalid memslot Bluetooth: avoid deadlock between hci_dev->lock and socket lock net: lapbether: Prevent racing when checking whether the netif is running libbpf: Add explicit padding to bpf_xdp_set_link_opts bpftool: Fix maybe-uninitialized warnings iommu: Check dev->iommu in iommu_dev_xxx functions iommu/vt-d: Reject unsupported page request modes selftests/bpf: Re-generate vmlinux.h and BPF skeletons if bpftool changed libbpf: Add explicit padding to btf_dump_emit_type_decl_opts powerpc/fadump: Mark fadump_calculate_reserve_size as __init powerpc/prom: Mark identical_pvr_fixup as __init MIPS: fix local_irq_{disable,enable} in asmmacro.h ima: Fix the error code for restoring the PCR value inet: use bigger hash table for IP ID generation pinctrl: pinctrl-single: remove unused parameter pinctrl: pinctrl-single: fix pcs_pin_dbg_show() when bits_per_mux is not zero MIPS: loongson64: fix bug when PAGE_SIZE > 16KB ASoC: wm8960: Remove bitclk relax condition in wm8960_configure_sysclk iommu/arm-smmu-v3: add bit field SFM into GERROR_ERR_MASK RDMA/mlx5: Fix drop packet rule in egress table IB/isert: Fix a use after free in isert_connect_request powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration MIPS/bpf: Enable bpf_probe_read{, str}() on MIPS again gpio: guard gpiochip_irqchip_add_domain() with GPIOLIB_IRQCHIP ALSA: core: remove redundant spin_lock pair in snd_card_disconnect net: phy: lan87xx: fix access to wrong register of LAN87xx udp: never accept GSO_FRAGLIST packets powerpc/pseries: Only register vio drivers if vio bus exists net/tipc: fix missing destroy_workqueue() on error in tipc_crypto_start() bug: Remove redundant condition check in report_bug RDMA/core: Fix corrupted SL on passive side nfc: pn533: prevent potential memory corruption net: hns3: Limiting the scope of vector_ring_chain variable mips: bmips: fix syscon-reboot nodes iommu/vt-d: Don't set then clear private data in prq_event_thread() iommu: Fix a boundary issue to avoid performance drop iommu/vt-d: Report right snoop capability when using FL for IOVA iommu/vt-d: Report the right page fault address iommu/vt-d: Preset Access/Dirty bits for IOVA over FL iommu/vt-d: Remove WO permissions on second-level paging entries iommu/vt-d: Invalidate PASID cache when root/context entry changed ALSA: usb-audio: Add error checks for usb_driver_claim_interface() calls HID: lenovo: Use brightness_set_blocking callback for setting LEDs brightness HID: lenovo: Fix lenovo_led_set_tp10ubkbd() error handling HID: lenovo: Check hid_get_drvdata() returns non NULL in lenovo_event() HID: lenovo: Map mic-mute button to KEY_F20 instead of KEY_MICMUTE KVM: arm64: Initialize VCPU mdcr_el2 before loading it ASoC: simple-card: fix possible uninitialized single_cpu local variable liquidio: Fix unintented sign extension of a left shift of a u16 IB/hfi1: Use kzalloc() for mmu_rb_handler allocation powerpc/64s: Fix pte update for kernel memory on radix powerpc/perf: Fix PMU constraint check for EBB events powerpc: iommu: fix build when neither PCI or IBMVIO is set mac80211: bail out if cipher schemes are invalid perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric xfs: fix return of uninitialized value in variable error rtw88: Fix an error code in rtw_debugfs_set_rsvd_page() mt7601u: fix always true expression mt76: mt7615: fix tx skb dma unmap mt76: mt7915: fix tx skb dma unmap mt76: mt7915: fix aggr len debugfs node mt76: mt7615: fix mib stats counter reporting to mac80211 mt76: mt7915: fix mib stats counter reporting to mac80211 mt76: mt7663s: make all of packets 4-bytes aligned in sdio tx aggregation mt76: mt7663s: fix the possible device hang in high traffic KVM: PPC: Book3S HV P9: Restore host CTRL SPR after guest exit ovl: invalidate readdir cache on changes to dir with origin RDMA/qedr: Fix error return code in qedr_iw_connect() IB/hfi1: Fix error return code in parse_platform_config() RDMA/bnxt_re: Fix error return code in bnxt_qplib_cq_process_terminal() cxgb4: Fix unintentional sign extension issues net: thunderx: Fix unintentional sign extension issue RDMA/srpt: Fix error return code in srpt_cm_req_recv() RDMA/rtrs-clt: destroy sysfs after removing session from active list i2c: cadence: fix reference leak when pm_runtime_get_sync fails i2c: img-scb: fix reference leak when pm_runtime_get_sync fails i2c: imx-lpi2c: fix reference leak when pm_runtime_get_sync fails i2c: imx: fix reference leak when pm_runtime_get_sync fails i2c: omap: fix reference leak when pm_runtime_get_sync fails i2c: sprd: fix reference leak when pm_runtime_get_sync fails i2c: stm32f7: fix reference leak when pm_runtime_get_sync fails i2c: xiic: fix reference leak when pm_runtime_get_sync fails i2c: cadence: add IRQ check i2c: emev2: add IRQ check i2c: jz4780: add IRQ check i2c: mlxbf: add IRQ check i2c: rcar: make sure irq is not threaded on Gen2 and earlier i2c: rcar: protect against supurious interrupts on V3U i2c: rcar: add IRQ check i2c: sh7760: add IRQ check powerpc/xive: Drop check on irq_data in xive_core_debug_show() powerpc/xive: Fix xmon command "dxi" ASoC: ak5558: correct reset polarity net/mlx5: Fix bit-wise and with zero net/packet: make packet_fanout.arr size configurable up to 64K net/packet: remove data races in fanout operations drm/i915/gvt: Fix error code in intel_gvt_init_device() iommu/amd: Put newline after closing bracket in warning perf beauty: Fix fsconfig generator drm/amd/pm: fix error code in smu_set_power_limit() MIPS: pci-legacy: stop using of_pci_range_to_resource powerpc/pseries: extract host bridge from pci_bus prior to bus removal powerpc/smp: Reintroduce cpu_core_mask KVM: x86: dump_vmcs should not assume GUEST_IA32_EFER is valid rtlwifi: 8821ae: upgrade PHY and RF parameters wlcore: fix overlapping snprintf arguments in debugfs i2c: sh7760: fix IRQ error path i2c: mediatek: Fix wrong dma sync flag mwl8k: Fix a double Free in mwl8k_probe_hw netfilter: nft_payload: fix C-VLAN offload support netfilter: nftables_offload: VLAN id needs host byteorder in flow dissector netfilter: nftables_offload: special ethertype handling for VLAN vsock/vmci: log once the failed queue pair allocation libbpf: Initialize the bpf_seq_printf parameters array field by field net: ethernet: ixp4xx: Set the DMA masks explicitly gro: fix napi_gro_frags() Fast GRO breakage due to IP alignment check RDMA/cxgb4: add missing qpid increment RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails ALSA: usb: midi: don't return -ENOMEM when usb_urb_ep_type_check fails sfc: ef10: fix TX queue lookup in TX event handling vsock/virtio: free queued packets when closing socket net: marvell: prestera: fix port event handling on init net: davinci_emac: Fix incorrect masking of tx and rx error channel mt76: mt7615: fix memleak when mt7615_unregister_device() crypto: ccp: Detect and reject "invalid" addresses destined for PSP nfp: devlink: initialize the devlink port attribute "lanes" net: stmmac: fix TSO and TBS feature enabling during driver open net: renesas: ravb: Fix a stuck issue when a lot of frames are received net: phy: intel-xway: enable integrated led functions RDMA/rxe: Fix a bug in rxe_fill_ip_info() RDMA/core: Add CM to restrack after successful attachment to a device powerpc/64: Fix the definition of the fixmap area ath9k: Fix error check in ath9k_hw_read_revisions() for PCI devices ath10k: Fix a use after free in ath10k_htc_send_bundle ath10k: Fix ath10k_wmi_tlv_op_pull_peer_stats_info() unlock without lock wlcore: Fix buffer overrun by snprintf due to incorrect buffer size powerpc/perf: Fix the threshold event selection for memory events in power10 powerpc/52xx: Fix an invalid ASM expression ('addi' used instead of 'add') net: phy: marvell: fix m88e1011_set_downshift net: phy: marvell: fix m88e1111_set_downshift net: enetc: fix link error again bnxt_en: fix ternary sign extension bug in bnxt_show_temp() ARM: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E arm64: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb selftests: net: mirror_gre_vlan_bridge_1q: Make an FDB entry static selftests: mlxsw: Remove a redundant if statement in tc_flower_scale test bnxt_en: Fix RX consumer index logic in the error path. KVM: VMX: Intercept FS/GS_BASE MSR accesses for 32-bit KVM net:emac/emac-mac: Fix a use after free in emac_mac_tx_buf_send selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro selftests/bpf: Fix field existence CO-RE reloc tests selftests/bpf: Fix core_reloc test runner bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds RDMA/siw: Fix a use after free in siw_alloc_mr RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res net: bridge: mcast: fix broken length + header check for MRDv6 Adv. net:nfc:digital: Fix a double free in digital_tg_recv_dep_req perf tools: Change fields type in perf_record_time_conv perf jit: Let convert_timestamp() to be backwards-compatible perf session: Add swap operation for event TIME_CONV ia64: fix EFI_DEBUG build kfifo: fix ternary sign extension bugs mm/sl?b.c: remove ctor argument from kmem_cache_flags mm: memcontrol: slab: fix obtain a reference to a freeing memcg mm/sparse: add the missing sparse_buffer_fini() in error branch mm/memory-failure: unnecessary amount of unmapping afs: Fix speculative status fetches bpf: Fix alu32 const subreg bound tracking on bitwise operations bpf, ringbuf: Deny reserve of buffers larger than ringbuf bpf: Prevent writable memory-mapping of read-only ringbuf pages arm64: Remove arm64_dma32_phys_limit and its uses net: Only allow init netns to set default tcp cong to a restricted algo smp: Fix smp_call_function_single_async prototype Revert "net/sctp: fix race condition in sctp_destroy_sock" sctp: delay auto_asconf init until binding the first addr Linux 5.10.37 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I5bee89c285d9dd72de967b0e70d96951ae4e06ae	2021-05-15 09:28:55 +02:00
Nikolay Borisov	2e95bc6cfe	mm/sl?b.c: remove ctor argument from kmem_cache_flags [ Upstream commit 3754000872188e3e4713d9d847fe3c615a47c220 ] This argument hasn't been used since `e153362a50` ("slub: Remove objsize check in kmem_cache_flags()") so simply remove it. Link: https://lkml.kernel.org/r/20210126095733.974665-1-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-05-14 09:50:45 +02:00
Andrey Konovalov	24690d7d25	FROMGIT: kasan, mm: integrate slab init_on_free with HW_TAGS This change uses the previously added memory initialization feature of HW_TAGS KASAN routines for slab memory when init_on_free is enabled. With this change, memory initialization memset() is no longer called when both HW_TAGS KASAN and init_on_free are enabled. Instead, memory is initialized in KASAN runtime. For SLUB, the memory initialization memset() is moved into slab_free_hook() that currently directly follows the initialization loop. A new argument is added to slab_free_hook() that indicates whether to initialize the memory or not. To avoid discrepancies with which memory gets initialized that can be caused by future changes, both KASAN hook and initialization memset() are put together and a warning comment is added. Combining setting allocation tags with memory initialization improves HW_TAGS KASAN performance when init_on_free is enabled. Link: https://lkml.kernel.org/r/190fd15c1886654afdec0d19ebebd5ade665b601.1615296150.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> (cherry picked from commit 6b548c253039de9a1658bb4c38e13e963f06489d https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm) Bug: 182930667 Signed-off-by: Alexander Potapenko <glider@google.com> Change-Id: I3bdfe966b27dc93964ad38c9a8385ca744932307	2021-03-24 15:09:18 -07:00
Andrey Konovalov	5a7af11e34	FROMGIT: kasan, mm: integrate slab init_on_alloc with HW_TAGS This change uses the previously added memory initialization feature of HW_TAGS KASAN routines for slab memory when init_on_alloc is enabled. With this change, memory initialization memset() is no longer called when both HW_TAGS KASAN and init_on_alloc are enabled. Instead, memory is initialized in KASAN runtime. The memory initialization memset() is moved into slab_post_alloc_hook() that currently directly follows the initialization loop. A new argument is added to slab_post_alloc_hook() that indicates whether to initialize the memory or not. To avoid discrepancies with which memory gets initialized that can be caused by future changes, both KASAN hook and initialization memset() are put together and a warning comment is added. Combining setting allocation tags with memory initialization improves HW_TAGS KASAN performance when init_on_alloc is enabled. Link: https://lkml.kernel.org/r/c1292aeb5d519da221ec74a0684a949b027d7720.1615296150.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> (cherry picked from commit c7948d4407ed85251c6de1a09589e69e4072abb4 https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm) Bug: 182930667 Signed-off-by: Alexander Potapenko <glider@google.com> Change-Id: I257917062e2cc5bfb3dbb46508200c30631a00a3	2021-03-24 15:09:18 -07:00
Mike Rapoport	9538c5a8c5	FROMGIT: mm: introduce debug_pagealloc_{map,unmap}_pages() helpers Patch series "arch, mm: improve robustness of direct map manipulation", v7. During recent discussion about KVM protected memory, David raised a concern about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1]. Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is possible that __kernel_map_pages() would fail, but since this function is void, the failure will go unnoticed. Moreover, there's lack of consistency of __kernel_map_pages() semantics across architectures as some guard this function with #ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page allocation debugging is disabled at run time and some allow modifying the direct map regardless of DEBUG_PAGEALLOC settings. This set straightens this out by restoring dependency of __kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites accordingly. Since currently the only user of __kernel_map_pages() outside DEBUG_PAGEALLOC is hibernation, it is updated to make direct map accesses there more explicit. [1] https://lore.kernel.org/lkml/2759b4bf-e1e3-d006-7d86-78a40348269d@redhat.com This patch (of 4): When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the kernel direct mapping after free_pages(). The pages than need to be mapped back before they could be used. Theese mapping operations use __kernel_map_pages() guarded with with debug_pagealloc_enabled(). The only place that calls __kernel_map_pages() without checking whether DEBUG_PAGEALLOC is enabled is the hibernation code that presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP is set. Still, on arm64, __kernel_map_pages() will bail out when DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush() may render some pages not present in the direct map and hibernation code won't be able to save such pages. To make page allocation debugging and hibernation interaction more robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP has to be made more explicit. Start with combining the guard condition and the call to __kernel_map_pages() into debug_pagealloc_map_pages() and debug_pagealloc_unmap_pages() functions to emphasize that __kernel_map_pages() should not be called without DEBUG_PAGEALLOC and use these new functions to map/unmap pages when page allocation debugging is enabled. Link: https://lkml.kernel.org/r/20201109192128.960-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20201109192128.960-2-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Len Brown <len.brown@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 77bc7fd607dee2ffb28daff6d0dd8ae42af61ea8 https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm) Bug: 182930667 Signed-off-by: Alexander Potapenko <glider@google.com> Change-Id: I9f0dac574bc3a7ea7d88bff051b77eca19610ce9	2021-03-24 15:09:17 -07:00
Andrey Konovalov	3cd65f50cd	FROMGIT: kasan: move _RET_IP_ to inline wrappers Generic mm functions that call KASAN annotations that might report a bug pass _RET_IP_ to them as an argument. This allows KASAN to include the name of the function that called the mm function in its report's header. Now that KASAN has inline wrappers for all of its annotations, move _RET_IP_ to those wrappers to simplify annotation call sites. Link: https://linux-review.googlesource.com/id/I8fb3c06d49671305ee184175a39591bc26647a67 Link: https://lkml.kernel.org/r/5c1490eddf20b436b8c4eeea83fce47687d5e4a4.1610733117.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> (cherry picked from commit 94e23417b8f73b5749495ce986b14cd5c4d996fb https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm) Bug: 172318110 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Change-Id: If1066e1aade848c3eb29476b3140919698f2a475	2021-02-07 13:41:42 -08:00
Alexander Potapenko	6ba57f3a0c	BACKPORT: mm, kfence: insert KFENCE hooks for SLAB Inserts KFENCE hooks into the SLAB allocator. To pass the originally requested size to KFENCE, add an argument 'orig_size' to slab_alloc(). The additional argument is required to preserve the requested original size for kmalloc() allocations, which uses size classes (e.g. an allocation of 272 bytes will return an object of size 512). Therefore, kmem_cache::size does not represent the kmalloc-caller's requested size, and we must introduce the argument 'orig_size' to propagate the originally requested size to KFENCE. Without the originally requested size, we would not be able to detect out-of-bounds accesses for objects placed at the end of a KFENCE object page if that object is not equal to the kmalloc-size class it was bucketed into. When KFENCE is disabled, there is no additional overhead, since slab_alloc() functions are __always_inline. Link: https://lkml.kernel.org/r/20201103175841.3495947-5-elver@google.com Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Alexander Potapenko <glider@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Co-developed-by: Marco Elver <elver@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hillf Danton <hdanton@sina.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Joern Engel <joern@purestorage.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: SeongJae Park <sjpark@amazon.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [glider: resolved minor API change in mm/slab_common.c] Bug: 177201466 (cherry picked from commit 840c0553e89413319d67971a321bcc07114da9b8 https://github.com/hnaz/linux-mm v5.11-rc4-mmots-2021-01-21-20-10) Test: CONFIG_KFENCE_KUNIT_TEST=y passes on Cuttlefish Signed-off-by: Alexander Potapenko <glider@google.com> Change-Id: Iab2ba9c7b06b9a234d93ba892be639941861f8ab	2021-02-05 09:20:53 -08:00
Alexander Popov	1e95fcd132	FROMGIT: mm/slab: rerform init_on_free earlier Currently in CONFIG_SLAB init_on_free happens too late, and heap objects go to the heap quarantine not being erased. Lets move init_on_free clearing before calling kasan_slab_free(). In that case heap quarantine will store erased objects, similarly to CONFIG_SLUB=y behavior. Link: https://lkml.kernel.org/r/20201210183729.1261524-1-alex.popov@linux.com Signed-off-by: Alexander Popov <alex.popov@linux.com> Reviewed-by: Alexander Potapenko <glider@google.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 177201466 (cherry picked from commit a32d654db543843a5ffb248feaec1a909718addd https://github.com/hnaz/linux-mm v5.11-rc4-mmots-2021-01-21-20-10) Test: CONFIG_KFENCE_KUNIT_TEST=y passes on Cuttlefish Signed-off-by: Alexander Potapenko <glider@google.com> Change-Id: I2bf5e70c1524619526efd792bbdd959b813af1e4	2021-02-05 09:20:53 -08:00
Vijayanand Jitta	df2e575fcc	ANDROID: mm: Export get_slabinfo Export get_slabinfo symbol for loadable vendor modules. Bug: 176277895 Change-Id: I01870a370da9bf5db842ff14801d94ef79350560 Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>	2021-01-06 16:16:27 +00:00
Chen Tao	70b6d25ec5	mm: fix some comments formatting Correct one function name "get_partials" with "get_partial". Update the old struct name of list3 with kmem_cache_node. Signed-off-by: Chen Tao <chentao3@hotmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lkml.kernel.org/r/Message-ID: Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-16 11:11:19 -07:00
Bharata B Rao	d1b2cf6cb8	mm: memcg/slab: uncharge during kmem_cache_free_bulk() Object cgroup charging is done for all the objects during allocation, but during freeing, uncharging ends up happening for only one object in the case of bulk allocation/freeing. Fix this by having a separate call to uncharge all the objects from kmem_cache_free_bulk() and by modifying memcg_slab_free_hook() to take care of bulk uncharging. Fixes: `964d4bd370` ("mm: memcg/slab: save obj_cgroup for non-root slab objects" Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Roman Gushchin <guro@fb.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201009060423.390479-1-bharata@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-13 18:38:31 -07:00
Mateusz Nosek	c1ff3f9549	mm/slab.c: clean code by removing redundant if condition The removed code was unnecessary and changed nothing in the flow, since in case of returning NULL by 'kmem_cache_alloc_node' returning 'freelist' from the function in question is the same as returning NULL. Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Link: https://lkml.kernel.org/r/20200915230329.13002-1-mateusznosek0@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-13 18:38:27 -07:00
Shakeel Butt	678ff6a7af	mm: slab: fix potential double free in ___cache_free With the commit `10befea91b` ("mm: memcg/slab: use a single set of kmem_caches for all allocations"), it becomes possible to call kfree() from the slabs_destroy(). The functions cache_flusharray() and do_drain() calls slabs_destroy() on array_cache of the local CPU without updating the size of the array_cache. This enables the kfree() call from the slabs_destroy() to recursively call cache_flusharray() which can potentially call free_block() on the same elements of the array_cache of the local CPU and causing double free and memory corruption. To fix the issue, simply update the local CPU array_cache cache before calling slabs_destroy(). Fixes: `10befea91b` ("mm: memcg/slab: use a single set of kmem_caches for all allocations") Signed-off-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Roman Gushchin <guro@fb.com> Tested-by: Ming Lei <ming.lei@redhat.com> Reported-by: kernel test robot <rong.a.chen@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ted Ts'o <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-26 10:15:01 -07:00
Roman Gushchin	74d555bed5	mm: slab: rename (un)charge_slab_page() to (un)account_slab_page() charge_slab_page() and uncharge_slab_page() are not related anymore to memcg charging and uncharging. In order to make their names less confusing, let's rename them to account_slab_page() and unaccount_slab_page() respectively. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Link: http://lkml.kernel.org/r/20200707173612.124425-2-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:25 -07:00
Roman Gushchin	849504809f	mm: memcg/slab: remove unused argument by charge_slab_page() charge_slab_page() is not using the gfp argument anymore, remove it. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Link: http://lkml.kernel.org/r/20200707173612.124425-1-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:25 -07:00
Roman Gushchin	10befea91b	mm: memcg/slab: use a single set of kmem_caches for all allocations Instead of having two sets of kmem_caches: one for system-wide and non-accounted allocations and the second one shared by all accounted allocations, we can use just one. The idea is simple: space for obj_cgroup metadata can be allocated on demand and filled only for accounted allocations. It allows to remove a bunch of code which is required to handle kmem_cache clones for accounted allocations. There is no more need to create them, accumulate statistics, propagate attributes, etc. It's a quite significant simplification. Also, because the total number of slab_caches is reduced almost twice (not all kmem_caches have a memcg clone), some additional memory savings are expected. On my devvm it additionally saves about 3.5% of slab memory. [guro@fb.com: fix build on MIPS] Link: http://lkml.kernel.org/r/20200717214810.3733082-1-guro@fb.com Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Link: http://lkml.kernel.org/r/20200623174037.3951353-18-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:25 -07:00
Roman Gushchin	c7094406fc	mm: memcg/slab: deprecate slab_root_caches Currently there are two lists of kmem_caches: 1) slab_caches, which contains all kmem_caches, 2) slab_root_caches, which contains only root kmem_caches. And there is some preprocessor magic to have a single list if CONFIG_MEMCG_KMEM isn't enabled. It was required earlier because the number of non-root kmem_caches was proportional to the number of memory cgroups and could reach really big values. Now, when it cannot exceed the number of root kmem_caches, there is really no reason to maintain two lists. We never iterate over the slab_root_caches list on any hot paths, so it's perfectly fine to iterate over slab_caches and filter out non-root kmem_caches. It allows to remove a lot of config-dependent code and two pointers from the kmem_cache structure. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/20200623174037.3951353-16-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:25 -07:00
Roman Gushchin	9855609bde	mm: memcg/slab: use a single set of kmem_caches for all accounted allocations This is fairly big but mostly red patch, which makes all accounted slab allocations use a single set of kmem_caches instead of creating a separate set for each memory cgroup. Because the number of non-root kmem_caches is now capped by the number of root kmem_caches, there is no need to shrink or destroy them prematurely. They can be perfectly destroyed together with their root counterparts. This allows to dramatically simplify the management of non-root kmem_caches and delete a ton of code. This patch performs the following changes: 1) introduces memcg_params.memcg_cache pointer to represent the kmem_cache which will be used for all non-root allocations 2) reuses the existing memcg kmem_cache creation mechanism to create memcg kmem_cache on the first allocation attempt 3) memcg kmem_caches are named <kmemcache_name>-memcg, e.g. dentry-memcg 4) simplifies memcg_kmem_get_cache() to just return memcg kmem_cache or schedule it's creation and return the root cache 5) removes almost all non-root kmem_cache management code (separate refcounter, reparenting, shrinking, etc) 6) makes slab debugfs to display root_mem_cgroup css id and never show :dead and :deact flags in the memcg_slabinfo attribute. Following patches in the series will simplify the kmem_cache creation. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/20200623174037.3951353-13-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:25 -07:00
Roman Gushchin	964d4bd370	mm: memcg/slab: save obj_cgroup for non-root slab objects Store the obj_cgroup pointer in the corresponding place of page->obj_cgroups for each allocated non-root slab object. Make sure that each allocated object holds a reference to obj_cgroup. Objcg pointer is obtained from the memcg->objcg dereferencing in memcg_kmem_get_cache() and passed from pre_alloc_hook to post_alloc_hook. Then in case of successful allocation(s) it's getting stored in the page->obj_cgroups vector. The objcg obtaining part look a bit bulky now, but it will be simplified by next commits in the series. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/20200623174037.3951353-9-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:24 -07:00
Marco Elver	cfbe1636c3	mm, kcsan: instrument SLAB/SLUB free with "ASSERT_EXCLUSIVE_ACCESS" Provide the necessary KCSAN checks to assist with debugging racy use-after-frees. While KASAN is more reliable at generally catching such use-after-frees (due to its use of a quarantine), it can be difficult to debug racy use-after-frees. If a reliable reproducer exists, KCSAN can assist in debugging such issues. Note: ASSERT_EXCLUSIVE_ACCESS is a convenience wrapper if the size is simply sizeof(var). Instead, here we just use __kcsan_check_access() explicitly to pass the correct size. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Link: http://lkml.kernel.org/r/20200623072653.114563-1-elver@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:23 -07:00
Vlastimil Babka	e42f174e43	mm, slab/slub: improve error reporting and overhead of cache_from_obj() cache_from_obj() was added by commit `b9ce5ef49f` ("sl[au]b: always get the cache from its page in kmem_cache_free()") to support kmemcg, where per-memcg cache can be different from the root one, so we can't use the kmem_cache pointer given to kmem_cache_free(). Prior to that commit, SLUB already had debugging check+warning that could be enabled to compare the given kmem_cache pointer to one referenced by the slab page where the object-to-be-freed resides. This check was moved to cache_from_obj(). Later the check was also enabled for SLAB_FREELIST_HARDENED configs by commit `598a0717a8` ("mm/slab: validate cache membership under freelist hardening"). These checks and warnings can be useful especially for the debugging, which can be improved. Commit `598a0717a8` changed the pr_err() with WARN_ON_ONCE() to WARN_ONCE() so only the first hit is now reported, others are silent. This patch changes it to WARN() so that all errors are reported. It's also useful to print SLUB allocation/free tracking info for the offending object, if tracking is enabled. Thus, export the SLUB print_tracking() function and provide an empty one for SLAB. For SLUB we can also benefit from the static key check in kmem_cache_debug_flags(), but we need to move this function to slab.h and declare the static key there. [1] https://lore.kernel.org/r/20200608230654.828134-18-guro@fb.com [vbabka@suse.cz: avoid bogus WARN()] Link: https://lore.kernel.org/r/20200623090213.GW5535@shao2-debian Link: http://lkml.kernel.org/r/b33e0fa7-cd28-4788-9e54-5927846329ef@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Roman Gushchin <guro@fb.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Garrett <mjg59@google.com> Cc: Jann Horn <jannh@google.com> Cc: Vijayanand Jitta <vjitta@codeaurora.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Link: http://lkml.kernel.org/r/afeda7ac-748b-33d8-a905-56b708148ad5@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:23 -07:00
Vlastimil Babka	d3c58f24be	mm, slab/slub: move and improve cache_from_obj() The function cache_from_obj() was added by commit `b9ce5ef49f` ("sl[au]b: always get the cache from its page in kmem_cache_free()") to support kmemcg, where per-memcg cache can be different from the root one, so we can't use the kmem_cache pointer given to kmem_cache_free(). Prior to that commit, SLUB already had debugging check+warning that could be enabled to compare the given kmem_cache pointer to one referenced by the slab page where the object-to-be-freed resides. This check was moved to cache_from_obj(). Later the check was also enabled for SLAB_FREELIST_HARDENED configs by commit `598a0717a8` ("mm/slab: validate cache membership under freelist hardening"). These checks and warnings can be useful especially for the debugging, which can be improved. Commit `598a0717a8` changed the pr_err() with WARN_ON_ONCE() to WARN_ONCE() so only the first hit is now reported, others are silent. This patch changes it to WARN() so that all errors are reported. It's also useful to print SLUB allocation/free tracking info for the offending object, if tracking is enabled. We could export the SLUB print_tracking() function and provide an empty one for SLAB, or realize that both the debugging and hardening cases in cache_from_obj() are only supported by SLUB anyway. So this patch moves cache_from_obj() from slab.h to separate instances in slab.c and slub.c, where the SLAB version only does the kmemcg lookup and even could be completely removed once the kmemcg rework [1] is merged. The SLUB version can thus easily use the print_tracking() function. It can also use the kmem_cache_debug_flags() static key check for improved performance in kernels without the hardening and with debugging not enabled on boot. [1] https://lore.kernel.org/r/20200608230654.828134-18-guro@fb.com Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Vijayanand Jitta <vjitta@codeaurora.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Link: http://lkml.kernel.org/r/20200610163135.17364-10-vbabka@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:22 -07:00
Xiao Yang	221503e128	mm/slab.c: update outdated kmem_list3 in a comment kmem_list3 has been renamed to kmem_cache_node long long ago so update it. References: `6744f087ba` ("slab: Common name for the per node structures") `ce8eb6c424` ("slab: Rename list3/l3 to node") Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Link: http://lkml.kernel.org/r/20200722033355.26908-1-yangx.jy@cn.fujitsu.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:22 -07:00
Long Li	444050990d	mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order kmalloc cannot allocate memory from HIGHMEM. Allocating large amounts of memory currently bypasses the check and will simply leak the memory when page_address() returns NULL. To fix this, factor the GFP_SLAB_BUG_MASK check out of slab & slub, and call it from kmalloc_order() as well. In order to make the code clear, the warning message is put in one place. Signed-off-by: Long Li <lonuxli.64@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Link: http://lkml.kernel.org/r/20200704035027.GA62481@lilong Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:22 -07:00
Kees Cook	dabc3e291d	mm/slab: add naive detection of double free Similar to commit `ce6fa91b93` ("mm/slub.c: add a naive detection of double free or corruption"), add a very cheap double-free check for SLAB under CONFIG_SLAB_FREELIST_HARDENED. With this added, the "SLAB_FREE_DOUBLE" LKDTM test passes under SLAB: lkdtm: Performing direct entry SLAB_FREE_DOUBLE lkdtm: Attempting double slab free ... ------------[ cut here ]------------ WARNING: CPU: 2 PID: 2193 at mm/slab.c:757 ___cache _free+0x325/0x390 [keescook@chromium.org: fix misplaced __free_one()] Link: http://lkml.kernel.org/r/202006261306.0D82A2B@keescook Link: https://lore.kernel.org/lkml/7ff248c7-d447-340c-a8e2-8c02972aca70@infradead.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Randy Dunlap <rdunlap@infradead.org> [build tested] Cc: Roman Gushchin <guro@fb.com> Cc: Christoph Lameter <cl@linux.com> Cc: Alexander Popov <alex.popov@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: Matthew Garrett <mjg59@google.com> Cc: Jann Horn <jannh@google.com> Cc: Vijayanand Jitta <vjitta@codeaurora.org> Link: http://lkml.kernel.org/r/20200625215548.389774-3-keescook@chromium.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-07 11:33:22 -07:00
Joonsoo Kim	97a225e69a	mm/page_alloc: integrate classzone_idx and high_zoneidx classzone_idx is just different name for high_zoneidx now. So, integrate them and add some comment to struct alloc_context in order to reduce future confusion about the meaning of this variable. The accessor, ac_classzone_idx() is also removed since it isn't needed after integration. In addition to integration, this patch also renames high_zoneidx to highest_zoneidx since it represents more precise meaning. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Baoquan He <bhe@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/1587095923-7515-3-git-send-email-iamjoonsoo.kim@lge.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-03 20:09:44 -07:00
Vlastimil Babka	8e57f8acbb	mm, debug_pagealloc: don't rely on static keys too early Commit `96a2b03f28` ("mm, debug_pagelloc: use static keys to enable debugging") has introduced a static key to reduce overhead when debug_pagealloc is compiled in but not enabled. It relied on the assumption that jump_label_init() is called before parse_early_param() as in start_kernel(), so when the "debug_pagealloc=on" option is parsed, it is safe to enable the static key. However, it turns out multiple architectures call parse_early_param() earlier from their setup_arch(). x86 also calls jump_label_init() even earlier, so no issue was found while testing the commit, but same is not true for e.g. ppc64 and s390 where the kernel would not boot with debug_pagealloc=on as found by our QA. To fix this without tricky changes to init code of multiple architectures, this patch partially reverts the static key conversion from `96a2b03f28`. Init-time and non-fastpath calls (such as in arch code) of debug_pagealloc_enabled() will again test a simple bool variable. Fastpath mm code is converted to a new debug_pagealloc_enabled_static() variant that relies on the static key, which is enabled in a well-defined point in mm_init() where it's guaranteed that jump_label_init() has been called, regardless of architecture. [sfr@canb.auug.org.au: export _debug_pagealloc_enabled_early] Link: http://lkml.kernel.org/r/20200106164944.063ac07b@canb.auug.org.au Link: http://lkml.kernel.org/r/20191219130612.23171-1-vbabka@suse.cz Fixes: `96a2b03f28` ("mm, debug_pagelloc: use static keys to enable debugging") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-01-13 18:19:02 -08:00
Pengfei Li	dc0a7f7558	mm, slab: remove unused kmalloc_size() The size of kmalloc can be obtained from kmalloc_info[], so remove kmalloc_size() that will not be used anymore. Link: http://lkml.kernel.org/r/1569241648-26908-3-git-send-email-lpf.vector@gmail.com Signed-off-by: Pengfei Li <lpf.vector@gmail.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-12-01 06:29:17 -08:00
Pengfei Li	cb5d9fb38c	mm, slab: make kmalloc_info[] contain all types of names Patch series "mm, slab: Make kmalloc_info[] contain all types of names", v6. There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM and KMALLOC_DMA. The name of KMALLOC_NORMAL is contained in kmalloc_info[].name, but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically generated by kmalloc_cache_name(). Patch1 predefines the names of all types of kmalloc to save the time spent dynamically generating names. These changes make sense, and the time spent by new_kmalloc_cache() has been reduced by approximately 36.3%. Time spent by new_kmalloc_cache() (CPU cycles) 5.3-rc7 66264 5.3-rc7+patch 42188 This patch (of 3): There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM and KMALLOC_DMA. The name of KMALLOC_NORMAL is contained in kmalloc_info[].name, but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically generated by kmalloc_cache_name(). This patch predefines the names of all types of kmalloc to save the time spent dynamically generating names. Besides, remove the kmalloc_cache_name() that is no longer used. Link: http://lkml.kernel.org/r/1569241648-26908-2-git-send-email-lpf.vector@gmail.com Signed-off-by: Pengfei Li <lpf.vector@gmail.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-12-01 06:29:17 -08:00
Randy Dunlap	87bf4f71af	mm/slab.c: fix kernel-doc warning for __ksize() Fix kernel-doc warning in mm/slab.c: mm/slab.c:4215: warning: Function parameter or member 'objp' not described in '__ksize' Also add Return: documentation section for this function. Link: http://lkml.kernel.org/r/68c9fd7d-f09e-d376-e292-c7b2bdf1774d@infradead.org Fixes: `10d1f8cb39` ("mm/slab: refactor common ksize KASAN logic into slab_common.c") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-10-14 15:04:01 -07:00
Alexander Potapenko	6471384af2	mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options Patch series "add init_on_alloc/init_on_free boot options", v10. Provide init_on_alloc and init_on_free boot options. These are aimed at preventing possible information leaks and making the control-flow bugs that depend on uninitialized values more deterministic. Enabling either of the options guarantees that the memory returned by the page allocator and SL[AU]B is initialized with zeroes. SLOB allocator isn't supported at the moment, as its emulation of kmem caches complicates handling of SLAB_TYPESAFE_BY_RCU caches correctly. Enabling init_on_free also guarantees that pages and heap objects are initialized right after they're freed, so it won't be possible to access stale data by using a dangling pointer. As suggested by Michal Hocko, right now we don't let the heap users to disable initialization for certain allocations. There's not enough evidence that doing so can speed up real-life cases, and introducing ways to opt-out may result in things going out of control. This patch (of 2): The new options are needed to prevent possible information leaks and make control-flow bugs that depend on uninitialized values more deterministic. This is expected to be on-by-default on Android and Chrome OS. And it gives the opportunity for anyone else to use it under distros too via the boot args. (The init_on_free feature is regularly requested by folks where memory forensics is included in their threat models.) init_on_alloc=1 makes the kernel initialize newly allocated pages and heap objects with zeroes. Initialization is done at allocation time at the places where checks for __GFP_ZERO are performed. init_on_free=1 makes the kernel initialize freed pages and heap objects with zeroes upon their deletion. This helps to ensure sensitive data doesn't leak via use-after-free accesses. Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator returns zeroed memory. The two exceptions are slab caches with constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never zero-initialized to preserve their semantics. Both init_on_alloc and init_on_free default to zero, but those defaults can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON. If either SLUB poisoning or page poisoning is enabled, those options take precedence over init_on_alloc and init_on_free: initialization is only applied to unpoisoned allocations. Slowdown for the new features compared to init_on_free=0, init_on_alloc=0: hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%) hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%) Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%) Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%) Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%) Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%) The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline is within the standard error. The new features are also going to pave the way for hardware memory tagging (e.g. arm64's MTE), which will require both on_alloc and on_free hooks to set the tags for heap objects. With MTE, tagging will have the same cost as memory initialization. Although init_on_free is rather costly, there are paranoid use-cases where in-memory data lifetime is desired to be minimized. There are various arguments for/against the realism of the associated threat models, but given that we'll need the infrastructure for MTE anyway, and there are people who want wipe-on-free behavior no matter what the performance cost, it seems reasonable to include it in this series. [glider@google.com: v8] Link: http://lkml.kernel.org/r/20190626121943.131390-2-glider@google.com [glider@google.com: v9] Link: http://lkml.kernel.org/r/20190627130316.254309-2-glider@google.com [glider@google.com: v10] Link: http://lkml.kernel.org/r/20190628093131.199499-2-glider@google.com Link: http://lkml.kernel.org/r/20190617151050.92663-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.cz> [page and dmapool parts Acked-by: James Morris <jamorris@linux.microsoft.com>] Cc: Christoph Lameter <cl@linux.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Sandeep Patil <sspatil@android.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-07-12 11:05:46 -07:00
Roman Gushchin	6cea1d569d	mm: memcg/slab: unify SLAB and SLUB page accounting Currently the page accounting code is duplicated in SLAB and SLUB internals. Let's move it into new (un)charge_slab_page helpers in the slab_common.c file. These helpers will be responsible for statistics (global and memcg-aware) and memcg charging. So they are replacing direct memcg_(un)charge_slab() calls. Link: http://lkml.kernel.org/r/20190611231813.3148843-6-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Waiman Long <longman@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-07-12 11:05:44 -07:00
Roman Gushchin	4348669475	mm: memcg/slab: generalize postponed non-root kmem_cache deactivation Currently SLUB uses a work scheduled after an RCU grace period to deactivate a non-root kmem_cache. This mechanism can be reused for kmem_caches release, but requires generalization for SLAB case. Introduce kmemcg_cache_deactivate() function, which calls allocator-specific __kmem_cache_deactivate() and schedules execution of __kmem_cache_deactivate_after_rcu() with all necessary locks in a worker context after an rcu grace period. Here is the new calling scheme: kmemcg_cache_deactivate() __kmemcg_cache_deactivate() SLAB/SLUB-specific kmemcg_rcufn() rcu kmemcg_workfn() work __kmemcg_cache_deactivate_after_rcu() SLAB/SLUB-specific instead of: __kmemcg_cache_deactivate() SLAB/SLUB-specific slab_deactivate_memcg_cache_rcu_sched() SLUB-only kmemcg_rcufn() rcu kmemcg_workfn() work kmemcg_cache_deact_after_rcu() SLUB-only For consistency, all allocator-specific functions start with "__". Link: http://lkml.kernel.org/r/20190611231813.3148843-4-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Waiman Long <longman@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-07-12 11:05:44 -07:00
Roman Gushchin	c03914b7aa	mm: memcg/slab: postpone kmem_cache memcg pointer initialization to memcg_link_cache() Patch series "mm: reparent slab memory on cgroup removal", v7. # Why do we need this? We've noticed that the number of dying cgroups is steadily growing on most of our hosts in production. The following investigation revealed an issue in the userspace memory reclaim code [1], accounting of kernel stacks [2], and also the main reason: slab objects. The underlying problem is quite simple: any page charged to a cgroup holds a reference to it, so the cgroup can't be reclaimed unless all charged pages are gone. If a slab object is actively used by other cgroups, it won't be reclaimed, and will prevent the origin cgroup from being reclaimed. Slab objects, and first of all vfs cache, is shared between cgroups, which are using the same underlying fs, and what's even more important, it's shared between multiple generations of the same workload. So if something is running periodically every time in a new cgroup (like how systemd works), we do accumulate multiple dying cgroups. Strictly speaking pagecache isn't different here, but there is a key difference: we disable protection and apply some extra pressure on LRUs of dying cgroups, and these LRUs contain all charged pages. My experiments show that with the disabled kernel memory accounting the number of dying cgroups stabilizes at a relatively small number (~100, depends on memory pressure and cgroup creation rate), and with kernel memory accounting it grows pretty steadily up to several thousands. Memory cgroups are quite complex and big objects (mostly due to percpu stats), so it leads to noticeable memory losses. Memory occupied by dying cgroups is measured in hundreds of megabytes. I've even seen a host with more than 100Gb of memory wasted for dying cgroups. It leads to a degradation of performance with the uptime, and generally limits the usage of cgroups. My previous attempt [3] to fix the problem by applying extra pressure on slab shrinker lists caused a regressions with xfs and ext4, and has been reverted [4]. The following attempts to find the right balance [5, 6] were not successful. So instead of trying to find a maybe non-existing balance, let's do reparent accounted slab caches to the parent cgroup on cgroup removal. # Implementation approach There is however a significant problem with reparenting of slab memory: there is no list of charged pages. Some of them are in shrinker lists, but not all. Introducing of a new list is really not an option. But fortunately there is a way forward: every slab page has a stable pointer to the corresponding kmem_cache. So the idea is to reparent kmem_caches instead of slab pages. It's actually simpler and cheaper, but requires some underlying changes: 1) Make kmem_caches to hold a single reference to the memory cgroup, instead of a separate reference per every slab page. 2) Stop setting page->mem_cgroup pointer for memcg slab pages and use page->kmem_cache->memcg indirection instead. It's used only on slab page release, so performance overhead shouldn't be a big issue. 3) Introduce a refcounter for non-root slab caches. It's required to be able to destroy kmem_caches when they become empty and release the associated memory cgroup. There is a bonus: currently we release all memcg kmem_caches all together with the memory cgroup itself. This patchset allows individual kmem_caches to be released as soon as they become inactive and free. Some additional implementation details are provided in corresponding commit messages. # Results Below is the average number of dying cgroups on two groups of our production hosts. They do run some sort of web frontend workload, the memory pressure is moderate. As we can see, with the kernel memory reparenting the number stabilizes in 60s range; however with the original version it grows almost linearly and doesn't show any signs of plateauing. The difference in slab and percpu usage between patched and unpatched versions also grows linearly. In 7 days it exceeded 200Mb. day 0 1 2 3 4 5 6 7 original 56 362 628 752 1070 1250 1490 1560 patched 23 46 51 55 60 57 67 69 mem diff(Mb) 22 74 123 152 164 182 214 241 # Links [1]: commit `68600f623d` ("mm: don't miss the last page because of round-off error") [2]: commit `9b6f7e163c` ("mm: rework memcg kernel stack accounting") [3]: commit `172b06c32b` ("mm: slowly shrink slabs with a relatively small number of objects") [4]: commit `a9a238e83f` ("Revert "mm: slowly shrink slabs with a relatively small number of objects") [5]: https://lkml.org/lkml/2019/1/28/1865 [6]: https://marc.info/?l=linux-mm&m=155064763626437&w=2 This patch (of 10): Initialize kmem_cache->memcg_params.memcg pointer in memcg_link_cache() rather than in init_memcg_params(). Once kmem_cache will hold a reference to the memory cgroup, it will simplify the refcounting. For non-root kmem_caches memcg_link_cache() is always called before the kmem_cache becomes visible to a user, so it's safe. Link: http://lkml.kernel.org/r/20190611231813.3148843-2-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Waiman Long <longman@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-07-12 11:05:43 -07:00
Marco Elver	10d1f8cb39	mm/slab: refactor common ksize KASAN logic into slab_common.c This refactors common code of ksize() between the various allocators into slab_common.c: __ksize() is the allocator-specific implementation without instrumentation, whereas ksize() includes the required KASAN logic. Link: http://lkml.kernel.org/r/20190626142014.141844-5-elver@google.com Signed-off-by: Marco Elver <elver@google.com> Acked-by: Christoph Lameter <cl@linux.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-07-12 11:05:42 -07:00
Kees Cook	a64b53780e	mm/slab: sanity-check page type when looking up cache This avoids any possible type confusion when looking up an object. For example, if a non-slab were to be passed to kfree(), the invalid slab_cache pointer (i.e. overlapped with some other value from the struct page union) would be used for subsequent slab manipulations that could lead to further memory corruption. Since the page is already in cache, adding the PageSlab() check will have nearly zero cost, so add a check and WARN() to virt_to_cache(). Additionally replaces an open-coded virt_to_cache(). To support the failure mode this also updates all callers of virt_to_cache() and cache_from_obj() to handle a NULL cache pointer return value (though note that several already handle this case gracefully). [dan.carpenter@oracle.com: restore IRQs in kfree()] Link: http://lkml.kernel.org/r/20190613065637.GE16334@mwanda Link: http://lkml.kernel.org/r/20190530045017.15252-3-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Alexander Popov <alex.popov@linux.com> Cc: Alexander Potapenko <glider@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-07-12 11:05:41 -07:00
Qian Cai	7878c231da	slab: remove /proc/slab_allocators It turned out that DEBUG_SLAB_LEAK is still broken even after recent recue efforts that when there is a large number of objects like kmemleak_object which is normal on a debug kernel, # grep kmemleak /proc/slabinfo kmemleak_object 2243606 3436210 ... reading /proc/slab_allocators could easily loop forever while processing the kmemleak_object cache and any additional freeing or allocating objects will trigger a reprocessing. To make a situation worse, soft-lockups could easily happen in this sitatuion which will call printk() to allocate more kmemleak objects to guarantee an infinite loop. Also, since it seems no one had noticed when it was totally broken more than 2-year ago - see the commit `fcf88917dd` ("slab: fix a crash by reading /proc/slab_allocators"), probably nobody cares about it anymore due to the decline of the SLAB. Just remove it entirely. Suggested-by: Vlastimil Babka <vbabka@suse.cz> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-05-16 15:51:55 -07:00
Qian Cai	745e10146c	mm/slab.c: fix an infinite loop in leaks_show() "cat /proc/slab_allocators" could hang forever on SMP machines with kmemleak or object debugging enabled due to other CPUs running do_drain() will keep making kmemleak_object or debug_objects_cache dirty and unable to escape the first loop in leaks_show(), do { set_store_user_clean(cachep); drain_cpu_caches(cachep); ... } while (!is_store_user_clean(cachep)); For example, do_drain slabs_destroy slab_destroy kmem_cache_free __cache_free ___cache_free kmemleak_free_recursive delete_object_full __delete_object put_object free_object_rcu kmem_cache_free cache_free_debugcheck --> dirty kmemleak_object One approach is to check cachep->name and skip both kmemleak_object and debug_objects_cache in leaks_show(). The other is to set store_user_clean after drain_cpu_caches() which leaves a small window between drain_cpu_caches() and set_store_user_clean() where per-CPU caches could be dirty again lead to slightly wrong information has been stored but could also speed up things significantly which sounds like a good compromise. For example, # cat /proc/slab_allocators 0m42.778s # 1st approach 0m0.737s # 2nd approach [akpm@linux-foundation.org: tweak comment] Link: http://lkml.kernel.org/r/20190411032635.10325-1-cai@lca.pw Fixes: `d31676dfde` ("mm/slab: alternative implementation for DEBUG_SLAB_LEAK") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-05-14 09:47:45 -07:00
Li RongQing	517f9f1ee5	mm/slab.c: remove unneed check in cpuup_canceled nc is a member of percpu allocation memory, and cannot be NULL. Link: http://lkml.kernel.org/r/1553159353-5056-1-git-send-email-lirongqing@baidu.com Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-05-14 09:47:45 -07:00
Tobin C. Harding	16cb0ec75b	slab: use slab_list instead of lru Currently we use the page->lru list for maintaining lists of slabs. We have a list in the page structure (slab_list) that can be used for this purpose. Doing so makes the code cleaner since we are not overloading the lru list. Use the slab_list instead of the lru list for maintaining lists of slabs. Link: http://lkml.kernel.org/r/20190402230545.2929-7-tobin@kernel.org Signed-off-by: Tobin C. Harding <tobin@kernel.org> Acked-by: Christoph Lameter <cl@linux.com> Reviewed-by: Roman Gushchin <guro@fb.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-05-14 09:47:45 -07:00
Linus Torvalds	8f14772703	Merge branch 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 irq updates from Ingo Molnar: "Here are the main changes in this tree: - Introduce x86-64 IRQ/exception/debug stack guard pages to detect stack overflows immediately and deterministically. - Clean up over a decade worth of cruft accumulated. The outcome of this should be more clear-cut faults/crashes when any of the low level x86 CPU stacks overflow, instead of silent memory corruption and sporadic failures much later on" * 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) x86/irq: Fix outdated comments x86/irq/64: Remove stack overflow debug code x86/irq/64: Remap the IRQ stack with guard pages x86/irq/64: Split the IRQ stack into its own pages x86/irq/64: Init hardirq_stack_ptr during CPU hotplug x86/irq/32: Handle irq stack allocation failure proper x86/irq/32: Invoke irq_ctx_init() from init_IRQ() x86/irq/64: Rename irq_stack_ptr to hardirq_stack_ptr x86/irq/32: Rename hard/softirq_stack to hard/softirq_stack_ptr x86/irq/32: Make irq stack a character array x86/irq/32: Define IRQ_STACK_SIZE x86/dumpstack/64: Speedup in_exception_stack() x86/exceptions: Split debug IST stack x86/exceptions: Enable IST guard pages x86/exceptions: Disconnect IST index and stack order x86/cpu: Remove orig_ist array x86/cpu: Prepare TSS.IST setup for guard pages x86/dumpstack/64: Use cpu_entry_area instead of orig_ist x86/irq/64: Use cpu entry area instead of orig_ist x86/traps: Use cpu_entry_area instead of orig_ist ...	2019-05-06 15:56:41 -07:00
Qian Cai	1a62b18d51	slab: store tagged freelist for off-slab slabmgmt Commit `51dedad06b` ("kasan, slab: make freelist stored without tags") calls kasan_reset_tag() for off-slab slab management object leading to freelist being stored non-tagged. However, cache_grow_begin() calls alloc_slabmgmt() which calls kmem_cache_alloc_node() assigns a tag for the address and stores it in the shadow address. As the result, it causes endless errors below during boot due to drain_freelist() -> slab_destroy() -> kasan_slab_free() which compares already untagged freelist against the stored tag in the shadow address. Since off-slab slab management object freelist is such a special case, just store it tagged. Non-off-slab management object freelist is still stored untagged which has not been assigned a tag and should not cause any other troubles with this inconsistency. BUG: KASAN: double-free or invalid-free in slab_destroy+0x84/0x88 Pointer tag: [ff], memory tag: [99] CPU: 0 PID: 1376 Comm: kworker/0:4 Tainted: G W 5.1.0-rc3+ #8 Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.0.6 07/10/2018 Workqueue: cgroup_destroy css_killed_work_fn Call trace: print_address_description+0x74/0x2a4 kasan_report_invalid_free+0x80/0xc0 __kasan_slab_free+0x204/0x208 kasan_slab_free+0xc/0x18 kmem_cache_free+0xe4/0x254 slab_destroy+0x84/0x88 drain_freelist+0xd0/0x104 __kmem_cache_shrink+0x1ac/0x224 __kmemcg_cache_deactivate+0x1c/0x28 memcg_deactivate_kmem_caches+0xa0/0xe8 memcg_offline_kmem+0x8c/0x3d4 mem_cgroup_css_offline+0x24c/0x290 css_killed_work_fn+0x154/0x618 process_one_work+0x9cc/0x183c worker_thread+0x9b0/0xe38 kthread+0x374/0x390 ret_from_fork+0x10/0x18 Allocated by task 1625: __kasan_kmalloc+0x168/0x240 kasan_slab_alloc+0x18/0x20 kmem_cache_alloc_node+0x1f8/0x3a0 cache_grow_begin+0x4fc/0xa24 cache_alloc_refill+0x2f8/0x3e8 kmem_cache_alloc+0x1bc/0x3bc sock_alloc_inode+0x58/0x334 alloc_inode+0xb8/0x164 new_inode_pseudo+0x20/0xec sock_alloc+0x74/0x284 __sock_create+0xb0/0x58c sock_create+0x98/0xb8 __sys_socket+0x60/0x138 __arm64_sys_socket+0xa4/0x110 el0_svc_handler+0x2c0/0x47c el0_svc+0x8/0xc Freed by task 1625: __kasan_slab_free+0x114/0x208 kasan_slab_free+0xc/0x18 kfree+0x1a8/0x1e0 single_release+0x7c/0x9c close_pdeo+0x13c/0x43c proc_reg_release+0xec/0x108 __fput+0x2f8/0x784 ____fput+0x1c/0x28 task_work_run+0xc0/0x1b0 do_notify_resume+0xb44/0x1278 work_pending+0x8/0x10 The buggy address belongs to the object at ffff809681b89e00 which belongs to the cache kmalloc-128 of size 128 The buggy address is located 0 bytes inside of 128-byte region [ffff809681b89e00, ffff809681b89e80) The buggy address belongs to the page: page:ffff7fe025a06e00 count:1 mapcount:0 mapping:01ff80082000fb00 index:0xffff809681b8fe04 flags: 0x17ffffffc000200(slab) raw: 017ffffffc000200 ffff7fe025a06d08 ffff7fe022ef7b88 01ff80082000fb00 raw: ffff809681b8fe04 ffff809681b80000 00000001000000e0 0000000000000000 page dumped because: kasan: bad access detected page allocated via order 0, migratetype Unmovable, gfp_mask 0x2420c0(__GFP_IO\|__GFP_FS\|__GFP_NOWARN\|__GFP_COMP\|__GFP_THISNODE) prep_new_page+0x4e0/0x5e0 get_page_from_freelist+0x4ce8/0x50d4 __alloc_pages_nodemask+0x738/0x38b8 cache_grow_begin+0xd8/0xa24 ____cache_alloc_node+0x14c/0x268 __kmalloc+0x1c8/0x3fc ftrace_free_mem+0x408/0x1284 ftrace_free_init_mem+0x20/0x28 kernel_init+0x24/0x548 ret_from_fork+0x10/0x18 Memory state around the buggy address: ffff809681b89c00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe ffff809681b89d00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe >ffff809681b89e00: 99 99 99 99 99 99 99 99 fe fe fe fe fe fe fe fe ^ ffff809681b89f00: 43 43 43 43 43 fe fe fe fe fe fe fe fe fe fe fe ffff809681b8a000: 6d fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe Link: http://lkml.kernel.org/r/20190403022858.97584-1-cai@lca.pw Fixes: `51dedad06b` ("kasan, slab: make freelist stored without tags") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-04-19 09:46:04 -07:00
Qian Cai	80552f0f7a	mm/slab: Remove store_stackinfo() store_stackinfo() does not seem used in actual SLAB debugging. Potentially, it could be added to check_poison_obj() to provide more information but this seems like an overkill due to the declining popularity of SLAB, so just remove it instead. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-mm <linux-mm@kvack.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: rientjes@google.com Cc: sean.j.christopherson@intel.com Link: https://lkml.kernel.org/r/20190416142258.18694-1-cai@lca.pw	2019-04-17 11:46:27 +02:00
Qian Cai	fcf88917dd	slab: fix a crash by reading /proc/slab_allocators The commit `510ded33e0` ("slab: implement slab_root_caches list") changes the name of the list node within "struct kmem_cache" from "list" to "root_caches_node", but leaks_show() still use the "list" which causes a crash when reading /proc/slab_allocators. You need to have CONFIG_SLAB=y and CONFIG_MEMCG=y to see the problem, because without MEMCG all slab caches are root caches, and the "list" node happens to be the right one. Fixes: `510ded33e0` ("slab: implement slab_root_caches list") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Tobin C. Harding <tobin@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-04-07 19:23:12 -10:00
Nicolas Boichat	6d6ea1e967	mm: add support for kmem caches in DMA32 zone Patch series "iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables", v6. This is a followup to the discussion in [1], [2]. IOMMUs using ARMv7 short-descriptor format require page tables (level 1 and 2) to be allocated within the first 4GB of RAM, even on 64-bit systems. For L1 tables that are bigger than a page, we can just use __get_free_pages with GFP_DMA32 (on arm64 systems only, arm would still use GFP_DMA). For L2 tables that only take 1KB, it would be a waste to allocate a full page, so we considered 3 approaches: 1. This series, adding support for GFP_DMA32 slab caches. 2. genalloc, which requires pre-allocating the maximum number of L2 page tables (4096, so 4MB of memory). 3. page_frag, which is not very memory-efficient as it is unable to reuse freed fragments until the whole page is freed. [3] This series is the most memory-efficient approach. stable@ note: We confirmed that this is a regression, and IOMMU errors happen on 4.19 and linux-next/master on MT8173 (elm, Acer Chromebook R13). The issue most likely starts from commit `ad67f5a654` ("arm64: replace ZONE_DMA with ZONE_DMA32"), i.e. 4.15, and presumably breaks a number of Mediatek platforms (and maybe others?). [1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html [2] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html [3] https://patchwork.codeaurora.org/patch/671639/ This patch (of 3): IOMMUs using ARMv7 short-descriptor format require page tables to be allocated within the first 4GB of RAM, even on 64-bit systems. On arm64, this is done by passing GFP_DMA32 flag to memory allocation functions. For IOMMU L2 tables that only take 1KB, it would be a waste to allocate a full page using get_free_pages, so we considered 3 approaches: 1. This patch, adding support for GFP_DMA32 slab caches. 2. genalloc, which requires pre-allocating the maximum number of L2 page tables (4096, so 4MB of memory). 3. page_frag, which is not very memory-efficient as it is unable to reuse freed fragments until the whole page is freed. This change makes it possible to create a custom cache in DMA32 zone using kmem_cache_create, then allocate memory using kmem_cache_alloc. We do not create a DMA32 kmalloc cache array, as there are currently no users of kmalloc(..., GFP_DMA32). These calls will continue to trigger a warning, as we keep GFP_DMA32 in GFP_SLAB_BUG_MASK. This implies that calls to kmem_cache_*alloc on a SLAB_CACHE_DMA32 kmem_cache must _not_ use GFP_DMA32 (it is anyway redundant and unnecessary). Link: http://lkml.kernel.org/r/20181210011504.122604-2-drinkcat@chromium.org Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Sasha Levin <Alexander.Levin@microsoft.com> Cc: Huaisheng Ye <yehs1@lenovo.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Yong Wu <yong.wu@mediatek.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Tomasz Figa <tfiga@google.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-29 10:01:37 -07:00
Mike Rapoport	a862f68a8b	docs/core-api/mm: fix return value descriptions in mm/ Many kernel-doc comments in mm/ have the return value descriptions either misformatted or omitted at all which makes kernel-doc script unhappy: $ make V=1 htmldocs ... ./mm/util.c:36: info: Scanning doc for kstrdup ./mm/util.c:41: warning: No description found for return value of 'kstrdup' ./mm/util.c:57: info: Scanning doc for kstrdup_const ./mm/util.c:66: warning: No description found for return value of 'kstrdup_const' ./mm/util.c:75: info: Scanning doc for kstrndup ./mm/util.c:83: warning: No description found for return value of 'kstrndup' ... Fixing the formatting and adding the missing return value descriptions eliminates ~100 such warnings. Link: http://lkml.kernel.org/r/1549549644-4903-4-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-05 21:07:20 -08:00
Alexey Dobriyan	b9726c26dc	numa: make "nr_node_ids" unsigned int Number of NUMA nodes can't be negative. This saves a few bytes on x86_64: add/remove: 0/0 grow/shrink: 4/21 up/down: 27/-265 (-238) Function old new delta hv_synic_alloc.cold 88 110 +22 prealloc_shrinker 260 262 +2 bootstrap 249 251 +2 sched_init_numa 1566 1567 +1 show_slab_objects 778 777 -1 s_show 1201 1200 -1 kmem_cache_init 346 345 -1 __alloc_workqueue_key 1146 1145 -1 mem_cgroup_css_alloc 1614 1612 -2 __do_sys_swapon 4702 4699 -3 __list_lru_init 655 651 -4 nic_probe 2379 2374 -5 store_user_store 118 111 -7 red_zone_store 106 99 -7 poison_store 106 99 -7 wq_numa_init 348 338 -10 __kmem_cache_empty 75 65 -10 task_numa_free 186 173 -13 merge_across_nodes_store 351 336 -15 irq_create_affinity_masks 1261 1246 -15 do_numa_crng_init 343 321 -22 task_numa_fault 4760 4737 -23 swapfile_init 179 156 -23 hv_synic_alloc 536 492 -44 apply_wqattrs_prepare 746 695 -51 Link: http://lkml.kernel.org/r/20190201223029.GA15820@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-05 21:07:19 -08:00
Qian Cai	92d1d07daa	mm/slab.c: kmemleak no scan alien caches Kmemleak throws endless warnings during boot due to in __alloc_alien_cache(), alc = kmalloc_node(memsize, gfp, node); init_arraycache(&alc->ac, entries, batch); kmemleak_no_scan(ac); Kmemleak does not track the array cache (alc->ac) but the alien cache (alc) instead, so let it track the latter by lifting kmemleak_no_scan() out of init_arraycache(). There is another place that calls init_arraycache(), but alloc_kmem_cache_cpus() uses the percpu allocation where will never be considered as a leak. kmemleak: Found object by alias at 0xffff8007b9aa7e38 CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #2 Call trace: dump_backtrace+0x0/0x168 show_stack+0x24/0x30 dump_stack+0x88/0xb0 lookup_object+0x84/0xac find_and_get_object+0x84/0xe4 kmemleak_no_scan+0x74/0xf4 setup_kmem_cache_node+0x2b4/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 ret_from_fork+0x10/0x18 kmemleak: Object 0xffff8007b9aa7e00 (size 256): kmemleak: comm "swapper/0", pid 1, jiffies 4294697137 kmemleak: min_count = 1 kmemleak: count = 0 kmemleak: flags = 0x1 kmemleak: checksum = 0 kmemleak: backtrace: kmemleak_alloc+0x84/0xb8 kmem_cache_alloc_node_trace+0x31c/0x3a0 __kmalloc_node+0x58/0x78 setup_kmem_cache_node+0x26c/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 kmemleak: Not scanning unknown object at 0xffff8007b9aa7e38 CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #2 Call trace: dump_backtrace+0x0/0x168 show_stack+0x24/0x30 dump_stack+0x88/0xb0 kmemleak_no_scan+0x90/0xf4 setup_kmem_cache_node+0x2b4/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 ret_from_fork+0x10/0x18 Link: http://lkml.kernel.org/r/20190129184518.39808-1-cai@lca.pw Fixes: `1fe00d50a9` ("slab: factor out initialization of array cache") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-05 21:07:14 -08:00

1 2 3 4 5 ...

719 Commits