2b40885cdc
36576 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
8026d5839b |
This is the 5.10.195 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmUJdfMACgkQONu9yGCS aT7i/w//Wbvt3F9hF/9Rmg9A4J23OWl2o07Z8Fi0a4F4B0FJjuQGSPRvpSvKtIWv +7taCzOw9+Qi52hTR7BK+QpLpEPgMbv1IdgyPu1gtjL4WHuKk1aOeafISYuQDgeZ XSFoV1EGjxkg3wbMZkucnmQVitGxC/iV0ojvxKleiIE9UNzceQclGmmBL0FwmEYp c91XKEACZ5K/spSyyxocP4Fw6mbk98ISiju+74op5EDFry9qnIYa2pU/au3gZvh/ TScOYOQsBojOFTy/wuEfpOiVBK9gLFq8du0J/gHS2aUqswkp/qFcpH7wbS5Po3+l Ja9a76o2B4btMCz6UhyhwzB+0QTQ1Gdea35FHRbF3d4ssNJDqDtwBCHqd3zeMUYo uTDhyTsSGV40Gm9A5Sojyzjgj4X12rQ0ffL+zcXfXe60flE8SNIxR8DiIXPlAsC+ pgNQ5l/HcdJE1abRoTkvpsptaT2sNXgwZZij+VOBI3Vp4wr61U69CfP/QWWPZZF5 ECEh8ZDK1roiEyBjn6njqXmt5vbmNasgI5umgnNPBgKEB2OLXqox6rn9XK0qMJ+X /oiCaL9RveU/QL5qNvV6Z2beXPwT51Vdy8+bQBfb5bUFRGQcTVIWaBRG0ZIHeSGm pG10/VAnCGtNrC6M/HVGd0Wyih+ur65Jz/rNKbkMX69cvJuxPWk= =RAs8 -----END PGP SIGNATURE----- Merge 5.10.195 into android12-5.10-lts Changes in 5.10.195 erofs: ensure that the post-EOF tails are all zeroed ARM: pxa: remove use of symbol_get() mmc: au1xmmc: force non-modular build and remove symbol_get usage net: enetc: use EXPORT_SYMBOL_GPL for enetc_phc_index rtc: ds1685: use EXPORT_SYMBOL_GPL for ds1685_rtc_poweroff modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules USB: serial: option: add Quectel EM05G variant (0x030e) USB: serial: option: add FOXCONN T99W368/T99W373 product usb: dwc3: meson-g12a: do post init to fix broken usb after resumption usb: chipidea: imx: improve logic if samsung,picophy-* parameter is 0 HID: wacom: remove the battery when the EKR is off staging: rtl8712: fix race condition Bluetooth: btsdio: fix use after free bug in btsdio_remove due to race condition configfs: fix a race in configfs_lookup() serial: qcom-geni: fix opp vote on shutdown serial: sc16is7xx: fix broken port 0 uart init serial: sc16is7xx: fix bug when first setting GPIO direction firmware: stratix10-svc: Fix an NULL vs IS_ERR() bug in probe fsi: master-ast-cf: Add MODULE_FIRMWARE macro nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers() nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse pinctrl: amd: Don't show `Invalid config param` errors ASoC: rt5682: Fix a problem with error handling in the io init function of the soundwire ARM: dts: imx: update sdma node name format ARM: dts: imx7s: Drop dma-apb interrupt-names ARM: dts: imx: Adjust dma-apbh node name ARM: dts: imx: Set default tuning step for imx7d usdhc phy: qcom-snps-femto-v2: use qcom_snps_hsphy_suspend/resume error code media: pulse8-cec: handle possible ping error media: pci: cx23885: fix error handling for cx23885 ATSC boards 9p: virtio: make sure 'offs' is initialized in zc_request ASoC: da7219: Flush pending AAD IRQ when suspending ASoC: da7219: Check for failure reading AAD IRQ events ethernet: atheros: fix return value check in atl1c_tso_csum() vxlan: generalize vxlan_parse_gpe_hdr and remove unused args m68k: Fix invalid .section syntax s390/dasd: use correct number of retries for ERP requests s390/dasd: fix hanging device after request requeue fs/nls: make load_nls() take a const parameter ASoc: codecs: ES8316: Fix DMIC config ASoC: atmel: Fix the 8K sample parameter in I2SC master platform/x86: intel: hid: Always call BTNL ACPI method platform/x86: huawei-wmi: Silence ambient light sensor drm/amd/display: Exit idle optimizations before attempt to access PHY ovl: Always reevaluate the file signature for IMA ata: pata_arasan_cf: Use dev_err_probe() instead dev_err() in data_xfer() security: keys: perform capable check only on privileged operations kprobes: Prohibit probing on CFI preamble symbol clk: fixed-mmio: make COMMON_CLK_FIXED_MMIO depend on HAS_IOMEM vmbus_testing: fix wrong python syntax for integer value comparison net: usb: qmi_wwan: add Quectel EM05GV2 idmaengine: make FSL_EDMA and INTEL_IDMA64 depends on HAS_IOMEM scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock netlabel: fix shift wrapping bug in netlbl_catmap_setlong() bnx2x: fix page fault following EEH recovery sctp: handle invalid error codes without calling BUG() scsi: storvsc: Always set no_report_opcodes ALSA: seq: oss: Fix racy open/close of MIDI devices tracing: Introduce pipe_cpumask to avoid race on trace_pipes platform/mellanox: Fix mlxbf-tmfifo not handling all virtio CONSOLE notifications net: Avoid address overwrite in kernel_connect udf: Check consistency of Space Bitmap Descriptor udf: Handle error when adding extent to a file Revert "net: macsec: preserve ingress frame ordering" reiserfs: Check the return value from __getblk() eventfd: Export eventfd_ctx_do_read() eventfd: prevent underflow for eventfd semaphores fs: Fix error checking for d_hash_and_lookup() tmpfs: verify {g,u}id mount options correctly selftests/harness: Actually report SKIP for signal tests refscale: Fix uninitalized use of wait_queue_head_t OPP: Fix passing 0 to PTR_ERR in _opp_attach_genpd() selftests/resctrl: Don't leak buffer in fill_cache() selftests/resctrl: Unmount resctrl FS if child fails to run benchmark selftests/resctrl: Close perf value read fd on errors x86/decompressor: Don't rely on upper 32 bits of GPRs being preserved perf/imx_ddr: don't enable counter0 if none of 4 counters are used s390/pkey: fix/harmonize internal keyblob headers s390/paes: fix PKEY_TYPE_EP11_AES handling for secure keyblobs x86/efistub: Fix PCI ROM preservation in mixed mode cpufreq: powernow-k8: Use related_cpus instead of cpus in driver.exit() bpftool: Use a local bpf_perf_event_value to fix accessing its fields bpf: Clear the probe_addr for uprobe tcp: tcp_enter_quickack_mode() should be static hwrng: nomadik - keep clock enabled while hwrng is registered regmap: rbtree: Use alloc_flags for memory allocations udp: re-score reuseport groups when connected sockets are present bpf: reject unhashed sockets in bpf_sk_assign wifi: mt76: testmode: add nla_policy for MT76_TM_ATTR_TX_LENGTH spi: tegra20-sflash: fix to check return value of platform_get_irq() in tegra_sflash_probe() can: gs_usb: gs_usb_receive_bulk_callback(): count RX overflow errors also in case of OOM wifi: mwifiex: Fix OOB and integer underflow when rx packets wifi: mwifiex: fix error recovery in PCIE buffer descriptor management selftests/bpf: fix static assert compilation issue for test_cls_*.c crypto: stm32 - Properly handle pm_runtime_get failing crypto: api - Use work queue in crypto_destroy_instance Bluetooth: nokia: fix value check in nokia_bluetooth_serdev_probe() Bluetooth: Fix potential use-after-free when clear keys net: tcp: fix unexcepted socket die when snd_wnd is 0 selftests/bpf: Clean up fmod_ret in bench_rename test script ice: ice_aq_check_events: fix off-by-one check when filling buffer crypto: caam - fix unchecked return value error hwrng: iproc-rng200 - Implement suspend and resume calls lwt: Fix return values of BPF xmit ops lwt: Check LWTUNNEL_XMIT_CONTINUE strictly fs: ocfs2: namei: check return value of ocfs2_add_entry() wifi: mwifiex: fix memory leak in mwifiex_histogram_read() wifi: mwifiex: Fix missed return in oob checks failed path samples/bpf: fix broken map lookup probe wifi: ath9k: fix races between ath9k_wmi_cmd and ath9k_wmi_ctrl_rx wifi: ath9k: protect WMI command response buffer replacement with a lock wifi: mwifiex: avoid possible NULL skb pointer dereference Bluetooth: btusb: Do not call kfree_skb() under spin_lock_irqsave() wifi: ath9k: use IS_ERR() with debugfs_create_dir() net: arcnet: Do not call kfree_skb() under local_irq_disable() mlxsw: i2c: Fix chunk size setting in output mailbox buffer mlxsw: i2c: Limit single transaction buffer size hwmon: (tmp513) Fix the channel number in tmp51x_is_visible() net/sched: sch_hfsc: Ensure inner classes have fsc curve netrom: Deny concurrent connect(). drm/bridge: tc358764: Fix debug print parameter order quota: factor out dquot_write_dquot() quota: rename dquot_active() to inode_quota_active() quota: add new helper dquot_active() quota: fix dqput() to follow the guarantees dquot_srcu should provide ASoC: stac9766: fix build errors with REGMAP_AC97 soc: qcom: ocmem: Add OCMEM hardware version print soc: qcom: ocmem: Fix NUM_PORTS & NUM_MACROS macros arm64: dts: qcom: msm8996: Add missing interrupt to the USB2 controller drm/amdgpu: avoid integer overflow warning in amdgpu_device_resize_fb_bar() ARM: dts: BCM5301X: Harmonize EHCI/OHCI DT nodes name ARM: dts: BCM53573: Describe on-SoC BCM53125 rev 4 switch ARM: dts: BCM53573: Drop nonexistent #usb-cells ARM: dts: BCM53573: Add cells sizes to PCIe node ARM: dts: BCM53573: Use updated "spi-gpio" binding properties drm/etnaviv: fix dumping of active MMU context x86/mm: Fix PAT bit missing from page protection modify mask ARM: dts: s3c64xx: align pinctrl with dtschema ARM: dts: samsung: s3c6410-mini6410: correct ethernet reg addresses (split) ARM: dts: s5pv210: adjust node names to DT spec ARM: dts: s5pv210: add dummy 5V regulator for backlight on SMDKv210 ARM: dts: samsung: s5pv210-smdkv210: correct ethernet reg addresses (split) drm: adv7511: Fix low refresh rate register for ADV7533/5 ARM: dts: BCM53573: Fix Ethernet info for Luxul devices arm64: dts: qcom: sdm845: Add missing RPMh power domain to GCC arm64: dts: qcom: sdm845: Fix the min frequency of "ice_core_clk" drm/amdgpu: Update min() to min_t() in 'amdgpu_info_ioctl' md/bitmap: don't set max_write_behind if there is no write mostly device md/md-bitmap: hold 'reconfig_mutex' in backlog_store() drm/tegra: Remove superfluous error messages around platform_get_irq() drm/tegra: dpaux: Fix incorrect return value of platform_get_irq of: unittest: fix null pointer dereferencing in of_unittest_find_node_by_name() drm/armada: Fix off-by-one error in armada_overlay_get_property() drm/panel: simple: Add missing connector type and pixel format for AUO T215HVN01 ima: Remove deprecated IMA_TRUSTED_KEYRING Kconfig drm: xlnx: zynqmp_dpsub: Add missing check for dma_set_mask drm/msm/mdp5: Don't leak some plane state firmware: meson_sm: fix to avoid potential NULL pointer dereference smackfs: Prevent underflow in smk_set_cipso() drm/amd/pm: fix variable dereferenced issue in amdgpu_device_attr_create() drm/msm/a2xx: Call adreno_gpu_init() earlier audit: fix possible soft lockup in __audit_inode_child() bus: ti-sysc: Fix build warning for 64-bit build drm/mediatek: Fix potential memory leak if vmap() fail bus: ti-sysc: Fix cast to enum warning of: unittest: Fix overlay type in apply/revert check ALSA: ac97: Fix possible error value of *rac97 ipmi:ssif: Add check for kstrdup ipmi:ssif: Fix a memory leak when scanning for an adapter drivers: clk: keystone: Fix parameter judgment in _of_pll_clk_init() clk: sunxi-ng: Modify mismatched function name clk: qcom: gcc-sc7180: use ARRAY_SIZE instead of specifying num_parents clk: qcom: gcc-sc7180: Fix up gcc_sdcc2_apps_clk_src ext4: correct grp validation in ext4_mb_good_group clk: qcom: gcc-sm8250: use ARRAY_SIZE instead of specifying num_parents clk: qcom: gcc-sm8250: Fix gcc_sdcc2_apps_clk_src clk: qcom: reset: Use the correct type of sleep/delay based on length PCI: Mark NVIDIA T4 GPUs to avoid bus reset pinctrl: mcp23s08: check return value of devm_kasprintf() PCI: pciehp: Use RMW accessors for changing LNKCTL PCI/ASPM: Use RMW accessors for changing LNKCTL clk: imx8mp: fix sai4 clock clk: imx: composite-8m: fix clock pauses when set_rate would be a no-op vfio/type1: fix cap_migration information leak powerpc/fadump: reset dump area size if fadump memory reserve fails powerpc/perf: Convert fsl_emb notifier to state machine callbacks drm/amdgpu: Use RMW accessors for changing LNKCTL drm/radeon: Use RMW accessors for changing LNKCTL net/mlx5: Use RMW accessors for changing LNKCTL wifi: ath10k: Use RMW accessors for changing LNKCTL powerpc: Don't include lppaca.h in paca.h powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT nfs/blocklayout: Use the passed in gfp flags powerpc/iommu: Fix notifiers being shared by PCI and VIO buses jfs: validate max amount of blocks before allocation. fs: lockd: avoid possible wrong NULL parameter NFSD: da_addr_body field missing in some GETDEVICEINFO replies NFS: Guard against READDIR loop when entry names exceed MAXNAMELEN NFSv4.2: fix handling of COPY ERR_OFFLOAD_NO_REQ media: ad5820: Drop unsupported ad5823 from i2c_ and of_device_id tables media: i2c: tvp5150: check return value of devm_kasprintf() media: v4l2-core: Fix a potential resource leak in v4l2_fwnode_parse_link() drivers: usb: smsusb: fix error handling code in smsusb_init_device media: dib7000p: Fix potential division by zero media: dvb-usb: m920x: Fix a potential memory leak in m920x_i2c_xfer() media: cx24120: Add retval check for cx24120_message_send() scsi: hisi_sas: Print SAS address for v3 hw erroneous completion print scsi: libsas: Introduce more SAM status code aliases in enum exec_status scsi: hisi_sas: Modify v3 HW SSP underflow error processing scsi: hisi_sas: Modify v3 HW SATA completion error processing scsi: hisi_sas: Fix warnings detected by sparse scsi: hisi_sas: Fix normally completed I/O analysed as failed media: rkvdec: increase max supported height for H.264 media: mediatek: vcodec: Return NULL if no vdec_fb is found usb: phy: mxs: fix getting wrong state with mxs_phy_is_otg_host() scsi: RDMA/srp: Fix residual handling scsi: iscsi: Rename iscsi_set_param() to iscsi_if_set_param() scsi: iscsi: Add length check for nlattr payload scsi: iscsi: Add strlen() check in iscsi_if_set{_host}_param() scsi: be2iscsi: Add length check when parsing nlattrs scsi: qla4xxx: Add length check when parsing nlattrs serial: sprd: Assign sprd_port after initialized to avoid wrong access serial: sprd: Fix DMA buffer leak issue x86/APM: drop the duplicate APM_MINOR_DEV macro scsi: qedf: Do not touch __user pointer in qedf_dbg_stop_io_on_error_cmd_read() directly scsi: qedf: Do not touch __user pointer in qedf_dbg_debug_cmd_read() directly scsi: qedf: Do not touch __user pointer in qedf_dbg_fp_int_cmd_read() directly coresight: tmc: Explicit type conversions to prevent integer overflow dma-buf/sync_file: Fix docs syntax driver core: test_async: fix an error code IB/uverbs: Fix an potential error pointer dereference fsi: aspeed: Reset master errors after CFAM reset iommu/qcom: Disable and reset context bank before programming iommu/vt-d: Fix to flush cache of PASID directory table media: go7007: Remove redundant if statement USB: gadget: f_mass_storage: Fix unused variable warning media: ov5640: Enable MIPI interface in ov5640_set_power_mipi() media: i2c: ov2680: Set V4L2_CTRL_FLAG_MODIFY_LAYOUT on flips media: ov2680: Remove auto-gain and auto-exposure controls media: ov2680: Fix ov2680_bayer_order() media: ov2680: Fix vflip / hflip set functions media: ov2680: Fix regulators being left enabled on ov2680_power_on() errors cgroup:namespace: Remove unused cgroup_namespaces_init() scsi: core: Use 32-bit hostnum in scsi_host_lookup() scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock serial: tegra: handle clk prepare error in tegra_uart_hw_init() amba: bus: fix refcount leak Revert "IB/isert: Fix incorrect release of isert connection" RDMA/siw: Balance the reference of cep->kref in the error path RDMA/siw: Correct wrong debug message HID: logitech-dj: Fix error handling in logi_dj_recv_switch_to_dj_mode() HID: multitouch: Correct devm device reference for hidinput input_dev name x86/speculation: Mark all Skylake CPUs as vulnerable to GDS tracing: Fix race issue between cpu buffer write and swap mtd: rawnand: brcmnand: Fix mtd oobsize phy/rockchip: inno-hdmi: use correct vco_div_5 macro on rk3328 phy/rockchip: inno-hdmi: round fractal pixclock in rk3328 recalc_rate phy/rockchip: inno-hdmi: do not power on rk3328 post pll on reg write rpmsg: glink: Add check for kstrdup mtd: spi-nor: Check bus width while setting QE bit mtd: rawnand: fsmc: handle clk prepare error in fsmc_nand_resume() um: Fix hostaudio build errors dmaengine: ste_dma40: Add missing IRQ check in d40_probe cpufreq: Fix the race condition while updating the transition_task of policy virtio_ring: fix avail_wrap_counter in virtqueue_add_packed igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c netfilter: xt_u32: validate user space input netfilter: xt_sctp: validate the flag_info count skbuff: skb_segment, Call zero copy functions before using skbuff frags igb: set max size RX buffer when store bad packet is enabled PM / devfreq: Fix leak in devfreq_dev_release() ALSA: pcm: Fix missing fixup call in compat hw_refine ioctl printk: ringbuffer: Fix truncating buffer size min_t cast scsi: core: Fix the scsi_set_resid() documentation ipmi_si: fix a memleak in try_smi_init() ARM: OMAP2+: Fix -Warray-bounds warning in _pwrdm_state_switch() backlight/gpio_backlight: Compare against struct fb_info.device backlight/bd6107: Compare against struct fb_info.device backlight/lv5207lp: Compare against struct fb_info.device xtensa: PMU: fix base address for the newer hardware arm64: csum: Fix OoB access in IP checksum code for negative lengths media: dvb: symbol fixup for dvb_attach() Revert "scsi: qla2xxx: Fix buffer overrun" scsi: mpt3sas: Perform additional retries if doorbell read returns 0 ntb: Drop packets when qp link is down ntb: Clean up tx tail index on link down ntb: Fix calculation ntb_transport_tx_free_entry() Revert "PCI: Mark NVIDIA T4 GPUs to avoid bus reset" procfs: block chmod on /proc/thread-self/comm parisc: Fix /proc/cpuinfo output for lscpu dlm: fix plock lookup when using multiple lockspaces dccp: Fix out of bounds access in DCCP error handler X.509: if signature is unsupported skip validation net: handle ARPHRD_PPP in dev_is_mac_header_xmit() fsverity: skip PKCS#7 parser when keyring is empty pstore/ram: Check start of empty przs during init s390/ipl: add missing secure/has_secure file to ipl type 'unknown' crypto: stm32 - fix loop iterating through scatterlist for DMA cpufreq: brcmstb-avs-cpufreq: Fix -Warray-bounds bug usb: typec: bus: verify partner exists in typec_altmode_attention USB: core: Unite old scheme and new scheme descriptor reads USB: core: Change usb_get_device_descriptor() API USB: core: Fix race by not overwriting udev->descriptor in hub_port_init() USB: core: Fix oversight in SuperSpeed initialization usb: typec: tcpci: clear the fault status bit tracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY md/md-bitmap: remove unnecessary local variable in backlog_store() udf: initialize newblock to 0 net/ipv6: SKB symmetric hash should incorporate transport ports io_uring: always lock in io_apoll_task_func io_uring: break out of iowq iopoll on teardown io_uring: break iopolling on signal scsi: qla2xxx: Fix deletion race condition scsi: qla2xxx: fix inconsistent TMF timeout scsi: qla2xxx: Fix erroneous link up failure scsi: qla2xxx: Turn off noisy message log scsi: qla2xxx: Remove unsupported ql2xenabledif option fbdev/ep93xx-fb: Do not assign to struct fb_info.dev drm/ast: Fix DRAM init on AST2200 lib/test_meminit: allocate pages up to order MAX_ORDER parisc: led: Fix LAN receive and transmit LEDs parisc: led: Reduce CPU overhead for disk & lan LED computation pinctrl: cherryview: fix address_space_handler() argument dt-bindings: clock: xlnx,versal-clk: drop select:false clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock soc: qcom: qmi_encdec: Restrict string length in decode NFS: Fix a potential data corruption NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info kconfig: fix possible buffer overflow backlight: gpio_backlight: Drop output GPIO direction check for initial power state perf annotate bpf: Don't enclose non-debug code with an assert() x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm() perf top: Don't pass an ERR_PTR() directly to perf_session__delete() watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load pwm: lpc32xx: Remove handling of PWM channels net/sched: fq_pie: avoid stalls in fq_pie_timer() sctp: annotate data-races around sk->sk_wmem_queued ipv4: annotate data-races around fi->fib_dead net: read sk->sk_family once in sk_mc_loop() drm/i915/gvt: Save/restore HW status to support GVT suspend/resume drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt() ipv4: ignore dst hint for multipath routes igb: disable virtualization features on 82580 veth: Fixing transmit return status for dropped packets net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr af_unix: Fix data-races around user->unix_inflight. af_unix: Fix data-race around unix_tot_inflight. af_unix: Fix data-races around sk->sk_shutdown. af_unix: Fix data race around sk->sk_err. net: sched: sch_qfq: Fix UAF in qfq_dequeue() kcm: Destroy mutex in kcm_exit_net() igc: Change IGC_MIN to allow set rx/tx value between 64 and 80 igbvf: Change IGBVF_MIN to allow set rx/tx value between 64 and 80 igb: Change IGB_MIN to allow set rx/tx value between 64 and 80 s390/zcrypt: don't leak memory if dev_set_name() fails idr: fix param name in idr_alloc_cyclic() doc ip_tunnels: use DEV_STATS_INC() net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times netfilter: nfnetlink_osf: avoid OOB read net: hns3: fix the port information display when sfp is absent sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory() ext4: add correct group descriptors and reserved GDT blocks to system zone ata: sata_gemini: Add missing MODULE_DESCRIPTION ata: pata_ftide010: Add missing MODULE_DESCRIPTION fuse: nlookup missing decrement in fuse_direntplus_link btrfs: don't start transaction when joining with TRANS_JOIN_NOSTART btrfs: use the correct superblock to compare fsid in btrfs_validate_super mtd: rawnand: brcmnand: Fix crash during the panic_write mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write mtd: rawnand: brcmnand: Fix potential false time out warning drm/amd/display: prevent potential division by zero errors perf hists browser: Fix hierarchy mode header perf tools: Handle old data in PERF_RECORD_ATTR perf hists browser: Fix the number of entries for 'e' key ACPI: APEI: explicit init of HEST and GHES in apci_init() arm64: sdei: abort running SDEI handlers during crash scsi: qla2xxx: If fcport is undergoing deletion complete I/O with retry scsi: qla2xxx: Consolidate zio threshold setting for both FCP & NVMe scsi: qla2xxx: Fix crash in PCIe error handling scsi: qla2xxx: Flush mailbox commands on chip reset ARM: dts: samsung: exynos4210-i9100: Fix LCD screen's physical size ARM: dts: BCM5301X: Extend RAM to full 256MB for Linksys EA6500 V2 bus: mhi: host: Skip MHI reset if device is in RDDM net: ipv4: fix one memleak in __inet_del_ifa() selftests/kselftest/runner/run_one(): allow running non-executable files kselftest/runner.sh: Propagate SIGTERM to runner child net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add net: ethernet: mvpp2_main: fix possible OOB write in mvpp2_ethtool_get_rxnfc() net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all() hsr: Fix uninit-value access in fill_frame_info() r8152: check budget for r8152_poll() kcm: Fix memory leak in error path of kcm_sendmsg() platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors platform/mellanox: mlxbf-tmfifo: Drop jumbo frames net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict() ipv6: fix ip6_sock_set_addr_preferences() typo ixgbe: fix timestamp configuration code kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg(). drm/amd/display: Fix a bug when searching for insert_above_mpcc parisc: Drop loops_per_jiffy from per_cpu struct Linux 5.10.195 Change-Id: I4eef618f573b6d4201e05c9cf56088d77d712d97 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
5103216b86 |
tracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY
commit 3d07fa1dd19035eb0b13ae6697efd5caa9033e74 upstream. The pipe cpumask used to serialize opens between the main and percpu trace pipes is not zeroed or initialized. This can result in spurious -EBUSY returns if underlying memory is not fully zeroed. This has been observed by immediate failure to read the main trace_pipe file on an otherwise newly booted and idle system: # cat /sys/kernel/debug/tracing/trace_pipe cat: /sys/kernel/debug/tracing/trace_pipe: Device or resource busy Zero the allocation of pipe_cpumask to avoid the problem. Link: https://lore.kernel.org/linux-trace-kernel/20230831125500.986862-1-bfoster@redhat.com Cc: stable@vger.kernel.org Fixes: c2489bb7e6be ("tracing: Introduce pipe_cpumask to avoid race on trace_pipes") Reviewed-by: Zheng Yejian <zhengyejian1@huawei.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
8c90c4e619 |
printk: ringbuffer: Fix truncating buffer size min_t cast
commit 53e9e33ede37a247d926db5e4a9e56b55204e66c upstream.
If an output buffer size exceeded U16_MAX, the min_t(u16, ...) cast in
copy_data() was causing writes to truncate. This manifested as output
bytes being skipped, seen as %NUL bytes in pstore dumps when the available
record size was larger than 65536. Fix the cast to no longer truncate
the calculation.
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Reported-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Link: https://lore.kernel.org/lkml/d8bb1ec7-a4c5-43a2-9de0-9643a70b899f@linux.microsoft.com/
Fixes:
|
||
|
c5d30d6aa8 |
tracing: Fix race issue between cpu buffer write and swap
[ Upstream commit 3163f635b20e9e1fb4659e74f47918c9dddfe64e ]
Warning happened in rb_end_commit() at code:
if (RB_WARN_ON(cpu_buffer, !local_read(&cpu_buffer->committing)))
WARNING: CPU: 0 PID: 139 at kernel/trace/ring_buffer.c:3142
rb_commit+0x402/0x4a0
Call Trace:
ring_buffer_unlock_commit+0x42/0x250
trace_buffer_unlock_commit_regs+0x3b/0x250
trace_event_buffer_commit+0xe5/0x440
trace_event_buffer_reserve+0x11c/0x150
trace_event_raw_event_sched_switch+0x23c/0x2c0
__traceiter_sched_switch+0x59/0x80
__schedule+0x72b/0x1580
schedule+0x92/0x120
worker_thread+0xa0/0x6f0
It is because the race between writing event into cpu buffer and swapping
cpu buffer through file per_cpu/cpu0/snapshot:
Write on CPU 0 Swap buffer by per_cpu/cpu0/snapshot on CPU 1
-------- --------
tracing_snapshot_write()
[...]
ring_buffer_lock_reserve()
cpu_buffer = buffer->buffers[cpu]; // 1. Suppose find 'cpu_buffer_a';
[...]
rb_reserve_next_event()
[...]
ring_buffer_swap_cpu()
if (local_read(&cpu_buffer_a->committing))
goto out_dec;
if (local_read(&cpu_buffer_b->committing))
goto out_dec;
buffer_a->buffers[cpu] = cpu_buffer_b;
buffer_b->buffers[cpu] = cpu_buffer_a;
// 2. cpu_buffer has swapped here.
rb_start_commit(cpu_buffer);
if (unlikely(READ_ONCE(cpu_buffer->buffer)
!= buffer)) { // 3. This check passed due to 'cpu_buffer->buffer'
[...] // has not changed here.
return NULL;
}
cpu_buffer_b->buffer = buffer_a;
cpu_buffer_a->buffer = buffer_b;
[...]
// 4. Reserve event from 'cpu_buffer_a'.
ring_buffer_unlock_commit()
[...]
cpu_buffer = buffer->buffers[cpu]; // 5. Now find 'cpu_buffer_b' !!!
rb_commit(cpu_buffer)
rb_end_commit() // 6. WARN for the wrong 'committing' state !!!
Based on above analysis, we can easily reproduce by following testcase:
``` bash
#!/bin/bash
dmesg -n 7
sysctl -w kernel.panic_on_warn=1
TR=/sys/kernel/tracing
echo 7 > ${TR}/buffer_size_kb
echo "sched:sched_switch" > ${TR}/set_event
while [ true ]; do
echo 1 > ${TR}/per_cpu/cpu0/snapshot
done &
while [ true ]; do
echo 1 > ${TR}/per_cpu/cpu0/snapshot
done &
while [ true ]; do
echo 1 > ${TR}/per_cpu/cpu0/snapshot
done &
```
To fix it, IIUC, we can use smp_call_function_single() to do the swap on
the target cpu where the buffer is located, so that above race would be
avoided.
Link: https://lore.kernel.org/linux-trace-kernel/20230831132739.4070878-1-zhengyejian1@huawei.com
Cc: <mhiramat@kernel.org>
Fixes:
|
||
|
629079f502 |
cgroup:namespace: Remove unused cgroup_namespaces_init()
[ Upstream commit 82b90b6c5b38e457c7081d50dff11ecbafc1e61a ]
cgroup_namspace_init() just return 0. Therefore, there is no need to
call it during start_kernel. Just remove it.
Fixes:
|
||
|
98ef243d59 |
audit: fix possible soft lockup in __audit_inode_child()
[ Upstream commit b59bc6e37237e37eadf50cd5de369e913f524463 ]
Tracefs or debugfs maybe cause hundreds to thousands of PATH records,
too many PATH records maybe cause soft lockup.
For example:
1. CONFIG_KASAN=y && CONFIG_PREEMPTION=n
2. auditctl -a exit,always -S open -k key
3. sysctl -w kernel.watchdog_thresh=5
4. mkdir /sys/kernel/debug/tracing/instances/test
There may be a soft lockup as follows:
watchdog: BUG: soft lockup - CPU#45 stuck for 7s! [mkdir:15498]
Kernel panic - not syncing: softlockup: hung tasks
Call trace:
dump_backtrace+0x0/0x30c
show_stack+0x20/0x30
dump_stack+0x11c/0x174
panic+0x27c/0x494
watchdog_timer_fn+0x2bc/0x390
__run_hrtimer+0x148/0x4fc
__hrtimer_run_queues+0x154/0x210
hrtimer_interrupt+0x2c4/0x760
arch_timer_handler_phys+0x48/0x60
handle_percpu_devid_irq+0xe0/0x340
__handle_domain_irq+0xbc/0x130
gic_handle_irq+0x78/0x460
el1_irq+0xb8/0x140
__audit_inode_child+0x240/0x7bc
tracefs_create_file+0x1b8/0x2a0
trace_create_file+0x18/0x50
event_create_dir+0x204/0x30c
__trace_add_new_event+0xac/0x100
event_trace_add_tracer+0xa0/0x130
trace_array_create_dir+0x60/0x140
trace_array_create+0x1e0/0x370
instance_mkdir+0x90/0xd0
tracefs_syscall_mkdir+0x68/0xa0
vfs_mkdir+0x21c/0x34c
do_mkdirat+0x1b4/0x1d4
__arm64_sys_mkdirat+0x4c/0x60
el0_svc_common.constprop.0+0xa8/0x240
do_el0_svc+0x8c/0xc0
el0_svc+0x20/0x30
el0_sync_handler+0xb0/0xb4
el0_sync+0x160/0x180
Therefore, we add cond_resched() to __audit_inode_child() to fix it.
Fixes:
|
||
|
b275f0ae35 |
bpf: Clear the probe_addr for uprobe
[ Upstream commit 5125e757e62f6c1d5478db4c2b61a744060ddf3f ]
To avoid returning uninitialized or random values when querying the file
descriptor (fd) and accessing probe_addr, it is necessary to clear the
variable prior to its use.
Fixes:
|
||
|
066fbd8bc9 |
refscale: Fix uninitalized use of wait_queue_head_t
[ Upstream commit f5063e8948dad7f31adb007284a5d5038ae31bb8 ]
Running the refscale test occasionally crashes the kernel with the
following error:
[ 8569.952896] BUG: unable to handle page fault for address: ffffffffffffffe8
[ 8569.952900] #PF: supervisor read access in kernel mode
[ 8569.952902] #PF: error_code(0x0000) - not-present page
[ 8569.952904] PGD c4b048067 P4D c4b049067 PUD c4b04b067 PMD 0
[ 8569.952910] Oops: 0000 [#1] PREEMPT_RT SMP NOPTI
[ 8569.952916] Hardware name: Dell Inc. PowerEdge R750/0WMWCR, BIOS 1.2.4 05/28/2021
[ 8569.952917] RIP: 0010:prepare_to_wait_event+0x101/0x190
:
[ 8569.952940] Call Trace:
[ 8569.952941] <TASK>
[ 8569.952944] ref_scale_reader+0x380/0x4a0 [refscale]
[ 8569.952959] kthread+0x10e/0x130
[ 8569.952966] ret_from_fork+0x1f/0x30
[ 8569.952973] </TASK>
The likely cause is that init_waitqueue_head() is called after the call to
the torture_create_kthread() function that creates the ref_scale_reader
kthread. Although this init_waitqueue_head() call will very likely
complete before this kthread is created and starts running, it is
possible that the calling kthread will be delayed between the calls to
torture_create_kthread() and init_waitqueue_head(). In this case, the
new kthread will use the waitqueue head before it is properly initialized,
which is not good for the kernel's health and well-being.
The above crash happened here:
static inline void __add_wait_queue(...)
{
:
if (!(wq->flags & WQ_FLAG_PRIORITY)) <=== Crash here
The offset of flags from list_head entry in wait_queue_entry is
-0x18. If reader_tasks[i].wq.head.next is NULL as allocated reader_task
structure is zero initialized, the instruction will try to access address
0xffffffffffffffe8, which is exactly the fault address listed above.
This commit therefore invokes init_waitqueue_head() before creating
the kthread.
Fixes:
|
||
|
0c0547d2a6 |
tracing: Introduce pipe_cpumask to avoid race on trace_pipes
[ Upstream commit c2489bb7e6be2e8cdced12c16c42fa128403ac03 ] There is race issue when concurrently splice_read main trace_pipe and per_cpu trace_pipes which will result in data read out being different from what actually writen. As suggested by Steven: > I believe we should add a ref count to trace_pipe and the per_cpu > trace_pipes, where if they are opened, nothing else can read it. > > Opening trace_pipe locks all per_cpu ref counts, if any of them are > open, then the trace_pipe open will fail (and releases any ref counts > it had taken). > > Opening a per_cpu trace_pipe will up the ref count for just that > CPU buffer. This will allow multiple tasks to read different per_cpu > trace_pipe files, but will prevent the main trace_pipe file from > being opened. But because we only need to know whether per_cpu trace_pipe is open or not, using a cpumask instead of using ref count may be easier. After this patch, users will find that: - Main trace_pipe can be opened by only one user, and if it is opened, all per_cpu trace_pipes cannot be opened; - Per_cpu trace_pipes can be opened by multiple users, but each per_cpu trace_pipe can only be opened by one user. And if one of them is opened, main trace_pipe cannot be opened. Link: https://lore.kernel.org/linux-trace-kernel/20230818022645.1948314-1-zhengyejian1@huawei.com Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
0111b7bb51 |
kprobes: Prohibit probing on CFI preamble symbol
[ Upstream commit de02f2ac5d8cfb311f44f2bf144cc20002f1fbbd ] Do not allow to probe on "__cfi_" or "__pfx_" started symbol, because those are used for CFI and not executed. Probing it will break the CFI. Link: https://lore.kernel.org/all/168904024679.116016.18089228029322008512.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
f71b0b4a49 |
modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules
commit 9011e49d54dcc7653ebb8a1e05b5badb5ecfa9f9 upstream. It has recently come to my attention that nvidia is circumventing the protection added in |
||
|
397f70b65c |
This is the 5.10.194 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmTy4cgACgkQONu9yGCS aT5pDA//SU2uPkDkNZ7p8QEdN3tMS2PPNdhi4Pza1OsJHhBbdfVUg4u31tZJJHpx P4c8iLkqnGHf4AyACvhR+GpX2b8T9pXTSIdv+GHqsNVEWmctYIwHGbDJNcoKtskc zV3OTDjy6hySEc2fhtxIdtNlLeW2g+WMwpgivXjfkrNVBGntiu7aVBQ8e6tk4TP/ yEDtpCPzHKBmFF2hkvimQNeAUZUPaIbnBt+bmvgrcZs0TxphQtz8Qu6jFZFuMuII WQycQB56mCwQty3xZkLSjL0NhIN7SZ5hjto5HzFeEuZ0ZYSO90NxUzuYLI/oTSOE 2pwkCxGfQ5UJHEqfbCxAV1Da+Uzo++XzAtuIP2+y6BJoO8X7IuaolAS4QSjb2J9u C4tNH+NyGO7JM3rxCbIAc+DcdBLIV5fLe2Xnq2SdQASgBok2QKyAHpU5UDKO43r5 hoCUbQ7DLVIChJasIwrMY7OXyI4p+AD4RJEhILa8rTi0L7hdFDgxAGdI8G48LjHt 951Fi+5P0KJ1tnol6FUjtSbcduzfTMzfKoPhy4NdtUCjTnbProbSTPCCeH6eabpD 73jSGFDMEwr9P/NHk4SJO74v/z/v6NfgE4os95ZI5DkC1ygVOS+8f/pUDTh3Zbhx KUbXhIc0oRhnOcUY/aB5ikiAy3r6fS6/BbV55amyf1LtUKOH1QQ= =38AI -----END PGP SIGNATURE----- Merge 5.10.194 into android12-5.10-lts Changes in 5.10.194 module: Expose module_init_layout_section() arm64: module-plts: inline linux/moduleloader.h arm64: module: Use module_init_layout_section() to spot init sections ARM: module: Use module_init_layout_section() to spot init sections mhi: pci_generic: Fix implicit conversion warning Revert "drm/amdgpu: install stub fence into potential unused fence pointers" Revert "MIPS: Alchemy: fix dbdma2" rcu: Prevent expedited GP from enabling tick on offline CPU rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader rcu-tasks: Wait for trc_read_check_handler() IPIs rcu-tasks: Add trc_inspect_reader() checks for exiting critical section Linux 5.10.194 Change-Id: I1e4230071f3ca3c9e811aeba31446875028d7b3d Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
d93ba6e46e |
rcu-tasks: Add trc_inspect_reader() checks for exiting critical section
commit 18f08e758f34e6dfe0668bee51bd2af7adacf381 upstream. Currently, trc_inspect_reader() treats a task exiting its RCU Tasks Trace read-side critical section the same as being within that critical section. However, this can fail because that task might have already checked its .need_qs field, which means that it might never decrement the all-important trc_n_readers_need_end counter. Of course, for that to happen, the task would need to never again execute an RCU Tasks Trace read-side critical section, but this really could happen if the system's last trampoline was removed. Note that exit from such a critical section cannot be treated as a quiescent state due to the possibility of nested critical sections. This means that if trc_inspect_reader() sees a negative nesting value, it must set up to try again later. This commit therefore ignores tasks that are exiting their RCU Tasks Trace read-side critical sections so that they will be rechecked later. [ paulmck: Apply feedback from Neeraj Upadhyay and Boqun Feng. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
3e22624f8f |
rcu-tasks: Wait for trc_read_check_handler() IPIs
commit cbe0d8d91415c9692fe88191940d98952b6855d9 upstream. Currently, RCU Tasks Trace initializes the trc_n_readers_need_end counter to the value one, increments it before each trc_read_check_handler() IPI, then decrements it within trc_read_check_handler() if the target task was in a quiescent state (or if the target task moved to some other CPU while the IPI was in flight), complaining if the new value was zero. The rationale for complaining is that the initial value of one must be decremented away before zero can be reached, and this decrement has not yet happened. Except that trc_read_check_handler() is initiated with an asynchronous smp_call_function_single(), which might be significantly delayed. This can result in false-positive complaints about the counter reaching zero. This commit therefore waits for in-flight IPI handlers to complete before decrementing away the initial value of one from the trc_n_readers_need_end counter. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
9190c1f0ae |
rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader
commit 46aa886c483f57ef13cd5ea0a85e70b93eb1d381 upstream. The trc_wait_for_one_reader() function is called at multiple stages of trace rcu-tasks GP function, rcu_tasks_wait_gp(): - First, it is called as part of per task function - rcu_tasks_trace_pertask(), for all non-idle tasks. As part of per task processing, this function add the task in the holdout list and if the task is currently running on a CPU, it sends IPI to the task's CPU. The IPI handler takes action depending on whether task is in trace rcu-tasks read side critical section or not: - a. If the task is in trace rcu-tasks read side critical section (t->trc_reader_nesting != 0), the IPI handler sets the task's ->trc_reader_special.b.need_qs, so that this task notifies exit from its outermost read side critical section (by decrementing trc_n_readers_need_end) to the GP handling function. trc_wait_for_one_reader() also increments trc_n_readers_need_end, so that the trace rcu-tasks GP handler function waits for this task's read side exit notification. The IPI handler also sets t->trc_reader_checked to true, and no further IPIs are sent for this task, for this trace rcu-tasks grace period and this task can be removed from holdout list. - b. If the task is in the process of exiting its trace rcu-tasks read side critical section, (t->trc_reader_nesting < 0), defer this task's processing to future calls to trc_wait_for_one_reader(). - c. If task is not in rcu-task read side critical section, t->trc_reader_nesting == 0, ->trc_reader_checked is set for this task, so that this task is removed from holdout list. - Second, trc_wait_for_one_reader() is called as part of post scan, in function rcu_tasks_trace_postscan(), for all idle tasks. - Third, in function check_all_holdout_tasks_trace(), this function is called for each task in the holdout list, but only if there isn't a pending IPI for the task (->trc_ipi_to_cpu == -1). This function removed the task from holdout list, if IPI handler has completed the required work, to ensure that the current trace rcu-tasks grace period either waits for this task, or this task is not in a trace rcu-tasks read side critical section. Now, considering the scenario where smp_call_function_single() fails in first case, inside rcu_tasks_trace_pertask(). In this case, ->trc_ipi_to_cpu is set to the current CPU for that task. This will result in trc_wait_for_one_reader() getting skipped in third case, inside check_all_holdout_tasks_trace(), for this task. This further results in ->trc_reader_checked never getting set for this task, and the task not getting removed from holdout list. This can cause the current trace rcu-tasks grace period to stall. Fix the above problem, by resetting ->trc_ipi_to_cpu to -1, on smp_call_function_single() failure, so that future IPI calls can be send for this task. Note that all three of the trc_wait_for_one_reader() function's callers (rcu_tasks_trace_pertask(), rcu_tasks_trace_postscan(), check_all_holdout_tasks_trace()) hold cpu_read_lock(). This means that smp_call_function_single() cannot race with CPU hotplug, and thus should never fail. Therefore, also add a warning in order to report any such failure in case smp_call_function_single() grows some other reason for failure. Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
ad4f8c117b |
rcu: Prevent expedited GP from enabling tick on offline CPU
commit 147f04b14adde831eb4a0a1e378667429732f9e8 upstream. If an RCU expedited grace period starts just when a CPU is in the process of going offline, so that the outgoing CPU has completed its pass through stop-machine but has not yet completed its final dive into the idle loop, RCU will attempt to enable that CPU's scheduling-clock tick via a call to tick_dep_set_cpu(). For this to happen, that CPU has to have been online when the expedited grace period completed its CPU-selection phase. This is pointless: The outgoing CPU has interrupts disabled, so it cannot take a scheduling-clock tick anyway. In addition, the tick_dep_set_cpu() function's eventual call to irq_work_queue_on() will splat as follows: smpboot: CPU 1 is now offline WARNING: CPU: 6 PID: 124 at kernel/irq_work.c:95 +irq_work_queue_on+0x57/0x60 Modules linked in: CPU: 6 PID: 124 Comm: kworker/6:2 Not tainted 5.15.0-rc1+ #3 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS +rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014 Workqueue: rcu_gp wait_rcu_exp_gp RIP: 0010:irq_work_queue_on+0x57/0x60 Code: 8b 05 1d c7 ea 62 a9 00 00 f0 00 75 21 4c 89 ce 44 89 c7 e8 +9b 37 fa ff ba 01 00 00 00 89 d0 c3 4c 89 cf e8 3b ff ff ff eb ee <0f> 0b eb b7 +0f 0b eb db 90 48 c7 c0 98 2a 02 00 65 48 03 05 91 6f RSP: 0000:ffffb12cc038fe48 EFLAGS: 00010282 RAX: 0000000000000001 RBX: 0000000000005208 RCX: 0000000000000020 RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff9ad01f45a680 RBP: 000000000004c990 R08: 0000000000000001 R09: ffff9ad01f45a680 R10: ffffb12cc0317db0 R11: 0000000000000001 R12: 00000000fffecee8 R13: 0000000000000001 R14: 0000000000026980 R15: ffffffff9e53ae00 FS: 0000000000000000(0000) GS:ffff9ad01f580000(0000) +knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000000de0c000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tick_nohz_dep_set_cpu+0x59/0x70 rcu_exp_wait_wake+0x54e/0x870 ? sync_rcu_exp_select_cpus+0x1fc/0x390 process_one_work+0x1ef/0x3c0 ? process_one_work+0x3c0/0x3c0 worker_thread+0x28/0x3c0 ? process_one_work+0x3c0/0x3c0 kthread+0x115/0x140 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 ---[ end trace c5bf75eb6aa80bc6 ]--- This commit therefore avoids invoking tick_dep_set_cpu() on offlined CPUs to limit both futility and false-positive splats. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
ecd62c8512 |
module: Expose module_init_layout_section()
commit 2abcc4b5a64a65a2d2287ba0be5c2871c1552416 upstream. module_init_layout_section() choses whether the core module loader considers a section as init or not. This affects the placement of the exit section when module unloading is disabled. This code will never run, so it can be free()d once the module has been initialised. arm and arm64 need to count the number of PLTs they need before applying relocations based on the section name. The init PLTs are stored separately so they can be free()d. arm and arm64 both use within_module_init() to decide which list of PLTs to use when applying the relocation. Because within_module_init()'s behaviour changes when module unloading is disabled, both architecture would need to take this into account when counting the PLTs. Today neither architecture does this, meaning when module unloading is disabled there are insufficient PLTs in the init section to load some modules, resulting in warnings: | WARNING: CPU: 2 PID: 51 at arch/arm64/kernel/module-plts.c:99 module_emit_plt_entry+0x184/0x1cc | Modules linked in: crct10dif_common | CPU: 2 PID: 51 Comm: modprobe Not tainted 6.5.0-rc4-yocto-standard-dirty #15208 | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : module_emit_plt_entry+0x184/0x1cc | lr : module_emit_plt_entry+0x94/0x1cc | sp : ffffffc0803bba60 [...] | Call trace: | module_emit_plt_entry+0x184/0x1cc | apply_relocate_add+0x2bc/0x8e4 | load_module+0xe34/0x1bd4 | init_module_from_file+0x84/0xc0 | __arm64_sys_finit_module+0x1b8/0x27c | invoke_syscall.constprop.0+0x5c/0x104 | do_el0_svc+0x58/0x160 | el0_svc+0x38/0x110 | el0t_64_sync_handler+0xc0/0xc4 | el0t_64_sync+0x190/0x194 Instead of duplicating module_init_layout_section()s logic, expose it. Reported-by: Adam Johnston <adam.johnston@arm.com> Fixes: 055f23b74b20 ("module: check for exit sections in layout_sections() instead of module_init_section()") Cc: stable@vger.kernel.org Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
3acf914de4 |
This is the 5.10.193 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmTvUN4ACgkQONu9yGCS aT7lPhAAyUQ/VO7lONIwD9LBkFxDIWEJBn4928UNFrweLuxd/ZJWZXoLJ9zMykqS HmKjjKcZXO5fHQxHGHz+zfWKaAqNqCTuvOTKe6HKWRFIEOWjNKmNCJ3+tNgFjQLz +9G3LXLykpiIc878se7qEZKseiEh56PfE62CCP3+k1cQX6PlIOXCJIq3cntYJJ0j feV3nHPyR0S/sHnTGZFJDijhLUXd2NUaR6O4AxNk4QiBcw2j7i6B2I8FLvH2LkHi CeBKt0wl1RJtTIraQP4cKsxrnKYvn8yxfEuIoUNlGlLR0tt8R1LfXlV9pZt0yMoX 15sS8xkXH8K3+Wau78VuPX8Ar/dckiGU+G1lbrL2Xe4S04dnW3yx/Lq8DoZbFT4s 4BADWBy8zSobCdVkOaz7G0DVp8m6a9O/dNiCt62C8xg8hJdPAfurnfySMfTPUGKG 5oLVIcGnjDG/kN+R1kDmejpHtTCpMR53NEbkfN0UxVnx9aClpLZj4Vl9teOdP/ir mFow0/mNCRZyFlZHV0xU4fwo5xMuwNu3sYee61O12sz3PNA3dgfZawsKfF3vQLy7 eDPeses8AULl4+IaOwbBXuamtlDiARTM8jcnZ9nZr6LM3040d/2g36XbAkdmhQ/S jhwWpj6k9uKbrVLWYPnrR2Cc74toMdn80qJy+0Fu861ILpKCPbs= =8h+N -----END PGP SIGNATURE----- Merge 5.10.193 into android12-5.10-lts Changes in 5.10.193 objtool/x86: Fix SRSO mess NFSv4: fix out path in __nfs4_get_acl_uncached xprtrdma: Remap Receive buffers after a reconnect PCI: acpiphp: Reassign resources on bridge if necessary dlm: improve plock logging if interrupted dlm: replace usage of found with dedicated list iterator variable fs: dlm: add pid to debug log fs: dlm: change plock interrupted message to debug again fs: dlm: use dlm_plock_info for do_unlock_close fs: dlm: fix mismatch of plock results from userspace MIPS: cpu-features: Enable octeon_cache by cpu_type MIPS: cpu-features: Use boot_cpu_type for CPU type based features fbdev: Improve performance of sys_imageblit() fbdev: Fix sys_imageblit() for arbitrary image widths fbdev: fix potential OOB read in fast_imageblit() dm integrity: increase RECALC_SECTORS to improve recalculate speed dm integrity: reduce vmalloc space footprint on 32-bit architectures ALSA: pcm: Fix potential data race at PCM memory allocation helpers drm/amd/display: do not wait for mpc idle if tg is disabled drm/amd/display: check TG is non-null before checking if enabled libceph, rbd: ignore addr->type while comparing in some cases rbd: make get_lock_owner_info() return a single locker or NULL rbd: retrieve and check lock owner twice before blocklisting rbd: prevent busy loop when requesting exclusive lock tracing: Fix cpu buffers unavailable due to 'record_disabled' missed tracing: Fix memleak due to race between current_tracer and trace octeontx2-af: SDP: fix receive link config sock: annotate data-races around prot->memory_pressure dccp: annotate data-races in dccp_poll() ipvlan: Fix a reference count leak warning in ipvlan_ns_exit() net: bgmac: Fix return value check for fixed_phy_register() net: bcmgenet: Fix return value check for fixed_phy_register() net: validate veth and vxcan peer ifindexes ice: fix receive buffer size miscalculation igb: Avoid starting unnecessary workqueues net/sched: fix a qdisc modification with ambiguous command request netfilter: nf_tables: fix out of memory error handling rtnetlink: return ENODEV when ifname does not exist and group is given rtnetlink: Reject negative ifindexes in RTM_NEWLINK net: remove bond_slave_has_mac_rcu() bonding: fix macvlan over alb bond support ibmveth: Use dcbf rather than dcbfl NFSv4: Fix dropped lock for racing OPEN and delegation return clk: Fix slab-out-of-bounds error in devm_clk_release() mm: add a call to flush_cache_vmap() in vmap_pfn() NFS: Fix a use after free in nfs_direct_join_group() nfsd: Fix race to FREE_STATEID and cl_revoked selinux: set next pointer before attaching to list batman-adv: Trigger events for auto adjusted MTU batman-adv: Don't increase MTU when set by user batman-adv: Do not get eth header before batadv_check_management_packet batman-adv: Fix TT global entry leak when client roamed back batman-adv: Fix batadv_v_ogm_aggr_send memory leak batman-adv: Hold rtnl lock during MTU update via netlink lib/clz_ctz.c: Fix __clzdi2() and __ctzdi2() for 32-bit kernels radix tree: remove unused variable of: dynamic: Refactor action prints to not use "%pOF" inside devtree_lock media: vcodec: Fix potential array out-of-bounds in encoder queue_setup PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus drm/vmwgfx: Fix shader stage validation drm/display/dp: Fix the DP DSC Receiver cap size x86/fpu: Set X86_FEATURE_OSXSAVE feature after enabling OSXSAVE in CR4 torture: Fix hang during kthread shutdown phase tick: Detect and fix jiffies update stall timers/nohz: Switch to ONESHOT_STOPPED in the low-res handler when the tick is stopped cgroup/cpuset: Rename functions dealing with DEADLINE accounting sched/cpuset: Bring back cpuset_mutex sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets cgroup/cpuset: Iterate only if DEADLINE tasks are present sched/deadline: Create DL BW alloc, free & check overflow interface cgroup/cpuset: Free DL BW in case can_attach() fails drm/i915: Fix premature release of request's reusable memory ASoC: rt711: add two jack detection modes scsi: snic: Fix double free in snic_tgt_create() scsi: core: raid_class: Remove raid_component_add() clk: Fix undefined reference to `clk_rate_exclusive_{get,put}' pinctrl: renesas: rza2: Add lock around pinctrl_generic{{add,remove}_group,{add,remove}_function} dma-buf/sw_sync: Avoid recursive lock during fence signal mm,hwpoison: refactor get_any_page mm: fix page reference leak in soft_offline_page() mm: memory-failure: kill soft_offline_free_page() mm: memory-failure: fix unexpected return value in soft_offline_page() ASoC: Intel: sof_sdw: include rt711.h for RT711 JD mode mm,hwpoison: fix printing of page flags Linux 5.10.193 Change-Id: I7c6ce55cbc73cef27a5cbe8954131a052b67dac2 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
2d69f68ad4 |
cgroup/cpuset: Free DL BW in case can_attach() fails
commit 2ef269ef1ac006acf974793d975539244d77b28f upstream. cpuset_can_attach() can fail. Postpone DL BW allocation until all tasks have been checked. DL BW is not allocated per-task but as a sum over all DL tasks migrating. If multiple controllers are attached to the cgroup next to the cpuset controller a non-cpuset can_attach() can fail. In this case free DL BW in cpuset_cancel_attach(). Finally, update cpuset DL task count (nr_deadline_tasks) only in cpuset_attach(). Suggested-by: Waiman Long <longman@redhat.com> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> [ Fix conflicts in kernel/cgroup/cpuset.c due to new code being applied that is not applicable on this branch. Reject new code. ] Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
4603c2a104 |
sched/deadline: Create DL BW alloc, free & check overflow interface
commit 85989106feb734437e2d598b639991b9185a43a6 upstream. While moving a set of tasks between exclusive cpusets, cpuset_can_attach() -> task_can_attach() calls dl_cpu_busy(..., p) for DL BW overflow checking and per-task DL BW allocation on the destination root_domain for the DL tasks in this set. This approach has the issue of not freeing already allocated DL BW in the following error cases: (1) The set of tasks includes multiple DL tasks and DL BW overflow checking fails for one of the subsequent DL tasks. (2) Another controller next to the cpuset controller which is attached to the same cgroup fails in its can_attach(). To address this problem rework dl_cpu_busy(): (1) Split it into dl_bw_check_overflow() & dl_bw_alloc() and add a dedicated dl_bw_free(). (2) dl_bw_alloc() & dl_bw_free() take a `u64 dl_bw` parameter instead of a `struct task_struct *p` used in dl_cpu_busy(). This allows to allocate DL BW for a set of tasks too rather than only for a single task. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
c9546921a4 |
cgroup/cpuset: Iterate only if DEADLINE tasks are present
commit c0f78fd5edcf29b2822ac165f9248a6c165e8554 upstream. update_tasks_root_domain currently iterates over all tasks even if no DEADLINE task is present on the cpuset/root domain for which bandwidth accounting is being rebuilt. This has been reported to introduce 10+ ms delays on suspend-resume operations. Skip the costly iteration for cpusets that don't contain DEADLINE tasks. Reported-by: Qais Yousef (Google) <qyousef@layalina.io> Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/ Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
5ac05ce568 |
sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
commit 6c24849f5515e4966d94fa5279bdff4acf2e9489 upstream. Qais reported that iterating over all tasks when rebuilding root domains for finding out which ones are DEADLINE and need their bandwidth correctly restored on such root domains can be a costly operation (10+ ms delays on suspend-resume). To fix the problem keep track of the number of DEADLINE tasks belonging to each cpuset and then use this information (followup patch) to only perform the above iteration if DEADLINE tasks are actually present in the cpuset for which a corresponding root domain is being rebuilt. Reported-by: Qais Yousef (Google) <qyousef@layalina.io> Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/ Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> [ Fix conflicts in kernel/cgroup/cpuset.c and kernel/sched/deadline.c due to pulling new fields and functions. Remove new code and match the patch diff. ] Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
b950133d9a |
sched/cpuset: Bring back cpuset_mutex
commit 111cd11bbc54850f24191c52ff217da88a5e639b upstream.
Turns out percpu_cpuset_rwsem - commit
|
||
|
312713e3ea |
cgroup/cpuset: Rename functions dealing with DEADLINE accounting
commit ad3a557daf6915296a43ef97a3e9c48e076c9dd8 upstream. rebuild_root_domains() and update_tasks_root_domain() have neutral names, but actually deal with DEADLINE bandwidth accounting. Rename them to use 'dl_' prefix so that intent is more clear. No functional change. Suggested-by: Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
b2125926ba |
timers/nohz: Switch to ONESHOT_STOPPED in the low-res handler when the tick is stopped
commit 62c1256d544747b38e77ca9b5bfe3a26f9592576 upstream. When tick_nohz_stop_tick() stops the tick and high resolution timers are disabled, then the clock event device is not put into ONESHOT_STOPPED mode. This can lead to spurious timer interrupts with some clock event device drivers that don't shut down entirely after firing. Eliminate these by putting the device into ONESHOT_STOPPED mode at points where it is not being reprogrammed. When there are no timers active, then tick_program_event() with KTIME_MAX can be used to stop the device. When there is a timer active, the device can be stopped at the next tick (any new timer added by timers will reprogram the tick). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220422141446.915024-1-npiggin@gmail.com Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
ae4f109b95 |
tick: Detect and fix jiffies update stall
commit a1ff03cd6fb9c501fff63a4a2bface9adcfa81cd upstream. On some rare cases, the timekeeper CPU may be delaying its jiffies update duty for a while. Known causes include: * The timekeeper is waiting on stop_machine in a MULTI_STOP_DISABLE_IRQ or MULTI_STOP_RUN state. Disabled interrupts prevent from timekeeping updates while waiting for the target CPU to complete its stop_machine() callback. * The timekeeper vcpu has VMEXIT'ed for a long while due to some overload on the host. Detect and fix these situations with emergency timekeeping catchups. Original-patch-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
c7e91047d3 |
torture: Fix hang during kthread shutdown phase
commit d52d3a2bf408ff86f3a79560b5cce80efb340239 upstream. During rcutorture shutdown, the rcu_torture_cleanup() function calls torture_cleanup_begin(), which sets the fullstop global variable to FULLSTOP_RMMOD. This causes the rcutorture threads for readers and fakewriters to exit all of their "while" loops and start shutting down. They then call torture_kthread_stopping(), which in turn waits for kthread_stop() to be called. However, rcu_torture_cleanup() has not yet called kthread_stop() on those threads, and before it gets a chance to do so, multiple instances of torture_kthread_stopping() invoke schedule_timeout_interruptible(1) in a tight loop. Tracing confirms that TIMER_SOFTIRQ can then continuously execute timer callbacks. If that TIMER_SOFTIRQ preempts the task executing rcu_torture_cleanup(), that task might never invoke kthread_stop(). This commit improves this situation by increasing the timeout passed to schedule_timeout_interruptible() from one jiffy to 1/20th of a second. This change prevents TIMER_SOFTIRQ from monopolizing its CPU, thus allowing rcu_torture_cleanup() to carry out the needed kthread_stop() invocations. Testing has shown 100 runs of TREE07 passing reliably, as oppose to the tens-of-percent failure rates seen beforehand. Cc: Paul McKenney <paulmck@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> Cc: <stable@vger.kernel.org> # 6.0.x Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
b8205dfed6 |
tracing: Fix memleak due to race between current_tracer and trace
[ Upstream commit eecb91b9f98d6427d4af5fdb8f108f52572a39e7 ]
Kmemleak report a leak in graph_trace_open():
unreferenced object 0xffff0040b95f4a00 (size 128):
comm "cat", pid 204981, jiffies 4301155872 (age 99771.964s)
hex dump (first 32 bytes):
e0 05 e7 b4 ab 7d 00 00 0b 00 01 00 00 00 00 00 .....}..........
f4 00 01 10 00 a0 ff ff 00 00 00 00 65 00 10 00 ............e...
backtrace:
[<000000005db27c8b>] kmem_cache_alloc_trace+0x348/0x5f0
[<000000007df90faa>] graph_trace_open+0xb0/0x344
[<00000000737524cd>] __tracing_open+0x450/0xb10
[<0000000098043327>] tracing_open+0x1a0/0x2a0
[<00000000291c3876>] do_dentry_open+0x3c0/0xdc0
[<000000004015bcd6>] vfs_open+0x98/0xd0
[<000000002b5f60c9>] do_open+0x520/0x8d0
[<00000000376c7820>] path_openat+0x1c0/0x3e0
[<00000000336a54b5>] do_filp_open+0x14c/0x324
[<000000002802df13>] do_sys_openat2+0x2c4/0x530
[<0000000094eea458>] __arm64_sys_openat+0x130/0x1c4
[<00000000a71d7881>] el0_svc_common.constprop.0+0xfc/0x394
[<00000000313647bf>] do_el0_svc+0xac/0xec
[<000000002ef1c651>] el0_svc+0x20/0x30
[<000000002fd4692a>] el0_sync_handler+0xb0/0xb4
[<000000000c309c35>] el0_sync+0x160/0x180
The root cause is descripted as follows:
__tracing_open() { // 1. File 'trace' is being opened;
...
*iter->trace = *tr->current_trace; // 2. Tracer 'function_graph' is
// currently set;
...
iter->trace->open(iter); // 3. Call graph_trace_open() here,
// and memory are allocated in it;
...
}
s_start() { // 4. The opened file is being read;
...
*iter->trace = *tr->current_trace; // 5. If tracer is switched to
// 'nop' or others, then memory
// in step 3 are leaked!!!
...
}
To fix it, in s_start(), close tracer before switching then reopen the
new tracer after switching. And some tracers like 'wakeup' may not update
'iter->private' in some cases when reopen, then it should be cleared
to avoid being mistakenly closed again.
Link: https://lore.kernel.org/linux-trace-kernel/20230817125539.1646321-1-zhengyejian1@huawei.com
Fixes:
|
||
|
9c2ceffd4e |
tracing: Fix cpu buffers unavailable due to 'record_disabled' missed
[ Upstream commit b71645d6af10196c46cbe3732de2ea7d36b3ff6d ]
Trace ring buffer can no longer record anything after executing
following commands at the shell prompt:
# cd /sys/kernel/tracing
# cat tracing_cpumask
fff
# echo 0 > tracing_cpumask
# echo 1 > snapshot
# echo fff > tracing_cpumask
# echo 1 > tracing_on
# echo "hello world" > trace_marker
-bash: echo: write error: Bad file descriptor
The root cause is that:
1. After `echo 0 > tracing_cpumask`, 'record_disabled' of cpu buffers
in 'tr->array_buffer.buffer' became 1 (see tracing_set_cpumask());
2. After `echo 1 > snapshot`, 'tr->array_buffer.buffer' is swapped
with 'tr->max_buffer.buffer', then the 'record_disabled' became 0
(see update_max_tr());
3. After `echo fff > tracing_cpumask`, the 'record_disabled' become -1;
Then array_buffer and max_buffer are both unavailable due to value of
'record_disabled' is not 0.
To fix it, enable or disable both array_buffer and max_buffer at the same
time in tracing_set_cpumask().
Link: https://lkml.kernel.org/r/20230805033816.3284594-2-zhengyejian1@huawei.com
Cc: <mhiramat@kernel.org>
Cc: <vnagarnaik@google.com>
Cc: <shuah@kernel.org>
Fixes:
|
||
|
b23fd871be |
Merge 5.10.192 into android12-5.10-lts
Changes in 5.10.192 mmc: sdhci-f-sdh30: Replace with sdhci_pltfm macsec: Fix traffic counters/statistics macsec: use DEV_STATS_INC() net/mlx5: Refactor init clock function net/mlx5: Move all internal timer metadata into a dedicated struct net/mlx5: Skip clock update work when device is in error state drm/radeon: Fix integer overflow in radeon_cs_parser_init ALSA: emu10k1: roll up loops in DSP setup code for Audigy ASoC: Intel: sof_sdw: add quirk for MTL RVP ASoC: Intel: sof_sdw: add quirk for LNL RVP PCI: tegra194: Fix possible array out of bounds access ARM: dts: imx6dl: prtrvt, prtvt7, prti6q, prtwd2: fix USB related warnings ASoC: Intel: sof_sdw: Add support for Rex soundwire iopoll: Call cpu_relax() in busy loops quota: Properly disable quotas when add_dquot_ref() fails quota: fix warning in dqgrab() dma-remap: use kvmalloc_array/kvfree for larger dma memory remap drm/amdgpu: install stub fence into potential unused fence pointers HID: add quirk for 03f0:464a HP Elite Presenter Mouse RDMA/mlx5: Return the firmware result upon destroying QP/RQ ovl: check type and offset of struct vfsmount in ovl_entry udf: Fix uninitialized array access for some pathnames fs: jfs: Fix UBSAN: array-index-out-of-bounds in dbAllocDmapLev MIPS: dec: prom: Address -Warray-bounds warning FS: JFS: Fix null-ptr-deref Read in txBegin FS: JFS: Check for read-only mounted filesystem in txBegin media: v4l2-mem2mem: add lock to protect parameter num_rdy usb: gadget: u_serial: Avoid spinlock recursion in __gs_console_push media: platform: mediatek: vpu: fix NULL ptr dereference usb: chipidea: imx: don't request QoS for imx8ulp usb: chipidea: imx: add missing USB PHY DPDM wakeup setting gfs2: Fix possible data races in gfs2_show_options() pcmcia: rsrc_nonstatic: Fix memory leak in nonstatic_release_resource_db() Bluetooth: L2CAP: Fix use-after-free Bluetooth: btusb: Add MT7922 bluetooth ID for the Asus Ally drm/amdgpu: Fix potential fence use-after-free v2 ALSA: hda/realtek: Add quirks for Unis H3C Desktop B760 & Q760 ALSA: hda: fix a possible null-pointer dereference due to data race in snd_hdac_regmap_sync() powerpc/kasan: Disable KCOV in KASAN code ring-buffer: Do not swap cpu_buffer during resize process IMA: allow/fix UML builds iio: add addac subdirectory dt-bindings: iio: add AD74413R iio: adc: stx104: Utilize iomap interface iio: adc: stx104: Implement and utilize register structures iio: addac: stx104: Fix race condition for stx104_write_raw() iio: addac: stx104: Fix race condition when converting analog-to-digital bus: mhi: Add MHI PCI support for WWAN modems bus: mhi: Add MMIO region length to controller structure bus: mhi: Move host MHI code to "host" directory bus: mhi: host: Range check CHDBOFF and ERDBOFF irqchip/mips-gic: Get rid of the reliance on irq_cpu_online() irqchip/mips-gic: Use raw spinlock for gic_lock usb: gadget: udc: core: Introduce check_config to verify USB configuration usb: cdns3: allocate TX FIFO size according to composite EP number usb: cdns3: fix NCM gadget RX speed 20x slow than expection at iMX8QM USB: dwc3: qcom: fix NULL-deref on suspend mmc: bcm2835: fix deferred probing mmc: sunxi: fix deferred probing mmc: core: add devm_mmc_alloc_host mmc: meson-gx: use devm_mmc_alloc_host mmc: meson-gx: fix deferred probing tracing/probes: Have process_fetch_insn() take a void * instead of pt_regs tracing/probes: Fix to update dynamic data counter if fetcharg uses it virtio-mmio: Use to_virtio_mmio_device() to simply code virtio-mmio: don't break lifecycle of vm_dev i2c: bcm-iproc: Fix bcm_iproc_i2c_isr deadlock issue fbdev: mmp: fix value check in mmphw_probe() powerpc/rtas_flash: allow user copy to flash block cache objects tty: n_gsm: fix the UAF caused by race condition in gsm_cleanup_mux tty: serial: fsl_lpuart: Clear the error flags by writing 1 for lpuart32 platforms btrfs: fix BUG_ON condition in btrfs_cancel_balance i2c: designware: Handle invalid SMBus block data response length value net: xfrm: Fix xfrm_address_filter OOB read net: af_key: fix sadb_x_filter validation net: xfrm: Amend XFRMA_SEC_CTX nla_policy structure xfrm: fix slab-use-after-free in decode_session6 ip6_vti: fix slab-use-after-free in decode_session6 ip_vti: fix potential slab-use-after-free in decode_session6 xfrm: add NULL check in xfrm_update_ae_params xfrm: add forgotten nla_policy for XFRMA_MTIMER_THRESH selftests: mirror_gre_changes: Tighten up the TTL test match drm/panel: simple: Fix AUO G121EAN01 panel timings according to the docs ipvs: fix racy memcpy in proc_do_sync_threshold netfilter: nft_dynset: disallow object maps net: phy: broadcom: stub c45 read/write for 54810 team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves i40e: fix misleading debug logs net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset sock: Fix misuse of sk_under_memory_pressure() net: do not allow gso_size to be set to GSO_BY_FRAGS bus: ti-sysc: Flush posted write on enable before reset arm64: dts: rockchip: fix supplies on rk3399-rock-pi-4 arm64: dts: rockchip: use USB host by default on rk3399-rock-pi-4 arm64: dts: rockchip: add ES8316 codec for ROCK Pi 4 arm64: dts: rockchip: add SPDIF node for ROCK Pi 4 arm64: dts: rockchip: fix regulator name on rk3399-rock-4 arm64: dts: rockchip: sort nodes/properties on rk3399-rock-4 arm64: dts: rockchip: Disable HS400 for eMMC on ROCK Pi 4 ASoC: rt5665: add missed regulator_bulk_disable ASoC: meson: axg-tdm-formatter: fix channel slot allocation ALSA: hda/realtek - Remodified 3k pull low procedure serial: 8250: Fix oops for port->pm on uart_change_pm() ALSA: usb-audio: Add support for Mythware XA001AU capture and playback interfaces. cifs: Release folio lock on fscache read hit. mmc: wbsd: fix double mmc_free_host() in wbsd_init() mmc: block: Fix in_flight[issue_type] value error netfilter: set default timeout to 3 secs for sctp shutdown send and recv state af_unix: Fix null-ptr-deref in unix_stream_sendpage(). virtio-net: set queues after driver_ok net: fix the RTO timer retransmitting skb every 1ms if linear option is enabled mmc: f-sdh30: fix order of function calls in sdhci_f_sdh30_remove x86/cpu: Fix __x86_return_thunk symbol type x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk() x86/alternative: Make custom return thunk unconditional objtool: Add frame-pointer-specific function ignore x86/ibt: Add ANNOTATE_NOENDBR x86/cpu: Clean up SRSO return thunk mess x86/cpu: Rename original retbleed methods x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 x86/cpu: Cleanup the untrain mess x86/srso: Explain the untraining sequences a bit more x86/static_call: Fix __static_call_fixup() x86/retpoline: Don't clobber RFLAGS during srso_safe_ret() x86/CPU/AMD: Fix the DIV(0) initial fix attempt x86/srso: Disable the mitigation on unaffected configurations x86/retpoline,kprobes: Fix position of thunk sections with CONFIG_LTO_CLANG objtool/x86: Fixup frame-pointer vs rethunk x86/srso: Correct the mitigation status when SMT is disabled Linux 5.10.192 Change-Id: Id6dcc6748bce39baa640b8f0c3764d1d95643016 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
412095349f |
Merge 5.10.191 into android12-5.10-lts
Changes in 5.10.191
wireguard: allowedips: expand maximum node depth
mmc: moxart: read scr register without changing byte order
ipv6: adjust ndisc_is_useropt() to also return true for PIO
bpf: allow precision tracking for programs with subprogs
bpf: stop setting precise in current state
bpf: aggressively forget precise markings during state checkpointing
selftests/bpf: make test_align selftest more robust
selftests/bpf: Workaround verification failure for fexit_bpf2bpf/func_replace_return_code
selftests/bpf: Fix sk_assign on s390x
dmaengine: pl330: Return DMA_PAUSED when transaction is paused
riscv,mmio: Fix readX()-to-delay() ordering
drm/nouveau/gr: enable memory loads on helper invocation on all channels
drm/shmem-helper: Reset vma->vm_ops before calling dma_buf_mmap()
drm/amd/display: check attr flag before set cursor degamma on DCN3+
hwmon: (pmbus/bel-pfe) Enable PMBUS_SKIP_STATUS_CHECK for pfe1100
radix tree test suite: fix incorrect allocation size for pthreads
x86/pkeys: Revert
|
||
|
0dd121e0e6 |
Revert "tracing: Show real address for trace event arguments"
This reverts commit
|
||
|
06fab437d7 |
Revert "tracing: Fix sleeping while atomic in kdb ftdump"
This reverts commit
|
||
|
3b76d92636 |
tracing/probes: Fix to update dynamic data counter if fetcharg uses it
[ Upstream commit e38e2c6a9efc435f9de344b7c91f7697e01b47d5 ]
Fix to update dynamic data counter ('dyndata') and max length ('maxlen')
only if the fetcharg uses the dynamic data. Also get out arg->dynamic
from unlikely(). This makes dynamic data address wrong if
process_fetch_insn() returns error on !arg->dynamic case.
Link: https://lore.kernel.org/all/168908494781.123124.8160245359962103684.stgit@devnote2/
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/all/20230710233400.5aaf024e@gandalf.local.home/
Fixes:
|
||
|
265a979ded |
tracing/probes: Have process_fetch_insn() take a void * instead of pt_regs
[ Upstream commit 8565a45d0858078b63c7d84074a21a42ba9ebf01 ] In preparation to allow event probes to use the process_fetch_insn() callback in trace_probe_tmpl.h, change the data passed to it from a pointer to pt_regs, as the event probe will not be using regs, and make it a void pointer instead. Update the process_fetch_insn() callers for kprobe and uprobe events to have the regs defined in the function and just typecast the void pointer parameter. Link: https://lkml.kernel.org/r/20210819041842.291622924@goodmis.org Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Stable-dep-of: e38e2c6a9efc ("tracing/probes: Fix to update dynamic data counter if fetcharg uses it") Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
66a3b2a121 |
ring-buffer: Do not swap cpu_buffer during resize process
[ Upstream commit 8a96c0288d0737ad77882024974c075345c72011 ] When ring_buffer_swap_cpu was called during resize process, the cpu buffer was swapped in the middle, resulting in incorrect state. Continuing to run in the wrong state will result in oops. This issue can be easily reproduced using the following two scripts: /tmp # cat test1.sh //#! /bin/sh for i in `seq 0 100000` do echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb sleep 0.5 echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb sleep 0.5 done /tmp # cat test2.sh //#! /bin/sh for i in `seq 0 100000` do echo irqsoff > /sys/kernel/debug/tracing/current_tracer sleep 1 echo nop > /sys/kernel/debug/tracing/current_tracer sleep 1 done /tmp # ./test1.sh & /tmp # ./test2.sh & A typical oops log is as follows, sometimes with other different oops logs. [ 231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8 [ 231.713375] Modules linked in: [ 231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15 [ 231.716750] Hardware name: linux,dummy-virt (DT) [ 231.718152] Workqueue: events update_pages_handler [ 231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 231.721171] pc : rb_update_pages+0x378/0x3f8 [ 231.722212] lr : rb_update_pages+0x25c/0x3f8 [ 231.723248] sp : ffff800082b9bd50 [ 231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000 [ 231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0 [ 231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a [ 231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000 [ 231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510 [ 231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002 [ 231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558 [ 231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001 [ 231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000 [ 231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208 [ 231.744196] Call trace: [ 231.744892] rb_update_pages+0x378/0x3f8 [ 231.745893] update_pages_handler+0x1c/0x38 [ 231.746893] process_one_work+0x1f0/0x468 [ 231.747852] worker_thread+0x54/0x410 [ 231.748737] kthread+0x124/0x138 [ 231.749549] ret_from_fork+0x10/0x20 [ 231.750434] ---[ end trace 0000000000000000 ]--- [ 233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 233.721696] Mem abort info: [ 233.721935] ESR = 0x0000000096000004 [ 233.722283] EC = 0x25: DABT (current EL), IL = 32 bits [ 233.722596] SET = 0, FnV = 0 [ 233.722805] EA = 0, S1PTW = 0 [ 233.723026] FSC = 0x04: level 0 translation fault [ 233.723458] Data abort info: [ 233.723734] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 233.724176] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 233.724589] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000 [ 233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 233.726720] Modules linked in: [ 233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15 [ 233.727777] Hardware name: linux,dummy-virt (DT) [ 233.728225] Workqueue: events update_pages_handler [ 233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 233.729054] pc : rb_update_pages+0x1a8/0x3f8 [ 233.729334] lr : rb_update_pages+0x154/0x3f8 [ 233.729592] sp : ffff800082b9bd50 [ 233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000 [ 233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418 [ 233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003 [ 233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58 [ 233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001 [ 233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000 [ 233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c [ 233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0 [ 233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000 [ 233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000 [ 233.734418] Call trace: [ 233.734593] rb_update_pages+0x1a8/0x3f8 [ 233.734853] update_pages_handler+0x1c/0x38 [ 233.735148] process_one_work+0x1f0/0x468 [ 233.735525] worker_thread+0x54/0x410 [ 233.735852] kthread+0x124/0x138 [ 233.736064] ret_from_fork+0x10/0x20 [ 233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060) [ 233.736959] ---[ end trace 0000000000000000 ]--- After analysis, the seq of the error is as follows [1-5]: int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size, int cpu_id) { for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; //1. get cpu_buffer, aka cpu_buffer(A) ... ... schedule_work_on(cpu, &cpu_buffer->update_pages_work); //2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to // update_pages_handler, do the update process, set 'update_done' in // complete(&cpu_buffer->update_done) and to wakeup resize process. //----> //3. Just at this moment, ring_buffer_swap_cpu is triggered, //cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer. //ring_buffer_swap_cpu is called as the 'Call trace' below. Call trace: dump_backtrace+0x0/0x2f8 show_stack+0x18/0x28 dump_stack+0x12c/0x188 ring_buffer_swap_cpu+0x2f8/0x328 update_max_tr_single+0x180/0x210 check_critical_timing+0x2b4/0x2c8 tracer_hardirqs_on+0x1c0/0x200 trace_hardirqs_on+0xec/0x378 el0_svc_common+0x64/0x260 do_el0_svc+0x90/0xf8 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb8 el0_sync+0x180/0x1c0 //<---- /* wait for all the updates to complete */ for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; //4. get cpu_buffer, cpu_buffer(B) is used in the following process, //the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong. //for example, cpu_buffer(A)->update_done will leave be set 1, and will //not 'wait_for_completion' at the next resize round. if (!cpu_buffer->nr_pages_to_update) continue; if (cpu_online(cpu)) wait_for_completion(&cpu_buffer->update_done); cpu_buffer->nr_pages_to_update = 0; } ... } //5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong, //Continuing to run in the wrong state, then oops occurs. Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn Signed-off-by: Chen Lin <chen.lin5@zte.com.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
04e774fb67 |
dma-remap: use kvmalloc_array/kvfree for larger dma memory remap
[ Upstream commit 51ff97d54f02b4444dfc42e380ac4c058e12d5dd ] If dma_direct_alloc() alloc memory in size of 64MB, the inner function dma_common_contiguous_remap() will allocate 128KB memory by invoking the function kmalloc_array(). and the kmalloc_array seems to fail to try to allocate 128KB mem. Call trace: [14977.928623] qcrosvm: page allocation failure: order:5, mode:0x40cc0 [14977.928638] dump_backtrace.cfi_jt+0x0/0x8 [14977.928647] dump_stack_lvl+0x80/0xb8 [14977.928652] warn_alloc+0x164/0x200 [14977.928657] __alloc_pages_slowpath+0x9f0/0xb4c [14977.928660] __alloc_pages+0x21c/0x39c [14977.928662] kmalloc_order+0x48/0x108 [14977.928666] kmalloc_order_trace+0x34/0x154 [14977.928668] __kmalloc+0x548/0x7e4 [14977.928673] dma_direct_alloc+0x11c/0x4f8 [14977.928678] dma_alloc_attrs+0xf4/0x138 [14977.928680] gh_vm_ioctl_set_fw_name+0x3c4/0x610 [gunyah] [14977.928698] gh_vm_ioctl+0x90/0x14c [gunyah] [14977.928705] __arm64_sys_ioctl+0x184/0x210 work around by doing kvmalloc_array instead. Signed-off-by: Gao Xu <gaoxu2@hihonor.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
df0f5bd7a8 |
This is the 5.10.190 stable release
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmTWBikACgkQONu9yGCS
aT7+sBAAgFSUVSAURxIgHfi1XvXnGI/eZI/0ub8AqkWRdMUVbpqXR0hsq1/KP5sj
zz6l4QTfFyGm5EQPv3dEDi3Nztsb2dQEaJTIKQV6N4kY133fn6sYIxQE9Rb1jHcp
UoI/WbiOAwRqKd5glsPANByen6U4871Y1h2drBgMXYzP1fVEhPCVxz1XhAyZlVTw
n94774DCiSXaVUt82r5IPOV2HfZAnik4BThCD3oE1v9qN2ugow51GA11R9Mf49lR
mthPqsMhOHl+IPgB+3VaL+DCDDbs6b+izBCNSy5Jdupqsd22RuuLE3a8gSgZSoYB
3NrIqwOPUuVX3yUsvaTfklXMDU8RyKGEAIK/njW2NSFv8ccMa1lymhXnrkjSd7AP
BdHVfpB9k3WJQt6ElMfH6S8KszWFyLgolsT7cG3osi7VgsbDuGW30ApTl7WI7nDB
QgoM3X2FfYYZ8uc7+fT0PU8tk098VqB0c4PQbigN7i0l0J6o1PPrg4Ke9++qaSAs
PZ+rUha4+9whgjjyF05sWSScgwo0Az4qEx5r1Igf3lfDOUnfIPy3cKd+O1zpjrt5
6ZO3CQkYM4IO8mP5ps2t4rJvZuPfAVBi3Ndgtt/JbLlkxbkuhNzTE1TKA7UOZRAW
RK6YlzhHsR5LQEOZb6pVJtO2HTaJSBjTrp4W+xj/e6k51z+I4Co=
=RCz+
-----END PGP SIGNATURE-----
Merge 5.10.190 into android12-5.10-lts
Changes in 5.10.190
KVM: s390: pv: fix index value of replaced ASCE
io_uring: don't audit the capability check in io_uring_create()
gpio: tps68470: Make tps68470_gpio_output() always set the initial value
btrfs: fix race between quota disable and relocation
btrfs: fix extent buffer leak after tree mod log failure at split_node()
i2c: Delete error messages for failed memory allocations
i2c: Improve size determinations
i2c: nomadik: Remove unnecessary goto label
i2c: nomadik: Use devm_clk_get_enabled()
i2c: nomadik: Remove a useless call in the remove function
PCI/ASPM: Return 0 or -ETIMEDOUT from pcie_retrain_link()
PCI/ASPM: Factor out pcie_wait_for_retrain()
PCI/ASPM: Avoid link retraining race
dlm: cleanup plock_op vs plock_xop
dlm: rearrange async condition return
fs: dlm: interrupt posix locks only when process is killed
drm/ttm: add ttm_bo_pin()/ttm_bo_unpin() v2
drm/ttm: never consider pinned BOs for eviction&swap
tracing: Show real address for trace event arguments
pwm: meson: Simplify duplicated per-channel tracking
pwm: meson: fix handling of period/duty if greater than UINT_MAX
ext4: fix to check return value of freeze_bdev() in ext4_shutdown()
phy: qcom-snps: Use dev_err_probe() to simplify code
phy: qcom-snps: correct struct qcom_snps_hsphy kerneldoc
phy: qcom-snps-femto-v2: keep cfg_ahb_clk enabled during runtime suspend
phy: qcom-snps-femto-v2: properly enable ref clock
media: staging: atomisp: select V4L2_FWNODE
i40e: Fix an NULL vs IS_ERR() bug for debugfs_create_dir()
net: phy: marvell10g: fix 88x3310 power up
net: hns3: reconstruct function hclge_ets_validate()
net: hns3: fix wrong bw weight of disabled tc issue
vxlan: move to its own directory
vxlan: calculate correct header length for GPE
phy: hisilicon: Fix an out of bounds check in hisi_inno_phy_probe()
ethernet: atheros: fix return value check in atl1e_tso_csum()
ipv6 addrconf: fix bug where deleting a mngtmpaddr can create a new temporary address
tcp: Reduce chance of collisions in inet6_hashfn().
ice: Fix memory management in ice_ethtool_fdir.c
bonding: reset bond's flags when down link is P2P device
team: reset team's flags when down link is P2P device
platform/x86: msi-laptop: Fix rfkill out-of-sync on MSI Wind U100
netfilter: nft_set_rbtree: fix overlap expiration walk
netfilter: nftables: add helper function to validate set element data
netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR
netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID
net/sched: mqprio: refactor nlattr parsing to a separate function
net/sched: mqprio: add extack to mqprio_parse_nlattr()
net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64
benet: fix return value check in be_lancer_xmit_workarounds()
tipc: check return value of pskb_trim()
tipc: stop tipc crypto on failure in tipc_node_create
RDMA/mlx4: Make check for invalid flags stricter
drm/msm/dpu: drop enum dpu_core_perf_data_bus_id
drm/msm/adreno: Fix snapshot BINDLESS_DATA size
RDMA/mthca: Fix crash when polling CQ for shared QPs
drm/msm: Fix IS_ERR_OR_NULL() vs NULL check in a5xx_submit_in_rb()
ASoC: fsl_spdif: Silence output on stop
block: Fix a source code comment in include/uapi/linux/blkzoned.h
dm raid: fix missing reconfig_mutex unlock in raid_ctr() error paths
dm raid: clean up four equivalent goto tags in raid_ctr()
dm raid: protect md_stop() with 'reconfig_mutex'
ata: pata_ns87415: mark ns87560_tf_read static
ring-buffer: Fix wrong stat of cpu_buffer->read
tracing: Fix warning in trace_buffered_event_disable()
Revert "usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()"
USB: gadget: Fix the memory leak in raw_gadget driver
serial: qcom-geni: drop bogus runtime pm state update
serial: 8250_dw: Preserve original value of DLF register
serial: sifive: Fix sifive_serial_console_setup() section
USB: serial: option: support Quectel EM060K_128
USB: serial: option: add Quectel EC200A module support
USB: serial: simple: add Kaufmann RKS+CAN VCP
USB: serial: simple: sort driver entries
can: gs_usb: gs_can_close(): add missing set of CAN state to CAN_STATE_STOPPED
Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"
usb: dwc3: pci: skip BYT GPIO lookup table for hardwired phy
usb: dwc3: don't reset device side if dwc3 was configured as host-only
usb: ohci-at91: Fix the unhandle interrupt when resume
USB: quirks: add quirk for Focusrite Scarlett
usb: xhci-mtk: set the dma max_seg_size
Revert "usb: xhci: tegra: Fix error check"
Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group
Documentation: security-bugs.rst: clarify CVE handling
staging: ks7010: potential buffer overflow in ks_wlan_set_encode_ext()
tty: n_gsm: fix UAF in gsm_cleanup_mux
ALSA: hda/relatek: Enable Mute LED on HP 250 G8
hwmon: (nct7802) Fix for temp6 (PECI1) processed even if PECI1 disabled
btrfs: check for commit error at btrfs_attach_transaction_barrier()
file: always lock position for FMODE_ATOMIC_POS
nfsd: Remove incorrect check in nfsd4_validate_stateid
tpm_tis: Explicitly check for error code
irq-bcm6345-l1: Do not assume a fixed block to cpu mapping
irqchip/gic-v4.1: Properly lock VPEs when doing a directLPI invalidation
KVM: VMX: Invert handling of CR0.WP for EPT without unrestricted guest
KVM: VMX: Fold ept_update_paging_mode_cr0() back into vmx_set_cr0()
KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em
KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest
staging: rtl8712: Use constants from <linux/ieee80211.h>
staging: r8712: Fix memory leak in _r8712_init_xmit_priv()
btrfs: check if the transaction was aborted at btrfs_wait_for_commit()
virtio-net: fix race between set queues and probe
s390/dasd: fix hanging device after quiesce/resume
ASoC: wm8904: Fill the cache for WM8904_ADC_TEST_0 register
ceph: never send metrics if disable_send_metrics is set
dm cache policy smq: ensure IO doesn't prevent cleaner policy progress
drm/ttm: make ttm_bo_unpin more defensive
ACPI: processor: perflib: Use the "no limit" frequency QoS
ACPI: processor: perflib: Avoid updating frequency QoS unnecessarily
cpufreq: intel_pstate: Drop ACPI _PSS states table patching
selftests: mptcp: depend on SYN_COOKIES
io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
ASoC: cs42l51: fix driver to properly autoload with automatic module loading
kprobes/x86: Fix fall-through warnings for Clang
x86/kprobes: Do not decode opcode in resume_execution()
x86/kprobes: Retrieve correct opcode for group instruction
x86/kprobes: Identify far indirect JMP correctly
x86/kprobes: Use int3 instead of debug trap for single-step
x86/kprobes: Fix to identify indirect jmp and others using range case
x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration
x86/kprobes: Update kcb status flag after singlestepping
x86/kprobes: Fix JNG/JNLE emulation
io_uring: gate iowait schedule on having pending requests
perf: Fix function pointer case
loop: Select I/O scheduler 'none' from inside add_disk()
arm64: dts: imx8mn-var-som: add missing pull-up for onboard PHY reset pinmux
word-at-a-time: use the same return type for has_zero regardless of endianness
KVM: s390: fix sthyi error handling
wifi: cfg80211: Fix return value in scan logic
net/mlx5: DR, fix memory leak in mlx5dr_cmd_create_reformat_ctx
net/mlx5e: fix return value check in mlx5e_ipsec_remove_trailer()
bpf: Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing
rtnetlink: let rtnl_bridge_setlink checks IFLA_BRIDGE_MODE length
net: dsa: fix value check in bcm_sf2_sw_probe()
perf test uprobe_from_different_cu: Skip if there is no gcc
net: sched: cls_u32: Fix match key mis-addressing
mISDN: hfcpci: Fix potential deadlock on &hc->lock
net: annotate data-races around sk->sk_max_pacing_rate
net: add missing READ_ONCE(sk->sk_rcvlowat) annotation
net: add missing READ_ONCE(sk->sk_sndbuf) annotation
net: add missing READ_ONCE(sk->sk_rcvbuf) annotation
net: add missing data-race annotations around sk->sk_peek_off
net: add missing data-race annotation for sk_ll_usec
net/sched: cls_u32: No longer copy tcf_result on update to avoid use-after-free
net/sched: cls_fw: No longer copy tcf_result on update to avoid use-after-free
net/sched: cls_route: No longer copy tcf_result on update to avoid use-after-free
bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire
net: ll_temac: Switch to use dev_err_probe() helper
net: ll_temac: fix error checking of irq_of_parse_and_map()
net: netsec: Ignore 'phy-mode' on SynQuacer in DT mode
net: dcb: choose correct policy to parse DCB_ATTR_BCN
s390/qeth: Don't call dev_close/dev_open (DOWN/UP)
ip6mr: Fix skb_under_panic in ip6mr_cache_report()
vxlan: Fix nexthop hash size
net/mlx5: fs_core: Make find_closest_ft more generic
net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio
tcp_metrics: fix addr_same() helper
tcp_metrics: annotate data-races around tm->tcpm_stamp
tcp_metrics: annotate data-races around tm->tcpm_lock
tcp_metrics: annotate data-races around tm->tcpm_vals[]
tcp_metrics: annotate data-races around tm->tcpm_net
tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen
scsi: zfcp: Defer fc_rport blocking until after ADISC response
libceph: fix potential hang in ceph_osdc_notify()
USB: zaurus: Add ID for A-300/B-500/C-700
ceph: defer stopping mdsc delayed_work
exfat: use kvmalloc_array/kvfree instead of kmalloc_array/kfree
exfat: release s_lock before calling dir_emit()
mtd: spinand: toshiba: Fix ecc_get_status
mtd: rawnand: meson: fix OOB available bytes for ECC
arm64: dts: stratix10: fix incorrect I2C property for SCL signal
net: tun_chr_open(): set sk_uid from current_fsuid()
net: tap_open(): set sk_uid from current_fsuid()
bpf: Disable preemption in bpf_event_output
open: make RESOLVE_CACHED correctly test for O_TMPFILE
drm/ttm: check null pointer before accessing when swapping
file: reinstate f_pos locking optimization for regular files
tracing: Fix sleeping while atomic in kdb ftdump
fs/sysv: Null check to prevent null-ptr-deref bug
Bluetooth: L2CAP: Fix use-after-free in l2cap_sock_ready_cb
net: usbnet: Fix WARNING in usbnet_start_xmit/usb_submit_urb
fs: Protect reconfiguration of sb read-write from racing writes
ext2: Drop fragment support
mtd: rawnand: omap_elm: Fix incorrect type in assignment
mtd: rawnand: fsl_upm: Fix an off-by one test in fun_exec_op()
powerpc/mm/altmap: Fix altmap boundary check
selftests/rseq: check if libc rseq support is registered
selftests/rseq: Play nice with binaries statically linked against glibc 2.35+
soundwire: bus: add better dev_dbg to track complete() calls
soundwire: bus: pm_runtime_request_resume on peripheral attachment
soundwire: fix enumeration completion
PM / wakeirq: support enabling wake-up irq after runtime_suspend called
PM: sleep: wakeirq: fix wake irq arming
exfat: speed up iterate/lookup by fixing start point of traversing cluster chain
exfat: support dynamic allocate bh for exfat_entry_set_cache
exfat: check if filename entries exceeds max filename length
mt76: move band capabilities in mt76_phy
mt76: mt7615: Fix fall-through warnings for Clang
wifi: mt76: mt7615: do not advertise 5 GHz on first phy of MT7615D (DBDC)
ARM: dts: imx: add usb alias
ARM: dts: imx6sll: fixup of operating points
ARM: dts: nxp/imx6sll: fix wrong property name in usbphy node
x86/CPU/AMD: Do not leak quotient data after a division by 0
Linux 5.10.190
Fix up build problem in ext4 due to merge of
|
||
|
f50fa8d8ce |
This is the 5.10.189 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmTSghkACgkQONu9yGCS aT49xRAAgJkv4gAs4lUP9EkwLzBDO59LDHtuihPRWDUHYIkuSvuwxDc6BlXq3xvc IYBKuej6NfLHmyBPBJ5vVXVQvk9eTVASFlShRdCQS1QuvSt4/FUBuLTaz/gfVziL 5Hemr04LA/QSZj44h3va5EZq4g3SIxN1V9dfiACUAUHPdDOsw2pEnm4eSYTvb6XB sU1aSrERP9pegbqNKX4o9WVWYFJOKSi8QeBSTDVx0Q2PV3/MbffRARGzQ+wgGJp0 Q67y7LX7Xpkn52koTMJJYt2Ior3OZNdWwRkKShDjizu2gM0ZfEBd2gdGAXpfJIa2 thYuAusaYs4DpVwTC5NqxE8iiIqOnTfzji7JN5jwhHRmhs5KxOwcO2kSM6hwOGws 6iUznxp8Js6T4YPIjcDdl/GEDdA1Uzcy0DnUOP43WJKqpihBAS1z06FCVTsUQfKu Bt1v2rT3+riZ8jFgniqF8wkxEXG1OpLOOq1BIoYdKHWfS86YAdOMVrFOsVEnJywY akrQOW+6mMrRJ6O7Td3Yn8qYhy2NLcMy9S8xgrA5ewM2BbG5/OrKQk98Z8YBYj3V 3+WUyXEZ0tSpXd9TqkvAP4zYMye9XvuCctUFzIu5izUxobXzYLYN/2nDZaTyK3cU fVOvLdef8P4AJkU5kynzS89oQj0Kz8qKgyI94R8cu0ovFXJv+cM= =lV1+ -----END PGP SIGNATURE----- Merge 5.10.189 into android12-5.10-lts Changes in 5.10.189 init: Provide arch_cpu_finalize_init() x86/cpu: Switch to arch_cpu_finalize_init() ARM: cpu: Switch to arch_cpu_finalize_init() ia64/cpu: Switch to arch_cpu_finalize_init() m68k/cpu: Switch to arch_cpu_finalize_init() mips/cpu: Switch to arch_cpu_finalize_init() sh/cpu: Switch to arch_cpu_finalize_init() sparc/cpu: Switch to arch_cpu_finalize_init() um/cpu: Switch to arch_cpu_finalize_init() init: Remove check_bugs() leftovers init: Invoke arch_cpu_finalize_init() earlier init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init() x86/fpu: Remove cpuinfo argument from init functions x86/fpu: Mark init functions __init x86/fpu: Move FPU initialization into arch_cpu_finalize_init() x86/speculation: Add Gather Data Sampling mitigation x86/speculation: Add force option to GDS mitigation x86/speculation: Add Kconfig option for GDS KVM: Add GDS_NO support to KVM x86/xen: Fix secondary processors' FPU initialization x86/mm: fix poking_init() for Xen PV guests x86/mm: Use mm_alloc() in poking_init() mm: Move mm_cachep initialization to mm_init() x86/mm: Initialize text poking earlier Documentation/x86: Fix backwards on/off logic about YMM support x86/cpu: Add VM page flush MSR availablility as a CPUID feature x86/cpufeatures: Assign dedicated feature word for CPUID_0x8000001F[EAX] tools headers cpufeatures: Sync with the kernel sources x86/bugs: Increase the x86 bugs vector size to two u32s x86/cpu, kvm: Add support for CPUID_80000021_EAX x86/srso: Add a Speculative RAS Overflow mitigation x86/srso: Add IBPB_BRTYPE support x86/srso: Add SRSO_NO support x86/srso: Add IBPB x86/srso: Add IBPB on VMEXIT x86/srso: Fix return thunks in generated code x86/srso: Tie SBPB bit setting to microcode patch detection xen/netback: Fix buffer overrun triggered by unusual packet x86: fix backwards merge of GDS/SRSO bit Linux 5.10.189 Change-Id: Ibaf2cd3f0542d497374bcf135e9faf1791e9af5d Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
1952a4d5e4 |
bpf: aggressively forget precise markings during state checkpointing
[ Upstream commit 7a830b53c17bbadcf99f778f28aaaa4e6c41df5f ] Exploit the property of about-to-be-checkpointed state to be able to forget all precise markings up to that point even more aggressively. We now clear all potentially inherited precise markings right before checkpointing and branching off into child state. If any of children states require precise knowledge of any SCALAR register, those will be propagated backwards later on before this state is finalized, preserving correctness. There is a single selftests BPF program change, but tremendous one: 25x reduction in number of verified instructions and states in trace_virtqueue_add_sgs. Cilium results are more modest, but happen across wider range of programs. SELFTESTS RESULTS ================= $ ./veristat -C -e file,prog,insns,states ~/imprecise-early-results.csv ~/imprecise-aggressive-results.csv | grep -v '+0' File Program Total insns (A) Total insns (B) Total insns (DIFF) Total states (A) Total states (B) Total states (DIFF) ------------------- ----------------------- --------------- --------------- ------------------ ---------------- ---------------- ------------------- loop6.bpf.linked1.o trace_virtqueue_add_sgs 398057 15114 -382943 (-96.20%) 8717 336 -8381 (-96.15%) ------------------- ----------------------- --------------- --------------- ------------------ ---------------- ---------------- ------------------- CILIUM RESULTS ============== $ ./veristat -C -e file,prog,insns,states ~/imprecise-early-results-cilium.csv ~/imprecise-aggressive-results-cilium.csv | grep -v '+0' File Program Total insns (A) Total insns (B) Total insns (DIFF) Total states (A) Total states (B) Total states (DIFF) ------------- -------------------------------- --------------- --------------- ------------------ ---------------- ---------------- ------------------- bpf_host.o tail_handle_nat_fwd_ipv4 23426 23221 -205 (-0.88%) 1537 1515 -22 (-1.43%) bpf_host.o tail_handle_nat_fwd_ipv6 13009 12904 -105 (-0.81%) 719 708 -11 (-1.53%) bpf_host.o tail_nodeport_nat_ingress_ipv6 5261 5196 -65 (-1.24%) 247 243 -4 (-1.62%) bpf_host.o tail_nodeport_nat_ipv6_egress 3446 3406 -40 (-1.16%) 203 198 -5 (-2.46%) bpf_lxc.o tail_handle_nat_fwd_ipv4 23426 23221 -205 (-0.88%) 1537 1515 -22 (-1.43%) bpf_lxc.o tail_handle_nat_fwd_ipv6 13009 12904 -105 (-0.81%) 719 708 -11 (-1.53%) bpf_lxc.o tail_ipv4_ct_egress 5074 4897 -177 (-3.49%) 255 248 -7 (-2.75%) bpf_lxc.o tail_ipv4_ct_ingress 5100 4923 -177 (-3.47%) 255 248 -7 (-2.75%) bpf_lxc.o tail_ipv4_ct_ingress_policy_only 5100 4923 -177 (-3.47%) 255 248 -7 (-2.75%) bpf_lxc.o tail_ipv6_ct_egress 4558 4536 -22 (-0.48%) 188 187 -1 (-0.53%) bpf_lxc.o tail_ipv6_ct_ingress 4578 4556 -22 (-0.48%) 188 187 -1 (-0.53%) bpf_lxc.o tail_ipv6_ct_ingress_policy_only 4578 4556 -22 (-0.48%) 188 187 -1 (-0.53%) bpf_lxc.o tail_nodeport_nat_ingress_ipv6 5261 5196 -65 (-1.24%) 247 243 -4 (-1.62%) bpf_overlay.o tail_nodeport_nat_ingress_ipv6 5261 5196 -65 (-1.24%) 247 243 -4 (-1.62%) bpf_overlay.o tail_nodeport_nat_ipv6_egress 3482 3442 -40 (-1.15%) 204 201 -3 (-1.47%) bpf_xdp.o tail_nodeport_nat_egress_ipv4 17200 15619 -1581 (-9.19%) 1111 1010 -101 (-9.09%) ------------- -------------------------------- --------------- --------------- ------------------ ---------------- ---------------- ------------------- Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221104163649.121784-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: ecdf985d7615 ("bpf: track immediate values written to stack by BPF_ST instruction") Signed-off-by: Pu Lehui <pulehui@huawei.com> Tested-by: Luiz Capitulino <luizcap@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
7ca3e7459f |
bpf: stop setting precise in current state
[ Upstream commit f63181b6ae79fd3b034cde641db774268c2c3acf ] Setting reg->precise to true in current state is not necessary from correctness standpoint, but it does pessimise the whole precision (or rather "imprecision", because that's what we want to keep as much as possible) tracking. Why is somewhat subtle and my best attempt to explain this is recorded in an extensive comment for __mark_chain_precise() function. Some more careful thinking and code reading is probably required still to grok this completely, unfortunately. Whiteboarding and a bunch of extra handwaiving in person would be even more helpful, but is deemed impractical in Git commit. Next patch pushes this imprecision property even further, building on top of the insights described in this patch. End results are pretty nice, we get reduction in number of total instructions and states verified due to a better states reuse, as some of the states are now more generic and permissive due to less unnecessary precise=true requirements. SELFTESTS RESULTS ================= $ ./veristat -C -e file,prog,insns,states ~/subprog-precise-results.csv ~/imprecise-early-results.csv | grep -v '+0' File Program Total insns (A) Total insns (B) Total insns (DIFF) Total states (A) Total states (B) Total states (DIFF) --------------------------------------- ---------------------- --------------- --------------- ------------------ ---------------- ---------------- ------------------- bpf_iter_ksym.bpf.linked1.o dump_ksym 347 285 -62 (-17.87%) 20 19 -1 (-5.00%) pyperf600_bpf_loop.bpf.linked1.o on_event 3678 3736 +58 (+1.58%) 276 285 +9 (+3.26%) setget_sockopt.bpf.linked1.o skops_sockopt 4038 3947 -91 (-2.25%) 347 343 -4 (-1.15%) test_l4lb.bpf.linked1.o balancer_ingress 4559 2611 -1948 (-42.73%) 118 105 -13 (-11.02%) test_l4lb_noinline.bpf.linked1.o balancer_ingress 6279 6268 -11 (-0.18%) 237 236 -1 (-0.42%) test_misc_tcp_hdr_options.bpf.linked1.o misc_estab 1307 1303 -4 (-0.31%) 100 99 -1 (-1.00%) test_sk_lookup.bpf.linked1.o ctx_narrow_access 456 447 -9 (-1.97%) 39 38 -1 (-2.56%) test_sysctl_loop1.bpf.linked1.o sysctl_tcp_mem 1389 1384 -5 (-0.36%) 26 25 -1 (-3.85%) test_tc_dtime.bpf.linked1.o egress_fwdns_prio101 518 485 -33 (-6.37%) 51 46 -5 (-9.80%) test_tc_dtime.bpf.linked1.o egress_host 519 468 -51 (-9.83%) 50 44 -6 (-12.00%) test_tc_dtime.bpf.linked1.o ingress_fwdns_prio101 842 1000 +158 (+18.76%) 73 88 +15 (+20.55%) xdp_synproxy_kern.bpf.linked1.o syncookie_tc 405757 373173 -32584 (-8.03%) 25735 22882 -2853 (-11.09%) xdp_synproxy_kern.bpf.linked1.o syncookie_xdp 479055 371590 -107465 (-22.43%) 29145 22207 -6938 (-23.81%) --------------------------------------- ---------------------- --------------- --------------- ------------------ ---------------- ---------------- ------------------- Slight regression in test_tc_dtime.bpf.linked1.o/ingress_fwdns_prio101 is left for a follow up, there might be some more precision-related bugs in existing BPF verifier logic. CILIUM RESULTS ============== $ ./veristat -C -e file,prog,insns,states ~/subprog-precise-results-cilium.csv ~/imprecise-early-results-cilium.csv | grep -v '+0' File Program Total insns (A) Total insns (B) Total insns (DIFF) Total states (A) Total states (B) Total states (DIFF) ------------- ------------------------------ --------------- --------------- ------------------ ---------------- ---------------- ------------------- bpf_host.o cil_from_host 762 556 -206 (-27.03%) 43 37 -6 (-13.95%) bpf_host.o tail_handle_nat_fwd_ipv4 23541 23426 -115 (-0.49%) 1538 1537 -1 (-0.07%) bpf_host.o tail_nodeport_nat_egress_ipv4 33592 33566 -26 (-0.08%) 2163 2161 -2 (-0.09%) bpf_lxc.o tail_handle_nat_fwd_ipv4 23541 23426 -115 (-0.49%) 1538 1537 -1 (-0.07%) bpf_overlay.o tail_nodeport_nat_egress_ipv4 33581 33543 -38 (-0.11%) 2160 2157 -3 (-0.14%) bpf_xdp.o tail_handle_nat_fwd_ipv4 21659 20920 -739 (-3.41%) 1440 1376 -64 (-4.44%) bpf_xdp.o tail_handle_nat_fwd_ipv6 17084 17039 -45 (-0.26%) 907 905 -2 (-0.22%) bpf_xdp.o tail_lb_ipv4 73442 73430 -12 (-0.02%) 4370 4369 -1 (-0.02%) bpf_xdp.o tail_lb_ipv6 152114 151895 -219 (-0.14%) 6493 6479 -14 (-0.22%) bpf_xdp.o tail_nodeport_nat_egress_ipv4 17377 17200 -177 (-1.02%) 1125 1111 -14 (-1.24%) bpf_xdp.o tail_nodeport_nat_ingress_ipv6 6405 6397 -8 (-0.12%) 309 308 -1 (-0.32%) bpf_xdp.o tail_rev_nodeport_lb4 7126 6934 -192 (-2.69%) 414 402 -12 (-2.90%) bpf_xdp.o tail_rev_nodeport_lb6 18059 17905 -154 (-0.85%) 1105 1096 -9 (-0.81%) ------------- ------------------------------ --------------- --------------- ------------------ ---------------- ---------------- ------------------- Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221104163649.121784-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: ecdf985d7615 ("bpf: track immediate values written to stack by BPF_ST instruction") Signed-off-by: Pu Lehui <pulehui@huawei.com> Tested-by: Luiz Capitulino <luizcap@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
2474ec58b9 |
bpf: allow precision tracking for programs with subprogs
[ Upstream commit be2ef8161572ec1973124ebc50f56dafc2925e07 ] Stop forcing precise=true for SCALAR registers when BPF program has any subprograms. Current restriction means that any BPF program, as soon as it uses subprograms, will end up not getting any of the precision tracking benefits in reduction of number of verified states. This patch keeps the fallback mark_all_scalars_precise() behavior if precise marking has to cross function frames. E.g., if subprogram requires R1 (first input arg) to be marked precise, ideally we'd need to backtrack to the parent function and keep marking R1 and its dependencies as precise. But right now we give up and force all the SCALARs in any of the current and parent states to be forced to precise=true. We can lift that restriction in the future. But this patch fixes two issues identified when trying to enable precision tracking for subprogs. First, prevent "escaping" from top-most state in a global subprog. While with entry-level BPF program we never end up requesting precision for R1-R5 registers, because R2-R5 are not initialized (and so not readable in correct BPF program), and R1 is PTR_TO_CTX, not SCALAR, and so is implicitly precise. With global subprogs, though, it's different, as global subprog a) can have up to 5 SCALAR input arguments, which might get marked as precise=true and b) it is validated in isolation from its main entry BPF program. b) means that we can end up exhausting parent state chain and still not mark all registers in reg_mask as precise, which would lead to verifier bug warning. To handle that, we need to consider two cases. First, if the very first state is not immediately "checkpointed" (i.e., stored in state lookup hashtable), it will get correct first_insn_idx and last_insn_idx instruction set during state checkpointing. As such, this case is already handled and __mark_chain_precision() already handles that by just doing nothing when we reach to the very first parent state. st->parent will be NULL and we'll just stop. Perhaps some extra check for reg_mask and stack_mask is due here, but this patch doesn't address that issue. More problematic second case is when global function's initial state is immediately checkpointed before we manage to process the very first instruction. This is happening because when there is a call to global subprog from the main program the very first subprog's instruction is marked as pruning point, so before we manage to process first instruction we have to check and checkpoint state. This patch adds a special handling for such "empty" state, which is identified by having st->last_insn_idx set to -1. In such case, we check that we are indeed validating global subprog, and with some sanity checking we mark input args as precise if requested. Note that we also initialize state->first_insn_idx with correct start insn_idx offset. For main program zero is correct value, but for any subprog it's quite confusing to not have first_insn_idx set. This doesn't have any functional impact, but helps with debugging and state printing. We also explicitly initialize state->last_insns_idx instead of relying on is_state_visited() to do this with env->prev_insns_idx, which will be -1 on the very first instruction. This concludes necessary changes to handle specifically global subprog's precision tracking. Second identified problem was missed handling of BPF helper functions that call into subprogs (e.g., bpf_loop and few others). From precision tracking and backtracking logic's standpoint those are effectively calls into subprogs and should be called as BPF_PSEUDO_CALL calls. This patch takes the least intrusive way and just checks against a short list of current BPF helpers that do call subprogs, encapsulated in is_callback_calling_function() function. But to prevent accidentally forgetting to add new BPF helpers to this "list", we also do a sanity check in __check_func_call, which has to be called for each such special BPF helper, to validate that BPF helper is indeed recognized as callback-calling one. This should catch any missed checks in the future. Adding some special flags to be added in function proto definitions seemed like an overkill in this case. With the above changes, it's possible to remove forceful setting of reg->precise to true in __mark_reg_unknown, which turns on precision tracking both inside subprogs and entry progs that have subprogs. No warnings or errors were detected across all the selftests, but also when validating with veristat against internal Meta BPF objects and Cilium objects. Further, in some BPF programs there are noticeable reduction in number of states and instructions validated due to more effective precision tracking, especially benefiting syncookie test. $ ./veristat -C -e file,prog,insns,states ~/baseline-results.csv ~/subprog-precise-results.csv | grep -v '+0' File Program Total insns (A) Total insns (B) Total insns (DIFF) Total states (A) Total states (B) Total states (DIFF) ---------------------------------------- -------------------------- --------------- --------------- ------------------ ---------------- ---------------- ------------------- pyperf600_bpf_loop.bpf.linked1.o on_event 3966 3678 -288 (-7.26%) 306 276 -30 (-9.80%) pyperf_global.bpf.linked1.o on_event 7563 7530 -33 (-0.44%) 520 517 -3 (-0.58%) pyperf_subprogs.bpf.linked1.o on_event 36358 36934 +576 (+1.58%) 2499 2531 +32 (+1.28%) setget_sockopt.bpf.linked1.o skops_sockopt 3965 4038 +73 (+1.84%) 343 347 +4 (+1.17%) test_cls_redirect_subprogs.bpf.linked1.o cls_redirect 64965 64901 -64 (-0.10%) 4619 4612 -7 (-0.15%) test_misc_tcp_hdr_options.bpf.linked1.o misc_estab 1491 1307 -184 (-12.34%) 110 100 -10 (-9.09%) test_pkt_access.bpf.linked1.o test_pkt_access 354 349 -5 (-1.41%) 25 24 -1 (-4.00%) test_sock_fields.bpf.linked1.o egress_read_sock_fields 435 375 -60 (-13.79%) 22 20 -2 (-9.09%) test_sysctl_loop2.bpf.linked1.o sysctl_tcp_mem 1508 1501 -7 (-0.46%) 29 28 -1 (-3.45%) test_tc_dtime.bpf.linked1.o egress_fwdns_prio100 468 435 -33 (-7.05%) 45 41 -4 (-8.89%) test_tc_dtime.bpf.linked1.o ingress_fwdns_prio100 398 408 +10 (+2.51%) 42 39 -3 (-7.14%) test_tc_dtime.bpf.linked1.o ingress_fwdns_prio101 1096 842 -254 (-23.18%) 97 73 -24 (-24.74%) test_tcp_hdr_options.bpf.linked1.o estab 2758 2408 -350 (-12.69%) 208 181 -27 (-12.98%) test_urandom_usdt.bpf.linked1.o urand_read_with_sema 466 448 -18 (-3.86%) 31 28 -3 (-9.68%) test_urandom_usdt.bpf.linked1.o urand_read_without_sema 466 448 -18 (-3.86%) 31 28 -3 (-9.68%) test_urandom_usdt.bpf.linked1.o urandlib_read_with_sema 466 448 -18 (-3.86%) 31 28 -3 (-9.68%) test_urandom_usdt.bpf.linked1.o urandlib_read_without_sema 466 448 -18 (-3.86%) 31 28 -3 (-9.68%) test_xdp_noinline.bpf.linked1.o balancer_ingress_v6 4302 4294 -8 (-0.19%) 257 256 -1 (-0.39%) xdp_synproxy_kern.bpf.linked1.o syncookie_tc 583722 405757 -177965 (-30.49%) 35846 25735 -10111 (-28.21%) xdp_synproxy_kern.bpf.linked1.o syncookie_xdp 609123 479055 -130068 (-21.35%) 35452 29145 -6307 (-17.79%) ---------------------------------------- -------------------------- --------------- --------------- ------------------ ---------------- ---------------- ------------------- Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221104163649.121784-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: ecdf985d7615 ("bpf: track immediate values written to stack by BPF_ST instruction") Signed-off-by: Pu Lehui <pulehui@huawei.com> Tested-by: Luiz Capitulino <luizcap@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
426656e8dd |
tracing: Fix sleeping while atomic in kdb ftdump
commit 495fcec8648cdfb483b5b9ab310f3839f07cb3b8 upstream. If you drop into kdb and type "ftdump" you'll get a sleeping while atomic warning from memory allocation in trace_find_next_entry(). This appears to have been caused by commit |
||
|
3048cb0dc0 |
bpf: Disable preemption in bpf_event_output
commit d62cc390c2e99ae267ffe4b8d7e2e08b6c758c32 upstream.
We received report [1] of kernel crash, which is caused by
using nesting protection without disabled preemption.
The bpf_event_output can be called by programs executed by
bpf_prog_run_array_cg function that disabled migration but
keeps preemption enabled.
This can cause task to be preempted by another one inside the
nesting protection and lead eventually to two tasks using same
perf_sample_data buffer and cause crashes like:
BUG: kernel NULL pointer dereference, address: 0000000000000001
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
...
? perf_output_sample+0x12a/0x9a0
? finish_task_switch.isra.0+0x81/0x280
? perf_event_output+0x66/0xa0
? bpf_event_output+0x13a/0x190
? bpf_event_output_data+0x22/0x40
? bpf_prog_dfc84bbde731b257_cil_sock4_connect+0x40a/0xacb
? xa_load+0x87/0xe0
? __cgroup_bpf_run_filter_sock_addr+0xc1/0x1a0
? release_sock+0x3e/0x90
? sk_setsockopt+0x1a1/0x12f0
? udp_pre_connect+0x36/0x50
? inet_dgram_connect+0x93/0xa0
? __sys_connect+0xb4/0xe0
? udp_setsockopt+0x27/0x40
? __pfx_udp_push_pending_frames+0x10/0x10
? __sys_setsockopt+0xdf/0x1a0
? __x64_sys_connect+0xf/0x20
? do_syscall_64+0x3a/0x90
? entry_SYSCALL_64_after_hwframe+0x72/0xdc
Fixing this by disabling preemption in bpf_event_output.
[1] https://github.com/cilium/cilium/issues/26756
Cc: stable@vger.kernel.org
Reported-by: Oleg "livelace" Popov <o.popov@livelace.ru>
Closes: https://github.com/cilium/cilium/issues/26756
Fixes:
|
||
|
3f7395c382 |
perf: Fix function pointer case
commit 1af6239d1d3e61d33fd2f0ba53d3d1a67cc50574 upstream.
With the advent of CFI it is no longer acceptible to cast function
pointers.
The robot complains thusly:
kernel-events-core.c⚠️cast-from-int-(-)(struct-perf_cpu_pmu_context-)-to-remote_function_f-(aka-int-(-)(void-)-)-converts-to-incompatible-function-type
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Cixi Geng <cixi.geng1@unisoc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
a6d2fd1703 |
tracing: Fix warning in trace_buffered_event_disable()
[ Upstream commit dea499781a1150d285c62b26659f62fb00824fce ]
Warning happened in trace_buffered_event_disable() at
WARN_ON_ONCE(!trace_buffered_event_ref)
Call Trace:
? __warn+0xa5/0x1b0
? trace_buffered_event_disable+0x189/0x1b0
__ftrace_event_enable_disable+0x19e/0x3e0
free_probe_data+0x3b/0xa0
unregister_ftrace_function_probe_func+0x6b8/0x800
event_enable_func+0x2f0/0x3d0
ftrace_process_regex.isra.0+0x12d/0x1b0
ftrace_filter_write+0xe6/0x140
vfs_write+0x1c9/0x6f0
[...]
The cause of the warning is in __ftrace_event_enable_disable(),
trace_buffered_event_enable() was called once while
trace_buffered_event_disable() was called twice.
Reproduction script show as below, for analysis, see the comments:
```
#!/bin/bash
cd /sys/kernel/tracing/
# 1. Register a 'disable_event' command, then:
# 1) SOFT_DISABLED_BIT was set;
# 2) trace_buffered_event_enable() was called first time;
echo 'cmdline_proc_show:disable_event:initcall:initcall_finish' > \
set_ftrace_filter
# 2. Enable the event registered, then:
# 1) SOFT_DISABLED_BIT was cleared;
# 2) trace_buffered_event_disable() was called first time;
echo 1 > events/initcall/initcall_finish/enable
# 3. Try to call into cmdline_proc_show(), then SOFT_DISABLED_BIT was
# set again!!!
cat /proc/cmdline
# 4. Unregister the 'disable_event' command, then:
# 1) SOFT_DISABLED_BIT was cleared again;
# 2) trace_buffered_event_disable() was called second time!!!
echo '!cmdline_proc_show:disable_event:initcall:initcall_finish' > \
set_ftrace_filter
```
To fix it, IIUC, we can change to call trace_buffered_event_enable() at
fist time soft-mode enabled, and call trace_buffered_event_disable() at
last time soft-mode disabled.
Link: https://lore.kernel.org/linux-trace-kernel/20230726095804.920457-1-zhengyejian1@huawei.com
Cc: <mhiramat@kernel.org>
Fixes:
|
||
|
0efbdbc453 |
ring-buffer: Fix wrong stat of cpu_buffer->read
[ Upstream commit 2d093282b0d4357373497f65db6a05eb0c28b7c8 ]
When pages are removed in rb_remove_pages(), 'cpu_buffer->read' is set
to 0 in order to make sure any read iterators reset themselves. However,
this will mess 'entries' stating, see following steps:
# cd /sys/kernel/tracing/
# 1. Enlarge ring buffer prepare for later reducing:
# echo 20 > per_cpu/cpu0/buffer_size_kb
# 2. Write a log into ring buffer of cpu0:
# taskset -c 0 echo "hello1" > trace_marker
# 3. Read the log:
# cat per_cpu/cpu0/trace_pipe
<...>-332 [000] ..... 62.406844: tracing_mark_write: hello1
# 4. Stop reading and see the stats, now 0 entries, and 1 event readed:
# cat per_cpu/cpu0/stats
entries: 0
[...]
read events: 1
# 5. Reduce the ring buffer
# echo 7 > per_cpu/cpu0/buffer_size_kb
# 6. Now entries became unexpected 1 because actually no entries!!!
# cat per_cpu/cpu0/stats
entries: 1
[...]
read events: 0
To fix it, introduce 'page_removed' field to count total removed pages
since last reset, then use it to let read iterators reset themselves
instead of changing the 'read' pointer.
Link: https://lore.kernel.org/linux-trace-kernel/20230724054040.3489499-1-zhengyejian1@huawei.com
Cc: <mhiramat@kernel.org>
Cc: <vnagarnaik@google.com>
Fixes:
|
||
|
840ce9cfc8 |
tracing: Show real address for trace event arguments
[ Upstream commit efbbdaa22bb78761bff8dfdde027ad04bedd47ce ] To help debugging kernel, show real address for trace event arguments in tracefs/trace{,pipe} instead of hashed pointer value. Since ftrace human-readable format uses vsprintf(), all %p are translated to hash values instead of pointer address. However, when debugging the kernel, raw address value gives a hint when comparing with the memory mapping in the kernel. (Those are sometimes used with crash log, which is not hashed too) So converting %p with %px when calling trace_seq_printf(). Moreover, this is not improving the security because the tracefs can be used only by root user and the raw address values are readable from tracefs/percpu/cpu*/trace_pipe_raw file. Link: https://lkml.kernel.org/r/160277370703.29307.5134475491761971203.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Stable-dep-of: d5a821896360 ("tracing: Fix memory leak of iter->temp when reading trace_pipe") Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
1ff14defdf |
mm: Move mm_cachep initialization to mm_init()
commit af80602799681c78f14fbe20b6185a56020dedee upstream. In order to allow using mm_alloc() much earlier, move initializing mm_cachep into mm_init(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221025201057.751153381@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
6ee042fd24 |
x86/mm: Use mm_alloc() in poking_init()
commit 3f4c8211d982099be693be9aa7d6fc4607dff290 upstream. Instead of duplicating init_mm, allocate a fresh mm. The advantage is that mm_alloc() has much simpler dependencies. Additionally it makes more conceptual sense, init_mm has no (and must not have) user state to duplicate. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221025201057.816175235@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |