Kernel for Galaxy S24, rebased on CLO sources (WIP)
Go to file
Thinh Tran a7d15ac085 net/tg3: fix race condition in tg3_reset_task()
[ Upstream commit 16b55b1f2269962fb6b5154b8bf43f37c9a96637 ]

When an EEH error is encountered by a PCI adapter, the EEH driver
modifies the PCI channel's state as shown below:

   enum {
      /* I/O channel is in normal state */
      pci_channel_io_normal = (__force pci_channel_state_t) 1,

      /* I/O to channel is blocked */
      pci_channel_io_frozen = (__force pci_channel_state_t) 2,

      /* PCI card is dead */
      pci_channel_io_perm_failure = (__force pci_channel_state_t) 3,
   };

If the same EEH error then causes the tg3 driver's transmit timeout
logic to execute, the tg3_tx_timeout() function schedules a reset
task via tg3_reset_task_schedule(), which may cause a race condition
between the tg3 and EEH driver as both attempt to recover the HW via
a reset action.

EEH driver gets error event
--> eeh_set_channel_state()
    and set device to one of
    error state above           scheduler: tg3_reset_task() get
                                returned error from tg3_init_hw()
                             --> dev_close() shuts down the interface
tg3_io_slot_reset() and
tg3_io_resume() fail to
reset/resume the device

To resolve this issue, we avoid the race condition by checking the PCI
channel state in the tg3_reset_task() function and skip the tg3 driver
initiated reset when the PCI channel is not in the normal state.  (The
driver has no access to tg3 device registers at this point and cannot
even complete the reset task successfully without external assistance.)
We'll leave the reset procedure to be managed by the EEH driver which
calls the tg3_io_error_detected(), tg3_io_slot_reset() and
tg3_io_resume() functions as appropriate.

Adding the same checking in tg3_dump_state() to avoid dumping all
device registers when the PCI channel is not in the normal state.

Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com>
Tested-by: Venkata Sai Duggi <venkata.sai.duggi@ibm.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231201001911.656-1-thinhtr@linux.vnet.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-20 11:50:05 +01:00
arch arm64: dts: rockchip: fix rk356x pcie msg interrupt name 2024-01-20 11:50:04 +01:00
block blk-mq: don't count completed flush data request as inflight in case of quiesce 2024-01-20 11:50:04 +01:00
certs certs: Fix build error when PKCS#11 URI contains semicolon 2023-02-09 11:28:11 +01:00
crypto crypto: pcrypt - Fix hungtask for PADATA_RESET 2023-11-28 17:06:58 +00:00
Documentation dt-bindings: nvmem: mxs-ocotp: Document fsl,ocotp 2024-01-01 12:39:04 +00:00
drivers net/tg3: fix race condition in tg3_reset_task() 2024-01-20 11:50:05 +01:00
fs smb: client, common: fix fortify warnings 2024-01-20 11:50:04 +01:00
include wifi: avoid offset calculation on NULL pointer 2024-01-20 11:50:03 +01:00
init proc: sysctl: prevent aliased sysctls from getting passed to init 2023-11-28 17:07:08 +00:00
io_uring net: Declare MSG_SPLICE_PAGES internal sendmsg() flag 2024-01-10 17:10:27 +01:00
ipc ipc: fix memory leak in init_mqueue_fs() 2022-12-31 13:32:01 +01:00
kernel bpf: Fix a verifier bug due to incorrect branch offset comparison with cpu=v4 2024-01-10 17:10:36 +01:00
lib genirq/affinity: Only build SMP-only helper functions on SMP kernels 2024-01-10 17:10:36 +01:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
mm mm: fix unmap_mapping_range high bits shift bug 2024-01-10 17:10:35 +01:00
net wifi: mac80211: handle 320 MHz in ieee80211_ht_cap_ie_to_sta_ht_cap 2024-01-20 11:50:03 +01:00
rust rust: allocator: Prevent mis-aligned allocation 2023-08-11 12:08:18 +02:00
samples fprobe: Pass entry_data to handlers 2023-10-25 12:03:12 +02:00
scripts sign-file: Fix incorrect return values check 2023-12-20 17:00:19 +01:00
security keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry 2024-01-01 12:38:58 +00:00
sound ASoC: hdac_hda: Conditionally register dais for HDMI and Analog 2024-01-20 11:50:05 +01:00
tools selftests: mptcp: set FAILING_LINKS in run_tests 2024-01-10 17:10:31 +01:00
usr usr/gen_init_cpio.c: remove unnecessary -1 values from int file 2022-10-03 14:21:44 -07:00
virt kvm/vfio: ensure kvg instance stays around in kvm_vfio_group_add() 2023-09-13 09:42:46 +02:00
.clang-format inet: ping: use hlist_nulls rcu iterator during lookup 2022-12-01 12:42:46 +01:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore Kbuild: add Rust support 2022-09-28 09:02:20 +02:00
.mailmap 9 hotfixes. 6 for MM, 3 for other areas. Four of these patches address 2022-12-10 17:10:52 -08:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Remove Michal Marek from Kbuild maintainers 2022-11-16 14:53:00 +09:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS genirq/affinity: Move group_cpus_evenly() into lib/ 2024-01-10 17:10:33 +01:00
Makefile Linux 6.1.73 2024-01-15 18:54:51 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.