* acpica:
ACPICA: Update version to 20150930
ACPICA: Debugger: Fix dead lock issue ocurred in single stepping mode
ACPI: Enable build of AML interpreter debugger
ACPICA: Debugger: Add thread ID support so that single step mode can only apply to the debugger thread
ACPICA: Debugger: Fix "terminate" command by cleaning up subsystem shutdown logic
ACPICA: Debugger: Fix "quit/exit" command by cleaning up user commands termination logic
ACPICA: Linuxize: Export debugger files to Linux
ACPICA: iASL: General cleanup of the file suffix #defines
ACPICA: Improve typechecking, both compile-time and runtime
ACPICA: Update NFIT table to rename a flags field
ACPICA: Debugger: Update mutexes used for multithreaded debugger
ACPICA: Update exception code for "file not found" error
ACPICA: iASL: Add symbolic operator support for Index() operator
ACPICA: Remove unnecessary conditional compilation
Pull memremap fix from Dan Williams:
"The new memremap() api introduced in the 4.3 cycle to unify/replace
ioremap_cache() and ioremap_wt() is mishandling the highmem case.
This patch has received a build success notification from a
0day-kbuild-robot run and has received an ack from Ard"
From the commit message:
"The impact of this bug is low for now since the pmem driver is the
only user of memremap(), but this is important to fix before more
conversions to memremap arrive in 4.4"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
memremap: fix highmem support
priv->hwts_*_en indicate if timestamping is enabled/disabled at run
time. But priv->dma_cap.time_stamp and priv->dma_cap.atime_stamp
indicates HW is support for PTPv1/PTPv2.
Signed-off-by: Phil Reid <preid@electromag.com.au>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the use of struct timespec in
dccp_probe to use struct timespec64 instead. timespec uses a 32-bit
seconds field which will overflow in the year 2038 and beyond. timespec64
uses a 64-bit seconds field. Note that the correctness of the code isn't
changed, since the original code only uses the timestamps to compute a
small elapsed interval. This patch is part of a larger attempt to remove
instances of 32-bit timekeeping structures (timespec, timeval, time_t)
from the kernel so it is easier to identify where the real 2038 issues
are.
Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Anastasov says:
====================
ipv4: fix problems from the RTNH_F_LINKDOWN introduction
Fix two problems from the change that introduced RTNH_F_LINKDOWN
flag. The first patch deals with the removal of local route on
DOWN event. The second patch makes sure the RTNH_F_LINKDOWN
flag is properly updated on UP event because the DOWN event
sets it in all cases.
v2->v3:
- use bool for force var
v1->v2:
- forgot to add ifconfig dummy0 down in the test case
- split to 2 patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When nexthop is part of multipath route we should clear the
LINKDOWN flag when link goes UP or when first address is added.
This is needed because we always set LINKDOWN flag when DEAD flag
was set but now on UP the nexthop is not dead anymore. Examples when
LINKDOWN bit can be forgotten when no NETDEV_CHANGE is delivered:
- link goes down (LINKDOWN is set), then link goes UP and device
shows carrier OK but LINKDOWN remains set
- last address is deleted (LINKDOWN is set), then address is
added and device shows carrier OK but LINKDOWN remains set
Steps to reproduce:
modprobe dummy
ifconfig dummy0 192.168.168.1 up
here add a multipath route where one nexthop is for dummy0:
ip route add 1.2.3.4 nexthop dummy0 nexthop SOME_OTHER_DEVICE
ifconfig dummy0 down
ifconfig dummy0 up
now ip route shows nexthop that is not dead. Now set the sysctl var:
echo 1 > /proc/sys/net/ipv4/conf/dummy0/ignore_routes_with_linkdown
now ip route will show a dead nexthop because the forgotten
RTNH_F_LINKDOWN is propagated as RTNH_F_DEAD.
Fixes: 8a3d03166f ("net: track link-status of ipv4 nexthops")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
we should not delete the local routes if the local address
is still present. The confusion comes from the fact that both
fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
constant. Fix it by returning back the variable 'force'.
Steps to reproduce:
modprobe dummy
ifconfig dummy0 192.168.168.1 up
ifconfig dummy0 down
ip route list table local | grep dummy | grep host
local 192.168.168.1 dev dummy0 proto kernel scope host src 192.168.168.1
Fixes: 8a3d03166f ("net: track link-status of ipv4 nexthops")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
platform_driver doesn't need to set .owner, because
platform_driver_register() will set it.
Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simplify DSA by pushing the switchdev objects for VLAN add and delete
operations down to its drivers. Currently only mv88e6xxx is affected.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The EPHY on GENET v1->v3 is extremely finicky, and will show occasional
failures based on the timing and reset sequence, ranging from duplicate
packets, to extremely high latencies.
Perform an additional software reset, and re-configuration to make sure it is
in a consistent and working state.
Fixes: 6ac3ce8295 ("net: bcmgenet: Remove excessive PHY reset")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull x86 fixes from Thomas Gleixner:
"This set of updates contains:
- Another bugfix for the pathologic vm86 machinery. Clear
thread.vm86 on fork to prevent corrupting the parent state. This
comes along with an update to the vm86 selftest case
- Fix another corner case in the ioapic setup code which causes a
boot crash on some oddball systems
- Fix the fallout from the dma allocation consolidation work, which
leads to a NULL pointer dereference when the allocation code is
called with a NULL device"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vm86: Set thread.vm86 to NULL on fork/clone
selftests/x86: Add a fork() to entry_from_vm86 to catch fork bugs
x86/ioapic: Prevent NULL pointer dereference in setup_ioapic_dest()
x86/dma-mapping: Fix arch_dma_alloc_attrs() oops with NULL dev
Pull irq fixes from Thomas Gleixner:
"The last two one-liners for 4.3 from the irqchip space:
- Regression fix for armada SoC which addresses the fallout from the
set_irq_flags() cleanup
- Add the missing propagation of the irq_set_type() callback to the
parent interrupt controller of the tegra interrupt chip"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/tegra: Propagate IRQ type setting to parent
irqchip/armada-370-xp: Fix regression by clearing IRQ_NOAUTOEN
The SS_LISTEN socket state is defined by both af_vsock.c and
vmci_transport.c. This is risky since the value could be changed in one
file and the other would be out of sync.
Rename from SS_LISTEN to VSOCK_SS_LISTEN since the constant is not part
of enum socket_state (SS_CONNECTED, ...). This way it is clear that the
constant is vsock-specific.
The big text reflow in af_vsock.c was necessary to keep to the maximum
line length. Text is unchanged except for s/SS_LISTEN/VSOCK_SS_LISTEN/.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On certain hardware in certain situations loopback test fails and the
driver gets removed. During mdiobus_unregister() instance of PHY driver
gets disposed. But by this time it has already been started using
phy_connect_direct().
PHY driver uses DELAYED_WORK in order to maintain its state. Attempting
to dispose the driver without calling phy_disconnect() causes deallocation
of DELAYED_WORK being active. This shortly causes a bad crash in timer
code.
The problem can be discovered by enabling CONFIG_DEBUG_OBJECTS_TIMERS and
CONFIG_DEBUG_OBJECTS_FREE
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Testing of the new UDP bearer has revealed that reception of
NAME_DISTRIBUTOR, LINK_PROTOCOL/RESET and LINK_PROTOCOL/ACTIVATE
message buffers is not prepared for the case that those may be
non-linear.
We now linearize all such buffers before they are delivered up to the
generic reception layer.
In order for the commit to apply cleanly to 'net' and 'stable', we do
the change in the function tipc_udp_recv() for now. Later, we will post
a commit to 'net-next' moving the linearization to generic code, in
tipc_named_rcv() and tipc_link_proto_rcv().
Fixes: commit d0f91938be ("tipc: add ip/udp media type")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are in a context where we can sleep, and the FEC PHY reset gpio
may be on an I2C expander. Use the cansleep() variant when
setting the GPIO value.
Based on a patch from Russell King for pci-mvebu.c.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa says:
====================
net: clean up interactions of CHECKSUM_PARTIAL and fragmentation
This series fixes wrong checksums on the wire for IPv4 and IPv6. Large
send buffers and especially NFS lead to wrong checksums in both IPv4
and IPv6.
CHECKSUM_PARTIAL skbs should not receive the respective fragmentations
functions, so we add WARN_ON_ONCE to those functions to fix up those as
soon as they get reported.
Changelog:
v2: added v4 checks
v3: removed WARN_ON_ONCES (advice by Tom Herbert)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
of those warn about them once and handle them gracefully by recalculating
the checksum.
Fixes: commit 32dce968dd ("ipv6: Allow for partial checksums on non-ufo packets")
See-also: commit 72e843bb09 ("ipv6: ip6_fragment() should check CHECKSUM_PARTIAL")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We cannot reliable calculate packet size on MSG_MORE corked sockets
and thus cannot decide if they are going to be fragmented later on,
so better not use CHECKSUM_PARTIAL in the first place.
The IPv6 code also intended to protect and not use CHECKSUM_PARTIAL in
the existence of IPv6 extension headers, but the condition was wrong. Fix
it up, too. Also the condition to check whether the packet fits into
one fragment was wrong and has been corrected.
Fixes: commit 32dce968dd ("ipv6: Allow for partial checksums on non-ufo packets")
See-also: commit 72e843bb09 ("ipv6: ip6_fragment() should check CHECKSUM_PARTIAL")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
of those warn about them once and handle them gracefully by recalculating
the checksum.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We cannot reliable calculate packet size on MSG_MORE corked sockets
and thus cannot decide if they are going to be fragmented later on,
so better not use CHECKSUM_PARTIAL in the first place.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
AMD Fam17h processors introduce support for the CLZERO
instruction. It zeroes out the 64 byte cache line specified in
RAX.
Add the bit here to allow /proc/cpuinfo to list the feature.
Boris: we're adding this as a separate ->x86_capability leaf
because CPUID_80000008_EBX is going to contain more feature bits
and it will fill out with time.
Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com>
Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
[ Wrap code in patch form, fix comments. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1446207099-24948-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Scalable MCA (SMCA) is a new feature in AMD Fam17h processors
which indicates presence of MCA extensions.
MCA extensions expands existing register space for the MCE banks
and also introduces a new MSR range to accommodate new banks.
Add the detection bit.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
[ Reformat mce_vendor_flags definitions and save indentation levels. Improve comments. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1446207099-24948-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Relax dependencies on SPI_MASTER for drivers in the SPI menu
that already has this dependency.
- Move out the expander that would be hidden for I2C access if
SPI_MASTER was not selected. Tentatively create a separate
menu for this.
- Move the ZX SoC driver to memory-mapped drivers, this must be
a mistake and only worked because the system has an SPI master
enabled at the same time.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The I2C expander menu already depends on I2C, drop subdependecies
on individual drivers. Keep the instances of depends on I2C=y
though, so these are still restricted to the compiled-in case.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Port-mapped I/O depends on X86 already, so individual drivers need
not specify this dependency.
Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
This should be our final batch of fixes for 4.3:
- A patch from Sudeep Holla that fixes annotation of wakeup sources properly,
old unused format seems to have spread through copying.
- Two patches from Tony for OMAP. One dealing with MUSB setup problems due to
runtime PM being enabled too early on the parent device. The other fixes
IRQ numbering for OMAP1.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWNGocAAoJEIwa5zzehBx32UIP/AiO+T1Sb7ZlF+SoQTpMloe1
BCPtb/rHV4nBbgCTOw7ixauUAt50oI5d6d8+QimC9CgeXOch9QtdWhsp85dVfped
hfQmlqGIfRi5ltETk7Wnem4JPiz6g9O8atZc8KaFUzhVMPBbtuxaWV9G6rfo11U2
ps49n73D1CurAIlmGRl47+YpfZl6DvrqiY/YJShU7nEwixIGMYvObN4Wrp1Wlg2H
1nZ/KPEuVDamS5hxGqqZL+P6I+ePnCN53QFcVfTPzAUu3brv5v/HXvHcE/AdaR5b
wtrMGzZ3lfN7WChSS8mGgzUBMvMd+xnfF7Bb5B6zwcSGs+xU4Sey8A465m7OY30t
qdIDbrNB1+1ez62i/yWuH/yAD4QJbQWY3sFIarCdWayqtbh2Yq3d6AoHFydEnTE/
uHziKxFl3Zur21AcV89PQr3tPwf7Qu9CyN6DSdDgh2xyjPMSnZmDEjY4X2S9z4pu
4N7JJUL0e4NFr57pk6nU9vkRcLuRXCQDnGqnsZCDxLxEOpsdthGxog/yaHJGeZU8
2VmqVlwz4ST11X4wt/4XuC5HOWhBesx238UYFjQdPCcQg+YD78ANNHceFm7LUeV7
ddtbwqjuU+MOGW8KjViPmDkimdaGPNm6eSJt+T4g0toHdWyo9ForQysFuXn2XQBX
EVEDAgego+tNhaCCDW+p
=jKgB
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"This should be our final batch of fixes for 4.3:
- A patch from Sudeep Holla that fixes annotation of wakeup sources
properly, old unused format seems to have spread through copying.
- Two patches from Tony for OMAP. One dealing with MUSB setup
problems due to runtime PM being enabled too early on the parent
device. The other fixes IRQ numbering for OMAP1"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
usb: musb: omap2430: Fix regression caused by driver core change
ARM: OMAP1: fix incorrect INT_DMA_LCD
ARM: dts: fix gpio-keys wakeup-source property
This is three essential bug fixes for various SCSI parts. The only affected
users are SCSI multi-path via device handler (basically all the enterprise)
and mvsas users. The dh bugs are an async entanglement in boot resulting in a
serious WARN_ON trip and a use after free on remove leading to a crash with
strict memory accounting. The mvsas bug manifests as a null deref oops but
only on abort sequences; however, these can commonly occur with SATA attached
devices, hence the fix.
Signed-off-by: James Bottomley <JBottomley@Odin.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABAgAGBQJWNDcYAAoJEDeqqVYsXL0M20oH/ioR0SAFJx5YJRCU2y3aTmTd
ZCcLaRJTHkdu+rObn+gU54q6eCrg9Z7+gj1DSb/B55nAgi8pK+bMd+Im/X+w4fG5
4WIj9P2HJX5YliDFONf68XWCeFyfAMFScEkLY8raOaFU+KtZGvpb/RQRwce0eAMF
OiV+NK65ENZdVnfTCR3btyBq+FDvPL2dX5a3VK3NpTArrH0BF4/ZuoC7E/tYkKgo
CQ6bGSyvx/loJlopLr3tcf4+tzCi4H8gr0cRLm9SHLbEwP0oXXpN9juVjiUbACIb
6etoyJJ+MjlQLofem2bJ4ftnP8cAju+EllfBkJXD3VEsqTAd0r9+B6Rx3YPlo10=
=jo6V
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is three essential bug fixes for various SCSI parts.
The only affected users are SCSI multi-path via device handler
(basically all the enterprise) and mvsas users. The dh bugs are an
async entanglement in boot resulting in a serious WARN_ON trip and a
use after free on remove leading to a crash with strict memory
accounting. The mvsas bug manifests as a null deref oops but only on
abort sequences; however, these can commonly occur with SATA attached
devices, hence the fix"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi_dh: don't try to load a device handler during async probing
scsi_dh: fix use-after-free when removing scsi device
mvsas: Fix NULL pointer dereference in mvs_slot_task_free
One bugfix for a list corruption in raid5 because of incorrect
locking.
Other for possible data corruption when a recovering device is failed,
removed, and re-added.
Both tagged for -stable.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJWNAnCAAoJEDnsnt1WYoG5VzwP/jRfljaxLKUicIGEd9qeaei5
vVtmzPugthzYTfcJRm54ZufQOnjse/uXEBqmvoMJhBeEMbL2VIWbn1i7sMWjgq8W
HVz/N/N8iT5DFWgzXmQQCaxIu17njbEmO8IelcNr3i6tE5wbHJ9WhV5UDGdQgwca
6XjQ8r8QjmN3uVaiNL4JcrEtZfpE8PqgBCJzsQYRZzOp9m3jJlgyJ9YWQA/VQ2Cj
cVXs7tTGTdpPQdgRRFHF/Z/7H+KUcp9XV5ahDRwqDryQkZHuFvwyZFfpLspqvc5P
OJio9WY0y93QmRRsuIC/ig0+CnxDXeqHiRwbprIMvKbPxxaGn4s24VioRDa0PRzT
v9HcjtUhs9q8iTo4TZJrggD+rPm523a9iiDU6SbM2qlM3XUpBis/fjbLN82Nnas+
ADa0/DAsOE0I1WltoaaNUbweuflGw2NnFxnUueqq8dK23/Vabu3vw4tohiNp+/3m
km8Is3j7lGKobQ3AKYIFlbeAqdsyASZkSDfA9IZr3SIeYBWgPljd9n+6BIPOqPNq
S2HtOLke/XX2KEM0BHzgu4XliE9P9+B9lQETI4MehP0rMzTURGKh87du41YHckp1
beX22aeOuv//ok11JTs6M+StREKeTrl+dTStn0U1jt6HyMzseGaNoy3Eib5ePBCb
C2ZZRz4OR2MWvWUFxKms
=QYvv
-----END PGP SIGNATURE-----
Merge tag 'md/4.3-rc7-fixes' of git://neil.brown.name/md
Pull md bug fixes from Neil Brown:
"Two more bug fixes for md.
One bugfix for a list corruption in raid5 because of incorrect
locking.
Other for possible data corruption when a recovering device is failed,
removed, and re-added.
Both tagged for -stable"
* tag 'md/4.3-rc7-fixes' of git://neil.brown.name/md:
Revert "md: allow a partially recovered device to be hot-added to an array."
md/raid5: fix locking in handle_stripe_clean_event()
When RAID-4/5/6 array suffers from missing journal device, we put
the array in read only state. We should not allow trasition to
read-write states (clean and active) before replacing journal device.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Set journal disk ->raid_disk to >=0, I choose raid_disks + 1 instead of
0, because we already have a disk with ->raid_disk 0 and this causes
sysfs entry creation conflict. A lot of places assumes disk with
->raid_disk >=0 is normal raid disk, so we add check for journal disk.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
When journal disk is faulty and we are reassemabling the raid array, the
journal disk is old. We don't allow the journal disk added to the raid
array. Since journal disk is missing in the array, the raid5 will mark
the array readonly.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
If raid array is expected to have journal (eg, journal is set in MD
superblock feature map) and the array is started without journal disk,
start the array readonly.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
If a raid array has journal feature bit set, add a new bit to indicate
this. If the array is started without journal disk existing, we know
there is something wrong.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
There are 3 places the raid5-cache dispatches IO. The discard IO error
doesn't matter, so we ignore it. The superblock write IO error can be
handled in MD core. The remaining are log write and flush. When the IO
error happens, we mark log disk faulty and fail all write IO. Read IO is
still allowed to run. Userspace will get a notification too and
corresponding daemon can choose setting raid array readonly for example.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
raid5-cache uses journal disk rdev->bdev, rdev->mddev in several places.
Don't allow journal disk disappear magically. On the other hand, we do
need to update superblock for other disks to bump up ->events, so next
time journal disk will be identified as stale.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Since superblock is updated infrequently, we do a simple trim of log
disk (a synchronous trim)
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
journal disk can be faulty. The Journal and Faulty aren't exclusive with
each other.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Simplify the bio completion handler by using bio chaining and submitting
bios as soon as they are full.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Factor out code to reserve log space.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
This is the only user, and keeping all code initializing the io_unit
structure together improves readbility.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Set up bi_sector properly when we allocate an bio instead of updating it
at submission time.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: NeilBrown <neilb@suse.com>
Split out a helper to allocate a bio for log writes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Remove the only partially used local 'io' variable to simplify the code
flow.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
For devices without a volatile write cache we don't need to send a FLUSH
command to ensure writes are stable on disk, and thus can avoid the whole
step of batching up bios for processing by the MD thread.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>