33319 Commits

Author SHA1 Message Date
07708c4af1 KVM: x86: Disallow hypercalls for guest callers in rings > 0
So far unprivileged guest callers running in ring 3 can issue, e.g., MMU
hypercalls. Normally, such callers cannot provide any hand-crafted MMU
command structure as it has to be passed by its physical address, but
they can still crash the guest kernel by passing random addresses.

To close the hole, this patch considers hypercalls valid only if issued
from guest ring 0. This may still be relaxed on a per-hypercall base in
the future once required.

Cc: stable@kernel.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:20 +03:00
b927a3cec0 KVM: VMX: Introduce KVM_SET_IDENTITY_MAP_ADDR ioctl
Now KVM allow guest to modify guest's physical address of EPT's identity mapping page.

(change from v1, discard unnecessary check, change ioctl to accept parameter
address rather than value)

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-09-10 08:33:16 +03:00
a1b37100d9 KVM: Reduce runnability interface with arch support code
Remove kvm_cpu_has_interrupt() and kvm_arch_interrupt_allowed() from
interface between general code and arch code. kvm_arch_vcpu_runnable()
checks for interrupts instead.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:13 +03:00
0b71785dc0 KVM: Move kvm_cpu_get_interrupt() declaration to x86 code
It is implemented only by x86.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:13 +03:00
d34e6b175e KVM: add ioeventfd support
ioeventfd is a mechanism to register PIO/MMIO regions to trigger an eventfd
signal when written to by a guest.  Host userspace can register any
arbitrary IO address with a corresponding eventfd and then pass the eventfd
to a specific end-point of interest for handling.

Normal IO requires a blocking round-trip since the operation may cause
side-effects in the emulated model or may return data to the caller.
Therefore, an IO in KVM traps from the guest to the host, causes a VMX/SVM
"heavy-weight" exit back to userspace, and is ultimately serviced by qemu's
device model synchronously before returning control back to the vcpu.

However, there is a subclass of IO which acts purely as a trigger for
other IO (such as to kick off an out-of-band DMA request, etc).  For these
patterns, the synchronous call is particularly expensive since we really
only want to simply get our notification transmitted asychronously and
return as quickly as possible.  All the sychronous infrastructure to ensure
proper data-dependencies are met in the normal IO case are just unecessary
overhead for signalling.  This adds additional computational load on the
system, as well as latency to the signalling path.

Therefore, we provide a mechanism for registration of an in-kernel trigger
point that allows the VCPU to only require a very brief, lightweight
exit just long enough to signal an eventfd.  This also means that any
clients compatible with the eventfd interface (which includes userspace
and kernelspace equally well) can now register to be notified. The end
result should be a more flexible and higher performance notification API
for the backend KVM hypervisor and perhipheral components.

To test this theory, we built a test-harness called "doorbell".  This
module has a function called "doorbell_ring()" which simply increments a
counter for each time the doorbell is signaled.  It supports signalling
from either an eventfd, or an ioctl().

We then wired up two paths to the doorbell: One via QEMU via a registered
io region and through the doorbell ioctl().  The other is direct via
ioeventfd.

You can download this test harness here:

ftp://ftp.novell.com/dev/ghaskins/doorbell.tar.bz2

The measured results are as follows:

qemu-mmio:       110000 iops, 9.09us rtt
ioeventfd-mmio: 200100 iops, 5.00us rtt
ioeventfd-pio:  367300 iops, 2.72us rtt

I didn't measure qemu-pio, because I have to figure out how to register a
PIO region with qemu's device model, and I got lazy.  However, for now we
can extrapolate based on the data from the NULLIO runs of +2.56us for MMIO,
and -350ns for HC, we get:

qemu-pio:      153139 iops, 6.53us rtt
ioeventfd-hc: 412585 iops, 2.37us rtt

these are just for fun, for now, until I can gather more data.

Here is a graph for your convenience:

http://developer.novell.com/wiki/images/7/76/Iofd-chart.png

The conclusion to draw is that we save about 4us by skipping the userspace
hop.

--------------------

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:12 +03:00
090b7aff27 KVM: make io_bus interface more robust
Today kvm_io_bus_regsiter_dev() returns void and will internally BUG_ON
if it fails.  We want to create dynamic MMIO/PIO entries driven from
userspace later in the series, so we need to enhance the code to be more
robust with the following changes:

   1) Add a return value to the registration function
   2) Fix up all the callsites to check the return code, handle any
      failures, and percolate the error up to the caller.
   3) Add an unregister function that collapses holes in the array

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:12 +03:00
e9f4275732 KVM: PIT support for HPET legacy mode
When kvm is in hpet_legacy_mode, the hpet is providing the timer
interrupt and the pit should not be. So in legacy mode, the pit timer
is destroyed, but the *state* of the pit is maintained. So if kvm or
the guest tries to modify the state of the pit, this modification is
accepted, *except* that the timer isn't actually started. When we exit
hpet_legacy_mode, the current state of the pit (which is up to date
since we've been accepting modifications) is used to restart the pit
timer.

The saved_mode code in kvm_pit_load_count temporarily changes mode to
0xff in order to destroy the timer, but then restores the actual
value, again maintaining "current" state of the pit for possible later
reenablement.

[avi: add some reserved storage in the ioctl; make SET_PIT2 IOW]
[marcelo: fix memory corruption due to reserved storage]

Signed-off-by: Beth Kon <eak@us.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:12 +03:00
1000ff8d89 KVM: Add trace points in irqchip code
Add tracepoint in msi/ioapic/pic set_irq() functions,
in IPI sending and in the point where IRQ is placed into
apic's IRR.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:11 +03:00
aec51dc4f1 KVM: Trace mmio
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:07 +03:00
ae8c1c4025 KVM: Trace irq level and source id
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:06 +03:00
bda9020e24 KVM: remove in_range from io devices
This changes bus accesses to use high-level kvm_io_bus_read/kvm_io_bus_write
functions. in_range now becomes unused so it is removed from device ops in
favor of read/write callbacks performing range checks internally.

This allows aliasing (mostly for in-kernel virtio), as well as better error
handling by making it possible to pass errors up to userspace.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:05 +03:00
6c47469453 KVM: convert bus to slots_lock
Use slots_lock to protect device list on the bus.  slots_lock is already
taken for read everywhere, so we only need to take it for write when
registering devices.  This is in preparation to removing in_range and
kvm->lock around it.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:05 +03:00
d3efc8efdb KVM: use vcpu_id instead of bsp_vcpu pointer in kvm_vcpu_is_bsp
Change kvm_vcpu_is_bsp to use vcpu_id instead of bsp_vcpu pointer, which
is only initialized at the end of kvm_vm_ioctl_create_vcpu.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:04 +03:00
2023a29cbe KVM: remove old KVMTRACE support code
Return EOPNOTSUPP for KVM_TRACE_ENABLE/PAUSE/DISABLE ioctls.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:03 +03:00
ec04b2604c KVM: Prepare memslot data structures for multiple hugepage sizes
[avi: fix build on non-x86]

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:02 +03:00
229456fc34 KVM: convert custom marker based tracing to event traces
This allows use of the powerful ftrace infrastructure.

See Documentation/trace/ for usage information.

[avi, stephen: various build fixes]
[sheng: fix control register breakage]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:59 +03:00
54dee9933e KVM: VMX: conditionally disable 2M pages
Disable usage of 2M pages if VMX_EPT_2MB_PAGE_BIT (bit 16) is clear
in MSR_IA32_VMX_EPT_VPID_CAP and EPT is enabled.

[avi: s/largepages_disabled/largepages_enabled/ to avoid negative logic]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:56 +03:00
3f5d18a965 KVM: Return to userspace on emulation failure
Instead of mindlessly retrying to execute the instruction, report the
failure to userspace.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:52 +03:00
988a2cae6a KVM: Use macro to iterate over vcpus.
[christian: remove unused variables on s390]

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:52 +03:00
73880c80aa KVM: Break dependency between vcpu index in vcpus array and vcpu_id.
Archs are free to use vcpu_id as they see fit. For x86 it is used as
vcpu's apic id. New ioctl is added to configure boot vcpu id that was
assumed to be 0 till now.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:52 +03:00
c5af89b68a KVM: Introduce kvm_vcpu_is_bsp() function.
Use it instead of open code "vcpu_id zero is BSP" assumption.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:51 +03:00
6a4a983973 KVM: Reorder ioctls in kvm.h
Somehow the VM ioctls got unsorted; resort.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:50 +03:00
fa40a8214b KVM: switch irq injection/acking data structures to irq_lock
Protect irq injection/acking data structures with a separate irq_lock
mutex. This fixes the following deadlock:

CPU A                               CPU B
kvm_vm_ioctl_deassign_dev_irq()
  mutex_lock(&kvm->lock);            worker_thread()
  -> kvm_deassign_irq()                -> kvm_assigned_dev_interrupt_work_handler()
    -> deassign_host_irq()               mutex_lock(&kvm->lock);
      -> cancel_work_sync() [blocked]

[gleb: fix ia64 path]

Reported-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:49 +03:00
60eead79ad KVM: introduce irq_lock, use it to protect ioapic
Introduce irq_lock, and use to protect ioapic data structures.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:49 +03:00
b188d2d365 KVM: remove redundant declarations
Changing s390 code in kvm_arch_vcpu_load/put come across this header
declarations. They are complete duplicates, not even useful forward
declarations as nothing using it is in between (maybe it was that in
the past).

This patch removes the two dispensable lines.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:44 +03:00
e733339140 KVM: Downsize max support MSI-X entry to 256
We only trap one page for MSI-X entry now, so it's 4k/(128/8) = 256 entries at
most.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:43 +03:00
c5ff41ce66 KVM: Allow PIT emulation without speaker port
The in-kernel speaker emulation is only a dummy and also unneeded from
the performance point of view. Rather, it takes user space support to
generate sound output on the host, e.g. console beeps.

To allow this, introduce KVM_CREATE_PIT2 which controls in-kernel
speaker port emulation via a flag passed along the new IOCTL. It also
leaves room for future extensions of the PIT configuration interface.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:41 +03:00
721eecbf4f KVM: irqfd
KVM provides a complete virtual system environment for guests, including
support for injecting interrupts modeled after the real exception/interrupt
facilities present on the native platform (such as the IDT on x86).
Virtual interrupts can come from a variety of sources (emulated devices,
pass-through devices, etc) but all must be injected to the guest via
the KVM infrastructure.  This patch adds a new mechanism to inject a specific
interrupt to a guest using a decoupled eventfd mechnanism:  Any legal signal
on the irqfd (using eventfd semantics from either userspace or kernel) will
translate into an injected interrupt in the guest at the next available
interrupt window.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:41 +03:00
890ca9aefa KVM: Add MCE support
The related MSRs are emulated. MCE capability is exported via
extension KVM_CAP_MCE and ioctl KVM_X86_GET_MCE_CAP_SUPPORTED.  A new
vcpu ioctl command KVM_X86_SETUP_MCE is used to setup MCE emulation
such as the mcg_cap. MCE is injected via vcpu ioctl command
KVM_X86_SET_MCE. Extended machine-check state (MCG_EXT_P) and CMCI are
not implemented.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:39 +03:00
23bcf634c8 net_sched: fix estimator lock selection for mq child qdiscs
When new child qdiscs are attached to the mq qdisc, they are actually
attached as root qdiscs to the device queues. The lock selection for
new estimators incorrectly picks the root lock of the existing and
to be replaced qdisc, which results in a use-after-free once the old
qdisc has been destroyed.

Mark mq qdisc instances with a new flag and treat qdiscs attached to
mq as children similar to regular root qdiscs.

Additionally prevent estimators from being attached to the mq qdisc
itself since it only updates its byte and packet counters during dumps.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-09 18:11:23 -07:00
ea6a634ef7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-09-09 17:33:45 -07:00
1ee65e37e9 LSM/SELinux: inode_{get,set,notify}secctx hooks to access LSM security context information.
This patch introduces three new hooks. The inode_getsecctx hook is used to get
all relevant information from an LSM about an inode. The inode_setsecctx is
used to set both the in-core and on-disk state for the inode based on a context
derived from inode_getsecctx.The final hook inode_notifysecctx will notify the
LSM of a change for the in-core state of the inode in question. These hooks are
for use in the labeled NFS code and addresses concerns of how to set security
on an inode in a multi-xattr LSM. For historical reasons Stephen Smalley's
explanation of the reason for these hooks is pasted below.

Quote Stephen Smalley

inode_setsecctx:  Change the security context of an inode.  Updates the
in core security context managed by the security module and invokes the
fs code as needed (via __vfs_setxattr_noperm) to update any backing
xattrs that represent the context.  Example usage:  NFS server invokes
this hook to change the security context in its incore inode and on the
backing file system to a value provided by the client on a SETATTR
operation.

inode_notifysecctx:  Notify the security module of what the security
context of an inode should be.  Initializes the incore security context
managed by the security module for this inode.  Example usage:  NFS
client invokes this hook to initialize the security context in its
incore inode to the value provided by the server for the file when the
server returned the file's attributes to the client.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-10 10:11:24 +10:00
b1ab7e4b2a VFS: Factor out part of vfs_setxattr so it can be called from the SELinux hook for inode_setsecctx.
This factors out the part of the vfs_setxattr function that performs the
setting of the xattr and its notification. This is needed so the SELinux
implementation of inode_setsecctx can handle the setting of the xattr while
maintaining the proper separation of layers.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-10 10:11:22 +10:00
bf992fa2bc Merge branch 'master' into for-linus 2009-09-10 00:02:02 +02:00
9b83ccd2f1 ACPI PM: Replace wakeup.prepared with reference counter
The wakeup.prepared flag is used for marking devices that have the
wake-up power already enabled, so that the wake-up power is not
enabled twice in a row for the same device.  This assumes, however,
that device wake-up power will only be enabled once, while the device
is being prepared for a system-wide sleep transition, and the second
attempt is made by acpi_enable_wakeup_device_prep().

With the upcoming PCI wake-up rework this assumption will not hold
any more for PCI bridges and the root bridge whose wake-up power
may be enabled as a result of wake-up enable propagation from other
devices (eg. add-on devices that are not associated with any GPEs).
Thus, there may be many attempts to enable wake-up power on a PCI
bridge or the root bridge during a system power state transition
and it's better to replace wakeup.prepared with a reference counter.

Reviewed-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 14:19:18 -07:00
e80bb09d2c PCI PM: Introduce device flag wakeup_prepared
Introduce a new PCI device flag, wakeup_prepared, to prevent PCI
wake-up preparation code from being executed twice in a row for the
same device and for the same purpose.

Reviewed-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 14:19:11 -07:00
28760489a3 PCI: pcie: Ensure hotplug ports have a minimum number of resources
In general a BIOS may goof or we may hotplug in a hotplug controller.
In either case the kernel needs to reserve resources for plugging
in more devices in the future instead of creating a minimal resource
assignment.

We already do this for cardbus bridges I am just adding a variant
for pcie bridges.

v2: Make testing for pcie hotplug bridges based on a flag.

    So far we only set the flag for pcie but a header_quirk
    could easily be added for the non-standard pci hotplug
    bridges.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 14:10:24 -07:00
9dba910e9d PCI: separate out pci_add_dynid()
Separate out pci_add_dynid() from store_new_id() and export it so that
in-kernel code can add PCI IDs dynamically.  As the function will be
available regardless of HOTPLUG, put it and pull pci_free_dynids()
outside of CONFIG_HOTPLUG.

This will be used by pci-stub to initialize initial IDs via module
param.

While at it, remove bogus get_driver() failure check.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:43:58 -07:00
825c423a35 PCI hotplug: add support for 5.0G link speed
Add support for PCI-E 5.0 GT/s in max_bus_speed and cur_bus_speed.

Reviewed-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:50 -07:00
6ac3bd5270 PCI/vgaarb: cleanup some warnings + cleanup some comments.
Fix some warnings reported in linux-next + also cleanup some
comment errors noticed by Pekka Paalanen.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:41 -07:00
260d703adc PCI: support for PCI Express fundamental reset
This is the first of three patches that implement a bit field that PCI
Express device drivers can use to indicate they need a fundamental reset
during error recovery.

By default, the EEH framework on powerpc does what's known as a "hot
reset" during recovery of a PCI Express device.  We've found a case
where the device needs a "fundamental reset" to recover properly.  The
current PCI error recovery and EEH frameworks do not support this
distinction.

The attached patch (courtesy of Richard Lary) adds a bit field to
pci_dev that indicates whether the device requires a fundamental reset
during recovery.

These patches supersede the previously submitted patch that implemented
a fundamental reset bit field.

Signed-off-by: Mike Mason <mmlnx@us.ibm.com>
Signed-off-by: Richard Lary <rlary@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:37 -07:00
deb2d2ecd4 PCI/GPU: implement VGA arbitration on Linux
Background:
Graphic devices are accessed through ranges in I/O or memory space. While most
modern devices allow relocation of such ranges, some "Legacy" VGA devices
implemented on PCI will typically have the same "hard-decoded" addresses as
they did on ISA. For more details see "PCI Bus Binding to IEEE Std 1275-1994
Standard for Boot (Initialization Configuration) Firmware Revision 2.1"
Section 7, Legacy Devices.

The Resource Access Control (RAC) module inside the X server currently does
the task of arbitration when more than one legacy device co-exists on the same
machine. But the problem happens when these devices are trying to be accessed
by different userspace clients (e.g. two server in parallel). Their address
assignments conflict. Therefore an arbitration scheme _outside_ of the X
server is needed to control the sharing of these resources. This document
introduces the operation of the VGA arbiter implemented for Linux kernel.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Tiago Vignatti <tiago.vignatti@nokia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:36 -07:00
1d4a433fc4 PCI: Document pci_ids.h addition policy.
IDs should generally only be added to pci_ids.h when they're shared
across several files in the tree.  IDs that are just used by a single
driver should be defined in the driver instead.

Perhaps documenting this is a good idea to prevent things being moved there,
as it still seems to be happening judging from the git log.

(based on discussion w/gregkh and others).

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:28 -07:00
711d57796f PCI: expose function reset capability in sysfs
Some devices allow an individual function to be reset without affecting
other functions in the same device: that's what pci_reset_function does.
For devices that have this support, expose reset attribite in sysfs.

This is useful e.g. for virtualization, where a qemu userspace
process wants to reset the device when the guest is reset,
to emulate machine reboot as closely as possible.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:24 -07:00
76d56de57a ACPI: export acpi_pci_root and friends
We can simplify ACPI drivers if we can tell whether a handle is an
ACPI PCI root or not.

Reviewed-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:22 -07:00
a7db504052 PCI: remove pcibios_scan_all_fns()
This was #define'd as 0 on all platforms, so let's get rid of it.

This change makes pci_scan_slot() slightly easier to read.

Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Tony Luck <tony.luck@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Russell King <linux@arm.linux.org.uk>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:18 -07:00
2bba22c50b sched: Turn off child_runs_first
Set child_runs_first default to off.

It hurts 'optimal' make -j<NR_CPUS> workloads as make jobs
get preempted by child tasks, reducing parallelism.

Note, this patch might make existing races in user
applications more prominent than before - so breakages
might be bisected to this commit.

Child-runs-first is broken on SMP to begin with, and we
already had it off briefly in v2.6.23 so most of the
offenders ought to be fixed. Would be nice not to revert
this commit but fix those apps finally ...

Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1252486344.28645.18.camel@marge.simson.net>
[ made the sysctl independent of CONFIG_SCHED_DEBUG, in case
  people want to work around broken apps. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-09 17:30:05 +02:00
24ea602e18 ssb: Implement SDIO host bus support
Add support for communicating with a Sonics Silicon Backplane through a
SDIO interface, as found in the Nintendo Wii WLAN daughter card.

The Nintendo Wii WLAN card includes a custom Broadcom 4318 chip with
a SDIO host interface.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-09-09 11:19:00 -04:00
5d5d9c97ff IPv6/addrconf: Fix minor addrlabel thinko
Fix apparent thinko related to RTM_DELADDRLABEL, introduced by commit
2a8cc6c89039e0530a3335954253b76ed0f9339a ("[IPV6] ADDRCONF: Support
RFC3484 configurable address selection policy table.").

Signed-off-by: Tushar Gohad <tgohad@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-09 03:42:23 -07:00
3e5cd1f257 dmi: extend dmi_get_year() to dmi_get_date()
There are cases where full date information is required instead of
just the year.  Add month and day parsing to dmi_get_year() and rename
it to dmi_get_date().

As the original function only required '/' followed by any number of
parseable characters at the end of the string, keep that behavior to
avoid upsetting existing users.

The new function takes dates of format [mm[/dd]]/yy[yy].  Year, month
and date are checked to be in the ranges of [1-9999], [1-12] and
[1-31] respectively and any invalid or out-of-range component is
returned as zero.

The dummy implementation is updated accordingly but the return value
is updated to indicate field not found which is consistent with how
other dummy functions behave.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-09-08 21:17:48 -04:00