909 Commits

Author SHA1 Message Date
133f09a169 [SPARC64]: Use more mearningful names for IRQ registry.
All of the interrupts say "LDX RX" and "LDX TX" currently
which is next to useless.  Put a device specific prefix
before "RX" and "TX" instead which makes it much more
useful.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:24 -07:00
e450992d13 [SPARC64]: Initial domain-services driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:20 -07:00
13077d8028 [SPARC64]: Export powerd facilities for external entities.
Besides the existing usage for power-button interrupts, we'll
want to make use of this code for domain-services where the
LDOM manager can send reboot requests to the guest node.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:16 -07:00
2c4f4ecb7a [SPARC64]: Add domain-services nodes to VIO device tree.
They sit under the root of the MD tree unlike the rest of
the LDC channel based virtual devices.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:13 -07:00
cb48123584 [SPARC64]: Assorted LDC bug cures.
1) LDC_MODE_RELIABLE is deprecated an unused by anything, plus
   it and LDC_MODE_STREAM were mis-numbered.

2) read_stream() should try to read as much as possible into
   the per-LDC stream buffer area, so do not trim the read_nonraw()
   length by the caller's size parameter.

3) Send data ACKs when necessary in read_nonraw().

4) In read_nonraw() when we get a pure ACK, advance the RX head
   unconditionally past it.

5) Provide the ACKID field in the ldcdgb() packet dump in read_nonraw().
   This helps debugging stream mode LDC channel problems.

6) Decrease verbosity of rx_data_wait() so that it is more useful.
   A debugging message each loop iteration is too much.

7) In process_data_ack() stop the loop checking when we hit lp->tx_tail
   not lp->tx_head.

8) Set the seqid field properly in send_data_nack().

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:09 -07:00
5a606b72a4 [SPARC64]: Do not ACK an INO if it is disabled or inprogress.
This is also a partial workaround for a bug in the LDOM firmware which
double-transmits RX inos during high load.  Without this, such an
event causes the kernel to loop forever in the interrupt call chain
ACK'ing but never actually running the IRQ handler (and thus clearing
the interrupt condition in the device).

There is still a bad potential effect when double INOs occur,
not covered by this changeset.  Namely, if the INO is already on
the per-cpu INO vector list, we still blindly re-insert it and
thus we can end up losing interrupts already linked in after
it.

We could deal with that by traversing the list before insertion,
but that's too expensive for this edge case.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:05 -07:00
e53e97ce3c [SPARC64]: Add LDOM virtual channel driver and VIO device layer.
Virtual devices on Sun Logical Domains are built on top
of a virtual channel framework.  This, with help of hypervisor
interfaces, provides a link layer protocol with basic
handshaking over which virtual device clients and servers
communicate.

Built on top of this is a VIO device protocol which has it's
own handshaking and message types.  At this layer attributes
are exchanged (disk size, network device addresses, etc.)
descriptor rings are registered, and data transfers are
triggers and replied to.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:03:18 -07:00
36e235901f PCI: Only build PCI syscalls on architectures that want them
The PCI syscalls are built on every architecture except X86, but only
a few have ever hooked them up.  Use a new Kconfig symbol to save a
couple of kB on the architectures that have never used the syscalls.
Tested on x86 and ia64 only.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:13 -07:00
b8a3a5214d PCI: read revision ID by default
Currently there are 97 occurrences where drivers need the pci
revision ID. We can do this once for all devices. Even the pci
subsystem needs the revision several times for quirks. The extra
u8 member pads out nicely in the pci_dev struct.

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:09 -07:00
0437e109e1 sched: zap the migration init / cache-hot balancing code
the SMP load-balancer uses the boot-time migration-cost estimation
code to attempt to improve the quality of balancing. The reason for
this code is that the discrete priority queues do not preserve
the order of scheduling accurately, so the load-balancer skips
tasks that were running on a CPU 'recently'.

this code is fundamental fragile: the boot-time migration cost detector
doesnt really work on systems that had large L3 caches, it caused boot
delays on large systems and the whole cache-hot concept made the
balancing code pretty undeterministic as well.

(and hey, i wrote most of it, so i can say it out loud that it sucks ;-)

under CFS the same purpose of cache affinity can be achieved without
any special cache-hot special-case: tasks are sorted in the 'timeline'
tree and the SMP balancer picks tasks from the left side of the
tree, thus the most cache-cold task is balanced automatically.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-09 18:51:57 +02:00
a357b8f42e [SPARC64]: Need to set state to IDLE during sun4v IRQ enable.
This fixes hypervisor console interrupts on LDOM guests.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-26 00:13:31 -07:00
1245088400 [SPARC64]: Fix VIRQ enabling.
We were doing the wrong call to turn them on, and also
when enabling we need to forcefully set the state to IDLE.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-26 00:13:09 -07:00
fc395f8d58 [SPARC64]: Fix args to sun4v_ldc_revoke().
First argument is LDC channel ID, then mapping cookie,
then the MTE revoke cookie.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:27 -07:00
56f5c0bd50 [SPARC64]: Fix IO/MEM space sizing for PCI.
In pci_determine_mem_io_space(), do not hard code the region sizes.
Instead, use the values given to us in the ranges property.

Thanks goes to Mikael Petterson for the original Xorg failure
bug repoert, and strace dumps from Mikael and Dmitry Artamonow.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:19 -07:00
4a907dec98 [SPARC64]: Wire up cookie based sun4v interrupt registry.
This will be used for logical domain channel interrupts.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:04 -07:00
8c2786cfa6 [SPARC64]: Handle PCI bridges without 'ranges' property.
This fixes the IDE controller not showing up on Netra-T1
systems.

Just like Simba bridges, some PCI bridges can lack the
'ranges' OBP property.  So we handle this similarly to
the existing Simba code:

1) In of_device register address resolving, we push the
   translation to the parent.

2) In PCI device scanning, we interrogate the PCI config
   space registers of the PCI bus device in order to resolve
   the resources, just like the generic Linux PCI probing
   code does.

With much help and testing from Fabio, who also reported
the initial problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Fabio Massimo Di Nitto <fabbione@ubuntu.com>
2007-06-07 21:59:44 -07:00
ea1ff19ce0 [SPARC64]: Include <linux/rwsem.h> instead of <asm/rwsem.h>.
To be consistent with other architectures, include the generic version
of rwsem.h.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 20:24:50 -07:00
ec4d18f219 [SPARC64]: Fix SBUS IRQ regression caused by PCI-E driver.
We used to access the 64-bit IRQ IMAP and ICLR registers of bus
controllers 4-bytes in and as a 32-bit register word, since only the
low 32-bits were relevant.  This seemed like a good idea at the time.

But the PCI-E controller requires full 8-byte 64-bit access to
these registers, so we switched over to accessing them fully.

SBUS was not adjusted properly, which broke interrupts completely.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 16:59:51 -07:00
321566c250 [SPARC64]: Fix 2 bugs in PCI Sabre bus scanning.
If we are on hummingbird, bus runs at 66MHZ.

pbm->pci_bus should be setup with the result of pci_scan_one_pbm()
or else we deref NULL pointers in the error interrupt handlers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 16:59:46 -07:00
a2f9f6bbb3 [SPARC64]: Fix {mc,smt}_capable().
It's not just sun4v hypervisor platforms that should return true
for this, sun4u with UltraSPARC-IV should return true too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:05 -07:00
5cd342df96 [SPARC64]: Make core and sibling groups equal on UltraSPARC-IV.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:02 -07:00
f78eae2e6f [SPARC64]: Proper multi-core scheduling support.
The scheduling domain hierarchy is:

   all cpus -->
      cpus that share an instruction cache -->
          cpus that share an integer execution unit

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:00 -07:00
d887ab3a9b [SPARC64]: Provide mmu statistics via sysfs.
If the system supports hypervisor based statistics, allow them to
be fetched, enabled, and disabled via sysfs.

Enable and disable via the boolean:

/sys/devices/systems/cpu/cpuN/mmustat_enable

Statistic values are provided under:

/sys/devices/systems/cpu/cpuN/mmu_status/

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:57 -07:00
48b6735640 [SPARC64]: Fix service channel hypervisor function names.
sed 's/scv/svc/'

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:54 -07:00
d1f253e60a [SPARC64]: Export basic cpu properties via sysfs.
Cache sizes, udelay_val, and clock_tick.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:51 -07:00
eff3414b72 [SPARC64]: Move topology init code into new file, sysfs.c
Also, use per-cpu data for struct cpu.  Calling kmalloc for
each cpu in topology_init() is just plain clumsy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:50 -07:00
54ca412336 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix:
  sparc64: fix alignment bug in linker definition script
2007-05-31 12:33:16 -07:00
dbbe3cb8cf [SPARC64]: Add missing NCS and SVC hypervisor interfaces.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-31 01:52:48 -07:00
4096b46f01 sparc64: fix alignment bug in linker definition script
The RO_DATA section were hardcoded to a specific
alignment in include/asm-generic/vmlinux.h.
But for sparc64 this did not match the PAGE_SIZE.

Introduce a new section definition named:
RO_DATA that takes actual alignment as parameter.
RODATA are provided for backward compatibility.

On top of this avoid hardcoding alignment for
sparc64 in reset of the script
Fix is build-tested on sparc64 + x86_64.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-29 21:29:00 +02:00
7db35f31cb [SPARC64]: Fill holes in hypervisor APIs and fix KTSB registry.
Several interfaces were missing and others misnumbered or
improperly documented.

Also, make sure to check the return value when registering
the kernel TSBs with the hypervisor.  This helped to find
the 4MB kernel TSB alignment bug fixed in a previous changeset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:52:15 -07:00
2d9e2763c2 [SPARC64]: Fix two bugs wrt. kernel 4MB TSB.
1) The TSB lookup was not using the correct hash mask.

2) It was not aligned on a boundary equal to it's size,
   which is required by the sun4v Hypervisor.

wasn't having it's return value checked, and that bug will be fixed up
as well in a subsequent changeset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:51:38 -07:00
679292993c [SPARC64]: Fix _PAGE_EXEC_4U check in sun4u I-TLB miss handler.
It was using an immediate _PAGE_EXEC_4U value in an 'and'
instruction to perform the test.  This doesn't work because
the immediate field is signed 13-bit, this the mask being
tested against the PTE was 0x1000 sign-extended to 32-bits
instead of just plain 0x1000.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:50:15 -07:00
7189859f28 [SPARC64]: arch/sparc64/time.c doesn't compile on Ultra 1 (no PCI)
This is bug 8540 on bugzilla.kernel.org

arch/sparc64/time.c contains references to assorted bq4802 stuff if
CONFIG_PCI is not set, and compile fails. I #ifdef'ed out everything
that looks PCI-ish in that file.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:50:02 -07:00
22adb358e8 [SPARC64]: Eliminate NR_CPUS limitations.
Cheetah systems can have cpuids as large as 1023, although physical
systems don't have that many cpus.

Only three limitations existed in the kernel preventing arbitrary
NR_CPUS values:

1) dcache dirty cpu state stored in page->flags on
   D-cache aliasing platforms.  With some build time
   calculations and some build-time BUG checks on
   page->flags layout, this one was easily solved.

2) The cheetah XCALL delivery code could only handle
   a cpumask with up to 32 cpus set.  Some simple looping
   logic clears that up too.

3) thread_info->cpu was a u8, easily changed to a u16.

There are a few spots in the kernel that still put NR_CPUS
sized arrays on the kernel stack, but that's not a sparc64
specific problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:49 -07:00
5cbc307373 [SPARC64]: Use machine description and OBP properly for cpu probing.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:41 -07:00
e01c0d6d8c [SPARC64]: Negotiate hypervisor API for PCI services.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:34 -07:00
22d6a1cba3 [SPARC64]: Report proper system soft state to the hypervisor.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:29 -07:00
36b48973b8 [SPARC64]: Fix typo in sun4v_hvapi_register error handling.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:21 -07:00
5840fc66bb [SPARC64]: PCI device scan is way too verbose by default.
These messages were very useful when bringing up the
OBP based PCI device scan code, but it's just a lot
of noise every bootup now especially on big machines.

The messages can be re-enabled via 'ofpci_debug=1' on
the kernel command line.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:18 -07:00
59db8102bd [SPARC64]: Don't be picky about virtual-dma values on sun4v.
Handle arbitrary base and length values as long as they
are multiples of IO_PAGE_SIZE.

Bug found by Arun Kumar Rao.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:15 -07:00
ca967258b6 all-archs: consolidate .data section definition in asm-generic
With this consolidation we can now modify the .data
section definition in one spot for all archs.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-19 09:11:57 +02:00
7664709b44 all-archs: consolidate .text section definition in asm-generic
Move definition of .text section to asm-generic.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-19 09:11:57 +02:00
03983ab858 [SPARC64]: Fix sched_clock() et al.
SPARC64_NSEC_PER_CYC_SHIFT was set too high.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-17 22:55:26 -07:00
c7754d465b [SPARC64]: Add hypervisor API negotiation and fix console bugs.
Hypervisor interfaces need to be negotiated in order to use
some API calls reliably.  So add a small set of interfaces
to request API versions and query current settings.

This allows us to fix some bugs in the hypervisor console:

1) If we can negotiate API group CORE of at least major 1
   minor 1 we can use con_read and con_write which can improve
   console performance quite a bit.

2) When we do a console write request, we should hold the
   spinlock around the whole request, not a byte at a time.
   What would happen is that it's easy for output from
   different cpus to get mixed with each other.

3) Use consistent udelay() based polling, udelay(1) each
   loop with a limit of 1000 polls to handle stuck hypervisor
   console.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-15 20:23:02 -07:00
be35cf01a9 [SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-14 04:19:01 -07:00
17f34f0ec9 [SPARC64]: Add missing cpus_empty() check in hypervisor xcall handling.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-14 02:01:52 -07:00
49d23cfcec [SPARC64]: Be more resiliant with PCI I/O space regs.
If we miss on the ranges, just toss the translation up to the parent
instead of failing.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-13 22:01:18 -07:00
8354c5b726 [SPARC]: Wire up signalfd/timerfd/eventfd syscalls.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 22:06:51 -07:00
d037e0532e [SPARC64]: Add support for bq4802 TOD chip, as found on ultra45.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:27 -07:00
95d71e663e [SPARC64]: Correct FIRE_IOMMU_FLUSHINV register offset.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:26 -07:00