I attended Hot Chips 17 on August 15 & 16 at the Stanford Memorial Auditorium at Stanford University. This annual conference looks at interesting developments in chip design. The focus is on technology, not business. It is put on by chip designers for chip designers. About 520 people attended this year, which is up from recent years, but much lower than during The Bubble.
Hot Chips is very good for observing fads and trends. In previous years, it would seem that the hot trend was graphics chips or routers. This year, it is low (or reduced) power. Heat is becoming an increasingly important issue. The system costs associated with increasing clock speeds are becoming unmanageable. Also, applications are becoming more mobile, so there are also large rewards for reducing power consumption to increase battery life.
The semiconductor industry has been dominated by the performance/cost characteristics of microprocessors, resulting in a 30 year stall in architecture. But now, performance/watt is changing the economies of microprocessors, opening the door to more innovation.
The key to increasing performance while decreasing power is parallelism. The cycles become slower, but the amount to work done in each cycle is greater. We saw lots of techniques that had been practiced in devices like the Cray I and Transputers; parallelism may be coming back in style.
We saw lots of multiprocessors and vector processors. The problem with parallelism is at its root the same problem we have with sequential programming: The Software Crisis. It is too difficult to produce large amounts of high quality software. The problem is exponentially more difficult when the software must run in a parallel mode. This is why Intel and others had been pursuing superscalar architecture: The software is easier. But now low power will force the industry to reconsider parallel programming. A number of vendors showed products that they claimed could take ordinary C programs and automatically turn them into efficient parallel programs. I think such systems will be very disappointing.
An easier approach to the software problem is Threading. A dual-core processor can run two threads at once: one thread for the application, and one thread for the operating system, spyware, malware, and viruses. The benefit, for most applications, is much less than the optimal 2X. To get benefit from multicore processors, we need to change the way we write software. The simple-minded approach of using threads at the application level will result in a significant decrease of reliability: Threads are evil. The likelihood of introducing races and deadlocks is unacceptably high.
There was a panel on The Next Killer Application. This is generally a product issue, not a technology issue. Designers want to do what they want to do, and they hope that there will be customers for it. A killer app is an unexpected, hugely successful product. Two of the panelists thought that the next killer app would be Digital Media, particularly H.264 processors. One panelist thought the killer app is selling technology starter kits to the billion families in India and China who will soon be coming online. One thought it was Sync, meaning intelligent access to all of your data, everywhere, any time. Another thought it was Model Based Computing.
There were questions about DRM. These engineers do not like DRM, and they think real people will not like it either. There was concern that H.264 and VC1 are being used to overcompress, resulting in lower image quality. There was also a comment that media creation tools might be more important than media consumption tools. There was an opinion that we should be developing products that increase the biological quality of life, not products that keep us distracted until we die. It seems strange to use the word "killer" to describe a product that keeps us alive.
There were a number of chips presented that had media applications.
IBM presented the Cell multiprocessor. It combines a Power Processor with 8 SPE (synergistic processor element). An SPE is a SIMD data processor. They suggested that Cell could be used for video encoding/decoding. Cells can be linked together to provide greater performance.
Toshiba presented SCC (Super Companion Chip), an accessory for Cell. It provides PC interfaces (USB, ATA, Ethernet, etc.) and AV interfaces. An SCC with a Cell could be used to implement an HDTV set or DVR.
Telairity presented a system for doing H.264 encoding using 4 or 8 of its Telairity-1 chips. (This chip was presented at Hot Chips 15.) Each chip contains 5 independent vector/scalar processors. These will be very expensive, professional encoders.
Tensilica presented its HiFi 2 audio processing core. It is low power for portable applications. The architecture allows for application-specific instructions, which can support very efficient codecs.
Philips presented the PNX1700, the latest in its Nexperia family that began 8 years ago with the TM1000. It is intended for media center applications. It includes a super-pipelined VLIW engine, a DVD controller, lots of I/O interfaces, and a scaler/line doubler.
Cradle presented CT3616, a multiprocessor DSP. A chip contains 2 quads; a quad contains 4 general purpose processors and 8 SIMD DSP engines.
NVidia presented GeForce 7800, a 3D graphics chip. Performance of this high-end GPU is limited by the speed of the CPU that drives it. NVidia needs to encourage the rest of the industry to get much better at parallel programming in order to improve its own performance.
Microsoft presented Xbox 360, the successor to the Xbox video game console. The system includes a CPU chip which contains 3 Power Processor cores with vector units, and a high-performance GPU containing 48 shader cores. Microsoft decided that this system did not have to be Microsoft compatible, so the designers were freed from most of the inherent inefficiencies that are associated with Microsoft engineering.
Xbox 360 will have a standard DVD player and HDTV output.