在本周于美国举行、聚集众多顶尖微处理器架构师的年度 Hot Chips 大会上,发表多场演说的ARM是聚光灯焦点之一,但显然英特尔(Intel)仍是微处理器领域的霸主。IBM与甲骨文(Oracle)也藉由大会上的演说展示其Power 8与Sparc架构仍在市场上各自占有一席之地,而这两家公司表示,芯片堆栈方案正在崛起。
值 得一提的是,在这为期三天的Hot Chips大会上,笔者完全没听到有人提“CPU”这个缩写,看来它已经是过气的名词。主流28纳米制程技术的问世──以及少数以更具侵略性制程节点生产 的芯片──显然已经让微处理器演进至系统级芯片(SoC)时代。可能这样的历史趋势正是 ARM 在大会上备受瞩目的原因;x86或许仍是重量级架构,但ARM核心则是会出现在大多数工程师的设计中。以下是 2014年度的Hot Chips大会重点摘录。
ARM首席技术官Mike Muller介绍了一款嵌入了传感器中枢(sensor hub)、锁定物联网(IoT)应用的测试芯片;Muller表示他上一次在Hot Chips大会介绍ARM技术得回溯到1992年。

此外,Muller也呼吁工程师们参与将接棒今日DDR4规格的Jedec 内存新标准订定。

AMD 在大会上介绍第一款采用ARM架构的服务器SoC,代号Seattle;这款芯片预定在今年底前量产。Insight64分析师Nathan Brookwood表示,该款芯片采用标准ARM 64位核心,可能会吸引对一些采用客制化核心之竞争产品兼容性有疑虑的客户。


Applied Micro展示X-Gene产品蓝图
Applied Micro的40纳米X-Gene系列ARM核心64位服务器SoC已经量产,该公司在大会上介绍的是还在开发阶段的第二代X-Gene,以及计划才刚成形的第三代。

Applied Micro展示X-Gene 产品蓝图
如同AMD的Seattle,Applied Micro介绍其第二代X-Gene芯片将配备8颗64位ARM 核心,初估该芯片性能将比第一代高出30~100%。

Applied Micro的第二代X-Gene芯片架构

据了解,Denver原本将采用收购自Transmeta 的技术、为x86架构芯片,后来却演变成采用了一种新式执行优化功能,取代其他竞争对手所采用的全乱序执行(full out-of-order)设计(参考阅读

IBM为Power 8打造新软件
IBM在去年的Hot Chips大会发表Power 8处理器,今年则是为该款芯片开发了新的软件堆栈;Power架构是开放给所有芯片与系统开发商的资源,IBM改写了其软件堆栈,包括新版本的Linux、新的hypervisor、新的韧体等等。

IBM为Power 8打造新软件

Power 8首度登场时被誉为最具扩充性的服务器处理器
M7 是为甲骨文的软件堆栈量身打造,Tirias Research分析师Kevin Krewell表示,其功能包括Java垃圾回收加速(garbage collection acceleration),以及一款甲骨文数据库查询加速器(query accelerator)。

AMD 也在大会上公开了客户端运算CPU/GPU组合芯片Kaveri的细节,这款芯片在去年首度亮相,值得一探究竟的原因是,该芯片采用AMD的异质系统架构 (Heterogeneous Systems Architecture,HSA)是首款支持连贯性连结(coherent connection),以及CPU与GPU之间的完全内存共享。现在的疑问是谁将采用第二代HSA芯片?何时有产品上市?


SK海力士也详细介绍了其高带宽内存(High Bandwidth Memory)的进展。

微处理器领域的资深工程师Dave Ditzel表示,新创公司ThruChip Communications开发出一种能跨越芯片无线分配电力的方法,这种技术与该公司在芯片之间传递数据的电感耦合技术结合,可做为取代昂贵且复杂之硅穿孔技术的替代方案。

ThruChip Communications开发的芯片无线堆栈技术架构

在今年Hot Chips大会上最热门的、运算领域以外的话题,一个是微软(Microsoft)打算采用Altera的FPGA来加速其数据中心的Bing服务,为可程序化逻辑组件开启了新应用;而百度(Baidu)也表示将支持这样的转变。

微软数据中心将利用Altera FPGA加速Bing服务
此外有一家比特币采矿机(bitcoin mining)新创公司,采用高程度的设计重复使用方法,在不到一年时间内让整体系统速度破纪录。

CUPERTINO, Calif. — ARM commanded much of the spotlight with a half-dozen presentations at Hot Chips, an annual gathering of top microprocessor architects. But the event made it clear Intel is still king of the silicon hill.
IBM and Oracle gave talks showing their Power 8 and Sparc architectures are still very much in the game. And two talks showed that chip stacks are on the rise.
One acronym I never heard uttered during the three-day event was CPU, a term that has become so last century. The advent of mainstream 28nm process technology -- and a handful of chips built on more aggressive nodes – has clearly morphed the microprocessor into the SoC era.
Perhaps this historical trend is why ARM figured so prominently in the proceedings. The x86 may still be the 900-pound gorilla, but the ARM core is the one available for most engineer’s designs.
ARM tests IoT
ARM's CTO Mike Muller described a test chip for the Internet of Things that embedded a sensor hub (below), a separate chip in some of today's products. Muller noted back in 1992 he gave the sole ARM paper at Hot Chips. Fast forward to 2014 and his talk was one of six on ARM chips.
Separately, Muller also called for engineers to get involved in a new Jedec memory standard that will be a follow on to today's DDR4 spec.
AMD tours Seattle
AMD described Seattle, its first ARM-based server SoC. It aims to have it in production before the end of the year. Use of a standard ARM 64-bit core may attract buyers concerned about possible incompatibility issues with custom cores some competitors are designing, said analyst Nathan Brookwood of Insight64.
As is the case with many server SoCs, memory dominates the Seattle die.
Applied Micro shows XGene road map
Applied Micro said it has production volumes of its 40nm X-Gene, one of the first 64-bit ARM-based server SoCs to hit the market. It detailed its second-generation now in the lab and sketched out rough plans for a third generation.
Like AMD's Seattle, Applied's Shadowcat, its second-generation X-Gene will sport eight 64-bit ARM cores(see above). Initial performance figures show the chip offering a 30-100% performance boost over the first-gen part (see below).
Intel rules the microserver
Intel described Avoton, its second-generation Atom-based server SoC, intended to stave off competition from ARM-based servers. As it turned out competitors have hardly made it to market. Meanwhile Intel grabbed the early sockets for low-power server-class SoCs (see below).
Nvidia's Denver stays mobile
Denver, Nvidia's custom 64-bit ARM core, made its debut in a mobile SoC. Reports say Nvidia cancelled its plans for a server part, but the company declined comment.
Initially, Denver was going to be an x86 chip, reportedly using technology from Transmeta. But it morphed instead into an ARM chip using a novel optimizer as an alternative to doing a full out-of-order design like its server SoC competitors Applied Micro, Broadcom and Cavium. It may take until some time in 2015 befor benchmarks with all the chips determine which route was best for performance/Watt.
Intel grabs Avago's Axxia
Avago described Axxia, the ARM-based networking SoC it acquired with LSI. Two days later it sold the chip to Intel which is expected to design a next-generation part based on the x86, the approach Intel took with the mobile application processor it acquired with Infineon, noted analyst Brookwood.
Power 8 revamps software
The big news about IBM's Power 8, disclosed at last year's event, was a new software stack (see above). In an effort to make Power open to any chip or system developer, IBM re-wrote its software stack including a new version of Linux, a new hypervisor, new firmware, and other goodies.
Power 8 was hailed as one of the most scalable server processors when it originally debuted.
Oracle packs 32 Sparc cores
Oracle disclosed its largest Sparc chip to date. The M7 packs 32 multithreaded cores and can expand into a 32-way system with 8,000 threads. 'I was blown away by the scalability of the M7 in the memory hierarchy, interconnect and cache sizes. This is a big ambitious chip,' said analyst Brookwood.
The M7 is tailored for Oracle's software stack, sporting 'features such as Java garbage collection acceleration and an Oracle database query accelerator,' said analyst Kevin Krewell of Tirias Research.
AMD's coherant message
AMD did a deep dive on Kaveri, its client computing CPU/GPU combo first described last year. It was worth a closer look because it's the first chip to support a coherent connection and full memory sharing between the CPU and GPU cores thanks to AMD's Heterogeneous Systems Architecture. Now the big question is, who will have the second HSA-compliant chip? And when?
Chip stacks get real
Xilinx pioneered 2.5-D chip stacks putting memory and FPGA next to each other on a substrate. SK Hynix said that's just the start, with graphics apps coming next followed by networking chips...and someday maybe even the Holy Grail of smartphones. The company detailed the advantages of its High Bandwidth Memory offering (see below).
Wireless chip stacks
Startup ThruChip Communications has found a way to wirelessly distribute power across a chip, said microprocessor veteran Dave Ditzel. The technique (above) when combined with its inductive coupling method for passing data between chip die provides an alternative to expensive and complex through silicon vias, he said. The coupling technology is looking even better for those using the new ultra-thin wafers (see below) now emerging from the labs because the thin chips allow for smaller coils, he added.
FPGAs, Bitcoin rising
Two of the hottest Hot Chip talks came from outside the mainstream computing field the conference usually addresses. Microsoft's plan to adopt Altera FPGAs to accelerate Bing in its data centers opens a new door for programmable logic (above), a shift Baidu also said it supports. Separately, a bitcoin mining startup beat speed records getting full systems out the door in well less than a year (see below) thanks to a high degree of design re-use.
