Mike Watts, product manager, NI
Reliability testing has long served as a method of ensuring that semiconductor devices maintain their desired performance over a given lifetime. As integrated circuit (IC) manufacturers continue to introduce new and innovative processes with decreasing device geometries, they need to ensure the additional complexity from these changes does not affect the long-term reliability of their ICs. Additionally, major technology trends in autonomous driving, cloud-based data storage and life sciences are forcing IC suppliers to provide higher assurances of product reliability to their customers who work on mission-critical applications.
The PCI eXtensions for Instrumentation (PXI) modular SMU platform provides scalable, high-density solutions for test applications.
These two trends are driving semiconductor manufacturers to vastly increase the amount of reliability data they collect and analyse while decreasing the cost of test. When faced with this problem of more data at a lower cost, many reliability engineers find they cannot solve it using traditional reliability solutions, so they are turning toward modular, flexible solutions that can be scaled to fit their needs.
Reliability testing
Device reliability is typically modelled as failure rate over time, with the highest failure rates occurring immediately after manufacturing and again after the product has exceeded its useful lifetime.
A typical model of device reliability.
The left side of the graph shown in the image above shows early failures often caused by defects in the manufacturing process. These types of failures can be screened during production to minimise the number of defective parts sent to customers. However, the functional tests performed during production cannot identify defects that cause the device to wear out prematurely and cannot offer insight into the product’s usable lifetime. Reliability testing identifies these types of failure mechanisms and estimates the product’s usable lifetime.
Reliability testing involves stressing a device at the extreme ends of its specifications—usually voltage and temperature—to accelerate wearout and model usable lifetime against known failure mechanisms. These tests can be performed on a wafer or packaged part. Wafer-level reliability (WLR) provides more data earlier in the manufacturing process, without the cost and potential damage associated with cutting and packaging the IC.
WLR
WLR is a type of parametric testing that extracts information about the device’s usable lifetime and long-term reliability. These tests are typically not performed on the actual IC being developed but rather a set of test structures or purpose-built dies that are built into the wafer specifically for gathering parametric data. These test structures consist of fundamental wafer elements (e.g., transistors, capacitors and resistors) or basic circuits (e.g., ring oscillators), which provide insight into the fabrication process. Most WLR tests involve applying a stress, such as voltage or temperature, and measuring the response of the device to monitor for any signs of degradation. Common failure mechanisms used include bias or negative bias temperature instability (BTI or NBTI), hot-carrier induced degradation (HCI), time-dependent dielectric breakdown (TDDB) and electromigration (EM).
Traditional approach to building WLR systems
Traditional WLR systems vary in both measurement capability and architecture. Specialised WLR systems may involve high-frequency alternating current (AC) or pulsed stimulus; however, most complementary metal-oxide semiconductor (CMOS) devices are tested with direct current (DC) instruments such as source measure units (SMUs), which provide the necessary stress and measurement capability for collecting parametric data. The two main approaches for building WLR systems involve either building a rack-and-stack system from traditional box instrumentation or buying a purpose-built turnkey system.
Rack-and-stack systems
SMUs are traditionally expensive, high-precision DC instruments that tend to limit the number of channels you can place in a standard test rack. Because of these constraints, SMUs are often combined with a low-leakage switching matrix to route signals from the SMU to dozens of test points while minimising the noise, leakage current and thermal EMF associated with relays. This approach works reasonably well when serially testing a small number of test structures generates statistically significant reliability data. Additionally, switching is a practical extension of a box instrument that historically has cost US$5,000 to $10,000 per channel and would otherwise be limited to 20 or 40 channels in a full 19 in. test rack. However, given the performance expectations for the relays, the switching subsystem is often a large and expensive piece of the WLR system.
Turnkey systems
The alternative approach is to purchase a purpose-built turnkey system that is prepackaged with all components such as the oven, test rack, instrumentation and software. Aligning your test requirements with the functionality of the equipment saves development and integration time but requires a large capital budget. These systems are often built with a fixed number of channels, hardware specifications and software, and are serviced by the vendor. System vendors may sell separate systems for wafer and packaged reliability systems, or they may sell the same system for both applications regardless of the differences in test requirements.
Challenges of traditional WLR systems
The traditional WLR approaches of either building rack-and-stack systems from traditional box instrumentation or buying purpose-built turnkey systems served their purpose for decades. However, many engineers are finding these architectures do not scale well to meet their evolving data and cost requirements.
Rack-and-stack systems are limited by the low-channel density of traditional box SMUs. As reliability stresses often require fixed stimulus times, the best way to increase data velocity, or the amount of data that can be gathered in the same (or less) time, is by increasing parallelism. The limited channel density of traditional box SMUs creates challenges for building high-channel-count systems with a small footprint and often forces engineers to use a switched topology to multiplex the SMU to multiple pins. However, this switched topology quickly becomes a bottleneck because the pins are tested serially instead of in parallel, failing to achieve the desired goal of increased data velocity.
Turnkey systems do not provide the flexibility needed to modify the test software or hardware as device requirements change, or the modifications are prohibitively expensive.
Because of these challenges, many companies are starting to build parallel test systems using modular instrumentation.
An alternative approach to building WLR systems
The market for test instrumentation has changed dramatically over the past decade with the rise of modular SMU platforms such as PCI eXtensions for Instrumentation (PXI)1. Modular platforms have grown increasingly desirable for building automated test systems because of their extensive input/output (I/O) capability, compact form factor and flexible software.
Using a modular approach, the footprint of WLR systems can be reduced dramatically without sacrificing measurement quality. The open software architecture allows manufacturers to define the functionality of their system, modify tests and add hardware as their requirements change. This includes integrating the latest multicore processors, maximising system uptime through health and monitoring tools, and adding I/O.
High-density SMUs
By using PXI SMUs as the foundation for WLR systems, hundreds of SMU channels can be added while maintaining a reasonable footprint and cost per channel. NI SMUs are designed for building automated test systems, and the modular architecture can be used to optimise the number of channels and device specifications of the overall system. With the high-channel density, placing switches between the SMU and the wafer can be avoided. Instead, each test pad can be connected directly to a high-precision device. This SMU-per-pin architecture prevents the negative impact that switches have on signal integrity, test time and test routine flexibility, thus helping in the implementation of advanced stress-measure algorithms.
A highly parallel, SMU-per-pin architecture can significantly reduce total WLR cycle time compared with a traditional multiplexed architecture.
NI SMUs provide a significantly high number of channels. A WLR system based on PXI SMUs offers the following:
- High density—Up to 68 SMU channels can be fitted in a single 4U, 19 in. PXI chassis and several chassis mounted in a single automated test rack to achieve hundreds of independent SMU channels per system.
- High-precision measurements—With measurement sensitivity ranging from 10 fA to 10 pA, the measurement quality of the system need not be sacrificed.
- A high-speed sequencing engine—Large, hardware-timed sequences can be streamed to SMUs in the system and all channels can be synchronised. This provides very fast execution rates and deterministic sourcing and sampling.
- A built-in digitiser—With sample rates greater than 600 kS/s, transient device recovery behaviour can be captured without an external oscilloscope.
High uptime and serviceability
Ensuring system uptime is critical for both inline and offline reliability systems. If an inline system fails, wafer production can come to a halt. Offline reliability tests, which are often executed over the course of months or years, offer critical data on the product’s expected lifetime. Because of these requirements, reliability testers need to stay online and continuously collect data throughout the experiment because a failed tester could lead to a failed experiment.
A high-uptime PXI chassis with redundant fans and power supplies.
The PXI SMU platform provides numerous benefits for developing high-uptime, critical applications. For example, a system can be built using a chassis that has redundant, hot-swappable fans and power supplies. If a component malfunctions, it can be replaced without powering down the system and aborting the experiment. Additionally, the health of the system can be remotely monitored for fan speed, temperature, power consumption and other key parameters that may indicate an imminent failure.
Access to the latest commercial processors
Parallel test systems cannot be bottlenecked by a lack of processing capability or communication latency. One advantage of building parallel WLR systems with PXI is the ability to use controllers with the latest multicore Intel processors. Additionally, the chassis backplane allows for low-latency communication between the processor and modules as well as module-to-module communication with digital triggers. For parallel WLR systems, this means the detailed sequence execution can be offloaded to each SMU and the controller reserved for data collection and analysis.
PXI-based test systems provide access to the latest commercial processors.
Summary
Traditional reliability systems have served their purpose for decades; however, the ability of these systems to provide and analyse massive amounts of reliability data is decreasing. To address these needs, many companies are turning to modular platforms, such as PXI, for highly parallel WLR systems with high uptime and the latest commercial processors. Using the software-defined architecture of these systems, companies can maintain control of their intellectual property (IP) and scale their systems as requirements change. This approach satisfies their need for more reliability data at a lower cost and positions them well to address ever changing test requirements in the future.
NI
Reference
1PCI eXtensions for Instrumentation (PXI) is a PC-based, modular SMU platform for measurement and automation systems. It provides power, cooling and a communication bus to support multiple instrumentation modules within the same enclosure.