A new ultra-high sensitivity, low-power optical receiver based on a decision-feedback equalizer

Alexander V. Rylyakov, Clint L. Schow, Jeffrey A. Kash
IBM Research, T. J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY 10598
sasha@us.ibm.com

Abstract: A decision-feedback equalizer is used to recover data at baud rates far above the bandwidth of a low-noise TIA front-end. The overall 90-nm CMOS DC-coupled clocked receiver has better than -25 dBm power sensitivity at 4 Gb/s, dissipating 1.2 pJ/bit.

OCIS codes: (060.2360) Fiber optics links and subsystems; (060.2380) Fiber optics sources and detectors; (200.4650) Optical interconnects.

1. Introduction
Modern supercomputers and data centers rely on a massive amount of parallel optical interconnects to meet the demands for high bandwidth communications between racks [1]. The upcoming IBM Blue Waters system [2] will be the first to extend the optical links to the module level. Future exascale machines with 10’s to 100’s of millions of optical links will need high performance communication circuits that simultaneously achieve low power, high sensitivity and efficiency in a small footprint [3]. A high-sensitivity, low-power receiver is a critical component, determining the overall optical power budget of a short reach optical interconnect system. This is particularly true for future highly integrated silicon photonic communications, where the optical power from a single “laser power supply” source can be split between multiple channels.

In this work, we investigate an optical receiver based on a high-gain transimpedance amplifier (TIA) that achieves low noise by lowering the bandwidth. The bandwidth reduction causes significant inter-symbol interference (ISI) at the target data rate, which is then cancelled by a decision-feedback equalizer (DFE). DFE is a well-known type of ISI canceller commonly used to overcome the limitations of the communication channel.

Fig. 1. Block diagram of the 90-nm CMOS receiver chip with TIA schematic and simulated 10 Gb/s PD output current $I_{PD}$, differential DFE input $V_{IN}$ and single-ended 1:2 demultiplexed 5 Gb/s final receiver output $D_2$. 
Typically, very short reach (module-to-module, chip-to-chip, or on-chip) optical interconnects feature nearly ideal channel performance with little or no need for equalization. DFE is used here to compensate for the receiver bandwidth, and not the bandwidth limits of the channel. Also, this approach is very different from the standard technique of combining a TIA or high impedance front-end with a linear peaking amplifier or equalizer [4]. A peaking amplifier compensates for ISI by introducing additional gain at higher frequencies, resulting in a significant increase in the total input referred noise. In contrast, DFE recovers data based only on the input signal and the history of the previous bits, adding only a negligible amount of noise at low bit error rates.

2. Receiver Design

The receiver block diagram is shown in Fig. 1. The output of the external photodiode (PD) is applied to the TIA (schematic shown in inset in Fig. 1). The DC-coupled TIA produces a single-ended output which is converted into a differential signal $V_{IN}$ by comparing it to the output of the replica TIA. The half-rate 2-tap DFE is comprised of the front end samplers (S/H in Fig. 1) that hold the current data, latches L1 and L2 that keep track of the previous 2 bits and summers (the plus signs in Fig. 1) where the weighted 2 previous bits are subtracted from the current bit to cancel the ISI. The samplers and the latches are clocked on the opposite edges of the clock, so that the receiver operates as a conventional 1:2 demultiplexer (see [5] for more details).

The ISI introduced by a bandwidth limited TIA is a smooth exponential RC decay, without any reflections or resonances. This is the main reason that a very “light”, 2-tap equalizer is adequate for our application. Unlike the general case of compensating for an external unknown channel, the TIA and the DFE can be optimally co-designed, minimizing noise, area and power dissipation. Another significant advantage of the clocked system is the absence of the power-hungry limiting amplifier stages. The gain of the TIA is adequate for the DFE latches to make correct decision, and the final digital data is regenerated rather than amplified.

The power dissipation number that we report is for the sum of VDD_TIA and VDD_CORE domains. The off-chip clock receiver circuit, the serial interface and the final output data drivers use a separate power supply. The power dissipation associated with those parts is not reported as it is not relevant to the total power dissipation number when the receiver is used in an integrated implementation. The core of the receiver occupies only 60µm x 75µm. With the inclusion of the digital-to-analog converters (DACs), the receiver area grows to 60µm x 230µm. Layout of the DACs was not optimized for area and can be significantly reduced in size.

3. Experimental Results

The block diagram of the test setup together with the picture of the receiver test site with the PD is shown in Fig. 2. The receiver was tested with an Emore PD [6], wire-bonded to the CMOS chip. Optical data stream was generated with a directly modulated 850nm Emcore VCSEL. Both the PD and the VCSEL were fiber coupled with lensed fiber probes. The measured power sensitivity curves for 3 Gb/s and 4 Gb/s input data rates are shown in Fig. 3. Input current sensitivity of the receiver was converted into optical power assuming 1A/W PD efficiency. The estimated power efficiency of the 850nm PD that was used for convenience was approximately 0.55 A/W. The two half-rate output data streams are also shown in Fig. 3. At 3 Gb/s and -25dBm input power the receiver was operating with BER less than $10^{-12}$ on both outputs. At 4 Gb/s the best measured BER was $10^{-9}$, limited by mechanical vibrations of the fiber probe. The receiver bathtub curve at 3 Gb/s is shown in Fig. 4, together with both demultiplexed outputs.

![Fig. 2. Block diagram of the test setup, with picture of the 90-nm CMOS receiver test site with wire-bonded PD.](image-url)
The horizontal eye opening, measured by adjusting the position of the input data relative to the clock, was 36% at a BER < 10^-9. Testing was performed with both PRBS7 and PRBS31 data, demonstrating no dependence on the type of the test pattern used. The two DFE taps were set at 40% and 10% of the main tap. Power dissipation in this experiment was 4.6 mW (3.1 mA, 1.0V for VDD_TIA and 1.7mA, 0.9V for VDD_CORE).

4. Conclusions

We propose and demonstrate a high-sensitivity (-25dBm), low power (1.2 pJ/bit), small area (60µm x 75µm core), 4 Gb/s 90nm CMOS optical receiver, based on a new concept of processing TIA output directly with a DFE. This receiver can be used in highly integrated, massively parallel optical communication applications, enabling high density, low power links.

5. References