MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com

# A Multi-Standards HDTV Video Decoder for Blu-Ray Disc Standard

Noriyuki Minegishi, Hidenori Sato, Yoshihiro Ogawa, Anthony Vetro, Fumitaka Izumihara, Masayuki Koyama

TR2008-004 April 2008

#### Abstract

This paper presents an HDTV video decoder core that is able to decode MPEG-2, MPEG-4 AVC and VC-1 formats and is fully compatible with the Blu-ray Disc standard. A novel re-configurable architecture is adopted to achieve reduced hardware, and a data compression method suitable for all video decoding standards is applied to reduce memory data usage and access bandwidth. The circuit volume of the core is 1.5Mgates with 90nm CMOS technology and operation clock frequency is 162MHz for 1080i at 30fr/s.

IEEE International Conference on Consumer Electronics, January 2008

This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved.

Copyright © Mitsubishi Electric Research Laboratories, Inc., 2008 201 Broadway, Cambridge, Massachusetts 02139



## A Multi-Standards HDTV Video Decoder for Blu-Ray Disc Standard

Noriyuki Minegishi<sup>1</sup>, Hidenori Sato<sup>1</sup>, Yoshihiro Ogawa<sup>1</sup>, Anthony Vetro<sup>2</sup>, Fumitaka Izumihara<sup>3</sup>, Masayuki Koyama<sup>3</sup>

<sup>1</sup> Mitsubishi Electric Corporation, Kanagawa, Japan
<sup>2</sup> Mitsubishi Electric Research Labs, Cambridge, USA
<sup>3</sup> Renesas Technology Corporation, Hyogo, Japan

**Abstract** — This paper presents an HDTV video decoder core that is is able to decode MPEG-2, MPEG-4 AVC and VC-1 formats and is fully compatible with the Blu-ray Disc standard. A novel re-configurable architecture is adopted to achieve reduced hardware, and a data compression method suitable for all video decoding standards is applied to reduce memory data usage and access bandwidth. The circuit volume of the core is 1.5Mgates with 90nm CMOS technology and operation clock frequency is 162MHz for 1080i at 30fr/s.

#### INTRODUCTION

Several video compression standards, e.g., MPEG-2, H.264/MPEG-4 AVC and Windows Media Video (VC-1), have been established and are used in practical applications such as recent terrestrial broadcast and high-compression optical disc. Semiconductor devices that meet these standards for multimedia applications are required to achieve high performance and cost effectiveness. Several solutions [1]-[3] are introduced; however, none of them corresponded for high-compression optical disc standards, such as Blu-ray. To develop a chip for practical use, hardware size, memory usage and memory access band width have to be considered. We propose a multi-standard video decoder core that adopts dynamic and static re-configurable techniques and a data compression method suitable for all video standards.

### **OVERVIEW OF THE CORE ARCHITECTURE**

The proposed video decoder architecture is shown in Figure 1. Considering the profile requirements of the Advanced profile of VC-1 and High profile of AVC, real-time entropy decoding cannot be realized with reasonable clock frequency. Therefore, the overall decode operation was divided into 2 parts: the VLC decode section and pixel operation section. The VLC decoding achieves a maximum bit rate operation (40Mbits/sec) and the pixel operation achieves a maximum frame size and rate operation (1080i with 30fr/s). In fact, this divided architecture can realize the maximum Blu-ray specification performance with 162MHz clock frequency.

A hybrid architecture is adopted for VLC decoding to realize both flexibility and high performance. During VLC decoding, a dynamic re-configurable VLC table is introduced to minimize hardware for quite different variable length codes of each video standard. Moreover, a data compression method that is based on Exp-Golomb codes is applied and implemented in the data buffer blocks. To prevent an empty buffer, the VLC decoding must be performed fast enough. The data compression function reduces eternal memory usage and access bandwidth between the core and external memory to satisfy this requirement.



### Dynamic Re-configurable VLC Table

Figure 2a shows a block diagram of the dynamic reconfigurable VLC table corresponding to the video standards of interest. The table consists of an array of matching elements (PEs) that compares the input bitstream with a stored binary pattern, an address decoder, a memory that contains table configuration data and decoding results, and control logic. The PE array compares the bitstream with stored binary patterns. When one of the PE matches, the PE number is transferred to the address decoder. The address decoder then points to a unique address based on the table type and mapping pages. If the data is decoded data, the control logic outputs a decoded coefficient; otherwise, the matched PE is considered as a branch point, and the control logic outputs subsequent pages as part of an address and PE group identifier (R0 to R4).

Figure 2b shows how a comparison mapping on the PE array changes with an example. An entropy coding table is considered as a tree search structure. At the beginning, the PE group identifier "R0" and "R2" is activated to compare nodes "n0" to "n4". If PE9 is matched, the control logic disables "R0" and "R2" and activates "R1" and "R4".

Moreover, the arithmetic unit is shared and connection of the data path is configured for each standard in the pixel operation unit to reduce hardware costs.





#### Data Compression Method

Table 1 shows the data compression syntax for coefficients decoded by CABAC. Considering compression rate and encode-decode performance, the Exp-Golomb algorithm is applied. However, the Exp-Golmb compression method is not efficient for large values. Hence, we set 14 bits length limitation for LEVEL data compression by our experimental results. The Exp-Golomb table with fixed length code is shown in Table 2. This compression method is applied for transform coefficient and motion vector data.

| Table 1. C | ompression | Syntax for | CABAC | Coefficient |
|------------|------------|------------|-------|-------------|
|------------|------------|------------|-------|-------------|

|                         | bits |                      |
|-------------------------|------|----------------------|
| block_CABAC(){          |      |                      |
| while(LEVEL!=EOB){      |      |                      |
| (first RUN)= numCoeff-1 |      |                      |
| RUN                     | 1-13 | Unsigned ExpGolomb   |
| LEVEL                   | 1-31 | Signed ExpGolomb     |
|                         |      | +FLC(cmax=14,FLC=16) |
| }                       |      |                      |
| }                       |      |                      |

#### Table 2. Exp-Golomb Table with Fixed Length Code

| Bitstring form                        | Range of codeNum |
|---------------------------------------|------------------|
| 1                                     | 0                |
| 0 1 X <sub>0</sub>                    | 1-2              |
| $0\ 0\ 1\ X_0\ X_1$                   | 3-6              |
| $0\ 0\ 0\ 1\ X_0\ X_1\ X_2$           | 7-14             |
|                                       |                  |
| 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |                  |

#### **IMPLEMENTATION AND RESULTS**

The core is implemented with top-down approach on HDL basis. We have carried out the HDL synthesis with 90nm CMOS ASIC library. The circuit volume of the core is 1.5MGates. By applying dynamic re-configurable VLC table, the circuit size of the table is reduced by 60% compared with a hard-wired logic implementation. We measured the proposed compression method with over 300 video sequences. The memory data usage is reduced 50% and access bandwidth is improved by 12%. As a result, two 512Mbit DDR2 SDRAM with 324MHz operation can be applied.

#### CONCLUSION

This paper introduced a multi-standards HDTV video decoder with low cost hardware and external memory. A dynamic reconfigurable VLC table and sharing arithmetic units achieved low-cost hardware and the proposed compression method reduces memory data usage and access bandwidth. Further information of the core architecture and more detailed explanation of the methodology will be reported in the fulllength version of the paper.

#### REFERENCES

- T-M. Liu, et al, "A 125uW Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications," *IEEE Int'l Solid-State Circuit Conf.*, Feb. 2006.
- [2] C-D. Chien, et al, "A 252kgate/71mW Multi-Standard Multi-Channel Video Decoder for High Definition Video Applications," *IEEE Int'l Solid-State Circuit Conf.*, Feb. 2007.
- [3] Y-S. Tung, et al, "DSP-Based Multi-Format Video Decoding Engine for Media Adapter Applications," ICCE 2005.