The TMS320DM6441 (also referenced as DM6441) leverages TI's DaVinci™
technology to meet the networked media encode and decode application processing
needs of next-generation embedded devices.
The DM6441 enables OEMs and ODMs to quickly bring to market devices featuring
robust operating systems support, rich user interfaces, high processing
performance, and long battery life through the maximum flexibility of a fully
integrated mixed processor solution.
The dual-core architecture of the DM6441 provides benefits of both DSP and
Reduced Instruction Set Computer (RISC) technologies, incorporating a
high-performance TMS320C64x+ DSP core and an ARM926EJ-S core.
The ARM926EJ-S is a 32-bit RISC processor core that performs 32-bit or 16-bit
instructions and processes 32-bit, 16-bit, or 8-bit data. The core uses
pipelining so that all parts of the processor and memory system can operate
continuously.
The ARM core incorporates:
- A coprocessor 15 (CP15) and protection module
- Data and program memory management units (MMUs) with table look-aside
buffers.
- Separate 16K-byte instruction and 8K-byte data caches. Both are four-way
associative with virtual index virtual tag (VIVT).
The TMS320C64x+™ DSPs are the highest-performance fixed-point DSP generation
in the TMS320C6000™ DSP platform. It is based on an enhanced version of the
second-generation high-performance, advanced very-long-instruction-word (VLIW)
architecture developed by Texas Instruments (TI), making these DSP cores an
excellent choice for digital media applications. The C64x is a code-compatible
member of the C6000™ DSP platform. The TMS320C64x+ DSP is an enhancement of the
C64x+ DSP with added functionality and an expanded instruction set.
Any reference to the C64x DSP or C64x CPU also applies, unless otherwise
noted, to the C64x+ DSP and C64x+ CPU, respectively.
With performance of up to 4104 million instructions per second (MIPS) at a
clock rate of 513 MHz, the C64x+ core offers solutions to high-performance DSP
programming challenges. The DSP core possesses the operational flexibility of
high-speed controllers and the numerical capability of array processors. The
C64x+ DSP core processor has 64 general-purpose registers of 32-bit word length
and eight highly independent functional units&151;two multipliers for a
32-bit result and six arithmetic logic units (ALUs). The eight functional units
include instructions to accelerate the performance in video and imaging
applications. The DSP core can produce four 16-bit multiply-accumulates (MACs)
per cycle for a total of 2052 million MACs per second (MMACS), or eight 8-bit
MACs per cycle for a total of 4104 MMACS. For more details on the C64x+ DSP, see
the TMS320C64x/C64x+ DSP CPU and Instruction Set Reference Guide
(literature number SPRU732).
The DM6441 also has application-specific hardware logic, on-chip memory, and
additional on-chip peripherals similar to the other C6000 DSP platform devices.
The DM6441 core uses a two-level cache-based architecture. The Level 1 program
cache (L1P) is a 256K-bit direct mapped cache and the Level 1 data cache (L1D)
is a 640K-bit 2-way set-associative cache. The Level 2 memory/cache (L2)
consists of an 512K-bit memory space that is shared between program and data
space. L2 memory can be configured as mapped memory, cache, or combinations of
the two.
The peripheral set includes: two configurable video ports; a 10/100 Mb/s
Ethernet MAC (EMAC) with a management data input/output (MDIO) module; an
inter-integrated circuit (I2C) bus interface; one audio serial port (ASP); two
64-bit general-purpose timers each configurable as two independent 32-bit
timers; one 64-bit watchdog timer; up to 71 pins of general-purpose input/output
(GPIO) with programmable interrupt/event generation modes, multiplexed with
other peripherals; three UARTs with hardware handshaking support on one UART;
three pulse width modulator (PWM) peripherals; and two external memory
interfaces: an asynchronous external memory interface (EMIFA) for slower
memories/peripherals, and a higher speed synchronous memory interface for
DDR2.
The DM6441 device includes a video processing subsystem (VPSS) with two
configurable video/imaging peripherals: one video processing front-end (VPFE)
input used for video capture, one video processing back-end (VPBE) output with
imaging coprocessor (VICP) used for display.
The video processing front-end (VPFE) consists of a CCD controller (CCDC), a
preview engine (previewer), histogram module, auto-exposure/white balance/focus
module (H3A), and resizer. The CCDC is capable of interfacing to common video
decoders, CMOS sensors, and charge coupled devices (CCDs). The previewer is a
real-time image processing engine that takes raw imager data from a CMOS sensor
or CCD and converts from an RGB Bayer pattern to YUV4:2:2. The histogram and H3A
modules provide statistical information on the raw color data for use by the
DM6441. The resizer accepts image data for separate horizontal and vertical
resizing from 1/4x to 4x in increments of 256/N, where N is between 64 and
1024.
The video processing back-end (VPBE) consists of an on-screen display engine
(OSD) and a video encoder (VENC). The OSD engine is capable of handling two
separate video windows and two separate OSD windows. Other configurations
include two video windows, one OSD window, and one attribute window allowing up
to eight levels of alpha blending. The VENC provides four analog DACs that run
at 54 MHz, providing a means for composite NTSC/PAL video, S-Video, and/or
component video output. The VENC also provides up to 24 bits of digital output
to interface to RGB888 devices. The digital output is capable of 8/16-bit BT.656
output and/or CCIR.601 with separate horizontal and vertical syncs. VFocus (part
of the VPBE functionality and operationally (e.g., 16-bit multiplexed
address/data) is also provided.
The Ethernet media access controller (EMAC) provides an efficient interface
between the DM6441 and the network. The DM6441 EMAC support both 10Base-T and
100Base-TX, or 10 Mbits/second (Mbps) and 100 Mbps in either half- or
full-duplex mode, with hardware flow control and quality of service (QOS)
support.
The management data input/output (MDIO) module continuously polls all 32 MDIO
addresses in order to enumerate all PHY devices in the system. Once a PHY
candidate has been selected by the ARM, the MDIO module transparently monitors
its link state by reading the PHY status register. Link change events are stored
in the MDIO module and can optionally interrupt the ARM, allowing the ARM to
poll the link status of the device without continuously performing costly MDIO
accesses.
The HPI, I2C, SPI, USB2.0, and VLYNQ ports allow DM6441 to easily control
peripheral devices and/or communicate with host processors. The DM6441 also
provides Memory Stick/Memory Stick Pro card support, MMC/SD with SDIO support,
and a universal serial bus (USB).
The DM6441 also includes a video/imaging coprocessor (VICP) to offload many
video and imaging processing tasks from the DSP core, making more DSP MIPS
available for common video and imaging algorithms. For more information on the
VICP enhanced codecs, such as H.264 and MPEG4, please contact your nearest TI
sales representative.
The rich peripheral set provides the ability to control external peripheral
devices and communicate with external processors. For details on each of the
peripherals, see the related sections later in this document and the associated
peripheral reference guides.
The DM6441 has a complete set of development tools for both the ARM and DSP.
These include C compilers, a DSP assembly optimizer to simplify programming and
scheduling, and a Windows™ debugger interface for visibility into source code
The TMS320DM6441 (also referenced as DM6441) leverages TI's DaVinci™
technology to meet the networked media encode and decode application processing
needs of next-generation embedded devices.
The DM6441 enables OEMs and ODMs to quickly bring to market devices featuring
robust operating systems support, rich user interfaces, high processing
performance, and long battery life through the maximum flexibility of a fully
integrated mixed processor solution.
The dual-core architecture of the DM6441 provides benefits of both DSP and
Reduced Instruction Set Computer (RISC) technologies, incorporating a
high-performance TMS320C64x+ DSP core and an ARM926EJ-S core.
The ARM926EJ-S is a 32-bit RISC processor core that performs 32-bit or 16-bit
instructions and processes 32-bit, 16-bit, or 8-bit data. The core uses
pipelining so that all parts of the processor and memory system can operate
continuously.
The ARM core incorporates:
- A coprocessor 15 (CP15) and protection module
- Data and program memory management units (MMUs) with table look-aside
buffers.
- Separate 16K-byte instruction and 8K-byte data caches. Both are four-way
associative with virtual index virtual tag (VIVT).
The TMS320C64x+™ DSPs are the highest-performance fixed-point DSP generation
in the TMS320C6000™ DSP platform. It is based on an enhanced version of the
second-generation high-performance, advanced very-long-instruction-word (VLIW)
architecture developed by Texas Instruments (TI), making these DSP cores an
excellent choice for digital media applications. The C64x is a code-compatible
member of the C6000™ DSP platform. The TMS320C64x+ DSP is an enhancement of the
C64x+ DSP with added functionality and an expanded instruction set.
Any reference to the C64x DSP or C64x CPU also applies, unless otherwise
noted, to the C64x+ DSP and C64x+ CPU, respectively.
With performance of up to 4104 million instructions per second (MIPS) at a
clock rate of 513 MHz, the C64x+ core offers solutions to high-performance DSP
programming challenges. The DSP core possesses the operational flexibility of
high-speed controllers and the numerical capability of array processors. The
C64x+ DSP core processor has 64 general-purpose registers of 32-bit word length
and eight highly independent functional units&151;two multipliers for a
32-bit result and six arithmetic logic units (ALUs). The eight functional units
include instructions to accelerate the performance in video and imaging
applications. The DSP core can produce four 16-bit multiply-accumulates (MACs)
per cycle for a total of 2052 million MACs per second (MMACS), or eight 8-bit
MACs per cycle for a total of 4104 MMACS. For more details on the C64x+ DSP, see
the TMS320C64x/C64x+ DSP CPU and Instruction Set Reference Guide
(literature number SPRU732).
The DM6441 also has application-specific hardware logic, on-chip memory, and
additional on-chip peripherals similar to the other C6000 DSP platform devices.
The DM6441 core uses a two-level cache-based architecture. The Level 1 program
cache (L1P) is a 256K-bit direct mapped cache and the Level 1 data cache (L1D)
is a 640K-bit 2-way set-associative cache. The Level 2 memory/cache (L2)
consists of an 512K-bit memory space that is shared between program and data
space. L2 memory can be configured as mapped memory, cache, or combinations of
the two.
The peripheral set includes: two configurable video ports; a 10/100 Mb/s
Ethernet MAC (EMAC) with a management data input/output (MDIO) module; an
inter-integrated circuit (I2C) bus interface; one audio serial port (ASP); two
64-bit general-purpose timers each configurable as two independent 32-bit
timers; one 64-bit watchdog timer; up to 71 pins of general-purpose input/output
(GPIO) with programmable interrupt/event generation modes, multiplexed with
other peripherals; three UARTs with hardware handshaking support on one UART;
three pulse width modulator (PWM) peripherals; and two external memory
interfaces: an asynchronous external memory interface (EMIFA) for slower
memories/peripherals, and a higher speed synchronous memory interface for
DDR2.
The DM6441 device includes a video processing subsystem (VPSS) with two
configurable video/imaging peripherals: one video processing front-end (VPFE)
input used for video capture, one video processing back-end (VPBE) output with
imaging coprocessor (VICP) used for display.
The video processing front-end (VPFE) consists of a CCD controller (CCDC), a
preview engine (previewer), histogram module, auto-exposure/white balance/focus
module (H3A), and resizer. The CCDC is capable of interfacing to common video
decoders, CMOS sensors, and charge coupled devices (CCDs). The previewer is a
real-time image processing engine that takes raw imager data from a CMOS sensor
or CCD and converts from an RGB Bayer pattern to YUV4:2:2. The histogram and H3A
modules provide statistical information on the raw color data for use by the
DM6441. The resizer accepts image data for separate horizontal and vertical
resizing from 1/4x to 4x in increments of 256/N, where N is between 64 and
1024.
The video processing back-end (VPBE) consists of an on-screen display engine
(OSD) and a video encoder (VENC). The OSD engine is capable of handling two
separate video windows and two separate OSD windows. Other configurations
include two video windows, one OSD window, and one attribute window allowing up
to eight levels of alpha blending. The VENC provides four analog DACs that run
at 54 MHz, providing a means for composite NTSC/PAL video, S-Video, and/or
component video output. The VENC also provides up to 24 bits of digital output
to interface to RGB888 devices. The digital output is capable of 8/16-bit BT.656
output and/or CCIR.601 with separate horizontal and vertical syncs. VFocus (part
of the VPBE functionality and operationally (e.g., 16-bit multiplexed
address/data) is also provided.
The Ethernet media access controller (EMAC) provides an efficient interface
between the DM6441 and the network. The DM6441 EMAC support both 10Base-T and
100Base-TX, or 10 Mbits/second (Mbps) and 100 Mbps in either half- or
full-duplex mode, with hardware flow control and quality of service (QOS)
support.
The management data input/output (MDIO) module continuously polls all 32 MDIO
addresses in order to enumerate all PHY devices in the system. Once a PHY
candidate has been selected by the ARM, the MDIO module transparently monitors
its link state by reading the PHY status register. Link change events are stored
in the MDIO module and can optionally interrupt the ARM, allowing the ARM to
poll the link status of the device without continuously performing costly MDIO
accesses.
The HPI, I2C, SPI, USB2.0, and VLYNQ ports allow DM6441 to easily control
peripheral devices and/or communicate with host processors. The DM6441 also
provides Memory Stick/Memory Stick Pro card support, MMC/SD with SDIO support,
and a universal serial bus (USB).
The DM6441 also includes a video/imaging coprocessor (VICP) to offload many
video and imaging processing tasks from the DSP core, making more DSP MIPS
available for common video and imaging algorithms. For more information on the
VICP enhanced codecs, such as H.264 and MPEG4, please contact your nearest TI
sales representative.
The rich peripheral set provides the ability to control external peripheral
devices and communicate with external processors. For details on each of the
peripherals, see the related sections later in this document and the associated
peripheral reference guides.
The DM6441 has a complete set of development tools for both the ARM and DSP.
These include C compilers, a DSP assembly optimizer to simplify programming and
scheduling, and a Windows™ debugger interface for visibility into source code