TMS320C6655/57 定点浮点数字信号处理器技术手册

需积分: 15 45 浏览量更新于2024-07-17 2 收藏 2.82MB PDF 举报

"TMS320C6655/57是TI（Texas Instruments）公司生产的定点和浮点数字信号处理器（DSP），主要用于高性能计算和信号处理应用。这款处理器的相关资料文献号为ZHCS967A，发布日期为2012年8月，其中包含了产品的生产数据信息，并指出产品符合德州仪器的标准保修条款，但生产过程可能并未对所有参数进行测试。" TI的TMS320C6655/57 DSP是一款高性能的处理器，它结合了定点和浮点运算能力，适用于各种需要高效处理数字信号的应用场景，如通信、音频处理、图像处理和工业自动化等领域。该处理器的数据手册经过了多次修订，例如在ZHCS967A版本中，更新了追踪器的描述、McBSP时序要求表、热特性数据，并在内存映射摘要表中增加了DDR3 EMIF数据的脚注，同时在Smart Reflex切换表中加入了CVDD和Smart Reflex电压参数。值得注意的是，DDR3 PLL的初始化序列已从数据手册移至PLL控制器用户指南中。数据手册的内容涵盖了多个部分，包括但不限于处理器的架构概述、指令集、存储系统、外设接口、电源管理、调试工具以及详细的电气特性等。这些信息对于开发者来说至关重要，他们可以通过这些内容理解如何有效地利用TMS320C6655/57的硬件资源来实现高效的算法和系统设计。在硬件设计方面，TMS320C6655/57提供了丰富的片上外设，如多通道缓冲串行端口(McBSP)、外部存储器接口(EMIF)、高速串行端口等，这些都为系统的扩展性和灵活性提供了保障。此外，Smart Reflex系统是一种集成的电源管理和温度监控机制，能动态调整电压和频率以适应工作负载变化，从而提高能效并保护芯片免受潜在损害。在软件开发方面，TMS320C6655/57支持C/C++编程语言，开发者可以使用TI提供的编译器和开发工具链进行应用程序的编写和调试。TI还提供了一系列的开发库和示例代码，帮助开发者快速入门并优化性能。 TMS320C6655/57是一款强大的数字信号处理器，其丰富的特性和功能使其成为许多复杂信号处理任务的理想选择。通过详细阅读和理解提供的数据手册，开发者能够充分挖掘这款处理器的潜力，实现高效且可靠的系统设计。

ZHCS967A—August 2012

Fixed and Floating-Point Digital Signal Processor

TMS320C6655/57

www.ti.com

Submit Documentation Feedback

2 Device Overview

2.1 Device Characteristics

Table 2-1 Characteristics of the TMS320C6655/57 Processor

HARDWARE FEATURES TMS320C6655 TMS320C6657

Peripheral

DDR3 Memory Controller (32-bit bus width)

[1.5 V I/O] (clock source = DDRREFCLKN|P)

DDR3 Maximum Data Rate 1333

EDMA3 (64 independent channels) [DSP/3 clock rate] 1

High-speed 1×/2×/4× Serial RapidIO Port (4 lanes) 1

PCIe (2 lanes) 1

10/100/1000 Ethernet 1

Management Data Input/Output (MDIO) 1

HyperLink 1

EMIF16 1

McBSP 2

SPI 1

UART 2

uPP 1

C 1

64-Bit Timers (configurable) (internal clock source = CPU/6

clock frequency)

8 (each configurable as two 32-bit timers)

General-Purpose Input/Output port (GPIO) 32

Encoder/Decoder

Coprocessors

VCP2 (clock source = CPU/3 clock frequency) 2

TCP3d (clock source = CPU/2 clock frequency) 1

On-Chip Memory

CorePac Memory

32KB L1 Program Memory [SRAM/Cache]

32KB L1 Data Memory [SRAM/Cache]

1024KB L2 Unified Memory/Cache

ROM Memory 128KB L3 ROM

Multicore Shared Memory 1024KB MSM SRAM

C66x CorePac

Revision ID

CorePac Revision ID Register

(address location: 0181 2000h)

See Section 5.5 ‘‘C66x CorePac Revision’’ on page 102

JTAG BSDL_ID JTAGID register (address location: 0262 0018h) See Section 3.3.3 ‘‘JTAG ID (JTAGID) Register Description’’ on page 71

Frequency MHz

1250 (1.25GHz)

1000 (1.0 GHz)

- 850 (0.85 GHz)

Cycle Time ns

0.8 (1.25 GHz)

1 (1.0 GHz)

- 1.175 (0.85 GHz)

Voltage

Core (V) SmartReflex variable supply

I/O (V) 1.0 V, 1.5 V, and 1.8 V

Process

Technology

μm 0.040 μm

BGA Package 21 mm × 21mm 625-Pin Flip-Chip Plastic BGA (CZH or GZH)

Product Status

(1)

1 PRODUCTION DATA information is current as of publication date. Products conform to specifications per the terms of Texas Instruments standard warranty. Production

processing does not necessarily include testing of all parameters.

Production Data (PD) PD PD

End of Table 2-1

Fixed and Floating-Point Digital Signal Processor

ZHCS967A—August 2012

TMS320C6655/57

www.ti.com

Submit Documentation Feedback

2.2 DSP Core Description

The C66x Digital Signal Processor (DSP) extends the performance of the C64x+ and C674x DSPs through

enhancements and new features. Many of the new features target increased performance for vector processing. The

C64x+ and C674x DSPs support 2-way SIMD operations for 16-bit data and 4-way SIMD operations for 8-bit data.

On the C66x DSP, the vector processing capability is improved by extending the width of the SIMD instructions.

C66x DSPs can execute instructions that operate on 128-bit vectors. For example the QMPY32 instruction is able to

perform the element-to-element multiplication between two vectors of four 32-bit data each. The C66x DSP also

supports SIMD for floating-point operations. Improved vector processing capability (each instruction can process

multiple data in parallel) combined with the natural instruction level parallelism of C6000 architecture (e.g

execution of up to 8 instructions per cycle) results in a very high level of parallelism that can be exploited by DSP

programmers through the use of TI's optimized C/C++ compiler.

The C66x DSP consists of eight functional units, two register files, and two data paths as shown in Figure 2-1. The

two general-purpose register files (A and B) each contain 32 32-bit registers for a total of 64 registers. The

general-purpose registers can be used for data or can be data address pointers. The data types supported include

packed 8-bit data, packed 16-bit data, 32-bit data, 40-bit data, and 64-bit data. Multiplies also support 128-bit data.

40-bit-long or 64-bit-long values are stored in register pairs, with the 32 LSBs of data placed in an even register and

the remaining 8 or 32 MSBs in the next upper register (which is always an odd-numbered register). 128-bit data

values are stored in register quadruplets, with the 32 LSBs of data placed in a register that is a multiple of 4 and the

remaining 96 MSBs in the next 3 upper registers.

The eight functional units (.M1, .L1, .D1, .S1, .M2, .L2, .D2, and .S2) are each capable of executing one instruction

every clock cycle. The .M functional units perform all multiply operations. The .S and .L units perform a general set

of arithmetic, logical, and branch functions. The .D units primarily load data from memory to the register file and

store results from the register file into memory.

Each C66x .M unit can perform one of the following fixed-point operations each clock cycle: four 32 × 32 bit

multiplies, sixteen 16 × 16 bit multiplies, four 16 × 32 bit multiplies, four 8 × 8 bit multiplies, four 8 × 8 bit multiplies

with add operations, and four 16 × 16 multiplies with add/subtract capabilities. There is also support for Galois field

multiplication for 8-bit and 32-bit data. Many communications algorithms such as FFTs and modems require

complex multiplication. Each C66x .M unit can perform one 16 × 16 bit complex multiply with or without rounding

capabilities, two 16 × 16 bit complex multiplies with rounding capability, and a 32 × 32 bit complex multiply with

rounding capability. The C66x can also perform two 16 × 16 bit and one 32 × 32 bit complex multiply instructions

that multiply a complex number with a complex conjugate of another number with rounding capability.

Communication signal processing also requires an extensive use of matrix operations. Each C66x .M unit is capable

of multiplying a [1 × 2] complex vector by a [2 × 2] complex matrix per cycle with or without rounding capability.

A version also exists allowing multiplication of the conjugate of a [1 × 2] vector with a [2 × 2] complex matrix.

Each C66x .M unit also includes IEEE floating-point multiplication operations from the C674x DSP, which includes

one single-precision multiply each cycle and one double-precision multiply every 4 cycles. There is also a

mixed-precision multiply that allows multiplication of a single-precision value by a double-precision value and an

operation allowing multiplication of two single-precision numbers resulting in a double-precision number. The

C66x DSP improves the performance over the C674x double-precision multiplies by adding a instruction allowing

one double-precision multiply per cycle and also reduces the number of delay slots from 10 down to 4. Each C66x

.M unit can also perform one the following floating-point operations each clock cycle: one, two, or four

single-precision multiplies or a complex single-precision multiply.

The .L and .S units can now support up to 64-bit operands. This allows for new versions of many of the arithmetic,

logical, and data packing instructions to allow for more parallel operations per cycle. Additional instructions were

added yielding performance enhancements of the floating point addition and subtraction instructions, including the

ability to perform one double precision addition or subtraction per cycle. Conversion to/from integer and

single-precision values can now be done on both .L and .S units on the C66x. Also, by taking advantage of the larger

ZHCS967A—August 2012

Fixed and Floating-Point Digital Signal Processor

TMS320C6655/57

www.ti.com

Submit Documentation Feedback

2.3 Memory Map Summary

Table 2-2 shows the memory map address ranges of the TMS320C6655/57 device.

Table 2-2 Memory Map Summary (Part 1 of 5)

Logical 32-bit Address Physical 36-bit Address

Bytes DescriptionStart End Start End

00000000 007FFFFF 0 00000000 0 007FFFFF 8M Reserved

00800000 008FFFFF 0 00800000 0 008FFFFF 1M Local L2 SRAM

00900000 00DFFFFF 0 00900000 0 00DFFFFF 5M Reserved

00E00000 00E07FFF 0 00E00000 0 00E07FFF 32K Local L1P SRAM

00E08000 00EFFFFF 0 00E08000 0 00EFFFFF 1M-32K Reserved

00F00000 00F07FFF 0 00F00000 0 00F07FFF 32K Local L1D SRAM

00F08000 017FFFFF 0 00F08000 0 017FFFFF 9M-32K Reserved

01800000 01BFFFFF 0 01800000 0 01BFFFFF 4M C66x CorePac Registers

01C00000 01CFFFFF 0 01C00000 0 01CFFFFF 1M Reserved

01D00000 01D0007F 0 01D00000 0 01D0007F 128 Tracer_MSMC_0

01D00080 01D07FFF 0 01D00080 0 01D07FFF 32K-128 Reserved

01D08000 01D0807F 0 01D08000 0 01D0807F 128 Tracer_MSMC_1

01D08080 01D0FFFF 0 01D08080 0 01D0FFFF 32K-128 Reserved

01D10000 01D1007F 0 01D10000 0 01D1007F 128 Tracer_MSMC_2

01D10080 01D17FFF 0 01D10080 0 01D17FFF 32K-128 Reserved

01D18000 01D1807F 0 01D18000 0 01D1807F 128 Tracer_MSMC_3

01D18080 01D1FFFF 0 01D18080 0 01D1FFFF 32K-128 Reserved

01D20000 01D2007F 0 01D20000 0 01D2007F 128 Tracer_QM_DMA

01D20080 01D27FFF 0 01D20080 0 01D27FFF 32K-128 Reserved

01D28000 01D2807F 0 01D28000 0 01D2807F 128 Tracer_DDR

01D28080 01D2FFFF 0 01D28080 0 01D2FFFF 32K-128 Reserved

01D30000 01D3007F 0 01D30000 0 01D3007F 128 Tracer_SEM

01D30080 01D37FFF 0 01D30080 0 01D37FFF 32K-128 Reserved

01D38000 01D3807F 0 01D38000 0 01D3807F 128 Tracer_QM_CFG

01D38080 01D3FFFF 0 01D38080 0 01D3FFFF 32K-128 Reserved

01D40000 01D4007F 0 01D40000 0 01D4007F 128 Tracer_CFG

01D40080 01D47FFF 0 01D40080 0 01D47FFF 32K-128 Reserved

01D48000 01D4807F 0 01D48000 0 01D4807F 128 Tracer_L2_0

01D48080 01D4FFFF 0 01D48080 0 01D4FFFF 32K-128 Reserved

01D50000 01D5007F 0 01D50000 0 01D5007F 128 Tracer_L2_1(C6657) or Reserved (C6655)

01D50080 01D57FFF 0 01D50080 0 01D57FFF 32K-128 Reserved

01D58000 01D5807F 0 01D58000 0 01D5807F 128 Tracer_EMIF16

01D58080 01D5FFFF 0 01D58080 0 01D5FFFF 4464K -128 Reserved

021B4000 021B47FF 0 021B4000 0 021B47FF 2K McBSP0 Registers

021B4800 021B5FFF 0 021B4800 0 021B5FFF 6K Reserved

021B6000 021B67FF 0 021B6000 0 021B67FF 2K McBSP0 FIFO Registers

021B6800 021B7FFF 0 021B6800 0 021B7FFF 6K Reserved

021B8000 021B87FF 0 021B8000 0 021B87FF 2K McBSP1 Registers

021B8800 021B9FFF 0 021B8800 0 021B9FFF 6K Reserved

021BA000 021BA7FF 0 021BA000 0 021BA7FF 2K McBSP1 FIFO Registers

剩余231页未读，继续阅读

qq_41304674

粉丝: 0
资源: 2

TMS320C6655/57 定点浮点数字信号处理器技术手册

C6657相关全部文档手册

基于TMS320C6657的千兆以太网接口设计

TMS320C6657创龙开发板硬件说明书2

NonOS_GPIO_LED_NonOS_GPIO_LED_dsp_GPIO_TMS320C6657官网_tms320c6657

TMS320C6657例程程序

tms320c6657 data book

《TMS320C6655 和 TMS320C6657 定点及浮点数字信号处理器》中文手册

NonOS_rememberjst_C6657NonOs_tms320c6657_

upp_test_tms320c6657upp_

创龙TI KeyStone C66x多核定点/浮点DSP TMS320C665x，单核TMS320C6655和双核TMS320C6657核心板.pdf

最新资源