123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148 |
- NVIDIA Tegra
- ============
- - .. rubric:: T194
- :name: t194
- T194 has eight NVIDIA Carmel CPU cores in a coherent multi-processor
- configuration. The Carmel cores support the ARM Architecture version 8.2,
- executing both 64-bit AArch64 code, and 32-bit AArch32 code. The Carmel
- processors are organized as four dual-core clusters, where each cluster has
- a dedicated 2 MiB Level-2 unified cache. A high speed coherency fabric connects
- these processor complexes and allows heterogeneous multi-processing with all
- eight cores if required.
- - .. rubric:: T186
- :name: t186
- The NVIDIA® Parker (T186) series system-on-chip (SoC) delivers a heterogeneous
- multi-processing (HMP) solution designed to optimize performance and
- efficiency.
- T186 has Dual NVIDIA Denver2 ARM® CPU cores, plus Quad ARM Cortex®-A57 cores,
- in a coherent multiprocessor configuration. The Denver 2 and Cortex-A57 cores
- support ARMv8, executing both 64-bit Aarch64 code, and 32-bit Aarch32 code
- including legacy ARMv7 applications. The Denver 2 processors each have 128 KB
- Instruction and 64 KB Data Level 1 caches; and have a 2MB shared Level 2
- unified cache. The Cortex-A57 processors each have 48 KB Instruction and 32 KB
- Data Level 1 caches; and also have a 2 MB shared Level 2 unified cache. A
- high speed coherency fabric connects these two processor complexes and allows
- heterogeneous multi-processing with all six cores if required.
- Denver is NVIDIA's own custom-designed, 64-bit, dual-core CPU which is
- fully Armv8-A architecture compatible. Each of the two Denver cores
- implements a 7-way superscalar microarchitecture (up to 7 concurrent
- micro-ops can be executed per clock), and includes a 128KB 4-way L1
- instruction cache, a 64KB 4-way L1 data cache, and a 2MB 16-way L2
- cache, which services both cores.
- Denver implements an innovative process called Dynamic Code Optimization,
- which optimizes frequently used software routines at runtime into dense,
- highly tuned microcode-equivalent routines. These are stored in a
- dedicated, 128MB main-memory-based optimization cache. After being read
- into the instruction cache, the optimized micro-ops are executed,
- re-fetched and executed from the instruction cache as long as needed and
- capacity allows.
- Effectively, this reduces the need to re-optimize the software routines.
- Instead of using hardware to extract the instruction-level parallelism
- (ILP) inherent in the code, Denver extracts the ILP once via software
- techniques, and then executes those routines repeatedly, thus amortizing
- the cost of ILP extraction over the many execution instances.
- Denver also features new low latency power-state transitions, in addition
- to extensive power-gating and dynamic voltage and clock scaling based on
- workloads.
- - .. rubric:: T210
- :name: t210
- T210 has Quad Arm® Cortex®-A57 cores in a switched configuration with a
- companion set of quad Arm Cortex-A53 cores. The Cortex-A57 and A53 cores
- support Armv8-A, executing both 64-bit Aarch64 code, and 32-bit Aarch32 code
- including legacy Armv7-A applications. The Cortex-A57 processors each have
- 48 KB Instruction and 32 KB Data Level 1 caches; and have a 2 MB shared
- Level 2 unified cache. The Cortex-A53 processors each have 32 KB Instruction
- and 32 KB Data Level 1 caches; and have a 512 KB shared Level 2 unified cache.
- Directory structure
- -------------------
- - plat/nvidia/tegra/common - Common code for all Tegra SoCs
- - plat/nvidia/tegra/soc/txxx - Chip specific code
- Trusted OS dispatcher
- ---------------------
- Tegra supports multiple Trusted OS'.
- - Trusted Little Kernel (TLK): In order to include the 'tlkd' dispatcher in
- the image, pass 'SPD=tlkd' on the command line while preparing a bl31 image.
- - Trusty: In order to include the 'trusty' dispatcher in the image, pass
- 'SPD=trusty' on the command line while preparing a bl31 image.
- This allows other Trusted OS vendors to use the upstream code and include
- their dispatchers in the image without changing any makefiles.
- These are the supported Trusted OS' by Tegra platforms.
- - Tegra210: TLK and Trusty
- - Tegra186: Trusty
- - Tegra194: Trusty
- Scatter files
- -------------
- Tegra platforms currently support scatter files and ld.S scripts. The scatter
- files help support ARMLINK linker to generate BL31 binaries. For now, there
- exists a common scatter file, plat/nvidia/tegra/scat/bl31.scat, for all Tegra
- SoCs. The `LINKER` build variable needs to point to the ARMLINK binary for
- the scatter file to be used. Tegra platforms have verified BL31 image generation
- with ARMCLANG (compilation) and ARMLINK (linking) for the Tegra186 platforms.
- Preparing the BL31 image to run on Tegra SoCs
- ---------------------------------------------
- .. code:: shell
- CROSS_COMPILE=<path-to-aarch64-gcc>/bin/aarch64-none-elf- make PLAT=tegra \
- TARGET_SOC=<target-soc e.g. t194|t186|t210> SPD=<dispatcher e.g. trusty|tlkd>
- bl31
- Platforms wanting to use different TZDRAM\_BASE, can add ``TZDRAM_BASE=<value>``
- to the build command line.
- The Tegra platform code expects a pointer to the following platform specific
- structure via 'x1' register from the BL2 layer which is used by the
- bl31\_early\_platform\_setup() handler to extract the TZDRAM carveout base and
- size for loading the Trusted OS and the UART port ID to be used. The Tegra
- memory controller driver programs this base/size in order to restrict NS
- accesses.
- typedef struct plat\_params\_from\_bl2 {
- /\* TZ memory size */
- uint64\_t tzdram\_size;
- /* TZ memory base */
- uint64\_t tzdram\_base;
- /* UART port ID \*/
- int uart\_id;
- /* L2 ECC parity protection disable flag \*/
- int l2\_ecc\_parity\_prot\_dis;
- /* SHMEM base address for storing the boot logs \*/
- uint64\_t boot\_profiler\_shmem\_base;
- } plat\_params\_from\_bl2\_t;
- Power Management
- ----------------
- The PSCI implementation expects each platform to expose the 'power state'
- parameter to be used during the 'SYSTEM SUSPEND' call. The state-id field
- is implementation defined on Tegra SoCs and is preferably defined by
- tegra\_def.h.
- Tegra configs
- -------------
- - 'tegra\_enable\_l2\_ecc\_parity\_prot': This flag enables the L2 ECC and Parity
- Protection bit, for Arm Cortex-A57 CPUs, during CPU boot. This flag will
- be enabled by Tegrs SoCs during 'Cluster power up' or 'System Suspend' exit.
|