granule-protection-tables-design.rst 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284
  1. Granule Protection Tables Library
  2. =================================
  3. This document describes the design of the Granule Protection Tables (GPT)
  4. library used by Trusted Firmware-A (TF-A). This library provides the APIs needed
  5. to initialize the GPTs based on a data structure containing information about
  6. the systems memory layout, configure the system registers to enable granule
  7. protection checks based on these tables, and transition granules between
  8. different PAS (physical address spaces) at runtime.
  9. Arm CCA adds two new security states for a total of four: root, realm, secure,
  10. and non-secure. In addition to new security states, corresponding physical
  11. address spaces have been added to control memory access for each state. The PAS
  12. access allowed to each security state can be seen in the table below.
  13. .. list-table:: Security states and PAS access rights
  14. :widths: 25 25 25 25 25
  15. :header-rows: 1
  16. * -
  17. - Root state
  18. - Realm state
  19. - Secure state
  20. - Non-secure state
  21. * - Root PAS
  22. - yes
  23. - no
  24. - no
  25. - no
  26. * - Realm PAS
  27. - yes
  28. - yes
  29. - no
  30. - no
  31. * - Secure PAS
  32. - yes
  33. - no
  34. - yes
  35. - no
  36. * - Non-secure PAS
  37. - yes
  38. - yes
  39. - yes
  40. - yes
  41. The GPT can function as either a 1 level or 2 level lookup depending on how a
  42. PAS region is configured. The first step is the level 0 table, each entry in the
  43. level 0 table controls access to a relatively large region in memory (GPT Block
  44. descriptor), and the entire region can belong to a single PAS when a one step
  45. mapping is used. Level 0 entry can also link to a level 1 table (GPT Table
  46. descriptor) with a 2 step mapping. To change PAS of a region dynamically, the
  47. region must be mapped in Level 1 table.
  48. The Level 1 tables entries with the same PAS can be combined to form a
  49. contiguous block entry using GPT Contiguous descriptor. More details about this
  50. is explained in the following section.
  51. Design Concepts and Interfaces
  52. ------------------------------
  53. This section covers some important concepts and data structures used in the GPT
  54. library.
  55. There are three main parameters that determine how the tables are organized and
  56. function: the PPS (protected physical space) which is the total amount of
  57. protected physical address space in the system, PGS (physical granule size)
  58. which is how large each level 1 granule is, and L0GPTSZ (level 0 GPT size) which
  59. determines how much physical memory is governed by each level 0 entry. A granule
  60. is the smallest unit of memory that can be independently assigned to a PAS.
  61. L0GPTSZ is determined by the hardware and is read from the GPCCR_EL3 register.
  62. PPS and PGS are passed into the APIs at runtime and can be determined in
  63. whatever way is best for a given platform, either through some algorithm or hard
  64. coded in the firmware.
  65. GPT setup is split into two parts: table creation and runtime initialization. In
  66. the table creation step, a data structure containing information about the
  67. desired PAS regions is passed into the library which validates the mappings,
  68. creates the tables in memory, and enables granule protection checks. It also
  69. allocates memory for fine-grained locks adjacent to the L0 tables. In the
  70. runtime initialization step, the runtime firmware locates the existing tables in
  71. memory using the GPT register configuration and saves important data to a
  72. structure used by the granule transition service which will be covered more
  73. below.
  74. In the reference implementation for FVP models, you can find an example of PAS
  75. region definitions in the file ``plat/arm/board/fvp/include/fvp_pas_def.h``.
  76. Table creation API calls can be found in ``plat/arm/common/arm_common.c`` and
  77. runtime initialization API calls can be seen in
  78. ``plat/arm/common/arm_bl31_setup.c``.
  79. During the table creation time, the GPT lib opportunistically fuses contiguous
  80. GPT L1 entries having the same PAS. The maximum size of
  81. supported contiguous blocks is defined by ``RME_GPT_MAX_BLOCK`` build option.
  82. Defining PAS regions
  83. ~~~~~~~~~~~~~~~~~~~~
  84. A ``pas_region_t`` structure is a way to represent a physical address space and
  85. its attributes that can be used by the GPT library to initialize the tables.
  86. This structure is composed of the following:
  87. #. The base physical address
  88. #. The region size
  89. #. The desired attributes of this memory region (mapping type, PAS type)
  90. See the ``pas_region_t`` type in ``include/lib/gpt_rme/gpt_rme.h``.
  91. The programmer should provide the API with an array containing ``pas_region_t``
  92. structures, then the library will check the desired memory access layout for
  93. validity and create tables to implement it.
  94. ``pas_region_t`` is a public type, however it is recommended that the macros
  95. ``GPT_MAP_REGION_BLOCK`` and ``GPT_MAP_REGION_GRANULE`` be used to populate
  96. these structures instead of doing it manually to reduce the risk of future
  97. compatibility issues. These macros take the base physical address, region size,
  98. and PAS type as arguments to generate the pas_region_t structure. As the names
  99. imply, ``GPT_MAP_REGION_BLOCK`` creates a region using only L0 mapping while
  100. ``GPT_MAP_REGION_GRANULE`` creates a region using L0 and L1 mappings.
  101. Level 0 and Level 1 Tables
  102. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  103. The GPT initialization APIs require memory to be passed in for the tables to be
  104. constructed. The ``gpt_init_l0_tables`` API takes a memory address and size for
  105. building the level 0 tables and also memory for allocating the fine-grained bitlock
  106. data structure. The amount of memory needed for bitlock structure is controlled via
  107. ``RME_GPT_BITLOCK_BLOCK`` config which defines the block size for each bit of the
  108. the bitlock.
  109. The ``gpt_init_pas_l1_tables`` API takes an address and size for
  110. building the level 1 tables which are linked from level 0 descriptors. The
  111. tables should have PAS type ``GPT_GPI_ROOT`` and a typical system might place
  112. its level 0 table in SRAM and its level 1 table(s) in DRAM.
  113. Granule Transition Service
  114. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  115. The Granule Transition Service allows memory mapped with
  116. ``GPT_MAP_REGION_GRANULE`` ownership to be changed using SMC calls. Non-secure
  117. granules can be transitioned to either realm or secure space, and realm and
  118. secure granules can be transitioned back to non-secure. This library only
  119. allows Level 1 entries to be transitioned. The lib may either shatter
  120. contiguous blocks or fuse adjacent GPT entries to form a contiguous block
  121. opportunistically. Depending on the maximum block size, the fuse operation may
  122. propogate to higher block sizes as allowed by RME Architecture. Thus a higher
  123. maximum block size may have a higher runtime cost due to software operations
  124. that need to be performed for fuse to bigger block sizes. This cost may
  125. be offset by better TLB performance due to the higher block size and platforms
  126. need to make the trade-off decision based on their particular workload.
  127. Locking Scheme
  128. ~~~~~~~~~~~~~~
  129. During Granule Transition access to L1 tables is controlled by a lock to ensure
  130. that no more than one CPU is allowed to make changes at any given time.
  131. The granularity of the lock is defined by ``RME_GPT_BITLOCK_BLOCK`` build option
  132. which defines the size of the memory block protected by one bit of ``bitlock``
  133. structure. Setting this option to 0 chooses a single spinlock for all GPT L1
  134. table entries.
  135. Library APIs
  136. ------------
  137. The public APIs and types can be found in ``include/lib/gpt_rme/gpt_rme.h`` and this
  138. section is intended to provide additional details and clarifications.
  139. To create the GPTs and enable granule protection checks the APIs need to be
  140. called in the correct order and at the correct time during the system boot
  141. process.
  142. #. Firmware must enable the MMU.
  143. #. Firmware must call ``gpt_init_l0_tables`` to initialize the level 0 tables to
  144. a default state, that is, initializing all of the L0 descriptors to allow all
  145. accesses to all memory. The PPS is provided to this function as an argument.
  146. #. DDR discovery and initialization by the system, the discovered DDR region(s)
  147. are then added to the L1 PAS regions to be initialized in the next step and
  148. used by the GTSI at runtime.
  149. #. Firmware must call ``gpt_init_pas_l1_tables`` with a pointer to an array of
  150. ``pas_region_t`` structures containing the desired memory access layout. The
  151. PGS is provided to this function as an argument.
  152. #. Firmware must call ``gpt_enable`` to enable granule protection checks by
  153. setting the correct register values.
  154. #. In systems that make use of the granule transition service, runtime
  155. firmware must call ``gpt_runtime_init`` to set up the data structures needed
  156. by the GTSI to find the tables and transition granules between PAS types.
  157. API Constraints
  158. ~~~~~~~~~~~~~~~
  159. The values allowed by the API for PPS and PGS are enumerated types
  160. defined in the file ``include/lib/gpt_rme/gpt_rme.h``.
  161. Allowable values for PPS along with their corresponding size.
  162. * ``GPCCR_PPS_4GB`` (4GB protected space, 0x100000000 bytes)
  163. * ``GPCCR_PPS_64GB`` (64GB protected space, 0x1000000000 bytes)
  164. * ``GPCCR_PPS_1TB`` (1TB protected space, 0x10000000000 bytes)
  165. * ``GPCCR_PPS_4TB`` (4TB protected space, 0x40000000000 bytes)
  166. * ``GPCCR_PPS_16TB`` (16TB protected space, 0x100000000000 bytes)
  167. * ``GPCCR_PPS_256TB`` (256TB protected space, 0x1000000000000 bytes)
  168. * ``GPCCR_PPS_4PB`` (4PB protected space, 0x10000000000000 bytes)
  169. Allowable values for PGS along with their corresponding size.
  170. * ``GPCCR_PGS_4K`` (4KB granules, 0x1000 bytes)
  171. * ``GPCCR_PGS_16K`` (16KB granules, 0x4000 bytes)
  172. * ``GPCCR_PGS_64K`` (64KB granules, 0x10000 bytes)
  173. Allowable values for L0GPTSZ along with the corresponding size.
  174. * ``GPCCR_L0GPTSZ_30BITS`` (1GB regions, 0x40000000 bytes)
  175. * ``GPCCR_L0GPTSZ_34BITS`` (16GB regions, 0x400000000 bytes)
  176. * ``GPCCR_L0GPTSZ_36BITS`` (64GB regions, 0x1000000000 bytes)
  177. * ``GPCCR_L0GPTSZ_39BITS`` (512GB regions, 0x8000000000 bytes)
  178. Note that the value of the PPS, PGS, and L0GPTSZ definitions is an encoded value
  179. corresponding to the size, not the size itself. The decoded hex representations
  180. of the sizes have been provided for convenience.
  181. The L0 table memory has some constraints that must be taken into account.
  182. * The L0 table must be aligned to either the table size or 4096 bytes, whichever
  183. is greater. L0 table size is the total protected space (PPS) divided by the
  184. size of each L0 region (L0GPTSZ) multiplied by the size of each L0 descriptor
  185. (8 bytes). ((PPS / L0GPTSZ) * 8)
  186. * The L0 memory size must be greater than the table size and have enough space
  187. to allocate array of ``bitlock`` structures at the end of L0 table if
  188. required (``RME_GPT_BITLOCK_BLOCK`` is not 0).
  189. * The L0 memory must fall within a PAS of type GPT_GPI_ROOT.
  190. The L1 memory also has some constraints.
  191. * The L1 tables must be aligned to their size. The size of each L1 table is the
  192. size of each L0 region (L0GPTSZ) divided by the granule size (PGS) divided by
  193. the granules controlled in each byte (2). ((L0GPTSZ / PGS) / 2)
  194. * There must be enough L1 memory supplied to build all requested L1 tables.
  195. * The L1 memory must fall within a PAS of type GPT_GPI_ROOT.
  196. If an invalid combination of parameters is supplied, the APIs will print an
  197. error message and return a negative value. The return values of APIs should be
  198. checked to ensure successful configuration.
  199. Sample Calculation for L0 memory size and alignment
  200. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  201. Let PPS=GPCCR_PPS_4GB and L0GPTSZ=GPCCR_L0GPTSZ_30BITS
  202. We can find the total L0 table size with ((PPS / L0GPTSZ) * 8)
  203. Substitute values to get this: ((0x100000000 / 0x40000000) * 8)
  204. And solve to get 32 bytes. In this case, 4096 is greater than 32, so the L0
  205. tables must be aligned to 4096 bytes.
  206. Sample calculation for bitlock array size
  207. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  208. Let PGS=GPCCR_PPS_256TB and RME_GPT_BITLOCK_BLOCK=1
  209. The size of bit lock array in bits is the total protected space (PPS) divided
  210. by the size of memory block per bit. The size of memory block
  211. is ``RME_GPT_BITLOCK_BLOCK`` (number of 512MB blocks per bit) times
  212. 512MB (0x20000000). This is then divided by the number of bits in ``bitlock``
  213. structure (8) to get the size of bit array in bytes.
  214. In other words, we can find the total size of ``bitlock`` array
  215. in bytes with PPS / (RME_GPT_BITLOCK_BLOCK * 0x20000000 * 8).
  216. Substitute values to get this: 0x1000000000000 / (1 * 0x20000000 * 8)
  217. And solve to get 0x10000 bytes.
  218. Sample calculation for L1 table size and alignment
  219. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  220. Let PGS=GPCCR_PGS_4K and L0GPTSZ=GPCCR_L0GPTSZ_30BITS
  221. We can find the size of each L1 table with ((L0GPTSZ / PGS) / 2).
  222. Substitute values: ((0x40000000 / 0x1000) / 2)
  223. And solve to get 0x20000 bytes per L1 table.