Mangix

Mangix pushed to ag71xx at Mangix/libreCMC

  • 9414697f92 ag71xx: Remove ___cacheline_aligned from ring structs. Qualcomm's struct members and inner workings of their driver are all different. While this might make sense for their driver, it seems to hurt here. In iperf3, i've seen inconsistent results including a drop of 100mbps on an Archer C7v4. This patch keeps the results high and relatively consistent. Signed-off-by: Rosen Penev <rosenp@gmail.com>
  • f74ce430f2 ag71xx: Reorder ring struct for lower cache thrashing. Based on Qualcomm's upstream code. Reordered a bit different. iperf3 speed goes from ~279mbps to ~320mbps on an Archer C7v4. Signed-off-by: Rosen Penev <rosenp@gmail.com>
  • c8eb9545a0 ag71xx: Move timestamp struct member outside of struct. With this change, the timestamp variable is only used in ag71xx_check_dma_stuck. Small tx speedup. Based on a Qualcomm commit. Signed-off-by: Rosen Penev <rosenp@gmail.com>
  • ea2e88f605 ar71xx: use global timestamp for hang check Shrink the size of struct ag71xx_buf to 8 bytes, which improves cache footprint Signed-off-by: Felix Fietkau <nbd@nbd.name>
  • b9dc815e4c ag71xx: Reorder ag71xx struct members for better cache performance Qualcomm claims this improves the D-cache footprint. Origina commit message below: From: Ben Menchaca <ben.menchaca@qca.qualcomm.com> Date: Fri, 7 Jun 2013 10:57:28 -0500 Subject: [ag71xx] cluster/align structs for cache perf Cluster the frequently used, per-packet structures in ag71xx near to each other, and cacheline-align them. Some other re-ordering occurred to move "warmer" structures near the per-packet structures. Signed-off-by: Ben Menchaca <ben.menchaca@qca.qualcomm.com> Signed-off-by: Rosen Penev <rosenp@gmail.com>
  • View comparison for these 82 commits »

3 years ago

Mangix pushed to mangix at Mangix/libreCMC

  • 7d61e1ae43 ag71xx: Remove ___cacheline_aligned from ring structs. Qualcomm's struct members and inner workings of their driver are all different. While this might make sense for their driver, it seems to hurt here. In iperf3, i've seen inconsistent results including a drop of 100mbps on an Archer C7v4. This patch keeps the results high and relatively consistent. Signed-off-by: Rosen Penev <rosenp@gmail.com>
  • a926487e64 ag71xx: Reorder ring struct for lower cache thrashing. Based on Qualcomm's upstream code. Reordered a bit different. iperf3 speed goes from ~279mbps to ~320mbps on an Archer C7v4. Signed-off-by: Rosen Penev <rosenp@gmail.com>
  • 8e0d2dc709 ag71xx: Move timestamp struct member outside of struct. With this change, the timestamp variable is only used in ag71xx_check_dma_stuck. Small tx speedup. Based on a Qualcomm commit. Signed-off-by: Rosen Penev <rosenp@gmail.com>
  • 5c0a9aa5de ar71xx: use global timestamp for hang check Shrink the size of struct ag71xx_buf to 8 bytes, which improves cache footprint Signed-off-by: Felix Fietkau <nbd@nbd.name>
  • b64e2e8552 ag71xx: Reorder ag71xx struct members for better cache performance Qualcomm claims this improves the D-cache footprint. Origina commit message below: From: Ben Menchaca <ben.menchaca@qca.qualcomm.com> Date: Fri, 7 Jun 2013 10:57:28 -0500 Subject: [ag71xx] cluster/align structs for cache perf Cluster the frequently used, per-packet structures in ag71xx near to each other, and cacheline-align them. Some other re-ordering occurred to move "warmer" structures near the per-packet structures. Signed-off-by: Ben Menchaca <ben.menchaca@qca.qualcomm.com> Signed-off-by: Rosen Penev <rosenp@gmail.com>

3 years ago

Mangix created new branch mangix at Mangix/libreCMC

3 years ago

Mangix pushed to ag71xx at Mangix/libreCMC

  • a5d0de214f ag71xx: Remove ___cacheline_aligned from ring structs. Qualcomm's struct members and inner workings of their driver are all different. While this might make sense for their driver, it seems to hurt here. In iperf3, i've seen inconsistent results including a drop of 100mbps on an Archer C7v4. This patch keeps the results high and relatively consistent. Signed-off-by: Rosen Penev <rosenp@gmail.com>
  • 60fcf54838 ag71xx: Reorder ring struct for lower cache thrashing. Based on Qualcomm's upstream code. Reordered a bit different. iperf3 speed goes from ~279mbps to ~320mbps on an Archer C7v4. Signed-off-by: Rosen Penev <rosenp@gmail.com>
  • bbecb936c5 ag71xx: Move timestamp struct member outside of struct. With this change, the timestamp variable is only used in ag71xx_check_dma_stuck. Small tx speedup. Based on a Qualcomm commit. Signed-off-by: Rosen Penev <rosenp@gmail.com>
  • 9805d6cd83 ar71xx: use global timestamp for hang check Shrink the size of struct ag71xx_buf to 8 bytes, which improves cache footprint Signed-off-by: Felix Fietkau <nbd@nbd.name>
  • dd630fe38a ag71xx: Reorder ag71xx struct members for better cache performance Qualcomm claims this improves the D-cache footprint. Origina commit message below: From: Ben Menchaca <ben.menchaca@qca.qualcomm.com> Date: Fri, 7 Jun 2013 10:57:28 -0500 Subject: [ag71xx] cluster/align structs for cache perf Cluster the frequently used, per-packet structures in ag71xx near to each other, and cacheline-align them. Some other re-ordering occurred to move "warmer" structures near the per-packet structures. Signed-off-by: Ben Menchaca <ben.menchaca@qca.qualcomm.com> Signed-off-by: Rosen Penev <rosenp@gmail.com>

3 years ago

Mangix created new branch ag71xx at Mangix/libreCMC

3 years ago

Mangix forked a repository to Mangix/libreCMC

3 years ago