Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: add support for querying simd16 eu per dss #745

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions shared/source/os_interface/linux/xe/ioctl_helper_xe.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -464,6 +464,7 @@ bool IoctlHelperXe::getTopologyDataAndMap(const HardwareInfo &hwInfo, DrmQueryTo
StackVec<std::vector<std::bitset<8>>, 2> geomDss;
StackVec<std::vector<std::bitset<8>>, 2> computeDss;
StackVec<std::vector<std::bitset<8>>, 2> euDss;
StackVec<std::vector<std::bitset<8>>, 2> simd16EuDss;

auto topologySize = queryGtTopology.size();
auto dataPtr = queryGtTopology.data();
Expand All @@ -472,8 +473,10 @@ bool IoctlHelperXe::getTopologyDataAndMap(const HardwareInfo &hwInfo, DrmQueryTo
geomDss.resize(numTiles);
computeDss.resize(numTiles);
euDss.resize(numTiles);
simd16EuDss.resize(numTiles);
bool receivedDssInfo = false;
bool receivedEuPerDssInfo = false;
bool receivedSimd16EuPerDssInfo = false;
while (topologySize >= sizeof(drm_xe_query_topology_mask)) {
drm_xe_query_topology_mask *topo = reinterpret_cast<drm_xe_query_topology_mask *>(dataPtr);
UNRECOVERABLE_IF(topo == nullptr);
Expand All @@ -495,6 +498,10 @@ bool IoctlHelperXe::getTopologyDataAndMap(const HardwareInfo &hwInfo, DrmQueryTo
fillMask(euDss[tileId], topo);
receivedEuPerDssInfo = true;
break;
case DRM_XE_TOPO_SIMD16_EU_PER_DSS:
fillMask(simd16EuDss[tileId], topo);
receivedSimd16EuPerDssInfo = true;
break;
default:
xeLog("Unhandle GT Topo type: %d\n", topo->type);
}
Expand All @@ -504,9 +511,10 @@ bool IoctlHelperXe::getTopologyDataAndMap(const HardwareInfo &hwInfo, DrmQueryTo
topologySize -= itemSize;
dataPtr = ptrOffset(dataPtr, itemSize);
}

receivedEuPerDssInfo |= receivedSimd16EuPerDssInfo;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you simply add DRM_XE_TOPO_SIMD16_EU_PER_DSS to operate in the same var as DRM_XE_TOPO_SIMD16_EU_PER_DSS? you are not doing anything differently and kernel isn't going to report both. If the kernel reports both in future it actually means it has 2 different types of EUs in that platform, which would be very odd.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool isComputeDssEmpty = false;
getTopologyData(numTiles, geomDss.begin(), computeDss.begin(), euDss.begin(), topologyData, isComputeDssEmpty);
std::vector<std::bitset<8>> *euDssVector = receivedSimd16EuPerDssInfo ? simd16EuDss.begin() : euDss.begin();
getTopologyData(numTiles, geomDss.begin(), computeDss.begin(), euDssVector, topologyData, isComputeDssEmpty);

auto &dssInfo = isComputeDssEmpty ? geomDss : computeDss;
getTopologyMap(numTiles, dssInfo.begin(), topologyMap);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -835,6 +835,61 @@ TEST(IoctlHelperXeTest, givenComputeDssWhenGetTopologyDataAndMapThenResultsAreCo
}
}

TEST(IoctlHelperXeTest, givenSimd16EuPerDssAndEuPerDssWhenGetTopologyDataAndMapThenPreferSimd16EuPerDss) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not meant to prefer one vs the other as we don't return both for any platform. For a non-mock test, an assert that this is true would be ok.


auto executionEnvironment = std::make_unique<MockExecutionEnvironment>();
auto drm = DrmMockXe::create(*executionEnvironment->rootDeviceEnvironments[0]);
auto xeIoctlHelper = static_cast<MockIoctlHelperXe *>(drm->getIoctlHelper());
auto &hwInfo = *executionEnvironment->rootDeviceEnvironments[0]->getHardwareInfo();
xeIoctlHelper->initialize();

uint16_t tileId = 0;
for (auto gtId = 0u; gtId < 4u; gtId++) {
drm->addMockedQueryTopologyData(gtId, DRM_XE_TOPO_DSS_GEOMETRY, 8, {0, 0, 0, 0, 0, 0, 0, 0});
drm->addMockedQueryTopologyData(gtId, DRM_XE_TOPO_DSS_COMPUTE, 8, {0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff});
drm->addMockedQueryTopologyData(gtId, DRM_XE_TOPO_EU_PER_DSS, 8, {0b1111'1111, 0, 0, 0, 0, 0, 0, 0});
drm->addMockedQueryTopologyData(gtId, DRM_XE_TOPO_SIMD16_EU_PER_DSS, 8, {0b1111, 0, 0, 0, 0, 0, 0, 0});
}

DrmQueryTopologyData topologyData{};
TopologyMap topologyMap{};

auto result = xeIoctlHelper->getTopologyDataAndMap(hwInfo, topologyData, topologyMap);
ASSERT_TRUE(result);

// verify topology data
EXPECT_EQ(1, topologyData.sliceCount);
EXPECT_EQ(1, topologyData.maxSliceCount);

EXPECT_EQ(64, topologyData.subSliceCount);
EXPECT_EQ(64, topologyData.maxSubSliceCount);

EXPECT_EQ(256, topologyData.euCount);
EXPECT_EQ(4, topologyData.maxEuPerSubSlice);

// verify topology map
std::vector<int> expectedSliceIndices = {0};
ASSERT_EQ(expectedSliceIndices.size(), topologyMap[tileId].sliceIndices.size());
ASSERT_TRUE(topologyMap[tileId].sliceIndices.size() > 0);

for (auto i = 0u; i < expectedSliceIndices.size(); i++) {
EXPECT_EQ(expectedSliceIndices[i], topologyMap[tileId].sliceIndices[i]);
}

std::vector<int> expectedSubSliceIndices;
expectedSubSliceIndices.reserve(64u);
for (auto i = 0u; i < 64; i++) {
expectedSubSliceIndices.emplace_back(i);
}

ASSERT_EQ(expectedSubSliceIndices.size(), topologyMap[tileId].subsliceIndices.size());
ASSERT_TRUE(topologyMap[tileId].subsliceIndices.size() > 0);

for (auto i = 0u; i < expectedSubSliceIndices.size(); i++) {
EXPECT_EQ(expectedSubSliceIndices[i], topologyMap[tileId].subsliceIndices[i]);
}
}

TEST(IoctlHelperXeTest, givenOnlyMediaTypeWhenGetTopologyDataAndMapThenSubsliceIndicesNotSet) {

auto executionEnvironment = std::make_unique<MockExecutionEnvironment>();
Expand Down
1 change: 1 addition & 0 deletions third_party/uapi/xe/.version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
patch: https://lore.kernel.org/intel-xe/[email protected]/
10 changes: 9 additions & 1 deletion third_party/uapi/xe/xe_drm.h
Original file line number Diff line number Diff line change
Expand Up @@ -504,7 +504,14 @@ struct drm_xe_query_gt_list {
* available per Dual Sub Slices (DSS). For example a query response
* containing the following in mask:
* ``EU_PER_DSS ff ff 00 00 00 00 00 00``
* means each DSS has 16 EU.
* means each DSS has 16 SIMD8 EUs. This type may be omitted if device
* doesn't have SIMD8 EUs.
* - %DRM_XE_TOPO_SIMD16_EU_PER_DSS - To query the mask of SIMD16 Execution
* Units (EU) available per Dual Sub Slices (DSS). For example a query
* response containing the following in mask:
* ``SIMD16_EU_PER_DSS ff ff 00 00 00 00 00 00``
* means each DSS has 16 SIMD16 EUs. This type may be omitted if device
* doesn't have SIMD16 EUs.
*/
struct drm_xe_query_topology_mask {
/** @gt_id: GT ID the mask is associated with */
Expand All @@ -513,6 +520,7 @@ struct drm_xe_query_topology_mask {
#define DRM_XE_TOPO_DSS_GEOMETRY (1 << 0)
#define DRM_XE_TOPO_DSS_COMPUTE (1 << 1)
#define DRM_XE_TOPO_EU_PER_DSS (1 << 2)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other changes in this header that are already available in drm-next. It's probably easier if you just sync the header with drm-next and add on top this change. This way it won't conflict with future updates.
This is what we have in drm-next:

#define DRM_XE_TOPO_DSS_GEOMETRY        1
#define DRM_XE_TOPO_DSS_COMPUTE         2
#define DRM_XE_TOPO_L3_BANK             3
#define DRM_XE_TOPO_EU_PER_DSS          4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, feel free to commit this once DRM_XE_TOPO_SIMD16_EU_PER_DSS reaches drm-next as per rules in https://docs.kernel.org/gpu/drm-uapi.html:

The kernel patch can only be merged after all the above requirements are met, but it must be merged to either drm-next or drm-misc-next before the userspace patches land. uAPI always flows from the kernel, doing things the other way round risks divergence of the uAPI definitions and header files.

#define DRM_XE_TOPO_SIMD16_EU_PER_DSS 5
/** @type: type of mask */
__u16 type;

Expand Down
Loading