-
Notifications
You must be signed in to change notification settings - Fork 41
Labels
scalabilityRelated to scalability & performance effortsRelated to scalability & performance efforts
Milestone
Description
I've observed very poor performance when constructing the Grid.edge_node_connectivity
and Grid.face_edge_connectivity
.
The timings below were taken on a single NCAR Derecho CPU node.
- AMD EPYC™ 7763 Milan processors
- Dual-socket nodes, 64 cores per socket
- 256 GB DDR4 memory per node
Resolution | Nodes | Faces | Edges | Grid Load Time (s) | Connectivity Construction Time (s) | Total Time (s) |
---|---|---|---|---|---|---|
30km | 1,310,720 | 655,362 | 1,966,080 | 2.023 | 9.37 | 11.393 |
15km | 5,242,880 | 2,621,442 | 7,864,320 | 7.673 | 39.987 | 47.66 |
7.5km | 20,971,520 | 10,485,762 | 31,457,280 | 28.716 | 99.309 | 128.025 |
3.75km | 83,886,080 | 41,943,042 | 125,829,120 | 113.943 | 406.8 | 520.743 |
The timing for the 15km
grid seems inconsistent with the others, since there's an expected scaling of about 4x
. The others follow this trend.
Currently, we have the following implementation.
uxarray/uxarray/grid/connectivity.py
Lines 181 to 238 in a6aa629
def _build_edge_node_connectivity(face_nodes, n_face, n_max_face_nodes): | |
"""Constructs the UGRID connectivity variable (``edge_node_connectivity``) | |
and stores it within the internal (``Grid._ds``) and through the attribute | |
(``Grid.edge_node_connectivity``). | |
Additionally, the attributes (``inverse_indices``) and | |
(``fill_value_mask``) are stored for constructing other | |
connectivity variables. | |
Parameters | |
---------- | |
repopulate : bool, optional | |
Flag used to indicate if we want to overwrite the existed `edge_node_connectivity` and generate a new | |
inverse_indices, default is False | |
""" | |
padded_face_nodes = close_face_nodes(face_nodes, n_face, n_max_face_nodes) | |
# array of empty edge nodes where each entry is a pair of indices | |
edge_nodes = np.empty((n_face * n_max_face_nodes, 2), dtype=INT_DTYPE) | |
# first index includes starting node up to non-padded value | |
edge_nodes[:, 0] = padded_face_nodes[:, :-1].ravel() | |
# second index includes second node up to padded value | |
edge_nodes[:, 1] = padded_face_nodes[:, 1:].ravel() | |
# sorted edge nodes | |
edge_nodes.sort(axis=1) | |
# unique edge nodes | |
edge_nodes_unique, inverse_indices = np.unique( | |
edge_nodes, return_inverse=True, axis=0 | |
) | |
# find all edge nodes that contain a fill value | |
fill_value_mask = np.logical_or( | |
edge_nodes_unique[:, 0] == INT_FILL_VALUE, | |
edge_nodes_unique[:, 1] == INT_FILL_VALUE, | |
) | |
# all edge nodes that do not contain a fill value | |
non_fill_value_mask = np.logical_not(fill_value_mask) | |
edge_nodes_unique = edge_nodes_unique[non_fill_value_mask] | |
# Update inverse_indices accordingly | |
indices_to_update = np.where(fill_value_mask)[0] | |
remove_mask = np.isin(inverse_indices, indices_to_update) | |
inverse_indices[remove_mask] = INT_FILL_VALUE | |
# Compute the indices where inverse_indices exceeds the values in indices_to_update | |
indexes = np.searchsorted(indices_to_update, inverse_indices, side="right") | |
# subtract the corresponding indexes from `inverse_indices` | |
for i in range(len(inverse_indices)): | |
if inverse_indices[i] != INT_FILL_VALUE: | |
inverse_indices[i] -= indexes[i] | |
return edge_nodes_unique, inverse_indices, fill_value_mask |
Metadata
Metadata
Assignees
Labels
scalabilityRelated to scalability & performance effortsRelated to scalability & performance efforts
Type
Projects
Status
🏗 In progress