Skip to content

Visualize bounding boxes as Napari shapes layer #590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

DPWebster
Copy link

@DPWebster DPWebster commented May 14, 2025

Before submitting a pull request (PR), please read the contributing guide.

Please fill out as much of this template as you can, but if you have any problems or questions, just leave a comment and we will help out :)

Description

Closes #567

Adds a checkbox to toggle loading VIA-Tracks bounding box data as Napari shapes layers and functionality for loading shapes data from a bounding boxes dataset (containing heights/widths and centroid coordinates for a rectangular bounding box).

What is this PR

  • Addition of a new feature

Why is this PR needed?

Allows users to visualize VIA-Tracks bounding boxes data sets as rectangular bounding boxes instead of only visualizing their centroids.

What does this PR do?

Includes functionality for loading a VIA-Tracks file as a Napari shapes layer representation (in addition to points + tracks) which can be toggled on or off via a checkbox.

References

#567

How has this PR been tested?

Pytest run to ensure no existing functionality breaks. New functionality verified by attempting to load sample data (VIA_single-crab_MOCA-crab-1.csv, VIA_multiple-crabs_5-frames_labels.csv) with checkbox ticked. Also attempted to verify existing functionality still works by loading other datasets (DLC, etc.) with and without the checkbox ticked.

Is this a breaking change?

This feature should not break existing functionality.

Does this PR require an update to the documentation?

Yes, gui.md has been updated to reflect the new functionality..

Checklist:

  • The code has been tested locally
  • Tests have been added to cover all new functionality
  • The documentation has been updated to reflect any changes
  • The code has been formatted with pre-commit

@DPWebster
Copy link
Author

Hi @sfmig, this is a basic implementation of the bounding boxes visualization from #567. I still need to add test coverage, documentation and the optional features of #17 and displaying areas but I wanted to get something presentable so I could ask for further input.

  • While developing this I noticed that one entry from the source file will always correspond to one frame in the viewer even if the fps slider is changed or the data indicates that one entry should correspond to multiple frames (e.g. VIA_single-crab_MOCA-crab-1.csv has most of its entires correspond to 5 frames; it should have 168 frames of data, whereas our functions will load 35). I had spent some time working on a fix for this before I noticed it affected all data and not just the bounding box loading functions I was working on. Is this intended behavior or would it be useful to ensure the number of frames we load to the viewer matches up with the time column? (I see that there are some other issues related to this, so that might be something to work on after I finish up here.)

  • Based on discussion in Automatically generate bounding boxes annotations from pose data #102, bounding boxes for poses datasets could be visualized as a convex hull or as a rectangle. Is there any particular preference for which, or both?

@niksirbi niksirbi requested a review from sfmig May 14, 2025 08:47
@sfmig
Copy link
Contributor

sfmig commented May 15, 2025

Hi @DPWebster, thanks for having a go.

This looks like a good start. However, I have not done a detailed review since the PR is still a draft. Whenever you feel that it is as finished as it can be on your side, feel free to mark it as "Ready for review". Someone in the team (probably me) will review it. "Ready to review" usually means the functionality is in, along with tests, updated docs and a PR description.

In the description you mention that you plan to add two more features: visualisation for bboxes for poses datasets, and displaying of bboxes areas. I would strongly suggest dealing with these two features as two separate PRs. The main reason for this is PR size. Small PRs are faster to understand and to review, and often will get merged more quickly. It is also easier to write tests for a small chunk of work.

Re your questions:

While developing this I noticed that one entry from the source file will always correspond to one frame in the viewer even if [...] the data indicates that one entry should correspond to multiple frames (e.g. VIA_single-crab_MOCA-crab-1.csv has most of its entires correspond to 5 frames; it should have 168 frames of data, whereas our functions will load 35)

This is not a problem for us at the moment. It happens because in ds_to_napari_tracks, the time column that napari uses for the slider is "recomputed" to count number of frames based on the input data.

This was the original implementation of the widget, I believe for simplicity. We could instead use the timestamps as specified in the input movement dataset ds. This may make more sense for bboxes datasets (note the use_frame_numbers_from_file argument in the bboxes loading function). For poses, we currently refer all timepoints to the start of the data (so using the dataset timestamps or recomputing them based on the data are equivalent), but we are exploring expanding this to other formats (see #473). I opened an issue to discuss (#595), feel free to chime in with any thoughts.

For now I would recommend using the interpolated version of that dataset instead when you are trying things out. The VIA_single-crab_MOCA-crab-1.csv file is somewhat confusing due to the data being defined for one every 5 frames, but this will rarely be the case.

Re the question about convex hull / rectangles: it is usually a good idea to keep things as consistent and simple as possible, so in this case I believe that would be "rectangles". But probably better to discuss this in its relevant PR when we get to it.

I left a couple more comments in the code, although I realise it is not finished yet. Hope it helps!

@@ -86,3 +86,144 @@ def ds_to_napari_tracks(
properties = _construct_properties_dataframe(ds_)

return data, properties


def ds_to_napari_shapes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a lot of overlap between this function and ds_to_napari_tracks. It would be better if we could factor out the common bits, so that both ds_to_napari_shapes and ds_to_napari_tracks can re-use them. This will also make testing much easier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can combine ds_to_napari_tracks and ds_to_napari_shapes into a single function ds_to_napari that takes a movement dataset, and returns the data for the napari points and tracks layers, and for the shape layer.

I had a go, I paste my suggestion below in case it is useful. I tried it and seemed to work ok, but haven't checked/adapted the tests. I removed the docstring for clarity.

The implementation below is slightly different to yours, I was trying to make it a bit more readable by using the data arrays but not sure I succeeded. Feel free to take or leave bits as you see fit, or even to stick to your original approach if you prefer, but I do think it is worth combining these two functions into one.

I would also remove the references to upper-left, lower-left corners because they seem to assume a different coordinate system than the one used by napari. If you think of an image loaded in napari, the origin of the napari coordinate system is at the top left corner of the image, the x coordinate increases from left to right and the y coordinate from top to bottom. So the (xmin, ymin) corner is the top left corner of the bounding box, rather than lower left as in your code.

def ds_to_napari_layers(
    ds: xr.Dataset,
) -> tuple[np.ndarray, pd.DataFrame, np.ndarray]:

    # Get track id and time columns
    track_id_col, time_col = _construct_track_and_time_cols(ds)

    # Reorder axes to (individuals, keypoints, time, space) 
    axes_reordering: tuple[int, ...] = (2, 0, 1)
    if "keypoints" in ds.coords:
        axes_reordering = (3,) + axes_reordering

    # Get data for napari tracks and points layers
    yx_cols = np.transpose(
        ds.position.values,  # from: time, space, keypoints, individuals
        axes_reordering,  # to: individuals, keypoints, time, space
    ).reshape(-1, 2)[:, [1, 0]]  # swap x and y columns
    points_data = np.hstack((track_id_col, time_col, yx_cols))

    # Get data for napari boxes layer if present
    if "shape" in ds.data_vars:
        # Compute bbox corners
        xmin_ymin = ds.position - (ds.shape / 2)
        xmax_ymax = ds.position + (ds.shape / 2)
        xmax_ymin = xmin_ymin + np.stack(
            [
                ds.shape.sel(space="x"),  # xmax = xmin + width
                np.zeros_like(ds.shape.sel(space="x")),  # ymin = ymin + 0
            ],
            axis=1,
        )
        xmin_ymax = xmin_ymin + np.stack(
            [
                np.zeros_like(ds.shape.sel(space="y")),  # xmin = xmin + 0
                ds.shape.sel(space="y"),  # ymax = ymin + height
            ],
            axis=1,
        )

        # Add track_id and time columns to each corner array
        corner_arrays_with_track_id_and_time = [
            np.c_[
                track_id_col,
                time_col,
                np.transpose(corner.values, axes_reordering).reshape(-1, 2),
            ]
            for corner in [xmin_ymin, xmin_ymax, xmax_ymax, xmax_ymin]
        ]

	# Concatenate corner arrays along columns
        corners_array = np.concatenate(
            corner_arrays_with_track_id_and_time, axis=1
        )

        # Reshape to napari expected format
	# (goes through corners counterclockwise from xmin_ymin in image coordinates)
        corners_array = corners_array.reshape(-1, 4, 4)  # last dimension: track_id, time, x, y
        bboxes_shape_data = corners_array[:, :, [0, 1, 3, 2]]  # swap x and y columns

    else:
        bboxes_shape_data = None

   
    # Stack individuals, time and keypoints (if present) dimensions
    # into a new single dimension named "tracks"
    dimensions_to_stack: tuple[str, ...] = ("individuals", "time")
    if "keypoints" in ds.coords:
        dimensions_to_stack += ("keypoints",)  # add last
    ds_ = ds.stack(tracks=sorted(dimensions_to_stack))

     # Construct the properties DataFrame
    properties = _construct_properties_dataframe(ds_)

    return points_data, properties, bboxes_shape_data

# the data, so that one 4x3 array = 1 frame for 1 individual.
# Assume the last time entry corresponds to one frame.

# This block of code was originally intended to be used to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say for now to keep it simple, let's assume each entry in the data array corresponds to one frame as per usual


Notes
-----
A corresponding napari Shapes array can be derived from the Tracks array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this note... if a Shapes array can be derived from the Tracks array, can we re-use our existing function then?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note was included by error, as you can probably tell this function was initially based off of ds_to_napari_tracks() and I missed getting rid of this when rewriting the docstring. I'll fix this when I rework this function and add test coverage later today.

Copy link

codecov bot commented May 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (4fe7d28) to head (aab3b71).

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #590   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           32        32           
  Lines         1774      1835   +61     
=========================================
+ Hits          1774      1835   +61     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@DPWebster DPWebster marked this pull request as ready for review May 18, 2025 05:29
@DPWebster DPWebster requested a review from sfmig May 27, 2025 21:41
Copy link
Contributor

@sfmig sfmig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @DPWebster,

This is a very nice attempt and thanks for joining the call today! I think it was good to discuss it in a group.

I left some in-line comments but have a couple of general ones.

As discussed, we decided for simplicity to always load the boxes layer if the shape data array is available in the loaded movement dataset. So we would need to remove the checkbox widget - hopefully it also simplifies things a bit with the implementation and tests.

Re the "white bboxes" issue I found, I confirm that it happens if the loaded dataset has nans in the shape array. This may happen if not all individuals are present in all frames. I added a sample dataset to GIN where this is the case (VIA_multiple-crabs_5-frames_labels_missing.csv) to help us check this "manually". However, we should also properly test this (i.e. test that data with nans is loaded with the expected colors for boxes, tracks and points). If you want to do that in this PR that would be great, but it is also fine if we raise an issue and deal with it separately.

I also found some comments and docstrings were somewhat outdated, if you could have a look that would be great.

Thanks again for the work!

@@ -5,7 +5,7 @@ The `movement` graphical user interface (GUI), powered by our custom plugin for
[napari](napari:), makes it easy to view and explore `movement`
motion tracks. Currently, you can use it to
visualise 2D [movement datasets](target-poses-and-bboxes-dataset)
as points and tracks overlaid on video frames.
as points, tracks, and rectangular bounding boxes overlaid on video frames.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
as points, tracks, and rectangular bounding boxes overlaid on video frames.
as points, tracks, and rectangular bounding boxes (if defined) overlaid on video frames.

def _add_shapes_layer(self):
"""Add tracked data to the viewer as a Shapes layer."""
shapes_style = ShapesStyle(
name=f"shapes: {self.file_name}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name=f"shapes: {self.file_name}",
name=f"boxes: {self.file_name}",

Just to avoid overloading the term "shapes" - we already use it to refer to a type of napari layer, for the data variable in the movement dataset holding the bounding boxes width and height, and for the numpy-like term (the number of elements per dimension in a data array). We will also likely have other napari shape layers in the future (for example to define regions of interest, see #377 ).

It would be good to be a bit more specific in the code as well and maybe call it "bboxes layer" rather than shapes layer.

@@ -134,10 +134,12 @@ an expanded `Load tracked data` menu. To load tracked data in napari:
1. Select one of the [supported formats](target-supported-formats) from the `source software` dropdown menu.
2. Set the `fps` (frames per second) of the video the data refers to. Note this will only affect the units of the time variable shown when hovering over a keypoint. If the `fps` is not known, you can set it to 1, which will effectively make the time variable equal to the frame number.
3. Select the file containing the tracked data. You can paste the path to the file directly in the text box, or you can use the file browser button.
4. Click `Load`.
4. Optionally, you may load the selected file as rectangular bounding boxes in addition to keypoints and tracks by ticking the `load bboxes from path?` checkbox. Currently, this is only supported for bounding box datasets, i.e. if `source software` is set to VIA-tracks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we decided to load the shapes layer with the boxes data if available, this should be reworded.


The data will be loaded into the viewer as a
[points layer](napari:howtos/layers/points.html) and as a [tracks layer](napari:howtos/layers/tracks.html).
If bounding box data is selected to be loaded and visualized, it is loaded as a [shapes layer](napari:howtos/layers/shapes.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If bounding box data is selected to be loaded and visualized, it is loaded as a [shapes layer](napari:howtos/layers/shapes.html).
If the data for the width and height of the bounding boxes is available, it is loaded as a [napari shapes layer](napari:howtos/layers/shapes.html).


![napari widget with shapes loaded](../_static/napari_shapes_layer.png)

Bounding boxes are represented as rectangles color-coded by individual. Bounding boxes are always labeled and coloured by individual, even for data with keypoints.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Bounding boxes are represented as rectangles color-coded by individual. Bounding boxes are always labeled and coloured by individual, even for data with keypoints.
Bounding boxes are represented as rectangles color-coded by individual.

For now only bounding box datasets will show a shapes layer with bounding boxes, and this kind of datasets don't have keypoints. So we can omit that last sentence.

The name of the colormap to use, otherwise use the edge_colormap.

"""
# Set points and text to be colored by selected property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we only color them by individual (because bboxes datasets don't have keypoints) I wonder if this can be simplified

@@ -166,6 +189,8 @@ def _on_load_clicked(self):
# Add the data as a points and a tracks layers
self._add_points_layer()
self._add_tracks_layer()
if self.shapes is not None and self.bboxes_checkbox.isChecked():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.shapes is not None and self.bboxes_checkbox.isChecked():
if self.shapes is not None:

Comment on lines 211 to 227
# Also convert to napari Shapes array if supported and selected
# Warn user if conversion isn't supported.
if (
self.bboxes_checkbox.isChecked()
and self.source_software in SUPPORTED_BBOXES_FILES
):
self.shapes = ds_to_napari_shapes(ds)
if (
self.bboxes_checkbox.isChecked()
and self.source_software not in SUPPORTED_BBOXES_FILES
):
self.shapes = None
show_warning(f"{self.source_software} to bboxes not supported.")

# Also warn via logger for integration with tests.
logger.warning(f"{self.source_software} to bboxes not supported.")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we could just have:

Suggested change
# Also convert to napari Shapes array if supported and selected
# Warn user if conversion isn't supported.
if (
self.bboxes_checkbox.isChecked()
and self.source_software in SUPPORTED_BBOXES_FILES
):
self.shapes = ds_to_napari_shapes(ds)
if (
self.bboxes_checkbox.isChecked()
and self.source_software not in SUPPORTED_BBOXES_FILES
):
self.shapes = None
show_warning(f"{self.source_software} to bboxes not supported.")
# Also warn via logger for integration with tests.
logger.warning(f"{self.source_software} to bboxes not supported.")
if self.shapes is None:
logger.warning(f"{self.source_software} to bboxes not supported.")

Do you know why we need the double warning? I'm confused about that

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The show_warning was intended to provide feedback in the GUI that a user was trying to load bounding boxes from a file format that didn't currently support that. The tests also relied on this error message (to validate that we weren't trying to execute any of the functions for converting datasets to Shapes layers on datasets that those functions don't support) but I wasn't able to get the tests to recognize that output so I added the logger.warning one as a placeholder. I probably should have gone back and polished that before marking this for review but in any case I don't think there's a need for the show_warning now that creating a Shapes layer will be automatic.

expected_n_colors,
):
"""Test that set_color_by updates the color and color cycle of
the point markers and the text.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the point markers and the text.
the bounding boxes and the text.

properties_df=self.properties.iloc[self.data_not_nan, :],
)
self.shapes_layer = self.viewer.add_shapes(
self.shapes[:, :, 1:],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the issue with nans:

Suggested change
self.shapes[:, :, 1:],
self.shapes[self.data_not_nan, :, 1:],

Copy link

@DPWebster DPWebster requested a review from sfmig June 11, 2025 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Visualise bounding boxes as Shapes layer in napari
3 participants