Skip to content

Commit 5b019d8

Browse files
feat(LAB-3634): import labels from shapefiles (#1890)
Co-authored-by: paulruelle <[email protected]>
1 parent 71a3758 commit 5b019d8

File tree

8 files changed

+1332
-0
lines changed

8 files changed

+1332
-0
lines changed
Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
<!-- FILE AUTO GENERATED BY docs/utils.py DO NOT EDIT DIRECTLY -->
2+
<a href="https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/import_labels_from_shapefiles.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
3+
4+
# Importing Labels from Shapefiles
5+
6+
This tutorial explains how to use `kili.append_labels_from_shapefiles` function in the Kili SDK to import geometries
7+
from shapefile files and convert them to annotations in your Kili projects.
8+
9+
## Introduction
10+
11+
Shapefiles are a standard geospatial data format that stores the location, shape, and attributes of geographic features.
12+
They are commonly used in Geographic Information Systems (GIS) to represent points, lines, and polygons.
13+
14+
The `append_labels_from_shapefiles` function automatically converts this spatial data into Kili annotations.
15+
16+
17+
## Prerequisites
18+
19+
Before using this feature, make sure you have the following:
20+
21+
- A Kili project of type `IMAGE` or `GEOSPATIAL`
22+
- One or more shapefile files (`.shp`)
23+
- Knowledge of the coordinate system (EPSG code) of your shapefiles
24+
25+
## Installation
26+
27+
This feature requires additional dependencies. Install them using:
28+
29+
```bash
30+
pip install 'kili[gis]'
31+
```
32+
33+
_This command installs the necessary libraries such as pyproj which are used for geospatial data manipulation._
34+
35+
36+
## Function Structure
37+
38+
The `append_labels_from_shapefiles` function takes the following parameters:
39+
40+
| Parameter | Type | Description |
41+
|--------------------|----------------------|------------------------------------------------------------|
42+
| `project_id` | str | The Kili project identifier |
43+
| `asset_external_id` | str | The external identifier of the asset to annotate |
44+
| `shapefile_paths` | List[str] | List of paths to shapefile files |
45+
| `job_names` | List[str] | List of job names corresponding to each shapefile |
46+
| `category_names` | List[str] | List of category names corresponding to each shapefile |
47+
| `from_epsgs` | Optional[List[int]] | Optional list of source EPSG codes for each shapefile |
48+
49+
50+
## Supported Geometry Types
51+
52+
The function supports the following shapefile geometry types:
53+
54+
- Points (Type 1) - Converted to "marker" type annotations in Kili
55+
- Polylines (Type 3) - Converted to "polyline" type annotations in Kili
56+
- Polygons (Type 5) - Converted to "semantic" type (mask) annotations in Kili
57+
58+
## Basic Usage Example
59+
60+
Here's a simple usage example:
61+
62+
```python
63+
from kili.client import Kili
64+
65+
# Initialize Kili client
66+
kili = Kili(api_key="your_api_key")
67+
68+
# Import labels from shapefiles
69+
kili.append_labels_from_shapefiles(
70+
project_id="your_project_id",
71+
asset_external_id="satellite_image.tif",
72+
shapefile_paths=["roads.shp", "buildings.shp", "water_points.shp"],
73+
job_names=["ROADS", "BUILDINGS", "HYDROLOGY"],
74+
category_names=["ROAD", "BUILDING", "WATER_POINT"],
75+
from_epsgs=[4326, 4326, 4326] # All shapefiles are in WGS84 (EPSG:4326)
76+
)
77+
```
78+
79+
## Advanced Example with Different Coordinate Systems
80+
81+
If your shapefiles use different coordinate systems:
82+
83+
```python
84+
85+
from kili.client import Kili
86+
87+
kili = Kili(api_key="your_api_key")
88+
89+
kili.append_labels_from_shapefiles(
90+
project_id="your_project_id",
91+
asset_external_id="sentinel2_image.jp2",
92+
shapefile_paths=["observation_points.shp", "parcels.shp", "protected_areas.shp"],
93+
job_names=["OBSERVATIONS", "PARCELS", "ZONES"],
94+
category_names=["OBS_POINT", "AGRICULTURAL_PARCEL", "NATURE_RESERVE"],
95+
from_epsgs=[4326, 2154, 3857] # WGS84, Lambert93, Web Mercator
96+
)
97+
```
98+
99+
In this example, each shapefile uses a different coordinate system. The function will automatically convert all
100+
geometries to WGS84 (EPSG:4326) before importing them into Kili.
101+
102+
103+
## Troubleshooting
104+
105+
### Problem: ImportError
106+
107+
```bash
108+
ImportError: This function requires the 'gis' extra dependencies.
109+
Install them with: pip install kili[gis] or pip install 'kili[gis]'
110+
```
111+
112+
Solution: Install the GIS dependencies:
113+
114+
```bash
115+
pip install 'kili[gis]'
116+
```
117+
118+
### Problem: Incorrect Coordinates or Invisible Annotations
119+
120+
Possible solutions:
121+
122+
- Check that you have specified the correct EPSG code for each shapefile
123+
- Make sure the image in Kili is correctly georeferenced
124+
- Check if the image has geospatial metadata with:
125+
126+
```
127+
asset = kili.assets(project_id="your_project_id", fields=["jsonContent"])[0]
128+
print(asset['jsonContent'])
129+
```
130+
131+
### Problem: Categories Not Found
132+
133+
Solution: Check that the category names exactly match those in your Kili ontology:
134+
135+
```python
136+
project = kili.projects(project_id="your_project_id", fields=["jsonInterface"])[0]
137+
print(project["jsonInterface"])
138+
```
139+
140+
141+
## Conclusion
142+
143+
The `append_labels_from_shapefiles` function greatly simplifies the import of geospatial data into Kili, allowing you
144+
to easily convert your existing GIS data into annotations usable for machine learning and manual annotation.

docs/tutorials.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ For other specific use cases, see these tutorials:
3333
- [Importing OCR pre-annotations](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/ocr_pre_annotations/)
3434
- [Importing segmentation pre-annotations](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/pixel_level_masks/)
3535
- [Importing DINOv2 classification pre-annotations](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/finetuning_dinov2/)
36+
- [Importing labels from shapefiles](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/import_labels_from_shapefiles/)
3637

3738
Additionally, we’ve devoted one [tutorial](https://python-sdk-docs.kili-technology.com/latest/sdk/tutorials/inference_labels/) to explaining the most common use cases for importing and using model-generated labels: actively monitoring the quality of a model currently deployed to production to detect issues like data drift, and using a model to speed up the process of label creation.
3839

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ nav:
5151
- Segmentation Pre-annotations: sdk/tutorials/pixel_level_masks.md
5252
- Inference Labels: sdk/tutorials/inference_labels.md
5353
- DINOv2 Classification Pre-annotations: sdk/tutorials/finetuning_dinov2.md
54+
- Import labels from shapefiles (GIS): sdk/tutorials/import_labels_from_shapefiles.md
5455
- Converting Labels:
5556
- DICOM: sdk/tutorials/medical_imaging.md
5657
- Tagtog: sdk/tutorials/tagtog_to_kili.md

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,8 @@ dependencies = [
5050
"pip-system-certs >= 4.0.0, < 5.0.0; platform_system=='Windows'",
5151
"pyrate-limiter >= 3, < 4",
5252
"shapely >= 1.8, < 3",
53+
"pyproj >= 2.6.1, < 3; python_version < '3.9'",
54+
"pyproj == 3.6.1; python_version >= '3.9'",
5355
]
5456
urls = { homepage = "https://github.com/kili-technology/kili-python-sdk" }
5557

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "234a6586cee4560f",
6+
"metadata": {},
7+
"source": [
8+
"<a href=\"https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/main/recipes/import_labels_from_shapefiles.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
9+
"\n",
10+
"# Importing Labels from Shapefiles\n",
11+
"\n",
12+
"This tutorial explains how to use `kili.append_labels_from_shapefiles` function in the Kili SDK to import geometries\n",
13+
"from shapefile files and convert them to annotations in your Kili projects.\n",
14+
"\n",
15+
"## Introduction\n",
16+
"\n",
17+
"Shapefiles are a standard geospatial data format that stores the location, shape, and attributes of geographic features.\n",
18+
"They are commonly used in Geographic Information Systems (GIS) to represent points, lines, and polygons.\n",
19+
"\n",
20+
"The `append_labels_from_shapefiles` function automatically converts this spatial data into Kili annotations.\n",
21+
"\n",
22+
"\n",
23+
"## Prerequisites\n",
24+
"\n",
25+
"Before using this feature, make sure you have the following:\n",
26+
"\n",
27+
"- A Kili project of type `IMAGE` or `GEOSPATIAL`\n",
28+
"- One or more shapefile files (`.shp`)\n",
29+
"- Knowledge of the coordinate system (EPSG code) of your shapefiles\n",
30+
"\n",
31+
"## Installation\n",
32+
"\n",
33+
"This feature requires additional dependencies. Install them using:\n",
34+
"\n",
35+
"```bash\n",
36+
"pip install 'kili[gis]'\n",
37+
"```\n",
38+
"\n",
39+
"_This command installs the necessary libraries such as pyproj which are used for geospatial data manipulation._\n",
40+
"\n",
41+
"\n",
42+
"## Function Structure\n",
43+
"\n",
44+
"The `append_labels_from_shapefiles` function takes the following parameters:\n",
45+
"\n",
46+
"| Parameter | Type | Description |\n",
47+
"|--------------------|----------------------|------------------------------------------------------------|\n",
48+
"| `project_id` | str | The Kili project identifier |\n",
49+
"| `asset_external_id` | str | The external identifier of the asset to annotate |\n",
50+
"| `shapefile_paths` | List[str] | List of paths to shapefile files |\n",
51+
"| `job_names` | List[str] | List of job names corresponding to each shapefile |\n",
52+
"| `category_names` | List[str] | List of category names corresponding to each shapefile |\n",
53+
"| `from_epsgs` | Optional[List[int]] | Optional list of source EPSG codes for each shapefile |\n",
54+
"\n",
55+
"\n",
56+
"## Supported Geometry Types\n",
57+
"\n",
58+
"The function supports the following shapefile geometry types:\n",
59+
"\n",
60+
"- Points (Type 1) - Converted to \"marker\" type annotations in Kili\n",
61+
"- Polylines (Type 3) - Converted to \"polyline\" type annotations in Kili\n",
62+
"- Polygons (Type 5) - Converted to \"semantic\" type (mask) annotations in Kili\n",
63+
"\n",
64+
"## Basic Usage Example\n",
65+
"\n",
66+
"Here's a simple usage example:\n",
67+
"\n",
68+
"```python\n",
69+
"from kili.client import Kili\n",
70+
"\n",
71+
"# Initialize Kili client\n",
72+
"kili = Kili(api_key=\"your_api_key\")\n",
73+
"\n",
74+
"# Import labels from shapefiles\n",
75+
"kili.append_labels_from_shapefiles(\n",
76+
" project_id=\"your_project_id\",\n",
77+
" asset_external_id=\"satellite_image.tif\",\n",
78+
" shapefile_paths=[\"roads.shp\", \"buildings.shp\", \"water_points.shp\"],\n",
79+
" job_names=[\"ROADS\", \"BUILDINGS\", \"HYDROLOGY\"],\n",
80+
" category_names=[\"ROAD\", \"BUILDING\", \"WATER_POINT\"],\n",
81+
" from_epsgs=[4326, 4326, 4326] # All shapefiles are in WGS84 (EPSG:4326)\n",
82+
")\n",
83+
"```\n",
84+
"\n",
85+
"## Advanced Example with Different Coordinate Systems\n",
86+
"\n",
87+
"If your shapefiles use different coordinate systems:\n",
88+
"\n",
89+
"```python\n",
90+
"\n",
91+
"from kili.client import Kili\n",
92+
"\n",
93+
"kili = Kili(api_key=\"your_api_key\")\n",
94+
"\n",
95+
"kili.append_labels_from_shapefiles(\n",
96+
" project_id=\"your_project_id\",\n",
97+
" asset_external_id=\"sentinel2_image.jp2\",\n",
98+
" shapefile_paths=[\"observation_points.shp\", \"parcels.shp\", \"protected_areas.shp\"],\n",
99+
" job_names=[\"OBSERVATIONS\", \"PARCELS\", \"ZONES\"],\n",
100+
" category_names=[\"OBS_POINT\", \"AGRICULTURAL_PARCEL\", \"NATURE_RESERVE\"],\n",
101+
" from_epsgs=[4326, 2154, 3857] # WGS84, Lambert93, Web Mercator\n",
102+
")\n",
103+
"```\n",
104+
"\n",
105+
"In this example, each shapefile uses a different coordinate system. The function will automatically convert all\n",
106+
"geometries to WGS84 (EPSG:4326) before importing them into Kili.\n",
107+
"\n",
108+
"\n",
109+
"## Troubleshooting\n",
110+
"\n",
111+
"### Problem: ImportError\n",
112+
"\n",
113+
"```bash\n",
114+
"ImportError: This function requires the 'gis' extra dependencies.\n",
115+
"Install them with: pip install kili[gis] or pip install 'kili[gis]'\n",
116+
"```\n",
117+
"\n",
118+
"Solution: Install the GIS dependencies:\n",
119+
"\n",
120+
"```bash\n",
121+
"pip install 'kili[gis]'\n",
122+
"```\n",
123+
"\n",
124+
"### Problem: Incorrect Coordinates or Invisible Annotations\n",
125+
"\n",
126+
"Possible solutions:\n",
127+
"\n",
128+
"- Check that you have specified the correct EPSG code for each shapefile\n",
129+
"- Make sure the image in Kili is correctly georeferenced\n",
130+
"- Check if the image has geospatial metadata with:\n",
131+
"\n",
132+
"```\n",
133+
"asset = kili.assets(project_id=\"your_project_id\", fields=[\"jsonContent\"])[0]\n",
134+
"print(asset['jsonContent'])\n",
135+
"```\n",
136+
"\n",
137+
"### Problem: Categories Not Found\n",
138+
"\n",
139+
"Solution: Check that the category names exactly match those in your Kili ontology:\n",
140+
"\n",
141+
"```python\n",
142+
"project = kili.projects(project_id=\"your_project_id\", fields=[\"jsonInterface\"])[0]\n",
143+
"print(project[\"jsonInterface\"])\n",
144+
"```\n",
145+
"\n",
146+
"\n",
147+
"## Conclusion\n",
148+
"\n",
149+
"The `append_labels_from_shapefiles` function greatly simplifies the import of geospatial data into Kili, allowing you\n",
150+
"to easily convert your existing GIS data into annotations usable for machine learning and manual annotation.\n"
151+
]
152+
}
153+
],
154+
"metadata": {
155+
"language_info": {
156+
"name": "python"
157+
}
158+
},
159+
"nbformat": 4,
160+
"nbformat_minor": 5
161+
}

src/kili/presentation/client/label.py

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@
4242
from kili.services.export.types import CocoAnnotationModifier, LabelFormat, SplitOption
4343
from kili.use_cases.asset.utils import AssetUseCasesUtils
4444
from kili.use_cases.label import LabelUseCases
45+
from kili.use_cases.label.process_shapefiles import get_json_response_from_shapefiles
4546
from kili.use_cases.label.types import LabelToCreateUseCaseInput
4647
from kili.use_cases.project.project import ProjectUseCases
4748
from kili.utils.labels.parsing import ParsedLabel
@@ -1257,3 +1258,48 @@ def is_rectangle(coco_annotation, coco_image, kili_annotation):
12571258
except NoCompatibleJobError as excp:
12581259
warnings.warn(str(excp), stacklevel=2)
12591260
return None
1261+
1262+
@typechecked
1263+
def append_labels_from_shapefiles(
1264+
self,
1265+
project_id: str,
1266+
asset_external_id: str,
1267+
shapefile_paths: List[str],
1268+
job_names: List[str],
1269+
category_names: List[str],
1270+
from_epsgs: Optional[List[int]] = None,
1271+
):
1272+
"""Import and convert shapefiles into annotations for a specific asset in a Kili project.
1273+
1274+
This method processes shapefile geometries (points, polylines, and polygons), converts them
1275+
to the appropriate Kili annotation format, and appends them as labels to the specified asset.
1276+
Each shapefile's geometries are associated with a job and category name in the Kili project.
1277+
1278+
Args:
1279+
project_id: The ID of the Kili project to add the labels to.
1280+
asset_external_id: The external ID of the asset to label.
1281+
shapefile_paths: List of file paths to the shapefiles to be processed.
1282+
job_names: List of job names in the Kili project, corresponding to each shapefile.
1283+
Each job name must match an existing job in the project.
1284+
category_names: List of category names corresponding to each shapefile.
1285+
Each category name must exist in the corresponding job's ontology.
1286+
from_epsgs: Optional list of EPSG codes specifying the coordinate reference systems
1287+
of the shapefiles. If not provided, EPSG:4326 (WGS84) is assumed for all files.
1288+
All geometries will be transformed to EPSG:4326 before being added to Kili.
1289+
1290+
Note:
1291+
This function requires the 'gis' extra dependencies.
1292+
Install them with: pip install kili[gis] or pip install 'kili[gis]'
1293+
"""
1294+
json_response = get_json_response_from_shapefiles(
1295+
shapefile_paths=shapefile_paths,
1296+
job_names=job_names,
1297+
category_names=category_names,
1298+
from_epsgs=from_epsgs,
1299+
)
1300+
1301+
self.append_labels(
1302+
project_id=project_id,
1303+
json_response_array=[json_response],
1304+
asset_external_id_array=[asset_external_id],
1305+
)

0 commit comments

Comments
 (0)