Skip to content

Commit 6727773

Browse files
Merge pull request #387 from rustprooflabs/docs-data-files
Improve Docs with `--pgosm-date` details and behavior
2 parents c946501 + 862d549 commit 6727773

File tree

5 files changed

+113
-42
lines changed

5 files changed

+113
-42
lines changed

docs/book.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,5 +13,5 @@ git-repository-url = "https://github.com/rustprooflabs/pgosm-flex"
1313
git-repository-icon = "fa-github"
1414
edit-url-template = "https://github.com/rustprooflabs/pgosm-flex/edit/main/docs/{path}"
1515

16-
[preprocessor.variables.variables]
17-
pgosm_flex_version = "0.10.0"
16+
#[preprocessor.variables.variables]
17+
#pgosm_flex_version = "0.10.0"

docs/src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
- [Layersets](./layersets.md)
1414
- [Indexes](./custom-indexes.md)
1515
- [Configure Postgres](./configure-postgres.md)
16+
- [Data Files](./data-files.md)
1617
- [Query examples](./query.md)
1718
- [Routing](./routing.md)
1819
- [Processing Time](./performance.md)

docs/src/common-customization.md

Lines changed: 2 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@ exactly what `--region` and `--subregion` options to choose.
2727
This can be a bit confusing as larger subregions can contain smaller subregions.
2828
Feel free to [start a discussion](https://github.com/rustprooflabs/pgosm-flex/discussions/new/choose) if you need help figuring this part out!
2929

30+
> See the [Data Files](data-files.md) section for steps to change this behavior.
31+
3032
If you want to load the entire United States subregion, instead of
3133
the District of Columbia subregion, the `docker exec` command is changed to the
3234
following.
@@ -48,46 +50,6 @@ docker exec -it \
4850
--region=north-america
4951
```
5052

51-
## Specific input file
52-
53-
The automatic Geofabrik download can be overridden by providing PgOSM Flex
54-
with the path to a valid `.osm.pbf` file using `--input-file`.
55-
This option overrides the default file handling, archiving, and MD5
56-
checksum validation. With `--input-file` you can use a custom `osm.pbf`
57-
you created, or use it to simply remove the need for an internet connection
58-
from the instance running the processing.
59-
60-
> Note: The `--region` option is always required, the `--subregion` option can be used with `--input-file` to put the information in the `subregion` column of `osm.pgosm_flex`.
61-
62-
63-
### Small area / custom extract
64-
65-
Some of the smallest subregions provided by Geofabrik are quite large compared
66-
to the area of interest for a project.
67-
The `osmium` tool makes it quick and easy to
68-
[extract a bounding box](https://docs.osmcode.org/osmium/latest/osmium-extract.html).
69-
The following example extracts an area roughly around Denver, Colorado.
70-
It takes about 3 seconds to extract the 3.2 MB `denver.osm.pbf` output from
71-
the 239 MB input.
72-
73-
```bash
74-
osmium extract --bbox=-105.0193,39.7663,-104.9687,39.7323 \
75-
-o denver.osm.pbf \
76-
colorado-2023-04-18.osm.pbf
77-
```
78-
79-
The PgOSM Flex procesing time for the smaller Denver region takes less than 20 seconds on a
80-
typical laptop, versus 11 minutes for all of Colorado.
81-
82-
```bash
83-
docker exec -it \
84-
pgosm python3 docker/pgosm_flex.py \
85-
--ram=8 \
86-
--region=custom \
87-
--subregion=denver \
88-
--input-file=denver.osm.pbf \
89-
--layerset=everything
90-
```
9153

9254
## Customize load to PostGIS
9355

docs/src/customizations.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,4 @@
55
- [Layersets](./layersets.md)
66
- [Layers](./layers.md)
77
- [Configure Postgres](./configure-postgres.md)
8+
- [Data Files](./data-files.md)

docs/src/data-files.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Data Files
2+
3+
PgOSM Fle will automatically manage downloads of the appropriate data and `.md5`
4+
files from the [Geofabrik download server](https://download.geofabrik.de/).
5+
When using the default behavior, PgOSM Flex will automatically start downloading
6+
the two necessary files:
7+
8+
* `<region/subregion>-latest.osm.pbf`
9+
* `<region/subregion>-latest.osm.pbf.md5`
10+
11+
The data path on the host machine is defined via the `docker run` command. This
12+
documentation always uses `~/pgosm-data` per the [quick start](quick-start.md).
13+
14+
```bash
15+
docker run --name pgosm -d --rm \
16+
-v ~/pgosm-data:/app/output \
17+
...
18+
```
19+
20+
> See the [Selecting Region and Sub-region](common-customization.md#selecting-region-and-subregion)
21+
> section for more about the default behavior.
22+
23+
24+
25+
There are two methods to override this default behavior: specify `--pgosm-date`
26+
or use `--input-file`.
27+
If you have manually saved files in the path used by PgOSM Flex using `-latest`
28+
in the filename, they **will be overwritten** if you are not using one of the
29+
methods described below.
30+
31+
32+
## Specific date with `--pgosm-date`
33+
34+
Use `--pgosm-date` to specify a specific date for the data. The date specified
35+
must be in `yyyy-mm-dd` format.
36+
This mode requires you have a valid `.pbf` and matching `.md5` file in order to
37+
function. The following example shows the `docker exec` command along with
38+
a `--pgosm-date` defined.
39+
40+
```bash
41+
docker exec -it \
42+
pgosm python3 docker/pgosm_flex.py \
43+
--ram=8 \
44+
--region=north-america/us \
45+
--subregion=district-of-columbia \
46+
--pgosm-date=2024-05-14
47+
```
48+
49+
The output from running should confirm it finds and uses the file with the
50+
specified date.
51+
Remember, the paths reported from Docker (`/app/output/`) report the
52+
container-internal path, not your local path on the host.
53+
54+
```bash
55+
INFO:pgosm-flex:geofabrik:PBF File exists /app/output/district-of-columbia-2024-05-14.osm.pbf
56+
INFO:pgosm-flex:geofabrik:PBF & MD5 files exist. Download not needed
57+
INFO:pgosm-flex:geofabrik:Copying Archived files
58+
INFO:pgosm-flex:pgosm_flex:Running osm2pgsql
59+
```
60+
61+
If a date is specified without matching file(s) it will raise an error and exit.
62+
63+
```bash
64+
ERROR:pgosm-flex:geofabrik:Missing PBF file for 2024-05-15. Cannot proceed.
65+
```
66+
67+
68+
## Specific input file with `--input-file`
69+
70+
The automatic Geofabrik download can be overridden by providing PgOSM Flex
71+
with the path to a valid `.osm.pbf` file using `--input-file`.
72+
This option overrides the default file handling, archiving, and MD5
73+
checksum validation. With `--input-file` you can use a custom `osm.pbf`
74+
you created, or use it to simply remove the need for an internet connection
75+
from the instance running the processing.
76+
77+
> Note: The `--region` option is always required, the `--subregion` option can be used with `--input-file` to put the information in the `subregion` column of `osm.pgosm_flex`.
78+
79+
80+
### Small area / custom extract
81+
82+
Some of the smallest subregions provided by Geofabrik are quite large compared
83+
to the area of interest for a project.
84+
The `osmium` tool makes it quick and easy to
85+
[extract a bounding box](https://docs.osmcode.org/osmium/latest/osmium-extract.html).
86+
The following example extracts an area roughly around Denver, Colorado.
87+
It takes about 3 seconds to extract the 3.2 MB `denver.osm.pbf` output from
88+
the 239 MB input.
89+
90+
```bash
91+
osmium extract --bbox=-105.0193,39.7663,-104.9687,39.7323 \
92+
-o denver.osm.pbf \
93+
colorado-2023-04-18.osm.pbf
94+
```
95+
96+
The PgOSM Flex processing time for the smaller Denver region takes less than 20 seconds on a
97+
typical laptop, versus 11 minutes for all of Colorado.
98+
99+
```bash
100+
docker exec -it \
101+
pgosm python3 docker/pgosm_flex.py \
102+
--ram=8 \
103+
--region=custom \
104+
--subregion=denver \
105+
--input-file=denver.osm.pbf \
106+
--layerset=everything
107+
```

0 commit comments

Comments
 (0)