Reducing the size of GeoJSON files with geojson-shave
Giant GeoJSON files can be a nightmare, crashing your IDE, GIS software or browser (and potentially causing you to tear your hair out in frustration!).
I use GeoJSON files quite often so I decided to create a command-line tool that reduces the size of GeoJSON files.
You can view the project homepage here.
You can install the tool using pip:
$ pip install geojson-shave
Usage
geojson-shave
reduces the size of GeoJSON files by truncating latitude/longitude coordinates to the specified decimal places,
eliminating unnecessary whitespace and (optionally) replacing the properties key's value with null/empty dictionary.
Simply pass the file path of your GeoJSON file and it will truncuate the coordinates to 5 decimal places, outputing to the current working directory:
$ geojson-shave roads.geoson
Alterntatively you can specify the number of decimal points you want the coordiantes truncuated to:
$ geojson-shave roads.geojson -d 3
You can also specify if you only want certain Geometry object types in the file to be processed:
$ geojson-shave roads.geojson -g LineString Polygon
Note that the -g option doesn't apply to objects nested within Geometry Collection.
And to reduce the file size even further you can nullify the property value of Feature objects:
$ geojson-shave roads.geojson -p
Output to a directory other than the current working directory:
$ geojson-shave roads.geojson -o ../data/output.geojson
How I did it
To fully understand how the command-line tool works you can read the source code but to truncuate coordinates I used a recursive function:
def _create_coordinates(coordinates, precision):
"""Create truncuated coordinates."""
new_coordinates = []
for item in coordinates:
if isinstance(item, list):
new_coordinates.append(_create_coordinates(item, precision))
else:
item = round(item, precision)
new_coordinates.append(float(item))
return new_coordinates
Because there are different types of GeoJSON Geometry objects with varying levels of nested coordinates I had to use recursion, a technique I hadn't used before but I had fun employing.
For example, you can see the difference between a Point and Polygon objects' coordinates:
{
"type": "Point",
"coordinates": [100.0, 0.0]
},
{
"type": "Polygon",
"coordinates": [
[
[100.0, 0.0],
[101.0, 0.0],
[101.0, 1.0],
[100.0, 1.0],
[100.0, 0.0]
],
[
[100.8, 0.8],
[100.8, 0.2],
[100.2, 0.2],
[100.2, 0.8],
[100.8, 0.8]
]
]
}
Comments !