Aerial Bold Data


In November 2014, Benedikt and Joey began their planetary search for letterforms. Aerial Bold is the first global database of human crowd- and algorithmically- sourced letterforms. The database of letterforms not only includes the letterform classification (e.g. "A", "B", "C", etc), but also other metadata including the features' location (e.g. city, country), width, height, rotation angle from north, and readability and beauty rating.

The database is structured as a geojson file - a web friendly and open spatial data format - containing the tens of thousands of locations and descriptive letterform metadata found from their Letter Hunt campaign and the letterform machine learning algorithm.

The letterform database is not to be confused with the Aerial Bold Font which is a translation of the archetypal letterforms found in the database to an opentype format vector font family.

Data Providers


The creation of the Aerial Bold database was made possible by the generous contribution of aerial imagery from Mapbox and data provided by the OpenStreetMap community and the United States Geological Survey (USGS). Without the support of Mapbox and the communities and organizations, the Aerial Bold Project would not have been possible.

We used open source technologies for generating and processing the data. Namely, we relied on Turf.js, Leaflet.js, Node.js / Express.js, MongoDB, QGIS, Python, Theano, scikit-learn, Fiona, Shapely, and GDAL.

Data Sample


The image shows the geojson data (the orange bounding box) overlaid onto aerial imagery. The properties of the data are shown on image right.



Structure

  • letter: The alphanumeric letterform type.
  • zoom: the zoom level at which the feature was detected.
  • type: manual or automated
  • authorName: the name of the person who found the feature (if any).
  • authorURL: the URL of the person who found the feature (if any).
  • angle: the angle from true north.
  • width: the length of the shortest side in meters.
  • height: the length of the longest side in meters
  • modified: data modified (if any).
  • country: the country in which the feature lives.
  • countryCode: the country abbreviation in which the feature lives.
  • subregion: the subregion in which the feature lives (if any)
  • continent: the continent in which the feature lives.

{
    "_id": "54c90f1ca7bd530000efba11",
    "geometry": {
        "type": "Polygon",
        "coordinates": [
            [
                [-77.2291, 38.7280],
                [-77.2265, 38.7272],
                [-77.2280, 38.7243],
                [-77.2306, 38.7251],
                [-77.2291, 38.7280]
            ]
        ]
    },
    "properties": {
        "letter": "E",
        "zoom": 16,
        "type": "manual",
        "authorName": "J. Lee & B. Groß",
        "authorURL": "http://aerial-bold.com",
        "angle": 21.30628543570892,
        "width": 130.7849676099804,
        "height": 184.4465210270137,
        "inBing": false,
        "inMapbox": true,
        "beautiful": 3,
        "readable": 1,
        "modified": "2015-05-03T00:56:08.858Z",
        "country": "United States",
        "countryCode": "USA",
        "subregion": "Northern America",
        "continent": "North America"
    },
    "type": "Feature"
}

        

State of the Data


When we started our Kickstarter, we set out to map the entire world. Our vision was to develop an algorithm that could automagically find all of earth's letterforms visible on the available aerial imagery and we succeeded... except there is more work to be done. What our project revealed is that despite the ability to find and classify the position and type of letterform in an aerial image, there is still manual effort necessary to draw the letterform's bounding box as well as rate the letterform on its beauty and readability - properties that only human can attribute.

The current state of the Aerial Bold dataset (2016-02-08) covers 22 different countries, of which most of the results come from the USA, France, Germany, the UK, Australia, Spain, Canada, Korea, Netherlands, New Zealand, Portugal, Switzerland, and the Netherlands. We'd love to see more diversity in the countries and landscapes as well as in the letterforms. We now have all of the tools to continue the planetary search for letterforms, but covering the entire world will be a process that we hope to continue as time goes on.

Data Release / Licensing


The Data will be released under a TBA license to:

  • Kickstarter Backers: TBA
  • Publicly: TBA