Foursquare Open Source Places: A new foundational dataset

3 pointsposted a day ago
by jjwiseman

2 Comments

jjwiseman

a day ago

This data is about 10 GB, available on S3, and is really easy to query from duckdb. If you have fast internet you can even do a trial query without downloading anything:

  SELECT
      COUNT(*)
  FROM
      's3://fsq-os-places-us-east-1/release/dt=2024-11-19/places/parquet/places-*.snappy.parquet'
  WHERE
      name = 'Wendy''s';
This takes less than a minute with fast home internet.

Curious what parts of the planet have the highest density of fast food restaurants? Download the files so you have local copies, then make sure you have the H3 extension installed:

  INSTALL h3 FROM community; LOAD h3;
Now:

  SELECT
      printf('%x', cell) as h3,
      COUNT(*) AS count
  FROM (
      SELECT
          name,
          H3_LATLNG_TO_CELL(latitude, longitude, 4) AS cell
      FROM 'places-*.snappy.parquet'
      WHERE array_contains(fsq_category_labels, 'Dining and Drinking > Restaurant > Fast Food Restaurant')
  )
  GROUP BY
      cell
  ORDER BY
      count DESC;
The top H3 cell will have ID 841ec91ffffffff; Go to https://observablehq.com/@nrabinowitz/h3-index-inspector and paste it in and you'll see that it's… Istanbul! Write another query to find out what the most common fast food places are in Istanbul… (Burger King and McDonald's, and it's not even close).

Install the spatial extension and you can find all venues within 100 meters of you. Find all the Taco Bell/Pizza Hut combos. Go crazy.

[Also see https://simonwillison.net/2024/Nov/20/foursquare-open-source...]

user

a day ago

[deleted]