NYC Subway Turnstile Counts Data aggregated by day and station complex for the year 2020. Updated weekly.
nyc-transit-data/turnstile_daily_counts_2020
NYC Subway Turnstile Counts Data aggregated by day and station complex for the year 2020. Updated weekly, usually on Monday mornings. (The pipeline is mostly automated, but must be manually triggered)
Where the Data Came From
This aggregation was created from weekly raw turnstile counts published by the New York MTA at http://web.mta.info/developers/turnstile.html
The raw data were imported into a postgresql database for processing, and aggregated to calendar days for each station complex.
The process is outlined in this blog post, and the code for the data pipeline is available on github.
Caveats
This aggregation is a best-effort to make a clean and usable dataset of station-level counts. There were some assumptions and important decisions made to arrive at the finished product.
The dataset excludes turnstile observation windows (4 hours) that resulted in entries or exits of over 10,000. This threshold excludes the obviously spurious numbers that come from the counters rolling over, but could include false readings that are within the threshold.
The turnstile counts were aggregated to calendar day using the timestamp of the end of the 4-hour observation window + 2 hours. An observation window that ends at 2am would count for the same day, but a window ending between midnight and 1:59am would count for the previous day.
The last date in the dataset contains a small number of entries and exits that will be aggregated into the next weekâÃÂÃÂs worth of data, and should not be used.
PATH and Roosevelt Island Tramway
The dataset also includes turnstile counts for the PATH train system and the Roosevelt Island Tramway
Spurious Data in early versions
Versions prior to QmPkGqJ318gcok69Noj3gw3coby8FDrab3x1hBisFcU3Yq were built with a pipeline that had a major error, causing inaccurate numbers near the transition between weekly input files.
CSV Configuration
Schema
title | type | description |
---|---|---|
stop_name | string | the name associated with the station, based on lookup from nyc-transit-data/turnstile-station-list |
daytime_routes | string | the daytime routes associated with the station, based on lookup from nyc-transit-data/turnstile-station-list |
division | string | the subway division of the station, based on lookup from nyc-transit-data/turnstile-station-list |
line | string | the subway line of the station, based on lookup from nyc-transit-data/turnstile-station-list |
borough | string | the single letter borough identifier for the station, based on lookup from nyc-transit-data/turnstile-station-list |
structure | string | the structure type of the station, based on lookup from nyc-transit-data/turnstile-station-list |
gtfs_longitude | number | the wgs84 longitude of the station, based on lookup from nyc-transit-data/turnstile-station-list |
gtfs_latitude | number | the wgs84 latitude of the station, based on lookup from nyc-transit-data/turnstile-station-list |
complex_id | string | the complex id. If not a complex, the station id |
date | string | the calendar date |
entries | integer | the sum of entires for all turnstile observation windows associated with this station/complex that ended on this calendar date |
exits | integer | the sum of exits for all turnstile observation windows associated with this station/complex that ended on this calendar date |