Research Datasets

Cellphone-based Population Flow Data

Data Description

The cellphone-based population flows were extracted from SafeGraph data by the Geoinformation and Big Data Research Lab at the Center for GIScience and Geospatial Big Data (CeGIS) in collaboration with BDHSC for academic research purposes.

SafeGraph is a data company that aggregates anonymized location data from mobile devices to provide insights into physical places. The data is aggregated using a panel of Global Positioning System (GPS) points from about 10% of anonymous mobile devices (45 million) and records the foot traffic or visitation patterns of over 5 million places or points of interest (POIs) in the United States (US). Types of places include restaurants, parks, schools, and healthcare facilities etc. The home location of each device is determined at the census block group (CBG) level as the common nighttime location of each mobile device over a six-week period [1]. The recorded number of mobile devices highly correlates with the actual US Census populations, with a Pearson correlation coefficient of 0.97 at the county level [2].

This data contains monthly and weekly population flows originating from over 230,000 CBGs (device’s home CBG) to over 5 million POIs in the US from 01/01/2018 to 08/30/2022. These population flows are called “Origin-Destination-Time (ODT)” flows because each flow refers to a visitation record (number of visitors) from a CBG (Origin) to POI (Destination) during a specific period (Time).  In total, this dataset has 9.5 billion ODT flows. These population flows can be requested in two formats: 1) ODT flows filtered with time (year, month, week) and geographic location. and 2) ODT flows aggregated spatially and/or temporally. We also developed an interactive geospatial web portal, called ODT Flow Explorer [3], that allows researchers to query and visualize the spatially aggregated human mobility flow data at various geographic levels.

Applications (sample published work): This population flow data can help us answer the following questions in different application scenarios:  How often people in a neighborhood visit a place (e.g., park, restaurant, and HIV testing facility). How long do they stay? From which neighborhood do they come from? Studies have been using such data to examine the socioeconomic disparities in travel behavior [4,5], food consumption and health awareness [6,7], restaurant visitation dynamics [8,9], COVID-19 spreading [10,11], disparities of social distancing [12,13], and recreation park visitations [14-16] and beyond. 

References

1. SafeGraph-Social Distancing Metrics. https://docs.safegraph.com/docs/social-distancing-metrics
2.  Squire, R. (2019, October 17). What About Bias in the SafeGraph Dataset? SafeGraph. https://www.safegraph.com/blog/what-about-bias-in-the-safegraph-dataset
3. Li, Z., Huang, X., Hu, T., Ning, H., Ye, X., Huang, B., & Li, X. (2021). ODT FLOW: Extracting, analyzing, and sharing multi-source multi-scale human mobility. Plos one, 16(8), e0255259.
4. Brough, R., Freedman, M., & Phillips, D. C. (2021). Understanding socioeconomic disparities in travel behavior during the COVID‐19 pandemic. Journal of Regional Science, 61(4), 753-774.
5. Lamb, M. R., Kandula, S., & Shaman, J. (2021). Differential COVID‐19 case positivity in New York City neighborhoods: Socioeconomic factors and mobility. Influenza and Other Respiratory Viruses, 15(2), 209-217.
6. Zhao, H., & Banerjee, T. (2022). Unhealthy eating and health awareness: evidence from restaurant visitors using consumers’ cell phone geo-location data. Applied Economics Letters, 1-5.
7. Zhou, H., Kim, G., Wang, J., & Wilson, K. (2022). Investigating the association between the socioeconomic environment of the service area and fast food visitation: A context-based crystal growth approach. Health & Place, 76, 102855.
8. Wang, S., Huang, X., She, B., & Li, Z. (2022). Diverged landscape of restaurant recovery: the effect of COVID-19 on the restaurant industry in the United States. Available at SSRN 4225156.
9. Huang, X., Bao, X., Li, Z., Zhang, S., & Zhao, B. (2022). Black Businesses Matter: A Longitudinal Study of Black-Owned Restaurants in the COVID-19 Pandemic Using Geospatial Big Data. Annals of the American Association of Geographers, 1-17.
10. Zeng C., Zhang J., Li Z., Sun X., Olatosi B., Weissman S., Li X., (2021) Spatial-temporal relationship between population mobility and COVID-19 outbreaks in South Carolina: A time series forecasting analysis, Journal of Medical Internet Research, https://doi.org/10.2196/27045
11. Ning, H., Li, Z., Qiao, S., Zeng, C., Zhang, J., Olatosi, B., & Li, X. (2022). Revealing geographic transmission pattern of COVID-19 using neighborhood-level simulation with human mobility data and SEIR model: A Case Study of South Carolina. medRxiv.
12. Huang X., Li Z., Jiang Y., Ye X., Deng C., Zhang J., Li X., (2021), The characteristics of multi-source mobility datasets and how they reveal the luxury nature of social distancing in the U.S., International Journal of Digital Earth, https://doi.org/10.1080/17538947.2021.1886358
13. Huang X., Li Z., Lu J., Wang S., Wei H., Chen B.  (2020) Time-series clustering for home dwell time during COVID-19: what can we learn from it?, ISPRS International Journal of Geo-Information, https://doi.org/10.3390/ijgi9110675
14. Kupfer, J. A., Li, Z., Ning, H., & Huang, X. (2021). Using mobile device data to track the effects of the COVID-19 Pandemic on spatiotemporal patterns of national park visitation. Sustainability, 13(16), 9366.
15. Wei, H., Huang, X., Wang, S., Lu, J., Li, Z., & Zhu, L. (2022) A data-driven investigation on park visitation and income mixing of visitors in New York City. Environment and Planning B: Urban Analytics and City Science, https://doi.org/10.1177/23998083221130708
16. Liang, Y., Yin, J., Pan, B., Lin, M., & Chi, G. (2021). Assessing the validity of SafeGraph data for visitor monitoring in Yellowstone National Park.

Mode of Access

Remote online access

Level of Access

USC Researchers