Discussions
How to understand the difference between raw_visitor_counts data and visitors from different cbgs
I calculated the sum of raw_visitor_counts data for each park in Los Angeles. In addition, I calculated the total number of visitors from different cbgs for each park. I found that although their pearson correlation coefficient is high (0.99), the raw_visitor_counts data is much larger than the total number of visitor_home_cbgs.
I understand that the visitor_home_cbgs data does not include all visitors, but based on my analysis, I got two opposite conclusions. One is that the number of visitors to parks will decrease during hot weather (using raw_visitor_counts data). The other is that the number of visitors to parks will increase during hot weather (using visitor_home_cbgs data). How can l explain this due to the difference in data collection?
Also, does the number of visitors in 'visitor_home_cbgs' refer to the number of unique visitors for the entire week?