P. Wang, C.-Y. Chan, A. Muralidharan, S. Fang
Pages: 65-76
Abstract
The appropriate choice of sample size plays a key role in any statistical study involving data. This is particularly evident in traffic flow analysis since general traffic patterns can only be observed meaningfully if sufficient amount of data are included in analysis. Proper sample sizes cannot only support validity of analysis but also improve work efficiency by saving considerable resources in data collection. In this paper, we described our research of traffic data at intersections on arterial roads to determine the minimum sample size for traffic pattern analysis based on similarity analysis of traffic time series. According to the characteristics of traffic flow in time series, Euclidean distance was used as a measure to explore the minimum sample size and the analysis was carried out based on the divided sub-datasets, such as weekday data, weekend data and weekly data. Seasonal impact was also taken into consideration in the analysis. Results show that 1) the values of minimum sample size in winter and spring or summer and fall are quite similar based on both weekday and weekly data analysis even at intersections with different configurations, and 2) for intersections with similar attributes and traffic conditions, the results are desirably matched in both the weekday and weekend data study. It is concluded that the proposed methodology is universally applicable for determining the minimum sample size at intersections. The findings obtained in our study can provide valuable guidance for the amount of data required to achieve meaningful statistical observations of traffic patterns.
Keywords: minimum sample size; Euclidean distance; time series of traffic flows; intersection traffic data; traffic pattern analysis