Assessment of Trend and Seasonality in Road Accident Data: An Iranian Case Study

Background: Road traffic accidents and their related deaths have become a major concern, particularly in developing countries. Iran has adopted a series of policies and interventions to control the high number of accidents occurring over the past few years. In this study we used a time series model to understand the trend of accidents, and ascertain the viability of applying ARIMA models on data from Taybad city. Methods : This study is a cross-sectional study. We used data from accidents occurring in Taybad between 2007 and 2011. We obtained the data from the Ministry of Health (MOH) and used the time series method with a time lag of one month. After plotting the trend, non-stationary data in mean and variance were removed using Box-Cox transformation and a differencing method respectively. The ACF and PACF plots were used to control the stationary situation. Results : The traffic accidents in our study had an increasing trend over the five years of study. Based on ACF and PACF plots gained after applying Box-Cox transformation and differencing, data did not fit to a time series model. Therefore, neither ARIMA model nor seasonality were observed. Conclusion : Traffic accidents in Taybad have an upward trend. In addition, we expected either the AR model, MA model or ARIMA model to have a seasonal trend, yet this was not observed in this analysis. Several reasons may have contributed to this situation, such as uncertainty of the quality of data, weather changes, and behavioural factors that are not taken into account by time series analysis.


Background
Road traffic injuries (RTIs) are a major public health problem to which inadequate attention has been paid (1).According to the World Health Organization (WHO), traffic accidents and related deaths are an emerging global epidemic (2).Worldwide, approximately 1.2 million people are killed and up to 50 million people injured annually as a result of road accidents (2).Road traffic accidents are decreasing in developed nations.On the other hand, these are increasing in developing countries (2).In low-income and middle-income countries it is forecast that the number of road traffic deaths and injuries will increase by as much as 80% between 2000 and 2020 (3).The cost of these fatalities, disabilities and injuries can have a significant impact on health and social and economic development (4).The application of policies and interventions to control traffic accidents can decrease the cost of traffic accidents.Petrol rationing, an improvement in traffic enforcement, setting up of speed bumps, legislation, and the enforcement of the use of helmets for cyclists and motorcyclists are examples of such interventions (5).
However, there are certain aspects which are out with human control with respect to traffic accidents, such as weather conditions.Adverse weather during snowy and rainy seasons clearly affects the severity and occurrence of road accidents (6).Moreover, the number of journeys varies according to different seasons.The high number of journeys in certain seasons can significantly affect traffic accidents (7).
As previously noted, in order to reduce the occurrence of accidents, interventions could be set up and evaluated by statistical methods such as time series (8).Time series analysis is a statistical method applied to explore possible pattern of data over time with the ultimate aim of forecasting future events.
In Iran, the rate of traffic accidents is very high and they doi: 10.15171/ijhpm.2013.08 account for a considerable percentage of deaths (9).It was estimated that over 30,000 Iranians die due to road traffic accidents annually (10).In response to this, the reduction in the number of car accidents has become a main priority for public health policies.These include improving road safety, enforcing rules for safe driving, and increasing police monitoring activities.Furthermore, there has been indirect intervention such as ending the sale of subsidised fuel to reduce fuel consumption, which has been implemented in recent years and may have the effect of minimising the occurrence of accidents.We used time series to understand the trend of accidents in Taybad and to evaluate the effectiveness of several road safety interventions on accident occurrence.
Taybad is the capital city of Taybad county, Razavi Khorasan province, located in the east of Iran.Its population is approximately 48,000 and it is located close to the border with Afghanistan.

Methods
This study is a descriptive analytic study.We extracted data on traffic accidents for different Iranian cities from the Ministry of Health (MOH).The data were examined to find the most complete data set.Finally, we found data on Taybad road accidents, which had a complete data set over the years of study.This source of data contains the number of traffic accidents attributed to motor vehicles, motorcycles, and pedestrians.
Time series analysis is used for sequence observations ordered in time (Y t ).In this sequence, Y t denotes observations at a specific time lag.The time period between the two observations is referred to as a lag.Time series analysis is applied in different fields such as economics, management, social science, medicine, etc (11).This method was suggested by Box and Jenkins in 1976.The time series approach is often referred to as ARIMA.ARIMA is an acronym for Auto-Regressive Integrated Moving Average model, which consists of moving average (MA) and autoregressive (AR) models.In time series it is important that the mean and variance of the trend are stationary.A process is stationary if its mean and variance do not change over time.The time series plot is used to assess stationary data.If there are non-stationary data in time series, there are ways to remove them, such as differencing method and Box-Cox transformation (12).In time series, it is assumed that observations are not usually independent of each other.The observed number of traffic accidents of the previous month is usually a good indicator of the number of traffic accidents in the current month.
The outcome measure was the number of traffic accidents.The monthly time lag was considered in this study.Models were identified through ACF (autocorrelation functions) and PACF (partial autocorrelation functions).The residuals of the diagnostic models and the AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion) were used to select the best fit model for future forecasting.
In this analysis, the first step was to plot the sequence of accident data over 5 years from 2007 to 2011.Because of existing non-stationary data in mean and variance on the trend plot, Box-Cox transformation and differencing were applied respectively to remove non-stationary data from variance and mean.Subsequently, ACF and PACF plots were used to determine AR (auto regressive) and MA (moving average) figures of a possible ARIMA model.

Results
The total number of accidents in Taybad between January 2007 and December 2011 was 20,720.As shown in Figure 1, there is an increasing trend in the occurrence of accidents in Taybad, which is observed in all groups: pedestrians, motorcyclists and motor vehicles.
The total monthly average of accidents was 345 (SD=118.85).The highest and lowest average of accidents was seen in the fourth year, with 446 and 229 accidents, respectively.The highest number of accidents in the first year was seen in month 12, with 326 accidents.In the second year, it was seen in month 2, with 446 accidents.In the third, fourth and fifth years, the highest number of accidents were seen in months 5, 6 and 4, respectively.The total number of accidents in these months was 572, 531 and 430, respectively.
In this analysis, non-stationary data was seen in mean and variance, and an increasing trend over the time period (Figure 1).Lambda estimation in Box-Cox transformation was 0.92 using 0.95% confidence interval (lower CL 0.36upper CL 1.47).To ensure that non-stationary data was removed, a trend plot was subsequently applied.Box-Cox logarithmic transformation and differencing ensured that the model was stationary in variance and in mean with respect to central tendency (Figure 2).Figures 3 and  4 show ACF and PACF plots of accidents after Box-Cox transformation and differencing, respectively.Based on these figures, the resulting data of the series showed that time series analysis does not fit this data.

Discussion
Our results found that traffic accidents had an upward trend in Taybad over the period under study, which was an unexpected outcome, due to the considerable amount of various interventions implemented over recent years.These included a range of policies and activities, such as the enforcement of rules for using car seat belts, safety helmets for motorcyclists, and the introduction of new penalties for offending motorists.Petrol and gas rationing was an additional intervention, which was launched in April 2007.This was considered to be an intervention, due to gradual fuel subsidy cuts over time.Additionally the establishment of pedestrian bridges, street widening, and training programmes for motorists have been carried out over recent years.
Despite the implementation of interventions to control traffic accidents, the Iranian automobile industry has accelerated production of cars and motorcycles in recent years (9).Therefore, one explanation is that the growing trend of accidents is partially due to the increasing number of cars over the past few years.
Although time series has been applied on many occasions for road accident studies, (8,(12)(13)(14)(15) our assumptions on the ARIMA model and seasonality were not satisfied by our results and our data appeared to be inappropriate for analysis with time series method.
Failure to apply time series analysis in this study can be explained by several factors, including lack of high quality data, which may affect the quality of findings (16).Furthermore, a seasonality pattern in road accidents was expected, as a number of accidents in winter may be due to snow and slippery roads (6,17).However, we have experienced a relatively long period of drought, which has led to winters with low precipitation and blizzards, which in turn has reduced the occurrence of snow and blizzards.
Another factor which could affect the level of accidents is the number of journeys made, which also varies according to season (7).Therefore, it was expected that there would be a peak level of travel for holidays in the summer, which is the hot season.We expected to observe this as a spike in ACF and PACF.However, we did not find such a pattern.This could be explained by the fact that the month of Ramadan occurred in summer during the period of study.In the month of Ramadan, those who are fasting cannot complete their fasting obligation while travelling (18), and hence may abstain from travelling.
Furthermore, despite the fact that application of time series models would have been preferred to explain and predict road safety events (19), some academics criticise time series analysis, arguing that the results of time series are not robust enough to control seasonal effects (20,21).A final point worth mentioning is that all interventions are aimed at drivers or road safety, and there have been   few policies targeting pedestrians.Pedestrian behaviour is also important in traffic accidents.Certain behavioural factors, such as carrying out distraction activities such as using mobile phone while crossing the road, and a lack of adherence to traffic regulations by pedestrians can affect traffic patterns (22,23).
Although this study used the data of one city and is the first study to apply the time series model, we acknowledge that there are certain limitations with our study, such as uncertainty of the quality of data, lack of data on pedestrian behaviour, and detailed local and national traffic control interventions, which were not available to us.Also the number of car owners might be increased, yet we have no such data to consider in data analysis.It is therefore suggested that a combination of qualitative and qualitative research is carried out to gain a more in-depth understanding of the situation.
Statistical methods have been used widely in the evaluation and explanation of various events in medical sciences, and many researchers have found them appealing.However, the application of sophisticated methods such as time series requires certain assumptions and prerequisites, such as adequate quality of data, otherwise results may be misleading.