BY Ronnie Williams, PE, PTOE

Read The White Paper

When a Department of Transportation (DOT) or agency performs a highway safety study, the study typically focuses on an existing corridor with a documented safety issue. Such studies often are reactive, looking for the root causes of crashes and testing a series of solutions in an existing corridor. The results are safety plans that lead to the recommendation of innovative or proven solutions, which help to reduce or eliminate factors that contribute to the crashes in a corridor.

Safety studies require a great amount of data, from existing crash information and daily traffic volumes to roadway characteristics such as number and width of lanes to type of shoulder and even the horizontal and vertical curvature of the roadway. Inaccurate or poor data is a detriment to a study’s fidelity.

Many state agencies utilize internally developed databases that commonly include data on interstates and other major routes. However, they often have limited data when it comes to minor streets and roadways because those facilities are typically maintained at the city or county level. In fact, according to the U.S. Government Accountability Office (GAO), locally maintained roads account for about 77% of all public roads in the nation, while state-maintained roads represent just 20% of total road mileage.

Incomplete data can make it challenging to identify hazardous locations and it can be a roadblock to identifying and reporting on potential solutions for hazardous locations. As a result, agencies can have difficulty applying a data-driven, strategic approach to highway safety. Without data to factually back up their plans, agencies may have difficulty securing funding and addressing their top traffic safety priorities.


By utilizing existing data from third-party sources, an agency can add value and information to new highway safety research. By harnessing this existing wealth of data, agencies can gain new insights for their studies and improve the quality of the transportation safety data.

When agencies find ways to obtain missing data, as well as improve or validate their existing datasets, they uncover blind spots and deliver a more comprehensive approach to highway safety analysis. By thoroughly evaluating all of the roadways within a jurisdiction, an agency is better equipped to create a safety plan that will help set goals, prioritize projects and develop sound budgeting. As an agency becomes more fully informed, the agency is better positioned to develop long-range plans to help reduce crashes and also to implement steps to proactively address potential safety issues.


The transportation industry has reached an inflection point. With electric and autonomous vehicles becoming more common on our roadways, agencies are more focused on maintaining the existing transportation network. As a result, there is a greater focus on improving comprehensive highway safety modeling as a means to not only create safer roadways, but to improve mobility through the implementation of safer roadways.

To achieve all this while being mindful of available funding needed to fund the improvements, agencies continually look for ways to improve processes and increase efficiency. As agencies integrate and analyze available quantitative data, they can more easily identify sites at high risk and develop effective safety plans for both design and construction.



Many agencies maintain a system database, which includes existing system attributes such as number and width of lanes. While this data is generally readily available for interstates and other major routes, it is typically less complete for minor roadways.


By assessing data deficiencies and adding missing, relevant datasets, agencies can build a more complete picture of the roadway’s safety challenges. Specifically, agencies can identify and integrate pre-crash, environment and post-crash third-party datasets into their existing systems to increase data quality to advance progress on highway-safety analysis.

Pre-crash datasets may be driver- or vehicle-oriented, such as citation histories or other crash predictors from Department of Motor Vehicles (DMV) datasets. Other pre-crash datasets are comprised of meteorological or naturalistic data, such as the findings from the Strategic Highway Research Program 2 (SHRP2).

Crowdsourced data sources, such as Waze, or other GPS navigation tools that provide user-submitted travel times and route details are also available. One helpful specific resource is data available through HERE Technologies. With aggregated data sourced from millions of datasets, HERE has developed databases that include roadway geometry, vehicle speeds and real-time traffic conditions, among other attributes.

Crash environment datasets are something all states are required to maintain in their statewide crash records. These systems provide a field that depicts the crash location, which can reveal more details about the built environment in which the crash occurred, such as roadway and land-use characteristics. These systems include official crash reports and associated details pertaining to individual crashes.

An additional third-party source to augment an agency’s existing data is Google Earth. This free resource requires users to manually extract data, such as lane width and shoulder width, and while some automation is possible, can provide a quick dataset easily obtained without performing a field visit.

Post-crash datasets can include hospitalization data, as well as medical insurance claims data, emergency medical system (EMS) data and vital statistics, many of which are managed by state Departments of Health.


Vehicle event data recorder (EDR) technology can provide a detailed picture of the seconds right before and after a crash. These automotive “black boxes” record crash data and save moment-by-moment statistics, including speed, acceleration and braking. They may also record information from inside the car. The National Highway Traffic Safety Administration (NHTSA) has developed a final ruling that provides standards for the data collected by EDRs. While not currently mandated, many automotive manufacturers have implemented some form of EDR data collection.


As agencies work to improve their traffic data systems through assessments and the integration of existing data, it is helpful to bring a third-party perspective to the process. As technology advances, so does the ability to collect not only more data, but also more detailed data.

  1. Vehicle data systems include information on the identification and ownership of vehicles registered in the state. Data should be available for vehicle make, model, year of manufacture, body type and vehicle history, including odometer readings. This information supports the analysis of vehicle-related factors that may contribute to a state’s crash experience.

  2. Driver data systems include information about the state’s population of licensed drivers, as well as convicted traffic violators who are not licensed in the state. The information about people licensed in the state typically includes personal identification, driver’s license number, license status, driver restrictions, certain convictions in prior states, crash history, citations and violations, and driver education data.

  3. Roadway data systems include roadway location, identification, classification and physical characteristics — such as surface type, presence of traffic control devices and intersections — and usage, such as travel by vehicle type. Roadway information is typically available for all public roadways, including local roads and others not maintained by the state.

  4. Crash data systems document the time, location, environment and characteristics of a crash, such as the sequence of events for a motor vehicle crash. By integrating and linking to other third-party data systems, an agency can better use the crash component to identify roadways, vehicles, drivers, occupants and pedestrians involved in crashes, and document the consequences of such crashes, whether they involve fatalities, injuries, property damage and/or citations.

  5. Citation and adjudication data systems include information on the time of citation distribution to a state, county or local law enforcement officer, issuance to an offender, the citation’s disposition and conviction in the driver-history database. Third-party information and datasets typically identify the type of violation, location, date and time, enforcement agency, court of jurisdiction and final resolution.

  6. Injury surveillance data systems incorporate information from trauma centers and emergency medical services, as well as hospital records on inpatient/discharge, rehabilitation and morbidity, to monitor injury causes, magnitude, costs and outcomes. These systems provide information for agencies to track magnitude, and injury types and severity sustained by people in motor vehicle crashes.

When an agency assesses the depth and breadth of its data systems as listed above, it should also evaluate the quality of its data systems based on six performance measures:

  1. Timeliness refers to the varying times by which data is entered, updated or made available for analysis. To provide a meaningful analysis, this information should be available within 90 days of a crash.

  2. Consistency refers to all reporting jurisdictions within a state that collect the same data elements over time and remain consistent with nationally accepted and published guidelines and standards, such as Model Minimum Uniform Crash Criteria. Injury surveillance data should be consistent with statewide formats and follow national standards, such as those published by the Centers for Disease Control and Prevention.

  3. Completeness refers to verifying that all necessary state data and associated elements are collected completely, which results in fewer missing or unknown values. Roadway information should provide complete and accurate details regarding the number of miles of roadway, number and type of highway structures, traffic volumes, traffic control devices, speeds, signage and more.

  4. Accuracy refers to a state’s use of quality control methods to verify the accuracy and reliability of information, such as edit checks. For vehicle data, states should use current technologies designed for these purposes.

  5. Accessibility refers to information that is easily accessible to the principal users or relevant communities. For example, citation and adjudication data should be available to driver control personnel, law enforcement, court officials and agencies that have administrative oversight responsibilities related to courts.

  6. Integration refers to information that can be linked to other information sources to evaluate relationships between specific roadway, crash, vehicle and human factors at the time of a crash. For example, health-outcome data can be associated with specific medical and financial consequences. Across all data systems, the GAO found that states met the data integration performance measure 13% of the time.


Through supplementing existing available data with the utilization of third-party datasets, agencies can produce complete datasets that improve the evaluation of highway safety. This improvement can result in increased efficiencies, detailed analysis that focuses on the root cause of safety concerns, improved consideration for project funding, and potentially allow for analysis that takes a proactive approach to the identification of locations that have the potential to result in a proactive approach to the identification to issues that could result in safety issues in the future.


Village of Mount Prospect

Mount Prospect, Illinois

Completion Date
December 2015


In the early morning hours of July 23, 2011, an intense round of storms brought historic rainfall to the Village of Mount Prospect. Located 22 miles northwest of downtown Chicago, the Village experienced a record 7 inches of rain in a short three-hour period, later receiving over 8 inches within 24 hours. Classified as a 500-year storm event, the Village received widespread flooding and significant property damage.

During previous heavy rainfall, the area already had been prone to frequent flooding and basement surcharges and backups. To mitigate future flooding challenges, the Village required combined relief sewer improvements that would increase the capacity of the system from a 2-year level of service to a 25-year level.

Interested in learning more?