Value of Time Estimation

Estimating the Value of Time (VOT) Parameter for Utah’s Travel Demand Model

This notebook replicates the analysis from the ’_archive/_Source - Med Income & Value of Time - 2022-08-09.xlsb’ using the Utah Household and Hosehold Income data from 2019-2023 American Community Survey 5-Year Estimates. The VOTs are segmented by occupancy and purpose (work vs. personal).
Author
Affiliation

Pukar Bhandari

Published

October 14, 2025

1 Introduction

This analysis estimates the Value of Time (VOT) parameters for Utah’s travel demand model using income data from the American Community Survey (ACS) 2019-2023 5-Year Estimates. The VOT represents how much travelers value their time, expressed in cents per minute, and is a critical parameter for evaluating transportation projects and predicting mode choice behavior.

The methodology segments travelers by income level (low, average, and high) and trip purpose (work vs. personal trips), recognizing that different travelers value their time differently. Work trips typically have higher VOT since travel time directly impacts productivity, while personal trips reflect individuals’ willingness to trade time for money in their leisure activities.

This notebook replicates and updates the methodology from the archived Excel workbook, providing a transparent, reproducible workflow using open source data science tools.

2 Environment Setup

Install Required Packages

This section prepares the computing environment by loading necessary Python libraries and configuring project-specific settings. We use pandas and numpy for data manipulation, geopandas for spatial operations, and pygris for seamless access to Census data. The visualization libraries (matplotlib and seaborn) will help us understand income distributions and validate our calculations.

!conda install -c conda-forge numpy pandas geopandas matplotlib seaborn python-dotenv openpyxl
!pip install pygris

Load Libraries

Import all required libraries for the analysis. The pygris library enables direct access to Census Bureau geographic data and ACS estimates through their API.

Show the code
# For Analysis
import numpy as np
import pandas as pd
import geopandas as gpd
import warnings

# For Visualization
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
import seaborn as sns
from adjustText import adjust_text

# Census data query libraries & modules
from pygris import blocks, block_groups, counties, states
from pygris.helpers import validate_state, validate_county
from pygris.data import get_census

# misc
import datetime
import os
from pathlib import Path
import requests

from dotenv import load_dotenv
load_dotenv()
False

Environment Variables

We establish Utah-specific geographic parameters for the analysis. The NAD83 / UTM zone 12N coordinate reference system (EPSG:3566) is the standard projection for Utah, providing accurate distance measurements for spatial analysis. The state FIPS code (49) uniquely identifies Utah in federal datasets.

Show the code
PROJECT_CRS = "EPSG:3566"  # NAD83 / UTM zone 12N
STATE_FIPS = validate_state("UT")
Using FIPS code '49' for input 'UT'
Tip

Need a Census API key? Get one for free at census.gov/developers.

Create a .env file in the project directory and add your Census API key: CENSUS_API_KEY=your-key-here This enables fetching US Census data from the Census API.

Show the code
# Set your API key into environment (alternative to .env file)
os.environ['CENSUS_API_KEY'] = 'your_api_key_here'

3 Define Helper Functions

To maintain code reusability and follow DRY (Don’t Repeat Yourself) principles, we define helper functions for common operations throughout the analysis.

Fetch Excel Files from BLS or BTS

This utility function automates downloading data files from federal agencies. It checks if a file already exists locally before attempting to download, avoiding unnecessary network requests and respecting the agencies’ servers. The function includes proper HTTP headers to ensure reliable downloads.

Show the code
def fetch_excel(path, url):
    """
    Download Excel file if it doesn't exist locally.

    Parameters:
    -----------
    path : str or Path
        Local file path to save the Excel file
    url : str
        URL to download the Excel file from
    """
    # Convert to Path object if string
    filepath = Path(path)

    # Download file if it doesn't exist
    if not filepath.exists():
        filepath.parent.mkdir(parents=True, exist_ok=True)

        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
        }

        response = requests.get(url, headers=headers)
        filepath.write_bytes(response.content)

4 Lookup Tables

Lookup tables serve as reference datasets that map categories to values. These tables ensure consistency across calculations and make the code more maintainable by centralizing key parameter definitions.

Income Category Lookup

The ACS reports household income in 16 predefined brackets rather than individual values. To perform calculations with this grouped data, we need to estimate representative income values for each bracket. This lookup table defines the boundaries of each income bracket and calculates midpoint values.

For the highest income bracket ($200,000+), which has no upper limit, we use $300,000 as a reasonable midpoint. This value is based on research showing that high-income distributions typically concentrate between $200,000 and $400,000, with $300,000 representing a conservative central estimate.Create a reference table defining the income brackets used in ACS Table B19001. Each bracket has a lower and upper limit, and we calculate midpoints for median estimation. The highest bracket ($200,000+) uses $300,000 as a reasonable midpoint based on income distribution patterns.

Show the code
lookup_hhinc = pd.DataFrame({
  "Income Category": [
    "HH_LT_10K", "HH_10_15K", "HH_15_20K", "HH_20_25K", "HH_25_30K", "HH_30_35K",
    "HH_35_40K", "HH_40_45K", "HH_45_50K", "HH_50_60K", "HH_60_75K",
    "HH_75_100K", "HH_100_125K", "HH_125_150K", "HH_150_200K", "HH_GT_200K"
  ],
  "Lower Limit": [
    0, 10000, 15000, 20000, 25000, 30000,
    35000, 40000, 45000, 50000, 60000,
    75000, 100000, 125000, 150000, 200000
  ],
  "Upper Limit": [
    9999, 14999, 19999, 24999, 29999, 34999,
    39999, 44999, 49999, 59999, 74999,
    99999, 124999, 149999, 199999, np.inf
  ]
})

# Compute midpoint and round it
lookup_hhinc['Midpoint'] = (
  (lookup_hhinc['Lower Limit'] + lookup_hhinc['Upper Limit']) / 2
).round()

# Replace infinite midpoint (last category) with 300000
lookup_hhinc.loc[np.isinf(lookup_hhinc["Upper Limit"]), "Midpoint"] = 300000

lookup_hhinc
Income Category Lower Limit Upper Limit Midpoint
0 HH_LT_10K 0 9999.0 5000.0
1 HH_10_15K 10000 14999.0 12500.0
2 HH_15_20K 15000 19999.0 17500.0
3 HH_20_25K 20000 24999.0 22500.0
4 HH_25_30K 25000 29999.0 27500.0
5 HH_30_35K 30000 34999.0 32500.0
6 HH_35_40K 35000 39999.0 37500.0
7 HH_40_45K 40000 44999.0 42500.0
8 HH_45_50K 45000 49999.0 47500.0
9 HH_50_60K 50000 59999.0 55000.0
10 HH_60_75K 60000 74999.0 67500.0
11 HH_75_100K 75000 99999.0 87500.0
12 HH_100_125K 100000 124999.0 112500.0
13 HH_125_150K 125000 149999.0 137500.0
14 HH_150_200K 150000 199999.0 175000.0
15 HH_GT_200K 200000 inf 300000.0

5 Raw Data Sources

This section retrieves the foundational datasets needed for VOT estimation. We access data directly from authoritative federal sources to ensure accuracy and reproducibility.

Consumer Price Index

The CPI-U-RS (Consumer Price Index for All Urban Consumers - Research Series) provides the most consistent measure of inflation over time. While we use current-year dollars in this analysis, having CPI data available enables future inflation adjustments and facilitates comparisons with historical VOT estimates. The research series is specifically designed for longitudinal analysis, maintaining methodological consistency that the standard CPI-U lacks.

Data Source: Consumer Price Index for All Urban Consumers (CPI-U) [Source: Bureau of Labor Statistics]

Show the code
# Set file path and URL for CPI data
filepath_cpi = Path("_data/bls/r-cpi-u-rs-allitems.xlsx")
url_cpi = "https://www.bls.gov/cpi/research-series/r-cpi-u-rs-allitems.xlsx"

# Ensure the file exists, Download if not
fetch_excel(path=filepath_cpi, url=url_cpi)

# Read Excel file
df_CPI = pd.read_excel(
    filepath_cpi,
    sheet_name="All items",
    usecols="A:N",
    skiprows=5,
    engine='openpyxl'
)

df_CPI
YEAR JAN FEB MAR APR MAY JUNE JULY AUG SEP OCT NOV DEC AVG
0 1977 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 100.0 NaN
1 1978 100.5 101.1 101.8 102.7 103.6 104.5 105.0 105.5 106.1 106.7 107.3 107.8 104.4
2 1979 108.7 109.7 110.7 111.8 113.0 114.1 115.1 116.0 117.1 117.9 118.5 119.5 114.3
3 1980 120.8 122.4 123.8 124.7 125.7 126.7 127.5 128.6 129.9 130.7 131.5 132.4 127.1
4 1981 133.6 135.2 136.3 137.1 137.9 138.7 139.7 140.7 141.8 142.4 142.9 143.4 139.1
5 1982 144.2 144.7 144.9 145.0 146.1 147.5 148.5 148.8 149.5 150.2 150.5 150.6 147.5
6 1983 151.0 151.1 151.2 152.4 153.2 153.7 154.3 154.8 155.6 156.0 156.1 156.3 153.8
7 1984 157.2 158.0 158.3 159.0 159.5 159.9 160.4 161.1 161.8 162.2 162.2 162.3 160.2
8 1985 162.5 163.2 163.9 164.6 165.2 165.6 165.9 166.2 166.8 167.2 167.7 168.1 165.6
9 1986 168.6 168.1 167.3 166.9 167.4 168.2 168.2 168.5 169.4 169.5 169.5 169.6 168.4
10 1987 170.6 171.3 172.0 172.9 173.4 174.0 174.3 175.3 176.1 176.5 176.6 176.5 174.1
11 1988 177.0 177.3 178.0 178.9 179.5 180.1 180.8 181.6 182.7 183.2 183.3 183.4 180.5
12 1989 184.3 184.9 185.9 187.2 188.1 188.5 189.0 189.2 189.8 190.6 190.9 191.1 188.3
13 1990 193.0 193.9 194.9 195.2 195.5 196.5 197.3 199.0 200.6 201.7 202.0 202.0 197.6
14 1991 202.9 203.1 203.3 203.5 204.1 204.5 204.7 205.2 206.1 206.2 206.7 206.8 204.8
15 1992 207.2 207.7 208.6 209.0 209.3 209.7 210.1 210.6 211.2 211.8 212.1 211.9 209.9
16 1993 212.6 213.3 214.0 214.5 215.0 215.2 215.3 215.7 216.0 216.7 216.9 216.7 215.2
17 1994 217.1 217.6 218.4 218.6 218.9 219.5 220.0 220.7 221.1 221.2 221.5 221.4 219.7
18 1995 222.2 222.9 223.6 224.3 224.7 225.2 225.3 225.7 226.1 226.7 226.5 226.4 225.0
19 1996 227.5 228.3 229.4 230.2 230.8 230.9 231.3 231.5 232.3 233.0 233.4 233.4 231.0
20 1997 234.1 234.7 235.2 235.5 235.4 235.8 235.9 236.3 237.1 237.5 237.4 237.0 236.0
21 1998 237.4 237.8 238.1 238.6 239.0 239.1 239.4 239.7 240.0 240.5 240.5 240.3 239.2
22 1999 240.9 241.2 241.9 243.6 243.6 243.7 244.4 245.1 246.2 246.7 246.8 246.8 244.2
23 2000 247.6 249.1 251.1 251.2 251.4 252.9 253.4 253.4 254.8 255.1 255.3 255.1 252.5
24 2001 256.8 257.9 258.5 259.5 260.5 261.1 260.3 260.4 261.4 260.6 260.1 259.1 259.7
25 2002 259.8 260.8 262.2 263.7 263.6 263.9 264.1 265.0 265.4 265.9 265.9 265.3 263.8
26 2003 266.4 268.6 270.2 269.5 269.1 269.5 269.7 270.7 271.6 271.4 270.6 270.3 269.8
27 2004 271.7 273.2 274.9 275.8 277.3 278.2 277.8 277.9 278.5 280.0 280.2 279.1 277.0
28 2005 279.6 281.3 283.5 285.4 285.2 285.2 286.5 288.0 291.5 292.2 289.8 288.6 286.4
29 2006 290.8 291.4 293.1 295.5 296.9 297.6 298.4 299.1 297.6 296.0 295.6 296.0 295.7
30 2007 296.9 298.5 301.2 303.1 305.0 305.6 305.5 304.9 305.8 306.4 308.3 308.1 304.1
31 2008 309.6 310.5 313.2 315.1 317.7 320.9 322.6 321.3 320.9 317.6 311.6 308.3 315.8
32 2009 309.7 311.2 312.0 312.8 313.7 316.4 315.8 316.6 316.8 317.1 317.3 316.7 314.7
33 2010 317.8 317.9 319.2 319.8 320.0 319.7 319.8 320.2 320.4 320.8 320.9 321.5 319.8
34 2011 323.0 324.6 327.8 329.9 331.5 331.1 331.4 332.3 332.8 332.2 331.9 331.1 330.0
35 2012 332.6 334.0 336.6 337.6 337.2 336.7 336.2 338.1 339.6 339.5 337.9 337.0 336.9
36 2013 338.0 340.8 341.7 341.3 341.9 342.8 342.9 343.3 343.8 342.9 342.2 342.2 342.0
37 2014 343.5 344.8 347.0 348.2 349.4 350.1 349.9 349.3 349.6 348.7 346.9 344.9 347.7
38 2015 343.4 344.9 346.9 347.7 349.4 350.7 350.7 350.2 349.7 349.5 348.8 347.6 348.3
39 2016 348.2 348.5 350.0 351.7 353.1 354.2 353.7 354.0 354.8 355.3 354.7 354.9 352.8
40 2017 356.9 358.0 358.3 359.4 359.7 360.0 359.8 360.9 362.8 362.5 362.6 362.3 360.3
41 2018 364.3 366.0 366.7 368.3 369.8 370.3 370.4 370.7 371.1 371.8 370.6 369.3 369.1
42 2019 370.0 371.5 373.6 375.6 376.4 376.5 377.2 377.1 377.5 378.4 378.2 377.8 375.8
43 2020 379.2 380.2 379.5 377.2 377.2 379.4 381.3 382.6 383.1 383.2 382.9 383.2 380.8
44 2021 384.9 387.1 390.0 393.2 396.7 400.5 402.4 403.1 404.2 407.6 409.5 410.8 399.2
45 2022 414.3 418.2 423.9 426.3 431.0 436.9 436.8 436.7 437.6 439.4 439.0 437.6 431.5
46 2023 441.1 443.6 445.0 447.3 448.4 449.9 450.7 452.7 453.8 453.6 452.7 452.3 449.3
47 2024 454.7 457.6 460.5 462.3 463.1 463.2 463.8 464.1 464.9 465.4 465.2 465.3 462.5

What we find: The CPI data shows the annual inflation rates that can be used to adjust income values across different time periods. For travel demand modeling, using inflation-adjusted dollars ensures that VOT parameters remain comparable over multi-year planning horizons.

American Community Survey (ACS) 5-Year Estimates

Define Census Variables

The ACS 5-Year Estimates provide the most reliable small-area income statistics available. Unlike the 1-Year estimates, which have large margins of error, the 5-year data aggregates five years of survey responses to produce stable estimates suitable for sub-state analysis. We extract data from two key tables:

  • Table B19013: Median household income (direct estimate)
  • Table B19001: Household count by income bracket (distribution)

The income brackets in B19001 align perfectly with our lookup table, allowing us to calculate weighted averages and validate the median income from B19013.

Show the code
# Define variables to download
acs_variables = {
    'B19013_001E': 'HH_MED_INC',  # Median Household Income in the Past 12 Months (in 2023 Inflation-Adjusted Dollars)
    'B19013_001M': 'HH_MED_INC_MOE',  # Margin of Error for Median Household Income
    'B19001_001E': 'HH_TOTAL',  # Total Households
    'B19001_002E': 'HH_LT_10K',  # Less than $10,000
    'B19001_003E': 'HH_10_15K',  # $10,000 to $14,999
    'B19001_004E': 'HH_15_20K',  # $15,000 to $19,999
    'B19001_005E': 'HH_20_25K',  # $20,000 to $24,999
    'B19001_006E': 'HH_25_30K',  # $25,000 to $29,999
    'B19001_007E': 'HH_30_35K',  # $30,000 to $34,999
    'B19001_008E': 'HH_35_40K',  # $35,000 to $39,999
    'B19001_009E': 'HH_40_45K',  # $40,000 to $44,999
    'B19001_010E': 'HH_45_50K',  # $45,000 to $49,999
    'B19001_011E': 'HH_50_60K',  # $50,000 to $59,999
    'B19001_012E': 'HH_60_75K',  # $60,000 to $74,999
    'B19001_013E': 'HH_75_100K',  # $75,000 to $99,999
    'B19001_014E': 'HH_100_125K',  # $100,000 to $124,999
    'B19001_015E': 'HH_125_150K',  # $125,000 to $149,999
    'B19001_016E': 'HH_150_200K',  # $150,000 to $199,999
    'B19001_017E': 'HH_GT_200K'  # $200,000 or more
}

State Level Data

We begin with state-level data for Utah to establish baseline income statistics. While our final VOT parameters may use more granular geographic data in the future, state-level estimates provide a robust foundation with minimal sampling error. The large sample size at the state level ensures that our calculated VOT parameters are statistically reliable.

Show the code
# Fetch state boundaries from TIGER/Line shapefiles
gdf_ut_bound = states(
  year=2023,
  cache=True
).to_crs(PROJECT_CRS)

# Filter for Utah only
gdf_ut_bound = gdf_ut_bound[gdf_ut_bound['STATEFP'] == str(STATE_FIPS)]

# Fetch Income data from ACS 5-year estimates for Utah
df_ut_income = get_census(
  dataset="acs/acs5",
  year=2023,
  variables=list(acs_variables.keys()),
  params={
      # "key": f"{os.getenv('CENSUS_API_KEY')}", # FIXME: This causes error
      "for": f"state:{STATE_FIPS}"
    },
    return_geoid=True,
    guess_dtypes=True
)

# Join ACS data to block group boundaries and transform CRS
gdf_ut_income = gdf_ut_bound[['GEOID', 'STATEFP', 'NAME', 'geometry']].merge(
    df_ut_income, on = "GEOID"
).to_crs(PROJECT_CRS).rename(columns=acs_variables)

# Preview data
gdf_ut_income
GEOID STATEFP NAME geometry HH_MED_INC HH_MED_INC_MOE HH_TOTAL HH_LT_10K HH_10_15K HH_15_20K ... HH_35_40K HH_40_45K HH_45_50K HH_50_60K HH_60_75K HH_75_100K HH_100_125K HH_125_150K HH_150_200K HH_GT_200K
0 49 49 Utah POLYGON ((900313.399 6302435.171, 900580.099 6... 91750 634 1094896 33918 22999 22352 ... 33072 32113 35150 69627 104883 159368 134089 102926 124090 137560

1 rows × 23 columns

What we find: Utah’s median household income from the ACS provides the anchor point for all subsequent calculations. The distribution across income brackets shows the shape of Utah’s income distribution, which typically differs from national patterns due to the state’s unique demographic characteristics (larger household sizes, younger population, and specific economic structure).

6 Intermediate Calculations

With raw data in hand, we now perform the intermediate calculations needed to derive VOT parameters. These steps transform income distributions into the specific metrics required by the travel demand model.

Income Groupings (Approximate Income Quartiles)

Travel demand models often segment travelers by income level because income strongly influences mode choice, route selection, and willingness to pay for time savings. Rather than using 16 separate income categories (which would create excessive model complexity), we aggregate households into four income groups approximating quartiles.

The quartile approach ensures that each income group contains roughly 25% of households, providing balanced sample sizes for model estimation. However, because ACS income brackets don’t align perfectly with quartile boundaries, we assign entire brackets to the quartile they predominantly fall within.

Income Group 1 (Low Income): The first quartile of households by income. This group typically has the lowest VOT because they face tighter budget constraints and may be more willing to spend time to save money.

Income Groups 2-4 (Higher Income): The upper three quartiles, which we sometimes aggregate as “high income” in contrast to the lowest quartile. These households generally have higher VOT because their opportunity cost of time is greater.

Show the code
# Create a copy of lookup table to work with
df_inc_group = lookup_hhinc.copy()

# Get the income category columns from gdf_ut_income
# Extract just the income bracket counts (excluding totals and medians)
income_cols = [col for col in gdf_ut_income.columns if col.startswith('HH_')
               and col not in ['HH_TOTAL', 'HH_MED_INC', 'HH_MED_INC_MOE']]

# Create a mapping between lookup categories and gdf_ut_income columns
# They should already match, but let's be explicit
df_inc_group['State HH'] = df_inc_group['Income Category'].map(
    gdf_ut_income[income_cols].iloc[0].to_dict()
)

# Calculate percentage of households
total_hh = df_inc_group['State HH'].sum()
df_inc_group['% HH'] = (df_inc_group['State HH'] / total_hh * 100).round(1)

# Calculate cumulative percentage
df_inc_group['Cum % HH'] = df_inc_group['% HH'].cumsum().round(1)

# Assign income groups based on quartiles (25%, 50%, 75%, 100%)
df_inc_group['Inc Group'] = pd.cut(
    df_inc_group['Cum % HH'],
    bins=[0, 25, 50, 75, 100],
    labels=['Inc Group 1', 'Inc Group 2', 'Inc Group 3', 'Inc Group 4'],
    include_lowest=True
)

# Calculate HH_MedInc_Product (HH * Midpoint)
df_inc_group['HH_MedInc_Product'] = df_inc_group['State HH'] * df_inc_group['Midpoint']

# Display the dataframe
df_inc_group
Income Category Lower Limit Upper Limit Midpoint State HH % HH Cum % HH Inc Group HH_MedInc_Product
0 HH_LT_10K 0 9999.0 5000.0 33918 3.1 3.1 Inc Group 1 1.695900e+08
1 HH_10_15K 10000 14999.0 12500.0 22999 2.1 5.2 Inc Group 1 2.874875e+08
2 HH_15_20K 15000 19999.0 17500.0 22352 2.0 7.2 Inc Group 1 3.911600e+08
3 HH_20_25K 20000 24999.0 22500.0 24376 2.2 9.4 Inc Group 1 5.484600e+08
4 HH_25_30K 25000 29999.0 27500.0 28040 2.6 12.0 Inc Group 1 7.711000e+08
5 HH_30_35K 30000 34999.0 32500.0 30333 2.8 14.8 Inc Group 1 9.858225e+08
6 HH_35_40K 35000 39999.0 37500.0 33072 3.0 17.8 Inc Group 1 1.240200e+09
7 HH_40_45K 40000 44999.0 42500.0 32113 2.9 20.7 Inc Group 1 1.364802e+09
8 HH_45_50K 45000 49999.0 47500.0 35150 3.2 23.9 Inc Group 1 1.669625e+09
9 HH_50_60K 50000 59999.0 55000.0 69627 6.4 30.3 Inc Group 2 3.829485e+09
10 HH_60_75K 60000 74999.0 67500.0 104883 9.6 39.9 Inc Group 2 7.079602e+09
11 HH_75_100K 75000 99999.0 87500.0 159368 14.6 54.5 Inc Group 3 1.394470e+10
12 HH_100_125K 100000 124999.0 112500.0 134089 12.2 66.7 Inc Group 3 1.508501e+10
13 HH_125_150K 125000 149999.0 137500.0 102926 9.4 76.1 Inc Group 4 1.415232e+10
14 HH_150_200K 150000 199999.0 175000.0 124090 11.3 87.4 Inc Group 4 2.171575e+10
15 HH_GT_200K 200000 inf 300000.0 137560 12.6 100.0 Inc Group 4 4.126800e+10
Show the code
# Prepare data for seaborn
df_plot = df_inc_group.copy()
df_plot['Income Label'] = df_plot['Income Category'].str.replace('HH_', '').str.replace('_', ' - ')
df_plot['index'] = range(len(df_plot))

# Define colors for each income group
palette = {'Inc Group 1': '#3498db', 'Inc Group 2': '#2ecc71',
           'Inc Group 3': '#f39c12', 'Inc Group 4': '#e74c3c'}

# Set seaborn style and context
sns.set_style("whitegrid")
sns.set_context("notebook")

# Create barplot using seaborn
# plt.figure(figsize=(12, 6))
sns.barplot(
    data=df_plot,
    x='index',
    y='State HH',
    hue='Inc Group',
    palette=palette,
    legend=True,
    dodge=False
)

# Customize plot
plt.xlabel('Income Category', fontsize=11, fontweight='bold')
plt.ylabel('Number of Households', fontsize=11, fontweight='bold')
plt.title('Utah Household Income Distribution by Quartile',
          fontsize=13, fontweight='bold', pad=20)

# Format y-axis with comma separator
plt.gca().yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, p: f'{x:,.0f}'))

# Set x-axis labels
plt.xticks(range(len(df_plot)), df_plot['Income Label'],
           rotation=45, ha='right', fontsize=9)

# Customize legend
plt.legend(loc='upper left', frameon=True, fontsize=10, title='')

# Grid styling
plt.grid(axis='y', alpha=0.3, linestyle='--')
plt.gca().set_axisbelow(True)

# Remove top and right spines for cleaner look
sns.despine()

plt.tight_layout()
plt.show()

Household Income Distribution by Quartile

What we find: The visualization reveals Utah’s income distribution shape. We see a concentration of households in middle-income brackets with smaller tails at the lower extreme. This indicates a relatively strong middle class. The quartile assignments allow us to create a simplified four-group segmentation suitable for model implementation while preserving the essential income-based variation in time values.

Median Income (in Model Base Year Dollars)

To calculate VOT, we need a single representative income value for each income segment. The median is preferable to the mean because it’s less sensitive to extreme values and better represents the “typical” household in each group.

We calculate medians using two approaches:

  1. Weighted average of bracket midpoints: Using our lookup table midpoints and household counts
  2. ACS-reported median: The direct estimate from Table B19013

The weighted average approach typically produces a value slightly different from the ACS median due to our simplified midpoint assumptions. We calculate a correction factor to align our estimates with the official ACS median for the full population, then apply this same correction proportionally to the income group medians. This ensures internal consistency while respecting the authoritative ACS estimates.

Show the code
# Define income categories
categories = {
    'Average': df_inc_group,
    'Low Inc': df_inc_group[df_inc_group['Inc Group'] == 'Inc Group 1'],
    'High Inc': df_inc_group[df_inc_group['Inc Group'] != 'Inc Group 1']
}

# Calculate metrics for each category
summary_data = {}
for cat_name, cat_df in categories.items():
    summary_data[cat_name] = {
        'Sum HH': cat_df['State HH'].sum(),
        'Sum HH * Inc': cat_df['HH_MedInc_Product'].sum(),
    }
    # Calculate unadjusted median income
    summary_data[cat_name]['Unadj Med Inc'] = (
        summary_data[cat_name]['Sum HH * Inc'] / summary_data[cat_name]['Sum HH']
    )

# Calculate correction factor from Average category
actual_median_income = gdf_ut_income['HH_MED_INC'].iloc[0]
correction_factor = actual_median_income / summary_data['Average']['Unadj Med Inc']
inflation_factor = 1.0

# Apply correction and inflation factors
for cat_name in summary_data:
    summary_data[cat_name]['Correction Factor'] = correction_factor
    summary_data[cat_name]['Adj Med Income'] = (
        summary_data[cat_name]['Unadj Med Inc'] * correction_factor
    )
    summary_data[cat_name]['Inflation Adj Factor'] = inflation_factor
    summary_data[cat_name]['Median Income'] = (
        summary_data[cat_name]['Adj Med Income'] * inflation_factor
    )

# Convert to DataFrame
df_summary = pd.DataFrame(summary_data).T

# Format for display
format_specs = {
    'Sum HH': '{:,.0f}',
    'Sum HH * Inc': '{:,.0f}',
    'Unadj Med Inc': '${:,.0f}',
    'Correction Factor': '{:.4f}',
    'Adj Med Income': '${:,.0f}',
    'Inflation Adj Factor': '{:.4f}',
    'Median Income': '${:,.0f}'
}

df_median_income = df_summary.copy()
for col, fmt in format_specs.items():
    df_median_income[col] = df_summary[col].apply(lambda x: fmt.format(x))

df_median_income
Sum HH Sum HH * Inc Unadj Med Inc Correction Factor Adj Med Income Inflation Adj Factor Median Income
Average 1,094,896 124,503,122,500 $113,712 0.8069 $91,750 1.0000 $91,750
Low Inc 262,353 7,428,247,500 $28,314 0.8069 $22,845 1.0000 $22,845
High Inc 832,543 117,074,875,000 $140,623 0.8069 $113,463 1.0000 $113,463

What we find:

  • Average Median Income: The overall median for all Utah households, matching the ACS B19013 estimate
  • Low Income Median: The median for the bottom quartile, representing lower-wage workers, retirees on fixed incomes, and households with part-time employment
  • High Income Median: The median for the upper three quartiles combined, representing professional workers, dual-income households, and higher-wage earners

These three values form the foundation for calculating income-differentiated VOT parameters.

Value of Time (in Model Base Year Dollars)

The Value of Time converts annual income into an hourly rate, then applies purpose-specific multipliers to reflect how travelers trade off time and money for different trip types. The fundamental assumption is that VOT relates to income—higher earners value time more—but the relationship isn’t one-to-one.

Research from revealed preference studies (toll usage patterns) and stated preference surveys consistently shows that:

  • People value work trip time at 35-50% of their wage rate (employers care about productivity, commuters care about stress and lost leisure)
  • People value personal trip time at 25-35% of their wage rate (pure leisure trade-off)
  • Commercial vehicles have higher VOT reflecting business operating costs beyond driver wages

The percentages we apply come from regional travel behavior studies, calibrated to match observed patterns in toll lane usage, route choice, and mode choice behavior.

Calculate Hourly Income

We convert annual median income to an hourly rate by dividing by 2,080 hours (52 weeks × 40 hours/week). This assumes full-time, year-round employment, which is a standard convention in VOT estimation. While not every household member works full-time, this standardization allows consistent comparison across income groups and aligns with how wages are typically expressed.

Show the code
# Convert annual median income to hourly rate (assuming 2080 work hours/year)
df_hourly = pd.DataFrame({
    'Median Income': df_summary['Median Income'],
    'Hourly Rate': df_summary['Median Income'] / 2080
}, index=['Average', 'Low Inc', 'High Inc'])

df_hourly.style.format({
    'Median Income': '${:,.0f}',
    'Hourly Rate': '${:.2f}'
})
  Median Income Hourly Rate
Average $91,750 $44.11
Low Inc $22,845 $10.98
High Inc $113,463 $54.55

What we find: The hourly rates provide an intuitive way to think about the opportunity cost of time. For example, if the average household has an hourly rate of $45, spending an hour in traffic has a notional cost of $45 in lost productive time or leisure.

Calculate VOT in cents per mile

These hardcoded percentages represent the fraction of hourly wage that travelers implicitly value their travel time at, based on observed behavior in previous studies. The percentages differ by trip purpose because:

Work Trips (higher %): Include both the opportunity cost to the traveler AND the employer’s interest in worker productivity. However, it’s typically less than 100% of the wage because commute time is partially compensated through location choice (people choose home/work locations balancing commute time against housing costs and wages).

Personal Trips (lower %): Reflect pure leisure time trade-offs. People are willing to spend more time traveling when they have flexibility and the marginal utility of saved time is lower.

Income Effects: Lower-income travelers often show higher VOT as a percentage of income for work trips (they can’t afford to be late) but lower for personal trips (more time-flexible). Higher-income travelers show the opposite pattern.

Important

These percentages have been calibrated in previous model validation efforts to match observed behavior patterns in the region.

Show the code
# Define VOT as percentage of hourly income for each trip purpose and income group
# Hardcoded VOT percentages from previous calculations
vot_pct = pd.DataFrame({
    'Work': [0.39, 0.62, 0.34],
    'Personal': [0.30, 0.49, 0.27]
}, index=['Average', 'Low Inc', 'High Inc'])
Show the code
# Calculate VOT in cents per minute (hourly rate * percentage * 100 / 60)
vot_cents_min = ((df_hourly['Hourly Rate'].values[:, None] * vot_pct) * 100 / 60).round(0)

vot_cents_min.style.format({
    'Work': '${:.1f}',
    'Personal': '${:.1f}'
})
  Work Personal
Average $29.0 $22.0
Low Inc $11.0 $9.0
High Inc $31.0 $25.0

Calculate and Display Work & Personal VOT

We now calculate the final VOT values in cents per minute, the standard unit for travel demand models. The conversion from dollars per hour to cents per minute (multiplying by 100, dividing by 60) makes the values easier to work with in network assignment algorithms where travel times are typically in minutes.

Work VOT Results: These values represent how much travelers implicitly pay (in toll charges, fuel costs, or inconvenience) to save one minute of travel time on work-related trips. Higher-income travelers have higher work VOT, reflecting both their higher opportunity cost and their greater ability to pay for time savings.

Show the code
# Display Work VOT results
df_vot_work = pd.DataFrame({
    '% of Income': vot_pct['Work'],
    'Unrounded ($/hr)': df_hourly['Hourly Rate'] * vot_pct['Work'],
    'VOT (¢/min)': vot_cents_min['Work'],
    'Equivalent ($/hr)': vot_cents_min['Work'] * 60 / 100
}, index=['Average', 'Low Inc', 'High Inc'])

df_vot_work.style.format({
    '% of Income': '{:.0%}',
    'Unrounded ($/hr)': '${:.2f}',
    'VOT (¢/min)': '{:.0f}',
    'Equivalent ($/hr)': '${:.2f}'
})
  % of Income Unrounded ($/hr) VOT (¢/min) Equivalent ($/hr)
Average 39% $17.20 29 $17.40
Low Inc 62% $6.81 11 $6.60
High Inc 34% $18.55 31 $18.60

Personal VOT Results: Personal trip VOT is consistently lower than work VOT across all income groups. The relative difference between income groups is also more pronounced, as higher-income households have more discretionary income to trade for leisure time convenience.

Show the code
# Display Personal VOT results
df_vot_personal = pd.DataFrame({
    '% of Income': vot_pct['Personal'],
    'Unrounded ($/hr)': df_hourly['Hourly Rate'] * vot_pct['Personal'],
    'VOT (¢/min)': vot_cents_min['Personal'],
    'Equivalent ($/hr)': vot_cents_min['Personal'] * 60 / 100
}, index=['Average', 'Low Inc', 'High Inc'])

df_vot_personal.style.format({
    '% of Income': '{:.0%}',
    'Unrounded ($/hr)': '${:.2f}',
    'VOT (¢/min)': '{:.0f}',
    'Equivalent ($/hr)': '${:.2f}'
})
  % of Income Unrounded ($/hr) VOT (¢/min) Equivalent ($/hr)
Average 30% $13.23 22 $13.20
Low Inc 49% $5.38 9 $5.40
High Inc 27% $14.73 25 $15.00

Model Application: These VOT values feed directly into the mode choice and route choice components of the travel demand model. For example, if a toll road saves 10 minutes compared to a free alternative, a traveler will choose the toll road if the toll cost is less than (10 minutes × VOT). The model predicts what fraction of travelers find the trade-off worthwhile based on their income segment and trip purpose distribution.

Calculate and Display Truck VOT

Commercial vehicle VOT requires a different framework than passenger vehicles because it reflects business operating costs rather than personal wage rates. Truck VOT includes:

  • Driver wages and benefits
  • Vehicle operating costs (fuel, maintenance, depreciation)
  • Cargo value and time-sensitivity
  • Business overhead and profit margins

The percentages (relative to average household income) are calibrated to match observed commercial vehicle behavior, particularly toll road usage patterns. Larger trucks have higher VOT because they carry more valuable cargo, have higher operating costs per hour, and typically serve time-sensitive delivery schedules.

Truck Categories:

  • Light Trucks: Small commercial vehicles, delivery vans, service vehicles
  • Medium Trucks: Box trucks, small semi-trailers, regional delivery vehicles
  • Heavy Trucks: Large semi-trailers, long-haul freight, bulk cargo carriers
Important

These percentages have been calibrated in previous model validation efforts to match observed behavior patterns in the region.

Show the code
# Calculate Truck VOT (using Average income as base)
truck_pct = pd.Series([0.65, 0.87, 1.10], index=['Light', 'Medium', 'Heavy'])
Show the code
df_vot_trucks = pd.DataFrame({
    '% of Income': truck_pct,
    'Unrounded ($/hr)': df_hourly.loc['Average', 'Hourly Rate'] * truck_pct,
    'VOT (¢/min)': ((df_hourly.loc['Average', 'Hourly Rate'] * truck_pct) * 100 / 60).round(0)
})

df_vot_trucks['Equivalent ($/hr)'] = df_vot_trucks['VOT (¢/min)'] * 60 / 100

df_vot_trucks.style.format({
    '% of Income': '{:.0%}',
    'Unrounded ($/hr)': '${:.2f}',
    'VOT (¢/min)': '{:.0f}',
    'Equivalent ($/hr)': '${:.2f}'
})
  % of Income Unrounded ($/hr) VOT (¢/min) Equivalent ($/hr)
Light 65% $28.67 48 $28.80
Medium 87% $38.38 64 $38.40
Heavy 110% $48.52 81 $48.60

What we find: Heavy truck VOT can exceed $30/hour, reflecting the high cost of keeping valuable cargo and expensive equipment idle in traffic. This high VOT explains why commercial vehicles disproportionately use toll facilities and why freight routing is highly sensitive to congestion and travel time reliability.

7 Export Results

The final step packages our calculated VOT parameters in formats ready for model implementation and documentation.

Create Final VOT Table

This table assembles all VOT parameters into a single reference dataset matching the travel demand model’s naming conventions. Each parameter corresponds to a specific traveler type and trip purpose combination:

Parameter Definitions:

  • VOT_Auto_Wrk: Standard work trip VOT (average income)
  • VOT_Auto_Per: Standard personal trip VOT (average income)
  • VOT_Auto_Ext: External trips (average of work and personal, used for through-trips)
  • VOT_LT, VOT_MD, VOT_HV: Light, medium, and heavy truck VOT
  • VOT_Auto_Wrk_Lo, VOT_Auto_Per_Lo: Low-income work and personal trip VOT
  • VOT_Auto_Wrk_Hi, VOT_Auto_Per_Hi: High-income work and personal trip VOT

The table displays values in both cents per minute (for model input) and dollars per hour (for intuitive interpretation).

Show the code
# Build the final VOT parameters table
df_vot_params = pd.DataFrame({
    'Parameter': [
        'VOT_Auto_Wrk', 'VOT_Auto_Per', 'VOT_Auto_Ext',
        'VOT_LT', 'VOT_MD', 'VOT_HV',
        'VOT_Auto_Wrk_Lo', 'VOT_Auto_Per_Lo',
        'VOT_Auto_Wrk_Hi', 'VOT_Auto_Per_Hi'
    ],
    'VOT (cent/min)': [
        vot_cents_min.loc['Average', 'Work'],
        vot_cents_min.loc['Average', 'Personal'],
        (vot_cents_min.loc['Average', 'Work'] + vot_cents_min.loc['Average', 'Personal']) / 2,
        df_vot_trucks.loc['Light', 'VOT (¢/min)'],
        df_vot_trucks.loc['Medium', 'VOT (¢/min)'],
        df_vot_trucks.loc['Heavy', 'VOT (¢/min)'],
        vot_cents_min.loc['Low Inc', 'Work'],
        vot_cents_min.loc['Low Inc', 'Personal'],
        vot_cents_min.loc['High Inc', 'Work'],
        vot_cents_min.loc['High Inc', 'Personal']
    ]
})

df_vot_params['VOT ($/hr)'] = df_vot_params['VOT (cent/min)'] * 60 / 100

df_vot_params.style.format({
    'VOT (cent/min)': '{:.0f}¢',
    'VOT ($/hr)': '${:.1f}'
})
  Parameter VOT (cent/min) VOT ($/hr)
0 VOT_Auto_Wrk 29¢ $17.4
1 VOT_Auto_Per 22¢ $13.2
2 VOT_Auto_Ext 26¢ $15.3
3 VOT_LT 48¢ $28.8
4 VOT_MD 64¢ $38.4
5 VOT_HV 81¢ $48.6
6 VOT_Auto_Wrk_Lo 11¢ $6.6
7 VOT_Auto_Per_Lo $5.4
8 VOT_Auto_Wrk_Hi 31¢ $18.6
9 VOT_Auto_Per_Hi 25¢ $15.0

Export to CSV

We export the final VOT parameters to a CSV file for easy integration into the travel demand model configuration. This file becomes the authoritative source for VOT parameters, documenting the values used in model runs and providing traceability back to the source data and methodology.

Show the code
# Create output directory if it doesn't exist
output_dir = Path("_output")
output_dir.mkdir(parents=True, exist_ok=True)

# Export to CSV
df_vot_params.to_csv(
    output_dir / "value_of_time.csv",
    index=False
)

The exported file can be version-controlled alongside the model, ensuring that any changes to VOT assumptions are tracked and documented. When updating the model to new base years or recalibrating with new income data, this notebook can be re-run to generate updated VOT parameters consistently.

TipDownload the output files: