Value of Time Estimation

1 Introduction

This analysis estimates the Value of Time (VOT) parameters for Utah’s travel demand model using income data from the American Community Survey (ACS) 2019-2023 5-Year Estimates. The VOT represents how much travelers value their time, expressed in cents per minute, and is a critical parameter for evaluating transportation projects and predicting mode choice behavior.

The methodology segments travelers by income level (low, average, and high) and trip purpose (work vs. personal trips), recognizing that different travelers value their time differently. Work trips typically have higher VOT since travel time directly impacts productivity, while personal trips reflect individuals’ willingness to trade time for money in their leisure activities.

This notebook replicates and updates the methodology from the archived Excel workbook, providing a transparent, reproducible workflow using open source data science tools.

2 Environment Setup

Install Required Packages

This section prepares the computing environment by loading necessary Python libraries and configuring project-specific settings. We use pandas and numpy for data manipulation, geopandas for spatial operations, and pygris for seamless access to Census data. The visualization libraries (matplotlib and seaborn) will help us understand income distributions and validate our calculations.

!conda install -c conda-forge numpy pandas geopandas matplotlib seaborn python-dotenv openpyxl
!pip install pygris

Load Libraries

Import all required libraries for the analysis. The pygris library enables direct access to Census Bureau geographic data and ACS estimates through their API.

Show the code

# For Analysis
import numpy as np
import pandas as pd
import geopandas as gpd
import warnings

# For Visualization
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
import seaborn as sns
from adjustText import adjust_text

# Census data query libraries & modules
from pygris import blocks, block_groups, counties, states
from pygris.helpers import validate_state, validate_county
from pygris.data import get_census

# misc
import datetime
import os
from pathlib import Path
import requests

from dotenv import load_dotenv
load_dotenv()

False

Environment Variables

We establish Utah-specific geographic parameters for the analysis. The NAD83 / UTM zone 12N coordinate reference system (EPSG:3566) is the standard projection for Utah, providing accurate distance measurements for spatial analysis. The state FIPS code (49) uniquely identifies Utah in federal datasets.

Show the code

PROJECT_CRS = "EPSG:3566"  # NAD83 / UTM zone 12N
STATE_FIPS = validate_state("UT")

Using FIPS code '49' for input 'UT'

Tip

Need a Census API key? Get one for free at census.gov/developers.

Create a .env file in the project directory and add your Census API key: CENSUS_API_KEY=your-key-here This enables fetching US Census data from the Census API.

Show the code

# Set your API key into environment (alternative to .env file)
os.environ['CENSUS_API_KEY'] = 'your_api_key_here'

3 Define Helper Functions

To maintain code reusability and follow DRY (Don’t Repeat Yourself) principles, we define helper functions for common operations throughout the analysis.

Fetch Excel Files from BLS or BTS

This utility function automates downloading data files from federal agencies. It checks if a file already exists locally before attempting to download, avoiding unnecessary network requests and respecting the agencies’ servers. The function includes proper HTTP headers to ensure reliable downloads.

Show the code

def fetch_excel(path, url):
    """
    Download Excel file if it doesn't exist locally.

    Parameters:
    -----------
    path : str or Path
        Local file path to save the Excel file
    url : str
        URL to download the Excel file from
    """
    # Convert to Path object if string
    filepath = Path(path)

    # Download file if it doesn't exist
    if not filepath.exists():
        filepath.parent.mkdir(parents=True, exist_ok=True)

        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
        }

        response = requests.get(url, headers=headers)
        filepath.write_bytes(response.content)

4 Lookup Tables

Lookup tables serve as reference datasets that map categories to values. These tables ensure consistency across calculations and make the code more maintainable by centralizing key parameter definitions.

Income Category Lookup

The ACS reports household income in 16 predefined brackets rather than individual values. To perform calculations with this grouped data, we need to estimate representative income values for each bracket. This lookup table defines the boundaries of each income bracket and calculates midpoint values.

For the highest income bracket ($200,000+), which has no upper limit, we use $300,000 as a reasonable midpoint. This value is based on research showing that high-income distributions typically concentrate between $200,000 and $400,000, with $300,000 representing a conservative central estimate.Create a reference table defining the income brackets used in ACS Table B19001. Each bracket has a lower and upper limit, and we calculate midpoints for median estimation. The highest bracket ($200,000+) uses $300,000 as a reasonable midpoint based on income distribution patterns.

Show the code

lookup_hhinc = pd.DataFrame({
  "Income Category": [
    "HH_LT_10K", "HH_10_15K", "HH_15_20K", "HH_20_25K", "HH_25_30K", "HH_30_35K",
    "HH_35_40K", "HH_40_45K", "HH_45_50K", "HH_50_60K", "HH_60_75K",
    "HH_75_100K", "HH_100_125K", "HH_125_150K", "HH_150_200K", "HH_GT_200K"
  ],
  "Lower Limit": [
    0, 10000, 15000, 20000, 25000, 30000,
    35000, 40000, 45000, 50000, 60000,
    75000, 100000, 125000, 150000, 200000
  ],
  "Upper Limit": [
    9999, 14999, 19999, 24999, 29999, 34999,
    39999, 44999, 49999, 59999, 74999,
    99999, 124999, 149999, 199999, np.inf
  ]
})

# Compute midpoint and round it
lookup_hhinc['Midpoint'] = (
  (lookup_hhinc['Lower Limit'] + lookup_hhinc['Upper Limit']) / 2
).round()

# Replace infinite midpoint (last category) with 300000
lookup_hhinc.loc[np.isinf(lookup_hhinc["Upper Limit"]), "Midpoint"] = 300000

lookup_hhinc

	Income Category	Lower Limit	Upper Limit	Midpoint
0	HH_LT_10K	0	9999.0	5000.0
1	HH_10_15K	10000	14999.0	12500.0
2	HH_15_20K	15000	19999.0	17500.0
3	HH_20_25K	20000	24999.0	22500.0
4	HH_25_30K	25000	29999.0	27500.0
5	HH_30_35K	30000	34999.0	32500.0
6	HH_35_40K	35000	39999.0	37500.0
7	HH_40_45K	40000	44999.0	42500.0
8	HH_45_50K	45000	49999.0	47500.0
9	HH_50_60K	50000	59999.0	55000.0
10	HH_60_75K	60000	74999.0	67500.0
11	HH_75_100K	75000	99999.0	87500.0
12	HH_100_125K	100000	124999.0	112500.0
13	HH_125_150K	125000	149999.0	137500.0
14	HH_150_200K	150000	199999.0	175000.0
15	HH_GT_200K	200000	inf	300000.0

5 Raw Data Sources

This section retrieves the foundational datasets needed for VOT estimation. We access data directly from authoritative federal sources to ensure accuracy and reproducibility.

Consumer Price Index

The CPI-U-RS (Consumer Price Index for All Urban Consumers - Research Series) provides the most consistent measure of inflation over time. While we use current-year dollars in this analysis, having CPI data available enables future inflation adjustments and facilitates comparisons with historical VOT estimates. The research series is specifically designed for longitudinal analysis, maintaining methodological consistency that the standard CPI-U lacks.

Data Source: Consumer Price Index for All Urban Consumers (CPI-U) [Source: Bureau of Labor Statistics]

Show the code

# Set file path and URL for CPI data
filepath_cpi = Path("_data/bls/r-cpi-u-rs-allitems.xlsx")
url_cpi = "https://www.bls.gov/cpi/research-series/r-cpi-u-rs-allitems.xlsx"

# Ensure the file exists, Download if not
fetch_excel(path=filepath_cpi, url=url_cpi)

# Read Excel file
df_CPI = pd.read_excel(
    filepath_cpi,
    sheet_name="All items",
    usecols="A:N",
    skiprows=5,
    engine='openpyxl'
)

df_CPI

	YEAR	JAN	FEB	MAR	APR	MAY	JUNE	JULY	AUG	SEP	OCT	NOV	DEC	AVG
0	1977	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	100.0	NaN
1	1978	100.5	101.1	101.8	102.7	103.6	104.5	105.0	105.5	106.1	106.7	107.3	107.8	104.4
2	1979	108.7	109.7	110.7	111.8	113.0	114.1	115.1	116.0	117.1	117.9	118.5	119.5	114.3
3	1980	120.8	122.4	123.8	124.7	125.7	126.7	127.5	128.6	129.9	130.7	131.5	132.4	127.1
4	1981	133.6	135.2	136.3	137.1	137.9	138.7	139.7	140.7	141.8	142.4	142.9	143.4	139.1
5	1982	144.2	144.7	144.9	145.0	146.1	147.5	148.5	148.8	149.5	150.2	150.5	150.6	147.5
6	1983	151.0	151.1	151.2	152.4	153.2	153.7	154.3	154.8	155.6	156.0	156.1	156.3	153.8
7	1984	157.2	158.0	158.3	159.0	159.5	159.9	160.4	161.1	161.8	162.2	162.2	162.3	160.2
8	1985	162.5	163.2	163.9	164.6	165.2	165.6	165.9	166.2	166.8	167.2	167.7	168.1	165.6
9	1986	168.6	168.1	167.3	166.9	167.4	168.2	168.2	168.5	169.4	169.5	169.5	169.6	168.4
10	1987	170.6	171.3	172.0	172.9	173.4	174.0	174.3	175.3	176.1	176.5	176.6	176.5	174.1
11	1988	177.0	177.3	178.0	178.9	179.5	180.1	180.8	181.6	182.7	183.2	183.3	183.4	180.5
12	1989	184.3	184.9	185.9	187.2	188.1	188.5	189.0	189.2	189.8	190.6	190.9	191.1	188.3
13	1990	193.0	193.9	194.9	195.2	195.5	196.5	197.3	199.0	200.6	201.7	202.0	202.0	197.6
14	1991	202.9	203.1	203.3	203.5	204.1	204.5	204.7	205.2	206.1	206.2	206.7	206.8	204.8
15	1992	207.2	207.7	208.6	209.0	209.3	209.7	210.1	210.6	211.2	211.8	212.1	211.9	209.9
16	1993	212.6	213.3	214.0	214.5	215.0	215.2	215.3	215.7	216.0	216.7	216.9	216.7	215.2
17	1994	217.1	217.6	218.4	218.6	218.9	219.5	220.0	220.7	221.1	221.2	221.5	221.4	219.7
18	1995	222.2	222.9	223.6	224.3	224.7	225.2	225.3	225.7	226.1	226.7	226.5	226.4	225.0
19	1996	227.5	228.3	229.4	230.2	230.8	230.9	231.3	231.5	232.3	233.0	233.4	233.4	231.0
20	1997	234.1	234.7	235.2	235.5	235.4	235.8	235.9	236.3	237.1	237.5	237.4	237.0	236.0
21	1998	237.4	237.8	238.1	238.6	239.0	239.1	239.4	239.7	240.0	240.5	240.5	240.3	239.2
22	1999	240.9	241.2	241.9	243.6	243.6	243.7	244.4	245.1	246.2	246.7	246.8	246.8	244.2
23	2000	247.6	249.1	251.1	251.2	251.4	252.9	253.4	253.4	254.8	255.1	255.3	255.1	252.5
24	2001	256.8	257.9	258.5	259.5	260.5	261.1	260.3	260.4	261.4	260.6	260.1	259.1	259.7
25	2002	259.8	260.8	262.2	263.7	263.6	263.9	264.1	265.0	265.4	265.9	265.9	265.3	263.8
26	2003	266.4	268.6	270.2	269.5	269.1	269.5	269.7	270.7	271.6	271.4	270.6	270.3	269.8
27	2004	271.7	273.2	274.9	275.8	277.3	278.2	277.8	277.9	278.5	280.0	280.2	279.1	277.0
28	2005	279.6	281.3	283.5	285.4	285.2	285.2	286.5	288.0	291.5	292.2	289.8	288.6	286.4
29	2006	290.8	291.4	293.1	295.5	296.9	297.6	298.4	299.1	297.6	296.0	295.6	296.0	295.7
30	2007	296.9	298.5	301.2	303.1	305.0	305.6	305.5	304.9	305.8	306.4	308.3	308.1	304.1
31	2008	309.6	310.5	313.2	315.1	317.7	320.9	322.6	321.3	320.9	317.6	311.6	308.3	315.8
32	2009	309.7	311.2	312.0	312.8	313.7	316.4	315.8	316.6	316.8	317.1	317.3	316.7	314.7
33	2010	317.8	317.9	319.2	319.8	320.0	319.7	319.8	320.2	320.4	320.8	320.9	321.5	319.8
34	2011	323.0	324.6	327.8	329.9	331.5	331.1	331.4	332.3	332.8	332.2	331.9	331.1	330.0
35	2012	332.6	334.0	336.6	337.6	337.2	336.7	336.2	338.1	339.6	339.5	337.9	337.0	336.9
36	2013	338.0	340.8	341.7	341.3	341.9	342.8	342.9	343.3	343.8	342.9	342.2	342.2	342.0
37	2014	343.5	344.8	347.0	348.2	349.4	350.1	349.9	349.3	349.6	348.7	346.9	344.9	347.7
38	2015	343.4	344.9	346.9	347.7	349.4	350.7	350.7	350.2	349.7	349.5	348.8	347.6	348.3
39	2016	348.2	348.5	350.0	351.7	353.1	354.2	353.7	354.0	354.8	355.3	354.7	354.9	352.8
40	2017	356.9	358.0	358.3	359.4	359.7	360.0	359.8	360.9	362.8	362.5	362.6	362.3	360.3
41	2018	364.3	366.0	366.7	368.3	369.8	370.3	370.4	370.7	371.1	371.8	370.6	369.3	369.1
42	2019	370.0	371.5	373.6	375.6	376.4	376.5	377.2	377.1	377.5	378.4	378.2	377.8	375.8
43	2020	379.2	380.2	379.5	377.2	377.2	379.4	381.3	382.6	383.1	383.2	382.9	383.2	380.8
44	2021	384.9	387.1	390.0	393.2	396.7	400.5	402.4	403.1	404.2	407.6	409.5	410.8	399.2
45	2022	414.3	418.2	423.9	426.3	431.0	436.9	436.8	436.7	437.6	439.4	439.0	437.6	431.5
46	2023	441.1	443.6	445.0	447.3	448.4	449.9	450.7	452.7	453.8	453.6	452.7	452.3	449.3
47	2024	454.7	457.6	460.5	462.3	463.1	463.2	463.8	464.1	464.9	465.4	465.2	465.3	462.5

What we find: The CPI data shows the annual inflation rates that can be used to adjust income values across different time periods. For travel demand modeling, using inflation-adjusted dollars ensures that VOT parameters remain comparable over multi-year planning horizons.

American Community Survey (ACS) 5-Year Estimates

Define Census Variables

The ACS 5-Year Estimates provide the most reliable small-area income statistics available. Unlike the 1-Year estimates, which have large margins of error, the 5-year data aggregates five years of survey responses to produce stable estimates suitable for sub-state analysis. We extract data from two key tables:

Table B19013: Median household income (direct estimate)
Table B19001: Household count by income bracket (distribution)

The income brackets in B19001 align perfectly with our lookup table, allowing us to calculate weighted averages and validate the median income from B19013.

Show the code

# Define variables to download
acs_variables = {
    'B19013_001E': 'HH_MED_INC',  # Median Household Income in the Past 12 Months (in 2023 Inflation-Adjusted Dollars)
    'B19013_001M': 'HH_MED_INC_MOE',  # Margin of Error for Median Household Income
    'B19001_001E': 'HH_TOTAL',  # Total Households
    'B19001_002E': 'HH_LT_10K',  # Less than $10,000
    'B19001_003E': 'HH_10_15K',  # $10,000 to $14,999
    'B19001_004E': 'HH_15_20K',  # $15,000 to $19,999
    'B19001_005E': 'HH_20_25K',  # $20,000 to $24,999
    'B19001_006E': 'HH_25_30K',  # $25,000 to $29,999
    'B19001_007E': 'HH_30_35K',  # $30,000 to $34,999
    'B19001_008E': 'HH_35_40K',  # $35,000 to $39,999
    'B19001_009E': 'HH_40_45K',  # $40,000 to $44,999
    'B19001_010E': 'HH_45_50K',  # $45,000 to $49,999
    'B19001_011E': 'HH_50_60K',  # $50,000 to $59,999
    'B19001_012E': 'HH_60_75K',  # $60,000 to $74,999
    'B19001_013E': 'HH_75_100K',  # $75,000 to $99,999
    'B19001_014E': 'HH_100_125K',  # $100,000 to $124,999
    'B19001_015E': 'HH_125_150K',  # $125,000 to $149,999
    'B19001_016E': 'HH_150_200K',  # $150,000 to $199,999
    'B19001_017E': 'HH_GT_200K'  # $200,000 or more
}

State Level Data

We begin with state-level data for Utah to establish baseline income statistics. While our final VOT parameters may use more granular geographic data in the future, state-level estimates provide a robust foundation with minimal sampling error. The large sample size at the state level ensures that our calculated VOT parameters are statistically reliable.

Show the code

# Fetch state boundaries from TIGER/Line shapefiles
gdf_ut_bound = states(
  year=2023,
  cache=True
).to_crs(PROJECT_CRS)

# Filter for Utah only
gdf_ut_bound = gdf_ut_bound[gdf_ut_bound['STATEFP'] == str(STATE_FIPS)]

# Fetch Income data from ACS 5-year estimates for Utah
df_ut_income = get_census(
  dataset="acs/acs5",
  year=2023,
  variables=list(acs_variables.keys()),
  params={
      # "key": f"{os.getenv('CENSUS_API_KEY')}", # FIXME: This causes error
      "for": f"state:{STATE_FIPS}"
    },
    return_geoid=True,
    guess_dtypes=True
)

# Join ACS data to block group boundaries and transform CRS
gdf_ut_income = gdf_ut_bound[['GEOID', 'STATEFP', 'NAME', 'geometry']].merge(
    df_ut_income, on = "GEOID"
).to_crs(PROJECT_CRS).rename(columns=acs_variables)

# Preview data
gdf_ut_income

	GEOID	STATEFP	NAME	geometry	HH_MED_INC	HH_MED_INC_MOE	HH_TOTAL	HH_LT_10K	HH_10_15K	HH_15_20K	...	HH_35_40K	HH_40_45K	HH_45_50K	HH_50_60K	HH_60_75K	HH_75_100K	HH_100_125K	HH_125_150K	HH_150_200K	HH_GT_200K
0	49	49	Utah	POLYGON ((900313.399 6302435.171, 900580.099 6...	91750	634	1094896	33918	22999	22352	...	33072	32113	35150	69627	104883	159368	134089	102926	124090	137560

1 rows × 23 columns

What we find: Utah’s median household income from the ACS provides the anchor point for all subsequent calculations. The distribution across income brackets shows the shape of Utah’s income distribution, which typically differs from national patterns due to the state’s unique demographic characteristics (larger household sizes, younger population, and specific economic structure).

6 Intermediate Calculations

With raw data in hand, we now perform the intermediate calculations needed to derive VOT parameters. These steps transform income distributions into the specific metrics required by the travel demand model.

Income Groupings (Approximate Income Quartiles)

Travel demand models often segment travelers by income level because income strongly influences mode choice, route selection, and willingness to pay for time savings. Rather than using 16 separate income categories (which would create excessive model complexity), we aggregate households into four income groups approximating quartiles.

The quartile approach ensures that each income group contains roughly 25% of households, providing balanced sample sizes for model estimation. However, because ACS income brackets don’t align perfectly with quartile boundaries, we assign entire brackets to the quartile they predominantly fall within.

Income Group 1 (Low Income): The first quartile of households by income. This group typically has the lowest VOT because they face tighter budget constraints and may be more willing to spend time to save money.

Income Groups 2-4 (Higher Income): The upper three quartiles, which we sometimes aggregate as “high income” in contrast to the lowest quartile. These households generally have higher VOT because their opportunity cost of time is greater.

Show the code

# Create a copy of lookup table to work with
df_inc_group = lookup_hhinc.copy()

# Get the income category columns from gdf_ut_income
# Extract just the income bracket counts (excluding totals and medians)
income_cols = [col for col in gdf_ut_income.columns if col.startswith('HH_')
               and col not in ['HH_TOTAL', 'HH_MED_INC', 'HH_MED_INC_MOE']]

# Create a mapping between lookup categories and gdf_ut_income columns
# They should already match, but let's be explicit
df_inc_group['State HH'] = df_inc_group['Income Category'].map(
    gdf_ut_income[income_cols].iloc[0].to_dict()
)

# Calculate percentage of households
total_hh = df_inc_group['State HH'].sum()
df_inc_group['% HH'] = (df_inc_group['State HH'] / total_hh * 100).round(1)

# Calculate cumulative percentage
df_inc_group['Cum % HH'] = df_inc_group['% HH'].cumsum().round(1)

# Assign income groups based on quartiles (25%, 50%, 75%, 100%)
df_inc_group['Inc Group'] = pd.cut(
    df_inc_group['Cum % HH'],
    bins=[0, 25, 50, 75, 100],
    labels=['Inc Group 1', 'Inc Group 2', 'Inc Group 3', 'Inc Group 4'],
    include_lowest=True
)

# Calculate HH_MedInc_Product (HH * Midpoint)
df_inc_group['HH_MedInc_Product'] = df_inc_group['State HH'] * df_inc_group['Midpoint']

# Display the dataframe
df_inc_group

	Income Category	Lower Limit	Upper Limit	Midpoint	State HH	% HH	Cum % HH	Inc Group	HH_MedInc_Product
0	HH_LT_10K	0	9999.0	5000.0	33918	3.1	3.1	Inc Group 1	1.695900e+08
1	HH_10_15K	10000	14999.0	12500.0	22999	2.1	5.2	Inc Group 1	2.874875e+08
2	HH_15_20K	15000	19999.0	17500.0	22352	2.0	7.2	Inc Group 1	3.911600e+08
3	HH_20_25K	20000	24999.0	22500.0	24376	2.2	9.4	Inc Group 1	5.484600e+08
4	HH_25_30K	25000	29999.0	27500.0	28040	2.6	12.0	Inc Group 1	7.711000e+08
5	HH_30_35K	30000	34999.0	32500.0	30333	2.8	14.8	Inc Group 1	9.858225e+08
6	HH_35_40K	35000	39999.0	37500.0	33072	3.0	17.8	Inc Group 1	1.240200e+09
7	HH_40_45K	40000	44999.0	42500.0	32113	2.9	20.7	Inc Group 1	1.364802e+09
8	HH_45_50K	45000	49999.0	47500.0	35150	3.2	23.9	Inc Group 1	1.669625e+09
9	HH_50_60K	50000	59999.0	55000.0	69627	6.4	30.3	Inc Group 2	3.829485e+09
10	HH_60_75K	60000	74999.0	67500.0	104883	9.6	39.9	Inc Group 2	7.079602e+09
11	HH_75_100K	75000	99999.0	87500.0	159368	14.6	54.5	Inc Group 3	1.394470e+10
12	HH_100_125K	100000	124999.0	112500.0	134089	12.2	66.7	Inc Group 3	1.508501e+10
13	HH_125_150K	125000	149999.0	137500.0	102926	9.4	76.1	Inc Group 4	1.415232e+10
14	HH_150_200K	150000	199999.0	175000.0	124090	11.3	87.4	Inc Group 4	2.171575e+10
15	HH_GT_200K	200000	inf	300000.0	137560	12.6	100.0	Inc Group 4	4.126800e+10

Show the code

# Prepare data for seaborn
df_plot = df_inc_group.copy()
df_plot['Income Label'] = df_plot['Income Category'].str.replace('HH_', '').str.replace('_', ' - ')
df_plot['index'] = range(len(df_plot))

# Define colors for each income group
palette = {'Inc Group 1': '#3498db', 'Inc Group 2': '#2ecc71',
           'Inc Group 3': '#f39c12', 'Inc Group 4': '#e74c3c'}

# Set seaborn style and context
sns.set_style("whitegrid")
sns.set_context("notebook")

# Create barplot using seaborn
# plt.figure(figsize=(12, 6))
sns.barplot(
    data=df_plot,
    x='index',
    y='State HH',
    hue='Inc Group',
    palette=palette,
    legend=True,
    dodge=False
)

# Customize plot
plt.xlabel('Income Category', fontsize=11, fontweight='bold')
plt.ylabel('Number of Households', fontsize=11, fontweight='bold')
plt.title('Utah Household Income Distribution by Quartile',
          fontsize=13, fontweight='bold', pad=20)

# Format y-axis with comma separator
plt.gca().yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, p: f'{x:,.0f}'))

# Set x-axis labels
plt.xticks(range(len(df_plot)), df_plot['Income Label'],
           rotation=45, ha='right', fontsize=9)

# Customize legend
plt.legend(loc='upper left', frameon=True, fontsize=10, title='')

# Grid styling
plt.grid(axis='y', alpha=0.3, linestyle='--')
plt.gca().set_axisbelow(True)

# Remove top and right spines for cleaner look
sns.despine()

plt.tight_layout()
plt.show()

Household Income Distribution by Quartile

What we find: The visualization reveals Utah’s income distribution shape. We see a concentration of households in middle-income brackets with smaller tails at the lower extreme. This indicates a relatively strong middle class. The quartile assignments allow us to create a simplified four-group segmentation suitable for model implementation while preserving the essential income-based variation in time values.

Median Income (in Model Base Year Dollars)

To calculate VOT, we need a single representative income value for each income segment. The median is preferable to the mean because it’s less sensitive to extreme values and better represents the “typical” household in each group.

We calculate medians using two approaches:

Weighted average of bracket midpoints: Using our lookup table midpoints and household counts
ACS-reported median: The direct estimate from Table B19013

The weighted average approach typically produces a value slightly different from the ACS median due to our simplified midpoint assumptions. We calculate a correction factor to align our estimates with the official ACS median for the full population, then apply this same correction proportionally to the income group medians. This ensures internal consistency while respecting the authoritative ACS estimates.

Show the code

# Define income categories
categories = {
    'Average': df_inc_group,
    'Low Inc': df_inc_group[df_inc_group['Inc Group'] == 'Inc Group 1'],
    'High Inc': df_inc_group[df_inc_group['Inc Group'] != 'Inc Group 1']
}

# Calculate metrics for each category
summary_data = {}
for cat_name, cat_df in categories.items():
    summary_data[cat_name] = {
        'Sum HH': cat_df['State HH'].sum(),
        'Sum HH * Inc': cat_df['HH_MedInc_Product'].sum(),
    }
    # Calculate unadjusted median income
    summary_data[cat_name]['Unadj Med Inc'] = (
        summary_data[cat_name]['Sum HH * Inc'] / summary_data[cat_name]['Sum HH']
    )

# Calculate correction factor from Average category
actual_median_income = gdf_ut_income['HH_MED_INC'].iloc[0]
correction_factor = actual_median_income / summary_data['Average']['Unadj Med Inc']
inflation_factor = 1.0

# Apply correction and inflation factors
for cat_name in summary_data:
    summary_data[cat_name]['Correction Factor'] = correction_factor
    summary_data[cat_name]['Adj Med Income'] = (
        summary_data[cat_name]['Unadj Med Inc'] * correction_factor
    )
    summary_data[cat_name]['Inflation Adj Factor'] = inflation_factor
    summary_data[cat_name]['Median Income'] = (
        summary_data[cat_name]['Adj Med Income'] * inflation_factor
    )

# Convert to DataFrame
df_summary = pd.DataFrame(summary_data).T

# Format for display
format_specs = {
    'Sum HH': '{:,.0f}',
    'Sum HH * Inc': '{:,.0f}',
    'Unadj Med Inc': '${:,.0f}',
    'Correction Factor': '{:.4f}',
    'Adj Med Income': '${:,.0f}',
    'Inflation Adj Factor': '{:.4f}',
    'Median Income': '${:,.0f}'
}

df_median_income = df_summary.copy()
for col, fmt in format_specs.items():
    df_median_income[col] = df_summary[col].apply(lambda x: fmt.format(x))

df_median_income

	Sum HH	Sum HH * Inc	Unadj Med Inc	Correction Factor	Adj Med Income	Inflation Adj Factor	Median Income
Average	1,094,896	124,503,122,500	$113,712	0.8069	$91,750	1.0000	$91,750
Low Inc	262,353	7,428,247,500	$28,314	0.8069	$22,845	1.0000	$22,845
High Inc	832,543	117,074,875,000	$140,623	0.8069	$113,463	1.0000	$113,463

What we find:

Average Median Income: The overall median for all Utah households, matching the ACS B19013 estimate
Low Income Median: The median for the bottom quartile, representing lower-wage workers, retirees on fixed incomes, and households with part-time employment
High Income Median: The median for the upper three quartiles combined, representing professional workers, dual-income households, and higher-wage earners

These three values form the foundation for calculating income-differentiated VOT parameters.

Value of Time (in Model Base Year Dollars)

The Value of Time converts annual income into an hourly rate, then applies purpose-specific multipliers to reflect how travelers trade off time and money for different trip types. The fundamental assumption is that VOT relates to income—higher earners value time more—but the relationship isn’t one-to-one.

Research from revealed preference studies (toll usage patterns) and stated preference surveys consistently shows that:

People value work trip time at 35-50% of their wage rate (employers care about productivity, commuters care about stress and lost leisure)
People value personal trip time at 25-35% of their wage rate (pure leisure trade-off)
Commercial vehicles have higher VOT reflecting business operating costs beyond driver wages

The percentages we apply come from regional travel behavior studies, calibrated to match observed patterns in toll lane usage, route choice, and mode choice behavior.

Calculate Hourly Income

We convert annual median income to an hourly rate by dividing by 2,080 hours (52 weeks × 40 hours/week). This assumes full-time, year-round employment, which is a standard convention in VOT estimation. While not every household member works full-time, this standardization allows consistent comparison across income groups and aligns with how wages are typically expressed.

Show the code

# Convert annual median income to hourly rate (assuming 2080 work hours/year)
df_hourly = pd.DataFrame({
    'Median Income': df_summary['Median Income'],
    'Hourly Rate': df_summary['Median Income'] / 2080
}, index=['Average', 'Low Inc', 'High Inc'])

df_hourly.style.format({
    'Median Income': '${:,.0f}',
    'Hourly Rate': '${:.2f}'
})

	Median Income	Hourly Rate
Average	$91,750	$44.11
Low Inc	$22,845	$10.98
High Inc	$113,463	$54.55

What we find: The hourly rates provide an intuitive way to think about the opportunity cost of time. For example, if the average household has an hourly rate of $45, spending an hour in traffic has a notional cost of $45 in lost productive time or leisure.

Calculate VOT in cents per mile

These hardcoded percentages represent the fraction of hourly wage that travelers implicitly value their travel time at, based on observed behavior in previous studies. The percentages differ by trip purpose because:

Work Trips (higher %): Include both the opportunity cost to the traveler AND the employer’s interest in worker productivity. However, it’s typically less than 100% of the wage because commute time is partially compensated through location choice (people choose home/work locations balancing commute time against housing costs and wages).

Personal Trips (lower %): Reflect pure leisure time trade-offs. People are willing to spend more time traveling when they have flexibility and the marginal utility of saved time is lower.

Income Effects: Lower-income travelers often show higher VOT as a percentage of income for work trips (they can’t afford to be late) but lower for personal trips (more time-flexible). Higher-income travelers show the opposite pattern.

Important

These percentages have been calibrated in previous model validation efforts to match observed behavior patterns in the region.

Show the code

# Define VOT as percentage of hourly income for each trip purpose and income group
# Hardcoded VOT percentages from previous calculations
vot_pct = pd.DataFrame({
    'Work': [0.39, 0.62, 0.34],
    'Personal': [0.30, 0.49, 0.27]
}, index=['Average', 'Low Inc', 'High Inc'])

Show the code

# Calculate VOT in cents per minute (hourly rate * percentage * 100 / 60)
vot_cents_min = ((df_hourly['Hourly Rate'].values[:, None] * vot_pct) * 100 / 60).round(0)

vot_cents_min.style.format({
    'Work': '${:.1f}',
    'Personal': '${:.1f}'
})

	Work	Personal
Average	$29.0	$22.0
Low Inc	$11.0	$9.0
High Inc	$31.0	$25.0

Calculate and Display Work & Personal VOT

We now calculate the final VOT values in cents per minute, the standard unit for travel demand models. The conversion from dollars per hour to cents per minute (multiplying by 100, dividing by 60) makes the values easier to work with in network assignment algorithms where travel times are typically in minutes.

Work VOT Results: These values represent how much travelers implicitly pay (in toll charges, fuel costs, or inconvenience) to save one minute of travel time on work-related trips. Higher-income travelers have higher work VOT, reflecting both their higher opportunity cost and their greater ability to pay for time savings.

Show the code

# Display Work VOT results
df_vot_work = pd.DataFrame({
    '% of Income': vot_pct['Work'],
    'Unrounded ($/hr)': df_hourly['Hourly Rate'] * vot_pct['Work'],
    'VOT (¢/min)': vot_cents_min['Work'],
    'Equivalent ($/hr)': vot_cents_min['Work'] * 60 / 100
}, index=['Average', 'Low Inc', 'High Inc'])

df_vot_work.style.format({
    '% of Income': '{:.0%}',
    'Unrounded ($/hr)': '${:.2f}',
    'VOT (¢/min)': '{:.0f}',
    'Equivalent ($/hr)': '${:.2f}'
})

	% of Income	Unrounded ($/hr)	VOT (¢/min)	Equivalent ($/hr)
Average	39%	$17.20	29	$17.40
Low Inc	62%	$6.81	11	$6.60
High Inc	34%	$18.55	31	$18.60

Personal VOT Results: Personal trip VOT is consistently lower than work VOT across all income groups. The relative difference between income groups is also more pronounced, as higher-income households have more discretionary income to trade for leisure time convenience.

Show the code

# Display Personal VOT results
df_vot_personal = pd.DataFrame({
    '% of Income': vot_pct['Personal'],
    'Unrounded ($/hr)': df_hourly['Hourly Rate'] * vot_pct['Personal'],
    'VOT (¢/min)': vot_cents_min['Personal'],
    'Equivalent ($/hr)': vot_cents_min['Personal'] * 60 / 100
}, index=['Average', 'Low Inc', 'High Inc'])

df_vot_personal.style.format({
    '% of Income': '{:.0%}',
    'Unrounded ($/hr)': '${:.2f}',
    'VOT (¢/min)': '{:.0f}',
    'Equivalent ($/hr)': '${:.2f}'
})

	% of Income	Unrounded ($/hr)	VOT (¢/min)	Equivalent ($/hr)
Average	30%	$13.23	22	$13.20
Low Inc	49%	$5.38	9	$5.40
High Inc	27%	$14.73	25	$15.00

Model Application: These VOT values feed directly into the mode choice and route choice components of the travel demand model. For example, if a toll road saves 10 minutes compared to a free alternative, a traveler will choose the toll road if the toll cost is less than (10 minutes × VOT). The model predicts what fraction of travelers find the trade-off worthwhile based on their income segment and trip purpose distribution.

Calculate and Display Truck VOT

Commercial vehicle VOT requires a different framework than passenger vehicles because it reflects business operating costs rather than personal wage rates. Truck VOT includes:

Driver wages and benefits
Vehicle operating costs (fuel, maintenance, depreciation)
Cargo value and time-sensitivity
Business overhead and profit margins

The percentages (relative to average household income) are calibrated to match observed commercial vehicle behavior, particularly toll road usage patterns. Larger trucks have higher VOT because they carry more valuable cargo, have higher operating costs per hour, and typically serve time-sensitive delivery schedules.

Truck Categories:

Light Trucks: Small commercial vehicles, delivery vans, service vehicles
Medium Trucks: Box trucks, small semi-trailers, regional delivery vehicles
Heavy Trucks: Large semi-trailers, long-haul freight, bulk cargo carriers

Important

These percentages have been calibrated in previous model validation efforts to match observed behavior patterns in the region.

Show the code

# Calculate Truck VOT (using Average income as base)
truck_pct = pd.Series([0.65, 0.87, 1.10], index=['Light', 'Medium', 'Heavy'])

Show the code

df_vot_trucks = pd.DataFrame({
    '% of Income': truck_pct,
    'Unrounded ($/hr)': df_hourly.loc['Average', 'Hourly Rate'] * truck_pct,
    'VOT (¢/min)': ((df_hourly.loc['Average', 'Hourly Rate'] * truck_pct) * 100 / 60).round(0)
})

df_vot_trucks['Equivalent ($/hr)'] = df_vot_trucks['VOT (¢/min)'] * 60 / 100

df_vot_trucks.style.format({
    '% of Income': '{:.0%}',
    'Unrounded ($/hr)': '${:.2f}',
    'VOT (¢/min)': '{:.0f}',
    'Equivalent ($/hr)': '${:.2f}'
})

	% of Income	Unrounded ($/hr)	VOT (¢/min)	Equivalent ($/hr)
Light	65%	$28.67	48	$28.80
Medium	87%	$38.38	64	$38.40
Heavy	110%	$48.52	81	$48.60

What we find: Heavy truck VOT can exceed $30/hour, reflecting the high cost of keeping valuable cargo and expensive equipment idle in traffic. This high VOT explains why commercial vehicles disproportionately use toll facilities and why freight routing is highly sensitive to congestion and travel time reliability.

7 Export Results

The final step packages our calculated VOT parameters in formats ready for model implementation and documentation.

Create Final VOT Table

This table assembles all VOT parameters into a single reference dataset matching the travel demand model’s naming conventions. Each parameter corresponds to a specific traveler type and trip purpose combination:

Parameter Definitions:

VOT_Auto_Wrk: Standard work trip VOT (average income)
VOT_Auto_Per: Standard personal trip VOT (average income)
VOT_Auto_Ext: External trips (average of work and personal, used for through-trips)
VOT_LT, VOT_MD, VOT_HV: Light, medium, and heavy truck VOT
VOT_Auto_Wrk_Lo, VOT_Auto_Per_Lo: Low-income work and personal trip VOT
VOT_Auto_Wrk_Hi, VOT_Auto_Per_Hi: High-income work and personal trip VOT

The table displays values in both cents per minute (for model input) and dollars per hour (for intuitive interpretation).

Show the code

# Build the final VOT parameters table
df_vot_params = pd.DataFrame({
    'Parameter': [
        'VOT_Auto_Wrk', 'VOT_Auto_Per', 'VOT_Auto_Ext',
        'VOT_LT', 'VOT_MD', 'VOT_HV',
        'VOT_Auto_Wrk_Lo', 'VOT_Auto_Per_Lo',
        'VOT_Auto_Wrk_Hi', 'VOT_Auto_Per_Hi'
    ],
    'VOT (cent/min)': [
        vot_cents_min.loc['Average', 'Work'],
        vot_cents_min.loc['Average', 'Personal'],
        (vot_cents_min.loc['Average', 'Work'] + vot_cents_min.loc['Average', 'Personal']) / 2,
        df_vot_trucks.loc['Light', 'VOT (¢/min)'],
        df_vot_trucks.loc['Medium', 'VOT (¢/min)'],
        df_vot_trucks.loc['Heavy', 'VOT (¢/min)'],
        vot_cents_min.loc['Low Inc', 'Work'],
        vot_cents_min.loc['Low Inc', 'Personal'],
        vot_cents_min.loc['High Inc', 'Work'],
        vot_cents_min.loc['High Inc', 'Personal']
    ]
})

df_vot_params['VOT ($/hr)'] = df_vot_params['VOT (cent/min)'] * 60 / 100

df_vot_params.style.format({
    'VOT (cent/min)': '{:.0f}¢',
    'VOT ($/hr)': '${:.1f}'
})

	Parameter	VOT (cent/min)	VOT ($/hr)
0	VOT_Auto_Wrk	29¢	$17.4
1	VOT_Auto_Per	22¢	$13.2
2	VOT_Auto_Ext	26¢	$15.3
3	VOT_LT	48¢	$28.8
4	VOT_MD	64¢	$38.4
5	VOT_HV	81¢	$48.6
6	VOT_Auto_Wrk_Lo	11¢	$6.6
7	VOT_Auto_Per_Lo	9¢	$5.4
8	VOT_Auto_Wrk_Hi	31¢	$18.6
9	VOT_Auto_Per_Hi	25¢	$15.0

Export to CSV

We export the final VOT parameters to a CSV file for easy integration into the travel demand model configuration. This file becomes the authoritative source for VOT parameters, documenting the values used in model runs and providing traceability back to the source data and methodology.

Show the code

# Create output directory if it doesn't exist
output_dir = Path("_output")
output_dir.mkdir(parents=True, exist_ok=True)

# Export to CSV
df_vot_params.to_csv(
    output_dir / "value_of_time.csv",
    index=False
)

The exported file can be version-controlled alongside the model, ensuring that any changes to VOT assumptions are tracked and documented. When updating the model to new base years or recalibrating with new income data, this notebook can be re-run to generate updated VOT parameters consistently.

Download the output files:

value_of_time.csv