Chapter 1 What is HMDA Data?

The Home Mortgage Disclosure Act (HMDA) was enacted by Congress in 1975 and is implemented by the Consumer Financial Protection Bureau (CFPB). The HMDA requires many financial institutions to maintain, report, and publicly disclose information about mortgages. This information is crucial for understanding and monitoring trends in housing finance, and for ensuring compliance with fair lending laws. 1

HMDA data includes information on loan applications, loan originations, loan purchases, and denied applications. The data encompasses various aspects such as:

  • Loan Characteristics: Information about the loan amount, type of loan, and purpose of the loan (e.g., home purchase, refinance).
  • Applicant Information: Demographic details of the loan applicants including race, ethnicity, gender, and income.
  • Property Information: Data about the location and type of property being financed.
  • Action Taken: The outcome of the loan application, whether it was approved, denied, or withdrawn.

1.1 Why Use HMDA Data?

While this book is focused on teaching data analysis in R, the HMDA dataset serves as an excellent example for several reasons:

  1. Real-World Relevance: HMDA data provides a real-world context that makes the learning process more engaging and practical.
  2. Comprehensive Dataset: The dataset includes a wide range of variables, making it suitable for demonstrating various data analysis techniques.
  3. Publicly Available: HMDA data is publicly accessible, allowing you to follow along with the examples and practice on your own.

By the end of this book, you will not only have a solid understanding of data analysis in R but also be equipped with practical skills that can be applied to other datasets and domains.

Let’s get started on this journey of exploring data analysis with R, using the HMDA data as our guide!


  1. If you would like to learn more about HMDA data please see: https://www.consumerfinance.gov/data-research/hmda/↩︎