Research Publication Title

Modeling Interest Rate

Presenter Information

Hanwen ChenFollow

Major

Mathematics

Faculty Mentor

Jebessa Mijena

Keywords

Interest Rate, Linear Regression, Tree-Based Methods

Abstract

The goal of this research is to develop a model that could predict the interest rate on loans with attention to accuracy based on the information provided by clients. We looked at the 66 variables that are associated with a client, such as the debt-to-income ratio, the employment length, and the term of loans, to see how these predictors are correlated to the interest rate on loans. In the first phase, we collected financial data from LendingClub, which is an American peer to peer lending company and took out of uncorrelated predictors and missing value in the database. In the second phase, we divided the data into training and test data. The training data has a sample of 318257 people’s financial data, and test data has a sample of 136396 people’s financial data. We analyzed data by applying different statical methods, such as Linear Regression, Shrinkage Methods, Dimension Reduction Methods, and Tree-Based Methods, to study the association between the interest rate and remaining predictors. All models computation were done on R statistical software. We evaluated the performance of these models by comparing the difference between the predicted interest rate and the actual interest rate on the test data. In the last phase, we picked a model with the most accurate prediction. Although the predictors that are chosen by the backward selection method and the forward selection method are slightly different, we found that three predictors: the term of the loan, the last FICO scores, and the initial listing status of the loan recorded as a whole or fractional loan, are most critical in predicting the interest rate.

This document is currently not available here.

Share

COinS
 

Modeling Interest Rate

The goal of this research is to develop a model that could predict the interest rate on loans with attention to accuracy based on the information provided by clients. We looked at the 66 variables that are associated with a client, such as the debt-to-income ratio, the employment length, and the term of loans, to see how these predictors are correlated to the interest rate on loans. In the first phase, we collected financial data from LendingClub, which is an American peer to peer lending company and took out of uncorrelated predictors and missing value in the database. In the second phase, we divided the data into training and test data. The training data has a sample of 318257 people’s financial data, and test data has a sample of 136396 people’s financial data. We analyzed data by applying different statical methods, such as Linear Regression, Shrinkage Methods, Dimension Reduction Methods, and Tree-Based Methods, to study the association between the interest rate and remaining predictors. All models computation were done on R statistical software. We evaluated the performance of these models by comparing the difference between the predicted interest rate and the actual interest rate on the test data. In the last phase, we picked a model with the most accurate prediction. Although the predictors that are chosen by the backward selection method and the forward selection method are slightly different, we found that three predictors: the term of the loan, the last FICO scores, and the initial listing status of the loan recorded as a whole or fractional loan, are most critical in predicting the interest rate.

blog comments powered by Disqus