Modeling Interest Rate
Abstract
The goal of this research is to develop a model that could predict the interest rate on loans with attention to accuracy based on the information provided by clients. We looked at the 66 variables that are associated with a client, such as the debt-to-income ratio, the employment length, and the term of loans, to see how these predictors are correlated to the interest rate on loans. In the first phase, we collected financial data from LendingClub, which is an American peer to peer lending company and took out of uncorrelated predictors and missing value in the database. In the second phase, we divided the data into training and test data. The training data has a sample of 318257 people’s financial data, and test data has a sample of 136396 people’s financial data. We analyzed data by applying different statical methods, such as Linear Regression, Shrinkage Methods, Dimension Reduction Methods, and Tree-Based Methods, to study the association between the interest rate and remaining predictors. All models computation were done on R statistical software. We evaluated the performance of these models by comparing the difference between the predicted interest rate and the actual interest rate on the test data. In the last phase, we picked a model with the most accurate prediction. Although the predictors that are chosen by the backward selection method and the forward selection method are slightly different, we found that three predictors: the term of the loan, the last FICO scores, and the initial listing status of the loan recorded as a whole or fractional loan, are most critical in predicting the interest rate.
Modeling Interest Rate
The goal of this research is to develop a model that could predict the interest rate on loans with attention to accuracy based on the information provided by clients. We looked at the 66 variables that are associated with a client, such as the debt-to-income ratio, the employment length, and the term of loans, to see how these predictors are correlated to the interest rate on loans. In the first phase, we collected financial data from LendingClub, which is an American peer to peer lending company and took out of uncorrelated predictors and missing value in the database. In the second phase, we divided the data into training and test data. The training data has a sample of 318257 people’s financial data, and test data has a sample of 136396 people’s financial data. We analyzed data by applying different statical methods, such as Linear Regression, Shrinkage Methods, Dimension Reduction Methods, and Tree-Based Methods, to study the association between the interest rate and remaining predictors. All models computation were done on R statistical software. We evaluated the performance of these models by comparing the difference between the predicted interest rate and the actual interest rate on the test data. In the last phase, we picked a model with the most accurate prediction. Although the predictors that are chosen by the backward selection method and the forward selection method are slightly different, we found that three predictors: the term of the loan, the last FICO scores, and the initial listing status of the loan recorded as a whole or fractional loan, are most critical in predicting the interest rate.