Improving Collections through ML based Collections Analytics


Collections teams in Banks & other financial institutions often face a difficult task of deciding on the best possible steps to take for pending payments in loans & cards repayments. This usually results in the pending dues moving in various days past due (DPD) buckets. Even when they have a list of actions to follow or the best channel to reach out (like Calling, messaging, Sending WhatsApp, visiting in person) it may not necessarily be the most effective or cost-efficient approach. Then there are problems related to when to reach out to customers. The ideal action would be one that reduces costs and ensures a resolution in terms of the loans. With the advancement of technology, the application of machine learning in collection systems can greatly benefit the process by providing the most effective actions to optimize cost and achieve resolutions.


To develop an effective machine learning solution for optimizing cost and achieving resolutions in collection systems, we began by analyzing different data available in the collections system. This included customer demographics, loans taken by customers, past behaviors, and different actions that were taken, as well as their respective outcomes and resolutions. Next, we applied an unsupervised machine learning model to segment the past loan data and provide labels for each loan. This helped us to understand and classify the different types of loans based on their characteristics and patterns. Using the labeled loan data, we developed supervised machine-learning models to predict the same labels for new loans. These models were trained on past loan data and could accurately classify new loans based on their features. To further optimize the actions taken for each loan, we developed another machine learning model based on past actions, their results, and resolutions. This model predicted the initial action to be taken for each loan based on its labeled characteristics. Finally, we developed real-time machine learning models that considered the previous actions taken, the resolution received, and other relevant data to predict the best next action for that particular loan.


To improve the efficiency of loan collections, we develop several machine learning models that analyze and predict the best course of action for each loan. The initial step is to use an unsupervised learning model to segment the raw loan data based on their curability, which means classifying them into either curable or incurable loans. These segmented loans are then further labeled based on their level of risk, ranging from high to very low. Next, we develop supervised learning models using the labeled data to predict the curability and risk associated with each loan. We used logistic regression, decision tree, and random forest classifier models to train our data, and evaluate their performance using various metrics such as precision, Gini, AUROC, accuracy, sensitivity, and specificity. These metrics help us determine the accuracy of our models and their ability to predict curability and risk. After this, we use a classification model to predict the initial action to be taken for each loan based on its curability and risk level. This model took into account all the data available for each loan, including the customer demographics, loan history, and past actions taken, to suggest the most effective action for the current situation. Finally, we create real-time machine learning models that updated their predictions based on the previous actions taken and the resolution received for each loan. This allows us to continuously optimize our approach and provide the most efficient and cost-effective solutions for loan collections. We use K-nearest neighbors for unsupervised learning as it is a commonly used model for the segmentation and clustering of data. The model provided a better understanding of the relationship between different loans and allowed us to label them based on their characteristics. In conclusion, the machine learning models we develop provide a comprehensive approach to loan collections, allowing us to classify loans based on curability and risk, predict the initial actions, and update our approach in real-time. The final models listed are as follows:

  1. Curability or Non-Curability of the loans.
  2. Channels to use for reaching out to customers (Calling, messaging, Sending WhatsApp, visiting in person).
  3. Day & time to reach out to customers.
  4. Action list after each action is performed.

These models can significantly reduce costs and increase resolution rates in the loan collections process.


Leave a Reply

Your email address will not be published. Required fields are marked *