Private sector banks who have strong customer base in retail, corporate and NR space needs to assess their customers income levels which will help the companies in targeting specific customers and create customer centric campaigns.
With the help of Real Estate portals, we can extract the required Properties information for assessing the real estate prices and thus the predicting income levels of people living in the properties.


Extract the various data points such as outright prices, real estate details, area/sub area etc. from various portals such as Magicbricks, Square yards & Proptiger across cities.

2. Preprocessing and cleaning of the unstructured data
I. Using NLP, prebuild libraries and statistical measures we cleaned the data like extracting missing title from property descriptions, imputed missing values and outliers in floor no. and total floor count.
II. Standardize the unstructured columns such as price, property type etc.
III. Generate amenities from existing columns
3. Created unsupervised ML model to cluster the respective area
I. Build a K- means clustering model on the cleaned and preprocessed data to segment the properties basis location, price, amenities, area, furnishing status and project status etc.

II. Apply the ML model for clustering the properties into different segments such as Elite, Super Affluent, Affluent, Mass
Affluent, Afforders, Aspirers and Simplers.

4. Match the input address with the master database to map the input customer address to the segment in which he lies we can use a combination of Fuzzy matching, Regex, Google API and NLP algorithms.


The output provides the detailed analytics of each portal wrt city, area, prices and property features etc.


Leave a Reply

Your email address will not be published. Required fields are marked *