Alternative Data: the value of companies data in credit risk assessment

Insights 26 March 2019

Data quality

One of the main problems with credit risk analysis is the lack of data for the creditworthiness assessment.

It may sound curious, in a world overwhelmed by data; in fact, the issue is related to the data quality rather than quantity.

Most of the assessment models are based on companies’ public balance sheet data, forgetting the main source of information: the company itself.

Companies internal data (from clients’ payment behavior to the information included in private documents) are a unique and updated source of information and allow to forecast scenarios not yet highlighted by the market trends and to identify in advance risky factors.

Yet, in order to exploit them, you need a proper risk analysis model, able to integrate data from different sources and to provide reliable outcomes.

Dataset

The acquisition of a proper dataset is the first step in the implementation of a credit risk assessment model.

When choosing data sources, the following considerations should be taken into account:

  • the model should be able to distinguish between healthy and defaulted companies: the dataset should then include data on companies no longer in business;
  • the model should provide the risk profile of any company, without geographical limits and regardless of the business sector;
  • the model should have excellent forecasting capabilities and be able to identify in advance risky companies.

As they provide a huge amount of public data, data providers are the first source of Big Data and allow to set up a good starting dataset. Yet, does a model built on Big Data meet the above-mentioned requirements? Unfortunately, not.

As wide-ranging as these databases may be, they can’t cover all the countries in the same way. For instance, in Italy partnerships are not required to disclose the financial statement, while in other countries, including Netherlands, companies are required to publish only some specific data. Due to this information gap, it is not possible to implement a credit risk assessment model with a global approach based on Big Data only.

Alternative Data

That’s why it is so important to include Alternative Data in the model.

Alternative Data can be defined as the data provided by non-traditional sources. They can be classified as follows:

  • data produced by individuals’ online activities, like social media insights and search engines analytics;
  • data generated by sensors, like satellite and geolocation data;
  • data generated through business process, like transaction data or data provided by companies’ internal software and databases.

Alternative Data tends to be unstructured, i.e. they are broadcasted in a non-numerical format (like images and text).

In order to exploit them, is therefore necessary to:

  1. develop a model capable of integrating both structured and unstructured data;
  2. develop an analysis framework able to integrate data both from traditional sources and form “alternative” ones.

The assessment model

Since Alternative Data are typically unstructured, they can’t be analyzed with standard analysis tools, but only with Artificial Intelligence and deep-learning neural networks.

This approach is the same used by the creditworthiness assessment model MORE (Multi Objective Rating Evaluation), which elaborates input data, both structured and unstructured, through non-linear networks, integrating them into Game Theory’s multivariate models.

The Game Theory is a mathematical theory that investigates how subjects (players) interact with each other in a strategic environment (game) to maximize their payoff.

MORE analyzes all economic and financial areas (solvency, liquidity, profitability, etc.) of a company and evaluates the best strategic data interaction in order to achieve the Pareto optimal, a measure of efficiency that can be achieved only if no other alternative state would make the situation of any player (here economic and financial area) better off without worsening the situation of another.

The top score is assigned to the companies whose different areas are best balanced and achieve the Pareto optimal.

oplon Risk Platform

As above mentioned, a valid analysis model is just the first step: to include Alternative Data you also need a proper analysis framework able to collect data from different sources.

In order to meet these requirements, we developed oplon Risk Platform, a risk management solution based on MORE that provides the analysis tool used by Credit Rating Agencies to assess the companies’ creditworthiness (credit score, probability of default, sensitivity and cash flow analysis, etc.). Thanks to the integration with MORE, oplon includes in the evaluation process both public data supplied by data providers and companies’ private and internal data, whether structured or unstructured.

The assessment process is organized in analysis steps: first you choose the analysis to be performed, then oplon automatically calculates the company’s MORE Score from the financial statement, which can be downloaded from a data provider or uploaded manually.

In the following analysis steps companies' alternative data can be included as follows:

  • by filling the qualitative questionnaires: some analysis steps include qualitative questionnaires aimed at integrating companies’ internal information, like clients’ payment behavior;
  • by uploading internal files and documents: thanks to Natural Language Processing algorithms, oplon can automatically read and integrate text files or images;
  • by implementing custom models: oplon allows to integrate custom model within the platform, developed independently or upon request. The models can be customized according to the users’ needs; for example, it is possible to request customized pricing models or include data provided by the sensors of 4.0 manufacturing technologies.

All data are integrated into the MORE algorithm, which identifies potential risk factors and automatically updates the company's credit assessment.