The importance of data when it comes to lower mid-market business lending


October 1, 2019

If you are an individual or a small or medium-sized business aiming to borrow a few thousand to a few hundred thousand dollars, technology has made it possible for lenders to provide an extremely efficient service and for you to get a near-instantaneous credit decision. This is at the core of alternative and peer-to-peer lending companies such as Lending Club, Kabbage, OnDeck, etc.

For larger or more complex loans however, technology has made fewer inroads into the problem, with lending at the higher end (i.e. loans of $500k to $25m) remaining a fundamentally manual and expensive process. A bank given the choice of issuing a loan of $100m with a 1% arrangement fee, delivering $1m in fees, or issuing a loan of $10m with a 1% arrangement fee, delivering $100k in fees, will naturally focus on the bigger deal, presenting a barrier to the underwriting of smaller loans, if the process is manual. In the mid-market, where loans are too big for automated decision models but too small for the unit economics of the manual approach to make commercial sense, the market has been characterised by fairly inflexible, product-centric lending which does not necessarily meet borrowers’ needs.

While a fully-automated decision process will probably never be appropriate for loans in the millions of dollars, artificial intelligence and large-scale data analysis can help bridge the gap between fully automated and fully manual credit assessment, allowing an efficient, semi-automated process while still preserving some of the customisation necessary to address the complex needs of small and medium-sized businesses. This has been the experience of OakNorth Bank, a small and medium-sized business lender in the UK which uses the OakNorth Credit Intelligence. The bank has grown its lending book in the UK from zero at the start of trading in September 2015 to several billion dollars today with no credit losses and yields that supported a profit of $40m in 2018.

The suite allows banks and lending institutions to significantly improve and accelerate their credit decisioning and monitoring capabilities. It supplements the traditional method of relying on backward-looking historical data sourced from the borrower, and scenario analysis based on standard haircuts that are not necessarily linked to industry drivers (Level 1 and 2 analysis), with technology and massive data sets, to model a forward-looking view that’s informed by industry benchmarks, macroeconomic drivers, and scenario analysis specific to each business (Level 3 and 4 analysis).

Sentiment analysis for example can help to summarise qualitative text data into scores which are more readily included in other analysis. Below, we present an example of a restaurant chain, comparing review scores on a popular rating site with the sentiment of text in the reviews. The size of the bubble indicates the number of reviews in the sample. Reviews of restaurants on the diagonal of this plot are normal (a good score corresponds to a positive sentiment and poor scores also correspond with more negative sentiment). It is more interesting to note the restaurants where the score is positive but the sentiment of text less so or vice versa. Our most anomalous example (labelled “R1” below) is a restaurant with few reviews, so it may be that the score has been unduly affected by a small number of negative reviews and it will revert to the diagonal as more data points are added. Repeating this analysis for multiple review sites and for a peer set including this chain’s competitors would give a sense for the external perception of the quality of these restaurants and how they compare with peers. This would be important for the understanding of the credit case and allows data-driven validation of the set of comparable businesses used for analysis.


Clustering is another set of techniques which can be used to create a data-driven segmentation of a given dataset into a set of categories. As such, this allows external validation of assumptions that the analyst might have a priori. The example below is performing an analysis of a spirits manufacturer – when looking at sales data for many brands in the industry, it was difficult for the analyst to discern any particular trends. However, clustering highlighted two brands which were very different. One was seeing a significant sales decline and the other corresponding sales growth. This was related to a demographic change in drinking fashions: young people were changing their habits from one brand to the other. This is in interesting example of AI making data more intuitive and understandable – once the analyst saw the clusters, they were immediately able to explain them.

About the author