Internal data is the first place companies look once they think about analytics and insights. However, they shouldn’t overlook the wealth of value gained from mining external and third-party datasets.

Information about your operations, such as sales transactions and operational performance, can let you know what’s happened in the past and assist you in making educated guesses about what’ll happen in the future. External data sources can help you understand what your competition does and how trends such as, for example, consumer behavior patterns, market dynamics, or even the weather can impact your performance. For me, an comprehension of both is essential today if you intend to take advantage of the transformative opportunities provided by data and analytics.

Artificial intelligence (AI) and machine learning, fueled by data, quickly become a vastly transformative force in many industries and markets. However, not every company gets the resources of an Amazon or perhaps a Walmart that allow it to generate vast levels of proprietary, internal data from a customer base of millions. Fortunately, external data could be just as helpful and benefit from being readily available to more or less anyone.

Through the Covid-19 pandemic, rapidly changing behavior meant that businesses’ existing models to predict demand or forecast change became obsolete overnight. A massive amount of their internal data now held little use. In this period, companies often found that external data was key to building new models to predict how people react to changing circumstances. Data on internet search traffic was precious for sets, from tracking the spread of the virus to predicting where behavior changes would be most severe to understanding people’s new priorities in a changing world.

External datasets may be publicly available – for example, many governments make a wide selection of the information available through portals such as data.gov and data.gov.uk. Alternatively, they may be privately held and designed for free (for example, Google’s basic search and trends data services) or for a cost. Companies such as Nielsen and Experian provide marketing and demographic data from many sources, and niche providers have emerged carrying specialist datasets of value to many different industries.

When one US glass manufacturer sought to diversify its revenue streams, it found that it could predict where window repairs were most likely to be needed by analyzing publicly available crime data. It could quickly build a profitable new business unit providing emergency repairs by streamlining its supply chain and prepping mobile repair units. On an industry-wide scale, finance and bank card companies have long used external data from credit reference agencies to measure lending risks to individual customers. And property businesses use public databases of property sales to estimate the worthiness of houses they buy, sell and lease.

Special mention must also get to the role that external data plays in the transformative power of the “digital twin.” This can be a simulated version of a company, something, or a process that may be used to predict how different variables will affect its performance in the real world. As the “twin” model is generally built using internal data, external data may be used to simulate the “world” that the twin exists in. For instance, Goodyear creates simulated versions of its tires using data from its manufacturing processes. After that, it uses external data on the structure and condition of road surfaces and weather data to generate realistic environments that may be used to predict the performance of new tire prototypes.

Of course, there’s no such thing as a free lunch, and you will find challenges as it pertains to working together with external data, even if it’s provided at no cost. One is that as there isn’t direct control of the method of capturing the information, you may find yourself overly reliant on the information provided, which might vanish or drastically alter its types of operation. If you’ve used resources to create analytics tools around these services, and they suddenly aren’t available anymore, this might be a problem.

Along with that, there can be technical issues. Dealing with multiple datasets from different providers means you’ve to make sure your data is in a structure that can be easily correlated and merged. The most valuable insights often come from combining several different and altogether separate datasets. A data engineering or cleansing job is usually necessary to get it all into a state where this is possible.

Finally, remember that because you can require numerous data sources, ranging from satellite imagery and meteorological data to anonymized customer data, you could have to create and maintain relationships with several different data suppliers. This brings compliance issues, as you’ll always need to ensure that the information you’re buying has been collected and processed lawfully and ethically. The amount of compliance and regulation around data utilization is now more significant by the day. As a data processor, you could ideally face a potentially expensive penalty if your suppliers aren’t all above-board.

If your organization can put plans and strategies in place to control all of this, then working together with external data gets the potential to be advantageous. It indicates your data and analytics strategy is no longer just about “you” but becomes about building an awareness of the environments and ecosystems around your company where it operates. It can enable you to streamline and drive efficiencies through your existing business models or even transform them to allow you to create entirely new ones.

Using different data types – external data, unstructured data, real-time data – is one among the topics covered in depth in the 2nd edition of my book Data Strategy – Just how to Profit From a World of Big Data, Analytics, And Artificial Intelligence.