
Accelerating global development

Ten months ago, we released our first Global Product that was fast enough for everyone to use. This expanded our view of the world’s companies from the UK to twelve countries in Europe and America. 

Since then, we’ve worked hard to bring features from our UK product to our Global Product. 

In January we added full text search. In March we added jobs and skills data, including a unified skills naming system across all countries. In June we added Dealroom data and delivered big performance improvements to ML List building. 

This month we’re focusing on the coverage and accuracy of our company to domain matching. 

Company Domains 

We can’t classify what a company does without the text from its website. And to get that web text we need to know a company’s domain. This is one of the hardest things we do at The Data City. 

For over seven years we have developed increasingly powerful methods for finding a company’s domain. Today we have a domain for 1.5 million British companies. In September 2020, just four years ago, we only matched half as many, 800,000. 

Unfortunately for us, lessons learned in Britain are often not applicable to other countries. Different postcode systems (or no widely used postcode system at all), different legal structures, different naming conventions for companies, and much more mean we can’t find company domains in France, Italy, Germany, Ireland or the USA in the same way as we do in Britain.  

Today, in France, a country with the same population and the same size economy as Britain, we have a domain name for just 401,798 companies. 

After a focused two weeks of work, we are increasing this to 853,320, a 2.4x increase. We now have a domain name for nearly two thirds of the 1.47 million French companies with at least one employee.  

Even more exciting are the improvements we’ve made for the largest companies. This is important since together these represent the majority of the French economy. 

New methods mean that for the nearly 30,000 French companies with over 100 employees we now have a domain for two thirds, which we estimate to be correct almost every time. We’ve increased by three times the number of big French companies for which we have a correct domain. We’ve also achieved a 10x reduction in our error rate, which is estimated by manually checking the domains of a random and representative sample of 644 companies. 

For smaller companies, the news is good too. We have a correct domain name for twice as many French companies, and we do even better for the companies in innovative sectors that are more likely to be growing quickly. 

What next? 

This is a huge improvement in the coverage and accuracy of our French company to domain matches. With 64% coverage and an estimated 97% accuracy for companies with over 100 employees we cover the majority of the French economy well and with very high accuracy. 

But this is just the start. 

We are already working to increase the coverage and accuracy of our company domains in France further. We’re getting very promising results. 

While that happens, we’re looking for where to do next. If you’ve been using our Global Product, or if you’re thinking about using it, please let us know which countries or regions you’d most like us to give the French treatment to. 

We can’t guarantee that the improvements we’ve achieved for France will be possible everywhere. But we’re pretty confident. 

Interested in discovering how our Global Platform can help you understand what companies do? Visit our Global Product page and sign up for a free trial today.

You can also sign up for our upcoming global webinar where we’ll be showcasing the platform for the first time. 

About the authors