Working out loud

Estimating Company Group Structure

Bijen Shah, a data science and analytics student at University of Westminster, spent some time at The Data City working alongside Jack Lewis to identify the parents (and ultimate parent) of company chains and group structures. The following article documents his experience working on this problem.

During my career break studying MSc Data Science and Analytics at University of Westminster, I wanted to do an industrial project as part of my master’s dissertation. I am grateful of the University to help me find an industry project with the support of CDRC Masters Dissertation Scheme. With a successful interview with The Data City, I worked on a project with Jack Lewis to identify the parents (and ultimate parent) of company chains and group structures.

You can read my dissertation here.

Understanding the problem

Here, users see three companies from the THEOREM SOLUTIONS group. The group is ultimately owned by company THEOREM SOLUTIONS HOLDINGS LIMITED – 13198079. Where other two companies (THEOREM SOLUTIONS LIMITED – 02384114, THEOREM SOLUTIONS GROUP LIMITED – 07098741) are the child companies. Users want to see this relationship between companies possibly by collapsing the child companies under the Ultimate Parent Company.

The solution

Completing my dissertation project, “Identify Multi-level Hierarchies and Tree Structure of Group Chain Companies using Graph data and Relational Databases”, I created a web-app to allow users to use the findings from the dissertation project. The web app is a great way for users to be able to use my new dataset before full integration into The Data City product. Please get in touch if you would like to explore the web app for somewhat solving the problem of multiple entities of the same company appearing in the data download. It uses a data download from the platform and produces a new Excel file which makes use of our new dataset.

The project experience is a valuable addition to my growing career in the data field. I gained valuable insights working for a project in a scale-up environment and learnt new technologies & data products from their knowledge sharing sessions which provides a boost to future opportunities.

Working with agile methodologies, The Data City’s team was always flexible and supportive in facilitating resources, knowledge, and regular meets with the team. I was excited to meet the whole team in person in their Leeds office to gain hands-on feedback on my ways of working and experience to learn their ways.


We orchestrated a ETL (Extract, Transform and Load) procedure to find company chains and estimate groups of companies using simple SQL queries, low code Python Pandas functions and open-sourced data from Open Ownership Register. Open Ownership Register is a Beneficial Ownership Data Standard (BODS) database which has beneficiaries’ data from the UK, Denmark, and Slovakia.

The expectation was to utilise Open Ownership’s Data to find immediate and ultimate parent for each company registered in the UK. This would allow users of The Data City product to better understand the companies’ financials, growth, fundamentals, etc. and make a sale, marketing, collaboration, business deal with them accordingly. Using simple SQL queries and staging the data in different tables, we were able to identify company chains and ultimate parent company of each company in that chain.

For example, Disley Holdings (UK) is an Ultimate Parent Company which owns the following chain of companies (see below). To discover this information from a tabular SQL dataset format we had to run recursive queries to find sub companies and their sub companies at all levels based on the company numbers available in their beneficiaries’ information. We also found out that companies might have multiple parent companies and ultimate parent companies which was a complicated scenario for our recursion query. We tackled this problem with recursively only finding companies in a top-down manner starting from a list of Ultimate Parent Companies.

U – Ultimate Parent Companies
P – Parent Companies
C – Company

This data is now stored in a tabular format where each company under an ultimate company is listed with their respective relational numbers as shown below.

The data allows The Data City to answer a very key question their users kept asking them, “Who owns this list of companies I am looking at and can The Data City help find their details?”. The Data City only had somewhat limited information on group structure but they mainly struggled with cascading a whole group structure into one ultimate parent company.

As the project progressed, we were to add more value by not only finding ultimate parent companies but also estimating company group chains which were not declared as beneficiaries in their company statements. For example, the Bizibl group of companies is a group of three companies Bizibl Group Limited, Bizibl Technologies Limited and Bizibl Finance Limited. They are all owned by the same 2-person entity, and all have the same registered address.

This data is now available in a web-app format for The Data City’s users and soon would be live on their platform as follows.

I am proud to share that the learnings and progress we did in this project has become one of their motivation points for their global product’s architecture. Stay tuned with them for their global product which will allow users to sell, market, invest, learn, and do business with other companies globally.

I would like to thank The Data City, Jack Lewis, CDRC and University of Westminster for giving me the opportunity to learn, work, and grow with them. I have gained valuable experience and have upskilled myself for future opportunities.

About the author