Working out loud

Quantifying highly influential directors using network centrality analysis

As part of the programme for UCL MSc Business Analytics, students were required to collaborate with a company to conduct analysis and research for their dissertation. Fortunately, my programme director was aware of the CDRC Masters Dissertation Scheme which led to the exposure of The Data City’s project. After a successful round of interviews with The Data City, I was assigned to explore director networks using the data available to The Data City. Alongside the project, I was lucky to be hired as a part time business analyst to gain further experience on the technology The Data City uses. Completing my dissertation project, “Quantifying highly influential directors using network centrality analysis” was a very satisfying moment for me as I was set responsibilities to tackle a real-life problem. Not to mention I also gained valuable soft and technical skills experience within a start-up company which has helped me to be ready for my future career.  

During the project, The Data City was consistently supportive in providing all necessary resources to facilitate the project. I was fortunate enough to take part in some of the business projects alongside with regular team meetings and updates. This was incredibly useful, especially for me, who has never stepped foot into an organisation before. I was met with an amazing and friendly team who were always happy to help me out or just for a chat. 

As for the content of my dissertation project, we built a unique methodology to create a network of directors within an emerging economic sector. Within each sector, directors are scored and ranked using our analytical tools.  

Our methodology also tackles a key issue that we have identified during the dissertation project: company directors do not have a unique identifier within the dataset. Directors in the open data provided by Companies House have a Person Number; however, we quickly noticed a director can have multiple Person Numbers. Please note, each Person Number will hold different records of a director’s employment history.  To provide more information in context, we will use Alex Craven, one of the founders of The Data City to demonstrate more clearly the problem with Companies House’s open dataset. Below is a result page from Companies House when we search “Alexander James Craven”.

Each row of result will have it’s own Person Number and having discussed with Alex Craven, the four records do belong to him. In this case, if we generate a director network based on the original dataset, 4 directors will be presented in the network as different people, despite being the same person. Therefore, unique identifier is critical to generate a representative network of the industry. We need to assign a single person number to Alex Craven so we can create a link between the different companies Alex has been a director of.

Here we present the director network for the HealthTech industry in Leeds.

There are three distinct clusters of directors within the network, in this case each cluster represents a large corporate organisation with many subisidaries and directors. Using the our methodology, we present the top 10 most influential directors within the Leeds HealthTech industry. We believe this methodology is capable of generating any director network and identify some of the highly influential directors within any industry. We could even do this for every director in the UK!

NameWeighted OWANormalised OWARank
SOUTHBY PETER JOHN0.04711
O’HANLON SHAUN0.0360.7692
GIBB MOIRA MARGARET0.0080.1673
SPENCER CHRISTOPHER MICHAEL KENNEDY0.0070.1444
RIDDELL SEAN DOUGLAS0.0030.0655
TAYLOR IAN0.0030.0586
BENSON CHRISTINE0.0030.0557
STABLES DAVID LINDSAY0.0020.0458
THORBURN ANDREW JOHN0.0020.0339
WILCOCK STEPHEN JOHN0.0020.03110

With the project drawing to a close, I believe the CDRC Masters Dissertation Scheme has truly opened up my perspective in solving real life problems. It has improved my project management skills that I will be able to bring forward to my future role. Overall, I would love to take this opportunity to thank the CDRC Master Dissertation Scheme and The Data City for this project. And I couldn’t thank The Data City enough for their support and assistance for what I believe to be the best experience to kick start my career as a Data Scientist/Business Analyst.

Written by: Jason Li 

About the author

Jason Li

Jason is a master’s Student from UCL studying Business Analytics. His course alongside his role in the Data City focuses on data and machine learning. Within his study, he has participated numerous group projects in the field of big data analytics- from predictive analytics, using machine learning algorithms for recommendation engines, to sentiment analysis and text summarisation using natural language processing. He is passionate about using data to generate useful insights.