A review by Veronica Hera (UCL BSc Politics, Philosophy and Economics, 2018-2021) and Shivam Gujral (UCL BSc Economics, 2019-2022). This review is based on work undertaken between April and September 2020 to inform a research project conducted jointly by Prof Cloda Jenkins (UCL Economics) and Dr Tom O’Grady (UCL Political Science and Q-Step). The research was funded by the UCL Social and Historical Science Dean’s Strategic Educational Enhancement Fund.
Big Data, Data Science, and Machine Learning have paved their way to the vocabulary of social sciences especially at a time when people have become cognizant of the exponential rate at which the data is being generated (Harding et al, 2018). Organisations in both the public and private sector are using advanced methods of data analysis techniques driven by big data to make better, quicker, and more efficient decisions. However, we wonder if graduates are equipped with the necessary skills and ready to apply them once they kick start their careers. Our research aims to answer these questions by looking at a diverse set of organisations, ranging from government authorities to research institutes, and understanding how they approach data analysis and what skills they are looking for when recruiting new joiners.
Through our desktop research and interviews with the representatives of these organisations, we found out that various organisations have distinct perceptions on how important data analysis is for their process of policymaking. Usually, organisations conducting data collection themselves tend to see it as the main driver of policymaking, while those who outsource their data collection focus more on backing theoretical claims with relevant statistics and modelling. In terms of data analysis, while some organisations perform a more basic analysis and focus more on presenting data in a comprehensive and understandable way, others use more complex analysis to write reports addressing more specialized audiences like economists, data scientists, or policymakers. Lastly, while several organisations use mostly R and Python as their main software, others have their own bespoke software, either for visualisation or producing results from datasets. Nevertheless, when it comes to recruiting, all organisations are looking for graduates that know core statistical concepts and have basic skills in R, Excel or Python, or the potential to learn them quickly on the job, as well as an ability to conduct project-based work and explain their findings in an understandable manner to wider audiences.
The era of “Big Data” brought about many opportunities and challenges, including the increasing popularity of heavily detailed datasets, at organisations like the European Central Bank. We also observe that the presence of big data has led to a paradigm shift of datasets not just being recorded as transactions, but also being used for analytical purposes. For example, purchasing household items from Tesco, could now be used for analytical purposes to explore topics such as increase in household consumption during Covid-19. With regards to analysis, the tools used differ in accordance with sector and the analysis performed, but some of the most popular ones were Python and R, followed by STATA and Excel. We also saw the usage of SPSS and MATLAB in some organisations. Some online packages like the GG plot function in R and Excel, followed by Tableau were also used for visualisation purposes. However, many organisations also use more bespoke internal software for which they provide in-house training. Despite the fact that more complex techniques are used in some places, most organisations are happy if new joiners are acquainted with the most basic theoretical concepts and their practical applications. Some, like the ECB, may have ‘data centres’ which could potentially be used in order to obtain a better understanding of data analysis or solve any queries one may have. The most commonly required ability for new joiners is knowledge of the core statistical as well as social sciences concepts and understanding of general policy analysis. For the students within the UCL Economics department, this means a good grasp of economic theory and for UCL Q-Step students, it means having a good knowledge of their respective areas within Geography, Political Science, and Population Health. Along with this, skills such as modelling in econometrics and core statistical and data modelling techniques are always a plus.
So, what does this mean for students and those designing modules and programmes?
In terms of graduates’ skills, we noticed a consensus between recruiters that students lack practical skills and despite the strong theoretical background they gain from university, many of them have trouble bridging the gap between the academic training and the real world. Graduate recruiters suggest university courses should teach students how to recognize patterns, develop their own datasets and then conduct the analysis while also encouraging more independent thinking in the realm of data analysis design and method selection. Moreover, many believe that data scientists and social scientists complement each other and that an ability to convey abstract ideas in an understandable manner, to an audience without prior technical knowledge, is very useful in presenting data.
All in all, speaking from the perspective of Economics and Q-Step departments, students are offered a wide range of opportunities to gain practical experiences like taking part in research activities, skills labs, and Explore Econ. Our research recommends taking up activities outside of the classroom to build up the skills and make good use of the available resources. Students across these departments have gained the basic skills in software like STATA and R, and extra-curricular activities can offer them a good chance to utilise these skills in an academically oriented research format.
We also concluded that designing new modules, such as “How to analyse Big Data” which teaches students how to handle big datasets and help them understand the evolving discipline, “The impact of Big Data in social sciences” which has a broader scope, explaining how data science has influenced various disciplines, showcasing real-life examples, from fields such as public health and political science. Furthermore, modules like “Data design and interpretation” would prompt students to think more independently towards designing datasets, method selection and presenting statistical findings to wider audiences. These could be whole modules or incorporated into existing or other new data analysis modules.
To conclude, the presence of data analysis can lead to a paradigm shift within social sciences. Although it is essential to have a good capture over conceptual roots, it is also increasingly important to be aware of the technical advancements in the subject!
- Harding MH. Big Data in economics. IZA World of Labor [Internet]. 2018 [cited 2020 Aug 26]. Available from: https://wol.iza.org/articles/big-data-in-economics/long.
- Data Analysis in Policymaking research report