By Rosa Borge, Associate Professor and Researcher, UOC-IN3.
The Big Data analytics would shape a smarter society if this new paradigm contributes to knowledge in the Social Sciences. On one hand, the analysis of Big Data is helping to guide decisions in socially relevant areas such as the public health system, the aid in zones affected by natural disasters or the improvement of the educational offer. And one of its related fields, social networks analysis, is contributing also to better understand the causes and consequences of the connections among people, actors, organizations or systems.
However, these specialized tools and analytics suffer from methodological and theoretical problems that we should take into account. Methodological problems such as the errors, losses and blocked or protected data when collecting and assembling large Internet-based big data sets, which leads to hardly representative samples.
Also, there is usually a lack of information about the distribution of many important variables that characterised the population involved in the data collection, so it is not possible to carry out explanatory analysis. Notwithstanding, I believe that these methodological problems can be solved with a more accurate design research and more skilful and interdisciplinary teams.
« The analysis and visualization of millions of data without a Social Science theory could lead to observe patterns and rules where they do not exist »
With relation to the theoretical hindrances, I think that is crucial to immerse Big Data analytics within the Social Sciences’ theories on human interaction and behaviour, and taking always into account the context and the social institutions modelling interactions and behaviour.
The analysis and visualization of millions of data without a Social Science theory could lead to observe patterns and rules where they do not exist. In that sense, merely presenting “social networks graphs” and measuring connections through the usual mathematical algorithms could be illustrative but socially meaningless. Previous good reflection on research questions, hypotheses and derived indicators is needed.
In addition, we should examine the equivalence of the representations obtained and the algorithms used with similar concepts and models in the Social Sciences (i.e. polarization, homophily, centrality, tie strengh, influentials, diffusion, etc.). In most cases, the measurements and models do not indicate the same.
At this stage, I think that far from refraining us from Big Data, we, as social scientists, have to see it as an opportunity to gain more knowledge in the interest for society, but taking into account that classical methodological and theoretical concerns are totally relevant to this new paradigm.