PISA (Slice 1)

A lot can be said about PISA  (Programme for International Student Assessment) 2015, the triennial test and survey completed by a sample of 15 year olds in OECD and other nations.   Databases and interactive data visualisations are available on the PISA website.  Potentially, anyone with skills in data analysis can use these data sources to explore their own questions about the data, and research will no doubt be forthcoming. In the meantime, the first slice of PISA looks at news reports of the PISA results.

Results of the 2015 PISA test were published on 6th December 2016.  UK based news sources have been keen to report on the performance of the UK, and its constituent countries, in comparison with the other nations taking part in PISA 2015.  BBC News ran with the headline: PISA Tests: Singapore top in global education rankings reporting that, in comparison, the “UK remains a middle-ranking performer”.  The Telegraph asks: Where does the UK rank in the international school league tables?   While it reports that the UK has climbed up the ranks for science and reading, it cautions that the average point score had dropped in both subjects, though only by one point in reading.   As the Telegraph report goes on to say:  “only 11 per cent of students in the UK are top performers”  in comparison with Singapore where 35% of pupils are ‘top performers’.   Similarly, The Guardian focuses on the apparent lack of success of UK schools with the headline:  UK schools fail to climb international league table.   The Independent too warns that UK schools are falling behind leading countries.  These reports suggest that there is little to celebrate in the latest PISA results. The position of the UK in the PISA rankings signifies, according to news reports, that we are falling behind.

However, where average, or mean scores form the basis for comparison we should not be surprised to see some variation in the results between schools as highlighted by the case of Alexandra Park School in North London. As BBC News reported, the average score of the pupils in  this school surpasses the average score of pupils in Singapore, the top performing country taking part in PISA.    So, while the UK is ‘middle ranking’, some UK schools are not.  Some score higher, some score lower. Perhaps in an effort to highlight which country is to blame for bringing down the UK average,  the differential performance of constituent parts of the UK has also received attention. The PISA scores for Wales are lower than other UK countries, and falls below the OECD average.  The BBC reports that Wales is “still worst in UK in world education tests” as the performance of Wales’ pupils has failed to improve on previous PISA tests.   Wales Online reminds us that politicians may be judged by and held accountable for the educational performance of a country’s pupils, reporting that First Minister, Carwyn Jones has been described as a ‘failure’, partly as a result of the country’s performance in PISA 2015.

These reports, in highlighting the apparent mediocre position of the UK in global league tables suggests that the comparative performance of the UK should be a concern for both educators and policy makers. Grek (2009) discusses a politics of comparison where the PISA ranks provoke policy responses in an attempt to increase a nation’s position in future tests.  Future slices on this blog will pick up on these and other issues related to PISA.

Continue reading “PISA (Slice 1)”

Big Data and Social Science

This was a training course organised by the NCRM (National Centre for Research Methods).   Held at the LSE in Holborn, and facilitated by Frauke Kreuter, two days were dedicated to considering the ways in which social scientists could engage with Big Data.  The content of the two days is supported by a book Big Data and Social Science: A Practical Guide to Methods and Tools.  It was a shame I could only find a hard copy at the time of purchase as it really is a weighty tome, and not something one wants to carry around.

What is Big Data?

This is a good question.  One response to this, that Big Data is “anything that is too big to fit onto your computer” (Foster et al, 2017: p3) reveals the temporality of this as a defining characteristic.  As the computing capacity of personal computing increases, so does the ability to handle vast amounts of data using a personal computer or laptop. So, this may not be a good yardstick for defining Big Data.  Still, this gives us an indication of the ‘Bigness’ of Big Data.  There are three key characteristics of Big Data, including volume (large datasets), velocity (data that may be in real time, or streamed), and variety (data in various formats and from multiple sources).  This is discussed in more detail in Chapter 5 of Big Data and Social Science: A Practical Guide to Methods and Tools.

Accessing Big Data

References to the proliferation of Big Data and the datafication of everyday life can be found in social scientific literature (boyd and Crawford, 2012; van Dijck, 2014; McFarland, et al, 2016).  While data may be ‘everywhere’, it is important to know where to look as well as develop the skills needed to access the data. Techniques such as web scraping were discussed.  This involves searching for data on the web and extracting it.

There are tools such as Beautiful Soup to facilitate web scraping, and we discussed Selector Gadget which the user can use to identify the code needed to select different parts of web pages.  However, one of the challenges with this is that web sites change, meaning that this might not be a reliable way of extracting data.  Further, web scraping may be illegal in some circumstances as the providers have not given permission for their data to be accessed in this way.

Another approach is to use Application Programming Interface or API. In non technical terms, this means ‘reading the data and putting it into something else’. It is distinct from web scraping, apparently.  Chapter 2 in Data and Social Science: A Practical Guide to Methods and Tools provides more details on the methods and tools used in collecting data from web sources.

Record Linkage

Big Data may be generated from more than one, indeed several, datasets.   Tokle and Bender (2017) highlight the ways in which Big Data differs from the more usual survey data used by social scientists.  Survey data, usually, contains all the data relevant to the area of research interest.  Social scientists using Big Data may have to use data from several sources.  This relates to the ‘organic’ characteristic of Big Data.  That is, it is typically data that is found, rather than designed (as in survey data) and may come from the myriad everyday transactions of human activity.  These include credit card transactions and social media use.

Researchers using Big Data may want to ‘match’ cases that appear in both datasets.  In other words, data on individuals may be linked across datasets.  This might be very useful to a researcher trying to gain a complete picture of the activity of interest.

Of course, in linking records, there is the possibility that individuals will be identified. We discussed how this meant that informed consent, usually essential for social scientists, is not enforceable. In fact, Big Data threatens informed consent as a value of social research. The consequences of using an individual’s data cannot, yet, be known.  Such ethical concerns urgently need addressing by social scientists  (boyd and Crawford, 2012). Chapter 3 in Data and Social Science: A Practical Guide to Methods and Tools covers more on record linkage and matching.


This was the most animated part of the session and is testimony to the ability of visualisations to tell a story with data.  Of course, this is nothing new. Historically, visualisations of data including Nightingale’s Coxcombs, du Bois’ hand coloured charts of Black Life in the USA, Jon Snow’s cholera map and Mineard’s visualisation of Napoleon’s march on and retreat from Moscow have been used to tell powerful stories, that data presented as raw statistics or in tabular form could not.

We discusses how there is now an expectation that visualisations will be interactive.  One example we explored was Baby Name Voyager which provided some fun as we entered various names. However, a shocking dramatic visualisation was explored in Out of Sight, Out of Mind,  displaying animations of   drone strikes in Pakistan, and the resulting fatalities .

Data visualisations are not just a way of presenting results, they are also used for presenting findings of work in progress, which has value for Learning Analytics. Chapter 9 in  Data and Social Science: A Practical Guide to Methods and Tools covers visualisations in more detail.

What has this to do with Education?

Another way of phrasing this might be, why would Big Data not have anything to do with education?  Education and educational practices have long been the subject of quantification (Smith, 2016).   Today:

“Schools are increasingly caught up in the data/information frenzy”  (Smith, 2016: 2).

Big Data has become part of the way in which education is governed (Sellar, 2015; Selwyn, 2015; Williamson, 2015).   In particular, student performance data is increasingly used for accountability purposes.  Leaders and managers of educational institutions will rapidly need to become familiar with Big Data analytics.  Within Higher Education, data is routinely collected from every student transaction (lecture attendance, library visits, assignment submissions) and is collected by institutions, constituting a wealth of digital data on students. They may not be aware we collect, and use this data, and again this raises more ethical issues that researchers are engaged with.   Along with Learning Analytics this data may be be used used to identify those students at risk from failing or dropping out. As Learning Analytics develops, JISC has published a review of Learning Analytics practice in UK and internationally.

A two day course couldn’t cover everything, or produce Big Data experts. Other sessions included text analysis and machine learning, which both have relevance to education, and are  covered in more detail in Data and Social Science: A Practical Guide to Methods and Tools.


Continue reading “Big Data and Social Science”