8th February 2017

Takeaway the Lab: Data Analysis To-Go

Ex Aula - Logo4-cropped


Neal Thomas Barsch, MSc in Economics for Development (2016)

The digital universe by 2013 had grown to an estimated 4.4 zetabytes of total stored data [1]. This is 4.4×1012 gigabytes, or about 660 million years’ worth of HD video. In the lab, or connected to the Internet, collected data makes predictions about my human behaviour every day. Advertisements on Facebook are related to my “likes” or Internet cookies, and search suggestions appear based on what I have been reading. The network IP address I am connected to is location-tracked, allowing companies (however you feel about it) to generally track where I am. Algorithms instantly run so that a product I shop for on Amazon appears as an advertisement on an unrelated site I subsequently visit. Yet, data collection in the developing world is an entirely different story. Paper surveys and data collection are still the norm, and most analysis to date requires collecting data, then returning to the lab for analysis. What if instead we could bring the lab along with us to the field? What if we could predict, live and offline in a rural village in the Philippines, the same things we do back in a lab?

My research centres on a cultural phenomenon in the Philippines called the “Sari-Sari” store. Literally translated “Sari-Sari” means “variety-variety,” and these stores are usually like your typical corner store, family owned, selling soft drinks, crisps, household goods, and other small food items. It’s estimated there are more than a million Sari-Sari stores in every corner of the island nation.  For comparison, the entire UK has around 50,000 pubs. Accounting for the difference in geographical land area, imagine for every pub you passed in Oxford or London there were instead sixteen pubs, and then you would be on the level of Sari-Sari stores [2]. This ubiquity is massively powerful in its possibility to solve some of the biggest problems in the Philippines. What if these stores could provide crucial financial services to unbanked populations? What if people could save for their children’s educations safely and securely, gain capital for collateral for loans, and save to start businesses?  And perhaps most importantly, how do we lower the cost of establishing and providing these services to the level where banks will actually be interested, consumers pay little or no fees, and the market is sustainable?

It is the last point I am most interested in with my research into prediction models that work live and offline in the field. Not every Sari-Sari store will be a good mobile banking branch, and it’s difficult and costly in the traditional model to find and recruit the right ones. We have to find the trusted stores in the community, and furthermore stores that are used to dealing with cash. Culturally, it is extremely tough to go to a random Sari-Sari store, especially in a tight-knit rural community, and ask ‘how much cash do you have?’ or ‘how much does your community trust you?’ What we can do is take account of the stocks and variety of the store, ask how often the store restocks, what people buy most often, the business hours of the store, and even take into account the materials from which the store is built. With this information (what econometricians would call proxy variables), we can quickly build a picture of how the store fits into the community and functions on a day-to-day basis without asking sensitive financial or personal questions.

The algorithmic models we use are built, offline, into the tablet surveys themselves. The algorithmic models are constantly predicting and assessing the survey in the background (the math of the regressions is live and offline on the tablet). The survey then automatically determines the probability certain questions will be relevant, skips irrelevant questions, emphasises relevant ones, and stops when a store is either clearly a fit (or clearly not) for recruitment into the mobile financial services programme. This not only allows easier assessment by field workers as the survey recruits stores automatically based on responses, it saves field workers time by skipping irrelevant questions and allows for more efficient and sustainable recruitment.

The progress made with the tablet models doesn’t stop with the recruitment analysis. If the store fits into the model, then the tablet can be programmed to display a recruitment video, answer frequently asked questions, and take application forms for any programme, including for the mobile banking pilot. These processes are automatic and require no return-to-office assessment, and the automation makes it extremely easy for field workers to use the surveys with minimal cost and training. Going even further with the “field-lab” capability, we assess and build cell signal maps using the same tablets the field workers carry to take surveys. The tablets are programmed to record the GPS coordinates, cell signal strength of each SIM card (we use dual-SIM devices to build two network maps at once), tower ID the tablet is connected to, and other network information every 20 metres the tablets move (all automatic and in the background). When we go into the field, the lab now truly comes with us.

So far, the prediction models I have built have been used to survey nearly five thousand Sari-Sari stores in the Philippines, collect over a million cell signal data points, and recruit hundreds of stores into pilot programmes for business training and mobile money projects.

These applications of the “field-lab” technology are only scratching the surface of what is possible. Disaster prediction models can be built into tablets (and already are to some extent) to direct relief workers to the most affected areas, on a house-by-house basis, using built in modelling and satellite imagery assessments. On the microenterprise side, models will be able to direct relief and predict where stores could be used as immediate aid for affected areas providing immediate food and water needs.  International organisations would then be able to contribute funds directly to stores so they can immediately distribute these products to the community. In education, homework built with the technology will be able to predict exactly where individual students’ weaknesses are and focus curriculum and tailor lessons to individual students. In business, prediction algorithms could be used to assess business opportunities in impoverished areas, tailored to reflect the actual situation of each area, which would help families pull themselves out of poverty.  The possibilities are truly endless. The field, rather than the lab, is the new frontier of data technology.


[1] International Data Corp. as cited by Science and Technology Research News (https://goo.gl/lFcLgK)

[2] UK Campaign for Real Ale Pub Tracker Number (https://goo.gl/eJrBho)


Recent Research Highlights

5th December 2018

Václav JANEČEK: ‘Ownership of Personal Data in the Internet of Things’

Data Protection?

In light of the recent developments in data protection laws around the world, Ex Aula is delighted to present a short video that examines the implications of data protection rights. Aimed at a non-specialist audience, DPhil student Václav Janeček discusses his work (published in the journal ‘Computer Law and Security Review’) on data ownership in the context of EU […]

Read More…

25th March 2018

Caricaturing Terror

How does one draw tragedy? How can terror be depicted without trivialising the sorrow of those who suffered from it? In Pakistan terror is not something one can caricaturise, when the terrorist can be present a few metres from you, ready to detonate a bomb in the centre of your hometown [1]. When you cannot […]

Read More…

11th March 2018

Molecules that make you think: using genetics to understand our emotions

The most common question I’ve been asked when introducing my work to strangers, friends, and Tinder dates has been “but aren’t mental illnesses…in the mind? What do genes or molecules have anything to do with it?” The answer is, in short, everything. Each of our mental functions is fundamentally rooted in biological processes that can […]

Read More…

4th March 2018

Romance Comic Books, the Cold War, and Teaching Women Their Place

I came across romance comic books by accident during a tiring Google search for a topic for a term paper. At first, I thought romance comic books were a joke – that a modern artist had created them to make fun of 1950’s domestic ideals. Then I found out that Captain America creators, Joe Simon […]

Read More…

18th February 2018

Coming Up For Air: 100 Million Years of Ocean Biology

  George Cuvier was a young man at the Storming of the Bastille in the summer of 1789. It was under the shadow of the French Revolution that he developed the concept of ‘catastrophism’. In the midst of the radical political changes that were engulfing Europe, Cuvier speculated that the Earth itself had undergone radical, […]

Read More…