8th February 2017

Takeaway the Lab: Data Analysis To-Go

Ex Aula - Logo4-cropped


Neal Thomas Barsch, MSc in Economics for Development (2016)

The digital universe by 2013 had grown to an estimated 4.4 zetabytes of total stored data [1]. This is 4.4×1012 gigabytes, or about 660 million years’ worth of HD video. In the lab, or connected to the Internet, collected data makes predictions about my human behaviour every day. Advertisements on Facebook are related to my “likes” or Internet cookies, and search suggestions appear based on what I have been reading. The network IP address I am connected to is location-tracked, allowing companies (however you feel about it) to generally track where I am. Algorithms instantly run so that a product I shop for on Amazon appears as an advertisement on an unrelated site I subsequently visit. Yet, data collection in the developing world is an entirely different story. Paper surveys and data collection are still the norm, and most analysis to date requires collecting data, then returning to the lab for analysis. What if instead we could bring the lab along with us to the field? What if we could predict, live and offline in a rural village in the Philippines, the same things we do back in a lab?

My research centres on a cultural phenomenon in the Philippines called the “Sari-Sari” store. Literally translated “Sari-Sari” means “variety-variety,” and these stores are usually like your typical corner store, family owned, selling soft drinks, crisps, household goods, and other small food items. It’s estimated there are more than a million Sari-Sari stores in every corner of the island nation.  For comparison, the entire UK has around 50,000 pubs. Accounting for the difference in geographical land area, imagine for every pub you passed in Oxford or London there were instead sixteen pubs, and then you would be on the level of Sari-Sari stores [2]. This ubiquity is massively powerful in its possibility to solve some of the biggest problems in the Philippines. What if these stores could provide crucial financial services to unbanked populations? What if people could save for their children’s educations safely and securely, gain capital for collateral for loans, and save to start businesses?  And perhaps most importantly, how do we lower the cost of establishing and providing these services to the level where banks will actually be interested, consumers pay little or no fees, and the market is sustainable?

It is the last point I am most interested in with my research into prediction models that work live and offline in the field. Not every Sari-Sari store will be a good mobile banking branch, and it’s difficult and costly in the traditional model to find and recruit the right ones. We have to find the trusted stores in the community, and furthermore stores that are used to dealing with cash. Culturally, it is extremely tough to go to a random Sari-Sari store, especially in a tight-knit rural community, and ask ‘how much cash do you have?’ or ‘how much does your community trust you?’ What we can do is take account of the stocks and variety of the store, ask how often the store restocks, what people buy most often, the business hours of the store, and even take into account the materials from which the store is built. With this information (what econometricians would call proxy variables), we can quickly build a picture of how the store fits into the community and functions on a day-to-day basis without asking sensitive financial or personal questions.

The algorithmic models we use are built, offline, into the tablet surveys themselves. The algorithmic models are constantly predicting and assessing the survey in the background (the math of the regressions is live and offline on the tablet). The survey then automatically determines the probability certain questions will be relevant, skips irrelevant questions, emphasises relevant ones, and stops when a store is either clearly a fit (or clearly not) for recruitment into the mobile financial services programme. This not only allows easier assessment by field workers as the survey recruits stores automatically based on responses, it saves field workers time by skipping irrelevant questions and allows for more efficient and sustainable recruitment.

The progress made with the tablet models doesn’t stop with the recruitment analysis. If the store fits into the model, then the tablet can be programmed to display a recruitment video, answer frequently asked questions, and take application forms for any programme, including for the mobile banking pilot. These processes are automatic and require no return-to-office assessment, and the automation makes it extremely easy for field workers to use the surveys with minimal cost and training. Going even further with the “field-lab” capability, we assess and build cell signal maps using the same tablets the field workers carry to take surveys. The tablets are programmed to record the GPS coordinates, cell signal strength of each SIM card (we use dual-SIM devices to build two network maps at once), tower ID the tablet is connected to, and other network information every 20 metres the tablets move (all automatic and in the background). When we go into the field, the lab now truly comes with us.

So far, the prediction models I have built have been used to survey nearly five thousand Sari-Sari stores in the Philippines, collect over a million cell signal data points, and recruit hundreds of stores into pilot programmes for business training and mobile money projects.

These applications of the “field-lab” technology are only scratching the surface of what is possible. Disaster prediction models can be built into tablets (and already are to some extent) to direct relief workers to the most affected areas, on a house-by-house basis, using built in modelling and satellite imagery assessments. On the microenterprise side, models will be able to direct relief and predict where stores could be used as immediate aid for affected areas providing immediate food and water needs.  International organisations would then be able to contribute funds directly to stores so they can immediately distribute these products to the community. In education, homework built with the technology will be able to predict exactly where individual students’ weaknesses are and focus curriculum and tailor lessons to individual students. In business, prediction algorithms could be used to assess business opportunities in impoverished areas, tailored to reflect the actual situation of each area, which would help families pull themselves out of poverty.  The possibilities are truly endless. The field, rather than the lab, is the new frontier of data technology.


[1] International Data Corp. as cited by Science and Technology Research News (https://goo.gl/lFcLgK)

[2] UK Campaign for Real Ale Pub Tracker Number (https://goo.gl/eJrBho)


Recent Research Highlights

5th May 2019

Back to the Future: Remembering the 90’s in Putin’s Russia

Niels Ackermann

During a winter evening last year, I found myself alone in an empty Kyiv park with my friend, Maria. We stood by a concrete pedestal where a statue of Vladimir Lenin was once mounted. It had obviously been torn down and the remains were scattered at our feet. Looking at the ruble, I made out […]

Read More…

28th April 2019

Nuclear Power: Is It Your Cup Of Tea?

Ben performing field work in Idaho, USA.

Nuclear power is a contentious political issue and it is something that most people hold a strong opinion on. Some people are against nuclear power as a result of the severe consequences of nuclear power plant disasters in Chernobyl and Fukushima (when 14m waves from a tsunami led to a major incident). These incidents have […]

Read More…

3rd March 2019

Toxic learning: The neuroscience of drug addiction


I just got back from the swimming pool. When I was in the pool, I very vividly recalled my memories from my childhood when I used to swim competitively. Why did this distant, abstract experience feel so powerfully familiar? As I finished pondering this bizarre feeling, it occurred that I had been (thankfully) swimming without […]

Read More…

24th February 2019

The Conflicting Realities of Parenting with Psychosis


I am a student in my 20s without any children or dependents. You could argue that there is no role in life granted more freedom than mine. In my position, you have the flexibility to choose how you spend your time and who you spend it with. You can spontaneously choose to meet a friend […]

Read More…

17th February 2019

AI + Entertainment: A tale of bridging two creativities

mayur 1

Look at the image above. What do you think about this artwork? Any clue how much does this cost? Hold your heart, my friend, as it was auctioned for just $432,500. You might be wondering if the painting is embellished with riches of all sorts but no, it’s the painter who attracted this huge price […]

Read More…