8th February 2017

Takeaway the Lab: Data Analysis To-Go



Ex Aula - Logo4-cropped

 

Neal Thomas Barsch, MSc in Economics for Development (2016)

The digital universe by 2013 had grown to an estimated 4.4 zetabytes of total stored data [1]. This is 4.4×1012 gigabytes, or about 660 million years’ worth of HD video. In the lab, or connected to the Internet, collected data makes predictions about my human behaviour every day. Advertisements on Facebook are related to my “likes” or Internet cookies, and search suggestions appear based on what I have been reading. The network IP address I am connected to is location-tracked, allowing companies (however you feel about it) to generally track where I am. Algorithms instantly run so that a product I shop for on Amazon appears as an advertisement on an unrelated site I subsequently visit. Yet, data collection in the developing world is an entirely different story. Paper surveys and data collection are still the norm, and most analysis to date requires collecting data, then returning to the lab for analysis. What if instead we could bring the lab along with us to the field? What if we could predict, live and offline in a rural village in the Philippines, the same things we do back in a lab?

My research centres on a cultural phenomenon in the Philippines called the “Sari-Sari” store. Literally translated “Sari-Sari” means “variety-variety,” and these stores are usually like your typical corner store, family owned, selling soft drinks, crisps, household goods, and other small food items. It’s estimated there are more than a million Sari-Sari stores in every corner of the island nation.  For comparison, the entire UK has around 50,000 pubs. Accounting for the difference in geographical land area, imagine for every pub you passed in Oxford or London there were instead sixteen pubs, and then you would be on the level of Sari-Sari stores [2]. This ubiquity is massively powerful in its possibility to solve some of the biggest problems in the Philippines. What if these stores could provide crucial financial services to unbanked populations? What if people could save for their children’s educations safely and securely, gain capital for collateral for loans, and save to start businesses?  And perhaps most importantly, how do we lower the cost of establishing and providing these services to the level where banks will actually be interested, consumers pay little or no fees, and the market is sustainable?

It is the last point I am most interested in with my research into prediction models that work live and offline in the field. Not every Sari-Sari store will be a good mobile banking branch, and it’s difficult and costly in the traditional model to find and recruit the right ones. We have to find the trusted stores in the community, and furthermore stores that are used to dealing with cash. Culturally, it is extremely tough to go to a random Sari-Sari store, especially in a tight-knit rural community, and ask ‘how much cash do you have?’ or ‘how much does your community trust you?’ What we can do is take account of the stocks and variety of the store, ask how often the store restocks, what people buy most often, the business hours of the store, and even take into account the materials from which the store is built. With this information (what econometricians would call proxy variables), we can quickly build a picture of how the store fits into the community and functions on a day-to-day basis without asking sensitive financial or personal questions.

The algorithmic models we use are built, offline, into the tablet surveys themselves. The algorithmic models are constantly predicting and assessing the survey in the background (the math of the regressions is live and offline on the tablet). The survey then automatically determines the probability certain questions will be relevant, skips irrelevant questions, emphasises relevant ones, and stops when a store is either clearly a fit (or clearly not) for recruitment into the mobile financial services programme. This not only allows easier assessment by field workers as the survey recruits stores automatically based on responses, it saves field workers time by skipping irrelevant questions and allows for more efficient and sustainable recruitment.

The progress made with the tablet models doesn’t stop with the recruitment analysis. If the store fits into the model, then the tablet can be programmed to display a recruitment video, answer frequently asked questions, and take application forms for any programme, including for the mobile banking pilot. These processes are automatic and require no return-to-office assessment, and the automation makes it extremely easy for field workers to use the surveys with minimal cost and training. Going even further with the “field-lab” capability, we assess and build cell signal maps using the same tablets the field workers carry to take surveys. The tablets are programmed to record the GPS coordinates, cell signal strength of each SIM card (we use dual-SIM devices to build two network maps at once), tower ID the tablet is connected to, and other network information every 20 metres the tablets move (all automatic and in the background). When we go into the field, the lab now truly comes with us.

So far, the prediction models I have built have been used to survey nearly five thousand Sari-Sari stores in the Philippines, collect over a million cell signal data points, and recruit hundreds of stores into pilot programmes for business training and mobile money projects.

These applications of the “field-lab” technology are only scratching the surface of what is possible. Disaster prediction models can be built into tablets (and already are to some extent) to direct relief workers to the most affected areas, on a house-by-house basis, using built in modelling and satellite imagery assessments. On the microenterprise side, models will be able to direct relief and predict where stores could be used as immediate aid for affected areas providing immediate food and water needs.  International organisations would then be able to contribute funds directly to stores so they can immediately distribute these products to the community. In education, homework built with the technology will be able to predict exactly where individual students’ weaknesses are and focus curriculum and tailor lessons to individual students. In business, prediction algorithms could be used to assess business opportunities in impoverished areas, tailored to reflect the actual situation of each area, which would help families pull themselves out of poverty.  The possibilities are truly endless. The field, rather than the lab, is the new frontier of data technology.

 

[1] International Data Corp. as cited by Science and Technology Research News (https://goo.gl/lFcLgK)

[2] UK Campaign for Real Ale Pub Tracker Number (https://goo.gl/eJrBho)

 


Recent Research Highlights

1st June 2017

The Molecules of Life That Trigger Death

Layal Liverpool, DPhil in Infection, Immunology, and Translational Medicine Nucleic acids –  DNA and RNA – are the molecules of life. Without them we wouldn’t exist but, ironically, they are the very molecules used by viruses to hijack our cells. Viral nucleic acids act like a virus-blueprint, containing all the instructions necessary to make more […]

Read More…

24th May 2017

The Death of the Brainstem: Should Each Person be Permitted to Define Death for Themselves?

  Jake White,  Law Established understandings of when death occurs have been critically undermined by technological advancement and medical innovation. Conceptions of what ‘it’ is that is constitutive of human life has been destabilised as medical intervention makes possible the continuation of major organs that would otherwise succumb to failure. Where a patient is in […]

Read More…

18th May 2017

Standing on the Shoulders of Giants: Developing Antibiotics

Hannah Behrens, DPhil Infection, Immunology and Translational Medicine (m.2015) Although first discovered in 1928, it was only during the Second World War that Penicillin was developed into a drug that could cure people of bacterial diseases. This started the “antibiotic era” and is considered to be one of the most important medical discoveries of the […]

Read More…

12th May 2017

‘Good entertainment & civill mirth’: English Provincial Fairs in the Eighteenth Century

  Jessica Davidson, DPhil in History On 24 May 1702, 18 year old John Cannon set off with his friend John Berryman for Binegar fair, 12 miles from their home in West Lydford, Somerset, ‘being joyous of seeing this great fair’. There they were to set up a stall to sell hats made by Berryman’s […]

Read More…

3rd May 2017

The Paradox of Reality

Linde Wester, a fourth year DPhil in Computer Science Reality cannot exist. At least not any reasonable reality. A reasonable reality must satisfy some basic assumptions such as causality: the idea that the past can influence events in the future, but not the other way around. We’ve known this since 2005, when research groups from The […]

Read More…