Since 2013, I have been experimenting with family data science – or the process of drawing deeper understanding and insights to help my wife and daughters grow, stay healthy and be happy. I am just a curious dad who is convinced that a little 21st century IT along with a stream of the right data and analytics can help reinstate a healthy dose of household conscientiousness in between the joy, pressures and chaos of everyday family life.
Imagine starting your morning with your smart home assistant gently informing you to take it a little easier over the coming days. It suggests this because it detected an emerging acute cough and reminds you that during this same period of seasonal change over the past five years, you have tended towards multiple days with coughing and or bronchitis. Imagine how in the busyness of everyday, this small nugget of timely information helps you adjust and avert a more serious bronchitis, for example.
Looking ahead a few years, imagine smart sensors spread throughout the home, maybe embedded in the walls. These sensors casually record observations regarding your family’s growth, health and happpiness. They observe coughs and colds, stress or excitement, and other aspects you control. You do this because the data collected by these sensors feeds analytical processes to deliver highly personalized and timely well-being insights.
Recent advances in cloud computing, machine learning and the emerging discipline of Data Science are enabling these unprecedented opportunities, and allowing us to rethink how we nurture, encourage and care for the members of our family.
A few years ago I started tracking key aspects of my family’s day-to-day well-being including symptoms, medicines taken, doctor’s visits, activity, vitals and more. I use this data to conduct casual exploration and N-of-1 experimentation to address issues and opportunities affecting my family’s well-being—a process I refer to as Family Data Science.
I was curious what this data would reveal about my ongoing struggles with seasonal allergies. Since my early twenties, I suffered mild to severe chronic seasonal allergies. Symptoms typically begin in February in the form of mild itchiness and continue through March, April and May in the form of nasal congestion, fatigue, sleepiness, sinus pressure, dry mouth, sore throat, loose stools and more. I manage to hang on in good years, waiting for Spring to fully blossom and symptoms to slowly subside. In bad years, I require visits to the doctor’s office for sinus infection or similar respiratory ailments. Equally frustrating is the feeling of having wasted Spring’s most beautiful months battling seasonal allergies.
My aim with this project was to uncover insights that will help me adjust and better cope with seasonal allergies in the years ahead.
How did I do it?
My self-tracking projects are powered by ostlog – an open-source Personal Well-being Library. For this project, I used ostlog to record daily symptoms and medicines to a single integrated database as illustrated in Figure 1 below.
Figure 1: Self-tracking with ostlog
The first step was to quantify the impact seasonal allergies were having on my health and well-being. A quick analysis of my 2016 data revealed a distinct spike in allergy-related symptoms  during the month of April (see Figure 2).
Note that for this project I excluded seasonal allergies resulting from the onset of Autumn (i.e. October and November).
Figure 2: Monthly allergy symptoms 2016
The data confirmed what I have known for a long time—Spring allergies were at their worst during peak pollen months such as April. What I didn’t realize however, was the exact nature of this impact. Which of the various types of allergy symptoms was I suffering from the most?
To my surprise, digestive symptoms occurred almost as much as respiratory symptoms in April 2016 (see Figure 3.) Before starting this project, I never directly associated digestive symptoms with seasonal allergies.
Based on these findings, I structured this project around the following three questions:
What if any link existed between seasonal allergies and digestive symptoms?
How could I reduce allergy-related respiratory symptoms?
How could I improve antihistamines usage through Spring? (i.e. minimize use and their side-effects but maximizing temporary relief)
What Did I Learn?
I did some research on the link between seasonal allergies and digestion and found this interesting article by the Capital Research Vitality Center:
An estimated 80 percent of the immune system resides in the gut, and when digestive problems set in, immune problems are sure to follow. A chronically inflamed gut—which causes indigestion, heartburn, bloating, pain, diarrhea, constipation, irritable bowel disorders, and more—sends the immune system into overdrive.
As a result, the body becomes hypersensitive and overreacts to stuff it shouldn’t, including pollen, grass, and other triggers associated with spring.
Because allergy symptoms frequently start with poor digestive function, the gut is a great place to start for relief.
I saw this approach in addressing gut health as a potential natural remedy for my chronic seasonal allergy symptoms.
So I started 2017 by eating significantly more anti-inflammatory foods including clementines, broccoli, and spinach. I also eliminated the occasional glass of wine or beer to avoid the irritative effects of alcohol on the intestinal lining.
The next step was to find new ways to reduce respiratory symptoms. Here my research revealed a potential negative impact caused by excess mucus on the both respiratory and digestive systems. During the worse months of allergy seasons, I tend to get congested, with mucus buildup making it difficult to breathe freely. To reduce the mucus buildup, I adopted a very simple idea. My wife and I regularly utilized saline nasal sprays on our daughters to help keep their respiratory pathways mucus free during winter cold and flu season. I wondered if these same nasal sprays would help limit mucus buildup during allergy season. So as part of this project, I also performed two saline rinses a day through March/April/May 2017.
Regarding antihistamine usage (see Figure 3), the curious fact is how I refrained from taking any during the peak pollen month in April 2016. In general I wanted to minimize or avoid their use when possible (especially their side-effects), but I also wanted to be more selective for those days when I really needed the additional relief. I changed my approach through the Spring 2017 by limiting doses to just one pill at the first sign of worsening allergy symptoms, or on those days when I knew I would spend the better part of the day outdoors.
To summarize, based on my research I tested the following hypotheses:
By improving gut health, I would have a more effective method and natural remedy to reduce the effects of seasonal allergies.
By keeping my respiratory pathways mucus free with the use of saline nasal rinses, I could improve respiration throughout Spring allergy season with possible benefits to digestive system too.
By relying less on antihistamines as a primary remedy, but better timing their use when additional relief was absolutely necessary, I could further reduce the effects of seasonal allergies while minimizing the drug’s side-effects.
The results have been pleasantly surprising. With the exception of a stomach virus suffered in March (most likely unrelated to seasonal allergies but accounted for in Figure 4), I’ve gone through the peak pollen months of Spring 2017 with a noticeable improvement in my ability to cope with allergies. (see Figure 4 and 5).
Regarding the saline nasal rinses, of the three types of allergy-related symptoms that affect me the most (i.e. skin irritations, respiratory, digestive) the biggest improvement (see Figure 6) was the reduction in respiratory symptoms, which I assume is a direct result of these nasal rinses. Bottom line, I was finally able to take in and smell the Spring air more days than not in 2017!
Regarding antihistamine usage, I took single doses at the first sign of worsening symptoms. Limiting their use also implied limiting their side-effects, which can be as frustrating as the allergy symptoms themselves.
The results are positive yet more tests are needed in the coming years to confirm the effectiveness of this approach. In general I feel very hopeful to have found what amounts to an effective natural remedy to better cope with Spring allergies.
 Symptoms are recorded and specially marked the first day they are observed. Subsequent observations of the original symptom are not counted twice provided the original continues to persist at least once in any 7-day period starting from the date of the first observation, otherwise the symptom is recorded as a new occurrence.
Digital innovations have given rise to the Quantified Self —a movement enabling individuals to capitalize on the insight-generating power of self-tracking. People self-track to gain deeper insights regarding their mind, body and other aspects of their well-being. These insights can help people make better day-to-day decisions regarding their performance, health, and happiness. For example, allowing a patient to preempt a doctor’s visit, or transforming a necessary visit into an informative data driven discussion . Others do it to collect data required to train personal well-being algorithms that will soon integrate with their smart home assistants.
As a husband and father, I wondered how this could benefit my family too. Could self-tracking help us become more conscientious in our day-to-day? Could it help improve our well-being? Three year ago, I started tracking various aspects of my family’s growth, health and happiness through a practice I refer to as Family Data Science. I wasn’t interested in tracking minute-by-minute calories, moods, or steps. Instead, I wanted a daily record of exceptional events that be analyzed across time and other dimensions.
My experience is proving valuable in three ways. First, the data allows us to recall events over longer periods and greater detail than our memory alone. Second, the act of self-tracking introduces a moment of pause and reflection in the busyness of every day, and this helps boost conscientiousness. Third, applying machine learning to this data enables unique integration opportunities with today’s growing demand for smart home assistants.
Conventional wisdom teaches us that saving for retirement by maximizing investment account contributions at a young age is a sound strategy for individuals and their future generations. Doing so allows interests to compound daily while also minimizing tax liability. In this era of digital innovations, Family Data Science is offering similar benefits in the area of health and well-being. Starting early allows families to make better decision through all stages of their life, and this leads to better outcomes. The value of data compounds over time as well, providing a larger body of evidence to enable deeper and more accurate discovery of insights.
 Topol E. (2015). The Patient Will See You Now: The future of Medicine is in Your Hands, Basic Books
What if I could apply a little family data science to help answer the question “Which neighborhood is right for my family?” In other words, I want to rank future neighborhoods of interest in such a way so that the ones on top are guaranteed to satisfy the needs of my family.
Seven years ago, my wife and I moved to Rome, Italy. We moved apartments and neighborhoods quite a number of times just to keep up with the changing demands of a growing family. In two cases, it didn’t take long for us to realize we had moved to a neighborhood that didn’t really suit us.
Looking back over this period, a few factors emerged for those neighborhoods that did work for us, which were curiously missing in those that didn’t.
In this article, I propose a method to rank neighborhoods of interest according to criteria important to us. I then apply this method to automatically rank past and present neighborhoods we’ve lived in in order to see if the ranking reflects our preferences.
I wanted to use some of these metrics, but I needed to apply them at the neighborhood level not just the city level as done by the researchers. One idea for doing this is to rely on remote sensing of satellite images. With geospatial platforms such as Google Earth Engine, this is very much doable.
In the sections below, I use Google Earth Engine to compute neighborhood-related features that are based on the “four generators of diversity” proposed by Jane Jacobs. The final ranking for my neighborhoods of interest will be based on these features.
Our newborn loves to sit back in her stroller and listen to the wind move its way through the tree leaves. Our small girls, on the other hand, love the freedom to scoot or bike around without having to worry about cars and trucks running them over. This means that our ideal neighborhood will have a combination of green as well as park areas closed off to traffic.
I rely on the standard Normalized Difference Vegetation Index (NDVI) to calculate a neighborhood’s ‘greenesss’. NDVI works because satellite’s equipped with near-infrared sensors accurately record reflection of solar radiation by green vegetation on the ground.
NDVI is a numerical indicator and can be computed by analyzing satellite images in Google Earth Engine (see Image 1).
Ranking our Rome neighborhoods of interest by their NDVI confirms what we already knew – Trastevere is one of the greenest and also happens to be our favorite neighborhood in Rome.
Della Vittoria – .33 NDVI
Trastevere – .25 NDVI
Ostiense – .23 NDVI
Testaccio – .19 NDVI
Prati – .13 NDVI
Multiple Land Use
Neighborhoods that serve multiple functions is another aspect of urban life important to us. It is really great when we can greet the neighbors, pick our favorite flavor at the local ice-cream shop, get a hair-cut and take the kids to the children’s park all within a short walk of our front door. The foot traffic generated by these types of neighborhoods creates an urban vibrancy unlike that experienced in single-use neighborhoods.
Land Use Mix is one of the variables used by the researchers in the paper I referred to earlier . The underlying principle for Land Use Mix was proposed by Jacobs when she suggested a city district should serve more than one primary function, preferably more than two.
The researchers concluded that the city of Rome has a high Land Use Mix. Living here for the past seven years, however, taught us that not all neighborhoods exhibit the same levels of Land Use Mix.
In my quest to pick the perfect neighborhood, I will also calculate Land Use Mix for the neighborhoods we have lived in using the following formula:
where Pi,j refers to the percentage of square footage with land use j in district i, and n is the number of possible land uses. For the purpose of my article, a ‘district’ is synonymous with ‘neighborhood’, which I consider to be equivalent to the Rioni of Rome or the Quarters of Rome. Just like the researchers, I will use a value of n=3 (1=residential, 2=parks/squares/water, 3=businesses/commercial/government). Dedicated single-use neighborhoods will have LUMi equal to zero. When land use is equally divided in all n ways then LUMi will equal one. The higher LUMi, the more mixed the neighborhood’s land use.
Ranking our Rome neighborhoods of interest by their Land Use Mix begins to reveal some interesting characteristics:
Trastevere .73 LUM
Della Vittoria .64 LUM
Testaccio .44 LUM
Prati .44 LUM
Ostiense .34 LUM
Neighborhood Block Size
Jacobs believed people didn’t like walking down long blocks, and instead would avoid them at all costs. She believed short blocks offered more navigation options between Point A and B. With short blocks, pedestrian traffic is more easily distributed and this distribution helps create more viable locations for smaller businesses to mix into residential areas.
This observation plays out on the streets of Rome everyday. Neighborhoods composed of mostly small, sometimes quirky shaped blocks are full of small mom-and-pop type stores sitting at the base of multi-level residential buildings.
Not surprisingly, applying the formula for block size in my neighborhoods of interest reveals the following:
Della Vittoria .011
Many times numbers simply confirm what our intuition knew all along. In this exercise, I took the principles proposed by Jacobs and the some of the metrics utilized by researchers to rank a set of Roman neighborhoods based on characteristics important to me and my family. I then used Google Earth Engine and remote sensing of satellite images to quantify these characteristics. Having lived in all five neighborhoods of interest, we have strong opinions regarding these neighborhoods and how they compare to one another. Not surprisingly, the final ranking reflects these opinions.
 R. Cervero. Land-use mixing and suburban mobility. University of California Transportation Center, 1989.