My Data Principles

In my still early journey practicing family data science, I’ve come to understand a few key areas where data can help a lot, and also other areas where data won’t help any.

First, it doesn’t really make sense to try and quantify creative endeavors, for example – hours spent at dance class. At the end of the day, passion and motivation will determine how much of these activities my family engages in (and how often!)

The same applies to long-term plans – I don’t believe  i will quantify my way towards achieving the big moonshots / BHAGs in life.  I do believe theses types of goals should be articulated, and maybe even journaled in the process of attaining them, but that’s about it from a quantification standpoint.

Instead, I think it is better to measure those things where I am confident that conformance to a set standard or level can help determine personal growth, health and/or happiness.  Examples of this include tracking sleep, nutrition, health or cash flow.

Power BI your Family Data Science

Since 2013, I have been experimenting with family data science – or the process of drawing deeper understanding and insights to help my wife and daughters grow, stay healthy and be happy.

I generate and maintain household datasets that record key aspects of my family’s day-to-day lives. I use these datasets to drive my family data science experiments and they can be grouped in one of two ways.

pbi
Power BI Visusalizations

The first group is mostly transactional in nature. Credit card purchases, blood test results or dance practice attendance are just a few examples of datasets in this group.

The second group is mostly analytical in nature. These datasets aggregate, or count the different aspects recorded in the day-to-day transactional events.

During the first few years of running my family data science experiments, I happily collected and aggregated this information. For example, I could tell you the number of hours my daughters spent at dance practice,how many times my wife and oldest daughter suffered from bronchitis during winter months, or the months during which I sent the most work-related emails – no major insights but clarity nonetheless.

These analytics can be amusing for a little while, but as dad and husband, I wanted discover and answers to more probing questions. I wanted to know what happens to my daughter’s respiratory issues when we cut lactose for a period of six months? How do my wife’s sleep patterns improve when she goes swimming on a regular basis? Does my standing heart rate decrease during weeks of healthy eating and intense exercise?

In recent months, I’ve been using Microsoft Excel 2016, and its powerful Get and Transform features, to automatically connect to my household datasets in the Microsoft Azure cloud.  Once in Excel, I map these datasets to a Power Pivot data model.  Preparing the data this way gives me the freedom to  discover and correlate seemingly independent observations.   For example,  how weeks with increased workout activity improve the family’s sleep patterns and mood.

Excel’s potential for this kind of analysis is great.  Unfortunately the Get and Transform and PowerPivot features are clouded by the traditional spreadsheet  features.  When they coexist in one monolithic software package, the result is slow processing an instability. Excel crashed on me quite a number of times as it connected and imported 10MB from fourteen datasets in the Azure cloud.  The process of analyzing and refreshing the data model was simply too slow for the type of experimentation I was trying to perform.

This is where Power BI desktop comes to the rescue.   Power BI repackages Excel’s Get and Transform and PowerPivot features into an intuitive report authoring environment.  Where Excel buries the discovery and visualization features among its traditional spreadsheet capabilities, Power BI brings them front and center.

The end result is that Power BI lets you quickly visualize your data in numerous and diverse ways.   This is a much needed improvement over similar approaches using Excel 2016.

April Fools from BP Monitor

Since 2013, I have been experimenting with family data science – or the process of drawing deeper understanding and insights to help my wife and daughters grow, stay healthy and be happy.

Home blood pressure monitoring is something I do on a regular basis. About a year ago I purchased a Medisana wrist based monitor, this despite the fact that the American Heart Association discourages these types of wrist based blood pressure monitors due to their inaccuracy!

Nevertheless, the price and reviews were right and I was confident I could overcome the inaccuracy risks by simply using the device as recommended by the manufacturer.

Figure 1 shows a screenshot of my recent BP readings in iPhone’s Health app.  Look closely and you will see my BP readings over the past month, including multiple readings for April 1.  The high reading on April 1 was 153/99 mmHg.  This represents an all-time high for me and by far.  Was I at risk for hypertension or was something else going on here?

A quick visit to my local pharmacy’s upper arm-based blood pressure monitor resulted in a more respectable 124/82 mmHg blood pressure reading.   This confirmed my suspicion that my wrist-based monitor had gone out of whack.   I took a few more readings at the pharmacy over the course of the day and then another one in the evening using my wrist-based monitor and the verdict was in – my wrist-based monitor was playing an April Fools joke by inflating my readings by over 20%!

bp
Figure 1: Blood Pressure readings in March

School night anxiety?

Since 2013, I have been experimenting with family data science – or the process of drawing deeper understanding and insights to help my wife and daughters grow, stay healthy and be happy.  A recent analysis in one of my family datasets revealed an interesting observation regarding our kids and the days of the week in which they tend to experience the most symptoms (e.g. cough, runny nose, congestion).

The dataset in question represents a record log of symptoms I keep to track the things affecting my family over the course of the year.

Pivoting my way through this dataset, I landed on a table showing the frequency of my daughters’ symptoms by days of the week (see table 1 below).

symp1
Table 1: Symptoms by day of week

When I showed my wife this table, we couldn’t help reacting to the fact that the largest number of symptoms (thirty-two percent)  occurred on Sundays.   Of course there can be many reasons for this, but we wondered whether good ol’ Monday back to school anxiety may be stressing the kids unnecessarily.

Digging in a little bit more, table 2 reveals that 97% of these Sunday symptoms occurred in quarters when school was in session.  In other words, the only quarter with no Sunday symptoms is the same quarter when school is mostly closed for summer!

symp3
Table 2: Sunday symptoms by quarter

Numbers can be deceiving.  Table 1 and 2 support the school night hypothesis but table 2 also suggests an entirely different one that has more to do with the onset of colder weather.   This is because 80% of the Sunday symptoms from table 2 occurred during the mostly cold Northern Hemisphere autumn and winter months.  Another factor complicating the original hypothesis has to do with the date when I recorded the symptoms.  I tend to have more free time on Sunday compared to other days of the week.  It is entirely possible I was capitalizing on this free time to record symptoms which, in actuality, appeared prior to Sunday.

Conclusive one way or the other?  Definitely not.  With only a few years of data, it is clearly too early to reach any credible conclusions on this.   Nonetheless, the data does make for some interesting observations which are just some of the benefits of practicing a little family data science.

Sharing account passwords with the rest of the family

Consider the unfortunate event where I am hit by a bus, or a more common one where I am on a business trip without internet access to complete an urgent bank transaction.   How could I ensure that my wife or daughters (when they are older) can easily access and administer the accounts I depend on?

I have 47 of these accounts covering personal email accounts, bank accounts, student loan accounts, Dropbox, social media accounts and a host of other websites.  They also include accounts to manage my family’s digital records including marriage certificates, birth certificates, passport, national identity cards and much more.

I want to ensure that my family has instant access to these accounts without needing me to provide the login credentials.   I also want to enable my wife and daughters to share their account information with rest of the family if they choose to do so.

If you are also facing this problem then take a look at the recently announced 1Password family edition.

1Password

 

When counting things just won’t cut it

Welcome to familysmarts.net.  In this third post I move past the stage of simply ‘counting’ things in the household datasets I generate, to a new stage where I seek a different type of data to support probing family data science related questions.

If you are new to this site, here is a brief overview.  Since 2013, I have been experimenting with family data science – or the process of drawing deeper understanding and insights to help my wife and daughters grow, stay healthy and be happy.  I am not (yet) one of those obsessed quantified-self data geeks nor do I buy into the fitness tracker fads either.  I also have no intentions of replacing intuition and the good ol’ role of parenting with smart bots to guide your loved ones towards doing the right thing.  Rather, I am just a curious dad who is convinced that a little 21st century IT along with a stream of the right data and analytics can help reinstate a healthy dose of household conscientiousness in between the joy, pressures and chaos of everyday family life.

I generate and maintain datasets that record key aspects of my family’s day-to-day lives.  I use these datasets to drive most of my family data science experiments and they can be grouped in one of two ways.

The first group is mostly transactional in nature.  Credit card purchases, blood test results or dance practice attendance are just a few examples of datasets in this group.

The second group is mostly analytical in nature.  These datasets aggregate, or count the different aspects recorded in the day-to-day transactional events. 

During the first few years of running my family data science experiments, I happily collected and aggregated this information.   For example,  I could tell you the number of hours my daughters spent at dance practice,how many times my wife and oldest daughter suffered from bronchitis during winter months, or the months during which I sent the most work-related emails.  

These analytics can be amusing for a little while, but as dad and husband, I wanted answers to more probing questions.  I wanted to know what happens to my daughter’s respiratory issues when we cut lactose for a period of six months?  How do my wife’s sleep patterns improve when she goes swimming on a regular basis?   Does my standing heart rate decrease during weeks of healthy eating and intense exercise?

To answer these types of questions I needed a new type of dataset.  One that would do less ‘counting’ of day-to-day events and instead more ‘explaining’ of how these events come together in unique ways that help my family grow, stay healthy and be happy. 

sergio@familysmarts.net