Personal Biosensors and the Internet of Medical Things


There is tsunami of new devices and apps out that will help you record everything from the number of steps you took in a day to calories and caffeine ingested, sleep quality, weight, blood pressure and blood glucose levels.  The next revolution in Medicine will be the Internet of Medical Things (IoMT), uniquely tagged devices that help monitor blood pressure, blood glucose, physical activity, temperature, sleep, and even motion.  Along with patient entered data from tablets, mobile devices, and conventional desktop computers, data from these devices will change the face of medicine, increase our ability to engage patients in their own health behaviors, and provide massive amounts of data for population health study on an unprecedented scale.

Personal biosensor devices (PBD’s) like Fitbit and Jawbone have become the rage, with many corporations looking to provide PBD’s to employees, with the goal of improving employee health.  Often the devices are paired with financial incentives to motivate people to change behavior.  As reported last year in Wired,  company +Citizen has a program where employees have voluntarily agreed to share their fitness, productivity and happiness data.  Many  vendors, such as FitLinxx, SparkPeople, and Endomondo specifically offer employer packages.

Mobile apps are branching out, and rapidly linking with these devices, allowing coupling of geospatial and biometric data.  The data to be generated by these devices, already in use by hundreds of thousands, if not millions of people, will be staggering.  This past year, clinical research and clinical trials started to incorporate PBD data from smart phones and PBD’s.

At present, it is unclear whether apps or PBD’s will alter health behavior.  Despite their ubiquity, there is little data on improvement in glucose control by diabetics who use such mobile software to manage their blood sugars.  Do weight loss and calorie counting apps really achieve their goals?  I think that it’s fair to say that anectdotal evidence suggests great promise in many cases.  From a personal standpoint, my Fitbit has made me more aware of my sedentary computer habits, and motivated me to take more steps and run out more.  My favorite recent awareness raising app, pointed out to me by my colleague Joshua Schwimmer, is UpCoffee by Jawbone.  I had no idea of the half-life of caffeine before I downloaded the app!

The impact of PBD’s and apps may not be all good, or all predictable.  Sometimes, personal bio-sensing apps can actually lead to bad outcomes.  An article by Alice Gregory in the New Republic last year describes how calorie counting mobile fitness apps can worsen eating disorders.  Given the studies that have described the addictive properties of electronic devices and the internet, and the underlying biology, it is not surprising that these problems can be exacerbated in people with addictive or compulsive behavior tendencies or illnesses.

Where all this leads, we don’t know yet.  Certainly to very large data sets and something far beyond telemedicine.  Something exciting is happening in medicine and research.  I hope that this will lead to the ability to crowdsource population health research questions  and studies beyond our wildest imagination.  What would you study if you had access to data from a million PBD’s?

Geospatial Data and HIPAA

How have privacy regulations affected the use of GIS data?

Since 1854, when John Snow used geospatial mapping to locate the well spreading cholera in London, GIS data has been a cornerstone of public health and epidemiology research.  Today, a wealth of data sources are available for research.  For example, locate a patient within a census tract in the United States, and a variety of information such as average income in the area, demographic data, and other census information can be linked directly to your patient-specific study data.  Alternatively, in this innovative study from Brazil GIS mapping software was used to determine that the distance an expectant mother had to travel through urban transportation networks to reach healthcare was an important risk factor for death during pregnancy.  Similar studies have used GIS data to examine infant mortality, rural population HIV-mortality, and tuberculosis control measures.  While geocoding large amounts of data for medical epidemiology studies can be extremely informative, you need to be careful not to run afoul of government privacy laws, especially the HIPAA privacy rule in the United States.

The Health Information Portability and Accountability Act (HIPAA) rules define personal health information (PHI), which may include diagnoses, test results, payment or visit information.  The intent was to protect people against disclosure of health information in conjunction with information that could reveal their identity.  This identification information consists of 18 identifiers, such as name, social security number, and date of birth.  The definition of “identifiable information” also includes any data that would allow another person to re-identify a person directly or indirectly without access to a specific code or key.  For geospatial information, the personal identifiers include a person’s street address and ZIP code.  GIS coordinates are considered  an “equivalent geocode”, meaning that they are as good as a street address.  Imagine a map plotting the location of eight people infected with HIV in a sparsely populated rural area.  It would not take much to match that data up with a specific person.  The point is that all such information needs to be de-identified before it can be released or worked on outside of a HIPAA compliant data storage and analysis environment.

De-identification of GIS data in healthcare research can be thought of as a two part process:  de-identifying data while obtaining a set of coordinates used to plot a person’s location, called geocoding, and de-identifying the data when presenting the results of your research.

Geocoding is the process of translating an address into set of XY coordinates that can be used to plot a location on a map.  You could do this easily by feeding a list addresses into one several geocoding services on the internet such as bulkgeocoderGoogle, Mapquest, cloudmade, or  ArcGIS Online.  But, if you have lists of patient data, this could be a massive HIPAA violation. The best way to make sure you are HIPAA compliant is to use a geocoding firm with which you have a business association agreement (BAA) that will take your information and generate the geocodes in a HIPAA compliant and secure environment. An important best practice is to process a list of addresses that have been separated from any other information, and can only be linked by a secure, randomized key.  Once the geocoding service returns your data, you can link it back to your complete research file.  It is unclear, however, whether submitting a list of  addresses using an e-mail address containing information about a diagnosis (e.g. Researcher@DiabetesInstituteResearch.Org) outside of a BAA would constitute a breach, since one might infer the diagnosis of people at addresses on the list from the organization name.  Best to consult your organization’s privacy officer about this issue.

Once you have done your analysis, and wish to publish plotted geocoded data, it must be done in a way that you cannot identify an individual by examining the data set alone or in combination with other publicly available data.  Think of the map of firearm owners in Westchester county published by a local newspaper.  If it had been a map of people with a diagnosis of leukemia, it would have been a HIPAA violation.  Deidentification methods could be quite sophisticated, such as statistical de-identification.  An interesting workshop sponsored by the department of Health and Human Services discussing these issues can be found here.  Several methods are available to avoid this pitfall:

  • Point aggregation – combining points into geographic bins, such as zip code areas, counties, states, or other areas.  This way, no individual data point is identifiable as a person, but the bins must have a sufficient population and subject density.
  • Geostatistical analysis – One example is creating a probability map, where any area represents the probability of a study subject having a particular condition or value.  Again, no individual points are plotted.
  • “Jittering” data involves adding or subtracting some random values to a precise GIS location so that an individual point is not precisely located on a diagram.
  • Data point displacement by translation, rotation, or change of scale.

Resolution of the map is also important, as is the population density of the area you are plotting data for.  One needs to be careful, as well, that the de-identification methods do not change the validity of your research results.

So, the use of large GIS data sets is a tremendous opportunity for population health research, but requires specific practices with respect to de-identification when analyzing and publishing that data.  Geocode and aggregate carefully!

Norovirus, Networks, and Big-Data

Another norvirus outbreak has been in the news related to a group of cases on a cruise ship.  With over  700 passengers and crew falling ill, it is one of the largest outbreaks on a cruise ship ever reported.  Norovirus is a highly contagious member of the Caliciviridae family, and contains multiple genotypes and subtypes.  Small mutations in the norovirus genome lead to new strains, similar to the phenomenon of antigenic shift in influenza viruses.  Larger mutations can lead to pandemic strains when the prevailing population immunity to older strains is no longer effective against the new strain.  The United States is in the midst of the norovirus season, with a new strain being responsible for most cases.

How is Big Data Science revolutionizing the tracking and prediction of norovirus outbreaks?  The US Center for Disease Control now tracks norovirus outbreaks through a combination of traditional outbreak surveillance as reported by public health departments around the US and confirmed by molecular testing of specimens from symptomatic individuals. But, an alternative Big Data real-time social media monitoring approach is being tested in the UK by the Food Standards Agency.  Tweet the hashtag #Barf in London, and your tweet will be added to the FSA statistics, along with the geographic location.  About 50% of gastrointestinal intestinal illnesses in the US and UK are caused by norovirus, so tweets and Google Searches about stomach cramps, vomiting and diarrhea have a high likelihood of being norovirus related!   FSA researchers found an upswing in hashtags describing GI symptoms occurred 3-4 weeks before an outbreak was identified by traditional laboratory surveillance.

#Vomit:  Predicting Norovirus Outbreaks with Twitter

So how can Big Data Science contribute to solutions?  Recognizing outbreaks in real time using Big Data analytics is a start.  Taming data velocity and volume are key here.  Early recognition can lead to containment and public health strategies can limit the outbreak.  But potential solutions go beyond larger public health responses.  One of the major ways individuals can prevent the spread of the virus, and themselves from being infected, is simple good hygiene such as had washing.  Norovirus outbreaks occur more frequently in places where people are living together and have risk factors such as being elderly, immunosuppressed, or very young.  Day care centers, nursing homes and hospitals are the key areas.  In a novel application of Big Data Science real-time analytics, IBM has developed a method of tracking handwashing among healthcare workers after each patient contact.  An RFID tag carried by the worker, couples with sensors which record entry into the room, exit, and use of a hand sanitizer dispenser, have lead to pronounced increases in had-washing.  The data is still out on whether this will reduce infectious outbreaks or their spread, but if the promise bears out, look for such systems in high risk areas such as institutional kitchens, day care centers and other areas.  It does seem a bit Big Brother-ish, which is a topic for my next post…

For now….wash your hands, tweet your symptoms, and stay healthy!