Facebook and Healthcare data, the contrarian view.

Recently, I read an article that Facebook had been considering partnering with hospitals to connect their social data with the hospitals’ patient data in order to provide improved services to patients. Facebook had decided, in the wake of the Cambridge Analytica fiasco, to put those plans on hold. Here is the video version of the report, which is definitely worth watching.

I thought this was disappointing, because I know that many patients rely on social media generally, and Facebook in particular to coordinate patient care. Connecting healthcare data with a patients social graph, when done with permission and with limited and intelligent goals could result in real improvements in patient care, especially for our most vulnerable populations.

I tweeted as much:

I have been surprised by the subsequent reactions, very few of my tweets seem to garner this much attention or engagement. Given this reaction, I thought it wise to more carefully defend my position.

I do not think anyone should claim “expertise” in anything as nebulous and unknowable as healthcare cybersecurity currently is, but I am definitely comfortable saying, that I am not a novice.

I have spent time thinking carefully about the intersection of healthcare information systems, and cybersecurity and privacy. This has lead me to be frequently at odds with other cybersecurity experts who are legitimately concerned about the dangers of connecting to early.

The problem that I see again and again are knee-jerk policy reactions to technology potential and, more generally, a tendency for talking-head histrionics regarding healthcare information privacy. Probably the most extreme of these, historically, has been my friend Dr. Deborah Peel. Dr Peel has continued to suggest that all health information exchange halt, until it can be made entirely secure and entirely respect patient privacy and ongoing consent.

The problem with that approach is that it tends to drive healthcare data exchange efforts “underground”. The discussion about Facebooks change in policies is a good example such fear-mongering. Note how CNBC chose to frame the news that Facebook was NOT going to reach out to hospitals. Let me quote some of the article, highlighting some of the terms that I find concerning.

Facebook sent a doctor on a secret mission to ask hospitals to share patient data

Facebook was in talks with top hospitals and other medical groups as recently as last month about a proposal to share data about the social networks of their most vulnerable patients.

The idea was to build profiles of people that included their medical conditions, information that health systems have, as well as social and economic factors gleaned from Facebook.

Now, CNBC is not as given as some of the other networks to outright fear-mongering, but I do need to quibble with this type of reporting. First, if you read the article closely you will see that the project intended to link data using a two-sided hashing mechanism. This serves to protect the privacy of both the Facebook user data, and the hospitals’ patient data. The headline makes it seem like it would be trivial for both the hospital and Facebook to identify these patients. Of course, such a dataset would be relatively simple to re-identify given how much of Facebook’s user data is public information. And it is highly unlikely that either Facebook or the hospitals intended to release this merged dataset to the public. Still, de-identifying a dataset like this is a useful precaution to ensure that researchers are not tempted to violate patient privacy.

This type of de-identification strategy would have made the resulting dataset almost useless for Facebooks main profit center: selling targeted advertising. But the article makes it sounds like this is the aim, because the partnership was seeking to “build a profile”. A profile that is not connected to an identity is… well it’s not profile. A profile is a kind of an aggregation of multiple people.. and a dossier is about one specific person. Deidentified data is not really either one of those things. It is about a specific person, unlike a profile and unlike a dossier because no identity is attached. In this respect, building a “profile” does not seem like such a big deal, its an aggregate of people, a single average that is useful to help understand many potential individuals…

In fact, “building a profile” is clearly not the aim of such an endeavor, but only an intermediate goal. The reason such a “double anonymized” merged dataset would be useful, is because you could learn how to help patients by studying it. The research might help the hospitals, and Facebook, to understand how to better serve the patients that they both have as customers. A non-public anonymized dataset like this, shared only between the a limited set of researchers representing the two parties who contributed data, is pretty hard to abuse.

In fact, this is exactly the type of research that both Facebook and the hospitals have an independent and shared ethical obligation to undertake. There are more patients who share clinical information across Facebook than any software designed for that purpose (typically those products are called PHR systems). Facebook users use the platform every day to coordinate caregiving for their friends and family. They use it to coordinate whose turn it is to make dinner, and to coordinate which “friend” is going to show up and hold their loved ones hand. They use it promote the go-fund-me pages that have frequently taken the place of comprehensive health insurance in this country. They use it to request prayers, when the pain is really bad, and the pills no longer work.

Is this a good idea? Well, there are many who would warn that sharing data publicly like this is dangerous and they are profoundly correct. But this sharing is not done because users trust Facebook, quite the contrary. Facebook is tolerated, as a gateway to the friendships and family members that Facebook users so desperately need when they become seriously ill.

That use-case is not what Mark Zuckerburg imagined in his Harvard dorm. Frankly, it was a case of great foresight for Mark to guess that people might use his young platform get laid. But as Facebook has become the de-facto mechanism to connect with friends and family, especially across generations, it has also become a very common place for patients to connect with their care community. Or at least the parts of their care community that are NOT professional clinicians.

The professional clinicians not only fail to connect with patients friend and family network, they also fail to connect digitally with each other. Instead, they are rewarded for hoarding their portion of a patients medical data to themselves, a problem regularly referred to as the “silo” problem in healthcare informatics.

There has been no technical reason why patient data is not regularly shared between healthcare providers for more than three decades. However, our healthcare system continues to financially reward providers who hoard rather than share data. Healthcare technologists like myself do not, despite appearances to the contrary work to make data sharing possible, instead, we spend our careers desperately seeking technical solutions to health data exchange that are politically palatable.

So I hope you can understand that when two parts of the healthcare eco-system start to consider collaborating in a way that helps patients, this is something that we should celebrate with… concerned… optimism.

And I am concerned. I am very concerned that Facebooks basic structure does little to protect those who share healthcare information across its network already.

Just as we should be concerned that Apples recently announced Hospital integrations will serve reduce the investments that hospitals make in other patient-data sharing methods. Which might serve to widen the digital health divide. Poor people in this country have trouble affording iPhones, which could be soon be one of the few ways to conveniently access their own hospital data. But we should cautiously celebrate Apples work in this area.

We should be concerned that Google has recently announced multiple new Health IT API initiatives, despite having unceremoniously shut down its previous healthcare API offering.

We should celebrate Grindr’s efforts to encourage regular STD testing, even if this action has been clearly overshadowed by the news that they were sharing HIV status with third party companies.

Look, if you are this far in the article and thinking that I am defending Facebook’s egregious cybersecurity mistakes, its constantly over-reaching data grabs and generally cavalier (even sometimes malicious) attitude towards personal privacy, then you are missing my point entirely. As twitter user _j3lena_ pointed out correctly, it is only reasonable to assume that there are dozens of other organizations that have Facebook data on the same scale as Cambridge Analytica. That is just the one that we see. (updated to acknowledge _j3lena_’s comments)

Facebook has been a privacy nightmare for years, and I am very hopeful that they might see their failure in these areas as an existential threat to their existence. Because they should go out of business if they cannot ensure that their platform is something more than a monetized privacy-abuse vector. Facebook deserves to go the way of the Dodo, if they cannot help its users differentiate between real and fake news. Make no mistake, Russians advertising on facebook is a big problem, but this pales in comparison to the personal consequences for a person who is convinced not give their child a vaccine because of a facebook group.

My point is just this. We need to give companies credit when they embrace security best-practices as they pursue ethically reasonable goals. Like leveraging a hashing for de-identification scheme in an attempt to do things with patients’ data to help clinicians to improve care they give those patients. We need to criticize, and if needed, boycott and regulate companies that abuse our data. We need to have national policies that create real consequences for companies that abuse their positions of trust.

But we also need to give credit where credit is due, and Facebook was probably trying to do some good work with this hospital collaboration.

I hope this better explains my tweet.


Good Questions:

As per always, the Twitter community has given me new things to think about.

  • First, it is not clear what it means for this to be done “in secret” if this deal included non-disclosure agreements, that is problematic.
  • Second, and this is something that I did not get into, but that CNBC did a good job emphasizing, especially in the video version of the report, is that it is not clear how, or if, explicit patient consent would have been involved.


Added several good points from Twitter, and as @corbinpetro pointed out, its CNBC and not CBS.


The Federal Government Recommends JSON

It is the policy of many governments to support transparency with the release of Open Data. But few understand how important it is that this Open Data be released in machine-readable openly available formats. I have already written a lengthly blog post about how most of the time, the CSV standard is the right data standard to use for releasing large open data sets.  But really JSON, XML, HTML, RTF, TXT, TSV and PDF files, which are all open standard file formats, each have their place as appropriate data standards for governments to use as they release Open Data.

But it can be difficult to explain to someone inside a government or non-profit, who is already releasing Open Data that CSV is a good standard, but XLSX (Microsoft Excel) is not. For many people, a CSV really is an Excel file, so there is no difference in their direct experience. But for those of us who want to parse, ETL or integrate that data automatically there is a world of difference in the level of effort required between a clean CSV and a messy XLSX file (not to mention the cybersecurity implications).

A few months ago (sorry I get distracted) Project Open Data which is a policy website maintained and governed jointly by the Office of Management and Budget and the Office of Science and Technology Policy of the US Federal Government updated its website to include W3C and IETF as sources of Open Data Format Standards, by accepting a pull request that I made. As I had expected, not including IETF and W3C in the list of sources of Open Standards was an omission and not a conspiracy (sometimes I panic).

This is a very important resource for those of us who advocate for Open Data. It means that we can use a single URL link, specifically, this one:


To indicate that it is the policy of the United States Federal Government that not only release Open Data, but it do so using specific standards that are also open. Now that the W3C and IETF are added, the following data standards are by proxy included in the new policy regarding open data standards:

Obviously these four standards make up almost all of the machine readable Open Data that is already easy to work with, and with a few exceptions represents the data formats that 95% (my guesstimate) of all Government data should be released in. In short, while there are certainly other good standards, and even cases where we must tolerate proprietary standards for data, most of the data that we need to release should be released in one of these four data formats.

For those of us who advocate for reasonableness in Open Data releases.. this is a pretty big deal. We can now simply include a few links to publicly available policy documents rather than arguing independently for the underlying principles.

And because the entire Project Open Data website is so clear, concise and well-written and because it comes with the implicit endorsement of US Federal Governments (OMB and OSTP), this is a wonderful new resource for advocating with National, State, City, Local and International governments for the release of Open Data using reasonable data formats. Hell, we might even be able to get some of the NGOs to consider releasing data correctly because of this. My hope is that this will make complaining about proprietary format data releases easier, and therefore more frequent, and help us to educate data releasers on how to make their data more useful. Which in turn will make it easier for data scientists, data journalists, academics and other data wonks to create impact using the data.

My applause to the maintainers and contributors to Project Open Data.







My political bias

I think, when one starts to write what could be a politically explosive blog posts, it is good order to reveal your political biases and opinions. I am about to write several, so this is a nice preface that I can just link to, in order to explain my perspective on current political issues.

Let me assure you. I join you, in your disgust for that other party. No matter what party you consider the other party and what party you consider “your” party.

I grew up conservative, in a household that had fought for the Reagan Administration in more ways than one. Modern “conservative” values look nothing like what I grew up with. I still find the core messages of the conservative ideal appealing. I do not want the government doing what corporations, or non-profits should be doing and tend to prefer small government.

I also find mainstream liberal ideas persuasive. The notion that no one should be afraid of getting sick and that sometimes, the governments need to step in when corporations or criminals start to abuse people who are not in a position to defend themselves.

I am utterly dissatisfied with Obamacare, which is much better than any current Trumpcare proposal. Obamacare is deeply problematic in many ways. But if it is in a “death spiral” it is only to the degree that such a spiral can be caused by the current administration pulling the rug out from underneath it. The currently proposed TrumpCare options are orders of magnitude worse. So bad in fact that I think they are more likely straw-man negotiation tactics between the middle-right and far-right components of the Republican Party.

All of which is to say, that if I have any biases, they are against every current political party. I did not choose to vote in the last presidential election, because I felt strongly that I had no viable political options. One candidate had demonstrated that she was very willing to “hard wire” her victory by rigging the outcome of the Democratic Party election. Everyone continues to emphasis how terrible the Russian hacking was, but the Democrats still wrote all the damning emails. And the only conservative thing about Trump is that he A. Not Hillary Clinton and B. willing to pretend to be against abortion on demand… which for the religious right meant that he was tolerable, despite being the anathema of everything else they believe in.

In short, I believe very strongly that its basically all bullshit, and both parties have completely betrayed the US citizenry by substantially betraying their own core values. This is likely the result of dark money in politics and the only political donations I am currently willing to give are to organizations like RootStrikers.

I have no illusions that the United States is the “best” country in the world. Everyone who says that is loading up the word “best” with their own desires, and then brow-beating the rest of us into submission if we disagree. But we are pretty badass country full of badass people who are talented, brilliant, assertive, clever and moral. Why are we being given such poor choices in our leadership?

As a healthcare data wonk, I will not mince words. As far as healthcare goes, here is the basic reality.: Its necessarily very expensive and everyone is pretending that if they were in charge then it would not be expensive. Obamacare at least was a respectable try at solving this problem, and so far Trumpcare is not a respectable attempt. I hope that changes, because if it does not it could be pretty bad, especially for poor people.

I hope Trump and Congress can fix this, because they have basically destroyed any thing they could that Obamacare needed succeed in order to ensure that their “death spiral” criticism is valid. Its like shooting an animal and then saying: “See, this here animal is wounded and useless. No good at all. Sad”


By undermining Obamacare, without having a viable replacement plan in place, Trump is taking an awful risk with millions of lives. I hope is his gamble pays off, for all our sakes.

Just wanted to be clear where I stood on things, since I think my readers have a right to know.


Drones and healthcare. A brain dump.

Here are my thoughts about drones in healthcare.

  • Most people are not really aware at how blindingly fast small drones are. Most demonstrations have them moving at a snails pace. They are incredibly quick and can cover large distances in the short time that they have battery life.
  • This makes drones ideal for the delivery of small light-weight packages. We can easily foresee a time when very small doses of medications are delivered each day by a drone. This will help patients to adhere to a medication schedule much more cleanly… which will likely reveal the degree to which patients are actually making very different drug taking choices than what their providers think they are. This could have both positive and negative impacts for patients who rely on opioids for pain control.
  • If we do start to deliver expensive medications via drone, then shooting them down will become a sport for criminals. This is a likely outcome for any drone-based delivery system. Of course, the cameras on drones should also make it very difficult to avoid being caught. Especially if large groups of drones team together. Imagine the behaviors of wasps or bees when “one” of their own is molested.
  • It is possible that groups of drones could be used instead of helicopters for airlifting patients. Its easier to show than describe. This might lead to “get outside so you can be gotten” being an important part of instructions for handling strokes and/or heart attacks. Perhaps doors will become smart enough to allow teams of drones in for airlift purposes.
  • Drones are surprisingly capable or cooperating to accomplish complex goals. So the notion of multiple drones working together to airlift an unconscious person from inside a house all the way to an either emergency care, or another (fully charged) group of emergency lift drones is not unreasonable.
  • We can imagine a feature of future luxury homes being having locally available series of “emergency airlift drones” that are capable of detecting the need for help, or responding to shouts etc etc. This could lead to another layer of healthcare dispartities.
  • This disparity might be easier to resolve by having apartment complexes have “shared” emergency drone pods, that can respond to emergencies in the local area.
  • Drones can swim and fly.  I expect that this will become a part of swimming pools, with one drone capable of rescuing a drowning person, and handing them over to drones capable of doing further airlift. Drowning is a major source of childhood deaths, and I expect that in the same way that “rails” are advocated for swimming pools today, tomorrow AI drown detection and drowned person retrieval will become standard for pools, lakes and rivers.
  • In the interim, drones will be used to detect local health hazards. For instance, you can use a drone fly-over to detect swimming pools that do not have rails (an ironic tie in) but you can also use them to search for mosquito breeding sources, like the massive puddles that currently form outside my apartment (sore subject, I did say it was a brain dump).
  • Drones will become a source of hyper accurate environmental data. Have questions about local air quality? That issue will soon be sampled at a rate hundreds of times more accurately than currently known. This will lead to the ability for public health issues to become local enforcement issues. This could become accurate enough to sort out when a heavy smoker is impacting a local school yard or park. Of course, one might expect that witch-hunts around drone data might become common.
  • For instance, stalking and private hyper-tracking of individuals will become a problem. Now rather than sitting outside a woman’s home, a disgruntled X-boyfriend can just program a drone to track her every move. Or, plic might choose to constantly monitor the movements of convicted sex predators. “Observation rights” are about to be a thing. And observation and scrutiny are known to have mental health implications.
  • The ability to deliver medications via drone will not be limited to legitimate sources. In fact illegal drug delivery is already happening, because drug dealers do not give a shit about FAA regulations the way that Amazon does.
  • In general automated deliver of every kind, groceries etc etc, will give people less reason to leave the house. This will reduce walking and make hyper-sedentary behaviors easier. Given the coorelation between the health of urbanites who are currently forced to walk and suburbanites who drive everywhere, this could create a third even more sedentary population. That is going to be expensive.

Thats all for now, I expect I will add to this…


Self-driving cars and healthcare. A brain dump.

Here are my thoughts on how self-driving cars related to healthcare. In no particular order.

  • First, it is very likely that self driving technology is already well past the reliability and safety of any human driver. This makes for a classic engineering ethics debate. How long will our society tolerate a technical solution to a problem (driving) that we know is much less safe than another solution (AI drivers), just because we are used to a particular paradigm (human drivers).
  • Second, there will be a very strange middle stage when self-driving becomes available in new cars. AI drivers are likely to be vastly more cautious than normal drivers. Some are very concerned that this will cause a problem with humans “bullying” AI drivers. But there is also going to be a health disparity created here. People who can afford new cars could be nearly immune from car accidents, creating a new kind of haves/have nots.
  • We can expect that the geo structure of modern life will change substantially. The introduction of highways to the US caused a migration to suburbs by making a longer drive shorter. Soon it may become both affordable and popular to live in very rural areas, because long commutes will be easier to handle. If you can read, type and make phone calls safely as your car handles a two hour commute to work, long drives will become much more tolerable. This could increase populations in very rural areas, making emergency services drive longer for more people. This could dramatically increase the need for automated drone-based life-flights that are capable of air-lifting people much more cheaply.
  • Similarly, very urban areas could entirely loose parking facilities. Instead of parking cars, they would either drive themselves away from urban areas to use cheaper parking, or perhaps stay active in “uber-mode” delivering other passengers to different destinations. Car ownership could become something that is done only by the very rich (who choose to afford their own self-driving car rather than subject themselves to the availability of a pool of unowned cars) or the very poor (who choose to drive themselves in older cars, and face the problem of vanishing parking)
  • As self-drive cars age, they will become subject to more mechanical problems and more unreliable, increasing the delta between the safety of self-driving cars (which will be smart enough to seek their own maintenance).
  • It is entirely possible that humans will choose to unionize drivers. So even though a truck is capable to driving itself, a human “monitor” will be required to be present for “accountability” purposes (but really just so they stay employed). We are already seeing that “sitting is the new smoking” and these “driver monitors” could lead to a hyper-sedentary lifestyle that could have very negative impacts. Drivers are already subject to lots of unhealthy behaviors, which could be made much worse by disengaging them from any energy expending processes at all.
  • The impact on policing cannot be over emphasized. Many small police departments rely entirely on traffic tickets for revenue. In fact, many small towns are entirely run on ticket revenue. Large police departments also rely on traffic violations for funding policing. In some ways, one might consider the police presence as something that is enabled by the fact that roaming police are constantly able to gain revenue by ticketing traffic violations. Depending on how this issue is resolved we could see police targeting “pedestrian violations” much more heavily, or we could see an uptick in violence as a result of lowered policing. When you consider the possibility of “drone police cars” that are themselves lowering the cost of police presence… it becomes very difficult to predict how policing, and as a result, violent crime, will change in response to self-driving technology.
  • Almost all organ donations are made as the result of traffic accidents. This could lead to a critical shortage of donations. This shortage will likely cause the funding for organ printing to sky-rocket. This is similar to the “parity costs” for solar energy. There is a kind of “parity cost” for organ printing in healthcare and dramatically reduced accident rates as the result of self driving cars could change that calculus very quickly. However, we may face a decade(ish) worth of organ shortage (and corresponding shortages) before organ printing works fully, and after accident based organ donation trails off.
  • Accidents of any kind are generally estimated to be the sixth, fifth or fourth leading cause of death in the United States. Auto accidents are the most common cause of accident. If you eliminate this as a “way to die” it will put greater pressure on heart disease, stroke and age related disorders like Parkinson’s and Alzheimer’s disease. This is a good problem to have, of course, be it needs to be accounted for.

Thats all I can think of off the top of my head.