Drones and healthcare. A brain dump.

Here are my thoughts about drones in healthcare.

  • Most people are not really aware at how blindingly fast small drones are. Most demonstrations have them moving at a snails pace. They are incredibly quick and can cover large distances in the short time that they have battery life.
  • This makes drones ideal for the delivery of small light-weight packages. We can easily foresee a time when very small doses of medications are delivered each day by a drone. The notion that drug delivery could be “bursty” could have major impacts, specifically:
    • In the future, people will be able to call 911 and get an epi-pen to them anywhere in a major city in a matter of minutes. Much faster than an ambulance could arrive. People with serious allergies will have “panic buttons” on their cell phones that enable on-demand delivery of epinephrine via drone.
    • Daily delivery of meds will enable patients to adhere to a medication schedule much more cleanly… which will likely reveal the degree to which patients are actually making very different drug taking choices than what their providers think they are.
    • This could have both positive and negative impacts for patients who rely on opioids for pain control.
  • If we do start to deliver expensive medications via drone, then shooting them down will become a sport for criminals. This is a likely outcome for any drone-based delivery system. Of course, the cameras on drones should also make it very difficult to avoid being caught. Especially if large groups of drones team together. Imagine the behaviors of wasps or bees when “one” of their own is molested.
  • It is possible that groups of drones could be used instead of helicopters for airlifting patients. Its easier to show than describe. This might lead to “get outside so you can be gotten” being an important part of instructions for handling strokes and/or heart attacks. Perhaps doors will become smart enough to allow teams of drones in for airlift purposes.
  • Drones are surprisingly capable or cooperating to accomplish complex goals. So the notion of multiple drones working together to airlift an unconscious person from inside a house all the way to an either emergency care, or another (fully charged) group of emergency lift drones is not unreasonable.
  • We can imagine a feature of future luxury homes being having locally available series of “emergency airlift drones” that are capable of detecting the need for help, or responding to shouts etc etc. This could lead to another layer of healthcare dispartities.
  • This disparity might be easier to resolve by having apartment complexes have “shared” emergency drone pods, that can respond to emergencies in the local area.
  • Drones can swim and fly.  I expect that this will become a part of swimming pools, with one drone capable of rescuing a drowning person, and handing them over to drones capable of doing further airlift. Drowning is a major source of childhood deaths, and I expect that in the same way that “rails” are advocated for swimming pools today, tomorrow AI drown detection and drowned person retrieval will become standard for pools, lakes and rivers.
  • In the interim, drones will be used to detect local health hazards. For instance, you can use a drone fly-over to detect swimming pools that do not have rails (an ironic tie in) but you can also use them to search for mosquito breeding sources, like the massive puddles that currently form outside my apartment (sore subject, I did say it was a brain dump).
  • Drones will become a source of hyper accurate environmental data. Have questions about local air quality? That issue will soon be sampled at a rate hundreds of times more accurately than currently known. This will lead to the ability for public health issues to become local enforcement issues. This could become accurate enough to sort out when a heavy smoker is impacting a local school yard or park. Of course, one might expect that witch-hunts around drone data might become common.
  • For instance, stalking and private hyper-tracking of individuals will become a problem. Now rather than sitting outside a woman’s home, a disgruntled X-boyfriend can just program a drone to track her every move. Or, plic might choose to constantly monitor the movements of convicted sex predators. “Observation rights” are about to be a thing. And observation and scrutiny are known to have mental health implications.
  • The ability to deliver medications via drone will not be limited to legitimate sources. In fact illegal drug delivery is already happening, because drug dealers do not give a shit about FAA regulations the way that Amazon does.
  • In general automated deliver of every kind, groceries etc etc, will give people less reason to leave the house. This will reduce walking and make hyper-sedentary behaviors easier. Given the coorelation between the health of urbanites who are currently forced to walk and suburbanites who drive everywhere, this could create a third even more sedentary population. That is going to be expensive.

Thats all for now, I expect I will add to this…


Self-driving cars and healthcare. A brain dump.

Here are my thoughts on how self-driving cars related to healthcare. In no particular order.

  • First, it is very likely that self driving technology is already well past the reliability and safety of any human driver. This makes for a classic engineering ethics debate. How long will our society tolerate a technical solution to a problem (driving) that we know is much less safe than another solution (AI drivers), just because we are used to a particular paradigm (human drivers).
  • Second, there will be a very strange middle stage when self-driving becomes available in new cars. AI drivers are likely to be vastly more cautious than normal drivers. Some are very concerned that this will cause a problem with humans “bullying” AI drivers. But there is also going to be a health disparity created here. People who can afford new cars could be nearly immune from car accidents, creating a new kind of haves/have nots.
  • We can expect that the geo structure of modern life will change substantially. The introduction of highways to the US caused a migration to suburbs by making a longer drive shorter. Soon it may become both affordable and popular to live in very rural areas, because long commutes will be easier to handle. If you can read, type and make phone calls safely as your car handles a two hour commute to work, long drives will become much more tolerable. This could increase populations in very rural areas, making emergency services drive longer for more people. This could dramatically increase the need for automated drone-based life-flights that are capable of air-lifting people much more cheaply.
  • Similarly, very urban areas could entirely loose parking facilities. Instead of parking cars, they would either drive themselves away from urban areas to use cheaper parking, or perhaps stay active in “uber-mode” delivering other passengers to different destinations. Car ownership could become something that is done only by the very rich (who choose to afford their own self-driving car rather than subject themselves to the availability of a pool of unowned cars) or the very poor (who choose to drive themselves in older cars, and face the problem of vanishing parking)
  • As self-drive cars age, they will become subject to more mechanical problems and more unreliable, increasing the delta between the safety of self-driving cars (which will be smart enough to seek their own maintenance).
  • It is entirely possible that humans will choose to unionize drivers. So even though a truck is capable to driving itself, a human “monitor” will be required to be present for “accountability” purposes (but really just so they stay employed). We are already seeing that “sitting is the new smoking” and these “driver monitors” could lead to a hyper-sedentary lifestyle that could have very negative impacts. Drivers are already subject to lots of unhealthy behaviors, which could be made much worse by disengaging them from any energy expending processes at all.
  • The impact on policing cannot be over emphasized. Many small police departments rely entirely on traffic tickets for revenue. In fact, many small towns are entirely run on ticket revenue. Large police departments also rely on traffic violations for funding policing. In some ways, one might consider the police presence as something that is enabled by the fact that roaming police are constantly able to gain revenue by ticketing traffic violations. Depending on how this issue is resolved we could see police targeting “pedestrian violations” much more heavily, or we could see an uptick in violence as a result of lowered policing. When you consider the possibility of “drone police cars” that are themselves lowering the cost of police presence… it becomes very difficult to predict how policing, and as a result, violent crime, will change in response to self-driving technology.
  • Almost all organ donations are made as the result of traffic accidents. This could lead to a critical shortage of donations. This shortage will likely cause the funding for organ printing to sky-rocket. This is similar to the “parity costs” for solar energy. There is a kind of “parity cost” for organ printing in healthcare and dramatically reduced accident rates as the result of self driving cars could change that calculus very quickly. However, we may face a decade(ish) worth of organ shortage (and corresponding shortages) before organ printing works fully, and after accident based organ donation trails off.
  • Accidents of any kind are generally estimated to be the sixth, fifth or fourth leading cause of death in the United States. Auto accidents are the most common cause of accident. If you eliminate this as a “way to die” it will put greater pressure on heart disease, stroke and age related disorders like Parkinson’s and Alzheimer’s disease. This is a good problem to have, of course, be it needs to be accounted for.

Thats all I can think of off the top of my head.


NoSQL and Technical Debt

So I just had a chance encounter with a former professor of mine, Dr. Hicks from Trinity University.

While I was at Trinity University, mumble mumble long time ago, Dr. Hicks assured me that I needed to take his database class, the course at Trinity CS department that was the most focused on SQL. That class had a sterling reputation, it was regarded as difficult, but practical and comprehensive. Dr. Hicks is a good-natured and capable teacher, and approached the topic with both sophistication, enthusiasm and affection. But databases and SQL is a complicated topic and no amount of nice-professoring is going to make it easier. It was a hard class and at the time, avoiding a hard class seemed like the right decision for me.

I did not take this class, which I profoundly regret. Not a moralizing regret, of course, more of a “that decision cost me” regret. Since leaving Trinity, I have had to teach myself database design, database querying, normalization, un-normalization, query optimization, data importing and in-SQL ETL processes.

Over the last 10 years, I estimate that I have spent at least 1000 hours teaching myself these methods outside of a single class. In terms of missed opportunity dollars, as well as simple compensation, I am sure it has cost me upwards of $100k (this is what happens when you as an entrepreneur… when you take too long to do some work, you suffer as both the client and the programmer). I really wish someone had taken me through the basics, so that I would have only had to teach myself the advanced database topics that I only apply more rarely. As it is, I have lots of legacy code that I am moderately embarrassed by. Not because it is bad code, but because it is code that I wrote to solve problems that are well-solved in a generic manner inside almost all modern database engines.

Dr. Hicks also mentioned to me that he was deliberating how much he should consider including NoSQL technologies in his class. He indicated that students regarded the NoSQL topics as more modern and valuable, and regarded the SQL topics with some distaste.

This prompted the following in-person rant from me on Technical Debt which I thought might be interesting to my readers (why are you here again?) and perhaps some of Dr. Hicks potential students.

His students made the understandable but dangerous error of seeing a new type of technology as exclusively a progression from an old one. NoSQL is not a replacement for SQL, it is solving a different problem in a different way. Both helicopters and airplanes fly, but they do so in different ways that are optimized to solve different problems. They have different benefits and handicaps.

The right way to think about NoSQL is as an excellent solution to the scaling problem, at the expense of supporting complex queries. NoSQL is very simple to query effectively, precisely because complex queries are frequently impossible at the database level. SQL is much harder to learn to query, because one must understand almost everything about the underlying data structures as well as the way the SQL language works, before query writing can even start.

Most of the time, the right way to think about any technology is:

  • What does this make easy?
  • What does this make hard?
  • What does mistakes does this encourage?
  • What mistakes does it prevent?

Almost all modern programming languages are grappling with the “powerful tool I can use to shoot myself in the foot” problem. Computer Science shows us that any Turing-complete programming language can fully emulate any other programming language. So you can use fortran to make web pages if you want, but php makes it easy to make web pages. You can use php to do data science, but R makes that easy. And of course, languages like python seek to be “pretty good” at everything.

NoSQL tends to encourage a very specific and dangerous type of technical debt, because it enables a programmer to skip upfront data architecture, in favor of “store it in a pile and sort it out later”. This is roughly equivilent to storing all of your clothes, clean and otherwise, in large piles on the floor. Adulthood usually means using closets and dressers, since ordering your clothes storage has roughly the same benefit of arranging your data storage.

SQL forces several good habits that avoid the “pile on the floor effect”. To use SQL you have to ask yourself, as an upfront task:

  • What does my data look like now?
  • What relationships are represented in my data?
  • How and Why will I need to query this data in the future?
  • How much data of various types do I expect to get?
  • What will my data look like in the future?

With NoSQL, you get to defer these decisions. With NoSQL you get to just through the data on the pile, in a very generic manner, and later you figure out how you want to use the data. Because the underlying emphasis of NoSQL on scaling, you can be sure that you can defer these decisions without losing data.  If all you need to do is CRUD, at scale, and data analysis is secondary, NoSQL can be ideal. Most of the time, however, data analysis is critical to the operation of an application. When you have to have both scaling and data analysis… well, that is a true data science topic… there is no bottom in that pond.

SQL is not the only querying language that enforces this discipline. Neo4J has created a query language called Cypher, that has many of the same underlying benefits of the SQL language, but is designed for querying graph structures, rather than. Unlike traditional NoSQL databases, Neo4J enforces upfront thinking about data structures, much like a SQL database, it just uses a different underlying data structure: A graph instead of a table. In fact, with time, having experience with both SQL and Graph databases, I have started to understand when the data I am working with “wants” to be in a graph database, vs a traditional tabular SQL database. (Hint: If everything is many-to-many and shortest path or similar things matter… then you probably want a graph database)

Indeed, it is not a requirement that you forgo the efforts to create careful data structures in NoSQL languages. NoSQL experts very quickly realize that using schema’s for data is a good idea, even if doing so is not enforced by the engine.

The key underlying concept of the Technical Debt metaphor is that a programmer must consciously make decisions about how much Debt to incur, in order to avoid the crisis of software that requires so much eventual maintenance, that no further progress can be made on it. Essentially, there is something like “software design bankruptcy” that we should stay far away from.

Like financial debt, bankruptcy is not actually the worst state to reach with technical debt.  The worst state, in both finances and technology, is poverty created and sustained by interest payments. What people sometimes call “debt slavery”. Another state to avoid is taking on no debt at all. Debt is a ready source of capital, and can be used to dramatically accelerate both technical and financial progress.

Also like real life, most individuals manage debt poorly, and the few individuals who learn to use debt wisely have a significant advantage.

But the first step to managing debt wisely is to be recognize when you are taking debt on, and to ensure that it is done with intention and forethought. Make no mistake, forging ahead without designing your data structures is a kind of hedonism, nor dissimilar from those who choose to purchase drinks they cannot afford on their credit card.

If you are looking forward to career with data, not learning SQL is a technical debt equivalent of taking a payday loan. By learning SQL carefully, you will learn to forecast and plan your data strategy, which in many cases is at the heart of your application. Even if you abandon SQL for the sake of another database with some some other benefit later on, the habits you learn from careful data structure planning will always be valuable. Even if you never actually use a SQL database in your career.







Better NDC downloads from the FDA

Recently, the FDA Division of Drug Information, Center for Drug Evaluation and Research dramatically improved how their NDC search tool data downloads work in response to some complaints they received from… someone. Most notably they:

  • Added the NDC Package Code (the NDC-10 code with the dashes) to each row as a distinct field. This is the only field that is unique per row!
  • Added the ability to download the results in plain CSV. (Previously you could only get Microsoft Excel files, which is a proprietary data standard)


NDC search and data download improvements
NDC search and data download improvements

This makes the download functionality much more useful, and IMHO, that improvement makes the searching generally much more worthwhile.

Data hounds like me just download the entire NDC database which is already available as open data already. But these files use non-standard data formats and require special ETL processing to work with conveniently. Now, you can make useful subsets of the NDC data and then download those subsets in an open standard.  Those CSV files will make working with the data in both spreadsheets (other than Excel) and automatic import into databases much easier.

Especially given my recent rant about using simple download formats.  I think it is really important to recognize the folks at the FDA who work every day to ensure that medication information is a little more useful to the public.

Thank you!



Open Data Frustrations

First, let me say that I applaud and salute anyone who releases open data about anything as relevant as healthcare data. It is a tough and thankless slog to properly build, format and document open data files. Really, if you work on this please know that I appreciate you. I value your time and your purpose in life.

But please get your shit together.

Get your shit together
Get your shit together

Please do not make your own data format standards. Please use a standard that does not require me to buy any proprietary expensive software to read. The best open standards have RFCs. Choose one of those.

And most of all. If a comma-delimited file will work for your data, just use a CSV. If you were thinking, “but what if I have commas in my data?”… well you are just wrong. CSV is an actual standard. It has ways to escape commas and most importantly, you do not need to think about that. All you need to do is use the CSV export functionality of whatever you are working with. It will automatically do the right thing for you.

You are not doing yourself any favors creating a fixed length file structure. Soon, you will find that you did not really account for how long last names are. Or you will find an address that is longer than 40 characters. Or the people at the FDA will add another digit to sort out NDC codes… or whatever. CSV files mean that you do not have to think about how many characters your data fields use. More importantly, it means that I do not need to think about it either.

You might be thinking “We should use JSON for this!” or “XML is an open standard”. Yes, thank you for choosing other good open formats… but for very large data sets, you probably just want to use a CSV file. The people at CMS thought JSON would be a good standard to use for the Qualified Health Plan data… and they did in fact design the standard so you could keep the JSON filed to a reasonable size. But the health insurance companies have no incentive to make their JSON files a reasonable size and so they have multiple gigabyte JSON files. That is hugely painful to download and it is a pain to parse.

Just use CSV.

I was recently working with the MAX Provider Characteristics files from Medicaid. Here are the issues I had.

  • They have one zip file from 2009 which empties into a directory with the same name as the zip file. That means that the zip file will not open, because it is trying to write to a directory with the same name as the original file. I have to admit, I am amazed that this mistake is even possible.
  • in 2009, the zip files made subdirectories. In 2010 and 2011 they dumped to the current directory tar-bomb style. (either way is fine, pick one)
  • sometimes the file names of the ‘txt’ files are ALL CAPS and sometimes not, even in the same years data.
  • Sometimes the state codes are upper case like ‘WI’ and ‘WV’, sometimes they are camel case ‘Wy’ and ‘Wa’, sometimes they are lowercase ‘ak’ and ‘al’. Of course, we also have ‘aZ’.
  • Usually the structure is StateCode.year.maxpc.txt .. like GA.2010.maxpc.txt. Except for that one time when they wrote it FL.Y2010.MAXPC.TXT
  • the actual data in the files is fixed length format. Each year, you have to confirm that all of the field lengths are the same in order to ensure that your parser will continue to work.
  • They included instructions for importing the data files in SAS, the single most expense data processing tool available. Which is, of course, what they were using to export the data.
  • They did not include instructions for any of the most popular programming languages. SAS does not even make the top 20 list.
  • There are multiple zip files, each with multiple files inside. We can afford a download that is over 100 MB in size. Just make, one. single. csv file. please.
  • Sometimes the files end in .txt Other times they just end in a ‘.’ (period).
  • The files are not just text files, they have some cruft at the beginning that ensures that they are interpreted as binary files.

Now how does that make me feel as someone trying to make use of these files? Pretty much like you might expect.

I love open data in healthcare. But please, please, start using easy to use and simple data standards. Get your shit together. I spend too much time hacking on ETL, I need to focus on things that change the world. And guess what… you need me to focus on those things too.

So if you are reading this, and you might very well be because I specifically referred you to this rant. Please do the right thing.

Soon, this advise will likely be formally compatible with the Open Data policies of the Federal Government.

  1. Use an open standard for your data
  2. Use CSV if you can
  3. Are you ABSOLUTELY SURE that you cannot use CSV?
  4. Use JSON if you cannot use CSV
  5. Use XML if you cannot use CSV or JSON
  6. Are you looking to compress and manage massive amounts of data moving it around at a furious rate, in a almost-binary compressed format? Perhaps try Protocol Buffers.
  7. Find that Protocol Buffers page confusing? Its because you should be using CSV. So just use CSV.
  8. Make your data and file naming consistent, so that a machine can process it.

This way, we can have all of the wonderful tools for csv data processing available to us. Joy!

Thank you.

Updated Mar 22 2017 (added protocol buffers and links)


Google Intrusion Detection Problems

(Update: Less than four hours after tweeting out this blog post, we got our access turned back on. So the Google support team is definitely listening on social media. We very much appreciate that, because it resolves this issue as a crisis. We are still concerned by the “auto-off” trend and the missing button. But we will be working to make sure there is a better long term solution. Will update this post as appropriate moving forward.)

So today our Google Cloud Account was suspended. This is a pretty substantial problem, since we had committed to leveraging the Google Cloud at DocGraph. We presumed that Google Cloud was as mature and battle tested as other carrier grade cloud providers like Microsoft, Rackspace and Amazon. But it has just been made painfully clear to us that Google Cloud is not mature at all.

Last Thursday, we were sent a message titled “Action required: Critical Problem with your Google Cloud Platform / API Project ABCDE123456789” here is that message.


Which leads to our first issue Google is referring to the project by its id, and not its project name. It took us a considerable amount of time to figure out what they were talking about when they said “625834285688”. We probably lost a day trying to figure out what that meant. This is the first indication that they would be communicating with us in a manner that was deeply biased towards how they view their world of their cloud service internally, totally ignoring what we were seeing from the outside. While that was the first issue, it was nowhere near the biggest.

The biggest issue is that it was not possible to complete the “required action”. Thats right, Google threatened to shut our cloud account down in 3 days unless we did something… but made it impossible to complete that action. 

Note that they do not actually detail the action needed, in the “action required” email. Instead they refer to a FAQ, where you find these instructions:


So we did that.. and guess what, we could not find the blue “Request an appeal” button anywhere. So we played a little “wheres waldo” on the Google Cloud console.

  • We looked where they instructed us to.
  • We looked at the obvious places
  • We looked at the not-obvious places

As far as we can tell, there was no “Request an appeal” button anywhere in our interface. Essentially, this makes actually following the request impossible.

So we submitted a support request saying “Hey you want us to click something, but we cannot find it” and also “what exactly is the problem you have with our account in any case?”

However, early yesterday morning, despite us reaching out to their support services to figure out what was going on, Google shut our entire project down. Note that we did not say “shutdown the problematic server” or even “shutdown all your servers”. Google Cloud services shutdown the entire project. Although we use multiple google cloud APIs we thought it made sense to keep everything we were doing on the same project. For those wondering that is a significant design flaw, since Google has fully-automated systems that can shut down entire projects that cannot be manually overridden. (Or at least, they were not manually overridden for us).

We have lost access to multiple critical data stores because Google has an automated threat detection system that is incapable of handling false positives.  This is the big takeaway: It is not safe to use any part of Google Cloud Services because their threat detection system has a fully automated allergic reaction to anything that has not seen before, and it is capable of taking down all of your cloud services, without limitation. 

So how are we going to get out of this situation? Google offers support solutions where you can talk to a person if you have a problem. We view it as problematic that interrupting an “allergic reaction” as a “support issue”. However, we would be willing to purchase top-tier support in order to get this resolved quickly. But there does not appear to be an option to purchase access to a human to get this resolved. Apparently, we should have thought about that before our project was suspended.

Of course, we are very curious as to why our account was suspended. As data journalists, we are heavy users of little-known web services. We suspect that one of our API client implementations looked to Googles threat detection algorithms like it was “hacking” in one way or another. There are other, much less likely explanations, but that is our best guess as to what is happening.

But we have not idea what the problem is, because Google has given us no specific information about where to look for the problem. If were actually doing something nefarious, we would know which server was the problem. We would know exactly how we are breaking the rules, but because we are (in one way or another) a false positive in their system, we have no idea where to even start looking for the traffic pattern that Google finds concerning.

Now when we are logged in, we simply see an “appeal” page that asserts, boldly “Project is in violation of Google’s Terms of Service”. There is no conversation capacity, and filling out the form appears to simply loopback to the form itself.

It hardly matters, Googles support system is so completely broken, that this issue represents a denial of service attack vector. The simplest way to take down any infrastructure that relies on Google would be to hack a single server, and then send out really obvious hack attempts outbound from that server. Because Google ignores inbound support requests and has a broken “action required” mechanism, the automated systems will automatically take down an entire companies Cloud infrastructure, no matter what. 

Obviously, we will give Google a few hours to see if they can fix the problem and we will update this article if they respond in a reasonable timeframe, but we will likely have to move our infrastructure to a Cloud provider that has a mature user interface and support ticketing system. While Google Cloud offers some powerful features, they are not safe to use until Google abandons its “guilty until proven innocent, without an option to prove” threat response model. 







Mourning Jess Jacobs

Yesterday, Jess Jacobs died.

Like so many others on Twitter, I knew Jess just well enough to be profoundly frustrated as I watched helplessly as the healthcare system failed her again and again. Today, the small world of Internet patient advocates mourn for her across blogs and twitter. The world of people who are trying to fix healthcare from underneath is small, and relationships that form on social media around a cause can be intense. There is nothing like an impossible, uphill battle to make lasting friendships. Now this community is responding to the loss of not only one of our own, but one of our favorites.

Is the NSA sitting on medical device vulnerabilities?

Today is not a fun day to read slashdot if you care about healthcare cybersecurity. First, it highlights how the DEA is strong-arming states into divulging the contents of their prescription databases.

Second, and even more troubling, was the claim that the NSA was looking to exploit medical devices. The story was broken by Intercept reporter Jenna McLaughlin. Since then, the article has been picked up by the Verge. Their title is even more extreme: “The NSA wants to monitor pacemakers and other medical devices”  Jenna did not specifically mention where she heard the comments, but her twitter feed gave me a hint.

The comments were from NSA deputy director Richard Ledgett, who is the same guy that countered the Ted talk from Snowden with his own. He was speaking at the Defense One Tech Summit. It is incredibly hard to find, but his comments are available as a video he goes on almost exactly at 3 hours. I tried to embed the talk below, YMMV.

In one sense this has been blown out of proportion. Patrick Tucker is the moderator/interviewer here, and he is the one that is pressing Ledgett on the issue of biomedical devices. Start at 3:15 for the discussion on medical devices.

Ledgett insists that targeting medical devices is not a priority for the NSA. But the troubling thing is his answer to the first question:

Question: ” What is your estimation of their security ”

Answer: ” Varies alot depending on the device and the manufacturer”

The problems with this is that I know of no examples of the NSA releasing data on insecure medical devices. In fact, the FDA has recently released information about specific medical devices that were insecure, without giving credit to the NSA.

This means that the NSA is investigating the security of medical devices, but then not releasing that information to the public. Ironically, it is a quantified self device that is most illustrated here. Ledgett specifically highlights fitbit, which I know had some pretty strange wireless behavior (that many regarded as insecure, in its early versions). So we know they have looked at one specific device, but there has been no release of information from the NSA on the device. At least I cannot find any.

If indeed the NSA is investigating medical devices, and is not releasing all of that information to the FDA, device manufactures and the public, then that is a huge problem.

I am still thinking about this, but it does not look good.

I suppose I should also mention that I ran across the interesting fact that Osama Bin Laden was using a CPAP machine.

Update: I have submitted a FOIA request for access to vulnerabilities about “healing devices” and it has been denied.

Clintons Server Politifact

Most of the time that I spend as a security-wonk is focused on email security. This is due almost entirely to my involvement as one of the architects of the Direct Project, which is a specification for using secure encrypted email in healthcare settings.

Which is why I was surprised by a recent analysis from Politifact evaluating something that Hillary Clinton said about her email servers. I should mention that I am apolitical. I care, but both US parties fail to resonate. So I have no reason to pick one side of this debate over the other. I am interested in the implications and perceptions of Hillary’s email system, however, because it is very revealing of basic attitudes about email systems.

For those that do not know Politifact is an organization that evaluates the veracity of specific statements that politicians make. Given my attitude about politics, you can understand why I am a fan of such a service. The statement that Hillary Clinton made that Politifact was evaluating was that “my predecessors did the same thing” regarding her email practices.

Politifact said:

And there’s a big difference between a private account, which is generally free and simple to start, and a private server, which requires a more elaborate setup…. The unorthodox approach has opened up questions about her system’s level of security.

later concluding:

This is a misleading claim chiefly because only one prior secretary of state regularly used email, Colin Powell. Powell did use a personal email address for government business, however he did not use a private server kept at his home, as Clinton did.

We rate this claim Mostly False.

The central assumption that Politifact is making is that Clinton’s email server was fundamentally less secure than using a service. Specifically, Colin Powell used AOL. In fact, for the average person, you are probably better off using a service like AOL. But Hillary Clinton and Colin Powell are hardly the average person. There are considerably advantages to having your own email server and your own domain, if you are particularly concerned with security.

First, all of the email services are constantly the targets of hackers. If someone broke into AOL, they could find Colin Powell’s account as a side-effect of the overall hack. It would be a bonus for hacking a system that is already regarded as a high-value target. Second, it is still relatively easy to spoof email. That means that it is fairly simple for someone to send emails pretending to be a particular person on a public email service. So if I had wanted to pretend to be Colin Powell, it would have been a little easier to get away with it, given that he was using an email service. It would be much easier to setup specific defenses (there is not that much you can do without encryption of some kind) to combat spoofing on your own server.

Unless Colin Powell had some special relationship with AOL (which is actually a real possibility) then login attempts from eastern Europe to his account would not have been flagged in any way. On a private server, however, you could always say “Is Secretary Clinton in eastern Europe today? No. Then that login attempt is a problem”. Of course, if you are not watching the logs on your private server, then this advantage is negated.

As it turns out, Clinton was also publicly serving up Windows Remote Desktop on her server, which makes it unlikely that she was taking the steps needed to get the security benefits. Even with that information, however, I cannot see the merit to the assumption that using AOL vs hosting your own Exchange server is fundamentally less or more secure for a public official like this.

Ultimately when you trust an organization like AOL you are effectively trusting thousands of people all at once. Clinton probably trusted somewhere between 10 and 100 people with the contents of her email server. Colin Powell probably trusted somewhere between 1000 and 10000. If I was making suggestions for the security of the email of my grandmother.. I would go with AOL. If I were making suggestions for the secretary of state? It is much less clear, and would depend alot on how the two different email systems were configured and regularly used.

As per almost always, the Wikipedia article on the subject is a tremendous source of the kinds of detail that a security researcher like me might need to evaluate whether there were security advantages, or disadvantages to hosting your own server. But still it is obvious that Colin Powell trusted state secrets to a massive Internet provider and Hillary Clinton trusted state secrets to a small team of generalist (i.e. not security) consultants. Neither of those decisions was well-informed by proper security thinking for securing emails that might eventually become state secrets.

So from my perspective as a security researcher with a focus on email security, it is a pretty fair statement for Hillary to say “My recent bad decision about email is equivalent to previous bad decisions made by members of the other party”.

Which means I think Politifact got it wrong. What is more interesting is why. They got it wrong because they made some flawed. This is deeply ironic, because that is precisely the same mistake that both Clinton and Powell made about exactly the same issue.

But I also think this is a problem with the way that technical options are presented. Politifact quotes Clifford Neuman as saying “you would need to stay current on patches”. I can promise you that this is not all Clifford had to say on the matter, but this is the only thing that Politifact chose to surface. The reality of the technical issues is a huge debate about whether Software as a Service is more secure than locally deployed and supported software. In reality, locally deployed software clearly can me made more secure, because one can choose to enforce parameters that improve security at the expense of convenience (like two factor authentication, for instance). However, Software as a Service is usually more secure in practice because you have teams of people ensuring that bare-minimums are always met.

I really could not care less about Clinton’s or Powell’s choices when it comes to email servers. It is a little silly to be accusing people who get to decide what is classified and what is not with mis-handling classified information. Personally, I think the fact that Clinton was exposing an RDP connection to the public Internet is the only thing that I have heard that is truly scandalous here, and this is clearly not the focus of the media circus here. I do not care at all about the political side of this.

I am very concerned, however, about how novices think about about complex security and privacy issues. How did Politifact, which is charged with getting to the bottom of this issue, discussed precisely none of these complex technical issues? The conclusion they reached is pretty shallow. Which I do not think is their failing… I think this is a symptom of dogmatic thinking in InfoSec messaging.

Still not done thinking about this.