Why I Chose Cloudera

Evaluating opportunities across the landscape of healthcare data, analytics, and AI companies I began to formulate the organizations into two distinct buckets: companies producing insights from above the waterline/top of the iceberg and companies capable of producing insights from below the waterline/bottom of the iceberg. The significance of the data iceberg analogy is that everything above the waterline consists of what we what we already know and what we think we need to know. Everything below the waterline consists of the unknown. Unfortunately, in the healthcare industry, the bottom of the iceberg is believed to be far greater than what is above the waterline. This divide has been exacerbated by the healthcare industry’s complete failure to adopt technological advancements as it lags behind nearly 80% of other industries in innovative technology adoption.  

By examining opportunities through the lens of the data iceberg analogy it became clear that the path to having a meaningful, lasting impact is deriving insights from above and below the waterline. This is especially difficult in a healthcare industry that’s still struggling with basic interoperability adoption. Therefore, I defined a criteria for evaluating opportunities around three key areas: how well the company enabled their clients to innovate, ability to prepare their clients for the future, and company culture. Viewing opportunities through this lens, Cloudera stood out. Here’s why:

Openness Leads to Innovation

Vendor lock-in, closed systems, data silos, and interoperability issues have plagued healthcare for decades. The collaborative nature of open source software is in direct opposition to the closed standards that have held healthcare back for decades and shows a commitment to continual technology growth and advancement.  Cloudera’s data, analytics and AI platform is capable of producing insights into the unknown by integrating and ingesting data from edge sources as well as legacy systems. It’s deployable across a variety of options – on premise, public, private, hybrid and multi cloud – enabling companies to embrace the future while still incorporating legacy data.

Layered in throughout all of these criteria is the inherent need for governance and security. The open source software allows collaboration from all sources which is why it’s been described as the Wild West by many developers. Health records are very sensitive information that should not be subject to a free-for-all environment. Cloudera understands this and infuses strong security and governance tools into the core foundation of their offering.

Preparing for the Future

At a SxSW session in March titled “EQ in the ER: Building Empathy Through XRRasu Shrestha MD MBA, Chief Strategy Officer and EVP Atrium Health, and Stephen Ibaraki, Chairman & General Partner, Redds Capital VC Fund, discussed technology in healthcare today and what role they see technology playing in the near future. These great innovators and leaders do an excellent job of highlighting that, despite the historical failure to adopt technology, we are on the precipice of major technological changes in the healthcare industry. Stephen Ibaraki conveys the findings of the WEF Future Jobs Report that forecasts dramatic changes to adoption rates of big data, biotech, machine learning, wearable technology, blockchain, IoT, AV-VR, 3D printing, stationary robots, and non-human robots by 2022. Even if just a small portion of the forecasted adoption of these technologies comes to fruition it would represent a significant and much needed advancement in the rate of technology adoption for the healthcare industry.

However, if organizations move too quickly to adopt these edge-based technologies they run the risk of creating new data silos. In order for healthcare organizations to maximize the benefits of data from edge technologies they need a strong foundation for data management, protection, and governance. Cloudera provides that foundation, adapts with technological advancements and provide organizations versatility without a loss of governance or security. Cloudera’s platform is software not hardware based, providing the versatility to be compatible with any deployment option. It has the flexibility, agility, and elasticity needed to empower people to get clear and actionable insights from complex data anywhere, from the Edge to AI.

Utilizing a new, modern approach to enterprise data management such as Cloudera’s Enterprise Data Cloud eliminates siloed analytics allowing healthcare organizations to create a coherent approach to their data strategy. Only a comprehensive, coherent approach to data and analytics provides the necessary foundation to enable insights from below the water line.

With the Cloudera/Hortonworks merger completed the new Cloudera is uniquely positioned to enable healthcare organizations to speed innovation and technology adoption. The merger not only eliminated each company’s primary competitor but added strength and breadth to their product offering.

A Culture for, of and by the People

I gravitate toward company cultures that prioritize the needs of the customer, drive innovation, demand excellence, promote a team-like atmosphere, and invest in the development of it’s employees. I want a company to value what I bring to the table but feet strongly about their ability to help me continue to grow and develop. I am not a finished product. I take pride in my ability to learn new skill sets and continue to develop and refine existing skill sets. It’s important that a company understand and embrace that mentality.

From the very first conversation with the Cloudera team through the last it was clear that our expectations around culture fit aligned. It was evident that Cloudera has prioritized hiring exceptionally talented individuals, values their contributions, and empowers their personal and career growth. The company’s embrace of open source, open standards, and open markets further highlights their belief in innovation and collaboration.

It’s an honor to join such a talented team and innovative company. I believe strongly that Cloudera’s Edge to AI capability will enable healthcare organizations to accelerate innovation by deriving more meaningful, actionable insights that shed light on what’s below the waterline.

Hype or Reality: AI uses in Healthcare?

It’s hard to miss the advertisements for virtual assistants from Amazon, Google, and Apple (all powered by AI), and TV commercials promoting AI capabilities for Microsoft and IBM. The hype about AI is real.

Whenever technology is hyped up, I’m always a little skeptical about whether it’s producing the outcomes to justify the hype. Therefore, let’s explore whether or not AI is producing real, tangible differences in healthcare.

What AI is not

Let’s start by clarifying a few common misconceptions about the technology. First, AI is not machines becoming self-aware. It is not Skynet from Terminator or V.I.K.I. from I, Robot. It is not a parkour robot that will take over the world, although that robot may one day save lives on the battle field or during natural disasters. In other words, do not confuse real-life AI with entertainment’s version of AI. The AI being used in production today is not that advanced. I recently wrote an article comparing my newborn to some of today’s most advanced AI algorithms and toddlers are even more advanced than today’s AI.

The days of AI being able to hop on the internet and learn something new like Baymax does in the animated movie Big Hero 6 are far down the road. Between now and then there are a great deal of hurdles to overcome. In the movie Baymax uses information from the internet to teach itself about emotional pain. While the lack of data preparation would be a hurdle today the bigger issue is the lack of data validation. The validity of data is one of the key reasons why AI is not widely trusted in the medical community. In addition to data validity, clinicians question the lack of model expandability and the potential for bias. All of which are very valid concerns and must be addressed if AI is going to be trusted with our most valuable commodity: human life.

What AI is

AI is the tip of the spear of the analytic journey and its role is to predict, automate, and optimize. However, there are a number of data preparation steps that must be completed before AI is applied. Data preparation consumes the vast majority of time, resources, and cost associated with an AI project.

AI is considered “intelligent” because it has the ability to learn, but that doesn’t mean it possesses intelligence on par with humans. Far from it. AI requires vast amounts of data to learn very simple and specific tasks. A more apt comparison is comparing today’s AI capabilities to a smart, well-trained canine. We use canine’s strengths like their superior sense of smell to the benefit of or in service to humans. The animals go through specific and repetitious training that results in a targeted and defined outcome. Similarly, machines have a superior ability to process, calculate, and analyze data as compared to humans. Therefore, we use specific and repetitious training to achieve targeted and defined outcomes such as finding anomalies in vast amounts of data, matching key words, and identifying patterns.

The key similarity between canines and today’s AI is that the training results in a very specific outcome. To illustrate this point, think of all the different types of scents canines are trained to detect – low blood sugar, explosives, drugs, scent of people, etc. Even though the foundational function of detecting a smell remains the same, canines are only trained to detect a single scent. Similarly, today’s AI is is most effective when targeted toward accomplishing a single task. Humans, on the other hand, are able to build on the knowledge they’ve learned and apply it to new situations to effectively teach themselves.

Where AI fits in Healthcare today

Organizations invest in technology due to one or more of three primary drivers: 1) grow revenues 2) reduce costs 3) increase customer satisfaction. AI’s versatility allows it to hit one, two, or all three of these drivers in a single solution which is why it’s such a hot topic.

Every major healthcare provider or insurance organization in the United States is already using AI or is exploring how ways to use AI. Some of the uses are behind the scenes in research and development and may never reach production. For this article, I’m going to focus on the areas where AI is delivering quantifiable results.

Consumer Experience

The key to a good customer service experience is speed and accuracy of information. When the consumer has to wait on hold, repeat themselves multiple times to virtual agent, and re-validate their account to a live person after previously validating with a virtual agent customer satisfaction suffers. Chatbots and virtual assistants like Amazon’s Alexa, Google Home, and Apple’s Siri have raised our expectations for virtual agents. Their natural language processing capabilities make interacting with them easy and intuitive. As a result, upgrading virtual agents to meet these new consumer expectations is one of the top priorities of healthcare organizations I work with. According to the latest third party analyst ratings Watson sits head and shoulders above the rest. IBM’s commitment to the hybrid cloud infrastructure allows organizations to utilize the power of Watson without compromising compliance or security. Additionally, IBM has announced that Watson is now available on third party cloud providers such as Microsoft’s Azure, AWS, and Google Cloud.

Healthcare organizations are beginning to prioritize the accuracy and relevance of information available to consumers through search engines and third party websites. Information like each location’s hours of operation, days physicians are available at a particular location, whether a physician is accepting new patients, and which insurance carriers they accept should be readily available and always up-to-date. When this information is incorrect it not only affects consumers but it can also affect billing. Updating this information manually can be very time consuming and inefficient as some search engines may not populate the changes for days, weeks, or even months. The company leading the way in forging the Digital Knowledge Management space is Yext. Yext’s Healthcare Knowledge Engine is a digital brand management platform allowing healthcare organizations to better manage the digital portion of the consumer-to-patient journey. The company recently announced Yext Brain which they describe as “your AI strategy’s central nervous system.” Yext is a company on the rise and I expect big things from them as they continue to carve out their market.

As a side note, the rise of the consumer in healthcare is one of the most interesting and relevant developments of today’s healthcare landscape. It signifies a shift in the power dynamic between the care provider and the care receiver. As value-based care initiatives and the role the individual plays in their care continues to develop and evolve so too will this dynamic.

Analytics & Business Intelligence

With AI as the tip of the spear of the analytic journey it’s only natural that the analytics and BI space has been an early adopter of AI technology. AI’s ability to predict, automate, and optimize while continuously learning and evolving is a natural fit for tasks like flagging billing and coding errors, clinical decision support, financial and supply chain management, facility management, and NLP for clinical notetaking. Those are just a few examples of where AI fits in this rapidly evolving and growing space.

There are very innovative companies delivering real outcomes in this space. One company that has caught my attention is Health Catalyst. They recently ascended to unicorn status (achieving a company valuation of $1 billion). What’s been even more impressive are the outcomes the company has delivered for their clients along the way. Their website boasts an impressive 178 success stories  and Health Catalyst CEO Dan Burton stated to Forbes that last year alone they recorded over 250 projects that resulted in “measurable clinical, financial or operational improvements.”

Care Management

Care management is one of the most exciting uses for AI technology because of the ability to produce clear, measurable outcomes that positively affect quality of life. AI is capable of deriving a deeper level of insight than traditional analytics. Utilizing AI to augment human intelligence in this way can produce a continuous loop of care development, oversight, and growth.

A few examples of how AI is being used in care management are to actively monitor patients with chronic diseases and alerting care providers to pertinent changes in condition in real-time, to alert pharmacists to medication conflicts, to match patients with treatments that prove most effective, proactively identify sepsis, preventing 30-day hospital readmissions, and to monitor the mother and fetus during pregnancy. AI use in this space is growing rapidly because the benefits and ROI are often easily quantifiable.

A company excelling in this space through the use of AI is HealthEC. Recently ranked first on the 2019 “Best in KLAS” report, HealthEC excels at collecting ever-increasing amounts of data from disparate sources, analyzing them, and helping organizations across the care continuum improve their operations. They boast the ability to access 100 percent of all available electronic health data, structured and unstructured. Their use of AI allows them to deliver enhanced insights across the entire healthcare landscape and connect payers, providers, patients, labs, and hospitals through a single platform.

Diagnostic & Imaging

One of the areas that AI technology has matured significantly is image recognition. This capability is a natural fit in dermatology and radiology. A recent study proved that AI has the capability of diagnosing skin cancer more accurately from a picture than a dermatologist. This does not mean that we don’t need dermatologists, but rather that AI can be used as a front-line diagnosis tool to help catch skin cancer earlier because of the convenience of using a smartphone to snap and upload a picture. The convenience factor of this technology has a real chance to make a significant impact in the early detection rates of skin cancer. When skin cancer is found and removed early it is almost always curable.

In radiology and imaging AI use is growing rapidly. While there is great optimism among radiologists that AI will be able to provide more substantive value at scale, there are hurdles that need to be addressed before that can happen. Marrying the technology to current workflows and building trust in the data used to train the AI algorithms are examples of hurdles that need to be addressed but it’s clear that AI uses in radiology are only going to increase.

Final Verdict

Even though AI is currently being over hyped by the marketing teams of many large technology companies, AI uses in healthcare are real. AI is producing legitimate business outcomes that produce a quantifiable ROI, which is table stakes for any new technology to gain a foothold in an industry. Furthermore, the current push for value-based care makes it necessary for healthcare organizations to derive deeper and more meaningful insights from the mounds of data they have available to them. As companies like IBM continue to develop and refine software tools designed to improve visibility, control, and explainability of AI models confidence and usage of AI will only increase.

Lessons in Machine Learning from My Newborn Baby

Our little man at less than 72 hours old.

In the months leading up to my son’s birth earlier this year I was spending time increasing my knowledge and understanding of Machine Learning (ML). And then the baby arrived earlier than expected and all the extra learning activities were put on hold, replaced with sleepless nights, diaper changes, and all things baby. While watching my newborn son acclimate himself to this new world, it struck me that I was watching machine learning in real life.

Like the rest of us, my son was born with only a basic set of instincts that I refer to as his base-level operating system. We didn’t have to teach him how to breathe, cry, sleep, pee and poop, etc. Those functions, along with the ability to learn, came hardcoded. Everything else he’s learning in real-time. As I watched this happen I began to recognize patterns to his learning that resembled ML techniques in form of Supervised, Unsupervised, and Reinforced Learning. Allow me to explain…

Supervised Learning (SL)

 Newborn babies have no idea where or how to get food. Their stomach signals their brain that it’s time for a fill up and the brain has no idea how to resolve this situation. So, baby hits the panic button and then starts to scream and cry. That’s when baby’s Mom or a caretaker guides the baby through the feeding process. This is a structured process with a targeted outcome. At the conclusion of feeding baby learns how to react to the feeling of hunger in order to get food and how to recognize and consume the food from the correct food source. The baby will use this lesson to be able to differentiate it’s food source from a similarly shaped object such as a pacifier/binky or finger. Trust me, I tried several times to trick my son with a binky to buy time until my wife was ready to feed him and he was not having it (probably an early sign that he’s a genius, just my unbiased opinion).

The way babies learn to feed is a real-life example of how we use supervised machine learning. To follow the example, we are the mother/caretaker, the algorithm is the baby, and the food, binky, and anything else the baby puts in it’s mouth (which is everything he/she can) is the data. The goal is to achieve a targeted and specific outcome. Therefore, we provide it with the right food source to teach it how to identify the data needed to achieve that targeted outcome. After the guided sessions the algorithm will have learned how to differentiate it’s food source from a binky, finger, toe, etc.

The best use cases for supervised machine learning are ones where there’s a specific targeted outcome or value you would like to predict from your data. Supervised ML is the most commonly used ML training style with a wide variety of use cases. Use case examples include sentiment analysis, predicting customer churn, predicting employee performance, and internet/email fraud detection.

Unsupervised Learning (UL)

 When my son was born there were a number of people in the room other than my wife and I – doctors, nurses, a midwife, and, because he was born a few weeks early, a specialized team from the preemie NICU that, thankfully, was not needed. To my son this was all new data. His mind had to begin classifying shapes, sounds, colors, smells, things that moved, things that didn’t, etc. Over time his mind began to differentiate and classify all this new data and he began recognizing familiar shapes – namely my wife and me. When other people would hold him the expression on his face would remain unchanged even though they were smiling at him. When they would hand him back to my wife or myself he would look at our faces and smile. We did not teach him to smile or to react a specific way when he sees us. His brain observed a pattern of familiarity, classified my wife and my face as favorable, and determined that smiling is the appropriate reaction when he sees us. He is not able to put labels on my wife and me such as “Parents” or “Mom and Dad” but he has learned to distinguish us from other people and chose to acknowledge us by smiling.

An unsupervised algorithm (“baby”) learns through observation, recognizing patterns, categorizing and differentiating data to determine the best response/action without guided input. However, these algorithms cannot determine the correct label for the groups it creates – i.e. Mom, Dad, Parents, etc.

Unsupervised machine learning is supposed to uncover previously unknown patterns in data but, since you don’t know what the outcomes should be, there is no way to determine the accuracy. That makes real-world applications of this model difficult but there are still some use cases that continue to be pursued such as visual recognition and robotics.

Reinforced Learning (RL)

Our oldest child, version 1.0 if you will, was an easy baby. He started sleeping through the night in his crib at 7-weeks-old. The latest iteration, version 2.0, has been much more challenging. It took a lot of trial and error to figure out what worked and what did not. When he cried we would go through all the usual culprits – feeding him, changing his diaper, holding him, rocking him, putting him in his car seat, driving him around, etc. – and each time we got it wrong he would cry harder to let us know we got it wrong. When we finally found the correct option he stopped crying and returned to a satisfied and happy baby. Over time, we learned from past interactions and became more effective at predicting the correct choice the first time.

In this example our baby is training us through reinforced learning. My wife and I are the algorithm. We make a predictive decision about which actions will satisfy our crying baby based on our analysis of the information at hand and he responds with positive or negative reinforcement. It’s a trial and error process where we have learned from our errors to achieve the desired outcome.

 Examples of how Reinforced Learning is being used include tutoring systems and personalized learning, learning treatment policies in the medical sciences, and Salesforce used deep RL for abstractive text summarization (a technique for automatically generating summaries from text based on content “abstracted” from some original text document). You can view RL in action in this short video from Google DeepMind, who created a reinforcement learning program that plays old Atari’s video games.

 Conclusion

Our son is a genius that already understands machine learning techniques and is intentionally using them to boost his cognitive development. All kidding aside, I am not surprised to find commonalities in the way my baby is learning and techniques used to train machines. It’s logical that machines would be taught to learn in similar ways to how people learn. There’s no reason to reinvent that wheel. And, even though ML has been around for over 50 years it is still very much in it’s infancy. ML hit a growth spurt over the last ten years or so that has seen it’s adoption rates sky rocket but we are still just scratching the surface of what can be done with this technology.

As with all new technology there are concerns that need to be addressed. The biggest concern for ML is whether or not we can trust the models to be free from bias. Companies like IBM are working to provide better insight and explanability to ML models that will help build trust. As a parent, I also relate to bias concerns. I want my kids to make their own choices but it is also my responsibility to teach them essential functions regarding how to behave, interact with others, etc. that can overlap with their ability to make their own choices. As my kids grow and their cognitive abilities expand they will begin processing the lessons my wife and I teach them in ways we will not be able to predict. They will derive unintended conclusions from our teachings that will require us to correct their understanding of the salient points of the lessons or reteach them altogether. This represents the challenges facing data scientists as ML progresses and becomes more widely utilized. A recent example of this is issue is Amazon’s recruiting tool that began showing bias against women and, as a result, had to be abandoned by the company. Despite this concern I am confident that men and women much smarter than myself will resolve bias through model explanability and transparency. Parenting bias, on the other hand, will likely continue in perpetuity.