Sun, 26 February 2017
If you're going to apply machine learning (ML) in a business context, you need a lot of data, and algorithms across the board perform better with more recent, rich, and relevant data. Today, there are companies whose entire business models are predicated on helping others make sense of and use of this type of information. In this episode, we speak with the CTO and Co-Founder of one such company—Palo Alto-based Cloudera. CTO Amr Awadallah, PhD, speaks with us this week about where he sees "data lakes" (or "data hubs", Cloudera's preferred term) and warehouses play an important role in ML applications in business. Based on his experiences helping a variety of companies in many countries set up data lakes, Amwadallah is able to distill and communicate these uses in three broad categories that apply across industries as companies look to solve tougher problems and ask more complex questions using unstructured data.