The Future of Data Science: Q&A with MIT Professional Education’s Devavrat Shah

In this guest blog post, Devavrat Shah, co-director of MIT Professional Education’s new online course, Data Science: Data to Insights, discusses the evolution of new technologies and how data science professionals can ensure their skill set remains comprehensive and able to meet challenges as they arise in the digital marketplace.

We are in the midst of a digital revolution. The amount of data being generated in the enterprise today is staggering and organizations are struggling to keep up with the influx. And while the promise of big data remains, much still stands in the way of converting massive amounts of information into insights that can be used to drive business decisions.

Data scientists and data analysts are poised to deliver on this promise and lead the digital transformation by enabling organizations to capitalize on big data, and use it to create opportunities and innovations. But first they will be required to boost their skills, understand modern tools, and learn the various methods and techniques now available. Professor Devavrat Shah shares his thoughts on how IT professionals can get ahead and compete in our new digital world.

Digital transformation is driving massive change. What is the biggest challenge on the horizon for data science professionals?

The biggest challenges data science professionals face is the inability to transform into efficient “Data Science Machines.” A recent survey from IT research firm Gartner showed 59% of IT professionals thought their organizations were not prepared for the necessary changes to bring about a digital business approach. One of the most pressing problems faced was the shortage of technical skills.

While it is easy to get caught up in the hype surrounding big data and analytics, the reality is turning avalanches of data into meaningful business insights creates challenges that require the evolution of data science skills. Otherwise, big data will simply become too big, too fast or too hard to process, analyze and convert into insights.

Professionals must prepare by moving beyond the details of infrastructure implementation and begin to focus on how to turn data into decisions. We built the infrastructure that can store and process massive amounts of data, but we still lack the critical ability to seamlessly stick all the various pieces of data together to make accurate predictions that lead to high-impact decisions. This is one of the defining challenges of our time – but data science professionals who tackle it effectively will no doubt experience tremendous success in their careers.

How can data science professionals ensure they have the right skills not just to succeed but excel in our new big data world?

In my view, success for data science professionals relies on becoming trained and able data scientists with the ability to perform data processing and computation at a massive scale. To achieve this, professionals must invest time in ongoing education through institutions with multidisciplinary programs that include elements from engineering, mathematical sciences, and social sciences. Converting big data into meaningful information begins with skilled professionals who are educated in all disciplines to be both data scientists and statisticians.

How do professionals determine where to prioritize their focus given the various new technologies and advances?

An effective approach is to study the state of the practice. Learn what is happening at top companies such as Amazon, Google and Netflix. How are these modern consumer facing companies able to process data at scale to extract meaningful information that led to massive success?

Look at domains outside your industry or area of expertise.   Are there any trends in the strategies and technologies that others adopted? Computer languages such as Python have been used successfully in scientific computing and highly quantitative domains such as physics for more than a decade. It was used to improve the Space Shuttle mission design and has powered much of Google’s internal infrastructure. How can business analysts and data scientists in companies of all sizes benefit from Python when it comes to big data and analytics?

The key isn’t knowing any one technology, model or practice per se. Professionals should be well-versed on a variety of tools, perspectives and approaches so they can identify which methods and models are most appropriate in a particular use case.

What are some of the common pitfalls in big data analytics and how can professionals avoid them?

One of the most common mistakes organizations make is failing to capture the right data needed to make the right decisions. For example, a consumer facing company making a drastic strategic change based on large number of negative reviews may be misleading as consumers are likely to provide feedback aggressively when they are unhappy rather than not and hence it is important to understand the overall context.

Data science is all about having the data you need. But as the volume of information continues to skyrocket, the variety and velocity of data will grow as well. A fundamental challenge facing data science professionals is what data to collect and keep. And as more data is being collected, extracting value from that data is only going to become more complex. Data scientists and data analysts will need to rely on statistical and machine-learning approaches to extract information from data automatically. Machine learning will become critical in order to deliver insights to the right decision makers at the right time.

What does the future hold for data scientists?

Over the next 5 years, data scientists will develop the ability to utilize all sorts of data in real-time. This will fuel the need for making more intricate predictions and computations at scale which will spark the emergence of new data science paradigms due to the needs of future applications.

More and more data will be used to drive key business decisions, and will enable innovations like “Deep Learning” that allow for accurate predictions and decision making. Further, modern applications have brought to fore new statistical paradigms such as recommendation systems that are key enablers for many of the modern businesses being media portals, e-commerce portals, or social interaction platforms.

Regardless of how things evolve, one thing is clear: Skilled data scientists, statisticians, and business analysts will be the key to unlocking the endless possibilities of big data.

Devavrat Shah, co-director of the Data Science: Data to Insights course, is a professor in MIT’s Department of Electrical Engineering and Computer Science, director of the SDSC, and a core faculty member at the IDSS. He is also a member of MIT’s Laboratory for Information and Decision Systems (LIDS) and the Operations Research Center (ORC).