Data science is quickly becoming one of the most important fields of study in the business world. Surprisingly, it is an interdisciplinary field that combines mathematics, computer science, and statistics. Moreover, data science helps businesses make better decisions, gain a competitive edge, and discover innovative solutions. To detail, this article explores what is data science, its applications, and data science’s six components.
Ignoring The Hype, What Exactly Is Data Science?
Have you seen the terms “data science”, “big data”, “data analytics” or “data scientist” thrown around but you’re not sure what they mean? Well, you’re not alone! The term “data science” is relatively new, having only come into use this century. Unfortunately, some marketers have taken advantage of this new term to gain attention for their products, resulting in a wide variety of definitions for data-related terms. Thus to cut through the hype, we need a concise definition of data science. So I think the following definition is the best I have seen.
“Data science is the discipline of making data useful.”Cassie Kozyrkov, Chief Decision Scientist, Google
I like this short definition because it focuses on the essence of what data science is. Specifically, it highlights that data science is a science using the scientific method. Second, it is a discipline with its own domain knowledge, methodologies, and tool sets. Lastly, this definition explicitly states that this science is focused on data and the problem of how to make data useful. For more specifics on the fundamentals of data science to include its merits, history and methods, see SC Tech Insights’ Data Science Definition – The Truth About This Discipline And Its Massive Growth.
“The next Darwin is more likely to be a data wonk than a naturalist wandering through an exotic landscape.”David Weinberger, Author, Technologist, Speaker
What Is Data Science Good For?
Data science is making a huge impact across so many industries. It is also a multi-discipline field involving statistics, computer science, and artificial intelligence to name a few. Because of this it can be hard to see the specific benefits of data science. Below are some specific applications of data science and what it is good for.
“We don’t use these technologies because they are huge, connected, and complex. We use them because they work.”David Weinberger
- Healthcare. For example, data science helps Identify and predicts disease, and assists with personalized health care recommendations.
- eCommerce. As an example, data science is behind the automated “smart” ad placement and personalized product recommendations.
- Law Enforcement. Examples include data-driven crime predictions, facial recognition tools, and tax fraud enforcement.
- Transportation. For example, optimized shipping routes, modeling the most effective traffic patterns, getting hot food delivered quickly.
For more examples, see Builtin’s 22 Data Science Applications and Examples.
What Is Data Science? – Its Fundamentals Described In 6 Parts
Data science, its domain knowledge, and its software tool sets are rapidly growing as well as the amount of data in general. Additionally, data scientists apply data science across multiple disciplines, organizations, and industries. Moreover, David Donoho, a professor of statistics at Stanford, describes and classifies the various activities of data science in 50 years of Data Science. Specifically, he describes GDS (Greater Data Science), the science of learning from data, as divided into six divisions summarized as follows:
1. Exploring and Preparing Data.
A less scientific word for this is “data cleansing“. Specifically, this is broken up into two parts, exploration and preparation of the data. First, exploration is a basic sanity check of the basic properties of the data. Second, preparation addresses data anomalies, reformatting / re-coding, and regrouping of data.
2. Representing and Transforming Data.
As data can be in a wide range of formats and computing environments are changing and different, data needs to be transformed. Specifically, this can include import of data into a database or even it being live streamed. In other cases the data may be acoustic, image, sensor, etc. that needs mathematical representations.
“AI in the form of machine learning, and especially deep learning, is letting us benefit from data we used to exclude as too vast, messy, and trivial.”David Weinberger
3. Computing with Data.
Data scientists have many challenges and opportunities with the ever expansion of computer science technologies, methodologies, and programming languages. Also, they must grapple with what computing tools to use for each data science project.
4. Modeling Data.
The data scientist must decide what data model to use, relying on existing models or combination of models for the intended outcome. Moreover, there are a large number of data modeling resources based on traditional academic statistics as well as more modern data modeling resources such as Machine Learning code repositories.
“Evolution has given us minds tuned for survival and only incidentally for truth.”David Weinberger
5. Visualizing and Presenting Data.
Data visualization has moved much further along than just simple histograms, scatterplots, and time series plots. Data visualization and presentation is now an advanced and evolving discipline. This includes a wide range of dashboard technologies as well as more complicated technologies to monitor data pipelines consisting of data streaming and distributed data.
6. Science about Data Science.
Data science is rapidly evolving where scientists are identifying commonly-occurring analysis/ processing workflows. Moreover, data scientists, just like in other sciences, are distributing artifacts and publishing scientific results to expand the knowledge of the data science community.
For more information from SC Tech Insights on AI, Data Analytics, & Robotics, click here.
Greetings! As an independent supply chain tech expert with 30+ years of hands-on experience, I take great pleasure in providing actionable insights to logistics leaders. My background includes implementing 100s of innovative solutions using emerging technologies and a data-centric development approach. I have also provided business intelligence (BI) solutions for 1,000s of shippers. For more about me, click here.