Why on Earth is Data Science?
Why on Earth is Data Science?
Data science is a field that extracts insights and knowledge from data, to use in a broad range of contexts. It uses interdisciplinary methods, processes, and algorithms. Data science is an umbrella category for disciplines that include statistics, data analysis, machine learning and its closely related methods. The purpose of data science is to understand a subject through the use of data. Data science has been a booming field in recent years. Data science professionals have been sought by many companies for solving business complexities. Thus the job role of a can be defined as “A data scientist is a statistician who can code.” Data science is the study of how data can be used and analyzed in order to make new discoveries or can also be simply defined as “the discipline of making the data useful.”
“Big data” technologies, such as Hadoop, Hbase, CouchDB, and others have received considerable media attention recently. As with traditional technologies, big data technologies are used for many tasks, including data engineering. Occasionally, big data technologies are actually used for implementing data-mining techniques, but more often the well-known big data technologies are used for data processing in support of the data-mining techniques and other data-science activities. One way to think about the state of big data technologies is to draw an analogy with the business adoption of internet technologies. In Web 1.0, businesses busied themselves with getting the basic internet technologies in place so that they could establish a web presence, build electronic commerce capability, and improve operating efficiency. We can think of ourselves as being in the era of Big Data 1.0, with firms engaged in building capabilities to process large data. Managers in enterprises without substantial data-science resources should still understand basic principles in order to engage consultants on an informed basis. Investors in data-science ventures need to understand the fundamental principles in order to assess investment opportunities accurately. More generally, businesses increasingly are driven by data analytics, and there is great professional advantage in being able to interact competently with and within such businesses. Understanding the fundamental concepts, and having frameworks for organizing data-analytic thinking, not only will allow one to interact competently, but will help to envision opportunities for improving data-driven decision making or to see data-oriented competitive threats.
Underlying the extensive collection of techniques for mining data is a much smaller set of fundamental concepts comprising data science. In order for data science to flourish as a field, rather than to drown in the flood of popular attention, we must think beyond the algorithms, techniques, and tools in common use. We must think about the core principles and concepts that underlie the techniques, and also the systematic thinking that fosters success in data-driven decision making. There is strong evidence that business performance can be improved substantially via data-driven decision making, big data technologies, and data-science techniques based on big data. Data science supports data-driven decision making—and sometimes allows making decisions automatically at massive scale—and depends upon technologies for “big data” storage and engineering. However, the principles of data science are its own and should be considered and discussed explicitly in order for data science to realize its potential.
Comments
Post a Comment