Data Science: Where did it come from and what is it now?
The field of data science is relatively new, though the ideas that it encompasses are not. Data science is an outgrowth of statistical analysis and has been made possible by the ever-stronger computing power that has been developed over the past few decades. In a nutshell, data science is concerned with using data to make decisions. With the growing availability and ease of collecting data through computers, the science (and business) of analyzing data has become a part of just about every aspect of life. And with desktop programs, anyone with a computer and access to data could be involved in the field of data science.
The practice of collecting, organizing and displaying data and using the data to make decisions is not new at all. People have collected data on population, manufacturing of products, sales, revenues and many more things for hundreds of years. Consider that the first census of the United States was held in 1790. But over these hundreds of years, this type of data collection and use included a minimal amount of analysis that informed decisions.
The field of statistics grew out of the collecting and displaying of data, but statistics went further and developed into own branch of mathematics with its own theories and processes that dove deep into analyzing data. Inferential statistics is the part of the field of statistics that combines data analysis with probability theory to guide research and decisions based on data. Over the years, statisticians developed processes that could analyze data, and as the field of computer science grew, programmers developed software that could quickly carry out the computations needed for the inferential statistical processes.
The programs that were developed to analyze data were, and still are, very powerful and are capable of analyzing data in many ways. The problem years ago was having the data available to be analyzed. Some entities, such as large businesses, were able to collect data in separate locations and bring them together for analysis on a larger scale. But then came the Internet, and the availability of data grew tremendously. With the Internet, the ability to collect and transmit data brought life to this new field of data science.
Data Science Today
Today, with the interconnectivity the Internet provides, there are a great deal of data that are available to be analyzed. For example, a chain of stores throughout a region or country or even the entire world can have all the data from all its stores be continuously fed into a central computing center. A healthcare network can pool all the data on its patients and employee contacts. A research study can be carried out in far-apart locations with all the data from it nearly instantaneously collected in one place. And of course, all the information that is put directly onto the Internet through millions of websites can be compiled and shared easily by the site operators.
The amount of information that can be collected now is really overwhelming, and that is where the field of data science comes into play. Data science is all about analyzing the mammoth amount of data that are collected and making sense of them and putting them to use in decision-making. Take the chain of stores as an example. It may sell millions of products each day. With applications of data science, it can run programs that find what products are sold the most, perhaps in different regions of the country or even at different times of the day. It can analyze what products are returned and whether differences in prices between products or between stores makes a difference in sales. What the store owners do with the data is almost endless because of the powerful computer programs that have been developed and continue to be developed.
In health care, data on patients, whether in studies or just in routine care, can be brought into one database so that healthcare professionals, especially researchers, can analyze it. For instance, with much more data available, the effects – good and bad – of a particular drug can be determined. Or data on a particular type of surgery can be analyzed to determine what method of that surgery might truly be the best. As long as the data are in the system, analytics can be run to flesh out the information someone wants.
And then there is data mining. Simply put, data mining is searching data to find patterns. This is being done more and more and better and better by all types of entities, not the least of which are businesses. Businesses use data, individual and aggregate, in an effort to find what products particular customers or types of customers are likely to buy. The payoff for businesses who do this can be great, so their investment in the development of analytical tools continues to be great. And so the field of data science will continue to expand.
The history of data science has really just begun. For those growing up in this “big data” Internet world, having so much information – much of it at their fingertips – will not seem strange. As with many aspects of historical development – such as in the technology that has led to this new science – future generations will learn to control data processing to their benefit, both personally and, one can only hope, socially. What many today look at as information overload will become commonplace in the years to come, and the science to analyze it will change in ways that we cannot imagine at this moment in time.
If you are interested in learning more about the history of data science, and being a part of where it is heading in the future, click here to explore data science degree options.