There are many ways to collect data including surveys, interviews and research experiments. Most data is collected with the knowledge and consent of the people involved. But in recent decades, with the growth of the Internet and other computer systems, the world of data science has expanded well beyond these means of collecting data. Today, a great deal of data is collected in automatic ways, often without people being aware of it. Here, we will take a look at some everyday activities that allow for automatic data collection. Those who work in fields that involve data science need to know about these and stay up with the newest (and future) ways to collect information.
Data Collection on the Internet
Foremost among data collection sources is the Internet. Just about every click and every piece of text entered online is recorded. When you browse a website, the website records which items you clicked on and what search terms you entered. In most cases (some sites, like Google and Facebook are exceptions), the information is also being collected – with permission of the main site – by numerous other companies, that collect the information solely for the purpose of gathering it so they can organize and resell it. This information is then used by other companies to target advertising.
Facebook, Google and some other sites collect information automatically, but they do not sell it. For example, on Facebook, your connections and the groups you join are analyzed, and that information is used to suggest others to whom you might want to connect. On Google, you may see advertisements based on your previous searches, but these are from sources approved by Google. These websites use the data they collect to improve their services and to improve your experience on the site. In addition, they allow you to determine what information is shared publicly so you have some control in the situation.
Cell Phones & Data Collection
Another source of data is the cell phone, the usage of which is tracked by cell phone companies. When and to whom calls are made and from whom they are received, how many texts are sent, and how much time is spent on the Internet are among the many pieces of data that are collected from your phone. Even without having GPS on your phone, your movements can be tracked through the cell phone towers to which your phone connects as you move. If you have GPS turned on, then your movements are tracked by the satellites to which your phone is connected. Information sent through blue-tooth connections is also recorded. We all need to assume that everything we do on our phones might be seen by someone who did not receive expressed permission to see it.
Point of Purchase
When shopping in stores, during point-of-sale transactions, all items are scanned. Companies collect this information to track inventories and determine what items are selling. But if you use a credit or debit card in these transactions, information such as the location, day, date, time of day, and amount of the purchase are recorded. This information, among other things, may be used in data mining to target customers and by credit bureaus in calculating credit scores.
Data Tracking in Transportation
Toll booth devices (such as the E-ZPass) are used to collect more than tolls. Electronic pass readers at toll plazas and elsewhere along the highways are used to collect location data. Autoblog reported, that in New York City alone, by July 2014 there were 149 E-ZPass readers and that hundreds of thousands of records on drivers are produced every day. Automatic license plate readers in the form of high-speed cameras mounted on police cars, road signs and bridges can also track the movement of cars. This information is provided to law enforcement officials and often kept in regional databases.
Capturing license plates can be a useful tool in capturing criminals, but the collection of all license plate data lends itself not only to data mining but also a myriad of privacy issues. Not surprisingly, the American Civil Liberties Union (ACLU) is working to reduce the amount of information collected from the general public and to keep the records that are collected private and have that information destroyed within a reasonable amount of time. The ACLU argues that there is no reasonable need to collect and store this information about innocent motorists.
As more and more data are collected and the general public becomes aware of this phenomenon, the issue of privacy of information is sure to grow. To learn more about data collection and privacy concerns, consider pursuing a degree in data science.