“Data Science” searches over the years


Google Books Ngram Viewer
This morning, I was just randomly browsing Data Science content over the internet and casually typed “Data Science” in the Google search bar. What I used to see as a line graph depicting the historical usage of a word, every time I searched something in the Google search, actually appeared really magnificent and intriguing this time around.
For those of you who do not know what Google Books Ngram Viewer is, according to Wikipedia — “it is an online search engine that charts the frequencies of any set of search strings using a yearly count of n-grams found in printed sources published between 1500 and 2019 in Google’s text corpora in English, Chinese (simplified), French, German, Hebrew, Italian, Russian, or Spanish. There are also some specialized English corpora, such as American English, British English, and English Fiction.”

The Google Ngram Viewer program can search for a word or phrase, including misspellings or gibberish. The n-grams are matched with the text within the selected corpus, optionally using case-sensitive spelling (which compares the exact use of uppercase letters), and, if found in 40 or more books, are then displayed as a graph.
The Google Ngram Viewer supports searches for parts of speech and wildcards. It is routinely used in search.
Now let me take you through something really interesting and worth your reading time.
Today, I casually searched “Data Science” on Google. And of course I know that Google provides this historical view around the usage of any word when you search it in the search bar, and I sometimes see it or sometimes ignore it too.
But this time “Data Science”’ bigram caught a lot of my interest.
What you see below is a line chart depicting the usage / coverage of the bigram “Data Science” in the Google’s text corpus (which is made of the content / information picked from the printed sources published between 1500 and 2019).

Interesting Observation 01
The usage of the phrase “Data Science” got started in the late 1930s. Surprisingly, it went on for a couple of years and faded away in the early 1940s itself.
Now this was very intriguing for me since, on the records, Data Science as a discipline got founded and practiced way later in the 1970s.
Now, this is where I spent some time researching about it. As mentioned on https://scientistcafe.com/ids/a-brief-history-of-data-science.html
In the early 19th century, when Legendre and Gauss came up with the least-squares method for linear regression , probably only physicists would use it to fit linear regression for their data. Now, nearly anyone can build linear regression using spreadsheet with just a little bit of self-guided online training. In 1936, Fisher came up with linear discriminant analysis. In the 1940s, we had another widely used model — logistic regression .
This kind of proves that while there are many new and exciting developments in Data Science, however many of the techniques that we are using are based on years of hard work of the statisticians, computer scientists, mathematicians and scientists of many other fields.
Interesting Observation 02
The usage of the phrase “Data Science” again spiked around late 1950s and again went towards low in the early 1960s.
The term data science first appeared in the 1960s and was used to refer to the use of computers in scientific research.
In the early 19th century, when Legendre and Gauss came up with the least-squares method for linear regression , probably only physicists would use it to fit linear regression for their data. Now, nearly anyone can build linear regression using spreadsheet with just a little bit of self-guided online training. In 1936, Fisher came up with linear discriminant analysis. In the 1940s, we had another widely used model — logistic regression.
Interesting Observation 03
There was a continuous surge in the usage of the phrase (bigram) “Data Science” effective early 1960s and continued till early 1980s.
The term data science first appeared in the 1960s and was used to refer to the use of computers in scientific research. Then we entered into the Digital Age in the 1970’s.
There was a rapid shift in the industry when IBM introduced personal computers to the mainstream public. With the introduction of computers and PC’s, we advanced into a digital age that opened up a flood of information and digital data. New computer technology increased data collection, storage and manipulation capabilities via novel programming, software, and database systems. As a result, statistics became computerized and created new practices such as data mining and data analytics.