Colin Middleton

Static Peak, Grand Teton National Park, WY
Static Peak, Grand Teton National Park, WY

: Home :

Decoding the Data Scientist: A Look at This Evolving Role

Introduction

The term “data scientist” has become increasingly prevalent in recent years, often hailed as a pivotal role in modern organizations. However, for those outside the field, understanding exactly what a data scientist does can be a challenging task. This article aims to demystify this relatively new profession, exploring its origins, the reasons behind its often ambiguous definition, and the various specialized roles that fall under its umbrella [1].

While the term “data science” might sound contemporary, its roots can be traced back several decades. Some early mentions appeared in the 1960s, sometimes used as an alternative to statistics [2]. Peter Naur used the term in his 1974 publication, defining it as the application of data and data processes in building and handling models of reality [1]. In 1997, C. F. Jeff Wu suggested that statistics should be renamed data science to better reflect its evolving nature beyond just describing data [5]. However, it wasn’t until the early 21st century that “data science” solidified as a distinct profession, driven by the increasing availability of large and complex datasets, often referred to as “big data” [1]. The rise of the internet and advancements in technology have led to an explosion of data, creating a need for professionals who can extract meaningful insights from this vast information [3].

One of the first things to understand about the title “data scientist” is that it is a relatively new and broadly defined term [1]. Unlike more established roles with clear boundaries, the responsibilities of a data scientist can vary significantly depending on the organization, industry, and specific needs [9]. This ambiguity stems from the interdisciplinary nature of the field, which draws upon elements of statistics, computer science, mathematics, and domain-specific knowledge [5]. The rapid evolution of technology and the ever-increasing amounts of data mean that the role of a data scientist is still taking shape [8]. This lack of a rigid definition can sometimes lead to confusion, as different companies might use the same title to describe roles with vastly different responsibilities [8].

Despite the general nature of the title, the work that falls under “data science” typically involves a systematic process of extracting knowledge and insights from data to inform decision-making [6]. This often includes several key stages: obtaining data from various sources, cleaning and preparing the data for analysis, exploring the data to identify patterns and trends, building models to make predictions or classifications, and communicating the findings to stakeholders in a clear and understandable way [6]. Data scientists utilize a variety of tools and techniques, including programming languages like Python and R, statistical methods, machine learning algorithms, and data visualization software [7].

Given the breadth of skills and responsibilities associated with the general title of data scientist, it often encompasses several more specific roles. Understanding these distinctions can provide a clearer picture of the different types of work involved in the field. Below are some common specific roles that a “data scientist” will perform.

Constituent Roles

1. Data Analyst

Often considered a foundational role, a data analyst focuses on interpreting existing data to identify trends, patterns, and insights that can help organizations make better decisions [10]. They work with structured data, often using tools like SQL and Excel to query and manipulate datasets [17]. Data analysts are skilled at data visualization, creating charts and graphs to communicate their findings to both technical and non-technical audiences [2].

Example of a Data Dashboard
Example of a data dashboard.

Typical Tasks:

2. Machine Learning Engineer

This role focuses on building and deploying machine learning models that can automate tasks, make predictions, or provide recommendations [10]. Machine learning engineers have a strong background in computer programming and are skilled in implementing machine learning algorithms using various frameworks and libraries [27]. They work closely with data scientists to take models from research to production [23].

Diagram of a Neural Network
Example of the internal structure of a machine learning model.

Typical Tasks:

3. Data Engineer

Data engineers are responsible for building and maintaining the infrastructure that allows data to be collected, stored, and processed efficiently [10]. They focus on the “plumbing” of data, ensuring that data is readily available and in the right format for data scientists and analysts to use [12]. They often work with large-scale data systems and cloud technologies [32].

ETL Diagram
Typical Extract, Transform, and Load (ETL) process diagram.

Typical Tasks:

4. Data Architect

A data architect focuses on the overall strategy for how data will be stored, managed, and used within an organization [10]. They design the blueprint for the data infrastructure, considering factors like data security, data governance, and future scalability [10]. They often work at a higher level than data engineers, focusing on the strategic vision for data management [42].

Example of the design of a database schema
Example of the design of a database schema.

Typical Tasks:

5. Business Intelligence Analyst

While sometimes considered a separate field, the work of a business intelligence (BI) analyst often overlaps with that of a data scientist, particularly in terms of analyzing data to provide business insights [18]. BI analysts typically focus on using data to understand past and present business performance and identify areas for improvement [18].

Business Intelligence Value Chain
Business intelligence value chain.

Typical Tasks:

6. Statistician

Statisticians represent one of the oldest and most foundational disciplines contributing to modern data science [51]. They focus on the development and application of mathematical and statistical methods to collect, analyze, interpret, and draw conclusions from data [53]. Statisticians typically have strong mathematical and theoretical backgrounds, with expertise in probability theory, statistical inference, and experimental design [52].

95% Confidence Interval
95% Confidence Interval.

Typical Tasks:

Summary

In conclusion, the role of a data scientist is a multifaceted one, often acting as an umbrella term for various specialized roles centered around the extraction of knowledge and insights from data. While the title itself can be new, general, and poorly defined, understanding the specific roles it encompasses, such as data analyst, machine learning engineer, data engineer, data architect, and business intelligence analyst, provides a clearer picture of the diverse work involved in this rapidly evolving field [9]. As organizations continue to generate and rely on increasing amounts of data, the demand for professionals with these skills will only continue to grow [20].

My Data Science Focus: Where I Fit into the Data Science Landscape

Throughout my career, I’ve developed expertise across several key domains within the broader data science field. My experience reflects the multidisciplinary nature of data science discussed above, with particular depth in the following areas:

Predictive Modeling and Statistical Analysis

My work at Pearl Health demonstrates my expertise in developing sophisticated forecasting models using techniques ranging from classical time series methods (exponential smoothing, theta modeling) to multivariate regression. This experience spans both the statistician and data scientist roles, applying rigorous statistical methodology to real-world healthcare operational challenges.

My published research on homelessness prediction further showcases my capabilities in evaluating diverse predictive algorithms including logistic regression, neural networks, and survival analysis models (Cox Proportional Hazards). Addressing extreme class imbalance through minority oversampling demonstrates my ability to handle complex machine learning challenges that require both technical skill and domain understanding.

Data Engineering and Pipeline Development

While my primary focus has been on analysis and modeling, I’ve developed substantial data engineering capabilities. My work architecting enterprise-grade data ingestion and versioning pipelines demonstrates how data scientists often need to build the infrastructure that enables their analytical work. The comprehensive tracking system I designed for pricing tool inputs represents the operational data systems that bridge raw data and actionable insights.

My experience refactoring mission-critical codebases (12k+ lines of Python) and engineering proprietary I/O systems that significantly improved input accuracy shows my commitment to creating robust, maintainable data infrastructure—skills typically associated with the data engineering domain.

Business Analytics and Communication

A significant portion of my work has involved translating complex technical concepts into actionable business insights. Leading individualized analyses for strategic prospects and developing comprehensive tracking systems for sales funnel optimization demonstrates my ability to function as a business analyst, connecting data insights to organizational decision-making.

My experience presenting research findings to the Spokane City Council highlights my capability to communicate technical results to non-technical stakeholders—a crucial skill that spans all data science roles.

Tool Development and Optimization

Throughout my career, I’ve consistently focused on building and optimizing tools that transform data into utility. From migrating client reporting tools from SAS to Python (achieving a 10x runtime improvement) to developing The Wordler application and contributing to customized algorithms for document clustering, I’ve demonstrated versatility in creating practical data applications.


My experience illustrates how modern data scientists often need to operate across traditional role boundaries, developing capabilities that span from engineering to communication. While my strongest technical skills are in statistical modeling and predictive analytics, I’ve cultivated the breadth necessary to deliver end-to-end data solutions in complex environments.

Credit

I largely used Google’s Gemini (Deep Research) and Anthropic’s Claude (3.7 Sonnet) to write this article for me and to generate all the graphics. I acted as editor.

Sources

  1. The History Of Data Science and Pioneers You Should Know Worcester Polytechnic Institute, accessed March 19, 2025, https://onlinestemprograms.wpi.edu/blog/history-data-science-and-pioneers-you-should-know
  2. What is Data Science? - AWS, accessed March 19, 2025, https://aws.amazon.com/what-is/data-science/
  3. A Brief History of Data Science - DATAVERSITY, accessed March 19, 2025, https://www.dataversity.net/brief-history-data-science/
  4. onlinestemprograms.wpi.edu, accessed March 19, 2025, https://onlinestemprograms.wpi.edu/blog/history-data-science-and-pioneers-you-should-know#:~:text=1974,and%20handling%20models%20of%20reality.%22
  5. Data science - Wikipedia, accessed March 19, 2025, https://en.wikipedia.org/wiki/Data_science
  6. What Is Data Science? Definition, Skills, Applications & More, accessed March 19, 2025, https://seas.harvard.edu/news/what-data-science-definition-skills-applications-more
  7. What Is Data Science? Definition, Tools, Techniques, & More, accessed March 19, 2025, https://ischool.syracuse.edu/what-is-data-science/
  8. Overcoming Some of the Worst Parts of Being a Data Scientist Towards Data Science, accessed March 19, 2025, https://towardsdatascience.com/overcoming-some-of-the-worst-parts-of-being-a-data-scientist-3237d20f356f/
  9. Hiring a Data Scientist: Decoding the Ambiguities - Datahut Blog, accessed March 19, 2025, https://www.blog.datahut.co/post/hiring-a-data-scientist-decoding-the-ambiguities
  10. 12 Data Science Job Titles — Which Role Is Right for You? - Built In, accessed March 19, 2025, https://builtin.com/data-science/data-science-jobs
  11. The ambiguity of data science team roles and the need for a data science workforce framework Request PDF - ResearchGate, accessed March 19, 2025, https://www.researchgate.net/publication/322512207_The_ambiguity_of_data_science_team_roles_and_the_need_for_a_data_science_workforce_framework
  12. seas.harvard.edu, accessed March 19, 2025, https://seas.harvard.edu/news/what-data-science-definition-skills-applications-more#:~:text=Data%20science%20and%20engineering%20also,use%20them%20for%20decision%2Dmaking.
  13. What is a Data Scientist and How Can You Succeed in This Field? - News@TheU, accessed March 19, 2025, https://news.miami.edu/uonline/stories/2024/07/what-is-a-data-scientist.html
  14. What Does a Data Analyst Do? Roles, Skills & Tools Explained, accessed March 19, 2025, https://ischool.syracuse.edu/what-does-a-data-analyst-do/
  15. What Does a Data Analyst Do? Roles, Skills, and Salary, accessed March 19, 2025, https://graduate.northeastern.edu/knowledge-hub/what-does-a-data-analyst-do/
  16. What Does a Data Analyst Do? SNHU, accessed March 19, 2025, https://www.snhu.edu/about-us/newsroom/stem/what-does-a-data-analyst-do
  17. What Does a Data Analyst Do? Your 2025 Career Guide - Coursera, accessed March 19, 2025, https://www.coursera.org/articles/what-does-a-data-analyst-do-a-career-guide
  18. 8 Key Data Science Roles Explained, accessed March 19, 2025, https://365datascience.com/career-advice/types-of-data-science-roles-explained/
  19. Data analyst job profile Prospects.ac.uk, accessed March 19, 2025, https://www.prospects.ac.uk/job-profiles/data-analyst
  20. Data Scientists : Occupational Outlook Handbook - Bureau of Labor Statistics, accessed March 19, 2025, https://www.bls.gov/ooh/math/data-scientists.htm
  21. 5 Data Analytics Projects for Beginners - Coursera, accessed March 19, 2025, https://www.coursera.org/articles/data-analytics-projects-for-beginners
  22. The Roles and Responsibilities of a Data Analyst Pecan AI, accessed March 19, 2025, https://www.pecan.ai/blog/the-roles-and-responsibilities-of-a-data-analyst/
  23. Key Insights on 7 Data Science Roles, Responsibilities and Skills, accessed March 19, 2025, https://und.edu/blog/data-science-roles-and-responsibilities.html
  24. 11 Data Science Careers That Are Shaping the Future, accessed March 19, 2025, https://graduate.northeastern.edu/knowledge-hub/data-science-careers-shaping-our-future/
  25. www.run.ai, accessed March 19, 2025, https://www.run.ai/guides/machine-learning-engineering#:~:text=Machine%20learning%20engineers%20build%20software,in%20production%20and%20at%20scale.
  26. What Is a Machine Learning Engineer? (+ How to Get Started) - Coursera, accessed March 19, 2025, https://www.coursera.org/articles/what-is-machine-learning-engineer
  27. What is a Machine Learning Engineer? The Ultimate Guide - NVIDIA Run:ai, accessed March 19, 2025, https://www.run.ai/guides/machine-learning-engineering
  28. What Is a Machine Learning Engineer? (2025 Guide) - BrainStation, accessed March 19, 2025, https://brainstation.io/career-guides/what-is-a-machine-learning-engineer
  29. Machine Learning Engineer Job Description - LinkedIn Business, accessed March 19, 2025, https://business.linkedin.com/talent-solutions/resources/how-to-hire-guides/machine-learning-engineer/job-description
  30. Machine Learning Engineer Job Description: A Complete Guide - Caltech Bootcamps, accessed March 19, 2025, https://pg-p.ctme.caltech.edu/blog/ai-ml/machine-learning-engineer-job-description
  31. 100+ Machine Learning Projects with Source Code [2025] - GeeksforGeeks, accessed March 19, 2025, https://www.geeksforgeeks.org/machine-learning-projects/
  32. What is a Data Engineer? - Splunk, accessed March 19, 2025, https://www.splunk.com/en_us/blog/learn/data-engineer-role-responsibilities.html
  33. What Is a Data Engineer? A Guide to This In-Demand Career Coursera, accessed March 19, 2025, https://www.coursera.org/articles/what-does-a-data-engineer-do-and-how-do-i-become-one
  34. Working as a data engineer Randstad USA, accessed March 19, 2025, https://www.randstadusa.com/job-seeker/career-advice/job-profiles/data-engineer/
  35. Data Engineer Roles and Responsibilities, Salaries and Jobs Randstad, accessed March 19, 2025, https://www.randstad.com/career-advice/careers/data-engineer/
  36. Data engineer - Government Digital and Data Profession Capability Framework, accessed March 19, 2025, https://ddat-capability-framework.service.gov.uk/role/data-engineer
  37. Data Engineer Job Description - Virginia Office of Data Governance and Analytics, accessed March 19, 2025, https://www.odga.virginia.gov/media/governorvirginiagov/chief-data-officer/css/Job-Description—Data-Engineer-Sample.pdf
  38. Key Data Engineer Skills and Responsibilities - Simplilearn.com, accessed March 19, 2025, https://www.simplilearn.com/data-engineer-role-article
  39. Top 10 Data Engineering Projects for 2025 - Simplilearn.com, accessed March 19, 2025, https://www.simplilearn.com/tutorials/big-data-tutorial/data-engineering-projects
  40. www.splunk.com, accessed March 19, 2025, https://www.splunk.com/en_us/blog/learn/data-engineer-role-responsibilities.html#:~:text=A%20data%20engineer’s%20primary%20responsibility,writing%20code%20for%20required%20customizations.
  41. Data Architect - Texas State Auditor’s Office, accessed March 19, 2025, https://hr.sao.texas.gov/Compensation/JobDescriptions/R0317.pdf
  42. What Does a Data Architect Do? A Career Guide - Coursera, accessed March 19, 2025, https://www.coursera.org/articles/data-architect
  43. What is a Data Architect? Responsibilities, Skills & Salary Explored Splunk, accessed March 19, 2025, https://www.splunk.com/en_us/blog/learn/data-architect-role-responsibilities.html
  44. Data architect - Government Digital and Data Profession Capability Framework, accessed March 19, 2025, https://ddat-capability-framework.service.gov.uk/role/data-architect
  45. Data Architect Roles & Responsibilities Guide for 2025 - Atlan, accessed March 19, 2025, https://atlan.com/data-architect-roles-and-responsibilities/
  46. 15-2051.01 - Business Intelligence Analysts - O*NET, accessed March 19, 2025, https://www.onetonline.org/link/summary/15-2051.01
  47. A Day in the Life of a Business Intelligence Analyst, accessed March 19, 2025, https://post.edu/blog/a-day-in-the-life-of-a-business-intelligence-analyst/
  48. Business Intelligence Analyst Job Description and Templates - Thoughts about Product Adoption, User Onboarding and Good UX Userpilot Blog, accessed March 19, 2025, https://userpilot.com/blog/business-intelligence-analyst-job-description/
  49. Example Job Description for Business Intelligence Analyst - Yardstick, accessed March 19, 2025, https://www.yardstick.team/job-description/business-intelligence-analyst
  50. Business Intelligence Analyst Job Description - Deel, accessed March 19, 2025, https://www.deel.com/job-description-templates/business-intelligence-analyst
  51. What is a Statistician? - American Statistical Association, accessed April 10, 2025, https://www.amstat.org/your-career/what-is-a-statistician
  52. Statistician - Bureau of Labor Statistics, accessed April 10, 2025, https://www.bls.gov/ooh/math/mathematicians-and-statisticians.htm
  53. What Does a Statistician Do? - Coursera, accessed April 10, 2025, https://www.coursera.org/articles/statistician
  54. The Role of Statisticians in Data Science - Royal Statistical Society, accessed April 10, 2025, https://rss.org.uk/news-publication/news-publications/2019/general-news/the-role-of-statisticians-in-data-science/
  55. Statistician: Occupational Outlook Handbook - U.S. Bureau of Labor Statistics, accessed April 10, 2025, https://www.bls.gov/ooh/math/statisticians.htm
  56. Statistical Methods for Data Science - Harvard University, accessed April 10, 2025, https://online-learning.harvard.edu/course/statistical-methods-data-science
  57. The American Statistician: The Future of Statistics and Data Science - Taylor & Francis Online, accessed April 10, 2025, https://www.tandfonline.com/toc/utas20/72/1
  58. Foundations of Statistics for Data Scientists - Stanford University, accessed April 10, 2025, https://statistics.stanford.edu/courses/foundations-statistics-data-scientists
  59. Statistician vs. Data Scientist: What’s the Difference? - DataCamp, accessed April 10, 2025, https://www.datacamp.com/blog/statistician-vs-data-scientist-whats-the-difference
  60. Statistical Science in the Age of Artificial Intelligence - Annual Review of Statistics, accessed April 10, 2025, https://www.annualreviews.org/doi/10.1146/annurev-statistics-031219-041141

: Home :