job summary:
The Data Scientist is responsible for developing, maintaining, and optimizing big data solutions using the Databricks Unified Analytics Platform.
This role supports data engineering, machine learning, and analytics initiatives within this organization that relies on large-scale data processing.
Duties include:
Designing and developing scalable data pipelines
Implementing ETL/ELT workflows
Optimizing Spark jobs
Integrating with Azure Data Factory
Automating deployments
Collaborating with cross-functional teams
Ensuring data quality, governance, and security.
location: Austin, Texas
job type: Contract
salary: $47 - 52 per hour
work hours: 8am to 5pm
education: Bachelors
responsibilities:
Minimum Requirements:
Candidates that do not meet or exceed the minimum stated requirements (skills/experience) will be displayed to customers but may not be chosen for this opportunity.
Years Required/Preferred Experience
4 Required Implement ETL/ELT workflows for both structured and unstructured data
4 Required Automate deployments using CI/CD tools
4 Required Collaborate with cross-functional teams including data scientists, analysts, and stakeholders
4 Required Design and maintain data models, schemas, and database structures to support analytical and operational use cases
4 Required Evaluate and implement appropriate data storage solutions, including data lakes (Azure Data Lake Storage) and data warehouses
4 Required Implement data validation and quality checks to ensure accuracy and consistency
4 Required Contribute to data governance initiatives, including metadata management, data lineage, and data cataloging
4 Required Implement data security measures, including encryption, access controls, and auditing; ensure compliance with regulations and best practices
4 Required Proficiency in Python and R programming languages
4 Required Strong SQL querying and data manipulation skills
4 Required Experience with Azure cloud platform
4 Required Experience with DevOps, CI/CD pipelines, and version control systems
4 Required Working in agile, multicultural environments
4 Required Strong troubleshooting and debugging capabilities
3 Required Design and develop scalable data pipelines using Apache Spark on Databricks
3 Required Optimize Spark jobs for performance and cost-efficiency
3 Required Integrate Databricks solutions with cloud services (Azure Data Factory)
3 Required Ensure data quality, governance, and security using Unity Catalog or Delta Lake
3 Required Deep understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL
3 Required Hands-on experience with Databricks notebooks, clusters, jobs, and Delta Lake
1 Preferred Knowledge of ML libraries (MLflow, Scikit-learn, TensorFlow)
1 Preferred Databricks Certified Associate Developer for Apache Spark
1 Preferred Azure Data Engineer Associate
qualifications:
Must-Have Qualifications (Required)
Data Pipeline & ELT Expertise: Proven experience (4+ years) in designing, implementing, and optimizing scalable ETL/ELT workflows for both structured and unstructured data, including strong SQL querying and data manipulation skills.
Big Data & Databricks: At least 3 years of experience designing and developing scalable data pipelines using Apache Spark on Databricks. This includes a deep understanding of Spark architecture (RDDs, DataFrames, Spark SQL) and experience optimizing Spark jobs for performance.
Cloud & DevOps: Hands-on experience with the Azure cloud platform and integrating Databricks with other cloud services (e.g., Azure Data Factory). Required experience with DevOps, CI/CD pipelines, and version control systems to automate deployments.
Data Governance & Quality: Experience implementing data validation, quality checks, and contributing to data governance initiatives (metadata management, data lineage, data cataloging).
Databricks Specific: Experience ensuring data quality, governance, and security using Unity Catalog or Delta Lake.
Programming & Modeling: Proficiency in Python (and R is a strong plus) and extensive experience designing and maintaining data models, schemas, and database structures for analytical and operational use cases.
Security & Compliance: Experience implementing data security measures, including encryption, access controls, and auditing to ensure compliance.
Collaboration & Agility: Strong collaboration skills with cross-functional teams (Data Scientists, Analysts, Stakeholders) and experience working in agile, multicultural environments.
Nice-to-Have Qualifications (Preferred)
Experience or knowledge of ML libraries and tools like MLflow, Scikit-learn, or TensorFlow.
Certifications such as Azure Data Engineer Associate or Databricks Certified Associate Developer for Apache Spark.
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
At Randstad Digital, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact HRsupport@randstadusa.com.
Pay offered to a successful candidate will be based on several factors including the candidate's education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad Digital offers a comprehensive benefits package, including: medical, prescription, dental, vision, AD&D, and life insurance offerings, short-term disability, and a 401K plan (all benefits are based on eligibility).
This posting is open for thirty (30) days.