Data Engineer
Bengaluru
Job No. atci-5510173-s2010015
Full-time
工作描述
Project Role : Data Engineer
Project Role Description : Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes to migrate and deploy data across systems.
Must have skills : Palantir Foundry
Good to have skills : NA
Minimum 7.5 year(s) of experience is required
Educational Qualification : 15 years full time education
Summary:
As a Lead Data Engineer, Design, build and enhance applications to meet business process and requirements in Palantir foundry. The role includes creating efficient data pipelines and ensuring the integrity and quality of data throughout its lifecycle. You will be responsible for implementing processes that extract, transform, and load data to facilitate seamless migration and deployment across various systems. This position requires continuous collaboration with different teams to optimize data workflows and support organizational data needs effectively.
Work experience: Minimum 6 years
Job Requirements & Key Responsibilities :
Responsible for designing , developing, testing, and supporting data pipelines and applications on Palantir foundry.
Configure and customize Workshop to design and implement workflows and ontologies.
Collaborate with data engineers and stakeholders to ensure successful deployment and operation of Palantir foundry applications.
Work with stakeholders including the product owner, data, and design teams to assist with data-related technical issues and understand the requirements and design the data pipeline.
Work independently, troubleshoot issues and optimize performance.
Communicate design processes, ideas, and solutions clearly and effectively to team and client.
Assist junior team members in improving efficiency and productivity.
Technical Experience :
Must have Skills : Palantir Foundry , PySpark
Proficiency in PySpark, Python and Sql with demonstrable ability to write & optimize SQL and spark jobs.
Hands on experience on Palantir foundry related services like Data Connection, Code repository, Contour , Data lineage & Health checks.
Good to have working experience with workshop , ontology , slate.
Hands-on experience in data engineering and building data pipelines (Code/No Code) for ELT/ETL data migration, data refinement and data quality checks on Palantir Foundry.
Experience in ingesting data from different external source systems using data connections and sync.
Good Knowledge on Spark Architecture and hands on experience on performance tuning & code optimization.
Proficient in managing both structured and unstructured data, with expertise in handling various file formats such as CSV, JSON, Parquet, and ORC.
Experience in developing and managing scalable architecture & managing large data sets.
Good understanding of data loading mechanism and adeptly implement strategies for capturing CDC.
Nice to have test driven development and CI/CD workflows.
Experience in version control software such as Git and working with major hosting services (e. g. Azure DevOps, GitHub, Bitbucket, Gitlab).
Implementing code best practices involves adhering to guidelines that enhance code readability, maintainability, and overall quality.
Good to Have Skills :
Knowledge on Big Data tools & Technologies
Organizational and project management experience.
Educational Qualification:15 years of full-term education
Project Role Description : Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes to migrate and deploy data across systems.
Must have skills : Palantir Foundry
Good to have skills : NA
Minimum 7.5 year(s) of experience is required
Educational Qualification : 15 years full time education
Summary:
As a Lead Data Engineer, Design, build and enhance applications to meet business process and requirements in Palantir foundry. The role includes creating efficient data pipelines and ensuring the integrity and quality of data throughout its lifecycle. You will be responsible for implementing processes that extract, transform, and load data to facilitate seamless migration and deployment across various systems. This position requires continuous collaboration with different teams to optimize data workflows and support organizational data needs effectively.
Work experience: Minimum 6 years
Job Requirements & Key Responsibilities :
Responsible for designing , developing, testing, and supporting data pipelines and applications on Palantir foundry.
Configure and customize Workshop to design and implement workflows and ontologies.
Collaborate with data engineers and stakeholders to ensure successful deployment and operation of Palantir foundry applications.
Work with stakeholders including the product owner, data, and design teams to assist with data-related technical issues and understand the requirements and design the data pipeline.
Work independently, troubleshoot issues and optimize performance.
Communicate design processes, ideas, and solutions clearly and effectively to team and client.
Assist junior team members in improving efficiency and productivity.
Technical Experience :
Must have Skills : Palantir Foundry , PySpark
Proficiency in PySpark, Python and Sql with demonstrable ability to write & optimize SQL and spark jobs.
Hands on experience on Palantir foundry related services like Data Connection, Code repository, Contour , Data lineage & Health checks.
Good to have working experience with workshop , ontology , slate.
Hands-on experience in data engineering and building data pipelines (Code/No Code) for ELT/ETL data migration, data refinement and data quality checks on Palantir Foundry.
Experience in ingesting data from different external source systems using data connections and sync.
Good Knowledge on Spark Architecture and hands on experience on performance tuning & code optimization.
Proficient in managing both structured and unstructured data, with expertise in handling various file formats such as CSV, JSON, Parquet, and ORC.
Experience in developing and managing scalable architecture & managing large data sets.
Good understanding of data loading mechanism and adeptly implement strategies for capturing CDC.
Nice to have test driven development and CI/CD workflows.
Experience in version control software such as Git and working with major hosting services (e. g. Azure DevOps, GitHub, Bitbucket, Gitlab).
Implementing code best practices involves adhering to guidelines that enhance code readability, maintainability, and overall quality.
Good to Have Skills :
Knowledge on Big Data tools & Technologies
Organizational and project management experience.
Educational Qualification:15 years of full-term education
职位要求
15 years full time education