Tookitaki is looking for a Data Engineer who is familiar with the Spark platform and is able to design, optimize and maintain optimal data/machine learning (ML) pipelines in the platform. The following are the main responsibilities of the role:
Responsibilities:
Designing and implementing fine-tuned production-ready data/ML pipelines in Hadoop platform.
Driving optimization, testing, and tools to improve quality.
Reviewing and approving high-level & detailed design to ensure that the solution delivers to the business needs and aligns with the data & analytics architecture principles and roadmap.
Understanding business requirements and solution design to develop and implement solutions that adhere to big data architectural guidelines and address business requirements.
Following proper SDLC (Code review, sprint process).
Identifying, designing, and implementing internal process improvements: automating manual processes, optimizing data delivery, etc.
Building robust and scalable data infrastructure (both batch processing and real-time) to support the needs of internal and external users
Understanding various data security standards and using secure data security tools to apply and adhere to the required data controls for user access in the Hadoop platform.
Supporting and contributing to developing guidelines and standards for data ingestion
Working with data scientist and business analytics team to assist in data ingestion and data-related technical issues.
Designing and documenting the development & deployment flow.
Requirements:
Experience in developing rest API services using one of the Scala frameworks
Ability to troubleshoot and optimize complex queries on the Spark platform
Expert in optimizing 'big data' data/ML pipelines, architectures, and data sets
Experience in Big Data access and storage techniques.
Experience in doing cost estimation based on design and development.
Excellent debugging skills for the technical stack mentioned above which even includes analyzing server logs and application logs.
Highly organized, self-motivated, proactive, and able to propose the best design solutions.
Good time management and multitasking skills to work to deadlines by working independently and as a part of a team.
Ability to analyze and understand complex problems.
Ability to explain technical information in business terms.
Ability to communicate clearly and effectively, both verbally and in writing.
Strong in user requirements gathering, maintenance, and support
Excellent understanding of Agile Methodology.
Good experience in Data Architecture, Data Modelling, and Data Security.
Must Have Skills :
Spark:3 to 4 years
Scala: Minimum 2 years of experience
Hadoop, Hive, Hbase: Minimum 1 Years of Experience
You're about to be taken to the employer's website to complete your application.
Please either log in, or enter your name and email address before we re-direct you