MINDTEL GLOBAL PRIVATE LIMITED

Big Data Engineer - Scala/Spark

Job Location

in, India

Job Description

Locations : Pune, Noida, and Hyderabad Experience Range - 5 Years Notice Period : Immediate to 60 days serving We are looking for a highly skilled and experienced Big Data Engineer to join our growing data engineering team. You will play a critical role in designing, building, and maintaining scalable big data solutions that empower our organization with valuable insights and drive data-driven business decisions. The ideal candidate possesses a strong foundation in Scala and Apache Spark, coupled with practical experience in leveraging cloud platforms such as AWS or Azure to implement robust and efficient data architectures. You will collaborate closely with data scientists, analysts, and other stakeholders to understand their data requirements and deliver high-quality, reliable data pipelines. Key Responsibilities : - Design, develop, and implement scalable and efficient big data processing pipelines using Scala and Apache Spark. - Build robust ETL (Extract, Transform, Load) and ELT processes for batch and streaming data. - Optimize data processing workflows for performance, reliability, and cost-effectiveness. - Design and implement scalable and fault-tolerant data architectures on either AWS or Azure cloud platforms. - Utilize cloud-native services for data storage (e.g., S3, Azure Data Lake Storage), data processing (e.g., EMR, Databricks, Azure Synapse Analytics), and data warehousing (e.g., Redshift, Snowflake, Azure Synapse). - Implement data lakes and data warehouses to support various analytical needs. - Develop and maintain data ingestion frameworks to efficiently and reliably ingest data from diverse sources (e.g., relational databases, NoSQL databases, APIs, streaming platforms). - Ensure data quality and integrity during the ingestion process. - Implement and enforce data quality checks and validation processes throughout the data lifecycle. - Adhere to data security best practices and compliance requirements (e.g., GDPR, HIPAA) when designing and implementing data solutions on cloud platforms. - Implement alerting and monitoring mechanisms to proactively identify and resolve potential issues. - Participate in on-call rotations as needed to support critical data pipelines. - Stay up-to-date with the latest trends and technologies in big data and cloud computing. - Identify opportunities for process improvement and optimization within the data engineering landscape. - Evaluate and recommend new tools and technologies to enhance our data infrastructure. Required Skills : - 5 years of hands-on experience in a Data Engineer or Big Data Engineer role. - Proven ability to write clean, efficient, and maintainable code in Scala. - Extensive experience with Apache Spark for large-scale data processing, including Spark SQL, Spark Streaming, and Spark MLlib (nice to have). - Hands-on experience working with data-related services on either Amazon Web Services (AWS) or Microsoft Azure (at least one is mandatory). This includes services for storage, compute, and data processing. - Proficient in building complex data pipelines, data lakes, and batch/streaming ETL/ELT processes. - Solid understanding of the principles and challenges of distributed systems and how they apply to big data processing. - Good understanding of different data modeling techniques and proven ability to optimize data pipelines for performance and scalability. - Experience with various data storage technologies, including distributed file systems (e.g., HDFS), object storage (e.g., S3, Azure Blob Storage), and data lakes (e.g., ADLS Gen2). - Excellent analytical and problem-solving skills with the ability to diagnose and resolve complex technical issues. - Ability to work effectively both independently and as part of a collaborative team. - Experience with workflow orchestration tools such as Apache Airflow, Apache Oozie, or similar. - Familiarity with containerization technologies (Docker) and orchestration platforms (Kubernetes). - Knowledge of DevOps principles and experience implementing CI/CD pipelines for data engineering workflows. - Exposure to real-time data processing tools and frameworks like Apache Kafka, Apache Flink, or similar. - Experience working with NoSQL databases (e.g., Cassandra, MongoDB, HBase). - Experience with data warehousing solutions like Amazon Redshift, Snowflake, or Azure Synapse Analytics. - Relevant cloud certifications such as AWS Certified Data Analytics Specialty or Azure Data Engineer Associate. (ref:hirist.tech)

Location: in, IN

Posted Date: 4/10/2025

View More MINDTEL GLOBAL PRIVATE LIMITED Jobs

Contact Information

Contact	Human Resources MINDTEL GLOBAL PRIVATE LIMITED