Senior Java Development + Data Engineer
MUST HAVE:
- Must have a minimum of 10-12 years of hands-on development experience implementing batch and events driven applications using Java, Kafka, Spark, Scala, PySpark and Python.
- Experience with Apache Kafka and Connectors, Java, Springboot in building event driven services, Python in building ML pipelines.
- Develop data pipelines responsible for ingesting large amounts of different kinds of data from various sources.
- Help evolve data architecture and work on Next Generation real time pipeline algorithms and architecture in addition to supporting and maintaining current pipelines and legacy systems.
- Write code and develop worker nodes for business logic, ETL and orchestration processes.
- Develop algorithms for better attribution rules and category classifiers.
- Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive search, discovery, and recommendations.
- Work closely with architects, engineers, data analysts, data scientists, contractors/consultants and project managers in assessing project requirements, design, develop and support data ingestions and API services.
- Work with Data Scientists in building feature engineering pipelines and integrating machine learning models during the content enrichment process.
- Able to influence on priorities working with various partners including engineers, project management office and leadership.
- Mentor junior team members, define architecture, code review, hands-on development and deliver the work in sprint cycle.
- Participate in design discussions with Architects and other team members for the design of new systems and re-engineering of components of existing systems.
- Wear an Architect hat when required to bring new ideas to the table, thought leadership and forward thinking.
- Take a holistic approach to building solutions by thinking of the big picture and overall solution.
- Work on moving away from legacy systems into next generation architecture.
- Take complete ownership from requirements, solution design, development, production launch and post launch production support. Participate in code reviews and regular on-call rotations.
- Desire to apply the best solution in the industry, apply correct design patterns during development and learn best practices and data engineering tools and technologies.
- Performs any other functions and duties assigned and necessary for the smooth and efficient operation
EDUCATION & EXPERIENCE:
- BS or MS in Computer Science (or related field) with 12+ years of hands-on software development experience working in large-scale data processing pipelines.
- Must have skills are Apache Spark, Scala and PySpark with 2-4 years of experience building production grade batch pipelines that handle large volumes of data.
- Must have at least 8+ years of experience in Java and API / Microservices.
- Must have at least 5+ years of experience in Python.
- 5+ years of experience in understanding and writing complex SQL and stored procedures for processing raw data, ETL, data validation, using databases such as SQL Server, Redis and other NoSQL DBs.
- Knowledge of Big data technologies, Hadoop, HDFS.
- Expertise with building events driven pipelines with Kafka and Java / Spark.
- Expertise with Amazon AWS stack such as EMR, EC2, S3.
- Experience working with APIs to collect and ingest data as well build the APIs for business logic.
- Experience working with setting up, maintaining, and debugging production systems and infrastructure.
- Experience in building fault-tolerant and resilient systems.
- Experience in building worker nodes, knowledge of REST principles and data engineering design patterns.
- In-depth knowledge of Java, SpringBoot, Spark, Scala, PySpark, Python, Orchestration tools, ESB, SQL, Stored procedures, Docker, RESTful web services, Kubernetes, CI/CD, Observability techniques, Kafka, Release processes, caching strategies, versioning, B&D, BitBucket / Git and AWS Cloud Ecosystem, NoSQL Databases, Hazelcast.
- Strong software development, architecture diagramming, problem-solving and debugging skills.
- Phenomenal communication and influencing skills
NICE TO HAVE:
- Exposure to Machine Learning (ML), LLM models, using AI during coding, build with AI.
- Knowledge of Elastic APM, ELK stack and search technologies such as Elasticsearch / Solr.
- Some experience in workflow orchestration tools such as Air Flow or Apache NiFi.
Apply tot his job Apply To this Job