Worked as a Senior Data Engineer, served internal investment teams, and fulfilled data product needs.

ACHIEVEMENTS:

  • Project: Large Datasets Pipelines - Increased data processing efficiency through the development of PySpark and Scala data pipelines, effectively managing large datasets on the Hadoop Distributed File System (HDFS).

  • Project: Python Web Scrappers - Created a Java and Spring-based scraping framework, enabling rapid prototyping of new scrapers with reusable components. Enhanced data ingestion and crawling efficiency using Python and Java scrapers.

  • Project: Decommissioning of Data Management Platform (DMP) - Re-architected J2EE monolithic data platform into Python microservices for enhanced scalability. Developed flexible data models and pipelines, creating an internal Data Lake for efficient storage and accessibility of datasets from multiple sources.

  • Project: Decommissioning of MALT Refinery, Timeseries MIS - Revamped time-series data catalog management, replacing a C# desktop app with an advanced web application. Employed Angular for the front end and Java Spring Boot for the back end, achieving improved accessibility and usability.