Data Arc

Project Resources

Explore my sample collection of open-source projects, code samples, and development resources.

Featured Projects

COVID-19 Data Engineering

Analyzed lockdown policy effectiveness using GCP, Apache Kafka, and Spark. The system combined real-time and batch processing for advanced analytics.

GCP Kafka Spark
View on GitHub

EV Charging Trends

Analyzed U.S. alternative fuel stations data to uncover trends in EV charging infrastructure and forecasted future growth patterns.

Data Analysis Python Prophet
View on GitHub

Air Quality Data Pipeline

Built a web scraping pipeline using Scrapy and Requests to collect and process air quality data from EPA AirNow for geospatial analysis.

Scrapy Python GIS
View on GitHub

Document Processor

A robust document processing system for handling various file formats, including PDFs, with text extraction and analysis capabilities.

Python NLP PDF
View on GitHub

Data Engineering Practice

A collection of data engineering exercises and solutions, covering ETL processes, data pipelines, and big data technologies.

Python ETL Big Data
View on GitHub

Anomarly or Incident Detection

Implementation of ML/AI algorithms including autoencoders, and Isolation Forest for predictive analysis.

Python TensorFlow Isolation Forest
View on GitHub

Skills