About Gyan Bahadur
English
Native or bilingual
Experience
- DHIS2Data EngineerSOFTWARE PUBLISHINGJuly 2020 - Today (5 years and 11 months)
- Designed and implemented scalable ETL/ELT pipelines using Apache Spark (PySpark/Scala), Databricks, Microsoft Fabric, and Azure Data Factory, processing millions of public health records including HIV/AIDS surveillance, immunization, maternal health, and disease reporting used in WHO and Save the Children-supported programs.
- Architected a Medallion Data Lake (Bronze, Silver, Gold layers) with strong data modeling practices (dimensional and analytical modeling) to standardize healthcare data ingestion, transformation, and analytics for disease monitoring, outbreak tracking, and national health reporting.
- Engineered multi-source data ingestion pipelines from relational databases (SQL Server, PostgreSQL), APIs (DHIS2 and external health systems), CSV, Excel, JSON, and flat files, enabling unified processing of heterogeneous public healthcare data.
- Built and tuned high-performance Spark pipelines using caching, in-memory computation, and distributed processing optimizations, enabling efficient processing of large-scale public health datasets in Databricks and Microsoft Fabric environments.
- Collaborated with public health stakeholders, including WHO-aligned initiatives and Save the Children programs, to define data models, ETL rules, validation frameworks, and standardized health indicators, ensuring accurate reporting for HIV/AIDS and other critical health programs.
- Built and maintained cloud-based data lakes on AWS S3 and Azure Data Lake, implementing scalable transformations using AWS Glue, Microsoft Fabric, and Spark with optimized partitioning and caching strategies for large datasets.
- Developed Power BI dashboards integrated with SQL, APIs, DHIS2 systems, and Microsoft Fabric datasets, enabling visualization of health program performance, treatment outcomes, and key public health indicators at scale.
- OpenMRSJr. Data EngineerSOFTWARE PUBLISHINGJuly 2019 - July 2020 (1 year)Dallas, United States
- Designed and developed scalable ETL pipelines using AWS Glue and PySpark to ingest and transform large-scale healthcare data (millions of patient and medicine records) from S3 and external systems into Amazon Redshift for analytics and reporting.
- Automated data discovery and querying using AWS Glue Crawlers, Data Catalog, and Amazon Athena, enabling efficient access to high-volume patient and clinical datasets for analytics teams.
- Built a reusable and scalable ETL framework using Spark (Python/Scala) to standardize ingestion, transformation, and loading of millions of healthcare records including patient history, prescriptions, and treatment data into Hive and HBase.
- Designed and optimized data models for healthcare analytics, structuring raw, staging, and curated layers (Medallion-style modeling) to support efficient querying and reporting on patient and medicine datasets.
- Optimized Hive table design with partitioning and bucketing strategies, significantly improving performance for millions of patient-level and pharmaceutical records.
- Implemented event-driven data pipelines using AWS Lambda and S3 triggers, enabling automated and near real-time processing of incoming patient and medicine data at scale.
- Orchestrated end-to-end workflows using Apache Airflow DAGs, ensuring reliable scheduling, dependency management, and monitoring of large-scale healthcare data pipelines.
- Developed and optimized distributed processing jobs using PySpark and Spark SQL, efficiently handling millions of records across patient demographics, prescriptions, and clinical events.
- Built real-time streaming pipelines using Apache Kafka and Apache Flink, and containerized workloads using Docker and Kubernetes, enabling scalable processing of high-volume healthcare data streams.
- Solulab IncSoftware Developer InternDIGITAL AND ITJanuary 2019 - July 2019 (6 months)Ahmedabad, India● Developed and integrated RESTful APIs using FastAPI and PostgreSQL into the Bevvi application, enabling seamless data exchange and functionality with third-party services.● Designed a responsive user interface (UI) using React and Material UI for the Bevvi application, increasing mobile traffic by 25% and improving user satisfaction.● Utilized Jenkins for Continuous Integration and Continuous Deployment (CI/CD), reducing deployment times by 40% and improving release consistency and reliability.● Designed and implemented scalable AWS cloud infrastructure using services like EC2, S3, and DynamoDB, ensuring optimal performance and cost efficiency.● Automated serverless workflows using AWS Lambda and API Gateway, reducing operational overhead and enabling event-driven
Recommendations
Be the first to recommend Gyan Bahadur
Help this freelancer shine by sharing your experience working together.
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- Master of ScienceUniversity of the Cumberlands2025Computer Science
- Bachelor in Science & TechnologyMaharaja Ranjit Singh Punjab Technical University2019Computer Science & Engineering
Certifications
- Algorithmic ToolboxUC San Diego2024