Hello, I am

PRAYAG VERMA

A software Engineer

Specialized in Data Domain!
Prayag Verma

ABOUT

Prayag Verma

Hello, I'm Prayag Verma.

A M.S IT graduate from the University of Texas at Dallas

I'm a software engineer specialized in data domain with over four+ years of experience, working with global companies like Infosys, Amdocs, and Briston Infotech.

Currently, I'm actively looking for an opportunity in data- driven domains such as data engineering, data or solution architect, ETL/DWH developement, and ETL Testing.

EDUCATION

Academic Journey

2023 - 2025

University of Texas at Dallas

Information Technology and Management (ITM)

2015 - 2019

Anna University, Chennai

Computer Science and Engineering (CSE)

2013 - 2015

Makatpurh High School, Giridih

Senior Secondary School (+2)

Education Illustration

RESUME

SUMMARY

I'm an innovative, independent, and deadline-driven Data Engineer and Architect with over four years of hands-on experience designing, developing, and testing user-centered enterprise data warehouse solutions.

I've worked across diverse domains, including telecommunications, health insurance, and retail, bringing expertise in data engineering, solution architecture, etl processing, data pipeline streamlining, and data warehousing to every project.

EDUCATION

23 - 2025
University of Texas at Dallas
Information Technology and Management (ITM)
15 - 2019
Anna University, Chennai
Computer Science and Engineering (CSE)
15 - 2019
Makatpurh High School, Giridih
Senior Secondary School(+2)

DOMAIN KNOWLEDGE

Data Engineering, Data Architect, Solution, Architect, Data Analyst, Data Science, Big Data, ETL Development, ETL Testing, Business Analyst, SDE

DOMAIN KNOWLEDGE

AWS Cloud Solution Architecture, Big Data, Business Data Warehousing, Advance Statistics for Data Science, Business Analytics with R, Database Foundation for Business Analytics, Predictive Analytics for Data Science, and Prescriptive Analytics.

PROFESSIONAL EXPERIENCE

Infosys (Data engineer)

  • Created and maintained scalable data pipelines using Azure Data Factory and Databricks, demonstrating ownership in ensuring seamless structured and unstructured data integration, improving processing efficiency by 10%.
  • Led the migration of flat files and tables (10+ TB each) from IPC workflows to ADLS, Snowflake, and Azure Synapse Analytics, showcasing think big by enabling cloud-native scalability and automation.
  • Optimized databases via denormalization and restructuring tables, deployed auto-scaling with Azure Monitor (frugality- cut costs by 8%), and integrated APIs services, while improving query performance by 12% and data consistency by 15%.

Amdocs (Data engineer)

  • Designed scalable ADF pipelines to ingest and process terabytes of relational and non-relational real-time data streaming, leveraging Teradata, ADLS, Databricks, Azure Event Hubs, and Spark, increasing efficiency by 25%.
  • Migrated 150+ KornShell based legacy OLTP systems to Azure Data Lake and Snowflake, while building real-time streaming solutions using Azure Event Hubs and Airflow, displaying are right, a lot by ensuring real-time data processing.
  • Modeled Power BI dashboards using DAX, providing real-time customer and billing insights, while manifesting earn trust by documenting data mapping, technical specifications, and API integrations for 100% transparency and compliance.

Briston Infotech (Data engineer)

  • Streamlined cross-domain ETL solutions for healthcare and retail industries using Informatica PowerCenter, ensuring seamless integration of tabular and non-tabular data into Snowflake and Oracle E-DWH with 99% reliability.
  • Spearheaded query performance by implementing indexing, fact and dimension tables, star/snowflake schemas (Data Prep) data modeling techniques, cutting query execution time by 8%.
  • Automated 100+ ETL workflows using TIDALscheduler, reducing manual intervention and ensuring seamless data integration.

SKILLS

PROGRAMMING LANGUAGES

Python

90%

NumPy

85%

Pandas

75%

Shell Scripting

80%

ETL / ORCHESTRATION

ADF

90%

Airflow

85%

AWS Glue

75%

PwerCenter

70%

DBT

70%

TIDAL

65%

STREAMING

Fabric

80%

Kafka

75%

Flink

65%

VISUALIZATION

Power BI

95%

Tableau

75%

CI/CD & VERSION CONTROL

Azure Pipeline

70%

Jenkins

65%

Git Action

70%

GitLab

75%

DATABASES

MYSQL

90%

Oracle

85%

PL/SQL

80%

MS SQL Server

75%

Teradata

70%

Snow SQL

80%

MongoDB

75%

No SQL

65%

CLOUD TECHNOLOGIES

Databricks

90%

ADLS

95%

Cosmos DB

80%

Stream Analytics

70%

Teradata

65%

Redshift

70%

DynamoDB

75%

Athena

65%

Quick Sight

65%

BIG DATA TECHNOLOGIES

Hadoop

80%

Hive

85%

HDFS

90%

Spark SQL

70%

Scala

60%

Scoop

75%

Impala

75%

MapReduce

65%

HBase

60%

CONCEPTS/METHODOLOGY

SDLC

90%

STLC

95%

Agile

95%

Data Modeling

80%

Data Warehousing

85%

Data Architecting

90%

PROJECTS

Car Hail Damage Repair

A comprehensive solution that streamlines vehicle damage assessment using AI technology. This innovative system allows users to upload images of damaged vehicles and receive instant analysis of damage severity and repair options through a portable device.

User-Friendly Web App with Flask
Smart Image Processing with OpenCV and Pillow
Machine Learning Analysis powered by TensorFlow
Efficient Data Processing with NumPy

Digital License Management

A comprehensive platform for generating and managing digital license keys with seamless API integration. This solution simplifies the software licensing process for developers and businesses, ensuring secure key distribution and validation.

Secure License Key Generation
API Integration for Third-party Applications
User Authentication and Authorization
Integrated Support System

ETLQC - Data Testing Platform

A robust web-based application designed to validate and test data across various sources including flat files, relational databases, and APIs. ETLQC ensures data accuracy and reliability throughout ETL/ELT processes, making it an essential tool for data engineers and quality assurance teams.

Multi-source Data Validation
Automated Testing Workflows
Comprehensive Reporting Dashboard
ETL/ELT Process Integration

Car Auction Data Analysis

A comprehensive analysis of car auction data encompassing both electric and non-electric vehicles from various manufacturers. This project leverages Python and Object-Oriented Programming principles to extract valuable insights from automotive market data.

Electric vs. Non-Electric Vehicle Analysis
Price Prediction Models
Automotive Market Trend Analysis
OOP-based Data Processing Framework

Microsoft Azure Projects

A collection of end-to-end ETL/Data Engineering solutions implemented using Microsoft Azure services. This repository showcases expertise in cloud-based data processing, from minor tasks to full-scale applications, demonstrating versatile skills in modern data architecture.

Cloud-native Data Solutions
Real-time Data Processing
Scalable ETL Architectures
End-to-End Data Pipeline Management

SQL From Zero to Hero

A comprehensive educational repository featuring a carefully curated series of SQL exercises designed to take users from beginner to advanced level. Each exercise includes detailed schemas, challenging questions, and thoroughly explained solutions to build strong SQL fundamentals.

Progressive Learning Path
Real-world Problem Scenarios
Detailed Explanations and Best Practices
Diverse Database Schema Examples

CERTIFICATION

BLOGS

Coming Soon

Stay tuned for insightful articles and tutorials on data engineering, cloud technologies, and more!

CONTACT

Let's Connect

Feel free to reach out for opportunities, collaborations, or just to say hello!

Location

Seattle, USA

Website

www.prayagverma.com

Send A Message

Y7NH33F

FAQ

Frequently Asked Questions

Here are some common questions about my background, skills, and how we can work together.

What are your core areas of expertise in data engineering?

My expertise centers around building scalable data pipelines, data warehousing solutions, and ETL/ELT processes. I specialize in cloud platforms like Azure and AWS, with strong skills in Python, SQL, Spark, and modern data tools like Databricks, ADF, and Airflow. I'm particularly strong in designing data architectures that balance performance, cost, and maintainability.

How do you approach data quality and governance in your projects?

I believe data quality is the foundation of any successful data initiative. My approach includes implementing robust validation rules, automated testing pipelines, and monitoring systems to catch issues early. For governance, I work to establish clear data ownership, lineage tracking, and documentation practices. I've developed custom data quality frameworks that integrate with ETL processes to ensure consistency and reliability throughout the data lifecycle.

Can you tell me about your experience with real-time data processing?

I've designed and implemented several real-time data processing systems using technologies like Apache Kafka, Azure Event Hubs, and Stream Analytics. One notable project involved creating a real-time customer analytics platform that processed millions of events per hour with sub-second latency. This system used a combination of stream processing for immediate insights and batch processing for historical analysis, providing business users with both real-time dashboards and comprehensive reporting capabilities.

How do you handle large-scale data migrations?

Large-scale data migrations require careful planning and execution. My approach involves thorough source system analysis, detailed mapping documentation, and creating a robust testing strategy before any migration begins. I typically implement the migration in phases, starting with a proof of concept followed by incremental migrations when possible. Throughout the process, I use automated validation to verify data integrity and completeness. I also design fallback mechanisms and maintain parallel systems during the transition period to minimize business disruption.

How can I contribute to your blog section?

I welcome guest contributions on topics related to data engineering, cloud technologies, analytics, or software development! To contribute, simply reach out through the contact form with the subject "Blog Contribution" and include a brief outline of your proposed topic. The ideal length is 1000-2000 words, and I encourage practical, hands-on content that provides value to readers. You'll receive full attribution for your work, and it's a great way to share your knowledge with the community while gaining exposure for your expertise.

What topics are suitable for blog contributions?

The blog welcomes a wide range of technical topics, including but not limited to:

  • Data engineering best practices and patterns
  • Cloud architecture and implementation guides
  • ETL/ELT techniques and tools
  • Performance optimization for data pipelines
  • Big data technologies and frameworks
  • Data modeling and warehouse design
  • Data quality and testing strategies
  • Python, SQL, or Spark tutorials
  • Real-world case studies and problem solving
  • Emerging technologies in the data space

The most valuable contributions share practical insights, provide code examples when relevant, and offer actionable takeaways for readers.

What technologies do you enjoy working with most?

I particularly enjoy working with Azure Databricks, Python, and modern data pipeline orchestration tools like Apache Airflow. I find Databricks especially powerful for its unified analytics platform that combines the best of data engineering and data science capabilities. On the AWS side, I'm excited about the capabilities of services like Glue, Redshift, and Step Functions for building serverless data workflows. I'm also increasingly interested in the intersection of data engineering with MLOps, and how we can build better pipelines to support model training and deployment.

Are you available for freelance or consulting work?

Yes, I'm selectively available for freelance consulting on data engineering projects, particularly those involving complex data architectures, performance optimization, or cloud migrations. I can provide services ranging from architecture review and technical guidance to hands-on implementation and team mentoring. If you have a project in mind, please reach out through the contact form with details about your needs and timeline, and we can discuss how I might be able to help.