Simon Fraser

Simon Fraser

Hadoop Developer @ Health | Santé

About Simon Fraser

Simon Fraser is a Hadoop Developer with a Bachelor's Degree in Computer Science from the University of Waterloo. He has extensive experience in data extraction, transformation, and loading processes, as well as developing complex MapReduce programs and workflows in Oozie.

Work at Ontario Ministry of Health and Long-Term Care

Simon Fraser currently serves as a Hadoop Developer at the Ontario Ministry of Health and Long-Term Care. His tenure began in 2013, where he has been responsible for various data management tasks. His work includes creating Hive tables based on specific business requirements and developing Hive queries to compare raw data with Enterprise Data Warehouse (EDW) reference tables. He has implemented partitioning and dynamic partitions in Hive to enhance data access efficiency.

Education and Expertise

Simon Fraser studied Computer Science at the University of Waterloo, where he earned his Bachelor's Degree from 1986 to 1990. His academic background has equipped him with the foundational knowledge necessary for his role as a Hadoop Developer. He possesses expertise in data extraction, transformation, and loading (ETL) processes, particularly in working with Hadoop and related technologies.

Technical Skills in Hadoop Development

As a Hadoop Developer, Simon Fraser has developed MapReduce programs in Java to parse raw data and populate staging tables. He has implemented complex MapReduce programs, including map-side joins using distributed cache. His skills extend to defining workflows using Oozie and automating data loading into HDFS with PIG. He has also converted existing SQL queries into Hive QL queries and designed custom writable formats in MapReduce.

Data Management and Analysis

Simon Fraser has extensive experience in data management and analysis. He has moved data between HDFS and relational databases using SQOOP and refined website clickstream data from Omniture logs into Hive. His work includes developing User Defined Functions (UDFs) in Pig and Hive using Java, as well as creating Hive tables to transform and analyze data stored in HDFS.

People similar to Simon Fraser