Data Engineer, Speed Products, Science and Analytics
DESCRIPTION
As a Data Engineer, you will be working in one of the world's largest and most complex data warehouse environments. You will design, implement and support scalable data infrastructure solutions to integrate with multi heterogeneous data sources, aggregate and retrieve data in a fast and safe mode, curate data that can be used in reporting, analysis, machine learning models and ad-hoc data requests. You will be managing multiple Redshift clusters that support reporting needs for the transportation org. You will be exposed to cutting edge AWS big data technologies. You should have excellent business and communication skills to be able to work with business owners and Tech leaders to gather infrastructure requirements, design data infrastructure, build up data pipelines and data-sets to meet business needs. You stay abreast of emerging technologies, investigating and implementing where appropriate.
Key job responsibilities
- Design and develop the pipelines required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Python and AWS big data technologies.
- Oversee and continually improve production operations, including optimizing data delivery, re-designing infrastructure for greater scalability, code deployments, bug fixes and overall release management and coordination.
- Establish and maintain best practices for the design, development and support of data integration solutions, including documentation.
- Work closely with Product teams, Data Scientists, Software developers and Business Intelligence Engineer to explore new data sources and deliver the data.
- Able to read, write, and debug data processing and orchestration code written Python/Scala etc following best coding standards (e.g. version controlled, code reviewed, etc.)
BASIC QUALIFICATIONS
- 1+ years of data engineering experience
- Experience with data modeling, warehousing and building ETL pipelines
- Experience with one or more query language (e.g., SQL, PL/SQL, DDL, MDX, HiveQL, SparkSQL, Scala)
- Experience with one or more scripting language (e.g., Python, KornShell)
PREFERRED QUALIFICATIONS
- Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
- Experience with any ETL tool like, Informatica, ODI, SSIS, BODI, Datastage, etc.
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $91,200/year in our lowest geographic market up to $185,000/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.