Amazon’s eCommerce Foundation (eCF) organization is responsible for the core components that drive the Amazon website and customer experience. Serving millions of customer page views and orders per day, eCF builds for scale.
As an organization within eCF, the Business Data Technologies (BDT) group is no exception. We collect petabytes of data from thousands of data sources inside and outside Amazon including the Amazon catalog system, inventory system, customer order system, page views on the website and Alexa systems. We also support Amazon subsidiaries such as IMDB and Audible. We provide interfaces for our internal customers to access and query the data hundreds of thousands of times per day, using Amazon Web Service’s (AWS) Redshift, Hive, and Spark. We build scalable solutions that grow with the Amazon business.
BDT is growing, and the data processing landscape is shifting. Our data is consumed by thousands of teams across Amazon including Research Scientists, Machine Learning Specialists, Business Analysts and Data Engineers. Amazon.com is seeking an outstanding Data Engineer to join the BDT Content team. The BDT Content team manages the core Amazon business data from hundreds of source systems. Amazon.com has culture of data-driven decision-making, and demands business intelligence that is timely, accurate, and actionable. If you join the Amazon.com BDT Content team, your work will have an immediate influence on day-to-day decision making at Amazon.com.
As an Amazon.com Data Engineer II you will be working in one of the world's largest cloud-based data lakes. You should be skilled in the architecture of data warehouse solutions for the Enterprise using multiple platforms (EMR, RDBMS, Columnar, Cloud). You should have extensive experience in the design, creation, management, and business use of extremely large datasets. You should have excellent business and communication skills to be able to work with business owners to develop and define key business questions, and to build data sets that answer those questions. Above all you should be passionate about working with huge data sets and someone who loves to bring datasets together to answer business questions and drive change.
As a Data Engineer II on Amazon.com’s Business Data Technologies team, design, develop, implement, test, and operate large-scale, high-volume, high-performance data structures for analytics and deep learning. Implement data ingestion routines both real time and batch using best practices in data modeling, ETL/ELT processes leveraging AWS technologies and Big data tools. Gather business and functional requirements and translate these requirements into robust, scalable, operable solutions that work well within the overall data architecture. Analyze source data systems and drive best practices in source teams. Participate in the full development life cycle, end-to-end, from design, implementation and testing, to documentation, delivery, support, and maintenance. Produce comprehensive, usable dataset documentation and metadata. Evaluate and make decisions around dataset implementations designed and proposed by peer data engineers. Evaluate and make decisions around the use of new or existing software products and tools. Mentor junior data engineers.
· A desire to work in a collaborative, intellectually curious environment.
· Degree in Computer Science, Engineering, Mathematics, or a related field and 4-5+ years industry experience
· Must have one year of experience in the following skill(s):
· Developing and operating large-scale data structures for business intelligence analytics using: ETL/ELT processes; OLAP technologies; data modeling; SQL;
· Experience with at least one relational database technology such as Redshift, Oracle, MySQL or MS SQL
· Experience with at least one massively parallel processing data technology such as Redshift, Teradata, Netezza, Spark or Hadoop based big data solution
· Industry experience as a Data Engineer or related specialty (e.g., Software Engineer, Business Intelligence Engineer, Data Scientist) with a track record of manipulating, processing, and extracting value from large datasets.
· Coding proficiency in at least one modern programming language (Python, Ruby, Java, etc)
· Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
· Experience building data products incrementally and integrating and managing datasets from multiple sources
· Query performance tuning skills using Unix profiling tools and SQL
· Experience leading large-scale data warehousing and analytics projects, including using AWS technologies – Redshift, S3, EC2, Data-pipeline and other big data technologies
· Experience providing technical leadership and mentor other engineers for the best practices on the data engineering space
· Linux/UNIX including to process large data sets.
· Experience with AWS
Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success and we make recruiting decisions based on your experience and skills. We welcome applications from all members of society irrespective of age, gender, disability, sexual orientation, race, religion or belief.