Principal Data Engineer
Spotter, named one of TIME100's Most Influential Companies this year, empowers top YouTube creators to accelerate their business and unleash their full creative potential by giving them access to the capital, knowledge, and community they need to succeed at scale. As the top provider of creator-friendly growth capital, Spotter tailors our investments to meet the unique needs of each creator we partner with, giving them the freedom to create without compromise.
Creators are free to reinvest their funds however they choose, from hiring a team, to building their own production studios, and everything in between, all while maintaining total control over their catalogs, their channels, and their future earnings. In addition to funding, Spotter provides creators with in-depth data insights into the performance of their existing content, enabling them to leverage the full value of their library, as well as the value of future uploads and how they can improve performance in the future.
Featured in Forbes, Fast Company, Variety, Axios, and more, Spotter has already deployed over $850 million to YouTube creators to reinvest in themselves and accelerate their growth. Spotter has licensed content that consists of over 725,000 videos, which generate 88 billion monthly watch-time minutes. With our curated premium video catalog, we deliver a unique scaled media solution to Advertisers and Ad Agencies that is transparent, efficient, and 100% brand safe.
What You’ll Do:
Are you ready to help lead the charge in shaping the data-driven future of Spotter? We're in search of an exceptional Principal Data Engineer who will play a pivotal role in designing, building, and optimizing scalable data infrastructure. You will help us with data pipelines for acquisition and transformation of large datasets, storage and querying optimizations of varying data to support a large range of use cases from Analytics to Creator Products to Operations using traditional and ML focused access patterns. You will be a key player in empowering us to make data-informed decisions that will fuel our innovation and growth.
- Develop and maintain scalable data pipelines, including:
- ETL pipelines, both single and multi-node solutions
- Build data quality assurance steps for new and existing pipelines
- Create derived datasets with augmented properties
- Work on analytics ready datasets to power internal and creator facing tools
- Troubleshoot issues when they arise, working directly with internal data consumers
- Automate pipeline runs with scheduling and orchestration tools
- Work with large scale datasets
- Work with/use various external APIs to enhance data
- Setup database tables for analytics users to consume the data collected by the Data Engineering team
- Work with big data technologies to improve data availability and data quality in the cloud (AWS)
- Lead development of projects involving other team members and act as a mentor
- Actively participate in team discussions about technology/architecture/solutions for new projects and to improve existing code and pipeline
Who You Are:
- Bachelor’s degree, preferably in Computer Science or Computer Information Systems
- 6+ years of software engineering experience
- 5+ years of data engineering experience with Apache Spark or Apache Flink
- 4+ years of experience running software and services in the cloud
- Proficiency in working with DataFrame APIs (Pandas and Spark) for parallel and single node processing
- Proficiency using advanced languages and techniques with Python, Scala, etc. with modern data optimized file formats such as Parquet and Avro
- Proficiency with SQL on RDBMS and data warehouse solutions like Redshift
- Hands on experience with Data Lake technologies like Delta Lake and Iceberg
- Experience with data acquisition from external APIs at large scale / in parallel processing
- Experience supporting ML/AI projects: deployed pipelines for computing features, using models for inference on large datasets
Additional Valued Skills:
- Experience with YouTube APIs
- Experience with AWS Glue metastore
- Experience with Data-Mesh approaches
- Experience with data cataloging, data lineage and data governance tools and approaches
- Experience with vector databases
- Medical and vision insurance covered up to 100%
- Dental insurance
- 401(k) matching
- Stock options
- Complimentary gym access
- Autonomy and upward mobility
- Diverse, equitable, and inclusive culture, where your voice matters.
In compliance with local law, we are disclosing the compensation, or a range thereof, for roles that will be performed in Culver City. Actual salaries will vary and may be above or below the range based on various factors including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The overall market range for roles in this area of Spotter are typically: $100-$500K salary per year. The range listed is just one component of Spotter’s total compensation package for employees. Other rewards may include annual discretionary bonus and equity.
COVID-19 Vaccination Policy
Spotter requires proof of being fully vaccinated for COVID-19 as a condition of commencing employment.
Spotter is an equal opportunity employer. Spotter does not discriminate in employment on the basis of race, religion, creed, color, national origin, ancestry, citizenship, physical or mental disability, medical condition, genetic characteristics or information, marital status, sex (including pregnancy, childbirth, breastfeeding, and related medical conditions), gender, gender identity, gender expression, age, sexual orientation, military status, veteran status, use of or request for family or medical leave, political affiliation, or any other status protected under applicable federal, state or local laws.
Equal access to programs, services and employment is available to all persons. Those applicants requiring reasonable accommodations as part of the application and/or interview process should notify a representative of the Human Resources Department.