Returning Candidate?

Site Reliability Engineer

Site Reliability Engineer

Job Locations 
US-NY-New York
US-CA-Santa Monica
US-CA-San Francisco

More information about this job


MediaMath is a global technology company that's leading the movement to revolutionize traditional marketing and empowering marketers to unleash the power of goal-based marketing at scale, transparently across the enterprise. Our platform - TerminalOne Marketing Operating System - handles billions of transactions every hour and hundreds of millions of internet users every day, which means every solution must be built to scale. Our breakthroughs create new marketplaces and solve long-standing problems in an industry that is constantly evolving. Our engineers are building the leading technology platform to power the new digital marketing ecosystem, and we are looking for driven, curious innovators to join our team.


MediaMath’s is currently seeking a Site Reliability Engineer II. As an SRE II, you will be front-and-center in the effort to keep our distributed services fast and reliable, 100% of the time. Our systems span from a large globally distributed RTB bidding platform which services more than 3 million real-time transactions a second, to clusters of user databases that host tens of billions records, to AWS EMR cluster, and more.


  • Manage the scalability, performance, and availability of MediaMath RTB bidding platform by solving for reliability against existing systems and services spanning the entire stack.

  • Develop tools and automation to minimize delivery time and increase developer productivity.
  • Participate in the design and development of new and evolving services, architecture, and performance standards.

  • Participate in and strongly influence capacity planning and service performance analysis and tuning.
  • Influence in development of best practices for deployment, monitoring and alerting.

  • Support team members in the development of a SOA strategy and migration path.

  • Respond to and resolve emergent issues. Be on-call periodically as part of shared team.
  • Begin to mentor and coaching for junior team members
  • This is not an exhaustive list of responsibilities. Other duties may be assigned, as needed. MediaMath retains the right to change job duties at any time.
  • As part of our global technology team, you may be required to be work off-hours or be on-call on a rotating basis.
  • You are considered a “security employee” and have a particularly noteworthy security aspect to your role and are required to undergo additional training annually.
  • Administer and ensure logical security in carrying out all job duties
  • Support in Security Incident response and monitoring, as needed


This is not an exhaustive list of responsibilities. As part of our global technology team, you may be required to be work off-hours or be on-call on a rotating basis. Other duties may be assigned, as needed. MediaMath retains the right to change job duties at any time.


The top qualification for this role, above all else, is a strong desire to be part of something big; where input is encouraged and results are rewarded.


Experience Requirements

  • 5-7 years of relevant work experience, including experience with high-volume, production distributed systems environment.


Preferred Skills

  • Extensive working experience with Linux system (Debian based).
  • Familiarity with cloud infrastructure, such as AWS.
  • High-level shell fluency + one or more scripting languages (Python, Perl, or similar).
  • Experience managing and deploying full stack, distributed services.
  • Experience with container technologies (Docker, Vagrant, LXC, etc)
  • Experience with system automation tools (Ansible, Chef, Puppet, Salt Stack, etc.).
  • Experience with monitoring, alerting, and pipeline analysis tools (Nagios, Sensu, Graphite, Riemann, Logstash, etc.).
  • Excellent analytical skills, coupled with a strong sense of ownership, urgency and drive.
  • Experience with queuing/data-pipelining solutions (Storm, RabbitMQ, Amazon Kinesis, ZeroMQ, Kafka, etc.).
  • Experience with SQL/NoSQL systems such as PostgresSQL, MongoDB, Redis, Cassandra, DynamoDB, etc.


How do we reward our outstanding Mathletes? We start with company equity, comprehensive medical, dental, vision, short term and long term disability and life insurance, open paid time off, free on-site chair massages and our 401(k) and 401(k) matching. We then serve-up flexible spending accounts, bagel Fridays, free snacks and sodas, and the latest and greatest technology you need to do your job (including cellphone bill allowance). And the cherry on top? Regular happy hours and events; including Potlucks, Trivia Nights, and pick-up basketball, to name a few.


MediaMath employees can expect to work in an environment governed by Math Values: Obsess About Outcomes; Innovate to Scale; Win/Win Wins; Make Decisions, Take Responsibility.