We aim to break down walls between development and operations; participate in finding and building solutions which enable teams to deliver software updates in a way that is highly stable and operationally sound. We are strongly invested in the AWS Cloud, infrastructure-as-code, and monitoring-as-code. We favor the practical and pragmatic over the ideal, including finding right-sized solutions. We are anticipatory and forward-looking, reliable, and have a bias toward taking action. We understand that without our customers our efforts are worthless, and that operational changes are likely to have a direct impact on user experience. We understand that uptime is paramount, and we work backwards from there.
location: Boston, Massachusetts
job type: Contract
salary: $55 - 65 per hour
work hours: 9am to 5pm
responsibilities: Essential Accountabilities: Leadership:
- Listening to the needs of our teams, learning how they work best, and delivering solutions
- The ability to collaborate with product teams and technical principals to prioritize our efforts.
- Stay current on industry trends; conceive and present to management ways to improve current practices, to improve our standing in the marketplace, and remain on the cutting edge of technology.
- Ability to take ownership over a project, drive it forward, "sell" it to other teams inside the company as a solution for a given problem, and work with teams to drive adoption.
- If you see an opportunity to solve a problem or otherwise make something better, take the initiative.
- Mentor team members; foster growth by setting high-reaching goals; providing support as needed to achieve them.
- Hands-on design, understanding, and troubleshooting of highly-distributed, large-scale production systems - both modern and legacy, monolithic and micro.
- Co-ownership with the development teams over reliability, uptime, capacity, and performance.
- Ensuring the repeatability, traceability, and transparency of our infrastructure automation.
- Identifying highest-impact opportunities to optimize existing systems; ensuring "right-sized" and cost-optimized solutions in consideration of technical and business constraints.
- System design consulting for teams seeking to leverage or improve their production infrastructure.
- Anticipate, build, and plan capacity for upcoming product/feature launches.
- Working with application teams and product principals to fully operationalize software/systems projects (including security requirements).
- Being part of an on-call rotation spread amongst the rest of the team. (The better we do at the things above, the quieter the rotation is!)
- Must have the skills of a senior (or higher) level software application engineer.
- Must have the skills of a senior (or higher) level cloud operations engineer.
- Ability to translate knowledge and ideas into written-word as documentation/1-pagers.
- Excellent presentation and communication skills.
- Mastery of AWS services (IAM, EC2, S3, EBS/EFS, ELB/ALB, AutoScaling, RDS and replication techniques, VPC, Subnets, Elastic IP, Route53, CloudWatch, CloudFront, Lambda, CloudFormation, ECS, SNS, ElastiCache).
- Expertise in container/container-fleet-orchestration technologies (Kubernetes, ECS, Docker).
- Expertise integrating continuous-integration and continuous-delivery software development lifecycles (i.e., CI/CD) into one or more applications (using Jenkins, Circle CI, Travis CI, or other modern CI tools).
- Expertise in infrastructure automation technologies (e.g., Terraform, CloudFormation).
- Expertise with Lean/Agile deployment processes (e.g., blue/green, zero downtime, canary, and DNS strategies).
- Significant experience troubleshooting interactions among concurrent and distributed systems.
- Cloud database operations and deployment experience (e.g., RDS MySQL/Postgres/Aurora), caching operations & deployments (e.g., Memcache, Redis).
- Ability to design and manage escalation response plans - from monitoring, to reaction/response/remediation, to retrospection/post-mortem in culturally-aligned (proactive, customer focused, collaborative, proven-with-data) ways.
- Familiarity with site and infrastructure monitoring systems (e.g., CloudWatch, Datadog, New Relic, Sumo Logic, Thousand Eyes).
- Cloud and container-native Linux administration/build/management skills (e.g., AMIs, Packer).
- Strong problem-solving, root cause understanding, and systems engineering skills.
- Expertise with software development lifecycle branching and distributed source code management systems (e.g., Git/Mercurial, Git-Flow, GitHub-Flow).
- B.S. Degree in Computer Science (or related technical field, or equivalent industry experience).
- A non-trivial background in open source is a HUGE plus.
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.