ThirdLaw logo

Cloud & Dev Operations Engineer

ThirdLaw
Full-time
Remote
Operations

About the Challenge We're Tackling:

As enterprises integrate LLMs into their existing applications, traditional observability tools fall short in addressing the unique safety and operational risks posed by LLM interactions. These tools are adept at monitoring conventional metrics like rate limits, latency, and cost breakdowns but lack the capacity to assess the stochastic risks inherent in LLM inputs, outputs, and inter-LLM communications. This gap represents the primary barrier to confidently deploying LLMs in enterprise settings. At ThirdLaw, we empower IT and Security teams with the tools to answer the foundational question; "Is this OK?" and take decisive action when it isn't. We provide the next-generation monitoring solutions necessary to evaluate, investigate, and mitigate the unique risks associated with LLM deployments.

About the role:

AI is reshaping software development, enterprise knowledge management, and the way work gets done. By giving IT and Security professionals the tools to make sure AI is doing everything it should, and nothing it shouldn’t, you’ll be enabling the safest path to a wave of incredible AI-powered innovation.  This role is responsible for ensuring the availability, reliability, and performance of cloud infrastructure and services. This includes CI/CD, automation, and infrastructure as code, as well as sensible and cost-effective choices on cloud infrastructure and services.

What you’ll be doing:

  • Cloud Operations: You’ll work within a small but mighty team of AI engineers and backend engineers to provision and manage cloud resources, establish observability and incident response, enforce security and compliance controls and optimize costs.

  • Deployment Operations: build the deployment and maintenance infrastructure to support complex hybrid deployments that work across cloud hosted and customer hosted platform components.

  • Development Operations: Build and maintain CI/CD pipelines, use Terraform to enable repeatable and scalable infrastructure, manage deployments, and ensure fast identification and resolution of issues across any environment.

  • Every day, you will lay the foundation for our service. Most of this work is first-tracks / ground up / from scratch, with your impact as clear as day. This is an enterprise solution and has real expectations around reliability, security and scalability.

  • Start-up responsibility; you are the first and often last stop on whether our service is good or great.

Skills and Qualities you’ll need to bring

  • Cloud Infra Expert. Expertise in major cloud providers including AWS and Azure, including provisioning and managing services (EC2, VPC, S3, etc.). Experience with IaC tools like Terraform, CloudFormation, or Ansible to manage and version infrastructure declaratively.

  • CI/CD Pipeline Development: Proficiency in setting up and managing CI/CD pipelines with tools like Jenkins, GitLab CI, or GitHub Actions to automate software builds, tests, and deployments.

  1. Containerization & Orchestration: Skills in Docker and Kubernetes for packaging applications, scaling deployments, and managing dependencies. Good understand of docker and K8 internals.

  • Infrastructure Monitoring & Incident Management: Familiarity with monitoring tools (e.g., CloudWatch, Datadog) and incident management best practices to ensure high uptime and performance.

  • Security & Compliance: Skills in managing identity and access management (IAM), encryption, network security, and compliance frameworks (e.g., SOC 2, GDPR).

  • Cost Optimization: Ability to analyze cloud usage patterns, manage budgets, and implement cost-saving measures (e.g., rightsizing, spot instances).

  • AI-first: Interest and willingness to learn concepts in artificial intelligence, machine learning, and deep neural networks. You are excited about the possibilities of LLMs.

Nice-to-have:

  • Ideally, you live in the bay area or want to be here enough to collaborate in person sometimes, but we are able to work with anyone in the continental United States.

Join us as we pursue our mission to unlock the boundless possibilities of generative AI by ensuring AI trust and safety. We're looking for people who bring thoughtful ideas and aren't afraid to challenge the norm. Our team is small and focused, valuing autonomy and real impact over titles and management. We need strong technical skills, a proactive mindset, and clear written communication, as much of our work is asynchronous. Our product is new and operates in a rapidly changing ecosystem of generative AI; we are builders with the ability to dispatch ambiguity to solve customer pain. If you're organized, take initiative, and want to work closely with customers to shape our products, you'll fit in well here.