Software Architect - Agentic Evals Job at Datagrid AI, Santa Clara, CA

dTB6alJWNUdiZFRncEZKNGV5Ync3eG5WSGc9PQ==
  • Datagrid AI
  • Santa Clara, CA

Job Description

Fully remote, with the exception of occasional meetings in San Francisco to collaborate.

Bay Area residency required.

We believe that everyone deserves their own personal army of AI helpers with deep access to company data to automate any task. Datagrid ingests business data continuously from 100+ sources, makes it all available to AI, and eliminates grunt-work such as categorizing 10k support tickets in minutes.

We are a Series-A startup headquartered in San Francisco, but operate as a distributed company. We offer competitive salaries and health benefits, along with equity and respect for work/life balance.

Join our tight-knit team that ships fast and pushes the boundaries of AI! In the last few months, our agents learned to use Microsoft Teams, write SQL queries, and automate tasks on complex schedules like “MWF at half past 9”. Our Agents live where people work (Slack, Microsoft Teams, etc.) and automatically take useful actions like producing safety reports from worksite photos.

Responsibilities

Datagrid Agents operate where our customers work- across Teams, Slack, and even SMS. Agents make multistep plans, leverage vectorized data from 100+ sources, use tools like Docusign, and manipulate the Datagrid app. We cannot possibly test this all manually.

Your job will be to:

  • Work closely with an ex-Googler who built Gemini evals to create a harness for evaluating Agent performance, make that harness available both for local development and in CI/CD pipelines, and set up alerting for when Agents misbehave.
  • Influence and contribute to the extension of Datagrid’s Agentic capabilities.
  • Choose the best open/closed source components to build out the testing infra.
  • Integrate publicly available benchmarks such as RAGBench into the testing system.
  • Grant subject matter experts the ability to add to the test library using customer queries, manually authored cases, and synthetically generated questions.
  • Expose evaluation performance so the company can track improvement over time.

Desired Experience

  • Proven track record of building test harnesses for Chat Agents from 0 ⇒ 1.
  • 10+ years of B2B software engineering experience.
  • Ability to write effective LLM prompts without assistance.
  • Proficiency with nodejs and server side frameworks such as NestJS or NextJS.
  • Familiarity with JavaScript frameworks such as React, Angular JS.
  • Experience with databases such as Weaviate and BigQuery.
  • Experience working with GCP or similar cloud providers.

Salary Range: $200k - $240k

Equity

100% covered medical, dental and vision

401k

All candidates for this role will be asked the following interview question: “Work with me to design a system to evaluate the Agent’s performance at SQL queries.” We don’t expect you to have the perfect answer, but will evaluate you on your ability to clearly explain your thinking.

Job Tags

Local area, Remote job,

Similar Jobs

The Encompass Group

Campus Recruiter Job at The Encompass Group

 ...This thriving international firm is seeking an International Recruiter to support the human resources department. Typical duties would...  ...are top of mind. Organize and participate in career fairs, campus presentations, info sessions, and other recruitment events, both... 

The Suburban Collective

House Manager Job at The Suburban Collective

 ...Objective: Manage the home and handle food preparation during school hours and assist with (2)children's basic needs for 1-2 hours after...  ..., maintenance, and improvements. Oversee landscaping, cleaning services, and seasonal maintenance. Organize and declutter living... 

LMiadvertising

Communications Event - Entry Level Job at LMiadvertising

 ...have offered? Join our team and utilize your passion for good! We are confident you will be pleased with the outcome! As an Event Hospitality Manager, you will organize, manage, and oversee dynamic events in various locations! These unique events will serve our... 

Equitable Advisors

Wealth Management Associate - Entry Level Job at Equitable Advisors

 ...fulfilled life. We believe in teamwork, collaboration, and rewarding work. That is why we offer so many ways to strengthen relationships...  ...business # A work-life balance and access to a full suite of remote-work technology solutions # Advancement and management... 

NineTwelve

Facilities Security Officer Job at NineTwelve

FACILITIES SECURITY OFFICER (FSO) NineTwelve is currently seeking an individual to fill a Facilities Security Officer position in the Crane, IN area . Potential candidates must have an active government security clearance and relevant prior experience in Industrial...