Build an Autonomous Wet-Lab Protocol Planner and Validator Using Salesforce CodeGen for Agentic Experiment Design and Safety Optimization

In this tutorial, we build a Wet-Lab Protocol Planner & Validator that acts as an intelligent agent for experimental design and execution. We design the system using Python and integrate Salesforce’s CodeGen-350M-mono model for natural language reasoning. We structure the pipeline into modular components: ProtocolParser for extracting structured data, such as steps, durations, and temperatures, from textual protocols; InventoryManager for validating reagent availability and expiry; Schedule Planner for generating timelines and parallelization; and Safety Validator for identifying biosafety or chemical hazards. The LLM is then used to generate optimization suggestions, effectively closing the loop between perception, planning, validation, and refinement.

import re, json, pandas as pd
from datetime import datetime, timedelta
from collections import defaultdict
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch


MODEL_NAME = "Salesforce/codegen-350M-mono"
print("Loading CodeGen model (30 seconds)...")
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
   MODEL_NAME, torch_dtype=torch.float16, device_map="auto"
)
print("✓ Model loaded!")

We begin by importing essential libraries and loading the Salesforce CodeGen-350M-mono model locally for lightweight, API-free inference. We initialize both the tokenizer and model with float16 precision and automatic device mapping to ensure compatibility and speed on Colab GPUs.

class ProtocolParser:
   def read_protocol(self, text):
       steps = []
       lines = text.split('\n')
       for i, line in enumerate(lines, 1):
           step_match = re.search(r'^(\d+)\.\s+(.+)', line.strip())
           if step_match:
               num, name = step_match.groups()
               context="\n".join(lines[i:min(i+4, len(lines))])
               duration = self._extract_duration(context)
               temp = self._extract_temp(context)
               safety = self._check_safety(context)
               steps.append({
                   'step': int(num), 'name': name, 'duration_min': duration,
                   'temp': temp, 'safety': safety, 'line': i, 'details': context[:200]
               })
       return steps
  
   def _extract_duration(self, text):
       text = text.lower()
       if 'overnight' in text: return 720
       match = re.search(r'(\d+)\s*(?:hour|hr|h)(?:s)?(?!\w)', text)
       if match: return int(match.group(1)) * 60
       match = re.search(r'(\d+)\s*(?:min|minute)(?:s)?', text)
       if match: return int(match.group(1))
       match = re.search(r'(\d+)-(\d+)\s*(?:min|minute)', text)
       if match: return (int(match.group(1)) + int(match.group(2))) // 2
       return 30
  
   def _extract_temp(self, text):
       text = text.lower()
       if '4°c' in text or '4 °c' in text or '4°' in text: return '4C'
       if '37°c' in text or '37 °c' in text: return '37C'
       if '-20°c' in text or '-80°c' in text: return 'FREEZER'
       if 'room temp' in text or 'rt' in text or 'ambient' in text: return 'RT'
       return 'RT'
  
   def _check_safety(self, text):
       flags = []
       text_lower = text.lower()
       if re.search(r'bsl-[23]|biosafety', text_lower): flags.append('BSL-2/3')
       if re.search(r'caution|corrosive|hazard|toxic', text_lower): flags.append('HAZARD')
       if 'sharp' in text_lower or 'needle' in text_lower: flags.append('SHARPS')
       if 'dark' in text_lower or 'light-sensitive' in text_lower: flags.append('LIGHT-SENSITIVE')
       if 'flammable' in text_lower: flags.append('FLAMMABLE')
       return flags


class InventoryManager:
   def __init__(self, csv_text):
       from io import StringIO
       self.df = pd.read_csv(StringIO(csv_text))
       self.df['expiry'] = pd.to_datetime(self.df['expiry'])
  
   def check_availability(self, reagent_list):
       issues = []
       for reagent in reagent_list:
           reagent_clean = reagent.lower().replace('_', ' ').replace('-', ' ')
           matches = self.df[self.df['reagent'].str.lower().str.contains(
               '|'.join(reagent_clean.split()[:2]), na=False, regex=True
           )]
           if matches.empty:
               issues.append(f"❌ {reagent}: NOT IN INVENTORY")
           else:
               row = matches.iloc[0]
               if row['expiry']  2)
       return list(reagents)[:15]

We define the ProtocolParser and InventoryManager classes to extract structured experimental details and verify reagent inventory. We parse each protocol step for duration, temperature, and safety markers, while the inventory manager validates stock levels, expiry dates, and reagent availability through fuzzy matching.

class SchedulePlanner:
   def make_schedule(self, steps, start_time="09:00"):
       schedule = []
       current = datetime.strptime(f"2025-01-01 {start_time}", "%Y-%m-%d %H:%M")
       day = 1
       for step in steps:
           end = current + timedelta(minutes=step['duration_min'])
           if step['duration_min'] > 480:
               day += 1
               current = datetime.strptime(f"2025-01-0{day} 09:00", "%Y-%m-%d %H:%M")
               end = current
           schedule.append({
               'step': step['step'], 'name': step['name'][:40],
               'start': current.strftime("%H:%M"), 'end': end.strftime("%H:%M"),
               'duration': step['duration_min'], 'temp': step['temp'],
               'day': day, 'can_parallelize': step['duration_min'] > 60,
               'safety': ', '.join(step['safety']) if step['safety'] else 'None'
           })
           if step['duration_min']

We implement the SchedulePlanner and SafetyValidator to design efficient experiment timelines and enforce lab safety standards. We dynamically generate daily schedules, identify parallelizable steps, and validate potential risks, such as unsafe pH levels, hazardous chemicals, or biosafety-level requirements.

def llm_call(prompt, max_tokens=200):
   try:
       inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512).to(model.device)
       outputs = model.generate(
           **inputs, max_new_tokens=max_tokens, do_sample=True,
           temperature=0.7, top_p=0.9, pad_token_id=tokenizer.eos_token_id
       )
       return tokenizer.decode(outputs[0], skip_special_tokens=True)[len(prompt):].strip()
   except:
       return "Batch similar temperature steps together. Pre-warm instruments."


def agent_loop(protocol_text, inventory_csv, start_time="09:00"):
   print("\n🔬 AGENT STARTING PROTOCOL ANALYSIS...\n")
   parser = ProtocolParser()
   steps = parser.read_protocol(protocol_text)
   print(f"📄 Parsed {len(steps)} protocol steps")
   inventory = InventoryManager(inventory_csv)
   reagents = inventory.extract_reagents(protocol_text)
   print(f"🧪 Identified {len(reagents)} reagents: {', '.join(reagents[:5])}...")
   inv_issues = inventory.check_availability(reagents)
   validator = SafetyValidator()
   safety_risks = validator.validate(steps)
   planner = SchedulePlanner()
   schedule = planner.make_schedule(steps, start_time)
   parallel_opts, time_saved = planner.optimize_parallelization(schedule)
   total_time = sum(s['duration'] for s in schedule)
   optimized_time = total_time - time_saved
   opt_prompt = f"Protocol has {len(steps)} steps, {total_time} min total. Key bottleneck optimization:"
   optimization = llm_call(opt_prompt, max_tokens=80)
   return {
       'steps': steps, 'schedule': schedule, 'inventory_issues': inv_issues,
       'safety_risks': safety_risks, 'parallelization': parallel_opts,
       'time_saved': time_saved, 'total_time': total_time,
       'optimized_time': optimized_time, 'ai_optimization': optimization,
       'reagents': reagents
   }

We construct the agent loop, integrating perception, planning, validation, and revision into a single, coherent flow. We use CodeGen for reasoning-based optimization to refine step sequencing and propose practical improvements for efficiency and parallel execution.

def generate_checklist(results):
   md = "# 🔬 WET-LAB PROTOCOL CHECKLIST\n\n"
   md += f"**Total Steps:** {len(results['schedule'])}\n"
   md += f"**Estimated Time:** {results['total_time']} min ({results['total_time']//60}h {results['total_time']%60}m)\n"
   md += f"**Optimized Time:** {results['optimized_time']} min (save {results['time_saved']} min)\n\n"
   md += "## ⏱️ TIMELINE\n"
   current_day = 1
   for item in results['schedule']:
       if item['day'] > current_day:
           md += f"\n### Day {item['day']}\n"
           current_day = item['day']
       parallel = " 🔄" if item['can_parallelize'] else ""
       md += f"- [ ] **{item['start']}-{item['end']}** | Step {item['step']}: {item['name']} ({item['temp']}){parallel}\n"
   md += "\n## 🧪 REAGENT PICK-LIST\n"
   for reagent in results['reagents']:
       md += f"- [ ] {reagent}\n"
   md += "\n## ⚠️ SAFETY & INVENTORY ALERTS\n"
   all_issues = results['safety_risks'] + results['inventory_issues']
   if all_issues:
       for risk in all_issues:
           md += f"- {risk}\n"
   else:
       md += "- ✅ No critical issues detected\n"
   md += "\n## ✨ OPTIMIZATION TIPS\n"
   for tip in results['parallelization']:
       md += f"- {tip}\n"
   md += f"- 💡 AI Suggestion: {results['ai_optimization']}\n"
   return md


def generate_gantt_csv(schedule):
   df = pd.DataFrame(schedule)
   return df.to_csv(index=False)

We create output generators that transform results into human-readable Markdown checklists and Gantt-compatible CSVs. We ensure that every execution produces clear summaries of reagents, time savings, and safety or inventory alerts for streamlined lab operations.

SAMPLE_PROTOCOL = """ELISA Protocol for Cytokine Detection


1. Coating (Day 1, 4°C overnight)
  - Dilute capture antibody to 2 μg/mL in coating buffer (pH 9.6)
  - Add 100 μL per well to 96-well plate
  - Incubate at 4°C overnight (12-16 hours)
  - BSL-2 cabinet required


2. Blocking (Day 2)
  - Wash plate 3× with PBS-T (200 μL/well)
  - Add 200 μL blocking buffer (1% BSA in PBS)
  - Incubate 1 hour at room temperature


3. Sample Incubation
  - Wash 3× with PBS-T
  - Add 100 μL diluted samples/standards
  - Incubate 2 hours at room temperature


4. Detection Antibody
  - Wash 5× with PBS-T
  - Add 100 μL biotinylated detection antibody (0.5 μg/mL)
  - Incubate 1 hour at room temperature


5. Streptavidin-HRP
  - Wash 5× with PBS-T
  - Add 100 μL streptavidin-HRP (1:1000 dilution)
  - Incubate 30 minutes at room temperature
  - Work in dark


6. Development
  - Wash 7× with PBS-T
  - Add 100 μL TMB substrate
  - Incubate 10-15 minutes (monitor color development)
  - Add 50 μL stop solution (2M H2SO4) - CAUTION: corrosive
"""


SAMPLE_INVENTORY = """reagent,quantity,unit,expiry,lot
capture antibody,500,μg,2025-12-31,AB123
blocking buffer,500,mL,2025-11-30,BB456
PBS-T,1000,mL,2026-01-15,PT789
detection antibody,8,μg,2025-10-15,DA321
streptavidin HRP,10,mL,2025-12-01,SH654
TMB substrate,100,mL,2025-11-20,TM987
stop solution,250,mL,2026-03-01,SS147
BSA,100,g,2024-09-30,BS741"""


results = agent_loop(SAMPLE_PROTOCOL, SAMPLE_INVENTORY, start_time="09:00")
print("\n" + "="*70)
print(generate_checklist(results))
print("\n" + "="*70)
print("\n📊 GANTT CSV (first 400 chars):\n")
print(generate_gantt_csv(results['schedule'])[:400])
print("\n🎯 Time Savings:", f"{results['time_saved']} minutes via parallelization")

We conduct a comprehensive test run using a sample ELISA protocol and a reagent inventory dataset. We visualize the agent’s outputs, optimized schedule, parallelization gains, and AI-suggested improvements, demonstrating how our planner functions as a self-contained, intelligent lab assistant.

At last, we demonstrated how agentic AI principles can enhance reproducibility and safety in wet-lab workflows. By parsing free-form experimental text into structured, actionable plans, we automated protocol validation, reagent management, and temporal optimization in a single pipeline. The integration of CodeGen enables on-device reasoning about bottlenecks and safety conditions, allowing for self-contained, data-secure operations. We concluded with a fully functional planner that generates Gantt-compatible schedules, Markdown checklists, and AI-driven optimization tips, establishing a robust foundation for autonomous laboratory planning systems.

Check out the FULL CODES here. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🙌 Follow MARKTECHPOST: Add us as a preferred source on Google.

Build an Autonomous Wet-Lab Protocol Planner and Validator Using Salesforce CodeGen for Agentic Experiment Design and Safety Optimization

The Download: A new home under the sea, and cloning pets

Moonshot AI Releases Kimi K2 Thinking: An Impressive Thinking Model that can Execute up to 200–300 Sequential Tool Calls without Human Interference

Charting the future of AI, from safer answers to faster thinking | MIT News

The Download: How doctors fight conspiracy theories, and your AI footprint

Top Insights

Spanish Institute To Sell Forgotten $10K Bitcoin Stash For $10M

The Download: A new home under the sea, and cloning pets

Crypto Needs To Rethink Incentive Structures

Build an Autonomous Wet-Lab Protocol Planner and Validator Using Salesforce CodeGen for Agentic Experiment Design and Safety Optimization

Related Posts

The Download: A new home under the sea, and cloning pets

Moonshot AI Releases Kimi K2 Thinking: An Impressive Thinking Model that can Execute up to 200–300 Sequential Tool Calls without Human Interference

Charting the future of AI, from safer answers to faster thinking | MIT News

The Download: How doctors fight conspiracy theories, and your AI footprint

Spanish Institute To Sell Forgotten $10K Bitcoin Stash For $10M

The Download: A new home under the sea, and cloning pets

Crypto Needs To Rethink Incentive Structures

Subscribe to Updates