See all roles

NLP Engineer & Computer Vision – Rebuild OCR→LLM Comic Translation Pipeline (Convex + Python) - Contract to Hire

Work from home Full-time role Hiring

We’re hiring an experienced Computer Vision + NLP Engineer to rebuild our entire Korean → English comic translation tool from scratch. The current system works, but we need a clean, modular, much faster, and more accurate version built on top of Convex instead of Supabase for real-time updates. You will replicate the existing workflow exactly, and improve it across accuracy, performance, and architecture. This is a rebuild from scratch. --- # Current Validated Workflow (What You Will Rebuild and Improve) Our existing tool processes full chapters with this pipeline: 1. Upload – Chapter images are uploaded. 2. Text Detection – CRAFT generates bounding boxes around text. 3. Text Extraction (OCR) – Gemini 2.5 Pro extracts Korean text inside each bounding box. 4. Panel Detection – OpenCV identifies comic panels in each image. 5. Panel Filtering – Gemini 2.5 Pro removes inaccurate/outlier panels. 6. Alignment – Remaining text boxes are matched to their correct panels. 7. Translation – Gemini 2.5 Pro produces English translations using panel and chapter context. This workflow is already validated and must behave the same, just faster, cleaner, more accurate, and modular. --- # Your Job in This Project Rebuild this entire system from zero with a modern, maintainable architecture that gives us: Better accuracy

  • More precise bounding boxes
  • Higher OCR accuracy (including stylized Korean fonts)
  • Better panel detection and filtering
  • More consistent, human-like translations

Much faster overall performance

  • Dramatically reduced processing time per chapter
  • Efficient batching and async operations
  • Minimal latency from upload to final results

A modular, replaceable architecture Every step must be isolated behind a clear interface so we can easily swap components:

  • Replace CRAFT → PaddleOCR / Donut / Yolov8 detector
  • Replace Gemini → GPT or another LLM
  • Replace panel detector without touching text logic
  • Swap OCR engines freely (Paddle, Donut, TrOCR, GPT fallback)

Modular means no rewrites when upgrading models. Convex-based backend

  • Real-time updates streamed to the frontend
  • Job orchestration in Convex
  • Stable state management
  • Partial outputs instead of waiting for entire chapter completion

--- # What You Must Deliver For The $2,000 Milestone 1. Fully rebuilt pipeline implementing all steps (upload → detection → OCR → panels → alignment → translation). 2. Modular architecture where detection, OCR, panel logic, and translation can be swapped independently. 3. Convex integration for real-time syncing, job progress, and results. 4. Significant accuracy improvements over the current system. 5. Significant performance improvements (faster processing end-to-end). 6. Clean project structure with documentation for all modules and interfaces. --- # Tech Stack You Will Use

  • Python – OCR, detection, panel processing, AI orchestration
  • TypeScript – Convex + frontend integration
  • Convex – backend database, jobs, and real-time sync
  • OCR Tools – CRAFT for text detection
  • LLMs – Gemini, GPT

--- # Required Skills Must have

  • Strong OCR experience
  • Experience with LLM-based translation/localization
  • Python + TypeScript proficiency
  • Ability to design clean, modular system architectures
  • Experience rebuilding/refactoring complex pipelines

--- # To Apply Please include:

  • Relevant projects (OCR, CV, LLM translation, or modular system rebuilds)
  • Examples where you improved accuracy, performance, or architecture
  • A short explanation of how you would:

1. Design a modular detection → OCR → panel → translation pipeline 2. Improve bounding boxes and OCR for stylized Korean fonts 3. Integrate Convex for real-time progress streaming to the frontend Apply tot his job Apply To this Job

You might like

HIM Coder, Certified, Remote

Work from home Full-time role

Profee Cardiology Medical Coder

Work from home Full-time role

AppEx - Response Ops - Principal Backend Engineer (Node.js)

Work from home Full-time role

Principal Engineer - Store Selling & Customer Experience (Hybrid - Seattle)

Work from home Full-time role

Retail Sales - Women's Apparel - Westchester

Work from home Full-time role

Assistant Manager - Women's Apparel & Trend - Natick Mall

Work from home Full-time role

Nordstrom Hiring Day - Thursday, November 13th, 2025, 11am-5pm - The Village at Corte Madera

Work from home Full-time role

Remote Diagnostic Pediatric Radiologist Job at Northwell Health Physician Partners in New York

Work from home Full-time role

Concierge Nurse Consultant, TN

Work from home Full-time role

Hiring Now: Want Nurse Navigator-Leukemia 5 8-Hour Days With

Work from home Full-time role

Attorney - Bodily Injury Claims Adjuster

Work from home Full-time role

GIS Analyst Fall 2025 Internship (Paid/Remote)

Work from home Full-time role

Library Specialist- South Branch- Part Time

Work from home Full-time role

(Remote) - Doordash Work From Home Job

Work from home Full-time role

Senior Technical Project Manager (Project Hire)

Work from home Full-time role

Experienced Remote Data Entry Specialist – Web Store Operations at blithequark

Work from home Full-time role

Data Entry Specialist - Medical Records (Remote)

Work from home Full-time role

Experienced Data Entry Clerk – Entry-Level Opportunity for Remote Talent Acquisition Team Members

Work from home Full-time role

Immediately Require Principal Data Architect (Remote) in USA

Work from home Full-time role

Experienced Senior Data Scientist – Remote Work Opportunity with Amazon, $26/Hour

Work from home Full-time role