Founding Data and Search Engineer Job at Sail AI, Dallas, TX

NzZBU3FEenRvVmNQb2VRWGlaU0NXUzZkUmc9PQ==
  • Sail AI
  • Dallas, TX

Job Description

Job description

🚀 We're Hiring: Founding Data and Search Engineer

📍 Location: Hybrid – Dallas Area, Texas

💼 Compensation: Competitive based on experience plus Equity (Early-Stage AI Startup)

🔎 About Us – Sail AI

Sail AI is building AI-powered SaaS platforms for consumer and enterprise markets. Our flagship product is an AI-driven lifestyle recommendation engine that blends robust web crawling, semantic search, vector indexing, and real-time personalization to help users discover meaningful experiences. We’re solving complex discovery and search challenges using AI, LLMs, and comprehensive knowledge-based systems.

🎯 Role Overview

We’re looking for a Founding Data and Search Engineer to take ownership of our end-to-end data infrastructure and search capabilities—from scratch. As an integral core member of our team, you’ll be responsible for architecting advanced web crawlers, developing sophisticated indexing strategies, and building knowledge graph architectures that power our AI-driven recommendations. If you’re looking for high-impact work, early-stage equity, and the chance to shape early-stage product decisions, this role is for you.

âś… Key Responsibilities

🕸 Web Crawling & Data Ingestion

  • Build from scratch distributed, resilient web crawlers to handle complex, dynamic websites.
  • Implement anti-bot measures (IP rotation, CAPTCHA solving) to optimize crawler success rates.
  • Develop data ingestion pipelines for continuous harvesting and normalization of web data.

🔍 Advanced Search & Indexing Systems

  • Integrate and optimize Elasticsearch/OpenSearch with vector databases (e.g., Pinecone, Weaviate) to enable hybrid semantic + keyword search.
  • Create cutting-edge indexing strategies, including vector-based embeddings and graph-based approaches.
  • Collaborate with engineers to fine-tune LLM-driven ranking algorithms and recommendation models.

đź§  Knowledge Graph Architecture

  • Design and implement scalable knowledge graphs and vector indexing systems to unify structured/unstructured data.
  • Establish data lineage and governance frameworks to ensure reproducibility and traceability.
  • Orchestrate multi-step AI workflows using tools like LangChain, LangGraph, etc.

⚙️ Early-Stage Data Infrastructure

  • Architect the foundational data infrastructure in the cloud (AWS, Azure, or GCP) with a focus on scalability and cost-efficiency.
  • Champion security and compliance for sensitive data through encryption, access controls, and adherence to regulations.
  • Drive CI/CD best practices (GitHub Actions, Jenkins) and containerization (Docker, Kubernetes) for frictionless deployments.

🤝 High-Impact Collaboration

  • Own critical product features end-to-end, shaping the intelligence that powers our recommendations.
  • Work closely with team members to align technical roadmaps with business goals.
  • Contribute to agile sprints and strategic roadmap discussions, directly influencing company growth and trajectory.

🚦 Core Technical Competencies (REQUIRED)

. Crawl & Scrape Public, API, and Social Media Data

  • Develop robust scraping/crawling solutions (e.g., Scrapy, Selenium) with anti-bot tactics.
  • Automate data extraction from diverse sources—public websites, APIs, social feeds.

. Build Data Pipelines & Knowledge Base

  • Ingest, transform, and clean raw data (ETL/ELT workflows).
  • Store curated data in databases, knowledge graphs, or data lakes.
  • Maintain data lineage, versioning, and integrity for downstream usage.

. Create Vectors & Indexes

  • Generate embedding vectors (e.g., BERT/GPT) for semantic retrieval.
  • Construct indexes via Elasticsearch/OpenSearch or vector DBs (Milvus, Pinecone).
  • Integrate knowledge-graph or hybrid indexing strategies for advanced queries.

. Train & Fine-Tune Models (RAG, Custom Fine-Tuning)

  • Use the curated knowledge base for retrieval-augmented generation.
  • Fine-tune large language models with domain-specific data.
  • Continuously iterate for relevance, performance, and accuracy.

📚 Requirements

. Education: Bachelor’s in Computer Science, Data Engineering, or related field (Master’s/PhD a plus).

. Experience: 5+ years in data engineering with a focus on search, web scraping, or knowledge systems.

. Technical Skills:

  • Proficiency in Python (Scrapy, FastAPI) or Java (Spring Boot).
  • Expertise in Elasticsearch/OpenSearch, vector databases, and anti-bot/distributed crawling techniques.
  • Skilled with CI/CD pipelines (GitHub Actions, Jenkins) and containerization (Docker, Kubernetes).

. Soft Skills:

  • Strong ownership mentality, problem-solving, and cross-team communication.
  • Comfortable in a fast-paced, early-stage startup environment.

🌟 Preferred Qualifications

  • Familiarity with LLM orchestration (LangChain, LlamaIndex) and recommendation systems.
  • Hands-on experience with knowledge graphs (Neo4j, Amazon Neptune) or graph analytics.
  • Cloud certifications (GCP, AWS, Azure).

🎉 Why Join Sail AI?

  • Founding Role & Equity: Own a meaningful stake in our high-growth AI startup.
  • High-Impact Problems: Tackle cutting-edge challenges in web crawling, semantic search, and advanced knowledge representation.
  • Innovative Technology: Pioneer the latest in LLMs, vector search, and knowledge graph engineering.
  • Collaborative Culture: Join a passionate team that values curiosity, creativity, and results.
  • Flexibility: Hybrid work setup with the freedom to shape your schedule.

✉️ How to App

Send your resume and GitHub portfolio to jobs@raegroup.net with the subject line:

“Founding Data and Search Engineer Application – [Your Name]”

đź”— #Hiring #BackendEngineer #AIjobs #SearchTech #VectorSearch #WebScraping #KnowledgeBase #RecommendationSystems #SemanticSearch #LLM #StartupJobs #RemoteTechJobs

Job Tags

Similar Jobs

Suburban Hospital

Security Manager Job at Suburban Hospital

 ...Are you an experienced Security Officer? Then we have a great opportunity for you at Suburban Hospital located in Bethesda, Maryland! Security Officer - Sergeant Position Summary: To ensure the safety and security of Suburban Hospital staff, patients and visitors... 

Americold Warehousing

Forklift Mechanic Job at Americold Warehousing

 ...Do : Performs in depth troubleshooting of MHE electrical, mechanical, and hydraulic systems. Rebuilds and/or replaces major components...  ...and adjustment of material handling equipment. Inspects forklifts, pallet jacks, and other material handling equipment to ensure... 

Vanguard Group Staffing, Inc.

Attorney Job at Vanguard Group Staffing, Inc.

Attorney needed to join the staff of nonprofit civil legal firm. Entry Level and Mid-Level attorneys with NYS Bar certification are needed for Attorney positions. Work on challenging and innovative legal issues that require strategic and creative thinking, for ...

School Gig

Dance Teacher Job at School Gig

 ...Job Summary: California public schools are actively hiring enthusiastic and talented Dance Teachers at all levelselementary, middle, and high schoolfor the upcoming school year. This is a unique opportunity for educators passionate about dance to inspire students, foster... 

Schulte Hospitality Group

Housekeeping Manager - The Trail Hotel Bardstown Job at Schulte Hospitality Group

 ...Schulte Hospitality Group is seeking a dynamic, service-oriented Housekeeping Manager to join our team! SHG is an organization whose success is rooted in its service culture. Our mission is to exude hospitality, be respectful and authentic, prioritize the needs of...