Back

Client-to-Course Matching

Overview

Training catalogs are huge. Providers have hundreds of courses, each with different schedules, locations, language levels, prerequisites, and “this is basically the same course but with a new name” energy.

Client profiles are messy too: goals, background, language, mobility, life constraints, learning style. Keyword search can’t connect those dots. Manual searching works, but it’s slow and depends on whoever happens to know the catalog best.

Key Design Decision

Decision: rules first, semantics second.

If you let an LLM rank everything from the start, it will confidently recommend stuff the client literally can’t attend. So the system runs in two stages:

Stage 1: Hard constraint filtering

Deterministic filtering removes courses that don’t fit reality: travel distance/time, schedule conflicts, language requirements the client doesn’t meet, missing prerequisites, and other “this is a no” conditions.

Stage 2: Semantic ranking

Once the list is reasonable (usually ~20–50), the model ranks for fit: career direction, skill gaps, course focus, format preference (online vs in-person), and what the client actually said they want.

Output is a ranked list with short reasoning per option. It doesn’t “decide.” It gives the case worker a strong shortlist that’s easy to sanity-check.

Constraint and Tradeoff

Constraint: course data is not clean.

When catalogs come from scraped pages or third-party APIs, you get missing schedules, outdated locations, and fields that are “technically present” but basically useless.

So instead of pretending the data is perfect, the system surfaces it: “schedule missing”, “last updated 6 months ago”, “requirements unclear”. That way the case worker knows what needs verification before recommending anything.

Tradeoff: you don’t get magical certainty. You get honest results that are safer to use. In this domain, that’s the whole point.

What This Connects To

The selected matches feed into the report generation system. If a course is recommended, the report can reference the match reasoning so the “why this course” part is traceable and defensible.

Back to overview