Evaluate AvelinLabs in your workflow

AvelinLabs is a workforce intelligence layer for AI-native hiring systems. Use this guide to evaluate whether AvelinLabs fits your workflow by testing real job text against documented beta endpoints, response fields, examples, confidence and uncertainty signals, review routing, weak-signal handling, ambiguity detection, explanations, and response-derived metrics.

The practical question is:

How do I evaluate whether AvelinLabs fits my workflow?

AvelinLabs is:

not a job board
not a raw job-data feed
not an ATS
not a recruiting agency
not a generic AI wrapper

Treat it as a structured intelligence layer that helps your product turn role text into occupation alignment, O*NET-grounded classification, skill evidence, confidence-aware signals, and review-oriented explanations.

Who this guide is for

This guide is for teams evaluating where AvelinLabs would fit in an existing product, platform, or data workflow:

HR tech platforms
recruiting and staffing platforms
workforce analytics teams
job-data enrichment providers
AI-native hiring and workforce products

What you can evaluate today

During beta, use the documented APIs and examples to evaluate whether AvelinLabs returns useful decision-support signals for your workflow. The most useful first endpoint is POST /api/v1/job/analyze because it returns ranked occupation candidates, evidence, explanations, confidence, uncertainty, ambiguity, weak-signal context, and routing signals.

You can evaluate:

occupation alignment
O*NET-grounded classification
skill evidence
confidence and uncertainty
ambiguity detection
weak-signal handling
review routing
explanation and evidence coverage
response-derived metrics

Start with the Job Analyze API reference, then use the Response Field Reference and examples guide while reviewing outputs.

For concrete strong, title-only, vague, ambiguous, and noisy input comparisons, see Input quality examples.

30-60 minute evaluation flow

Start with a representative job description.

Use a real role from your product workflow, support queue, customer data model, or sample intake process. Remove secrets, credentials, unnecessary personal data, and content that is not needed for analysis.
Run the Job Analyze endpoint.

Send the role title and skill-bearing description to POST /api/v1/job/analyze using a runtime API key. Use the documented request shape in the Job Analyze API reference. If you want runnable examples, use the official examples repository.
Inspect ranked occupation results.

Review results[] as a ranked set of candidate occupations. Do not evaluate only the first title. Look at whether the ranked candidates make sense for the role, whether alternatives are plausible, and whether the top result is clearly stronger than the rest.
Check O*NET codes and occupation labels.

Review results[].onet_code and results[].title. Decide whether the O*NET mapping is usable for your downstream workflow, such as normalization, profile lookup, enrichment, analytics, or reviewer display.
Review matching skills and explanations.

Inspect results[].matching_skills, job_signals, and results[].explanation.why_match. A useful output should give your reviewers or product UI enough evidence to understand why an occupation was returned.
Inspect confidence, trust, and uncertainty.

Review confidence, confidence_level, trust_score, and uncertainty.total. These fields are decision-support signals. Use them to design customer-specific thresholds for display, review, automation, or input improvement.
Check decision routing and ambiguity signals.

Review decision.decision, decision.reason, is_ambiguous, domain_is_ambiguous, weak_signal_detected, and weak_signal_reason. These fields help decide whether the output can be accepted into a low-risk workflow, routed to human review, rejected for your use case, or sent back upstream for better input.
Try a weak, vague, or noisy job description.

Test at least one title-only, generic, ambiguous, overly short, or noisy input. For example, compare a strong “Data Analyst” role description with a sparse input such as “Analyst, helps teams with reports.”
Compare outputs across strong vs weak inputs.

A strong input should usually provide clearer occupation evidence, stronger skill coverage, and more useful explanations. A weak input may produce lower confidence, higher uncertainty, review routing, ambiguity signals, or weak-signal warnings.
Decide what can be automated, reviewed, rejected, or improved upstream.

Map response fields into your own workflow policy. Decide which cases can use automatic display or enrichment, which require reviewer approval, which should be rejected for your workflow, and which should ask the user or upstream system for better job text.

What to measure during evaluation

Measure outputs over a small batch of representative roles. Include strong inputs, weak inputs, ambiguous roles, and noisy examples that reflect your real product intake.

The Response Field Reference metrics section describes practical metrics you can derive from responses. For a first evaluation, track:

classification coverage
O*NET mapping coverage
high-confidence occupation coverage
auto-accept, review, and ambiguity rates
weak-signal rate
uncertainty distribution
skill-evidence coverage
explanation coverage

Define your denominator before testing. For example, decide whether a “usable” response means any occupation result, an occupation with results[].onet_code, a high-confidence result, a result with enough explanation coverage, or a result that satisfies your review policy.

What good evaluation inputs look like

Strong inputs usually include:

clear job title
responsibilities
skills
seniority
domain context
enough text to infer occupation and skill evidence

Weak inputs may be:

title-only
overly short
vague
ambiguous
generic
noisy
missing evidence
not clearly occupational

How to interpret strong vs weak results

AvelinLabs should not always return high confidence. A useful evaluator should expect weak, vague, noisy, or ambiguous inputs to produce more cautious signals.

When input evidence is strong, look for aligned occupation candidates, O*NET codes, matching skills, and explanations that a reviewer can understand. Then evaluate whether confidence, trust, uncertainty, and decision routing are consistent with your workflow policy.

When input evidence is weak, do not assume a low-confidence response is wrong. A low-confidence or high-uncertainty response can be useful if it tells your product to route the case to review, request better job text, surface ambiguity, or avoid automatic downstream use.

Good evaluation compares both cases. The goal is not to force every response into an automatic result. The goal is to see whether AvelinLabs gives your workflow enough structure to decide what to automate, what to review, and what to improve upstream.

How not to use AvelinLabs

Do not treat outputs as final automated hiring decisions.
Do not use confidence alone as a final decision rule.
Do not assume every low-confidence response is wrong.
Do not interpret beta APIs as final stable contracts.
Do not claim legal or compliance suitability from the API alone.

Use AvelinLabs outputs as decision-support signals inside your own product policy, review process, and customer workflow.

Evaluation checklist

[ ] I tested at least one strong job description.
[ ] I tested at least one weak or ambiguous job description.
[ ] I reviewed occupation titles and O*NET codes.
[ ] I inspected matching skills and explanations.
[ ] I checked confidence, uncertainty, and trust signals.
[ ] I reviewed decision routing and ambiguity fields.
[ ] I checked weak-signal fields for sparse or noisy inputs.
[ ] I defined customer-specific thresholds for automation vs review.
[ ] I measured classification, O*NET mapping, confidence, routing, weak-signal, uncertainty, skill-evidence, and explanation coverage.
[ ] I identified where AvelinLabs would fit in my product workflow.
[ ] I identified what should remain human-reviewed.