UGC Scanner — Interactive Tutorial

Model Lab
Demo Tour

A guided, animated walkthrough of the exact workflow you follow on the Model Lab page — from checking your dataset and training a model, through scoring jobs and probing site groups, to exporting the Runtime JSON that powers the browser extension.

🎯 9 step workflow
📊 Animated stats
🔍 Spotlight effects
🖱️ Animated cursor
💬 Live toasts
🎉

Demo Complete!

The cycle never stops — every label you add, every model you retrain, and every scored job you inspect makes the model smarter. When you're happy with your F1 score, export the Runtime JSON and load it into the browser extension.

47Labeled Candidates
0.91Best Precision
0.89Best F1
8Sites Probed
→ Open Model Lab

Interactive Tutorial

Model Lab Demo

An animated walkthrough of every step you take on the Model Lab page. Controls on the left let you step through manually or let it auto-play.

Overview

0
Labeled Candidates
0
Positive Labels
0
Negative Labels
0
Uncertain Labels
0
Candidate Rows
0
Reviewed Items
0
Hostnames
0
Saved Models
0
Feature Families
💡

Labeled Candidates is the number that drives model quality. You need at least 20 balanced labels (mix of Positive and Negative) before training produces a reliable model. Aim for 50+.

Train Variants

Training uses human-reviewed candidate rows only. Leave the job filter blank to use all labeled candidates stored, or enter comma-separated Job IDs to scope training to a specific batch.

↓ Download Candidate Dataset CSV
keyword-aware

Keyword-Aware Model

Uses text keywords (e.g. "comments", "reply", "posted by") combined with structural HTML features. Best accuracy for UGC detection.

Feature Count: 42 🔤 Keywords: included Algorithm: Logistic Regression
↓ Download Variant Dataset
structure-only

Structure-Only Model

Uses only DOM shape signals — element depth, sibling counts, ARIA roles. No text keywords. Useful as a baseline to measure keyword contribution.

Feature Count: 28 🏗 Keywords: excluded Algorithm: Logistic Regression
↓ Download Variant Dataset

Trained Models

Use Runtime JSON when the browser extension needs a compact deployable inference bundle. Click Use to select a model for scoring jobs or the site-group probe.

ArtifactVariantCreated PrecisionRecallF1Top-1

Score Existing Job

Use a trained model against a completed job. This scores every stored candidate, re-ranks each page, and shows whether the top candidate is confident or needs manual review.

Recent Jobs
badffc6c-8464-4675-af85-1f4e3a762a31
running
227/332
89 detected
cc919aa8-d163-43c4-9fa9-ccdf799aab28
completed_with_errors
77/79
34 detected

Site Group Probe

Paste or upload URLs or hostnames. The probe matches them against pages you have already scanned, scores with the selected model, and groups results by hostname.